Skip to content

Codeguruu03/NL2SQL_Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Multi-Agent SQL Query System 🧠

Python Streamlit OpenRouter Live Demo CI/CD Tests PostgreSQL

A sophisticated multi-agent AI system that converts natural language queries into SQL, executes them, and provides intelligent insights with automatic visualizations. Built with four specialized AI agents working collaboratively.

πŸ”— Try the Live Demo β†’


πŸ“‹ Table of Contents


🎯 Features

Multi-Agent Architecture

  • πŸ€– Chat Agent: Main orchestrator that coordinates the entire workflow
  • πŸ’Ύ SQL Agent: LLM-powered natural language to SQL converter
  • πŸ“Š Summary Agent: Generates human-friendly insights from query results

Intelligent Query Processing

  • Convert natural language to SQL queries using AI
  • Support for complex queries (joins, aggregations, filtering)
  • Automatic schema awareness and adaptation
  • Query validation and safety checks

Flexible Database Connectivity

  • Default SQLite database with sample organizational data
  • Connect to any custom SQLite database file
  • Auto-detect schema from connected databases
  • Real-time database switching

Advanced Features

  • AI-generated insights and summaries
  • Error handling with helpful messages
  • Retry logic for API failures
  • Modern tabbed UI for results display
  • Real-time query execution

πŸ—οΈ System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    User Interface                        β”‚
β”‚              (Streamlit Web Application)                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Chat Agent                             β”‚
β”‚              (Orchestrator & Router)                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚                       β”‚
               ↓                       ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   SQL Agent      β”‚    β”‚  Summary Agent   β”‚
    β”‚                  β”‚    β”‚                  β”‚
    β”‚ - Schema Analysisβ”‚    β”‚ - Result Analysisβ”‚
    β”‚ - SQL Generation β”‚    β”‚ - Insight Gen.   β”‚
    β”‚ - Query Executionβ”‚    β”‚ - Summarization  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                      β”‚
             ↓                      ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚      OpenRouter LLM API            β”‚
    β”‚  (Llama 3.1 8B Instruct - Free)    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
             ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚      SQLite Database               β”‚
    β”‚   (Default or Custom DB)           β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Workflow

1. User Input (Natural Language)
         ↓
2. Chat Agent receives and validates request
         ↓
3. SQL Agent:
   - Analyzes database schema
   - Generates appropriate SQL query
   - Validates query for safety
   - Executes against database
         ↓
4. Summary Agent:
   - Receives query results
   - Analyzes data patterns
   - Generates natural language summary
         ↓
5. UI displays:
   - Generated SQL query
   - Results table
   - AI-generated insights

πŸ€– Agent Details

1. Chat Agent (chat_agent.py)

Role: Main orchestrator and coordinator

Responsibilities:

  • Receives user queries from the UI
  • Routes requests to appropriate agents
  • Manages agent communication flow
  • Handles errors and edge cases
  • Returns unified responses to the UI

Key Methods:

  • process(user_query, schema): Main processing pipeline
  • handle_conversation(): Alternative conversational interface

2. SQL Agent (sql_agent.py)

Role: Natural language to SQL converter

Responsibilities:

  • Analyzes database schema
  • Converts natural language to SQL using LLM
  • Validates generated queries for safety
  • Executes SQL against the database
  • Returns structured results

Key Features:

  • Lower temperature (0.3) for consistent SQL generation
  • Query validation (SELECT-only policy)
  • SQL injection prevention
  • Automatic schema injection into prompts

Key Methods:

  • process(user_query, schema): Generate and execute SQL
  • validate_query(sql): Safety validation
  • _clean_sql_response(): Parse LLM output

3. Summary Agent (summary_agent.py)

Role: Result interpreter and insight generator

Responsibilities:

  • Receives SQL query results
  • Analyzes data patterns and trends
  • Generates human-friendly summaries
  • Highlights key findings and insights

Key Features:

  • Moderate temperature (0.7) for natural language
  • Handles large result sets (up to 20 rows)
  • Fallback summaries when LLM fails
  • Contextual analysis with original query

Key Methods:

  • process(sql_query, columns, rows): Generate summary
  • _format_results(): Prepare data for LLM
  • _generate_fallback_summary(): Backup summary generation

πŸ“ Project Structure

Assignment/
β”‚
β”œβ”€β”€ 🎯 Core Application
β”‚   β”œβ”€β”€ app.py                    # Streamlit UI and main application
β”‚   └── db.py                     # Database operations and schema detection
β”‚
β”œβ”€β”€ πŸ€– Multi-Agent System
β”‚   β”œβ”€β”€ agent_base.py             # Abstract base class for all agents
β”‚   β”œβ”€β”€ chat_agent.py             # Orchestrator agent
β”‚   β”œβ”€β”€ sql_agent.py              # SQL generation agent
β”‚   └── summary_agent.py          # Result interpretation agent
β”‚
β”œβ”€β”€ πŸ”§ Infrastructure
β”‚   └── openrouter_client.py      # OpenRouter API client wrapper
β”‚
β”œβ”€β”€ βš™οΈ Configuration
β”‚   β”œβ”€β”€ .env                      # Environment variables (API keys)
β”‚   β”œβ”€β”€ .env.example              # Environment template
β”‚   β”œβ”€β”€ requirements.txt          # Python dependencies
β”‚   └── .gitignore               # Git ignore rules
β”‚
β”œβ”€β”€ πŸ’Ύ Data
β”‚   └── test_db.sqlite           # Default SQLite database (auto-created)
β”‚
└── πŸ“š Documentation
    └── README.md                 # This file

File Descriptions

File Lines Purpose
app.py ~210 Main Streamlit application with UI and database connectivity
agent_base.py ~105 Base class providing common agent functionality
chat_agent.py ~165 Orchestrator coordinating SQL and Summary agents
sql_agent.py ~210 Natural language to SQL conversion and execution
summary_agent.py ~200 AI-powered result summarization
openrouter_client.py ~150 OpenRouter API client with retry logic
db.py ~250 Database operations, schema detection, connection management

πŸ“Š Database Schema

Default Database (test_db.sqlite)

The system includes a sample organizational database:

Tables

1. department

CREATE TABLE department (
    dept_id INTEGER PRIMARY KEY AUTOINCREMENT,
    dept_name TEXT NOT NULL UNIQUE
);
  • Stores organizational departments
  • Sample: Engineering, HR, Finance

2. usermaster

CREATE TABLE usermaster (
    user_id INTEGER PRIMARY KEY AUTOINCREMENT,
    full_name TEXT NOT NULL,
    email TEXT UNIQUE NOT NULL,
    phone TEXT,
    dept_id INTEGER,
    FOREIGN KEY (dept_id) REFERENCES department(dept_id)
);
  • Stores user/employee information
  • Links to departments via foreign key
  • Sample: Alice Johnson, Bob Sharma, etc.

3. UserSkillAndRatings

CREATE TABLE UserSkillAndRatings (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    user_id INTEGER NOT NULL,
    skill_name TEXT NOT NULL,
    rating INTEGER NOT NULL,
    FOREIGN KEY (user_id) REFERENCES usermaster(user_id)
);
  • Stores user skills and proficiency ratings (1-5)
  • Sample: Python (5), SQL (4), etc.

Sample Data

Table Rows Description
department 3 Engineering, HR, Finance
usermaster 4 Alice, Bob, Charlie, Deepa
UserSkillAndRatings 7 Various skills with ratings

Custom Databases

The system supports any SQLite database through:

  • Auto-schema detection using PRAGMA queries
  • Dynamic table and column discovery
  • Foreign key relationship mapping
  • Automatic schema formatting for LLM context

πŸš€ Quick Start

Prerequisites

  • Python 3.11 or higher
  • Internet connection (for OpenRouter API)
  • OpenRouter API key (free tier available)

Installation

Step 1: Clone/Download the Project

cd Assignment

Step 2: Create Virtual Environment

# Create virtual environment
python -m venv venv

# Activate (Windows)
.\venv\Scripts\activate

# Activate (Linux/macOS)
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Dependencies installed:

  • streamlit - Web UI framework
  • pandas - Data manipulation
  • requests - HTTP client for API calls
  • python-dotenv - Environment variable management

Step 4: Configure API Key

  1. Get free API key from OpenRouter
  2. Copy the template:
    copy .env.example .env
  3. Edit .env and add your key:
    OPENROUTER_API_KEY=sk-or-v1-your_actual_key_here
    OPENROUTER_MODEL=meta-llama/llama-3.1-8b-instruct:free

Step 5: Run the Application

streamlit run app.py

The app opens automatically at http://localhost:8501


πŸ’¬ Usage Examples

Example 1: Basic User Query

Query: "list all users"

Generated SQL:
SELECT u.user_id, u.full_name, u.email, d.dept_name 
FROM usermaster u 
LEFT JOIN department d ON u.dept_id = d.dept_id;

Results: 4 rows

Summary: "The database contains 4 users across 3 departments. 
Engineering has the most employees with 2 users, while HR and 
Finance each have 1 user."

Example 2: Department Filter

Query: "show users in engineering department"

Generated SQL:
SELECT u.user_id, u.full_name, u.email, d.dept_name 
FROM usermaster u 
JOIN department d ON u.dept_id = d.dept_id 
WHERE d.dept_name = 'Engineering';

Results: 2 rows (Alice Johnson, Charlie Singh)

Summary: "The Engineering department has 2 employees: Alice Johnson 
and Charlie Singh. Both are active users in the system."

Example 3: Aggregation Query

Query: "what's the average rating for python skill"

Generated SQL:
SELECT s.skill_name, AVG(s.rating) AS avg_rating 
FROM UserSkillAndRatings s 
WHERE s.skill_name = 'Python' 
GROUP BY s.skill_name;

Results: 1 row (Python, 4.0)

Summary: "Python has an average rating of 4.0 across all users. 
This indicates strong proficiency in Python among the team."

Example 4: Complex Query

Query: "which department has the most skilled employees"

Generated SQL:
SELECT d.dept_name, COUNT(DISTINCT s.user_id) as skilled_count
FROM department d
JOIN usermaster u ON d.dept_id = u.dept_id
JOIN UserSkillAndRatings s ON u.user_id = s.user_id
GROUP BY d.dept_name
ORDER BY skilled_count DESC;

βš™οΈ Configuration

Environment Variables

Variable Required Default Description
OPENROUTER_API_KEY βœ… Yes - Your OpenRouter API key
OPENROUTER_MODEL ❌ No meta-llama/llama-3.1-8b-instruct:free LLM model to use

Agent Configuration

Modify these parameters in agent files:

SQL Agent (sql_agent.py):

temperature=0.3      # Lower for consistent SQL
max_tokens=500       # SQL query length limit

Summary Agent (summary_agent.py):

temperature=0.7      # Higher for natural language
max_tokens=500       # Summary length limit

Database Configuration

Default Database: test_db.sqlite (auto-created)

Custom Database: Use UI sidebar to connect to any SQLite file


πŸ“š Technical Documentation

Agent Communication Protocol

Agents communicate through structured dictionaries:

# SQL Agent Output
{
    "sql_query": str,      # Generated SQL
    "columns": List[str],  # Column names
    "rows": List[Tuple]    # Result rows
}

# Summary Agent Output
{
    "summary": str         # Natural language summary
}

# Chat Agent Response
{
    "success": bool,
    "sql_query": str,
    "columns": List[str],
    "rows": List[Tuple],
    "summary": str,
    "error": str | None
}

Error Handling

Levels:

  1. API Level: OpenRouter client with 3 retries
  2. Agent Level: Try-catch with fallback behaviors
  3. UI Level: User-friendly error messages

Example:

try:
    result = chat_agent.process(query, schema)
except ValueError as e:
    # Configuration error (missing API key)
except Exception as e:
    # Runtime error (API failure, etc.)

Safety Features

Query Validation:

  • Only SELECT queries allowed
  • Blocks: DROP, DELETE, INSERT, UPDATE, ALTER, CREATE
  • SQL injection prevention through validation

Rate Limiting:

  • OpenRouter free tier: Check current limits
  • Retry logic with exponential backoff
  • Graceful degradation to fallback summaries

πŸ”Œ API Integration

OpenRouter API

Endpoint: https://openrouter.ai/api/v1/chat/completions

Model Used: meta-llama/llama-3.1-8b-instruct:free

  • Free tier available
  • 8B parameters
  • Instruction-tuned for tasks

Request Format:

{
    "model": "meta-llama/llama-3.1-8b-instruct:free",
    "messages": [
        {"role": "system", "content": "System prompt..."},
        {"role": "user", "content": "User query..."}
    ],
    "temperature": 0.7,
    "max_tokens": 500
}

Alternative Models: Update in .env:

OPENROUTER_MODEL=anthropic/claude-3-haiku
OPENROUTER_MODEL=google/gemini-flash-1.5

πŸ› Troubleshooting

Common Issues

1. API Key Error

ValueError: OpenRouter API key is required

Solution: Add your API key to .env file

2. Summary Truncated

Summary: "The database contains..."  (cuts off)

Solution: Already fixed! max_tokens increased to 500

3. App Won't Start

ModuleNotFoundError: No module named 'streamlit'

Solution: Install dependencies: pip install -r requirements.txt

4. Database Connection Failed

Error: Database file not found

Solution: Check file path or use default database

5. Empty Results

No results found

Solution: Normal - query returned no matching records

Debug Mode

Check terminal output for detailed logs:

[OpenRouter] Calling API with model: meta-llama/...
[OpenRouter] Response status: 200
[OpenRouter] Success! Response length: 245 chars
[Summary Agent] Calling LLM with prompt length: 523
[Summary Agent] Successfully generated summary: 245 chars

πŸ›‘οΈ Security & Best Practices

Security Features

  • βœ… API keys stored in .env (gitignored)
  • βœ… SELECT-only query policy
  • βœ… SQL injection prevention
  • βœ… Input validation and sanitization
  • βœ… No direct database modification allowed

Best Practices

  • Keep API keys confidential
  • Use virtual environments
  • Regularly update dependencies
  • Test with sample databases first
  • Monitor API usage and costs

πŸŽ“ Learning Outcomes

This project demonstrates:

  • βœ… Multi-agent AI system design
  • βœ… LLM integration and prompt engineering
  • βœ… Natural language processing for SQL
  • βœ… Streamlit web application development
  • βœ… API client implementation with error handling
  • βœ… Database schema introspection
  • βœ… Agent coordination and orchestration
  • βœ… Singleton pattern for resource management

πŸ™ Credits

Technologies:

LLM Model:

  • Meta Llama 3.1 8B Instruct (via OpenRouter)

Developed By: Internship Assignment Project


πŸ“„ License

This is an educational project developed for learning purposes.


πŸ”— Quick Links


Made with ❀️ using AI and Multi-Agent Systems

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages