IdealQuery is a high-performance, teaching-grade query optimization engine designed to demystify database performance. It features a powerful Go backend that parses, analyzes, and optimizes SQL queries, coupled with an interactive React frontend to visualize and compare different execution plans. This tool provides invaluable insights into both rule-based and cost-based optimization strategies.
- SQL Parsing: Transforms raw SQL queries into a structured logical plan representation.
- Rule-Based Optimization: Applies a set of heuristic rules (like predicate pushdown) to improve the query plan.
- Cost-Based Optimization: Utilizes table statistics and a cost model to find the most efficient execution plan by exploring different join orders and physical operators.
- Execution Simulation: Simulates query execution against different database connectors (PostgreSQL, MongoDB) to provide realistic performance metrics without running the actual query.
- Interactive Plan Visualization: Renders query plans as interactive tree diagrams using D3.js, allowing for easy comparison and node inspection.
- Database Catalog Management: Manages table schemas and statistics, which are crucial for the cost-based optimizer.
| Technology | Description |
|---|---|
| Go | The core language for the high-performance backend API. |
| Gin | A lightweight and fast HTTP web framework for Go. |
| PostgreSQL | The primary relational database used for testing and simulation. |
| Docker | For containerizing and orchestrating the application services. |
| React | A JavaScript library for building the user interface. |
| Vite | A modern, fast frontend build tool. |
| D3.js | A JavaScript library for producing dynamic data visualizations. |
| Zustand | A small, fast, and scalable state-management solution for React. |
The simplest way to get IdealQuery running is with Docker. This will spin up the Go backend, the React frontend, and a PostgreSQL instance with pre-loaded sample data.
-
Clone the repository:
git clone https://github.com/your-username/IdealQuery.git cd IdealQuery -
Start the services using Docker Compose:
docker-compose up --build
-
Access the application:
- Frontend UI: http://localhost:5173
- Backend API: http://localhost:8080
Once the application is running, you can write a SQL query in the editor, click "Run Query", and explore the original, rule-based, and cost-based execution plans in the visualization panel.
This document outlines the API for the IdealQuery backend. The API is built with Go and the Gin framework, providing endpoints for parsing, optimizing, and simulating SQL queries, as well as managing the database catalog.
Create a .env file in the backend/ directory and populate it with the following variables. A backend/.env.example file is provided for reference.
# Database Configuration
DB_HOST=postgres
DB_PORT=5432
DB_NAME=IdealQuery_test
DB_USER=test_user
DB_PASSWORD=test_password
DB_SSLMODE=disable
# App Config
GIN_MODE=debug
PORT=8080
# Connection String (auto-generated if not provided)
# DATABASE_URL=postgres://test_user:test_password@postgres:5432/IdealQuery_test?sslmode=disable
# Logging
LOG_LEVEL=debug
LOG_FORMAT=json
# IdealQuery Configuration
MAX_QUERY_PLANS=1000
OPTIMIZATION_TIMEOUT=30s
ENABLE_COST_BASED_OPTIMIZER=true
ENABLE_RULE_BASED_OPTIMIZER=truehttp://localhost:8080
Checks the health of the API server.
Request: None.
Response:
{
"status": "healthy"
}Errors:
- None.
Parses a query string from a specific dialect into a logical plan.
Request:
{
"dialect": "sql",
"query": "SELECT name FROM customers WHERE age > 30"
}dialect(string, required): The query language. Must be one ofsql,mongo,athena.query(string, required): The query string to parse.
Response:
{
"logicalPlan": {
"id": "node_2",
"node_type": "project",
"children": [
{
"id": "node_1",
"node_type": "filter",
"children": [
{
"id": "node_0",
"node_type": "scan",
"table_name": "customers",
"metadata": {}
}
],
"predicate": {
"expression": {
"type": "binary_op",
"value": ">",
"left": { "type": "column", "value": "age" },
"right": { "type": "literal", "value": 30 }
}
},
"metadata": {}
}
],
"projections": [{ "name": "name" }],
"metadata": {}
}
}Errors:
- 400 Bad Request: If the request payload is invalid, the dialect is unsupported, or a parsing error occurs.
Optimizes a given logical plan using a specified strategy.
Request:
{
"logicalPlan": {
"id": "node_1",
"node_type": "filter",
"children": [
{
"id": "node_0",
"node_type": "scan",
"table_name": "customers"
}
],
"predicate": {
"expression": {
"type": "binary_op",
"value": ">",
"left": { "type": "column", "value": "age" },
"right": { "type": "literal", "value": 30 }
}
}
},
"strategy": "cost"
}logicalPlan(object, required): The logical plan structure to optimize.strategy(string, required): The optimization strategy. Must be one ofcost,rule.
Response:
{
"optimizedPlan": {
"id": "node_...",
"node_type": "scan",
"table_name": "customers",
"metadata": {
"physical_operator": "sequential"
}
},
"explain": {
"applied_rules": ["PredicatePushdown", "CostBasedOptimization"],
"steps": [...],
"statistics": { "total_rules_applied": 2 }
}
}Errors:
- 400 Bad Request: If the request payload is invalid or the strategy is unsupported.
- 500 Internal Server Error: If an error occurs during the optimization process.
Simulates the execution of a query plan for a specific data connector.
Request:
{
"plan": {
"id": "node_0",
"node_type": "scan",
"table_name": "customers",
"estimated_rows": 1000
},
"connector": "postgres",
"options": {}
}plan(object, required): The logical plan to simulate.connector(string, required): The target connector. Must be one ofpostgres,mongo.options(object, optional): Connector-specific simulation options.
Response:
{
"metrics": {
"execution_time": 10000000,
"rows_processed": 1000,
"rows_returned": 1000,
"cpu_time": 10000,
"io_operations": 10,
"memory_used": 100000,
"network_traffic": 0,
"operator_metrics": {},
"connector": "postgres",
"simulation_only": true
}
}Errors:
- 400 Bad Request: If the request payload is invalid.
- 500 Internal Server Error: If an error occurs during simulation.
Adds a new table schema to the catalog.
Request:
{
"name": "new_table",
"columns": [
{ "name": "id", "data_type": "int", "nullable": false },
{ "name": "data", "data_type": "string", "nullable": true }
],
"row_count": 5000
}Response:
{
"message": "Table added successfully"
}Errors:
- 400 Bad Request: Invalid schema format.
- 409 Conflict: If a table with the same name already exists.
Retrieves a list of all table names in the catalog.
Request: None.
Response:
{
"tables": [
"customers",
"orders",
"products",
"suppliers"
]
}Errors:
- None.
Retrieves the schema and statistics for a specific table.
Request:
None. URL parameter name is required. Example: /api/catalog/table/customers/stats.
Response:
{
"name": "customers",
"columns": [
{ "name": "customer_id", "data_type": "int", "nullable": false },
{ "name": "name", "data_type": "string", "nullable": false }
],
"row_count": 5000,
"indexes": []
}Errors:
- 404 Not Found: If the specified table does not exist.
Updates the statistics for an existing table.
Request:
URL parameter name is required. Example: /api/catalog/table/customers/stats.
{
"row_count": 6000,
"column_stats": {
"name": {
"ndv": 5500
}
}
}Response:
{
"message": "Statistics updated successfully"
}Errors:
- 400 Bad Request: If the update payload is invalid.
- 404 Not Found: If the specified table does not exist.