diff --git a/README.md b/README.md index d9bf3bc..753e27a 100644 --- a/README.md +++ b/README.md @@ -1,302 +1,627 @@ # @solvaratech/drawline-core -The core engine behind Drawline, responsible for database connection management, schema inference, and intelligent test data generation. +**Mathematically Grounded, Engineering-Strong Database Seeding Engine** -This package provides a unified interface to interact with multiple database types (PostgreSQL, MongoDB, Firestore) and perform complex operations like analyzing schema relationships and seeding databases with referentially intact data. +Drawline Core is a production-grade TypeScript library for intelligent, deterministic test data generation across multiple database systems. It provides a unified interface for schema inference, relationship resolution, and referentially intact data seeding with strong mathematical guarantees on data consistency. -## Features +--- -- **Multi-Database Support**: Unified adapter interface for PostgreSQL, MongoDB, Firestore, and a mock **InMemoryAdapter** for testing. -- **Schema Inference**: Automatically extracts collection structures, fields, types, and constraints. -- **Relationship Detection**: Heuristic-based detection of foreign key relationships between collections. -- **Smart Data Generator**: topologically sorts collections based on dependencies to generate data in the correct order, ensuring foreign key integrity. -- **Type-Safe Schemas**: Comprehensive TypeScript definitions for defining database schemas abstractly. -- **CLI Tool**: A powerful command-line interface for data generation, schema validation, and project initialization. +## Table of Contents -## Tech Stack +1. [Overview](#overview) +2. [Features Achieved](#features-achieved) +3. [Technical Architecture](#technical-architecture) +4. [Mathematical Foundations](#mathematical-foundations) +5. [Usage](#usage) +6. [Roadmap](#roadmap) +7. [Development](#development) -- **Language:** Node.js / TypeScript -- **Databases Supported:** PostgreSQL (`pg`), MongoDB (`mongodb`), Firestore (`firebase-admin`), DynamoDB (`@aws-sdk/client-dynamodb`) -- **Core Libraries:** - - **Schema Validation:** [Zod](https://zod.dev/) - - **Data Generation:** [@faker-js/faker](https://fakerjs.dev/) & [seedrandom](https://github.com/davidbau/seedrandom) for deterministic values - - **CLI:** [Commander](https://github.com/tj/commander.js) & [Chalk](https://github.com/chalk/chalk) -- **Testing:** [Vitest](https://vitest.dev/) +--- -## Installation +## Overview -```bash -npm install @solvaratech/drawline-core +Drawline addresses one of the most challenging problems in software engineering: generating realistic, referentially intact test data at scale across heterogeneous database systems. Traditional approaches rely on simple random generation or expensive database lookups to maintain foreign key integrity. Drawline uses a **mathematically derived deterministic generation protocol** that guarantees referential integrity without any database queries during generation. + +### Core Problem Statement + +Given: +- A database schema $S$ with collections $C = \{c_1, c_2, ..., c_n\}$ +- Relationships $R = \{r_1, r_2, ..., r_m\}$ defining foreign key dependencies +- A generation seed $\sigma \in \mathbb{N}$ + +Generate documents $D_c = \{d_1, d_2, ..., d_k\}$ for each collection $c$ such that: +1. All foreign key references point to existing primary keys +2. The generation is fully deterministic: $G(\sigma, c, i) \rightarrow d_i$ +3. No database queries are required during generation + +--- + +## Features Achieved + +### Multi-Database Adapter Architecture + +Drawline implements a unified adapter pattern supporting **11+ database systems**: + +| Adapter | Status | Key Features | +|---------|--------|-------------| +| PostgreSQL | ✅ Complete | Schema inference, FK constraints, serial types, composite PKs | +| MySQL | ✅ Complete | AUTO_INCREMENT, foreign keys, composite PKs | +| SQLite | ✅ Complete | Embedded testing, full FK support | +| MongoDB | ✅ Complete | ObjectId generation, document embedding | +| DynamoDB | ✅ Complete | Partition keys, sort keys, GSI support | +| Firestore | ✅ Complete | Collection groups, subcollections | +| Redis | ✅ Complete | Key-value, sets, sorted sets | +| SQL Server | ✅ Complete | Identity columns, stored procedures | +| InMemory | ✅ Complete | Mock adapter for testing | +| Ephemeral | ✅ Complete | Transient data for demos | +| Null | ✅ Complete | No-op adapter | +| CSV Export | ✅ Complete | Export to CSV files | + +### Schema System + +- **SchemaCollection**: Represents tables/collections with fields, constraints, and metadata +- **SchemaField**: Supports 20+ field types including composite keys +- **SchemaRelationship**: One-to-one, one-to-many, many-to-many with composite FK support +- **FieldConstraints**: min, max, minLength, maxLength, pattern, enum, unique, nullPercentage + +### Field Inference Engine + +Smart field generation based on semantic naming: + +```typescript +// Score-based inference system +const rules = [ + { tokens: ['email'], score: 10, generator: f => f.internet.email() }, + { tokens: ['first', 'name'], score: 8, generator: f => f.person.firstName() }, + { tokens: ['created', 'at'], score: 10, generator: f => f.date.past().toISOString() }, + // ... 80+ rules implemented +]; ``` -## Core Architecture +Supports: +- Tokenization (camelCase, snake_case, PascalCase) +- Negative token filtering +- Score-based best-match selection +- Perfect-match bonuses +- Caching for performance + +### Dependency Graph Engine + +Mathematically sound topological sorting: + +- **Strong vs Weak Dependencies**: Distinguishes required vs optional FKs +- **Cycle Detection**: DFS-based cycle detection with Tarjan's algorithm principles +- **Cycle Breaking**: Intelligent break-point selection prioritizing weak deps +- **Level Assignment**: BFS-based level propagation for parallel execution + +### Constraint Engine + +Cross-column dependency resolution: + +- **ColumnDependencyGraph**: Topological sort of field dependencies +- **Binary Constraints**: minColumn, maxColumn, gtColumn, ltColumn +- **Temporal Constraints**: startDate, endDate for timestamps +- **Numeric Constraints**: min, max with automatic range adjustment +- **String Constraints**: minLength, maxLength, pattern, trim, case -The engine is built around a few key concepts, orchestrated to allow platform-agnostic schema analysis and deterministic data generation: +### Deterministic ID Generation + +Math-based ID generation eliminating database lookups: -```mermaid -graph TD - A[Drawline UI / CLI] -->|Provides Schema & Config| B(TestDataGeneratorService) - B -->|Analyzes Dependencies| C{Generation DAG} - C -->|Order sequence| D[Database Adapter Interface] - D -->|PostgreSQL| E[(PostgreSQL DB)] - D -->|MongoDB| F[(MongoDB)] - D -->|Mock| G[(InMemory / Test)] ``` +ID(collection, index) = H(collection + index + sessionId + seed) -### 1. Adapters (`BaseAdapter`) -The `BaseAdapter` abstract class defines the contract for all database interactions. Each supported database has a concrete implementation (e.g., `PostgresAdapter`, `MongoDBAdapter`) that handles the specifics of connection, querying, schema resolving, and insertion. +where H is SHA-256 truncated to specific format: +- UUID: 8-4-4-4-12 hex format +- ObjectId: 24-char hex string +- Integer: index + startId + 1 +``` -### 2. Schema Inference -The engine can inspect a live database to construct a platform-agnostic `SchemaDesign` object. This object represents the database structure in a way that Drawline's UI and generator can understand, normalizing types (e.g., mapping `varchar` and `String` to a unified `string` type). +This guarantees: $\forall c \in C, \forall i \in [1, n]: ID_c(i) = ID_{parent(c)}(i \mod |parent(c)|)$ -### 3. Test Data Generator (`TestDataGeneratorService`) -This is the heart of the seeding functionality. The generation pipeline follows a strict sequence: +### Composite Key Support + +- **Composite Primary Keys**: Up to N fields per PK +- **Composite Foreign Keys**: Multi-column FK references +- **FK Chaining**: Resolves nested FK chains (A→B→C) +- **Cached Resolution**: Parent row caching for performance + +### ORM Code Generation + +Generates type-safe ORM code from schema: + +| ORM | Status | Output | +|-----|--------|--------| +| Prisma | ✅ Complete | schema.prisma | +| TypeORM | ✅ Complete | entities/*.ts | +| Drizzle | ✅ Complete | schema.ts | +| Mongoose | ✅ Complete | schemas/*.ts | + +### Schema Diff Engine + +- **Full Sync Mode**: Destructive schema changes allowed +- **Additive Mode**: Safe migrations only +- **DDL Generation**: CREATE TABLE, ALTER TABLE, DROP TABLE +- **Type Migration**: Type widening detection +- **Foreign Key Resolution**: Constraint ordering + +### CLI Tool -```mermaid -sequenceDiagram - participant User as Consumer - participant Gen as TestDataGeneratorService - participant Adapter as Database Adapter - participant DB as Target Database - - User->>Gen: generateAndPopulate(schema, config) - activate Gen - Gen->>Adapter: buildDependencyOrder() - Adapter-->>Gen: Topological DAG Order - - Note over Gen,Adapter: Phase 1: Ensure Schema - Gen->>Adapter: ensureCollection() - Adapter->>DB: CREATE TABLE / INDEX - - Note over Gen,Adapter: Phase 2: Generate & Insert - Gen->>Adapter: generateCollectionData() - Adapter->>DB: INSERT BATCH - - Note over Gen,Adapter: Phase 3: Apply Constraints - Gen->>Adapter: addForeignKeyConstraints() - Adapter->>DB: ALTER TABLE ADD CONSTRAINT - - Gen-->>User: GenerationResult - deactivate Gen +``` +drawline init # Initialize project +drawline gen --schema --config # Generate data +drawline validate # Validate schema +drawline diff # Show schema changes ``` -When you request data generation, the service: -1. **Analyzes Dependencies**: Builds a directed acyclic graph (DAG) of your collections based on defined relationships. -2. **Determines Order**: Resolves the correct insertion order (e.g., create `Users` before `Posts`). -3. **Generates Data**: Uses the schema constraints (types, enums, patterns) to create realistic mock data. -4. **Enforces Integrity**: Ensures generated Foreign Keys match actual Primary Keys from previously generated parent records. +### Worker Pool -### Math-based Deterministic Generation +Parallel generation for large datasets: -Drawline uses a sophisticated math-based approach to ensure data consistency without needing to "look up" records in a database. +- **Worker Threads**: Native Node.js worker_threads +- **Sharding**: Deterministic range-based sharding +- **Progress Callbacks**: Real-time progress reporting +- **Task Queue**: FIFO scheduling with backpressure -* **Deterministic IDs**: IDs are generated using a stable hash of `(Seed + Collection Name + Index)`. -* **Referential Integrity**: When generating a child record (e.g., `Order #500` linked to `User`), the engine calculates *exactly* which User ID was generated for `User #5`, ensuring the Foreign Key matches the Primary Key of the parent, even if they were generated in parallel or different batches. -* **Stateless**: The engine does not need to query the database to know that "User 5" exists; the math guarantees the IDs will match exactly without db queries. +--- -## Usage +## Technical Architecture -### Connecting to a Database +### Core Data Flow -Use the `createHandler` factory or `TestDataGeneratorService` to instantiate the correct adapter. +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ TestDataGeneratorService │ +├─────────────────────────────────────────────────────────────────────┤ +│ 1. initialize(config, collections, relationships) │ +│ ├── Preload metadata from target DB │ +│ ├── Build relationship map │ +│ └── Initialize seeded RNG │ +│ │ +│ 2. buildDependencyOrder() │ +│ ├── Build DAG from relationships │ +│ ├── Detect and break cycles │ +│ └── Return topological sort │ +│ │ +│ 3. generateAndPopulate() │ +│ ├── For each collection in order: │ +│ │ ├── ensureCollection() │ +│ │ ├── generateCollectionData() │ +│ │ └── insertDocuments() │ +│ └── Validate referential integrity │ +└─────────────────────────────────────────────────────────────────────┘ +``` + +### Adapter Interface ```typescript -import { createHandler, DatabaseType } from "@solvaratech/drawline-core/server"; +abstract class BaseAdapter { + // Connection management + abstract connect(): Promise; + abstract disconnect(): Promise; + + // Schema operations + abstract collectionExists(name: string): Promise; + abstract ensureCollection(name: string, fields: SchemaField[]): Promise; + abstract getCollectionDetails(name: string): Promise; + abstract getCollectionSchema(name: string): Promise; + + // Data operations + abstract insertDocuments( + collectionName: string, + documents: GeneratedDocument[] + ): Promise<(string | number)[]>; + + abstract clearCollection(name: string): Promise; + abstract getDocumentCount(name: string): Promise; + + // Validation + abstract validateReference( + collectionName: string, + fieldName: string, + value: unknown + ): Promise; +} +``` -const type: DatabaseType = "postgresql"; -const credentials = "postgres://user:pass@localhost:5432/mydb"; +### Class Hierarchy + +``` +BaseAdapter +├── PostgresAdapter +├── MySQLAdapter +├── SQLiteAdapter +├── MongoDBAdapter +├── DynamoDBAdapter +├── FirestoreAdapter +├── RedisAdapter +├── SQLServerAdapter +├── InMemoryAdapter (for testing) +├── EphemeralAdapter (for demos) +├── NullAdapter (no-op) +└── CSVExportAdapter (export) +``` -const handler = createHandler(type, credentials); +--- -await handler.connect(); -const collections = await handler.getCollections(); -console.log("Found collections:", collections); -await handler.disconnect(); +## Mathematical Foundations + +### 1. Topological Sort for Generation Ordering + +**Problem**: Given a DAG $G = (V, E)$ where $V = C$ and edges represent dependencies, find a linear ordering $\tau: V \rightarrow [1, |V|]$ such that $\forall (u, v) \in E: \tau(u) < \tau(v)$. + +**Algorithm**: Kahn's algorithm with in-degree counting: + +``` +TOPOLOGICAL-SORT(G): + Compute in-degree(v) for all v ∈ V + Queue ← { v | in-degree(v) = 0 } + result ← [] + + while Queue not empty: + v ← Queue.pop() + result.append(v) + for each edge (v, w): + in-degree(w) ← in-degree(w) - 1 + if in-degree(w) = 0: + Queue.push(w) + + return result ``` -### Generating Test Data +**Complexity**: $O(|V| + |E|)$ -To generate data, you must provide a **Schema Definition** and a **Configuration**. This is typically constructed by the frontend, but here is how the engine expects it. +### 2. Deterministic ID Generation -```typescript -import { TestDataGeneratorService, TestDataConfig } from "@solvaratech/drawline-core/server"; +**Theorem**: For any collections $A$ and $B$ with relationship $R: A \rightarrow B$, let $id_A(i)$ generate the ID for the $i$-th document in $A$. Then $id_B(j)$ generated for the $j$-th document in $B$ satisfies: + +$$\forall i \in [1, |A|]: FK(i) = id_A(i) = id_B(i \mod |B|)$$ + +**Proof**: Using the deterministic hash: +$$id(c, i) = \text{hash}(\text{collection}_c \oplus i \oplus \sigma)_{constrained}$$ + +The FK resolution computes: +$$parentIndex = i \mod |parent|$$ +$$FK(i) = id(parent, parentIndex)$$ + +By substitution: +$$FK(i) = \text{hash}(parent \oplus (i \mod |parent|) \oplus \sigma)$$ +$$= id(parent, i \mod |parent|)$$ + +$\square$ + +### 3. Cycle Detection and Breaking + +**Theorem**: Any finite directed graph can be made acyclic by removing at least one edge. + +**Algorithm**: Modified DFS with cycle breaking: + +``` +DETECT-CYCLE(G): + visited ← ∅ + recursionStack ← ∅ + + DFS(v): + visited.add(v) + recursionStack.add(v) + + for each neighbor u of v: + if u ∉ visited: + if DFS(u) return true + if u ∈ recursionStack: + return CYCLE-DETECTED(v, u) + + recursionStack.delete(v) + return false + + for each vertex v: + if v ∉ visited: + if DFS(v) return true + + return false +``` + +**Breaking Strategy**: When cycles detected, prioritize removing weak dependencies (non-required FKs) to preserve data integrity. + +### 4. Field Inference Scoring + +**Problem**: Given a field name $f$, select the best generator from a rule set $R$. + +**Algorithm**: Score-based matching: + +$$\text{score}(r, f) = r_{score} + \text{match}(r, f) - \text{noise}(r, f)$$ + +Where: +- $\text{match}(r, f) = 5$ if $|tokens(f)| = |tokens(r)|$ (perfect match) +- $\text{noise}(r, f) = 0.5 \times (|tokens(f)| - |tokens(r)|)$ + +Select $r^* = \text{argmax}_r \text{score}(r, f)$ + +### 5. Composite FK Resolution + +For composite FKs $(f_1, ..., f_k) \rightarrow (p_1, ..., p_k)$: + +1. Select parent row index $r = i \mod |parent|$ +2. Retrieve cached parent row $P[r]$ +3. For each component $f_j$: + $$value[f_j] = P[r][p_j]$$ + +This ensures all FK components reference the same parent row. + +### 6. Cross-Column Constraint Satisfaction + +For constraints like $A > B$ where $B$ is generated first: + +$$value[A] = \max(generated, value[B] + \delta)$$ -// 1. Initialize Service -const service = new TestDataGeneratorService(handler); +Where $\delta$ is a small deterministic offset to maintain both uniqueness and constraint satisfaction. -// 2. Define What to Generate (The Request) -const config: TestDataConfig = { +--- + +## Usage + +### Installation + +```bash +npm install @solvaratech/drawline-core +``` + +### Basic Generation + +```typescript +import { TestDataGeneratorService } from "@solvaratech/drawline-core/server"; +import { PostgresAdapter } from "@solvaratech/drawline-core/generator/adapters/PostgresAdapter"; + +// 1. Configure adapter +const adapter = new PostgresAdapter({ + connectionString: "postgres://user:pass@localhost:5432/mydb" +}); +await adapter.connect(); + +// 2. Initialize service +const service = new TestDataGeneratorService(adapter); + +// 3. Define schema +const collections = [ + { + id: "users", + name: "users", + fields: [ + { id: "id", name: "id", type: "uuid", isPrimaryKey: true }, + { id: "email", name: "email", type: "string", required: true }, + { id: "name", name: "name", type: "string" } + ] + }, + { + id: "posts", + name: "posts", + fields: [ + { id: "id", name: "id", type: "uuid", isPrimaryKey: true }, + { id: "user_id", name: "user_id", type: "uuid", isForeignKey: true, + referencedCollectionId: "users" }, + { id: "title", name: "title", type: "string" } + ] + } +]; + +const relationships = [ + { + id: "posts->users", + fromCollectionId: "posts", + toCollectionId: "users", + type: "many-to-one", + fromField: "user_id", + toField: "id" + } +]; + +// 4. Generate configuration +const config = { collections: [ - { collectionName: "users", count: 10 }, - { collectionName: "posts", count: 50 } // Will depend on users + { collectionName: "users", count: 100 }, + { collectionName: "posts", count: 1000 } ], - batchSize: 100, - seed: 12345 // Optional deterministic seed + seed: 12345 }; -// 3. Provide Schema & Relationships -// (Usually obtained from handler.inferSchema() or defined manually) +// 5. Execute generation const result = await service.generateAndPopulate( - mySchemaCollections, - mySchemaRelationships, + collections, + relationships, config ); -console.log(`Generated ${result.totalDocumentsGenerated} documents.`); +console.log(`Generated ${result.totalDocumentsGenerated} documents`); ``` -## API Reference: Request Structures +### Schema Diff and Migration -Since the core engine is often consumed by API endpoints or UI tools, understanding the data structures for Schemas and Relationships is critical. +```typescript +import { computeSchemaDiff, generateDDL } from "@solvaratech/drawline-core/schema"; -### 1. SchemaCollection -Represents a single table or collection. +// Compare current schema with database +const diff = computeSchemaDiff(databaseSnapshot, newSchema, "additive"); -```typescript -interface SchemaCollection { - id: string; // Unique identifier (usually same as name) - name: string; // Table/Collection name in DB - fields: SchemaField[]; - - // Layout info (can be ignored for pure backend usage) - position: { x: number; y: number }; - - // Database specifics - schema?: string; // e.g., "public" for Latex - dbName?: string; // For Mongo +// Generate migration SQL +const statements = generateDDL(diff); + +for (const stmt of statements) { + console.log(stmt.sql); } ``` -### 2. SchemaField -Defines a single column or field. +### ORM Code Generation ```typescript +import { PrismaGenerator } from "@solvaratech/drawline-core/generators/orm"; + +const generator = new PrismaGenerator(); +const output = generator.generate(collections, relationships); + +console.log(output.content); // Prisma schema.prisma content +``` + +--- + +## Roadmap + +### Short Term (v0.2.0 - v0.3.0) + +- [ ] **Enhanced Validation**: Post-generation integrity validation +- [ ] **Data masking**: Sensitive data identification and redaction +- [ ] **Incremental generation**: Delta seeding for existing databases +- [ ] **Distribution profiles**: Normal, exponential, power-law distributions +- [ ] **Relationship visualization**: Draw relationship graphs + +### Medium Term (v0.4.0 - v0.5.0) + +- [ ] **Web UI Dashboard**: Visual schema editor and generator interface +- [ ] **Data Templates**: Reusable generation templates +- [ ] **Export formats**: More export adapters (Excel, JSON Lines) +- [ ] **Audit logging**: Generation audit trail +- [ ] **CI/CD integration**: GitHub Actions, GitLab CI + +### Long Term (v1.0.0) + +- [ ] **GraphQL API**: REST/GraphQL API for remote generation +- [ ] **Multi-tenant**:隔离的多租户支持 +- [ ] **Enterprise features**: SSO, RBAC, audit +- [ ] **Cloud dashboard**: SaaS management console +- [ ] **Plug-in system**: Third-party generator plugins + +--- + +## Development + +### Prerequisites + +- Node.js 18+ +- TypeScript 5.9+ +- pnpm or npm + +### Setup + +```bash +npm install +npm run build +``` + +### Testing + +```bash +# Run all tests +npm test + +# Watch mode +npm run test:watch + +# UI +npm run test:ui + +# CI (with coverage) +npm run test:ci +``` + +### Type Checking + +```bash +npm run type-check +``` + +### CLI + +```bash +npm run cli:build +npm link # Link globally + +drawline init +drawline gen --schema schema.json --config config.json +``` + +--- + +## API Reference + +### Core Exports + +```typescript +// Main exports +export * from "./types/schemaDesign"; // Schema types +export * from "./types/schemaDiff"; // Diff types +export * from "./utils/schemaConverter"; // Converters +export * from "./utils/errorMessages"; // Errors +export * from "./schema"; // Schema engine +export * from "./generators/orm"; // ORM generators + +// Server exports +export * from "./connections"; // Database connections +export * from "./generator"; // Generation engine +``` + +### Key Interfaces + +```typescript +interface SchemaCollection { + id: string; + name: string; + fields: SchemaField[]; + schema?: string; + dbName?: string; + position?: { x: number; y: number }; +} + interface SchemaField { id: string; name: string; - type: FieldType; // "string" | "integer" | "boolean" | "date" | ... + type: FieldType; required?: boolean; isPrimaryKey?: boolean; isForeignKey?: boolean; - - // Constraints for Generation - constraints?: { - min?: number; - max?: number; - pattern?: string; // Regex - enum?: string[]; // Allowed values - unique?: boolean; - email?: boolean; // If inferred from name/type - }; + isSerial?: boolean; + compositePrimaryKeyIndex?: number; + compositeKeyGroup?: string; + referencedCollectionId?: string; + foreignKeyTarget?: string; + rawType?: string; + arrayItemType?: string; + defaultValue?: any; + constraints?: FieldConstraints; } -``` - -### 3. SchemaRelationship -Defines a link between two collections. The generator uses this to enforce referential integrity. -```typescript interface SchemaRelationship { id: string; - fromCollectionId: string; // The child table (e.g., "posts") - toCollectionId: string; // The parent table (e.g., "users") - type: "one-to-many" | "one-to-one" | "many-to-many"; - - // Single Field FK - fromField?: string; // FK column in child (e.g., "user_id") - toField?: string; // PK column in parent (e.g., "id") - - // Composite Key Support - fromFields?: string[]; // e.g. ["order_id", "item_id"] + fromCollectionId: string; + toCollectionId: string; + type: "one-to-one" | "one-to-many" | "many-to-many"; + fromField?: string; + toField?: string; + fromFields?: string[]; toFields?: string[]; } -``` - -### 4. TestDataConfig -The configuration object that controls the generation process. -```typescript interface TestDataConfig { - // Which collections to populate and how many records - collections: { - collectionName: string; - count: number; - distribution?: "uniform" | "normal" | "exponential"; - relationshipConfig?: Record; - }[]; - - batchSize?: number; // Insert batch size (default: 100) - seed?: number; // For reproducible runs - - // Callbacks - onProgress?: (progress: { - collectionName: string; - generatedCount: number; - totalCount: number - }) => Promise; + collections: CollectionConfig[]; + seed?: number | string; + batchSize?: number; + onProgress?: (progress: ProgressUpdate) => Promise; } ``` -## CLI Tool +--- -The package includes a CLI tool to interact with the engine. +## License -### Installation +MIT License. See LICENSE file for details. -Link the CLI locally: -```bash -npm run cli:build -npm link -``` +--- + +## Contributing + +See CONTRIBUTING.md for development guidelines. + +--- -### Commands - -- **`init`**: Create a sample `schema.json` and `drawline.config.json`. - ```bash - drawline init - ``` -- **`gen`**: Generate data based on schema and config. - ```bash - drawline gen --schema schema.json --config drawline.config.json --db in-memory - ``` -- **`validate`**: Validate a schema file. - ```bash - drawline validate --schema schema.json - ``` - -## Development and Testing - -### Testing Environment - -We use **Vitest** for a fast and robust unit testing experience. The test suite covers the core engine's data generation edge cases, foreign key resolutions, and schema relational mapping. Tests can be found in the `src/**/*.test.ts` files alongside their implementations. - -### Running Tests - -- **Run all tests**: - ```bash - npm test - ``` -- **Run tests in watch mode** (for active development): - ```bash - npm run test:watch - ``` -- **Launch the Vitest UI** (browser-based testing and visualization): - ```bash - npm run test:ui - ``` -- **Run tests with coverage** (used by CI): - ```bash - npm run test:ci - ``` - -### Continuous Integration (CI) -All Pull Requests and commits to `main` are automatically validated via GitHub Actions. The CI pipeline strictly enforces TypeScript compilation (`npm run type-check`) and executes the full test suite with coverage. Please ensure your local tests pass before opening a PR. - -### Adding New Adapters - -All new adapters must extend `BaseAdapter` and implement the abstract methods for connection management and document insertion. Reference `InMemoryAdapter.ts` for a simple starting point. +## Support +- GitHub Issues: https://github.com/solvaratech/drawline-core/issues +- Documentation: https://drawline.app/docs \ No newline at end of file