diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml new file mode 100644 index 0000000..b4283b4 --- /dev/null +++ b/.github/workflows/docs.yml @@ -0,0 +1,71 @@ +name: Deploy Docs + +on: + push: + branches: + - refactor + paths: + - 'docs/**' + pull_request: + branches: + - refactor + paths: + - 'docs/**' + workflow_dispatch: + +permissions: + contents: read + pages: write + id-token: write + +concurrency: + group: pages + cancel-in-progress: false + +jobs: + build: + name: Build Docs + runs-on: ubuntu-latest + steps: + - name: Checkout Repository + uses: actions/checkout@v4 + + - name: Setup Node.js + uses: actions/setup-node@v4 + with: + node-version: '20' + + - name: Install Mintlify CLI + run: npm install -g mintlify + + - name: Export Docs + working-directory: docs + run: mintlify export + + - name: Unzip Export + working-directory: docs + run: unzip export.zip -d site + + - name: Add SPA fallback + run: | + cp docs/site/index.html docs/site/404.html + + - name: Upload artifact + uses: actions/upload-pages-artifact@v3 + with: + path: docs/site + + deploy: + name: Deploy to GitHub Pages + if: github.event_name == 'push' + needs: build + runs-on: ubuntu-latest + + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + + steps: + - name: Deploy to GitHub Pages + id: deployment + uses: actions/deploy-pages@v4 diff --git a/docs/api-reference/grpc.mdx b/docs/api-reference/grpc.mdx new file mode 100644 index 0000000..bdae86e --- /dev/null +++ b/docs/api-reference/grpc.mdx @@ -0,0 +1,403 @@ +--- +title: "gRPC API" +description: "High-performance Protocol Buffer API reference" +--- + +# gRPC API Reference + +VortexDB's gRPC API provides high-performance vector operations using Protocol Buffers over HTTP/2. + +## Connection + +| Parameter | Default | +|-----------|---------| +| Host | `localhost` | +| Port | `50051` | +| Protocol | HTTP/2 (plaintext) | + +```bash +# Test connection with grpcurl +grpcurl -plaintext localhost:50051 list +``` + +## Authentication + +All gRPC calls require the `authorization` header with your API key: + +```bash +-H "authorization: your-api-key" +``` + +The API key is set via the `GRPC_ROOT_PASSWORD` environment variable. + +--- + +## Service Definition + +```protobuf +syntax = "proto3"; +package vectordb; + +service VectorDB { + // Insert a vector with a payload and return the assigned PointID + rpc InsertVector(InsertVectorRequest) returns (PointID); + + // Delete a vector by its PointID + rpc DeletePoint(PointID) returns (google.protobuf.Empty); + + // Get a vector and its payload by PointID + rpc GetPoint(PointID) returns (Point); + + // Search for the k nearest vectors to a target vector + rpc SearchPoints(SearchRequest) returns (SearchResponse); +} +``` + +--- + +## Methods + +### InsertVector + +Insert a vector with its associated payload. + + + The vector to insert. Must match the configured `DIMENSION`. + + + + Metadata associated with the vector. + + +**Request:** + +```protobuf +message InsertVectorRequest { + DenseVector vector = 1; + Payload payload = 2; +} +``` + +**Response:** + +```protobuf +message PointID { + UUID id = 1; +} +``` + +**Example:** + +```bash +grpcurl -plaintext \ + -H "authorization: secret" \ + -d '{ + "vector": {"values": [0.1, 0.2, 0.3, 0.4]}, + "payload": {"content_type": 1, "content": "Hello world"} + }' \ + localhost:50051 vectordb.VectorDB/InsertVector +``` + +**Response:** + +```json +{ + "id": { + "value": "550e8400-e29b-41d4-a716-446655440000" + } +} +``` + +--- + +### GetPoint + +Retrieve a point by its ID. + + + The unique identifier of the point. + + +**Request:** + +```protobuf +message PointID { + UUID id = 1; +} +``` + +**Response:** + +```protobuf +message Point { + PointID id = 1; + Payload payload = 2; + DenseVector vector = 3; +} +``` + +**Example:** + +```bash +grpcurl -plaintext \ + -H "authorization: secret" \ + -d '{"id": {"value": "550e8400-e29b-41d4-a716-446655440000"}}' \ + localhost:50051 vectordb.VectorDB/GetPoint +``` + +**Response:** + +```json +{ + "id": { + "id": { + "value": "550e8400-e29b-41d4-a716-446655440000" + } + }, + "payload": { + "contentType": "Text", + "content": "Hello world" + }, + "vector": { + "values": [0.1, 0.2, 0.3, 0.4] + } +} +``` + +--- + +### DeletePoint + +Delete a point by its ID. + + + The unique identifier of the point to delete. + + +**Request:** + +```protobuf +message PointID { + UUID id = 1; +} +``` + +**Response:** + +```protobuf +google.protobuf.Empty +``` + +**Example:** + +```bash +grpcurl -plaintext \ + -H "authorization: secret" \ + -d '{"id": {"value": "550e8400-e29b-41d4-a716-446655440000"}}' \ + localhost:50051 vectordb.VectorDB/DeletePoint +``` + +**Response:** + +```json +{} +``` + +--- + +### SearchPoints + +Search for the k nearest neighbors to a query vector. + + + The vector to search with. Must match the configured `DIMENSION`. + + + + The distance metric to use. + + + + Maximum number of results to return. + + +**Request:** + +```protobuf +message SearchRequest { + DenseVector query_vector = 1; + Similarity similarity = 2; + uint64 limit = 3; +} +``` + +**Response:** + +```protobuf +message SearchResponse { + repeated PointID result_point_ids = 1; +} +``` + +**Example:** + +```bash +grpcurl -plaintext \ + -H "authorization: secret" \ + -d '{ + "query_vector": {"values": [0.1, 0.2, 0.3, 0.4]}, + "similarity": 3, + "limit": 5 + }' \ + localhost:50051 vectordb.VectorDB/SearchPoints +``` + +**Response:** + +```json +{ + "resultPointIds": [ + {"id": {"value": "550e8400-e29b-41d4-a716-446655440000"}}, + {"id": {"value": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"}} + ] +} +``` + +--- + +## Message Types + +### UUID + +```protobuf +message UUID { + string value = 1; // UUID v4 string +} +``` + +### DenseVector + +```protobuf +message DenseVector { + repeated float values = 1; // Vector components +} +``` + +### Point + +```protobuf +message Point { + PointID id = 1; // Unique identifier + Payload payload = 2; // Associated metadata + DenseVector vector = 3; // Vector values +} +``` + +### PointID + +```protobuf +message PointID { + UUID id = 1; +} +``` + +### Payload + +```protobuf +message Payload { + ContentType content_type = 1; // Type of content + string content = 2; // Content string +} +``` + +--- + +## Enums + +### Similarity + +Distance/similarity metric for search operations. + +| Value | Name | Description | +|-------|------|-------------| +| `0` | `Euclidean` | L2 distance (straight line) | +| `1` | `Manhattan` | L1 distance (city block) | +| `2` | `Hamming` | Count of differing elements | +| `3` | `Cosine` | Angular distance | + +### ContentType + +Type of payload content. + +| Value | Name | Description | +|-------|------|-------------| +| `0` | `Image` | Image reference or data | +| `1` | `Text` | Text content | + +--- + +## Error Codes + +| gRPC Code | Name | Description | +|-----------|------|-------------| +| `0` | `OK` | Success | +| `3` | `INVALID_ARGUMENT` | Invalid request (e.g., wrong dimensions) | +| `5` | `NOT_FOUND` | Point does not exist | +| `13` | `INTERNAL` | Server error | +| `16` | `UNAUTHENTICATED` | Invalid or missing API key | + +--- + +## Client Libraries + +### Python + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + # Insert + point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + payload=Payload.text("Hello") + ) + + # Search + results = db.search( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + similarity=Similarity.COSINE, + limit=5 + ) +``` + +### Generate Clients + +Use `protoc` to generate clients in any language: + +```bash +# Python +python -m grpc_tools.protoc \ + -I./crates/grpc/proto \ + --python_out=./client \ + --grpc_python_out=./client \ + ./crates/grpc/proto/vector-db.proto + +# Go +protoc --go_out=. --go-grpc_out=. vector-db.proto + +# TypeScript +protoc --plugin=protoc-gen-ts \ + --ts_out=./client \ + vector-db.proto +``` + +## Next Steps + + + + REST API reference + + + Python client documentation + + diff --git a/docs/api-reference/http.mdx b/docs/api-reference/http.mdx new file mode 100644 index 0000000..fe88632 --- /dev/null +++ b/docs/api-reference/http.mdx @@ -0,0 +1,364 @@ +--- +title: "HTTP API" +description: "RESTful JSON API reference" +--- + +# HTTP API Reference + +VortexDB's HTTP API provides a RESTful interface for vector operations using JSON over HTTP/1.1. + +## Base URL + +``` +http://localhost:3000 +``` + +The port can be configured via the `HTTP_PORT` environment variable. + + +The HTTP API does not require authentication and is designed for development and trusted internal networks. For production, use the gRPC API or deploy behind a reverse proxy with authentication. + + +--- + +## Endpoints + +### Health Check + +Check if the server is running. + + + Returns "OK" if the server is healthy. + + + +```bash Request +curl http://localhost:3000/health +``` + +```text Response +OK +``` + + +--- + +### Root + +Verify the server is running. + + +```bash Request +curl http://localhost:3000/ +``` + +```text Response +Vector Database server is running! +``` + + +--- + +### Insert Point + + + Array of floating-point numbers representing the vector. Must match the configured `DIMENSION`. + + + + Metadata object associated with the vector. + + + + Type of content: `"Text"` or `"Image"` + + + The content string + + + + + + The UUID of the created point. + + + +```bash Request +curl -X POST http://localhost:3000/points \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.1, 0.2, 0.3, 0.4], + "payload": { + "content_type": "Text", + "content": "Hello, VortexDB!" + } + }' +``` + +```json Response (201 Created) +{ + "point_id": "550e8400-e29b-41d4-a716-446655440000" +} +``` + + +**Error Responses:** + +| Status | Description | +|--------|-------------| +| `400 Bad Request` | Invalid JSON or missing fields | +| `500 Internal Server Error` | Server error during insertion | + +--- + +### Get Point + +Retrieve a point by its ID. + + + The UUID of the point to retrieve. + + + + The point's UUID. + + + + The stored vector values. + + + + The associated payload metadata. + + + +```bash Request +curl http://localhost:3000/points/550e8400-e29b-41d4-a716-446655440000 +``` + +```json Response (200 OK) +{ + "id": "550e8400-e29b-41d4-a716-446655440000", + "vector": [0.1, 0.2, 0.3, 0.4], + "payload": { + "content_type": "Text", + "content": "Hello, VortexDB!" + } +} +``` + + +**Error Responses:** + +| Status | Description | +|--------|-------------| +| `404 Not Found` | Point does not exist | +| `500 Internal Server Error` | Server error during retrieval | + +--- + +### Delete Point + +Delete a point by its ID. + + + The UUID of the point to delete. + + + +```bash Request +curl -X DELETE http://localhost:3000/points/550e8400-e29b-41d4-a716-446655440000 +``` + +```text Response (204 No Content) +(empty body) +``` + + +**Error Responses:** + +| Status | Description | +|--------|-------------| +| `500 Internal Server Error` | Server error during deletion | + +--- + +### Search Points + +Search for the k nearest neighbors to a query vector. + + + Query vector. Must match the configured `DIMENSION`. + + + + Distance metric: `"Euclidean"`, `"Manhattan"`, `"Hamming"`, or `"Cosine"` + + + + Maximum number of results to return. + + + + Array of point IDs ordered by similarity (closest first). + + + +```bash Request +curl -X POST http://localhost:3000/points/search \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.1, 0.2, 0.3, 0.4], + "similarity": "Cosine", + "limit": 5 + }' +``` + +```json Response (200 OK) +{ + "results": [ + "550e8400-e29b-41d4-a716-446655440000", + "6ba7b810-9dad-11d1-80b4-00c04fd430c8", + "f47ac10b-58cc-4372-a567-0e02b2c3d479" + ] +} +``` + + +**Error Responses:** + +| Status | Description | +|--------|-------------| +| `400 Bad Request` | Invalid JSON or missing fields | +| `500 Internal Server Error` | Server error during search | + +--- + +## Data Types + +### Vector + +An array of floating-point numbers: + +```json +[0.1, 0.2, 0.3, 0.4] +``` + + +The vector length must match the `DIMENSION` environment variable configured on the server. + + +### Payload + +```json +{ + "content_type": "Text", + "content": "Your content here" +} +``` + +| Field | Type | Values | +|-------|------|--------| +| `content_type` | string | `"Text"` or `"Image"` | +| `content` | string | Any string content | + +### Similarity + +| Value | Description | +|-------|-------------| +| `"Euclidean"` | L2 distance (straight line) | +| `"Manhattan"` | L1 distance (city block) | +| `"Hamming"` | Count of differing elements | +| `"Cosine"` | Angular distance (1 - cosine similarity) | + +--- + +## Error Format + +Errors are returned as plain text with an appropriate HTTP status code: + +```bash +curl -v http://localhost:3000/points/nonexistent-id +``` + +``` +< HTTP/1.1 404 Not Found +< content-type: text/plain; charset=utf-8 +< +Point not found +``` + +--- + +## Examples + +### Complete Workflow + +```bash +# 1. Check health +curl http://localhost:3000/health +# OK + +# 2. Insert a vector +POINT_ID=$(curl -s -X POST http://localhost:3000/points \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.1, 0.2, 0.3, 0.4], + "payload": {"content_type": "Text", "content": "First document"} + }' | jq -r '.point_id') + +echo "Created point: $POINT_ID" + +# 3. Get the point +curl http://localhost:3000/points/$POINT_ID + +# 4. Search for similar vectors +curl -X POST http://localhost:3000/points/search \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.15, 0.25, 0.35, 0.45], + "similarity": "Cosine", + "limit": 10 + }' + +# 5. Delete the point +curl -X DELETE http://localhost:3000/points/$POINT_ID +``` + +### Batch Insert (using shell loop) + +```bash +for i in {1..10}; do + curl -s -X POST http://localhost:3000/points \ + -H "Content-Type: application/json" \ + -d "{ + \"vector\": [$(echo "scale=2; $i/10" | bc), $(echo "scale=2; $i/10 + 0.1" | bc), 0.3, 0.4], + \"payload\": {\"content_type\": \"Text\", \"content\": \"Document $i\"} + }" +done +``` + +--- + +## OpenAPI Specification + +The complete OpenAPI specification is available at: + +``` +/docs/openapi.yaml +``` + +You can import this into tools like Postman or Swagger UI for interactive API exploration. + +--- + +## Next Steps + + + + High-performance gRPC API reference + + + Build applications with the Python client + + diff --git a/docs/api-reference/overview.mdx b/docs/api-reference/overview.mdx new file mode 100644 index 0000000..5a149c3 --- /dev/null +++ b/docs/api-reference/overview.mdx @@ -0,0 +1,173 @@ +--- +title: "API Overview" +description: "VortexDB exposes two APIs: gRPC and HTTP" +--- + +# API Overview + +VortexDB provides two complementary APIs for different use cases: + +## Available APIs + + + + High-performance Protocol Buffer interface for production workloads + + + RESTful JSON interface for quick testing and prototyping + + + +## Comparison + +| Feature | gRPC | HTTP | +|---------|------|------| +| **Protocol** | HTTP/2 + Protobuf | HTTP/1.1 + JSON | +| **Default Port** | 50051 | 3000 | +| **Authentication** | API key required | None | +| **Performance** | Higher throughput | Lower latency for simple requests | +| **Client Libraries** | Auto-generated | Any HTTP client | +| **Best For** | Production, SDKs | Debugging, curl | + +## Authentication + +### gRPC + +The gRPC API requires authentication via the `authorization` header: + +```bash +# Using grpcurl +grpcurl -plaintext \ + -H "authorization: your-api-key" \ + localhost:50051 vectordb.VectorDB/GetPoint +``` + +```python +# Python SDK +db = VortexDB( + grpc_url="localhost:50051", + api_key="your-api-key" +) +``` + +### HTTP + +The HTTP API does not require authentication by default. It's designed for trusted internal networks and development. + + +In production, always deploy the HTTP API behind a reverse proxy with authentication, or disable it entirely using `DISABLE_HTTP=true`. + + +## Common Operations + +Both APIs support the same core operations: + +| Operation | gRPC Method | HTTP Endpoint | +|-----------|-------------|---------------| +| Insert vector | `InsertVector` | `POST /points` | +| Get point | `GetPoint` | `GET /points/:id` | +| Delete point | `DeletePoint` | `DELETE /points/:id` | +| Search vectors | `SearchPoints` | `POST /points/search` | +| Health check | - | `GET /health` | + +## Error Handling + +### gRPC Error Codes + +| Code | Name | Description | +|------|------|-------------| +| `0` | OK | Success | +| `3` | INVALID_ARGUMENT | Invalid request parameters | +| `5` | NOT_FOUND | Point not found | +| `13` | INTERNAL | Server error | +| `16` | UNAUTHENTICATED | Invalid or missing API key | + +### HTTP Status Codes + +| Code | Description | +|------|-------------| +| `200` | Success | +| `201` | Created (for insert) | +| `204` | No Content (for delete) | +| `400` | Bad Request | +| `404` | Not Found | +| `500` | Internal Server Error | + +## Data Types + +### Vector + +A dense vector of floating-point values: + + + + ```protobuf + message DenseVector { + repeated float values = 1; + } + ``` + + + ```json + { + "vector": [0.1, 0.2, 0.3, 0.4] + } + ``` + + + +### Payload + +Metadata attached to vectors: + + + + ```protobuf + enum ContentType { + Image = 0; + Text = 1; + } + + message Payload { + ContentType content_type = 1; + string content = 2; + } + ``` + + + ```json + { + "payload": { + "content_type": "Text", + "content": "Hello, world!" + } + } + ``` + + + +### Similarity Metric + +Distance function for search: + +| Value | gRPC Enum | HTTP String | +|-------|-----------|-------------| +| Euclidean (L2) | `0` | `"Euclidean"` | +| Manhattan (L1) | `1` | `"Manhattan"` | +| Hamming | `2` | `"Hamming"` | +| Cosine | `3` | `"Cosine"` | + +## Rate Limiting + +VortexDB does not implement built-in rate limiting. For production deployments, use a reverse proxy or API gateway to enforce limits. + +## Next Steps + + + + Complete gRPC API documentation + + + Complete HTTP API documentation + + diff --git a/docs/concepts/architecture.mdx b/docs/concepts/architecture.mdx new file mode 100644 index 0000000..2af8ee7 --- /dev/null +++ b/docs/concepts/architecture.mdx @@ -0,0 +1,217 @@ +--- +title: "Architecture" +description: "Understanding VortexDB's modular architecture" +--- + +# Architecture + +VortexDB is a high-performance vector database built in Rust, designed with modularity and flexibility at its core. This page explains how the components work together. + +## High-Level Overview + +```mermaid +flowchart TB + subgraph Clients + HTTP[HTTP Client] + GRPC[gRPC Client] + SDK[Python SDK
gRPC under hood] + end + + subgraph Server[VortexDB Server] + subgraph Transport[Transport Layer] + HTTPLayer[HTTP Layer
Axum] + GRPCLayer[gRPC Layer
tonic + prost] + end + + API[API Core] + + subgraph Backend[Backend Layer] + Index[Index Layer] + Storage[Storage Layer] + Snapshot[Snapshot Engine] + end + end + + HTTP --> HTTPLayer + GRPC --> GRPCLayer + SDK --> GRPCLayer + + HTTPLayer <--> GRPCLayer + HTTPLayer --> API + GRPCLayer --> API + + API --> Index + API --> Storage + API --> Snapshot +``` + +## Crate Structure + +VortexDB is organized as a Rust workspace with the following crates: + +| Crate | Purpose | +|-------|---------| +| `server` | Main entry point, configuration, server startup | +| `api` | Core database logic and error handling | +| `grpc` | Protocol Buffers definitions and gRPC service | +| `http` | REST API handlers using Axum | +| `index` | Vector indexing algorithms (Flat, KD-Tree, HNSW) | +| `storage` | Persistence backends (InMemory, RocksDB) | +| `snapshot` | Point-in-time backup and restore | +| `defs` | Shared type definitions | +| `tui` | Terminal user interface | + +## Transport Layers + +VortexDB exposes two APIs that share the same underlying database: + +### gRPC API + +The **gRPC layer** is the primary high-performance interface: + +- **Protocol**: HTTP/2 with Protocol Buffers +- **Port**: 50051 (default) +- **Authentication**: API key via `authorization` header +- **Use cases**: Production workloads, SDKs, high-throughput scenarios + +```protobuf +service VectorDB { + rpc InsertVector(InsertVectorRequest) returns (PointID); + rpc DeletePoint(PointID) returns (google.protobuf.Empty); + rpc GetPoint(PointID) returns (Point); + rpc SearchPoints(SearchRequest) returns (SearchResponse); +} +``` + +### HTTP API + +The **HTTP layer** provides a RESTful interface: + +- **Protocol**: HTTP/1.1 with JSON +- **Port**: 3000 (default) +- **Authentication**: None (designed for internal/trusted networks) +- **Use cases**: Quick testing, curl commands, prototyping + +| Method | Endpoint | Description | +|--------|----------|-------------| +| `GET` | `/` | Root endpoint | +| `GET` | `/health` | Health check | +| `POST` | `/points` | Insert a point | +| `GET` | `/points/:id` | Get a point by ID | +| `DELETE` | `/points/:id` | Delete a point | +| `POST` | `/points/search` | Search for similar vectors | + +### When to Use Which? + + + + - Building production applications + - Using the Python SDK (it uses gRPC) + - Need authentication + - Processing high volumes of requests + - Want strongly-typed client libraries + + + - Quick prototyping with curl + - Integrating with systems that don't support gRPC + - Debugging and testing + - Building browser-based tools + + + +## Storage Layer + +The storage layer persists vectors and their payloads using a trait-based design: + +```rust +pub trait StorageEngine: Send + Sync { + fn insert(&self, vector: DenseVector, payload: Payload) -> Result; + fn get(&self, point_id: PointId) -> Result>; + fn delete(&self, point_id: PointId) -> Result; + fn checkpoint_at(&self, path: &Path) -> Result; + fn restore_checkpoint(&mut self, checkpoint: &StorageCheckpoint) -> Result<()>; +} +``` + +### Available Backends + +| Backend | Description | Use Case | +|---------|-------------|----------| +| **InMemory** | Stores data in RAM | Development, testing, ephemeral workloads | +| **RocksDB** | LSM-tree persistent storage | Production deployments | + +Set the backend via the `STORAGE_TYPE` environment variable: +```bash +STORAGE_TYPE=rocksdb # or 'inmemory' +``` + +## Index Layer + +The index layer provides fast similarity search. See [Indexers](/concepts/indexers) for details on choosing the right index. + +```rust +pub trait VectorIndex: Send + Sync { + fn insert(&mut self, vector: IndexedVector) -> Result<()>; + fn delete(&mut self, point_id: PointId) -> Result; + fn search(&self, query: DenseVector, similarity: Similarity, k: usize) -> Result>; +} +``` + +## Data Flow + +Here's what happens when you insert a vector: + + + + Client sends insert request via gRPC or HTTP + + + Server validates vector dimensions match configuration + + + Vector and payload are persisted to the storage backend + + + Vector is added to the index for fast searching + + + Server returns the generated point ID to client + + + +## Configuration + +VortexDB is configured via environment variables: + +| Variable | Required | Default | Description | +|----------|----------|---------|-------------| +| `GRPC_ROOT_PASSWORD` | Yes | - | Authentication password for gRPC | +| `DIMENSION` | Yes | - | Vector dimensionality | +| `DATA_PATH` | No | `/data` | Path for persistent storage | +| `HTTP_HOST` | No | `0.0.0.0` | HTTP server bind address | +| `HTTP_PORT` | No | `3000` | HTTP server port | +| `GRPC_HOST` | No | `0.0.0.0` | gRPC server bind address | +| `GRPC_PORT` | No | `50051` | gRPC server port | +| `STORAGE_TYPE` | No | `inmemory` | Storage backend | +| `INDEX_TYPE` | No | `flat` | Index algorithm | +| `LOGGING` | No | `true` | Enable logging | +| `DISABLE_HTTP` | No | `false` | Run gRPC only | + +## Thread Safety + +VortexDB is designed for concurrent access: + +- The **storage layer** uses `Arc` for thread-safe reference counting +- The **index layer** uses `RwLock` for concurrent reads with exclusive writes +- Both gRPC and HTTP handlers are fully async using Tokio + +## Next Steps + + + + Learn about index algorithms and when to use each + + + Understand backup and restore mechanisms + + diff --git a/docs/concepts/indexers.mdx b/docs/concepts/indexers.mdx new file mode 100644 index 0000000..423ed83 --- /dev/null +++ b/docs/concepts/indexers.mdx @@ -0,0 +1,206 @@ +--- +title: "Indexers" +description: "Choosing the right vector index for your use case" +--- + +# Indexers + +VortexDB supports multiple indexing algorithms, each optimized for different use cases. The index determines how vectors are organized for similarity search. + +## Overview + +| Index | Search Time | Build Time | Memory | Best For | +|-------|------------|------------|--------|----------| +| **Flat** | O(n) | O(1) | Low | Small datasets, exact results | +| **KD-Tree** | O(log n) avg | O(n log n) | Medium | Low-dimensional vectors | +| **HNSW** | O(log n) | O(n log n) | Higher | Large-scale approximate search | + +## Flat Index + +The **Flat** index performs brute-force exhaustive search by computing distances to every vector. + +### Characteristics + + + + - **100% recall**: Always returns the exact nearest neighbors + - **No build time**: Vectors are simply appended + - **Works in any dimension**: No curse of dimensionality + - **Simple and predictable**: Easy to reason about + + + - **Linear search time**: Scans every vector for each query + - **Not scalable**: Impractical for millions of vectors + + + +### When to Use + +- **Dataset size**: < 10,000 vectors +- **Requirements**: Need exact/guaranteed results +- **Use cases**: Testing, prototyping, small production workloads + +```bash +INDEX_TYPE=flat +``` + +## KD-Tree Index + +The **KD-Tree** (k-dimensional tree) is a space-partitioning data structure that recursively divides the vector space. + +At each level, the tree splits data along a different dimension (cycling through x, y, z, ...). + +### Characteristics + + + + - **Fast for low dimensions**: O(log n) average search time when d < 20 + - **Exact results**: Returns true nearest neighbors + - **Efficient updates**: Supports incremental insertions + + + - **Curse of dimensionality**: Performance degrades above ~20 dimensions + - **Rebalancing needed**: Can become unbalanced with many deletions + + + +### When to Use + +- **Vector dimensions**: < 20 dimensions +- **Dataset size**: Thousands to hundreds of thousands of vectors +- **Use cases**: Geographic data, low-dimensional embeddings, spatial queries + +```bash +INDEX_TYPE=kdtree +``` + + +For high-dimensional embeddings (e.g., 384, 768, 1536 dimensions common in ML), KD-Tree is not recommended. Use HNSW instead. + + +## HNSW Index + +**HNSW** (Hierarchical Navigable Small World) is a state-of-the-art approximate nearest neighbor algorithm based on proximity graphs. It constructs a multi-layered graph where each layer represents a different level of granularity, enabling efficient navigation from coarse to fine-grained similarity search. + +Check out this [blog post](https://blog.sdslabs.co/2026/03/hnsw-index) for more theoretical details and [this blog](https://blog.sdslabs.co/2026/03/hnsw-indexp2) covering implementation in VortexDB. + +### Characteristics + + + + - **Sub-linear search**: O(log n) regardless of dimensions + - **High recall**: Typically 95-99%+ with proper tuning + - **Scales to billions**: Used by major vector databases + - **Handles high dimensions**: Works well with 768+ dimensional vectors + + + - **Higher memory**: Graph structure adds overhead + - **Approximate results**: May miss exact nearest neighbors + - **Build time**: Initial index construction is slower + + + +### Algorithm Reference + +VortexDB's HNSW implementation follows the original paper: + +> Malkov, Y. A., & Yashunin, D. A. (2018). *Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs*. [arXiv:1603.09320](https://arxiv.org/abs/1603.09320) + +### When to Use + +- **Vector dimensions**: Any, but especially > 20 dimensions +- **Dataset size**: 100,000+ vectors +- **Use cases**: Semantic search, recommendation systems, RAG applications + +```bash +INDEX_TYPE=hnsw +``` + +## Distance Metrics + +All indexes support four distance/similarity metrics: + + + + Measures the angle between two vectors, ignoring magnitude. + + $$d = 1 - \frac{A \cdot B}{\|A\| \|B\|}$$ + + - **Range**: 0 (identical) to 2 (opposite) + - **Best for**: Text embeddings, normalized vectors + + ```python + similarity=Similarity.COSINE + ``` + + + L2 distance—the straight-line distance between points. + + $$d = \sqrt{\sum_{i=1}^{n} (a_i - b_i)^2}$$ + + - **Range**: 0 (identical) to ∞ + - **Best for**: General purpose + + ```python + similarity=Similarity.EUCLIDEAN + ``` + + + L1 distance—the sum of absolute differences. + + $$d = \sum_{i=1}^{n} |a_i - b_i|$$ + + - **Range**: 0 (identical) to ∞ + - **Best for**: Sparse vectors, grid-based data + + ```python + similarity=Similarity.MANHATTAN + ``` + + + Counts positions where elements differ. + + - **Range**: 0 (identical) to n (completely different) + - **Best for**: Binary vectors, categorical data + + ```python + similarity=Similarity.HAMMING + ``` + + + +## Choosing an Index + +Use this decision tree: + +| Vectors | Dimensions | Recommended Index | +|---------|------------|-------------------| +| < 10,000 | Any | **Flat** | +| 10K - 100K | < 20 | **KD-Tree** | +| 10K - 100K | ≥ 20 | **HNSW** | +| > 100K | Any | **HNSW** | + +## Configuration Example + +```bash +# For a semantic search application with OpenAI embeddings +DIMENSION=1536 +INDEX_TYPE=hnsw +STORAGE_TYPE=rocksdb + +# For a small prototype with sentence-transformers +DIMENSION=384 +INDEX_TYPE=flat +STORAGE_TYPE=inmemory +``` + +## Next Steps + + + + Learn how to backup and restore your index + + + Explore the complete API + + diff --git a/docs/concepts/snapshots.mdx b/docs/concepts/snapshots.mdx new file mode 100644 index 0000000..ed8f890 --- /dev/null +++ b/docs/concepts/snapshots.mdx @@ -0,0 +1,235 @@ +--- +title: "Snapshots" +description: "Backup and restore your vector database" +--- + +# Snapshots + +VortexDB's snapshot system provides point-in-time backups of your entire database, including both the index topology and stored data. + +## Overview + +A snapshot captures: +- **Index topology**: The structure of your index (graph edges, tree nodes, etc.) +- **Index metadata**: Configuration specific to the index type +- **Storage data**: All vectors and their payloads +- **Database metadata**: Dimensions, timestamps, version info + +## How It Works + +### Creating a Snapshot + +The snapshot process involves two parallel operations: + + + + The index serializes its topology and metadata into binary format: + ```rust + pub trait SerializableIndex { + fn serialize_topology(&self) -> Result>; + fn serialize_metadata(&self) -> Result>; + fn snapshot(&self) -> Result; + } + ``` + + + The storage backend creates a consistent checkpoint: + ```rust + fn checkpoint_at(&self, path: &Path) -> Result; + ``` + For RocksDB, this uses the native checkpoint API for efficiency. + + + A manifest file is created with: + - Snapshot UUID + - Timestamp + - Parser version (for compatibility) + - SHA-256 checksums of all files + + + All files are bundled into a compressed tarball (`.tar.gz`). + + + +### Restoring a Snapshot + + + + The tarball is extracted to a temporary directory. + + + All file checksums are verified against the manifest. + + + The parser version is checked for compatibility. + + + The storage checkpoint is restored first. + + + The index is deserialized from topology and metadata, then populated with vectors from storage. + + + +## Index-Specific Serialization + +Each index type has its own serialization format: + + + + - **Topology**: List of point IDs in insertion order + - **Metadata**: Minimal (just magic bytes) + - **Restore**: O(n) - simply rebuild the vector list + + + - **Topology**: Tree structure with node relationships + - **Metadata**: Dimension count, tree depth + - **Restore**: O(n) - rebuild tree structure, populate vectors + + + - **Topology**: Multi-layer graph with all edges + - **Metadata**: Level multiplier, max connections, entry point + - **Restore**: O(n) - rebuild graph layers, populate vectors + + + +## Snapshot Data Structures + +### Snapshot Object + +```rust +pub struct Snapshot { + pub id: Uuid, // Unique identifier + pub date: SystemTime, // Creation timestamp + pub sem_ver: Version, // Parser version + pub index_snapshot: IndexSnapshot, // Index data + pub storage_snapshot: StorageCheckpoint, // Storage data + pub dimensions: usize, // Vector dimensions +} +``` + +### Index Snapshot + +```rust +pub struct IndexSnapshot { + pub index_type: IndexType, // Flat, KDTree, or HNSW + pub magic: Magic, // 4-byte identifier + pub topology_b: Vec, // Serialized graph/tree + pub metadata_b: Vec, // Index-specific config +} +``` + +### Storage Checkpoint + +```rust +pub struct StorageCheckpoint { + pub path: PathBuf, // Path to checkpoint files + pub storage_type: StorageType, // InMemory or RocksDB +} +``` + +## Snapshot Engine + +The `SnapshotEngine` automates periodic snapshots: + +### Features + +- **Scheduled backups**: Automatically creates snapshots at configured intervals +- **Retention policy**: Keeps the `k` most recent snapshots +- **Background operation**: Runs in a separate thread without blocking queries + +### Registry Interface + +```rust +pub trait SnapshotRegistry: Send + Sync { + // Add a new snapshot to the registry + fn add_snapshot(&mut self, path: &Path) -> Result; + + // List snapshots with pagination + fn list_snapshots(&mut self, limit: usize, offset: usize) -> Result; + + // Get the most recent snapshot + fn get_latest_snapshot(&mut self) -> Result; + + // Get metadata for a specific snapshot + fn get_metadata(&mut self, id: String) -> Result; + + // Remove a snapshot + fn remove_snapshot(&mut self, id: String) -> Result; + + // Load and restore a snapshot + fn load(&mut self, id: String, path: &Path) -> Result; +} +``` + +## Manifest Format + +The manifest (`manifest.json`) contains all metadata needed for restore: + +```json +{ + "snapshot_id": "550e8400-e29b-41d4-a716-446655440000", + "created_at": "2026-03-31T12:00:00Z", + "parser_version": "1.0.0", + "dimensions": 384, + "index_type": "HNSW", + "files": { + "index_metadata": { + "filename": "hnsw-index-meta.bin", + "checksum": "sha256:abc123..." + }, + "index_topology": { + "filename": "hnsw-index-topo.bin", + "checksum": "sha256:def456..." + }, + "storage": { + "filename": "rocksdb-checkpoint/", + "checksum": "sha256:789ghi..." + } + } +} +``` + +## Consistency Guarantees + +VortexDB snapshots provide the following guarantees: + + + + A snapshot represents a consistent point-in-time view. Either the entire snapshot succeeds, or no partial state is written. + + + SHA-256 checksums verify that snapshot files haven't been corrupted or tampered with during storage or transfer. + + + Each snapshot includes a parser version. VortexDB will refuse to restore snapshots from incompatible versions to prevent data corruption. + + + +## Best Practices + + + + Configure automated snapshots based on your data change rate + + + Periodically verify that snapshots can be successfully restored + + + Copy snapshots to external storage for disaster recovery + + + Track snapshot sizes and adjust retention policies accordingly + + + +## Next Steps + + + + Deploy VortexDB with persistent snapshots + + + Learn how to trigger snapshots via API + + diff --git a/docs/getting-started/installation.mdx b/docs/getting-started/installation.mdx new file mode 100644 index 0000000..ef1a5c1 --- /dev/null +++ b/docs/getting-started/installation.mdx @@ -0,0 +1,155 @@ +--- +title: "Installation" +description: "Get VortexDB up and running in minutes" +--- + +# Installation + +VortexDB can be installed using Docker (recommended) or built from source. This guide covers both methods. + +## Prerequisites + + + + Docker 20.10+ and Docker Compose v2 + + + Rust 1.88+, protobuf-compiler, clang + + + +## Docker Installation (Recommended) + +The fastest way to get started is using Docker: + +```bash +# Clone the repository +git clone https://github.com/sdslabs/VortexDB.git +cd VortexDB + +# Copy the environment template +cp .env.example .env +``` + +### Configure Environment + +Edit the `.env` file with your settings: + +```bash +# Required settings +GRPC_ROOT_PASSWORD=your-secure-password +DIMENSION=384 # Vector dimension (e.g., 384 for MiniLM embeddings) +DATA_PATH=/data + +# Optional settings (with defaults) +HTTP_HOST=0.0.0.0 +HTTP_PORT=3000 +GRPC_HOST=0.0.0.0 +GRPC_PORT=50051 +STORAGE_TYPE=inmemory # inmemory | rocksdb +INDEX_TYPE=flat # flat | kdtree | hnsw +LOGGING=true +DISABLE_HTTP=false +``` + +### Start VortexDB + +```bash +docker compose up +``` + + +Use `docker compose up --build` after making code changes to rebuild the image. + + +VortexDB is now running: +- **HTTP API**: http://localhost:3000 +- **gRPC API**: localhost:50051 + +## Building from Source + +### Install Dependencies + + + + ```bash + sudo apt-get update && sudo apt-get install -y \ + protobuf-compiler \ + clang \ + libclang-dev \ + llvm-dev \ + build-essential + ``` + + + ```bash + brew install protobuf llvm + ``` + + + ```bash + sudo pacman -S protobuf clang llvm + ``` + + + +### Build and Run + +```bash +# Clone the repository +git clone https://github.com/sdslabs/VortexDB.git +cd VortexDB + +# Build in release mode +cargo build --release + +# Run the server +./target/release/server +``` + +## Python SDK Installation + +Install the Python client to interact with VortexDB: + +```bash +pip install vortexdb +``` + +Or install from source: + +```bash +cd client/python +pip install -e . +``` + + +The Python SDK requires Python 3.9+ and communicates with VortexDB via gRPC. + + +## Verify Installation + +Check that VortexDB is running correctly: + + +```bash Health Check (HTTP) +curl http://localhost:3000/health +# Response: OK +``` + +```python Health Check (Python) +from vortexdb import VortexDB + +db = VortexDB( + grpc_url="localhost:50051", + api_key="your-secure-password" +) +print("Connected successfully!") +db.close() +``` + + +## Next Steps + + + Insert your first vector in 60 seconds + diff --git a/docs/getting-started/quickstart.mdx b/docs/getting-started/quickstart.mdx new file mode 100644 index 0000000..b93090f --- /dev/null +++ b/docs/getting-started/quickstart.mdx @@ -0,0 +1,236 @@ +--- +title: "Quickstart" +description: "Insert your first vector in 60 seconds" +--- + +# Quickstart + +This guide will have you inserting and searching vectors in under 60 seconds. + +## Prerequisites + +Make sure VortexDB is running (see [Installation](/getting-started/installation)). + +## Insert Your First Vector + + + + ```python + from vortexdb import VortexDB, DenseVector, Payload, Similarity + + # Connect to VortexDB + db = VortexDB( + grpc_url="localhost:50051", + api_key="your-secure-password" + ) + + # Insert a vector with a text payload + point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + payload=Payload.text("Hello, VortexDB!") + ) + print(f"Inserted point: {point_id}") + + # Clean up + db.close() + ``` + + + ```bash + curl -X POST http://localhost:3000/points \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.1, 0.2, 0.3, 0.4], + "payload": { + "content_type": "Text", + "content": "Hello, VortexDB!" + } + }' + ``` + + Response: + ```json + { + "point_id": "550e8400-e29b-41d4-a716-446655440000" + } + ``` + + + ```protobuf + // Using grpcurl + grpcurl -plaintext \ + -H "authorization: your-secure-password" \ + -d '{ + "vector": {"values": [0.1, 0.2, 0.3, 0.4]}, + "payload": {"content_type": 1, "content": "Hello, VortexDB!"} + }' \ + localhost:50051 vectordb.VectorDB/InsertVector + ``` + + + +## Search for Similar Vectors + + + + ```python + from vortexdb import VortexDB, DenseVector, Similarity + + db = VortexDB( + grpc_url="localhost:50051", + api_key="your-secure-password" + ) + + # Search for 5 most similar vectors using cosine similarity + results = db.search( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + similarity=Similarity.COSINE, + limit=5 + ) + + print(f"Found {len(results)} similar vectors:") + for point_id in results: + print(f" - {point_id}") + + db.close() + ``` + + + ```bash + curl -X POST http://localhost:3000/points/search \ + -H "Content-Type: application/json" \ + -d '{ + "vector": [0.1, 0.2, 0.3, 0.4], + "similarity": "Cosine", + "limit": 5 + }' + ``` + + Response: + ```json + { + "results": [ + "550e8400-e29b-41d4-a716-446655440000", + "6ba7b810-9dad-11d1-80b4-00c04fd430c8" + ] + } + ``` + + + +## Retrieve a Point + + + + ```python + # Get the point you just inserted + point = db.get(point_id=point_id) + + if point: + print(point.pretty()) + # Output: + # Point ID: 550e8400-e29b-41d4-a716-446655440000 + # Vector: [0.1, 0.2, 0.3, 0.4] + # Payload: Hello, VortexDB! + ``` + + + ```bash + curl http://localhost:3000/points/550e8400-e29b-41d4-a716-446655440000 + ``` + + Response: + ```json + { + "id": "550e8400-e29b-41d4-a716-446655440000", + "vector": [0.1, 0.2, 0.3, 0.4], + "payload": { + "content_type": "Text", + "content": "Hello, VortexDB!" + } + } + ``` + + + +## Delete a Point + + + + ```python + db.delete(point_id=point_id) + print("Point deleted successfully") + ``` + + + ```bash + curl -X DELETE http://localhost:3000/points/550e8400-e29b-41d4-a716-446655440000 + ``` + + + +## Complete Example + +Here's a complete example using the Python SDK with context manager: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +# Using context manager for automatic cleanup +with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + # Insert some vectors + vectors = [ + ([0.1, 0.2, 0.3, 0.4], "First document"), + ([0.2, 0.3, 0.4, 0.5], "Second document"), + ([0.9, 0.8, 0.7, 0.6], "Third document"), + ] + + point_ids = [] + for vec, text in vectors: + pid = db.insert( + vector=DenseVector(vec), + payload=Payload.text(text) + ) + point_ids.append(pid) + print(f"Inserted: {text} -> {pid}") + + # Search for vectors similar to the first one + results = db.search( + vector=DenseVector([0.15, 0.25, 0.35, 0.45]), + similarity=Similarity.COSINE, + limit=2 + ) + + print(f"\nTop 2 similar vectors:") + for pid in results: + point = db.get(point_id=pid) + print(f" - {point.payload.content}") +``` + +## Distance Metrics + +VortexDB supports four distance/similarity metrics: + +| Metric | Description | Best For | +|--------|-------------|----------| +| `COSINE` | Measures angle between vectors | Text embeddings, normalized vectors | +| `EUCLIDEAN` | L2 distance (straight line) | General purpose | +| `MANHATTAN` | L1 distance (city block) | Sparse vectors | +| `HAMMING` | Count of differing elements | Binary vectors | + +## Next Steps + + + + Understand how VortexDB works under the hood + + + Choose the right index for your use case + + + Build a semantic search application + + + Explore the complete API + + diff --git a/docs/guides/docker-compose.mdx b/docs/guides/docker-compose.mdx new file mode 100644 index 0000000..7e52b58 --- /dev/null +++ b/docs/guides/docker-compose.mdx @@ -0,0 +1,375 @@ +--- +title: "Deploy with Docker Compose" +description: "Production-ready VortexDB deployment guide" +--- + +# Deploy with Docker Compose + +This guide covers deploying VortexDB using Docker Compose for development and production environments. + +## Prerequisites + +- Docker 20.10+ +- Docker Compose v2 +- Git + +## Quick Start + +```bash +# Clone the repository +git clone https://github.com/sdslabs/VortexDB.git +cd VortexDB + +# Create environment file +cp .env.example .env + +# Start VortexDB +docker compose up +``` + +## Configuration + +### Required Settings + +Create a `.env` file with these required variables: + +```bash +# Authentication password for gRPC (required) +GRPC_ROOT_PASSWORD=your-secure-password-here + +# Vector dimensions - must match your embeddings (required) +DIMENSION=384 +``` + +### Optional Settings + +| Variable | Default | Description | +|----------|---------|-------------| +| `HTTP_HOST` | `0.0.0.0` | HTTP server bind address | +| `HTTP_PORT` | `3000` | HTTP server port | +| `GRPC_HOST` | `0.0.0.0` | gRPC server bind address | +| `GRPC_PORT` | `50051` | gRPC server port | +| `STORAGE_TYPE` | `inmemory` | `inmemory` or `rocksdb` | +| `INDEX_TYPE` | `flat` | `flat`, `kdtree`, or `hnsw` | +| `LOGGING` | `true` | Enable logging | +| `DISABLE_HTTP` | `false` | Disable HTTP API | +| `DATA_PATH` | `/data` | Container data directory | + +### Example Production Configuration + +```bash +# .env +GRPC_ROOT_PASSWORD=production-secure-password-change-me + +# Vector config (1536 for OpenAI ada-002) +DIMENSION=1536 + +# Server config +HTTP_HOST=0.0.0.0 +HTTP_PORT=3000 +GRPC_HOST=0.0.0.0 +GRPC_PORT=50051 + +# Production settings +STORAGE_TYPE=rocksdb +INDEX_TYPE=hnsw +LOGGING=true +DISABLE_HTTP=false +``` + +## Docker Compose File + +The default `docker-compose.yml`: + +```yaml +services: + vortexdb: + build: . + image: vortexdb:latest + container_name: vortexdb + restart: unless-stopped + + env_file: + - .env + + environment: + HTTP_HOST: ${HTTP_HOST:-0.0.0.0} + HTTP_PORT: ${HTTP_PORT:-3000} + GRPC_HOST: ${GRPC_HOST:-0.0.0.0} + GRPC_PORT: ${GRPC_PORT:-50051} + STORAGE_TYPE: ${STORAGE_TYPE:-inmemory} + INDEX_TYPE: ${INDEX_TYPE:-flat} + LOGGING: ${LOGGING:-true} + DISABLE_HTTP: ${DISABLE_HTTP:-false} + GRPC_ROOT_PASSWORD: ${GRPC_ROOT_PASSWORD} + DIMENSION: ${DIMENSION} + DATA_PATH: /data + + ports: + - "${HTTP_PORT:-3000}:3000" + - "${GRPC_PORT:-50051}:50051" + + volumes: + - vortexdb_data:/data + +volumes: + vortexdb_data: +``` + +## Common Operations + +### Start Services + +```bash +# Start in foreground (with logs) +docker compose up + +# Start in background +docker compose up -d + +# Rebuild and start (after code changes) +docker compose up --build +``` + +### View Logs + +```bash +# Follow logs +docker compose logs -f + +# Last 100 lines +docker compose logs --tail 100 +``` + +### Stop Services + +```bash +# Stop containers +docker compose stop + +# Stop and remove containers +docker compose down + +# Stop, remove containers, AND delete volumes (data loss!) +docker compose down -v +``` + +### Restart Services + +```bash +docker compose restart +``` + +## Deployment Scenarios + +### Development Mode + +For quick testing with in-memory storage: + +```bash +STORAGE_TYPE=inmemory \ +INDEX_TYPE=flat \ +DIMENSION=4 \ +GRPC_ROOT_PASSWORD=dev-password \ +docker compose up +``` + +### gRPC Only + +Disable the HTTP API for security: + +```bash +DISABLE_HTTP=true docker compose up +``` + +### Custom Ports + +```bash +HTTP_PORT=8080 \ +GRPC_PORT=9000 \ +docker compose up +``` + +This exposes HTTP on port 8080 and gRPC on port 9000. + +## Data Persistence + +### Volume Location + +Data is stored in a Docker volume called `vortexdb_data`. View its location: + +```bash +docker volume inspect vortexdb_vortexdb_data +``` + +### Backup Data + +```bash +# Create a backup +docker run --rm \ + -v vortexdb_vortexdb_data:/data \ + -v $(pwd)/backup:/backup \ + alpine tar czf /backup/vortexdb-backup.tar.gz -C /data . +``` + +### Restore Data + +```bash +# Restore from backup +docker run --rm \ + -v vortexdb_vortexdb_data:/data \ + -v $(pwd)/backup:/backup \ + alpine sh -c "cd /data && tar xzf /backup/vortexdb-backup.tar.gz" +``` + +## Health Checks + +### HTTP Health Check + +```bash +curl http://localhost:3000/health +# OK +``` + +### Using Docker Health Status + +Add a health check to your compose file: + +```yaml +services: + vortexdb: + # ... other config ... + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:3000/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 40s +``` + +## Production Tips + + + + Always use `STORAGE_TYPE=rocksdb` in production to ensure data survives container restarts. + + + + For datasets over 100,000 vectors, use `INDEX_TYPE=hnsw` for optimal search performance. + + + + Either disable it with `DISABLE_HTTP=true` or place it behind a reverse proxy with authentication. + + + + Add resource limits to prevent runaway memory usage: + + ```yaml + services: + vortexdb: + deploy: + resources: + limits: + memory: 4G + cpus: '2' + ``` + + + + Generate a secure password for `GRPC_ROOT_PASSWORD`: + + ```bash + openssl rand -base64 32 + ``` + + + +## Reverse Proxy Setup + +### Nginx Example + +```nginx +upstream vortexdb_http { + server localhost:3000; +} + +server { + listen 80; + server_name vortexdb.example.com; + + location / { + proxy_pass http://vortexdb_http; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + } +} + +# For gRPC (requires nginx 1.13.10+) +server { + listen 443 http2; + server_name grpc.vortexdb.example.com; + + location / { + grpc_pass grpc://localhost:50051; + } +} +``` + +### Traefik Example + +```yaml +# docker-compose.yml with Traefik labels +services: + vortexdb: + labels: + - "traefik.enable=true" + - "traefik.http.routers.vortexdb.rule=Host(`vortexdb.example.com`)" + - "traefik.http.services.vortexdb.loadbalancer.server.port=3000" +``` + +## Troubleshooting + + + + Check logs for missing required environment variables: + + ```bash + docker compose logs + ``` + + Ensure `GRPC_ROOT_PASSWORD` and `DIMENSION` are set. + + + + Change the ports in your `.env` file: + + ```bash + HTTP_PORT=8080 + GRPC_PORT=50052 + ``` + + + + Ensure `STORAGE_TYPE=rocksdb` is set and the volume is properly mounted. + + + + The container runs as user `vortexdb`. If you're using a bind mount, set permissions: + + ```bash + sudo chown -R 1000:1000 ./data + ``` + + + +## Next Steps + + + + Build a semantic search application + + + Learn about backup and restore + + diff --git a/docs/guides/python-sdk.mdx b/docs/guides/python-sdk.mdx new file mode 100644 index 0000000..55f6981 --- /dev/null +++ b/docs/guides/python-sdk.mdx @@ -0,0 +1,424 @@ +--- +title: "Build with Python SDK" +description: "Create a semantic search application using the VortexDB Python client" +--- + +# Build with the Python SDK + +This guide walks through building a semantic search application using VortexDB's Python SDK. + +## Installation + +```bash +pip install vortexdb +``` + +Or install from source: + +```bash +git clone https://github.com/sdslabs/VortexDB.git +cd VortexDB/client/python +pip install -e . +``` + + +The Python SDK requires Python 3.9+ and uses gRPC for communication. + + +## Quick Start + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +# Connect to VortexDB +db = VortexDB( + grpc_url="localhost:50051", + api_key="your-api-key" +) + +# Insert a vector +point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + payload=Payload.text("Hello, world!") +) + +# Search for similar vectors +results = db.search( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + similarity=Similarity.COSINE, + limit=5 +) + +# Clean up +db.close() +``` + +## Client Configuration + +### Connection Options + +```python +db = VortexDB( + grpc_url="localhost:50051", # VortexDB server address + api_key="your-api-key", # Authentication key + timeout=30.0, # Request timeout in seconds +) +``` + +### Environment Variables + +The client can also be configured via environment variables: + +| Variable | Description | +|----------|-------------| +| `VORTEXDB_GRPC_URL` | Server address | +| `VORTEXDB_API_KEY` | Authentication key | +| `VORTEXDB_TIMEOUT` | Request timeout | + +```python +import os +os.environ["VORTEXDB_GRPC_URL"] = "localhost:50051" +os.environ["VORTEXDB_API_KEY"] = "secret" + +# Client uses environment variables +db = VortexDB() +``` + +### Priority Order + +Configuration is loaded in this order (later overrides earlier): +1. Default values +2. Environment variables +3. Constructor arguments + +## Using Context Managers + +The recommended way to use the client is with a context manager: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +# Connection is automatically closed when exiting the 'with' block +with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3]), + payload=Payload.text("My document") + ) + print(f"Inserted: {point_id}") +# Connection closed automatically +``` + +## Core Operations + +### Insert Vectors + +```python +from vortexdb import DenseVector, Payload + +# Insert a text document +point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + payload=Payload.text("This is my document content") +) + +# Insert an image reference +image_id = db.insert( + vector=DenseVector([0.5, 0.6, 0.7, 0.8]), + payload=Payload.image("path/to/image.jpg") +) +``` + +### Retrieve Points + +```python +# Get a point by ID +point = db.get(point_id=point_id) + +if point: + print(f"ID: {point.id}") + print(f"Vector: {point.vector.to_list()}") + print(f"Content: {point.payload.content}") + print(f"Type: {point.payload.content_type}") + + # Pretty print + print(point.pretty()) +else: + print("Point not found") +``` + +### Search for Similar Vectors + +```python +from vortexdb import Similarity + +# Search with different metrics +results = db.search( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + similarity=Similarity.COSINE, # or EUCLIDEAN, MANHATTAN, HAMMING + limit=10 +) + +# Results are point IDs ordered by similarity +for point_id in results: + point = db.get(point_id=point_id) + print(f" {point.payload.content}") +``` + +### Delete Points + +```python +# Delete by ID +db.delete(point_id=point_id) +``` + +## Data Types + +### DenseVector + +```python +from vortexdb import DenseVector + +# Create from list +vec = DenseVector([0.1, 0.2, 0.3, 0.4]) + +# Create from tuple +vec = DenseVector((0.1, 0.2, 0.3, 0.4)) + +# Access values +values = vec.to_list() # [0.1, 0.2, 0.3, 0.4] +``` + + +All vectors must have the same dimension as configured on the server (`DIMENSION` env var). + + +### Payload + +```python +from vortexdb import Payload + +# Text payload +text_payload = Payload.text("My document content") + +# Image payload +image_payload = Payload.image("path/to/image.jpg") + +# Manual construction +from vortexdb.models import ContentType +payload = Payload(ContentType.TEXT, "Custom content") +``` + +### Similarity Metrics + +```python +from vortexdb import Similarity + +# Available metrics +Similarity.COSINE # Best for normalized embeddings +Similarity.EUCLIDEAN # L2 distance +Similarity.MANHATTAN # L1 distance +Similarity.HAMMING # For binary vectors +``` + +## Error Handling + +```python +from vortexdb import VortexDB +from vortexdb.exceptions import ( + VortexDBError, + AuthenticationError, + NotFoundError, + InvalidArgumentError, + TimeoutError, + ServiceUnavailableError, + InternalServerError, +) + +try: + db = VortexDB(grpc_url="localhost:50051", api_key="wrong-key") + point = db.get(point_id="some-id") +except AuthenticationError: + print("Invalid API key") +except NotFoundError: + print("Point not found") +except TimeoutError: + print("Request timed out") +except ServiceUnavailableError: + print("Server is unavailable") +except VortexDBError as e: + print(f"VortexDB error: {e}") +``` + +## Building a Semantic Search App + +Here's a complete example that builds a semantic search application using sentence embeddings: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +# You'll need an embedding model (e.g., sentence-transformers) +# pip install sentence-transformers +from sentence_transformers import SentenceTransformer + +# Initialize embedding model +model = SentenceTransformer('all-MiniLM-L6-v2') # 384 dimensions + +# Sample documents +documents = [ + "VortexDB is a vector database built in Rust", + "Machine learning models convert text to embeddings", + "Semantic search finds conceptually similar content", + "Python is a popular programming language", + "Docker makes deployment easier", +] + +def embed(text: str) -> DenseVector: + """Convert text to vector embedding.""" + embedding = model.encode(text).tolist() + return DenseVector(embedding) + +# Connect to VortexDB (DIMENSION must be 384 for MiniLM) +with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + + # Index documents + print("Indexing documents...") + for doc in documents: + point_id = db.insert( + vector=embed(doc), + payload=Payload.text(doc) + ) + print(f" Indexed: {doc[:50]}...") + + # Search + query = "How do I store vectors?" + print(f"\nSearching for: '{query}'") + + results = db.search( + vector=embed(query), + similarity=Similarity.COSINE, + limit=3 + ) + + print("\nTop 3 results:") + for i, point_id in enumerate(results, 1): + point = db.get(point_id=point_id) + print(f" {i}. {point.payload.content}") +``` + +Output: +``` +Indexing documents... + Indexed: VortexDB is a vector database built in Rust... + Indexed: Machine learning models convert text to embe... + Indexed: Semantic search finds conceptually similar c... + Indexed: Python is a popular programming language... + Indexed: Docker makes deployment easier... + +Searching for: 'How do I store vectors?' + +Top 3 results: + 1. VortexDB is a vector database built in Rust + 2. Machine learning models convert text to embeddings + 3. Semantic search finds conceptually similar content +``` + +## Batch Operations + +For bulk inserts, process documents in batches: + +```python +def batch_insert(db, documents, batch_size=100): + """Insert documents in batches.""" + point_ids = [] + + for i in range(0, len(documents), batch_size): + batch = documents[i:i + batch_size] + + for doc in batch: + pid = db.insert( + vector=embed(doc), + payload=Payload.text(doc) + ) + point_ids.append(pid) + + print(f"Inserted {min(i + batch_size, len(documents))}/{len(documents)}") + + return point_ids + +# Usage +with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + ids = batch_insert(db, large_document_list) +``` + +## Integration Examples + +### FastAPI Endpoint + +```python +from fastapi import FastAPI, HTTPException +from pydantic import BaseModel +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +app = FastAPI() +db = VortexDB(grpc_url="localhost:50051", api_key="secret") + +class SearchQuery(BaseModel): + vector: list[float] + limit: int = 10 + +@app.post("/search") +async def search(query: SearchQuery): + results = db.search( + vector=DenseVector(query.vector), + similarity=Similarity.COSINE, + limit=query.limit + ) + return {"results": results} + +@app.on_event("shutdown") +def shutdown(): + db.close() +``` + +### LangChain Integration + +```python +from langchain.vectorstores.base import VectorStore +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +class VortexDBVectorStore(VectorStore): + def __init__(self, grpc_url: str, api_key: str, embeddings): + self.db = VortexDB(grpc_url=grpc_url, api_key=api_key) + self.embeddings = embeddings + + def add_texts(self, texts, metadatas=None): + ids = [] + for i, text in enumerate(texts): + embedding = self.embeddings.embed_query(text) + pid = self.db.insert( + vector=DenseVector(embedding), + payload=Payload.text(text) + ) + ids.append(pid) + return ids + + def similarity_search(self, query, k=4): + embedding = self.embeddings.embed_query(query) + results = self.db.search( + vector=DenseVector(embedding), + similarity=Similarity.COSINE, + limit=k + ) + return [self.db.get(point_id=pid) for pid in results] +``` + +## Next Steps + + + + Complete API documentation + + + Use the terminal interface + + diff --git a/docs/guides/tui-walkthrough.mdx b/docs/guides/tui-walkthrough.mdx new file mode 100644 index 0000000..1b33c59 --- /dev/null +++ b/docs/guides/tui-walkthrough.mdx @@ -0,0 +1,318 @@ +--- +title: "TUI Walkthrough" +description: "Using VortexDB's terminal user interface" +--- + +# TUI Walkthrough + +VortexDB includes an interactive Terminal User Interface (TUI) built with [Ratatui](https://github.com/ratatui/ratatui) for managing your vector database from the command line. + +## Overview + +The TUI provides: +- **Dashboard view** for database overview +- **Database management** operations +- **Vector operations** (insert, search, delete) +- **Modal dialogs** for user input + +``` +┌─────────────────────────────────────────────────────────────┐ +│ VortexDB TUI [?] Help │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ +│ │ Dashboard │ │ Database │ │ Vectors │ │ +│ └─────────────┘ └─────────────┘ └─────────────┘ │ +│ │ +│ ┌───────────────────────────────────────────────────┐ │ +│ │ │ │ +│ │ Current View Content │ │ +│ │ │ │ +│ └───────────────────────────────────────────────────┘ │ +│ │ +│ [Tab] Switch View [Enter] Select [q] Quit │ +└─────────────────────────────────────────────────────────────┘ +``` + +## Installation + +### From Source + +```bash +# Clone the repository +git clone https://github.com/sdslabs/VortexDB.git +cd VortexDB + +# Build the TUI +cargo build --release --bin tui + +# Run +./target/release/tui +``` + +### Development Build + +```bash +cargo run --bin tui +``` + +## Getting Started + +### Launch the TUI + +```bash +# Make sure VortexDB server is running first +docker compose up -d + +# Then launch the TUI +./target/release/tui +``` + +### Connect to Server + +On launch, you'll be prompted to configure the connection: + +``` +┌──────────────────────────────────────┐ +│ Connect to VortexDB │ +├──────────────────────────────────────┤ +│ │ +│ gRPC URL: [localhost:50051 ] │ +│ API Key: [********************] │ +│ │ +│ [Connect] [Cancel] │ +└──────────────────────────────────────┘ +``` + +## Navigation + +### Keyboard Shortcuts + +| Key | Action | +|-----|--------| +| `Tab` | Switch between views | +| `Enter` | Select / Confirm | +| `Esc` | Cancel / Close modal | +| `q` | Quit application | +| `?` | Show help | +| `↑/↓` | Navigate lists | +| `j/k` | Navigate lists (vim-style) | + +## Views + +### Dashboard + +The dashboard provides an overview of your VortexDB instance: + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Dashboard │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ Server Status: ● Connected │ +│ Server Address: localhost:50051 │ +│ │ +│ ┌─────────────────────────────────────────────────────┐ │ +│ │ Database Stats │ │ +│ ├─────────────────────────────────────────────────────┤ │ +│ │ Total Points: 1,234 │ │ +│ │ Index Type: HNSW │ │ +│ │ Storage Type: RocksDB │ │ +│ │ Dimensions: 384 │ │ +│ └─────────────────────────────────────────────────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +**Features:** +- Connection status indicator +- Database statistics +- Quick health check + +### Database View + +Manage database-level operations: + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Database │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ Actions: │ +│ ──────── │ +│ > [ ] Refresh Stats │ +│ [ ] View Configuration │ +│ [ ] Export Data │ +│ [ ] Create Snapshot │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +**Operations:** +- Refresh database statistics +- View current configuration +- Export data (coming soon) +- Manage snapshots + +### Vector Operations + +The vector operations view lets you interact with individual vectors: + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Vector Operations │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ Actions: │ +│ ──────── │ +│ > [ ] Insert Vector │ +│ [ ] Search Vectors │ +│ [ ] Get Point by ID │ +│ [ ] Delete Point │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +## Common Tasks + +### Insert a Vector + + + + Navigate to Vector Operations → Insert Vector + + + Input your vector values (comma-separated): + ``` + Vector: 0.1, 0.2, 0.3, 0.4 + ``` + + + Add your payload content: + ``` + Content Type: Text + Content: My document content + ``` + + + Press Enter to insert. You'll see the generated point ID. + + + +### Search for Vectors + + + + Navigate to Vector Operations → Search Vectors + + + ``` + Query: 0.1, 0.2, 0.3, 0.4 + ``` + + + Choose from: Cosine, Euclidean, Manhattan, Hamming + + + Enter the maximum number of results (e.g., 10) + + + Results are displayed as a list of point IDs with their payloads + + + +### Get a Point + + + + Navigate to Vector Operations → Get Point by ID + + + Input the UUID: + ``` + Point ID: 550e8400-e29b-41d4-a716-446655440000 + ``` + + + The point details are displayed including vector and payload + + + +### Delete a Point + + + + Navigate to Vector Operations → Delete Point + + + Input the UUID to delete + + + Confirm the deletion in the modal dialog + + + +## Connecting to Remote Instances + +To connect to a remote VortexDB instance: + +```bash +# Set environment variables before launching +export VORTEXDB_GRPC_URL="remote-host:50051" +export VORTEXDB_API_KEY="your-api-key" + +./target/release/tui +``` + +Or enter the details in the connection dialog when the TUI starts. + +## Troubleshooting + + + + - Verify the VortexDB server is running + - Check the gRPC URL is correct + - Ensure the API key matches `GRPC_ROOT_PASSWORD` + - Check network connectivity + + + + - Ensure your terminal supports Unicode + - Try a different terminal emulator (iTerm2, Alacritty, kitty) + - Set `TERM=xterm-256color` + + + + - The TUI uses raw mode; some terminals may need configuration + - Try pressing `Esc` to close any open modals + - Use `Ctrl+C` to force quit if needed + + + +## Tips + + + + Use `Tab` to quickly switch between views without using arrow keys + + + Use `j`/`k` for navigation if you prefer vim-style keybindings + + + Point IDs can be copied from the display for use elsewhere + + + Use the refresh option to update stats after external changes + + + +## Next Steps + + + + Learn about the programmatic APIs + + + Build applications with Python + + diff --git a/docs/logo_horizontal.svg b/docs/logo_horizontal.svg new file mode 100644 index 0000000..89c7ecb --- /dev/null +++ b/docs/logo_horizontal.svg @@ -0,0 +1,8 @@ + + + + + + + + diff --git a/docs/mint.json b/docs/mint.json new file mode 100644 index 0000000..d3677f1 --- /dev/null +++ b/docs/mint.json @@ -0,0 +1,117 @@ +{ + "$schema": "https://mintlify.com/schema.json", + "name": "VortexDB", + "logo": { + "dark": "/logo_horizontal.svg", + "light": "/logo_horizontal.svg" + }, + "favicon": "/assets/favicon.svg", + "colors": { + "primary": "#00d4aa", + "light": "#00e6b8", + "dark": "#00b894", + "background": { + "dark": "#0a0a0f" + }, + "anchors": { + "from": "#00d4aa", + "to": "#00e6b8" + } + }, + "modeToggle": { + "default": "dark" + }, + "topbarLinks": [ + { + "name": "SDSLabs", + "url": "https://sdslabs.co" + }, + { + "name": "Discord", + "url": "https://discord.gg/dXkVEgTPu9" + } + ], + "topbarCtaButton": { + "name": "Github", + "url": "https://github.com/sdslabs/VortexDB" + }, + "tabs": [ + { + "name": "API Reference", + "url": "api-reference" + }, + { + "name": "SDK", + "url": "sdk" + } + ], + "anchors": [ + { + "name": "GitHub", + "icon": "github", + "url": "https://github.com/sdslabs/VortexDB" + }, + { + "name": "Community", + "icon": "discord", + "url": "https://discord.gg/dXkVEgTPu9" + }, + { + "name": "SDSLabs", + "icon": "globe", + "url": "https://sdslabs.co" + } + ], + "navigation": [ + { + "group": "Getting Started", + "pages": [ + "getting-started/installation", + "getting-started/quickstart" + ] + }, + { + "group": "Concepts", + "pages": [ + "concepts/architecture", + "concepts/indexers", + "concepts/snapshots" + ] + }, + { + "group": "Guides", + "pages": [ + "guides/docker-compose", + "guides/python-sdk", + "guides/tui-walkthrough" + ] + }, + { + "group": "API Reference", + "pages": [ + "api-reference/overview", + "api-reference/grpc", + "api-reference/http" + ] + }, + { + "group": "SDK", + "pages": [ + "sdk/reference", + "sdk/examples" + ] + } + ], + "footerSocials": { + "github": "https://github.com/sdslabs/VortexDB", + "discord": "https://discord.gg/dXkVEgTPu9", + "website": "https://sdslabs.co" + }, + "feedback": { + "thumbsRating": true + }, + "api": { + "baseUrl": "http://localhost:3000" + }, + "openapi": "openapi.yaml" +} diff --git a/docs/openapi.yaml b/docs/openapi.yaml new file mode 100644 index 0000000..3bfc6d8 --- /dev/null +++ b/docs/openapi.yaml @@ -0,0 +1,329 @@ +openapi: 3.0.3 +info: + title: VortexDB HTTP API + description: | + RESTful API for VortexDB - a high-performance vector database built in Rust. + + This API provides CRUD operations for vector points and similarity search. + version: 1.0.0 + license: + name: Apache 2.0 + url: https://www.apache.org/licenses/LICENSE-2.0 + contact: + name: SDSLabs + url: https://github.com/sdslabs/VortexDB + +servers: + - url: http://localhost:3000 + description: Local development server + +tags: + - name: Health + description: Server health endpoints + - name: Points + description: Vector point operations + +paths: + /: + get: + tags: + - Health + summary: Root endpoint + description: Verify the server is running + operationId: getRoot + responses: + '200': + description: Server is running + content: + text/plain: + schema: + type: string + example: Vector Database server is running! + + /health: + get: + tags: + - Health + summary: Health check + description: Check if the server is healthy + operationId: getHealth + responses: + '200': + description: Server is healthy + content: + text/plain: + schema: + type: string + example: OK + + /points: + post: + tags: + - Points + summary: Insert a point + description: Insert a new vector with its associated payload + operationId: insertPoint + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/InsertRequest' + example: + vector: [0.1, 0.2, 0.3, 0.4] + payload: + content_type: Text + content: Hello, VortexDB! + responses: + '201': + description: Point created successfully + content: + application/json: + schema: + $ref: '#/components/schemas/InsertResponse' + example: + point_id: 550e8400-e29b-41d4-a716-446655440000 + '400': + description: Invalid request + content: + text/plain: + schema: + type: string + example: Invalid vector dimensions + '500': + description: Server error + content: + text/plain: + schema: + type: string + example: Failed to insert point + + /points/{point_id}: + get: + tags: + - Points + summary: Get a point + description: Retrieve a point by its ID + operationId: getPoint + parameters: + - name: point_id + in: path + required: true + description: UUID of the point + schema: + type: string + format: uuid + example: 550e8400-e29b-41d4-a716-446655440000 + responses: + '200': + description: Point found + content: + application/json: + schema: + $ref: '#/components/schemas/Point' + example: + id: 550e8400-e29b-41d4-a716-446655440000 + vector: [0.1, 0.2, 0.3, 0.4] + payload: + content_type: Text + content: Hello, VortexDB! + '404': + description: Point not found + content: + text/plain: + schema: + type: string + example: Point not found + '500': + description: Server error + content: + text/plain: + schema: + type: string + example: Failed to get point + + delete: + tags: + - Points + summary: Delete a point + description: Delete a point by its ID + operationId: deletePoint + parameters: + - name: point_id + in: path + required: true + description: UUID of the point + schema: + type: string + format: uuid + example: 550e8400-e29b-41d4-a716-446655440000 + responses: + '204': + description: Point deleted successfully + '500': + description: Server error + content: + text/plain: + schema: + type: string + example: Failed to delete point + + /points/search: + post: + tags: + - Points + summary: Search for similar vectors + description: Find the k nearest neighbors to a query vector + operationId: searchPoints + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/SearchRequest' + example: + vector: [0.1, 0.2, 0.3, 0.4] + similarity: Cosine + limit: 5 + responses: + '200': + description: Search results + content: + application/json: + schema: + $ref: '#/components/schemas/SearchResponse' + example: + results: + - 550e8400-e29b-41d4-a716-446655440000 + - 6ba7b810-9dad-11d1-80b4-00c04fd430c8 + '400': + description: Invalid request + content: + text/plain: + schema: + type: string + example: Invalid vector dimensions + '500': + description: Server error + content: + text/plain: + schema: + type: string + example: Search failed + +components: + schemas: + DenseVector: + type: array + items: + type: number + format: float + description: Array of floating-point values representing the vector + example: [0.1, 0.2, 0.3, 0.4] + + ContentType: + type: string + enum: + - Text + - Image + description: Type of payload content + example: Text + + Payload: + type: object + required: + - content_type + - content + properties: + content_type: + $ref: '#/components/schemas/ContentType' + content: + type: string + description: Content string + example: Hello, VortexDB! + + Similarity: + type: string + enum: + - Euclidean + - Manhattan + - Hamming + - Cosine + description: | + Distance/similarity metric: + - **Euclidean**: L2 distance (straight line) + - **Manhattan**: L1 distance (city block) + - **Hamming**: Count of differing elements + - **Cosine**: Angular distance (1 - cosine similarity) + example: Cosine + + Point: + type: object + required: + - id + - vector + - payload + properties: + id: + type: string + format: uuid + description: Unique identifier for the point + example: 550e8400-e29b-41d4-a716-446655440000 + vector: + $ref: '#/components/schemas/DenseVector' + payload: + $ref: '#/components/schemas/Payload' + + InsertRequest: + type: object + required: + - vector + - payload + properties: + vector: + $ref: '#/components/schemas/DenseVector' + payload: + $ref: '#/components/schemas/Payload' + + InsertResponse: + type: object + required: + - point_id + properties: + point_id: + type: string + format: uuid + description: UUID of the created point + example: 550e8400-e29b-41d4-a716-446655440000 + + SearchRequest: + type: object + required: + - vector + - similarity + - limit + properties: + vector: + $ref: '#/components/schemas/DenseVector' + similarity: + $ref: '#/components/schemas/Similarity' + limit: + type: integer + minimum: 1 + description: Maximum number of results to return + example: 5 + + SearchResponse: + type: object + required: + - results + properties: + results: + type: array + items: + type: string + format: uuid + description: Array of point IDs ordered by similarity (closest first) + example: + - 550e8400-e29b-41d4-a716-446655440000 + - 6ba7b810-9dad-11d1-80b4-00c04fd430c8 diff --git a/docs/sdk/examples.mdx b/docs/sdk/examples.mdx new file mode 100644 index 0000000..c26c5ab --- /dev/null +++ b/docs/sdk/examples.mdx @@ -0,0 +1,602 @@ +--- +title: "SDK Examples" +description: "Working code examples for the VortexDB Python SDK" +--- + +# SDK Examples + +Ready-to-run code examples demonstrating VortexDB Python SDK usage. + +## Basic Usage + +The simplest example showing all core operations: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +def main(): + # Initialize client + db = VortexDB( + grpc_url="localhost:50051", + api_key="my-secret-password", + ) + + # Insert a vector + point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3]), + payload=Payload.text("hello world"), + ) + print(f"Inserted point: {point_id}") + + # Get the point + point = db.get(point_id=point_id) + print("Fetched point:", point.pretty()) + + # Search for similar vectors + results = db.search( + vector=DenseVector([0.1, 0.2, 0.3]), + similarity=Similarity.COSINE, + limit=3, + ) + print("Search results:", results) + + # Delete the point + db.delete(point_id=point_id) + print("Point deleted") + + # Close connection + db.close() + +if __name__ == "__main__": + main() +``` + +--- + +## Context Manager Usage + +Using the context manager for automatic resource cleanup: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +def main(): + # Connection is automatically closed when exiting the 'with' block + with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + # Insert some vectors + ids = [] + for i in range(5): + pid = db.insert( + vector=DenseVector([i * 0.1, i * 0.2, i * 0.3]), + payload=Payload.text(f"Document {i}"), + ) + ids.append(pid) + print(f"Inserted document {i}: {pid}") + + # Search + results = db.search( + vector=DenseVector([0.2, 0.4, 0.6]), + similarity=Similarity.COSINE, + limit=3, + ) + + print("\nSearch results:") + for pid in results: + point = db.get(point_id=pid) + print(f" - {point.payload.content}") + + # Connection automatically closed here + +if __name__ == "__main__": + main() +``` + +--- + +## Semantic Search with Embeddings + +Using sentence-transformers to build a semantic search system: + +```python +""" +Semantic search example using sentence-transformers. + +Requirements: + pip install sentence-transformers vortexdb + +Make sure VortexDB is running with DIMENSION=384 +""" + +from vortexdb import VortexDB, DenseVector, Payload, Similarity +from sentence_transformers import SentenceTransformer + +# Initialize embedding model (384 dimensions) +model = SentenceTransformer('all-MiniLM-L6-v2') + +# Sample documents +documents = [ + "The quick brown fox jumps over the lazy dog", + "Machine learning is a subset of artificial intelligence", + "Python is a popular programming language for data science", + "VortexDB is a vector database built in Rust", + "Docker containers package applications with their dependencies", + "Neural networks are inspired by biological brain structures", + "REST APIs use HTTP methods for communication", + "PostgreSQL is a powerful relational database", +] + +def embed(text: str) -> DenseVector: + """Convert text to vector embedding.""" + embedding = model.encode(text).tolist() + return DenseVector(embedding) + +def main(): + with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + # Index all documents + print("Indexing documents...") + point_ids = [] + for doc in documents: + pid = db.insert( + vector=embed(doc), + payload=Payload.text(doc), + ) + point_ids.append(pid) + print(f" ✓ {doc[:50]}...") + + # Interactive search + print("\n" + "=" * 50) + print("Semantic Search Demo") + print("=" * 50) + + queries = [ + "How do databases work?", + "Tell me about AI and machine learning", + "What programming languages are good for ML?", + ] + + for query in queries: + print(f"\nQuery: '{query}'") + print("-" * 40) + + results = db.search( + vector=embed(query), + similarity=Similarity.COSINE, + limit=3, + ) + + for i, pid in enumerate(results, 1): + point = db.get(point_id=pid) + print(f" {i}. {point.payload.content}") + +if __name__ == "__main__": + main() +``` + +--- + +## Error Handling + +Comprehensive error handling example: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity +from vortexdb.exceptions import ( + VortexDBError, + AuthenticationError, + NotFoundError, + InvalidArgumentError, + TimeoutError, + ServiceUnavailableError, +) + +def safe_insert(db: VortexDB, vector: list, content: str) -> str | None: + """Insert a vector with error handling.""" + try: + point_id = db.insert( + vector=DenseVector(vector), + payload=Payload.text(content), + ) + return point_id + except InvalidArgumentError as e: + print(f"Invalid input: {e}") + return None + except VortexDBError as e: + print(f"Database error: {e}") + return None + +def safe_get(db: VortexDB, point_id: str): + """Get a point with error handling.""" + try: + point = db.get(point_id=point_id) + if point: + return point + else: + print(f"Point {point_id} not found") + return None + except NotFoundError: + print(f"Point {point_id} does not exist") + return None + except VortexDBError as e: + print(f"Error fetching point: {e}") + return None + +def main(): + try: + db = VortexDB( + grpc_url="localhost:50051", + api_key="secret", + timeout=10.0, + ) + except ServiceUnavailableError: + print("Could not connect to VortexDB server") + return + except AuthenticationError: + print("Invalid API key") + return + + try: + # Try various operations + pid = safe_insert(db, [0.1, 0.2, 0.3, 0.4], "Test document") + if pid: + print(f"Inserted: {pid}") + point = safe_get(db, pid) + if point: + print(f"Content: {point.payload.content}") + + # Try to get a non-existent point + safe_get(db, "nonexistent-id-12345") + + finally: + db.close() + +if __name__ == "__main__": + main() +``` + +--- + +## Batch Processing + +Efficiently process large datasets: + +```python +from vortexdb import VortexDB, DenseVector, Payload, Similarity +from typing import List, Tuple +import time + +def batch_insert( + db: VortexDB, + items: List[Tuple[List[float], str]], + batch_size: int = 100, +) -> List[str]: + """Insert items in batches with progress reporting.""" + point_ids = [] + total = len(items) + + for i in range(0, total, batch_size): + batch = items[i:i + batch_size] + batch_ids = [] + + for vector, content in batch: + pid = db.insert( + vector=DenseVector(vector), + payload=Payload.text(content), + ) + batch_ids.append(pid) + + point_ids.extend(batch_ids) + + progress = min(i + batch_size, total) + print(f"Progress: {progress}/{total} ({100 * progress // total}%)") + + return point_ids + +def main(): + # Generate sample data + print("Generating sample data...") + items = [ + ([i * 0.001 for _ in range(4)], f"Document {i}") + for i in range(1000) + ] + + with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + print(f"\nInserting {len(items)} documents...") + start = time.time() + + ids = batch_insert(db, items, batch_size=100) + + elapsed = time.time() - start + print(f"\nInserted {len(ids)} documents in {elapsed:.2f}s") + print(f"Rate: {len(ids) / elapsed:.0f} inserts/second") + +if __name__ == "__main__": + main() +``` + +--- + +## FastAPI Integration + +Build a REST API wrapper around VortexDB: + +```python +""" +FastAPI wrapper for VortexDB. + +Run with: uvicorn example:app --reload +""" + +from fastapi import FastAPI, HTTPException +from pydantic import BaseModel +from typing import List +from vortexdb import VortexDB, DenseVector, Payload, Similarity +from vortexdb.exceptions import NotFoundError, VortexDBError + +app = FastAPI(title="VortexDB API Wrapper") + +# Global client (initialized on startup) +db: VortexDB = None + +class InsertRequest(BaseModel): + vector: List[float] + content: str + +class InsertResponse(BaseModel): + point_id: str + +class SearchRequest(BaseModel): + vector: List[float] + limit: int = 10 + similarity: str = "cosine" + +class SearchResult(BaseModel): + point_id: str + content: str + +class SearchResponse(BaseModel): + results: List[SearchResult] + +@app.on_event("startup") +async def startup(): + global db + db = VortexDB( + grpc_url="localhost:50051", + api_key="secret", + ) + +@app.on_event("shutdown") +async def shutdown(): + global db + if db: + db.close() + +@app.post("/insert", response_model=InsertResponse) +async def insert_vector(request: InsertRequest): + try: + point_id = db.insert( + vector=DenseVector(request.vector), + payload=Payload.text(request.content), + ) + return InsertResponse(point_id=point_id) + except VortexDBError as e: + raise HTTPException(status_code=500, detail=str(e)) + +@app.get("/points/{point_id}") +async def get_point(point_id: str): + try: + point = db.get(point_id=point_id) + if not point: + raise HTTPException(status_code=404, detail="Point not found") + return { + "id": point.id, + "vector": point.vector.to_list(), + "content": point.payload.content, + } + except NotFoundError: + raise HTTPException(status_code=404, detail="Point not found") + +@app.post("/search", response_model=SearchResponse) +async def search_vectors(request: SearchRequest): + similarity_map = { + "cosine": Similarity.COSINE, + "euclidean": Similarity.EUCLIDEAN, + "manhattan": Similarity.MANHATTAN, + "hamming": Similarity.HAMMING, + } + + sim = similarity_map.get(request.similarity.lower()) + if not sim: + raise HTTPException(status_code=400, detail="Invalid similarity metric") + + results = db.search( + vector=DenseVector(request.vector), + similarity=sim, + limit=request.limit, + ) + + search_results = [] + for pid in results: + point = db.get(point_id=pid) + if point: + search_results.append(SearchResult( + point_id=pid, + content=point.payload.content, + )) + + return SearchResponse(results=search_results) + +@app.delete("/points/{point_id}") +async def delete_point(point_id: str): + db.delete(point_id=point_id) + return {"status": "deleted"} +``` + +--- + +## Environment Configuration + +Using environment variables for configuration: + +```python +""" +Configuration example using environment variables. + +Set these before running: + export VORTEXDB_GRPC_URL=localhost:50051 + export VORTEXDB_API_KEY=your-api-key + export VORTEXDB_TIMEOUT=30 +""" + +import os +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +def main(): + # Method 1: Set env vars before import + os.environ.setdefault("VORTEXDB_GRPC_URL", "localhost:50051") + os.environ.setdefault("VORTEXDB_API_KEY", "secret") + os.environ.setdefault("VORTEXDB_TIMEOUT", "30") + + # Client reads from environment + with VortexDB() as db: + pid = db.insert( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + payload=Payload.text("Test document"), + ) + print(f"Inserted: {pid}") + + # Method 2: Override specific settings + with VortexDB(timeout=60.0) as db: # Uses env for url/key, overrides timeout + results = db.search( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + similarity=Similarity.COSINE, + limit=5, + ) + print(f"Found {len(results)} results") + +if __name__ == "__main__": + main() +``` + +--- + +## Testing Example + +Unit testing with a mock or real database: + +```python +""" +Testing example for VortexDB applications. + +Run with: pytest test_example.py -v +""" + +import pytest +from vortexdb import VortexDB, DenseVector, Payload, Similarity + +# Fixtures +@pytest.fixture +def db(): + """Create a VortexDB connection for testing.""" + client = VortexDB( + grpc_url="localhost:50051", + api_key="test-key", + ) + yield client + client.close() + +@pytest.fixture +def sample_vector(): + """Create a sample vector.""" + return DenseVector([0.1, 0.2, 0.3, 0.4]) + +@pytest.fixture +def sample_payload(): + """Create a sample payload.""" + return Payload.text("Test document") + +# Tests +class TestVortexDB: + def test_insert_and_get(self, db, sample_vector, sample_payload): + """Test inserting and retrieving a point.""" + # Insert + point_id = db.insert( + vector=sample_vector, + payload=sample_payload, + ) + assert point_id is not None + assert len(point_id) > 0 + + # Get + point = db.get(point_id=point_id) + assert point is not None + assert point.id == point_id + assert point.payload.content == "Test document" + + # Cleanup + db.delete(point_id=point_id) + + def test_search(self, db, sample_vector, sample_payload): + """Test search functionality.""" + # Insert test data + point_id = db.insert( + vector=sample_vector, + payload=sample_payload, + ) + + # Search + results = db.search( + vector=sample_vector, + similarity=Similarity.COSINE, + limit=10, + ) + + assert isinstance(results, list) + assert point_id in results + + # Cleanup + db.delete(point_id=point_id) + + def test_delete(self, db, sample_vector, sample_payload): + """Test delete functionality.""" + # Insert + point_id = db.insert( + vector=sample_vector, + payload=sample_payload, + ) + + # Delete + db.delete(point_id=point_id) + + # Verify deleted + point = db.get(point_id=point_id) + assert point is None + + def test_dense_vector_validation(self): + """Test DenseVector validation.""" + # Valid vectors + assert DenseVector([1.0, 2.0, 3.0]) + assert DenseVector((1, 2, 3)) # Converted to floats + + # Invalid vectors + with pytest.raises(TypeError): + DenseVector("not a list") + + with pytest.raises(ValueError): + DenseVector([]) # Empty + + with pytest.raises(TypeError): + DenseVector(["a", "b", "c"]) # Non-numeric +``` + +--- + +## Next Steps + + + + Complete API documentation + + + gRPC and HTTP API docs + + diff --git a/docs/sdk/reference.mdx b/docs/sdk/reference.mdx new file mode 100644 index 0000000..fbec9ca --- /dev/null +++ b/docs/sdk/reference.mdx @@ -0,0 +1,588 @@ +--- +title: "SDK Reference" +description: "Complete Python SDK API documentation" +--- + +# Python SDK Reference + +Complete API documentation for the VortexDB Python client. + +## Installation + +```bash +pip install vortexdb +``` + +--- + +## VortexDB + +The main client class for interacting with VortexDB. + +```python +from vortexdb import VortexDB +``` + +### Constructor + +```python +VortexDB( + *, + grpc_url: str | None = None, + api_key: str | None = None, + timeout: float | None = None, +) +``` + + + The gRPC server address. Can also be set via `VORTEXDB_GRPC_URL` environment variable. + + + + Authentication key for the gRPC API. Can also be set via `VORTEXDB_API_KEY` environment variable. + + + + Request timeout in seconds. Can also be set via `VORTEXDB_TIMEOUT` environment variable. + + +**Example:** + +```python +# Explicit configuration +db = VortexDB( + grpc_url="localhost:50051", + api_key="secret", + timeout=60.0, +) + +# Using environment variables +import os +os.environ["VORTEXDB_GRPC_URL"] = "localhost:50051" +os.environ["VORTEXDB_API_KEY"] = "secret" +db = VortexDB() +``` + +--- + +### Methods + +#### insert + +Insert a vector with its payload into the database. + +```python +def insert( + self, + *, + vector: DenseVector, + payload: Payload, +) -> str +``` + + + The vector to insert. Must match the server's configured dimension. + + + + Metadata to associate with the vector. + + + + The UUID of the created point. + + +**Raises:** +- `TypeError`: If vector is not a `DenseVector` +- `InvalidArgumentError`: If vector dimensions don't match +- `AuthenticationError`: If API key is invalid + +**Example:** + +```python +point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + payload=Payload.text("My document"), +) +print(f"Created point: {point_id}") +``` + +--- + +#### get + +Retrieve a point by its ID. + +```python +def get( + self, + *, + point_id: str, +) -> Point | None +``` + + + The UUID of the point to retrieve. + + + + The point if found, `None` otherwise. + + +**Raises:** +- `AuthenticationError`: If API key is invalid +- `InternalServerError`: On server error + +**Example:** + +```python +point = db.get(point_id="550e8400-e29b-41d4-a716-446655440000") + +if point: + print(f"Vector: {point.vector.to_list()}") + print(f"Payload: {point.payload.content}") +else: + print("Point not found") +``` + +--- + +#### search + +Search for the k nearest neighbors to a query vector. + +```python +def search( + self, + *, + vector: DenseVector, + similarity: Similarity, + limit: int, +) -> List[str] +``` + + + The query vector. Must match the server's configured dimension. + + + + The distance metric to use for comparison. + + + + Maximum number of results to return. + + + + List of point IDs ordered by similarity (closest first). + + +**Raises:** +- `TypeError`: If vector is not a `DenseVector` +- `InvalidArgumentError`: If vector dimensions don't match +- `AuthenticationError`: If API key is invalid + +**Example:** + +```python +results = db.search( + vector=DenseVector([0.1, 0.2, 0.3, 0.4]), + similarity=Similarity.COSINE, + limit=10, +) + +for point_id in results: + point = db.get(point_id=point_id) + print(f" - {point.payload.content}") +``` + +--- + +#### delete + +Delete a point by its ID. + +```python +def delete( + self, + *, + point_id: str, +) -> None +``` + + + The UUID of the point to delete. + + +**Raises:** +- `AuthenticationError`: If API key is invalid +- `InternalServerError`: On server error + +**Example:** + +```python +db.delete(point_id="550e8400-e29b-41d4-a716-446655440000") +``` + +--- + +#### close + +Close the gRPC connection. + +```python +def close(self) -> None +``` + +**Example:** + +```python +db = VortexDB(grpc_url="localhost:50051", api_key="secret") +# ... use the client ... +db.close() +``` + +--- + +### Context Manager + +The client supports the context manager protocol for automatic cleanup: + +```python +with VortexDB(grpc_url="localhost:50051", api_key="secret") as db: + point_id = db.insert( + vector=DenseVector([0.1, 0.2, 0.3]), + payload=Payload.text("Hello"), + ) +# Connection automatically closed +``` + +--- + +## DenseVector + +An immutable dense vector of floating-point values. + +```python +from vortexdb import DenseVector +``` + +### Constructor + +```python +DenseVector(values: List[float] | Tuple[float, ...]) +``` + + + The vector components. Must be non-empty and contain numeric values. + + +**Raises:** +- `TypeError`: If values is not a list or tuple +- `ValueError`: If values is empty +- `TypeError`: If any value is not numeric + +**Example:** + +```python +# From list +vec = DenseVector([0.1, 0.2, 0.3, 0.4]) + +# From tuple +vec = DenseVector((0.1, 0.2, 0.3, 0.4)) + +# Integers are converted to floats +vec = DenseVector([1, 2, 3, 4]) # -> [1.0, 2.0, 3.0, 4.0] +``` + +### Methods + +#### to_list + +Convert the vector to a Python list. + +```python +def to_list(self) -> list[float] +``` + +**Example:** + +```python +vec = DenseVector([0.1, 0.2, 0.3]) +values = vec.to_list() # [0.1, 0.2, 0.3] +``` + +### Properties + +#### values + +Access the vector values (read-only). + +```python +vec = DenseVector([0.1, 0.2, 0.3]) +print(vec.values) # (0.1, 0.2, 0.3) +``` + +--- + +## Payload + +Metadata associated with a vector. + +```python +from vortexdb import Payload +``` + +### Factory Methods + +#### text + +Create a text payload. + +```python +@staticmethod +def text(content: str) -> Payload +``` + +**Example:** + +```python +payload = Payload.text("This is my document content") +``` + +#### image + +Create an image payload. + +```python +@staticmethod +def image(content: str) -> Payload +``` + +**Example:** + +```python +payload = Payload.image("path/to/image.jpg") +``` + +### Constructor + +```python +Payload(content_type: ContentType, content: str) +``` + + + The type of content (`ContentType.TEXT` or `ContentType.IMAGE`). + + + + The content string. + + +### Properties + +| Property | Type | Description | +|----------|------|-------------| +| `content_type` | `ContentType` | Type of payload | +| `content` | `str` | Content string | + +--- + +## Point + +A point returned from the database (vector + payload + ID). + +```python +from vortexdb.models import Point +``` + +### Properties + +| Property | Type | Description | +|----------|------|-------------| +| `id` | `str` | Point UUID | +| `vector` | `DenseVector` | The vector values | +| `payload` | `Payload` | Associated metadata | + +### Methods + +#### pretty + +Return a formatted string representation. + +```python +def pretty(self) -> str +``` + +**Example:** + +```python +point = db.get(point_id="...") +print(point.pretty()) +# Output: +# Point ID: 550e8400-e29b-41d4-a716-446655440000 +# Vector: [0.1, 0.2, 0.3, 0.4] +# Payload Type: Text +# Payload Content: My document +``` + +--- + +## Similarity + +Enum for distance/similarity metrics. + +```python +from vortexdb import Similarity +``` + +### Values + +| Value | Description | +|-------|-------------| +| `Similarity.EUCLIDEAN` | L2 distance (straight line) | +| `Similarity.MANHATTAN` | L1 distance (city block) | +| `Similarity.HAMMING` | Count of differing elements | +| `Similarity.COSINE` | Angular distance | + +**Example:** + +```python +from vortexdb import Similarity + +# Use in search +results = db.search( + vector=DenseVector([0.1, 0.2, 0.3]), + similarity=Similarity.COSINE, + limit=5, +) +``` + +--- + +## ContentType + +Enum for payload content types. + +```python +from vortexdb.models import ContentType +``` + +### Values + +| Value | Description | +|-------|-------------| +| `ContentType.TEXT` | Text content | +| `ContentType.IMAGE` | Image reference | + +--- + +## Exceptions + +All exceptions inherit from `VortexDBError`. + +```python +from vortexdb.exceptions import ( + VortexDBError, + AuthenticationError, + NotFoundError, + InvalidArgumentError, + TimeoutError, + ServiceUnavailableError, + InternalServerError, + ConfigurationError, +) +``` + +### Exception Hierarchy + +| Exception | Description | +|-----------|-------------| +| `VortexDBError` | Base exception for all errors | +| `AuthenticationError` | Invalid or missing API key | +| `NotFoundError` | Requested resource not found | +| `InvalidArgumentError` | Invalid input parameters | +| `TimeoutError` | Request timed out | +| `ServiceUnavailableError` | Server is unavailable | +| `InternalServerError` | Server-side error | +| `ConfigurationError` | Invalid client configuration | + +**Example:** + +```python +from vortexdb.exceptions import ( + AuthenticationError, + NotFoundError, + VortexDBError, +) + +try: + point = db.get(point_id="nonexistent") +except NotFoundError: + print("Point does not exist") +except AuthenticationError: + print("Check your API key") +except VortexDBError as e: + print(f"Unexpected error: {e}") +``` + +--- + +## Configuration + +### VortexDBConfig + +Internal configuration class (usually not used directly). + +```python +from vortexdb.config import VortexDBConfig + +config = VortexDBConfig.from_env( + grpc_url="localhost:50051", + api_key="secret", + timeout=30.0, +) +``` + +### Environment Variables + +| Variable | Description | Default | +|----------|-------------|---------| +| `VORTEXDB_GRPC_URL` | Server address | `localhost:50051` | +| `VORTEXDB_API_KEY` | Authentication key | None | +| `VORTEXDB_TIMEOUT` | Request timeout (seconds) | `30.0` | + +--- + +## Type Hints + +The SDK is fully typed. Import types for type hints: + +```python +from typing import List, Optional +from vortexdb import VortexDB, DenseVector, Payload, Similarity +from vortexdb.models import Point, ContentType + +def search_documents( + db: VortexDB, + query_vector: List[float], + limit: int = 10, +) -> List[Optional[Point]]: + results = db.search( + vector=DenseVector(query_vector), + similarity=Similarity.COSINE, + limit=limit, + ) + return [db.get(point_id=pid) for pid in results] +``` + +## Next Steps + + + + Working code examples + + + Build a semantic search app + +