Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion cmd/vmcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The Virtual MCP Server (vmcp) is a standalone binary that aggregates multiple MC

## Features

### Implemented (Phase 1)
### Implemented
- ✅ **Group-Based Backend Management**: Automatic workload discovery from ToolHive groups
- ✅ **Tool Aggregation**: Combines tools from multiple MCP servers with conflict resolution (prefix, priority, manual)
- ✅ **Resource & Prompt Aggregation**: Unified access to resources and prompts from all backends
Expand All @@ -16,12 +16,14 @@ The Virtual MCP Server (vmcp) is a standalone binary that aggregates multiple MC
- ✅ **Health Endpoints**: `/health` and `/ping` for service monitoring
- ✅ **Configuration Validation**: `vmcp validate` command for config verification
- ✅ **Observability**: OpenTelemetry metrics and traces for backend operations and workflow executions
- ✅ **Composite Tools**: Multi-step workflows with elicitation support

### In Progress
- 🚧 **Incoming Authentication** (Issue #165): OIDC, local, anonymous authentication
- 🚧 **Outgoing Authentication** (Issue #160): RFC 8693 token exchange for backend API access
- 🚧 **Token Caching**: Memory and Redis cache providers
- 🚧 **Health Monitoring** (Issue #166): Circuit breakers, backend health checks
- 🚧 **Optimizer** Support the MCP optimizer in vMCP for context optimization on large toolsets.

### Future (Phase 2+)
- 📋 **Authorization**: Cedar policy-based access control
Expand Down
4 changes: 2 additions & 2 deletions deploy/charts/operator-crds/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ToolHive Operator CRDs Helm Chart

![Version: 0.0.106](https://img.shields.io/badge/Version-0.0.106-informational?style=flat-square)
![Version: 0.0.104](https://img.shields.io/badge/Version-0.0.104-informational?style=flat-square)
![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square)

A Helm chart for installing the ToolHive Operator CRDs into Kubernetes.
Expand Down Expand Up @@ -51,7 +51,7 @@ However, placing CRDs in `templates/` means they would be deleted when the Helm
## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
|-----|-------------|------|---------|
| crds | object | `{"install":{"registry":true,"server":true,"virtualMcp":true},"keep":true}` | CRD installation configuration |
| crds.install | object | `{"registry":true,"server":true,"virtualMcp":true}` | Feature flags for CRD groups |
| crds.install.registry | bool | `true` | Install Registry CRDs (mcpregistries) |
Expand Down
54 changes: 12 additions & 42 deletions docs/operator/crd-api.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

143 changes: 143 additions & 0 deletions pkg/vmcp/optimizer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# VMCPOptimizer Package

This package provides semantic tool discovery for Virtual MCP Server, reducing token usage by allowing LLMs to discover relevant tools on-demand instead of receiving all tool definitions upfront.

## Architecture

The optimizer exposes a clean interface-based architecture:

```
pkg/vmcp/optimizer/
├── optimizer.go # Public Optimizer interface and EmbeddingOptimizer implementation
├── config.go # Configuration types
├── README.md # This file
└── internal/ # Implementation details (not part of public API)
├── embeddings/ # Embedding backends (Ollama, OpenAI-compatible, vLLM)
├── db/ # Database operations (chromem-go vectors, SQLite FTS5)
├── ingestion/ # Tool ingestion service
├── models/ # Internal data models
└── tokens/ # Token counting utilities
```

## Public API

### Optimizer Interface

```go
type Optimizer interface {
// FindTool searches for tools matching the description and keywords
FindTool(ctx context.Context, input FindToolInput) (*FindToolOutput, error)

// CallTool invokes a tool by name with parameters
CallTool(ctx context.Context, input CallToolInput) (*mcp.CallToolResult, error)

// Close cleans up optimizer resources
Close() error

// HandleSessionRegistration handles session setup for optimizer mode
HandleSessionRegistration(...) (bool, error)

// OptimizerHandlerProvider provides tool handlers for MCP integration
adapter.OptimizerHandlerProvider
}
```

### Factory Pattern

```go
// Factory creates an Optimizer instance
type Factory func(
ctx context.Context,
cfg *Config,
mcpServer *server.MCPServer,
backendClient vmcp.BackendClient,
sessionManager *transportsession.Manager,
) (Optimizer, error)

// NewEmbeddingOptimizer is the production implementation
func NewEmbeddingOptimizer(...) (Optimizer, error)
```

## Usage

### In vMCP Server

```go
import "github.com/stacklok/toolhive/pkg/vmcp/optimizer"

// Configure server with optimizer
serverCfg := &vmcpserver.Config{
OptimizerFactory: optimizer.NewEmbeddingOptimizer,
OptimizerConfig: &optimizer.Config{
Enabled: true,
PersistPath: "/data/optimizer",
HybridSearchRatio: 70, // 70% semantic, 30% keyword
EmbeddingConfig: &embeddings.Config{
BackendType: "ollama",
BaseURL: "http://localhost:11434",
Model: "nomic-embed-text",
Dimension: 768,
},
},
}
```

### MCP Tools Exposed

When the optimizer is enabled, vMCP exposes two tools instead of all backend tools:

1. **`optim_find_tool`**: Semantic search for tools
- Input: `tool_description` (natural language), optional `tool_keywords`, `limit`
- Output: Ranked tools with similarity scores and token metrics

2. **`optim_call_tool`**: Dynamic tool invocation
- Input: `backend_id`, `tool_name`, `parameters`
- Output: Tool execution result

## Benefits

- **Token Savings**: Only relevant tools are sent to the LLM (typically 80-95% reduction)
- **Hybrid Search**: Combines semantic embeddings (70%) with BM25 keyword matching (30%)
- **Startup Ingestion**: Tools are indexed once at startup, not per-session
- **Clean Architecture**: Interface-based design allows easy testing and alternative implementations

## Implementation Details

The `internal/` directory contains implementation details that are not part of the public API:

- **embeddings/**: Pluggable embedding backends (Ollama, vLLM, OpenAI-compatible)
- **db/**: Hybrid search using chromem-go (vector DB) + SQLite FTS5 (BM25)
- **ingestion/**: Tool ingestion pipeline with background embedding generation
- **models/**: Internal data structures for backend tools and metadata
- **tokens/**: Token counting for metrics calculation

These internal packages use internal import paths and cannot be imported from outside the optimizer package.

## Testing

The interface-based design enables easy testing:

```go
// Mock the interface for unit tests
mockOpt := mocks.NewMockOptimizer(ctrl)
mockOpt.EXPECT().FindTool(...).Return(...)
mockOpt.EXPECT().Close()

// Use in server configuration
cfg.Optimizer = mockOpt
```

## Migration from Integration Pattern

Previous versions used an `Integration` interface. The current `Optimizer` interface provides the same functionality with cleaner separation of concerns:

**Before (Integration):**
- `OptimizerIntegration optimizer.Integration`
- `optimizer.NewIntegration(...)`

**After (Optimizer):**
- `Optimizer optimizer.Optimizer`
- `OptimizerFactory optimizer.Factory`
- `optimizer.NewEmbeddingOptimizer(...)`

The factory pattern allows the server to create the optimizer at startup with all necessary dependencies.
Loading
Loading