AlphaTrion is an open-source experiment tracking and agent orchestration framework for LLM application developers and AI engineers. Orchestrate multi-agent workflows, track LLM experiments, manage artifacts, and gain deep observability into your GenAI applications—all through an intuitive Python API and modern dashboard. Named after the oldest and wisest Transformer.
- 🔬 Experiment Management - Hierarchical experiments and runs with smart checkpointing (save on best metrics, early stopping, target optimization)
- 📦 Artifact Registry - Version datasets and model checkpoints using OCI registries or S3, with native
push/pullAPIs - 📊 Metrics & Observability - Built-in Prometheus metrics and distributed tracing (OpenTelemetry + ClickHouse) for LLM calls
- 🪝 Extensible Hooks - Pre/post-save hooks and post-run hooks for custom workflows
- 🎯 Modern Dashboard - Explore experiments, visualize metrics, and analyze traces through an intuitive web UI
- 🔌 Production-Ready - Async-first design, PostgreSQL metadata storage, and support for distributed workloads
- Organization - Top-level entity for grouping teams and users
- Team - Collaborative workspace for organizing experiments and runs
- User - Individual account with secure authentication and team memberships
- Experiment - Logical grouping of runs with shared purpose, organized by labels
- Run - Individual execution instance with configuration and metrics
# From PyPI
pip install alphatrion
# Or from source
git clone https://github.com/inftyai/alphatrion.git && cd alphatrion
source start.sh# Start PostgreSQL, ClickHouse, and Registry
cp .env.example .env
make up
# Wait for services to be ready, then run migrations
make migrate-all
# Initialize your organization, team, and user account
alphatrion initOptional Tools:
- pgAdmin:
http://localhost:8081(alphatrion@inftyai.com / alphatr1on) - Registry UI:
http://localhost:80 - Grafana:
http://localhost:3000(admin / admin) - LLM metrics dashboard - Prometheus:
http://localhost:9090- Metrics explorer
import alphatrion as alpha
from alphatrion.experiment import CraftExperiment
# Initialize with your user ID
alpha.init(user_id="<your_user_id>")
async def my_task():
# Your code here
await alpha.log_metrics({"accuracy": 0.95, "loss": 0.12})
async with CraftExperiment.start(name="my_experiment") as exp:
run = exp.run(my_task)
await exp.wait()# Start backend server (terminal 1)
alphatrion server
# Launch dashboard (terminal 2)
alphatrion dashboardAccess the dashboard at http://127.0.0.1:5173 and log in with your email and password to explore experiments, visualize metrics, and analyze traces.
AlphaTrion automatically captures distributed tracing data for all LLM calls, including latency, token usage, and span relationships.
Automatically sync metadata and status after run completion.
from alphatrion.experiment import CraftExperiment
from alphatrion.run import PostRunHookFn
async def train_model():
# Your training code
return {
"metadata": {"accuracy": 0.95, "loss": 0.05},
"status": "COMPLETED",
}
async with CraftExperiment.start("training") as exp:
run = exp.run(
train_model,
post_run_hooks=[PostRunHookFn.sync_metadata, PostRunHookFn.sync_status]
)
await exp.wait()make down- Architecture: Diagrams
- Dashboard: Setup Guide | CLI Reference | Architecture
- Development: Contributing Guide
- Claude Code Integration: Hooks Setup
We welcome contributions! Check out our development guide to get started.

