Skip to content

Enterprise-grade diffusion model infrastructure platform with multi-model support, LoRA training, and production monitoring

License

Notifications You must be signed in to change notification settings

PediMathematician/diffusion-platform

Repository files navigation

Diffusion Platform

A production-grade, enterprise-level infrastructure platform for deploying and managing diffusion models (Stable Diffusion 1.5, SDXL, FLUX) with LoRA training, queue management, and comprehensive monitoring.

Features

  • Multi-Model Support: SD1.5, SDXL, and FLUX with unified interface
  • LoRA Training: Complete pipeline for custom style training with BLIP auto-captioning
  • Queue System: Redis-backed task queue with priority handling and batch processing
  • Model Registry: Version control and management for models and LoRAs
  • RESTful API: FastAPI with OpenAPI documentation
  • Monitoring: Prometheus metrics and Grafana dashboards
  • Production Ready: Docker deployment, health checks, structured logging
  • Cloud Deployment: RunPod and Vast.ai integration

Architecture

┌─────────────────────────────────────────────┐
│         API Gateway (FastAPI)               │
│  Authentication, Rate Limiting, Routing     │
└──────────┬──────────────────────────────────┘
           │
    ┌──────┴──────────────────────┐
    │                             │
┌───▼────────────┐   ┌────────▼──────────┐
│  Task Queue    │   │  Cache Layer      │
│  (Redis)       │   │  (Redis/Memcached)│
└───┬────────────┘   └───────────────────┘
    │
┌───▼────────────────────────────────────────┐
│  Generation Workers (GPU Instances)        │
│  ┌──────────────────────────────────┐     │
│  │ Model: SD1.5 | SDXL | FLUX       │     │
│  └──────────────────────────────────┘     │
└─────────────────────────────────────────────┘

Quick Start

Prerequisites

  • NVIDIA GPU with 11GB+ VRAM (24GB+ recommended for SDXL/FLUX)
  • Docker and Docker Compose
  • CUDA 12.1+
  • Python 3.10+

Local Development

  1. Clone the repository:
git clone https://github.com/Pedimathematician/diffusion-platform.git
cd diffusion-platform
  1. Set up environment:
cp .env.example .env
# Edit .env with your configuration
  1. Start services with Docker Compose:
docker-compose up -d
  1. Access the API:

Without Docker

  1. Install dependencies:
pip install -r requirements.txt
  1. Start Redis:
redis-server --daemonize yes
  1. Run the API server:
uvicorn backend.main:app --host 0.0.0.0 --port 8000

Usage

Generate Images via API

import requests

# Submit generation request
response = requests.post("http://localhost:8000/api/generation/generate", json={
    "prompt": "a futuristic cityscape at sunset, highly detailed",
    "model_type": "sdxl",
    "num_inference_steps": 30,
    "guidance_scale": 7.5,
    "width": 1024,
    "height": 1024
})

task_id = response.json()["task_id"]

# Check status
status = requests.get(f"http://localhost:8000/api/generation/task/{task_id}")
print(status.json())

# Download image when complete
image = requests.get(f"http://localhost:8000/api/generation/task/{task_id}/image/0")
with open("output.png", "wb") as f:
    f.write(image.content)

Using Presets

response = requests.post("http://localhost:8000/api/generation/generate/preset", json={
    "preset_name": "detailed_concept_art",
    "prompt": "a magical forest with bioluminescent plants"
})

Train LoRA Adapter

  1. Prepare dataset:
python training/dataset_prep.py \
    /path/to/training/images \
    /path/to/prepared/dataset \
    --target-size 768 \
    --caption-prefix "in mystyle"
  1. Train LoRA:
python training/lora_trainer.py \
    /path/to/prepared/dataset \
    --model stabilityai/stable-diffusion-xl-base-1.0 \
    --output-dir ./my_lora \
    --epochs 100 \
    --lora-rank 32 \
    --use-wandb

API Endpoints

Generation

  • POST /api/generation/generate - Submit generation request
  • POST /api/generation/generate/preset - Generate with preset
  • GET /api/generation/task/{task_id} - Get task status
  • GET /api/generation/task/{task_id}/image/{index} - Download image
  • DELETE /api/generation/task/{task_id} - Cancel task

Models

  • GET /api/models/list - List registered models
  • GET /api/models/{model_id}/versions - List model versions
  • GET /api/models/{model_id}/compare - Compare versions

Health & Monitoring

  • GET /health - Health check
  • GET /metrics - Prometheus metrics
  • GET /stats - System statistics

Deployment

RunPod

# Use the provided startup script
bash deploy/runpod/startup.sh

Vast.ai

# Search for available GPUs
python deploy/vast_ai/launch.py --action search

# Create instance
python deploy/vast_ai/launch.py --action create --offer-id <OFFER_ID>

Kubernetes

kubectl apply -f deploy/kubernetes/

Configuration

Environment Variables

See .env.example for all available configuration options:

  • MODEL_CACHE_DIR: Directory for model weights
  • OUTPUT_DIR: Directory for generated images
  • REDIS_URL: Redis connection URL
  • HF_TOKEN: Hugging Face token for model downloads
  • LOG_LEVEL: Logging level (INFO, DEBUG, etc.)

Presets

Edit config/presets.yaml to add custom generation presets.

Monitoring

The platform includes comprehensive monitoring:

  • Prometheus Metrics: Request rates, latency, queue depth, GPU memory
  • Grafana Dashboards: Real-time visualization
  • Structured Logging: JSON logs with request tracking
  • Health Checks: Automated health monitoring

Access Grafana at http://localhost:3000 (default credentials: admin/admin)

Development

Project Structure

diffusion-platform/
├── backend/              # FastAPI backend
│   ├── api/             # API routes and schemas
│   ├── core/            # Core services (models, queue, registry)
│   └── utils/           # Utilities (logging, metrics)
├── training/            # Training pipelines
├── deploy/              # Deployment configs
├── config/              # Configuration files
├── docs/                # Documentation
└── tests/               # Test suite

Running Tests

pytest backend/tests/ -v --cov

Code Quality

# Format code  
black backend/ training/

# Lint
flake8 backend/ training/

# Type checking
mypy backend/

Performance

  • SD1.5: ~2-3s per image (512x512, 30 steps)
  • SDXL: ~5-8s per image (1024x1024, 30 steps)
  • FLUX: ~8-12s per image (1024x1024, 30 steps)

Benchmarks on RTX 4090, with xformers and torch.compile optimizations

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

Credits

Built by Semaka Mathunyane (@Pedimathematician)

Powered by:

Support

For questions or issues:

About

Enterprise-grade diffusion model infrastructure platform with multi-model support, LoRA training, and production monitoring

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published