A production-grade, enterprise-level infrastructure platform for deploying and managing diffusion models (Stable Diffusion 1.5, SDXL, FLUX) with LoRA training, queue management, and comprehensive monitoring.
- Multi-Model Support: SD1.5, SDXL, and FLUX with unified interface
- LoRA Training: Complete pipeline for custom style training with BLIP auto-captioning
- Queue System: Redis-backed task queue with priority handling and batch processing
- Model Registry: Version control and management for models and LoRAs
- RESTful API: FastAPI with OpenAPI documentation
- Monitoring: Prometheus metrics and Grafana dashboards
- Production Ready: Docker deployment, health checks, structured logging
- Cloud Deployment: RunPod and Vast.ai integration
┌─────────────────────────────────────────────┐
│ API Gateway (FastAPI) │
│ Authentication, Rate Limiting, Routing │
└──────────┬──────────────────────────────────┘
│
┌──────┴──────────────────────┐
│ │
┌───▼────────────┐ ┌────────▼──────────┐
│ Task Queue │ │ Cache Layer │
│ (Redis) │ │ (Redis/Memcached)│
└───┬────────────┘ └───────────────────┘
│
┌───▼────────────────────────────────────────┐
│ Generation Workers (GPU Instances) │
│ ┌──────────────────────────────────┐ │
│ │ Model: SD1.5 | SDXL | FLUX │ │
│ └──────────────────────────────────┘ │
└─────────────────────────────────────────────┘
- NVIDIA GPU with 11GB+ VRAM (24GB+ recommended for SDXL/FLUX)
- Docker and Docker Compose
- CUDA 12.1+
- Python 3.10+
- Clone the repository:
git clone https://github.com/Pedimathematician/diffusion-platform.git
cd diffusion-platform- Set up environment:
cp .env.example .env
# Edit .env with your configuration- Start services with Docker Compose:
docker-compose up -d- Access the API:
- API: http://localhost:8000
- Docs: http://localhost:8000/docs
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000
- Install dependencies:
pip install -r requirements.txt- Start Redis:
redis-server --daemonize yes- Run the API server:
uvicorn backend.main:app --host 0.0.0.0 --port 8000import requests
# Submit generation request
response = requests.post("http://localhost:8000/api/generation/generate", json={
"prompt": "a futuristic cityscape at sunset, highly detailed",
"model_type": "sdxl",
"num_inference_steps": 30,
"guidance_scale": 7.5,
"width": 1024,
"height": 1024
})
task_id = response.json()["task_id"]
# Check status
status = requests.get(f"http://localhost:8000/api/generation/task/{task_id}")
print(status.json())
# Download image when complete
image = requests.get(f"http://localhost:8000/api/generation/task/{task_id}/image/0")
with open("output.png", "wb") as f:
f.write(image.content)response = requests.post("http://localhost:8000/api/generation/generate/preset", json={
"preset_name": "detailed_concept_art",
"prompt": "a magical forest with bioluminescent plants"
})- Prepare dataset:
python training/dataset_prep.py \
/path/to/training/images \
/path/to/prepared/dataset \
--target-size 768 \
--caption-prefix "in mystyle"- Train LoRA:
python training/lora_trainer.py \
/path/to/prepared/dataset \
--model stabilityai/stable-diffusion-xl-base-1.0 \
--output-dir ./my_lora \
--epochs 100 \
--lora-rank 32 \
--use-wandbPOST /api/generation/generate- Submit generation requestPOST /api/generation/generate/preset- Generate with presetGET /api/generation/task/{task_id}- Get task statusGET /api/generation/task/{task_id}/image/{index}- Download imageDELETE /api/generation/task/{task_id}- Cancel task
GET /api/models/list- List registered modelsGET /api/models/{model_id}/versions- List model versionsGET /api/models/{model_id}/compare- Compare versions
GET /health- Health checkGET /metrics- Prometheus metricsGET /stats- System statistics
# Use the provided startup script
bash deploy/runpod/startup.sh# Search for available GPUs
python deploy/vast_ai/launch.py --action search
# Create instance
python deploy/vast_ai/launch.py --action create --offer-id <OFFER_ID>kubectl apply -f deploy/kubernetes/See .env.example for all available configuration options:
MODEL_CACHE_DIR: Directory for model weightsOUTPUT_DIR: Directory for generated imagesREDIS_URL: Redis connection URLHF_TOKEN: Hugging Face token for model downloadsLOG_LEVEL: Logging level (INFO, DEBUG, etc.)
Edit config/presets.yaml to add custom generation presets.
The platform includes comprehensive monitoring:
- Prometheus Metrics: Request rates, latency, queue depth, GPU memory
- Grafana Dashboards: Real-time visualization
- Structured Logging: JSON logs with request tracking
- Health Checks: Automated health monitoring
Access Grafana at http://localhost:3000 (default credentials: admin/admin)
diffusion-platform/
├── backend/ # FastAPI backend
│ ├── api/ # API routes and schemas
│ ├── core/ # Core services (models, queue, registry)
│ └── utils/ # Utilities (logging, metrics)
├── training/ # Training pipelines
├── deploy/ # Deployment configs
├── config/ # Configuration files
├── docs/ # Documentation
└── tests/ # Test suite
pytest backend/tests/ -v --cov# Format code
black backend/ training/
# Lint
flake8 backend/ training/
# Type checking
mypy backend/- SD1.5: ~2-3s per image (512x512, 30 steps)
- SDXL: ~5-8s per image (1024x1024, 30 steps)
- FLUX: ~8-12s per image (1024x1024, 30 steps)
Benchmarks on RTX 4090, with xformers and torch.compile optimizations
MIT License - see LICENSE file for details
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
Built by Semaka Mathunyane (@Pedimathematician)
Powered by:
For questions or issues:
- GitHub Issues: https://github.com/Pedimathematician/diffusion-platform/issues
- Documentation: docs/