A flexible authentication and authorization service designed to work with nginx's ngx_http_auth_request_module. Provides HMAC signature validation and API key authentication for microservices with fine-grained permission control.
- Multiple Authentication Methods: API Key (simple) or HMAC-SHA256 signatures (secure)
- Route-Based Protection: Define which endpoints require authentication
- Method-Level Permissions: Control access per HTTP method (GET, POST, DELETE, etc.)
- Domain-Based Routing: Multi-domain support with domain-specific access rules
- Rate Limiting: Per-client request limits with Redis backend (rolling 24-hour windows)
- Replay Protection: Redis-backed HMAC nonce storage for multi-instance deployments
- Client Management: Issue credentials and manage client lifecycle
- Flexible Permissions: Grant specific clients access to specific routes with specific methods
- Nginx Integration: Works seamlessly with nginx
auth_requestdirective - Flask HTTP Endpoint: Production-ready authorization service on port 7843
- Database-Driven: All configuration in PostgreSQL - no code changes needed
- Complete Management Scripts: Interactive CLI tools for all operations
- Audit Logging: Comprehensive structured logging via Loki with full request context
- Prometheus Metrics: Built-in metrics endpoint for monitoring and alerting
- Trusted Forwarder Support: Correct client IP resolution when auth subrequests route through Cloudflare
- Aegis-Backed Admin Console: Bearer-token auth on
/api/admin/*, public-auth proxy on/api/auth/*, webhook-driven admin provisioning, runtime config for the Next.js console - Comprehensive Testing: 330 tests with 100% pass rate
The system is built around three core entities:
- Routes: Define protected API endpoints and their auth requirements per HTTP method
- Clients: API consumers with credentials (API keys and/or shared secrets)
- Permissions: Connect clients to routes with allowed HTTP methods
Routes can be configured with domain-specific access rules, enabling multi-domain support with a single gatekeeper instance:
- Exact domain:
api.example.com- matches only this specific domain - Wildcard subdomain:
*.example.com- matches all subdomains (api.example.com, admin.example.com, etc.) - Any domain:
*- matches all domains (backward compatible)
Use cases:
- Different access rules for public API vs admin API on different subdomains
- Multi-tenant applications with domain-per-tenant
- Development/staging/production environments on different domains
Priority: When multiple routes match, exact domain matches take priority over wildcard subdomains, which take priority over any-domain wildcards.
See ARCHITECTURE.md for detailed architecture documentation.
When upstream servers make auth subrequests to the gatekeeper through Cloudflare, Cloudflare replaces CF-Connecting-IP with the upstream server's IP — not the original client's IP. This means the gatekeeper logs and rate-limits against the wrong IP.
The solution: Upstream servers forward the real client IP in an X-Original-Client-IP header, and the gatekeeper trusts it only from known server IPs.
How it works:
- Client (
198.51.100.42) → Cloudflare → Upstream nginx (e.g., your app server at203.0.113.10) - Upstream nginx makes an auth subrequest back through Cloudflare to the gatekeeper
- Cloudflare sets
CF-Connecting-IPto203.0.113.10(the upstream server, not the client) - But the upstream nginx also sets
X-Original-Client-IP: 198.51.100.42(the real client) - Gatekeeper sees the caller is
203.0.113.10, which is inTRUSTED_FORWARDER_IPS - Gatekeeper uses
X-Original-Client-IP(198.51.100.42) as the real client IP
Gatekeeper configuration — add the upstream server IPs to TRUSTED_FORWARDER_IPS:
TRUSTED_FORWARDER_IPS=203.0.113.10,203.0.113.20Upstream nginx configuration — add this to the auth_request location block:
location = /auth {
internal;
proxy_pass https://gatekeeper.example.com/authz;
# Forward the real client IP (from Cloudflare's CF-Connecting-IP on this server)
proxy_set_header X-Original-Client-IP $http_cf_connecting_ip;
# Standard headers
proxy_set_header X-Original-URI $request_uri;
proxy_set_header X-Original-Method $request_method;
proxy_set_header X-Original-Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}Security: If the caller IP is not in TRUSTED_FORWARDER_IPS, the X-Original-Client-IP header is ignored entirely — preventing spoofing by arbitrary clients.
The gatekeeper itself is the auth/authz engine for services it protects via nginx auth_request. Its own admin console is a separate concern that uses ByteForge Aegis for human-user authentication. Four moving parts:
-
Bearer-token auth on
/api/admin/*— every admin endpoint (@require_console_admin) takes an Aegisauth_tokenfrom theAuthorizationheader, calls Aegis's/api/auth/me, then checks the resolved user is provisioned inconsole_admins. A small in-memory positive cache (60s TTL by default) keeps the per-request round-trip out of the hot path. Seesrc/auth/aegis_authenticator.py. -
Public-auth proxy on
/api/auth/*— login, register, verify-email, check-verification-token, request-password-reset, reset-password. Aegis requiresX-Tenant-Api-Keyon these six endpoints (so a public client can't impersonate a tenant). The browser must NEVER hold this key, so the gatekeeper proxies the calls and attaches the key server-side fromAEGIS_TENANT_API_KEY. Seesrc/auth/aegis_tenant.pyandsrc/blueprints/auth.py. -
Webhook-driven admin provisioning at
/api/webhooks/aegis. When Aegis fires auser.verifiedwebhook, the gatekeeper inserts a row intoconsole_adminsif-and-only-if the email is in the comma-separatedAEGIS_ADMIN_EMAILSallowlist. Webhook signature is HMAC-SHA256-verified usingAEGIS_WEBHOOK_SECRET. Default is closed (empty allowlist = nobody gets provisioned). -
Runtime config at
/api/config— public, no auth, returns{aegisApiUrl, siteName, siteDomain}from env vars (AEGIS_API_URL,SITE_NAME,SITE_DOMAIN). The Next.js console fetches this at first paint instead of baking URLs into its bundle, so a single frontend image deploys across tenants. Seesrc/blueprints/config.py.
The required environment variables for each piece are documented in env.example.
- Python 3.13+
- PostgreSQL 12+
- Virtual environment
# Clone the repository
git clone <repository-url>
cd api-gatekeeper
# Create and activate virtual environment
python3.13 -m venv .
source bin/activate # On Windows: bin\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -r dev-requirements.txt
# Install package in editable mode
pip install -e .
# Set up environment variables
cp env.example .env
# Edit .env with your PostgreSQL credentials# Set required environment variables
export PG_PASSWORD="your_postgres_superuser_password"
export API_AUTH_ADMIN_PG_PASSWORD="your_app_password"
# Create and initialize database
source bin/activate
python dev_scripts/setup_database.py
# Create test database (for running tests)
python dev_scripts/setup_database.py --test-dbSee DATABASE_SETUP.md for detailed setup instructions.
Start the Flask authorization service:
# Load environment variables and start server
set -a && source .env && set +a
source bin/activate
python src/app.pyThe server starts on port 7843 and exposes:
For nginx + monitoring:
/authz: Authorization endpoint (for nginx auth_request on protected services)/health: Health check endpoint (verifies database and Redis connectivity)/metrics: Prometheus metrics endpoint (for monitoring and alerting)
For the admin console (gated by Aegis bearer auth, except /api/config which is public):
/api/admin/clients,/api/admin/routes,/api/admin/permissions: Admin-console list endpoints/api/auth/*: Public auth proxy (login, register, verify-email, password reset) — attaches the per-tenant Aegis API key server-side so it never ships to the browser/api/webhooks/aegis: Aegis webhook receiver — provisionsconsole_adminsrows fromuser.verifiedevents/api/config: Runtime config consumed by the console frontend on first paint (so a single frontend image deploys across tenants)
# Health check
curl http://localhost:7843/health
# Test authorization (requires X-Original-URI, X-Original-Method, and X-Original-Host headers)
curl -v http://localhost:7843/authz \
-H "X-Original-URI: /api/test" \
-H "X-Original-Method: GET" \
-H "X-Original-Host: api.example.com" \
-H "Authorization: Bearer your-api-key"Create sample routes and clients for testing:
source bin/activate
python scripts/setup_test_data.pyThis creates:
- Public route (
/api/public) - API key protected route (
/api/protected) - HMAC protected route (
/api/secure) - Test clients with credentials
- Appropriate permissions
The script outputs curl commands for testing each scenario.
All management scripts support both interactive and command-line modes. Run without arguments for interactive mode.
python scripts/create_route.pyInteractive prompts guide you through:
- Route pattern (e.g.,
/api/users/*) - Domain (exact domain, wildcard subdomain, or
*for any domain) - Service name
- HTTP methods to support
- Authentication requirements per method (none, API key, or HMAC)
Example:
Route pattern: /api/users/*
Domain: api.example.com (or * for any domain)
Service: user-service
Methods:
GET: Public (no auth required)
POST: Requires api_key
DELETE: Requires hmac
python scripts/list_routes.pyDisplays all configured routes with their patterns, services, and supported methods.
# Interactive mode
python scripts/delete_route.py
# Direct mode with route ID
python scripts/delete_route.py <route_id>Confirmation required. Cascade deletes associated permissions.
python scripts/create_client.pyInteractive prompts for:
- Client name
- Credential type (API key, shared secret, or both)
- Auto-generates secure credentials or accepts custom values
- Client status (active, suspended, revoked)
Features:
- Uses Python's
secretsmodule for secure credential generation - Credentials are displayed once - save them securely
python scripts/list_clients.pyShows all clients with truncated credentials (first 8 characters) for security.
Output includes:
- Client ID and name
- Credential types (API key and/or shared secret)
- Status (active/suspended/revoked)
# Interactive mode
python scripts/delete_client.py
# Direct mode with client ID
python scripts/delete_client.py <client_id>Safety features:
- Shows associated permissions before deletion
- Requires typing 'delete' to confirm
- Cascade deletes all client permissions
# Interactive mode - select client and route visually
python scripts/grant_permission.py
# Direct mode - provide IDs
python scripts/grant_permission.py <client_id> <route_id>Interactive mode walks you through:
- Select a client from list (shows credentials and status)
- Select a route from list (shows service and methods)
- Choose which HTTP methods to allow
- Review and confirm
Features:
- Detects existing permissions and offers to update
- Shows route auth requirements for each method
- Visual indicators for active/inactive clients
# All permissions grouped by client
python scripts/list_permissions.py
# Permissions for a specific client
python scripts/list_permissions.py --client <client_id>
# Permissions for a specific route (who can access it)
python scripts/list_permissions.py --route <route_id>Output includes:
- Client information (name, credentials, status)
- Route patterns and services
- Allowed HTTP methods
- Permission IDs (for revocation)
By route view shows:
- Route's authentication requirements per method
- All clients with access and their allowed methods
- Client status indicators
# Interactive mode - select from all permissions
python scripts/revoke_permission.py
# By permission ID
python scripts/revoke_permission.py <permission_id>
# By client and route IDs
python scripts/revoke_permission.py <client_id> <route_id>Safety features:
- Shows full permission details before deletion
- Displays affected client and route
- Requires typing 'revoke' to confirm
Here's a typical setup workflow:
# 1. Create a route for your API
python scripts/create_route.py
# Route: /api/products/*
# Service: product-service
# GET: Public, POST: API Key, DELETE: HMAC
# 2. Create a client
python scripts/create_client.py
# Name: Mobile App v2.1
# Generate API key: yes
# Generate shared secret: yes
# 3. Grant the client permission
python scripts/grant_permission.py
# Select: Mobile App v2.1
# Select: /api/products/*
# Allow: GET, POST (but not DELETE)
# 4. Verify configuration
python scripts/list_permissions.py --client <client_id>
# Shows: Mobile App can GET and POST to /api/products/*
# 5. View who can access a route
python scripts/list_permissions.py --route <route_id>
# Shows: All clients with access to /api/products/*Once you've created a client and granted permissions, here's how to authenticate API requests.
Send the API key in the Authorization header using the Bearer scheme:
curl -X GET https://api.example.com/api/products \
-H "Authorization: Bearer your-api-key-here"Python example:
import requests
headers = {
"Authorization": "Bearer your-api-key-here"
}
response = requests.get("https://api.example.com/api/products", headers=headers)JavaScript example:
const response = await fetch("https://api.example.com/api/products", {
headers: {
"Authorization": "Bearer your-api-key-here"
}
});For HMAC-protected routes, generate a signature using your shared secret:
# Authorization header format:
# HMAC client_id="<id>",timestamp="<epoch>",nonce="<uuid>",signature="<hex>"Signature computation:
- Create message:
{METHOD}\n{PATH}\n{TIMESTAMP}\n{NONCE}\n{BODY} - Compute HMAC-SHA256 with your shared secret
- Include all components in the Authorization header
See src/auth/request_signer.py for a reference implementation.
On successful authentication, the backend receives these headers from nginx:
X-Client-ID: The authenticated client's UUIDX-Client-Name: The client's display nameX-Route-ID: The matched route's UUID
Run the test suite:
# Run all tests
source bin/activate
python -m pytest
# Run specific test file
python -m pytest tests/test_flask_app.py
# Run with coverage
python -m pytest --cov=src tests/
# Run with verbose output
python -m pytest -vTest database:
- All tests use
api_auth_admin_testdatabase - Automatic cleanup between tests
- 330 tests covering the data-plane (route matching, HMAC, rate limiting, nonce storage, IP resolution) plus the admin-console plane (Aegis bearer-auth decorator, /api/admin/* endpoints, /api/auth/* proxy, /api/webhooks/aegis signature verification, /api/config wire format)
gatekeeper-backend/
├── src/
│ ├── app.py # Flask app factory; wires blueprints, DB, Redis, Aegis
│ ├── monitoring.py # Prometheus metrics
│ ├── rate_limiter.py # Redis-backed rate limiting
│ ├── auth/ # Authentication + authorization
│ │ ├── models.py # AuthResult model
│ │ ├── authorizer.py # Authorization engine (route + permission)
│ │ ├── hmac_handler.py # HMAC signature validation
│ │ ├── api_key_handler.py # API key extraction
│ │ ├── nonce_storage.py # Redis nonce storage for replay protection
│ │ ├── request_signer.py # Test utility for HMAC
│ │ ├── aegis_authenticator.py # Aegis bearer-token introspection + admin lookup
│ │ ├── aegis_tenant.py # Aegis tenant-key client for /api/auth/* proxy
│ │ └── console_admin_decorator.py # @require_console_admin decorator
│ ├── blueprints/ # Flask blueprints
│ │ ├── authz.py # /authz — nginx auth_request endpoint
│ │ ├── health.py # /health
│ │ ├── metrics.py # /metrics
│ │ ├── admin.py # /api/admin/* — clients, routes, permissions
│ │ ├── auth.py # /api/auth/* — public auth proxy to Aegis
│ │ ├── aegis_webhook.py # /api/webhooks/aegis — admin provisioning
│ │ └── config.py # /api/config — runtime config for the frontend
│ ├── database/ # Database layer
│ │ ├── schema.sql # PostgreSQL schema (routes, clients, permissions,
│ │ │ # rate_limits, console_admins)
│ │ └── driver.py # CRUD operations with connection pooling
│ └── utils/ # Utilities
│ ├── db_connection.py # Database connection helper
│ ├── ip_resolver.py # Client IP resolution w/ trusted forwarders
│ └── admin_allowlist.py # AEGIS_ADMIN_EMAILS parsing
├── docs/ # Documentation
│ ├── ARCHITECTURE.md # System architecture
│ ├── DATABASE_SETUP.md # Database setup guide
│ ├── DOCKER_DEPLOYMENT.md # Docker deployment guide
│ ├── HOW_TO_GATEKEEPER_AUTH_NGINX.md # Nginx auth_request integration
│ └── ROADMAP.md # Development roadmap
├── nginx/ # Nginx integration examples
│ ├── auth-example.conf # Example for protected services using auth_request
│ └── example-site.conf # Example for the gatekeeper console host itself
├── scripts/ # Management scripts (see env.example for db setup)
│ ├── create_route.py / list_routes.py / delete_route.py / update_route.py
│ ├── create_client.py / list_clients.py / delete_client.py / show_client.py
│ ├── grant_permission.py / list_permissions.py / revoke_permission.py
│ ├── set_rate_limit.py / list_rate_limits.py
│ └── setup_test_data.py
├── tests/ # Test suite (330 tests)
│ ├── conftest.py
│ ├── test_database_driver.py / test_client_operations.py / test_route_model.py
│ ├── test_authorizer.py / test_auth_handlers.py / test_flask_app.py
│ ├── test_rate_limiter.py / test_nonce_storage.py / test_ip_resolver.py
│ ├── test_admin_clients.py / test_admin_routes.py / test_admin_permissions.py
│ ├── test_admin_allowlist.py / test_console_admin_decorator.py
│ ├── test_aegis_authenticator.py / test_auth_proxy.py
│ ├── test_aegis_webhook.py / test_webhook_verifier.py
│ └── test_runtime_config.py
└── dev_scripts/ # Development utilities
├── setup_database.py # Idempotent: creates user/db, applies schema
├── recreate_database.py # Drops and recreates (destructive — dev only)
├── verify_schema.py # Inspects current schema state
└── setup_production_test_data.py
All configuration is database-driven:
- Add a protected endpoint: Create a route
- Grant access to a client: Create a permission
- Revoke access: Delete the permission or suspend the client
- Change auth requirements: Update the route
No code changes or service restarts required.
API_AUTH_ADMIN_PG_PASSWORD: Password for application database user
PORT: Flask server port (default:7843)POSTGRES_HOST: PostgreSQL host (default:localhost)POSTGRES_PORT: PostgreSQL port (default:5432)POSTGRES_USER: PostgreSQL superuser (default:postgres)PG_PASSWORD: Superuser password (only needed for setup)API_AUTH_ADMIN_PG_DB: Database name (default:api_auth_admin)API_AUTH_ADMIN_PG_USER: Application user (default:api_auth_admin)REDIS_HOST: Redis server for rate limiting (if not set, rate limiting is disabled)REDIS_PORT: Redis port (default:6379)REDIS_PASSWORD: Redis password (optional)REDIS_DB: Redis database number (default:0)TRUSTED_FORWARDER_IPS: Comma-separated list of upstream server IPs trusted to forwardX-Original-Client-IP(default: none)
These are documented in detail in env.example. Without them the data plane (/authz, HMAC, rate limiting) still works, but the admin console returns 503 / 500 from the relevant endpoints.
AEGIS_API_URL: Aegis base URL (e.g.https://aegis.example.com). Required for/api/admin/*,/api/auth/*, and/api/config.AEGIS_SITE_ID,AEGIS_TENANT_API_KEY: Aegis site ID and per-tenant API key. Required for the/api/auth/*public proxy. Treat the key as a secret.AEGIS_WEBHOOK_SECRET: HMAC-SHA256 secret Aegis signsuser.verifiedwebhooks with. Required for/api/webhooks/aegis(otherwise every webhook is rejected as unauthorized).AEGIS_ADMIN_EMAILS: Comma-separated allowlist of emails authorized to be provisioned as console admins. Default closed — empty means nobody is provisioned.AEGIS_AUTH_CACHE_TTL_SECONDS: Positive-cache TTL for resolved bearer tokens (default:60). Set to0to disable caching.
The Next.js console fetches these at first paint via /api/config. All defaults are display-only.
SITE_NAME: Branding string in the console header (default:gatekeeper).SITE_DOMAIN: Domain shown in the console UI (default: empty).
- API keys and shared secrets are unique across all clients
- Use auto-generation for cryptographically secure credentials
- Credentials are displayed only at creation time
- Consider implementing periodic credential rotation
- Use
suspendedstatus for temporary access revocation - Use
revokedstatus for permanent termination - Active clients can be filtered efficiently via database indexes
- Always use HTTPS for API key authentication
- HMAC provides request integrity but HTTPS is still recommended
- Implement rate limiting per client
- Monitor authentication failures
- Connection pooling prevents resource exhaustion
- All credentials are indexed for fast lookup
- Cascade deletes prevent orphaned permissions
- Unix timestamps avoid timezone-related bugs
- Data models go in the
api-gatekeeper-modelslibrary (separate repo) - Database operations in
src/database/driver.py - Update
src/database/schema.sqlfor schema changes - Add tests in
tests/ - Run full test suite before committing
Currently using schema.sql with CREATE TABLE IF NOT EXISTS. For production, implement proper migrations.
- Follow PEP 8 conventions
- Use type hints for parameters and return types
- Methods should be under 50 lines when possible
- No sys.path hacks - use proper package installation
- Unix timestamps for all date/time storage
Foundation (Complete)
- Route configuration with method-level auth requirements
- Client management with multiple credential types
- Permission system linking clients to routes
- Complete management scripts (CLI)
- Database schema with proper constraints
- Connection pooling and CRUD operations
Phase 1: Authorization Engine (Complete)
- AuthResult model with serialization
- Authorizer class with route matching and permission checking
- Route matching (exact and wildcard with priority rules)
- Public route support (no authentication)
- Client status verification (active/suspended/revoked)
Phase 2: Authentication Handlers (Complete)
- HMAC-SHA256 signature validation (byteforge-hmac integration)
- API key extraction (headers and query parameters)
- DatabaseSecretProvider for HMAC secret lookups
- RequestSigner utility for testing
Phase 3: Flask HTTP Endpoint (Complete)
- Flask application with
/authzendpoint - Nginx
auth_requestintegration - Health check endpoint (
/health) - Request/response header handling
- Comprehensive test suite (162 tests total)
- Nginx configuration example
- Test data setup utility
Phase 4: Production Readiness (Complete)
- Docker deployment setup
- Gunicorn production configuration
- Monitoring and metrics (Prometheus)
- Structured logging
- Production deployment and testing
Phase 5: Domain-Based Routing (Complete)
- Multi-domain support with domain field in routes
- Domain matching (exact, wildcard subdomain, any domain)
- Domain-aware authorization logic
- X-Original-Host header extraction
- Updated management scripts for domain configuration
- Comprehensive domain matching tests
Phase 6: Rate Limiting (Complete)
- Redis-based rate limiting backend
- Rolling 24-hour window limits
- Per-client request quotas
- Rate limit management scripts
- 429 status code for exceeded limits
- Automatic disabling if Redis not configured
Phase 7: Audit Logging (Complete)
- Comprehensive structured logging via Loki
- Full request context (client_id, client_name, route, domain, method)
- Decision tracking (allowed/denied with reason)
- Client IP and request duration logging
- Queryable audit trail in Grafana
Phase 8: Admin Console (Complete)
- Admin REST API (
/api/admin/clients,/api/admin/routes,/api/admin/permissions) - Web dashboard (gatekeeper-frontend, Next.js)
- Client SDKs (api-gatekeeper-api-js, api-gatekeeper-api-python)
- Aegis-based authentication (bearer-token introspection, webhook-driven admin provisioning, public-auth proxy with per-tenant API key)
- Runtime config endpoint (
/api/config) so a single frontend image deploys across tenants
Phase 9: Polish
- Rate-limits dashboard panel (data exists; UI not wired)
- Per-tenant config script for new-host bring-up
- ARCHITECTURE.md - System architecture and design principles
- DATABASE_SETUP.md - Detailed database setup instructions
- DOCKER_DEPLOYMENT.md - Docker deployment guide (standalone and existing stacks)
- ROADMAP.md - Development roadmap with phase breakdown
- HOW_TO_GATEKEEPER_AUTH_NGINX.md - Nginx auth_request integration patterns
- nginx/auth-example.conf - Nginx integration example
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Ensure all tests pass:
python -m pytest - Submit a pull request
This project is licensed under the O'Sassy License - see the LICENSE file for details.
For issues and questions:
- GitHub Issues: [repository-url]/issues
- Documentation: See docs/ARCHITECTURE.md for detailed system design
Status: Phases 1-8 complete. Production-ready data plane (multi-domain routing, rate limiting, HMAC replay protection, trusted forwarder IP resolution, comprehensive audit logging, Prometheus metrics) plus Aegis-backed admin console (admin REST API, web dashboard, client SDKs, runtime-config-driven multi-tenant deploy). 330 tests passing.