English | δΈζ | Docs | δΈζζζ‘£ |
As GPU throughput keeps climbing, CPU and I/O paths increasingly become the hidden bottlenecks that drag down AI training and inference efficiency. Fluxon is built to aggressively consolidate the complexity of low-level storage and transport so more of the system budget can be spent on model work instead of data-plane plumbing.
Built on a unified Rust-based storage-and-transport foundation, Fluxon exposes three standardized interfaces that target the core bottlenecks in AI systems:
- KV/RPC (Unified key-value and RPC): Breaks data silos and enables efficient cross-process, cross-node reuse of inference-side
KVCacheandlatent cache - MQ (Elastic message queue): Decouples system dependencies and supports elastic message transport across heterogeneous resource pools
- FS (
S3-compatible file, object, and cache acceleration system): Unifies multi-form storage so one system can cache key-value, file, and object data, while supporting remote access,S3forwarding, and large-scale cross-cluster migration for AI data and model files
- Foundation Capabilities
- Interface Capabilities
- Benchmark
- Runtime Requirements
- Quick Start
- Repository Structure
- Contributing
- Contributors
- License
- Stargazers over time
- End-to-end Rust: moves connection handling, protocol encoding/decoding, state-machine progression, shared-memory management, and observability collection into Rust hot paths
- Integrated storage and transport: prioritizes the cross-process shared-memory fast path and optimizes storage and transport within one unified data plane
- High-performance inter-node transport: inside the cluster,
RDMAis preferred, with automaticTCPfallback, and NICs can be enabled, disabled, and switched dynamically from the GUI - Automatic inter-node relay: supports automatic relay / forwarding across nodes and sub-clusters, reducing the integration cost of complex network topologies
- Global memory allocation and governance: uniformly manages global memory allocation, object lifecycles, capacity boundaries, and reclamation policies to avoid fragmentation and uncontrolled growth
- Unified role model:
master,owner_client, andexternal_clientcooperate in layers, organizing control-plane and data-plane responsibilities into a scalable tree topology while decoupling business service processes from data-plane resource governance and low-level communication paths - Unified object interface: lets the system organize multi-field objects uniformly, balancing API flexibility, ease of use, and room for low-level optimization
- Tensor-native zero-copy handoff path: better suited for reusing high-frequency tensor objects across caching and transport paths
- Unified observability: uses the
Prometheusprotocol andGreptimeto consolidatemetric / trace / log, and includes a built-in GUI for cluster member state, log information, key metrics, and topology - Shared capabilities across all three interfaces:
KV/RPC,MQ, andFSreuse the same caching, transport, lease, capacity-governance, and observability substrate
Designed for world-model inference caches, state sharing, service-to-service calls, and tensor object reuse. In scenarios such as multi-view latent-space prediction, state extrapolation, and prefix-cache reuse, Fluxon KV/RPC provides a more general AI data plane rather than a point solution for only a single KVCache use case.
- Local cache replicas and eventually consistent read path: prioritizes local fast-path hits while synchronizing metadata asynchronously in the background
- Batched reclamation and hot-object management: advances invalid-object cleanup asynchronously through
batch_delete, and combines it withTinyLFUto reuse hot objects more efficiently - Simultaneous control over
L2andL3in AI workloads: keeps global data objects indexed, discoverable, and reusable, reducing redundant memory waste from duplicate residency across cache tiers - KV and RPC synergy: the same parameter organization, caching, and communication foundation serves both state storage and service-to-service calls
Designed for heterogeneous training, data-processing pipelines, and intermediate-state handoff across resource pools. When the producer side and consumer side are split across different machines, different resource pools, or even different sub-clusters, Fluxon MQ consolidates message retention, capacity governance, and cross-cluster placement into one unified messaging layer.
Lease-based retention semantics: binds message retention to thechannel, ensuring data has bounded-time reliable retention before actual consumptionchannel-level prefix statistics and capacity governance: continuously tracks message counts and capacity usage boundaries for scaling and traffic control- Cross-cluster load-aware placement: uses consumer-side location to decide payload placement, shortening prefetch paths and stabilizing throughput
- Co-designed with KV: message shells and member metadata stay on the control plane, while large payloads stay on the
FluxonKVdata plane, avoiding a second duplicated large-object transport stack
Fluxon FS is an S3-compatible file and object cache for AI data and model files. It supports read/write acceleration, remote access, S3 forwarding, cache hits, and large-scale cross-cluster migration. In workloads with high-resolution video, trajectory samples, checkpoints, and other large file objects, these capabilities are unified in one file data plane.
- Unified caching system: directly reuses
FluxonKV/RPCcaching and communication capabilities, splits files intoKeyValueshards, and lets one system support accelerated reads and writes for key-value, file, and object caching S3forwarding access: supports object-storage access and forwarding for AI data and model files- Transparent Python file semantics: preserves the upper-layer
open() / read() / write()experience as much as possible while reducing system-call and cross-process overhead - Specialized optimization for small-file / large-file reads and writes: optimizes concurrency and transport paths by file granularity and read / write path to improve bandwidth utilization and overall throughput
- Large-scale cross-cluster migration: supports
PB-scale data migration and keeps caching, transport, and failure recovery in one unified path
The benchmark section mainly covers the RPC, KV, and FS data planes, and the related scripts and configurations are primarily under fluxon_test_stack/.
The RPC benchmark mainly shows call latency and throughput across different message sizes and concurrency levels, to observe the stability and tail-latency behavior of the service-to-service call path.
The TCP benchmark shows that Fluxon is significantly ahead of MooncakeStore and Redis on the two read-heavy workloads Read-affinity and Read-Zipf. For put_only, the current main constraint remains the inflight metadata deduplication path rather than payload transport.
The benchmark results show that small-file reads and large-file writes are already significantly ahead of Alluxio, large-file reads are roughly comparable, and small-file writes still have room for further optimization.
MQ currently focuses mainly on scenario problems and data-plane design. The automated runtime entrypoints are test_runner.py and fluxon_test_stack/.
For Quick Start (Docker):
- Docker installed
- The Quick Start image bundles the middleware required by the demo flows
For production deployment or building from source:
- OS: Linux only
- Python:
>= 3.10 - Rust: Toolchain pinned to
1.93.0; see fluxon_rs/rust-toolchain.toml - External middleware:
- The minimum service plane requires
etcdandGreptime FluxonFSfeatures such as directory transfer and pre-scan that persist task state also requireTiKV PDandTiKV
- The minimum service plane requires
- Docker: Required for Quick Start image workflows and runtime packaging workflows
Quick Start is the shortest path to try Fluxon. For formal installation, deployment, and operations, see User Docs.
docker run --rm -it --network host \
hanbaoaaa/fluxon_quick_start:0.2.1 \
--mode kv \
--etcd-client-port 12379 \
--master-p2p-port 31000 \
--panel-port 18080 \
--greptime-http-port 14000 \
--kv-http-port 8083Once inside, you can type:
put demo:hello world
get demo:hello
del demo:hello
Runtime view:
Open the printed link to view the KV Web UI:
Related interface docs:
docker run --rm -it --network host \
hanbaoaaa/fluxon_quick_start:0.2.1 \
--mode mq \
--etcd-client-port 37379 \
--kv-master-port 34200 \
--greptime-http-port 14000 \
--panel-port 18080Once inside, you can type:
put hello
put world
exit
The background consumer keeps printing received messages.
Startup also prints the MQ Web UI address.
Runtime view:
Related interface docs:
docker run --rm -it --network host \
hanbaoaaa/fluxon_quick_start:0.2.1 \
--mode fs \
--etcd-client-port 36379 \
--kv-master-port 34100 \
--greptime-http-port 14000 \
--panel-port 34180Once inside, you can type:
ls
echo "hello fs" > notes.txt
cat notes.txt
ui
FS Quick Start additionally prints:
fs_s3endpoint- Basic Auth entry; the default username / password is
admin / admin
Runtime view:
Open the printed link to view the FS Web UI:
Related interface docs:
fluxon_rs/: Rust core implementation and low-level capabilitiesfluxon_py/: Python interfaces, runtime, and bindingsdeployment/: deployment and operations toolchainscripts/: utility scripts and helper entrypointssetup_and_pack/: packaging and release resource preparation entrypointsexamples/fluxon_quick_start/: minimal runnable environment entrypointfluxon_test_stack/: test stack, benchmarks, and gitops entrypoint
Contributions are welcome. Before you start, please read the developer docs on GitHub Pages:
- Developer Docs
- Developer - 1 - Package core install artifacts
- Developer - 2 - Package middleware and images
- Developer - 4 - Publish a release
Some earlier contribution records are no longer fully reflected in the current commit history. Historical highlights:
yxrxy: FluxonFS implementation and optimizationzTz01: KVCache optimizationpakkah: RDMA support, VLM explorationunity1263: KV shared-memory design integration, benchmark toolchainmumupika: Initial MQ implementationmaplestarplayl: IPC integration, SPDK integrationRuileLu: KV lease supportSummage: Initial KV architecture optimization
Fluxon is open-sourced under Apache License 2.0, see LICENSE.





















