Skip to content

Delta sync#4284

Closed
ashleychandy wants to merge 10 commits intocowprotocol:mainfrom
ashleychandy:delta-sync
Closed

Delta sync#4284
ashleychandy wants to merge 10 commits intocowprotocol:mainfrom
ashleychandy:delta-sync

Conversation

@ashleychandy
Copy link
Contributor

Description

Implements auction delta sync to reduce solver payload size by sending incremental updates instead of full orderbook snapshots. This lays the foundation for Unified Auction by introducing structured change events and a driver-side replica that can consume deltas while still serving full auctions to solvers.

Changes

  • Refactor solvable orders caching to emit structured change events used internally by autopilot.
  • Add delta-sync infrastructure and replica logic in the driver, plus metrics and health/readiness tweaks.
  • Extend solver/driver DTOs and headers to support delta sync protocol versions.
  • Add delta-sync e2e coverage and integration wiring.

How to Test

cargo nextest run -p e2e local_node_delta_sync_update_pipeline_integration \
  --run-ignored ignored-only \
  --test-threads 1 \
  --failure-output final

Fixes #4207

Resolved conflicts in:
- crates/autopilot/src/infra/api.rs
- crates/autopilot/src/run.rs
- crates/configs/src/autopilot/solver.rs
- crates/driver/Cargo.toml
- crates/e2e/src/setup/services.rs
- crates/shared/src/lib.rs

Kept main's simpler approach for autopilot configuration while integrating
feature branch changes for delta sync API and thin solve request support.
@ashleychandy ashleychandy requested a review from a team as a code owner March 22, 2026 23:25
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces delta sync functionality to reduce solver payload size and lays the foundation for Unified Auction. It refactors solvable orders caching to emit structured change events, adds delta-sync infrastructure and replica logic in the driver, extends DTOs and headers for protocol versions, and includes end-to-end coverage. The changes are comprehensive and well-tested, addressing various aspects of incremental updates and thin solve requests. The severity of the comments has been adjusted to medium.

} else {
None
};
Metrics::solve_request_body_size(full_request.body_size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The solve_request_body_size metric currently only records the size of the full_request. If a thin_request is selected and sent to a driver, this metric will be inaccurate, as it will still reflect the larger full request size. This can lead to misleading performance monitoring data regarding the actual payload sizes being sent.

To ensure accurate metrics, please record the body_size() of the request variable after it has been selected by Self::select_solve_request_for_driver.

        let selected_request = Self::select_solve_request_for_driver(
            driver.supports_thin_solve_request,
            full_request,
            thin_request,
        );
        Metrics::solve_request_body_size(selected_request.body_size());

Comment on lines +1027 to +1049
#[test]
fn delta_sync_resync_retry_values_are_cached() {
static LOCK: std::sync::Mutex<()> = std::sync::Mutex::new(());
let _lock = LOCK.lock().expect("lock poisoned");

let attempts_expected = DELTA_SYNC_RESYNC_RETRY_ATTEMPTS.get().copied().unwrap_or(7);
let delay_expected = DELTA_SYNC_RESYNC_RETRY_DELAY
.get()
.copied()
.unwrap_or_else(|| Duration::from_millis(123));

let _ = DELTA_SYNC_RESYNC_RETRY_ATTEMPTS.set(7);
let _ = DELTA_SYNC_RESYNC_RETRY_DELAY.set(Duration::from_millis(123));

assert_eq!(delta_sync_resync_retry_attempts(), attempts_expected);
assert_eq!(delta_sync_resync_retry_delay(), delay_expected);

let _ = DELTA_SYNC_RESYNC_RETRY_ATTEMPTS.set(9);
let _ = DELTA_SYNC_RESYNC_RETRY_DELAY.set(Duration::from_millis(999));

assert_eq!(delta_sync_resync_retry_attempts(), attempts_expected);
assert_eq!(delta_sync_resync_retry_delay(), delay_expected);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test delta_sync_resync_retry_values_are_cached attempts to test the caching behavior of OnceLock static variables. However, OnceLocks can only be set once. If this test is run as part of a larger suite, or if the OnceLocks are initialized by another test or previous execution, subsequent set calls within this test will fail silently (return Err). This can lead to flaky test results or incorrect assertions.

To ensure reliable testing, please ensure that the OnceLocks are in an uninitialized state before running this test, perhaps by using a test-specific setup/teardown or a dedicated test guard that resets static state if possible, or by only asserting the get() value without attempting to set() if the OnceLock is expected to be initialized elsewhere.

@jmg-duarte
Copy link
Contributor

Closing this issue as is violates the contribution guide.

Any non-trivial documentation must be first discussed with the maintainers in an issue (see Opening an issue). Only very minor changes are accepted without prior discussion.
CONTRIBUTING.md:

@jmg-duarte jmg-duarte closed this Mar 24, 2026
@github-actions github-actions bot locked and limited conversation to collaborators Mar 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tracker: Auction delta sync

2 participants