Skip to content

perf: improve text checkout scalability#956

Open
lodyai[bot] wants to merge 8 commits intomainfrom
feat/scale-text-checkout-perf
Open

perf: improve text checkout scalability#956
lodyai[bot] wants to merge 8 commits intomainfrom
feat/scale-text-checkout-perf

Conversation

@lodyai
Copy link
Copy Markdown
Contributor

@lodyai lodyai Bot commented Apr 22, 2026

Summary

  • Add a text checkout benchmark with profile counters for frontier preparation, VV conversion, diff calculation, richtext tracker work, state apply, event emit, and future-sibling scanning.
  • Reduce per-change VersionVector work by passing a lightweight causal version view through diff calculation and teaching the richtext tracker to consume it directly.
  • Use a forward diff calculator for comparable checkout-to-latest paths so linear/import-greater updates avoid persistent CRDT checkout tracking.
  • Add plain-text no-style apply fast paths and conservative same-deps / same-parent fast paths for high-collaboration text checkout.

Benchmarks

  • 1000 peer wide-causal checkout: about 5.13ms -> 1.61ms, diff_calc about 4.90ms -> 1.39ms, tracker checkout about 3.47ms -> 37.6us.
  • 300 peer same-position checkout: about 4.93ms -> 1.78ms, frontier_prepare about 3.04ms -> 37.8us.
  • 1000 peer same-position checkout after frontier fast path: about 16.6ms. Future-sibling scan profile after same-parent fast path: about 1.83ms -> 575us.
  • checkout-to-latest linear smoke: about 65us with richtext tracker checkout/diff calls at 0.

Validation

  • cargo check -p loro-internal --features test_utils --bench text_checkout
  • cargo test -p loro-internal crdt_rope::test --features test_utils
  • cargo test -p loro-internal checkout --features test_utils
  • cargo test -p loro-internal richtext --features test_utils
  • cargo test -p loro-internal import --features test_utils
  • cargo check -p loro-internal
  • cargo test -p fuzz random_fuzz_1s -- --nocapture
  • cargo test -p fuzz update_text -- --nocapture
  • git diff --check

Notes

  • The local tracking file TMP_PLAN.md is intentionally not included in this PR.
  • Long-running libFuzzer targets were not run.

@lodyai
Copy link
Copy Markdown
Contributor Author

lodyai Bot commented Apr 22, 2026

Benchmark / validation update after the TMP_PLAN cleanup.

Fixed-size checkout bench command shape:

LORO_TEXT_CHECKOUT_PROFILE=1 LORO_TEXT_CHECKOUT_PEERS=1000 LORO_TEXT_CHECKOUT_BASE_LEN=1024 LORO_TEXT_CHECKOUT_CHANGES=1000 cargo bench -p loro-internal --features test_utils --bench text_checkout -- <case> --warm-up-time 0.05 --measurement-time 0.1 --sample-size 10

Results:

case criterion median-ish range avg_total main profile notes
plain/wide-causal-peer-checkout/1001 1.37-1.67 ms 1.46 ms diff_calc 1.27 ms; richtext_tracker_checkout 33 us; future_scan 0
plain/same-position-peer-checkout/1001 12.0-16.9 ms 15.89 ms diff_calc 15.15 ms; future_scan 0.57 ms; avg_future_scan_visited 383, max 999
code/checkout-to-latest-linear/1001 515-524 us 521 us diff_calc 358 us; state_apply 160 us; richtext tracker 0
rich/overlap-mark-peer-checkout/subscribed/1001 11.2-17.1 ms 15.20 ms state_apply 7.66 ms; diff_calc 6.63 ms; emit_events 72 us; future_scan 31 us

Phase 5 cost note: after the same-parent fast path, future sibling scanning is no longer the dominant cost in the measured 1000-peer cases. It is still visible in the worst same-position case (~0.57 ms, about 3-4% of total), but the remaining slow path is mostly replay/diff_calc. For subscribed rich overlap marks, future_scan is negligible (~31 us), while state apply and diff calc dominate.

Additional validation now passed:

cargo fuzz run text-update -- -max_total_time=60
cargo fuzz run all -- -max_total_time=60

Both fuzz targets completed their 60s runs without crashes. The fuzz lockfile refresh was discarded because it was generated by the local run and is unrelated to this PR.

@lodyai
Copy link
Copy Markdown
Contributor Author

lodyai Bot commented Apr 22, 2026

Follow-up phase 4 update after continuing the 1-4 plan.

Added commit 377afd4e (perf: batch rich text style event deltas):

  • Batch adjacent, non-overlapping retain-only rich text style event deltas before composing them.
  • Preserve the original compose order by flushing when a delta overlaps or is not retain-only.
  • This keeps the optimization local to event conversion and does not add checkout/cache state.

Validation passed after this commit:

cargo check -p loro-internal --features test_utils
cargo test -p loro-internal richtext --features test_utils
cargo test -p loro-internal checkout --features test_utils
cargo test -p loro-internal import --features test_utils
cargo test -p fuzz random_fuzz_1s -- --nocapture
git diff --check

Affected bench rerun:

LORO_TEXT_CHECKOUT_PROFILE=1 LORO_TEXT_CHECKOUT_PEERS=1000 LORO_TEXT_CHECKOUT_BASE_LEN=1024 LORO_TEXT_CHECKOUT_CHANGES=1000 cargo bench -p loro-internal --features test_utils --bench text_checkout -- rich/overlap-mark-peer-checkout/subscribed --warm-up-time 0.05 --measurement-time 0.1 --sample-size 10

Result: Criterion range 8.7449 ms - 13.709 ms, avg_total=12.692376ms, avg_state_apply=6.503316ms, avg_diff_calc=5.456606ms, avg_emit_events=50.446us, avg_richtext_insert_future_scan=27.024us. The sample is noisy (p = 0.17, no statistically significant change), but the profile is directionally better than the previous run (avg_total=15.199131ms, avg_state_apply=7.66148ms).

Copy link
Copy Markdown
Member

zxch3n commented Apr 22, 2026

Fuzz validation for the text checkout performance PR:

  • cargo fuzz run text-update -- -max_total_time=1200
    • Passed: 317,498 runs in 1201s
    • Final: cov: 4634, ft: 18712, corp: 1220/255Kb, rss: 507Mb
  • cargo fuzz run local_events -- -max_total_time=1200
    • Passed: 510,962 runs in 1201s
    • Final: cov: 15694, ft: 59080, corp: 2001/428Kb, rss: 566Mb
  • cargo fuzz run all -- -max_total_time=1200
    • Passed: 52,862 runs in 1201s
    • Final: cov: 26017, ft: 74208, corp: 625/22Kb, rss: 471Mb

No crashes, panics, sanitizer failures, or reproducer artifacts were reported. Workspace was clean after restoring the fuzz-generated crates/fuzz/fuzz/Cargo.lock refresh.

@zxch3n zxch3n marked this pull request as ready for review April 26, 2026 16:39
…out-perf

# Conflicts:
#	crates/loro-internal/src/loro.rs
#	crates/loro-internal/src/oplog.rs
@github-actions
Copy link
Copy Markdown
Contributor

WASM Size Report

  • Original size: 3087.93 KB
  • Gzipped size: 988.76 KB
  • Brotli size: 689.40 KB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant