Skip to content

fix(apijson): avoid O(n²) array marshaling for large slices#4316

Open
Tonyqu123 wants to merge 1 commit into
cloudflare:mainfrom
Tonyqu123:fix/array-encoder-quadratic
Open

fix(apijson): avoid O(n²) array marshaling for large slices#4316
Tonyqu123 wants to merge 1 commit into
cloudflare:mainfrom
Tonyqu123:fix/array-encoder-quadratic

Conversation

@Tonyqu123

Copy link
Copy Markdown

Summary

encoder.newArrayTypeEncoder constructs JSON arrays by repeatedly calling sjson.SetRawBytes(json, "-1", item) inside a loop. sjson is designed for sparse field updates on existing JSON; appending element-by-element forces it to re-parse and reallocate the entire buffer per call, giving the array encoder O(n²) time and O(n²) memory.

This patch replaces the sjson loop with a single-pass bytes.Buffer concatenation, restoring O(n).

Fixes #4315.

Impact

We hit this in production while syncing ~30,000 IPs to a Cloudflare account-level IP list via Rules.Lists.Items.Update. A single Update call allocated ~6.2 GB and ran for ~4 seconds, OOM-killing our 1 GiB controller.

Benchmarks on Marshal([]string{...}) (Apple M4 Pro, go1.26):

N before B/op after B/op mem before ns/op after ns/op speed
100 92 KB 8 KB 11x 138 µs 24 µs 5.9x
1,000 6.9 MB 66 KB 105x 7.1 ms 0.21 ms 33x
10,000 692 MB 845 KB 819x 437 ms 1.18 ms 369x
30,000 6.2 GB 2.0 MB 3,070x 4,091 ms 3.10 ms 1320x

1k → 30k scaling: before grew ns/op 580x, B/op 893x — clear O(n²). After the fix, scaling is linear.

Compatibility

Output is byte-identical to the previous implementation (verified against encoding/json for n = 0, 1, 2, 100). All existing tests in internal/apijson pass.

Tests

Added internal/apijson/array_bench_test.go with a correctness check and benchmarks at N = 100, 1k, 5k, 10k, 30k as a regression guard. Happy to drop the benchmark file if you'd prefer to keep the test directory generated-only.

Notes

The same sjson.SetRawBytes pattern is used by encodeMapEntries (encoder.go:339) and newStructTypeEncoder (encoder.go:299) — also technically O(n²), but typically don't hit the pathological case because struct field counts and map sizes stay small. Out of scope for this PR; happy to address in a follow-up if useful.

newArrayTypeEncoder built JSON arrays by repeatedly calling
sjson.SetRawBytes(json, "-1", item) inside a loop. sjson is designed for
sparse field updates on existing JSON; appending element-by-element forces
it to re-parse and reallocate the entire buffer per call, giving the array
encoder O(n²) time and O(n²) memory.

Replaced with a single-pass bytes.Buffer concatenation, restoring O(n).

Output is byte-identical to the previous implementation. Added a benchmark
in internal/apijson/array_bench_test.go covering N = 100..30,000 as a
regression guard.

Benchmarks (Apple M4 Pro, go1.26):

  N       before B/op   after B/op   mem      before ns/op   after ns/op   speed
  100     92 KB         8 KB         11x      138 µs         24 µs         5.9x
  1,000   6.9 MB        66 KB        105x     7.1 ms         0.21 ms       33x
  10,000  692 MB        845 KB       819x     437 ms         1.18 ms       369x
  30,000  6.2 GB        2.0 MB       3,070x   4,091 ms       3.10 ms       1320x

1k → 30k scaling: ns/op grows 580x, B/op grows 893x — clear O(n²).

Real-world impact: an SAP BTP controller syncing ~30,000 IPs to a Cloudflare
account-level IP list via Rules.Lists.Items.Update was OOM-killed because a
single Update call allocated ~6.2 GB.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

O(n²) memory allocation when marshaling large slices via Marshal()

2 participants