feat: Write transaction fragments directly to storage to reduce consumer memory footprint by alco · Pull Request #3783 · electric-sql/electric

alco · 2026-01-27T22:37:22Z

Summary

For shapes without subquery dependencies (the vast majority), the consumer now streams each transaction fragment directly to storage as it arrives, instead of buffering the entire transaction in memory. This significantly reduces per-shape-consumer memory usage for large transactions.

Shapes with subquery dependencies continue to buffer the full transaction (write_unit=txn) because they need the complete change set for move-in/move-out reasoning.

How it works

The consumer operates in one of two modes, determined at shape creation time:

write_unit=txn_fragment (default): Each fragment is filtered, converted to log items, and written to storage immediately. A lightweight PendingTxn struct tracks only metadata (xid, offset, byte count, change count) across fragments. When the final fragment with a commit arrives, storage is notified to advance its transaction boundary.
write_unit=txn (shapes with subquery dependencies): Fragments are accumulated into a complete transaction before writing, as before.

Key additions

Consumer.PendingTxn — tracks in-progress transaction metadata during fragment streaming. It is the lightweight counterpart to TransactionBuilder: where TransactionBuilder accumulates all changes in memory, PendingTxn only tracks offsets and counters while actual data is written to storage and discarded immediately.

New storage callbacks — append_fragment_to_log!/2 writes log items without advancing the persisted transaction offset (can be called multiple times per transaction), and signal_txn_commit!/2 marks the transaction as complete so crash-recovery sees the correct committed boundary. The existing full-transaction write path is refactored to share the same underlying logic.

Support.StorageTracer — test helper using Erlang trace sessions to observe which storage functions are called from a process, enabling non-intrusive assertions on the write path taken without wrapper modules or function patching.

Other notable changes

Initial snapshot filtering (xid-based skip) is now applied at the first fragment, avoiding accumulation of fragments for transactions that will be discarded.
The FlushTracker in ShapeLogCollector is now only invoked for fragments that include a commit, since mid-transaction fragments don't represent a flush boundary.
Removed CrashingFileStorage (dead code).

Expected memory footprint reduction

Previously, the consumer held an entire transaction's worth of change structs in memory until commit. Now, each fragment's changes are written to storage and released immediately. The consumer only retains the PendingTxn metadata struct (~100 bytes) per in-progress transaction, reducing peak memory by roughly the size of the buffered change data. The improvement scales linearly with transaction size.

Testing

Consumer and storage tests have been expanded to cover both write_unit modes, including tracing-based assertions on which storage functions are called.

coderabbitai · 2026-01-27T22:37:32Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

🔍 Trigger a full review

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

netlify · 2026-01-27T22:39:21Z

✅ Deploy Preview for electric-next ready!

Name	Link
🔨 Latest commit	`846aa9f`
🔍 Latest deploy log	https://app.netlify.com/projects/electric-next/deploys/699dcb614dec2f00085735af
😎 Deploy Preview	https://deploy-preview-3783--electric-next.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

codecov · 2026-01-27T22:40:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.20%. Comparing base (8f2f7bd) to head (b4b89a2).
⚠️ Report is 11 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3783      +/-   ##
==========================================
+ Coverage   87.06%   87.20%   +0.13%     
==========================================
  Files          25       25              
  Lines        2343     2391      +48     
  Branches      592      595       +3     
==========================================
+ Hits         2040     2085      +45     
- Misses        301      304       +3     
  Partials        2        2

Flag	Coverage Δ
packages/experimental	`87.73% <ø> (ø)`
packages/react-hooks	`86.48% <ø> (ø)`
packages/start	`82.83% <ø> (ø)`
packages/typescript-client	`91.87% <ø> (+0.05%)`	⬆️
packages/y-electric	`56.05% <ø> (ø)`
typescript	`87.20% <ø> (+0.13%)`	⬆️
unit-tests	`87.20% <ø> (+0.13%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

alco · 2026-02-10T14:25:00Z

Manual memory benchmark results

The tests below where done separately with electric@main and electric@this-branch. The database had a single table

CREATE TABLE IF NOT EXISTS public.users (
  id bigserial PRIMARY KEY,
  name TEXT NOT NULL,
  email TEXT NOT NULL                                                  
);

and Electric was initialized with 200 shapes, each shape having a where clause N=N where N in 1..200.

After creating the shapes and firing up Observer, I executed the following INSERT in the database:

insert into                              
  public.users(name, email)
    select
      'username_' || generate_series::text || '_' || repeat('x', 2000),
      'username_' || generate_series::text || '@example.com'
    from
      generate_series(1,10001);

Memory charts

The total memory usage on electric@main grows almost to 9GB during transaction processing. For electric@this-branch the total memory usage stays below 500MB.

electric@main

electric@this-branch

System stats comparison

For each run, I screenshotted the system stats in Observer both before the INSERT and after the memory usage had fallen back down (after GC and consumer hibernation).

The key thing to notice from these is that electric@main has residual process memory usage after the load testing which is not the case for electric@this-branch.

electric@main

electric@this-branch

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ccaed2d3ca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

packages/sync-service/lib/electric/shapes/consumer/state.ex

packages/sync-service/lib/electric/shapes/consumer.ex

alco · 2026-02-10T14:34:47Z

@claude please review this PR.

Does it address the referenced issue?
Do you see any obvious issue in the code?
Lack of test coverage?
Any gnarly or hairy code?

alco · 2026-02-11T11:16:12Z

I've run the replication throughput benchmark on our cloud fleet with the following params, comparing main (i.e. electricsql/electric:canary) with this PR (pushed to Docker Hub as alco/electric:fragment-direct-pr):

%{
  "row_size" => [150],
  "shape_count" => [5, 100, 1000, 2000],
  "tx_affects_shape" => [true, false],
  "tx_count" => [1000],
  "tx_row_count" => [90, 100, 120, 200, 220],
  "where_clause" => ["name = '\#{name}'", "name ILIKE '\#{name}%'"]
}

balegas · 2026-02-11T12:01:25Z

Every microsecond counts!

robacourt

LGTM. Great work! The interaction with the Materializer will need addressing, maybe consumers with materializers should be write_unit=txn

claude · 2026-02-11T16:01:28Z

Claude Code Review

Summary

Tenth review iteration of PR #3783 (2026-02-24). Since Review #9, a significant batch of commits landed that address all four remaining "What Remains" items plus two chatgpt-codex P1 concerns from Feb 10. The PR is now in excellent shape. One low-probability race condition from Review #9 remains open.

What's Working Well

Storage capability check (b5952c8837 + 596d828b25):

validate_storage_capabilities in State.initialize/3 is a clean implementation. It calls Storage.supports_txn_fragment_streaming?/1 and gracefully downgrades write_unit to :txn with a warning for unsupported backends. The InMemoryStorage and TestStorage both implement supports_txn_fragment_streaming?/0 → false, closing the chatgpt-codex P1 concern about InMemoryStorage crashing on multi-fragment transactions.

Flush boundary offset fix (2198e59a6d):

maybe_complete_pending_txn now uses txn_fragment.last_log_offset (the WAL commit boundary) as the right-hand side of txn_offset_mapping, instead of the previous approach that could use the last-relevant-change offset. This correctly handles the case where the commit fragment has no relevant changes — the flush boundary now always advances to the true WAL commit offset, not just to the last relevant-change offset. This closes the chatgpt-codex P1 from Feb 10.

Split notification functions (b17c81fee2):

notify_new_changes → :ok, with separate notify_materializer_of_new_changes and notify_clients_of_new_changes is a clean refactor. The :ok = ... assertion pattern at every call site is good defensive style. The materializer now correctly receives commit: false per fragment and a final commit: true commit signal (via the empty-list path in maybe_complete_pending_txn).

Expanded test coverage:

The new tests in consumer_test.exs (buffered fragments, snapshot filtering, idempotency, flush boundaries) provide substantial coverage that effectively replaces the "oracle test" concern from Review #9. Support.StorageTracer with Erlang :trace sessions is a well-designed non-intrusive tracing mechanism. The assert_receive_timeout-based timeout in collect_traced_calls/1 makes assertions reliable without flakiness.

is_subquery_shape? unit tests (e2b3dc8924):

state_test.exs now covers all three write_unit cases: standalone shape → :txn_fragment, shape with dependencies → :txn, and is_subquery_shape?: true → :txn. These directly prevent the regression risk noted in Review #9 suggestion #3.

Changeset (0f8f833d6b, b162c97291):

Now accurately documents the three new callbacks and the automatic fallback behavior for InMemoryStorage.

Issues Found

Important (Should Fix)

1. Pre-existing standalone consumer + new materializer subscription race

File: packages/sync-service/lib/electric/shapes/consumer.ex:210-215

Carried over from Review #9 — still unaddressed. subscribe_materializer returns state.latest_offset:

def handle_call({:subscribe_materializer, pid}, _from, state) do
  Process.monitor(pid, tag: :materializer_down)
  {:reply, {:ok, state.latest_offset}, %{state | materializer_subscribed?: true}, ...}
end

For a write_unit: :txn_fragment consumer, state.latest_offset is updated per-fragment in write_txn_fragment_to_storage. If a materializer subscribes while a multi-fragment transaction is in progress, state.latest_offset is a mid-transaction offset. The materializer's read_stream reads storage up to this offset, but PureFileStorage.get_log_stream is bounded by last_seen_txn_offset (only advanced by signal_txn_commit!/2). The materializer reads nothing for the in-flight transaction, and the per-fragment changes written before subscription are never re-delivered via notify_materializer_of_new_changes.

Impact: Incorrect move-in/out tracking in the outer shape for that one transaction.

Probability: Low — requires (a) inner shape created independently before outer shape, (b) outer shape created while a multi-fragment transaction is active on the inner shape.

Suggested mitigation: In subscribe_materializer, return the last committed offset from storage rather than state.latest_offset:

def handle_call({:subscribe_materializer, pid}, _from, state) do
  Process.monitor(pid, tag: :materializer_down)
  {:ok, safe_offset} = ShapeCache.Storage.fetch_latest_offset(state.storage)
  {:reply, {:ok, safe_offset}, %{state | materializer_subscribed?: true}, state.hibernate_after}
end

This ensures the materializer always starts from a committed boundary.

Suggestions (Nice to Have)

2. Changeset claim doesn't match behavior for backends missing supports_txn_fragment_streaming?/0

File: .changeset/brown-rice-float.md, packages/sync-service/lib/electric/shapes/consumer/state.ex:264-278

The changeset says: "Custom storage backends that do not implement these callbacks will automatically fall back to the default write_unit=txn mode". This is true for backends that implement supports_txn_fragment_streaming?/0 → false (like InMemoryStorage), but a backend that doesn't implement the callback at all would get an UndefinedFunctionError in validate_storage_capabilities rather than falling back.

The @callback definition will produce compile-time warnings for non-implementing backends, which mitigates the risk in practice. But if you want the changeset claim to be strictly accurate, a guard like function_exported?(mod, :supports_txn_fragment_streaming?, 0) would ensure clean fallback for older backends:

defp validate_storage_capabilities(%__MODULE__{write_unit: @write_unit_txn_fragment} = state, storage) do
  {mod, _opts} = storage
  supports? = function_exported?(mod, :supports_txn_fragment_streaming?, 0) and mod.supports_txn_fragment_streaming?()
  if supports? do
    state
  else
    # ... existing downgrade logic
  end
end

3. Pre-existing: pending_txn: nil crash on non-begin fragment

process_txn_fragment/2 at line ~616 accesses txn.consider_flushed? where txn = state.pending_txn. A non-begin fragment arriving with pending_txn: nil (malformed WAL stream or redelivery edge case) would crash with ArgumentError: argument error on nil.consider_flushed?. Very low probability in practice; still no change from Review #9.

Issue Conformance

Linked issue #3415: "Avoid holding whole transaction in consumer memory"

✅ Core objective delivered: memory improvement via direct fragment-to-storage streaming
✅ Fragment streaming for standalone shapes (no subquery deps, no materializer)
✅ Full-txn buffering for shapes in subquery hierarchy (is_subquery_shape?: true)
✅ Storage capability check: Falls back to :txn mode with warning for InMemoryStorage
✅ Materializer correctness: commit: false/true pattern ensures consistent transaction view
✅ Mid-transaction crash: Eliminated by setting write_unit at initialization
✅ Flush boundary fix: txn_offset_mapping now uses WAL commit offset
✅ Changeset: Documents new storage interface additions and fallback behavior
✅ Unit tests: Cover all write_unit modes, snapshot filtering, idempotency, flush boundaries
⚠️ Standalone materializer subscription race: Still present, low probability

Previous Review Status (Review #9 → Review #10)

What was addressed:

✅ is_subquery_shape? regression test (suggestion Postgres replication source TCP server #3): Added in state_test.exs
✅ Changeset (suggestion Read from vaxine stream, send transactions to slot server #4): Documents new callbacks and fallback behavior
✅ Storage capability check (implicit): validate_storage_capabilities added; InMemoryStorage explicitly returns false rather than raising
✅ Flush boundary mapping (chatgpt-codex P1 from Feb 10): maybe_complete_pending_txn now uses txn_fragment.last_log_offset for correct WAL commit boundary alignment
✅ Oracle test baseline (concern [Merged on #3] Write inserts, updates and deletes to Vaxine #2): Replaced by comprehensive unit tests for snapshot filtering, buffering, and idempotency under fragment streaming

What remains:

⚠️ Standalone inner-shape materializer subscribe race — low probability, subscribe_materializer returns potentially mid-transaction offset

Overall Assessment

The PR is merge-ready from a correctness standpoint. The critical architecture decisions (write_unit set at init time, materializer commit deferral, flush boundary alignment) are all correct. The one remaining Important issue is a low-probability edge case that requires a specific timing window. If the team is comfortable with the probability, this can merge as-is and the fix can be a follow-up. If not, the mitigation is a one-line change in subscribe_materializer.

Review iteration: 10 | 2026-02-24

alco · 2026-02-16T13:10:20Z

Update (Feb 16)

Rebased on current main and pushed new commits addressing review feedback:

Split materializer and client notifications (a93967e): Separated notify_new_changes into distinct functions for notifying the materializer vs. clients. This is needed to selectively notify the appropriate subscriber when processing individual txn fragments.
Use commit fragment offset for txn_offset_mapping and client notifications (78a629a): Fixes the flush alignment issue flagged in review — when the final commit fragment is filtered out, the mapping now uses the commit fragment's last_log_offset directly instead of the last relevant-change boundary. PendingTxn no longer needs to track last_log_offset.
Add supports_txn_fragment_streaming? callback to Storage behaviour (4e630ab): InMemoryStorage and TestStorage return false, PureFileStorage returns true. This allows runtime capability checking before enabling fragment streaming mode.
Downgrade write_unit to :txn when storage lacks fragment streaming support (4714a7a): On consumer init, validates that the storage backend supports txn fragment streaming. If unsupported (e.g. InMemoryStorage), falls back to full-transaction buffering with a warning log. This addresses the concern about InMemoryStorage crashing on fragment callbacks.

Still TODO: address @robacourt's feedback about shapes that are dependencies (innermost shapes with no dependencies themselves but that DO have a materializer) — these should probably use write_unit=txn to ensure the materializer gets the full transaction change set on commit.

Based on current implementation, Materializer sees changes from all txn fragments but when it then notifies the consumer with a :materializer_changes message in return, Consumer may still be assuming that a complete transaction has been processed.

alco · 2026-02-16T15:18:01Z

Latest changes (3 new commits since last push)

Defer Materializer subscriber notifications until commit fragment

Previously, when write_unit=:txn_fragment, the inner Consumer called Materializer.new_changes() per fragment, and the Materializer immediately sent :materializer_changes (move-in/move-out events) to the outer Consumer after each fragment. This could cause redundant intermediate notifications within a single transaction.

Changes:

Added a commit: option to Materializer.new_changes/3 (defaults to true for backward compatibility). When commit: false, events are accumulated in a new pending_events field on the Materializer state without notifying subscribers. When commit: true, accumulated events are flushed to subscribers.
The fragment-streaming path in consumer.ex passes commit: false for every fragment. The flush happens via notify_new_changes in maybe_complete_pending_txn which uses commit: true (the default).
Events from multiple fragments are merged by prepending newer events to older ones, matching the existing reverse-chronological ordering convention used by increment_value/decrement_value.
Added a test verifying correct event ordering across uncommitted fragments.
Moved signal_txn_commit! inside the num_changes > 0 branch since there's no need to signal storage when no changes were written.

alco · 2026-02-16T15:24:02Z

Regarding an erlier agentic comment about the potential impact of direct txn fragment writing to storage on PureFileStorage.trim_log!, I've asked Claude to analyse it and here's the result - https://gist.github.com/alco/588b3d7d7665f63d768d5f1b611499c7#file-fragment_storage_safety_analysis-md.

TL;DR: Fragment Streaming Storage Safety

All 7 crash/consistency scenarios are safe. The three-offset architecture (last_seen_offset / last_seen_txn_offset / last_persisted_txn_offset) provides the core safety guarantee:

trim_log!/2 correctly truncates uncommitted fragments — it uses last_persisted_txn_offset (only advanced after a complete txn is finalized + flushed) to remove both chunk index entries and log file bytes beyond the committed boundary.
Readers never see uncommitted data — stream_main_log caps reads at last_seen_txn_offset, which isn't advanced until signal_txn_commit! calls finalize_txn.
ETS consistency is maintained — fragment entries stay in an in-memory buffer until finalize_txn; the disk/ETS split read strategy handles mid-transaction flushes correctly.
Compaction has one theoretical edge case (transaction spanning 30+ MB creating 3+ chunks) where the compacted file could retain uncommitted data after a crash, but readers are still bounded by the txn offset, so no data corruption occurs. Practically, this is a near-zero likelihood scenario.

No code changes are needed — the existing invariants hold correctly with the fragment write path.

robacourt

LGTM. To test the affect of this on subqueries, you could cherry-pick 0cc3b49 and run the tests with the number of mutations in the test transactions over the max txn size and see if there any issues:

ELECTRIC_EXPERIMENTAL_MAX_TXN_SIZE=5 \
MUTATIONS_PER_TXN=15 \
mix test --only oracle test/integration/oracle_property_test.exs

alco · 2026-02-16T17:06:29Z

@robacourt That's dope! It has found one failure already after running for ~10sec. I'll look into fixing it and see if it can discover anything else.

…rate functions This is needed to be able to selectively notify the appropriate subscriber and processing individual txn fragments.

…tions PendingTxn no longer needs to track last_log_offset since the commit fragment's offset (txn_fragment.last_log_offset) is now used directly in maybe_complete_pending_txn. This fixes flush alignment lagging behind when the final commit fragment is filtered out. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

InMemoryStorage and TestStorage return false, PureFileStorage returns true. This allows runtime capability checking before enabling fragment streaming mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…pport When initializing a consumer, validate that the storage backend supports txn fragment streaming before using :txn_fragment mode. If unsupported (e.g. InMemoryStorage), fall back to full-transaction buffering with a warning log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Previously, Materializer sent :materializer_changes to subscribers on every new_changes call. When write_unit=:txn_fragment, this meant the outer Consumer could receive multiple intermediate move-in/move-out notifications within a single transaction, causing redundant work. Add a commit: option to Materializer.new_changes/3 (default: true). Changes are accumulated in pending_events and only flushed to subscribers when commit: true. The fragment-streaming path passes commit: not is_nil(fragment.commit) to defer until the final fragment. maybe_complete_pending_txn sends an empty new_changes with commit: true to handle the edge case where the commit fragment has no matching changes but earlier fragments accumulated events. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ding writes for this txn

Verifies that splitting inserts across two new_changes calls (one uncommitted, one committed) produces the same event order as sending them in a single call. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…bout ordering

…_to_storage The :includes_truncate and {[], 0} branches returned {state, []} instead of bare state, causing a FunctionClauseError crash when piped into maybe_complete_pending_txn which pattern-matches on %State{}. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…fications maybe_complete_pending_txn was notifying clients with txn_fragment.last_log_offset — the offset of the last change across ALL tables in the commit fragment. But fetch_latest_offset returns the offset of the last change written to THIS shape's storage. When a split transaction has changes to tables not matching the shape, the client receives a higher offset than what the shape's storage knows about, causing "offset out of bounds" on the next poll. Fix: use state.latest_offset (the last offset actually written to this shape's log) instead of the fragment's cross-table offset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alco · 2026-02-23T16:25:38Z

I have pushed a fix for the issue described in #3783 (comment) and have set write_unit=txn even for leaf Consumer processes - those that are started for the innermost shape of a shape with subqueries.

Still it's impossible to be sure this branch doesn't add new failures on top of those already observed in main until we figure out the root cause of those. This is just one example of different failures that are readily reproducible with oracle tests in main.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cdf0cb70ac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

packages/sync-service/lib/electric/shapes/consumer.ex

…hape-with-subqueries hierarchy The write_txn_fragment_to_storage path in Consumer was designed with simplifications that assumed the consumer has no inner shape dependencies: it doesn't account for move-ins/move-outs, doesn't call State.track_change, and evaluates WHERE clauses without extra_refs from inner materializers. These simplifications are correct for true leaf consumers, but a consumer can have a materializer subscribed to it by an outer shape even if it was initially assigned write_unit=txn_fragment (based on having no dependencies of its own). When such a consumer processes fragments directly, the outer shape's do_handle_txn call to Materializer.get_all_as_refs sees intermediate materializer state that hasn't been properly committed, leading to incorrect WHERE clause evaluation in the outer shape. Instead of reacting to materializer subscription events at runtime, we now initialize the consumer with write_unit=txn upfront when its shape is known to be part of a shape-with-subqueries hierarchy. This ensures that any consumer whose state may be observed by an outer shape via extra_refs buffers all fragments and processes the complete transaction atomically from the start, guaranteeing the materializer's value_counts are updated in a single consistent operation before the outer shape queries them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

No need to pass the whole state, only stack_id is used from it. Makes the data dependencies between functions clearer.

…subquery info

…angeset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tack This is a temporary measure to get the code changes made in this branch so far out of the door before polishing interop between subqueries and individual txn fragment writing to storage.

alco · 2026-02-24T22:31:17Z

In order to get this PR merged, I have disabled individual txn fragment writing to storage when the allow_subqueries feature flag is present in the stack config.

I will undo that and will continue fixing edge cases for shapes with subqueries in this follow-up PR.

robacourt

LGTM. I just left a couple of comments

robacourt · 2026-02-25T12:23:26Z

packages/sync-service/lib/electric/shape_cache.ex

-      |> build_shape_dependencies(known)
+      |> build_shape_dependencies(false, known)
+
+    # Any inner shape of a root shape with subqueries must pass the is_subquery_shape? option


I'm not sure why being a root is a special case. Any dependency should have the is_subquery_shape?: true

Oh, ignore me. You're just excluding the root from being is_subquery_shape?. Fair enough.

The logic in my head seems more straightforward that you tell a shape whether it's a subquery shape or not, when building the root thats false and when it builds it's dependencies it's true. No need to even mention the root. Also is_subquery_shape? could always be set rather than some additional conditional logic. But this is a very minor nit!

robacourt · 2026-02-25T12:32:25Z

packages/sync-service/lib/electric/shape_cache.ex

      Enum.zip(shape.shape_dependencies_handles, shape.shape_dependencies)
      |> Enum.reject(fn {handle, _shape} -> MapSet.member?(known, handle) end)
-      |> build_shape_dependencies(known)
+      |> build_shape_dependencies(false, known)


Can a shape become a subquery shape after it's created? So if SELECT id FROM products was created as a shape first and then id IN (SELECT id FROM products) later, would the 2nd shape use the first as the subquery shape? If so setting is_subquery_shape? on creation breaks. If this is the case personally I wouldn't be against ensuring that shapes and subquery shapes were distinct to prevent this from happening, it's only an optimisation to reuse shapes.

I can address both your comments here.

You are right, a shape may start independent but then have a materializer subscribe to it at a later point in time. I attempted to identify inner shapes using the is_subquery_shape? option passed at consumer startup but then determined that it wouldn't be sufficient. If we want to enable txn fragment processing for stacks that have allow_subqueries, we need to make it work even for shape with subqueries properly, at least in the case where shape.shape_dependencies == []. At that point, the is_subquery_shape? flag becomes superfluous and will be removed from the codebase.

Ok, so your plan for this PR is to not enable txn fragment processing for stacks that have allow_subqueries?

Exactly.

Once this is merged, direct txn fragment writing to storage will only be enabled for stack that don't have subqueries.

This comment has been minimized.

Sign in to view

alco force-pushed the alco/write-txn-fragments-to-storage branch from d3dc0f4 to 1c882c2 Compare January 28, 2026 11:18

alco force-pushed the alco/write-txn-fragments-to-storage branch from 1c882c2 to 71b61f5 Compare February 9, 2026 15:12

This comment has been minimized.

Sign in to view

alco force-pushed the alco/write-txn-fragments-to-storage branch from dba7fc3 to c7d9d17 Compare February 10, 2026 12:42

This comment has been minimized.

Sign in to view

alco mentioned this pull request Feb 10, 2026

Automated fuzz-testing harness for transaction processing pipeline #3821

Open

alco changed the title ~~Alco/write txn fragments to storage~~ feat: Write transaction fragments directly to storage to reduce consumer memory footprint Feb 10, 2026

alco force-pushed the alco/write-txn-fragments-to-storage branch from c7d9d17 to ccaed2d Compare February 10, 2026 13:54

alco marked this pull request as ready for review February 10, 2026 14:26

chatgpt-codex-connector bot reviewed Feb 10, 2026

View reviewed changes

packages/sync-service/lib/electric/shapes/consumer/state.ex Outdated Show resolved Hide resolved

packages/sync-service/lib/electric/shapes/consumer.ex Outdated Show resolved Hide resolved

packages/sync-service/lib/electric/shapes/consumer.ex Outdated Show resolved Hide resolved

alco mentioned this pull request Feb 10, 2026

feat(sync-service): Write transaction fragments directly to storage #3740

Closed

robacourt approved these changes Feb 11, 2026

View reviewed changes

alco added the claude label Feb 11, 2026

alco force-pushed the alco/write-txn-fragments-to-storage branch 2 times, most recently from 5f18340 to 4714a7a Compare February 16, 2026 13:08

This comment has been minimized.

Sign in to view

robacourt approved these changes Feb 16, 2026

View reviewed changes

alco and others added 12 commits February 23, 2026 17:21

Split notifying the materializer and clients of new changes into sepa…

b17c81f

…rate functions This is needed to be able to selectively notify the appropriate subscriber and processing individual txn fragments.

Fix typo

747022b

Add supports_txn_fragment_streaming? callback to Storage behaviour

b5952c8

InMemoryStorage and TestStorage return false, PureFileStorage returns true. This allows runtime capability checking before enabling fragment streaming mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix code formatting

6ae30d8

Do not signal txn commit to storage if it there hasn't been any prece…

fc6160c

…ding writes for this txn

Add test for event ordering across uncommitted fragments

3a7e3c3

Verifies that splitting inserts across two new_changes calls (one uncommitted, one committed) produces the same event order as sending them in a single call. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix filtering of empty materializer events and incorrect assumption a…

c72d416

…bout ordering

alco force-pushed the alco/write-txn-fragments-to-storage branch from 751cf24 to cdf0cb7 Compare February 23, 2026 16:23

alco marked this pull request as ready for review February 24, 2026 09:21

chatgpt-codex-connector bot reviewed Feb 24, 2026

View reviewed changes

packages/sync-service/lib/electric/shapes/consumer.ex Show resolved Hide resolved

alco and others added 3 commits February 24, 2026 16:52

Pass only stack_id across ShapeCache calls

073446a

No need to pass the whole state, only stack_id is used from it. Makes the data dependencies between functions clearer.

Enrich the config passed from ShapeCache to Consumer.start_link with …

846aa9f

…subquery info

alco force-pushed the alco/write-txn-fragments-to-storage branch from cdf0cb7 to 846aa9f Compare February 24, 2026 16:01

This comment has been minimized.

Sign in to view

alco and others added 3 commits February 24, 2026 22:40

Document new Storage behaviour callbacks for fragment streaming in ch…

0f8f833

…angeset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add unit tests for write_unit initialization based on subquery shape …

e2b3dc8

…config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Disable txn_fragment write_unit when subqueries are enabled for the s…

b4b89a2

…tack This is a temporary measure to get the code changes made in this branch so far out of the door before polishing interop between subqueries and individual txn fragment writing to storage.

This was referenced Feb 24, 2026

Restore write_unit=txn_fragment for shape consumers #3905

Closed

Restore write_unit=txn_fragment for shape consumers #3906

Draft

robacourt approved these changes Feb 25, 2026

View reviewed changes

alco merged commit 48bbbe3 into main Feb 26, 2026
63 of 64 checks passed

alco deleted the alco/write-txn-fragments-to-storage branch February 26, 2026 14:32

Conversation

alco commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Key additions

Other notable changes

Expected memory footprint reduction

Testing

Uh oh!

coderabbitai bot commented Jan 27, 2026 • edited by alco Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

netlify bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for electric-next ready!

Uh oh!

codecov bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

alco commented Feb 10, 2026

Manual memory benchmark results

Memory charts

electric@main

electric@this-branch

System stats comparison

electric@main

electric@this-branch

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alco commented Feb 10, 2026

Uh oh!

alco commented Feb 11, 2026

Uh oh!

balegas commented Feb 11, 2026

Uh oh!

robacourt left a comment

Choose a reason for hiding this comment

Uh oh!

claude bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

What's Working Well

Issues Found

Important (Should Fix)

Suggestions (Nice to Have)

Issue Conformance

Previous Review Status (Review #9 → Review #10)

Overall Assessment

Uh oh!

alco commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update (Feb 16)

Uh oh!

This comment has been minimized.

alco commented Feb 16, 2026

Latest changes (3 new commits since last push)

Uh oh!

alco commented Feb 16, 2026

TL;DR: Fragment Streaming Storage Safety

Uh oh!

robacourt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alco commented Feb 16, 2026

Uh oh!

alco commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

alco commented Jan 27, 2026 •

edited

Loading

coderabbitai bot commented Jan 27, 2026 •

edited by alco

Loading

netlify bot commented Jan 27, 2026 •

edited

Loading

codecov bot commented Jan 27, 2026 •

edited

Loading

claude bot commented Feb 11, 2026 •

edited

Loading

alco commented Feb 16, 2026 •

edited

Loading

robacourt left a comment •

edited

Loading

alco commented Feb 23, 2026 •

edited

Loading