chore(release): promote rc-2026.6.2 to 0.2.8#129
Merged
Conversation
The rc-2026.6.2 cut rewrote only ant-core's runtime `ant-protocol` dep to the git rc branch, leaving the optional `devnet` ant-node and the test-only ant-node/saorsa-core dev-deps on their released versions (ant-node 0.11.6 -> ant-protocol 2.1.2 / saorsa-core 0.24.5). That pulled a second protocol lineage into the graph, so any target bridging ant-core and ant-node (devnet, E2E, merkle-e2e tests) saw two incompatible copies of `ant_protocol::transport::P2PNode` and failed to compile with E0308. Point all three pins at the matching rc branches so the graph collapses to a single git-rc lineage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The CLI merkle upload path stored each wave of 64 chunks through `merkle_store_with_retry` with up to 4 attempts and 30s jittered backoffs, and a hard barrier: wave N+1 could not start until wave N's retry loop fully drained. A handful of quorum-short chunks therefore parked the wave's other ~63 slots idle through multiple backoffs — the single biggest throughput sink on the PROD-UL-01 run (one wave alone burned 34 minutes). Port the download path's deferred-retry design to the upload path: - Store each wave in a single pass (`max_attempts = 1`, no backoff) so a wave never blocks on a slow chunk. - Collect quorum-short chunks into a file-level deferred set and advance to the next wave immediately. - After the last wave, retry the whole deferred set in concurrent rounds with `[0, 15, 45]s` delays (matching the download path), re-reading each chunk's body from the spill at retry time (peak RAM unchanged) and reusing its proof. Failure semantics are preserved: chunks still short after the final round surface as `PartialUpload`; a non-quorum error aborts as `PartialUpload` while preserving earlier progress. Stats and progress numbering are carried across rounds, with each deferred round's successes recorded in its own histogram slot. Total per-chunk retry budget is unchanged (1 wave pass + 3 deferred rounds). Adds `merkle_deferred_retry`, `DeferredRetryOutcome`, `deferred_round_histogram_slot`, `DEFERRED_ROUND_DELAYS_SECS`, and unit tests. V2-466 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… store AIMD
The adaptive store concurrency limiter never ramped from its cold-start of
8 and got crushed to a +1-per-window crawl because two non-capacity
signals polluted its health input on the merkle upload path:
- Node-side PUT latency is dominated by the ~28s synchronous merkle
closeness lookup, inflating client-observed p95/median to 3-6x and
tripping the latency-vs-baseline Decrease even though nothing about it
is local congestion.
- Remote application rejections (pool-rejected, disk-full, quote-stale)
arrived as Error::Protocol / flattened Error::InsufficientPeers and were
classified as NetworkError, counting against success_target and driving
multiplicative decrease. With the default slow_start_ramp_threshold of 0,
a single such Decrease permanently exited slow-start.
Apply the fetch-channel precedent to the store channel (the situation is
structurally identical — verification variance instead of retry variance),
plus preserve the structured remote rejection reason so it classifies
correctly. The cold-start floor of 8 is deliberately unchanged.
- adaptive.rs: store_cfg.latency_decrease_enabled = false and
store_cfg.slow_start_ramp_threshold = usize::MAX, so a transient Decrease
halves but the next healthy window re-doubles. Genuine store congestion
still surfaces via the timeout-rate ceiling.
- error.rs/chunk.rs: new Error::RemotePut { address, source: ProtocolError }
carrying the structured upstream discriminant instead of stringifying it
into Error::Protocol. A ChunkPutResponse::Error means the transport
round-trip succeeded and the node declined at the application layer.
- chunk.rs: chunk_put_to_close_group surfaces a representative RemotePut for
app-only quorum shortfalls; any genuine transport failure keeps it
InsufficientPeers so real congestion still cuts the cap.
- mod.rs: classify_error maps RemotePut to ApplicationError.
- merkle.rs: merkle_store_with_retry treats RemotePut as recoverable
(defer/retry) like InsufficientPeers, so transient rejections don't abort
the upload.
Adds unit coverage: store ramps/recovers under the new tuning while a
timeout burst still cuts it; remote app-rejections don't move the cap;
RemotePut is recoverable in the retry path.
Linear: V2-468
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fix(client): stop verification latency and app-rejections suppressing store AIMD
…ounting Addresses review on the deferred merkle-upload retry path. 1. Memory bound (high): the deferred pass read every quorum-short chunk in the whole file into one Vec per round before storing, so peak resident bodies scaled with the file-wide deferred count rather than the wave path's ~UPLOAD_WAVE_SIZE / ~256 MB bound. merkle_deferred_retry now takes a batch_size and processes each round in batches of that size, re-reading only one batch of bodies from the spill at a time. The CLI caller passes UPLOAD_WAVE_SIZE. 2. Fatal-abort accounting (medium): merkle_store_with_retry returned Err on a non-quorum error, discarding the successes already recorded in that pass; the wave/deferred callers then built PartialUpload from stale state (could report failed_count = 0 and omit same-pass stores). The store helper now preserves same-pass successes (stored/stored_addresses), records the fatal chunk as failed, and surfaces the error via a new MerkleStoreOutcome::fatal field instead of Err. The external-signer path re-raises fatal as Err to keep its all-or-nothing contract; the CLI wave and deferred paths fold it into a PartialUpload whose failed set is derived authoritatively as every input chunk not in stored_addresses (shared partial_upload_after_fatal helper), so stored_count + failed_count accounts for the whole file. This also fixes the pre-existing wave-path under-reporting the review noted. Tests: same-pass successes preserved on fatal; deferred reads bounded to batch_size; updated the non-quorum-error test to assert fatal-in-outcome. cargo test -p ant-core --lib -> 338 passed; clippy and fmt clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…_uploads feat(client): download-style deferred retry for merkle uploads
Use ant-protocol via git instead of the local checkout path. BREAKING CHANGE: removes ant-core bootstrap-cache recording hooks and the bootstrap-cache E2E/dev-dependency surface.
SemVer: minor
SemVer: patch
SemVer: patch
Keep Client::connect exact to its supplied bootstrap peers while preserving CLI cache warm-start behavior. Filter cached bootstrap addresses for ipv4-only runs and update git dependency locks to the pushed timeout-removal branches. SemVer: patch
Retain cached bootstrap peers by peer-id keyspace coverage before recency while still enforcing IP diversity limits. Recency remains the tie-breaker among equally diverse candidates. SemVer: patch
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The cold-start-from-disk bootstrap cache test that used this dev-dependency was removed with the rest of the bootstrap cache integration, so the direct saorsa-core dev-dependency is now dead. Removing it keeps the manifest and lockfile consistent (the lock no longer carries the ant-core -> saorsa-core edge).
feat!: remove bootstrap cache integration
feat(cli): add peer count to file download
estimate_upload_cost sampled only the first ESTIMATE_SAMPLE_CAP chunk addresses, so a file whose leading chunks were already stored but whose tail was new returned CostEstimationInconclusive even though a real estimate was obtainable. Display consumers (the GUI) were left with no value to show. - Distributed sampling: sample addresses spread evenly across the whole chunk list instead of the first N (distributed_sample_indices, unit- tested). Files with <= cap chunks still sample every chunk, preserving exact "whole file sampled" detection. - The residual all-stored-but-incomplete case returns Ok with storage_cost_atto "0" instead of erroring, tagged with a new CostEstimateConfidence enum (PricedSample / VerifiedAllAlreadyStored / AllSamplesAlreadyStoredIncomplete). The CLI renders the confidence. UploadCostEstimate is now #[non_exhaustive] with a #[serde(default)] confidence field. Error::CostEstimationInconclusive is retained (no longer produced) to avoid removing a public variant. BREAKING CHANGE: UploadCostEstimate is #[non_exhaustive] and gained a `confidence` field; downstream code constructing or exhaustively destructuring it must update. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add `Client::file_download_to_sender`, which downloads + decrypts a file and streams the plaintext to a caller-provided `mpsc::Sender<Result<Bytes>>` instead of writing to disk. Constant memory (one decrypt batch resident at a time, same as `file_download`), and the caller receives bytes progressively as each batch decrypts — suitable for forwarding to an HTTP chunked body or a gRPC response stream. The bounded sink applies backpressure; a dropped receiver (client disconnect) ends the download early. Implemented by extracting the existing batched-fetch + streaming-decrypt loop out of `file_download_with_progress` into a private sink-parameterized core, `download_decrypted_chunks(.., on_chunk)`. `file_download_with_progress` is now a thin wrapper whose sink writes to the temp file + atomic-renames (behavior unchanged); the new method's sink forwards to the channel. No duplication of the fetch/retry logic, and `&self` is preserved (the caller spawns + owns the Receiver), so no `Client: Clone`/`'static` bound is required. Adds an e2e round-trip test that streams a multi-batch (~1 MiB) file through the channel and asserts the reassembled bytes equal the source. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop redundant .into() on already-Bytes decrypt result (clippy useless_conversion) and apply rustfmt reflows in file.rs + e2e_file.rs. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- TempDownload RAII guard: removes the staging file on every disk-path error AND on a panic unwind out of the block_in_place decrypt loop, replacing three duplicated cleanup arms (#1). drop(file) before rename for Windows. - New Error::Cancelled variant for a dropped receiver; was misclassified as Error::Network (#3). Routed to ApplicationError in classify_error so caller-initiated cancellation is not retried as a transport failure. - Doc the exact channel item type Result<Bytes, Error> on file_download_to_sender (#4). - Drop now-stale #[allow(clippy::unused_async)] on file_download (#7). - Harden e2e test: assert each streamed chunk is non-empty and >=2 segments arrive (multi-batch property), rename to test_file_download_to_sender_multibatch_round_trip (#6). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
It was an intra-doc link from the public file_download_with_progress to the private TempDownload struct, tripping -D rustdoc::private_intra_doc_links. A plain code span conveys the same thing without the link. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-streaming feat(data): stream decrypted file download to a channel sink
…461) The single-node payment path aborted the entire file on the first wave with any chunk short of quorum: `upload_spill_addresses_single` `?`-propagated the per-wave `PartialUpload` from `batch_upload_chunks_with_events`, so later waves — already self-encrypted, spilled, and sometimes already paid — were never attempted. In PROD-UL-02 this turned ~85% per-chunk success into 0% per-file success, killing every upload at wave 1 of N. Align it with the merkle path (`upload_waves_merkle`): a wave short of quorum records its failed chunks and continues; after all waves are attempted the file returns a single `Error::PartialUpload` with the full stored/failed breakdown. Genuinely fatal errors (wallet/payment infrastructure, missing proofs, spill reads) still abort immediately. The recoverable-vs-fatal decision is factored into a pure `fold_single_wave` helper with unit tests. Because `UPLOAD_WAVE_SIZE == PAYMENT_WAVE_SIZE`, each batch call is exactly one payment wave, so folding its `PartialUpload` leaves nothing un-attempted within the wave. Also surface on-chain spend on a partial upload: a partial still pays for the chunks it paid for, but the spend was silently dropped. Add a boxed `PartialUploadSpend` (storage_cost_atto + gas_cost_wei) to `Error::PartialUpload`, populate it at every raise site (single-node, merkle, external-signer), and report it in the CLI (human + JSON). Boxed to keep `Error` under clippy's `result_large_err` threshold. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…V2-461) Large-file single-node (--no-merkle) uploads OOM'd on small hosts: store concurrency could ramp to the wave size (64) and the send path holds each ~4 MB chunk body in flight, so a wave of large chunks pinned several GB. Cap store concurrency in store_paid_chunks_with_events by combined in-flight body bytes (STORE_INFLIGHT_BYTE_BUDGET, 64 MB) instead of chunk count, so ~4 MB chunks drop to ~16 concurrent stores while small chunks are unaffected. This is the standalone memory fix; no saorsa-core change is required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fix: continue single-node uploads on partial waves and bound store memory (V2-461)
# Conflicts: # Cargo.lock
feat(client): use witnessed SNP quote selection
Consume the witnessed close-group transcript from saorsa-core and compute quorum, vote counts, and fallback quote candidates in ant-client. Quote collection now keeps all quorum-recognised candidates available for reachability fallback, then pays the closest successful close group. SemVer: feature change; no public ant-client API break expected.
Select the closest witnessed SNP quote set whose paid median issuer is recognised by a close-group majority of the selected peers. This keeps fallback quote candidates available without paying a median issuer that the PUT majority may reject. SemVer: bug fix; no public ant-client API break expected.
Keep proof quote order stable while ordering PUT targets so the initial store wave favours peers that voted for the paid median issuer. Wire the in-process E2E protocol through AntProtocol::attach_p2p_node and use ant-node's test-only paid close-group override for the local client/storage-node topology. SemVer: bug fix; no public ant-client API break expected.
…olicy feat(client): apply witnessed quote policy locally
Re-point ant-protocol + ant-node (runtime optional + test-utils dev-dep) from feat/witnessed-transcript-policy -> canonical rc-2026.6.2, refresh lock to saorsa-core 0.26.0-rc.1 / ant-protocol 2.2.0-rc.1 / ant-node 0.12.1-rc.7. Includes #119 (apply witnessed quote policy locally).
Store public upload DataMaps through the same file upload chunk set so wave and merkle payments cover the shareable DataMap address instead of paying for it in a second post-upload call. SemVer: bug fix; no public ant-client API break expected.
Remove the [patch."…saorsa-core"] override that pointed at mickvandijke/saorsa-core@feat/witnessed-view-count-rc-2026.6.2 (scaffold for building against saorsa-core #135 before it merged). #135 is now on canonical rc-2026.6.2, so the lock resolves saorsa-core there (79f5ad6). Verified: cargo check --all-targets --all-features passes.
…nt-rc-2026.6.2 feat(client): widen SNP witnessed quote views
…atch-rc-2026.6.2 fix(client): include public DataMap in upload payment
Re-pin saorsa-core (79f5ad6, #135), ant-node (8f8842a, #146/#147), and ant-protocol to their current rc-2026.6.2 commits so the lock references match the branches. Lock-only; no version bump, no tag.
…ss-support fix: use direct witness support for SNP median
…d-quotes-rc-2026.6.2 fix(client): fetch witnessed quotes concurrently
Revert PR #125 (fetch witnessed quotes concurrently)
fix(snp): lower witness quorum for partial transcripts
Strip -rc and pin upstreams to crates.io: ant-protocol 2.2.0, ant-node 0.13.0 (runtime optional + test-utils dev-dep), via the re-exported saorsa-core 0.26.0. Hand-rolled (helper doesn't cover the ant-node dev-deps).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promotes ant-core + ant-cli to 0.2.8 (final), stripping
-rcand pinning upstreams to their published crates.io versions:2.2.0, ant-node0.13.0(runtime optional +test-utilsdev-dep)0.26.0comes through ant-protocol's re-exportsVerified:
cargo check --all-targets --all-featurespasses against the published crates. ant-core publishes to crates.io; ant-cli ships a GitHub binary.