Skip to content

Conversation

@darkskygit
Copy link
Member

@darkskygit darkskygit commented Dec 6, 2025

Summary by CodeRabbit

  • New Features

    • Native document crawling to return structured crawl results (blocks, title, summary).
    • crawlDocData API exposed across desktop, mobile, iOS, and Android for native retrieval.
    • New cross-platform block and crawl result types for consistent data exchange.
  • Improvements

    • Native-first crawl with validation and fallback to existing flow.
    • Throttled indexer refreshes to reduce redundant refreshes.
  • Chores

    • Dependency feature flags updated (native/mobile/native packages).

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 6, 2025

Walkthrough

Adds a native crawlDocData feature: new CrawlResult/BlockInfo types, native document binary loading/parsing and a fast-path in the indexer, plus TypeScript, Electron, iOS, Android, Rust/uniffi and SQLite wiring and normalization; also dependency and error/type adjustments across several crates and packages.

Changes

Cohort / File(s) Summary
Cargo.toml dependency updates
packages/backend/native/Cargo.toml, packages/frontend/mobile-native/Cargo.toml, packages/frontend/native/Cargo.toml, packages/frontend/native/nbstore/Cargo.toml, packages/common/native/Cargo.toml
Adjusted affine_common features (e.g., enable hashcash), added/updated workspace deps and dev-deps (serde, serde_json, thiserror, chrono, uuid, y-octo) and feature flags.
Common storage types & interface
packages/common/nbstore/src/storage/doc.ts, packages/frontend/native/index.d.ts
Added BlockInfo and CrawlResult interfaces; extended DocStorage and DocStorageBase with optional crawlDocData method; added native crawl types in TS declarations and DocStoragePool.crawlDocData signature.
SQLite DB API surface
packages/common/nbstore/src/impls/sqlite/db.ts
Added crawlDocData to NativeDBApis interface (signature: (id: string, docId: string) => Promise).
SQLite implementation & normalization
packages/common/nbstore/src/impls/sqlite/doc.ts
Implemented SqliteDocStorage.crawlDocData(docId) plus normalization helpers and validation to convert raw DB rows into CrawlResult (BlockInfo mapping, field parsing, warnings, and null-on-invalid).
Indexer integration (fast-path + refresh throttling)
packages/common/nbstore/src/sync/indexer/index.ts
Added tryNativeCrawlDocData fast-path that uses native crawl result when available (with fallback to Y.Doc crawling), introduced refreshIfNeed throttling and a lastRefreshed field, and integrated native result into internal IndexerDocument model.
Indexer storage refresh API & implementations
packages/common/nbstore/src/storage/indexer.ts, packages/common/nbstore/src/impls/cloud/indexer.ts, packages/common/nbstore/src/impls/idb/indexer/index.ts, packages/common/nbstore/src/storage/dummy/indexer.ts
Added refreshIfNeed() to IndexerStorage interface and base class and implemented no-op or conditional refresh logic in cloud, idb, and dummy indexer storage implementations.
Frontend/native Rust indexer module
packages/frontend/native/nbstore/src/indexer.rs, packages/frontend/native/nbstore/src/lib.rs
New NativeBlockInfo and NativeCrawlResult structs, From conversions from BlockInfo/CrawlResult; implemented SqliteDocStorage::crawl_doc_data that loads/merges binaries and parses via parse_doc_from_binary; exposed via DocStoragePool.crawl_doc_data and pub mod indexer.
Native record/field and error updates
packages/frontend/native/nbstore/src/doc.rs, packages/frontend/native/nbstore/src/error.rs
Changed DocRecord.bin from Vec<u8> to bytes::Bytes; added Parse(#[from] ParseError) variant to Error enum with transparent display.
Native bindings & FFI surface
packages/frontend/mobile-native/src/lib.rs, packages/frontend/mobile-native/Cargo.toml, packages/common/native/src/doc_parser.rs
Added uniffi BlockInfo and CrawlResult records and conversions; added DocStoragePool.crawl_doc_data FFI method; removed public CrawlDocInput and changed parse_doc_from_binary signature from CrawlDocInput to (doc_bin: Vec<u8>, doc_id: String).
TypeScript/JS plugin & handlers
packages/frontend/native/index.d.ts, packages/frontend/apps/electron/src/helper/nbstore/handlers.ts
Exposed native crawl result types and added DocStoragePool.crawlDocData declaration; added crawlDocData handler wired to POOL.crawlDocData in Electron helper.
iOS plugin bindings
packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts, packages/frontend/apps/ios/src/plugins/nbstore/index.ts
Imported CrawlResult type and added crawlDocData method declaration and native forwarding in iOS plugin surface.
Android plugin & Kotlin Uniffi bindings
packages/frontend/apps/android/src/plugins/nbstore/definitions.ts, packages/frontend/apps/android/src/plugins/nbstore/index.ts, packages/frontend/apps/android/App/app/src/main/java/uniffi/affine_mobile_native/affine_mobile_native.kt, packages/frontend/apps/android/App/app/src/main/java/app/affine/pro/plugin/NbStorePlugin.kt
Added crawlDocData plugin API and wiring; generated Uniffi FFI converters and types (BlockInfo, CrawlResult) and added NbStorePlugin.crawlDocData implementation.
Native nbstore Cargo & dev deps
packages/frontend/native/nbstore/Cargo.toml
Added/adjusted dependencies (affine_common with ydoc-loader, serde derive, y-octo) and dev-deps (serde_json, uuid).

Sequence Diagram(s)

sequenceDiagram
    participant Indexer
    participant NativeAPI as Native Storage (Rust)
    participant SQLite as SQLite DB
    participant Loader as Binary Loader/Merger
    participant Parser as parse_doc_from_binary
    participant Fallback as Y.Doc + crawlingDocData

    Indexer->>NativeAPI: tryNativeCrawlDocData(docId)
    alt native available & succeeds
        NativeAPI->>SQLite: load snapshot & ordered updates (universal_id, docId)
        SQLite-->>Loader: return binaries (snapshot + updates)
        Loader->>Loader: merge_updates -> merged doc binary
        Loader->>Parser: parse_doc_from_binary(doc_bin, doc_id)
        Parser-->>NativeAPI: CrawlResult (blocks, title, summary)
        NativeAPI-->>Indexer: NativeCrawlResult
    else native fails or unavailable
        NativeAPI-->>Indexer: null / error
        Indexer->>SQLite: fetch docBin
        SQLite-->>Indexer: doc binary
        Indexer->>Fallback: build Y.Doc & crawlingDocData
        Fallback-->>Indexer: crawlingDocData result
    end
    Indexer->>Indexer: integrate result and call refreshIfNeed()
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas needing extra attention:
    • Native binary loading, merge_updates correctness and error mapping (packages/frontend/native/nbstore/src/indexer.rs).
    • DB normalization, field parsing and validation hygiene (packages/common/nbstore/src/impls/sqlite/doc.ts).
    • Indexer fast-path integration and refreshIfNeed throttling interaction (packages/common/nbstore/src/sync/indexer/index.ts).
    • Cross-language FFI/type changes and generated converters (mobile native uniffi/Kotlin and TypeScript bindings).

Poem

🐰 I nibble bytes and stitch them tight,
Merging snapshots into morning light.
Blocks align, titles hum a tune,
Native hops make crawling zoom.
Hooray — cross-platform carrots soon! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.86% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(native): native reader for indexer' accurately describes the main change—adding native functionality to read and index document data. It directly relates to the core additions of crawlDocData methods, native indexer integration, and data type definitions across the codebase.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch darksky/native-reader

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
packages/frontend/native/nbstore/src/lib.rs (1)

6-6: Native crawl_doc_data exposure is correct; clone usage could be slightly tightened

The new DocStoragePool::crawl_doc_data correctly exposes the native crawl path and delegates to SqliteDocStorage with proper error propagation, and pub mod indexer; ensures the types are visible to JS.

You could avoid the extra universal_id.clone() by binding the storage first, e.g.:

let storage = self.get(universal_id.clone()).await?;
let result = storage.crawl_doc_data(&universal_id, &doc_id).await?;

or by adjusting get to take &str if you ever touch that API again, but this is purely a micro-optimization.

Also applies to: 121-133

packages/common/nbstore/src/sync/indexer/index.ts (1)

495-523: The title field is extracted but never used.

The tryNativeCrawlDocData method extracts and returns title from the crawl result, but the caller at line 416-419 only uses block and summary. Either remove the unused field or utilize it if needed.

   private async tryNativeCrawlDocData(docId: string) {
     try {
       const result = await this.doc.crawlDocData?.(docId);
       if (result) {
         return {
-          title: result.title,
           block: result.blocks.map(block =>
             IndexerDocument.from<'block'>(`${docId}:${block.blockId}`, {
               docId,
               blockId: block.blockId,
               content: block.content,
               flavour: block.flavour,
               blob: block.blob,
               refDocId: block.refDocId,
               ref: block.refInfo,
               parentFlavour: block.parentFlavour,
               parentBlockId: block.parentBlockId,
               additional: block.additional,
             })
           ),
           summary: result.summary,
         };
       }
       return null;
     } catch (error) {
       console.warn('[indexer] native crawlDocData failed', docId, error);
       return null;
     }
   }
packages/frontend/native/nbstore/src/indexer.rs (1)

90-95: Consider simplifying capacity calculation.

The current capacity calculation works but could be more readable.

-    let mut segments =
-      Vec::with_capacity(snapshot.as_ref().map(|_| 1).unwrap_or(0) + updates.len());
+    let mut segments =
+      Vec::with_capacity(snapshot.is_some() as usize + updates.len());
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 69cdeed and 1879c41.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • packages/backend/native/Cargo.toml (1 hunks)
  • packages/common/native/Cargo.toml (2 hunks)
  • packages/common/nbstore/src/impls/sqlite/db.ts (2 hunks)
  • packages/common/nbstore/src/impls/sqlite/doc.ts (2 hunks)
  • packages/common/nbstore/src/storage/doc.ts (3 hunks)
  • packages/common/nbstore/src/sync/indexer/index.ts (3 hunks)
  • packages/frontend/apps/electron/src/helper/nbstore/handlers.ts (1 hunks)
  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts (2 hunks)
  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts (2 hunks)
  • packages/frontend/mobile-native/Cargo.toml (1 hunks)
  • packages/frontend/native/Cargo.toml (2 hunks)
  • packages/frontend/native/index.d.ts (2 hunks)
  • packages/frontend/native/nbstore/Cargo.toml (3 hunks)
  • packages/frontend/native/nbstore/src/blob.rs (3 hunks)
  • packages/frontend/native/nbstore/src/doc.rs (5 hunks)
  • packages/frontend/native/nbstore/src/error.rs (2 hunks)
  • packages/frontend/native/nbstore/src/indexer.rs (1 hunks)
  • packages/frontend/native/nbstore/src/lib.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
packages/frontend/apps/ios/**/*.{ts,tsx}

📄 CodeRabbit inference engine (packages/frontend/apps/ios/AGENTS.md)

packages/frontend/apps/ios/**/*.{ts,tsx}: Use TypeScript as the programming language
Expose JavaScript APIs to native iOS code through window object for Capacitor integration
Implement getCurrentServerBaseUrl() API for native iOS bridge
Implement getCurrentI18nLocale() API for native iOS bridge
Implement getAiButtonFeatureFlag() API for native iOS bridge
Implement getCurrentWorkspaceId() API for native iOS bridge
Implement getCurrentDocId() API for native iOS bridge
Implement getCurrentDocContentInMarkdown() API for native iOS bridge
Implement createNewDocByMarkdownInCurrentWorkspace() API for native iOS bridge
Enable TypeScript strict mode

Files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
packages/frontend/apps/ios/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (packages/frontend/apps/ios/AGENTS.md)

Follow ESLint/Prettier configuration from workspace root

Files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
🧠 Learnings (4)
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement getCurrentDocId() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
  • packages/frontend/native/index.d.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement createNewDocByMarkdownInCurrentWorkspace() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement getCurrentDocContentInMarkdown() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Expose JavaScript APIs to native iOS code through window object for Capacitor integration

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
🧬 Code graph analysis (6)
packages/frontend/native/nbstore/src/lib.rs (2)
packages/frontend/native/nbstore/src/indexer.rs (1)
  • crawl_doc_data (59-78)
packages/frontend/native/index.d.ts (1)
  • NativeCrawlResult (131-135)
packages/common/nbstore/src/impls/sqlite/db.ts (1)
packages/common/nbstore/src/storage/doc.ts (1)
  • CrawlResult (22-26)
packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts (1)
packages/common/nbstore/src/storage/doc.ts (1)
  • CrawlResult (22-26)
packages/frontend/native/nbstore/src/indexer.rs (2)
packages/common/native/src/doc_parser.rs (2)
  • parse_doc_from_binary (90-300)
  • base (42-60)
packages/common/nbstore/src/storage/doc.ts (2)
  • BlockInfo (10-20)
  • CrawlResult (22-26)
packages/common/nbstore/src/impls/sqlite/doc.ts (1)
packages/common/nbstore/src/storage/doc.ts (1)
  • CrawlResult (22-26)
packages/common/nbstore/src/sync/indexer/index.ts (1)
packages/common/nbstore/src/sync/indexer/crawler.ts (1)
  • crawlingDocData (6-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: y-octo binding test on aarch64-apple-darwin
  • GitHub Check: Build AFFiNE native (aarch64-apple-darwin)
  • GitHub Check: E2E Test (8)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: E2E Test (3)
  • GitHub Check: E2E Test (9)
  • GitHub Check: E2E Test (4)
  • GitHub Check: E2E Test (10)
  • GitHub Check: E2E Test (6)
  • GitHub Check: E2E Test (5)
  • GitHub Check: E2E Test (7)
  • GitHub Check: E2E Test (1)
  • GitHub Check: E2E Test (2)
  • GitHub Check: Typecheck
  • GitHub Check: Build Server native
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Analyze (typescript, blocksuite)
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Analyze (javascript, affine)
  • GitHub Check: Lint
🔇 Additional comments (27)
packages/frontend/native/nbstore/src/error.rs (1)

1-1: ParseError integration into the central Error enum looks correct and idiomatic

Importing ParseError and adding the transparent Parse(#[from] ParseError) variant is a clean way to propagate parsing failures through the existing Result<T> = std::result::Result<T, Error> alias and enables ? at call sites without extra boilerplate. It does slightly tighten the public error surface’s dependency on affine_common::doc_parser, but that matches the PR’s intent around native parsing and doesn’t introduce correctness or ergonomics issues in this module.

Also applies to: 13-14

packages/common/native/Cargo.toml (2)

38-38: ydoc-loader feature correctly expanded.

All four dependencies (assert-json-diff, serde, serde_json, thiserror, y-octo) are properly declared as optional in the [dependencies] section and correctly referenced in the feature definition.


7-7: Empty default features is a safe choice.

All workspace crates that depend on affine_common explicitly specify the features they need: packages/frontend/native/Cargo.toml uses ["hashcash"], packages/frontend/native/nbstore/Cargo.toml uses ["ydoc-loader"], packages/backend/native/Cargo.toml uses ["doc-loader", "hashcash"], and packages/mobile-native/Cargo.toml uses ["hashcash"]. No crate implicitly relies on default feature activation, so the shift to empty defaults improves modularity without breaking dependencies.

packages/backend/native/Cargo.toml (1)

10-10: Feature enablement is appropriate for backend.

Backend correctly enables both doc-loader (for document processing) and hashcash (likely for proof-of-work related functionality). Both features and their dependencies are properly defined.

packages/frontend/mobile-native/Cargo.toml (1)

18-18: Minimal feature footprint appropriate for mobile.

Mobile native enables only hashcash, which is a lightweight feature suitable for mobile constraints.

packages/frontend/native/Cargo.toml (3)

10-12: Explicit feature selection is clean.

Disabling defaults and explicitly selecting only hashcash is a best practice for frontend native bindings.


30-33: Dev dependencies are reasonable.

Adding chrono, serde_json, and uuid as dev dependencies is appropriate for testing scenarios (e.g., test fixtures with timestamps and UUIDs).


27-27: thiserror is actively used and required.

The dependency is used across multiple error modules in the native crate:

  • packages/frontend/native/nbstore/src/error.rs — #[derive(Debug, thiserror::Error)]
  • packages/frontend/native/media_capture/src/macos/error.rs — #[derive(Error, Debug)]
  • packages/frontend/native/media_capture/src/windows/error.rs — #[derive(Error, Debug)]

The dependency is legitimate and necessary for error type definitions in the native module.

packages/frontend/native/nbstore/Cargo.toml (4)

13-15: ydoc-loader feature is correct for document store.

The ydoc-loader feature provides YAML/document loading with proper dependencies. Using default-features = false with selective enablement is appropriate.


52-54: Dev dependencies align with testing needs.

serde_json and uuid are appropriate for test fixtures and data serialization scenarios in the store layer.


32-32: y-octo dependency is properly integrated. Verified that y-octo is used in packages/frontend/native/nbstore/src/indexer.rs for document handling operations.


21-21: Serde dependency is intentionally used directly in nbstore code.

Confirmed: serde::Serialize is directly imported and used in packages/frontend/native/nbstore/src/indexer.rs with derive macros applied to the NativeBlockInfo and NativeCrawlResult structs. The non-optional dependency is justified.

packages/frontend/apps/electron/src/helper/nbstore/handlers.ts (1)

50-50: crawlDocData binding aligns with existing handler pattern

The new crawlDocData: POOL.crawlDocData.bind(POOL) is consistent with the other nbstore handlers and correctly wires the pool API through NativeDBApis.

If not already done, please run tsc to ensure the return type of POOL.crawlDocData is structurally compatible with CrawlResult expected by NativeDBApis.crawlDocData.

packages/common/nbstore/src/impls/sqlite/doc.ts (1)

3-8: SqliteDocStorage.crawlDocData delegation is straightforward; clarify nullability semantics

The override cleanly delegates to this.db.crawlDocData(docId) and the added CrawlResult import is correct for the return type.

Given the method is typed as Promise<CrawlResult | null> but the DB API appears to return a non-null CrawlResult, please confirm the intended behavior when a doc does not exist (null vs thrown error) and align either the DB API or this signature accordingly.

Also applies to: 84-86

packages/common/nbstore/src/impls/sqlite/db.ts (1)

4-5: NativeDBApis extended cleanly with crawlDocData

Adding CrawlResult to the imports and crawlDocData: (id: string, docId: string) => Promise<CrawlResult> to NativeDBApis fits the existing pattern and will be correctly wrapped by NativeDBApisWrapper.

Ensure the native bindings (Electron/iOS) now implement this method for all platforms before release to avoid runtime “method not implemented” failures.

Also applies to: 85-86

packages/frontend/native/nbstore/src/blob.rs (1)

94-96: Tests correctly adapted to Box<[u8]> SetBlob.data

Using vec![0, 0].into() to build SetBlob.data matches the new Box<[u8]> type and works with the existing deref()/len() usage in set_blob.

Also applies to: 134-136, 182-184

packages/frontend/native/index.d.ts (1)

58-59: Type declarations for crawlDocData and native crawl result look consistent

Exposing crawlDocData(universalId: string, docId: string): Promise<NativeCrawlResult> on DocStoragePool and defining NativeBlockInfo/NativeCrawlResult here aligns with the Rust NAPI exports and provides a clear, structured shape for crawl results.

Please double-check that the TS BlockInfo/CrawlResult shape in @affine/nbstore remains structurally compatible with NativeBlockInfo/NativeCrawlResult so that consumers relying on either type don’t hit subtle typing mismatches.

Also applies to: 119-135

packages/common/nbstore/src/sync/indexer/index.ts (1)

416-445: LGTM! Clean fast-path with fallback pattern.

The implementation correctly tries the native crawl first and falls back to the JS-based crawling when unavailable or on failure. The error handling gracefully degrades without disrupting the indexing flow.

packages/frontend/native/nbstore/src/doc.rs (2)

104-109: LGTM! Correct binding after type change.

The direct binding of update (which is &[u8]) is correct since the caller already converts via .as_ref() at line 77 before passing to this internal method.


374-386: LGTM! Test correctly adapted to Bytes type.

The use of vec![0, 0].into() and comparison via .as_ref() is idiomatic for working with the bytes::Bytes type.

packages/common/nbstore/src/storage/doc.ts (3)

10-26: LGTM! Well-defined interfaces aligned with native types.

The BlockInfo and CrawlResult interfaces correctly mirror the Rust structures in doc_parser.rs, ensuring type consistency across the TypeScript and native boundary.


115-117: LGTM! Optional method in interface.

Making crawlDocData optional with ? allows storage implementations to opt-in to native crawling support while maintaining backward compatibility.


197-199: LGTM! Safe default implementation.

The base class returns null by default, allowing callers to detect when native crawling is unavailable and fall back to alternative approaches.

packages/frontend/native/nbstore/src/indexer.rs (4)

10-30: LGTM! Clean N-API type definitions.

The NativeBlockInfo and NativeCrawlResult structs correctly expose the document crawl data to JavaScript with appropriate napi attributes and Serialize derives for test assertions.


58-78: LGTM! Well-structured crawl entry point.

The method correctly handles the root document case by conditionally loading the root binary, and properly propagates errors through the ? operator.


101-122: LGTM! Efficient merge with fast paths.

The function handles edge cases well: returning an error for empty input and avoiding unnecessary work when there's only one segment.


152-196: LGTM! Good test coverage.

The tests verify both the successful parsing path (with fixture comparison) and the error handling for missing documents. The temporary directory pattern with cleanup ensures test isolation.

@codecov
Copy link

codecov bot commented Dec 6, 2025

Codecov Report

❌ Patch coverage is 2.63158% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.60%. Comparing base (69cdeed) to head (6f08cfe).

Files with missing lines Patch % Lines
packages/common/nbstore/src/sync/indexer/index.ts 3.22% 30 Missing ⚠️
...ages/common/nbstore/src/impls/idb/indexer/index.ts 0.00% 5 Missing ⚠️
packages/common/nbstore/src/storage/doc.ts 0.00% 1 Missing ⚠️
...ckages/common/nbstore/src/storage/dummy/indexer.ts 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           canary   #14055      +/-   ##
==========================================
- Coverage   56.60%   56.60%   -0.01%     
==========================================
  Files        2757     2757              
  Lines      137789   137812      +23     
  Branches    21041    21046       +5     
==========================================
+ Hits        77994    78003       +9     
- Misses      57596    58122     +526     
+ Partials     2199     1687     -512     
Flag Coverage Δ
server-test 77.24% <ø> (+0.01%) ⬆️
unittest 31.97% <2.63%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/common/nbstore/src/sync/indexer/index.ts (1)

494-522: Verify native block mapping matches JS crawler semantics

This mapping is the new single source of truth for native crawl results; a few details are worth double‑checking against crawlingDocData in packages/common/nbstore/src/sync/indexer/crawler.ts:

  • In crawlingDocData, additional is serialized (via JSON.stringify) before passing into IndexerDocument.from<'block'>, suggesting the index schema expects a string. Here you pass block.additional through as‑is. If BlockInfo['additional'] is not already a serialized string, this will change the indexed type compared to the non‑native path.
  • The native path ignores result.title at the call site (only block and summary are used). If title is meant to override or enrich the doc index, consider either wiring it through or dropping it from the return type to avoid confusion.
  • If the native BlockInfo carries extra fields used by the JS crawler (e.g., a markdown preview) you may want to surface them here as well to keep search/snippet behavior aligned between native and fallback crawls.

Please confirm that the BlockInfo / CrawlResult types are intentionally shaped to keep the index schema compatible with the existing JS crawler; otherwise, it’s safer to normalize here so both paths produce identical IndexerDocument<'block'> documents.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1879c41 and 37efb73.

📒 Files selected for processing (4)
  • packages/common/nbstore/src/sync/indexer/index.ts (2 hunks)
  • packages/frontend/apps/android/src/plugins/nbstore/definitions.ts (2 hunks)
  • packages/frontend/apps/android/src/plugins/nbstore/index.ts (1 hunks)
  • packages/frontend/native/nbstore/src/indexer.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/frontend/native/nbstore/src/indexer.rs
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement getCurrentDocId() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/android/src/plugins/nbstore/index.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement createNewDocByMarkdownInCurrentWorkspace() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/android/src/plugins/nbstore/index.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement getCurrentDocContentInMarkdown() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/android/src/plugins/nbstore/index.ts
🧬 Code graph analysis (2)
packages/frontend/apps/android/src/plugins/nbstore/index.ts (1)
packages/frontend/apps/android/App/app/src/main/java/app/affine/pro/service/WebService.kt (1)
  • docId (25-25)
packages/common/nbstore/src/sync/indexer/index.ts (1)
packages/common/nbstore/src/sync/indexer/crawler.ts (1)
  • crawlingDocData (6-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: E2E BlockSuite Test (9)
  • GitHub Check: E2E BlockSuite Test (6)
  • GitHub Check: E2E BlockSuite Test (3)
  • GitHub Check: E2E BlockSuite Test (5)
  • GitHub Check: E2E BlockSuite Test (2)
  • GitHub Check: E2E Mobile Test (5)
  • GitHub Check: E2E Mobile Test (1)
  • GitHub Check: E2E Mobile Test (2)
  • GitHub Check: E2E BlockSuite Test (1)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: E2E Mobile Test (3)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, chromium)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, firefox)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
  • GitHub Check: Typecheck
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Analyze (typescript, blocksuite)
  • GitHub Check: Lint
🔇 Additional comments (2)
packages/common/nbstore/src/sync/indexer/index.ts (1)

415-444: Native fast-path + fallback structure looks correct

The control-flow here keeps the previous behavior (Y.Doc decode + crawlingDocData) as a clean fallback and only short-circuits when native crawl succeeds, so delete/skip semantics and downstream indexing logic remain intact. No issues from a correctness standpoint.

packages/frontend/apps/android/src/plugins/nbstore/definitions.ts (1)

1-1: LGTM! Type import is correctly structured.

The type-only import of CrawlResult follows TypeScript best practices.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 37efb73 and a86ba75.

📒 Files selected for processing (9)
  • packages/common/nbstore/src/impls/sqlite/doc.ts (2 hunks)
  • packages/frontend/apps/android/App/app/src/main/java/app/affine/pro/plugin/NbStorePlugin.kt (1 hunks)
  • packages/frontend/apps/android/App/app/src/main/java/uniffi/affine_mobile_native/affine_mobile_native.kt (10 hunks)
  • packages/frontend/apps/android/src/plugins/nbstore/definitions.ts (2 hunks)
  • packages/frontend/apps/android/src/plugins/nbstore/index.ts (1 hunks)
  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts (2 hunks)
  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts (2 hunks)
  • packages/frontend/mobile-native/src/lib.rs (4 hunks)
  • packages/frontend/native/nbstore/src/indexer.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • packages/frontend/apps/ios/src/plugins/nbstore/index.ts
  • packages/frontend/native/nbstore/src/indexer.rs
  • packages/frontend/apps/android/src/plugins/nbstore/index.ts
  • packages/frontend/apps/android/src/plugins/nbstore/definitions.ts
🧰 Additional context used
📓 Path-based instructions (2)
packages/frontend/apps/ios/**/*.{ts,tsx}

📄 CodeRabbit inference engine (packages/frontend/apps/ios/AGENTS.md)

packages/frontend/apps/ios/**/*.{ts,tsx}: Use TypeScript as the programming language
Expose JavaScript APIs to native iOS code through window object for Capacitor integration
Implement getCurrentServerBaseUrl() API for native iOS bridge
Implement getCurrentI18nLocale() API for native iOS bridge
Implement getAiButtonFeatureFlag() API for native iOS bridge
Implement getCurrentWorkspaceId() API for native iOS bridge
Implement getCurrentDocId() API for native iOS bridge
Implement getCurrentDocContentInMarkdown() API for native iOS bridge
Implement createNewDocByMarkdownInCurrentWorkspace() API for native iOS bridge
Enable TypeScript strict mode

Files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
packages/frontend/apps/ios/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (packages/frontend/apps/ios/AGENTS.md)

Follow ESLint/Prettier configuration from workspace root

Files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
🧠 Learnings (3)
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement createNewDocByMarkdownInCurrentWorkspace() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement getCurrentDocId() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
📚 Learning: 2025-11-27T03:23:11.880Z
Learnt from: CR
Repo: toeverything/AFFiNE PR: 0
File: packages/frontend/apps/ios/AGENTS.md:0-0
Timestamp: 2025-11-27T03:23:11.880Z
Learning: Applies to packages/frontend/apps/ios/**/*.{ts,tsx} : Implement getCurrentDocContentInMarkdown() API for native iOS bridge

Applied to files:

  • packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts
🧬 Code graph analysis (3)
packages/frontend/mobile-native/src/lib.rs (3)
packages/frontend/native/index.d.ts (2)
  • NativeBlockInfo (119-129)
  • NativeCrawlResult (131-135)
packages/frontend/native/nbstore/src/indexer.rs (3)
  • from (33-45)
  • from (49-55)
  • crawl_doc_data (59-78)
packages/common/nbstore/src/storage/doc.ts (1)
  • CrawlResult (22-26)
packages/common/nbstore/src/impls/sqlite/doc.ts (1)
packages/common/nbstore/src/storage/doc.ts (2)
  • CrawlResult (22-26)
  • BlockInfo (10-20)
packages/frontend/apps/android/App/app/src/main/java/app/affine/pro/plugin/NbStorePlugin.kt (4)
blocksuite/affine/data-view/src/core/utils/lazy.ts (1)
  • lazy (1-11)
packages/frontend/apps/android/App/app/src/main/java/uniffi/affine_mobile_native/affine_mobile_native.kt (1)
  • run (1738-1744)
packages/frontend/apps/android/src/plugins/nbstore/definitions.ts (1)
  • SetBlob (12-17)
packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts (1)
  • SetBlob (12-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Build AFFiNE native (x86_64-pc-windows-msvc)
  • GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, firefox)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
  • GitHub Check: Build AFFiNE native (x86_64-unknown-linux-gnu)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, firefox)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, chromium)
  • GitHub Check: Analyze (javascript, affine)
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Analyze (typescript, blocksuite)
  • GitHub Check: E2E Mobile Test (5)
  • GitHub Check: E2E Mobile Test (3)
  • GitHub Check: E2E Mobile Test (2)
  • GitHub Check: E2E Mobile Test (1)
  • GitHub Check: loom thread test
  • GitHub Check: fuzzing
  • GitHub Check: Lint Rust
🔇 Additional comments (6)
packages/frontend/apps/android/App/app/src/main/java/uniffi/affine_mobile_native/affine_mobile_native.kt (3)

809-810: crawl_doc_data checksum wiring matches existing pattern

The new checksum function and its validation branch for docstoragepool_crawl_doc_data are consistent with neighboring entries (signature, return type, and hard-coded checksum). Nothing stands out as problematic here.

Also applies to: 1116-1118


920-921: crawlDocData async FFI path is consistent with other DocStoragePool methods

  • FFI declaration uniffi_affine_mobile_native_fn_method_docstoragepool_crawl_doc_data matches the other async DocStoragePool methods (returns Long future handle, takes Pointer + RustBuffer arguments).
  • The DocStoragePoolInterface.crawlDocData signature and the DocStoragePool override follow the same uniffiRustCallAsync + callWithPointer pattern used by getBlob, getDocSnapshot, etc., including error handling via UniffiException.ErrorHandler.
  • Using the *_rust_buffer future functions and FfiConverterTypeCrawlResult.lift(it) is appropriate for a structured result type.

No functional or lifecycle issues spotted specific to this new method.

Also applies to: 1614-1615, 1801-1819


2459-2515: BlockInfo/CrawlResult models and converters align with UniFFI serialization conventions

  • BlockInfo and CrawlResult field order is consistent across data class, read, allocationSize, and write, which is critical for binary compatibility.
  • Optional fields (content, blob, refDocId, refInfo, parentFlavour, parentBlockId, additional) correctly use the new FfiConverterOptionalString / FfiConverterOptionalSequenceString helpers with a 1‑byte presence tag, mirroring existing optional converters (e.g., FfiConverterOptionalLong).
  • FfiConverterSequenceString and FfiConverterSequenceTypeBlockInfo mirror the existing SequenceLong / SequenceTypeDocClock implementations (length prefix + per‑item converter), so list serialization should be compatible with the Rust side.

Assuming the Rust definitions were regenerated from the same UniFFI schema, these bindings look mechanically sound.

Also applies to: 2519-2551, 2863-2890, 2991-3018, 3051-3074, 3079-3102

packages/frontend/apps/ios/src/plugins/nbstore/definitions.ts (1)

1-1: CrawlResult import and options-object crawlDocData signature look correct

Importing CrawlResult and switching crawlDocData to (options: { id: string; docId: string }) => Promise<CrawlResult> matches the rest of NbStorePlugin and the Capacitor options-object calling convention, resolving the earlier positional-argument issue. Based on learnings, this matches our iOS bridge conventions.

Also applies to: 154-157

packages/common/nbstore/src/impls/sqlite/doc.ts (1)

85-127: crawlDocData normalization and validation look solid

The new crawlDocData override and helpers (normalizeNativeCrawlResult, normalizeBlock, readStringField/readStringArrayField, safeAdditionalField, readField with snake_case fallback) provide a robust normalization layer from the raw DB/FFI result into the typed CrawlResult/BlockInfo, with sensible guards and warnings on malformed data. No functional issues spotted.

Also applies to: 129-205

packages/frontend/mobile-native/src/lib.rs (1)

188-232: BlockInfo/CrawlResult uniffi records and crawl_doc_data wiring look consistent; confirm identifier semantics

The new uniffi BlockInfo and CrawlResult records, plus the From<affine_nbstore::indexer::NativeBlockInfo> / From<NativeCrawlResult> impls, mirror the native/TS shapes field‑for‑field, so the data flows cleanly across the FFI boundary. The DocStoragePool::crawl_doc_data method also follows the same pattern as other methods: resolve the pool entry with inner.get(universal_id.clone()), call into the inner crawl_doc_data, and convert the result via Into.

One thing to double-check: the inner crawl_doc_data currently receives &universal_id as its first argument. Since the indexer’s crawl_doc_data API is defined as (space_id: &str, doc_id: &str), please confirm that universal_id is indeed the intended “space id” value here (and not a distinct per-connection identifier). If they differ, you may want to thread the actual space id instead.

Also applies to: 695-703

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/frontend/native/nbstore/src/doc.rs (1)

361-361: Consider using .as_ref() instead of .to_vec() in test assertions.

The .to_vec() calls allocate new vectors for each comparison. For more efficient tests, you could compare using references:

-result.iter().map(|u| u.bin.to_vec()).collect::<Vec<_>>()
+result.iter().map(|u| u.bin.as_ref()).collect::<Vec<&[u8]>>()

Or for the single-value assertions:

-assert_eq!(result.unwrap().bin.to_vec(), vec![0, 0]);
+assert_eq!(&*result.unwrap().bin, &[0, 0][..]);

However, the current approach is clear and the overhead is negligible in tests.

Also applies to: 385-385, 403-403, 419-419

packages/frontend/native/nbstore/src/indexer.rs (1)

124-199: Consider additional test coverage for edge cases.

The existing tests cover basic success and error paths well. Consider adding tests for:

  1. Multi-update merge: Insert multiple updates and verify correct merging (tests the sort and merge logic on lines 88-97)
  2. Root doc scenarios: Test when doc_id == space_id (no root doc needed) and when root doc is required
  3. Update ordering: Verify updates with out-of-order timestamps are correctly sorted

These would increase confidence in the merge logic and root doc handling.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a86ba75 and 8f2720b.

📒 Files selected for processing (4)
  • packages/frontend/apps/android/App/app/src/main/java/app/affine/pro/plugin/NbStorePlugin.kt (1 hunks)
  • packages/frontend/mobile-native/src/lib.rs (2 hunks)
  • packages/frontend/native/nbstore/src/doc.rs (5 hunks)
  • packages/frontend/native/nbstore/src/indexer.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/frontend/apps/android/App/app/src/main/java/app/affine/pro/plugin/NbStorePlugin.kt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: E2E Test (10)
  • GitHub Check: E2E Test (4)
  • GitHub Check: E2E Test (8)
  • GitHub Check: E2E Test (3)
  • GitHub Check: E2E Test (5)
  • GitHub Check: E2E Test (9)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: E2E Mobile Test (3)
  • GitHub Check: E2E Mobile Test (2)
  • GitHub Check: E2E Mobile Test (1)
  • GitHub Check: loom thread test
  • GitHub Check: miri code check
  • GitHub Check: Build @affine/electron renderer
  • GitHub Check: Build AFFiNE native (x86_64-pc-windows-msvc)
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Typecheck
  • GitHub Check: Analyze (javascript, affine)
  • GitHub Check: Lint
🔇 Additional comments (12)
packages/frontend/native/nbstore/src/doc.rs (2)

106-106: LGTM! Redundant .as_ref() call removed.

Since update is already &[u8] (per the function signature), binding it directly is correct and more concise.


149-149: LGTM! Correct use of .deref() for Bytes binding.

Explicitly dereferencing Bytes to &[u8] for sqlx binding is the correct approach.

packages/frontend/mobile-native/src/lib.rs (3)

186-213: LGTM!

The BlockInfo struct and its conversion from NativeBlockInfo are well-defined. The use of Option fields provides flexibility for representing sparse block metadata, and the straightforward field mapping is appropriate for this FFI boundary.


215-230: LGTM!

The CrawlResult struct and conversion implementation are correct. The vector transformation using into_iter().map(Into::into).collect() is idiomatic Rust for converting collections of types.


693-701: The universal_id parameter being cloned for get() (line 696) and then passed separately to crawl_doc_data() (line 698) is intentional and necessary. Unlike other storage methods that only require doc_id after the get() call, the underlying crawl_doc_data API signature requires both universal_id and doc_id as parameters. The clone is required because get() consumes its parameter. This pattern is correct as written.

packages/frontend/native/nbstore/src/indexer.rs (7)

1-8: LGTM!

The imports are well-organized and appropriate for the functionality. The FFI bindings via napi_derive, CRDT operations via y_octo, and document parsing utilities are all correctly imported.


10-22: LGTM!

The NativeBlockInfo struct correctly mirrors the BlockInfo type from the parser with appropriate FFI annotations. All fields are properly typed with Options where nullable.


24-30: LGTM!

The NativeCrawlResult struct correctly maps to the CrawlResult type with appropriate FFI annotations.


32-56: LGTM!

The conversion implementations are clean and idiomatic. The vector transformation using into_iter().map(Into::into).collect() on line 51 efficiently converts all blocks in a single pass.


80-98: LGTM!

The implementation is efficient and correct:

  • Proper sorting of updates by timestamp (line 88) ensures correct CRDT merge order
  • Smart capacity pre-allocation (line 90-91) avoids reallocation
  • Early return for missing data (line 84-86) avoids unnecessary processing

139-152: LGTM!

The test helper functions are well-designed with proper async handling and unique temp directory generation using UUIDs to avoid collisions.


101-122: No changes needed. The implementation correctly uses binary format v1 consistently throughout the codebase (all stored updates are in v1 format, and the merge operation applies and encodes them using v1 methods). The GUID parameter correctly uses the document ID as the identifier for CRDT merging. The code is well-structured with appropriate defensive checks and error handling.

@darkskygit darkskygit changed the title feat: native reader binding feat(native): native reader for indexer Dec 6, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
packages/frontend/native/nbstore/src/indexer.rs (1)

88-109: LGTM: Update merging correctly implemented.

The function properly:

  • Validates input defensively
  • Optimizes single-segment case
  • Applies updates sequentially to build final document state
  • Encodes the merged result

Optional: Line 94's ok_or is redundant since we just verified segments.len() == 1, guaranteeing pop() returns Some:

  if segments.len() == 1 {
-    return segments.pop().ok_or(ParseError::DocNotFound.into());
+    return Ok(segments.pop().unwrap());
  }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8f2720b and 71424d8.

📒 Files selected for processing (6)
  • packages/common/native/src/doc_parser.rs (2 hunks)
  • packages/frontend/mobile-native/src/lib.rs (2 hunks)
  • packages/frontend/native/Cargo.toml (2 hunks)
  • packages/frontend/native/nbstore/Cargo.toml (3 hunks)
  • packages/frontend/native/nbstore/src/indexer.rs (1 hunks)
  • packages/frontend/native/nbstore/src/lib.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • packages/frontend/native/nbstore/src/lib.rs
  • packages/frontend/mobile-native/src/lib.rs
  • packages/frontend/native/nbstore/Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (1)
packages/frontend/native/nbstore/src/indexer.rs (3)
packages/common/native/src/doc_parser.rs (3)
  • parse_doc_from_binary (82-285)
  • from (77-79)
  • base (34-52)
packages/common/nbstore/src/storage/doc.ts (2)
  • BlockInfo (10-20)
  • CrawlResult (22-26)
packages/frontend/native/index.d.ts (2)
  • NativeBlockInfo (119-129)
  • NativeCrawlResult (131-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: y-octo binding test on x86_64-apple-darwin
  • GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
  • GitHub Check: E2E Test (10)
  • GitHub Check: Build AFFiNE native (x86_64-apple-darwin)
  • GitHub Check: E2E Test (9)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: E2E Test (7)
  • GitHub Check: E2E Test (8)
  • GitHub Check: E2E Test (5)
  • GitHub Check: E2E Test (4)
  • GitHub Check: E2E Test (6)
  • GitHub Check: E2E Test (2)
  • GitHub Check: E2E Test (3)
  • GitHub Check: E2E Test (1)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
  • GitHub Check: Build Server native
  • GitHub Check: Lint
🔇 Additional comments (11)
packages/frontend/native/Cargo.toml (3)

10-10: Hashcash feature consistently added across crates.

The hashcash feature on affine_common aligns with parallel changes in other native crates (backend/native, mobile-native), suggesting coordinated dependency management across the native module ecosystem.


25-25: Appropriate error handling library for FFI layer.

Adding thiserror is a solid practice for native bindings, as it enables clean error type definitions and automatic conversions—critical for propagating domain errors across the NAPI boundary to JavaScript.


28-31: Dev-dependencies well-selected for testing structured data.

The dev-dependencies (chrono, serde_json, uuid) are appropriately chosen for testing the crawlDocData API's structured results (blocks, title, summary), temporal handling, and identifier generation.

packages/common/native/src/doc_parser.rs (3)

82-82: LGTM: Function signature simplified.

The removal of the CrawlDocInput wrapper struct in favor of direct parameters improves API clarity and reduces indirection.


497-500: LGTM: Test updated correctly.

The test properly constructs separate doc_bin and doc_id parameters to match the new function signature.


83-83: Document the rationale for the [0, 0] binary validation.

The check for doc_bin == [0, 0] lacks documentation and isn't covered by test cases. No evidence suggests this pattern is a known invalid format marker in the yjs-compatible binary format. Either add a test case demonstrating when this occurs and why it's invalid, or replace this with a more general validation (e.g., checking minimum binary size) if that's the actual intent.

packages/frontend/native/nbstore/src/indexer.rs (5)

8-28: LGTM: DTOs properly defined.

The NativeBlockInfo and NativeCrawlResult structs correctly mirror their counterparts from the parser module with appropriate NAPI bindings for FFI exposure.


30-54: LGTM: Type conversions implemented correctly.

Both From implementations properly map all fields, with the blocks vector conversion using idiomatic iterator chaining.


56-65: LGTM: Crawl implementation is clean and correct.

The method properly:

  • Loads the document binary
  • Returns DocNotFound for missing documents
  • Delegates parsing to parse_doc_from_binary
  • Converts the result to the native DTO

The previous concern about unused root doc loading has been addressed.


67-86: LGTM: Binary loading logic is sound.

The method correctly:

  • Handles missing documents by returning None
  • Sorts updates chronologically (critical for CRDT semantics)
  • Assembles segments with snapshot first, followed by ordered updates
  • Pre-allocates capacity for efficiency

111-180: LGTM: Test coverage is comprehensive.

The tests properly validate:

  • Successful parsing of a snapshot against a known fixture
  • Error handling for missing documents
  • Cleanup of temporary resources

The use of fixtures and temporary directories follows best practices.

@darkskygit darkskygit requested review from EYHN and pengx17 December 6, 2025 16:12
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
packages/common/native/src/doc_parser.rs (1)

82-83: Clarify the intent of the doc_bin == [0, 0] check.

The condition doc_bin == [0, 0] only evaluates to true if doc_bin contains exactly the bytes [0, 0]. If the intent is to detect invalid binary formats more broadly, consider documenting why this specific pattern is invalid or revising the condition to match the intended validation logic.


Consider accepting a slice instead of owned Vec<u8>.

The function takes ownership of doc_bin but could accept &[u8] for better API flexibility. While the current approach is reasonable given that call sites may pass owned vectors, using a slice would be more idiomatic:

-pub fn parse_doc_from_binary(doc_bin: Vec<u8>, doc_id: String) -> Result<CrawlResult, ParseError> {
+pub fn parse_doc_from_binary(doc_bin: &[u8], doc_id: String) -> Result<CrawlResult, ParseError> {
   if doc_bin.is_empty() || doc_bin == [0, 0] {

All callers have been updated correctly.

The single call site in packages/frontend/native/nbstore/src/indexer.rs:63 has been updated to pass doc_bin and doc_id as separate parameters, matching the new signature.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8ce7359 and 6f08cfe.

📒 Files selected for processing (9)
  • packages/common/native/src/doc_parser.rs (2 hunks)
  • packages/common/nbstore/src/impls/cloud/indexer.ts (1 hunks)
  • packages/common/nbstore/src/impls/idb/indexer/index.ts (1 hunks)
  • packages/common/nbstore/src/storage/dummy/indexer.ts (1 hunks)
  • packages/common/nbstore/src/storage/indexer.ts (2 hunks)
  • packages/common/nbstore/src/sync/indexer/index.ts (6 hunks)
  • packages/frontend/mobile-native/src/lib.rs (2 hunks)
  • packages/frontend/native/nbstore/src/indexer.rs (1 hunks)
  • packages/frontend/native/nbstore/src/lib.rs (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
packages/common/native/src/doc_parser.rs (1)
packages/common/nbstore/src/storage/doc.ts (1)
  • CrawlResult (22-26)
packages/common/nbstore/src/sync/indexer/index.ts (1)
packages/common/nbstore/src/sync/indexer/crawler.ts (1)
  • crawlingDocData (6-53)
packages/frontend/mobile-native/src/lib.rs (3)
packages/frontend/native/index.d.ts (2)
  • NativeBlockInfo (119-129)
  • NativeCrawlResult (131-135)
packages/frontend/native/nbstore/src/indexer.rs (3)
  • from (31-43)
  • from (47-53)
  • crawl_doc_data (57-65)
packages/common/nbstore/src/storage/doc.ts (1)
  • CrawlResult (22-26)
packages/frontend/native/nbstore/src/indexer.rs (3)
packages/common/native/src/doc_parser.rs (3)
  • parse_doc_from_binary (82-285)
  • from (77-79)
  • base (34-52)
packages/common/nbstore/src/storage/doc.ts (2)
  • BlockInfo (10-20)
  • CrawlResult (22-26)
packages/frontend/native/index.d.ts (2)
  • NativeBlockInfo (119-129)
  • NativeCrawlResult (131-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: y-octo binding test on x86_64-apple-darwin
  • GitHub Check: y-octo binding test on aarch64-apple-darwin
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, webkit)
  • GitHub Check: E2E BlockSuite Cross Browser Test (2, chromium)
  • GitHub Check: E2E BlockSuite Cross Browser Test (1, firefox)
  • GitHub Check: Build AFFiNE native (aarch64-pc-windows-msvc)
  • GitHub Check: E2E Mobile Test (5)
  • GitHub Check: Build AFFiNE native (x86_64-unknown-linux-gnu)
  • GitHub Check: Build AFFiNE native (aarch64-apple-darwin)
  • GitHub Check: Build AFFiNE native (x86_64-apple-darwin)
  • GitHub Check: E2E Mobile Test (2)
  • GitHub Check: E2E Mobile Test (4)
  • GitHub Check: E2E Mobile Test (3)
  • GitHub Check: Analyze (typescript, affine)
  • GitHub Check: Analyze (typescript, blocksuite)
  • GitHub Check: Typecheck
  • GitHub Check: E2E Mobile Test (1)
  • GitHub Check: Analyze (javascript, blocksuite)
  • GitHub Check: Analyze (javascript, affine)
  • GitHub Check: Lint
🔇 Additional comments (22)
packages/common/native/src/doc_parser.rs (1)

497-500: LGTM! Test correctly updated to match the new signature.

The test has been properly refactored to construct doc_bin and doc_id separately and pass them as individual arguments to parse_doc_from_binary.

packages/common/nbstore/src/storage/dummy/indexer.ts (1)

88-90: LGTM!

The no-op implementation of refreshIfNeed is appropriate for the dummy storage, consistent with other stub methods in this class.

packages/common/nbstore/src/impls/cloud/indexer.ts (1)

143-145: LGTM!

The no-op implementation is appropriate for cloud storage since it's marked as readonly and doesn't maintain pending updates locally.

packages/common/nbstore/src/impls/idb/indexer/index.ts (1)

179-192: LGTM! Smart batching optimization.

The implementation correctly identifies tables with pending operations and selectively refreshes only those that need it, avoiding unnecessary IndexedDB writes.

packages/frontend/native/nbstore/src/lib.rs (2)

6-6: LGTM!

The new indexer module is appropriately exposed as public.


121-133: LGTM!

The crawl_doc_data method follows the established delegation pattern used by other methods in this class, properly propagating errors and converting the result.

packages/common/nbstore/src/storage/indexer.ts (2)

65-65: LGTM!

The new refreshIfNeed method properly extends the IndexerStorage interface, enabling conditional refresh behavior across implementations.


178-178: LGTM!

The abstract method declaration correctly enforces implementation in concrete storage classes.

packages/frontend/mobile-native/src/lib.rs (2)

186-230: LGTM! Clean FFI type definitions.

The BlockInfo and CrawlResult structs properly bridge the native and uniffi boundaries, with straightforward conversions that maintain field alignment.


693-701: LGTM!

The method follows the established pattern used throughout this file, properly delegating to the inner pool and converting the result.

packages/common/nbstore/src/sync/indexer/index.ts (7)

120-120: LGTM!

The lastRefreshed field enables throttling of refresh operations to reduce overhead.


383-383: LGTM!

Replacing direct refresh with refreshIfNeed enables batching of updates for better performance.


416-445: LGTM! Well-structured fast-path with fallback.

The native crawl path is attempted first, and on failure gracefully falls back to the existing YDoc-based crawling logic, ensuring robustness.


466-466: LGTM!

Consistent use of the throttled refresh mechanism after processing document blocks.


481-481: LGTM!

Ensuring a final refresh in the finally block properly flushes any remaining batched updates.


486-492: LGTM! Effective throttling mechanism.

The 100ms throttle window provides a good balance between responsiveness and batching efficiency.


503-531: LGTM! Robust error handling.

The try-catch properly handles failures in the native path, logging warnings for debugging while returning null to trigger the fallback mechanism.

packages/frontend/native/nbstore/src/indexer.rs (5)

8-54: LGTM! Clean type definitions and conversions.

The N-API types and conversions properly bridge the native parser types to the JavaScript boundary, with straightforward field mappings.


56-65: LGTM!

The crawl_doc_data method cleanly orchestrates loading and parsing, with appropriate error propagation for missing documents.


67-86: LGTM! Solid document reconstruction logic.

The method properly handles the common pattern of merging a snapshot with incremental updates, correctly returning None when no data exists.


88-109: LGTM! Efficient merge implementation.

The function correctly optimizes for the single-segment case while properly handling the general case of applying multiple updates to a YDoc.


111-180: LGTM!

The tests provide good coverage, validating both the happy path (parsing against fixtures) and error handling (missing documents).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants