Added support of ApacheArrow compression#599
Merged
alex268 merged 5 commits intoydb-platform:release_v2.4.0from Feb 25, 2026
Merged
Added support of ApacheArrow compression#599alex268 merged 5 commits intoydb-platform:release_v2.4.0from
alex268 merged 5 commits intoydb-platform:release_v2.4.0from
Conversation
Support of reading Revert backward compability
There was a problem hiding this comment.
Pull request overview
This PR adds Apache Arrow compression support for query result streaming and for building compressed Arrow batches for bulk upserts, alongside some API refactoring around Arrow bulk-upsert data.
Changes:
- Add configurable Apache Arrow compression for query result parts (LZ4 frame / Zstd) and a helper handler to decode compressed Arrow record batches.
- Extend Arrow bulk upsert writer to build compressed batches and rename/move Arrow bulk-upsert payload types.
- Refactor
BulkUpsertDatafrom an interface + impl to a concrete base class, with Arrow payload extending it.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| table/src/test/java/tech/ydb/table/query/arrow/ApacheArrowWriterTest.java | Updates Arrow writer tests and adds a compression-related test. |
| table/src/test/java/tech/ydb/table/integration/ReadTableTest.java | Updates bulk upsert construction to new BulkUpsertData API. |
| table/src/test/java/tech/ydb/table/integration/BulkUpsertTest.java | Updates Arrow writer import after package move. |
| table/src/test/java/tech/ydb/table/integration/AllTypesRecord.java | Refactors random record generation and updates Arrow writer import. |
| table/src/main/java/tech/ydb/table/values/Value.java | Fixes Javadoc link formatting for a deprecated method. |
| table/src/main/java/tech/ydb/table/query/arrow/ApacheArrowWriter.java | Adds compressed batch building API and updates Arrow field creation. |
| table/src/main/java/tech/ydb/table/query/arrow/ApacheArrowData.java | Renames/moves Arrow bulk upsert payload and integrates with new BulkUpsertData base class. |
| table/src/main/java/tech/ydb/table/query/BulkUpsertProtoData.java | Removes proto-specific BulkUpsertData implementation (now folded into BulkUpsertData). |
| table/src/main/java/tech/ydb/table/query/BulkUpsertData.java | Converts BulkUpsertData into a concrete class holding proto rows. |
| table/src/main/java/tech/ydb/table/Session.java | Updates default bulk upsert path to use new BulkUpsertData constructor. |
| table/pom.xml | Adds arrow-compression as a test dependency. |
| query/src/test/java/tech/ydb/query/result/arrow/ArrowValueReaderTest.java | Updates Arrow field construction to use FieldType. |
| query/src/test/java/tech/ydb/query/impl/ApacheArrowTest.java | Adds integration tests covering compressed Arrow query results and binary copy scenarios. |
| query/src/test/java/tech/ydb/query/impl/AllTypesRecord.java | Updates Arrow writer import after package move. |
| query/src/main/java/tech/ydb/query/settings/ExecuteQuerySettings.java | Adds ApacheArrowFormatMode to configure Arrow result compression. |
| query/src/main/java/tech/ydb/query/settings/ApacheArrowFormatMode.java | Introduces Arrow compression mode configuration object. |
| query/src/main/java/tech/ydb/query/result/arrow/CompressedArrowPartsHandler.java | Adds handler that can decode compressed Arrow record batches. |
| query/src/main/java/tech/ydb/query/result/arrow/ArrowPartsHandler.java | Adds an overridable loader factory to support compressed loaders. |
| query/src/main/java/tech/ydb/query/impl/SessionImpl.java | Maps Arrow compression settings into gRPC request (ArrowFormatSettings). |
| query/pom.xml | Adds optional arrow-compression dependency for compressed Arrow support. |
| pom.xml | Adds Arrow compression dependency version management. |
Comments suppressed due to low confidence (3)
table/src/test/java/tech/ydb/table/query/arrow/ApacheArrowWriterTest.java:74
- The test method is named
zstdCompressionTest, but it creates anLZ4_FRAMEcodec. This looks like a mismatch (either the test name or the codec type should be changed); also consider asserting that the compressed batch can be deserialized with a compression-awareVectorLoaderand yields the original value, not just that the raw bytes differ.
table/src/main/java/tech/ydb/table/query/arrow/ApacheArrowData.java:20 ApacheArrowDatahas to callsuper((TypedValue) null), leaving the baseBulkUpsertDatastate invalid. Consider refactoringBulkUpsertDatato avoid requiring a nullablerowsvalue for non-proto upsert payloads (e.g., makeBulkUpsertDataabstract withapplyToRequestabstract, or add a protected no-rows constructor).
table/src/main/java/tech/ydb/table/query/arrow/ApacheArrowWriter.java:1- Changing the package of a public API type (
tech.ydb.table.query.ApacheArrowWriter->tech.ydb.table.query.arrow.ApacheArrowWriter) is a breaking change for downstream consumers. If backward compatibility is required, consider leaving a deprecated shim type in the old package that delegates to the new implementation (or at least document this break in the changelog/release notes).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
af14b2f to
d5c44ef
Compare
pnv1
reviewed
Feb 25, 2026
query/src/main/java/tech/ydb/query/settings/ApacheArrowFormatMode.java
Outdated
Show resolved
Hide resolved
pnv1
approved these changes
Feb 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.