Skip to content

chore(deps): bump the datafusion-arrow-parquet group across 1 directory with 22 updates#797

Merged
paleolimbot merged 23 commits intomainfrom
dependabot/cargo/datafusion-arrow-parquet-3fa1e37285
May 5, 2026
Merged

chore(deps): bump the datafusion-arrow-parquet group across 1 directory with 22 updates#797
paleolimbot merged 23 commits intomainfrom
dependabot/cargo/datafusion-arrow-parquet-3fa1e37285

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Apr 27, 2026

Bumps the datafusion-arrow-parquet group with 19 updates in the / directory:

Package From To
arrow 57.1.0 58.1.0
arrow-array 57.1.0 58.1.0
arrow-cast 57.1.0 58.1.0
arrow-ipc 57.1.0 58.1.0
arrow-json 57.1.0 58.1.0
datafusion 51.0.0 52.5.0
datafusion-catalog 51.0.0 52.5.0
datafusion-common 51.0.0 52.5.0
datafusion-common-runtime 51.0.0 52.5.0
datafusion-datasource 51.0.0 52.5.0
datafusion-datasource-parquet 51.0.0 52.5.0
datafusion-execution 51.0.0 52.5.0
datafusion-expr 51.0.0 52.5.0
datafusion-ffi 51.0.0 52.5.0
datafusion-optimizer 51.0.0 52.5.0
datafusion-physical-expr 51.0.0 52.5.0
datafusion-physical-plan 51.0.0 52.5.0
datafusion-pruning 51.0.0 52.5.0
parquet 57.1.0 58.1.0

Updates arrow from 57.1.0 to 58.1.0

Changelog

Sourced from arrow's changelog.

58.1.0 (2026-03-20)

Full Changelog

Implemented enhancements:

  • Reuse compression dict lz4_block #9566
  • [Variant] Add variant_to_arrow Struct type support #9529
  • [Variant] Add unshred_variant support for Binary and LargeBinary types #9526
  • [Variant] Add shred_variant support for LargeUtf8 and LargeBinary types #9525
  • [Variant] variant_get tests clean up #9517
  • parquet_variant: Support LargeUtf8 typed value in unshred_variant #9513
  • parquet-variant: Support string view typed value in unshred_variant #9512
  • Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
  • Followup for support ['fieldName'] in VariantPath #9478
  • Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
  • Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
  • Expose Avro writer schema when building the reader #9460 [arrow]
  • Python: avoid importing pyarrow classes ever time #9438
  • Add append_nulls to MapBuilder #9431 [arrow]
  • Add append_non_nulls to StructBuilder #9429 [arrow]
  • Add append_value_n to GenericByteBuilder #9425 [arrow]
  • Optimize from_bitwise_binary_op #9378 [arrow]
  • Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]

Fixed bugs:

  • MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
  • ListView interleave bug #9559 [arrow]
  • Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
  • "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
  • Potential overflow when calling util::bit_mask::set_bits (soundness issue) #9543 [arrow]
  • handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
  • Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
  • debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
  • Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
  • Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection #9370 [parquet]
  • Avro schema resolution not properly supported for complex types #9336 [arrow]

Documentation updates:

  • Update planned release schedule in README.md #9466 (alamb)

Performance improvements:

  • Introduce NullBuffer::try_from_unsliced to simplify array construction #9385 [parquet] [arrow]
  • perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
  • Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
  • pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)

... (truncated)

Commits
  • 6cadf3b Prepare for 58.1.0 Release (#9573)
  • 322f9ce [Variant] Add unshred_variant support for Binary and LargeBinary types (#9576)
  • bc74c71 feat(parquet): add content defined chunking for arrow writer (#9450)
  • 39dda22 Make Sbbf Constructers Public (#9569)
  • d53df60 feat: Optimize from_bitwise_binary_op with 64-bit alignment (#9441)
  • 44f5dfc perf: Coalesce page fetches when RowSelection selects all rows (#9578)
  • 14f1eb9 pyarrow: Cache the imported classes to avoid importing them each time (#9439)
  • 55a7768 [Variant] Add variant_to_arrow Struct type support (#9572)
  • 42ab0bc fix: Used checked_add for bounds checks to avoid UB (#9568)
  • 88422cb arrow-flight: generate dict_ids for dicts nested inside complex types (#9556)
  • Additional commits viewable in compare view

Updates arrow-array from 57.1.0 to 58.1.0

Changelog

Sourced from arrow-array's changelog.

58.1.0 (2026-03-20)

Full Changelog

Implemented enhancements:

  • Reuse compression dict lz4_block #9566
  • [Variant] Add variant_to_arrow Struct type support #9529
  • [Variant] Add unshred_variant support for Binary and LargeBinary types #9526
  • [Variant] Add shred_variant support for LargeUtf8 and LargeBinary types #9525
  • [Variant] variant_get tests clean up #9517
  • parquet_variant: Support LargeUtf8 typed value in unshred_variant #9513
  • parquet-variant: Support string view typed value in unshred_variant #9512
  • Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
  • Followup for support ['fieldName'] in VariantPath #9478
  • Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
  • Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
  • Expose Avro writer schema when building the reader #9460 [arrow]
  • Python: avoid importing pyarrow classes ever time #9438
  • Add append_nulls to MapBuilder #9431 [arrow]
  • Add append_non_nulls to StructBuilder #9429 [arrow]
  • Add append_value_n to GenericByteBuilder #9425 [arrow]
  • Optimize from_bitwise_binary_op #9378 [arrow]
  • Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]

Fixed bugs:

  • MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
  • ListView interleave bug #9559 [arrow]
  • Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
  • "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
  • Potential overflow when calling util::bit_mask::set_bits (soundness issue) #9543 [arrow]
  • handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
  • Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
  • debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
  • Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
  • Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection #9370 [parquet]
  • Avro schema resolution not properly supported for complex types #9336 [arrow]

Documentation updates:

  • Update planned release schedule in README.md #9466 (alamb)

Performance improvements:

  • Introduce NullBuffer::try_from_unsliced to simplify array construction #9385 [parquet] [arrow]
  • perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
  • Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
  • pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)

... (truncated)

Commits
  • 6cadf3b Prepare for 58.1.0 Release (#9573)
  • 322f9ce [Variant] Add unshred_variant support for Binary and LargeBinary types (#9576)
  • bc74c71 feat(parquet): add content defined chunking for arrow writer (#9450)
  • 39dda22 Make Sbbf Constructers Public (#9569)
  • d53df60 feat: Optimize from_bitwise_binary_op with 64-bit alignment (#9441)
  • 44f5dfc perf: Coalesce page fetches when RowSelection selects all rows (#9578)
  • 14f1eb9 pyarrow: Cache the imported classes to avoid importing them each time (#9439)
  • 55a7768 [Variant] Add variant_to_arrow Struct type support (#9572)
  • 42ab0bc fix: Used checked_add for bounds checks to avoid UB (#9568)
  • 88422cb arrow-flight: generate dict_ids for dicts nested inside complex types (#9556)
  • Additional commits viewable in compare view

Updates arrow-buffer from 57.1.0 to 57.3.0

Changelog

Sourced from arrow-buffer's changelog.

57.3.0 (2026-02-02)

Full Changelog

Breaking changes:

Fixed bugs:

Commits
  • 7505005 [57_maintenance] Update version to 57.3.0, add changelog (#9333)
  • 6bbfb99 [maintenance_57] Fix string array equality when the values buffer is the same...
  • 505eb8e [57_maintenance] Revert "Seal Array trait (#9092)", mark Array as unsafe ...
  • 74cf914 [57_maintenance] Mark BufferBuilder::new_from_buffer as unsafe (#9292) (#9312)
  • 25cc1ac [57_maintenance] fix: ensure BufferBuilder::truncate doesn't overset length...
  • 9fc2fbb [57_maintenance[Parquet] Provide only encrypted column stats in plaintext foo...
  • 3df3157 [57_maintenance] [regression] Error with adaptive predicate pushdown: "Invali...
  • 9e822e0 Update version to 57.2.0, add CHANGELOG (#9103)
  • 28f66f9 Add Union encoding documentation (#9102)
  • a8346be Minor: make it clear cache array reader is not cloning arrays (#9057)
  • Additional commits viewable in compare view

Updates arrow-cast from 57.1.0 to 58.1.0

Changelog

Sourced from arrow-cast's changelog.

58.1.0 (2026-03-20)

Full Changelog

Implemented enhancements:

  • Reuse compression dict lz4_block #9566
  • [Variant] Add variant_to_arrow Struct type support #9529
  • [Variant] Add unshred_variant support for Binary and LargeBinary types #9526
  • [Variant] Add shred_variant support for LargeUtf8 and LargeBinary types #9525
  • [Variant] variant_get tests clean up #9517
  • parquet_variant: Support LargeUtf8 typed value in unshred_variant #9513
  • parquet-variant: Support string view typed value in unshred_variant #9512
  • Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
  • Followup for support ['fieldName'] in VariantPath #9478
  • Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
  • Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
  • Expose Avro writer schema when building the reader #9460 [arrow]
  • Python: avoid importing pyarrow classes ever time #9438
  • Add append_nulls to MapBuilder #9431 [arrow]
  • Add append_non_nulls to StructBuilder #9429 [arrow]
  • Add append_value_n to GenericByteBuilder #9425 [arrow]
  • Optimize from_bitwise_binary_op #9378 [arrow]
  • Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]

Fixed bugs:

  • MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
  • ListView interleave bug #9559 [arrow]
  • Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
  • "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
  • Potential overflow when calling util::bit_mask::set_bits (soundness issue) #9543 [arrow]
  • handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
  • Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
  • debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
  • Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
  • Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection #9370 [parquet]
  • Avro schema resolution not properly supported for complex types #9336 [arrow]

Documentation updates:

  • Update planned release schedule in README.md #9466 (alamb)

Performance improvements:

  • Introduce NullBuffer::try_from_unsliced to simplify array construction #9385 [parquet] [arrow]
  • perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
  • Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
  • pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)

... (truncated)

Commits
  • 6cadf3b Prepare for 58.1.0 Release (#9573)
  • 322f9ce [Variant] Add unshred_variant support for Binary and LargeBinary types (#9576)
  • bc74c71 feat(parquet): add content defined chunking for arrow writer (#9450)
  • 39dda22 Make Sbbf Constructers Public (#9569)
  • d53df60 feat: Optimize from_bitwise_binary_op with 64-bit alignment (#9441)
  • 44f5dfc perf: Coalesce page fetches when RowSelection selects all rows (#9578)
  • 14f1eb9 pyarrow: Cache the imported classes to avoid importing them each time (#9439)
  • 55a7768 [Variant] Add variant_to_arrow Struct type support (#9572)
  • 42ab0bc fix: Used checked_add for bounds checks to avoid UB (#9568)
  • 88422cb arrow-flight: generate dict_ids for dicts nested inside complex types (#9556)
  • Additional commits viewable in compare view

Updates arrow-data from 57.1.0 to 57.3.0

Changelog

Sourced from arrow-data's changelog.

57.3.0 (2026-02-02)

Full Changelog

Breaking changes:

Fixed bugs:

Commits
  • 7505005 [57_maintenance] Update version to 57.3.0, add changelog (#9333)
  • 6bbfb99 [maintenance_57] Fix string array equality when the values buffer is the same...
  • 505eb8e [57_maintenance] Revert "Seal Array trait (#9092)", mark Array as unsafe ...
  • 74cf914 [57_maintenance] Mark BufferBuilder::new_from_buffer as unsafe (#9292) (#9312)
  • 25cc1ac [57_maintenance] fix: ensure BufferBuilder::truncate doesn't overset length...
  • 9fc2fbb [57_maintenance[Parquet] Provide only encrypted column stats in plaintext foo...
  • 3df3157 [57_maintenance] [regression] Error with adaptive predicate pushdown: "Invali...
  • 9e822e0 Update version to 57.2.0, add CHANGELOG (#9103)
  • 28f66f9 Add Union encoding documentation (#9102)
  • a8346be Minor: make it clear cache array reader is not cloning arrays (#9057)
  • Additional commits viewable in compare view

Updates arrow-ipc from 57.1.0 to 58.1.0

Changelog

Sourced from arrow-ipc's changelog.

58.1.0 (2026-03-20)

Full Changelog

Implemented enhancements:

  • Reuse compression dict lz4_block #9566
  • [Variant] Add variant_to_arrow Struct type support #9529
  • [Variant] Add unshred_variant support for Binary and LargeBinary types #9526
  • [Variant] Add shred_variant support for LargeUtf8 and LargeBinary types #9525
  • [Variant] variant_get tests clean up #9517
  • parquet_variant: Support LargeUtf8 typed value in unshred_variant #9513
  • parquet-variant: Support string view typed value in unshred_variant #9512
  • Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
  • Followup for support ['fieldName'] in VariantPath #9478
  • Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
  • Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
  • Expose Avro writer schema when building the reader #9460 [arrow]
  • Python: avoid importing pyarrow classes ever time #9438
  • Add append_nulls to MapBuilder #9431 [arrow]
  • Add append_non_nulls to StructBuilder #9429 [arrow]
  • Add append_value_n to GenericByteBuilder #9425 [arrow]
  • Optimize from_bitwise_binary_op #9378 [arrow]
  • Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]

Fixed bugs:

  • MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
  • ListView interleave bug #9559 [arrow]
  • Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
  • "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
  • Potential overflow when calling util::bit_mask::set_bits (soundness issue) #9543 [arrow]
  • handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
  • Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
  • debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
  • Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
  • Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection #9370 [parquet]
  • Avro schema resolution not properly supported for complex types #9336 [arrow]

Documentation updates:

  • Update planned release schedule in README.md #9466 (alamb)

Performance improvements:

  • Introduce NullBuffer::try_from_unsliced to simplify array construction #9385 [parquet] [arrow]
  • perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
  • Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
  • pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)

... (truncated)

Commits
  • 6cadf3b Prepare for 58.1.0 Release (#9573)
  • 322f9ce [Variant] Add unshred_variant support for Binary and LargeBinary types (#9576)
  • bc74c71 feat(parquet): add content defined chunking for arrow writer (#9450)
  • 39dda22 Make Sbbf Constructers Public (#9569)
  • d53df60 feat: Optimize from_bitwise_binary_op with 64-bit alignment (#9441)
  • 44f5dfc perf: Coalesce page fetches when RowSelection selects all rows (#9578)
  • 14f1eb9 pyarrow: Cache the imported classes to avoid importing them each time (#9439)
  • 55a7768 [Variant] Add variant_to_arrow Struct type support (#9572)
  • 42ab0bc fix: Used checked_add for bounds checks to avoid UB (#9568)
  • 88422cb arrow-flight: generate dict_ids for dicts nested inside complex types (#9556)
  • Additional commits viewable in compare view

Updates arrow-json from 57.1.0 to 58.1.0

Changelog

Sourced from arrow-json's changelog.

58.1.0 (2026-03-20)

Full Changelog

Implemented enhancements:

  • Reuse compression dict lz4_block #9566
  • [Variant] Add variant_to_arrow Struct type support #9529
  • [Variant] Add unshred_variant support for Binary and LargeBinary types #9526
  • [Variant] Add shred_variant support for LargeUtf8 and LargeBinary types #9525
  • [Variant] variant_get tests clean up #9517
  • parquet_variant: Support LargeUtf8 typed value in unshred_variant #9513
  • parquet-variant: Support string view typed value in unshred_variant #9512
  • Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
  • Followup for support ['fieldName'] in VariantPath #9478
  • Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
  • Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
  • Expose Avro writer schema when building the reader #9460 [arrow]
  • Python: avoid importing pyarrow classes ever time #9438
  • Add append_nulls to MapBuilder #9431 [arrow]
  • Add append_non_nulls to StructBuilder #9429 [arrow]
  • Add append_value_n to GenericByteBuilder #9425 [arrow]
  • Optimize from_bitwise_binary_op #9378 [arrow]
  • Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]

Fixed bugs:

  • MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
  • ListView interleave bug #9559 [arrow]
  • Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
  • "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
  • Potential overflow when calling util::bit_mask::set_bits (soundness issue) #9543 [arrow]
  • handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
  • Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
  • debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
  • Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
  • Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection #9370 [parquet]
  • Avro schema resolution not properly supported for complex types #9336 [arrow]

Documentation updates:

  • Update planned release schedule in README.md #9466 (alamb)

Performance improvements:

  • Introduce NullBuffer::try_from_unsliced to simplify array construction #9385 [parquet] [arrow]
  • perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
  • Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
  • pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)

... (truncated)

Commits
  • 6cadf3b Prepare for 58.1.0 Release (#9573)
  • 322f9ce [Variant] Add unshred_variant support for Binary and LargeBinary types (#9576)
  • bc74c71 feat(parquet): add content defined chunking for arrow writer (#9450)
  • 39dda22 Make Sbbf Constructers Public (#9569)
  • d53df60 feat: Optimize from_bitwise_binary_op with 64-bit alignment (#9441)
  • 44f5dfc perf: Coalesce page fetches when RowSelection selects all rows (#9578)
  • 14f1eb9 pyarrow: Cache the imported classes to avoid importing them each time (#9439)
  • 55a7768 [Variant] Add variant_to_arrow Struct type support (#9572)
  • 42ab0bc fix: Used checked_add for bounds checks to avoid UB (#9568)
  • 88422cb arrow-flight: generate dict_ids for dicts nested inside complex types (#9556)
  • Additional commit...

    Description has been truncated

@dependabot dependabot Bot added dependencies Pull requests that update a dependency file rust Pull requests that update rust code labels Apr 27, 2026
…ry with 22 updates

Bumps the datafusion-arrow-parquet group with 19 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [arrow](https://github.com/apache/arrow-rs) | `57.1.0` | `58.1.0` |
| [arrow-array](https://github.com/apache/arrow-rs) | `57.1.0` | `58.1.0` |
| [arrow-cast](https://github.com/apache/arrow-rs) | `57.1.0` | `58.1.0` |
| [arrow-ipc](https://github.com/apache/arrow-rs) | `57.1.0` | `58.1.0` |
| [arrow-json](https://github.com/apache/arrow-rs) | `57.1.0` | `58.1.0` |
| [datafusion](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-catalog](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-common](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-common-runtime](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-datasource](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-datasource-parquet](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-execution](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-expr](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-ffi](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-optimizer](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-physical-expr](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-physical-plan](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [datafusion-pruning](https://github.com/apache/datafusion) | `51.0.0` | `52.5.0` |
| [parquet](https://github.com/apache/arrow-rs) | `57.1.0` | `58.1.0` |



Updates `arrow` from 57.1.0 to 58.1.0
- [Changelog](https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...58.1.0)

Updates `arrow-array` from 57.1.0 to 58.1.0
- [Changelog](https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...58.1.0)

Updates `arrow-buffer` from 57.1.0 to 57.3.0
- [Changelog](https://github.com/apache/arrow-rs/blob/57.3.0/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...57.3.0)

Updates `arrow-cast` from 57.1.0 to 58.1.0
- [Changelog](https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...58.1.0)

Updates `arrow-data` from 57.1.0 to 57.3.0
- [Changelog](https://github.com/apache/arrow-rs/blob/57.3.0/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...57.3.0)

Updates `arrow-ipc` from 57.1.0 to 58.1.0
- [Changelog](https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...58.1.0)

Updates `arrow-json` from 57.1.0 to 58.1.0
- [Changelog](https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...58.1.0)

Updates `arrow-schema` from 57.1.0 to 57.3.0
- [Changelog](https://github.com/apache/arrow-rs/blob/57.3.0/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...57.3.0)

Updates `datafusion` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-catalog` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-common` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-common-runtime` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-datasource` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-datasource-parquet` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-execution` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-expr` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-ffi` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-optimizer` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-physical-expr` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-physical-plan` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `datafusion-pruning` from 51.0.0 to 52.5.0
- [Changelog](https://github.com/apache/datafusion/blob/main/CHANGELOG.md)
- [Commits](apache/datafusion@51.0.0...52.5.0)

Updates `parquet` from 57.1.0 to 58.1.0
- [Changelog](https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md)
- [Commits](apache/arrow-rs@57.1.0...58.1.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-version: 58.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-array
  dependency-version: 58.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-buffer
  dependency-version: 57.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-cast
  dependency-version: 58.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-data
  dependency-version: 57.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-ipc
  dependency-version: 58.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-json
  dependency-version: 58.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: arrow-schema
  dependency-version: 57.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-catalog
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-common
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-common-runtime
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-datasource
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-datasource-parquet
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-execution
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-expr
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-ffi
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-optimizer
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-physical-expr
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-physical-plan
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: datafusion-pruning
  dependency-version: 52.5.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
- dependency-name: parquet
  dependency-version: 58.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: datafusion-arrow-parquet
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot force-pushed the dependabot/cargo/datafusion-arrow-parquet-3fa1e37285 branch from ca5495f to d20c41c Compare April 27, 2026 18:21
paleolimbot and others added 22 commits April 30, 2026 09:07
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Comment on lines +394 to +401
let ctx = Arc::new(SessionContext::new()) as Arc<dyn TaskContextProvider>;
let ffi_provider = FFI_TableProvider::new(
provider,
true,
Some(self.runtime.handle().clone()),
&ctx,
None,
);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few FFI things changed...we don't really use this except currently for the situation where there are multiple SessionContexts and we pass data frames between them (and we should probably find a better way to do this).

Comment on lines +297 to +304
fn try_pushdown_projection(
&self,
projection: &ProjectionExprs,
) -> Result<Option<Arc<dyn FileSource>>> {
// Use SplitProjection to handle any projection:
// - file_indices provides column pruning (always works)
// - ProjectionOpener handles reordering/expressions/renames after reading
let split_projection = SplitProjection::new(self.table_schema.file_schema(), projection);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This projection pushdown bit is a key change in DataFusion 52. For the generic datasource (which we use for GDAL input) we can use the built-in pattern for bridging this with the previous approach (simple integers).

Comment on lines -477 to -481
/// Apply a [SchemaAdapterFactory] to the inner [ParquetSource]
pub fn with_schema_adapter_factory(
&self,
schema_adapter_factory: Arc<dyn SchemaAdapterFactory>,
) -> Self {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema adapter factory is no more (there is something called a physical expression adapater factory that we may need to add support for later)

Comment on lines +567 to +580
// DataFusion 52 has an issue where field metadata (like ARROW:extension:name)
// is stripped when evaluating embedded projections in ParquetOpener. This is
// because the batch schema comes from the parquet reader (which doesn't have
// extension metadata), and Column::return_field() looks up fields from that schema.
// This isn't a bug in DataFusion because we're the ones that advertised the table
// schema as having metadata'd expressions in the first place.
//
// We fix this by wrapping Column expressions with MetadataPreservingColumn,
// which stores the correct field from the file schema and returns it from
// return_field() regardless of the input schema.
let transformed_projection = wrap_columns_with_metadata_preserving(
projection.clone(),
self.inner.table_schema().table_schema(),
)?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a somewhat involved change that Opus did a great job root causing but very bad job solving and it took me some time to fix this. Because we advertise a file schema that is not the underlying Parquet file schema, we have expressions whose columns refer to a schema that the inner opener assumes is identical to the file schema (but are not!). I hope we can remove this wrapper at some point and/or simplify our Parquet wrapping strategy but for now this is is a minimally invasive strategy that ensures we don't loose benefits of the wrapped Parquet implementation.

Comment on lines +108 to +117
// Wrap with ProjectionOpener to handle reordering/expressions
if let Some(split_projection) = &self.split_projection {
ProjectionOpener::try_new(
split_projection.clone(),
inner_opener,
self.table_schema.file_schema(),
)
} else {
Ok(inner_opener)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @b4l...this is to keep up with the changes for DataFusion 52. The current tests pass but I didn't add new ones with more complex projections to check (there are some for the sedona-datasource that did trigger a failure and this approach worked there to fix the issue).

@paleolimbot paleolimbot merged commit 6c7cd96 into main May 5, 2026
18 checks passed
@dependabot dependabot Bot deleted the dependabot/cargo/datafusion-arrow-parquet-3fa1e37285 branch May 5, 2026 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file rust Pull requests that update rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants