Skip to content

Parquet row filter struct access tree#23217

Open
SubhamSinghal wants to merge 2 commits into
apache:mainfrom
SubhamSinghal:parquet-row-filter-struct-access-tree
Open

Parquet row filter struct access tree#23217
SubhamSinghal wants to merge 2 commits into
apache:mainfrom
SubhamSinghal:parquet-row-filter-struct-access-tree

Conversation

@SubhamSinghal

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Are these changes tested?

Yes

Are there any user-facing changes?

No. This is an internal refactor

@github-actions github-actions Bot added the datasource Changes to the datasource crate label Jun 27, 2026

@kosiew kosiew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SubhamSinghal
Thanks!
This LGTM. I have one small non-blocking suggestion for future cleanup.


let pruned_data_type = prune_struct_type(field.data_type(), field_paths);
let pruned_data_type = prune_struct_type(field.data_type(), node);
Arc::new(Field::new(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small future cleanup suggestion: if this internal projected schema is expected to preserve wrapper field metadata for pruned structs, it might be worth copying the metadata when rebuilding the root and intermediate Fields here and in prune_struct_type.

I don't think this PR introduces the issue, and it probably is not scan output visible since the terminal fields are cloned and the decoder projection can restore the requested output schema. This is mostly a future proofing and test coverage suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor: Build a reusable struct-access path tree for Parquet row-filter planning

2 participants