Skip to content

Fix array_except nullability mismatch#4237

Open
yuboxx wants to merge 3 commits into
apache:mainfrom
yuboxx:fix-array-except-nullability-3646
Open

Fix array_except nullability mismatch#4237
yuboxx wants to merge 3 commits into
apache:mainfrom
yuboxx:fix-array-except-nullability-3646

Conversation

@yuboxx
Copy link
Copy Markdown
Contributor

@yuboxx yuboxx commented May 6, 2026

Which issue does this PR close?

Closes #3646.

Rationale for this change

Spark can produce equivalent arrays with different containsNull values. DataFusion's array_except checks Arrow list types strictly, including child field nullability, so Comet could reject compatible Spark arrays with List(Int32) vs List(non-null Int32).

What changes are included in this PR?

This PR wraps DataFusion's array_except UDF to normalize list child nullability before delegation, and serializes an explicit normalized return type so Comet's planned schema matches the runtime Arrow result.

It also re-enables the SQL regression case and adds coverage for nested arrays and mixed-nullability inputs.

How are these changes tested?

  • cargo test -p datafusion-comet-spark-expr normalizes_ --lib
  • ./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite array_except" -Dtest=none

@yuboxx yuboxx changed the title DRAFT - NOT READY: Fix array_except nullability mismatch Fix array_except nullability mismatch May 17, 2026
@yuboxx yuboxx marked this pull request as ready for review May 17, 2026 22:04
@yuboxx
Copy link
Copy Markdown
Contributor Author

yuboxx commented May 17, 2026

@comphead This PR should be ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

arrays_except type mismatch

1 participant