[fix](be) Fix concat_ws nullable array handling#64703
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: concat_ws has a BE execution path for a single array argument. When the array column row itself is NULL, the executor still walked the nested array data and could return values from nested storage instead of treating the NULL array row as empty input. Also, if the optimizer rewrite is disabled, multiple array arguments can reach this BE array path and were silently executed using only the first array argument. This change keeps concat_ws return nullability unchanged, skips nested data for NULL array rows, and rejects array-form concat_ws calls unless the executor receives exactly separator plus one array argument.
### Release note
Fix wrong concat_ws results for nullable array inputs and return an error for unsupported multiple-array execution without optimizer rewrite.
### Check List (For Author)
- Test: Regression test and build
- ./build.sh --be
- ./run-regression-test.sh --run --conf /tmp/doris-regression-conf-run-path.groovy -d nereids_function_p0/scalar_function -s nereids_scalar_fn_concat_ws -forceGenOut
- ./run-regression-test.sh --run --conf /tmp/doris-regression-conf-run-path.groovy -d nereids_function_p0/scalar_function -s nereids_scalar_fn_concat_ws
- Behavior changed: Yes. concat_ws skips nested data for NULL array rows, and unsupported multiple-array BE execution now returns INVALID_ARGUMENT instead of silently consuming only the first array.
- Does this need documentation: No
870dbc0 to
aa7bf6f
Compare
|
run buildall |
TPC-H: Total hot run time: 29245 ms |
TPC-DS: Total hot run time: 173283 ms |
ClickBench: Total hot run time: 25.33 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
Automated review completed for PR #64703. I did not find a blocking correctness issue to raise inline.
Critical checkpoint conclusions:
- Goal/test coverage: The PR fixes nullable array-row handling in BE
concat_wsarray mode and adds regression coverage for the default rewritten path, a filtered nullable-array case, and the disabled-rewrite error path. - Scope: The code change is narrowly limited to BE
concat_wsarray execution and the matching regression test/output. - Concurrency/lifecycle/transactions/persistence: Not involved.
- Configuration/session behavior: No new config. The existing
disable_nereids_expression_rulespath was reviewed because the test uses it. - Compatibility/parallel paths: Multi-array
concat_wsremains supported through the normal NereidsCONCATWS_MULTI_ARRAY_TO_ONErewrite. The direct BE multi-array shape previously returned wrong data by ignoring later arrays; this PR turns that unsupported shape into an explicitInvalidArgument. - Error handling: The new BE arity guard returns
Statusthrough the existing function execution path. - Nullability/data correctness: The output remains separator-nullability driven, and NULL array rows now skip nested data and produce the existing Doris empty-string behavior.
- Tests/results/style: The added regression output matches the added queries, uses deterministic ordering where multiple rows are returned, and
git diff --checkis clean. - Performance/observability: No new hot-path concern or observability requirement found for this small branch.
Subagent conclusions:
optimizer-rewrite:OPT-NONE; no optimizer/rewrite candidate was substantiated.tests-session-config: proposedTEST-001about disabledCONCATWS_MULTI_ARRAY_TO_ONE; main review dismissed it with evidence because the disabled-rule direct BE path was already semantically wrong before this PR and now fails explicitly.- Convergence round 1: both live subagents replied
NO_NEW_VALUABLE_FINDINGSfor the same final ledger/comment set.
User focus: No additional user-provided review focus was supplied.
CI note: The listed macOS BE UT failure is from the runner using Java 25 while the job requires JDK 17; compile, Linux BE UT, formatter, CheckStyle, and P0 regression checks shown for the PR are passing or unrelated/pending.
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
Problem Summary: concat_ws has a BE execution path for a single array argument. When the array column row itself is NULL, the executor still walked the nested array data and could return values from nested storage instead of treating the NULL array row as empty input. Also, if the optimizer rewrite is disabled, multiple array arguments can reach this BE array path and were silently executed using only the first array argument. This change keeps concat_ws return nullability unchanged, skips nested data for NULL array rows, and rejects array-form concat_ws calls unless the executor receives exactly separator plus one array argument. ### Release note Fix wrong concat_ws results for nullable array inputs and return an error for unsupported multiple-array execution without optimizer rewrite.
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: concat_ws has a BE execution path for a single array argument. When the array column row itself is NULL, the executor still walked the nested array data and could return values from nested storage instead of treating the NULL array row as empty input. Also, if the optimizer rewrite is disabled, multiple array arguments can reach this BE array path and were silently executed using only the first array argument. This change keeps concat_ws return nullability unchanged, skips nested data for NULL array rows, and rejects array-form concat_ws calls unless the executor receives exactly separator plus one array argument.
Release note
Fix wrong concat_ws results for nullable array inputs and return an error for unsupported multiple-array execution without optimizer rewrite.
Check List (For Author)