[fix](fe) Fix IS TRUE/ IS FALSE predicate null semantics#64696
[fix](fe) Fix IS TRUE/ IS FALSE predicate null semantics#64696morrySnow wants to merge 1 commit into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
40fefe0 to
96f17d6
Compare
|
run buildall |
|
/review |
TPC-H: Total hot run time: 29013 ms |
|
Codex automated review failed and did not complete. Error: You've hit your usage limit. Visit https://chatgpt.com/codex/settings/usage to purchase more credits or try again at 11:44 AM. Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
TPC-DS: Total hot run time: 173277 ms |
ClickBench: Total hot run time: 25.34 s |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
FE Regression Coverage ReportIncrement line coverage |
1 similar comment
FE Regression Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
96f17d6 to
3cb3411
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Review result: request changes.
Critical checkpoint conclusions:
- Goal and tests: the PR targets Nereids
IS TRUE/IS FALSEnull semantics and adds parser, analyzer, and regression coverage. The runtime truth table is improved, but one analyzed-output metadata issue remains. - Scope and clarity: the authoritative PR patch is focused to 9 parser/analyzer/expression/test files.
- Concurrency, lifecycle, config, compatibility, transactions, data writes, and observability: not materially involved in this change.
- Parallel paths: parser nodes, expression visitor hooks, and analyzer lowering were reviewed. Legacy translation relies on analyzer lowering before execution.
- Test results: the new regression output still has a blank EOF line, but that is already covered by an existing inline thread and was not duplicated.
User focus: no additional user-provided focus was present.
Subagent conclusions: optimizer-rewrite proposed two candidates; ORT-2 became the inline comment below, while ORT-1 was dismissed because it is outside the authoritative GitHub PR file list. tests-session-config found no new valuable issue beyond the existing .out whitespace thread. Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set.
TPC-H: Total hot run time: 29027 ms |
TPC-DS: Total hot run time: 172721 ms |
ClickBench: Total hot run time: 25.23 s |
FE UT Coverage ReportIncrement line coverage |
Issue Number: None
Related PR: None
Problem Summary: Nereids parsed IS TRUE and IS FALSE directly as boolean casts. For IS NOT TRUE and IS NOT FALSE this lost SQL three-valued logic because NULL was propagated instead of being treated as not true or not false. This change adds IsTrue and IsFalse parser expressions and rewrites them during expression analysis to cast(expr as boolean) AND expr IS NOT NULL, or NOT cast(expr as boolean) AND expr IS NOT NULL, so the surrounding NOT handles NULL correctly.
Fix IS TRUE, IS FALSE, IS NOT TRUE, and IS NOT FALSE predicate semantics for NULL values in Nereids.
- Test: Unit Test and Regression test
- Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.parser.NereidsParserTest
- Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.analysis.ExpressionAnalyzerTest
- Regression test: ./run-regression-test.sh --run -d query_p0/sql_functions -s test_is_true_false_predicate (attempted; local running FE was not rebuilt with this change and reproduced the old wrong result for NULL IS NOT TRUE)
- Behavior changed: Yes (Nereids IS TRUE/FALSE predicates now follow SQL three-valued logic for NULL)
- Does this need documentation: No
3cb3411 to
9ea9627
Compare
|
/review |
|
run buildall |
There was a problem hiding this comment.
Review result: comment only.
I completed another full pass over the current GitHub PR diff at head 9ea9627f10540e7699e9057794d29bbfdfadca11. I found no new non-duplicate inline issue to submit. The remaining substantiated concerns I found are already covered by existing inline threads:
- repeated evaluation in the
IS TRUE/IS FALSEanalyzer lowering: #64696 (comment) - nullable output metadata after lowering the
AlwaysNotNullablepredicates toAnd: #64696 (comment) - blank line at EOF in the generated regression output: #64696 (comment)
Critical checkpoint conclusions:
- Goal and tests: the PR targets Nereids
IS TRUE/IS FALSEnull semantics and adds parser, analyzer, and regression coverage. The value-level truth table coverage now includes nullable boolean, string cast-to-boolean values, SQL NULL, andIS NOTforms, but the existing inline threads still cover metadata/repeated-evaluation and output-format concerns. - Scope and clarity: the authoritative GitHub PR patch is focused to the ten Nereids/regression files listed by the PR API.
- Concurrency, lifecycle, config, compatibility, transactions, data writes, and observability: not materially involved.
- Parallel paths: parser construction, expression visitor dispatch, analyzer lowering, and translation assumptions were reviewed; execution relies on analyzer lowering before legacy translation.
- Test results: no local tests were run.
thirdparty/installed/bin/protocis missing in this runner, andgit diff --checkstill reports the already-threaded EOF whitespace issue.
User focus: no additional user-provided review focus was present.
Subagent conclusions:
optimizer-rewritereported OR-1 and OR-2, both duplicates of the existing analyzer inline threads above.tests-session-configreported TC-1 as a duplicate of the existing.outwhitespace thread. TC-2 was dismissed because it came from local base-branch drift; GitHub's current PR diff does not include the MaxCompute files.- Convergence round 1 ended with both live subagents replying
NO_NEW_VALUABLE_FINDINGSfor the same current ledger and proposed final comment set.
TPC-H: Total hot run time: 29347 ms |
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary:
IS TRUE,IS FALSE,IS NOT TRUE, andIS NOT FALSEin Nereids did not preserve SQL three-valued logic correctly because the parser reused generic boolean expressions instead of representing these predicates explicitly. This patch adds NereidsIsTrueandIsFalseexpressions, parsesIS TRUEandIS FALSEinto those nodes, and rewrites them during expression analysis to boolean casts guarded byIS NOT NULL. With the existingNOTwrapper from parsing,IS NOT TRUEandIS NOT FALSEnow correctly include NULL rows.Release note
Fix Nereids null semantics for
IS TRUE,IS FALSE,IS NOT TRUE, andIS NOT FALSEpredicates.Check List (For Author)
./run-fe-ut.sh --run org.apache.doris.nereids.parser.NereidsParserTest./run-fe-ut.sh --run org.apache.doris.nereids.rules.analysis.ExpressionAnalyzerTest./run-regression-test.sh --run -d query_p0/sql_functions -s test_is_true_false_predicate -forceGenOutIS TRUE,IS FALSE,IS NOT TRUE, andIS NOT FALSE.