Skip to content

fix(tesseract): cast bound param on measure filters#11176

Open
waralexrom wants to merge 2 commits into
masterfrom
tesseract-cast-on-measure-filter
Open

fix(tesseract): cast bound param on measure filters#11176
waralexrom wants to merge 2 commits into
masterfrom
tesseract-cast-on-measure-filter

Conversation

@waralexrom

Copy link
Copy Markdown
Member

Summary

Under the Tesseract native SQL planner (CUBEJS_TESSERACT_SQL_PLANNER=true), filtering on an aggregate measure produced a HAVING comparison with a raw bound parameter and no numeric cast. On BigQuery the param stayed a string, so the engine rejected the comparison (No matching signature for operator < for argument types: INT64, STRING). The standard planner emits CAST(? AS FLOAT64) here; Tesseract did not. Fixes #11031.

Changes

  • Resolve a measure filter member to the number cast type (boolean for boolean measures), so the bound value is wrapped by the per-dialect cast template — e.g. CAST(? AS FLOAT64) on BigQuery, CAST(? AS DOUBLE) on Presto, and no cast where the dialect template is a no-op (Postgres).
  • Derive the cast type from the measure type rather than casting every measure numerically: string and time measures are not numeric scalars and now render without a numeric cast (the standard planner coerces them to float). Aggregations, count, rank and number measures stay numeric; min/max fall through to the numeric default since their operand type isn't known at this point.

Testing

  • cargo test -p cubesqlplanner — 1057 passed. New filter tests in tests/filter/to_sql.rs cover a numeric measure filter (cast applied) and a string measure filter (no cast).
  • Schema-compiler unit test asserting BigQuery measure filter renders CAST(? AS FLOAT64) under the native planner.

Measure filters rendered a HAVING comparison without the numeric cast
that the per-dialect templates emit, so on BigQuery a bound value stayed
a raw string and the engine rejected the comparison (INT64 vs STRING).

Resolve a measure filter member to the "number" cast type ("boolean"
for boolean measures) so the param is wrapped per dialect, e.g.
CAST(? AS FLOAT64) on BigQuery.

CORE-541
Instead of casting every non-boolean measure to a number, derive the
cast type from the measure type: string and time measures are not
numeric scalars and must not be coerced, so they render without a cast.
Aggregations, count, rank and number measures stay numeric; min/max fall
through to the numeric default since their operand type isn't known here.
@waralexrom waralexrom requested a review from a team as a code owner June 29, 2026 16:11
@github-actions github-actions Bot added rust Pull requests that update Rust code javascript Pull requests that update Javascript code labels Jun 29, 2026
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Claude finished @waralexrom's task in 2m 22s —— View job


Code Review

Review: ✅ LGTM — small note on min/max fallback. 0 high, 0 medium, 1 low.
  • Read PR diff & changed files
  • Trace cast wiring (member_typeFilterSqlContext::cast_param)
  • Check new tests cover the fix and the no-cast cases
  • Spot-check measure-type coverage

Summary

The fix is correct and minimal. The bug was that member_type for a measure was being set to the raw measure_type() string (e.g. "count", "sum"), but FilterSqlContext::cast_param (rust/cube/cubesqlplanner/cubesqlplanner/src/physical_plan/filter/operators/filter_sql_context.rs:30-40) only matches on "boolean" / "number". Anything else fell through with no cast, leaving the bound param as a string on dialects whose number_param_cast template is non-trivial (BigQuery CAST(? AS FLOAT64), Presto CAST(? AS DOUBLE)). Mapping aggregations / number / count / rank → "number" lights up the dialect template and matches the standard planner's behavior. string / time correctly opt out (date comparisons go through the dedicated date operators).

Correctness

  • The mapping in typed_filter.rs:167-171 is consistent with how cast_param consumes the string — only boolean and number trigger a cast template; everything else is a no-op. The wildcard arm is the right default since aggregations (sum, count, countDistinct, countDistinctApprox, avg, rank) all compare numerically.
  • The boolean case is preserved through the wildcard collapse — "boolean" would also have routed correctly via the prior code, but keeping the explicit arm makes the intent obvious.

Tests

  • Rust tests in tests/filter/to_sql.rs:600-638 cover the three meaningful cases: numeric measure cast (count/sum) and string measure no-cast. The mock template tesseract/number_param_cast is $_N_$::numeric, so the assertions exercise the real path.
  • Schema-compiler test (base-query.test.ts:1109-1122) asserts the dialect-specific CAST(? AS FLOAT64) on BigQuery under useNativeSqlPlanner: true, which is exactly the regression from Tesseract bug: filtering by aggregate measures broken in BigQuery due to type casting problem #11031.
  • Yaml fixture adds source_label (string measure via MAX({CUBE}.source)) — sensible addition for the no-cast assertion.

Low — min/max of non-numeric operands still cast to number

typed_filter.rs:167-171min/max are caught by the wildcard arm and resolved to "number", which is correct for numeric operands but produces a CAST(? AS FLOAT64) wrapper around the bound value when the measure is min(some_string_column) or min(some_timestamp). The legacy planner has the same float-cast fallback, so this is not a regression and the PR description explicitly calls it out — leaving as a note for future work rather than blocking. If the operand type is reachable from the measure symbol it would be cleaner to resolve through it. (Fix this →)

Nits

  • _ => Some("number".to_string()) will silently coerce any new measure type added later. Consider listing the known numeric types explicitly so the next added measure type is a compile-time decision. Optional — the current wildcard is the same shape the legacy planner uses.
• [Branch](https://github.com/cube-js/cube/tree/tesseract-cast-on-measure-filter)

@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.70%. Comparing base (b8847e4) to head (95da472).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11176      +/-   ##
==========================================
- Coverage   83.70%   83.70%   -0.01%     
==========================================
  Files         257      257              
  Lines       79465    79465              
==========================================
- Hits        66517    66513       -4     
- Misses      12948    12952       +4     
Flag Coverage Δ
cubesql 83.70% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@waralexrom waralexrom requested a review from ovr June 29, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

javascript Pull requests that update Javascript code rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tesseract bug: filtering by aggregate measures broken in BigQuery due to type casting problem

2 participants