-
Notifications
You must be signed in to change notification settings - Fork 333
Feat: Add StarRocks engine support #5658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
### What - **Add StarRocks engine support to SQLMesh** via StarRocks’ MySQL-compatible protocol. - Ship **engine adapter + docs + real integration tests** to ensure generated SQL works on StarRocks. ### Why - **User demand / adoption**: StarRocks is a common OLAP choice; SQLMesh users want to run the same model lifecycle (build, incremental maintenance, views/MVs) on StarRocks without bespoke SQL. - **Engine-specific semantics**: StarRocks differs from vanilla MySQL in DDL/DML constraints (e.g., key types, delete behavior, rename caveats). An adapter is needed to produce correct and predictable SQL. - **Confidence & maintainability**: Documenting config patterns + codifying behavior with integration tests prevents regressions and makes support “real” (not just “it parses”). ### Scope (what’s supported) - **Connectivity**: Connect through MySQL protocol (e.g., `pymysql`). - **Table creation / DDL**: - Key table types via `physical_properties`: **DUPLICATE KEY (default)**, **PRIMARY KEY (recommended for incremental)**, **UNIQUE KEY** - **Partitioning**: simple `partitioned_by` and advanced `partition_by` (complex expression partitioning) + optional initial `partitions` - **Distribution**: `distributed_by` structured form or string fallback (HASH / RANDOM; buckets required) - **Ordering**: `order_by` / `clustered_by` - **Generic PROPERTIES passthrough** (string key/value) - **Views**: - Regular views - **Materialized views** via `kind VIEW(materialized true)` with StarRocks-specific notes/constraints - **DML / maintenance**: - Insert/select/update basics - Delete behavior handled with StarRocks compatibility constraints (PRIMARY KEY tables recommended for robust deletes) ### Changes - **Engine adapter**: `sqlmesh/core/engine_adapter/starrocks.py` - **Docs**: `docs/integrations/engines/starrocks.md` - **Integration tests**: `tests/core/engine_adapter/integration/test_integration_starrocks.py`, and `tests/core/engine_adapter/test_starrocks.py` ### Verification - **Integration tests require a running StarRocks** instance. - Ran: - set `STARROCKS_HOST/PORT/USER/PASSWORD` - `pytest -m "starrocks and docker" tests/core/engine_adapter/integration/test_integration_starrocks.py` ### Known limitations / caveats - **No sync MV support (currently)** - **No tuple IN**: `(c1, c2) IN ((v1, v2), ...)` - **No `SELECT ... FOR UPDATE`** - **RENAME caveat**: rename target can’t be qualified with a database name ### Notes on compatibility - **Changes are StarRocks-scoped** (adapter/docs/tests) and should not impact other engines. Signed-off-by: jaogoy <jaogoy@gmail.com>
|
@erindru Hi Erin, would you like to take a review of this PR. This PR is similar with #5033, but to support StarRocks in SQLMesh. I'll be very glad to see your comments. I'm trying to fix the CI problem and some test cases. |
And optimize some test cases. Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
Signed-off-by: jaogoy <jaogoy@gmail.com>
What
Why
Scope (what’s supported)
pymysql).physical_properties: DUPLICATE KEY (default), PRIMARY KEY (recommended for incremental), UNIQUE KEYpartitioned_byand advancedpartition_by(complex expression partitioning) + optional initialpartitionsdistributed_bystructured form or string fallback (HASH / RANDOM; buckets required)order_by/clustered_bykind VIEW(materialized true)with StarRocks-specific notes/constraintsChanges
sqlmesh/core/engine_adapter/starrocks.pydocs/integrations/engines/starrocks.mdtests/core/engine_adapter/integration/test_integration_starrocks.py, andtests/core/engine_adapter/test_starrocks.pyVerification
STARROCKS_HOST/PORT/USER/PASSWORDpytest -m "starrocks and docker" tests/core/engine_adapter/integration/test_integration_starrocks.pyKnown limitations / caveats
(c1, c2) IN ((v1, v2), ...)SELECT ... FOR UPDATEAcknowledgement
This implementation was largely inspired by #5033 — thanks to @xinge-ji for the solid groundwork.