Skip to content

Refactor/freemine.tdengine.stmt2#35366

Open
freemine wants to merge 16 commits into
taosdata:mainfrom
freemine:refactor/freemine.tdengine.stmt2
Open

Refactor/freemine.tdengine.stmt2#35366
freemine wants to merge 16 commits into
taosdata:mainfrom
freemine:refactor/freemine.tdengine.stmt2

Conversation

@freemine

Copy link
Copy Markdown
Contributor

Description

Currently, taos_stmt2_prepare does not fully parse parameterized INSERT statements until taos_stmt2_get_fields or taos_stmt2_bind_param is called. It also lacks a clear way to handle literal SQL statements during the prepare/execute stages.

To fix this, this approach splits the logic based on the statement type:

  • For literal SQL statements: It reuses the internal logic of taos_query. The parsing part is handled during taos_stmt2_prepare, and the execution part is handled in taos_stmt2_exec.
  • For parameterized INSERTs: It forces the parsing to happen earlier by calling taos_stmt2_get_fields internally before taos_stmt2_prepare returns.

Benefits for Middleware (JDBC / ODBC):
This change makes implementing upper-level drivers (like JDBC and ODBC) much easier. Standard APIs like prepare, bind, execute, and executeDirect now map directly to the corresponding taos functions. Middleware developers no longer need to parse or inspect the SQL string type just to decide which underlying taos API to call.

Issue(s)

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
    -- both stmtTest and stmt2Test passed
    -- both synchronous/asynchronous passed
    -- both native/websocket passed
  • Is there no significant decrease in test coverage?

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for literal SQL statements in the TAOS_STMT2 interface, allowing direct execution without parameter binding. It adds parsing utilities (qIsLiteralSql, qPureParseInsert), state management (SStmt2LiteralCtx), and updates the statement preparation and execution flows. The review comments point out several critical issues: a misleading syntax error when parsing standard inserts on non-existent tables, a connection object reference leak in stmtPrepareLiteral2 due to a missing release call on early return, dead code in the parser header, an implicit variable dependency in the SET_ERR macro, and redundant preprocessor branches in thash.c.

Comment thread source/libs/parser/src/parser.c
Comment thread source/client/src/clientStmt2.c
Comment thread include/libs/parser/parser.h Outdated
Comment thread source/client/src/clientStmt2.c
Comment thread source/util/src/thash.c
Comment on lines 677 to 681
#ifdef _TD_LOONGARCH_64
SHashNode *pNewNode = taosMemoryCalloc(1, sizeof(SHashNode) + keyLen + dsize + 1);
#else
SHashNode *pNewNode = taosMemoryMalloc(sizeof(SHashNode) + keyLen + dsize + 1);
SHashNode *pNewNode = taosMemoryCalloc(1, sizeof(SHashNode) + keyLen + dsize + 1);
#endif

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since both the #ifdef and #else branches now execute the exact same taosMemoryCalloc call, the preprocessor conditional block is redundant and can be removed entirely to simplify the code.

  SHashNode *pNewNode = taosMemoryCalloc(1, sizeof(SHashNode) + keyLen + dsize + 1);

@sheyanjie-qq

Copy link
Copy Markdown
Contributor

● Review summary

  1. [High/High] Compatibility: literal stmt2 breaks async execution semantics
  • Problem: stmtExecLiteral2 calls taosAsyncExecLiteral(stmt) and then immediately waits on tsem_wait(&pStmt->ctx.sem), so an async stmt2 execution becomes blocking. The async callback is then
    invoked synchronously after completion, unlike the normal stmtExec2 path.
  • Evidence: source/client/src/clientStmt2.c:3450-3665
  • why_target_related: this PR introduces the literal stmt2 exec-direct path for SQL without ?.
  • Fix direction: keep literal stmt2 on the same non-blocking async callback path as normal stmt2, or explicitly route only sync APIs through the wait path.
  • Publishability: inline
  1. [Medium/High] Compatibility: qPureParseInsert rejects valid INSERT values
  • Problem: the new lightweight parser only accepts a narrow subset of value tokens such as ?, integer, float, and string. Valid TDengine values like now(), NULL, booleans, negative numbers, or
    expressions can be rejected during prepare.
  • Evidence: source/libs/parser/src/parser.c:1-205
  • why_target_related: this PR adds qPureParseInsert as a fallback for stmt2 insert field inference.
  • Fix direction: reuse the official parser path for field inference, or extend the fallback parser to accept the full legal INSERT value grammar.
  • Publishability: summary
  1. [Medium/High] Compatibility: literal stmt2 bypasses queryTableNotExistAsEmpty
  • Problem: doRequestCallback skips the TSDB_CODE_PAR_TABLE_NOT_EXIST to empty-result conversion when pRequest->literal_by_stmt2 is true. The same query can therefore return empty results
    through normal query/stmt2 but fail through literal stmt2.
  • Evidence: source/client/src/clientImpl.c:3445-3505
  • why_target_related: this PR adds the literal_by_stmt2 request flag and excludes that path from existing option handling.
  • Fix direction: preserve queryTableNotExistAsEmpty behavior for literal stmt2 unless there is a documented reason to diverge.
  • Publishability: inline

Publish preview

  • Summary comment: disabled; publishing was not requested.
  • Inline comments: none published. Candidate inline locations exist for findings 1 and 3.

@Pengrongkun

Copy link
Copy Markdown
Contributor

🤖 Code Review by GitHub Copilot

PR: #35366 Refactor/freemine.tdengine.stmt2
Status: completed ✅


Change Summary

This PR refactors STMT2 prepare/exec semantics by classifying SQL into two categories: Literal SQL (no ? parameters — parsed during prepare, executed during exec) and parameterized INSERT (field inference forced eagerly inside prepare via an internal get_fields call). The primary goal is to let middleware (JDBC/ODBC) map prepare/bind/execute/executeDirect directly to taos APIs without needing to inspect the SQL type.


Findings (5 total)


🔴 [high/high] Portability: parser.h unconditionally includes <syslog.h>, breaking Windows builds

syslog.h is POSIX-only and does not exist on Windows. The include is placed outside the first #if 0 guard, so every Windows translation unit that includes parser.h will fail to compile. The debug macros D/A that use syslog are correctly guarded by #if 0, but the include itself is not.

Location: include/libs/parser/parser.h line ~27

Fix direction: Remove #include <syslog.h> (and clean up the entire #if 0 debug block before merging).


🔴 [high/high] Compatibility: stmtExecLiteral2 blocks the calling thread then fires asyncExecFn synchronously, violating async semantics

stmtExecLiteral2 calls taosAsyncExecLiteral(stmt) and then immediately calls tsem_wait(&pStmt->ctx.sem), blocking the caller until execution completes. It then invokes asyncExecFn synchronously from the same blocked thread. When a user sets asyncExecFn and calls taos_stmt2_exec, the expected contract is that the call returns immediately and the callback fires asynchronously — as the non-literal path (stmtExec2) does. The literal path breaks this contract and will stall JDBC/ODBC threads that rely on async execution.

Location: source/client/src/clientStmt2.c, stmtExecLiteral2

Fix direction: Route literal exec through the same async callback mechanism as the non-literal path, or explicitly detect a set asyncExecFn at the top of stmtExecLiteral2 and return a clear error.


🟡 [medium/high] Correctness: qPureParseInsert rejects valid INSERT VALUES tokens

The next_value branch only accepts ? / INTEGER / FLOAT / STRING, rejecting these legal TDengine INSERT values:

  • NULL
  • Negative numbers (-1 is tokenized as TK_NK_MINUS + TK_NK_INTEGER, matching neither branch)
  • Booleans (true / false)
  • Function calls (now(), today())

This function is used as a fallback in stmtGetStbColFields2, so any valid SQL containing the above values will erroneously return TSDB_CODE_PAR_SYNTAX_ERROR, causing taos_stmt2_get_fields (and the prepare-time field inference) to fail on otherwise legal SQL.

Location: source/libs/parser/src/parser.c, qPureParseInsert:next_value

Fix direction: Extend next_value to handle NULL, unary minus, and boolean tokens; or fall back to the full parser for field inference.


🟡 [medium/high] Compatibility: doRequestCallback skips queryTableNotExistAsEmpty for literal stmt2

When pRequest->literal_by_stmt2 is non-NULL, the tsQueryTbNotExistAsEmpty logic (converting TABLE_NOT_EXIST into an empty result set) is completely bypassed. The same SELECT query via taos_query returns an empty result set, but via literal stmt2 it returns an error — an observable behavioral difference that could break JDBC/ODBC applications relying on this option.

Location: source/client/src/clientImpl.c, doRequestCallback

Fix direction: Preserve the queryTableNotExistAsEmpty handling for literal stmt2 unless there is an explicit documented reason to diverge.


🔵 [medium/medium] Correctness: stmtPrepareLiteral2 calls releaseTscObj when acquireTscObj returned NULL

When acquireTscObj(connId) returns NULL (object not found), the condition pObj != pStmt->taos is true and the code calls releaseTscObj(connId) before returning an error — but the acquire never succeeded, so calling release on an un-acquired connId may corrupt the reference count.

Location: source/client/src/clientStmt2.c, stmtPrepareLiteral2

Fix direction: Only call releaseTscObj when acquireTscObj returned a non-NULL value:

STscObj *pObj = acquireTscObj(connId);
if (pObj == NULL || pObj != pStmt->taos) {
    if (pObj != NULL) releaseTscObj(connId);
    SET_ERR("internal logic error");
    STMT_ERR_RET(TSDB_CODE_TSC_STMT_API_ERROR);
}

This comment was generated automatically by GitHub Copilot CLI tsdb-code-review skill v0.2.0.

Comment thread include/libs/parser/parser.h Outdated
Comment thread source/client/src/clientStmt2.c Outdated
Comment thread source/libs/parser/src/parser.c
Comment thread source/client/src/clientImpl.c Outdated
if (pRequest->msgBuf != NULL && pRequest->msgBufLen > 0) {
pRequest->msgBuf[0] = '\0';

if (!pRequest->literal_by_stmt2) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[medium/high] queryTableNotExistAsEmpty is silently bypassed for literal stmt2

The tsQueryTbNotExistAsEmpty option (converting TABLE_NOT_EXIST into an empty result set) is skipped when pRequest->literal_by_stmt2 is non-NULL. The same SELECT query through taos_query returns an empty result set; through literal stmt2 it returns an error. This behavioral difference can silently break JDBC/ODBC applications that rely on this option.

Fix direction: Preserve queryTableNotExistAsEmpty handling for literal stmt2 inside the if (!pRequest->literal_by_stmt2) block, unless the divergence is intentional and documented.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,stmt函数单独封装,不要和query耦合

Comment thread source/client/src/clientStmt2.c Outdated
Comment thread source/util/src/thash.c
@@ -677,7 +677,7 @@ SHashNode *doCreateHashNode(const void *key, size_t keyLen, const void *pData, s
#ifdef _TD_LOONGARCH_64
SHashNode *pNewNode = taosMemoryCalloc(1, sizeof(SHashNode) + keyLen + dsize + 1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要修改无关代码

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不是无关代码,valgrind 报告了read uninitialized data 错。本来我应该单独提一个PR的。

}

void taosAsyncQueryImplWithReqid(uint64_t connId, const char* sql, __taos_async_fn_t fp, void* param, bool validateOnly,
void taosAsyncQueryImplWithReqid(TAOS_STMT2 *stmt, uint64_t connId, const char* sql, __taos_async_fn_t fp, void* param, bool validateOnly,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stmt的函数单独封装,不要和query耦合在一起

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just don't get a clear solutions, otherwise i have to hard-copy the whole logic in the block.
or, what suggestions might be here?

Comment thread source/client/src/clientImpl.c Outdated
if (pRequest->msgBuf != NULL && pRequest->msgBufLen > 0) {
pRequest->msgBuf[0] = '\0';

if (!pRequest->literal_by_stmt2) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,stmt函数单独封装,不要和query耦合

}

if (stmtIsLiteral(pStmt)) {
return stmtExecLiteral2(stmt, affected_rows);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里没有考虑async情况pStmt->options.asyncExecFn,用户会设置callback函数去接收结果

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be handled in stmtExecLiteral2 for the next commit

Comment thread source/client/src/clientStmt2.c Outdated
Comment thread source/client/inc/clientInt.h
freemine added 6 commits May 30, 2026 07:40
1. remove `D` / `A` unused block
2. fully support asynchronous mode triggered by `taos_stmt2_init`
3. `qPureParseInsert` correctly parse VALUES clause
4. legacy support `queryTableNotExistAsEmpty`
1. reuse SRequestObj::msgBuf if possible
2. SRequestObj lifecycle management adjustment in `clientImpl.c::doRequestCallback`

@Pengrongkun Pengrongkun left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI review了几个意见,请考虑重新async模式下重新prepare的情况

return pStmt->ctx.code;
}

// NOTE: what if taosAsyncExecLiteral failed prematurelly?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

问题stmtExecLiteral2 在异步模式(asyncExecFn != NULL)下立即返回,但 stmtLiteralCallback 仍会在之后某时刻触发并解引用 pStmt。若此期间用户调用 taos_stmt2_close() 或对同一 stmt 重新 preparestmtLiteralCallback 将访问已释放或已被重置的 pStmt 内存,触发 use-after-free 或 abort()

证据

  • clientStmt2.c:stmtExecLiteral2:async 分支直接 return TSDB_CODE_SUCCESS 而不等待回调
  • clientStmt2.c:stmtClose2:释放 ctx 但无 literal-specific 的 in-flight 等待屏障
  • clientStmt2.c:stmtLiteralCallback:回调中 pStmt->exec.pRequest != res 时调用 abort(),说明作者已意识到指针失效风险

与本 MR 的关联:本 MR 新增 async literal 执行路径并以 pStmt 作为回调状态,但未为该新路径添加对应的生命周期屏障。

修复建议:在 stmtClose2stmtDeepReset 中新增针对 literal path 的 in-flight 检测,若 ctx.executing == 1 则等待信号量后再继续销毁;或在 async 回调触发前持有 stmt 的引用计数。

code = prepareAndParseSqlSyntax(&pWrapper, pRequest, updateMetaForce);
}

if (pRequest->literal_by_stmt2) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

问题SRequestObj.literal_by_stmt2 是指向 TAOS_STMT2* 的裸指针(注释为 // reference only),无所有权/生命周期协议。在异步 exec 路径中,doAsyncQuery 会解引用该指针检查 ctx.prepared。若 stmt 在请求飞行期间被 close,该指针即变为悬空指针。

证据

  • clientInt.hTAOS_STMT2 *literal_by_stmt2; // reference only
  • clientMain.c:doAsyncQueryTAOS_STMT2* stmt = pRequest->literal_by_stmt2; STscStmt2* pStmt = (STscStmt2*)stmt;

与本 MR 的关联:本 MR 引入该请求→stmt 反向引用以实现 literal exec-direct 功能。

修复建议:在赋值 literal_by_stmt2 时对 stmt 加引用计数,请求完成后释放;或在 stmt reset/close 时显式将 pRequest->literal_by_stmt2 置 NULL(需要加锁)。

#define RETURN_EXPECTING(msg) do { \
(void)snprintf(pCtx->buf, sizeof(pCtx->buf), \
"expecting %s, but got `%.*s`", \
msg, token.n < 10 ? 10 : token.n, token.z); \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里会发生越界读?'<'应该改成'>'

}

static inline int
stmt2LiteralCtxIsValid(SStmt2LiteralCtx *ctx) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

定义但未调用

static inline void
stmt2LiteralCtxRelease(SStmt2LiteralCtx *ctx) {
stmt2LiteralCtxReset(ctx);
if (ctx->sem_valid) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

问题stmtDeepReset 调用 stmt2LiteralCtxReset(仅重置 flags,不销毁/排干信号量),若此时有一个 async exec 的 stmtLiteralCallback 尚未触发,其后续的 tsem_post 会让下一次 stmtPrepareLiteral2tsem_wait 提前返回,携带的是上一次执行的 codepRequest,导致新 prepare 静默失败或状态混乱。

证据clientStmt2.c:stmt2LiteralCtxReset(不重置 sem_valid),stmtDeepReset(调用 reset 而非 release),stmtPrepareLiteral2stmt2LiteralCtxInit 看到 sem_valid == 1 直接复用旧信号量)。

修复建议:在 stmtDeepReset 中对 literal 已初始化的信号量先排干(或 destroy + re-init),确保下次 prepare 使用干净状态。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants