Skip to content

Conversation

@wangmm0220
Copy link
Contributor

Description

Please briefly describe the code changes in this pull request.

Jira: https://jira.taosdata.com:18080/browse/TD-

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @wangmm0220, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical race condition within the TMQ snapshot mechanism during data transformation. The changes primarily focus on enhancing thread safety by introducing metadata read locks, standardizing error handling with robust resource cleanup, and removing a function (tqReaderSetVtableInfo) that was likely contributing to the instability or was no longer required in the new, more robust design. These modifications aim to improve the reliability and stability of TMQ snapshot operations.

Highlights

  • Race Condition Fix: Implemented read locks (metaRLock/metaULock) for metadata operations during TMQ snapshot creation and retrieval to prevent race conditions, ensuring data consistency in concurrent environments.
  • Enhanced Error Handling and Resource Management: Standardized error checking and resource cleanup across several functions using TSDB_CHECK_NULL, TSDB_CHECK_CODE, and TSDB_CHECK_CONDITION macros, coupled with goto END blocks. This ensures consistent resource deallocation (e.g., tDecoderClear, tdbTbcClose, taosArrayDestroy) on all execution paths, improving robustness and preventing memory leaks.
  • Removed Obsolete Virtual Table Functionality: Eliminated the tqReaderSetVtableInfo function, its declaration, and its function pointer from the SStoreTqReader struct, along with the createStreamVtableBlock function. This streamlines the TMQ snapshot process by removing a component that was likely either problematic or no longer necessary.
  • Clarified Resource Ownership: Explicitly transferred ownership of the snapshot context (sContext) in createTmqRawScanOperatorInfo by setting the original handle's context to NULL, preventing potential double-free or use-after-free issues.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request primarily addresses a race condition in the TMQ snapshot functionality by introducing locking around the creation and use of the snapshot context. The changes also include significant refactoring to standardize error handling with macros, which improves code consistency and readability. Additionally, there's some good code cleanup, such as removing an unused function and clarifying resource ownership transfer. My review includes a couple of points: one regarding a non-portable header include that should be changed, and a more critical issue about a potential memory leak caused by a variable being re-declared in a loop. Overall, the core logic for the race condition fix appears sound.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a race condition in the TMQ (TDengine Message Queue) snapshot mechanism for data transformation by refactoring ownership management of snapshot contexts and removing unused vtable-related functionality.

Key Changes

  • Snapshot Ownership Transfer: Modified createTmqRawScanOperatorInfo to properly transfer ownership of sContext from pHandle to the operator info structure, preventing double-free or use-after-free issues
  • VTable Code Removal: Removed the tqReaderSetVtableInfo function and related vtable source scan infrastructure that is no longer needed
  • Error Handling Improvements: Refactored error handling in metaSnapshot.c functions to use consistent error checking macros and proper resource cleanup with lock management

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
source/libs/executor/src/scanoperator.c Transfers sContext ownership to operator info and removes createStreamVtableBlock function and vtable setup code
source/dnode/vnode/src/vnd/vnodeApi.c Removes tqReaderSetVtableInfo function pointer initialization
source/dnode/vnode/src/tq/tqRead.c Removes entire tqReaderSetVtableInfo function implementation (124 lines)
source/dnode/vnode/src/tq/tqMeta.c Adds destroySnapshot call in cleanup path to prevent resource leaks
source/dnode/vnode/src/meta/metaSnapshot.c Refactors error handling with proper lock management, adds NULL check in destroySnapContext, and fixes formatting
source/dnode/vnode/inc/vnode.h Removes tqReaderSetVtableInfo function declaration
include/libs/executor/storageapi.h Removes tqReaderSetVtableInfo function pointer from SStoreTqReader struct
Comments suppressed due to low confidence (1)

source/libs/executor/src/scanoperator.c:2520

  • Potential resource leak: If an error occurs after line 2506 (where pInfo->sContext is set from pHandle->sContext) but before the operator is successfully created, the code jumps to _end where pInfo is freed (line 2520) without destroying pInfo->sContext. This will cause a memory leak of the snapshot context. Consider adding cleanup for pInfo->sContext in the error path, or restoring ownership to pHandle->sContext before the error return.
  pInfo->sContext = pHandle->sContext;
  pHandle->sContext = NULL;  // transfer the ownership
  setOperatorInfo(pOperator, "RawScanOperator", QUERY_NODE_PHYSICAL_PLAN_TABLE_SCAN, false, OP_NOT_OPENED, pInfo,
                  pTaskInfo);

  pOperator->fpSet = createOperatorFpSet(NULL, doRawScanNext, NULL, destroyTmqRawScanOperatorInfo, optrDefaultBufFn,
                                         NULL, optrDefaultGetNextExtFn, NULL);
  *pOptrInfo = pOperator;
  return code;

_end:
  if (code != TSDB_CODE_SUCCESS) {
    qError("%s failed at line %d since %s", __func__, lino, tstrerror(code));
  }
  taosMemoryFree(pInfo);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@guanshengliang guanshengliang merged commit 00e27e3 into main Dec 8, 2025
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants