Skip to content

Conversation

@wardlican
Copy link
Contributor

Why are the changes needed?

TaskRuntime.stateLock and TableOptimizingProcess.lock can cause deadlocks.This can cause tables in a certain resource group to malfunction, and OptimizerKeeper will also fail to function properly.

Close #4001.

Brief change log

Fixes Implemented:

The following fixes have been implemented:

  1. Modified the completeTask method (lines 565-577):
  • First, call taskRuntime.complete() (acquire and release the database lock).
  • After the database lock is released, acquire TableOptimizingProcess.this.lock and call acceptResult.
  • Ensure the lock acquisition order is consistent: acquire TableOptimizingProcess.this.lock first, then acquire the database lock.
  1. Removed the whenCompleted callback registration (lines 910-922 and 932-945):
  • Avoid synchronously calling acceptResult while holding the database lock.
  • Instead, explicitly call acceptResult within completeTask.
  1. Modified the cancelTasks method (lines 897-902):
  • First, cancel all tasks (acquire and release the database lock).

  • Then call acceptResult for each canceled task (while holding TableOptimizingProcess.this.lock).

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added the module:ams-server Ams server module label Dec 9, 2025
@czy006
Copy link
Contributor

czy006 commented Dec 26, 2025

cc @baiyangtx

@zhoujinsong
Copy link
Contributor

I am not sure if the deadlock issue still exist in the master branch.
So we may hang the PR here and watch if the issue still producing. @wardlican @czy006

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: TaskRuntime.stateLock and TableOptimizingProcess.lock can cause deadlocks.

3 participants