Skip to content

Core: Expose commit retry exhaustion reason in failure messages#16757

Draft
shangeyao wants to merge 12 commits into
apache:mainfrom
shangeyao:fix/16744-commit-retry-exhaustion-messages
Draft

Core: Expose commit retry exhaustion reason in failure messages#16757
shangeyao wants to merge 12 commits into
apache:mainfrom
shangeyao:fix/16744-commit-retry-exhaustion-messages

Conversation

@shangeyao

Copy link
Copy Markdown

Summary

Fixes #16744.

When commit retry is exhausted, the failure message currently rethrows the underlying CommitFailedException without indicating why the retry loop stopped. This makes it hard for operators to know which table property to tune (commit.retry.num-retries vs commit.retry.total-timeout-ms).

This change:

  • Adds RetryExhaustedException with a Reason enum (RETRY_LIMIT_EXCEEDED / TIMEOUT_EXCEEDED) in Tasks.java
  • Throws RetryExhaustedException (preserving the original exception as cause) when retry is exhausted
  • Catches RetryExhaustedException at all 14 commit call sites and translates the reason into actionable messages, e.g.:
    • "Commit failed and retry timeout (60000 ms) reached. Consider increasing 'commit.retry.total-timeout-ms'"
    • "Commit failed and retry limit (4) reached. Consider increasing 'commit.retry.num-retries'"

The design keeps the generic retry utility (Tasks) unaware of Iceberg property names — classification happens in Tasks, translation happens at commit call sites.

Testing

./gradlew :iceberg-core:test --tests "org.apache.iceberg.util.TestTasks"
./gradlew :iceberg-core:compileJava

…he#16744)

When commit retry is exhausted, the failure message now indicates
whether the retry attempt limit or the total timeout was reached,
helping users know which table property to adjust.

- Add RetryExhaustedException with Reason enum in Tasks.java
- Translate retry exhaustion reason to actionable messages
  in all 14 commit call sites (SnapshotProducer, BaseTransaction,
  CatalogHandlers, etc.)
@github-actions github-actions Bot added the core label Jun 10, 2026
@shangeyao shangeyao marked this pull request as draft June 11, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Core: Expose commit retry exhaustion reason in failure messages

1 participant