chore(e2e): widen CloudWatch query window in MetricsE2ET to fix flaky test#2460
Merged
chore(e2e): widen CloudWatch query window in MetricsE2ET to fix flaky test#2460
Conversation
Pad the CloudWatch query time window by -1min/+2min for standard resolution metric queries to account for CloudWatch eventual consistency. Metric timestamps can shift by up to a minute during batch processing, causing MetricDataNotFoundException when using a tight 1-minute window. Closes #2440 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closed
2 tasks
svozza
previously approved these changes
Apr 9, 2026
The high-res metric query (period=1s) also fails with the original 1-minute window due to CloudWatch eventual consistency. Apply the same padded window and sum all returned data points, since only the high-resolution metric appears at 1-second granularity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Author
Verification runs (round 1)The following E2E test runs were triggered to verify the fix eliminates the flakiness:
All 6 runs failed. The high-resolution metric query with the padded window returned both standard (4) and high-res (8) product metrics summing to 12 instead of the expected 8. Fixed in a9eab07 by asserting |
svozza
previously approved these changes
Apr 9, 2026
With the padded window, both standard (4) and high-res (8) product metrics appear as separate 1-second buckets, so the sum is 12 not 8. Instead, assert that the high-res value (8.0) is contained in the returned data points. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Contributor
Author
Verification runs (round 2) ✅2 E2E test runs triggered against the latest fix (a9eab07): Both runs passed. |
Contributor
Author
Verification runs (round 3)2 additional E2E test runs to double-check: Both runs passed. The fix is confirmed to address the CloudWatch eventual consistency flakiness. |
Contributor
Author
|
@svozza The PR is now ready. I tested several E2E test rounds (see PR comments). |
svozza
approved these changes
Apr 9, 2026
Closed
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Changes
Pads the CloudWatch query time window in
MetricsE2ETby -1 min (before) / +2 min (after) to account for CloudWatch eventual consistency. Metric timestamps can shift by up to a minute during batch processing, causingMetricDataNotFoundExceptionorDataNotReadyExceptionin CI.This padding is applied to all metric queries, including the high-resolution (period=1s) query. Initial testing showed that the high-res query also fails with the original 1-minute window — see failed job run. For the high-res query, the assertion is changed from
get(0) == 8tocontains(8.0), since the wider window may return both the standard (4) and high-res (8) product metrics as separate 1-second buckets.No changes to shared test infrastructure (
MetricsFetcher,RetryUtils,LambdaInvoker).Issue number: #2440
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.