Fix more flakiness in Cosmos tests#38211
Open
AndriySvyryd wants to merge 10 commits intomainfrom
Open
Conversation
CosmosBulkConcurrencyTest and CosmosConcurrencyTest both inherited the
same StoreName ('CosmosConcurrencyTest'). Since xUnit runs test classes
in parallel, their CleanAsync calls raced against the same database —
one class could delete the database while the other was creating
containers, causing a 404 NotFound.
Override StoreName in the bulk fixture so each class uses its own
database.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes flakiness in Cosmos concurrency functional tests by isolating the database used by CosmosBulkConcurrencyTest from CosmosConcurrencyTest, preventing parallel test runs from racing on CleanAsync (delete/recreate).
Changes:
- Override
StoreNameinCosmosBulkConcurrencyTest.ConcurrencyFixtureto use a unique Cosmos database name.
…yTest.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…licts When the AsyncSeeder adds entities to the context and SaveChangesAsync hits a transient Cosmos error (429/503), the execution strategy retries CleanAsyncImpl with the same context. The retry's seeder then fails trying to Add entities already tracked from the previous attempt. Clear ChangeTracker at the start of each attempt so retries start clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The seeder invoked by SeedDataAsync adds entities to the context, so if EnsureCreatedAsync is called inside any retry loop and the seeder's SaveChangesAsync hits a transient error, the next attempt finds stale tracked entities and fails with an identity conflict. Clearing the tracker at the start of EnsureCreatedAsync makes the method inherently retry-safe rather than relying on callers to clear state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
0179bf6 to
e235149
Compare
e235149 to
62293b4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs:529
- This keeps the first shared CosmosTestStore's _storeContext undisposed until ProcessExit (the 'reserved for cleanup' instance). That means a DbContext (and likely its underlying Cosmos client/service provider) can stay alive for the entire test process per StoreName, increasing resource usage and potentially affecting long-running test runs. Consider storing only the information needed for cleanup (e.g., store name/connection info) and creating a short-lived context at ProcessExit for deletion, so DisposeAsync can always dispose _storeContext deterministically.
private readonly CosmosTestStore _testStore = testStore;
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
if (TestEnvironment.UseTokenCredential)
{
optionsBuilder.UseCosmos(
_testStore.ConnectionUri, _testStore.TokenCredential, _testStore.Name, _testStore._configureCosmos);
}
else
62293b4 to
c302751
Compare
Multiple test fixtures can share the same Cosmos database (e.g. all Northwind query tests share 'Northwind'). Previously, the first fixture to dispose would delete the database, causing other fixtures still running in parallel to fail with NotFound or wrong data. Shared stores now register themselves in a static dictionary at construction time. A static constructor registers a ProcessExit handler that deletes all shared databases in parallel after the test run completes. Non-shared databases continue to be deleted immediately on dispose. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
c302751 to
de42341
Compare
roji
approved these changes
May 2, 2026
5de08b2 to
57e73cf
Compare
57e73cf to
59ebab1
Compare
- CosmosDatabaseCreator.EnsureCreatedAsync: wrap entire body in execution strategy. Use StrongBox to persist the 'created' flag across retries. Clear ChangeTracker only when AsyncSeeder is set. - RelationalDatabaseCreator.EnsureCreated/EnsureCreatedAsync: wrap seeding in the execution strategy. Clear ChangeTracker only when the seeder is set. - Migrator.MigrateImplementation/MigrateAsyncImplementation: clear the change tracker before invoking the seeder. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
59ebab1 to
03a6ac8
Compare
- Refactor CosmosDatabaseCreator: move ChangeTracker.Clear() into SeedDataAsync so both EnsureCreatedAsync and CosmosTestStore.SeedAsync benefit from the fix without duplication. - Add InternalServerError (500) to CosmosExecutionStrategy retryable errors. The Cosmos emulator returns 500 when overwhelmed with concurrent operations. - Fix ReadItemPartitionKeyQueryFixtureBase entity sorter for SinglePartitionKeyEntity to sort by (Id, PartitionKey) instead of just Id. Two entities share the same Id across partitions, causing non-deterministic assertion matching. - Dispose _storeContext for deferred shared stores in DisposeAsync. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
roji
approved these changes
May 3, 2026
…posal CosmosDatabaseCreator.EnsureCreatedAsync: - Track DataInserted flag so InsertDataAsync (not idempotent) is not re-run on retry. On retry, clear ChangeTracker instead. - Remove ChangeTracker.Clear from SeedDataAsync since callers handle it. CosmosTestStore: - CleanAsync: track retry state, only clear ChangeTracker on retry. - DisposeAsync: only dispose context for non-canonical instances in _deferredStores (the canonical instance stays alive for ProcessExit). - Add CosmosBulkExecutionTest to deferred deletion list (shared by CosmosBulkWarningTest). Remove F1Test (not actually shared). RelationalDatabaseCreator.EnsureCreated/Async: - Track retry state, only clear ChangeTracker on retry. Migrator.MigrateImplementation/Async: - Use MigrationExecutionState.SeedingAttempted flag to only clear ChangeTracker on retry. CosmosExecutionStrategy: - Revert InternalServerError (500) from retryable errors — the 500s were caused by leaked CosmosClient instances from improper disposal. ReadItemPartitionKeyQueryInheritanceFixtureBase: - Fix DerivedSinglePartitionKeyEntity entity sorter to sort by (Id, PartitionKey) to avoid non-deterministic matching. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ferred deletion - Merge CosmosBulkWarningTest tests into CosmosBulkExecutionTest using the non-shared test pattern to eliminate the shared database. Delete the CosmosBulkWarningTest file. - Inline deferred deletion for the only shared database (Northwind). Add a debug assertion that throws when an unexpected database is shared across multiple fixture types. - Refactor SeedDataAsync to accept a clearChangeTracker parameter, eliminating the duplicate clear logic in EnsureCreatedAsync. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| await store.EnsureDeletedAsync(store._storeContext).ConfigureAwait(false); | ||
| } | ||
| catch | ||
| { |
The test uses assertOrder: true without an OrderBy, which produces non-deterministic results on the Linux Cosmos emulator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
CosmosBulkConcurrencyTestandCosmosConcurrencyTestboth used the same Cosmos DB database name ("CosmosConcurrencyTest") becauseCosmosBulkConcurrencyTest.ConcurrencyFixtureinheritedStoreNamefromCosmosConcurrencyTest.CosmosFixturewithout overriding it.Since xUnit runs test classes in parallel, both classes'
ConcurrencyTestAsyncmethods callCleanAsyncper-test, which deletes and recreates containers. When these calls race against the same database, one class can delete the database while the other is trying to create containers in it, resulting in a sporadic 404 NotFound.The fix overrides
StoreNamein the bulk fixture to"CosmosBulkConcurrencyTest"so each test class gets its own isolated database.Don't delete the Cosmos test databases immediately after the TestStore is disposed as that goes against the purpose of shared test databases and also creates race conditions.
Use execution strategy in EnsureCreated to make it resilient to transitive failures.
Also, clear the change tracker before calling SeedData delegate, so that when it's retried the old state isn't included.