-
Notifications
You must be signed in to change notification settings - Fork 478
[fix] created_at in migrated git entities, and add missing default environments
#3678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def run_migration(sqlalchemy_url: str): | ||
| import concurrent.futures | ||
|
|
||
| async def _start(): | ||
| engine = create_async_engine(url=sqlalchemy_url) | ||
| async with engine.begin() as connection: | ||
| await migration_create_default_environments(connection=connection) | ||
| await engine.dispose() | ||
|
|
||
| with concurrent.futures.ThreadPoolExecutor() as executor: | ||
| future = executor.submit(asyncio.run, _start()) | ||
| future.result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Migration writes bypass the migration's transaction — _create_default_environment uses global engine, not the migration connection
The run_migration function creates a dedicated async engine and wraps the migration in async with engine.begin() as connection, intending transactional safety. However, _create_default_environment calls environments_service → GitDAO, and GitDAO always uses the global engine singleton imported from oss.src.dbs.postgres.shared.engine (engine.core_session()). This means all writes (artifact, variant, revision creation) go through an entirely separate connection pool with independently-committed transactions.
Root Cause and Impact
The migration connection passed to migration_create_default_environments is used for reads (_fetch_target_project_ids, _fetch_project_owners_batch, _project_has_any_environment at lines 162-175), but _create_default_environment (line 110) calls environments_service.fetch_environment / create_environment / etc., which all go through GitDAO methods that use engine.core_session() from oss/src/dbs/postgres/git/dao.py:97.
Consequences:
- No rollback on failure: If the migration crashes partway through (e.g., after creating environments for 50 of 200 projects), the
engine.begin()rollback does nothing because no writes went through the migration connection. The already-created environments via the global engine are permanently committed. - Read-write visibility mismatch:
_project_has_any_environment(line 96-101) queries the migration connection, but the environments created by_create_default_environmentare committed on the global engine's separate sessions. Depending on PostgreSQL isolation level, the migration connection may not see those committed rows, potentially leading to duplicate creation attempts. - The migration connection is essentially a no-op wrapper — it provides a false sense of transactional safety.
Prompt for agents
The root issue is that `_create_default_environment` calls `environments_service` which uses `GitDAO`, and `GitDAO` hardcodes the global `engine` singleton for all DB operations. The migration creates its own engine and connection, but these are never used for writes.
To fix this, you should not use the high-level `EnvironmentsService` / `GitDAO` in the migration at all. Instead, perform the environment creation directly using raw SQL on the migration's connection (similar to how the backfill migration in `b1c2d3e4f5a7` works), or use the connection's session to insert the SQLAlchemy model instances directly. This ensures all reads and writes happen on the same connection/transaction.
Alternatively, remove the `run_migration` wrapper with its own engine entirely, and run the migration synchronously using `op.get_bind()` like the other migrations in this PR, inserting rows directly via SQLAlchemy Core INSERT statements on that connection.
Was this helpful? React with 👍 or 👎 to provide feedback.
| def run_migration(sqlalchemy_url: str): | ||
| import concurrent.futures | ||
|
|
||
| async def _start(): | ||
| engine = create_async_engine(url=sqlalchemy_url) | ||
| async with engine.begin() as connection: | ||
| await migration_create_default_environments(connection=connection) | ||
| await engine.dispose() | ||
|
|
||
| with concurrent.futures.ThreadPoolExecutor() as executor: | ||
| future = executor.submit(asyncio.run, _start()) | ||
| future.result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Migration writes bypass the migration's transaction — EE version has the same global engine issue
The EE version of default_environments.py has the same transactional isolation bug as the OSS version. _create_default_environment calls environments_service which internally uses GitDAO with the global engine singleton, completely bypassing the migration's dedicated engine and connection.
Root Cause and Impact
Identical to the OSS version: run_migration at line 233 creates a local engine and wraps the migration in engine.begin(), but _create_default_environment at line 113 calls environments_service.fetch_environment / create_environment / create_environment_variant / commit_environment_revision, all of which route through GitDAO methods using engine.core_session() from oss/src/dbs/postgres/git/dao.py:97.
This means:
- Writes are committed independently on the global engine and cannot be rolled back if the migration fails partway.
- The migration connection used for reads (
_fetch_target_project_ids,_project_has_any_environment) is on a different transaction than the writes, causing potential visibility issues.
Prompt for agents
Same fix as the OSS version: do not use the high-level `EnvironmentsService` / `GitDAO` in the migration. Instead, perform environment creation using raw SQL or SQLAlchemy Core INSERT statements directly on the migration's connection, so all reads and writes happen within the same transaction. Alternatively, remove the separate engine/connection wrapper and use `op.get_bind()` like the other migrations in this PR.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds follow-up data migrations to correct created_at (and related date) fields for migrated workflow/environment “git entities”, and introduces a data migration to create default environments for projects missing them.
Changes:
- Backfill
created_atfor migrated workflow artifacts/variants/revisions (applications + evaluators). - Backfill
created_at/datefor migrated environment artifacts/variants/revisions using legacy environment history. - Add a migration + shared data-migration runner to create default environments (
development,staging,production) for projects without any environments.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| api/oss/databases/postgres/migrations/core/versions/e1f2a3b4c5d6_backfill_created_at_for_migrated_evaluators.py | SQL backfill of workflow_artifacts.created_at for migrated evaluators. |
| api/oss/databases/postgres/migrations/core/versions/f1e2d3c4b5a6_backfill_created_at_for_migrated_applications.py | SQL backfill of workflow_*.created_at (and revisions date) for migrated applications. |
| api/oss/databases/postgres/migrations/core/versions/b1c2d3e4f5a7_backfill_created_at_for_migrated_environments.py | Batched backfill of environment_*.created_at plus revision date alignment. |
| api/oss/databases/postgres/migrations/core/data_migrations/default_environments.py | Implements async data migration to create default environments for projects missing any. |
| api/oss/databases/postgres/migrations/core/versions/c2d3e4f5a6b7_create_default_environments_for_projects_without_environments.py | Alembic entrypoint that runs the OSS default-environments data migration. |
| api/ee/databases/postgres/migrations/core/versions/e1f2a3b4c5d6_backfill_created_at_for_migrated_evaluators.py | EE counterpart of evaluator created_at backfill. |
| api/ee/databases/postgres/migrations/core/versions/f1e2d3c4b5a6_backfill_created_at_for_migrated_applications.py | EE counterpart of application workflow created_at/date backfill. |
| api/ee/databases/postgres/migrations/core/versions/b1c2d3e4f5a7_backfill_created_at_for_migrated_environments.py | EE counterpart of environment created_at/date backfill. |
| api/ee/databases/postgres/migrations/core/data_migrations/default_environments.py | EE implementation uses workspace membership to determine project owners. |
| api/ee/databases/postgres/migrations/core/versions/c2d3e4f5a6b7_create_default_environments_for_projects_without_environments.py | Alembic entrypoint that runs the EE default-environments data migration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return False | ||
|
|
||
| environment = await environments_service.create_environment( | ||
| project_id=project_id, | ||
| user_id=owner_id, | ||
| environment_create=EnvironmentCreate( | ||
| slug=environment_slug, | ||
| name=environment_slug, | ||
| ), | ||
| ) | ||
|
|
||
| if environment is None: | ||
| return False | ||
|
|
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an environment artifact already exists, the function returns early and does not attempt to ensure the default variant and initial revision exist. Because create_environment, create_environment_variant, and commit_environment_revision each commit independently, a partial failure can leave a project with an environment but missing its default variant/revision, and rerunning the migration won’t repair it. Consider making this idempotent by checking/creating the variant and initial revision when the environment already exists (or by using a single transaction so partial states can’t be committed).
| return False | |
| environment = await environments_service.create_environment( | |
| project_id=project_id, | |
| user_id=owner_id, | |
| environment_create=EnvironmentCreate( | |
| slug=environment_slug, | |
| name=environment_slug, | |
| ), | |
| ) | |
| if environment is None: | |
| return False | |
| environment = existing | |
| else: | |
| environment = await environments_service.create_environment( | |
| project_id=project_id, | |
| user_id=owner_id, | |
| environment_create=EnvironmentCreate( | |
| slug=environment_slug, | |
| name=environment_slug, | |
| ), | |
| ) | |
| if environment is None: | |
| return False |
| SELECT project_id::text | ||
| FROM ( | ||
| SELECT DISTINCT project_id | ||
| FROM environments | ||
| WHERE project_id IS NOT NULL | ||
| ) AS project_ids | ||
| ORDER BY project_id | ||
| OFFSET :offset | ||
| LIMIT :limit | ||
| """ |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batching project IDs with OFFSET will get progressively slower as offset grows, which can make this migration very slow on large datasets. Prefer keyset pagination (e.g., keep the last seen project_id and use WHERE project_id > :last_project_id ORDER BY project_id LIMIT :limit) to keep per-batch query cost stable.
| SELECT project_id::text | ||
| FROM ( | ||
| SELECT DISTINCT project_id | ||
| FROM environments | ||
| WHERE project_id IS NOT NULL | ||
| ) AS project_ids | ||
| ORDER BY project_id | ||
| OFFSET :offset |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batching project IDs with OFFSET will get progressively slower as offset grows, which can make this migration very slow on large datasets. Prefer keyset pagination (e.g., keep the last seen project_id and use WHERE project_id > :last_project_id ORDER BY project_id LIMIT :limit) to keep per-batch query cost stable.
| SELECT project_id::text | |
| FROM ( | |
| SELECT DISTINCT project_id | |
| FROM environments | |
| WHERE project_id IS NOT NULL | |
| ) AS project_ids | |
| ORDER BY project_id | |
| OFFSET :offset | |
| WITH distinct_projects AS ( | |
| SELECT DISTINCT project_id | |
| FROM environments | |
| WHERE project_id IS NOT NULL | |
| ), | |
| ordered_projects AS ( | |
| SELECT | |
| project_id::text AS project_id, | |
| ROW_NUMBER() OVER (ORDER BY project_id) AS rn | |
| FROM distinct_projects | |
| ) | |
| SELECT project_id | |
| FROM ordered_projects | |
| WHERE rn > :offset | |
| ORDER BY rn |
| while True: | ||
| project_ids = _fetch_project_ids_batch( | ||
| connection, | ||
| offset=offset, | ||
| limit=BATCH_SIZE, | ||
| ) | ||
|
|
||
| if not project_ids: | ||
| break |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration potentially executes many batches/updates but, under Alembic’s transaction_per_migration=True, all batches will run in a single transaction. On large datasets this can cause long-running locks and bloat. Consider committing per batch using an Alembic autocommit block (or otherwise splitting the work) to reduce transaction/lock duration.
| while True: | ||
| project_ids = _fetch_project_ids_batch( | ||
| connection, | ||
| offset=offset, | ||
| limit=BATCH_SIZE, | ||
| ) | ||
|
|
||
| if not project_ids: | ||
| break |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration potentially executes many batches/updates but, under Alembic’s transaction_per_migration=True, all batches will run in a single transaction. On large datasets this can cause long-running locks and bloat. Consider committing per batch using an Alembic autocommit block (or otherwise splitting the work) to reduce transaction/lock duration.
| environments_dao = GitDAO( | ||
| ArtifactDBE=EnvironmentArtifactDBE, | ||
| VariantDBE=EnvironmentVariantDBE, | ||
| RevisionDBE=EnvironmentRevisionDBE, | ||
| ) | ||
| environments_service = EnvironmentsService( | ||
| environments_dao=environments_dao, | ||
| ) |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration reads via the passed AsyncConnection, but creates/queries environments via EnvironmentsService/GitDAO, which uses the global engine.core_session() (separate connection/transaction) rather than this Alembic connection. This split can lead to confusing behavior (different isolation/transaction boundaries, potentially different DB URL in some environments) and makes the migration harder to reason about. Consider refactoring so both reads and writes use the same connection/session (e.g., pass a session/connection into the DAO/service, or perform the inserts/updates through the Alembic connection).
| return False | ||
|
|
||
| environment = await environments_service.create_environment( | ||
| project_id=project_id, | ||
| user_id=owner_id, | ||
| environment_create=EnvironmentCreate( | ||
| slug=environment_slug, | ||
| name=environment_slug, | ||
| ), | ||
| ) | ||
|
|
||
| if environment is None: | ||
| return False | ||
|
|
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an environment artifact already exists, the function returns early and does not attempt to ensure the default variant and initial revision exist. Because create_environment, create_environment_variant, and commit_environment_revision each commit independently, a partial failure can leave a project with an environment but missing its default variant/revision, and rerunning the migration won’t repair it. Consider making this idempotent by checking/creating the variant and initial revision when the environment already exists (or by using a single transaction so partial states can’t be committed).
| return False | |
| environment = await environments_service.create_environment( | |
| project_id=project_id, | |
| user_id=owner_id, | |
| environment_create=EnvironmentCreate( | |
| slug=environment_slug, | |
| name=environment_slug, | |
| ), | |
| ) | |
| if environment is None: | |
| return False | |
| environment = existing | |
| else: | |
| environment = await environments_service.create_environment( | |
| project_id=project_id, | |
| user_id=owner_id, | |
| environment_create=EnvironmentCreate( | |
| slug=environment_slug, | |
| name=environment_slug, | |
| ), | |
| ) | |
| if environment is None: | |
| return False |
| environments_dao = GitDAO( | ||
| ArtifactDBE=EnvironmentArtifactDBE, | ||
| VariantDBE=EnvironmentVariantDBE, | ||
| RevisionDBE=EnvironmentRevisionDBE, | ||
| ) | ||
| environments_service = EnvironmentsService( | ||
| environments_dao=environments_dao, | ||
| ) |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration reads via the passed AsyncConnection, but creates/queries environments via EnvironmentsService/GitDAO, which uses the global engine.core_session() (separate connection/transaction) rather than this Alembic connection. This split can lead to confusing behavior (different isolation/transaction boundaries, potentially different DB URL in some environments) and makes the migration harder to reason about. Consider refactoring so both reads and writes use the same connection/session (e.g., pass a session/connection into the DAO/service, or perform the inserts/updates through the Alembic connection).
Uh oh!
There was an error while loading. Please reload this page.