Skip to content

Conversation

@junaway
Copy link
Contributor

@junaway junaway commented Feb 9, 2026

Copilot AI review requested due to automatic review settings February 9, 2026 15:10
@vercel
Copy link

vercel bot commented Feb 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Feb 9, 2026 3:11pm

Request Review

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 9, 2026
@dosubot dosubot bot added the Backend label Feb 9, 2026
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 8 additional findings in Devin Review.

Open in Devin Review

Comment on lines +230 to +241
def run_migration(sqlalchemy_url: str):
import concurrent.futures

async def _start():
engine = create_async_engine(url=sqlalchemy_url)
async with engine.begin() as connection:
await migration_create_default_environments(connection=connection)
await engine.dispose()

with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(asyncio.run, _start())
future.result()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Migration writes bypass the migration's transaction — _create_default_environment uses global engine, not the migration connection

The run_migration function creates a dedicated async engine and wraps the migration in async with engine.begin() as connection, intending transactional safety. However, _create_default_environment calls environments_serviceGitDAO, and GitDAO always uses the global engine singleton imported from oss.src.dbs.postgres.shared.engine (engine.core_session()). This means all writes (artifact, variant, revision creation) go through an entirely separate connection pool with independently-committed transactions.

Root Cause and Impact

The migration connection passed to migration_create_default_environments is used for reads (_fetch_target_project_ids, _fetch_project_owners_batch, _project_has_any_environment at lines 162-175), but _create_default_environment (line 110) calls environments_service.fetch_environment / create_environment / etc., which all go through GitDAO methods that use engine.core_session() from oss/src/dbs/postgres/git/dao.py:97.

Consequences:

  1. No rollback on failure: If the migration crashes partway through (e.g., after creating environments for 50 of 200 projects), the engine.begin() rollback does nothing because no writes went through the migration connection. The already-created environments via the global engine are permanently committed.
  2. Read-write visibility mismatch: _project_has_any_environment (line 96-101) queries the migration connection, but the environments created by _create_default_environment are committed on the global engine's separate sessions. Depending on PostgreSQL isolation level, the migration connection may not see those committed rows, potentially leading to duplicate creation attempts.
  3. The migration connection is essentially a no-op wrapper — it provides a false sense of transactional safety.
Prompt for agents
The root issue is that `_create_default_environment` calls `environments_service` which uses `GitDAO`, and `GitDAO` hardcodes the global `engine` singleton for all DB operations. The migration creates its own engine and connection, but these are never used for writes.

To fix this, you should not use the high-level `EnvironmentsService` / `GitDAO` in the migration at all. Instead, perform the environment creation directly using raw SQL on the migration's connection (similar to how the backfill migration in `b1c2d3e4f5a7` works), or use the connection's session to insert the SQLAlchemy model instances directly. This ensures all reads and writes happen on the same connection/transaction.

Alternatively, remove the `run_migration` wrapper with its own engine entirely, and run the migration synchronously using `op.get_bind()` like the other migrations in this PR, inserting rows directly via SQLAlchemy Core INSERT statements on that connection.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +233 to +244
def run_migration(sqlalchemy_url: str):
import concurrent.futures

async def _start():
engine = create_async_engine(url=sqlalchemy_url)
async with engine.begin() as connection:
await migration_create_default_environments(connection=connection)
await engine.dispose()

with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(asyncio.run, _start())
future.result()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Migration writes bypass the migration's transaction — EE version has the same global engine issue

The EE version of default_environments.py has the same transactional isolation bug as the OSS version. _create_default_environment calls environments_service which internally uses GitDAO with the global engine singleton, completely bypassing the migration's dedicated engine and connection.

Root Cause and Impact

Identical to the OSS version: run_migration at line 233 creates a local engine and wraps the migration in engine.begin(), but _create_default_environment at line 113 calls environments_service.fetch_environment / create_environment / create_environment_variant / commit_environment_revision, all of which route through GitDAO methods using engine.core_session() from oss/src/dbs/postgres/git/dao.py:97.

This means:

  1. Writes are committed independently on the global engine and cannot be rolled back if the migration fails partway.
  2. The migration connection used for reads (_fetch_target_project_ids, _project_has_any_environment) is on a different transaction than the writes, causing potential visibility issues.
Prompt for agents
Same fix as the OSS version: do not use the high-level `EnvironmentsService` / `GitDAO` in the migration. Instead, perform environment creation using raw SQL or SQLAlchemy Core INSERT statements directly on the migration's connection, so all reads and writes happen within the same transaction. Alternatively, remove the separate engine/connection wrapper and use `op.get_bind()` like the other migrations in this PR.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds follow-up data migrations to correct created_at (and related date) fields for migrated workflow/environment “git entities”, and introduces a data migration to create default environments for projects missing them.

Changes:

  • Backfill created_at for migrated workflow artifacts/variants/revisions (applications + evaluators).
  • Backfill created_at/date for migrated environment artifacts/variants/revisions using legacy environment history.
  • Add a migration + shared data-migration runner to create default environments (development, staging, production) for projects without any environments.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
api/oss/databases/postgres/migrations/core/versions/e1f2a3b4c5d6_backfill_created_at_for_migrated_evaluators.py SQL backfill of workflow_artifacts.created_at for migrated evaluators.
api/oss/databases/postgres/migrations/core/versions/f1e2d3c4b5a6_backfill_created_at_for_migrated_applications.py SQL backfill of workflow_*.created_at (and revisions date) for migrated applications.
api/oss/databases/postgres/migrations/core/versions/b1c2d3e4f5a7_backfill_created_at_for_migrated_environments.py Batched backfill of environment_*.created_at plus revision date alignment.
api/oss/databases/postgres/migrations/core/data_migrations/default_environments.py Implements async data migration to create default environments for projects missing any.
api/oss/databases/postgres/migrations/core/versions/c2d3e4f5a6b7_create_default_environments_for_projects_without_environments.py Alembic entrypoint that runs the OSS default-environments data migration.
api/ee/databases/postgres/migrations/core/versions/e1f2a3b4c5d6_backfill_created_at_for_migrated_evaluators.py EE counterpart of evaluator created_at backfill.
api/ee/databases/postgres/migrations/core/versions/f1e2d3c4b5a6_backfill_created_at_for_migrated_applications.py EE counterpart of application workflow created_at/date backfill.
api/ee/databases/postgres/migrations/core/versions/b1c2d3e4f5a7_backfill_created_at_for_migrated_environments.py EE counterpart of environment created_at/date backfill.
api/ee/databases/postgres/migrations/core/data_migrations/default_environments.py EE implementation uses workspace membership to determine project owners.
api/ee/databases/postgres/migrations/core/versions/c2d3e4f5a6b7_create_default_environments_for_projects_without_environments.py Alembic entrypoint that runs the EE default-environments data migration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +119 to +132
return False

environment = await environments_service.create_environment(
project_id=project_id,
user_id=owner_id,
environment_create=EnvironmentCreate(
slug=environment_slug,
name=environment_slug,
),
)

if environment is None:
return False

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an environment artifact already exists, the function returns early and does not attempt to ensure the default variant and initial revision exist. Because create_environment, create_environment_variant, and commit_environment_revision each commit independently, a partial failure can leave a project with an environment but missing its default variant/revision, and rerunning the migration won’t repair it. Consider making this idempotent by checking/creating the variant and initial revision when the environment already exists (or by using a single transaction so partial states can’t be committed).

Suggested change
return False
environment = await environments_service.create_environment(
project_id=project_id,
user_id=owner_id,
environment_create=EnvironmentCreate(
slug=environment_slug,
name=environment_slug,
),
)
if environment is None:
return False
environment = existing
else:
environment = await environments_service.create_environment(
project_id=project_id,
user_id=owner_id,
environment_create=EnvironmentCreate(
slug=environment_slug,
name=environment_slug,
),
)
if environment is None:
return False

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +199
SELECT project_id::text
FROM (
SELECT DISTINCT project_id
FROM environments
WHERE project_id IS NOT NULL
) AS project_ids
ORDER BY project_id
OFFSET :offset
LIMIT :limit
"""
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batching project IDs with OFFSET will get progressively slower as offset grows, which can make this migration very slow on large datasets. Prefer keyset pagination (e.g., keep the last seen project_id and use WHERE project_id > :last_project_id ORDER BY project_id LIMIT :limit) to keep per-batch query cost stable.

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +197
SELECT project_id::text
FROM (
SELECT DISTINCT project_id
FROM environments
WHERE project_id IS NOT NULL
) AS project_ids
ORDER BY project_id
OFFSET :offset
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batching project IDs with OFFSET will get progressively slower as offset grows, which can make this migration very slow on large datasets. Prefer keyset pagination (e.g., keep the last seen project_id and use WHERE project_id > :last_project_id ORDER BY project_id LIMIT :limit) to keep per-batch query cost stable.

Suggested change
SELECT project_id::text
FROM (
SELECT DISTINCT project_id
FROM environments
WHERE project_id IS NOT NULL
) AS project_ids
ORDER BY project_id
OFFSET :offset
WITH distinct_projects AS (
SELECT DISTINCT project_id
FROM environments
WHERE project_id IS NOT NULL
),
ordered_projects AS (
SELECT
project_id::text AS project_id,
ROW_NUMBER() OVER (ORDER BY project_id) AS rn
FROM distinct_projects
)
SELECT project_id
FROM ordered_projects
WHERE rn > :offset
ORDER BY rn

Copilot uses AI. Check for mistakes.
Comment on lines +380 to +388
while True:
project_ids = _fetch_project_ids_batch(
connection,
offset=offset,
limit=BATCH_SIZE,
)

if not project_ids:
break
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migration potentially executes many batches/updates but, under Alembic’s transaction_per_migration=True, all batches will run in a single transaction. On large datasets this can cause long-running locks and bloat. Consider committing per batch using an Alembic autocommit block (or otherwise splitting the work) to reduce transaction/lock duration.

Copilot uses AI. Check for mistakes.
Comment on lines +380 to +388
while True:
project_ids = _fetch_project_ids_batch(
connection,
offset=offset,
limit=BATCH_SIZE,
)

if not project_ids:
break
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migration potentially executes many batches/updates but, under Alembic’s transaction_per_migration=True, all batches will run in a single transaction. On large datasets this can cause long-running locks and bloat. Consider committing per batch using an Alembic autocommit block (or otherwise splitting the work) to reduce transaction/lock duration.

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +38
environments_dao = GitDAO(
ArtifactDBE=EnvironmentArtifactDBE,
VariantDBE=EnvironmentVariantDBE,
RevisionDBE=EnvironmentRevisionDBE,
)
environments_service = EnvironmentsService(
environments_dao=environments_dao,
)
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migration reads via the passed AsyncConnection, but creates/queries environments via EnvironmentsService/GitDAO, which uses the global engine.core_session() (separate connection/transaction) rather than this Alembic connection. This split can lead to confusing behavior (different isolation/transaction boundaries, potentially different DB URL in some environments) and makes the migration harder to reason about. Consider refactoring so both reads and writes use the same connection/session (e.g., pass a session/connection into the DAO/service, or perform the inserts/updates through the Alembic connection).

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +129
return False

environment = await environments_service.create_environment(
project_id=project_id,
user_id=owner_id,
environment_create=EnvironmentCreate(
slug=environment_slug,
name=environment_slug,
),
)

if environment is None:
return False

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an environment artifact already exists, the function returns early and does not attempt to ensure the default variant and initial revision exist. Because create_environment, create_environment_variant, and commit_environment_revision each commit independently, a partial failure can leave a project with an environment but missing its default variant/revision, and rerunning the migration won’t repair it. Consider making this idempotent by checking/creating the variant and initial revision when the environment already exists (or by using a single transaction so partial states can’t be committed).

Suggested change
return False
environment = await environments_service.create_environment(
project_id=project_id,
user_id=owner_id,
environment_create=EnvironmentCreate(
slug=environment_slug,
name=environment_slug,
),
)
if environment is None:
return False
environment = existing
else:
environment = await environments_service.create_environment(
project_id=project_id,
user_id=owner_id,
environment_create=EnvironmentCreate(
slug=environment_slug,
name=environment_slug,
),
)
if environment is None:
return False

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +38
environments_dao = GitDAO(
ArtifactDBE=EnvironmentArtifactDBE,
VariantDBE=EnvironmentVariantDBE,
RevisionDBE=EnvironmentRevisionDBE,
)
environments_service = EnvironmentsService(
environments_dao=environments_dao,
)
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migration reads via the passed AsyncConnection, but creates/queries environments via EnvironmentsService/GitDAO, which uses the global engine.core_session() (separate connection/transaction) rather than this Alembic connection. This split can lead to confusing behavior (different isolation/transaction boundaries, potentially different DB URL in some environments) and makes the migration harder to reason about. Consider refactoring so both reads and writes use the same connection/session (e.g., pass a session/connection into the DAO/service, or perform the inserts/updates through the Alembic connection).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants