migration verifier support in mongosync_insight by DakshD7 · Pull Request #176 · mongodb/support-tools

DakshD7 · 2026-02-27T09:20:29Z

Adding migration verifier-related modifications that allow data verification when embedded verifiers are not in use and mongosync is running. They also help with data verification when source and sync connectors between clusters are used in a rollback strategy.

it will show first page

when initial sync run

and during the recheck, the namespace colour was displayed at the bottom based on the final check.

once its green mean data match in source and destination

Copilot

Pull request overview

This PR adds migration verifier monitoring support to the mongosync_insights web dashboard. It introduces a new third form on the home page, a dedicated verifier_metrics.html template, and a new migration_verifier.py module that gathers and visualizes data from the migration-verifier tool's metadata database using Plotly charts.

Changes:

New migration_verifier.py module with MongoDB query helpers and Plotly visualization for verifier task data
New verifier_metrics.html template with auto-refreshing dashboard for verifier metrics
Updates to home.html and mongosync_insights.py to add the verifier form and two new Flask routes (/renderVerifier, /get_verifier_data)

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 11 comments.

File	Description
`migration_verifier.py`	New module implementing data gathering (generation history, namespace stats, task summaries) and Plotly figure construction for migration verifier metrics
`verifier_metrics.html`	New Jinja2 template extending `base.html` with a polling dashboard that fetches and renders verifier Plotly charts
`home.html`	Adds third form for verifier connection details and a new `validateVerifierForm()` JS function; renames `validateForm` to `validateMetadataForm`
`mongosync_insights.py`	Adds `/renderVerifier` POST route for page rendering and `/get_verifier_data` POST route for AJAX metric refresh; adds verifier connection string form logic to the home page handler

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T22:15:30Z

migration/mongosync_insights/mongosync_insights.py

+    # Get database name from session
+    session_id = request.cookies.get(SESSION_COOKIE_NAME)
+    session_data = session_store.get_session(session_id)
+    db_name = session_data.get('verifier_db_name', 'migration_verification_metadata')


There is a redundant and duplicate session lookup. When CONNECTION_STRING is set (lines 325–326), the else branch that fetches session_id and session_data (lines 328–330) is skipped. However, lines 337–338 unconditionally perform the same session lookup again for db_name. This means the session store is hit twice per request (once on lines 328–330 in the else branch, and once always on lines 337–338). The session lookup on lines 337–338 should be deduplicated by reusing the session_data variable fetched in the else branch, or by moving the session lookup before the if/else block. In the current form, when CONNECTION_STRING is set, the first session lookup is skipped but the second (line 337–338) still correctly retrieves the db_name — so this is a performance/clarity issue rather than a functional bug. However, it makes the code harder to reason about and adds unnecessary work to the session store.

Copilot · 2026-03-02T22:15:31Z

migration/mongosync_insights/templates/home.html

+        <input type="submit" value="Monitor Verifier">
+        <p> Click the "Monitor Verifier" button to monitor the <a href="https://github.com/mongodb-labs/migration-verifier" target="_blank">migration-verifier</a> tool progress.</p>
+        <p><small><strong>Note:</strong> Connect to the MongoDB cluster where the migration-verifier writes its metadata (typically the destination cluster).</small></p>
+        <img src="{{ url_for('static', filename='migration_verifier.png') }}" width="300" alt="Migration Verifier Dashboard">


The migration_verifier.png image file is referenced here but does not exist in the images directory (the Flask static folder). This will result in a broken image icon displayed to users on the home page. Either add the image file or remove this <img> element.

Suggested change

<img src="{{ url_for('static', filename='migration_verifier.png') }}" width="300" alt="Migration Verifier Dashboard">

migration/mongosync_insights/templates/verifier_metrics.html

Copilot · 2026-03-02T22:15:31Z

migration/mongosync_insights/migration_verifier.py

+def get_latest_generation(db):
+    """Get the latest generation number from verification_tasks."""
+    pipeline = [
+        {"$sort": {"generation": -1}},
+        {"$limit": 1},
+        {"$project": {"generation": 1, "_id": 0}}
+    ]
+    result = list(db.verification_tasks.aggregate(pipeline))
+    if result:
+        return result[0].get("generation", 0)
+    return 0
+
+
+def get_verification_summary(db, generation):
+    """Get summary of verification tasks for a generation."""
+    pipeline = [
+        {"$match": {"generation": generation}},
+        {"$group": {
+            "_id": "$status",
+            "count": {"$sum": 1}
+        }}
+    ]
+    results = list(db.verification_tasks.aggregate(pipeline))
+    summary = {
+        "completed": 0,
+        "failed": 0,
+        "mismatch": 0,
+        "pending": 0,
+        "processing": 0
+    }
+    for r in results:
+        status = r["_id"]
+        if status in summary:
+            summary[status] = r["count"]
+        elif status == "added":
+            summary["pending"] += r["count"]
+    return summary
+
+
+def get_task_type_distribution(db, generation):
+    """Get distribution of task types for a generation."""
+    pipeline = [
+        {"$match": {"generation": generation}},
+        {"$group": {
+            "_id": "$type",
+            "count": {"$sum": 1}
+        }},
+        {"$sort": {"count": -1}}
+    ]
+    return list(db.verification_tasks.aggregate(pipeline))
+
+
+def get_failed_tasks(db, generation, limit=50):
+    """Get failed verification tasks with details including mismatch info."""
+    # First get basic task info (fast query)
+    pipeline = [
+        {"$match": {"generation": generation, "status": {"$in": ["failed", "mismatch"]}}},
+        {"$sort": {"begin_time": -1}},
+        {"$limit": limit},
+        {"$project": {
+            "_id": 1,
+            "type": 1,
+            "status": 1,
+            "query_filter": 1,
+            "_ids": 1,
+            "failed_docs": 1,
+            "begin_time": 1
+        }}
+    ]
+    tasks = list(db.verification_tasks.aggregate(pipeline))
+
+    # For first 20 tasks, try to get mismatch details (separate fast query)
+    if tasks:
+        task_ids = [t["_id"] for t in tasks[:20]]
+        try:
+            mismatches = list(db.mismatches.find(
+                {"task": {"$in": task_ids}},
+                {"task": 1, "detail": 1}
+            ).limit(20))
+
+            # Create lookup dict
+            mismatch_map = {m["task"]: m for m in mismatches}
+
+            # Attach mismatch info to tasks
+            for t in tasks:
+                if t["_id"] in mismatch_map:
+                    t["mismatch"] = mismatch_map[t["_id"]]
+        except Exception:
+            pass  # Skip mismatch lookup if it fails
+
+    return tasks
+
+
+def get_mismatches_with_details(db, generation, limit=100):
+    """Get verification tasks joined with mismatch details."""
+    pipeline = [
+        {"$match": {"generation": generation, "status": {"$in": ["failed", "mismatch"]}}},
+        {
+            "$lookup": {
+                "from": "mismatches",
+                "localField": "_id",
+                "foreignField": "task",
+                "as": "mismatch"
+            }
+        },
+        {"$unwind": {"path": "$mismatch", "preserveNullAndEmptyArrays": True}},
+        {"$limit": limit}
+    ]
+    return list(db.verification_tasks.aggregate(pipeline))
+
+
+def get_collection_mismatches(db, generation):
+    """Get collection/index metadata mismatches."""
+    return list(db.verification_tasks.find({
+        "generation": generation,
+        "status": "mismatch",
+        "type": "verifyCollection"
+    }))


The functions get_latest_generation, get_task_type_distribution, get_mismatches_with_details, and get_collection_mismatches are defined but never called anywhere in the codebase. These are dead code and should be removed to keep the module clean, or documented as reserved for future use.

migration/mongosync_insights/migration_verifier.py

Copilot · 2026-03-02T22:15:32Z

migration/mongosync_insights/mongosync_insights.py

+@app.route('/get_verifier_data', methods=['POST'])
+def getVerifierData():
+    """Get migration verifier metrics data for AJAX refresh."""
+    # Get connection string from env var or in-memory session store
+    if CONNECTION_STRING:
+        connection_string = CONNECTION_STRING
+    else:
+        session_id = request.cookies.get(SESSION_COOKIE_NAME)
+        session_data = session_store.get_session(session_id)
+        connection_string = session_data.get('verifier_connection_string')
+
+    if not connection_string:
+        logger.error("No connection string available for verifier metrics refresh")
+        return {"error": "No connection string available. Please refresh the page and re-enter your credentials."}, 400
+
+    # Get database name from session
+    session_id = request.cookies.get(SESSION_COOKIE_NAME)
+    session_data = session_store.get_session(session_id)
+    db_name = session_data.get('verifier_db_name', 'migration_verification_metadata')
+
+    return gatherVerifierMetrics(connection_string, db_name)


The get_mongo_client function in app_config.py uses @lru_cache(maxsize=1), which means it caches only one MongoDB client at a time. When a user uses the verifier feature with a different connection string than the main monitoring feature, alternating AJAX refresh calls between /get_metrics_data and /get_verifier_data will continuously evict and re-create the cached MongoDB client. This causes repeated reconnection overhead with each request, degrading performance and potentially exhausting connections. Consider increasing the maxsize to at least 2 to allow caching of both connections simultaneously.

migration/mongosync_insights/templates/verifier_metrics.html

Copilot · 2026-03-02T22:15:32Z

migration/mongosync_insights/mongosync_insights.py

+    response.set_cookie(
+        SESSION_COOKIE_NAME,
+        session_id,
+        httponly=True,
+        secure=SECURE_COOKIES,
+        samesite='Strict',
+        max_age=SESSION_TIMEOUT
+    )


Both renderMetrics and renderVerifier write to the same session cookie mi_session_id. The verifier session data only stores verifier_connection_string and verifier_db_name, while the metrics session stores connection_string and endpoint_url. If a user visits renderVerifier after renderMetrics, the shared cookie is overwritten with the verifier session ID. When the metrics page subsequently calls /get_metrics_data or /get_partitions_data, the session lookup finds a verifier session that lacks connection_string, causing a 400 error.

Consider using a separate cookie name for the verifier session (e.g., mi_verifier_session_id) to avoid this collision.

migration/mongosync_insights/mongosync_insights.py

Copilot · 2026-03-02T22:15:33Z

migration/mongosync_insights/mongosync_insights.py

+    # Migration verifier connection string form
+    if not CONNECTION_STRING:
+        verifier_connection_string_form = '''<label for="verifierConnectionString">Verifier MongoDB Connection String:</label>  
+                                    <input type="text" id="verifierConnectionString" name="verifierConnectionString" size="47" autocomplete="off"
+                                        placeholder="mongodb+srv://usr:pwd@cluster0.mongodb.net/"><br><br>'''
+    else:
+        sanitized_connection = sanitize_for_display(CONNECTION_STRING)
+        verifier_connection_string_form = f"<p><b>Connecting to Verifier DB at: </b>{sanitized_connection}</p>"


The verifier reuses the single shared CONNECTION_STRING environment variable (MI_CONNECTION_STRING) for the verifier connection. Since the PR description states the verifier supports scenarios where source and destination clusters are separate, and the docs note that the verifier typically connects to the destination cluster, users who want to monitor a verifier database on a different cluster than the main mongosync metadata database cannot do so when using environment variable configuration. Consider adding a dedicated MI_VERIFIER_CONNECTION_STRING environment variable for the verifier connection to support this use case, and update the CONFIGURATION.md documentation accordingly.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

keeping the file which they mentioned

09759a9

DakshD7 changed the title ~~keeping the file which they mentioned~~ migration verifier support in mongosync_insight Feb 27, 2026

jsmurphy requested a review from Copilot March 2, 2026 22:07

Copilot started reviewing on behalf of jsmurphy March 2, 2026 22:08 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

jsmurphy and others added 4 commits March 11, 2026 10:31

Update migration/mongosync_insights/mongosync_insights.py

d0bda56

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update migration/mongosync_insights/templates/verifier_metrics.html

924ab3e

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update migration/mongosync_insights/migration_verifier.py

aeb2f9d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update migration/mongosync_insights/templates/verifier_metrics.html

ca04d3a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

jsmurphy merged commit 65b4e0f into mongodb:master Mar 11, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

migration verifier support in mongosync_insight#176

migration verifier support in mongosync_insight#176
jsmurphy merged 5 commits intomongodb:masterfrom
DakshD7:feature/mongosync-insights-only

DakshD7 commented Feb 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

DakshD7 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DakshD7 commented Feb 27, 2026 •

edited

Loading