fix(web): preserve eval compare fields when hydrating session messages from eval history by Dsazz · Pull Request #402 · google/adk-web

Dsazz · 2026-03-01T20:24:01Z

Summary

This PR fixes a regression where failed eval cases loaded from Eval history did not show the Actual vs Expected comparison in the chat panel, even though backend responses contained the required data.

It also adds focused hardening in eval error handling and test coverage to prevent silent regressions.

Problem

When opening a failed eval case from Eval history, the chat session loaded, but failed-message comparison details were missing:

Actual Response / Expected Response
Actual Tool Uses / Expected Tool Uses
Score / Threshold context

The API payload already included these fields.

Root Cause

EvalTabComponent annotates session events with eval metadata (e.g. evalStatus, failedMetric, actualFinalResponse, expectedFinalResponse, etc.), but in the session-hydration path the event-to-message mapping dropped those fields before rendering.

As a result, chat panel conditions (e.g. failed eval compare rendering) had incomplete message objects and skipped UI sections.

What Changed

1) Preserve eval metadata in message hydration path

Updated chat message mapping so eval annotation fields survive event -> message conversion consistently across session load paths.

2) Keep compare rendering valid for empty-string actual response

Ensured compare rendering checks for presence (null/undefined) rather than truthiness, so empty actual responses still render as valid compare content.

3) Harden EvalTab error handling

Improved EvalTab HTTP error behavior:

404 on eval results is treated as expected “no history yet” ([])
non-404 errors are no longer silently flattened into empty results
removed brittle statusText === 'Not Found' checks in favor of status === 404

4) Add regression tests

Added/updated tests to cover:

stable eval history behavior
404 vs non-404 handling paths
tab visibility behavior for missing eval sets
failed compare data availability expectations

Why This Approach

Keeps backend contract unchanged
Fixes the bug at the data propagation boundary (root cause), not via template-only workaround
Preserves intended UX for real “no history yet” cases while avoiding hidden failures for real backend errors

Validation

Reproduced with failed eval history runs and confirmed compare sections appear
Confirmed score/threshold and tool-use compare fields display correctly
Confirmed 404/no-history behavior remains user-friendly
Added targeted test coverage for new behavior

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(web): preserve eval compare fields when hydrating session messages from eval history#402

fix(web): preserve eval compare fields when hydrating session messages from eval history#402
Dsazz wants to merge 5 commits intogoogle:mainfrom
Dsazz:fix/session-eval-combined-stability

Dsazz commented Mar 1, 2026

Uh oh!

google-cla bot commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dsazz commented Mar 1, 2026

Summary

Problem

Root Cause

What Changed

1) Preserve eval metadata in message hydration path

2) Keep compare rendering valid for empty-string actual response

3) Harden EvalTab error handling

4) Add regression tests

Why This Approach

Validation

Related

Uh oh!

google-cla bot commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant