feat: add truncate-results command by nvzhihanj · Pull Request #354 · mlcommons/endpoints

nvzhihanj · 2026-06-12T06:46:08Z

Split out from #353 for easier review. Targets release/v0.5.

What & why

In perf+accuracy mode, results.json stores every query's full response text under responses and can reach gigabytes — too large to attach to a submission or share as proof of work.

Changes

New command: inference-endpoint truncate-results <results.json> [--keep-n N] [--output PATH | --in-place].

Keeps the first --keep-n (default 5) responses verbatim.
Adds a truncation block: a sha256 of every response keyed by sample_uuid, plus responses_truncated, hash_algorithm, n_responses_total, n_responses_kept. This proves which outputs were produced without the bulk.
config / results / accuracy_scores / errors are preserved verbatim.
A perf-only results.json (no responses) passes through unchanged.
Output: <name>.truncated.json by default; --output PATH or --in-place to choose otherwise. The pure transform never mutates its input.

Tests

Unit tests cover: first-N-full + every-response-hashed + metadata; non-response sections preserved; input not mutated; keep_n exceeding total; perf-only passthrough; and the CLI writing a truncated copy (leaving the original intact) vs --in-place.

pre-commit green.

🤖 Generated with Claude Code

Perf+accuracy runs store every query's full response text under `responses` in results.json, which can reach gigabytes. `inference-endpoint truncate-results <results.json>` shrinks it: keep the first --keep-n (default 5) responses verbatim and replace the rest with a `truncation` block holding a sha256 of every response (proof of work) plus counts. Writes <name>.truncated.json by default, or --output PATH / --in-place. A perf-only results.json (no `responses`) passes through unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-12T06:46:16Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist

Code Review

This pull request introduces a new truncate-results command to shrink large benchmark results.json files by keeping a specified number of full responses and replacing the rest with SHA-256 hashes. The changes include the command implementation, CLI registration, documentation updates, and unit tests. The review feedback suggests improving memory efficiency by using file-like objects with json.load and json.dump instead of reading the entire file into memory, and enhancing robustness by validating that the responses section is a dictionary and handling potential negative values for the keep count.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-12T06:47:06Z

+    data = json.loads(config.results.read_text())
+    truncated = truncate_results_dict(data, keep_n=config.keep_n)
+
+    if config.in_place:
+        out_path = config.results
+    elif config.output is not None:
+        out_path = config.output
+    else:
+        out_path = config.results.with_name(config.results.stem + ".truncated.json")
+
+    out_path.write_text(json.dumps(truncated, indent=2))


Since this command is specifically designed to handle potentially gigabyte-sized results.json files, reading the entire file into memory as a string with read_text() and then parsing/serializing with json.loads()/json.dumps() can lead to high memory usage or Out-Of-Memory (OOM) errors.

Using json.load() and json.dump() with file objects avoids loading the entire file as a single Python string, significantly reducing the memory footprint.

Suggested change

data = json.loads(config.results.read_text())

truncated = truncate_results_dict(data, keep_n=config.keep_n)

if config.in_place:

out_path = config.results

elif config.output is not None:

out_path = config.output

else:

out_path = config.results.with_name(config.results.stem + ".truncated.json")

out_path.write_text(json.dumps(truncated, indent=2))

with open(config.results, "r", encoding="utf-8") as f:

data = json.load(f)

truncated = truncate_results_dict(data, keep_n=config.keep_n)

if config.in_place:

out_path = config.results

elif config.output is not None:

out_path = config.output

else:

out_path = config.results.with_name(config.results.stem + ".truncated.json")

with open(out_path, "w", encoding="utf-8") as f:

json.dump(truncated, f, indent=2)

gemini-code-assist · 2026-06-12T06:47:06Z

+    responses = results.get("responses")
+    if not responses:
+        return dict(results)
+
+    uuids = list(responses.keys())
+    kept = uuids[:keep_n]


To make truncate_results_dict more robust and prevent potential runtime exceptions:

Ensure responses is actually a dictionary before calling .keys() or .items(). If it's of an unexpected type (e.g., a list or string due to malformed input), calling .keys() would raise an AttributeError.

Guard against negative values of keep_n. If keep_n is negative, Python's slice notation uuids[:keep_n] will slice from the end of the list (e.g., keep_n = -1 would keep all but the last response), which is likely unintended.

Suggested change

responses = results.get("responses")

if not responses:

return dict(results)

uuids = list(responses.keys())

kept = uuids[:keep_n]

responses = results.get("responses")

if not isinstance(responses, dict) or not responses:

return dict(results)

uuids = list(responses.keys())

kept = uuids[:max(0, keep_n)]

nvzhihanj · 2026-06-22T05:10:19Z

Converting to draft right now, we need to finalize the accuracy format before adding the truncation.

nvzhihanj requested a review from a team as a code owner June 12, 2026 06:46

github-actions Bot requested a review from arekay-nv June 12, 2026 06:46

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

viraatc approved these changes Jun 21, 2026

View reviewed changes

nvzhihanj marked this pull request as draft June 22, 2026 05:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add truncate-results command#354

feat: add truncate-results command#354
nvzhihanj wants to merge 1 commit into
release/v0.5from
feat/truncate-results

nvzhihanj commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

nvzhihanj commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nvzhihanj commented Jun 12, 2026

What & why

Changes

Tests

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

nvzhihanj commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants