Update MiniMax M2.5 FP8 H200 vLLM agg recipes by anish-shanbhag · Pull Request #1298 · SemiAnalysisAI/InferenceX

anish-shanbhag · 2026-05-07T23:02:20Z

Update MiniMax-M2.5 FP8 H200 vLLM to vllm/vllm-openai:v0.20.1-ubuntu2404

Set vLLM serving knobs in benchmarks/single_node/minimaxm2.5_fp8_h200.sh: generated benchmark max-model-len, previous eval max-model-len handling, fp8 KV cache, FlashInfer attention/autotune, Triton MoE, and MiniMax QK norm fusion.

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

github-project-automation Bot added this to InferenceMAX Board May 7, 2026

anish-shanbhag marked this pull request as ready for review May 8, 2026 23:48

anish-shanbhag requested a review from a team May 8, 2026 23:48

anish-shanbhag requested review from jgangani and kedarpotdar-nv as code owners May 8, 2026 23:48

claude Bot reviewed May 8, 2026

View reviewed changes

Tune MiniMax M2.5 FP8 H200 vLLM agg

d601a12

anish-shanbhag force-pushed the ashan/port-inferencemax-53-minimax-h200-no-slurm-shared branch from e601235 to d601a12 Compare May 9, 2026 00:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update MiniMax M2.5 FP8 H200 vLLM agg recipes#1298

Update MiniMax M2.5 FP8 H200 vLLM agg recipes#1298
anish-shanbhag wants to merge 1 commit intoSemiAnalysisAI:mainfrom
anish-shanbhag:ashan/port-inferencemax-53-minimax-h200-no-slurm-shared

anish-shanbhag commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anish-shanbhag commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anish-shanbhag commented May 7, 2026 •

edited

Loading