Skip to content

fix(patch): preserve native matmul backend on torch 2.9#66

Open
froststeam wants to merge 1 commit into
MooreThreads:mainfrom
froststeam:fix-torch29-matmul-backend
Open

fix(patch): preserve native matmul backend on torch 2.9#66
froststeam wants to merge 1 commit into
MooreThreads:mainfrom
froststeam:fix-torch29-matmul-backend

Conversation

@froststeam
Copy link
Copy Markdown
Contributor

Summary

Fix a PyTorch 2.9 + torch.compile compatibility issue on MUSA.

torchada==0.1.55 replaces PyTorch's native CUDA matmul backend object with the MUSA matmul backend object:

torch.backends.cuda.matmul = torch.backends.musa.matmul

That makes CUDA-style matmul settings point to MUSA, but it also changes the object seen by PyTorch internals.

In PyTorch 2.9, torch.compile / Inductor expects torch.backends.cuda.matmul to remain PyTorch's native backend object and reads native attributes from it. After the replacement, torch.backends.cuda.matmul becomes torch_musa.core.musa.muBLASModule, which does not expose all PyTorch 2.9 native matmul attributes.

This can break SGLang service startup when torch.compile / Inductor is enabled.


Reproduced issue

With torchada==0.1.55:

torch 2.9.0
torchada 0.1.55
is_musa_platform True
cuda.matmul class <class 'torch_musa.core.musa.muBLASModule'>
cuda.matmul is musa.matmul True

A minimal Inductor compile fails with:

torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AttributeError: Unknown attribute allow_fp16_reduced_precision_reduction

Relevant traceback:

File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/codecache.py", line 859, in __init__
    torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction,
AttributeError: Unknown attribute allow_fp16_reduced_precision_reduction

Root cause

Inductor expects this attribute to exist on PyTorch's native torch.backends.cuda.matmul object:

torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction

But after the replacement, torch.backends.cuda.matmul points to:

torch.backends.musa.matmul

whose object type is:

torch_musa.core.musa.muBLASModule

and that object does not provide the PyTorch 2.9 native matmul backend API.


Fix

Keep the native PyTorch object:

torch.backends.cuda.matmul

and patch attribute access on MUSA platforms so CUDA-style code still gets MUSA matmul semantics where appropriate.

After this change:

torch.backends.cuda.matmul is not torch.backends.musa.matmul

but CUDA-style settings still forward to MUSA, for example:

torch.backends.cuda.matmul.allow_tf32 = True

affects:

torch.backends.musa.matmul.allow_tf32

on MUSA.

At the same time, PyTorch 2.9 / Inductor can still access native attributes such as:

torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction
torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction
torch.backends.cuda.matmul.allow_fp16_accumulation
torch.backends.cuda.matmul.fp32_precision

Validation

Minimal torch.compile reproduction

After installing the fixed editable torchada:

torch 2.9.0
torchada 0.1.56
cuda.matmul class <class 'torch.backends.cuda.cuBLASModule'>
cuda.matmul is musa.matmul False
allow_fp16_reduced_precision_reduction => True
allow_bf16_reduced_precision_reduction => True
allow_fp16_accumulation => False
allow_tf32 => False
fp32_precision => none

A minimal Inductor compile now succeeds:

compile inductor ok torch.Size([4, 4]) cpu

Targeted backend patch tests

$ python3 -m pytest /home/dist/qzg/0506/gitlab/mcc/torchada/tests/test_cuda_patching.py::TestIsCompiledAndBackends

============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/dist/qzg/0506/gitlab/mcc/torchada
configfile: pyproject.toml
plugins: anyio-4.13.0
collected 7 items

torchada/tests/test_cuda_patching.py .......                             [100%]

============================== 7 passed in 0.02s ===============================

Full torchada test suite

$ python3 -m pytest /home/dist/qzg/0506/gitlab/mcc/torchada/tests

============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/dist/qzg/0506/gitlab/mcc/torchada
configfile: pyproject.toml
plugins: anyio-4.13.0
collected 340 items

torchada/tests/test_cpp_extension.py ...............                     [  4%]
torchada/tests/test_cuda_patching.py ................................... [ 14%]
...............................s........................................ [ 35%]
.............................................s.........................  [ 56%]
torchada/tests/test_device_strings.py ................................   [ 66%]
torchada/tests/test_extension_build.py ..sssssssss                       [ 69%]
torchada/tests/test_mappings.py ........................................ [ 81%]
...........................................ssss                          [ 95%]
torchada/tests/test_platform.py .........                                [ 97%]
torchada/tests/test_python_compat.py ........                            [100%]

================= 325 passed, 15 skipped, 4 warnings in 3.14s ==================

SGLang startup validation

The SGLang service starts successfully after the fix:

[2026-05-14 08:41:34] Capture cuda graph end. Time elapsed: 135.94 s. mem usage=1.08 GB. avail mem=7.39 GB.
[2026-05-14 08:42:12] Capture draft cuda graph end. Time elapsed: 30.44 s. mem usage=0.18 GB. avail mem=3.62 GB.
[2026-05-14 08:42:14] INFO:     Started server process [1181362]
[2026-05-14 08:42:14] INFO:     Waiting for application startup.
[2026-05-14 08:42:14] INFO:     Application startup complete.
[2026-05-14 08:42:14] INFO:     Uvicorn running on http://127.0.0.1:30000 (Press CTRL+C to quit)
[2026-05-14 08:42:15] INFO:     127.0.0.1:60398 - "GET /model_info HTTP/1.1" 200 OK

Keep torch.backends.cuda.matmul as the native PyTorch backend object while forwarding its attribute access to MUSA matmul semantics on MUSA platforms.
@yeahdongcn
Copy link
Copy Markdown
Collaborator

@popsiclexu Could you please take a look? Thanks!

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 14, 2026

🤖 Augment PR Summary

Summary: Fixes a PyTorch 2.9 + torch.compile/Inductor incompatibility on MUSA by preserving PyTorch’s native torch.backends.cuda.matmul object.

Changes:

  • Stop replacing torch.backends.cuda.matmul with the MUSA backend object (which broke Inductor’s expected native attributes).
  • Patch torch.backends.cuda.matmul attribute access to forward “CUDA-style” matmul settings to torch.backends.musa.matmul when on MUSA, while falling back to native attributes when needed.
  • Keep fp32_precision behavior for older versions that lack the native attribute.
  • Add a targeted test ensuring allow_tf32 set via CUDA backend forwards to MUSA.
  • Update fp32_precision test expectations for PyTorch 2.9 (including the none precision mode).

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/torchada/_patch.py
try:
_ = cuda_matmul.fp32_precision
has_native_fp32_precision = True
except AttributeError:
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/torchada/_patch.py:1182: the probe cuda_matmul.fp32_precision may raise AssertionError (not just AttributeError) for unknown attributes (as noted in the tests), which would make _patch_backends_cuda() crash during import on some versions; consider catching AssertionError here as well.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.


import torchada

if not torchada.is_musa_platform() or not hasattr(torch.backends, "musa"):
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/test_cuda_patching.py:1251: this skip guard checks hasattr(torch.backends, "musa") but the test immediately dereferences torch.backends.musa.matmul; if musa exists without matmul, this will error instead of skipping.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@popsiclexu
Copy link
Copy Markdown
Contributor

Could you run the full test suite on Torch 2.7?

@yeahdongcn
Copy link
Copy Markdown
Collaborator

Could you run the full test suite on Torch 2.7?

# pytest -v ./tests/
================================================================ 310 passed, 30 skipped in 1.25s ================================================================
# pip list | grep torch
torch                              2.7.1
torch_musa                         2.7.1
torchada                           0.1.22          /ws
torchaudio                         2.7.1a0+95c61b4
torchvision                        0.22.1+6b25dcc
root@xiaodongye-s80:/ws# 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants