Skip to content

Conversation

@danieldk
Copy link
Member

@danieldk danieldk commented Dec 4, 2025

This change adds support for building noarch kernels. So far we have used the universal variant for kernels that do not have any AoT-compiled code. However, the universal variant has two important issues:

  1. A kernel without AoT-compiled might still be backend-specific. E.g. NVIDIA CuTe-based kernels are not universal in the sense that they don't work on non-NVIDIA GPUs.
  2. We cannot specify dependencies per backend.

To solve these issues, we introduce the noarch variants to replace universal kernels. Noarch kernels have variants of the shape torch-<backend> (e.g. torch-xpu). This resolves the issues outlined.

To support no-arch kernels, we update the build.toml format to v3, making the following changes:

  • general.universal is removed.
  • general.backends is introduced. This required option is used to list what backends the kernel supports.
  • general.cuda-{minver,maxver} has been moved to the general.cuda section.

If a kernel supports backend X and has one or more kernels.* sections with backend = "X", then the kernel is an AoT-compiled kernel for that backend. Otherwise, it is a noarch kernel for that backend. Suppose that we have:

[general]
# ...
backends = ["cuda", "xpu"]
#...

[kernel.mykernel]
backend = "xpu"
# ...

then the XPU kernel will be AoT-compiled (e.g. build/torch29-cxx11-xpu20252-x86_64-linux), whereas the CUDA kernel will be noarch (torch-cuda).

An older build.toml can be updated automatically with build2cmake update-build build.toml.

@danieldk danieldk force-pushed the backend-noarch-kernels branch from 39adce0 to 01a0e2f Compare December 5, 2025 08:07
@danieldk danieldk marked this pull request as ready for review December 5, 2025 09:20
pub struct General {
pub name: String,

pub backends: Vec<Backend>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v3 is mostly a copy of v2, so I'll mark the changes here.

In general, universal was removed, backends was added.

Comment on lines +60 to +61
pub minver: Option<Version>,
pub maxver: Option<Version>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda-minver and cuda-maxver have moved from general to this new general.cuda section.

pkgs.linkFarm "packages-for-cache" (
map (buildSet: {
name = buildName (buildSet.buildConfig);
name = buildSet.torch.variant;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved generation of variant strings torch-... to the Torch derivation itself. This makes it easier to access it from everywhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense

Copy link
Collaborator

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I didn’t follow every detail 😅, but the logic seems sound, Thank you

Comment on lines +61 to +65
- `torch-cpu`
- `torch-cuda`
- `torch-metal`
- `torch-rocm`
- `torch-xpu`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add npu in the case of noarch variants ?

pkgs.linkFarm "packages-for-cache" (
map (buildSet: {
name = buildName (buildSet.buildConfig);
name = buildSet.torch.variant;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense

Copy link
Collaborator

@drbh drbh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look good to me!

small nit regarding the format of the config module, currently when adding a new version we need to make changes to the previous version files. it may be helpful to limit the version files to parsing toml only, and maintain a top level config that is populated by the different versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants