-
Notifications
You must be signed in to change notification settings - Fork 30
Add support for building noarch kernels #319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Universal kernels still exist in the project writers.
This still generates noarch variants with unnecessary bits (CUDA version, system, etc.).
39adce0 to
01a0e2f
Compare
| pub struct General { | ||
| pub name: String, | ||
|
|
||
| pub backends: Vec<Backend>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v3 is mostly a copy of v2, so I'll mark the changes here.
In general, universal was removed, backends was added.
| pub minver: Option<Version>, | ||
| pub maxver: Option<Version>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda-minver and cuda-maxver have moved from general to this new general.cuda section.
| pkgs.linkFarm "packages-for-cache" ( | ||
| map (buildSet: { | ||
| name = buildName (buildSet.buildConfig); | ||
| name = buildSet.torch.variant; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved generation of variant strings torch-... to the Torch derivation itself. This makes it easier to access it from everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes makes sense
MekkCyber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I didn’t follow every detail 😅, but the logic seems sound, Thank you
| - `torch-cpu` | ||
| - `torch-cuda` | ||
| - `torch-metal` | ||
| - `torch-rocm` | ||
| - `torch-xpu` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add npu in the case of noarch variants ?
| pkgs.linkFarm "packages-for-cache" ( | ||
| map (buildSet: { | ||
| name = buildName (buildSet.buildConfig); | ||
| name = buildSet.torch.variant; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes makes sense
drbh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes look good to me!
small nit regarding the format of the config module, currently when adding a new version we need to make changes to the previous version files. it may be helpful to limit the version files to parsing toml only, and maintain a top level config that is populated by the different versions
This change adds support for building noarch kernels. So far we have used the universal variant for kernels that do not have any AoT-compiled code. However, the universal variant has two important issues:
To solve these issues, we introduce the noarch variants to replace universal kernels. Noarch kernels have variants of the shape
torch-<backend>(e.g.torch-xpu). This resolves the issues outlined.To support no-arch kernels, we update the
build.tomlformat to v3, making the following changes:general.universalis removed.general.backendsis introduced. This required option is used to list what backends the kernel supports.general.cuda-{minver,maxver}has been moved to thegeneral.cudasection.If a kernel supports backend X and has one or more
kernels.*sections withbackend = "X", then the kernel is an AoT-compiled kernel for that backend. Otherwise, it is a noarch kernel for that backend. Suppose that we have:then the XPU kernel will be AoT-compiled (e.g.
build/torch29-cxx11-xpu20252-x86_64-linux), whereas the CUDA kernel will be noarch (torch-cuda).An older
build.tomlcan be updated automatically withbuild2cmake update-build build.toml.