Add permutation that support running on ROCm gfx1151 devices such as Strix Halo #431
+232
−13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change is based on work done by @kprinssu at https://github.com/kprinssu/Kokoro-FastAPI
I don't necessarily expect this PR to get accepted, but I thought I'd throw it out there in case others found it useful.
In order to support AMD gfx1151 GPUS such as Strix Halo, I'm using the latest ROCm release, 7.10.0. This necessitated moving python to 3.12.
In addition, ROCm isn't supported on aarch64, so I need to make some build changes to allow this permutation to only build amd64.
If you are interested in taking this change, I'd be happy to iterate on it, especially if you'd prefer to find a way to keep the cpu & gpu permutations on python 3.10