Skip to content

various fixes for scalable vectors#153286

Open
davidtwco wants to merge 4 commits intorust-lang:mainfrom
davidtwco:sve-intrinsics
Open

various fixes for scalable vectors#153286
davidtwco wants to merge 4 commits intorust-lang:mainfrom
davidtwco:sve-intrinsics

Conversation

@davidtwco
Copy link
Copy Markdown
Member

@davidtwco davidtwco commented Mar 2, 2026

View all comments

These are a handful of patches fixing bugs with the current #[rustc_scalable_vector] infrastructure so that we can upstream the intrinsics to stdarch:

  1. Instead of just using regular struct lowering for tuples of scalable vectors, which results in an incorrect ABI (e.g. returning indirectly), we change to using BackendRepr::ScalableVector which will lower to the correct type and be passed in registers.

  2. Clang changed from representing tuples of scalable vectors as structs rather than as wide vectors (that is, scalable vector types where the N part of the <vscale x N x ty> type was multiplied by the number of vectors). rustc mirrored this in the initial implementation of scalable vectors that didn't land.

    When our early patches used the wide vector representation, our intrinsic patches used the legacy llvm.aarch64.sve.tuple.{create,get,set}{2,3,4} intrinsics for creating these tuples/getting/setting the vectors, which were only supported due to LLVM's AutoUpgrade pass converting these intrinsics into llvm.vector.insert. AutoUpgrade only supports these legacy intrinsics with the wide vector representation.

    With the current struct representation, Clang has special handling in codegen for generating insertvalue/extractvalue instructions for these operations, which must be replicated by rustc's codegen for our intrinsics to use.

    We add a new core::intrinsics::scalable module (never intended to be stable, just used by the stdarch intrinsics, gated by core_intrinsics) and add new intrinsics which rustc lowers to the appropriate insertvalue/extractvalue instructions.

  3. Add generation of debuginfo for scalable vectors, following the DWARF that Clang generates.

  4. Some intrinsics need something like simd_cast, which will work for scalable vectors too, so this implements a new sve_cast intrinsic that uses the same internals as simd_cast. This seemed better than permitting scalable vectors to be used with the simd_* intrinsics more generally as I can't guarantee this would work for all of them.

This is a relatively large patch but most of it is tests, and each commit should be relatively standalone. It's a little bit easier to upstream them together to avoid needing to stack them. It's possible that some more compiler fixes will be forthcoming but it's looking like this might be all at the moment.

Depends on #153653

r? @workingjubilee (discussed on Zulip)

@davidtwco davidtwco added the F-scalable-vectors `#[rustc_scalable_vector]` label Mar 2, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 2, 2026

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

This PR changes rustc_public

cc @oli-obk, @celinval, @ouz-a, @makai410

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rustbot rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Mar 2, 2026
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 2, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 2, 2026

workingjubilee is currently at their maximum review capacity.
They may take a while to respond.

@rust-log-analyzer

This comment has been minimized.

@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment was marked as resolved.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 2, 2026

triagebot.toml has been modified, there may have been changes to the review queue.

cc @wesleywiser

@rustbot rustbot added the A-meta Area: Issues & PRs about the rust-lang/rust repository itself label Mar 2, 2026
@rust-log-analyzer

This comment was marked as resolved.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 2, 2026

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

@rust-log-analyzer

This comment was marked as resolved.

@rust-log-analyzer

This comment has been minimized.

@rustbot

This comment has been minimized.

@davidtwco
Copy link
Copy Markdown
Member Author

@bors r=Amanieu

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Apr 2, 2026

📌 Commit 73fdeed has been approved by Amanieu

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 2, 2026
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 2, 2026
various fixes for scalable vectors

These are a handful of patches fixing bugs with the current `#[rustc_scalable_vector]` infrastructure so that we can upstream the intrinsics to stdarch:

1. Instead of just using regular struct lowering for tuples of scalable vectors, which results in an incorrect ABI (e.g. returning indirectly), we change to using `BackendRepr::ScalableVector` which will lower to the correct type and be passed in registers.
2. Clang changed from representing tuples of scalable vectors as structs rather than as wide vectors (that is, scalable vector types where the `N` part of the `<vscale x N x ty>` type was multiplied by the number of vectors). rustc mirrored this in the initial implementation of scalable vectors that didn't land.

   When our early patches used the wide vector representation, our intrinsic patches used the legacy `llvm.aarch64.sve.tuple.{create,get,set}{2,3,4}` intrinsics for creating these tuples/getting/setting the vectors, which were only supported due to LLVM's `AutoUpgrade` pass converting these intrinsics into `llvm.vector.insert`. `AutoUpgrade` only supports these legacy intrinsics with the wide vector representation.

   With the current struct representation, Clang has special handling in codegen for generating `insertvalue`/`extractvalue` instructions for these operations, which must be replicated by rustc's codegen for our intrinsics to use.

   We add a new `core::intrinsics::scalable` module (never intended to be stable, just used by the stdarch intrinsics, gated by `core_intrinsics`) and add new intrinsics which rustc lowers to the appropriate `insertvalue`/`extractvalue` instructions.
3. Add generation of debuginfo for scalable vectors, following the DWARF that Clang generates.
4. Some intrinsics need something like `simd_cast`, which will work for scalable vectors too, so this implements a new `sve_cast` intrinsic that uses the same internals as `simd_cast`. This seemed better than permitting scalable vectors to be used with the `simd_*` intrinsics more generally as I can't guarantee this would work for all of them.

This is a relatively large patch but most of it is tests, and each commit should be relatively standalone. It's a little bit easier to upstream them together to avoid needing to stack them.  It's possible that some more compiler fixes will be forthcoming but it's looking like this might be all at the moment.

Depends on rust-lang#153653

r? @workingjubilee (discussed on Zulip)
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 2, 2026
various fixes for scalable vectors

These are a handful of patches fixing bugs with the current `#[rustc_scalable_vector]` infrastructure so that we can upstream the intrinsics to stdarch:

1. Instead of just using regular struct lowering for tuples of scalable vectors, which results in an incorrect ABI (e.g. returning indirectly), we change to using `BackendRepr::ScalableVector` which will lower to the correct type and be passed in registers.
2. Clang changed from representing tuples of scalable vectors as structs rather than as wide vectors (that is, scalable vector types where the `N` part of the `<vscale x N x ty>` type was multiplied by the number of vectors). rustc mirrored this in the initial implementation of scalable vectors that didn't land.

   When our early patches used the wide vector representation, our intrinsic patches used the legacy `llvm.aarch64.sve.tuple.{create,get,set}{2,3,4}` intrinsics for creating these tuples/getting/setting the vectors, which were only supported due to LLVM's `AutoUpgrade` pass converting these intrinsics into `llvm.vector.insert`. `AutoUpgrade` only supports these legacy intrinsics with the wide vector representation.

   With the current struct representation, Clang has special handling in codegen for generating `insertvalue`/`extractvalue` instructions for these operations, which must be replicated by rustc's codegen for our intrinsics to use.

   We add a new `core::intrinsics::scalable` module (never intended to be stable, just used by the stdarch intrinsics, gated by `core_intrinsics`) and add new intrinsics which rustc lowers to the appropriate `insertvalue`/`extractvalue` instructions.
3. Add generation of debuginfo for scalable vectors, following the DWARF that Clang generates.
4. Some intrinsics need something like `simd_cast`, which will work for scalable vectors too, so this implements a new `sve_cast` intrinsic that uses the same internals as `simd_cast`. This seemed better than permitting scalable vectors to be used with the `simd_*` intrinsics more generally as I can't guarantee this would work for all of them.

This is a relatively large patch but most of it is tests, and each commit should be relatively standalone. It's a little bit easier to upstream them together to avoid needing to stack them.  It's possible that some more compiler fixes will be forthcoming but it's looking like this might be all at the moment.

Depends on rust-lang#153653

r? @workingjubilee (discussed on Zulip)
rust-bors bot pushed a commit that referenced this pull request Apr 2, 2026
…uwer

Rollup of 21 pull requests

Successful merges:

 - #153105 (Compute the result of a projection type with region errors)
 - #153286 (various fixes for scalable vectors)
 - #153532 (Attributes containing rustc)
 - #153960 (Make `layout_of` cycles fatal errors)
 - #154527 (Emit pre-expansion feature gate warnings for negative impls and specialization)
 - #154666 (Remove `StableHashContext` impls)
 - #154669 (Introduce #[diagnostic::on_move] on `Arc`)
 - #154710 (opaque_generic_const_args -> generic_const_args)
 - #154712 (Revert "`-Znext-solver` Remove the forced ambiguity hack from search graph")
 - #154713 (Stop compiling when we get resolving crate failure)
 - #154213 (tidy-alphabetical: fix line number in error message)
 - #154425 (Migrate transmute tests)
 - #154442 (Export `derive` at the crate root: `core::derive` and `std::derive`)
 - #154469 (mGCA: Lower spans for literal const args)
 - #154578 (Rename `probe_ty_var` to `try_resolve_ty_var`)
 - #154615 (Moving issues)
 - #154644 (rustdoc: seperate methods and associated functions in sidebar)
 - #154660 (Avoid creating async return opaques for foreign async fns)
 - #154671 (Add a test for a past ICE when calling a const fn of an unresolved type with the wrong number of args)
 - #154680 ([rustdoc] Replace `DocContext` with `TyCtxt` wherever possible)
 - #154709 (Revert `Ty` type alias in `rustc_type_ir`)
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 2, 2026
various fixes for scalable vectors

These are a handful of patches fixing bugs with the current `#[rustc_scalable_vector]` infrastructure so that we can upstream the intrinsics to stdarch:

1. Instead of just using regular struct lowering for tuples of scalable vectors, which results in an incorrect ABI (e.g. returning indirectly), we change to using `BackendRepr::ScalableVector` which will lower to the correct type and be passed in registers.
2. Clang changed from representing tuples of scalable vectors as structs rather than as wide vectors (that is, scalable vector types where the `N` part of the `<vscale x N x ty>` type was multiplied by the number of vectors). rustc mirrored this in the initial implementation of scalable vectors that didn't land.

   When our early patches used the wide vector representation, our intrinsic patches used the legacy `llvm.aarch64.sve.tuple.{create,get,set}{2,3,4}` intrinsics for creating these tuples/getting/setting the vectors, which were only supported due to LLVM's `AutoUpgrade` pass converting these intrinsics into `llvm.vector.insert`. `AutoUpgrade` only supports these legacy intrinsics with the wide vector representation.

   With the current struct representation, Clang has special handling in codegen for generating `insertvalue`/`extractvalue` instructions for these operations, which must be replicated by rustc's codegen for our intrinsics to use.

   We add a new `core::intrinsics::scalable` module (never intended to be stable, just used by the stdarch intrinsics, gated by `core_intrinsics`) and add new intrinsics which rustc lowers to the appropriate `insertvalue`/`extractvalue` instructions.
3. Add generation of debuginfo for scalable vectors, following the DWARF that Clang generates.
4. Some intrinsics need something like `simd_cast`, which will work for scalable vectors too, so this implements a new `sve_cast` intrinsic that uses the same internals as `simd_cast`. This seemed better than permitting scalable vectors to be used with the `simd_*` intrinsics more generally as I can't guarantee this would work for all of them.

This is a relatively large patch but most of it is tests, and each commit should be relatively standalone. It's a little bit easier to upstream them together to avoid needing to stack them.  It's possible that some more compiler fixes will be forthcoming but it's looking like this might be all at the moment.

Depends on rust-lang#153653

r? @workingjubilee (discussed on Zulip)
rust-bors bot pushed a commit that referenced this pull request Apr 2, 2026
…uwer

Rollup of 21 pull requests

Successful merges:

 - #153105 (Compute the result of a projection type with region errors)
 - #153286 (various fixes for scalable vectors)
 - #153532 (Attributes containing rustc)
 - #153960 (Make `layout_of` cycles fatal errors)
 - #154527 (Emit pre-expansion feature gate warnings for negative impls and specialization)
 - #154666 (Remove `StableHashContext` impls)
 - #154669 (Introduce #[diagnostic::on_move] on `Arc`)
 - #154710 (opaque_generic_const_args -> generic_const_args)
 - #154712 (Revert "`-Znext-solver` Remove the forced ambiguity hack from search graph")
 - #153614 (`FindParamInClause` handle edge-cases)
 - #154213 (tidy-alphabetical: fix line number in error message)
 - #154425 (Migrate transmute tests)
 - #154442 (Export `derive` at the crate root: `core::derive` and `std::derive`)
 - #154469 (mGCA: Lower spans for literal const args)
 - #154578 (Rename `probe_ty_var` to `try_resolve_ty_var`)
 - #154615 (Moving issues)
 - #154644 (rustdoc: seperate methods and associated functions in sidebar)
 - #154660 (Avoid creating async return opaques for foreign async fns)
 - #154671 (Add a test for a past ICE when calling a const fn of an unresolved type with the wrong number of args)
 - #154680 ([rustdoc] Replace `DocContext` with `TyCtxt` wherever possible)
 - #154709 (Revert `Ty` type alias in `rustc_type_ir`)
@JonathanBrouwer
Copy link
Copy Markdown
Contributor

@bors r-
#154718 (comment)

@rust-bors rust-bors bot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 2, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Apr 2, 2026

This pull request was unapproved.

This PR was contained in a rollup (#154718), which was unapproved.

Instead of just using regular struct lowering for these types, which
results in an incorrect ABI (e.g. returning indirectly), use
`BackendRepr::ScalableVector` which will lower to the correct type and
be passed in registers.

This also enables some simplifications for generating alloca of scalable
vectors and greater re-use of `scalable_vector_parts`.

A LLVM codegen test demonstrating the changed IR this generates is
included in the next commit alongside some intrinsics that make these
tuples usable.
Clang changed to representing tuples of scalable vectors as
structs rather than as wide vectors (that is, scalable vector types
where the `N` part of the `<vscale x N x ty>` type was multiplied by
the number of vectors). rustc mirrored this in the initial implementation
of scalable vectors.

Earlier versions of our patches used the wide vector representation and
our intrinsic patches used the legacy
`llvm.aarch64.sve.tuple.{create,get,set}{2,3,4}` intrinsics for creating
these tuples/getting/setting the vectors, which were only supported
due to LLVM's `AutoUpgrade` pass converting these intrinsics into
`llvm.vector.insert`. `AutoUpgrade` only supports these legacy intrinsics
with the wide vector representation.

With the current struct representation, Clang has special handling in
codegen for generating `insertvalue`/`extractvalue` instructions for
these operations, which must be replicated by rustc's codegen for our
intrinsics to use. This patch implements new intrinsics in
`core::intrinsics::scalable` (mirroring the structure of
`core::intrinsics::simd`) which rustc lowers to the appropriate
`insertvalue`/`extractvalue` instructions.
Generate debuginfo for scalable vectors, following the structure that
Clang generates for scalable vectors.
Abstract over the existing `simd_cast` intrinsic to implement a new
`sve_cast` intrinsic - this is better than allowing scalable vectors to
be used with all of the generic `simd_*` intrinsics.
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 3, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@davidtwco
Copy link
Copy Markdown
Member Author

@bors r=Amanieu

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Apr 3, 2026

📌 Commit 957320c has been approved by Amanieu

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 3, 2026
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Apr 3, 2026
various fixes for scalable vectors

These are a handful of patches fixing bugs with the current `#[rustc_scalable_vector]` infrastructure so that we can upstream the intrinsics to stdarch:

1. Instead of just using regular struct lowering for tuples of scalable vectors, which results in an incorrect ABI (e.g. returning indirectly), we change to using `BackendRepr::ScalableVector` which will lower to the correct type and be passed in registers.
2. Clang changed from representing tuples of scalable vectors as structs rather than as wide vectors (that is, scalable vector types where the `N` part of the `<vscale x N x ty>` type was multiplied by the number of vectors). rustc mirrored this in the initial implementation of scalable vectors that didn't land.

   When our early patches used the wide vector representation, our intrinsic patches used the legacy `llvm.aarch64.sve.tuple.{create,get,set}{2,3,4}` intrinsics for creating these tuples/getting/setting the vectors, which were only supported due to LLVM's `AutoUpgrade` pass converting these intrinsics into `llvm.vector.insert`. `AutoUpgrade` only supports these legacy intrinsics with the wide vector representation.

   With the current struct representation, Clang has special handling in codegen for generating `insertvalue`/`extractvalue` instructions for these operations, which must be replicated by rustc's codegen for our intrinsics to use.

   We add a new `core::intrinsics::scalable` module (never intended to be stable, just used by the stdarch intrinsics, gated by `core_intrinsics`) and add new intrinsics which rustc lowers to the appropriate `insertvalue`/`extractvalue` instructions.
3. Add generation of debuginfo for scalable vectors, following the DWARF that Clang generates.
4. Some intrinsics need something like `simd_cast`, which will work for scalable vectors too, so this implements a new `sve_cast` intrinsic that uses the same internals as `simd_cast`. This seemed better than permitting scalable vectors to be used with the `simd_*` intrinsics more generally as I can't guarantee this would work for all of them.

This is a relatively large patch but most of it is tests, and each commit should be relatively standalone. It's a little bit easier to upstream them together to avoid needing to stack them.  It's possible that some more compiler fixes will be forthcoming but it's looking like this might be all at the moment.

Depends on rust-lang#153653

r? @workingjubilee (discussed on Zulip)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-meta Area: Issues & PRs about the rust-lang/rust repository itself F-scalable-vectors `#[rustc_scalable_vector]` S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants