Skip to content

Revert "nvptx: use simd_fmin and simd_fmax for minnum and maxnum"#2054

Closed
RalfJung wants to merge 1 commit intorust-lang:mainfrom
RalfJung:nvptx-fminmax
Closed

Revert "nvptx: use simd_fmin and simd_fmax for minnum and maxnum"#2054
RalfJung wants to merge 1 commit intorust-lang:mainfrom
RalfJung:nvptx-fminmax

Conversation

@RalfJung
Copy link
Member

Same as #2052, but for nvptx.

This leaves only s390x as remaining user of simd_fmin/max in this repo:
https://github.com/rust-lang/stdarch//blob/72f7d39ae9215e862ebe6f14c88b2e17de2599e7/crates/core_arch/src/s390x/vector.rs#L830-L831
This is in a test_impl! macro which makes it sound like it's only used for tests, but the macro doesn't look like it is test-only, so I am not sure. Already the initial implementation in https://github.com/rust-lang/stdarch//commit/d5a389f1dd6369cd87b36bac887a5c7a27a28b4c used simd_fmin/max so I am not sure what the LLVM intrinsics for this would even be.

r? @folkertdev

Comment on lines +11 to 14
#[link_name = "llvm.minnum.v2f16"]
fn llvm_f16x2_minnum(a: f16x2, b: f16x2) -> f16x2;
#[link_name = "llvm.minimum.v2f16"]
fn llvm_f16x2_minimum(a: f16x2, b: f16x2) -> f16x2;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, looking at this, this seems to still invoke the same LLVM intrinsic, not something nvptx-specific? That doesn't seem right...

Copy link
Member Author

@RalfJung RalfJung Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the nvptx docs, they don't mention signaling NaN but if we assume they treat all NaN the same then using llvm.minnum is wrong because LLVM can (and will) fold minnum(SNaN, x) to QNaN whereas the docs say that must return x.

They also don't mention singed zero handling, but the instruction is specified as (a < b) ? a : b which would mean that if both inputs are zero then the 2nd input is returned. None of the variants of min/max in LLVM makes that guarantee (i.e., llvm.minimum is also not a correct implementation for f16x2_min_nan).

@RalfJung
Copy link
Member Author

I'll close the PR as this doesn't actually change which LLVM intrinsic is used, it's minnum either way.

@RalfJung RalfJung closed this Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants