Change default to JIT without fma#2076
Change default to JIT without fma#2076lgritz wants to merge 1 commit intoAcademySoftwareFoundation:mainfrom
Conversation
The ShadingSystem option "llvm_jit_fma" seems to advertise that it switches between no fused multiply-add (the default, 0), versus allowing fma to be generated in the JIT. The underlying LLVM option is actually a 3-way switch about how to utilized fused ops: Strict (no fused ops), Standard (allow fma but not other fused ops), or Fast (allow other fused ops when it will speed things up). Aside: "standard" is really confusing, as one might think it means original standard IEEE 754 definitions that lacked fma. These are not the same! We were using llvm_jit_fma to switch between Standard and Fast, but upon closer examination, we now believe that it should control a switch between Strict and Standard, and Strict should be the default. This seems to directly change the pesky render-microfacet test, which has always seemed to have an unusual sensitivity to the specific processor. Hopefully, users will find that across the board, it leads to their renderers having fewer minor pixel differences from one platform to another. Signed-off-by: Larry Gritz <lg@larrygritz.com>
ThiagoIze
left a comment
There was a problem hiding this comment.
Looks good to me. I added a suggestion for cleaning up the reference images but that could be done in another PR another day.
There was a problem hiding this comment.
I wonder, can we now perhaps remove some of these alt reference images after updating the main reference image? And if so, can we lower the fail thresholds back down? And if the answer is yes to all that, I wonder if the same could be said for other tests?
There was a problem hiding this comment.
Possibly. I don't claim that with this change, this test or any other no longer has any minor platform-to-platform differences. But I hope that it's less now, or at least less prone to needing new ones added in the future as minor changes to hardware or dependencies are rolled out to the GHA runner pool. (And even if not, this PR has merit by making the internal behavior of this option now match what it's documented to do).
For those tests with wider thresholds or multiple reference images, it's quite tedious to determine which ones can be eliminated or how much the threshold can be reduced. The task is basically trial and error of eliminating one ref image, pushing the change, and seeing if it breaks one of the 20-ish CI tests or not.
The ShadingSystem option "llvm_jit_fma" seems to advertise that it switches between no fused multiply-add (the default, 0), versus allowing fma to be generated in the JIT.
The underlying LLVM option is actually a 3-way switch about how to utilize fused ops: Strict (no fused ops), Standard (allow fma but not other fused ops), or Fast (allow other fused ops when it will speed things up). Aside: "standard" is really confusing, as one might think it means original standard IEEE 754 definitions that lacked fma.
These are not the same! We were using llvm_jit_fma to switch between Standard and Fast, but upon closer examination, we now believe that it should control a switch between Strict and Standard, and Strict should be the default.
This seems to directly change the pesky render-microfacet test, which has always seemed to have an unusual sensitivity to the specific processor. Hopefully, users will find that across the board, it leads to their renderers having fewer minor pixel differences from one platform to another.