-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Trying to build the new SME kernels fails with this error in llvm-21:
error: Incorrect size for ssyrk_direct_sme1_2VLx2VL prologue: 108 bytes of instructions in range, but .seh directives corresponding to 96 bytes
This seems to be fixed in llvm-22 (The newer version moves the rdsvl after .seh_endprologue, avoiding the SEH metadata issue.)
But it runs into this error in llvm-22:
LLVM ERROR: SME hazard padding is not supported on Windows
Further analysis of the PR suggests the combination of alloca with SME is not implemented on Windows:
| LLVM Version | Behavior |
|---|---|
| 21.1.7 (old) | Generates invalid .seh_save_reg x9/x0 directives |
| 22.0.0git (new) | Explicitly errors: "SME hazard padding is not supported on Windows" |
The newer LLVM "fixed" the assembler issues by rejecting the combination entirely rather than fixing the SEH generation. The check is in AArch64PrologueEpilogue.cpp: when hasSVECalleeSavesAboveFrameRecord and hasStackHazardSlotIndex are both true.
fatal usage error added in https://github.com/llvm/llvm-project/pull/138609/files#diff-abc2806d934cd854992bdf139a4ab9405859556f01b2fc4ab17aa6b8a50e72c4R1902-R1918
$ /home/jameson/llvm-mingw/build/bin/aarch64-w64-mingw32-clang -g -DSMALL_MATRIX_OPT -DGEMM_GEMV_FORWARD -DSBGEMM_GEMV_FORWARD -DBGEMM_GEMV_FORWARD -DMS_ABI -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -DDYNAMIC_ARCH -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=512 -DMAX_PARALLEL_NUMBER=1 -DBUILD_BFLOAT16 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.30.dev\" -march=armv9-a+sve2+sme -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=ssyrk_direct_alpha_betaLT_ARMV9SME -DASMFNAME=ssyrk_direct_alpha_betaLT_ARMV9SME_ -DNAME=ssyrk_direct_alpha_betaLT_ARMV9SME_ -DCNAME=ssyrk_direct_alpha_betaLT_ARMV9SME -DCHAR_NAME=\"ssyrk_direct_alpha_betaLT_ARMV9SME_\" -DCHAR_CNAME=\"ssyrk_direct_alpha_betaLT_ARMV9SME\" -DNO_AFFINITY -DTS=_ARMV9SME -I.. -DBUILD_KERNEL -DTABLE_NAME=gotoblas_ARMV9SME -DHAVE_SME -march=armv9-a+sve2+sme -DBUILD_KERNEL -DTABLE_NAME=gotoblas_ARMV9SME -UDOUBLE -UCOMPLEX -UDOUBLE -UCOMPLEX -UUPPER -DTRANSA ../kernel/arm64/ssyrk_direct_alpha_beta_arm64_sme1.c -Wclang -o /dev/null -c
warning: unknown warning option '-Wclang'; did you mean '-Wvla'? [-Wunknown-warning-option]
In file included from ../kernel/arm64/ssyrk_direct_alpha_beta_arm64_sme1.c:6:
In file included from ../common.h:61:
../config_kernel.h:22:9: warning: 'HAVE_SME' macro redefined [-Wmacro-redefined]
22 | #define HAVE_SME
| ^
<command line>:35:9: note: previous definition is here
35 | #define HAVE_SME 1
| ^
In file included from ../kernel/arm64/ssyrk_direct_alpha_beta_arm64_sme1.c:6:
../common.h:378:9: warning: 'YIELDING' macro redefined [-Wmacro-redefined]
378 | #define YIELDING __asm__ __volatile__ ("nop;nop;nop;nop;nop;nop;nop;nop; \n");
| ^
../common.h:373:9: note: previous definition is here
373 | #define YIELDING SwitchToThread()
| ^
../kernel/arm64/ssyrk_direct_alpha_beta_arm64_sme1.c:191:35: warning: passing 'const float *' to parameter of type 'float *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
191 | kernel_2x2(&a_ptr[row_idx], &b_ptr[col_idx],
| ^~~~~~~~~~~~~~~
../kernel/arm64/ssyrk_direct_alpha_beta_arm64_sme1.c:35:35: note: passing argument to parameter 'B' here
35 | kernel_2x2(const float *A, float *B, float *C, size_t shared_dim,
| ^
../kernel/arm64/ssyrk_direct_alpha_beta_arm64_sme1.c:21:17: warning: unused function 'sve_cntw' [-Wunused-function]
21 | static uint64_t sve_cntw() {
| ^~~~~~~~
error: Incorrect size for ssyrk_direct_sme1_2VLx2VL prologue: 108 bytes of instructions in range, but .seh directives corresponding to 96 bytes
5 warnings and 1 error generated.
It appears the code may be considered to have alloca because of the use of vscale functions
This is clearly an upstream bug, but it might be necessary to disable those new AArch64 kernels on Windows with Clang?