[ExecuTorch][WebGPU] linear_q4gsw test suite: Llama-1B shapes + 4k/8k sweep#20227
[ExecuTorch][WebGPU] linear_q4gsw test suite: Llama-1B shapes + 4k/8k sweep#20227JulianCloudNTH wants to merge 2 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20227
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New Failures, 5 Pending, 1 Unrelated FailureAs of commit cc8ef26 with merge base 5526971 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review |
|
Claude finished @JulianCloudNTH's task in 1m 32s —— View job Code Review: PR #20227 —
|
… sweep Pull Request resolved: #20227 Adds the numerical test suite for `et_vk.linear_q4gsw` (stacked on the op diff), mirroring the SDPA test suite. A named CONFIGS sweep covers real Llama-3.2-1B linear shapes — q/o-proj (2048->2048), k/v-proj (2048->512), gate/up-proj (2048->8192), down-proj (8192->2048), lm_head (2048->128256) — plus 4k/8k large-token prefill (M=4096/8192 on the 2048->2048 and 2048->512 projections). `test/ops/quantized_linear/test_quantized_linear.py` exports each config's `.pte` + an fp64 dequant-matmul "truth" golden; `test/test_webgpu_native.cpp` reconstructs the deterministic ramp input bit-for-bit, runs the op on the GPU, and compares per element; `scripts/test_webgpu_native_ci.sh` wires the fixtures into the Dawn(Tint)+SwiftShader CI. ghstack-source-id: 392908895 @exported-using-ghexport Differential Revision: [D108314849](https://our.internmc.facebook.com/intern/diff/D108314849/)
SS-JIA
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
Stack from ghstack (oldest at bottom):
Adds the numerical test suite for
et_vk.linear_q4gsw(stacked on the op diff), mirroring the SDPA test suite. A named CONFIGS sweep covers real Llama-3.2-1B linear shapes — q/o-proj (2048->2048), k/v-proj (2048->512), gate/up-proj (2048->8192), down-proj (8192->2048), lm_head (2048->128256) — plus 4k/8k large-token prefill (M=4096/8192 on the 2048->2048 and 2048->512 projections).test/ops/quantized_linear/test_quantized_linear.pyexports each config's.pte+ an fp64 dequant-matmul "truth" golden;test/test_webgpu_native.cppreconstructs the deterministic ramp input bit-for-bit, runs the op on the GPU, and compares per element;scripts/test_webgpu_native_ci.shwires the fixtures into the Dawn(Tint)+SwiftShader CI.Differential Revision: D108314849