exploit: inline function hooking of cudaEventElapsedTime by nataliakokoromyti · Pull Request #28 · gpu-mode/pygpubench

nataliakokoromyti · 2026-03-15T00:46:03Z

The benchmark times your GPU code by calling a CUDA library function called cudaEventElapsedTime. We find that function in memory, make its code page writable, and overwrite its first few bytes so it always returns a fake low number. The kernel still computes the right answer, but the harness reads fake timing. 1us instead of 49us. Tested on L4 (SM89).

Technique:

ctypes.CDLL('libcudart.so') → resolve cudaEventElapsedTime address
mmap(PROT_READ|PROT_WRITE|PROT_EXEC) → allocate shellcode page
mprotect(PROT_RWX) on the target function's code page (allowed: seccomp only restricts fork/clone)
ctypes.memmove → overwrite function prologue with jump to shellcode

Refactor inline function hooking documentation for clarity and conciseness.

nataliakokoromyti added 4 commits March 14, 2026 17:44

exploit: inline function hooking of cudaEventElapsedTime

e4d26b5

fix submission_function_detour.py

a64a26c

Refactor inline function hooking documentation for clarity and conciseness.

remove run_all.py change

b636220

add function detour to run_all.py

7b211bb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exploit: inline function hooking of cudaEventElapsedTime#28

exploit: inline function hooking of cudaEventElapsedTime#28
nataliakokoromyti wants to merge 4 commits intogpu-mode:masterfrom
nataliakokoromyti:exploit/function-detour

nataliakokoromyti commented Mar 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nataliakokoromyti commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nataliakokoromyti commented Mar 15, 2026 •

edited

Loading