Skip to content

exploit: inline function hooking of cudaEventElapsedTime#28

Open
nataliakokoromyti wants to merge 4 commits intogpu-mode:masterfrom
nataliakokoromyti:exploit/function-detour
Open

exploit: inline function hooking of cudaEventElapsedTime#28
nataliakokoromyti wants to merge 4 commits intogpu-mode:masterfrom
nataliakokoromyti:exploit/function-detour

Conversation

@nataliakokoromyti
Copy link
Contributor

@nataliakokoromyti nataliakokoromyti commented Mar 15, 2026

The benchmark times your GPU code by calling a CUDA library function called cudaEventElapsedTime. We find that function in memory, make its code page writable, and overwrite its first few bytes so it always returns a fake low number. The kernel still computes the right answer, but the harness reads fake timing. 1us instead of 49us. Tested on L4 (SM89).

Technique:

  1. ctypes.CDLL('libcudart.so') → resolve cudaEventElapsedTime address
  2. mmap(PROT_READ|PROT_WRITE|PROT_EXEC) → allocate shellcode page
  3. mprotect(PROT_RWX) on the target function's code page (allowed: seccomp only restricts fork/clone)
  4. ctypes.memmove → overwrite function prologue with jump to shellcode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant