(feat): Avoid GPU realloc on CuVector shrink#28
Conversation
CUDA.jl's resize! triggers pool_alloc + copy + pool_free when shrinking below 25% capacity (n < cap÷4). This is expensive for pool operations: - Safety invalidation: resize!(vec, 0) on every released slot - Acquire path: resize!(vec, smaller) when reusing slots Solution: _resize_without_shrink!(A, n) uses setfield!(:dims) for shrink (zero-cost, GPU memory preserved via maxsize) and delegates to resize! for grow. Grow-back after shrink reuses existing allocation since CUDA.jl sees n ≤ cap from preserved maxsize. Changes: - Add _resize_without_shrink! in CUDA ext acquire.jl - Replace resize! in get_view! acquire path - Add _resize_without_shrink!(vec, 0) to _invalidate_released_slots! (CUDA Level 1 now has structural invalidation matching CPU behavior) - Update Level 1 safety tests: verify length==0 + poison via re-acquire - Add _resize_without_shrink! pointer-preservation tests
…d guard - Update debug.jl header: Level 1 now includes structural invalidation via _resize_without_shrink!(vec, 0), not just poisoning - Fix acquire.jl cache invalidation comment: clarify grow vs shrink pointer stability semantics - Update get_view! docstring to describe _resize_without_shrink! behavior - Add @Assert hasfield(CuArray, :dims) compile-time guard for CUDA.jl internal API compatibility
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## master #28 +/- ##
===========================================
- Coverage 96.79% 74.57% -22.22%
===========================================
Files 14 14
Lines 2620 2616 -4
===========================================
- Hits 2536 1951 -585
- Misses 84 665 +581 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds a CUDA-specific resize strategy to avoid costly GPU reallocations when pooled CuVectors are shrunk (especially to length 0), restoring the pool’s “no GPU allocation on reuse” behavior and making CUDA safety level 1 structurally invalidate released slots like the CPU backend.
Changes:
- Introduces
_resize_without_shrink!forCuVectorshrink operations (dims-only shrink; grow delegates toresize!) and uses it in the CUDA acquire path. - Updates CUDA safety level 1 invalidation to shrink released vectors to length 0 after poisoning (without freeing GPU memory).
- Adjusts CUDA tests to assert length→0 after rewind and adds pointer-preservation tests for
_resize_without_shrink!.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
ext/AdaptiveArrayPoolsCUDAExt/acquire.jl |
Adds _resize_without_shrink! and uses it to avoid GPU realloc on shrink in get_view!. |
ext/AdaptiveArrayPoolsCUDAExt/debug.jl |
Level 1 invalidation now shrinks released vectors to length 0 using _resize_without_shrink! after poisoning. |
test/cuda/test_cuda_safety.jl |
Updates safety tests to reflect structural invalidation (length 0) and verifies poisoning via re-acquire. |
test/cuda/test_allocation.jl |
Adds tests ensuring shrink/grow-back preserves the GPU pointer when within maxsize. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Background
Pool operations frequently
resize!backing vectors. On CPU, shrinking preserves capacity — no problem. On CUDA, shrinking below 25% of capacity triggers a full GPU reallocation (alloc → copy → free).resize!(v, 0)in particular always reallocates, defeating the pool's zero-allocation guarantee.Solution
_resize_without_shrink!changes only the logical size (dims) on shrink, preserving the GPU allocation. When the vector grows back to the same size, the existing memory is reused with no reallocation.Impact
resize!(vec, 0)behavior. Previously CUDA could only poison without structural invalidation due to the memory-free issueCompatibility
:dims,.maxsize) —@assert hasfieldcompile-time guard added (tested with CUDA.jl v5.x)