Releases · ProjectTorreyPines/AdaptiveArrayPools.jl

14 Mar 21:21

github-actions

v0.3.1

d73eac7

v0.3.1 Latest

Latest

What's New

Metal.jl Support (#34)

Full Apple Silicon GPU backend — same API as CUDA (@with_pool :metal), lazy/typed-lazy modes, full safety suite, and task-local multi-device pools. Requires Julia 1.11+. See 📖 Metal Backend.

Structural Mutation Detection (#35)

Catches resize!/push!/append! on pool-backed arrays:

Compile-time: PoolMutationError at macro expansion (zero cost)
Runtime: wrapper-vs-backing divergence check at rewind — advisory @warn with maxlog=1 (pool self-heals on next acquire!)

Works across CPU, CUDA, and Metal. See 📖 Pool Safety.

Other

CUDA extension now gated behind Julia 1.11+ to match Metal

What's Changed

(feat): Metal.jl (Apple Silicon GPU) backend support by @mgyoo86 in #34
(feat): compile-time and runtime structural mutation detection by @mgyoo86 in #35

Full Changelog: v0.3.0...v0.3.1

Contributors

mgyoo86

Assets 2

13 Mar 21:14

github-actions

v0.3.0

6cffe33

v0.3.0

⚠️ Breaking Changes

Default `acquire!` returns `Array`

acquire! now returns Array{T,N} instead of SubArray/ReshapedArray. On Julia 1.11+, Array is a mutable struct enabling zero-allocation setfield! reuse — the same guarantee views had, with better FFI/ccall and type constraint compatibility.
New acquire_view! provides explicit opt-in to the old view behavior.

Remove `unsafe_*` API

The entire unsafe_* API (unsafe_acquire!, unsafe_zeros!, unsafe_ones!, unsafe_similar!) is removed with no deprecation period.

Migration Guide

unsafe_acquire!  →  acquire!
unsafe_zeros!    →  zeros!
unsafe_ones!     →  ones!
unsafe_similar!  →  similar!

If you relied on acquire! returning views: acquire! → acquire_view!

See the full 📖 Migration Guide for details.

What's Changed

Default acquire! → Array, remove unsafe_* API by @mgyoo86 in #33

Full Changelog: v0.2.6...v0.3.0

Contributors

mgyoo86

Assets 2

13 Mar 17:37

github-actions

v0.2.6

c194b0d

v0.2.6

Eliminate `try-finally` Overhead in `@with_pool`

@with_pool now uses direct rewind insertion instead of try-finally, enabling compiler inlining for ~15-25% speedup on hot-loop patterns.

New @safe_with_pool / @safe_maybe_with_pool macros preserve the old try-finally behavior for code that needs guaranteed exception cleanup.

What's Changed

Remove try-finally from @with_pool for inlining performance by @mgyoo86 in #32

Full Changelog: v0.2.5...v0.2.6

Contributors

mgyoo86

Assets 2

12 Mar 21:21

github-actions

v0.2.5

f4298cb

v0.2.5

Fix Compile-Time Explosion in Nested `@with_pool`

Nested or @inline @with_pool could cause compile-time explosion due to exponential macro expansion. Fixed by replacing union splitting with a single compile-time type assertion.

Safety system simplified from 4-tier (0–3) to binary RUNTIME_CHECK (0=off, 1=on) via LocalPreferences.toml. POOL_DEBUG, POOL_SAFETY_LV, set_safety_level! are removed.

See 📖 Safety and 📖 Configuration docs for details.

What's Changed

(fix): compile-time explosion in nested @with_pool + binary RUNTIME_CHECK by @mgyoo86 in #31

Full Changelog: v0.2.4...v0.2.5

Contributors

mgyoo86

Assets 2

12 Mar 04:13

github-actions

v0.2.4

a5783b8

v0.2.4

Hotfix: Eliminate `Core.Box` Allocation in `@inline @with_pool`

v0.2.3 introduced a Core.Box boxing regression (32–48 bytes) in @inline @with_pool functions, caused by closure-based dispatch interacting with try/finally boundaries after inlining.

Replaced with closureless union splitting on both CPU and CUDA — zero allocation on Julia 1.12+. On Julia <1.12, a minor 16-byte let-scope overhead may appear at top-level but disappears inside functions and hot loops.

What's Changed

(fix): Closureless union splitting — eliminate Core.Box allocation in @with_pool by @mgyoo86 in #30

Full Changelog: v0.2.3...v0.2.4

Contributors

mgyoo86

Assets 2

11 Mar 20:49

github-actions

v0.2.3

a3da1a3

v0.2.3

New Features

Pool Safety System

Two-layer safety for catching pool misuse:

Compile-time (STATIC_POOL_CHECKS): @with_pool AST analysis detects escaped pool-backed variables at macro-expansion time — zero runtime cost.
Runtime (POOL_SAFETY_LV): progressive protection levels — L1 invalidates released slots via resize!/setfield!, L2 adds full borrow tracking with escape detection.
Type-parameterized AdaptiveArrayPool{S}: encodes safety level as a type parameter, enabling dead-code elimination at S=0 for true zero overhead. CPU and CUDA.

Unlimited Zero-Alloc Dimension Patterns

On Julia 1.11+, both CPU and CUDA now use setfield!-based wrapper reuse — unlimited dimension patterns per slot are zero-allocation on the hot path. The old 4-way set-associative cache (with its eviction limit) is removed on both backends.

Zero-Alloc `reshape!` Support (CPU & CUDA)

reshape!(pool, A, dims...) now works on both CPU and CUDA. Same-dim reshapes are in-place; cross-dim reshapes reuse cached wrappers — zero allocation after warmup.

GPU Memory Preservation

_resize_to_fit! avoids GPU reallocation on CuVector shrink (CUDA resize! below 25% capacity triggers alloc→copy→free). Preserves VRAM across pool rewind cycles.

Limitations

The setfield!-based wrapper reuse and reshape! zero-allocation features require Julia 1.11+ on CPU. On Julia 1.10 (LTS), the CPU path retains the previous N-way set-associative cache behavior with CACHE_WAYS=4 eviction limit. CUDA is unaffected (always uses the new path).

What's Changed

(refac): Slot-first architecture: remove view cache, isolate legacy by @mgyoo86 in #23
(fix): USE_POOLING=false path fixes and 2-tier toggle rename by @mgyoo86 in #24
(feat): add 2-tier pool safety — compile-time escape detection + runtime validation by @mgyoo86 in #25
(feat): Type-parameterized safety dispatch (Pool{S}) — CPU path by @mgyoo86 in #26
(feat): pool safety dispatch — CUDA path by @mgyoo86 in #27
(feat): Avoid GPU realloc on CuVector shrink by @mgyoo86 in #28
(feat): CUDA arr_wrappers — Zero-Alloc CuArray Reuse via setfield! by @mgyoo86 in #29

Full Changelog: v0.2.2...v0.2.3

Contributors

mgyoo86

Assets 2

06 Mar 08:57

github-actions

v0.2.2

30ba3af

v0.2.2

New Feature

reshape!(pool, A, dims...) — Zero-allocation array reshaping via pool wrapper cache. On Julia 1.11+, cross-dim reshapes reuse cached wrappers with no allocation after warmup. (docs 📖)

Code Quality

Adopt Runic.jl formatter with CI enforcement

What's Changed

(feat): reshape! by @mgyoo86 in #20
(style): Runic.jl formatter by @mgyoo86 in #21

Full Changelog: v0.2.1...v0.2.2

Contributors

mgyoo86

Assets 2

04 Mar 06:21

github-actions

v0.2.1

32d7e44

v0.2.1

Bug Fix

Fallback Type Memory Leak

Fixed a memory leak where non-fixed-slot types (e.g., ForwardDiff.Dual) were not properly reclaimed on rewind! during repeated @with_pool calls. This caused n_active to grow unboundedly in workloads like ForwardDiff.gradient with pooled interpolation.
No performance regression — the fix is resolved entirely at compile time. CUDA extension included.

What's Changed

(fix): fallback type memory leak in checkpoint/rewind cycle by @mgyoo86 in #19

Full Changelog: v0.2.0...v0.2.1

Contributors

mgyoo86

Assets 2

18 Feb 21:55

mgyoo86

v0.2.0

3645a32

v0.2.0

What's New

Lazy Selective Rewind

@with_pool now defers per-type checkpoints until first acquire! and rewinds only the pools actually touched, instead of all 8 fixed-slot types. Up to 6.5× faster in common patterns with helper functions.

Bitmask-Aware Type Tracking

Per-type bitmask tracking replaces the boolean _untracked_flags system. When helper functions acquire types already tracked by the macro, the fast typed path is preserved via subset check (untracked ⊆ tracked).

Internal Naming Refactor

Internal identifiers renamed to reflect lazy-selective-rewind architecture (e.g. _mark_untracked! → _record_type_touch!, _depth_only_checkpoint! → _lazy_checkpoint!). Magic hex literals replaced with named constants. No user-facing API changes.

What's Changed

(feat): Bitmask-Aware Untracked Tracking for @with_pool by @mgyoo86 in #16
(perf): Dynamic Selective Rewind & Typed-Fallback Optimization by @mgyoo86 in #17
(refactor): rename internals to match evolved architecture by @mgyoo86 in #18

Full Changelog: v0.1.2...v0.2.0

Contributors

mgyoo86

Assets 2

31 Jan 18:33

github-actions

v0.1.2

b9f17c0

v0.1.2

What's New

⚠️ Breaking: Unified Bit Type API

acquire!(pool, Bit, n) now returns BitVector instead of SubArray{Bool}.

Why: Native BitVector utilizes SIMD-optimized chunk algorithms, making operations like count(), sum(), and bitwise broadcasting 10×–100× faster compared to SubArray{Bool} views.

# Before (v0.1.1): Returned SubArray{Bool}
# After  (v0.1.2): Returns native BitVector (SIMD optimized)

@with_pool pool function foo()
    bv = acquire!(pool, Bit, 10_000)

    # Operations using packed bits are significantly faster
    c = count(bv)  # 10x~100x speedup vs view behavior
end

Migration: No code changes needed for typical usage. Only affects code explicitly type-checking for SubArray.

What's Changed

⚠️ (refac): return BitVector for performance by @mgyoo86 in #15

Full Changelog: v0.1.1...v0.1.2

Contributors

mgyoo86

Assets 2

Releases: ProjectTorreyPines/AdaptiveArrayPools.jl

v0.3.1

What's New

Metal.jl Support (#34)

Structural Mutation Detection (#35)

Other

What's Changed

Contributors

Uh oh!

v0.3.0

⚠️ Breaking Changes

Default acquire! returns Array

Remove unsafe_* API

Migration Guide

What's Changed

Contributors

Uh oh!

v0.2.6

Eliminate try-finally Overhead in @with_pool

What's Changed

Contributors

Uh oh!

v0.2.5

Fix Compile-Time Explosion in Nested @with_pool

What's Changed

Contributors

Uh oh!

v0.2.4

Hotfix: Eliminate Core.Box Allocation in @inline @with_pool

What's Changed

Contributors

Uh oh!

v0.2.3

New Features

Pool Safety System

Unlimited Zero-Alloc Dimension Patterns

Zero-Alloc reshape! Support (CPU & CUDA)

GPU Memory Preservation

Limitations

What's Changed

Contributors

Uh oh!

v0.2.2

New Feature

Code Quality

What's Changed

Contributors

Uh oh!

v0.2.1

Bug Fix

Fallback Type Memory Leak

What's Changed

Contributors

Uh oh!

v0.2.0

What's New

Lazy Selective Rewind

Bitmask-Aware Type Tracking

Internal Naming Refactor

What's Changed

Contributors

Uh oh!

v0.1.2

What's New

⚠️ Breaking: Unified Bit Type API

What's Changed

Contributors

Uh oh!

Default `acquire!` returns `Array`

Remove `unsafe_*` API

Eliminate `try-finally` Overhead in `@with_pool`

Fix Compile-Time Explosion in Nested `@with_pool`

Hotfix: Eliminate `Core.Box` Allocation in `@inline @with_pool`

Zero-Alloc `reshape!` Support (CPU & CUDA)