Skip to content

(fix): Compile-Time Explosion in @inline @with_pool + Binary RUNTIME_CHECK#31

Merged
mgyoo86 merged 7 commits intomasterfrom
fix/compile_time_explosion
Mar 12, 2026
Merged

(fix): Compile-Time Explosion in @inline @with_pool + Binary RUNTIME_CHECK#31
mgyoo86 merged 7 commits intomasterfrom
fix/compile_time_explosion

Conversation

@mgyoo86
Copy link
Member

@mgyoo86 mgyoo86 commented Mar 12, 2026

Problem

@inline @with_pool nested 3+ levels deep caused exponential compile-time blowup (4^N branches from union splitting). A 3-level nesting took 64 seconds to compile vs expected ~1s.

Root cause: the macro generated 4-branch dispatch (S=0,1,2,3) that got inlined and multiplied at each nesting level.

Solution

Replace the 4-tier runtime-togglable safety system with a binary compile-time constant. One type parameter, one branch — no union splitting, no exponential blowup.

Before After
Safety levels 4 (0=off, 1=guard, 2=full, 3=debug) 2 (0=off, 1=all)
Dispatch Union splitting: 4 branches per @with_pool Type assertion: pool::Pool{RUNTIME_CHECK}
3-level nested TTFX 64 seconds ~1.4 seconds
Configuration set_safety_level!(N) at runtime LocalPreferences.toml + restart

34 files changed, +815 -1530 (net -715 lines)

What Changed

Removed

  • set_safety_level!(), set_runtime_check!(), POOL_DEBUG — runtime toggles
  • Union splitting in macro codegen (the 4^N explosion source)
  • Multi-level thresholds (S >= 2 poisoning, S >= 3 borrow tracking)

Added

  • RUNTIME_CHECK::Int — compile-time const, 0 = off, 1 = all checks
    • Set via runtime_check in LocalPreferences.toml (accepts true/1)
  • _runtime_check(pool) -> Bool{0}false, {S>=1}true
  • Macro type assertion: pool::PoolType{RUNTIME_CHECK} (single branch)

Configuration

# LocalPreferences.toml
[AdaptiveArrayPools]
runtime_check = 1   # 0 = off, 1 = on (restart required)

Migration (minor — safety API only)

These APIs were added recently and have no downstream users yet.

Before After
set_safety_level!(2) runtime_check = 1 in LocalPreferences.toml + restart
POOL_DEBUG[] = true Removed (use runtime_check = 1 instead)
4 levels (0-3) 2 levels (0/1)

For full details on the new safety system, see the updated Safety documentation.

mgyoo86 added 5 commits March 12, 2026 10:08
Two-tier approach based on STATIC_POOL_CHECKS:

1. STATIC_POOL_CHECKS=false (production): generate S=0 only branch
   in _wrap_with_dispatch, eliminating 4-branch code duplication.
   Cuts single-function TTFX ~50% (2.0s → 0.8s) and prevents
   exponential 4^N blowup in nested @inline @with_pool (64s → 1.4s
   at 3 levels).

2. STATIC_POOL_CHECKS=true (debug): inject :noinline meta into
   generated @with_pool function definitions to prevent inlining
   of 4-branch dispatch bodies into callers. Changes 4^N (exponential)
   scaling to 4*N (linear) for nested calls. Runtime cost (~2-5ns/call)
   is negligible in debug mode.
…E_CHECK

Collapse the old 4-level safety system (off/guard/full/debug) into a
binary S=0 (off) / S=1 (full checks) system controlled by a single
RUNTIME_CHECK Bool flag.

Key changes:
- New RUNTIME_CHECK compile-time Bool (LocalPreferences: runtime_check)
- _make_pool / _make_cuda_pool: 4 branches → 2 branches
- _dispatch_pool_scope: 4 branches → 2 branches
- _wrap_with_dispatch: 4-branch loop → 2-branch (S=0 fast path + S=1)
- All S >= 2 / S >= 3 thresholds → S >= 1
- STATIC_POOL_CHECKS → RUNTIME_CHECK in macro codegen
- Backward-compatible aliases preserved (DEFAULT_SAFETY_LV, STATIC_POOL_CHECKS, POOL_SAFETY_LV)
- set_safety_level! range narrowed to 0-1
- Legacy path (Julia ≤1.10) fully mirrored
- CUDA extension updated (types.jl + debug.jl)

Motivation: intermediate safety levels (1=guard, 2=full) had negligible
overhead difference (~30ns) and limited practical value — users either
want zero overhead or full diagnostics. Binary simplification also
reduces worst-case TTFX from 4^N to 2^N for nested @with_pool.
…fety artifacts

Complete the binary RUNTIME_CHECK migration (S=0/1 only):

- Remove POOL_DEBUG Ref{Bool} toggle and all || POOL_DEBUG[] gates
- Remove set_runtime_check!, _dispatch_pool_scope, _set_cuda_runtime_check_hook!
- Eliminate union splitting in _wrap_with_dispatch — now emits compile-time
  const type assertion (pool::PoolType{RUNTIME_CHECK ? 1 : 0})
- Fix stale multi-level thresholds: S>=2 (poisoning) and S>=3 (borrow tracking)
  collapsed to S>=1 — all safety features activate together
- Update all tests to use _make_pool(true/false) with direct checkpoint/rewind
  instead of set_runtime_check! + @with_pool

BREAKING: POOL_DEBUG and set_runtime_check! removed from public API.
Use RUNTIME_CHECK preference (compile-time) or _make_pool(true) for testing.
- RUNTIME_CHECK: Bool → Int (0=off, 1=on), extensible like -O0/-O1/-O2
- LocalPreferences.toml accepts both Bool and Int, normalized via
  _normalize_runtime_check
- Macro simplified: RUNTIME_CHECK ? 1 : 0 → direct RUNTIME_CHECK
- _runtime_check: {0}→false, generic {S}→true (future-proof for S>1)
- _make_pool: accepts both Bool and Int
- CUDA test_cuda_safety.jl: full rewrite from 4-tier (S=0/1/2/3) to
  binary (S=0/1), removes all POOL_DEBUG/set_safety_level!/
  _safety_level references
Replace all references to removed APIs (POOL_DEBUG, set_safety_level!,
multi-level 0-3 system) with the new binary RUNTIME_CHECK (0=off, 1=on)
compile-time preference. Key changes across 5 doc files:

- safety.md: full rewrite — unified poisoning/invalidation/escape/borrow
  under RUNTIME_CHECK=1, removed 4-tier level table
- configuration.md: added runtime_check preference section with Bool/Int
  acceptance, removed POOL_DEBUG section, added restart warning
- safety-rules.md: replaced POOL_DEBUG debugging section with
  LocalPreferences.toml example
- multi-threading.md: updated debugging tips and summary
- api.md: POOL_DEBUG → RUNTIME_CHECK in configuration table
@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 84.74576% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.20%. Comparing base (a5783b8) to head (5a51974).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/types.jl 73.33% 4 Missing ⚠️
src/legacy/types.jl 80.00% 3 Missing ⚠️
src/utils.jl 80.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #31      +/-   ##
==========================================
+ Coverage   96.97%   97.20%   +0.23%     
==========================================
  Files          14       14              
  Lines        2645     2613      -32     
==========================================
- Hits         2565     2540      -25     
+ Misses         80       73       -7     
Files with missing lines Coverage Δ
src/AdaptiveArrayPools.jl 100.00% <ø> (ø)
src/bitarray.jl 97.77% <ø> (ø)
src/debug.jl 96.36% <100.00%> (-0.72%) ⬇️
src/legacy/bitarray.jl 100.00% <ø> (ø)
src/legacy/state.jl 97.08% <100.00%> (+0.36%) ⬆️
src/macros.jl 98.78% <100.00%> (-0.12%) ⬇️
src/state.jl 97.05% <100.00%> (ø)
src/task_local_pool.jl 100.00% <ø> (+35.71%) ⬆️
src/utils.jl 93.54% <80.00%> (+1.15%) ⬆️
src/legacy/types.jl 90.00% <80.00%> (ø)
... and 1 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes exponential compile-time blowup (4^N branches from union splitting) when @inline @with_pool is nested 3+ levels deep, by replacing the 4-tier runtime-togglable safety system with a binary compile-time constant RUNTIME_CHECK.

Changes:

  • Replaced POOL_SAFETY_LV (runtime Ref{Int}, levels 0-3), POOL_DEBUG, set_safety_level!(), and _dispatch_pool_scope (closure-based union splitting) with a single RUNTIME_CHECK compile-time const Int (0=off, 1=on) and direct type assertion pool::PoolType{RUNTIME_CHECK} in macros.
  • Consolidated all safety features (structural invalidation, poisoning, escape detection, borrow tracking) into a single S=1 level, removing the multi-tiered threshold system.
  • Updated all tests to use direct _make_pool(true/false) construction and manual _lazy_checkpoint!/_lazy_rewind! instead of set_safety_level! + @with_pool macro.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/types.jl, src/legacy/types.jl Replace POOL_SAFETY_LV/DEFAULT_SAFETY_LV/STATIC_POOL_CHECKS with RUNTIME_CHECK const; simplify _make_pool to binary dispatch; replace _safety_level with _runtime_check returning Bool
src/macros.jl Replace closure-based _dispatch_pool_scope union splitting with direct pool::PoolType{RUNTIME_CHECK} type assertion; replace _safety_level >= N || POOL_DEBUG[] guards with _runtime_check(pool)
src/task_local_pool.jl Remove _dispatch_pool_scope, set_safety_level!, _set_cuda_safety_level_hook!
src/state.jl, src/legacy/state.jl Remove default S parameter from _rewind_typed_pool!, _invalidate_released_slots!, reset!; merge poisoning into S≥1 level
src/debug.jl Remove POOL_DEBUG ref; simplify showerror to remove multi-level labels and "Tip: set LV=3" text
src/utils.jl Replace _safety_label with _check_label; update display to use check=on/off
src/AdaptiveArrayPools.jl Remove exports: POOL_DEBUG, POOL_SAFETY_LV, STATIC_POOL_CHECKS, DEFAULT_SAFETY_LV, set_safety_level!, deprecated aliases
ext/.../types.jl CUDA: binary _make_cuda_pool, _runtime_check for CuAdaptiveArrayPool; remove _set_cuda_safety_level_hook!
ext/.../debug.jl CUDA: replace S >= 2 || POOL_DEBUG[] with S >= 1
ext/.../state.jl CUDA: pass S to reset! calls
ext/.../macros.jl CUDA: update comments for direct type assertion
ext/.../task_local_pool.jl Remove _set_cuda_safety_level_hook!
test/test_safety.jl Rewrite tests for binary S=0/S=1; use _make_pool(true/false)
test/test_debug.jl Remove POOL_DEBUG tests; rewrite escape detection tests to use direct _make_pool + _validate_pool_return
test/test_borrow_registry.jl Rewrite borrow tracking tests for binary system
test/cuda/test_cuda_safety.jl Consolidate CUDA safety tests to binary S=0/1
test/test_task_local_pool.jl Replace set_safety_level! tests with _make_pool tests
test/test_state.jl Pass S argument to reset! and _rewind_typed_pool!
test/test_allocation.jl Remove set_safety_level! save/restore
docs/ Update safety, configuration, safety-rules docs for binary RUNTIME_CHECK

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@mgyoo86 mgyoo86 merged commit 0c3f22d into master Mar 12, 2026
13 of 14 checks passed
@mgyoo86 mgyoo86 deleted the fix/compile_time_explosion branch March 12, 2026 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants