-
-
Notifications
You must be signed in to change notification settings - Fork 174
Labels
Description
Issue description
Concurrent calls to resolveModelFile() for the same model trigger multiple downloads, causing file handle leaks and Metal GPU crashes on Node 22+ macOS.
Expected Behavior
When multiple concurrent calls to resolveModelFile() happen for the same model path, it should detect the download is already in progress and wait for it to complete, then return the same file path to all callers.
Actual Behavior
Each concurrent call starts its own download. With 10 concurrent calls, 10 simultaneous downloads begin for the same 328MB file:
Full error log
10:42:38 (node:8119) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 SIGINT listeners added to [process]. MaxListeners is 10. Use emitter.setMaxListeners() to increase limit
(Use `node --trace-warnings ...` to show where the warning was created)
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:42:38 Downloading to ~/.node-llama-cpp/models
10:43:01 Downloading to ~/.node-llama-cpp/models
✔ hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf downloaded 328.58MB in 35s
10:43:14 Downloaded to ~/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf
⏵ hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf 100.0% (328.58MB/328.58MB) 3.73MB/s | 0s left
⏵ hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf 78.04% (256.45MB/328.58MB) 6.13MB/s | 11s left
10:43:22 [openclaw] Uncaught exception: Error: A FileHandle object was closed during garbage collection. This used to be allowed with a deprecation warning but is now considered an error. Please close FileHandle objects explicitly. File descriptor: 86 (/Users/graciegould/.node-llama-cpp/models/hf_ggml-org_embeddinggemma-300m-qat-Q8_0.gguf.ipull)
/Users/runner/work/node-llama-cpp/node-llama-cpp/llama/llama.cpp/ggml/src/ggml-metal/ggml-metal-device.m:608: GGML_ASSERT([rsets->data count] == 0) failed
WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info.
WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash.
See: https://github.com/ggml-org/llama.cpp/pull/17869
0 libggml-base.dylib 0x00000001066fd44c ggml_print_backtrace + 276
1 libggml-base.dylib 0x00000001066fd638 ggml_abort + 156
2 libggml-metal.so 0x00000001390a47a4 ggml_metal_device_init + 0
3 libggml-metal.so 0x00000001390a50ac ggml_metal_device_free + 24
4 libggml-metal.so 0x00000001390a664c _ZNSt3__110unique_ptrI17ggml_metal_device25ggml_metal_device_deleterED1B8ne200100Ev + 32
5 libsystem_c.dylib 0x000000019e1a142c __cxa_finalize_ranges + 480
6 libsystem_c.dylib 0x000000019e1a11ec exit + 44
7 libnode.141.dylib 0x0000000106d59db4 _ZN4node33DefaultProcessExitHandlerInternalEPNS_11EnvironmentENS_8ExitCodeE + 0
8 libnode.141.dylib 0x0000000106d59e08 _ZN4node10V8Platform16StopTracingAgentEv + 0
9 libnode.141.dylib 0x0000000106d5af24 _ZNKSt3__18functionIFvPN4node11EnvironmentENS1_8ExitCodeEEEclES3_S4_ + 48
10 libnode.141.dylib 0x0000000106dab500 _ZN4node11Environment4ExitENS_8ExitCodeE + 84
11 libnode.141.dylib 0x0000000106b77298 Builtins_CallApiCallbackGeneric + 152
12 libnode.141.dylib 0x0000000106b755ec Builtins_InterpreterEntryTrampoline + 268
13 libnode.141.dylib 0x0000000106b755ec Builtins_InterpreterEntryTrampoline + 268
14 libnode.141.dylib 0x0000000106b755ec Builtins_InterpreterEntryTrampoline + 268
15 ??? 0x0000000129359c98 0x0 + 4986346648
16 ??? 0x00000001290942dc 0x0 + 4983440092
17 ??? 0x0000000129283560 0x0 + 4985468256
18 libnode.141.dylib 0x0000000106b755ec Builtins_InterpreterEntryTrampoline + 268
19 libnode.141.dylib 0x0000000106b7296c Builtins_JSEntryTrampoline + 172
20 libnode.141.dylib 0x0000000106b72610 Builtins_JSEntry + 176
21 libnode.141.dylib 0x00000001072248f8 _ZN2v88internal12_GLOBAL__N_16InvokeEPNS0_7IsolateERKNS1_12InvokeParamsE + 1544
22 libnode.141.dylib 0x00000001072242dc _ZN2v88internal9Execution4CallEPNS0_7IsolateENS0_12DirectHandleINS0_6ObjectEEES6_NS_4base6VectorIKS6_EE + 88
23 libnode.141.dylib 0x0000000107e307b0 _ZN2v88Function4CallEPNS_7IsolateENS_5LocalINS_7ContextEEENS3_INS_5ValueEEEiPS7_ + 172
24 libnode.141.dylib 0x0000000106dfb0fc _ZN4node6errors24TriggerUncaughtExceptionEPN2v87IsolateENS1_5LocalINS1_5ValueEEENS4_INS1_7MessageEEEb + 392
25 libnode.141.dylib 0x0000000107d8e7dc _ZN4node6errors24TriggerUncaughtExceptionEPN2v87IsolateERKNS1_8TryCatchE.cold.2 + 68
26 libnode.141.dylib 0x0000000106dfbed4 _ZN4node6errors24TriggerUncaughtExceptionEPN2v87IsolateERKNS1_8TryCatchE + 80
27 libnode.141.dylib 0x0000000106da9e0c _ZZN4node11Environment27RunAndClearNativeImmediatesEbENK3$_0clEPNS_13CallbackQueueIvJPS0_EEE + 256
28 libnode.141.dylib 0x0000000107d6beb8 _ZN4node11Environment27RunAndClearNativeImmediatesEb + 380
29 libnode.141.dylib 0x0000000107d6b9a4 _ZN4node11Environment14CheckImmediateEP10uv_check_s + 268
30 libuv.1.0.0.dylib 0x0000000102880124 uv__run_check + 136
31 libuv.1.0.0.dylib 0x000000010287aae8 uv_run + 324
32 libnode.141.dylib 0x0000000106d54dfc _ZN4node21SpinEventLoopInternalEPNS_11EnvironmentE + 252
33 libnode.141.dylib 0x0000000106e28728 _ZN4node16NodeMainInstance3RunEPNS_8ExitCodeEPNS_11EnvironmentE + 184
34 libnode.141.dylib 0x0000000106e2845c _ZN4node16NodeMainInstance3RunEv + 144
35 libnode.141.dylib 0x0000000106dcc858 _ZN4node5StartEiPPc + 676
36 dyld 0x000000019df25d54 start + 7184
The key errors are:
MaxListenersExceededWarning— 11 SIGINT listeners proves concurrent downloaders"Downloading to"repeated 9x — 9 simultaneous downloads for the same file"FileHandle object was closed during garbage collection"— fatal on Node 22+GGML_ASSERT([rsets->data count] == 0) failed— Metal GPU crash during exit cleanup
Steps to reproduce
- delete existing model
rm -rf ~/.node-llama-cpp/models/hf_ggml-org_embeddinggemma* - Run concurrent resolveModelFile calls for the same model:
import { getLlama, resolveModelFile } from 'node-llama-cpp';
const modelPath = "hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf";
// Fire off 5+ concurrent calls to resolveModelFile
const promises = Array.from({ length: 5 }, () =>
resolveModelFile(modelPath)
);
await Promise.all(promises); // Multiple downloads start, crashes with file handle/Metal errors
My Environment
| Dependency | Version |
|---|---|
| Operating System | |
| CPU | Intel i9 / Apple M1 |
| Node.js version | x.y.zzz |
| Typescript version | x.y.zzz |
node-llama-cpp version |
x.y.zzz |
npx --yes node-llama-cpp inspect gpu output:
Result of running `npx --yes node-llama-cpp inspect gpu`
Additional Context
| Dependency | Version |
|---|---|
| Operating System | macOS (Apple Silicon) |
| CPU | Apple M-series |
| Node.js version | 22.22.0 |
node-llama-cpp version |
3.15.1 |
Relevant Features Used
- Metal support
- CUDA support
- Vulkan support
- Grammar
- Function calling
Are you willing to resolve this issue by submitting a Pull Request?
Yes, I have the time, and I know how to start.
Reactions are currently unavailable