Skip to content

Releases: withcatai/node-llama-cpp

v3.14.5

10 Dec 23:38
7e467cc

Choose a tag to compare

3.14.5 (2025-12-10)

Bug Fixes


Shipped with llama.cpp release b7347

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.14.4

08 Dec 19:53
9a428e5

Choose a tag to compare

3.14.4 (2025-12-08)

Bug Fixes

  • create-node-llama-cpp module package release (#530) (9a428e5)

Shipped with llama.cpp release b7324

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.14.3

08 Dec 16:27
8741471

Choose a tag to compare

3.14.3 (2025-12-08)

Features

  • source download CLI: log the downloaded release when the release is set to latest (#522) (e37835c)

Bug Fixes

  • adapt to llama.cpp changes (#522) (e37835c)
  • pad the context size to align with the implementation in llama.cpp (#522) (e37835c) (see #522 for more details)

Shipped with llama.cpp release b7315

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.14.2

26 Oct 19:45
e516e50

Choose a tag to compare

3.14.2 (2025-10-26)

Bug Fixes

  • a new release due to a semantic-release failure in the previous release (#518) (e516e50)

Shipped with llama.cpp release b6845

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.14.1

26 Oct 17:32
47475ac

Choose a tag to compare

3.14.1 (2025-10-26)

Bug Fixes

  • Vulkan: include integrated GPU memory (#516) (47475ac)
  • Vulkan: deduplicate the same device coming from different drivers (#516) (47475ac)
  • adapt Llama chat wrappers to breaking llama.cpp changes (#516) (47475ac)

Shipped with llama.cpp release b6843

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.14.0

02 Oct 21:53
02805ee

Choose a tag to compare

3.14.0 (2025-10-02)

Features

  • Qwen3 Reranker support (#506) (00305f7) (see #506 for prequantized Qwen3 Reranker models you can use)

Bug Fixes

  • handle HuggingFace rate limit responses (#506) (00305f7)
  • adapt to llama.cpp breaking changes (#506) (00305f7)

Shipped with llama.cpp release b6673

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.13.0

09 Sep 18:20
eefe78c

Choose a tag to compare

3.13.0 (2025-09-09)

Features

Bug Fixes


Shipped with llama.cpp release b6431

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.12.4

28 Aug 00:40
c5cd057

Choose a tag to compare

gpt-oss is here!

Read about the release in the blog post


3.12.4 (2025-08-28)

Bug Fixes


Shipped with llama.cpp release b6301

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.12.3

26 Aug 23:01
6e59160

Choose a tag to compare

gpt-oss is here!

Read about the release in the blog post


3.12.3 (2025-08-26)

Bug Fixes

  • Vulkan: context creation edge cases (#492) (12749c0)
  • prebuilt binaries CUDA 13 support (#494) (b10999d)
  • don't share loaded shared libraries between backends (#492) (12749c0)
  • split prebuilt CUDA binaries into 2 npm modules (#495) (6e59160)

Shipped with llama.cpp release b6294

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

v3.12.1

11 Aug 18:38
f849cd9

Choose a tag to compare

gpt-oss is here!

Read about the release in the blog post


3.12.1 (2025-08-11)

Features

Bug Fixes

  • gpt-oss segment budgets (#489) (30eaa23)
  • add support for more gpt-oss variations (#489) (30eaa23)
  • default to using a model message for prompt completion on unsupported models (#489) (30eaa23)
  • prompt completion config (#490) (f849cd9)

Shipped with llama.cpp release b6133

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)