Skip to content

Commit 8def368

Browse files
committed
Prepare 5.0.3 release: drop -SNAPSHOT, finalize CHANGELOG
- pom.xml: 5.0.3-SNAPSHOT -> 5.0.3 - README.md: release dependency examples 5.0.2 -> 5.0.3 (snapshot example stays 5.0.3-SNAPSHOT) - CHANGELOG.md: close [Unreleased] into [5.0.3] - 2026-06-29 and backfill the previously-missing [5.0.2] - 2026-06-08 section (reconstructed from the v5.0.2 tag), plus compare links. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01URUX3HiqQ1wzJnT8qn8c8E
1 parent 7889287 commit 8def368

3 files changed

Lines changed: 28 additions & 18 deletions

File tree

CHANGELOG.md

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,9 @@ from version 5.0.0 onward. Pre-fork releases (`1.x`–`4.2.0`) were authored by
99

1010
## [Unreleased]
1111

12+
## [5.0.3] - 2026-06-29
13+
1214
### Added
13-
- `CODE_OF_CONDUCT.md` (Contributor Covenant 2.0).
14-
- `docs/RELEASE.md` capturing the maintainer-facing release procedure (moved out of CHANGELOG).
15-
- OpenSSF Best Practices badge (project 12862) on README.
1615
- OpenAI-compatible `parallel_tool_calls` support: `ChatRequest.withParallelToolCalls(Boolean)` / `getParallelToolCalls()`, `InferenceParameters.withParallelToolCalls(boolean)`, and pass-through in the `/v1/chat/completions` server mapper.
1716
- Real-model tool-calling integration tests for blocking and streaming required tool calls (`ToolCallingIntegrationTest`, Qwen2.5-1.5B-Instruct), wired into CI and `validate-models`.
1817
- End-to-end vision input across blocking, typed `ChatRequest`, streaming, and OpenAI-compatible request mapping; real-model tests verify that distinct red and blue images produce the correct semantic answers.
@@ -22,13 +21,10 @@ from version 5.0.0 onward. Pre-fork releases (`1.x`–`4.2.0`) were authored by
2221
- `ModelParameters.enableSwaFull()` (`--swa-full`): keep full-size SWA KV cache to enable cross-request prompt-prefix reuse.
2322
- Typed cache observability through `Usage.getCachedTokens()`, `Usage.getProcessedPromptTokens()`, `SlotMetrics`, and `ServerMetrics.getSlotMetrics()`.
2423
- Authenticated JSON `GET /metrics` and `GET /slots` endpoints on the embedded server.
24+
- Windows GPU native classifiers: `cuda13-windows-x86-64`, `vulkan-windows-x86-64`, and `opencl-windows-x86-64`; the default Windows CPU JAR flipped to the Ninja Multi-Config generator with an `msvc-windows` classifier preserving the Visual Studio build.
2525

2626
### Changed
27-
- Unified `CONTRIBUTING.md` and `SECURITY.md` structure with sibling repositories in the project family.
28-
- Reconciled Java baseline to **11+** across `pom.xml`, README badge, `CLAUDE.md`, and `CONTRIBUTING.md`.
29-
- README license badge corrected from "Apache 2.0" to "MIT" (matches `LICENSE` file and `pom.xml`).
30-
- `pom.xml` SCM URL: `tree/master``tree/main` (default branch renamed).
31-
- Upgraded llama.cpp from b9151 to b9172.
27+
- Upgraded llama.cpp from b9172 to b9803 across multiple incremental upgrades.
3228
- Upgraded llama.cpp from b9803 to b9829. Compiles the new upstream `server-stream.cpp` (resumable-streaming SSE replay buffer) into `libjllama`, required because `server-context`/`server-http`/`server-models` now reference its symbols; refreshed `patches/0001` for the `tests/test-export-graph-ops.cpp` rename and the `server.cpp` GC-init context shift.
3329
- Upgraded llama.cpp from b9829 to b9839. Pure version bump — no project source changes: all four patches (`0001``0004`) apply unchanged against b9839, and every upstream change in the range is absorbed inside upstream-compiled translation units. Brings DFlash block-diffusion speculative decoding (`--spec-type draft-dflash`), the MiniCPM5 XML tool-call chat template, a server `--reasoning-preserve` flag (preserve reasoning trace across the full history when the template supports it), and Jinja `min`/`max` array filters; removes the now-unused `common/regex-partial.{cpp,h}` (partial-regex matching is fully inside the PEG parser), which the project never referenced.
3430
- Upgraded llama.cpp from b9839 to b9840. Pure version bump — no project source changes: the range is entirely the new **DeepSeek-V4** architecture (new `deepseek4` arch + dedicated `llama-kv-cache-dsv4` cache, `sqrtsoftplus` MoE gating, hyper-connection/compressor hparams + tensors, conversion scripts and embedded chat template), all absorbed inside upstream-compiled `libllama` and the Python converters. Upstream's `src/CMakeLists.txt` adds the new `llama-kv-cache-dsv4.cpp` itself (built via FetchContent). All four patches (`0001``0004`) apply unchanged; the project binds none of the new symbols.
@@ -43,9 +39,21 @@ from version 5.0.0 onward. Pre-fork releases (`1.x`–`4.2.0`) were authored by
4339
- `Session` now pins every inference request to its configured slot, so generation and slot save/restore/erase target the same KV state.
4440
- Cached-token usage is preserved through typed Java responses and OpenAI Responses/Anthropic blocking and streaming adapters.
4541

42+
## [5.0.2] - 2026-06-08
43+
4644
### Added
45+
- `CODE_OF_CONDUCT.md` (Contributor Covenant 2.0).
46+
- `docs/RELEASE.md` capturing the maintainer-facing release procedure (moved out of CHANGELOG).
47+
- OpenSSF Best Practices badge (project 12862) on README.
4748
- Reasoning-budget tests (Qwen3-0.6B).
4849

50+
### Changed
51+
- Unified `CONTRIBUTING.md` and `SECURITY.md` structure with sibling repositories in the project family.
52+
- Reconciled Java baseline to **11+** across `pom.xml`, README badge, `CLAUDE.md`, and `CONTRIBUTING.md`.
53+
- README license badge corrected from "Apache 2.0" to "MIT" (matches `LICENSE` file and `pom.xml`).
54+
- `pom.xml` SCM URL: `tree/master``tree/main` (default branch renamed).
55+
- Upgraded llama.cpp from b9151 to b9172.
56+
4957
## [5.0.1] - 2026-05-14
5058

5159
### Added
@@ -110,6 +118,8 @@ Releases `1.1.1` through `4.2.0` were authored by [@kherud](https://github.com/k
110118

111119
For an architecture-level diff between the pre-fork baseline (`49be664`) and the first 5.0.0 candidate (`24918e4`), see [`docs/history/49be664_24918e4.md`](docs/history/49be664_24918e4.md). For the server-fork-deletion refactor that culminated in 5.0.0, see [`docs/history/REFACTORING.md`](docs/history/REFACTORING.md). For the chat-completion integration that landed in 5.0.0, see [`docs/history/CHAT_INTEGRATION_SUMMARY.md`](docs/history/CHAT_INTEGRATION_SUMMARY.md).
112120

113-
[Unreleased]: https://github.com/bernardladenthin/java-llama.cpp/compare/v5.0.1...HEAD
121+
[Unreleased]: https://github.com/bernardladenthin/java-llama.cpp/compare/v5.0.3...HEAD
122+
[5.0.3]: https://github.com/bernardladenthin/java-llama.cpp/compare/v5.0.2...v5.0.3
123+
[5.0.2]: https://github.com/bernardladenthin/java-llama.cpp/compare/v5.0.1...v5.0.2
114124
[5.0.1]: https://github.com/bernardladenthin/java-llama.cpp/compare/v5.0.0...v5.0.1
115125
[5.0.0]: https://github.com/bernardladenthin/java-llama.cpp/releases/tag/v5.0.0

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ Access this library via Maven (released versions on Maven Central):
119119
<dependency>
120120
<groupId>net.ladenthin</groupId>
121121
<artifactId>llama</artifactId>
122-
<version>5.0.2</version>
122+
<version>5.0.3</version>
123123
</dependency>
124124
```
125125

@@ -184,54 +184,54 @@ classifier — those are mutually exclusive — and optionally a CPU Windows bui
184184
<dependency>
185185
<groupId>net.ladenthin</groupId>
186186
<artifactId>llama</artifactId>
187-
<version>5.0.2</version>
187+
<version>5.0.3</version>
188188
</dependency>
189189

190190
<!-- CUDA on Linux x86-64 (requires CUDA 13 runtime on the host) -->
191191
<dependency>
192192
<groupId>net.ladenthin</groupId>
193193
<artifactId>llama</artifactId>
194-
<version>5.0.2</version>
194+
<version>5.0.3</version>
195195
<classifier>cuda13-linux-x86-64</classifier>
196196
</dependency>
197197

198198
<!-- OpenCL/Adreno on Android (requires device-provided OpenCL ICD) -->
199199
<dependency>
200200
<groupId>net.ladenthin</groupId>
201201
<artifactId>llama</artifactId>
202-
<version>5.0.2</version>
202+
<version>5.0.3</version>
203203
<classifier>opencl-android-aarch64</classifier>
204204
</dependency>
205205

206206
<!-- CUDA on Windows x86-64 (requires CUDA 13 Toolkit on the host) -->
207207
<dependency>
208208
<groupId>net.ladenthin</groupId>
209209
<artifactId>llama</artifactId>
210-
<version>5.0.2</version>
210+
<version>5.0.3</version>
211211
<classifier>cuda13-windows-x86-64</classifier>
212212
</dependency>
213213

214214
<!-- Vulkan on Windows x86-64 (NVIDIA/AMD/Intel; vulkan-1.dll from the driver) -->
215215
<dependency>
216216
<groupId>net.ladenthin</groupId>
217217
<artifactId>llama</artifactId>
218-
<version>5.0.2</version>
218+
<version>5.0.3</version>
219219
<classifier>vulkan-windows-x86-64</classifier>
220220
</dependency>
221221

222222
<!-- OpenCL on Windows x86-64 (requires a driver-provided OpenCL ICD) -->
223223
<dependency>
224224
<groupId>net.ladenthin</groupId>
225225
<artifactId>llama</artifactId>
226-
<version>5.0.2</version>
226+
<version>5.0.3</version>
227227
<classifier>opencl-windows-x86-64</classifier>
228228
</dependency>
229229

230230
<!-- Windows CPU natives built with the MSVC / Visual Studio generator -->
231231
<dependency>
232232
<groupId>net.ladenthin</groupId>
233233
<artifactId>llama</artifactId>
234-
<version>5.0.2</version>
234+
<version>5.0.3</version>
235235
<classifier>msvc-windows</classifier>
236236
</dependency>
237237
```

pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ SPDX-License-Identifier: MIT
1212

1313
<groupId>net.ladenthin</groupId>
1414
<artifactId>llama</artifactId>
15-
<version>5.0.3-SNAPSHOT</version>
15+
<version>5.0.3</version>
1616
<packaging>jar</packaging>
1717

1818
<name>${project.groupId}:${project.artifactId}</name>

0 commit comments

Comments
 (0)