You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+25-5Lines changed: 25 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -393,10 +393,28 @@ not track the loader's own Java package). This is the same
393
393
`spotbugs-exclude.xml`, PIT `targetClasses`, and `CMakeLists.txt` OSInfo repairs.
394
394
395
395
### Code Formatting
396
+
397
+
C++ formatting is **enforced in CI** (`.github/workflows/clang-format.yml`) with a **pinned**
398
+
clang-format — currently **22.1.5**, installed via `pip install clang-format==22.1.5`. Format with
399
+
that exact version before committing; a different clang-format version reflows code differently and
400
+
will fail the check.
401
+
396
402
```bash
397
-
clang-format -i src/main/cpp/*.cpp src/main/cpp/*.hpp # Format C++ code
403
+
pip install "clang-format==22.1.5"
404
+
clang-format -i src/main/cpp/*.cpp src/main/cpp/*.hpp src/test/cpp/*.cpp # Format C++ code
398
405
```
399
406
407
+
The generated JNI header `src/main/cpp/jllama.h` (produced by `javac -h`) is intentionally excluded.
408
+
To bump the enforced version, update the pin in **both** the workflow (`CLANG_FORMAT_VERSION`) and
409
+
this line, then reformat the whole tree with the new version in the same commit.
410
+
411
+
**`.clang-format` sets `SortIncludes: Never` — do not re-enable include sorting.** The project has
412
+
order-sensitive includes (see the "Include order rule" above): the upstream `server-*.h` headers and
413
+
`utils.hpp` must precede `json_helpers.hpp` / `jni_helpers.hpp`, which use the `json` alias those
414
+
headers define. Alphabetical sorting moves the helper headers first and breaks the build with
415
+
`'json' does not name a type` (it slips past a local build whose toolchain resolves `json` anyway,
416
+
but fails the manylinux/aarch64/Android CI compilers). Keep include order manual.
417
+
400
418
### Javadoc — must build cleanly before `mvn package`
401
419
402
420
The release packaging job runs `mvn package` with the `release` profile, which attaches
@@ -453,7 +471,9 @@ If the local check passes (`BUILD SUCCESS`), the `mvn package` job in
453
471
-`LlamaIterator` / `LlamaIterable` — Streaming generation via Java `Iterator`/`Iterable`.
454
472
-`LlamaLoader` — Extracts the platform-specific native library from the JAR to a temp directory, or finds it on `java.library.path`.
455
473
-`OSInfo` — Detects OS and architecture for library resolution.
456
-
-`server.LlamaServer` — Optional OpenAI-compatible HTTP server and the fat-jar `Main-Class`. `LlamaServerArgs` parses the CLI; `OaiRouter` / `OaiHttpServer` (NanoHTTPD) map `POST /v1/chat/completions`, `/v1/completions`, `/v1/embeddings` and `GET /v1/models` to the `LlamaModel.handle*` methods. NanoHTTPD is an `<optional>` dependency (bundled only in the fat jar, not inherited by library consumers). The `server` package is a dedicated top layer in the ArchUnit `layeredArchitecture` rule (the only layer allowed to access the root `Api`). See README "OpenAI-compatible HTTP server".
474
+
-**`server` package — OpenAI-compatible HTTP endpoint. NOTE: two implementations coexist on this branch pending a "best of both" consolidation (see [`TODO.md`](TODO.md)).**
475
+
-`server.OpenAiCompatServer` — built on the JDK's `com.sun.net.httpserver` (no new dependency). Serves `POST /v1/chat/completions` (streaming via SSE + non-streaming) and `GET /v1/models` by delegating to `LlamaModel.chatComplete` / `LlamaModel.streamChatCompletion`, so editors that speak the OpenAI protocol (e.g. VS Code Copilot "Custom Endpoint") can drive a local model. Streaming uses the native OAI chunk path (`requestChatCompletionStream` / `receiveChatCompletionChunk`), preserving `delta.tool_calls`.
476
+
-`server.LlamaServer` — an OpenAI-compatible HTTP server and the fat-jar `Main-Class`. `LlamaServerArgs` parses the CLI; `OaiRouter` / `OaiHttpServer` (NanoHTTPD) map `POST /v1/chat/completions`, `/v1/completions`, `/v1/embeddings` and `GET /v1/models` to the `LlamaModel.handle*` methods. NanoHTTPD is an `<optional>` dependency (bundled only in the fat jar, not inherited by library consumers). The `server` package is a dedicated top layer in the ArchUnit `layeredArchitecture` rule (the only layer allowed to access the root `Api`). See README "OpenAI-compatible HTTP server".
0 commit comments