Skip to content

Commit c10a6ec

Browse files
committed
build(server): compile upstream server-http.cpp + cpp-httplib into libjllama
First step toward driving the OpenAI-compatible server natively from JNI, shipped inside libjllama rather than as a standalone llama-server executable (a JNI .so/.dll/.dylib loads anywhere a JVM runs; a separate binary does not, which is the whole point of preferring the JNI path here). This commit only makes the HTTP layer build and link — no JNI route wiring yet. What changed (CMakeLists.txt): - Compile tools/server/server-http.cpp (the upstream server_http_context HTTP transport) and vendor/cpp-httplib/httplib.cpp directly into jllama, on all platforms (the getifaddrs API-24 gate cpp-httplib needs on Android is already satisfied by the existing __ANDROID_UNAVAILABLE_SYMBOLS_ARE_WEAK__ define). - <cpp-httplib/httplib.h> already resolves via llama-common's vendor/ include dir, whose bundled nlohmann/json is the same 3.12.0 as our FetchContent copy, so nothing is shadowed and no extra include dir is required for it. - Mirror upstream's cpp-httplib tuning defines (payload/URI/backlog limits, TCP_NODELAY) on jllama so httplib.cpp and the server-http.cpp that includes httplib.h agree on the inline behaviour those macros control. - Silence httplib.cpp warnings (-w / /w), matching upstream's own target. - Link ws2_32 on MinGW (MSVC auto-links it via a pragma in httplib.h). - No SSL: CPPHTTPLIB_OPENSSL_SUPPORT is left undefined (plain HTTP for now; bind localhost or front with a TLS proxy). WebUI stub (src/main/cpp/webui_stub/ui.h): - server-http.cpp does #include "ui.h" — the asset table tools/ui (llama-ui) normally GENERATES via the llama-ui-embed host tool. We do not ship the Svelte WebUI (it needs npm or a prebuilt-asset download), so this header supplies the exact "empty asset table" interface embed.cpp emits for n_assets == 0: the llama_ui_asset struct plus llama_ui_find_asset / llama_ui_use_gzip / llama_ui_get_assets. LLAMA_UI_HAS_ASSETS is intentionally left undefined, so every static-asset-serving block in server-http.cpp compiles out; the single unguarded use iterates the (empty) asset list. Header-only (.h) so it is outside the clang-format glob, which only covers *.cpp/*.hpp. server.cpp (standalone main() + route wiring) stays excluded — wiring those routes to a JNI entry point is the next step. Verified locally (Linux x86_64): - cmake --build --target jllama -> [100%] Built target jllama (clean). - libjllama.so contains server_http_context::init/start/stop (T) and ~1.8k httplib symbols, with zero undefined server-http/httplib symbols. - NativeLibraryLoadSmokeTest: Tests run: 1, Failures: 0, Skipped: 0 (the larger lib still loads and JNI_OnLoad resolves every referenced Java class). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JdLpWD8nedY7LwNnHefZLF
1 parent ff614d2 commit c10a6ec

3 files changed

Lines changed: 100 additions & 2 deletions

File tree

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -483,7 +483,7 @@ If the local check passes (`BUILD SUCCESS`), the `mvn package` job in
483483
- `json_helpers.hpp` — Pure JSON transformation helpers (no JNI, no llama state). Independently unit-testable.
484484
- `jni_helpers.hpp` — JNI bridge helpers (handle management + server orchestration). Includes `json_helpers.hpp`.
485485
- Uses `nlohmann/json` for JSON deserialization of parameters.
486-
- The upstream server library (`server-context.cpp`, `server-queue.cpp`, `server-task.cpp`, `server-models.cpp`) is compiled directly into `jllama` via CMake — there is no hand-ported `server.hpp` fork.
486+
- The upstream server library (`server-context.cpp`, `server-queue.cpp`, `server-task.cpp`, `server-models.cpp`) is compiled directly into `jllama` via CMake — there is no hand-ported `server.hpp` fork. **Phase 2:** the upstream HTTP transport (`tools/server/server-http.cpp`) and its `cpp-httplib` backend (`vendor/cpp-httplib/httplib.cpp`) are now compiled into `jllama` too, so the OpenAI-compatible server can be driven natively from JNI *inside* `libjllama` — no separate `llama-server` executable (a JNI shared library loads anywhere a JVM runs, which a standalone binary does not). `server-http.cpp` does `#include "ui.h"` (the WebUI asset table that `tools/ui`/`llama-ui` normally generates); since the Svelte WebUI is not shipped, `src/main/cpp/webui_stub/ui.h` supplies the upstream **empty-asset** interface and leaves `LLAMA_UI_HAS_ASSETS` undefined (all static-asset-serving blocks compile out). `<cpp-httplib/httplib.h>` already resolves via `llama-common`'s `vendor/` include dir (same nlohmann/json 3.12.0 as the FetchContent copy). No SSL: `CPPHTTPLIB_OPENSSL_SUPPORT` is left undefined (plain-HTTP; bind localhost / front with a TLS proxy). Only `server.cpp` (the standalone `main()` + route wiring) remains excluded — wiring the routes to JNI is the next step.
487487

488488
### Native Helper Architecture
489489

CMakeLists.txt

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,6 @@ add_library(jllama SHARED
261261

262262
# Phase 1 refactoring: compile upstream server library units directly into jllama
263263
# server.hpp has been replaced by direct upstream includes in jllama.cpp.
264-
# server-http.cpp and server.cpp (main) are intentionally excluded.
265264
# server-context.cpp, server-queue.cpp, server-task.cpp compile on all platforms
266265
# including Android. server-models.cpp is excluded on Android because it pulls
267266
# in subprocess.h which calls posix_spawn_*, declared but not implemented by the
@@ -278,9 +277,49 @@ if(NOT ANDROID_ABI AND NOT OS_NAME MATCHES "Android")
278277
)
279278
endif()
280279

280+
# Phase 2: also compile the upstream HTTP transport (server-http.cpp) and its
281+
# cpp-httplib backend directly into jllama, so the OpenAI-compatible server can be
282+
# driven natively from JNI — shipped inside libjllama, with no separate
283+
# llama-server executable (a JNI .so/.dll/.dylib loads everywhere a JVM runs,
284+
# unlike a standalone binary). Only server.cpp (the standalone main() + route
285+
# wiring) stays excluded for now; this first step just makes the HTTP layer build
286+
# and link.
287+
#
288+
# server-http.cpp does `#include "ui.h"` — the WebUI asset table that tools/ui
289+
# normally GENERATES. We do not ship the Svelte WebUI (it needs npm / a prebuilt
290+
# asset download), so src/main/cpp/webui_stub/ui.h supplies the upstream "empty
291+
# asset table" interface instead (see that file). <cpp-httplib/httplib.h> already
292+
# resolves via llama-common's vendor/ include dir, whose bundled nlohmann/json is
293+
# the same 3.12.0 as our FetchContent copy, so adding nothing there shadows it.
294+
target_sources(jllama PRIVATE
295+
${llama.cpp_SOURCE_DIR}/tools/server/server-http.cpp
296+
${llama.cpp_SOURCE_DIR}/vendor/cpp-httplib/httplib.cpp
297+
)
298+
299+
# cpp-httplib is third-party: silence its warnings (matching upstream's own
300+
# cpp-httplib target, which compiles it with -w / /w). No SSL is enabled —
301+
# CPPHTTPLIB_OPENSSL_SUPPORT is left undefined — so the embedded server is
302+
# plain-HTTP for now (bind to localhost or front it with a TLS proxy).
303+
if(MSVC)
304+
set_source_files_properties(
305+
${llama.cpp_SOURCE_DIR}/vendor/cpp-httplib/httplib.cpp
306+
PROPERTIES COMPILE_FLAGS "/w")
307+
else()
308+
set_source_files_properties(
309+
${llama.cpp_SOURCE_DIR}/vendor/cpp-httplib/httplib.cpp
310+
PROPERTIES COMPILE_FLAGS "-w")
311+
endif()
312+
313+
# MinGW needs ws2_32 explicitly; MSVC auto-links it via a #pragma in httplib.h.
314+
if(WIN32 AND NOT MSVC)
315+
target_link_libraries(jllama PRIVATE ws2_32)
316+
endif()
317+
281318
set_target_properties(jllama PROPERTIES POSITION_INDEPENDENT_CODE ON)
282319
target_include_directories(jllama PRIVATE
283320
src/main/cpp
321+
# webui_stub/ui.h stands in for the generated llama-ui header (see Phase 2 above)
322+
src/main/cpp/webui_stub
284323
${JNI_INCLUDE_DIRS}
285324
${llama.cpp_SOURCE_DIR}/tools/mtmd
286325
${llama.cpp_SOURCE_DIR}/tools/server)
@@ -289,6 +328,13 @@ target_compile_features(jllama PRIVATE cxx_std_11)
289328

290329
target_compile_definitions(jllama PRIVATE
291330
SERVER_VERBOSE=$<BOOL:${LLAMA_VERBOSE}>
331+
# cpp-httplib tuning — mirror the defines upstream's cpp-httplib target sets so
332+
# httplib.cpp and every TU that includes httplib.h (server-http.cpp) agree on
333+
# the inline behaviour these macros control.
334+
CPPHTTPLIB_FORM_URL_ENCODED_PAYLOAD_MAX_LENGTH=1048576
335+
CPPHTTPLIB_LISTEN_BACKLOG=512
336+
CPPHTTPLIB_REQUEST_URI_MAX_LENGTH=32768
337+
CPPHTTPLIB_TCP_NODELAY=1
292338
)
293339

294340
if(OS_NAME STREQUAL "Windows")

src/main/cpp/webui_stub/ui.h

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
// SPDX-FileCopyrightText: 2026 Bernard Ladenthin <bernard.ladenthin@gmail.com>
2+
//
3+
// SPDX-License-Identifier: MIT
4+
5+
#pragma once
6+
7+
// ui.h — minimal stand-in for the WebUI asset interface that llama.cpp's
8+
// tools/ui (CMake target "llama-ui") normally GENERATES into ui.h / ui.cpp at
9+
// build time via the llama-ui-embed host tool.
10+
//
11+
// The upstream HTTP transport (tools/server/server-http.cpp) does
12+
// #include "ui.h"
13+
// and references llama_ui_get_assets() / llama_ui_find_asset() /
14+
// llama_ui_use_gzip(). We compile server-http.cpp directly into libjllama but do
15+
// NOT ship the Svelte WebUI assets (building them needs npm, or a prebuilt-asset
16+
// download from Hugging Face) — so we provide the exact "empty asset table"
17+
// interface that embed.cpp emits for its n_assets == 0 branch: the struct plus
18+
// the three functions, returning nothing.
19+
//
20+
// LLAMA_UI_HAS_ASSETS is intentionally left UNDEFINED. Every static-asset-serving
21+
// block in server-http.cpp is guarded by `#if defined(LLAMA_UI_HAS_ASSETS)`, so
22+
// all of them compile out; the single unguarded use — iterating the asset list to
23+
// collect public endpoint paths — simply iterates this empty array.
24+
//
25+
// To actually ship the WebUI later: remove this stub directory from jllama's
26+
// include path, build the real llama-ui target (assets on), and add its
27+
// generated-header directory instead.
28+
29+
#include <array>
30+
#include <cstddef>
31+
#include <string>
32+
33+
struct llama_ui_asset {
34+
std::string name;
35+
const unsigned char * data;
36+
std::size_t size;
37+
std::string etag;
38+
std::string type;
39+
};
40+
41+
inline const llama_ui_asset * llama_ui_find_asset(const std::string & /*name*/) {
42+
return nullptr;
43+
}
44+
45+
inline bool llama_ui_use_gzip() {
46+
return false;
47+
}
48+
49+
inline const std::array<llama_ui_asset, 0> & llama_ui_get_assets() {
50+
static const std::array<llama_ui_asset, 0> empty{};
51+
return empty;
52+
}

0 commit comments

Comments
 (0)