Skip to content

Add in-process fine-tuning proof of concept (LlamaTrainer)#287

Open
vaiju1981 wants to merge 1 commit into
bernardladenthin:mainfrom
vaiju1981:training-poc
Open

Add in-process fine-tuning proof of concept (LlamaTrainer)#287
vaiju1981 wants to merge 1 commit into
bernardladenthin:mainfrom
vaiju1981:training-poc

Conversation

@vaiju1981

Copy link
Copy Markdown

Summary

A proof of concept for in-process fine-tuning, wiring llama.cpp's ggml-opt training path into
the JNI layer. It mirrors upstream examples/training/finetune.cpp: load a model, tokenize a text
corpus into a ggml-opt dataset, run llama_opt_init + llama_opt_epoch for N epochs, and write the
fine-tuned GGUF via llama_model_save_to_file.

New API:

LlamaTrainer.finetune(Path model, String trainingText, Path output, int epochs, float learningRate);

Why this is a small change

The open question for "can java-llama.cpp train like llama.cpp?" was whether the
ggml-opt / llama_opt machinery even links into our static libjllama. It does — verified with
nm on the built library: llama_opt_init, llama_opt_epoch, ggml_opt_fit,
ggml_opt_dataset_init, common_opt_dataset_init, common_opt_lr_pars, and
llama_model_save_to_file are all defined (T) symbols, and CMake already links llama-common. So
this is pure JNI + C++ wiring with no build-system change.

What's verified

  • train_engine.cpp compiles and links into libjllama (b9842).
  • The finetuneNative JNI symbol is exported; the library loads cleanly (NativeLibraryLoadSmokeTest).
  • LlamaTrainer compiles through the strict Error Prone / NullAway pipeline.
  • The actual training run is exercised by LlamaTrainerIntegrationTest, which self-skips unless
    -Dnet.ladenthin.llama.train.model=/path/to/small.gguf is set (full-model fine-tuning is
    compute/memory-heavy and should not run in a default build).

Design

  • train_engine.{h,cpp} — a self-contained native finetune(), independent of the inference
    server_context; it loads its own model + context and forces the two settings upstream training
    requires (no mmap → writable weights; f32 KV cache → OUT_PROD has no f16 support). It
    intentionally does not call llama_backend_free(), since other live contexts in the JVM may
    still depend on the initialized backend.
  • LlamaTrainer — a deliberately minimal Java surface so the native path can be exercised before a
    richer API is designed.

Scope / next steps

This is a POC, not a finished feature — upstream training support is itself experimental (full or
selective fine-tune, small models). A follow-up FineTuner API could add: dataset/file input and
batching, optimizer and LoRA-target selection, a learning-rate schedule, validation split, and
progress callbacks. Opening this to confirm the approach and the in-process feasibility.

Wire llama.cpp's ggml-opt training path into the JNI layer, mirroring upstream
examples/training/finetune.cpp: load a model, tokenize a text corpus into a
ggml-opt dataset, run llama_opt_init + llama_opt_epoch for N epochs, and write
the fine-tuned GGUF via llama_model_save_to_file.

- train_engine.{h,cpp} - self-contained native finetune(), independent of the
  inference server_context (loads its own model + context; forces no-mmap and an
  f32 KV cache, as training requires)
- LlamaTrainer - minimal Java entry point (static finetune(...) overloads)
- CMakeLists.txt - compile train_engine.cpp into libjllama

The ggml-opt / llama_opt symbols already link into the static libjllama with no
build-system change (verified with nm), so this is pure JNI + C++ wiring. The
finetuneNative symbol is exported, the library links and loads cleanly, and the
Java layer compiles through the strict Error Prone / NullAway pipeline.

Scope is deliberately a proof of concept: full-model fine-tuning is compute- and
memory-intensive and upstream training support is experimental. The actual
training run is exercised by a model-gated integration test that self-skips
unless -Dnet.ladenthin.llama.train.model is set. A richer FineTuner API (dataset
handling, optimizer / LoRA options, progress callbacks) can build on this base.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant