Commit dca1b20
committed
Expose sse_ping_interval + audited completion params on InferenceParameters
Wires the b9864 per-request sse_ping_interval into the Java API and, from a
completion-schema audit, adds the other already-parseable-but-unexposed plain
scalars so callers of the OAI-compat completion path can set them.
New withers (each emits a JSON key honored by eval_llama_cmpl_schema):
- withSsePingInterval(int) -> sse_ping_interval (b9864; -1 disables pings)
- withXtcProbability(float) / withXtcThreshold(float) -> XTC sampler
- withNDiscard(int) -> n_discard (context-shift discard)
- withNIndent(int) -> n_indent (infill indentation)
- withTMaxPredictMs(int) -> t_max_predict_ms (generation time budget)
- withPostSamplingProbs(boolean) -> post_sampling_probs
- withTimingsPerToken(boolean) -> timings_per_token
- withReturnTokens(boolean) -> return_tokens
Audit method: extracted every field name from b9864's make_llama_cmpl_schema and
diffed against the InferenceParameters keys. t_max_prompt_ms was deliberately
skipped (commented out upstream, so not parseable). The
remaining unexposed fields are OAI aliases already covered (max_tokens/
max_completion_tokens -> n_predict) or OAI/server-internal / array-shaped /
advanced knobs (n, logprobs, echo, verbose, include_usage, return_progress,
response_fields, lora, grammar_lazy/grammar_triggers/preserved_tokens,
chat_format, parse_tool_calls, reasoning_control, backend_sampling, adaptive_*),
left out on purpose and documented in the breaking-changes history.
Tests: +2 Java withers tests (InferenceParametersTest -> 104 pass) and +3 C++
schema round-trip guards in test_server.cpp pinning that the native parser
honors sse_ping_interval (round-trip, -1 disables, below-hard-limit throws,
absent inherits the server default) -> full C++ suite 462 pass (was 459).
javadoc (llama module) + spotless + clang-format all clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HL7d4uQ3cKR5HwYFPvZvv71 parent 8eb55ff commit dca1b20
5 files changed
Lines changed: 172 additions & 3 deletions
File tree
- docs/history
- llama/src
- main/java/net/ladenthin/llama/parameters
- test
- cpp
- java/net/ladenthin/llama/parameters
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1076 | 1076 | | |
1077 | 1077 | | |
1078 | 1078 | | |
1079 | | - | |
| 1079 | + | |
1080 | 1080 | | |
1081 | 1081 | | |
1082 | 1082 | | |
1083 | 1083 | | |
1084 | 1084 | | |
1085 | | - | |
| 1085 | + | |
1086 | 1086 | | |
1087 | 1087 | | |
1088 | 1088 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
415 | 415 | | |
416 | 416 | | |
417 | 417 | | |
418 | | - | |
| 418 | + | |
419 | 419 | | |
0 commit comments