Skip to content

Commit 4f1fbd7

Browse files
committed
refactor(InferenceParameters): immutable + wither/append pattern
Convert InferenceParameters from a mutable fluent builder into a fully immutable value class with a functional wither API: InferenceParameters params = InferenceParameters.of("two plus two?") .withNPredict(8) .withSeed(1) .withTemperature(0.2f); The parent JsonParameters base is reshaped to match: parameters map is final and Collections.unmodifiableMap-wrapped, and the helpers putScalar / putEnum / putOptionalJson are replaced by withScalar / withEnum / withOptionalJson / withRaw, all of which allocate a new subclass instance through the abstract withParameters(Map) factory hook. ModelParameters extends a different parent (CliParameters) and is unchanged — it is constructed once and consumed once, so the immutability refactor brings no correctness payoff there but ~250 test line changes; deliberately skipped to keep this commit focused. Hidden mutation bug fixed: LlamaModel.complete / completeWithStats / chatComplete and the cancellable complete-with-token variant silently called parameters.setStream(true|false) on the caller's instance. They now bind a local derivation via withStream so the caller's parameters object is never touched. Same pattern applied to LlamaIterator. API breaks (intentional, alongside the parameter rename): - Consumer<InferenceParameters> -> UnaryOperator<InferenceParameters> in Session and ChatRequest. The customiser must return its transformed result because the input is immutable; lambdas like `p -> p.withSeed(1).withNPredict(8)` keep working with the expression-form return. - ChatRequest.applyCustomizer now returns InferenceParameters instead of being a void mutator; callers (only LlamaModel.chat) updated. Tests: - JsonParametersTest rewritten to cover the new wither helpers (withScalar / withEnum / withRaw / withOptionalJson) plus the unmodifiable-map invariant. The legacy CliParameters putScalar / putEnum tests are preserved unchanged because ModelParameters still uses them. - Bulk-renamed setX -> withX across the entire test surface (InferenceParametersTest 71+, ChatAdvancedTest 84, LlamaModelTest 90, ChatScenarioTest 64, MemoryManagementTest 40, plus smaller files and all examples), preserving ModelParameters' fluent setX chains where the overlap methods (setSeed, setGrammar, setJsonSchema, setSamplers, setChatTemplate, setChatTemplateKwargs, setReasoningFormat) appear. - A handful of tests that did `params.setX(...)` without capturing the return value were rewritten to `params = params.withX(...)`. SpotBugs Max+Low: net unchanged at 6 findings. The one new OCP_OVERLY_CONCRETE_PARAMETER on InferenceParameters.withReasoningFormat is suppressed with the same design-intent rationale as the existing ModelParameters OCP block (the narrow enum type is the API contract; widening to CliArg would silently accept any enum and emit a nonsense JSON value).
1 parent c42a2fc commit 4f1fbd7

35 files changed

Lines changed: 1074 additions & 950 deletions

spotbugs-exclude.xml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,20 @@ SPDX-License-Identifier: MIT
8888
</Or>
8989
</Match>
9090

91+
<!--
92+
Same design-intent rationale as the ModelParameters OCP block above:
93+
InferenceParameters.withReasoningFormat(ReasoningFormat) intentionally
94+
types its parameter to the specific ReasoningFormat enum rather than
95+
the shared CliArg interface. The narrow type is the API contract;
96+
widening it would silently accept any CliArg-implementing enum and
97+
emit a nonsense JSON value the native code would reject.
98+
-->
99+
<Match>
100+
<Class name="net.ladenthin.llama.InferenceParameters"/>
101+
<Bug pattern="OCP_OVERLY_CONCRETE_PARAMETER"/>
102+
<Method name="withReasoningFormat"/>
103+
</Match>
104+
91105
<!--
92106
InferenceParameters and ModelParameters are fluent builders whose
93107
parameters field is a Map<String, String> serving as the CLI / JSON

src/main/java/net/ladenthin/llama/ChatMessage.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
* Multimodal turns carry a non-null {@link #getParts()} list of {@link ContentPart}s
2626
* (text and image references). When parts are present they take precedence over
2727
* {@link #getContent()} during serialization; the upstream OAI chat path
28-
* (see {@link InferenceParameters#setMessages(java.util.List)}) emits an array-form
28+
* (see {@link InferenceParameters#withMessages(java.util.List)}) emits an array-form
2929
* {@code content} field that the compiled-in {@code mtmd} pipeline understands.
3030
* </p>
3131
*

src/main/java/net/ladenthin/llama/ChatRequest.java

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
import java.util.Collections;
1313
import java.util.List;
1414
import java.util.Optional;
15-
import java.util.function.Consumer;
15+
import java.util.function.UnaryOperator;
1616
import lombok.EqualsAndHashCode;
1717
import lombok.ToString;
1818
import org.jspecify.annotations.Nullable;
@@ -25,8 +25,12 @@
2525
*
2626
* <p>The request carries the conversation messages, optional tool definitions,
2727
* an optional {@code tool_choice} hint, and an {@link InferenceParameters}
28-
* customiser applied to the underlying request just before invocation. The
29-
* type is consumed by {@link LlamaModel#chat(ChatRequest)} and
28+
* customiser applied to the underlying request just before invocation. Because
29+
* {@link InferenceParameters} is itself immutable, the customiser is a
30+
* {@link UnaryOperator} that takes a parameter set and returns the transformed
31+
* one — callers chain {@code withX(...)} calls on the input and return the
32+
* resulting instance. The type is consumed by
33+
* {@link LlamaModel#chat(ChatRequest)} and
3034
* {@link LlamaModel#chatWithTools(ChatRequest, java.util.Map)}.
3135
*
3236
* <p>All instances are <b>immutable</b>: every field is {@code final} and the
@@ -47,7 +51,7 @@
4751
* .appendMessage("system", "be terse")
4852
* .appendMessage("user", "two plus two?")
4953
* .withMaxToolRounds(2)
50-
* .withInferenceCustomizer(p -> p.setNPredict(8).setSeed(1));
54+
* .withInferenceCustomizer(p -> p.withNPredict(8).withSeed(1));
5155
* }</pre>
5256
*
5357
* <p>Each call allocates a new {@code ChatRequest}. The cost is intentional:
@@ -58,7 +62,7 @@
5862
*
5963
* <p>{@code @EqualsAndHashCode} compares messages, tools, {@code toolChoice},
6064
* and {@code maxToolRounds} by value. The {@code paramsCustomizer}
61-
* {@link Consumer} is <b>excluded</b> from equality: lambdas have
65+
* {@link UnaryOperator} is <b>excluded</b> from equality: lambdas have
6266
* compiler-synthesised identity equality which is not value-shaped, so
6367
* including it would mean two structurally-identical requests with the same
6468
* customiser source code rarely compare equal — surprising for the typical
@@ -93,7 +97,7 @@ public final class ChatRequest {
9397
// equality is compiler-synthesised class identity, not value-shaped.
9498
@ToString.Exclude
9599
@EqualsAndHashCode.Exclude
96-
private final @Nullable Consumer<InferenceParameters> paramsCustomizer;
100+
private final @Nullable UnaryOperator<InferenceParameters> paramsCustomizer;
97101

98102
/**
99103
* All-args constructor. Private because callers should enter via {@link #empty()}
@@ -105,7 +109,7 @@ private ChatRequest(
105109
List<ToolDefinition> tools,
106110
@Nullable String toolChoice,
107111
int maxToolRounds,
108-
@Nullable Consumer<InferenceParameters> paramsCustomizer) {
112+
@Nullable UnaryOperator<InferenceParameters> paramsCustomizer) {
109113
this.messages = messages;
110114
this.tools = tools;
111115
this.toolChoice = toolChoice;
@@ -212,7 +216,7 @@ public ChatRequest withMaxToolRounds(int newMaxToolRounds) {
212216
* @param newCustomizer the customiser; {@code null} clears any prior customiser
213217
* @return a new request with the customiser replaced; this request is unchanged
214218
*/
215-
public ChatRequest withInferenceCustomizer(@Nullable Consumer<InferenceParameters> newCustomizer) {
219+
public ChatRequest withInferenceCustomizer(@Nullable UnaryOperator<InferenceParameters> newCustomizer) {
216220
return new ChatRequest(messages, tools, toolChoice, maxToolRounds, newCustomizer);
217221
}
218222

@@ -319,14 +323,14 @@ public Optional<String> buildToolsJson() {
319323
}
320324

321325
/**
322-
* Apply the optional customiser to an {@link InferenceParameters} instance.
323-
* Package-private; called by {@link LlamaModel}.
326+
* Apply the optional customiser to an {@link InferenceParameters} instance and
327+
* return the transformed result. Package-private; called by {@link LlamaModel}.
328+
* When no customiser is set, returns {@code params} unchanged.
324329
*
325-
* @param params the parameters to mutate
330+
* @param params the parameters to transform
331+
* @return the (possibly new) parameters produced by the customiser, or {@code params} when no customiser is set
326332
*/
327-
void applyCustomizer(InferenceParameters params) {
328-
if (paramsCustomizer != null) {
329-
paramsCustomizer.accept(params);
330-
}
333+
InferenceParameters applyCustomizer(InferenceParameters params) {
334+
return paramsCustomizer == null ? params : paramsCustomizer.apply(params);
331335
}
332336
}

src/main/java/net/ladenthin/llama/CompletionResult.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
* <p>
1414
* Bundles the generated text with parsed {@link Usage}, {@link Timings},
1515
* per-token {@link TokenLogprob} entries (populated only when
16-
* {@link InferenceParameters#setNProbs(int)} &gt; 0), and the {@link StopReason}.
16+
* {@link InferenceParameters#withNProbs(int)} &gt; 0), and the {@link StopReason}.
1717
* The raw native JSON is exposed via {@link #getRawJson()} as an escape hatch.
1818
* </p>
1919
*

0 commit comments

Comments
 (0)