Skip to content

Commit eb55f58

Browse files
committed
refactor: extract ChatTranscript with two-phase commit semantics from Session
Eliminates the catch-cleanup-rethrow pattern in Session.send and Session.stream by moving transcript management into a new ChatTranscript class whose API surface enforces the two-phase commit invariant by construction. The catch-rethrow pattern fb-contrib flagged (THROWS_METHOD_THROWS_RUNTIMEEXCEPTION) was the symptom of "mutate shared state, then if the call fails, undo the mutation". The ChatTranscript design eliminates the root cause: there is no API to commit half a round. The only round-commit method, appendRound(user, assistant), appends both turns atomically; the wire format sent to the model is built via messagesWithPendingUserTurn(...) which returns a fresh list WITHOUT mutating the transcript. On model failure, the caller never reaches appendRound — no rollback logic is required. Side benefit — testable as running documentation. Extracting the transcript management decouples the invariant from LlamaModel (whose static initializer loads the native library and is unmockable in test environments without the native library). The new ChatTranscriptTest exercises the invariant with 12 tests across two categories: @nested "mechanical API behaviour" (8 tests): - appendRound commits both turns atomically - appendUserTurn + appendAssistantTurn match appendRound - messagesWithPendingUserTurn does NOT mutate the transcript - messagesWithPendingUserTurn returns a fresh list each call - snapshot includes the system message when configured - snapshot omits the system message when null or empty - snapshot is unmodifiable - getSystemMessage returns null when absent @nested "two-phase commit pattern - running documentation" (4 tests): - fresh transcript untouched when model throws - existing transcript byte-for-byte unchanged when model throws - success commits user + assistant atomically - stream() shape - user turn only, assistant follows via commitStreamedReply Each two-phase test composes the ChatTranscript API the same way Session.send and Session.stream do, so reading the test doubles as documentation of the design contract. Net: - New file: src/main/java/net/ladenthin/llama/ChatTranscript.java - New file: src/test/java/net/ladenthin/llama/ChatTranscriptTest.java - Session.java: 247 lines -> 247 lines (no net change; delegates to ChatTranscript). The catch-rethrow blocks are gone; the buildParamsWithPendingUserTurn helper now delegates to ChatTranscript.messagesWithPendingUserTurn. SpotBugs Max+Low: - THROWS_METHOD_THROWS_RUNTIMEEXCEPTION goes 2 -> 0 by design, not by suppression. Cross-repo lifecycle TODO for BAF (PR #4087) can take inspiration from this refactor on AbstractProducer.produceKeys if/when that two-phase commit fits. - IMC_IMMATURE_CLASS_NO_EQUALS goes 2 -> 3 (one new finding on the new ChatTranscript class — identity-by-design like the existing CancellationToken and ChatRequest sites; will fold into the existing class-level suppression block). Total jllama: 13 -> 12. Test slice green: 117 tests across ChatTranscriptTest (12), ChatMessageTest (2), ChatResponseTest (7), InferenceParametersTest (86), LlamaArchitectureTest (10).
1 parent d31a906 commit eb55f58

3 files changed

Lines changed: 468 additions & 41 deletions

File tree

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
// SPDX-FileCopyrightText: 2026 Bernard Ladenthin <bernard.ladenthin@gmail.com>
2+
//
3+
// SPDX-License-Identifier: MIT
4+
5+
package net.ladenthin.llama;
6+
7+
import java.util.ArrayList;
8+
import java.util.Collections;
9+
import java.util.List;
10+
import lombok.ToString;
11+
import org.jspecify.annotations.Nullable;
12+
13+
/**
14+
* Append-only transcript of a multi-turn chat conversation, with an optional
15+
* leading {@code system} message. Extracted from {@link Session} so the
16+
* transcript invariants — especially the <b>two-phase commit</b> shape — are
17+
* testable independently of {@link LlamaModel} and its native library.
18+
*
19+
* <h2>Two-phase commit invariant</h2>
20+
*
21+
* <p>The append API only offers <b>atomic</b> turn commits:
22+
*
23+
* <ul>
24+
* <li>{@link #appendRound(String, String)} appends a user turn AND an
25+
* assistant turn in one synchronised operation — used by
26+
* {@link Session#send(String)} on the model-success path. There is no
27+
* way to commit only one half: if the model call throws, this method
28+
* is simply never called and the transcript is untouched.</li>
29+
* <li>{@link #appendUserTurn(String)} appends only the user turn — used
30+
* by {@link Session#stream(String)} when the streaming iterable has
31+
* been successfully created but the assistant reply is still being
32+
* accumulated. The matching assistant turn is appended later via
33+
* {@link #appendAssistantTurn(String)}.</li>
34+
* </ul>
35+
*
36+
* <p>The wire-format the model sees is built by
37+
* {@link #messagesWithPendingUserTurn(String)}, which returns a fresh list
38+
* containing the committed turns plus a pending user turn — <b>without
39+
* mutating</b> the underlying transcript. This is the mechanism by which the
40+
* model receives the prompt before the user turn is committed.
41+
*
42+
* <h2>Thread safety</h2>
43+
*
44+
* <p>This class is <b>not</b> internally synchronised. {@link Session} owns
45+
* the single instance and serialises access via its intrinsic lock, so the
46+
* transcript itself does not need additional synchronisation. Callers that
47+
* use {@code ChatTranscript} directly must provide their own synchronisation
48+
* if shared across threads.
49+
*
50+
* <h2>{@code toString} contract</h2>
51+
*
52+
* <p>Lombok-generated over the system message and turns list. The turns list
53+
* IS included because it is the operationally interesting state for log
54+
* traces. {@code equals}/{@code hashCode} are intentionally NOT generated:
55+
* a transcript instance is identified by its lifecycle owner ({@link Session}),
56+
* not by its accumulated content.
57+
*/
58+
@ToString
59+
final class ChatTranscript {
60+
61+
private final @Nullable String systemMessage;
62+
private final List<Pair<String, String>> turns = new ArrayList<Pair<String, String>>();
63+
64+
/**
65+
* Create a new empty transcript with an optional system message.
66+
*
67+
* @param systemMessage the system prompt to prepend to every wire-format
68+
* prompt; {@code null} or empty means "no system message"
69+
*/
70+
ChatTranscript(@Nullable String systemMessage) {
71+
this.systemMessage = systemMessage;
72+
}
73+
74+
/**
75+
* Append a user turn AND an assistant turn atomically. This is the only
76+
* API that records both halves of a round, so the two-phase commit
77+
* invariant is enforced by construction: callers that observe a model
78+
* call failure simply never invoke this method.
79+
*
80+
* @param userMessage the user turn
81+
* @param assistantMessage the assistant reply that completes the round
82+
*/
83+
void appendRound(String userMessage, String assistantMessage) {
84+
turns.add(new Pair<String, String>("user", userMessage));
85+
turns.add(new Pair<String, String>("assistant", assistantMessage));
86+
}
87+
88+
/**
89+
* Append a user turn. Used by streaming flows where the assistant reply
90+
* is accumulated incrementally and committed later via
91+
* {@link #appendAssistantTurn(String)}.
92+
*
93+
* @param userMessage the user turn
94+
*/
95+
void appendUserTurn(String userMessage) {
96+
turns.add(new Pair<String, String>("user", userMessage));
97+
}
98+
99+
/**
100+
* Append an assistant turn. Used to complete a round that was begun
101+
* with {@link #appendUserTurn(String)}.
102+
*
103+
* @param assistantMessage the assistant reply
104+
*/
105+
void appendAssistantTurn(String assistantMessage) {
106+
turns.add(new Pair<String, String>("assistant", assistantMessage));
107+
}
108+
109+
/**
110+
* Build the wire-format messages list with a pending user turn appended,
111+
* <b>without mutating</b> this transcript. This is the snapshot a model
112+
* call receives before the user turn is committed; if the model call
113+
* fails, the pending turn evaporates and the transcript stays untouched.
114+
*
115+
* @param pendingUserMessage the user turn to include in the wire format
116+
* @return a fresh list containing the committed turns followed by the
117+
* pending user turn
118+
*/
119+
List<Pair<String, String>> messagesWithPendingUserTurn(String pendingUserMessage) {
120+
List<Pair<String, String>> wire = new ArrayList<Pair<String, String>>(turns.size() + 1);
121+
wire.addAll(turns);
122+
wire.add(new Pair<String, String>("user", pendingUserMessage));
123+
return wire;
124+
}
125+
126+
/**
127+
* Return the system message, or {@code null} when none was configured.
128+
*
129+
* @return the system prompt, or {@code null}
130+
*/
131+
@Nullable
132+
String getSystemMessage() {
133+
return systemMessage;
134+
}
135+
136+
/**
137+
* Return an unmodifiable {@link ChatMessage} snapshot of the transcript,
138+
* including the system message if one was configured.
139+
*
140+
* @return the unmodifiable snapshot
141+
*/
142+
List<ChatMessage> snapshot() {
143+
List<ChatMessage> out = new ArrayList<ChatMessage>(turns.size() + 1);
144+
if (systemMessage != null && !systemMessage.isEmpty()) {
145+
out.add(new ChatMessage("system", systemMessage));
146+
}
147+
for (Pair<String, String> p : turns) {
148+
out.add(new ChatMessage(p.getKey(), p.getValue()));
149+
}
150+
return Collections.unmodifiableList(out);
151+
}
152+
153+
/**
154+
* Return the number of committed turns (user + assistant). Does NOT
155+
* include the system message.
156+
*
157+
* @return the turn count
158+
*/
159+
int size() {
160+
return turns.size();
161+
}
162+
}

src/main/java/net/ladenthin/llama/Session.java

Lines changed: 47 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@
44

55
package net.ladenthin.llama;
66

7-
import java.util.ArrayList;
8-
import java.util.Collections;
97
import java.util.List;
108
import java.util.function.Consumer;
119
import lombok.ToString;
@@ -45,8 +43,14 @@ public final class Session implements AutoCloseable {
4543
private final LlamaModel model;
4644

4745
private final int slotId;
48-
private final @Nullable String systemMessage;
49-
private final List<Pair<String, String>> turns = new ArrayList<Pair<String, String>>();
46+
47+
/**
48+
* Append-only transcript with two-phase commit semantics. See the
49+
* {@link ChatTranscript} class Javadoc for the full invariant statement
50+
* and the {@code ChatTranscriptTest} class for the running-documentation
51+
* tests that pin the contract.
52+
*/
53+
private final ChatTranscript transcript;
5054

5155
// Lambda Consumer — toString is the implementation hash, not useful in logs.
5256
@ToString.Exclude
@@ -86,7 +90,7 @@ public Session(
8690
@Nullable Consumer<InferenceParameters> paramsCustomizer) {
8791
this.model = model;
8892
this.slotId = slotId;
89-
this.systemMessage = systemMessage;
93+
this.transcript = new ChatTranscript(systemMessage);
9094
this.paramsCustomizer = paramsCustomizer;
9195
}
9296

@@ -101,19 +105,18 @@ public String send(String userMessage) {
101105
if (streamingActive) {
102106
throw new IllegalStateException(
103107
"stream in progress on slot " + slotId
104-
+ " (transcript=" + turns.size() + " turns)"
108+
+ " (transcript=" + transcript.size() + " turns)"
105109
+ "; call commitStreamedReply(...) before send(...)");
106110
}
107-
turns.add(new Pair<String, String>("user", userMessage));
108-
InferenceParameters params = buildParams();
109-
try {
110-
String reply = model.chatCompleteText(params);
111-
turns.add(new Pair<String, String>("assistant", reply));
112-
return reply;
113-
} catch (RuntimeException e) {
114-
turns.remove(turns.size() - 1);
115-
throw e;
116-
}
111+
// Two-phase commit: build the wire-format with the pending user turn
112+
// outside the transcript via messagesWithPendingUserTurn(...). On
113+
// model success, commit BOTH turns atomically through appendRound(...).
114+
// On model failure, nothing was committed — no rollback logic needed.
115+
// Invariant pinned by ChatTranscriptTest.
116+
InferenceParameters params = buildParamsWithPendingUserTurn(userMessage);
117+
String reply = model.chatCompleteText(params);
118+
transcript.appendRound(userMessage, reply);
119+
return reply;
117120
}
118121
}
119122

@@ -131,18 +134,16 @@ public LlamaIterable stream(String userMessage) {
131134
if (streamingActive) {
132135
throw new IllegalStateException(
133136
"stream in progress on slot " + slotId
134-
+ " (transcript=" + turns.size() + " turns)"
137+
+ " (transcript=" + transcript.size() + " turns)"
135138
+ "; call commitStreamedReply(...) before stream(...)");
136139
}
137-
turns.add(new Pair<String, String>("user", userMessage));
138-
try {
139-
LlamaIterable iterable = model.generateChat(buildParams());
140-
streamingActive = true;
141-
return iterable;
142-
} catch (RuntimeException e) {
143-
turns.remove(turns.size() - 1);
144-
throw e;
145-
}
140+
// Two-phase commit: see send(). The user turn is committed only after
141+
// generateChat successfully returns the iterable; the assistant turn is
142+
// committed separately by commitStreamedReply(...).
143+
LlamaIterable iterable = model.generateChat(buildParamsWithPendingUserTurn(userMessage));
144+
transcript.appendUserTurn(userMessage);
145+
streamingActive = true;
146+
return iterable;
146147
}
147148
}
148149

@@ -157,10 +158,10 @@ public void commitStreamedReply(String assistantText) {
157158
if (!streamingActive) {
158159
throw new IllegalStateException(
159160
"no stream in progress on slot " + slotId
160-
+ " (transcript=" + turns.size() + " turns)"
161+
+ " (transcript=" + transcript.size() + " turns)"
161162
+ "; call stream(...) first");
162163
}
163-
turns.add(new Pair<String, String>("assistant", assistantText));
164+
transcript.appendAssistantTurn(assistantText);
164165
streamingActive = false;
165166
}
166167
}
@@ -176,7 +177,7 @@ public String save(String filepath) {
176177
if (streamingActive) {
177178
throw new IllegalStateException(
178179
"stream in progress on slot " + slotId
179-
+ " (transcript=" + turns.size() + " turns)"
180+
+ " (transcript=" + transcript.size() + " turns)"
180181
+ "; call commitStreamedReply(...) before save(...)");
181182
}
182183
return model.saveSlot(slotId, filepath);
@@ -194,7 +195,7 @@ public String restore(String filepath) {
194195
if (streamingActive) {
195196
throw new IllegalStateException(
196197
"stream in progress on slot " + slotId
197-
+ " (transcript=" + turns.size() + " turns)"
198+
+ " (transcript=" + transcript.size() + " turns)"
198199
+ "; call commitStreamedReply(...) before restore(...)");
199200
}
200201
return model.restoreSlot(slotId, filepath);
@@ -207,14 +208,7 @@ public String restore(String filepath) {
207208
*/
208209
public List<ChatMessage> getMessages() {
209210
synchronized (lock) {
210-
List<ChatMessage> out = new ArrayList<ChatMessage>(turns.size() + 1);
211-
if (systemMessage != null && !systemMessage.isEmpty()) {
212-
out.add(new ChatMessage("system", systemMessage));
213-
}
214-
for (Pair<String, String> p : turns) {
215-
out.add(new ChatMessage(p.getKey(), p.getValue()));
216-
}
217-
return Collections.unmodifiableList(out);
211+
return transcript.snapshot();
218212
}
219213
}
220214

@@ -226,9 +220,21 @@ public void close() {
226220
}
227221
}
228222

229-
private InferenceParameters buildParams() {
230-
InferenceParameters params =
231-
new InferenceParameters("").setMessages(systemMessage, new ArrayList<Pair<String, String>>(turns));
223+
/**
224+
* Build inference parameters with a pending user turn appended to the existing
225+
* transcript — without mutating the underlying {@link ChatTranscript}. The
226+
* actual transcript mutation happens AFTER the model call returns successfully,
227+
* either via {@link ChatTranscript#appendRound(String, String)} (send path)
228+
* or {@link ChatTranscript#appendUserTurn(String)} (stream path).
229+
*
230+
* @param pendingUserMessage the user turn to include in the wire format
231+
* @return inference parameters carrying transcript + pending user turn
232+
*/
233+
private InferenceParameters buildParamsWithPendingUserTurn(String pendingUserMessage) {
234+
InferenceParameters params = new InferenceParameters("")
235+
.setMessages(
236+
transcript.getSystemMessage(),
237+
transcript.messagesWithPendingUserTurn(pendingUserMessage));
232238
if (paramsCustomizer != null) {
233239
paramsCustomizer.accept(params);
234240
}

0 commit comments

Comments
 (0)