Fix javadoc build error + reduce warnings on new classes

claude · claude · commit 85b38955419a · 2026-05-23T06:23:41.000Z
The release packaging job (mvn package, release profile) runs maven-javadoc-plugin's attach-javadocs which treats Javadoc tool errors as build failures. PR #188 introduced one such error: TokenLogprob.java had a </p> with no matching <p> (the prose was already enclosed by an outer <p>...</p>, and the inner </p> was stray). Fix the error and bring my new public APIs up to a clean shape: - TokenLogprob: rebalance the <p>/</p> HTML and add @return / @param to public getters and constructor. - Timings, Usage, ServerMetrics, ChatMessage, CancellationToken, Session, LlamaOutput: add @return / @param tags with a leading one-line description (the "no main description" warning fires on bare /** @return ... */ blocks). - LlamaModel: restore the doc comment for complete(params, token) that was accidentally stripped during an earlier edit, and add one for getMetricsTyped(); remove a stray orphan doc block. Local verification: mvn clean javadoc:jar -DskipTests=true -Dgpg.skip=true mvn -P release -Dmaven.test.skip=true -Dgpg.skip=true package Both: BUILD SUCCESS (was: BUILD FAILURE, 1 error, 100 warnings). 60 warnings remain, all from pre-existing files outside this PR. Document the verification command and the failure categories (errors vs warnings) in CLAUDE.md under "Javadoc — must build cleanly before mvn package". https://claude.ai/code/session_01R4ZrEy3ptJDLuUgUKuM4Gy
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -514,6 +514,52 @@ into `models/` out-of-band.
 clang-format -i src/main/cpp/*.cpp src/main/cpp/*.hpp   # Format C++ code
 ```
 
+### Javadoc — must build cleanly before `mvn package`
+
+The release packaging job runs `mvn package` with the `release` profile, which attaches
+a javadoc jar via `maven-javadoc-plugin`. The plugin treats Javadoc tool **errors** as
+build failures (warnings are tolerated). After changing any public/protected Java API,
+verify the javadoc build succeeds locally:
+
+```bash
+mvn clean javadoc:jar -DskipTests=true -Dgpg.skip=true
+# expected: BUILD SUCCESS
+```
+
+Common Javadoc errors that fail the build (not warnings):
+
+- **Unbalanced HTML**: `</p>` without a matching `<p>`, mismatched `<ul>`/`<li>`, stray
+  closing tags. Symptom: `error: unexpected end tag: </p>`.
+- **Invalid `{@link …}` targets**: typo'd class, method, or parameter name.
+- **Self-closing void HTML elements written as `<br>` inside `<pre>` blocks** in HTML5
+  mode (rare but seen).
+
+Common Javadoc *warnings* (do not fail the build, but should be cleaned up on new code):
+
+- `no main description` — a doc comment containing only `@param`/`@return`/`@throws`
+  tags with no leading prose. Fix: add a one-line description before the tags.
+- `no @return` / `no @param` — public method missing the tag. Fix: add it.
+- `no comment` — public method/field/enum constant has no doc comment at all.
+- `use of default constructor, which does not provide a comment` — public class with
+  no explicit constructor (the synthetic default has no Javadoc). Fix: add an explicit
+  no-arg constructor with a Javadoc comment.
+
+Preferred doc-comment shapes for getters and small value types:
+
+```java
+/**
+ * Brief one-line description of the value.
+ *
+ * @return the value
+ */
+public T getThing() { ... }
+```
+
+A bare `/** @return … */` triggers `no main description`; add a leading sentence.
+
+If the local check passes (`BUILD SUCCESS`), the `mvn package` job in
+`.github/workflows/publish.yml` will pass the `attach-javadocs` step.
+
 ## Architecture
 
 ### Two-Layer Design
diff --git a/src/main/java/net/ladenthin/llama/CancellationToken.java b/src/main/java/net/ladenthin/llama/CancellationToken.java
@@ -24,11 +24,15 @@ public final class CancellationToken {
 
     private volatile boolean cancelled;
 
+    /** Construct a fresh, not-cancelled token. */
     public CancellationToken() {
         // empty
     }
 
-    /** Returns {@code true} once {@link #cancel()} has been called and before {@link #reset()}. */
+    /**
+     * Cancellation flag accessor.
+     * @return {@code true} once {@link #cancel()} has been called and before {@link #reset()}
+     */
     public boolean isCancelled() {
         return cancelled;
     }
diff --git a/src/main/java/net/ladenthin/llama/ChatMessage.java b/src/main/java/net/ladenthin/llama/ChatMessage.java
@@ -14,15 +14,29 @@ public final class ChatMessage {
     private final String role;
     private final String content;
 
+    /**
+     * Construct a chat message.
+     *
+     * @param role    the message role: {@code "user"}, {@code "assistant"}, or {@code "system"}
+     * @param content the message text
+     */
     public ChatMessage(String role, String content) {
         this.role = role;
         this.content = content;
     }
 
+    /**
+     * Message role accessor.
+     * @return the message role string
+     */
     public String getRole() {
         return role;
     }
 
+    /**
+     * Message content accessor.
+     * @return the message text content
+     */
     public String getContent() {
         return content;
     }
diff --git a/src/main/java/net/ladenthin/llama/LlamaModel.java b/src/main/java/net/ladenthin/llama/LlamaModel.java
@@ -137,6 +137,16 @@ public CompletableFuture<String> chatCompleteTextAsync(InferenceParameters param
 		return CompletableFuture.supplyAsync(() -> chatCompleteText(parameters));
 	}
 
+	/**
+	 * Cancellable variant of {@link #complete(InferenceParameters)}. Runs in streaming mode
+	 * internally so the inference loop can observe a {@link CancellationToken#cancel()} call
+	 * from another thread between token boundaries and return early with whatever text was
+	 * accumulated so far.
+	 *
+	 * @param parameters the inference configuration (its {@code stream} flag is set to {@code true})
+	 * @param token cancellation handle observed at each token boundary
+	 * @return the text generated up to the point of stop or cancellation
+	 */
 	public String complete(InferenceParameters parameters, CancellationToken token) {
 		token.reset();
 		parameters.setStream(true);
@@ -453,14 +463,6 @@ public String getMetrics() {
 	private static final com.fasterxml.jackson.databind.ObjectMapper OBJECT_MAPPER =
 			new com.fasterxml.jackson.databind.ObjectMapper();
 
-	/**
-	 * Typed accessor for {@link #getMetrics()}. Parses the raw JSON into a
-	 * {@link ServerMetrics} view that exposes cumulative {@link Usage} and
-	 * {@link Timings}, slot counts, and a passthrough to the underlying JSON.
-	 *
-	 * @return parsed {@link ServerMetrics}
-	 * @throws LlamaException if the native call fails or the response cannot be parsed
-	 */
 	/**
 	 * Run {@link #complete(InferenceParameters)} constrained to the supplied JSON Schema
 	 * and deserialize the result into an instance of {@code type}. The schema is applied
@@ -506,6 +508,14 @@ public <T> T completeAsJson(Class<T> type, InferenceParameters parameters) throw
 		}
 	}
 
+	/**
+	 * Typed accessor for {@link #getMetrics()}. Parses the raw JSON into a
+	 * {@link ServerMetrics} view that exposes cumulative {@link Usage} and
+	 * {@link Timings}, slot counts, and a passthrough to the underlying JSON.
+	 *
+	 * @return parsed {@link ServerMetrics}
+	 * @throws LlamaException if the native call fails or the response cannot be parsed
+	 */
 	public ServerMetrics getMetricsTyped() throws LlamaException {
 		try {
 			return new ServerMetrics(OBJECT_MAPPER.readTree(getMetrics()));
diff --git a/src/main/java/net/ladenthin/llama/LlamaOutput.java b/src/main/java/net/ladenthin/llama/LlamaOutput.java
@@ -52,10 +52,27 @@ public final class LlamaOutput {
     @NotNull
     public final StopReason stopReason;
 
+    /**
+     * Backwards-compatible constructor that leaves {@link #logprobs} empty.
+     *
+     * @param text          generated text fragment
+     * @param probabilities token-to-probability map (may be empty)
+     * @param stop          whether this is the final token
+     * @param stopReason    the stop reason ({@link StopReason#NONE} on intermediate tokens)
+     */
     public LlamaOutput(@NotNull String text, @NotNull Map<String, Float> probabilities, boolean stop, @NotNull StopReason stopReason) {
         this(text, probabilities, Collections.<TokenLogprob>emptyList(), stop, stopReason);
     }
 
+    /**
+     * Construct an output with typed per-token logprobs in addition to the flat probability map.
+     *
+     * @param text          generated text fragment
+     * @param probabilities token-to-probability map (may be empty)
+     * @param logprobs      typed per-token logprob entries (may be empty)
+     * @param stop          whether this is the final token
+     * @param stopReason    the stop reason ({@link StopReason#NONE} on intermediate tokens)
+     */
     public LlamaOutput(@NotNull String text, @NotNull Map<String, Float> probabilities,
                        @NotNull List<TokenLogprob> logprobs, boolean stop, @NotNull StopReason stopReason) {
         this.text = text;
diff --git a/src/main/java/net/ladenthin/llama/ServerMetrics.java b/src/main/java/net/ladenthin/llama/ServerMetrics.java
@@ -32,39 +32,59 @@ public final class ServerMetrics {
         this.node = node;
     }
 
-    /** Number of slots currently idle. */
+    /**
+     * Idle slot count.
+     * @return number of slots currently idle
+     */
     public int getIdleSlots() {
         return node.path("idle").asInt(0);
     }
 
-    /** Number of slots currently processing a task. */
+    /**
+     * Busy slot count.
+     * @return number of slots currently processing a task
+     */
     public int getProcessingSlots() {
         return node.path("processing").asInt(0);
     }
 
-    /** Number of tasks waiting in the deferred queue. */
+    /**
+     * Backlog size.
+     * @return number of tasks waiting in the deferred queue
+     */
     public int getDeferredTasks() {
         return node.path("deferred").asInt(0);
     }
 
-    /** Server start timestamp (millis since epoch as reported by llama.cpp). */
+    /**
+     * Server start timestamp.
+     * @return millis since epoch as reported by llama.cpp
+     */
     public long getStartTimestamp() {
         return node.path("t_start").asLong(0L);
     }
 
-    /** Total decode invocations since server start. */
+    /**
+     * Lifetime decode counter.
+     * @return total decode invocations since server start
+     */
     public long getDecodeTotal() {
         return node.path("n_decode_total").asLong(0L);
     }
 
-    /** Cumulative number of busy-slot ticks (1 per decode per busy slot). */
+    /**
+     * Lifetime busy-slot counter.
+     * @return cumulative number of busy-slot ticks (1 per decode per busy slot)
+     */
     public long getBusySlotsTotal() {
         return node.path("n_busy_slots_total").asLong(0L);
     }
 
     /**
      * Maximum number of tokens any active slot is configured to hold. Absent in the
      * upstream JSON when no slot has been used yet; this getter returns {@code 0} then.
+     *
+     * @return the slot token-budget ceiling, or {@code 0} when no slot has been active
      */
     public int getTokensMax() {
         return node.path("n_tokens_max").asInt(0);
@@ -74,6 +94,8 @@ public int getTokensMax() {
      * Cumulative server-wide token usage since startup. Prompt tokens come from
      * {@code n_prompt_tokens_processed_total} and completion tokens from
      * {@code n_tokens_predicted_total}.
+     *
+     * @return cumulative {@link Usage} across all completions since server start
      */
     public Usage getCumulativeUsage() {
         return new Usage(
@@ -82,8 +104,10 @@ public Usage getCumulativeUsage() {
     }
 
     /**
-     * Usage counters from the most recent measurement window (current bucket) —
+     * Usage counters from the most recent measurement window (current bucket) &mdash;
      * {@code n_prompt_tokens_processed} and {@code n_tokens_predicted}.
+     *
+     * @return per-window {@link Usage} since the previous metrics emission
      */
     public Usage getWindowUsage() {
         return new Usage(
@@ -94,6 +118,8 @@ public Usage getWindowUsage() {
     /**
      * Cumulative throughput derived from the totals fields. Returns {@code 0.0} for
      * any rate where the corresponding ms total is zero.
+     *
+     * @return cumulative {@link Timings} aggregated from server-wide totals
      */
     public Timings getCumulativeTimings() {
         long promptN = node.path("n_prompt_tokens_processed_total").asLong(0L);
@@ -106,12 +132,18 @@ public Timings getCumulativeTimings() {
                 (int) predictedN, predictedMs, predictedPerSec, 0, 0);
     }
 
-    /** The {@code slots} array node, or a missing-node when no slots are reported. */
+    /**
+     * Slots array passthrough.
+     * @return the {@code slots} array node, or a missing-node when no slots are reported
+     */
     public JsonNode getSlots() {
         return node.path("slots");
     }
 
-    /** Underlying JSON for direct access to fields not yet exposed by typed getters. */
+    /**
+     * Raw passthrough escape hatch.
+     * @return underlying JSON for direct access to fields not yet exposed by typed getters
+     */
     public JsonNode asJson() {
         return node;
     }
diff --git a/src/main/java/net/ladenthin/llama/Session.java b/src/main/java/net/ladenthin/llama/Session.java
@@ -58,7 +58,12 @@ public Session(LlamaModel model, int slotId, String systemMessage,
         this.paramsCustomizer = paramsCustomizer;
     }
 
-    /** Send a user message and return the assistant's text reply, appending both to the transcript. */
+    /**
+     * Send a user message and return the assistant's text reply, appending both to the transcript.
+     *
+     * @param userMessage the user turn to append before invoking the model
+     * @return the assistant's reply text
+     */
     public String send(String userMessage) {
         turns.add(new Pair<String, String>("user", userMessage));
         InferenceParameters params = buildParams();
@@ -72,6 +77,9 @@ public String send(String userMessage) {
      * the assistant reply; consume it fully (or via try-with-resources) before calling
      * {@link #send(String)} again, because the assistant turn is only appended to the
      * transcript when the caller invokes {@link #commitStreamedReply(String)}.
+     *
+     * @param userMessage the user turn to append before starting the stream
+     * @return a {@link LlamaIterable} that yields assistant reply chunks
      */
     public LlamaIterable stream(String userMessage) {
         turns.add(new Pair<String, String>("user", userMessage));
@@ -81,22 +89,37 @@ public LlamaIterable stream(String userMessage) {
     /**
      * Record an assistant reply that was produced by a previous {@link #stream(String)}
      * call. Called by the caller after it has accumulated the streamed text.
+     *
+     * @param assistantText the assistant text accumulated from a prior {@link #stream(String)} call
      */
     public void commitStreamedReply(String assistantText) {
         turns.add(new Pair<String, String>("assistant", assistantText));
     }
 
-    /** Save this session's slot KV cache to {@code filepath}. */
+    /**
+     * Save this session's slot KV cache to {@code filepath}.
+     *
+     * @param filepath destination file path passed to {@link LlamaModel#saveSlot(int, String)}
+     * @return the JSON response from the native save action
+     */
     public String save(String filepath) {
         return model.saveSlot(slotId, filepath);
     }
 
-    /** Restore this session's slot KV cache from {@code filepath}. */
+    /**
+     * Restore this session's slot KV cache from {@code filepath}.
+     *
+     * @param filepath source file path passed to {@link LlamaModel#restoreSlot(int, String)}
+     * @return the JSON response from the native restore action
+     */
     public String restore(String filepath) {
         return model.restoreSlot(slotId, filepath);
     }
 
-    /** The accumulated turns so far, in order. */
+    /**
+     * Transcript accessor.
+     * @return the accumulated transcript so far, in order, including the system message if any
+     */
     public List<ChatMessage> getMessages() {
         List<ChatMessage> out = new ArrayList<ChatMessage>(turns.size() + 1);
         if (systemMessage != null && !systemMessage.isEmpty()) {
diff --git a/src/main/java/net/ladenthin/llama/Timings.java b/src/main/java/net/ladenthin/llama/Timings.java
diff --git a/src/main/java/net/ladenthin/llama/TokenLogprob.java b/src/main/java/net/ladenthin/llama/TokenLogprob.java
diff --git a/src/main/java/net/ladenthin/llama/Usage.java b/src/main/java/net/ladenthin/llama/Usage.java

Original file line number	Diff line number	Diff line change
`@@ -24,11 +24,15 @@ public final class CancellationToken {`
`24`	`24`
`25`	`25`	`private volatile boolean cancelled;`
`26`	`26`
	`27`	`+ /** Construct a fresh, not-cancelled token. */`
`27`	`28`	`public CancellationToken() {`
`28`	`29`	`// empty`
`29`	`30`	`}`
`30`	`31`
`31`		`- /** Returns {@code true} once {@link #cancel()} has been called and before {@link #reset()}. */`
	`32`	`+ /**`
	`33`	`+ * Cancellation flag accessor.`
	`34`	`+ * @return {@code true} once {@link #cancel()} has been called and before {@link #reset()}`
	`35`	`+ */`
`32`	`36`	`public boolean isCancelled() {`
`33`	`37`	`return cancelled;`
`34`	`38`	`}`