Skip to content

Commit 65894f6

Browse files
committed
Fix package phase: exclude module-info.java from gpu and opencl-android compile passes
The release profile chains three compile executions: - default-compile -> release 8, excludes module-info.java - module-info-compile -> release 9, includes only module-info.java - gpu / opencl-android -> classifier-specific passes that write to ${project.build.outputDirectory}_cuda / _opencl_android The gpu and opencl-android executions inherited <release>8</release> from the plugin-level config but did NOT inherit default-compile's <excludes>module-info.java</excludes>; Maven plugin executions are independent. When triggered (mvn package -P release,cuda,opencl-android) either pass picked up module-info.java and failed with "modules are not supported in -source 8". Added the same exclude block to both executions. Verified locally: mvn -P release,cuda,opencl-android -Dmaven.test.skip=true package now succeeds. Also added a TODO for LlamaModel.getSuppressTokens() mirroring the new llama_vocab::get_suppress_tokens() accessor from b9490->b9495. The bias is auto-applied inside the model graph (Gemma4 Unified uses it to block <image|>/<audio|> placeholders); a Java mirror would only be an inspector for callers running their own sampling. Same posture as the existing b9444->b9490 follow-up TODOs - add only on real user request.
1 parent 03bcef1 commit 65894f6

2 files changed

Lines changed: 12 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -730,6 +730,8 @@ interim measure until that work lands.
730730
731731
- **Expose Multi-Token Prediction toggle via `ModelParameters.setMtp(boolean)`.** Existed since the Qwen3.5 MTP work; b9444&#x2013;b9490 extends it to Step-3.5. CLI flags `--mtp`/`--no-mtp` (env `LLAMA_ARG_MTP`) control whether the draft head runs alongside the main model for accelerated decoding. Java setter would route to `common_params_speculative::type = COMMON_SPECULATIVE_TYPE_DRAFT_MTP`. Add only after a real user request &mdash; relevant only for MTP-trained models.
732732
733+
- **Expose `llama_vocab::get_suppress_tokens()` via `LlamaModel.getSuppressTokens()`.** Added in b9490&#x2013;b9495 alongside the new `tokenizer.ggml.suppress_tokens` GGUF key and the `LLM_KV_TOKENIZER_SUPPRESS_TOKENS` constant. When a GGUF declares this array, upstream stores it on `llama_vocab::impl::suppress_tokens` and exposes it via the new `llama_vocab::get_suppress_tokens()` accessor. The bias is **applied automatically** inside the model forward graph &mdash; the Gemma4 Unified graph (`src/models/gemma4.cpp`) reads the list and adds a `-INFINITY` logit bias to those token IDs via a new `llm_graph_input_logits_bias` input so the model cannot emit them (used to block `<image|>` / `<audio|>` placeholders). A Java mirror would be `public int[] getSuppressTokens()` on `LlamaModel`: a read-only inspector returning the suppression list for debugging or for callers running their own sampling who want to replicate the same bias. Value is low (the bias is auto-applied, Java callers cannot change it; java-llama.cpp does not expose custom logit-bias hooks at this level); cost is trivial (one JNI passthrough + a `getSuppressTokens()` Java method). Add only after a real user request &mdash; same posture as the b9444&#x2013;b9490 follow-ups (`setReasoningControl`, `setMaxOutputs`, `setMtp`) queued above.
734+
733735
- **`@VisibleForTesting` design-fit review.** Complement to the audit above: for every existing or planned `@VisibleForTesting` usage, ask whether widening access is the cleanest path to testability. Common alternatives that should be preferred when applicable: (a) inject the dependency through the constructor and have the test pass a stub or fake; (b) extract the tested behaviour into a separate testable helper class with public methods; (c) restructure the production API so what the test wants to verify is observable through normal public methods. Only keep the annotation where these alternatives are materially worse. `@VisibleForTesting` should be the last resort, not the first.
734736
735737
- **Package hierarchy review.** Walk the full `src/main/java/.../` tree and assess whether the current package layout still expresses the design intent. Look for: classes that have drifted into the wrong package as the codebase grew; flat "kitchen-sink" packages that should be split (high class count, mixed concerns); deeply nested packages that fragment cohesive components; circular dependencies between packages; missing seams where a sub-package boundary would prevent leaking implementation details. Produce a target tree as a separate planning step BEFORE making any moves — large package refactors are expensive to review and easy to do twice if the target isn't clear up front.

pom.xml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -692,6 +692,11 @@ SPDX-License-Identifier: MIT
692692
<goal>compile</goal>
693693
</goals>
694694
<configuration>
695+
<!-- Same rationale as default-compile: this pass targets release 8;
696+
module-info.java is Java 9+ and is handled by module-info-compile. -->
697+
<excludes>
698+
<exclude>module-info.java</exclude>
699+
</excludes>
695700
<compilerArgs>
696701
<arg>-h</arg>
697702
<arg>src/main/cpp</arg>
@@ -768,6 +773,11 @@ SPDX-License-Identifier: MIT
768773
<goal>compile</goal>
769774
</goals>
770775
<configuration>
776+
<!-- Same rationale as default-compile: this pass targets release 8;
777+
module-info.java is Java 9+ and is handled by module-info-compile. -->
778+
<excludes>
779+
<exclude>module-info.java</exclude>
780+
</excludes>
771781
<compilerArgs>
772782
<arg>-h</arg>
773783
<arg>src/main/cpp</arg>

0 commit comments

Comments
 (0)