Skip to content

Commit 5741d7b

Browse files
Merge pull request #79 from SKaiNET-developers/feature/gemma4
Gemma 4 epic: correctness, tool calling, multiplatform build
2 parents 3c19515 + 08f9415 commit 5741d7b

114 files changed

Lines changed: 14848 additions & 464 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

ISSUE-skainet-8b-oom.md

Lines changed: 0 additions & 113 deletions
This file was deleted.

ISSUE-skainet-spring-ai-adapter.md

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
# Spring AI adapter for SKaiNET-transformers (separate companion repo)
2+
3+
**Repository (target):** `SKaiNET-developers/SKaiNET-spring-ai` (new, to be created)
4+
**Depends on:** `SKaiNET-developers/SKaiNET-transformers` — modules `llm-api`, `llm-providers`
5+
**Labels:** enhancement, integration, spring
6+
**Milestone:**
7+
8+
---
9+
10+
## Summary
11+
12+
Build a thin Spring AI adapter + Spring Boot starter that exposes the existing
13+
`sk.ainet.llm.api.ChatModel` / `EmbeddingModel` SPI through Spring AI's
14+
`org.springframework.ai.chat.model.ChatModel` and
15+
`org.springframework.ai.embedding.EmbeddingModel`. The adapter lives in a
16+
**separate repository** so the SKaiNET core stays Kotlin Multiplatform and free
17+
of Spring transitives.
18+
19+
This is the planned follow-up to the neutral SPI work that landed on
20+
`feature/llm-api-neutral-spi` (modules `llm-api` and `llm-providers`).
21+
22+
## Motivation
23+
24+
- Spring AI's `ChatModel`/`EmbeddingModel` is the most familiar provider-SPI on the
25+
JVM today. Many users will reach for `spring-ai-starter-model-*` first.
26+
- We **do not** want Spring as a dependency in the SKaiNET core. The neutral SPI
27+
in `llm-api` already mirrors Spring AI's shape; the adapter is a translation layer
28+
of a few hundred lines.
29+
- A separate repo keeps Spring's release cadence (1.1.x ↔ 2.0.x) decoupled from
30+
SKaiNET-transformers' release cadence.
31+
32+
## What's already done (in this repo)
33+
34+
On branch `feature/llm-api-neutral-spi`:
35+
36+
- `llm-api/` — KMP module with `ChatModel`, `StreamingChatModel`, `EmbeddingModel`,
37+
`ChatRequest`/`Response`/`Chunk`, `ChatOptions`, `EmbeddingRequest`/`Response`,
38+
`Message`/`Role`, `ToolDefinition`/`ToolCall`, `Usage`, `FinishReason`. Kotlin
39+
`Flow` for streaming. Deps: `kotlin-stdlib` + `kotlinx-coroutines` only.
40+
- `llm-providers/` — JVM module with `SkaiNetChatModel<T>` (wraps any
41+
`InferenceRuntime<T>` + `Tokenizer` + `ChatTemplate`) and `SkaiNetEmbeddingModel<T>`
42+
(wraps `BertRuntime<T>`).
43+
- BOM updated; binary-compat baseline (`api/`) generated; basic unit tests for
44+
mappers and stop-sequence helper.
45+
46+
## Scope (this issue)
47+
48+
A new repository `SKaiNET-spring-ai` with two artifacts.
49+
50+
### 1. `spring-ai-skainet` — adapter library
51+
52+
```
53+
sk.ainet.llm.spring/
54+
SpringSkaiNetChatModel implements org.springframework.ai.chat.model.ChatModel
55+
, org.springframework.ai.chat.model.StreamingChatModel
56+
SpringSkaiNetEmbeddingModel implements org.springframework.ai.embedding.EmbeddingModel
57+
PromptMapper Spring AI Prompt/Message ↔ sk.ainet.llm.api.ChatRequest/Message
58+
OptionsMapper Spring AI ChatOptions ↔ sk.ainet.llm.api.ChatOptions
59+
StreamingBridge kotlinx.coroutines.flow.Flow → reactor.core.publisher.Flux
60+
(via kotlinx-coroutines-reactor `asFlux`)
61+
```
62+
63+
Translation rules:
64+
- Spring `Prompt.getInstructions()` → neutral `List<Message>`. Map roles 1:1.
65+
- Spring `ChatOptions` → neutral `ChatOptions`. `temperature`, `topK`, `topP`,
66+
`maxTokens`, `stopSequences`, `seed`, `model` map directly. Spring-only knobs
67+
(`frequencyPenalty`, `presencePenalty`) are dropped with a debug log
68+
(the underlying SKaiNET runtime does not honor them today).
69+
- `ChatResponse` ← neutral `ChatResponse`. Wrap each neutral `Generation` in a
70+
Spring `org.springframework.ai.chat.model.Generation`. Carry `Usage` into
71+
`ChatResponseMetadata`.
72+
- Streaming: `stream(Prompt)` returns `Flux<ChatResponse>`. Each upstream
73+
`ChatResponseChunk` becomes a `ChatResponse` whose single `Generation` carries the
74+
delta as content. Final chunk includes finishReason + final usage.
75+
76+
### 2. `spring-ai-starter-model-skainet` — Spring Boot starter
77+
78+
```
79+
sk.ainet.llm.spring.boot/
80+
SkaiNetAutoConfiguration
81+
@AutoConfiguration
82+
@ConditionalOnClass(SpringSkaiNetChatModel.class)
83+
@EnableConfigurationProperties(SkaiNetProperties.class)
84+
@ConditionalOnProperty(prefix = "spring.ai.skainet",
85+
name = "enabled", havingValue = "true",
86+
matchIfMissing = true)
87+
SkaiNetProperties prefix = "spring.ai.skainet"
88+
resources/META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports
89+
```
90+
91+
Properties:
92+
93+
```yaml
94+
spring.ai.skainet:
95+
enabled: true
96+
chat:
97+
model-path: /models/qwen3-0.6b.gguf # required if chat enabled
98+
model-format: gguf # gguf | safetensors
99+
chat-template: auto # auto | qwen | gemma | llama3 | chatml
100+
options:
101+
temperature: 0.7
102+
max-tokens: 512
103+
top-k: 40 # accepted, ignored at runtime today
104+
top-p: 0.95 # accepted, ignored at runtime today
105+
stop-sequences: ["</s>"]
106+
embedding:
107+
model-path: /models/bge-small-en # required if embedding enabled
108+
options:
109+
model: bge-small-en
110+
```
111+
112+
Conditional bean wiring:
113+
- `@Bean @ConditionalOnMissingBean ChatModel skaiNetChatModel(...)` —
114+
builds an `OptimizedLLMRuntime` from the configured GGUF path, picks the
115+
`ChatTemplate` via `ModelRegistry.detect(...)` (or the explicit override),
116+
wraps in `SkaiNetChatModel`, then in `SpringSkaiNetChatModel`.
117+
- `@Bean @ConditionalOnMissingBean EmbeddingModel skaiNetEmbeddingModel(...)`
118+
same pattern with `BertRuntime` (or its eventual `OptimizedLLMRuntime`-based
119+
replacement) + `SkaiNetEmbeddingModel`.
120+
121+
## Open dependencies / blockers
122+
123+
The SPI is callable today, but to make the **starter** usable we need a
124+
"one-call model loader" inside SKaiNET. Two options:
125+
126+
1. Add a `ModelLoader.fromGguf(path): InferenceRuntime<*>` (or similar) inside
127+
`llm-providers` and call it from the starter. Cleanest.
128+
2. Have the starter replicate the wiring from
129+
`llm-apps/skainet-cli/Main.kt` (Arena, ExecutionContext, weight loader,
130+
tokenizer). Works, but bigger surface to maintain.
131+
132+
Recommend option (1) as a small follow-up PR in this repo before standing up
133+
the Spring repo. Track separately if needed.
134+
135+
## Acceptance Criteria
136+
137+
- [ ] `SKaiNET-spring-ai` repo exists with `spring-ai-skainet` and
138+
`spring-ai-starter-model-skainet` modules and CI green on JDK 17 + 21.
139+
- [ ] Sample Spring Boot app: 30 lines of YAML + a `@RestController` injecting
140+
`org.springframework.ai.chat.client.ChatClient` returns a non-empty
141+
response for a Qwen3-0.6B GGUF.
142+
- [ ] `ChatClient` streaming endpoint (`text/event-stream`) emits tokens
143+
progressively, not in one chunk.
144+
- [ ] `EmbeddingModel.embed("hello").size == dimensions` for a small
145+
sentence-transformers model.
146+
- [ ] Spring AI 1.1.x compatibility documented; build sets `spring-ai-bom` as
147+
the dependency-version pin.
148+
- [ ] No Spring or Reactor classes appear anywhere in SKaiNET-transformers' core
149+
modules.
150+
151+
## Reproduction / Test Plan
152+
153+
Once the companion repo is up:
154+
155+
1. Publish SKaiNET-transformers `llm-api` + `llm-providers` artifacts (snapshot
156+
to mavenLocal is fine for first iteration).
157+
2. In the new repo, depend on those + `org.springframework.ai:spring-ai-bom:1.1.x`.
158+
3. Run the sample Boot app against `Qwen3-0.6B-Q8_0.gguf`:
159+
- `POST /chat` with body `"Hello, who are you?"` → 200, non-empty body
160+
- `GET /chat/stream?q=hello` → `text/event-stream`, multiple chunks
161+
4. Run the embedding sample against a `bge-small-en` checkpoint and check that
162+
two semantically similar sentences yield cosine similarity > 0.7.
163+
164+
## Reference
165+
166+
- Spring AI Ollama starter
167+
(`org.springframework.ai:spring-ai-starter-model-ollama`) is the closest
168+
structural analog for the autoconfig + properties layout.
169+
- Streaming bridge: `kotlinx.coroutines.reactor.asFlux` (artifact
170+
`org.jetbrains.kotlinx:kotlinx-coroutines-reactor`).
171+
- Memory note in this repo:
172+
`feedback_neutral_spi_over_framework_coupling.md` — keep Spring out of core.
173+
174+
## Related
175+
176+
- Plan file (this repo, this branch): `.claude/plans/spring-ai-s-own-docs-partitioned-toucan.md`
177+
- Modules: `llm-api/`, `llm-providers/`
178+
- BOM: `llm-bom/build.gradle.kts` already exposes both new modules.

0 commit comments

Comments
 (0)