|
| 1 | +# Spring AI adapter for SKaiNET-transformers (separate companion repo) |
| 2 | + |
| 3 | +**Repository (target):** `SKaiNET-developers/SKaiNET-spring-ai` (new, to be created) |
| 4 | +**Depends on:** `SKaiNET-developers/SKaiNET-transformers` — modules `llm-api`, `llm-providers` |
| 5 | +**Labels:** enhancement, integration, spring |
| 6 | +**Milestone:** — |
| 7 | + |
| 8 | +--- |
| 9 | + |
| 10 | +## Summary |
| 11 | + |
| 12 | +Build a thin Spring AI adapter + Spring Boot starter that exposes the existing |
| 13 | +`sk.ainet.llm.api.ChatModel` / `EmbeddingModel` SPI through Spring AI's |
| 14 | +`org.springframework.ai.chat.model.ChatModel` and |
| 15 | +`org.springframework.ai.embedding.EmbeddingModel`. The adapter lives in a |
| 16 | +**separate repository** so the SKaiNET core stays Kotlin Multiplatform and free |
| 17 | +of Spring transitives. |
| 18 | + |
| 19 | +This is the planned follow-up to the neutral SPI work that landed on |
| 20 | +`feature/llm-api-neutral-spi` (modules `llm-api` and `llm-providers`). |
| 21 | + |
| 22 | +## Motivation |
| 23 | + |
| 24 | +- Spring AI's `ChatModel`/`EmbeddingModel` is the most familiar provider-SPI on the |
| 25 | + JVM today. Many users will reach for `spring-ai-starter-model-*` first. |
| 26 | +- We **do not** want Spring as a dependency in the SKaiNET core. The neutral SPI |
| 27 | + in `llm-api` already mirrors Spring AI's shape; the adapter is a translation layer |
| 28 | + of a few hundred lines. |
| 29 | +- A separate repo keeps Spring's release cadence (1.1.x ↔ 2.0.x) decoupled from |
| 30 | + SKaiNET-transformers' release cadence. |
| 31 | + |
| 32 | +## What's already done (in this repo) |
| 33 | + |
| 34 | +On branch `feature/llm-api-neutral-spi`: |
| 35 | + |
| 36 | +- `llm-api/` — KMP module with `ChatModel`, `StreamingChatModel`, `EmbeddingModel`, |
| 37 | + `ChatRequest`/`Response`/`Chunk`, `ChatOptions`, `EmbeddingRequest`/`Response`, |
| 38 | + `Message`/`Role`, `ToolDefinition`/`ToolCall`, `Usage`, `FinishReason`. Kotlin |
| 39 | + `Flow` for streaming. Deps: `kotlin-stdlib` + `kotlinx-coroutines` only. |
| 40 | +- `llm-providers/` — JVM module with `SkaiNetChatModel<T>` (wraps any |
| 41 | + `InferenceRuntime<T>` + `Tokenizer` + `ChatTemplate`) and `SkaiNetEmbeddingModel<T>` |
| 42 | + (wraps `BertRuntime<T>`). |
| 43 | +- BOM updated; binary-compat baseline (`api/`) generated; basic unit tests for |
| 44 | + mappers and stop-sequence helper. |
| 45 | + |
| 46 | +## Scope (this issue) |
| 47 | + |
| 48 | +A new repository `SKaiNET-spring-ai` with two artifacts. |
| 49 | + |
| 50 | +### 1. `spring-ai-skainet` — adapter library |
| 51 | + |
| 52 | +``` |
| 53 | +sk.ainet.llm.spring/ |
| 54 | + SpringSkaiNetChatModel implements org.springframework.ai.chat.model.ChatModel |
| 55 | + , org.springframework.ai.chat.model.StreamingChatModel |
| 56 | + SpringSkaiNetEmbeddingModel implements org.springframework.ai.embedding.EmbeddingModel |
| 57 | + PromptMapper Spring AI Prompt/Message ↔ sk.ainet.llm.api.ChatRequest/Message |
| 58 | + OptionsMapper Spring AI ChatOptions ↔ sk.ainet.llm.api.ChatOptions |
| 59 | + StreamingBridge kotlinx.coroutines.flow.Flow → reactor.core.publisher.Flux |
| 60 | + (via kotlinx-coroutines-reactor `asFlux`) |
| 61 | +``` |
| 62 | + |
| 63 | +Translation rules: |
| 64 | +- Spring `Prompt.getInstructions()` → neutral `List<Message>`. Map roles 1:1. |
| 65 | +- Spring `ChatOptions` → neutral `ChatOptions`. `temperature`, `topK`, `topP`, |
| 66 | + `maxTokens`, `stopSequences`, `seed`, `model` map directly. Spring-only knobs |
| 67 | + (`frequencyPenalty`, `presencePenalty`) are dropped with a debug log |
| 68 | + (the underlying SKaiNET runtime does not honor them today). |
| 69 | +- `ChatResponse` ← neutral `ChatResponse`. Wrap each neutral `Generation` in a |
| 70 | + Spring `org.springframework.ai.chat.model.Generation`. Carry `Usage` into |
| 71 | + `ChatResponseMetadata`. |
| 72 | +- Streaming: `stream(Prompt)` returns `Flux<ChatResponse>`. Each upstream |
| 73 | + `ChatResponseChunk` becomes a `ChatResponse` whose single `Generation` carries the |
| 74 | + delta as content. Final chunk includes finishReason + final usage. |
| 75 | + |
| 76 | +### 2. `spring-ai-starter-model-skainet` — Spring Boot starter |
| 77 | + |
| 78 | +``` |
| 79 | +sk.ainet.llm.spring.boot/ |
| 80 | + SkaiNetAutoConfiguration |
| 81 | + @AutoConfiguration |
| 82 | + @ConditionalOnClass(SpringSkaiNetChatModel.class) |
| 83 | + @EnableConfigurationProperties(SkaiNetProperties.class) |
| 84 | + @ConditionalOnProperty(prefix = "spring.ai.skainet", |
| 85 | + name = "enabled", havingValue = "true", |
| 86 | + matchIfMissing = true) |
| 87 | + SkaiNetProperties prefix = "spring.ai.skainet" |
| 88 | + resources/META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports |
| 89 | +``` |
| 90 | + |
| 91 | +Properties: |
| 92 | + |
| 93 | +```yaml |
| 94 | +spring.ai.skainet: |
| 95 | + enabled: true |
| 96 | + chat: |
| 97 | + model-path: /models/qwen3-0.6b.gguf # required if chat enabled |
| 98 | + model-format: gguf # gguf | safetensors |
| 99 | + chat-template: auto # auto | qwen | gemma | llama3 | chatml |
| 100 | + options: |
| 101 | + temperature: 0.7 |
| 102 | + max-tokens: 512 |
| 103 | + top-k: 40 # accepted, ignored at runtime today |
| 104 | + top-p: 0.95 # accepted, ignored at runtime today |
| 105 | + stop-sequences: ["</s>"] |
| 106 | + embedding: |
| 107 | + model-path: /models/bge-small-en # required if embedding enabled |
| 108 | + options: |
| 109 | + model: bge-small-en |
| 110 | +``` |
| 111 | +
|
| 112 | +Conditional bean wiring: |
| 113 | +- `@Bean @ConditionalOnMissingBean ChatModel skaiNetChatModel(...)` — |
| 114 | + builds an `OptimizedLLMRuntime` from the configured GGUF path, picks the |
| 115 | + `ChatTemplate` via `ModelRegistry.detect(...)` (or the explicit override), |
| 116 | + wraps in `SkaiNetChatModel`, then in `SpringSkaiNetChatModel`. |
| 117 | +- `@Bean @ConditionalOnMissingBean EmbeddingModel skaiNetEmbeddingModel(...)` — |
| 118 | + same pattern with `BertRuntime` (or its eventual `OptimizedLLMRuntime`-based |
| 119 | + replacement) + `SkaiNetEmbeddingModel`. |
| 120 | + |
| 121 | +## Open dependencies / blockers |
| 122 | + |
| 123 | +The SPI is callable today, but to make the **starter** usable we need a |
| 124 | +"one-call model loader" inside SKaiNET. Two options: |
| 125 | + |
| 126 | +1. Add a `ModelLoader.fromGguf(path): InferenceRuntime<*>` (or similar) inside |
| 127 | + `llm-providers` and call it from the starter. Cleanest. |
| 128 | +2. Have the starter replicate the wiring from |
| 129 | + `llm-apps/skainet-cli/Main.kt` (Arena, ExecutionContext, weight loader, |
| 130 | + tokenizer). Works, but bigger surface to maintain. |
| 131 | + |
| 132 | +Recommend option (1) as a small follow-up PR in this repo before standing up |
| 133 | +the Spring repo. Track separately if needed. |
| 134 | + |
| 135 | +## Acceptance Criteria |
| 136 | + |
| 137 | +- [ ] `SKaiNET-spring-ai` repo exists with `spring-ai-skainet` and |
| 138 | + `spring-ai-starter-model-skainet` modules and CI green on JDK 17 + 21. |
| 139 | +- [ ] Sample Spring Boot app: 30 lines of YAML + a `@RestController` injecting |
| 140 | + `org.springframework.ai.chat.client.ChatClient` returns a non-empty |
| 141 | + response for a Qwen3-0.6B GGUF. |
| 142 | +- [ ] `ChatClient` streaming endpoint (`text/event-stream`) emits tokens |
| 143 | + progressively, not in one chunk. |
| 144 | +- [ ] `EmbeddingModel.embed("hello").size == dimensions` for a small |
| 145 | + sentence-transformers model. |
| 146 | +- [ ] Spring AI 1.1.x compatibility documented; build sets `spring-ai-bom` as |
| 147 | + the dependency-version pin. |
| 148 | +- [ ] No Spring or Reactor classes appear anywhere in SKaiNET-transformers' core |
| 149 | + modules. |
| 150 | + |
| 151 | +## Reproduction / Test Plan |
| 152 | + |
| 153 | +Once the companion repo is up: |
| 154 | + |
| 155 | +1. Publish SKaiNET-transformers `llm-api` + `llm-providers` artifacts (snapshot |
| 156 | + to mavenLocal is fine for first iteration). |
| 157 | +2. In the new repo, depend on those + `org.springframework.ai:spring-ai-bom:1.1.x`. |
| 158 | +3. Run the sample Boot app against `Qwen3-0.6B-Q8_0.gguf`: |
| 159 | + - `POST /chat` with body `"Hello, who are you?"` → 200, non-empty body |
| 160 | + - `GET /chat/stream?q=hello` → `text/event-stream`, multiple chunks |
| 161 | +4. Run the embedding sample against a `bge-small-en` checkpoint and check that |
| 162 | + two semantically similar sentences yield cosine similarity > 0.7. |
| 163 | + |
| 164 | +## Reference |
| 165 | + |
| 166 | +- Spring AI Ollama starter |
| 167 | + (`org.springframework.ai:spring-ai-starter-model-ollama`) is the closest |
| 168 | + structural analog for the autoconfig + properties layout. |
| 169 | +- Streaming bridge: `kotlinx.coroutines.reactor.asFlux` (artifact |
| 170 | + `org.jetbrains.kotlinx:kotlinx-coroutines-reactor`). |
| 171 | +- Memory note in this repo: |
| 172 | + `feedback_neutral_spi_over_framework_coupling.md` — keep Spring out of core. |
| 173 | + |
| 174 | +## Related |
| 175 | + |
| 176 | +- Plan file (this repo, this branch): `.claude/plans/spring-ai-s-own-docs-partitioned-toucan.md` |
| 177 | +- Modules: `llm-api/`, `llm-providers/` |
| 178 | +- BOM: `llm-bom/build.gradle.kts` already exposes both new modules. |
0 commit comments