|
| 1 | +--- |
| 2 | +title: "Embeddings" |
| 3 | +description: "EmbeddingRuntime SPI — runtime-agnostic text embedding across Spring AI, LangChain4j, Semantic Kernel, Embabel, and the Built-in client" |
| 4 | +--- |
| 5 | + |
| 6 | +# Embeddings |
| 7 | + |
| 8 | +`EmbeddingRuntime` is the sibling SPI to `AgentRuntime` for text-embedding |
| 9 | +generation. Each supported LLM framework ships an implementation discovered |
| 10 | +through `ServiceLoader`, so your RAG pipeline can swap embedding backends by |
| 11 | +changing one dependency — exactly the same contract as `@AiEndpoint` and |
| 12 | +`@AiTool` across chat runtimes. |
| 13 | + |
| 14 | +## Why a separate SPI |
| 15 | + |
| 16 | +Prior to 4.0.36 the `modules/rag` package consumed provider-specific embedding |
| 17 | +APIs directly (Spring AI `EmbeddingModel`, LangChain4j `EmbeddingModel`, etc.), |
| 18 | +which meant RAG code locked you into one backend. `EmbeddingRuntime` lifts |
| 19 | +that into a runtime-agnostic SPI with a uniform API: |
| 20 | + |
| 21 | +- `float[] embed(String text)` — embed a single text into a vector |
| 22 | +- `List<float[]> embedAll(List<String> texts)` — batch variant (preferred for |
| 23 | + amortizing per-request overhead) |
| 24 | +- `int dimensions()` — the vector length, or `-1` if the runtime cannot |
| 25 | + answer without a network call |
| 26 | +- `boolean isAvailable()` — whether this runtime's native API is on the |
| 27 | + classpath AND a concrete embedding model has been wired |
| 28 | +- `String name()` — human-readable ID: `"spring-ai"`, `"langchain4j"`, |
| 29 | + `"semantic-kernel"`, `"embabel"`, `"built-in"` |
| 30 | +- `int priority()` — selection priority when multiple runtimes are present |
| 31 | + (higher wins) |
| 32 | + |
| 33 | +## Runtime priorities |
| 34 | + |
| 35 | +The default resolver picks the highest-priority runtime whose |
| 36 | +`isAvailable()` returns `true`. Priorities are stable across releases so |
| 37 | +adapter wrappers always win over the zero-dependency fallback: |
| 38 | + |
| 39 | +| Runtime | Priority | Status | |
| 40 | +|---------|---------:|--------| |
| 41 | +| Spring AI (`spring-ai`) | **200** | Available when a Spring-managed `EmbeddingModel` bean is wired | |
| 42 | +| LangChain4j (`langchain4j`) | **190** | Available when a `dev.langchain4j.model.embedding.EmbeddingModel` instance is injected | |
| 43 | +| Semantic Kernel (`semantic-kernel`) | **180** | Available when a `TextEmbeddingGenerationService` is supplied. Uses `Mono.block()` at the sync boundary and unwraps `List<Float>` → `float[]` | |
| 44 | +| Embabel (`embabel`) | **170** | Thin pass-through over `com.embabel.common.ai.model.EmbeddingService` | |
| 45 | +| Built-in (`built-in`) | **50** | Zero-dependency OpenAI-compatible `/v1/embeddings` client. Fallback used when no framework-native `EmbeddingModel` is wired. | |
| 46 | + |
| 47 | +The Built-in runtime sits below every adapter so a framework-native |
| 48 | +`EmbeddingModel` always wins when present. Direct `OpenAiCompatibleClient` |
| 49 | +callers bypass the resolver and use the Built-in implementation unconditionally. |
| 50 | + |
| 51 | +## Auto-discovery |
| 52 | + |
| 53 | +Add the corresponding `atmosphere-*` dependency to your project: |
| 54 | + |
| 55 | +```xml |
| 56 | +<properties> |
| 57 | + <atmosphere.version>4.0.36</atmosphere.version> |
| 58 | +</properties> |
| 59 | + |
| 60 | +<!-- Spring AI embedding runtime (priority 200) --> |
| 61 | +<dependency> |
| 62 | + <groupId>org.atmosphere</groupId> |
| 63 | + <artifactId>atmosphere-spring-ai</artifactId> |
| 64 | + <version>${atmosphere.version}</version> |
| 65 | +</dependency> |
| 66 | + |
| 67 | +<!-- or LangChain4j (priority 190) --> |
| 68 | +<dependency> |
| 69 | + <groupId>org.atmosphere</groupId> |
| 70 | + <artifactId>atmosphere-langchain4j</artifactId> |
| 71 | + <version>${atmosphere.version}</version> |
| 72 | +</dependency> |
| 73 | + |
| 74 | +<!-- or Semantic Kernel (priority 180) --> |
| 75 | +<dependency> |
| 76 | + <groupId>org.atmosphere</groupId> |
| 77 | + <artifactId>atmosphere-semantic-kernel</artifactId> |
| 78 | + <version>${atmosphere.version}</version> |
| 79 | +</dependency> |
| 80 | +``` |
| 81 | + |
| 82 | +On application startup Atmosphere scans the classpath via |
| 83 | +`ServiceLoader<EmbeddingRuntime>` and picks the highest-priority |
| 84 | +`isAvailable()` runtime. No code changes needed when swapping backends. |
| 85 | + |
| 86 | +## Programmatic usage |
| 87 | + |
| 88 | +```java |
| 89 | +import org.atmosphere.ai.EmbeddingRuntime; |
| 90 | +import org.atmosphere.ai.EmbeddingRuntimeResolver; |
| 91 | + |
| 92 | +// Resolve the active runtime (highest-priority available) |
| 93 | +EmbeddingRuntime runtime = EmbeddingRuntimeResolver.resolve() |
| 94 | + .orElseThrow(() -> new IllegalStateException( |
| 95 | + "No EmbeddingRuntime available — add atmosphere-spring-ai or " + |
| 96 | + "atmosphere-langchain4j to the classpath, or configure AiConfig " + |
| 97 | + "for the Built-in fallback")); |
| 98 | + |
| 99 | +// Single embedding |
| 100 | +float[] vector = runtime.embed("Atmosphere is the unified AI runtime abstraction on the JVM."); |
| 101 | +System.out.println("Dimensions: " + vector.length); |
| 102 | + |
| 103 | +// Batch embedding (preferred for multiple inputs) |
| 104 | +List<float[]> vectors = runtime.embedAll(List.of( |
| 105 | + "First document", |
| 106 | + "Second document", |
| 107 | + "Third document")); |
| 108 | +``` |
| 109 | + |
| 110 | +`EmbeddingRuntimeResolver.resolveAll()` returns all available runtimes in |
| 111 | +priority order if you need to fan out or pick a specific backend manually. |
| 112 | + |
| 113 | +## Wiring a framework-native embedding model |
| 114 | + |
| 115 | +The adapter runtimes (Spring AI, LangChain4j, Semantic Kernel, Embabel) |
| 116 | +wrap a framework-managed `EmbeddingModel` / `EmbeddingService` / |
| 117 | +`TextEmbeddingGenerationService` instance. Wire it via the adapter's |
| 118 | +static setter during startup: |
| 119 | + |
| 120 | +### Spring AI |
| 121 | + |
| 122 | +```java |
| 123 | +@Configuration |
| 124 | +public class EmbeddingConfig { |
| 125 | + @Bean |
| 126 | + EmbeddingModel openAiEmbeddingModel() { |
| 127 | + return new OpenAiEmbeddingModel(...); |
| 128 | + } |
| 129 | + |
| 130 | + @PostConstruct |
| 131 | + void registerWithAtmosphere(@Autowired EmbeddingModel model) { |
| 132 | + SpringAiEmbeddingRuntime.setEmbeddingModel(model); |
| 133 | + } |
| 134 | +} |
| 135 | +``` |
| 136 | + |
| 137 | +### LangChain4j |
| 138 | + |
| 139 | +```java |
| 140 | +var lc4jEmbedder = OpenAiEmbeddingModel.builder() |
| 141 | + .apiKey(System.getenv("OPENAI_API_KEY")) |
| 142 | + .build(); |
| 143 | +LangChain4jEmbeddingRuntime.setEmbeddingModel(lc4jEmbedder); |
| 144 | +``` |
| 145 | + |
| 146 | +### Semantic Kernel |
| 147 | + |
| 148 | +```java |
| 149 | +var skService = OpenAITextEmbeddingGenerationService.builder() |
| 150 | + .withApiKey(System.getenv("OPENAI_API_KEY")) |
| 151 | + .withModelId("text-embedding-3-small") |
| 152 | + .build(); |
| 153 | +SemanticKernelEmbeddingRuntime.setEmbeddingService(skService); |
| 154 | +``` |
| 155 | + |
| 156 | +### Embabel |
| 157 | + |
| 158 | +```java |
| 159 | +EmbabelEmbeddingRuntime.setEmbeddingService(embabelEmbeddingService); |
| 160 | +``` |
| 161 | + |
| 162 | +### Built-in (zero-dep fallback) |
| 163 | + |
| 164 | +No wiring needed — the Built-in runtime reads `AiConfig.baseUrl` + `apiKey` |
| 165 | +at call time and POSTs to `/v1/embeddings` on any OpenAI-compatible endpoint |
| 166 | +(OpenAI, Azure OpenAI, Gemini's OpenAI gateway, Ollama, LocalAI, etc.). |
| 167 | + |
| 168 | +```java |
| 169 | +// Override the default model name if needed |
| 170 | +var builtIn = new BuiltInEmbeddingRuntime(); |
| 171 | +builtIn.setEmbeddingModel("text-embedding-3-large"); |
| 172 | +``` |
| 173 | + |
| 174 | +## Contract tests — `AbstractEmbeddingRuntimeContractTest` |
| 175 | + |
| 176 | +Every concrete `EmbeddingRuntime` ships with a contract-test subclass of |
| 177 | +`AbstractEmbeddingRuntimeContractTest`. The base assertions exercise |
| 178 | +`embed()`, `embedAll()`, `dimensions()`, and `isAvailable()` with a |
| 179 | +deterministic fake embedder so the bridge plumbing is validated without |
| 180 | +live network calls. |
| 181 | + |
| 182 | +The six parity assertions are: |
| 183 | + |
| 184 | +1. `runtimeHasStableName()` — `name()` returns a non-blank, stable identifier |
| 185 | +2. `embedSingleTextReturnsVectorOfExpectedDimension()` — single-text round-trip |
| 186 | +3. `embedAllReturnsVectorPerInputInOrder()` — batch round-trip preserves order |
| 187 | +4. `embedAllWithEmptyListReturnsEmptyList()` — edge case |
| 188 | +5. `runtimeIsAvailableAfterFakeInjection()` — availability gate flips on injection |
| 189 | +6. `dimensionsAccessorIsNonNegativeOrMinusOne()` — dimension advertising contract |
| 190 | + |
| 191 | +If you add a new `EmbeddingRuntime` implementation, subclass the base and |
| 192 | +supply a deterministic fake embedder via `installFakeEmbedder()` — no |
| 193 | +need to write the same assertions again. |
| 194 | + |
| 195 | +## Capabilities matrix |
| 196 | + |
| 197 | +| Runtime | `embed()` | `embedAll()` batched | `dimensions()` | Notes | |
| 198 | +|---------|:---------:|:--------------------:|:--------------:|-------| |
| 199 | +| Spring AI | ✅ | ✅ native batch | ✅ | `model.dimensions()` | |
| 200 | +| LangChain4j | ✅ | ✅ native batch via `TextSegment` | ✅ | `model.dimension()` | |
| 201 | +| Semantic Kernel| ✅ | ✅ native batch | `-1` | `Mono.block()` sync boundary | |
| 202 | +| Embabel | ✅ | ✅ native batch | ✅ | 1:1 pass-through | |
| 203 | +| Built-in | ✅ | ✅ via `/v1/embeddings` | ✅ from config | OpenAI-compatible wire format | |
| 204 | + |
| 205 | +## See also |
| 206 | + |
| 207 | +- [AI / LLM Reference](../../reference/ai/) — `AgentRuntime` SPI and capability matrix |
| 208 | +- [RAG Module](../../reference/ai/#rag) — how RAG pipelines consume `EmbeddingRuntime` |
| 209 | +- [Spring AI Integration](../../integrations/spring-ai/) |
| 210 | +- [LangChain4j Integration](../../integrations/langchain4j/) |
0 commit comments