Skip to content

Commit 97170a2

Browse files
committed
feat: add TRT-LLM, Dynamo KVBM integrations + dynamo-semblend Rust crate
Major additions for SemBlend v0.2.0: TRT-LLM integration (semblend/integration/trtllm/): - TRTLLMPyTorchBackend implementing SemBlendBackend ABC - KV cache layout adapter (stride computation for TRT-LLM's paged cache) - Model engine hook with 3 approaches (token sub, radix patch, block inject) - SemanticCacheLookupProvider + PostPrefixLoadHook upstream ABCs - SemBlendProvider reference implementation - Turnkey launcher (semblend-trtllm CLI) - 54 tests passing Dynamo KVBM integration (semblend/integration/dynamo/): - SemBlendKvIndexerWrapper wrapping Dynamo's KvIndexer - SemBlendEventPublisher for NATS semantic events - 14 tests passing dynamo-semblend Rust crate: - SemanticKvIndexer implementing KvIndexerInterface trait - DonorStore with SIMD cosine similarity - EmbedClient for MiniLM sidecar - 16 Rust tests passing Deploy infrastructure: - TRT-LLM: K8s manifests, Docker Compose, engine build scripts - Dynamo: DynamoGraphDeployment configs, SemBlend proxy - Benchmark results: TRT-LLM baseline + Dynamo baseline + SemBlend Benchmark results (Dynamo + TRT-LLM, 736 samples): - NarrativeQA: 19.3% baseline → 29.3% with SemBlend (+10pp) - Full 5-dataset baseline established for TRT-LLM and Dynamo KVBM Signed-off-by: Zach Bennett <zach@worldflowai.com>
1 parent a786902 commit 97170a2

3,692 files changed

Lines changed: 20140 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

deploy/dynamo/model-cache-pvc.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: v1
2+
kind: PersistentVolumeClaim
3+
metadata:
4+
name: model-cache
5+
namespace: semblend-trtllm
6+
spec:
7+
accessModes:
8+
- ReadWriteOnce
9+
resources:
10+
requests:
11+
storage: 50Gi
12+
storageClassName: "gp3"

0 commit comments

Comments
 (0)