Commit 97170a2
committed
feat: add TRT-LLM, Dynamo KVBM integrations + dynamo-semblend Rust crate
Major additions for SemBlend v0.2.0:
TRT-LLM integration (semblend/integration/trtllm/):
- TRTLLMPyTorchBackend implementing SemBlendBackend ABC
- KV cache layout adapter (stride computation for TRT-LLM's paged cache)
- Model engine hook with 3 approaches (token sub, radix patch, block inject)
- SemanticCacheLookupProvider + PostPrefixLoadHook upstream ABCs
- SemBlendProvider reference implementation
- Turnkey launcher (semblend-trtllm CLI)
- 54 tests passing
Dynamo KVBM integration (semblend/integration/dynamo/):
- SemBlendKvIndexerWrapper wrapping Dynamo's KvIndexer
- SemBlendEventPublisher for NATS semantic events
- 14 tests passing
dynamo-semblend Rust crate:
- SemanticKvIndexer implementing KvIndexerInterface trait
- DonorStore with SIMD cosine similarity
- EmbedClient for MiniLM sidecar
- 16 Rust tests passing
Deploy infrastructure:
- TRT-LLM: K8s manifests, Docker Compose, engine build scripts
- Dynamo: DynamoGraphDeployment configs, SemBlend proxy
- Benchmark results: TRT-LLM baseline + Dynamo baseline + SemBlend
Benchmark results (Dynamo + TRT-LLM, 736 samples):
- NarrativeQA: 19.3% baseline → 29.3% with SemBlend (+10pp)
- Full 5-dataset baseline established for TRT-LLM and Dynamo KVBM
Signed-off-by: Zach Bennett <zach@worldflowai.com>1 parent a786902 commit 97170a2
3,692 files changed
Lines changed: 20140 additions & 0 deletions
File tree
- deploy
- dynamo
- results
- trtllm
- k8s
- results
- scripts
- triton_model_repo
- ensemble
- tensorrt_llm
- dynamo-semblend
- src
- target
- debug
- .fingerprint
- anyhow-24ec003b790501cc
- anyhow-40fe6c014f4e8bb6
- anyhow-c8a37cef83c7ec44
- anyhow-d838341a95e3a987
- async-trait-e3fa7891f8ee9ae8
- atomic-waker-2ce81777cb728a88
- atomic-waker-b0bffc3cf38abbeb
- base64-80c574e62179e036
- base64-c78aae63be61ba3a
- bitflags-3167ba57ef9ec9ce
- bitflags-ccb9bc03a9edff55
- bytes-5544e1f3425bf39c
- bytes-8569d2878c2d2684
- cfg-if-910eb05bc9a9e024
- cfg-if-eed30869daa6a03e
- core-foundation-132092672ce4b94e
- core-foundation-865bbc324448bb6c
- core-foundation-f5b33ef795c6fce5
- core-foundation-fa0e608840733e1d
- core-foundation-sys-288a2d755650be3d
- core-foundation-sys-b58d4b6cd84d5e88
- displaydoc-08c1703951360da9
- dynamo-semblend-436abff61e5d50b5
- dynamo-semblend-8a3db464811faaf1
- dynamo-semblend-8df9a6f3a7e41d0e
- dynamo-semblend-ef33990e3bf6bb88
- encoding_rs-e6ec9b2b703df9c6
- encoding_rs-f1d790adecc69364
- equivalent-4d602d504de81892
- equivalent-a28faa80e44c431e
- errno-b5220538070da7cd
- errno-daf46fefe59e3dd4
- fastrand-31e8b3f7ae7829dc
- fastrand-f6659941a93351df
- fnv-0537d2c1a8f6eb12
- fnv-8eddd583da5e63ac
- form_urlencoded-72d53622942c245e
- form_urlencoded-d2229b62f38fcb2c
- futures-channel-817810f485379f99
- futures-channel-84e4509813baf595
- futures-core-4fd3c262167fc586
- futures-core-ea48297821933c39
- futures-sink-38966f85780c6434
- futures-sink-a39009d597e3d487
- futures-task-5182555bf0625e73
- futures-task-8616aae3358a58a9
- futures-util-5049453147f9424e
- futures-util-e0e935c9c8aad3a7
- getrandom-692e9cf8f9b8b84b
- getrandom-c7d2313fc8431040
- getrandom-e6caf4bc76cd301e
- getrandom-e7db038915b51532
- h2-88f316ca4e2009e2
- h2-c0bd869fdd0cd2f6
- hashbrown-0dfa14b90efbece1
- hashbrown-8708d95d28fcc90f
- http-044b52af4387ce25
- http-body-b030b6edd41b69ed
- http-body-b67e042efcbb114a
- http-body-util-7bacd188eb3b15ba
- http-body-util-b76628590fc2658a
- http-eb4a9f8369999e8e
- httparse-5934c055ab1614be
- httparse-839e23698d971916
- httparse-96aac2f1e6d62650
- httparse-e6e485a263eb8453
- hyper-01af7012f3e340cf
- hyper-665dfa3fc8b7298e
- hyper-tls-9cb84a312dba3fcf
- hyper-tls-c48c0785056079e6
- hyper-util-3aa0c1d22398e47a
- hyper-util-5365358d44bdf61c
- icu_collections-261a0442cfe468f6
- icu_collections-fa8a2ad9414c4f94
- icu_locale_core-6c98460dfbc62ebe
- icu_locale_core-9940ff653f9da1ed
- icu_normalizer-02fff7f062b995dc
- icu_normalizer-d3a7ee12b95f8249
- icu_normalizer_data-0254a55b90bc4928
- icu_normalizer_data-481cb15ff754b579
- icu_normalizer_data-9d2d20fbe9f5fd60
- icu_normalizer_data-bb95eba5e9e91f8f
- icu_properties-18c581916e1e3431
- icu_properties-41aad9f3e5df4425
- icu_properties_data-029d732ee25f1601
- icu_properties_data-0bba8c7835b6edac
- icu_properties_data-6e7712c3808d068d
- icu_properties_data-c859392eec28501c
- icu_provider-4a5b1289f0b6b368
- icu_provider-9b5bfcb7b90e6565
- idna-b2ba5eca896d4e8d
- idna-d3523920061ae3b0
- idna_adapter-55e98a64a0b9d091
- idna_adapter-a295af411c1a1728
- indexmap-6d5ae6c1f444dbf9
- indexmap-78de9202b4e693ab
- ipnet-0137913456c9a28c
- ipnet-d4a855ffc2aca3d2
- iri-string-9c5fa20c923c632b
- iri-string-db226e9c1b9bce3d
- itoa-b4706e056a146901
- itoa-d9a3701da09be60c
- libc-3194908faa4d976a
- libc-c7cb9a18f2fb75de
- libc-ed81bf86614af363
- libc-f5ee3ea939ddb0c1
- litemap-55809461120a4294
- litemap-cfeadd9b22a73434
- lock_api-b39d18939fd97e2b
- lock_api-bbf09bff5a42552d
- log-d11380b34016a3e7
- log-d50f360e7b74435a
- memchr-0e82c6227dd11c7a
- memchr-8f779190e318b158
- mime-4d10f548eb0f4473
- mime-75be28c2813a36fd
- mio-c28c1c78ffb0d825
- mio-f234a44cb9895929
- native-tls-0246543062ce5868
- native-tls-4d0a75f7d448fc58
- native-tls-84dfa73c46feb1ff
- native-tls-f9497e50439f411e
- once_cell-894824bd03fb1be8
- once_cell-c43e9fa2ed5f14b5
- parking_lot-6ed9ad954e5ac141
- parking_lot-f4207e0018bfbcb7
- parking_lot_core-017d24791fb0ca8d
- parking_lot_core-753eb33bc5f912c4
- parking_lot_core-9b33988b938a481f
- parking_lot_core-9e7ef84878f08830
- percent-encoding-0c5d2c6f010e1b7d
- percent-encoding-da65008b411bfe11
- pin-project-lite-90335a9c7138bd69
- pin-project-lite-f1efcbb9fac1edc6
- pin-utils-193c31fffae368ec
- pin-utils-a3c761c04bda9dae
- potential_utf-4487517ddace0fbb
- potential_utf-a9e94794ee8e28d5
- proc-macro2-913ca31eeed64f0b
- proc-macro2-e120ad67e7fe1db8
- proc-macro2-e7f548cc64fe319f
- quote-35043d4d5b4446ab
- quote-ac0e94b5558622ed
- quote-f7f1ef0f1b6a9d68
- reqwest-0a1009a219319c92
- reqwest-12d307d1a4d174b2
- rustc-hash-03fe8185a6d789cf
- rustc-hash-4a095f9996af3945
- rustix-b3986a3cad18da99
- rustix-beed7472d0c64620
- rustix-e18e14fff40ea462
- rustix-fa135af46cdca8b6
- rustls-pki-types-2a0085f96b63c70d
- rustls-pki-types-7055d7dcb8065f59
- ryu-0f2c5debc2326c7e
- ryu-8140ba8cd2919c44
- scopeguard-5ebacb81608016ec
- scopeguard-b48a43a5955c38e4
- security-framework-3b245d141ae58547
- security-framework-d4832a4c5251dac2
- security-framework-sys-23b0517919713b88
- security-framework-sys-da81c360c3b6d110
- serde-2beb76a2a4a45d11
- serde-75e2918ad8c40426
- serde-a2fa4bda50a7f79b
- serde-e3f9c65595ada2ff
- serde_core-26d0b4e83157361a
- serde_core-32eb346ac01e167b
- serde_core-9fcdc75afecb8d68
- serde_core-b69eb7896a73356b
- serde_derive-a58928edf83cf5b4
- serde_json-613e6835ab44a09a
- serde_json-8acd51b9d068b844
- serde_json-9948cbebddac7f91
- serde_json-e916d692b769d808
- serde_urlencoded-25cb4982da201fca
- serde_urlencoded-9ecc0979cfb2e103
- signal-hook-registry-0b3a871107a9fc27
- signal-hook-registry-f840e73606219ddb
- slab-e4d23bfe3e276562
- slab-f2c43c9bab461865
- smallvec-2c0b0ecccaf193ea
- smallvec-b9e83ce4ff41d5af
- socket2-0964e4de48acedc8
- socket2-d0a1fbc010fbcf10
- stable_deref_trait-2c3b0a7029402d41
- stable_deref_trait-69db5b374e509273
- syn-4881abf54188863a
- sync_wrapper-2a0b6afe7df42182
- sync_wrapper-d49544ad6a00e74e
- synstructure-43fe6628f768ef72
- system-configuration-2c75c150eb8206c6
- system-configuration-89382bed125ffff7
- system-configuration-sys-060e7bfc491f9837
- system-configuration-sys-5d76e8628b697ae2
- system-configuration-sys-dfd0cdfeab88e89a
- system-configuration-sys-f9b109be807945a2
- tempfile-3cd03c0abdb8bf02
- tempfile-7bb38b4c865195a3
- thiserror-1829b6bd59835608
- thiserror-90e1d99a1b4d7886
- thiserror-93fa8f1778ff6ab6
- thiserror-a76c80ec0f1e0fcb
- thiserror-impl-a96fb68cbd3da417
- tinystr-41fbc697d5b16123
- tinystr-afcedcfe66cf6c2c
- tokio-070e21c49121cb4c
- tokio-484b917a203329ed
- tokio-macros-797319545c2e5d0b
- tokio-native-tls-ba818cadcde8a3bb
- tokio-native-tls-d423a8ff6463e2e9
- tokio-util-76d68d1878bc352f
- tokio-util-eb3bb524602d3d8b
- tower-9a2f2fa0897bdc95
- tower-9ad27400641a4765
- tower-http-676176ddb6eb05b5
- tower-http-94312c2375aef6fb
- tower-layer-00ce058ba7161670
- tower-layer-b221d6b536670d08
- tower-service-4a60eee298fbe75b
- tower-service-c1ab1cd4f8f339d6
- tracing-4fdce086c72775b9
- tracing-attributes-adddac2b10498e1e
- tracing-core-1a0db7244484a37a
- tracing-core-d8b7e92155a9a22c
- tracing-d1a33be1e38e7bea
- try-lock-4f985d854b2af6be
- try-lock-da593a79b54e72ae
- unicode-ident-122d923a082262ca
- url-375bc7f83620833b
- url-ed5db3433e99a7de
- utf8_iter-9b322050c8e5831b
- utf8_iter-f397a552692e5ae1
- want-0b9bbb44a2ec9d8b
- want-98d99c66b918387a
- writeable-992582a4464a3c90
- writeable-9c04caa7a1d660b1
- xxhash-rust-1edbc59fcb272835
- xxhash-rust-f99f064f4fbbee9f
- yoke-338f6aacc97b1c40
- yoke-bb5cac821285fad3
- yoke-derive-562bd8a3be6e64a2
- zerofrom-0f84b858a83932b2
- zerofrom-1287109deddb7f55
- zerofrom-derive-ea72112116e27bfb
- zeroize-3032c9f30e41b51c
- zeroize-9cb5f108190666d6
- zerotrie-4d36004ee85dff35
- zerotrie-e6e6000a68818299
- zerovec-1bb6554a33192469
- zerovec-3bac6780b3326949
- zerovec-derive-6fd3ac718a193b10
- zmij-0c00a8e263317524
- zmij-5ebd53ef9124e758
- zmij-756d5f2ee4732c75
- zmij-c3387323a095706f
- deps
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
0 commit comments