You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Switch embedding inference to ONNX Runtime and update all docs
- Replace PyTorch with ONNX Runtime for sentence-transformers inference
(~50MB vs ~5GB, faster pod startup for KEDA scaling)
- Update dictionary with ONNX Runtime entry
- Update architecture, classification, demo guide with ONNX details
- Update implementation plan with Stage 13
- Update journal with ONNX switch details
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments