@@ -71,18 +71,29 @@ Applied via `helm upgrade --install`. Chaos daemons restarted and all experiment
7171 names across all docs, updated implementation plan progress.
7272- ** CI fix:** ` src/worker/store.py ` had a ruff format issue (function args on one line).
7373
74- ## Files changed
75-
76- - ` k8s/chaos/pod-kill.yaml ` — changed mode from ` fixed ` /` value: "2" ` to ` one `
77- - ` infra/helm-install.sh ` — added containerd runtime settings for Chaos Mesh
78- - ` infra/setup.sh ` — corrected storage account name
79- - ` docs/chaos-experiments.md ` — added containerd prerequisite note and verified results
80- - ` docs/demo-guide.md ` — updated rolling update section, removed stale Postgres checklist item
81- - ` docs/implementation-plan.md ` — marked chaos, rolling update, demo rehearsal as DONE
82- - ` locust/locustfile.py ` — reweighted for async pipeline, removed sync generate task
83- - ` src/worker/store.py ` — simplified blob path (filename only, no UUID prefix)
84- - ` tests/test_store.py ` — updated blob path assertions
85- - ` README.md ` — corrected test count (51 → 92)
86- - ` CLAUDE.md ` — removed chaos mesh and demo rehearsal from "Not yet done"
87- - ` .engineering-team/architecture-plan.md ` — updated demo script and Azure resource names
88- - ` .github/workflows/deploy.yml ` — corrected AKS cluster name and resource group
74+ ## CI/CD pipeline
75+
76+ - ** docker.yml:** Now builds images once and pushes to both ghcr.io and ACR. ACR push
77+ is gated on ` ACR_LOGIN_SERVER ` variable + ` ACR_CLIENT_ID ` /` ACR_CLIENT_SECRET ` secrets.
78+ - ** deploy.yml:** Triggers via ` workflow_run ` after docker.yml completes (no duplicate builds).
79+ Uses ` azure/login@v2 ` with service principal creds JSON for AKS access.
80+ - ** Secrets configured:** ` ACR_CLIENT_ID ` , ` ACR_CLIENT_SECRET ` , ` AZURE_TENANT_ID ` ,
81+ ` AZURE_SUBSCRIPTION_ID ` , ` AZURE_CREDENTIALS ` , plus ` ACR_LOGIN_SERVER ` variable.
82+ - ** Full pipeline working:** push → CI (lint+test) + Docker (build+push to ghcr.io+ACR) → Deploy (AKS).
83+
84+ ## Remove torch: direct ONNX Runtime inference
85+
86+ Replaced ` sentence-transformers ` (which requires torch ~ 2GB) with direct ONNX Runtime
87+ inference. The worker image dropped from ** ~ 3GB to ~ 190MB** (93% reduction).
88+
89+ - ` src/worker/semantic.py ` — replaced ` SentenceTransformer.encode() ` with manual
90+ tokenization (HuggingFace ` AutoTokenizer ` ) + ONNX inference + numpy mean pooling
91+ + L2 normalization. Same model (all-MiniLM-L6-v2), same 384-dim output.
92+ - ` pyproject.toml ` — replaced ` sentence-transformers[onnx] ` with ` onnxruntime ` ,
93+ ` transformers ` , ` huggingface-hub ` . Removed 31 packages from lock file.
94+ - ONNX model downloaded from HuggingFace Hub on first use via ` hf_hub_download ` .
95+
96+ ## Redis OOM fix
97+
98+ Redis was OOMKilled with 128Mi memory limit after accumulating a large backlog from
99+ Locust testing. Increased to 512Mi limit / 256Mi request via kubectl patch.
0 commit comments