Commit 057409f
feat(celery T2.2): tenant-scoped quota + bulkhead limits + flaky race-test hardening
Per docs/modularization/indexing-redesign-design-pack.md §H.5 + §H.6
+ architect msg=8420f12a + ruling 2 simplification (msg=492315e8).
The Wave 2 runtime needs two layers of resource protection above
the per-modality worker pool: a quota layer that throttles upstream
LLM/embedding calls per (resource_class, tenant_scope_key), and a
bulkhead layer that places hard wall-time / size ceilings on every
worker call regardless of tenant. This commit lands both as
orchestrator-callable helpers; the orchestrator/reconciler integration
(invoke acquire() before LLM/embedding; wrap calls in
bulkhead_timeout) is chenyexuan's T2.1 follow-up scope per architect
msg=492315e8 ruling 2.
Surface (aperag/indexing/quota.py, ~411 lines):
* QuotaPolicy — frozen dataclass holding (capacity,
refill_rate_per_sec) with __post_init__ validation that both are
> 0. Standard token-bucket parameters; 60/1.0 matches §H.5
baseline of "60 LLM calls / minute sustained".
* QuotaPolicyRegistry — maps (resource_class, tenant_scope_key) →
QuotaPolicy with two-tier lookup: exact match first, then
("default" tenant_scope_key fallback for that resource class).
Raises KeyError if neither configured (worker hitting an
unconfigured resource class is a deployment bug; surface loud).
Resource classes are independent — declaring an "llm" default
does NOT implicitly cap "embedding".
* QuotaBackend — async Protocol every backend implements. Single
acquire() method that blocks until one token is granted,
respecting the refill rate.
* InMemoryQuotaBackend — Python token bucket with one asyncio.Lock
per (resource_class, tenant_scope_key) bucket key. Suitable for
the §L Tier-1 single-process deployment (INDEXING_MODE=inline)
and as the canonical correctness oracle for unit tests. Uses an
injectable clock fixture so tests can advance time
deterministically.
* RedisQuotaBackend — Redis-backed implementation with an atomic
Lua script (HMGET → refill math → HMSET in one round-trip) so
concurrent multi-process workers never overshoot the bucket
capacity. The Lua script returns (acquired, wait_seconds);
callers retry after asyncio.sleep(wait_seconds) when refused.
Uses crc32 slot hashing on tenant_scope_key to bound Redis-key
cardinality. Compatible with both sync and async redis-py
(await-if-awaitable shim).
Surface (aperag/indexing/limits.py, ~162 lines):
* LLM_CALL_TIMEOUT_SECONDS = 60.0 (§H.6)
* EMBEDDING_CALL_TIMEOUT_SECONDS = 30.0 (§H.6)
* UPLOAD_MAX_BYTES = 50 * 1024 * 1024 (§H.6)
* bulkhead_timeout(seconds, label=...) — async context manager
wrapping asyncio.timeout; logs the label on TimeoutError so
per-call telemetry distinguishes timeouts at "graph.derive.llm"
from "vector.derive.embedding" without scattering log strings
across modality workers.
* reject_if_oversize(content_length, label=...) — boundary-time
ValueError when over UPLOAD_MAX_BYTES; called by upload handlers
before parser allocates.
Tests (tests/unit_test/indexing/test_t2_2_quota_limits.py, 20 cases):
* QuotaPolicy validation — capacity / refill_rate must be > 0;
fractional values accepted.
* QuotaPolicyRegistry — exact-match wins over default; KeyError
when neither configured; per-resource-class default isolation.
* InMemoryQuotaBackend — initial bucket starts at capacity;
drained bucket blocks until refill (under fake clock + monkey-
patched asyncio.sleep, deterministic timing assertion); per-tenant
isolation; per-resource-class isolation; refill capped at
capacity after long idle (1hr → 3 tokens not 3600); default
fallback routes unknown tenant through shared policy.
* Bulkhead — bulkhead_timeout completes within budget vs raises
TimeoutError when exceeded; reject_if_oversize accepts at
boundary, rejects strictly over cap; constants pinned to design
pack values.
* RedisQuotaBackend — construction-only smoke (Protocol surface),
Lua-script invocation against fake script (acquire when token
available, retry-loop when wait_seconds returned). Real-Redis
integration deferred to T2.3 load-test infra (real Redis fixture).
Hardening for huangheng msg=2b20974b informational +
architect msg=8420f12a follow-up directive:
* tests/unit_test/indexing/test_t1_2_graph.py:_RaceProvocateurStore
now takes race_count parameter. The lock-protected test stays at
race_count=1 (no barrier; single-writer-at-a-time naturally).
The no-lock negative-control flips to race_count=2: an
asyncio.Event barrier holds both writers at the post-read /
pre-write window until BOTH have reached it, then releases. This
pins the race deterministically — the previous asyncio.sleep(0)
was scheduler-dependent and the test flaked under heavy CI load
(huangheng observed 1/2 fail in a full-suite run). Verified
5/5 deterministic passes locally post-fix.
aperag/indexing/__init__.py — re-export the T2.2 quota + limits
surfaces so the indexing package surface stays uniform across the
3 wave layers.
Lint + tests:
* uvx ruff check + ruff format --check across aperag/indexing/ +
tests/unit_test/indexing/: clean.
* pytest tests/unit_test/indexing/ + tests/unit_test/test_phase3_reexport_audit.py:
101 passed, 0 failed (62 pre-existing Wave 1+2 + 20 new T2.2 +
modified race-test path that now passes 5/5 deterministically).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent 7f51d44 commit 057409f
5 files changed
Lines changed: 1091 additions & 17 deletions
File tree
- aperag/indexing
- tests/unit_test/indexing
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
57 | 64 | | |
58 | 65 | | |
59 | 66 | | |
| |||
108 | 115 | | |
109 | 116 | | |
110 | 117 | | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
111 | 126 | | |
112 | 127 | | |
113 | 128 | | |
| |||
235 | 250 | | |
236 | 251 | | |
237 | 252 | | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
238 | 266 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
0 commit comments