|
1 | 1 | # Reference my-claw Project — Design Rinsing in Practice |
2 | 2 |
|
3 | | -**Project:** my-claw — Multi-agent war room command center with voice, Telegram, Discord, and WebSocket interfaces |
| 3 | +**Project:** my-claw — Autonomous, self-managing, multi-agent AI system with voice, Telegram, Discord, and WebSocket interfaces |
4 | 4 |
|
5 | 5 | **Tech Stack:** Python 3.11+, Pipecat (real-time frame-processing pipeline), litellm (provider-agnostic LLM gateway), FastAPI, SQLite |
6 | 6 |
|
7 | | -**Scale:** 5-agent architecture with 3-tier routing, voice integration via Deepgram STT + Cartesia TTS, 31 test files with 473 test functions (unit + stack + Docker stack + browser stack), room-based isolation with 3 templates, 10 worker roles, behavioral constitution, trust tiers, memory system, heartbeat, and scheduling |
| 7 | +**Scale:** 5-agent architecture with 3-tier routing, voice integration via Deepgram STT + Cartesia TTS, 32 test files with 492 test functions (unit + stack + Docker stack + browser stack + Discord stack), room-based isolation with 3 templates, 10 worker roles, behavioral constitution, trust tiers, memory system, heartbeat, and scheduling |
8 | 8 |
|
9 | 9 | This case study demonstrates design rinsing — the structured practice of extracting distilled architectural understanding from external sources and translating it into a project's design. The my-claw project evolved through three distinct rinsing phases, each building on the last. That compounding — where each rinsing phase leveraged and extended the previous — is itself an example of [compound engineering](https://github.com/EveryInc/compound-engineering-plugin): each unit of work making subsequent units easier. |
10 | 10 |
|
@@ -178,14 +178,15 @@ The testing infrastructure demonstrates rinsing at the practice level — the tr |
178 | 178 | | Trading Bot Pattern | my-claw Translation | |
179 | 179 | |---|---| |
180 | 180 | | StackTestUtils class | Per-test session management with real services | |
181 | | -| Sequential test ordering | ST1-ST12 ordered by dependency (startup → auth → routing → voice → rooms) | |
| 181 | +| Sequential test ordering | ST1-ST11 ordered by dependency (startup → routing → voice → rooms → tools → trust → heartbeat) | |
182 | 182 | | Real dependencies | Zero mocks in integration tests; real Deepgram, Cartesia, litellm APIs | |
183 | 183 | | Full-loop assertions | Tests verify entire user journeys, not individual functions | |
184 | | -| Docker stack tests | ST-D1-ST-D11 against Docker container | |
| 184 | +| Docker stack tests | ST-D1-ST-D10 against Docker container | |
185 | 185 | | Browser stack tests | ST-B1-ST-B8 via Playwright against running container | |
| 186 | +| Discord stack tests | ST-DS1-ST-DS6 against real Discord bot | |
186 | 187 | | Health endpoint test mode | Container readiness checks before domain tests | |
187 | | -| Room isolation tests | 4 tests verifying isolated pipelines don't interfere | |
188 | | -| Tool stack tests | 5 tests verifying delegation and tool execution | |
| 188 | +| Room isolation tests | ST-R1-ST-R4 verifying isolated pipelines don't interfere | |
| 189 | +| Tool stack tests | ST-T1-ST-T6 verifying delegation and tool execution with real LLM | |
189 | 190 |
|
190 | 191 | Test markers: `pytest -m "not integration"` for unit (no network), `pytest -m integration` for real API tests (auto-skip if no .env). Unit tests use no mocks — they test against the module interfaces directly. |
191 | 192 |
|
|
0 commit comments