|
| 1 | +# SMP Server Postgres: Slow Query Analysis |
| 2 | + |
| 3 | +Data from three production servers (A, B, C) over a multi-day observation window. |
| 4 | + |
| 5 | +## Verified fixes |
| 6 | + |
| 7 | +### 1. getEntityCounts: replace ③④ with SUM(queue_count) |
| 8 | + |
| 9 | +**Query** (`QueueStore/Postgres.hs:160-167`): |
| 10 | + |
| 11 | +```sql |
| 12 | +SELECT |
| 13 | + (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL) AS queue_count, -- ① scan |
| 14 | + (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL AND notifier_id IS NOT NULL) AS notifier_count, -- ② scan |
| 15 | + (SELECT COUNT(1) FROM services WHERE service_role = ?) AS rcv_service_count, -- trivial (<10 rows) |
| 16 | + (SELECT COUNT(1) FROM services WHERE service_role = ?) AS ntf_service_count, -- trivial (<10 rows) |
| 17 | + (SELECT COUNT(1) FROM msg_queues WHERE rcv_service_id IS NOT NULL AND deleted_at IS NULL) AS rcv_service_queues_count, -- ③ scan |
| 18 | + (SELECT COUNT(1) FROM msg_queues WHERE ntf_service_id IS NOT NULL AND deleted_at IS NULL) AS ntf_service_queues_count -- ④ scan |
| 19 | +``` |
| 20 | + |
| 21 | +**Performance**: ~315ms avg, ~2s max, ~2500 calls. #1 slow query by total time (~800s). |
| 22 | + |
| 23 | +**Problem**: 4 scans of `msg_queues`. Indexes exist for some subqueries |
| 24 | +(`idx_msg_queues_rcv_service_id(rcv_service_id, deleted_at)` for ③, |
| 25 | +`idx_msg_queues_ntf_service_id(ntf_service_id, deleted_at)` for ④, |
| 26 | +`idx_msg_queues_updated_at_recipient_id(deleted_at, ...)` potentially for ①), |
| 27 | +but whether PostgreSQL uses them for COUNT and the actual per-subquery cost is |
| 28 | +unknown without `EXPLAIN ANALYZE`. |
| 29 | + |
| 30 | +**Fix**: Replace ③ and ④: |
| 31 | + |
| 32 | +```sql |
| 33 | +COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'M'), 0) AS rcv_service_queues_count, |
| 34 | +COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'N'), 0) AS ntf_service_queues_count |
| 35 | +``` |
| 36 | + |
| 37 | +**Verification**: |
| 38 | +- Trigger logic traced for all transitions (NULL→set, change, soft-delete, physical delete) — correct. |
| 39 | +- FK `rcv_service_id REFERENCES services(service_id)` guarantees every non-NULL value maps to a row. |
| 40 | +- `queue_count + p_change` is atomic under READ COMMITTED — concurrent-safe. |
| 41 | +- `update_all_aggregates()` exists as repair mechanism. |
| 42 | +- `services` table has <10 rows — SUM is O(1) vs full table scan. |
| 43 | + |
| 44 | +**Savings**: Eliminates 2 of 4 msg_queues scans (③④ → trivial SUM on <10 rows). |
| 45 | +Exact savings unknown — if ③④ already use indexes efficiently, savings may be modest. |
| 46 | +`EXPLAIN ANALYZE` needed to measure actual per-subquery cost. |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +### 2. expire_old_messages: remove trailing COUNTs |
| 51 | + |
| 52 | +**Stored procedure** (`Migrations.hs`), at the end of `expire_old_messages`: |
| 53 | + |
| 54 | +```sql |
| 55 | +r_expired_msgs_count := total_deleted; |
| 56 | +r_stored_msgs_count := (SELECT COUNT(1) FROM messages); -- 13-114ms per call (CSV) |
| 57 | +r_stored_queues := (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL); -- 544-719ms per call (CSV) |
| 58 | +``` |
| 59 | + |
| 60 | +**Performance**: 10 calls per observation window. Per-call cost of these COUNTs (from CSV): |
| 61 | + |
| 62 | +| Server | expire_old_messages avg | r_stored_queues avg | r_stored_msgs avg | COUNTs combined | % of procedure | |
| 63 | +|---|---|---|---|---|---| |
| 64 | +| A | 11,798ms | 719ms | 114ms | 833ms | 7.1% | |
| 65 | +| B | 11,242ms | 544ms | 13ms | 557ms | 5.0% | |
| 66 | +| C | 7,296ms | 588ms | 67ms | 655ms | 9.0% | |
| 67 | + |
| 68 | +**How results are used** (`Server.hs:485-488`): |
| 69 | + |
| 70 | +```haskell |
| 71 | +Right msgStats@MessageStats {storedMsgsCount = stored, expiredMsgsCount = expired} -> do |
| 72 | + atomicWriteIORef (msgCount stats) stored -- resets msgCount from storedMsgsCount |
| 73 | + atomicModifyIORef'_ (msgExpired stats) (+ expired) |
| 74 | + printMessageStats "STORE: messages" msgStats -- logs all three fields |
| 75 | +``` |
| 76 | + |
| 77 | +- `expiredMsgsCount` — computed incrementally in the loop. **Needed, already cheap.** |
| 78 | +- `storedMsgsCount` — used to reset `msgCount` stat. But `msgCount` is also maintained |
| 79 | + incrementally (`+1` on send at line 1963, `-1` on ACK at line 1916). The reset corrects drift. |
| 80 | +- `storedQueues` — **used only for logging**. Same value available from `getEntityCounts` |
| 81 | + which runs every ~60s via Prometheus. |
| 82 | + |
| 83 | +**Fix**: Remove both COUNTs from the stored procedure. Return only `r_expired_msgs_count`. |
| 84 | + |
| 85 | +For `storedMsgsCount`: either trust the incremental `msgCount` counter, or query |
| 86 | +`SELECT COUNT(1) FROM messages` separately (in parallel, not blocking the procedure). |
| 87 | + |
| 88 | +For `storedQueues`: use the value from the most recent `getEntityCounts` call. |
| 89 | + |
| 90 | +**Verification**: Traced all usages of `MessageStats` fields from `expireOldMessages` in |
| 91 | +`Server.hs`. `storedQueues` is only logged. `storedMsgsCount` resets a counter that's already |
| 92 | +maintained incrementally — removing the reset means potential drift, but the counter is |
| 93 | +corrected on next server restart anyway. |
| 94 | + |
| 95 | +**Savings**: 560-830ms per expiration cycle × 10 cycles = **5.6-8.3s total** over observation window. |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +### 3. Trigger WHEN clause: skip function call for non-service updates |
| 100 | + |
| 101 | +**Current trigger** (`Migrations.hs:566-568`): |
| 102 | + |
| 103 | +```sql |
| 104 | +CREATE TRIGGER tr_queue_update |
| 105 | +AFTER UPDATE ON msg_queues |
| 106 | +FOR EACH ROW EXECUTE PROCEDURE on_queue_update(); |
| 107 | +``` |
| 108 | + |
| 109 | +Fires on **every UPDATE** to `msg_queues`. From Server C data, ~1.2M updates per observation |
| 110 | +window, but only ~105K (8.7%) actually change service-related fields: |
| 111 | + |
| 112 | +| UPDATE pattern | Calls | Changes service fields? | |
| 113 | +|---|---|---| |
| 114 | +| SET updated_at | ~427K | No | |
| 115 | +| SET msg_can_write/size/expire (write_message) | ~336K | No | |
| 116 | +| SET msg_can_write/size/expire (try_del_*) | ~196K | No | |
| 117 | +| SET msg_can_write/size/expire (delete_expired) | ~136K | No | |
| 118 | +| SET sender_key | ~8K | No | |
| 119 | +| SET status | ~2K | No | |
| 120 | +| **SET rcv_service_id** | **~101K** | **Yes** | |
| 121 | +| **SET deleted_at** | **~5K** | **Yes** | |
| 122 | + |
| 123 | +The `on_queue_update()` PL/pgSQL function evaluates 8-12 boolean conditions on every call, |
| 124 | +then returns without calling `update_aggregates` for 91% of invocations. |
| 125 | + |
| 126 | +**Fix**: Add a `WHEN` clause to the trigger definition. PostgreSQL evaluates `WHEN` in C code |
| 127 | +before calling the PL/pgSQL function — no function entry overhead at all: |
| 128 | + |
| 129 | +```sql |
| 130 | +CREATE TRIGGER tr_queue_update |
| 131 | +AFTER UPDATE ON msg_queues |
| 132 | +FOR EACH ROW |
| 133 | +WHEN ( |
| 134 | + OLD.deleted_at IS DISTINCT FROM NEW.deleted_at |
| 135 | + OR OLD.rcv_service_id IS DISTINCT FROM NEW.rcv_service_id |
| 136 | + OR OLD.ntf_service_id IS DISTINCT FROM NEW.ntf_service_id |
| 137 | + OR OLD.notifier_id IS DISTINCT FROM NEW.notifier_id |
| 138 | +) |
| 139 | +EXECUTE PROCEDURE on_queue_update(); |
| 140 | +``` |
| 141 | + |
| 142 | +**Verification**: |
| 143 | +- PostgreSQL supports `OLD`/`NEW` in `WHEN` clauses for `AFTER UPDATE` triggers. |
| 144 | +- The 4 conditions match exactly the fields checked inside `on_queue_update()`. |
| 145 | +- When WHEN is false, the function is **never called** — zero PL/pgSQL overhead. |
| 146 | +- When WHEN is true, the function runs identically to today. |
| 147 | +- Behavioral change: **none** — same aggregates updated in same cases. |
| 148 | + |
| 149 | +**Savings**: ~1.1M PL/pgSQL function calls avoided. Each call has fixed overhead |
| 150 | +(function entry, OLD/NEW row extraction, condition evaluation, return). Exact savings |
| 151 | +need measurement, but function call overhead is non-trivial at this volume. |
| 152 | + |
| 153 | +--- |
| 154 | + |
| 155 | +## Fixes that need EXPLAIN ANALYZE |
| 156 | + |
| 157 | +### 4. Partial indexes for getEntityCounts ① and ② |
| 158 | + |
| 159 | +Subqueries ① and ② still scan msg_queues. ① may use |
| 160 | +`idx_msg_queues_updated_at_recipient_id(deleted_at, ...)` but ② has no index covering |
| 161 | +both `deleted_at IS NULL` and `notifier_id IS NOT NULL`. Actual plans unknown. |
| 162 | + |
| 163 | +Candidate indexes: |
| 164 | + |
| 165 | +```sql |
| 166 | +-- For ① queue_count: enables index-only COUNT |
| 167 | +CREATE INDEX idx_msg_queues_live ON msg_queues ((1)) WHERE deleted_at IS NULL; |
| 168 | + |
| 169 | +-- For ② notifier_count: enables index-only COUNT |
| 170 | +CREATE INDEX idx_msg_queues_live_notifier ON msg_queues ((1)) WHERE deleted_at IS NULL AND notifier_id IS NOT NULL; |
| 171 | +``` |
| 172 | + |
| 173 | +**Trade-off**: Each index adds write overhead on every INSERT/UPDATE/DELETE touching the |
| 174 | +filtered columns. Need `EXPLAIN ANALYZE` to confirm the COUNT actually uses the index |
| 175 | +(PostgreSQL may choose seq scan if the partial index covers most rows). |
| 176 | + |
| 177 | +--- |
| 178 | + |
| 179 | +## Not fixable (architectural) |
| 180 | + |
| 181 | +- **write_message / try_del_peek_msg max times (490-523ms)**: Lock contention on |
| 182 | + `FOR UPDATE` of the same `recipient_id` row. Inherent to concurrent queue access — cannot |
| 183 | + use `SKIP LOCKED` because these operations require the lock for correctness. |
| 184 | + |
| 185 | +- **UPDATE msg_queues SET updated_at (~430K calls, 83-90s total, 0.20ms avg)**: Per-call cost |
| 186 | + is already minimal. Trigger does zero aggregate work for this pattern (verified — all |
| 187 | + IS DISTINCT FROM checks fail, no `update_aggregates` called). Fix #3 eliminates even |
| 188 | + the function call overhead. |
| 189 | + |
| 190 | +--- |
| 191 | + |
| 192 | +## Summary |
| 193 | + |
| 194 | +| # | Fix | Before | After | Verified | |
| 195 | +|---|-----|--------|-------|----------| |
| 196 | +| 1 | getEntityCounts ③④ → `SUM(queue_count)` | ~315ms/call (4 msg_queues scans) | 2 scans eliminated; exact savings needs EXPLAIN ANALYZE | Correctness: yes. Savings estimate: no | |
| 197 | +| 2 | Remove trailing COUNTs from expire_old_messages | 7-12s/call (COUNTs add 560-830ms, ~5-9%) | 6.6-11.0s/call | Yes (CSV verified) | |
| 198 | +| 3 | Add WHEN clause to tr_queue_update | ~1.2M PL/pgSQL calls (91% no-op) | ~105K PL/pgSQL calls (only when needed) | Correctness: yes. Time savings: unmeasured | |
| 199 | +| 4 | Partial indexes for ①② | 2 msg_queues scans remain | Needs EXPLAIN ANALYZE | No | |
0 commit comments