Skip to content

Commit 32c0c9e

Browse files
Steven Lefebvreclaude
andcommitted
fix: audit findings — Postgres reclassify, batch completeness, entity noise
P1-1: reclassifyEntity() now works on Postgres (was SQLite-only syntax). Entities route dry-run preview also fixed for Postgres backend. P1-2: Batch endpoint now includes supersedes logic (facts by key, statuses by subject), structured DB writes, relevance scoring, and entity linking — matching the single-store endpoint. P2-1: Consolidated memories (merged facts, contradiction events) now indexed in keyword search and written to structured DB. P2-2: cleanupOldEvents refactored to filter per-page instead of loading all events into memory (OOM risk eliminated). P2-4: Internal _sessionKey/_sessionRank fields cleaned from search API response before sending to client. P2-6: Entity extraction noise filters tightened — blocks shell commands, JS keywords, imperative instructions, marketing taglines, em-dash fragments, all-lowercase multi-word phrases, and measurements. Tested against 15 known garbage entities (15/15 blocked). P3-2: knowledge_category enum validation added to validateMemoryInput. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent be78a71 commit 32c0c9e

6 files changed

Lines changed: 443 additions & 44 deletions

File tree

AUDIT-2026-04-02.md

Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
# Zengram Full Audit Report — 2026-04-02
2+
3+
**Audit scope**: Full codebase review + live database quality check
4+
**Models**: Claude Opus 4.6 (primary), Codex GPT-5.4 xhigh (adversarial), Gemini 3.1 Pro (adversarial)
5+
**Live instance**: 192.168.18.40:8084 — 521 memories, 787 entities, all retrieval paths active
6+
**Tests**: 114 pass / 0 fail / 32 suites
7+
**Codebase**: ~11K LOC across API, MCP server, 3 storage backends, 6 test suites
8+
9+
---
10+
11+
## P1 — High Severity (fix before next deploy)
12+
13+
### P1-1: Entity reclassification broken on Postgres backend
14+
**File**: `api/src/services/entities.js:370-398`
15+
**Also**: `api/src/routes/entities.js:115-117` (dry-run preview)
16+
17+
`reclassifyEntity()` uses SQLite-specific `store.db.prepare()` syntax. The Beelink production deployment uses Postgres (`store.pool`), so `store.db` is `undefined`. The function silently returns `{ error: 'No writable store' }`.
18+
19+
**Verified live**: `POST /entities/reclassify` dry-run returns `memories_affected: 0` for "Expert Local" (which has 13 mentions and many linked memories) — the Postgres path can't count.
20+
21+
**Fix**: Add a Postgres code path using `store.pool.query()` for both the UPDATE and the COUNT:
22+
```javascript
23+
if (store.pool) {
24+
await store.pool.query('UPDATE entities SET entity_type = $1 WHERE id = $2', [newType, entity.id]);
25+
const linkCount = await store.pool.query('SELECT COUNT(*) as count FROM entity_memory_links WHERE entity_id = $1', [entity.id]);
26+
// ...
27+
} else if (store.db) {
28+
// existing SQLite code
29+
}
30+
```
31+
32+
### P1-2: Batch endpoint skips critical write-time logic
33+
**File**: `api/src/routes/memory.js:847-971`
34+
35+
`POST /memory/batch` is missing:
36+
1. **Supersedes logic** — batch fact/status ingestion with same key creates duplicates instead of superseding old versions
37+
2. **Structured DB writes** — memories are in Qdrant but NOT in events/facts/statuses tables (divergence)
38+
3. **Relevance scoring** — batch memories have no `relevance_score`, `auto_flagged`, or `auto_deprioritized` fields
39+
4. **Entity linking** — entities are extracted (payload only) but never linked in the entity graph
40+
41+
If agents use `brain_batch` for bootstrap/migration, the structured DB will be missing those memories entirely.
42+
43+
---
44+
45+
## P2 — Medium Severity (fix in next sprint)
46+
47+
### P2-1: Consolidated memories invisible to keyword search and structured queries
48+
**File**: `api/src/services/consolidation.js:334-372`
49+
50+
When consolidation merges facts or creates contradiction events:
51+
- Stored in Qdrant ✓
52+
- **NOT indexed** in `memory_search` table → invisible to BM25 keyword search
53+
- **NOT written** to structured DB (events/facts/statuses) → invisible to `GET /memory/query`
54+
55+
Consolidated memories can only be found via vector search, never via keyword or structured queries.
56+
57+
### P2-2: cleanupOldEvents loads ALL events into memory
58+
**File**: `api/src/services/consolidation.js:533-561`
59+
60+
Scrolls ALL event-type memories into an in-memory array before filtering for old/unused ones. With thousands of events, this could exhaust container memory.
61+
62+
**Fix**: Use Qdrant compound filter to scroll only matching points:
63+
```javascript
64+
scrollPoints({ type: 'event', active: true }, 200, offset)
65+
// then filter access_count === 0 && created_at < cutoff per page
66+
```
67+
68+
### P2-3: Race condition in fact/status supersedes
69+
**File**: `api/src/routes/memory.js:128-156`
70+
71+
Two concurrent `POST /memory` with the same fact key:
72+
1. Both find the old fact via `findByPayload`
73+
2. Both try to supersede it (mark old as inactive, set `superseded_by`)
74+
3. Both create new points — one "wins" in Qdrant, but the old fact's `superseded_by` field may point to the wrong new memory
75+
76+
Low probability in practice (agents rarely race on the same key) but possible during bulk operations.
77+
78+
### P2-4: Internal fields leak in search API response
79+
**File**: `api/src/routes/memory.js:539-540`
80+
81+
`_sessionKey` and `_sessionRank` are added to result objects during session diversity re-ranking and flow through to the JSON response when results > 3.
82+
83+
**Verified live**: `GET /memory/search?q=...&limit=8` returns `_sessionKey: "ti-claude::2026-04-02"` and `_sessionRank: 0` in every result.
84+
85+
**Fix**: Delete internal fields before sending response:
86+
```javascript
87+
for (const r of results) {
88+
delete r._sessionKey;
89+
delete r._sessionRank;
90+
}
91+
```
92+
93+
### P2-5: Duplicate entities in production database
94+
Confirmed pairs that should be merged:
95+
| Entity A | Type | Mentions | Entity B | Type | Mentions |
96+
|----------|------|----------|----------|------|----------|
97+
| Expert Local | system | 13 | expert-local | client | 9 |
98+
| Claude Code | agent | 11 | claude-code | agent | 5 |
99+
| Dermka Clinik | system | 6 | dermka-clinik | client | 4 |
100+
| Mini-Claude | workflow | 7 | Mini Claude | system | 1 |
101+
102+
Root cause: Entity extraction treats hyphenated client_id slugs and capitalized display names as separate entities. The `extractEntities()` function adds both `clientId` (line 298) and capitalized phrases (line 340) without cross-checking.
103+
104+
### P2-6: Massive noise in workflow entity type (192 entities)
105+
Most workflow entities are garbage captured by the quoted name regex (`QUOTED_NAME_REGEX`). Examples:
106+
- `const viewportDefs` — code variable
107+
- `openclaw gateway restart` — shell command
108+
- `drop les emojis` — instruction
109+
- `Satisfaction Guaranteed or Money Back` — marketing copy
110+
- `NEVER build HTML yourself` — instruction
111+
112+
The `isJunkQuotedName()` filter is good but doesn't catch:
113+
- Multi-word shell commands without recognized command prefixes
114+
- French instructions (`drop les emojis`)
115+
- Marketing taglines that happen to be quoted
116+
117+
---
118+
119+
## P3 — Low Severity / Polish
120+
121+
### P3-1: 62 "system" entities with 1 mention (noise)
122+
Many are misclassified: "Maxime Fillion" (person), "Centre Morrin" (place), "Rebranded Prism" (action phrase), "Apple Vision Pro" (unrelated tech).
123+
124+
### P3-2: No `knowledge_category` validation on store endpoint
125+
**File**: `api/src/middleware/validate.js`
126+
`validateMemoryInput` doesn't validate `knowledge_category`. Any string passes through to Qdrant. MCP tool lists valid values but server doesn't enforce.
127+
128+
### P3-3: Feedback loop scrolls ALL active memories twice
129+
**File**: `api/src/services/feedback-loop.js:27-48, 70-105`
130+
Phase 1 (trust scores) and Phase 2 (stale detection) each scroll all active memories. Could be combined into one pass.
131+
132+
### P3-4: CI doesn't test Postgres backend
133+
**File**: `.github/workflows/ci.yml`
134+
All tests run SQLite only. Postgres-specific bugs (like P1-1) are never caught in CI.
135+
136+
### P3-5: `npm audit || true` always passes CI
137+
**File**: `.github/workflows/ci.yml:30`
138+
Security audit failures never fail the build.
139+
140+
### P3-6: `EMBEDDING_DIMS` export always null
141+
**File**: `api/src/services/embedders/interface.js:63`
142+
Legacy backwards-compat export that's never set. Any caller using it gets null.
143+
144+
### P3-7: Collection name validation gap
145+
**File**: `api/src/services/collection-registry.js:18-27`
146+
`resolveCollection()` doesn't call `validateCollectionSlug()`. Pre-prefixed names pass through to Qdrant URLs unvalidated. Qdrant likely rejects bad names, but defense-in-depth is missing.
147+
148+
### P3-8: SQLite FTS5 keyword search doesn't support `source_agent` filter
149+
**File**: `api/src/services/keyword-search.js:142`
150+
Postgres keyword search supports `client_id`, `type`, AND `source_agent` filters. SQLite only supports `client_id` and `type`. Agent-filtered keyword searches return different results on each backend.
151+
152+
---
153+
154+
## Architecture Highlights (Strengths)
155+
156+
- **Auth**: Agent key identity binding with timing-safe comparison. Rate limiting on failed attempts. Identity enforcement on store/update/delete.
157+
- **Credential scrubbing**: Comprehensive regex patterns (API keys, JWTs, AWS keys, SSH keys, connection strings, Anthropic/OpenAI keys).
158+
- **Input validation**: Content max 10K chars, metadata max 10KB depth 3, agent names regex-validated, type/importance enum-checked.
159+
- **Multi-path retrieval**: Vector (semantic) + BM25 (keyword) + graph (entity BFS) → RRF fusion. Well-implemented with proper fallbacks.
160+
- **Consolidation defense-in-depth**: XML escaping, batch ID validation, hallucination stripping, enum whitelists for types.
161+
- **Relevance scoring**: Zero-latency (no extra embed call), near-duplicate detection, source trust, entity density.
162+
- **Session diversity re-ranking**: Smart round-robin across agent/date groups prevents search result clustering.
163+
- **Graceful degradation**: Every service (entities, keyword, graph, consolidation) is optional and degrades gracefully.
164+
- **LRU entity cache**: Bounded with pinned built-in entries, prevents unbounded memory growth.
165+
- **Test suite**: 114 tests / 32 suites / 0 failures. Good coverage of validation, scrubbing, RRF, entity extraction.
166+
167+
---
168+
169+
---
170+
171+
## External Model Adversarial Review — Cross-Verification
172+
173+
### Codex GPT-5.4 (xhigh) — FAILED
174+
Codex sandbox hit a `bwrap` loopback failure and could not read any source files. No findings produced.
175+
176+
### Gemini 3.1 Pro (adversarial) — 20 findings, 10 verified
177+
178+
The Gemini agent produced 20 findings (5 P0, 5 P1, 5 P2, 5 P3). After line-by-line verification against source code:
179+
180+
| Finding | Gemini Severity | Verified | Actual Severity | Notes |
181+
|---------|----------------|----------|-----------------|-------|
182+
| P0-1: Race in dedup | P0 | REAL | **P2** | Already in report as P2-3. Low probability in practice |
183+
| P0-2: Timing side-channel on agent keys | P0 | **FALSE** | N/A | Map.get() on 256-bit keys not vulnerable |
184+
| P0-3: SQL wildcard injection via ILIKE | P0 | **FALSE** | N/A | Parameterized queries. Wildcards are intended search behavior |
185+
| P0-4: LLM output injection to DB | P0 | **FALSE** | N/A | Code has extensive validation (lines 252-300) |
186+
| P0-5: Collection name injection | P0 | REAL | **P3** | Already in report as P3-7. Qdrant rejects bad names |
187+
| P1-6: No timeout on findByPayload | P1 | **FALSE** | N/A | All Qdrant calls use `qdrantRequest()` with 10s timeout |
188+
| P1-7: Hardcoded LIMIT 50 | P1 | REAL | **P3** | Design choice, not a bug |
189+
| P1-8: Unbounded consolidation memory | P1 | REAL | **P2** | Already in report as P2-2 |
190+
| P1-9: LLM batch token overflow | P1 | **FALSE** | N/A | 50×10K chars = 125K tokens, Gemini Flash context = 1M |
191+
| P1-10: Exponential timeout backoff | P1 | **FALSE** | N/A | It's linear (not exponential), capped at 1 retry = 30s max |
192+
| P2-11: UUID collision | P2 | **FALSE** | N/A | Mathematically negligible (2^128 space) |
193+
| P2-12: Rate limiting not cluster-safe | P2 | REAL | **P3** | Single-instance deployment, not currently exploitable |
194+
| P2-13: Client resolver silent failure | P2 | **FALSE** | N/A | Code falls back to 'global' correctly (line 165) |
195+
| P2-14: No error recovery in batch consolidation | P2 | REAL | OK | Failed batches stay `consolidated=false`, retried next run |
196+
| P2-15: Embedding fallback | P2 | REAL | **P3** | Architectural limitation, not a bug |
197+
| P3-16: Incomplete input validation | P3 | **FALSE** | N/A | Didn't read validate.js. Validation IS thorough |
198+
| P3-17: Client_id in consolidation prompt | P3 | **FALSE** | N/A | Code preserves client_id correctly (line 338) |
199+
| P3-18: No logging | P3 | **FALSE** | N/A | Logging exists at lines 778, 832, 958 |
200+
| P3-19: Qdrant errors leak details | P3 | **FALSE** | N/A | All error handlers return generic messages |
201+
| P3-20: IP spoofing | P3 | REAL | **P3** | Internal API, not internet-facing |
202+
203+
**Summary**: 10/20 findings were false positives (50% FP rate). Of the 10 real findings, 5 were already in the primary audit, 3 were P3 design choices, and 2 were correctly handled by existing code. **Zero confirmed P0 vulnerabilities.**
204+
205+
### Claude Explore Agent — Architecture Analysis (47 tool calls, 537KB output)
206+
207+
Comprehensive architecture map with 15 anomalies. Cross-verification:
208+
209+
| Explore Claim | Verified | Notes |
210+
|--------------|----------|-------|
211+
| "No test files found" | **FALSE** | 6 test files in `api/tests/`, 114 tests pass |
212+
| "No batch size limit" | **FALSE** | `BATCH_MAX_SIZE=100` at `memory.js:849` |
213+
| "MCP consolidation async-only" | **FALSE** | `sync: boolean` param exists in MCP tool definition |
214+
| Confidence decay no lower bound | REAL (P3) | After 100 days: 0.98^100 = 13%. By design but could floor at 0.1 |
215+
| Temporal buffer inconsistency | REAL (P3) | ±2d for "days ago", ±5d for "last month", not documented |
216+
| Fire-and-forget lacks monitoring | REAL (P3) | Entity linking, keyword index, notifications — all silent on failure |
217+
| Client resolver MIN_THRESHOLD=2 | REAL (P3) | Permissive matching, could misattribute memories to wrong client |
218+
219+
**Summary**: 3/7 key claims were false. 4 valid observations, all P3 severity. The architecture analysis was thorough but over-reported issues it hadn't fully verified.
220+
221+
---
222+
223+
## Recommended Fix Priority
224+
225+
1. **P1-1**: Add Postgres path to `reclassifyEntity()` + entities route dry-run
226+
2. **P1-2**: Add supersedes, structured DB writes, and entity linking to batch endpoint
227+
3. **P2-1**: Add keyword indexing + structured DB writes to consolidation merged facts
228+
4. **P2-4**: Clean `_sessionKey`/`_sessionRank` from search response
229+
5. **P2-5**: Run entity merge for the 4 confirmed duplicate pairs
230+
6. **P2-6**: Tighten `isJunkQuotedName()` to reject more noise patterns
231+
7. **P2-2**: Refactor cleanupOldEvents to filter per-page instead of loading all
232+
8. **P3-2**: Add `knowledge_category` to `validateMemoryInput`
233+
9. **P3-4**: Add Postgres service to CI workflow

api/src/middleware/validate.js

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
const AGENT_NAME_REGEX = /^[a-zA-Z0-9_-]{1,64}$/;
44
const VALID_TYPES = ['event', 'fact', 'decision', 'status'];
55
const VALID_IMPORTANCE = ['critical', 'high', 'medium', 'low'];
6+
const VALID_KNOWLEDGE_CATEGORIES = ['brand', 'strategy', 'meeting', 'content', 'technical', 'relationship', 'general'];
67
const MAX_CONTENT_LENGTH = 10_000;
78
const MAX_METADATA_SIZE = 10_240; // 10 KB serialized
89
const MAX_METADATA_DEPTH = 3;
@@ -51,6 +52,12 @@ export function validateMetadata(metadata) {
5152
return null;
5253
}
5354

55+
export function validateKnowledgeCategory(kc) {
56+
if (kc === undefined || kc === null) return null; // optional, defaults to 'general'
57+
if (!VALID_KNOWLEDGE_CATEGORIES.includes(kc)) return `Invalid knowledge_category: ${kc}. Must be one of: ${VALID_KNOWLEDGE_CATEGORIES.join(', ')}`;
58+
return null;
59+
}
60+
5461
export function validateStringField(value, name, maxLen = MAX_STRING_FIELD_LENGTH) {
5562
if (value === undefined || value === null) return null; // optional
5663
if (typeof value !== 'string') return `${name} must be a string`;
@@ -80,11 +87,12 @@ export function validateTemporalFields(valid_from, valid_to) {
8087
}
8188

8289
// Validate all inputs for POST /memory and return first error or null
83-
export function validateMemoryInput({ type, content, source_agent, importance, metadata, client_id, key, subject, status_value, valid_from, valid_to }) {
90+
export function validateMemoryInput({ type, content, source_agent, importance, metadata, client_id, key, subject, status_value, valid_from, valid_to, knowledge_category }) {
8491
return validateType(type)
8592
|| validateContent(content)
8693
|| validateSourceAgent(source_agent)
8794
|| validateImportance(importance)
95+
|| validateKnowledgeCategory(knowledge_category)
8896
|| validateMetadata(metadata)
8997
|| validateClientId(client_id)
9098
|| validateStringField(key, 'key', 128)

api/src/routes/entities.js

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -112,15 +112,22 @@ entitiesRouter.post('/reclassify', async (req, res) => {
112112
if (isDryRun) {
113113
// Count linked memories for preview
114114
const store = _getStoreInstance();
115-
const linkCount = store?.db
116-
? store.db.prepare('SELECT COUNT(*) as count FROM entity_memory_links WHERE entity_id = @id').get({ id: entity.id })
117-
: { count: 0 };
115+
let memoriesAffected = 0;
116+
if (store?.pool) {
117+
const result = await store.pool.query(
118+
'SELECT COUNT(*) as count FROM entity_memory_links WHERE entity_id = $1', [entity.id]
119+
);
120+
memoriesAffected = parseInt(result.rows[0]?.count) || 0;
121+
} else if (store?.db) {
122+
const linkCount = store.db.prepare('SELECT COUNT(*) as count FROM entity_memory_links WHERE entity_id = @id').get({ id: entity.id });
123+
memoriesAffected = linkCount?.count || 0;
124+
}
118125

119126
results.push({
120127
name: entity.canonical_name,
121128
old_type: oldType,
122129
new_type: entry.new_type,
123-
memories_affected: linkCount?.count || 0,
130+
memories_affected: memoriesAffected,
124131
});
125132
} else {
126133
// 1. Update structured store

0 commit comments

Comments
 (0)