Skip to content

Commit 2535624

Browse files
committed
chore: execute P1-GET-ORGANIZED repo hygiene and MCP refactoring
- Enforce document frontmatter, folder promotion, and uppercase naming via project.sh - Scrub client names and build repo metadata catalog - Refactor MCP server to split env-probes and server-registry modules - Add experimental live environment probe scripts
1 parent 45e772d commit 2535624

60 files changed

Lines changed: 6576 additions & 1775 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,30 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
1313
- Do not edit a version block that has already been committed and pushed
1414
-->
1515

16+
## [2.1.1] - 2026-04-12
17+
18+
### Changed
19+
- **`tools/mcp-server/src/index.ts` + `src/registrations/*`** — MCP server registration is now split by domain so the entrypoint is composition-only and schemas live with the tools/resources they describe, reducing the maintenance cost of adding or changing MCP surface area.
20+
- **External JSON contract hardening** — env probes, pw-auth JSON flows, Query Monitor profile retrieval, LocalWP JSON list parsing, and WPCC scan artifact loading now validate parsed JSON with Zod-backed schemas before returning data to MCP callers, turning silent shape drift into explicit contract errors.
21+
- **Server registry source of truth** — moved the port registry backend from markdown parsing to canonical JSON at `tools/servers.registry.json`, with `tools/servers.md` generated from that data block so MCP reads/writes stop depending on markdown table formatting.
22+
- **MCP test workflow** — normal unit tests and socket-binding tests are now separated into `test:unit`, `test:socket`, and `test:all`, so sandboxed environments can run the stable contract suite without failing on localhost/TCP or Unix-socket bind restrictions.
23+
24+
## [2.1.0] - 2026-04-12
25+
26+
### Added
27+
- **`PROJECT/1-INBOX/P1-GET-ORGANIZED.md`** — added a new inbox plan for repo organization work. The plan sequences inventory, lifecycle rules, metadata cataloging, cleanup, and hybrid lexical plus semantic retrieval so embeddings are treated as a discovery layer after structure and retention rules are in place.
28+
- **MCP tools: `servers_monitor_check`, `dev_context_status`** — two new live environment probe tools in MCP server v0.11.0. `servers_monitor_check` wraps `experimental/servers-monitor.sh --json` to detect port conflicts, wildcard binds on 80/443, stale `/etc/hosts` entries, and missing services; returns severity-grouped counts and structured issue list. `dev_context_status` wraps `experimental/dev-context.sh status --json` to report current development mode (`valet` / `localwp` / `conflict` / `none`), service health, port 80 listeners, and Valet proxy registrations. Both fail soft when their backing scripts are missing or unconfigured (returning `not_installed` / `not_configured`) so they're safe to ship across machines without custom setup. Intended use: agents call these at session start to surface port 80 issues *before* the user asks "why can't I reach my site?".
29+
- **`handlers/env-probes.ts`** — new MCP handler module for live machine state probes. Split from `handlers/servers.ts` (which stays pure and only reads/writes the `tools/servers.md` registry) to keep file-parsing logic separate from shell-execution logic. Accepts dependency injection for script paths and exec implementation, enabling per-deployment configuration and test isolation. Gracefully degrades when scripts are missing — probes return structured "not installed" states rather than throwing.
30+
- **`experimental/servers-monitor.sh`** — 30-minute conflict monitor for local dev environments with Resend.com email alerts. Checks 5 categories: multiple-process port conflicts, `0.0.0.0:80/443` wildcard binds (breaks coexistence), stale `.local` `/etc/hosts` entries (cross-referenced against Local WP `sites.json`), expected service health (dnsmasq, MySQL), and Valet IP binding correctness. Dedupes by issue-set hash so the same steady-state conflict never re-alerts. New `--json` flag emits structured output to stdout (skips email, skips baseline update) for MCP consumption. Severity tiers: CRITICAL / HIGH / MEDIUM / LOW / FYI with project/service context per port. Config at `~/secrets/servers-monitor.conf` (outside repo, never committed); example template at `experimental/servers-monitor.conf.example`.
31+
- **`experimental/dev-context.sh`** — context switcher for the Local WP ↔ Valet port 80 mutex. `dev-context localwp` stops Valet nginx (frees `127.0.0.1:80` for Local WP's wildcard bind) while leaving dnsmasq running so `*.test` DNS still resolves. `dev-context valet` restores Valet (refuses if Local WP router is still up). `dev-context status` reports current mode, service health, and port 80 listeners. New `status --json` flag for MCP consumption. Docker Dify is never touched — it runs on dedicated high ports (8741/8742) and is orthogonal to the port 80 mutex.
32+
- **`experimental/local-nginx-shim`** + **`experimental/local-nginx-shim-install.sh`** — fallback workaround for Local WP's hard-coded `listen 80` (wildcard) router binding. The shim is a shell wrapper installed in place of Local WP's bundled nginx binary; at runtime it rewrites router-only config files to `listen 127.0.0.1:80` before exec'ing the real nginx (backed up as `nginx.real`). Only touches files under `run/router/nginx/conf/` — per-site configs on high ports are untouched. Installer is idempotent, refuses to run while Local WP is active, and supports clean uninstall. This enables true coexistence (Local WP + Valet + Docker Dify all on port 80 simultaneously on separate loopback IPs) as an alternative to the `dev-context` mutex approach. Re-run `install` after Local WP app updates.
33+
- **Preflight step 4: local server environment check**`experimental/preflight.md` now instructs agents to call `servers_monitor_check` and `dev_context_status` at session start (when the task involves a live site), so critical/high severity conflicts are surfaced before diagnostic questions arise. Numbered the subsequent steps accordingly.
34+
- **`~/bin/servers.md` (machine-specific) + `tools/servers.md` (generic template)** — forked the registry doc into a machine-specific snapshot (real ports, 21 Local WP site IDs, resolved-conflicts history, changelog) and a generic template used by `servers-audit.sh`. Both files include a **Troubleshooting Matrix** (11 symptom → fix rows) and an 8-branch **Decision Tree for LLM Agents** covering: `.test` reachability, `.local` reachability, `0.0.0.0:80/443` wildcard binds, false-positive port conflicts (nginx worker pattern, IPv4/IPv6 dual-stack, macOS system services), database shadowing, Valet config regeneration from stubs, `www.<site>` stale-entry false positives, and DNS cache staleness after `/etc/hosts` edits.
35+
36+
### Changed
37+
- **MCP server version bump to v0.11.0** — new Environment Probes area added; tool count updated from 26 to 28 across the same 7 user-facing areas (probes are grouped under "Server Registry" thematically).
38+
- **`experimental/servers-monitor.sh` false-positive fixes** — three classes of false positives fixed: (1) dedupe port-conflict detection by unique process **command**, not PID, so nginx's 10 workers sharing one listening socket no longer register as 10 conflicts; (2) handle IPv4+IPv6 dual-stack binds (e.g., Postgres on `127.0.0.1:5432` and `[::1]:5432`) as a single listener; (3) strip `www.` prefix when comparing `/etc/hosts` entries against Local WP `sites.json` (Local WP writes `www.<domain>` aliases but only stores the bare domain). macOS system services (ControlCenter AirPlay on 5000/7000, rapportd on 64343) classified as FYI severity to prevent noise.
39+
1640
## [2.0.0] - 2026-04-11
1741

1842
### Added
File renamed without changes.
Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
---
2+
title: "P1: Get Organized"
3+
status: inbox
4+
priority: P1
5+
created: 2026-04-11
6+
updated: 2026-04-11
7+
author: "GitHub Copilot"
8+
goal: "Create a practical repo-organization plan for AI-DDTK that reduces sprawl first and adds semantic retrieval only where it materially improves discovery."
9+
---
10+
11+
<!-- TOC -->
12+
13+
- [Phased Checklist (High-Level Progress)](#phased-checklist-high-level-progress)
14+
- [Overview](#overview)
15+
- [Goals](#goals)
16+
- [Non-Goals](#non-goals)
17+
- [Guiding Principle](#guiding-principle)
18+
- [Phase 0 — Inventory and Classification](#phase-0--inventory-and-classification)
19+
- [Phase 1 — Canonical Structure and Retention Rules](#phase-1--canonical-structure-and-retention-rules)
20+
- [Phase 2 — Build a Repo Metadata Catalog](#phase-2--build-a-repo-metadata-catalog)
21+
- [Phase 3 — Cleanup, Promotion, and Archival Pass](#phase-3--cleanup-promotion-and-archival-pass)
22+
- [Phase 4 — Add Hybrid Retrieval Where It Helps](#phase-4--add-hybrid-retrieval-where-it-helps)
23+
- [Phase 5 — Operating Rhythm and Ownership](#phase-5--operating-rhythm-and-ownership)
24+
- [Success Criteria](#success-criteria)
25+
- [Open Questions](#open-questions)
26+
27+
<!-- /TOC -->
28+
29+
---
30+
31+
## Phased Checklist (High-Level Progress)
32+
33+
> This document should be updated as work is completed. Mark off items immediately rather than batching status updates later.
34+
35+
- [ ] **Phase 0 — Inventory and Classification**
36+
- [ ] **Phase 1 — Canonical Structure and Retention Rules**
37+
- [ ] **Phase 2 — Build a Repo Metadata Catalog**
38+
- [ ] **Phase 3 — Cleanup, Promotion, and Archival Pass**
39+
- [ ] **Phase 4 — Add Hybrid Retrieval Where It Helps**
40+
- [ ] **Phase 5 — Operating Rhythm and Ownership**
41+
42+
## Overview
43+
44+
AI-DDTK has grown into a toolkit repo with code, docs, recipes, project tracking, experiments, generated reports, and operational artifacts. The current problem is not only search. It is lifecycle clarity: what is canonical, what is in progress, what is generated, what is temporary, and what should be archived.
45+
46+
Embeddings can help with discovery, clustering, and semantic lookup across notes, reports, and docs. They do not replace naming rules, retention rules, or folder discipline. This plan treats semantic retrieval as a later layer added on top of a cleaner repo model.
47+
48+
## Goals
49+
50+
- [ ] Reduce ambiguity about where new files belong.
51+
- [ ] Separate source-of-truth files from generated artifacts and temporary outputs.
52+
- [ ] Create a machine-readable catalog of important files and directories.
53+
- [ ] Make cleanup repeatable instead of one-off.
54+
- [ ] Add semantic retrieval only after the repo has usable metadata and folder hygiene.
55+
56+
## Non-Goals
57+
58+
- [ ] Do not redesign the entire repo structure in one pass.
59+
- [ ] Do not build a full knowledge platform before cleanup basics exist.
60+
- [ ] Do not index every file blindly into a vector store.
61+
- [ ] Do not treat generated artifacts as equal to canonical docs or source code.
62+
63+
## Guiding Principle
64+
65+
Organize first, index second.
66+
67+
If the repo lacks clear file lifecycle rules, embeddings will help search the mess without reducing the mess. The right sequence is:
68+
69+
1. Inventory the repo.
70+
2. Classify files by lifecycle and purpose.
71+
3. Clean up and normalize the highest-noise areas.
72+
4. Add metadata-backed discovery.
73+
5. Add hybrid lexical + semantic retrieval where it clearly improves workflow.
74+
75+
## Phase 0 — Inventory and Classification
76+
77+
Purpose: build a factual snapshot of what exists before making structural changes.
78+
79+
### Checklist
80+
81+
- [ ] Generate a repo-wide file inventory using tracked files as the baseline.
82+
- [ ] Break the inventory down by top-level area: `tools/`, `experimental/`, `PROJECT/`, `docs/`, `recipes/`, `templates/`, `test/`, `bin/`, `temp/`.
83+
- [ ] For each area, label files into one of these classes: `canonical-source`, `documentation`, `project-tracking`, `generated-artifact`, `temporary`, `experimental`, `archive-candidate`.
84+
- [ ] Identify the noisiest zones by count and churn, not by intuition.
85+
- [ ] Flag directories that mix multiple lifecycles in one place.
86+
- [ ] Produce a first-pass list of files that are probably duplicated, stale, or misfiled.
87+
- [ ] Record which generated outputs are intentionally checked in versus accidentally lingering.
88+
89+
### Deliverable
90+
91+
- [ ] A first-pass inventory snapshot stored in a machine-readable format such as JSON or CSV.
92+
93+
### Exit Criteria
94+
95+
- [ ] We can answer, with evidence, which parts of the repo are source, working notes, generated output, and probable cleanup targets.
96+
97+
## Phase 1 — Canonical Structure and Retention Rules
98+
99+
Purpose: define what belongs where and how long it should live.
100+
101+
### Checklist
102+
103+
- [ ] Define the canonical purpose of each top-level directory in one sentence.
104+
- [ ] Confirm `PROJECT/` is only for planning, tracking, inbox, working, and done states.
105+
- [ ] Confirm `temp/` is for sensitive and disposable runtime artifacts, not long-term reference docs.
106+
- [ ] Define what qualifies for `experimental/` and what conditions trigger promotion out of it.
107+
- [ ] Define which report outputs belong in-repo versus gitignored runtime storage.
108+
- [ ] Review `.gitignore` coverage for reports, screenshots, scans, auth state, and logs.
109+
- [ ] Write simple retention rules for generated content: keep, archive, or purge.
110+
- [ ] Decide what must always have an owning doc or README in dense directories.
111+
112+
### Deliverable
113+
114+
- [ ] A short policy section or reference doc update that names the lifecycle rules for canonical, experimental, generated, and temporary files.
115+
116+
### Exit Criteria
117+
118+
- [ ] A contributor can decide where a new file belongs without guessing.
119+
120+
## Phase 2 — Build a Repo Metadata Catalog
121+
122+
Purpose: create a lightweight system of record for discovery and cleanup.
123+
124+
### Checklist
125+
126+
- [ ] Define the metadata schema for the catalog.
127+
- [ ] Include at minimum: `path`, `area`, `file_type`, `lifecycle_class`, `owner_tool`, `canonical`, `generated`, `last_modified`, `status`, `notes`.
128+
- [ ] Add optional tags for themes such as `mcp`, `wpcc`, `playwright`, `local-wp`, `query-monitor`, `servers`, `project-doc`.
129+
- [ ] Generate the initial catalog automatically from the repo rather than maintaining it by hand.
130+
- [ ] Add a rule for how manual overrides are stored when auto-detection is wrong.
131+
- [ ] Mark high-value files explicitly as canonical references.
132+
- [ ] Mark low-value files explicitly as cleanup or archive candidates.
133+
- [ ] Decide where the catalog lives and whether it is checked in or regenerated.
134+
135+
### Deliverable
136+
137+
- [ ] A machine-readable catalog that can drive cleanup reports, folder summaries, and later search indexing.
138+
139+
### Exit Criteria
140+
141+
- [ ] We can query the repo by lifecycle and ownership instead of relying only on folder names.
142+
143+
## Phase 3 — Cleanup, Promotion, and Archival Pass
144+
145+
Purpose: reduce noise using the inventory and catalog rather than ad hoc decisions.
146+
147+
### Checklist
148+
149+
- [ ] Triage `PROJECT/1-INBOX` items into active, done, or misc states using existing doc rules.
150+
- [ ] Move finished project docs out of inbox.
151+
- [ ] Review `experimental/` for tools or docs that have effectively graduated.
152+
- [ ] Move obsolete or superseded planning docs to the appropriate archive location instead of leaving duplicates in place.
153+
- [ ] Consolidate duplicate instructions where one doc clearly supersedes another.
154+
- [ ] Remove or archive tracked generated artifacts that do not belong in the main repo surface.
155+
- [ ] Add missing README or index guidance in dense directories only where it reduces ambiguity.
156+
- [ ] Re-run the inventory after cleanup and measure count reduction and clearer classification.
157+
158+
### Deliverable
159+
160+
- [ ] A visibly smaller and more legible repo surface, especially in project-tracking and experimental areas.
161+
162+
### Exit Criteria
163+
164+
- [ ] The highest-noise folders have fewer ambiguous files and clearer ownership.
165+
166+
## Phase 4 — Add Hybrid Retrieval Where It Helps
167+
168+
Purpose: improve discovery after structure exists.
169+
170+
### Checklist
171+
172+
- [ ] Start with hybrid retrieval, not embeddings alone.
173+
- [ ] Use lexical search for exact names, paths, commands, headings, and schema keys.
174+
- [ ] Use embeddings for semantic discovery across docs, reports, changelog entries, project notes, and scan outputs.
175+
- [ ] Index docs and operational artifacts first.
176+
- [ ] Add code chunks only when documentation is insufficient for the target workflow.
177+
- [ ] Exclude binaries, screenshots, lockfiles, auth state, and highly repetitive output unless there is a clear use case.
178+
- [ ] Test real queries against the index before expanding scope.
179+
- [ ] Define success queries such as: “show me all files related to LocalWP auth failures” or “find similar past WPCC performance investigations.”
180+
181+
### Deliverable
182+
183+
- [ ] A narrow, high-signal retrieval layer aimed at discovery and clustering, not as a substitute for repo organization.
184+
185+
### Exit Criteria
186+
187+
- [ ] Semantic lookup answers real questions faster than plain grep without pulling in obvious noise.
188+
189+
## Phase 5 — Operating Rhythm and Ownership
190+
191+
Purpose: keep the repo organized after the first cleanup pass.
192+
193+
### Checklist
194+
195+
- [ ] Assign an owner or review rule for repo hygiene changes.
196+
- [ ] Add a recurring cleanup cadence for inbox, experimental, and generated-output areas.
197+
- [ ] Add a lightweight checklist for “new file acceptance” so artifacts do not accumulate silently.
198+
- [ ] Require new generated-output directories to declare whether they are checked in or gitignored.
199+
- [ ] Review the metadata catalog on a schedule rather than only during cleanup crises.
200+
- [ ] Add a simple report that shows new files by lifecycle class since the last review.
201+
- [ ] Revisit the hybrid index scope after one or two cleanup cycles.
202+
203+
### Deliverable
204+
205+
- [ ] A repeatable maintenance loop that prevents the repo from drifting back into ambiguity.
206+
207+
### Exit Criteria
208+
209+
- [ ] Repo organization becomes an operating habit, not a one-time project.
210+
211+
## Success Criteria
212+
213+
- [ ] The top-level repo areas each have a clear and enforced purpose.
214+
- [ ] New files can be classified quickly as canonical, generated, temporary, experimental, or project-tracking.
215+
- [ ] The noisiest folders have been reduced and normalized.
216+
- [ ] A metadata catalog exists and can be regenerated.
217+
- [ ] Semantic retrieval is scoped to the parts of the repo where it genuinely improves discovery.
218+
- [ ] Contributors can find the right file faster without memorizing tribal knowledge.
219+
220+
## Open Questions
221+
222+
- [ ] Should the metadata catalog live under `PROJECT/`, `tools/`, or a new repo-maintenance location?
223+
- [ ] Which generated artifacts are intentionally committed because they provide durable value?
224+
- [ ] Which `experimental/` items are actually production-grade and just waiting for promotion?
225+
- [ ] Should repo hygiene checks become part of `preflight.sh`, `post-flight.sh`, or a separate maintenance command?
226+
- [ ] Is the first retrieval target this repo alone, or this repo plus neighboring WordPress project artifacts and reports?

PROJECT/1-INBOX/P1-POST-FLIGHT.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,13 @@
1+
---
2+
title: "Post-Flight Session Cleanup Script"
3+
status: inbox
4+
priority: P1
5+
created: 2026-04-11
6+
updated: 2026-04-11
7+
author: noelsaw
8+
goal:
9+
---
10+
111
# Post-Flight Session Cleanup Script
212

313
## Overview
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)