Skip to content

Commit df529b9

Browse files
committed
feat: implement harvest agent with pre-scan functionality for specs, docs, and code comments
1 parent 389c0e0 commit df529b9

3 files changed

Lines changed: 752 additions & 0 deletions

File tree

Lines changed: 347 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,347 @@
1+
---
2+
description: Harvest knowledge from completed specs and stale docs into CHANGELOG and guides, clean stale code comments, then archive
3+
handoffs:
4+
- label: Run a site audit after harvesting
5+
agent: speckit.site-audit
6+
prompt: Run a site audit to confirm the project is in good health after harvesting
7+
- label: Archive remaining stale docs
8+
agent: speckit.archive
9+
prompt: Archive any remaining outdated documentation that was not covered by the harvest
10+
- label: Evolve the constitution
11+
agent: speckit.evolve-constitution
12+
prompt: Review the constitution in light of the cleaned-up documentation and harvested knowledge
13+
---
14+
15+
## User Input
16+
17+
```text
18+
$ARGUMENTS
19+
```
20+
21+
You **MUST** consider the user input before proceeding (if not empty).
22+
23+
---
24+
25+
## Goal
26+
27+
Harvest valuable knowledge from completed specs, stale documentation, and in-process drafts — then archive or remove them. This is a **knowledge-preserving cleanup**: information is captured in living documents (CHANGELOG, Guide.md, copilot-instructions) before source material moves to `.archive/`.
28+
29+
Additionally, scan source code for comments that reference completed specs, plans, or tasks and rewrite them as self-contained code documentation.
30+
31+
The output is:
32+
1. Updated living documents (CHANGELOG, Guide.md) with harvested knowledge
33+
2. Cleaned code comments (spec references → self-contained descriptions)
34+
3. Stale files moved to `.archive/` with preserved folder structure
35+
4. A harvest summary report at `.documentation/copilot/harvest-YYYY-MM-DD.md`
36+
37+
**CRITICAL: Never read from `.archive/` during this or any other command.** The archive is write-only from an operational perspective.
38+
39+
---
40+
41+
## Scope Options
42+
43+
By default (no arguments), perform a **full harvest**. Optional scope presets allow targeted runs:
44+
45+
| Scope Argument | Description |
46+
|----------------|-------------|
47+
| *(none)* | Full harvest — all phases enabled |
48+
| `--scope=specs` | Completed specs only — harvest and archive |
49+
| `--scope=docs` | Stale docs only — harvest and archive |
50+
| `--scope=comments` | Code comment cleanup only — no file moves |
51+
| `--scope=changelog` | CHANGELOG update only — no archival |
52+
| `--scope=scan` | Scan and report only — no modifications (dry run) |
53+
54+
Multiple scopes can be combined: `--scope=specs,comments`
55+
56+
---
57+
58+
## Operating Constraints
59+
60+
- **Constitution Authority**: `/.documentation/memory/constitution.md` governs this process — read it first
61+
- **Knowledge Preservation**: Information MUST be harvested into living documents BEFORE archival
62+
- **No Data Loss**: Files move to `.archive/` — never deleted outright
63+
- **CHANGELOG Is Append-Only**: New entries prepend; existing entries are never modified
64+
- **Confirmation Required**: The harvest plan MUST be presented to the user for approval before any file moves or code edits
65+
66+
---
67+
68+
## Execution Phases
69+
70+
### Phase 1: Pre-Scan and Context Loading
71+
72+
1. **Run the harvest pre-scan script** to gather inventory:
73+
74+
```powershell
75+
.documentation/scripts/powershell/harvest.ps1 -Json
76+
```
77+
78+
Parse the JSON output. Key fields:
79+
- `harvest_date` — Use for the report filename
80+
- `specs` — All spec folders with status (`completed`, `completed-needs-changelog`, `in-progress`, `draft`)
81+
- `docs` — All doc files categorized (living_reference, completed_reviews, stale_drafts, session_notes, etc.)
82+
- `code_comments` — Spec/task references found in source files
83+
- `changelog_gaps` — Specs with all tasks done but no CHANGELOG entry
84+
- `bak_files` — Backup files to clean
85+
- `archive_existing` — What's already in `.archive/` (skip these)
86+
87+
2. **Load governance documents**:
88+
- Read `/.documentation/memory/constitution.md`
89+
- Read `CHANGELOG.md` (repo root) — identify what's already documented
90+
- Read `.documentation/Guide.md` — identify what may need updating
91+
92+
3. **Parse scope arguments** from `$ARGUMENTS`:
93+
- If empty or `--scope=full`: enable all phases
94+
- If `--scope=scan`: run phases 1–2 only, write summary report, stop before any modifications
95+
96+
---
97+
98+
### Phase 2: Classify All Artifacts
99+
100+
#### 2a. Classify Specs
101+
102+
For each folder in `.documentation/specs/` (excluding `pr-review/`):
103+
104+
| Status | Criteria | Action |
105+
|--------|----------|--------|
106+
| **COMPLETED** | All tasks `[X]` in tasks.md AND in CHANGELOG | Harvest → Archive |
107+
| **COMPLETED-NEEDS-CHANGELOG** | All tasks `[X]` but not in CHANGELOG | Add CHANGELOG entry → Archive |
108+
| **IN-PROGRESS** | Some tasks incomplete | Leave in place |
109+
| **DRAFT** | No tasks.md, or no tasks started | Leave in place |
110+
111+
#### 2b. Classify Docs
112+
113+
For each file in `.documentation/` (recursive, never `.archive/`):
114+
115+
| Category | Path Pattern | Action |
116+
|----------|-------------|--------|
117+
| **Completed Reviews** | `.documentation/specs/pr-review/*.md` | Harvest → Archive |
118+
| **Completed Audits** | `.documentation/copilot/audit/*.md` | Harvest → Archive |
119+
| **Stale Drafts** | `.documentation/drafts/*.md` | Archive |
120+
| **Session Notes** | `.documentation/copilot/session*/` | Archive if work is merged |
121+
| **Impl Plans** | `*-implementation-plan.md`, `*-plan.md` (completed) | Harvest → Archive |
122+
| **Release Docs** | `.documentation/releases/` (superseded) | Archive |
123+
| **Quickfix Records** | `.documentation/quickfixes/` | Archive |
124+
| **Backup Files** | `*.bak`, `*.backup`, `*.old` | Archive |
125+
| **Living Reference** | Top-level `.documentation/*.md`, Guide.md, CHANGELOG.md | Keep — may update |
126+
127+
**Never archive:**
128+
- `/.documentation/memory/constitution.md`
129+
- `/.documentation/scripts/` and `/.documentation/templates/`
130+
- `/.documentation/Guide.md` — update it instead
131+
- `CHANGELOG.md` — append to it, never move it
132+
133+
#### 2c. Classify Code Comments
134+
135+
Search source files for spec/task references. The pre-scan output's `code_comments` array lists these. For each:
136+
137+
| Pattern Example | Action |
138+
|----------------|--------|
139+
| `# spec 026 Phase 5` | Rewrite as behavior description |
140+
| `# FR-013: always send transcript` | Strip prefix, keep or rewrite the behavior description |
141+
| `# T006: audit trail` | Rewrite as self-contained comment |
142+
| `# Phase 3 implementation` | Remove or rewrite |
143+
| `# TODO(spec-018): migrate later` | Remove entirely if spec-018 is completed |
144+
145+
**Rewrite rule**: Replace any spec/task reference with a self-contained description of WHAT the code does and WHY, without referencing the spec document. The comment must make sense to a reader who has never seen any spec.
146+
147+
```python
148+
# BEFORE: # FR-013: transcript always sent regardless of CRM engagement
149+
# AFTER: # Always send transcript — CRM cases need complete clinical context for closure
150+
151+
# BEFORE: # spec 026 Phase 5 — isolated handler contract
152+
# AFTER: # Handler receives only (data, step_data, step, answer) — no document access
153+
```
154+
155+
---
156+
157+
### Phase 3: Present Harvest Plan
158+
159+
**STOP and present the full plan to the user for approval before proceeding.** Format:
160+
161+
```markdown
162+
## Harvest Plan — YYYY-MM-DD
163+
164+
### Specs to Archive (N)
165+
| Spec | CHANGELOG? | Action |
166+
|------|------------|--------|
167+
| 001-example-spec | ✅ Yes | Archive |
168+
| 033-another-spec | ⚠️ No | Add CHANGELOG entry → Archive |
169+
170+
### Docs to Archive (N)
171+
| File | Category | Action |
172+
|------|----------|--------|
173+
| .documentation/specs/pr-review/pr-12345.md | Completed review | Archive |
174+
| .documentation/drafts/old-draft.md | Stale draft | Archive |
175+
176+
### CHANGELOG Updates Needed (N)
177+
| Spec | Summary |
178+
|------|---------|
179+
| 033-another-spec | Brief description of what was delivered |
180+
181+
### Code Comments to Rewrite (N)
182+
| File | Line | Current | Proposed |
183+
|------|------|---------|----------|
184+
| src/helpers.py | 142 | # FR-013 always send | # Always send transcript for CRM closure |
185+
186+
### Files to Clean (N)
187+
| File | Type |
188+
|------|------|
189+
| .documentation/drafts/scratch.bak | Backup file |
190+
191+
### Not Changing
192+
- .documentation/specs/019-in-progress/ (tasks incomplete)
193+
- .documentation/Guide.md (current — will update after archival)
194+
```
195+
196+
Wait for: **"Proceed with harvest? (yes/no/modify)"**
197+
198+
If the user says **modify**, apply their changes to the plan and re-present before executing.
199+
200+
---
201+
202+
### Phase 4: Harvest Knowledge → Living Documents
203+
204+
**Execute only after user approval.**
205+
206+
#### 4a. Update CHANGELOG.md
207+
208+
For each completed spec NOT already in CHANGELOG:
209+
1. Read the spec's `spec.md`, `plan.md`, and `tasks.md`
210+
2. Extract: spec number, key changes, what was delivered
211+
3. Prepend a new entry above existing content:
212+
213+
```markdown
214+
## [Spec NNN] Title — YYYY-MM-DD
215+
216+
One-paragraph summary of what was delivered and why.
217+
218+
- Key change bullet point
219+
- Another significant change
220+
- Tests added: +NN tests (if applicable)
221+
```
222+
223+
Never modify existing CHANGELOG entries.
224+
225+
#### 4b. Update .documentation/Guide.md
226+
227+
If completed specs introduced:
228+
- New patterns or conventions → update the relevant section
229+
- New directories or files in `.documentation/` → update the directory map
230+
- Deprecated old patterns → note the deprecation
231+
232+
The guide describes the **current state only** — no historical content.
233+
234+
#### 4c. Update .github/copilot-instructions.md (if it exists)
235+
236+
If completed specs introduced:
237+
- New import paths, module names, or architectural patterns
238+
- New document schemas or field names
239+
- Deprecations of old patterns
240+
241+
---
242+
243+
### Phase 5: Clean Code Comments
244+
245+
For each spec/task reference identified in the pre-scan:
246+
247+
1. Read 10 lines of surrounding context to understand the code's purpose
248+
2. Rewrite the comment as a self-contained behavioral description
249+
3. Apply the edit
250+
251+
**Rules**:
252+
- Never remove a comment without understanding its purpose first
253+
- If the comment accurately describes behavior but just has a spec prefix, strip only the prefix
254+
- If the comment is a pure task marker with no behavioral value, remove it entirely
255+
- If uncertain about the code's behavior, leave the comment unchanged and note it in the harvest report
256+
257+
---
258+
259+
### Phase 6: Archive Files
260+
261+
1. Determine today's archive folder: `.archive/YYYY-MM-DD/`
262+
2. Create it if it does not exist
263+
3. Mirror the source directory structure inside the archive:
264+
- `.documentation/specs/NNN-name/``.archive/YYYY-MM-DD/.documentation/specs/NNN-name/`
265+
- `.documentation/drafts/foo.md``.archive/YYYY-MM-DD/.documentation/drafts/foo.md`
266+
4. Move (never copy or delete) each approved file
267+
5. Remove empty parent directories left behind only if they have no remaining files
268+
6. After moving, verify no active file references the archived paths
269+
270+
**Do not move:**
271+
- `/.documentation/memory/constitution.md`
272+
- `/.documentation/scripts/`
273+
- `/.documentation/templates/`
274+
- `/.documentation/Guide.md`
275+
- `CHANGELOG.md`
276+
277+
#### Update .archive/README.md
278+
279+
Create or append to `.archive/README.md`:
280+
281+
```markdown
282+
# Archive
283+
284+
Completed and historical documentation. **Do not reference files from here in prompts, scripts, or active docs.**
285+
286+
## Contents
287+
288+
| Folder | Date | Description |
289+
|--------|------|-------------|
290+
| YYYY-MM-DD/ | YYYY-MM-DD | Harvest run — N specs, N docs archived |
291+
```
292+
293+
---
294+
295+
### Phase 7: Write Harvest Report
296+
297+
Create `.documentation/copilot/harvest-YYYY-MM-DD.md`:
298+
299+
```markdown
300+
# Harvest Report — YYYY-MM-DD
301+
302+
## Summary
303+
- Specs archived: N
304+
- Docs archived: N
305+
- Code comments rewritten: N
306+
- CHANGELOG entries added: N
307+
- Guide.md updated: yes/no
308+
309+
## Specs Archived
310+
| Spec | CHANGELOG Entry Added |
311+
|------|-----------------------|
312+
| NNN-name | ✅ / ➕ Added now |
313+
314+
## Docs Archived
315+
| File | Category | Destination |
316+
|------|----------|-------------|
317+
| ... | ... | .archive/YYYY-MM-DD/... |
318+
319+
## Code Comments Rewritten
320+
| File | Line | Before | After |
321+
|------|------|--------|-------|
322+
| ... | ... | ... | ... |
323+
324+
## Knowledge Harvested Into
325+
| Document | Changes |
326+
|----------|---------|
327+
| CHANGELOG.md | Added Spec NNN entry |
328+
| .documentation/Guide.md | Updated directory map |
329+
330+
## Still Active (Not Archived)
331+
| Item | Reason |
332+
|------|--------|
333+
| .documentation/specs/NNN-name/ | In-progress |
334+
```
335+
336+
---
337+
338+
## Anti-Patterns (DO NOT)
339+
340+
1. **Archive without harvesting** — Capture knowledge in living docs first
341+
2. **Delete files** — Always use `Move-Item` to `.archive/`; never `Remove-Item`
342+
3. **Modify existing CHANGELOG entries** — Only append new entries
343+
4. **Archive in-progress specs** — Only specs with all tasks complete are eligible
344+
5. **Leave stale code comments** — Spec references must be rewritten or removed
345+
6. **Skip user confirmation** — The harvest plan must be approved before any execution
346+
7. **Update constitution directly** — Constitution changes require the `/speckit.evolve-constitution` process
347+
8. **Read from `.archive/`** — The archive is write-only from an operational perspective
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
---
2+
agent: speckit.harvest
3+
---

0 commit comments

Comments
 (0)