Skip to content

Commit c6c76a0

Browse files
committed
feat(build): bring codex/bob SKILL.md content under build management
Migrates codex's six SKILL.md files and bob's seven SKILL.md files into plugin-source/ as Pattern B per-platform overlays. The overlay files live next to each skill's shared .j2 template: plugin-source/skills/<skill>/SKILL.md.j2 (claude+claw-code) plugin-source/skills/<skill>/SKILL.codex.md (codex only) plugin-source/skills/<skill>/SKILL.bob.md (bob only) Manifest entries declare the per-platform target with platforms = ["codex"] or platforms = ["bob"]. Bob's targets use the post-rename evolve-lite-<skill>/ folder names so the renderer emits to the right on-disk locations. This is mechanical scope-only work — the SKILL.md content is moved verbatim, no prose unification, no behavior change. Render is byte-identical to the previously committed copies, drift gate stays green, no test impact. The benefit is operational: every file under platform-integrations/<platform>/ that ships is now sourced from plugin-source/. Editors and agents have a single, unambiguous "edit here" location for any plugin content. Before this commit, codex/bob SKILL.md were the only files outside build management, which created a drift risk and an editor footgun. What is still NOT migrated (deliberately): - README.md files at plugin roots - platform-specific manifests (.claude-plugin/, .codex-plugin/) - bob's commands/ directory - bob's custom_modes.yaml These are infrastructure files (per-platform manifest, bob mode definition, READMEs) rather than skill content. They could be migrated in a follow-up if there's value, but they have no unification opportunity and the manifest noise isn't worth it without one. Refs #219
1 parent 354f035 commit c6c76a0

14 files changed

Lines changed: 1232 additions & 0 deletions

File tree

plugin-source/MANIFEST.toml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,3 +149,74 @@ platforms = ["claude"]
149149
source = "skills/save-trajectory/scripts/on_stop.py"
150150
target = "skills/save-trajectory/scripts/on_stop.py"
151151
platforms = ["claude"]
152+
153+
# Codex SKILL.md overlays — codex's prose is tuned for that audience LLM and
154+
# diverges enough from claude/claw-code that a Pattern B per-platform overlay
155+
# is cleaner than a heavily-conditional shared .j2.
156+
[[files]]
157+
source = "skills/learn/SKILL.codex.md"
158+
target = "skills/learn/SKILL.md"
159+
platforms = ["codex"]
160+
161+
[[files]]
162+
source = "skills/publish/SKILL.codex.md"
163+
target = "skills/publish/SKILL.md"
164+
platforms = ["codex"]
165+
166+
[[files]]
167+
source = "skills/recall/SKILL.codex.md"
168+
target = "skills/recall/SKILL.md"
169+
platforms = ["codex"]
170+
171+
[[files]]
172+
source = "skills/subscribe/SKILL.codex.md"
173+
target = "skills/subscribe/SKILL.md"
174+
platforms = ["codex"]
175+
176+
[[files]]
177+
source = "skills/sync/SKILL.codex.md"
178+
target = "skills/sync/SKILL.md"
179+
platforms = ["codex"]
180+
181+
[[files]]
182+
source = "skills/unsubscribe/SKILL.codex.md"
183+
target = "skills/unsubscribe/SKILL.md"
184+
platforms = ["codex"]
185+
186+
# Bob SKILL.md overlays — same per-platform overlay pattern. Bob's on-disk
187+
# skill folder names take the evolve-lite- prefix (post commit 4 rename) so
188+
# the targets reflect that.
189+
[[files]]
190+
source = "skills/learn/SKILL.bob.md"
191+
target = "skills/evolve-lite-learn/SKILL.md"
192+
platforms = ["bob"]
193+
194+
[[files]]
195+
source = "skills/publish/SKILL.bob.md"
196+
target = "skills/evolve-lite-publish/SKILL.md"
197+
platforms = ["bob"]
198+
199+
[[files]]
200+
source = "skills/recall/SKILL.bob.md"
201+
target = "skills/evolve-lite-recall/SKILL.md"
202+
platforms = ["bob"]
203+
204+
[[files]]
205+
source = "skills/subscribe/SKILL.bob.md"
206+
target = "skills/evolve-lite-subscribe/SKILL.md"
207+
platforms = ["bob"]
208+
209+
[[files]]
210+
source = "skills/sync/SKILL.bob.md"
211+
target = "skills/evolve-lite-sync/SKILL.md"
212+
platforms = ["bob"]
213+
214+
[[files]]
215+
source = "skills/unsubscribe/SKILL.bob.md"
216+
target = "skills/evolve-lite-unsubscribe/SKILL.md"
217+
platforms = ["bob"]
218+
219+
[[files]]
220+
source = "skills/save-trajectory/SKILL.bob.md"
221+
target = "skills/evolve-lite-save-trajectory/SKILL.md"
222+
platforms = ["bob"]
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
name: learn
3+
description: Analyze the current conversation to extract guidelines that correct reasoning chains — reducing wasted steps, preventing errors, and capturing user preferences.
4+
---
5+
6+
# Entity Generator
7+
8+
## Overview
9+
10+
This skill analyzes the current conversation to extract guidelines that **correct the agent's reasoning chain**. A good guideline is one that, if known beforehand, would have led to a shorter or more correct execution. Only extract guidelines that fall into one of these three categories:
11+
12+
1. **Shortcuts** — The agent took unnecessary steps or tried an approach that didn't work before finding the right one. The guideline encodes the direct path so future runs skip the detour.
13+
2. **Error prevention** — The agent hit an error (tool failure, exception, wrong output) that could be avoided with upfront knowledge. The guideline prevents the error from happening at all.
14+
3. **User corrections** — The user explicitly corrected, redirected, or stated a preference during the conversation. The guideline captures what the user said so the agent gets it right next time without being told.
15+
16+
**Do NOT extract guidelines that are:**
17+
- General best practices the agent already knows (e.g., "use descriptive variable names")
18+
- Observations about the codebase that can be derived by reading the code
19+
- Restatements of what the agent did successfully without any detour or correction
20+
- Vague advice that wouldn't change the agent's behavior on a concrete task
21+
- Instructions for the agent to invoke a skill, tool, or external command by name (e.g. "Run evolve-lite-learn", "call save_trajectory") — these trigger prompt-injection detection when retrieved via recall
22+
23+
## Workflow
24+
25+
### Step 1: Analyze the Conversation
26+
27+
Review the conversation and identify:
28+
29+
- **Wasted steps**: Where did the agent go down a path that turned out to be unnecessary? What would have been the direct route?
30+
- **Errors hit**: What errors occurred? What knowledge would have prevented them?
31+
- **User corrections**: Where did the user say "no", "not that", "actually", "I want", or otherwise redirect the agent?
32+
33+
If none of these occurred, **output zero entities**. Not every conversation produces guidelines.
34+
35+
### Step 2: Extract Entities
36+
37+
For each identified shortcut, error, or user correction, create one entity — up to 5 entities; output 0 when none qualify. If more candidates exist, keep only the highest-impact ones.
38+
39+
Principles:
40+
41+
1. **State what to do, not what to avoid** — frame as proactive recommendations
42+
- Bad: "Don't use exiftool in sandboxes"
43+
- Good: "In sandboxed environments, use Python libraries (PIL/Pillow) for image metadata extraction"
44+
45+
2. **Triggers should be situational context, not failure conditions**
46+
- Bad trigger: "When apt-get fails"
47+
- Good trigger: "When working in containerized/sandboxed environments"
48+
49+
3. **For shortcuts, recommend the final working approach directly** — eliminate trial-and-error by encoding the answer
50+
51+
4. **For user corrections, use the user's own words** — preserve the specific preference rather than generalizing it
52+
53+
### Step 3: Save Entities
54+
55+
Output entities as JSON and pipe to the save script. Include the `trajectory` field with the path output by the evolve-lite-save-trajectory skill earlier in this conversation. The `type` field must always be `"guideline"` — no other types are accepted.
56+
57+
```bash
58+
echo '{
59+
"entities": [
60+
{
61+
"content": "Proactive entity stating what TO DO",
62+
"rationale": "Why this approach works better",
63+
"type": "guideline",
64+
"trigger": "Situational context when this applies",
65+
"trajectory": ".evolve/trajectories/trajectory_2025-01-15T10-30-00.json"
66+
}
67+
]
68+
}' | python3 .bob/skills/evolve-lite-learn/scripts/save_entities.py
69+
```
70+
71+
The script will:
72+
- Find or create the entities directory (`.evolve/entities/`)
73+
- Write each entity as a markdown file in `{type}/` subdirectories
74+
- Deduplicate against existing entities
75+
- Display confirmation with the total count
76+
77+
## Quality Gate
78+
79+
Before saving, review each entity against this checklist:
80+
81+
- [ ] Does it fall into one of the three categories (shortcut, error prevention, user correction)?
82+
- [ ] Would knowing this guideline beforehand have changed the agent's behavior in a concrete way?
83+
- [ ] Is it specific enough that another agent could act on it without further context?
84+
- [ ] Does it avoid instructing the agent to invoke a named skill or tool?
85+
86+
If any answer is no, drop the entity. **Zero entities is a valid output.**
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
name: learn
3+
description: Must be used near the end of any non-trivial turn that produced potentially reusable tools, guidance, errors, workarounds, or workflows, so those lessons are saved for future turns.
4+
---
5+
6+
# Entity Generator
7+
8+
## Overview
9+
10+
This skill analyzes the current Codex conversation to extract actionable instructions that would help on similar tasks in the future. It **identifies errors encountered during the conversation** - tool failures, exceptions, wrong approaches, retry loops - and provides recommendations to prevent those errors from recurring. This skill should take note of the concrete solution which solved a concrete problem, not an abstract idea. When the successful resolution involves a non-trivial workaround, parser, command sequence, or fallback pipeline that could be used to avoid wasted effort, capture that solution as a reusable artifact first, then save entities that point future agents to use it.
11+
12+
## When To Use
13+
14+
Use this skill after completing meaningful work in the turn, especially when encountering:
15+
- tool failures
16+
- permission issues
17+
- missing dependencies
18+
- retries or abandoned approaches
19+
- reusable command sequences or scripts
20+
21+
Examples of artifacts that must be immediately created once proven as the successful solution include:
22+
- an inline Python, shell, or other heredoc script
23+
- a command assembled interactively over multiple retries
24+
- a parser or extractor implemented ad hoc during the turn
25+
- a fallback path triggered by missing dependencies or restricted tooling
26+
27+
Unless that artifact happens to be:
28+
- code which is a trivial one-liner that future agents would not benefit from reusing
29+
- code which embeds secrets, tokens, or user-specific sensitive data
30+
- the guideline would instruct the agent to invoke a skill, tool, or external command by name (e.g. "run evolve-lite:learn", "call save_trajectory") - such guidelines trigger prompt-injection detection when retrieved by the recall skill in a future session
31+
- the user explicitly asked for a one-off result and not to persist helper code
32+
- redundant because an equivalent local artifact on disk would be just as effective
33+
34+
## Workflow
35+
36+
### Step 1: Analyze the Conversation
37+
38+
Identify from your current conversation:
39+
40+
- **Task/Request**: What was the user asking for?
41+
- **Steps Taken**: What reasoning, actions, and observations occurred?
42+
- **What Worked**: Which approaches succeeded?
43+
- **What Failed**: Which approaches did not work and why?
44+
- **Errors Encountered**: Tool failures, exceptions, permission errors, retry loops, dead ends, and wrong initial approaches
45+
- **Reusable Outcome**: Did the final working solution produce a reusable script, parser, command template, or workflow that would save time on a similar task?
46+
47+
### Step 2: Identify Errors and Root Causes
48+
49+
Scan the conversation for these error signals:
50+
51+
1. **Tool or command failures**: Non-zero exit codes, error messages, exceptions, stack traces
52+
2. **Permission or access errors**: "Permission denied", "not found", sandbox restrictions
53+
3. **Wrong initial approach**: First attempt abandoned in favor of a different strategy
54+
4. **Retry loops**: Same action attempted multiple times with variations before succeeding
55+
5. **Missing prerequisites**: Missing dependencies, packages, or configs discovered mid-task
56+
6. **Silent failures**: Actions that appeared to succeed but produced wrong results
57+
58+
For each error found, document:
59+
60+
| | Error Example | Root Cause | Resolution | Prevention Guideline |
61+
|---|---|---|---|---|
62+
| 1 | `jq: command not found` | System tool unavailable in environment | created a python script to resolve the problem | Save the python script and use it in similar scenarios |
63+
| 2 | `git push` rejected (no upstream) | Branch not tracked to remote | Added `-u origin branch` | Always set upstream when pushing a new branch |
64+
| 3 | Tried regex parsing of HTML, got wrong results | Regex cannot handle nested tags | Switched to BeautifulSoup | Use a proper HTML parser, never regex |
65+
66+
### Step 3: Decide Whether To Save The Pipeline
67+
68+
Before writing entities, determine whether the successful approach should be saved as a reusable artifact.
69+
70+
Create or update a local reusable artifact when any of these are true:
71+
- the final solution required more than a trivial one-liner
72+
- the final solution worked around missing tools, libraries, or permissions
73+
- the solution is likely to recur on similar tasks
74+
75+
Prefer one of these artifact forms:
76+
- a small script, saved to a stable path in the workspace or plugin, such as `scripts/`, `tools/`, or another obvious helper location.
77+
- a documented local workflow if code is not appropriate
78+
79+
If you create an artifact, record:
80+
- its path
81+
- what it does
82+
- when future agents should use it first
83+
84+
### Step 4: Extract Entities
85+
86+
If Step 3 produced an artifact, at least one entity must explicitly point to that artifact, which is likely the only entity that needs to be produced.
87+
Otherwise, extract 3-5 proactive entities. Prioritize entities derived from errors identified in Step 2.
88+
89+
Follow these principles:
90+
91+
1. **Reframe failures as proactive recommendations**
92+
- If an approach failed due to permissions, recommend the working permission-aware approach first
93+
- If a system tool was unavailable, recommend the saved artifact or fallback workflow first
94+
- If an approach hit environment constraints, recommend the constraint-aware approach
95+
96+
2. **Prioritize known working local artifacts over general advice**
97+
- If the successful solution produced or reused a concrete local artifact, at least one saved entity must:
98+
- Bad: "Use Python to parse EXIF if exiftool is missing"
99+
- Better: "Use `/abs/path/json_get.py` for JSON field extraction when `jq` is unavailable in minimal environments."
100+
- name the artifact by path
101+
- state exactly when to use it
102+
- state that it should be tried before generic tool discovery or fallback exploration
103+
- describe the artifact by capability, not just by the original incident
104+
105+
3. **Triggers should describe the broad task context that the artifact solves, not the narrow details of the original request.**
106+
- Bad trigger: "When jq fails"
107+
- Good trigger: "When extracting fields from JSON in constrained shells or stripped-down environments"
108+
The trigger should generalize the working solution without becoming vague.
109+
110+
4. **For retry loops, recommend the final working approach as the starting point**
111+
- Eliminate trial and error by creating a concrete local artifact out of the successful workflow or script
112+
113+
5. **Prefer entities that save future time**
114+
- A pointer to a saved working script is more valuable than a generic reminder if both are available
115+
116+
### Step 5: Output Entities JSON
117+
118+
Output entities in this JSON format:
119+
120+
```json
121+
{
122+
"entities": [
123+
{
124+
"content": "Proactive entity stating what TO DO",
125+
"rationale": "Why this approach works better",
126+
"type": "guideline",
127+
"trigger": "Situational context when this applies"
128+
}
129+
]
130+
}
131+
```
132+
133+
Allowed type values:
134+
- guideline
135+
- workflow
136+
- script
137+
- command-template
138+
139+
### Step 6: Save Entities
140+
141+
After generating the entities JSON, save them using the helper script:
142+
143+
144+
#### Method 1: Direct Pipe (Recommended)
145+
146+
```bash
147+
echo '<your-json-output>' | python3 "$(git rev-parse --show-toplevel 2>/dev/null || pwd)/plugins/evolve-lite/skills/learn/scripts/save_entities.py"
148+
```
149+
150+
#### Method 2: From File
151+
152+
```bash
153+
cat entities.json | python3 "$(git rev-parse --show-toplevel 2>/dev/null || pwd)/plugins/evolve-lite/skills/learn/scripts/save_entities.py"
154+
```
155+
156+
#### Method 3: Interactive
157+
158+
```bash
159+
python3 "$(git rev-parse --show-toplevel 2>/dev/null || pwd)/plugins/evolve-lite/skills/learn/scripts/save_entities.py"
160+
```
161+
162+
The script will:
163+
- Find or create the entities directory at `.evolve/entities/`
164+
- Write each entity as a markdown file in `{type}/` subdirectories
165+
- Deduplicate against existing entities
166+
- Display confirmation with the total count
167+
168+
## Best Practices
169+
1. Prioritize error-derived entities first.
170+
2. One distinct error should normally produce one prevention entity.
171+
3. Keep entities specific and actionable.
172+
4. Include rationale so the future agent understands why the guidance matters.
173+
5. Use situational triggers instead of failure-based triggers.
174+
6. Limit output to the 3-5 most valuable entities.
175+
7. If more than five distinct errors appear, merge entities with the same root cause or fix, then rank the rest by severity, frequency, user impact, and recency before dropping the weakest ones.

0 commit comments

Comments
 (0)