Skip to content

Commit 824e4d9

Browse files
authored
fix(evolve-lite): tighten learn skill to only extract high-signal guidelines (#122)
* fix(evolve-lite): tighten learn skill to only extract high-signal guidelines Narrow guideline extraction to three categories: shortcuts (reducing wasted steps), error prevention, and user corrections. Add explicit exclusion list, quality gate checklist, and enforce that zero entities is a valid output when no corrections occurred. Also restrict entity_io.py to an allowlist of types (guideline, preference) so the LLM can no longer invent types like "observation" that store codebase facts instead of reasoning chain corrections. * fix(bob): clarify entity count rule in learn skill Addresses CodeRabbit review finding: Clarify entity count rule to avoid ambiguity * fix(evolve-lite): guard against non-string entity type values Add isinstance check before the allowlist membership test so non-string or unhashable types fall back to "guideline" instead of raising. Addresses CodeRabbit review finding: Handle non-string type values before allowlist check
1 parent 6945ce1 commit 824e4d9

2 files changed

Lines changed: 38 additions & 37 deletions

File tree

  • platform-integrations
Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,57 @@
11
---
22
name: learn
3-
description: Analyze the current conversation to extract actionable entities — proactive recommendations derived from errors, failures, and successful patterns.
3+
description: Analyze the current conversation to extract guidelines that correct reasoning chains — reducing wasted steps, preventing errors, and capturing user preferences.
44
---
55

66
# Entity Generator
77

88
## Overview
99

10-
This skill analyzes the current conversation to extract actionable entities that would help on similar tasks in the future. It **prioritizes errors** — tool failures, exceptions, wrong approaches, retry loops — and transforms them into proactive recommendations that prevent those errors from recurring.
10+
This skill analyzes the current conversation to extract guidelines that **correct the agent's reasoning chain**. A good guideline is one that, if known beforehand, would have led to a shorter or more correct execution. Only extract guidelines that fall into one of these three categories:
1111

12-
## Workflow
13-
14-
### Step 1: Analyze the Conversation
12+
1. **Shortcuts** — The agent took unnecessary steps or tried an approach that didn't work before finding the right one. The guideline encodes the direct path so future runs skip the detour.
13+
2. **Error prevention** — The agent hit an error (tool failure, exception, wrong output) that could be avoided with upfront knowledge. The guideline prevents the error from happening at all.
14+
3. **User corrections** — The user explicitly corrected, redirected, or stated a preference during the conversation. The guideline captures what the user said so the agent gets it right next time without being told.
1515

16-
Identify from the current conversation:
16+
**Do NOT extract guidelines that are:**
17+
- General best practices the agent already knows (e.g., "use descriptive variable names")
18+
- Observations about the codebase that can be derived by reading the code
19+
- Restatements of what the agent did successfully without any detour or correction
20+
- Vague advice that wouldn't change the agent's behavior on a concrete task
1721

18-
- **Task/Request**: What was the user asking for?
19-
- **What Worked**: Which approaches succeeded?
20-
- **What Failed**: Which approaches didn't work and why?
21-
- **Errors Encountered**: Tool failures, exceptions, permission errors, retry loops, dead ends, wrong initial approaches
22+
## Workflow
2223

23-
### Step 2: Identify Errors and Root Causes
24+
### Step 1: Analyze the Conversation
2425

25-
Scan for these error signals:
26+
Review the conversation and identify:
2627

27-
1. **Tool/command failures**: Non-zero exit codes, error messages, exceptions
28-
2. **Permission/access errors**: "Permission denied", "not found", sandbox restrictions
29-
3. **Wrong initial approach**: First attempt abandoned for a different strategy
30-
4. **Retry loops**: Same action attempted multiple times before succeeding
31-
5. **Missing prerequisites**: Dependencies, packages, configs discovered mid-task
32-
6. **Silent failures**: Actions that appeared to succeed but produced wrong results
28+
- **Wasted steps**: Where did the agent go down a path that turned out to be unnecessary? What would have been the direct route?
29+
- **Errors hit**: What errors occurred? What knowledge would have prevented them?
30+
- **User corrections**: Where did the user say "no", "not that", "actually", "I want", or otherwise redirect the agent?
3331

34-
If no errors are found, extract entities from successful patterns instead.
32+
If none of these occurred, **output zero entities**. Not every conversation produces guidelines.
3533

36-
### Step 3: Extract Entities
34+
### Step 2: Extract Entities
3735

38-
Extract 3-5 proactive entities. **Prioritize entities derived from errors.**
36+
For each identified shortcut, error, or user correction, create one entity — up to 5 entities; output 0 when none qualify. If more candidates exist, keep only the highest-impact ones.
3937

4038
Principles:
4139

42-
1. **Reframe failures as proactive recommendations** — recommend what worked, not what to avoid
43-
- Bad: "If exiftool fails, use PIL instead"
40+
1. **State what to do, not what to avoid** — frame as proactive recommendations
41+
- Bad: "Don't use exiftool in sandboxes"
4442
- Good: "In sandboxed environments, use Python libraries (PIL/Pillow) for image metadata extraction"
4543

4644
2. **Triggers should be situational context, not failure conditions**
4745
- Bad trigger: "When apt-get fails"
4846
- Good trigger: "When working in containerized/sandboxed environments"
4947

50-
3. **For retry loops, recommend the final working approach directly** — eliminate trial-and-error by encoding the answer
48+
3. **For shortcuts, recommend the final working approach directly** — eliminate trial-and-error by encoding the answer
49+
50+
4. **For user corrections, use the user's own words** — preserve the specific preference rather than generalizing it
5151

52-
### Step 4: Save Entities
52+
### Step 3: Save Entities
5353

54-
Output entities as JSON and pipe to the save script:
54+
Output entities as JSON and pipe to the save script. The `type` field must always be `"guideline"` — no other types are accepted.
5555

5656
```bash
5757
echo '{
@@ -72,12 +72,12 @@ The script will:
7272
- Deduplicate against existing entities
7373
- Display confirmation with the total count
7474

75-
## Best Practices
75+
## Quality Gate
76+
77+
Before saving, review each entity against this checklist:
78+
79+
- [ ] Does it fall into one of the three categories (shortcut, error prevention, user correction)?
80+
- [ ] Would knowing this guideline beforehand have changed the agent's behavior in a concrete way?
81+
- [ ] Is it specific enough that another agent could act on it without further context?
7682

77-
1. **Prioritize error-derived entities**: Errors are the highest-signal source of learnings
78-
2. **One error, one entity**: Each distinct error should produce one prevention entity
79-
3. **Be specific and actionable**: State what to do, not what to avoid
80-
4. **Include rationale**: Explain why the approach works
81-
5. **Use situational triggers**: Context-based, not failure-based
82-
6. **Limit to 3-5 entities**: Focus on the most impactful learnings
83-
7. **When more than 5 errors exist**: Merge errors with the same root cause, rank by severity > frequency > user impact, then keep the top 3-5
83+
If any answer is no, drop the entity. **Zero entities is a valid output.**

platform-integrations/claude/plugins/evolve-lite/lib/entity_io.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -262,9 +262,10 @@ def write_entity_file(directory, entity):
262262
Returns:
263263
Path to the written file.
264264
"""
265-
entity_type = entity.get("type", "general")
266-
if not re.fullmatch(r"[a-z0-9][a-z0-9_-]*", entity_type):
267-
entity_type = "general"
265+
_ALLOWED_TYPES = {"guideline", "preference"}
266+
entity_type = entity.get("type", "guideline")
267+
if not isinstance(entity_type, str) or entity_type not in _ALLOWED_TYPES:
268+
entity_type = "guideline"
268269
entity["type"] = entity_type
269270
type_dir = Path(directory) / entity_type
270271
type_dir.mkdir(parents=True, exist_ok=True)

0 commit comments

Comments
 (0)