Skip to content

Commit d373e7e

Browse files
jayaramkrJAYARAM RADHAKRISHNAN
andauthored
feat: improve tip generation prompt with richer guidance (#124)
* feat: improve tip generation prompt with richer guidance Restructures the prompt to produce more specific, actionable tips by adding explicit categories (strategy/recovery/optimization), step instructions, trigger conditions, and domain-specific pattern hints (API discovery, pagination, auth, error handling). * fix: add optional implementation_steps to Tip model and prompt Aligns the Tip schema with the generate_tips prompt which already instructed the LLM to produce implementation steps. Field is optional (default empty list) for backward compatibility with existing stored entities and consumers. * feat: propagate implementation_steps through tip storage and clustering Store implementation_steps in entity metadata across all three storage paths (phoenix sync, MCP save_trajectory, consolidation). Pass it through combine_cluster so the LLM sees prior steps when consolidating, and update combine_tips.jinja2 to render and emit the field. * fix: normalize implementation_steps to list[str] in combine_cluster Handles legacy entities where implementation_steps may be None, a bare string, or a non-list type stored in metadata. * fix: clarify task status context in tip generation prompt Replace placeholder 'Status: UNKNOWN' with an accurate description of the evaluation context — no ground truth or user feedback, only the agent's self-evaluation in the trajectory. Soften success/failure conditionals to reflect this uncertainty. --------- Co-authored-by: JAYARAM RADHAKRISHNAN <jayaramkr@us.ibm.com>
1 parent a6b7b33 commit d373e7e

7 files changed

Lines changed: 44 additions & 14 deletions

File tree

evolve/frontend/client/evolve_client.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,7 @@ def consolidate_tips(self, namespace_id: str, threshold: float | None = None) ->
143143
"rationale": tip.rationale,
144144
"category": tip.category,
145145
"trigger": tip.trigger,
146+
"implementation_steps": tip.implementation_steps,
146147
},
147148
)
148149
for tip in consolidated_tips

evolve/frontend/mcp/mcp_server.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,7 @@ def save_trajectory(trajectory_data: str, task_id: str | None = None) -> list[Re
215215
"category": tip.category,
216216
"rationale": tip.rationale,
217217
"trigger": tip.trigger,
218+
"implementation_steps": tip.implementation_steps,
218219
"task_description": result.task_description,
219220
"source_task_id": task_id,
220221
"creation_mode": "auto-mcp",

evolve/llm/tips/clustering.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,12 +157,22 @@ def combine_cluster(entities: list[RecordedEntity]) -> list[Tip]:
157157
dict.fromkeys((e.metadata or {}).get("task_description", "") for e in entities if (e.metadata or {}).get("task_description"))
158158
)
159159

160+
def _normalize_steps(raw: object) -> list[str]:
161+
if raw is None or raw == []:
162+
return []
163+
if isinstance(raw, str):
164+
return [raw]
165+
if isinstance(raw, list):
166+
return [str(x) for x in raw]
167+
return [str(raw)]
168+
160169
tips = [
161170
{
162171
"content": str(e.content),
163172
"rationale": (e.metadata or {}).get("rationale", ""),
164173
"category": (e.metadata or {}).get("category", "strategy"),
165174
"trigger": (e.metadata or {}).get("trigger", ""),
175+
"implementation_steps": _normalize_steps((e.metadata or {}).get("implementation_steps")),
166176
}
167177
for e in entities
168178
]

evolve/llm/tips/prompts/combine_tips.jinja2

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ These guidelines came from tasks like:
1313
- **Rationale:** {{ tip.rationale }}
1414
- **Category:** {{ tip.category }}
1515
- **Trigger:** {{ tip.trigger }}
16+
{% if tip.implementation_steps %}- **Implementation Steps:** {{ tip.implementation_steps | join('; ') }}{% endif %}
1617

1718
{% endfor %}
1819

@@ -35,7 +36,8 @@ Combine the above guidelines into a smaller set of HIGH-QUALITY, CONSOLIDATED, N
3536
"content": "Clear, actionable tip",
3637
"rationale": "Why this tip helps",
3738
"category": "strategy|recovery|optimization",
38-
"trigger": "When to apply this tip"
39+
"trigger": "When to apply this tip",
40+
"implementation_steps": ["step 1", "step 2"]
3941
}
4042
]
4143
}
Lines changed: 27 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,33 @@
1-
You are analyzing an AI agent's execution trajectory to extract actionable tips.
1+
Extract actionable, relevant tips from this trajectory that would help an AI agent perform similar tasks better in the future.
22

33
# Task Information
44
**Task:** {{task_instruction}}
5-
**Status:** UNKNOWN
5+
**Task Status:** There is no evaluation of the task's trajectory or output against any ground truth. There is also no user feedback to the AI agent. But the trajectory may contain the agent's self-evaluation of whether the task succeeded or failed.
66
**Steps Taken:** {{num_steps}}
77

88
# Agent Trajectory
99
{{trajectory_summary}}
1010

11-
# Your Task
12-
Extract 3-5 actionable tips from this trajectory that would help AI agents perform similar tasks better.
11+
**IMPORTANT TO REMEMBER:**
12+
1. Only generate tips if they are truly relevant and actionable
13+
2. Tips should be specific to patterns observed in this trajectory
14+
3. Include both positive patterns (what worked) and negative patterns (what to avoid)
15+
4. Each tip should have:
16+
- A clear, concise description (content)
17+
- The purpose/benefit of following it
18+
- The category: "strategy", "recovery", or "optimization"
19+
- Specific steps to implement the tip
20+
- A trigger condition (when to apply this tip)
1321

14-
**Guidelines:**
15-
1. Focus on patterns that worked or mistakes that were made
16-
2. Be specific to what you observed in this trajectory
17-
3. Each tip should have:
18-
- Clear description of what to do (or avoid)
19-
- Why it matters
20-
- When to apply it
22+
5. If the task seems to have succeeded, focus on the successful strategies used
23+
6. If the task seems to have failed, focus on what went wrong and how to prevent/recover from it
24+
7. Do not generate generic tips - be specific to this task execution
25+
8. Look for patterns in how the agent:
26+
- Discovered and used APIs
27+
- Handled authentication and credentials
28+
- Iterated through results (pagination)
29+
- Structured its approach to the problem
30+
- Handled errors or unexpected responses
2131

2232
{% if not constrained_decoding_supported %}
2333
**Output Format (JSON):**
@@ -28,11 +38,15 @@ Extract 3-5 actionable tips from this trajectory that would help AI agents perfo
2838
"content": "Clear, actionable tip",
2939
"rationale": "Why this tip helps",
3040
"category": "strategy|recovery|optimization",
31-
"trigger": "When to apply this tip"
41+
"trigger": "When to apply this tip",
42+
"implementation_steps": ["step 1", "step 2"]
3243
}
3344
]
3445
}
3546
```
3647

3748
Generate tips now. Return ONLY the JSON, no other text.
38-
{% endif %}
49+
{% endif %}
50+
51+
52+

evolve/schema/tips.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ class Tip(BaseModel):
1010
rationale: str = Field(description="Why this tip helps")
1111
category: Literal["strategy", "recovery", "optimization"]
1212
trigger: str = Field(description="When to apply this tip")
13+
implementation_steps: list[str] = Field(default_factory=list, description="Specific steps to implement this tip")
1314

1415

1516
class TipGenerationResponse(BaseModel):

evolve/sync/phoenix_sync.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -480,6 +480,7 @@ def _process_trajectory(self, trajectory: dict) -> int:
480480
"category": tip.category,
481481
"rationale": tip.rationale,
482482
"trigger": tip.trigger,
483+
"implementation_steps": tip.implementation_steps,
483484
"source_task_id": trajectory["trace_id"],
484485
"source_span_id": trajectory["span_id"],
485486
"task_description": result.task_description,

0 commit comments

Comments
 (0)