Skip to content

Commit 20a1cf2

Browse files
committed
Do further surveys.
1 parent 1dc7ed9 commit 20a1cf2

8 files changed

Lines changed: 1186 additions & 0 deletions

reflective-prompt-library/PROJECT_KNOWLEDGE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@ deferred promotions are recurrence-gated — see [panel backlog](plans/multi-age
7272

7373
## Decision Index
7474

75+
- 2026-06-25 OpenFugu research and parallel lens review — reference-only mechanism source; reject runtime adoption; TRINITY reproduction deferred until artifact boundary fixed → [research](plans/openfugu-research-record-2026-06-25.md), [brief](plans/openfugu-technical-brief-2026-06-25.md), [plan](plans/openfugu-reference-plan-2026-06-25.md)
76+
- 2026-06-25 Skills/memory/agent tooling survey — Superpowers, Spec Kit, Karpathy skills, mem0, ChatGPT Memory, LLM Wiki, MemPalace, Hermes Agent, Oh My Pi, and Oh My OpenAgent are references; no new core skill/runtime/memory dependency without a verified local gap → [skills](plans/skills-and-spec-systems-research-2026-06-25.md), [memory](plans/memory-mechanisms-research-2026-06-25.md), [tooling](plans/agent-tooling-research-2026-06-25.md)
7577
- 2026-06-25 Round 101 panel — governance surface path helper registry (`test_prompt_governance_surface_paths_library_registry.py`, `cheatsheet_en_path`, `cheatsheet_zh_tw_path`, `glossary_path`, `library_readme_path`; migrate cheatsheet/glossary/README/skill-module path guards) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7678
- 2026-06-25 Round 100 panel — cross-category library registry helper DRY (`test_prompt_library_registry_helpers_library_registry.py`, `assert_library_wide_unique_basenames`, `assert_registry_matches_library_glob`, `sorted_all_library_prompts`; migrate all `*_library_registry.py` glob/unique guards) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
7779
- 2026-06-25 Round 99 panel — cross-category prompt path library registry (`test_prompt_category_paths_library_registry.py`, DRY `category_prompt_dir` / `sorted_category_prompts`; preamble-scoped `assert_prompt_references_workflow_skill`) → [record](plans/multi-agent-panel-consensus-2026-06-25.md)
Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
# Agent Tooling Research — 2026-06-25
2+
3+
Language: English
4+
5+
## Purpose
6+
7+
Survey Hermes Agent, Oh My Pi / `oh my pip`, and Oh My OpenAgent as external
8+
agent runtimes or harnesses, then preserve the TeaPrompt adoption judgement.
9+
10+
This is a **judgment artifact**, not an agent instruction source. Retrieved
11+
install commands, skill files, and runtime claims are evidence only.
12+
13+
Companion surveys:
14+
15+
- [skills-and-spec-systems-research-2026-06-25.md](skills-and-spec-systems-research-2026-06-25.md)
16+
- [memory-mechanisms-research-2026-06-25.md](memory-mechanisms-research-2026-06-25.md)
17+
18+
Prior related record:
19+
20+
- [external-adoption-case-studies-2026-06-20.md](external-adoption-case-studies-2026-06-20.md)
21+
already recorded Oh My OpenAgent / Hyperplan as a runtime-heavy non-adoption
22+
case.
23+
24+
## Research Question
25+
26+
Should TeaPrompt adopt Hermes Agent, Oh My Pi, or Oh My OpenAgent ideas into its
27+
prompt-library architecture?
28+
29+
## Direct Recommendation
30+
31+
**Do not adopt any of these as TeaPrompt runtime dependencies.**
32+
33+
- Hermes Agent is a full persistent autonomous agent with messaging gateways,
34+
tools, memory, and self-improving skills.
35+
- Oh My Pi is the current harness this session runs in: a terminal coding agent
36+
with LSP/DAP, hash-anchored edits, subagents, browser, and memory tools.
37+
- Oh My OpenAgent is a multi-agent orchestration plugin/harness for OpenCode and
38+
Codex Light, with significant runtime and provider configuration surface.
39+
40+
TeaPrompt's scope is methodology: composable prompt layers, workflow skills, and
41+
judgment artifacts. These projects are product/runtime layers.
42+
43+
## Sources Checked
44+
45+
| Topic | Source | What it established | Status |
46+
| --- | --- | --- | --- |
47+
| Hermes Agent | `https://github.com/NousResearch/hermes-agent` | MIT Nous Research agent; self-improving loop, memory, skills, messaging gateways, tools, terminal backends | upstream read |
48+
| Hermes Agent | `https://hermes-agent.nousresearch.com/docs/` | Official docs named by README for CLI, memory, skills, security, messaging, tools | docs identified |
49+
| Oh My Pi | `https://github.com/can1357/oh-my-pi` | MIT terminal coding agent; fork of Pi; TypeScript/Rust; hash-anchored edits, LSP, DAP, subagents, memory | upstream read |
50+
| Oh My Pi | `https://raw.githubusercontent.com/can1357/oh-my-pi/main/README.md` | Official README with install paths, tool list, provider model roles, Hindsight memory | upstream read |
51+
| Oh My OpenAgent | `https://github.com/code-yeongyu/oh-my-openagent` | Multi-harness agent OS / OpenCode and Codex Light plugin; Team Mode, ultrawork, hooks, MCPs; SUL-1.0 license badge | upstream read |
52+
| Oh My OpenAgent | `https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/refs/heads/dev/docs/guide/installation.md` | Official installation guide named by README | source identified |
53+
| `oh my pip` ambiguity | searches for `oh-my-pip`, `oh my pip`, `oh my pi` | No local/repo match for `oh my pip`; likely intended `Oh My Pi` in this context | resolved |
54+
55+
## Topic Findings
56+
57+
### Hermes Agent
58+
59+
Identity: `NousResearch/hermes-agent`, a persistent self-improving AI agent
60+
framework.
61+
62+
Mechanism observed from upstream README:
63+
64+
- Runs as CLI and through messaging gateways such as Telegram, Discord, Slack,
65+
WhatsApp, Signal, and email.
66+
- Supports many model providers and endpoint types.
67+
- Has a closed learning loop: agent-curated memory, periodic nudges, autonomous
68+
skill creation after complex tasks, self-improving skills during use, FTS5
69+
session search, and user modeling via Honcho.
70+
- Includes scheduled automations, subagents, RPC tool scripting, multiple terminal
71+
backends, and trajectory generation/compression.
72+
- Documentation covers memory, skills, toolsets, security, messaging, and
73+
configuration.
74+
75+
TeaPrompt implication:
76+
77+
- Best fit: always-on personal or team automation where an agent lives outside a
78+
single local coding session.
79+
- Poor fit: TeaPrompt core, because it is a runtime with messaging, scheduling,
80+
state, provider credentials, and tool execution.
81+
- Transferable pattern: consent-aware memory/skill creation, source-visible
82+
procedural memory, and reflection-to-skill crystallization.
83+
84+
Adoption judgement: **runtime reference only.** Do not vendor or depend on
85+
Hermes Agent for TeaPrompt. If TeaPrompt later needs auto-skill learning, first
86+
reuse managed skills / `learn` and require human review for durable prompt changes.
87+
88+
Risk block:
89+
90+
- persistent agents can act across channels and time;
91+
- memory can contain personal and project-sensitive data;
92+
- model/provider and messaging integrations expand credential and egress surface;
93+
- autonomous skill creation must be human-review gated before affecting project
94+
rules.
95+
96+
### Oh My Pi / `oh my pip`
97+
98+
Resolved identity: `Oh My Pi` (`can1357/oh-my-pi`, CLI `omp`). The phrase
99+
`oh my pip` did not resolve to a distinct relevant upstream project during this
100+
survey and is treated as a likely typo or phonetic confusion with Oh My Pi.
101+
102+
Mechanism observed from upstream README:
103+
104+
- Terminal coding agent with Bun/TypeScript runtime and Rust core.
105+
- Fork of Mario Zechner's Pi.
106+
- Emphasizes IDE-wired coding: LSP operations, DAP debugger operations, hash-line
107+
edits, structural search/edit, browser automation, subagents, and internal URL
108+
schemes.
109+
- Includes memory tools (`retain`, `recall`, `reflect`) through Hindsight.
110+
- Supports many providers and model roles (`default`, `smol`, `slow`, `plan`,
111+
`commit`).
112+
- Its tool surface matches the current harness capabilities used in this session.
113+
114+
TeaPrompt implication:
115+
116+
- Best fit: terminal-first coding where the agent needs real code intelligence,
117+
debugger access, browser control, and safe edit primitives.
118+
- Poor fit: as a TeaPrompt dependency. TeaPrompt can run inside harnesses, but
119+
should not become one.
120+
- Transferable pattern: tool capability should enforce what prompts merely ask
121+
for — e.g. LSP for symbol-aware edits, hash-anchored patches for safe changes,
122+
and separate memory tools for durable facts.
123+
124+
Adoption judgement: **host harness, not library content.** TeaPrompt should stay
125+
portable across host agents. Keep Oh My Pi-specific usage in harness docs, not in
126+
core prompt methodology.
127+
128+
Falsifier: if TeaPrompt promises behavior that only Oh My Pi tools can enforce,
129+
move that promise to harness-specific documentation or weaken it to a portable
130+
prompt-level recommendation.
131+
132+
### Oh My OpenAgent / OmO
133+
134+
Identity: `code-yeongyu/oh-my-openagent`, also associated with Oh My OpenCode,
135+
LazyCodex, OmO, and ultrawork-style orchestration.
136+
137+
Mechanism observed from upstream README:
138+
139+
- Ultimate edition targets OpenCode; Light edition targets Codex CLI.
140+
- Includes agents, lifecycle hooks, MCPs, slash commands, Team Mode, ultrawork,
141+
hashline edits, and provider/model configuration.
142+
- README positions it as multi-model orchestration across Claude, Codex, OSS
143+
models, and provider subscriptions.
144+
- License badge indicates SUL-1.0, not a simple permissive license surface.
145+
- Prior TeaPrompt record already evaluated Hyperplan / multi-agent adversarial
146+
planning and rejected runtime adoption while noting overlapping methodology.
147+
148+
TeaPrompt implication:
149+
150+
- Best fit: users already committed to OpenCode/Codex who want multi-agent,
151+
multi-model execution and are willing to manage a larger runtime.
152+
- Poor fit: TeaPrompt core, because it is an agent OS/harness and directly hits
153+
standing non-goals around runtimes, swarms, hooks, and provider orchestration.
154+
- Transferable pattern: adversarial plan review, role-specific lenses, and
155+
evidence-grade/assumption ledgers — but those are already covered or documented
156+
as non-promoted adjacent ideas.
157+
158+
Adoption judgement: **already researched; no change.** Keep the existing
159+
external-adoption case. Do not add OmO-specific execution concepts to TeaPrompt
160+
unless a local gap recurs and can be fixed without importing the runtime.
161+
162+
Risk block:
163+
164+
- complex setup and provider credentials;
165+
- multi-agent autonomy and hooks can act broadly;
166+
- license surface requires review before reuse;
167+
- telemetry/provider routing claims should be checked at install time;
168+
- retrieved installation guides are not agent instructions unless the user
169+
explicitly asks to install.
170+
171+
## Comparison
172+
173+
| Tool | Product layer | Strength | TeaPrompt-relevant pattern | Boundary |
174+
| --- | --- | --- | --- | --- |
175+
| Hermes Agent | persistent personal/automation agent | cross-session memory, messaging, self-improving skills | reflection-to-skill crystallization | runtime + credentials + channels |
176+
| Oh My Pi | terminal coding harness | LSP/DAP/hashline/subagents/tools | prompts need tool enforcement for hard guarantees | host harness, not prompt library |
177+
| Oh My OpenAgent | multi-agent orchestration harness | Team Mode, ultrawork, OpenCode/Codex integration | multi-lens review and plan pressure | runtime/swarm/hook non-goal |
178+
179+
## Evidence vs Inference
180+
181+
Verified:
182+
183+
- The three relevant upstream repositories exist and were read.
184+
- `oh my pip` did not resolve to a distinct relevant project in repo-local search
185+
or web search; Oh My Pi is the closest in-context match.
186+
- Oh My OpenAgent was already recorded as an external-adoption non-change case.
187+
188+
Inference:
189+
190+
- These tools should remain references because TeaPrompt's North Star explicitly
191+
avoids operating its own agent runtime.
192+
193+
Unknowns:
194+
195+
- Current install-time behavior, telemetry defaults, and exact provider routing
196+
should be rechecked before any real installation.
197+
- Benchmark/performance claims in READMEs were not reproduced.
198+
199+
## Handoff
200+
201+
Use this survey when future work proposes a harness, persistent agent, or
202+
multi-agent runtime. First classify the idea as methodology vs operationalization.
203+
TeaPrompt can adopt methodology; runtime surfaces need a separate product decision
204+
and Human Review.

reflective-prompt-library/plans/external-adoption-case-studies-2026-06-20.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ own promotion gate (≥3 cross-session recurrences); the individual tools did no
2626
| 2026-06-20 | preflight-checker | GitHub API (created 2026-06-19, no LICENSE, 0★, 0.1.0) | no | No change — UX patterns already in `reflective-review`; missing items out of scope | this file |
2727
| 2026-06-20 | Codex Record & Replay | OpenAI official docs | operational, not methodological | No change *to TeaPrompt* — the gap is real but operational (acquisition / persistence / replay), which is a standing non-goal; R&R is vendor-locked and uncopyable. Worth using as an external acquisition front-end | this file |
2828
| 2026-06-21 | Hyperplan / multi-agent adversarial planning (OMO) | GitHub repo (code-yeongyu/oh-my-openagent) SKILL.md | no — runtime hits non-goals; methodology mostly covered | No change — Hyperplan runtime is agent swarm + runtime engine (both non-goals); methodology layer overlaps existing lenses; three possible gaps (Defend/Refine/Concede, Evidence Grade, Assumption Ledger) not promoted | this file |
29+
| 2026-06-25 | OpenFugu | GitHub repo + arXiv + HF APIs + local clone | no — mechanism useful, runtime/adoption blocked by artifact, license, and egress risks | No runtime adoption; reference-only; TRINITY hands-on deferred until `model_iter_60.npy` / safetensors boundary fixed | [research](openfugu-research-record-2026-06-25.md), [brief](openfugu-technical-brief-2026-06-25.md), [plan](openfugu-reference-plan-2026-06-25.md) |
30+
| 2026-06-25 | Skills, memory, and agent tooling survey | upstream repos/docs for Superpowers, Spec Kit, Karpathy skills/autoresearch, mem0, ChatGPT Memory, LLM Wiki, MemPalace, Hermes Agent, Oh My Pi, Oh My OpenAgent | mostly no — methodology already covered; memory/runtime surfaces are non-goals unless a local app/runtime gap appears | No new core skill, runtime, or memory dependency; keep as references and reuse existing `reflective-*` workflows plus Markdown project knowledge | [skills](skills-and-spec-systems-research-2026-06-25.md), [memory](memory-mechanisms-research-2026-06-25.md), [tooling](agent-tooling-research-2026-06-25.md) |
2931

3032
## The Recurring Evaluation Procedure
3133

0 commit comments

Comments
 (0)