Skip to content

Commit e483efe

Browse files
committed
Clean up outdated references: qwen3-embedding default, fix model names, update dreaming to built-in, fix one-shot prompt
1 parent 0e9d13d commit e483efe

1 file changed

Lines changed: 35 additions & 24 deletions

File tree

README.md

Lines changed: 35 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
7. [Web Search](#part-7-web-search-give-your-agent-eyes-on-the-internet) - Tavily, Brave, Serper, Gemini grounding
2121
8. [One-Shotting Big Tasks](#part-8-one-shotting-big-tasks-stop-iterating-start-researching) - Research-first methodology
2222
9. [Vault Memory System](#part-9-vault-memory-system-stop-losing-knowledge-between-sessions) - Structured knowledge graph, MOCs, cross-session continuity
23-
10. [State-of-the-Art Embeddings](./part10-state-of-the-art-embeddings.md) - Upgrade from nomic to Qwen3-VL, Stark Edition server, Windows gotchas
23+
10. [State-of-the-Art Embeddings](./part10-state-of-the-art-embeddings.md) - Upgrade from nomic to qwen3-embedding, SOTA quality, Windows gotchas
2424
11. [Auto-Capture Hook](./part11-auto-capture-hook.md) - Automatic knowledge extraction after every session, no manual memory writes
2525
12. [Self-Improving System](./part12-self-improving-system.md) - Micro-learning loop that compounds forever, $0/day
2626
13. [Memory Bridge](./part13-memory-bridge.md) - Give coding agents (Codex/Claude Code) access to your vault knowledge
@@ -178,7 +178,7 @@ ollama ps # Check what's loaded
178178
ollama stop modelname # Unload idle big models
179179
```
180180

181-
The default model for memory search is `nomic-embed-text` (300 MB). If you have a GPU with 8GB+ VRAM, upgrade to Qwen3-Embedding-8B for dramatically better search quality — see [Part 10](./part10-state-of-the-art-embeddings.md). If you have 500+ vault files, also add [LightRAG (Part 18)](./part18-lightrag-graph-rag.md) for knowledge graph retrieval that blows away basic vector search.
181+
The default model for memory search should be `qwen3-embedding:0.6b` (500 MB, 1024 dims) — same Qwen3 family that holds #1 on MTEB, runs on anything, and blows away nomic on quality. Pull it: `ollama pull qwen3-embedding:0.6b`. If you have a GPU with 8GB+ VRAM, upgrade to Qwen3-Embedding-8B for dramatically better search quality — see [Part 10](./part10-state-of-the-art-embeddings.md). If you have 500+ vault files, also add [LightRAG (Part 18)](./part18-lightrag-graph-rag.md) for knowledge graph retrieval that blows away basic vector search.
182182

183183
---
184184

@@ -385,9 +385,9 @@ _Pointers only. Search before answering._
385385

386386
Every detailed document → vault/. Leave a one-liner pointer in MEMORY.md or memory/.
387387

388-
**Step 5: Set up autoDream consolidation**
388+
**Step 5: Set up memory consolidation**
389389

390-
Session memory files pile up fast — 200+ files in a month. [Part 16](./part16-autodream-memory-consolidation.md) adds automatic consolidation that extracts durable knowledge from session files into organized topic files, and rebuilds MEMORY.md as a clean index. No scripts needed — just instructions in AGENTS.md.
390+
Session memory files pile up fast — 200+ files in a month. OpenClaw 2026.4+ has built-in dreaming ([Part 22](#part-22-built-in-dreaming)) — enable it in memory-core config and it auto-consolidates on a daily schedule. For older versions, use the custom autoDream approach in [Part 16](./part16-autodream-memory-consolidation.md).
391391

392392
### The Golden Rule
393393

@@ -548,12 +548,12 @@ This writes a `CONTEXT.md` that the coding agent reads automatically — giving
548548
| Role | What It Does | Best Model(s) | Why |
549549
|------|-------------|----------------|-----|
550550
| **Orchestrator** | Plans, judges, coordinates | Claude Opus 4.6 | Best complex reasoning + tool use |
551-
| **Sub-agents** | Execute delegated tasks | Gemini 3 Flash, Kimi K2.5, MiMo V2 Pro | Fast, cheap, capable enough |
551+
| **Sub-agents** | Execute delegated tasks | Kimi K2.5, MiMo V2 Pro, Gemini Flash | Fast, cheap, capable enough |
552552
| **Infrastructure** | Compaction, fallbacks, bulk work | Cerebras gpt-oss-120b | $0.60/M, 3000 tok/s, reliable |
553553
| **Knowledge Graph RAG** | Entity extraction, graph queries | Cerebras qwen-3-235b | 1400 tok/s, high accuracy for entity extraction |
554554
| **Coding (hard)** | Architecture, complex bugs | Claude Opus 4.6 | #1 SWE-bench (1549) — best coding model alive |
555555
| **Coding (batch)** | Scaffolding, CRUD, refactors | GPT-5.4 Codex | Fast, $0 on subscription, good with Memory Bridge |
556-
| **Research** | Web search, analysis | Gemini 3 Flash + Tavily | Built-in grounding |
556+
| **Research** | Web search, analysis | Kimi K2.5 + Tavily | Cheap, fast, good at research synthesis |
557557
| **Local inference** | $0 forever, private, no rate limits | QwOpus (27B), TerpBot (Nemotron 30B), Nemotron Nano 4B | Ollama on any GPU |
558558
| **Free tier** | Zero-cost operations | Gemini (all variants), Cerebras free tier, OpenRouter free models | $0 with generous limits |
559559

@@ -565,7 +565,7 @@ This writes a `CONTEXT.md` that the coding agent reads automatically — giving
565565
- 1M context window with prompt caching (up to 90% savings on cached tokens)
566566
- **Cost:** $5/M input, $25/M output, $0.50/M cached | **Max ($100/mo):** included - best value for heavy use
567567

568-
**Claude Sonnet 4.6** - Solid But Not the Best
568+
**Claude Sonnet 4** - Solid Workhorse
569569
- 80% of Opus quality at 20% of the cost. Strong at coding.
570570
- **Note:** Some power users (including the author) have dropped Sonnet entirely in favor of Opus for orchestration + Cerebras/Gemini for sub-agents. The quality gap matters when your agent makes architectural decisions.
571571
- **Cost:** $3/M input, $15/M output | **Pro ($20/mo):** included
@@ -635,12 +635,12 @@ Your Claude Pro/Max subscription includes API access. OpenClaw can use it direct
635635

636636
**Budget ($0/month):**
637637
```
638-
Main: Gemini 3.1 Pro (free) | Sub-agents: Gemini 3 Flash | Local: Nemotron Nano 4B
638+
Main: Gemini 3.1 Pro (free tier) | Sub-agents: Gemini Flash (free tier) | Local: Qwen 3.5 Opus Distilled
639639
```
640640

641641
**Balanced (~$20/month - Claude Pro):**
642642
```
643-
Main: Sonnet 4.6 (membership) | Fallback: Gemini 3.1 Pro | Sub-agents: Flash / Kimi K2.5
643+
Main: Sonnet 4 (membership) | Fallback: Gemini 3.1 Pro | Sub-agents: MiMo V2 Pro / Kimi K2.5
644644
```
645645

646646
**Power (~$100/month - Claude Max):**
@@ -877,7 +877,7 @@ A MOC connects related notes with `[[wiki-links]]`. Example:
877877

878878
## Key Facts
879879
- 358 memory files in memory/, mostly date-named
880-
- Vector search (Qwen3-VL or nomic-embed-text, 45ms, $0) finds similar, not connected
880+
- Vector search (qwen3-embedding or nomic-embed-text, ~45ms local, $0) finds similar, not connected
881881
- MEMORY.md must stay under 5K - injected on every message
882882

883883
## Connected Topics
@@ -1139,9 +1139,10 @@ Check if Ollama is installed:
11391139
- Linux: curl -fsSL https://ollama.com/install.sh | sh
11401140
11411141
Pull the embedding model (pick ONE based on your hardware):
1142-
- **16GB+ RAM (recommended):** ollama pull qwen3-embedding:0.6b (best quality-to-size ratio, 1024 dims, 32K context, same family as MTEB #1 model)
1142+
- **Most setups (recommended):** ollama pull qwen3-embedding:0.6b (best quality-to-size ratio, 1024 dims, 32K context, same family as MTEB #1 model)
11431143
- **32GB+ RAM or dedicated GPU:** ollama pull qwen3-embedding:4b (higher quality, ~3GB RAM)
1144-
- **Low RAM or potato hardware:** ollama pull nomic-embed-text (768 dims, smallest footprint)
1144+
- **RTX 3090+ or 5080+ with 16GB+ VRAM:** Use Qwen3-Embedding-8B via Fireworks or local vLLM (4096 dims, SOTA quality — see Part 10)
1145+
- **Low RAM or potato hardware:** ollama pull nomic-embed-text (768 dims, smallest footprint — noticeably worse quality)
11451146
11461147
Do NOT use cloud embeddings (Gemini, OpenAI, Voyage) as your primary — 2-5 second round-trip latency per search vs <100ms local. Cloud embeddings defeat the entire purpose of fast memory search.
11471148
@@ -1187,20 +1188,30 @@ If editing same file 5+ times without progress, STOP and reconsider approach ent
11871188
### Multi-Session Projects
11881189
One feature at a time. Create progress.txt (done/in-progress/next). Start sessions by reading it.
11891190
1190-
## STEP 9: SET UP AUTODREAM MEMORY CONSOLIDATION (Part 16)
1191+
## STEP 9: SET UP MEMORY CONSOLIDATION
11911192
1192-
Create the dream state file:
1193+
**OpenClaw 2026.4+ (recommended):** Enable built-in dreaming in openclaw.json:
1194+
```json
1195+
{
1196+
"plugins": {
1197+
"entries": {
1198+
"memory-core": {
1199+
"config": {
1200+
"dreaming": {
1201+
"enabled": true
1202+
}
1203+
}
1204+
}
1205+
}
1206+
}
1207+
}
1208+
```
1209+
That's it. Dreaming runs daily at 3am automatically. See Part 22 for full config.
1210+
1211+
**Older versions (< 2026.4):** Use the custom autoDream approach from Part 16:
11931212
- Create memory/.dream-state.json with: {"lastDreamAt":null,"sessionsSinceDream":0,"lastScanAt":null,"totalDreams":0,"lastDreamResult":null,"lastProcessedFiles":[]}
11941213
- Create memory/topics/ directory (or use vault/ if Part 9 is set up)
1195-
1196-
Add autoDream protocol to AGENTS.md (insert after orchestrator rules):
1197-
1198-
### autoDream — Memory Consolidation
1199-
On every new session, check gates (cheapest first):
1200-
1. TIME: ≥24h since lastDreamAt? SESSION: ≥5 sessions? USER: not urgent?
1201-
2. If all pass: Orient (read MEMORY.md) → Gather (grep new files, don't read everything) → Consolidate (write topics/vault) → Prune (rebuild MEMORY.md as pure index, <200 lines, <25KB)
1202-
3. Update dream-state.json. On failure, rollback lastDreamAt.
1203-
4. Tell user: "🌙 Memory consolidated — processed N files"
1214+
- Add autoDream protocol to AGENTS.md (see Part 16 for full instructions)
12041215

12051216
## STEP 10: CONFIG PROTECTION + SECURITY
12061217

@@ -1370,7 +1381,7 @@ Don't rely on manually starting services. Create a single `.cmd` or `.ps1` that
13701381
**One-shot prompt struggles on your model:**
13711382
Do these 3 things manually instead:
13721383
1. Copy files from `/templates` into your workspace root
1373-
2. Run `ollama pull nomic-embed-text`
1384+
2. Run `ollama pull qwen3-embedding:0.6b`
13741385
3. Restart gateway: `openclaw gateway stop && openclaw gateway start`
13751386
13761387
## FAQ

0 commit comments

Comments
 (0)