|
| 1 | +# Agent Script β Playlist Generator |
| 2 | + |
| 3 | +`scripts/agent.py` is a self-contained PEP 723 script that simulates the |
| 4 | +Rust Genius agent's multi-turn tool-calling loop against a local Ollama |
| 5 | +instance and the mt.db SQLite database. It serves as a rapid prototyping |
| 6 | +environment for prompt engineering and tool design before changes are ported |
| 7 | +to the Rust backend (`crates/mt-tauri/src/agent/`). |
| 8 | + |
| 9 | +## Architecture |
| 10 | + |
| 11 | +```text |
| 12 | +ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ |
| 13 | +β Ollama LLM ββββββΊβ agent.py ββββββΊβ mt.db β |
| 14 | +β (qwen3.5) β β tool loop β β (SQLite) β |
| 15 | +ββββββββββββββββ ββββββββ¬ββββββββ ββββββββββββββββ |
| 16 | + β |
| 17 | + ββββββββΌββββββββ |
| 18 | + β Last.fm API β |
| 19 | + β (httpx GET) β |
| 20 | + ββββββββββββββββ |
| 21 | +``` |
| 22 | + |
| 23 | +**Components:** |
| 24 | + |
| 25 | +- **System prompt** β `_build_system_prompt(max_tracks)` generates a dynamic |
| 26 | + prompt with strategy routing (mood, artist, regional, mixed) and |
| 27 | + interpolated track count bounds. |
| 28 | +- **8 tools** β 3 local (SQLite) + 5 Last.fm (httpx β cross-ref with library). |
| 29 | + All tools return actionable hints on empty results to guide the model's |
| 30 | + next action. |
| 31 | +- **JSONL logging** β Every session, turn, tool call, result, and parse |
| 32 | + outcome is logged to a structured JSONL file for analysis. |
| 33 | +- **Hard cap** β `parse_response()` deduplicates and truncates track IDs to |
| 34 | + `MAX_PLAYLIST_TRACKS` regardless of model output. |
| 35 | + |
| 36 | +## Tools |
| 37 | + |
| 38 | +| Tool | Source | Purpose | |
| 39 | +|------|--------|---------| |
| 40 | +| `get_recently_played` | SQLite | Recent listening habits | |
| 41 | +| `get_top_artists` | SQLite | Most-played artists by time range | |
| 42 | +| `search_library` | SQLite | Keyword search on title/artist/album | |
| 43 | +| `get_track_tags` | Last.fm | Mood/genre tags for a track | |
| 44 | +| `get_similar_tracks` | Last.fm + SQLite | Similar tracks cross-referenced with library | |
| 45 | +| `get_similar_artists` | Last.fm + SQLite | Similar artists with sample tracks from library | |
| 46 | +| `get_top_artists_by_tag` | Last.fm + SQLite | Genre discovery β top artists in a tag, filtered to library | |
| 47 | +| `get_top_tracks_by_country` | Last.fm + SQLite | Regional trending tracks in library | |
| 48 | + |
| 49 | +## Evolution β From Naive to Semantic |
| 50 | + |
| 51 | +### Problem: keyword matching is not semantic understanding |
| 52 | + |
| 53 | +The initial implementation used stub tools that returned `[]` for all Last.fm |
| 54 | +calls. The model fell back to `search_library` with mood words as keywords, |
| 55 | +which matches against title/artist/album text via SQL `LIKE`. This produced |
| 56 | +results like: |
| 57 | + |
| 58 | +- **"chill"** matched *Ladyhawke β Chills* (synth-pop, not chill) |
| 59 | +- **"calm"** matched *Rage Against the Machine β Calm Like A Bomb* (definitely not calm) |
| 60 | +- **"soft"** matched *Spoon β I Could See the Dude* (from album *Soft Effects*) |
| 61 | + |
| 62 | +The model exhausted all 5 turns doing keyword searches and never produced a |
| 63 | +playlist. |
| 64 | + |
| 65 | +### Solution: Last.fm tools + strategy-based prompt + actionable hints |
| 66 | + |
| 67 | +Three changes transformed the results: |
| 68 | + |
| 69 | +**1. Real Last.fm tool implementations** β `get_top_artists_by_tag("dream pop")` |
| 70 | +now queries Last.fm's tag database and cross-references against the local |
| 71 | +library. This finds Beach House, Cocteau Twins, Cigarettes After Sex β artists |
| 72 | +that are *actually* dreamy, not just containing the word "dream" in a track title. |
| 73 | + |
| 74 | +**2. Strategy-based system prompt** β Instead of a flat list of tool |
| 75 | +descriptions, the prompt routes by request type: |
| 76 | + |
| 77 | +```text |
| 78 | +- Mood/vibe requests β get_top_artists_by_tag with genre tags IN PARALLEL |
| 79 | +- Artist-based requests β get_similar_artists + get_similar_tracks |
| 80 | +- Regional requests β get_top_tracks_by_country |
| 81 | +- search_library is for specific artist/album/title lookups only |
| 82 | +``` |
| 83 | + |
| 84 | +**3. Actionable empty-result hints** β When a tool returns no matches, it |
| 85 | +explains *why* and suggests *what to try next*: |
| 86 | + |
| 87 | +```json |
| 88 | +{ |
| 89 | + "matches": 0, |
| 90 | + "lastfm_count": 50, |
| 91 | + "hint": "Last.fm returned 50 artists for 'ambient' but none are in your library. Try a broader tag, or use get_similar_artists on an artist you've already found." |
| 92 | +} |
| 93 | +``` |
| 94 | + |
| 95 | +This prevents the model from blindly retrying the same approach. Inspired by |
| 96 | +the [Manus agent design post](https://reddit.com/r/LocalLLaMA/comments/1rrisqn/) |
| 97 | +on error messages as navigation. |
| 98 | + |
| 99 | +### Results comparison |
| 100 | + |
| 101 | +**Before** (stub tools, naive prompt, temp=0.45): |
| 102 | + |
| 103 | +```jsonl |
| 104 | +{"event":"session_start","data":{"temperature":0.45,"prompt":"make me a chill playlist"}} |
| 105 | +{"event":"tool_call","data":{"tool":"search_library","args":{"query":"chill"}}} |
| 106 | +{"event":"tool_result","data":{"tool":"search_library","count":2,"result":[{"title":"Chills","artist":"Ladyhawke"},{"title":"chill","artist":"deadmau5"}]}} |
| 107 | +{"event":"tool_call","data":{"tool":"search_library","args":{"query":"calm"}}} |
| 108 | +{"event":"tool_result","data":{"tool":"search_library","count":1,"result":[{"title":"Calm Like A Bomb","artist":"Rage Against the Machine"}]}} |
| 109 | +{"event":"session_end","data":{"reason":"exhausted","turns_used":5}} |
| 110 | +``` |
| 111 | + |
| 112 | +Exhausted 5 turns. Produced a 2-track playlist of keyword matches. |
| 113 | + |
| 114 | +**After** (Last.fm tools, strategy prompt, hints, temp=0.2, seed=42): |
| 115 | + |
| 116 | +```jsonl |
| 117 | +{"event":"session_start","data":{"temperature":0.2,"seed":42,"prompt":"make me a chill playlist"}} |
| 118 | +{"event":"tool_call","data":{"tool":"get_top_artists_by_tag","args":{"tag":"chillout","limit":50}}} |
| 119 | +{"event":"tool_call","data":{"tool":"get_top_artists_by_tag","args":{"tag":"dream pop","limit":50}}} |
| 120 | +{"event":"tool_call","data":{"tool":"get_top_artists_by_tag","args":{"tag":"shoegaze","limit":50}}} |
| 121 | +{"event":"tool_result","data":{"tool":"get_top_artists_by_tag","count":6}} |
| 122 | +{"event":"tool_call","data":{"tool":"get_similar_tracks","args":{"artist":"Cigarettes After Sex","track":"K."}}} |
| 123 | +{"event":"tool_call","data":{"tool":"get_similar_tracks","args":{"artist":"Beach House","track":"Sparks"}}} |
| 124 | +{"event":"parse_success","data":{"playlist_name":"Chill Vibes Collection","track_ids":[69727,70192,71486,"...21 more"],"valid_count":25}} |
| 125 | +{"event":"session_end","data":{"reason":"success","turns_used":4}} |
| 126 | +``` |
| 127 | + |
| 128 | +25/25 valid tracks in 4 turns. Artists: Beach House, Cocteau Twins, |
| 129 | +Cigarettes After Sex, Alvvays, girl in red, The Radio Dept., M83, Grimes. |
| 130 | + |
| 131 | +## Determinism Controls |
| 132 | + |
| 133 | +| Lever | Default | Effect | |
| 134 | +|-------|---------|--------| |
| 135 | +| `AGENT_TEMPERATURE` | 0.2 | Lower = more deterministic token sampling | |
| 136 | +| `top_p` | 0.9 | Nucleus sampling cutoff (hardcoded) | |
| 137 | +| `AGENT_SEED` | 0 (random) | Fixed seed for reproducible output | |
| 138 | +| `AGENT_MAX_PLAYLIST_TRACKS` | 25 | Hard cap on output track count | |
| 139 | +| `parse_response()` | β | Deduplicates + truncates regardless of model output | |
| 140 | + |
| 141 | +## Applying to Rust Backend |
| 142 | + |
| 143 | +The script mirrors the Rust agent in `crates/mt-tauri/src/agent/`: |
| 144 | + |
| 145 | +| Python (`scripts/agent.py`) | Rust (`crates/mt-tauri/src/agent/`) | |
| 146 | +|------------------------------|--------------------------------------| |
| 147 | +| `_build_system_prompt()` | `prompt.rs::SYSTEM_PROMPT` | |
| 148 | +| `TOOLS` list | `tools.rs` (8 `impl Tool` structs) | |
| 149 | +| `tool_get_similar_tracks()` | `tools.rs::GetSimilarTracks::call()` | |
| 150 | +| `_lastfm_get()` | `lastfm/client.rs::api_call()` | |
| 151 | +| `parse_response()` | `mod.rs::parse_agent_response()` | |
| 152 | +| `run_agent()` loop | `mod.rs::agent_generate_playlist()` | |
| 153 | + |
| 154 | +Changes validated in the Python script should be ported to Rust: |
| 155 | + |
| 156 | +1. **System prompt** β Copy the strategy-based prompt to `prompt.rs` |
| 157 | +2. **Actionable hints** β Add hint metadata to Rust tool `Output` types |
| 158 | +3. **Default limits** β Increase `get_top_artists_by_tag` default from 10 to 50 |
| 159 | +4. **Hard cap** β Add dedup + truncation to `parse_agent_response()` |
| 160 | +5. **Temperature/seed** β Pass through Ollama options in `build_agent()` |
| 161 | + |
| 162 | +## Usage |
| 163 | + |
| 164 | +```bash |
| 165 | +# Basic |
| 166 | +uv run scripts/agent.py "make me a chill playlist" |
| 167 | + |
| 168 | +# With options |
| 169 | +uv run scripts/agent.py --model qwen3.5:9b --seed 42 --temperature 0.1 "shoegaze deep cuts" |
| 170 | + |
| 171 | +# Extended thinking |
| 172 | +uv run scripts/agent.py --think --max-turns 8 "jazz from my library" |
| 173 | +``` |
| 174 | + |
| 175 | +## Configuration |
| 176 | + |
| 177 | +All env vars are read from `.env` via `python-decouple`. CLI flags override |
| 178 | +env var defaults. |
| 179 | + |
| 180 | +| Env Var | Default | CLI Flag | |
| 181 | +|---------|---------|----------| |
| 182 | +| `OLLAMA_MODEL` | `qwen3.5:9b` | `--model` | |
| 183 | +| `OLLAMA_HOST` | `http://localhost:11434` | `--host` | |
| 184 | +| `AGENT_MAX_TURNS` | `5` | `--max-turns` | |
| 185 | +| `AGENT_TEMPERATURE` | `0.2` | `--temperature` | |
| 186 | +| `AGENT_THINK` | `false` | `--think` | |
| 187 | +| `AGENT_SEED` | `0` | `--seed` | |
| 188 | +| `AGENT_MAX_PLAYLIST_TRACKS` | `25` | β | |
| 189 | +| `AGENT_LOG_FILE` | `/tmp/ollama_python_agent.jsonl` | `--log-file` | |
| 190 | +| `LASTFM_API_KEY` | β | β | |
0 commit comments