|
| 1 | +--- |
| 2 | +name: download-from-swanlab-url |
| 3 | +description: Download per-step time-series metric data (reward, entropy, response length, etc.) from a SwanLab cloud run URL as a pandas.DataFrame. Use when the user provides a SwanLab URL and wants to fetch or analyze training curves. |
| 4 | +--- |
| 5 | + |
| 6 | +# Skill: Download metric data from a SwanLab run URL |
| 7 | + |
| 8 | +## Goal |
| 9 | + |
| 10 | +Given a SwanLab cloud URL of the form |
| 11 | + |
| 12 | +``` |
| 13 | +https://swanlab.cn/@<username>/<project>/runs/<exp_cuid>/chart |
| 14 | +``` |
| 15 | + |
| 16 | +fetch the per-step time-series data for one or more metrics (e.g. reward, entropy, response length) as a `pandas.DataFrame` for downstream plotting or analysis. |
| 17 | + |
| 18 | +## Trial-and-error log (read first; saves hours) |
| 19 | + |
| 20 | +The shortcuts you might try **do not work**: |
| 21 | + |
| 22 | +1. **WebFetch / `curl` of the chart page** — the page is a Vue SPA. The HTML body is just `<div id="app"></div>`; no chart data is embedded. Don't bother scraping HTML. |
| 23 | + |
| 24 | +2. **Direct REST probe (`curl https://api.swanlab.cn/api/v1/runs/<cuid>`)** — returns `404 Not Found` even with a valid api key. The public REST surface is not the v1 path; use the Python SDK. |
| 25 | + |
| 26 | +3. **`swanlab verify` says you're logged in** — but the env may point to a private cloud. On this host: |
| 27 | + ``` |
| 28 | + $ swanlab verify |
| 29 | + swanlab: You are logged into https://cloud-20.agent-matrix.com as fuqingxu |
| 30 | + ``` |
| 31 | + This is **not** swanlab.cn. The shell exports `SWANLAB_API_HOST` and `SWANLAB_WEB_HOST` in `~/.bashrc`, redirecting the SDK to a different deployment. Calling `OpenApi(api_key=...)` then fails with `Login failed: 404 Not Found` because `/login/api_key` only exists on swanlab.cn's API host. |
| 32 | + |
| 33 | +4. **Metric keys are NOT what the UI shows.** The chart card titled "reward" is logged as `critic/rewards/mean`. Asking `get_metrics(keys=['reward'])` returns code 404. You **must** probe candidate keys (see the cheat sheet below). |
| 34 | + |
| 35 | +5. **`get_summary` to enumerate keys** — does not work for non-cloned runs. When `rootExpId` / `rootProId` are `None` (i.e. the run was not cloned from another project), `experiment.get_summary` returns HTTP 400 "Bad Request". Skip it; probe candidate keys directly with `get_metrics`. |
| 36 | + |
| 37 | +## The working recipe |
| 38 | + |
| 39 | +### 1. Prerequisites |
| 40 | + |
| 41 | +- `swanlab` Python package (>= 0.7.4 confirmed). |
| 42 | +- An api key for `swanlab.cn`. On this machine it is stored in `~/.swanlab/.netrc`: |
| 43 | + ``` |
| 44 | + machine https://api.swanlab.cn |
| 45 | + login https://swanlab.cn |
| 46 | + password <API_KEY> |
| 47 | + ``` |
| 48 | + Read the password field from there if you don't have it on hand. |
| 49 | + |
| 50 | +### 2. Parse the URL |
| 51 | + |
| 52 | +```python |
| 53 | +import re |
| 54 | + |
| 55 | +URL = "https://swanlab.cn/@binaryhusky/spy-game-rl/runs/zku3ujg2k3unvt61jbu0s/chart" |
| 56 | +m = re.match(r"https?://swanlab\.cn/@([^/]+)/([^/]+)/runs/([^/]+)", URL) |
| 57 | +username, project, exp_id = m.group(1), m.group(2), m.group(3) |
| 58 | +``` |
| 59 | + |
| 60 | +### 3. Override env vars BEFORE importing `swanlab` |
| 61 | + |
| 62 | +Critical: if your shell has `SWANLAB_API_HOST` / `SWANLAB_WEB_HOST` / `SWANLAB_API_KEY` pointing to a private cloud, the SDK silently uses them. Either run via `env -i` or unset them: |
| 63 | + |
| 64 | +```bash |
| 65 | +unset SWANLAB_API_KEY SWANLAB_API_HOST SWANLAB_WEB_HOST |
| 66 | +SWANLAB_API_HOST=https://api.swanlab.cn/api \ |
| 67 | +SWANLAB_WEB_HOST=https://swanlab.cn \ |
| 68 | +python your_script.py |
| 69 | +``` |
| 70 | + |
| 71 | +Note: `SWANLAB_API_HOST` for swanlab.cn ends with `/api` (the SDK default in `swanlab/env.py`); without that suffix login also returns 404. |
| 72 | + |
| 73 | +### 4. Open the API and fetch metadata |
| 74 | + |
| 75 | +```python |
| 76 | +import swanlab |
| 77 | +api = swanlab.OpenApi(api_key="<API_KEY>") # raises ValidationError if hosts wrong |
| 78 | + |
| 79 | +exp = api.get_experiment(project=project, exp_id=exp_id, username=username) |
| 80 | +assert exp.code == 200, exp.errmsg |
| 81 | +print(exp.data.name, exp.data.state) |
| 82 | +# exp.data.profile['config'] contains the full training config (verl-style nested dict) |
| 83 | +``` |
| 84 | + |
| 85 | +### 5. Discover the right metric keys |
| 86 | + |
| 87 | +The keys you ask for must match what was logged. For a verl-based AgentJet run, the working names are: |
| 88 | + |
| 89 | +| You might think | Actual logged key | |
| 90 | +| --------------- | ----------------------- | |
| 91 | +| `reward` | `critic/rewards/mean` | |
| 92 | +| `entropy` | `actor/entropy` | |
| 93 | +| `response_length` | `response_length/mean` | |
| 94 | +| `pg_loss` | `actor/pg_loss` | |
| 95 | + |
| 96 | +Other common ones: `response_length/max`, `response_length/min`, `critic/score/mean`. Probe defensively — `get_metrics` returns `code=200` and N rows on hit, `code=404, rows=0` on miss: |
| 97 | + |
| 98 | +```python |
| 99 | +candidates = [ |
| 100 | + "critic/rewards/mean", "critic/score/mean", |
| 101 | + "actor/entropy", "actor/entropy_loss", "actor/pg_loss", |
| 102 | + "response_length/mean", "response_length/max", "response_length/min", |
| 103 | +] |
| 104 | +for k in candidates: |
| 105 | + r = api.get_metrics(exp_id=exp_id, keys=k) |
| 106 | + print(f"{k:40s} code={r.code} rows={0 if r.data is None else len(r.data)}") |
| 107 | +``` |
| 108 | + |
| 109 | +### 6. Fetch and save |
| 110 | + |
| 111 | +```python |
| 112 | +keys = ["critic/rewards/mean", "actor/entropy", "response_length/mean"] |
| 113 | +r = api.get_metrics(exp_id=exp_id, keys=keys) |
| 114 | +df = r.data # indexed by `step`; one column per key + one `<key>_timestamp` column |
| 115 | +df.to_csv("metrics.csv") |
| 116 | +``` |
| 117 | + |
| 118 | +The DataFrame layout is: |
| 119 | + |
| 120 | +``` |
| 121 | + actor/entropy actor/entropy_timestamp critic/rewards/mean ... |
| 122 | +step |
| 123 | +1 0.5569 1774003810000 0.7271 |
| 124 | +2 0.5732 1774004325000 0.7589 |
| 125 | +... |
| 126 | +``` |
| 127 | + |
| 128 | +The `_timestamp` columns are unix millis; usually you can drop them and plot against the `step` index. |
| 129 | + |
| 130 | +## End-to-end runnable snippet |
| 131 | + |
| 132 | +```python |
| 133 | +"""Fetch reward/entropy/response_length curves from a swanlab.cn run URL.""" |
| 134 | +import os, re, sys |
| 135 | +# Strip any private-cloud overrides BEFORE importing swanlab. |
| 136 | +for v in ("SWANLAB_API_KEY", "SWANLAB_API_HOST", "SWANLAB_WEB_HOST"): |
| 137 | + os.environ.pop(v, None) |
| 138 | +os.environ["SWANLAB_API_HOST"] = "https://api.swanlab.cn/api" |
| 139 | +os.environ["SWANLAB_WEB_HOST"] = "https://swanlab.cn" |
| 140 | + |
| 141 | +import swanlab |
| 142 | + |
| 143 | +URL = sys.argv[1] |
| 144 | +API_KEY = sys.argv[2] # or read from ~/.swanlab/.netrc |
| 145 | +m = re.match(r"https?://swanlab\.cn/@([^/]+)/([^/]+)/runs/([^/]+)", URL) |
| 146 | +username, project, exp_id = m.groups() |
| 147 | + |
| 148 | +api = swanlab.OpenApi(api_key=API_KEY) |
| 149 | +keys = ["critic/rewards/mean", "actor/entropy", "response_length/mean"] |
| 150 | +r = api.get_metrics(exp_id=exp_id, keys=keys) |
| 151 | +assert r.code == 200, r.errmsg |
| 152 | +df = r.data.rename(columns={ |
| 153 | + "critic/rewards/mean": "reward", |
| 154 | + "actor/entropy": "entropy", |
| 155 | + "response_length/mean": "response_length", |
| 156 | +}) |
| 157 | +df.to_csv("metrics.csv") |
| 158 | +print(df.head()) |
| 159 | +``` |
| 160 | + |
| 161 | +## Plot recipe (seaborn, optional) |
| 162 | + |
| 163 | +```python |
| 164 | +import seaborn as sns, matplotlib.pyplot as plt, pandas as pd |
| 165 | +df = pd.read_csv("metrics.csv").sort_values("step") |
| 166 | +sns.set_theme(context="paper", style="whitegrid") |
| 167 | +fig, axes = plt.subplots(1, 3, figsize=(11.5, 3.0)) |
| 168 | +palette = sns.color_palette("deep") |
| 169 | +for ax, (col, title, c) in zip(axes, [ |
| 170 | + ("reward", "Reward", palette[0]), |
| 171 | + ("entropy", "Policy Entropy", palette[3]), |
| 172 | + ("response_length", "Response Length", palette[2]), |
| 173 | +]): |
| 174 | + sns.lineplot(data=df, x="step", y=col, ax=ax, color=c, linewidth=1.6) |
| 175 | + ax.set_title(title); ax.set_xlabel("Training step"); ax.set_ylabel("") |
| 176 | +fig.tight_layout() |
| 177 | +fig.savefig("curves.pdf", bbox_inches="tight") |
| 178 | +``` |
| 179 | + |
| 180 | +## Troubleshooting cheat sheet |
| 181 | + |
| 182 | +| Symptom | Likely cause | Fix | |
| 183 | +| --- | --- | --- | |
| 184 | +| `ValidationError: Login failed: 404 Not Found` on `OpenApi(api_key=...)` | `SWANLAB_API_HOST` points to a private cloud, or missing `/api` suffix | Unset the env vars and explicitly set `SWANLAB_API_HOST=https://api.swanlab.cn/api` | |
| 185 | +| `get_metrics` returns `code=404, "No data found"` | Wrong key name (UI label != log key) | Probe with the cheat sheet in §5; remember verl prefixes (`actor/`, `critic/`, `response_length/...`) | |
| 186 | +| `get_summary` returns `code=400, "Bad Request"` | Run is not a clone; `rootExpId`/`rootProId` are None | Don't use `get_summary` for non-cloned runs; just probe `get_metrics` | |
| 187 | +| `OpenApi.login_info` is not callable | It's a property, not a method | Access as `api.login_info` (no `()`) | |
| 188 | +| WebFetch returns "no data" / SPA shell | Chart page is Vue-rendered client-side | Use the SDK; do not scrape HTML | |
| 189 | +| `swanlab verify` shows wrong host | `~/.bashrc` exports redirect SDK to a private cloud | Override env at script start, before `import swanlab` | |
0 commit comments