Skip to content

Update tutorials off models slated for retirement on 2026-06-12#732

Open
nealwu wants to merge 16 commits into
mainfrom
neal/tutorials-model-deprecations
Open

Update tutorials off models slated for retirement on 2026-06-12#732
nealwu wants to merge 16 commits into
mainfrom
neal/tutorials-model-deprecations

Conversation

@nealwu
Copy link
Copy Markdown
Member

@nealwu nealwu commented May 22, 2026

Swaps each tutorial's model (and any hardcoded paired renderer) to the recommended replacement from the deprecation page:

  • Qwen3-4B-Instruct-2507 -> Qwen3.5-4B (qwen3_5_disable_thinking)
  • Llama-3.x base -> Qwen3.5-9B-Base (role_colon)
  • Qwen3-30B-A3B -> Qwen3.6-35B-A3B (qwen3_5)
  • Qwen3-VL-235B-A22B-Instruct -> Qwen3.5-397B-A17B (vision-capable)
  • Kimi-K2.5 -> Kimi-K2.6 (kimi_k26_disable_thinking)

Also refreshes the renderer table and supported-models prose in 201 and 102 so they don't reference retired families.

Swaps each tutorial's model (and any hardcoded paired renderer) to
the recommended replacement from the deprecation page:
- Qwen3-4B-Instruct-2507 -> Qwen3.5-4B (qwen3_5_disable_thinking)
- Llama-3.x base -> Qwen3.5-9B-Base (role_colon)
- Qwen3-30B-A3B -> Qwen3.6-35B-A3B (qwen3_5)
- Qwen3-VL-235B-A22B-Instruct -> Qwen3.5-397B-A17B (vision-capable)
- Kimi-K2.5 -> Kimi-K2.6 (kimi_k26_disable_thinking)

Also refreshes the renderer table and supported-models prose in 201
and 102 so they don't reference retired families.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nealwu nealwu requested a review from derek-tml May 22, 2026 18:54
nealwu and others added 15 commits May 23, 2026 01:00
Qwen3.5-4B is a chat-tuned thinking model; feeding it raw tokens
(the intentional pattern in 101) drops into thinking mode and
produces garbled output. Switch to Qwen3.5-9B-Base -- base models
are the natural match for the raw tokenizer.encode -> sample API
the tutorial demonstrates. Drop the "\n" stop arg and tune
temperatures so the completion reads naturally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the 'name=' argument from save_weights_and_get_sampling_client_async()
calls in tutorials 102 and 302 -- the parameter is now ignored and warns
in newer Tinker SDK versions.

Add a 'tutorials' optional-deps extra ('marimo', 'matplotlib') so users
can install everything needed for the tutorials with
'uv pip install "tinker-cookbook[tutorials]"' instead of remembering each
package by hand. Update tutorials/README.md to use the new install line.

Includes marimo auto-formatting noise from a 0.21.1 -> 0.23.8 session
(expanded function signatures, reordered cell-return tuples) and a small
prose addition to 102 noting that the Kimi K2.6 stage is slower than the
Qwen3.5 stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The narration described the synchronous sample() + .result() futures
API, but every code cell uses await sample_async() + asyncio.gather.
Rewrite the prose (intro, sequential/concurrent sections, batch-eval,
summary, and the section header) to describe the actual mechanism:
asyncio.gather schedules the sample_async() coroutines onto the event
loop so all requests are in flight before any completes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Metadata-only change from running the notebook locally; brings 104 in
line with the other tutorials already at 0.23.8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Fix parse_response demo: the hardcoded fake_tokens were written for the
  old Qwen3-30B-A3B vocab, where 151645 was <|im_end|>. After the tokenizer
  swap that id decodes to unrelated text and the parse returns malformed.
  Build the tokens by encoding a plausible reply + the stop token instead.
- Vision section: stay on Qwen3.6-35B-A3B and use the same qwen3_5 renderer
  with an image processor, rather than switching models and framing VL as a
  separate renderer (Qwen3.5/3.6 are natively vision-capable).
- Renderer-switching cell now actually switches: loops over Qwen3.6 and
  Kimi K2.6 so the "each model family needs its own tokenizer" point is real.
- Note that qwen3_5 is a thinking renderer (open <think> tag) and document
  the qwen3_5_disable_thinking alternative.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
loss_fn_outputs[...]["logprobs"] is a TensorData wrapper, not a plain
list, so slicing it (ce_logprobs[-3:]) raised "TensorData object is not
subscriptable". Convert with .tolist() before slicing. Also aligns the
loss:sum print columns; remainder is marimo formatting churn (version
bump, trimmed cell-return tuples).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
asyncio.run() can't be called inside marimo's running event loop. Switch
the five completer calls to top-level await in async cells, and drop the
now-unused asyncio / MessageCompleter / TokenCompleter / TokensWithLogprobs
imports. Remainder is marimo serialization churn (version bump, trimmed
cell-return tuples).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The triple-quoted judge prompt picked up 4-space indentation from marimo
serialization, which became part of the string sent to the model. Restore
the original flush-left text.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- save_weights_for_sampler / save_state: drop _async (those return the
  future un-awaited; the sync method awaited once is the SDK's idiom).
- RestClient list/TTL/publish cells: .result() -> await *_async in async
  cells (sync-from-async warning / deadlock risk in marimo's loop).
- list_checkpoints: pass the full model_id, not model_id.split(":")[0]
  (the endpoint wants a ModelID; truncating caused 404 Model not found).
- weights.download: run via asyncio.to_thread so the sync helper's
  internal .result() doesn't block marimo's event loop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- NLLEvaluator: output["logprobs"] and datum.loss_fn_inputs["weights"]
  are TensorData, so torch.tensor(...) raised "Could not infer dtype".
  Use TensorData.to_torch() instead.
- Fix dead Inspect AI link: https://inspect.ai -> https://inspect.aisi.org.uk/
  (the inspect-ai package's actual docs site, by the UK AI Security Institute).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@nealwu
Copy link
Copy Markdown
Member Author

nealwu commented Jun 1, 2026

Going to split this up into multiple PRs for simplicity, one for each section of the tutorials (100s, 200s, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants