- 2026-03-14: Replaced original aspirational draft with as-built specification.
- 2026-03-14: Documented current GitHub Pages/static-host runtime approach, including COI service worker integration.
- 2026-03-14: Added implemented UX/accessibility features (theme toggle persistence, splitter, About modal, keyboard/focus behaviors).
- 2026-03-14: Captured actual run/session lifecycle behavior and OpenAI-compatible wrapper support as implemented in code.
Model Coder is a browser-based educational sandbox for learning OpenAI-style Python coding patterns against a local LLM.
Users write Python code in a PyScript editor and execute it in a terminal pane. The app provides a local openai-compatible wrapper so students can practice common syntax for:
OpenAI().chat.completions.create(...)OpenAI().responses.create(...)- Streaming responses
- Async usage with
AsyncOpenAI
The model runtime is local and in-browser with dual-mode support: WebLLM (Phi-3-mini) when WebGPU is available, falling back to wllama (Phi-2) for CPU-only environments.
- Run fully in-browser on static hosting (including GitHub Pages).
- Provide an OpenAI-like Python authoring experience without external API calls.
- Support interactive terminal workflows (
print,input, loop-based chat samples). - Provide beginner-friendly sample templates that can be edited and rerun quickly.
The implementation is split across these files:
index.html: app shell, toolbar, editor/terminal containers, accessibility landmarks, About modal.styles.css: responsive layout, light/dark themes, terminal/editor styling, focus-visible and a11y styles, modal styles.app.js: UI controller, templates, run lifecycle, session cleanup, theme/splitter/accessibility behavior.llm.js: local model lifecycle and request handling with dual-mode support (WebLLM for GPU, wllama for CPU).nopenai.py: Python-side OpenAI-compatible wrapper and stream abstractions.coi-serviceworker.js: COOP/COEP service-worker bootstrap for cross-origin isolation support on static hosting.
- Designed for static hosting and GitHub Pages.
- Includes
coi-serviceworker.jsbootstrap inindex.htmlto support cross-origin isolation patterns needed by advanced browser runtime features. - Terminal execution uses PyScript script runners targeted at
terminal-container. - Runtime mode selection is managed in
app.jsbyshouldUseTerminalWorker().
- Top toolbar with:
- sample picker
- Run, Stop, Reset Layout, Theme toggle, About button, Retry Model button
- Status pills for runtime/model state
- Two-pane workspace:
- editor pane (top)
- terminal pane (bottom)
- draggable splitter with keyboard support
- Light and dark themes
- Theme preference persisted in
localStorage - Editor and terminal themed consistently with app shell
- Labeled toolbar controls and landmarks
aria-livestatus updates- keyboard-resizable splitter
- skip link to workspace
- focus-visible styling for keyboard navigation
- About dialog with proper dialog semantics and keyboard close behavior
The wrapper exposed via nopenai.py supports:
OpenAI(base_url="http://localwllama", api_key="...")AsyncOpenAI(base_url="http://localwllama", api_key="...")- Chat Completions:
chat.completions.create(model="local-llm", messages=[...], stream=False|True)
- Responses API:
responses.create(model="local-llm", input=..., instructions=..., previous_response_id=..., stream=False|True)
- Synchronous and asynchronous stream iterators
Validation behavior:
base_urlmust behttp://localwllamamodelmust belocal-llm- message role/content types are validated
- user prompts are checked against
moderation/mod.txtreversed-term filters in JS before ChatML assembly - matched prompts return
I'm sorry. I can't help with thatas the model response
- Uses
@wllama/wllamafrom CDN. - Loads model
Felladrin/gguf-sharded-phi-2-orange-v2(phi-2-orange-v2.Q5_K_M.shard-00001-of-00025.gguf). - Converts request message structures to ChatML prompt format.
- Supports full and streaming completion paths.
- Maintains response/session maps for continuation semantics and stream retrieval.
- Editor is used for authoring only; execution is routed through terminal run button.
- Running code injects
nopenai.pysource and aliases it as bothnopenaiandopenai. - Run sessions are tracked with run IDs to prevent stale async completion events from corrupting UI state.
- Stop and completion both reset active session state while preserving expected terminal behavior.
- Switching to a different sample clears terminal output and resets model session context.
Current sample set:
- Blank Page
- Simple chat (ChatCompletions API)
- Simple chat (Responses API)
- Conversation Tracking (ChatCompletions API)
- Conversation Tracking (Responses API)
- Streaming (Responses API)
- Async chat (Responses API)
- Async streaming (Responses API)
- The app is designed for local, in-browser execution.
- User code, prompts, and outputs are processed locally in the browser runtime.
- No server-side inference pipeline is used by this app.
- This document is intentionally as-built, not aspirational.
- For implementation walkthrough and function-by-function flow, see
implementation_details.md.