Model Coder (As-Built Specification)

Changelog

2026-03-14: Replaced original aspirational draft with as-built specification.
2026-03-14: Documented current GitHub Pages/static-host runtime approach, including COI service worker integration.
2026-03-14: Added implemented UX/accessibility features (theme toggle persistence, splitter, About modal, keyboard/focus behaviors).
2026-03-14: Captured actual run/session lifecycle behavior and OpenAI-compatible wrapper support as implemented in code.

Overview

Model Coder is a browser-based educational sandbox for learning OpenAI-style Python coding patterns against a local LLM.

Users write Python code in a PyScript editor and execute it in a terminal pane. The app provides a local openai-compatible wrapper so students can practice common syntax for:

OpenAI().chat.completions.create(...)
OpenAI().responses.create(...)
Streaming responses
Async usage with AsyncOpenAI

The model runtime is local and in-browser with dual-mode support: WebLLM (Phi-3-mini) when WebGPU is available, falling back to wllama (Phi-2) for CPU-only environments.

Core Goals

Run fully in-browser on static hosting (including GitHub Pages).
Provide an OpenAI-like Python authoring experience without external API calls.
Support interactive terminal workflows (print, input, loop-based chat samples).
Provide beginner-friendly sample templates that can be edited and rerun quickly.

Architecture

The implementation is split across these files:

index.html: app shell, toolbar, editor/terminal containers, accessibility landmarks, About modal.
styles.css: responsive layout, light/dark themes, terminal/editor styling, focus-visible and a11y styles, modal styles.
app.js: UI controller, templates, run lifecycle, session cleanup, theme/splitter/accessibility behavior.
llm.js: local model lifecycle and request handling with dual-mode support (WebLLM for GPU, wllama for CPU).
nopenai.py: Python-side OpenAI-compatible wrapper and stream abstractions.
coi-serviceworker.js: COOP/COEP service-worker bootstrap for cross-origin isolation support on static hosting.

Hosting and Runtime Model

Designed for static hosting and GitHub Pages.
Includes coi-serviceworker.js bootstrap in index.html to support cross-origin isolation patterns needed by advanced browser runtime features.
Terminal execution uses PyScript script runners targeted at terminal-container.
Runtime mode selection is managed in app.js by shouldUseTerminalWorker().

UI and Interaction Requirements

Layout

Top toolbar with:
- sample picker
- Run, Stop, Reset Layout, Theme toggle, About button, Retry Model button
Status pills for runtime/model state
Two-pane workspace:
- editor pane (top)
- terminal pane (bottom)
- draggable splitter with keyboard support

Theming

Light and dark themes
Theme preference persisted in localStorage
Editor and terminal themed consistently with app shell

Accessibility

Labeled toolbar controls and landmarks
aria-live status updates
keyboard-resizable splitter
skip link to workspace
focus-visible styling for keyboard navigation
About dialog with proper dialog semantics and keyboard close behavior

Python API Compatibility (Implemented)

The wrapper exposed via nopenai.py supports:

OpenAI(base_url="http://localwllama", api_key="...")
AsyncOpenAI(base_url="http://localwllama", api_key="...")
Chat Completions:
- chat.completions.create(model="local-llm", messages=[...], stream=False|True)
Responses API:
- responses.create(model="local-llm", input=..., instructions=..., previous_response_id=..., stream=False|True)
Synchronous and asynchronous stream iterators

Validation behavior:

base_url must be http://localwllama
model must be local-llm
message role/content types are validated
user prompts are checked against moderation/mod.txt reversed-term filters in JS before ChatML assembly
matched prompts return I'm sorry. I can't help with that as the model response

Local Model Runtime

Uses @wllama/wllama from CDN.
Loads model Felladrin/gguf-sharded-phi-2-orange-v2 (phi-2-orange-v2.Q5_K_M.shard-00001-of-00025.gguf).
Converts request message structures to ChatML prompt format.
Supports full and streaming completion paths.
Maintains response/session maps for continuation semantics and stream retrieval.

Run Lifecycle Behavior

Editor is used for authoring only; execution is routed through terminal run button.
Running code injects nopenai.py source and aliases it as both nopenai and openai.
Run sessions are tracked with run IDs to prevent stale async completion events from corrupting UI state.
Stop and completion both reset active session state while preserving expected terminal behavior.
Switching to a different sample clears terminal output and resets model session context.

Built-in Samples

Current sample set:

Blank Page
Simple chat (ChatCompletions API)
Simple chat (Responses API)
Conversation Tracking (ChatCompletions API)
Conversation Tracking (Responses API)
Streaming (Responses API)
Async chat (Responses API)
Async streaming (Responses API)

Privacy and Data Flow

The app is designed for local, in-browser execution.
User code, prompts, and outputs are processed locally in the browser runtime.
No server-side inference pipeline is used by this app.

Notes for Contributors

This document is intentionally as-built, not aspirational.
For implementation walkthrough and function-by-function flow, see implementation_details.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model Coder (As-Built Specification)

Changelog

Overview

Core Goals

Architecture

Hosting and Runtime Model

UI and Interaction Requirements

Layout

Theming

Accessibility

Python API Compatibility (Implemented)

Local Model Runtime

Run Lifecycle Behavior

Built-in Samples

Privacy and Data Flow

Notes for Contributors

Uh oh!

FilesExpand file tree

spec.md

Latest commit

History

spec.md

File metadata and controls

Model Coder (As-Built Specification)

Changelog

Overview

Core Goals

Architecture

Hosting and Runtime Model

UI and Interaction Requirements

Layout

Theming

Accessibility

Python API Compatibility (Implemented)

Local Model Runtime

Run Lifecycle Behavior

Built-in Samples

Privacy and Data Flow

Notes for Contributors