Skip to content

Latest commit

 

History

History
79 lines (60 loc) · 4.6 KB

File metadata and controls

79 lines (60 loc) · 4.6 KB

Edge Python

A compact single-pass SSA bytecode compiler and stack VM for a sandboxed Python subset. Hand-written lexer, Pratt parser that emits bytecode directly (no AST), and a threaded-code interpreter with dual inline caching (scalar + instance-dunder), super-instruction fusion, and pure-function template memoization. Deterministic execution; around 170 KB WASM release.

Architecture

Single-pass pipeline: source -> SSA bytecode chunk; stack interpreter with adaptive inline caching and pure-function memoization.

  • Lexer (modules/lexer/) LUT-driven, offset-based tokens.
  • Parser (modules/parser/) Pratt precedence, SSA-versioned bytecode with Phi at joins, no AST.
  • Optimizer (modules/vm/optimizer.rs) constant folding, Phi-noop elimination, dead-code compaction.
  • VM (modules/vm/) flat-match dispatch, scalar + instance-dunder inline caches, pure-function template memoization, NaN-boxed 64-bit Val with a mark-and-sweep arena.
  • Resolver (modules/packages/) host-injected; native imports register for CallExtern dispatch.

Full rationale, NaN-box patterns, IC thresholds, GC roots, and intentional omissions: Design. Lexer and parser internals: Lexical, Syntax.

Layout

├── src
│   ├── main
│   ├── modules
│   │   ├── lexer
│   │   ├── packages
│   │   ├── parser
│   │   └── vm
│   └── util
└── tests
    └── cases

Quick start

cargo wasm # release WASM artifact -> target/wasm32-unknown-unknown/release/compiler.wasm
cargo test --release --no-default-features # host-side test suite (skips the prebuilt wasm download, only needed by external consumers)

cargo wasm is a workspace alias (.cargo/config.toml) for cargo build --release --target wasm32-unknown-unknown -p edge-python. Plain cargo build --release produces host artifacts (.rlib + cdylib) for embedders linking compiler. To add native modules from your own crate, implement the Resolver trait, see Writing modules.

The host runtime owns I/O, network, and module fetching; there is no native CLI. Browser hosts use the runtime/ JS package; server/edge runtimes use wasmtime, wasmer, Cloudflare Workers, Fastly Compute, Spin.

Consuming the release from another Rust crate

This crate declares links = "compiler" and its build.rs downloads the matching compiler.wasm from the GitHub Release for CARGO_PKG_VERSION into OUT_DIR. Downstream crates read the absolute path through DEP_COMPILER_LIB_WASM.

# Downstream Cargo.toml
[dependencies]
edge-python = { git = "https://github.com/dylan-sutton-chavez/edge-python", tag = "v0.1.0" }
// Downstream build.rs
fn main() {
    println!("cargo::rerun-if-changed=build.rs");
    let wasm = std::env::var("DEP_COMPILER_LIB_WASM").expect("`DEP_COMPILER_LIB_WASM` unset, upstream `edge-python` must declare `links = \"compiler\"`");
    std::fs::copy(&wasm, "runtime/compiler.wasm").expect("copy failed");
}

The download URL is derived from CARGO_PKG_VERSION, so a tag bump is the only retarget. Use branch = "main" for unreleased work. Requires curl on PATH; gated by the default-on prebuilt feature.

References

  1. Aho, Sethi & Ullman, Compilers: Principles, Techniques and Tools (1986). LUT-based lexer.
  2. Pratt, Top Down Operator Precedence (POPL 1973). Precedence climbing parser.
  3. Cytron et al., Efficiently Computing Static Single Assignment Form (TOPLAS 1991). SSA, φ-nodes.
  4. Gudeman, Representing Type Information in Dynamically Typed Languages (1993). NaN-boxing.
  5. Deutsch & Schiffman, Efficient Implementation of the Smalltalk-80 System (POPL 1984). Inline caching.
  6. Ertl & Gregg, The Structure and Performance of Efficient Interpreters (JILP 2003). Threaded dispatch.
  7. Hölzle & Ungar, Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback (PLDI 1994).
  8. Casey et al., Towards Superinstructions for Java Interpreters (SCOPES 2003). LoadAttr+Call fusion.
  9. Michie, Memo Functions and Machine Learning (Nature 1968). Pure-function memoization.
  10. McCarthy, Recursive Functions of Symbolic Expressions (CACM 1960). Mark-sweep GC.
  11. Backus, Can Programming Be Liberated from the von Neumann Style? (CACM 1978). Function-level paradigm.