Quadtrix.cpp is an experimental local LLM project that combines training, inference, and a chat interface in a single repository across three execution paths: a C++ versions, a PyTorch backend, and a React frontend.The C++ version is a decoder-only transformer - no external tensor libraries, no automatic differentiation. It includes character-level tokenization, backpropagation, AdamW optimization, checkpointing, and autoregressive generationd. The PyTorch version offers faster training and inference with BPE tokenization via tiktoken, while the React frontend provides a web UI on top of python backend.You can train small character-level models on CPU or Colab, or move to BPE-based models with the PyTorch backend. All version are designed to be hackable and runnable on consumer hardware without requiring massive GPU clusters.
technical notes: docs
Runs are ranked by validation loss. Lower is better.
| # | Val Loss | Parameters | Time | Hardware | Description |
|---|---|---|---|---|---|
| 1 | 0.7176 | 10.82M | 61.3 min | NVIDIA T4 | Large-scale run, coherent paragraphs, strong convergence |
| 2 | 0.9250 | 1.99M | 6.1 min | NVIDIA T4 | Optimised run, fast training, stable learning |
| 3 | 1.3145 | 0.82M | 39.4 min | x64 CPU | Baseline, small data |
| 4 | 1.6371 | 0.82M | 76.2 min | x64 CPU | Extended CPU training, 3000 iterations |
All runs: @Eamon2009, 2026.
| Metric | Character-Level | Small Scale | Large Scale |
|---|---|---|---|
| Parameters | 0.83M | 2.00M | 19.17M |
| Layers | 4 | 4 | 4 |
| Embedding dim | 128 | 200 | 200 |
| Attention heads | 4 | 4 | 4 |
| Context length | 64 | 200 | 200 |
| Vocab | 105 char | 110 char | ~50K BPE |
| Corpus | TinyStories | TinyStories | Children's Stories |
| Iterations | 3,000 | 5,000 | 5,000 |
| Train loss | 1.5632 | 0.9045 | — |
| Val loss | 1.6371 | 0.9301 | — |
| Gen. gap | 0.0739 | 0.0256 | — |
| Project | Focus | Language | Autograd |
|---|---|---|---|
| nanoGPT | Minimal GPT training | Python | PyTorch |
| minGPT | Educational GPT | Python | PyTorch |
| llama2.c | Inference-oriented C | C | None |
| Quadtrix.cpp | Training + inference + web UI + multi-backend | C++ / Python / TypeScript | Manual C++ + PyTorch |
Create a Python virtual environment
python -m venv .venv
.\.venv\Scripts\python.exe -m pip install --upgrade pip** Install backend and PyTorch dependencies**
cd backend
..\.venv\Scripts\python.exe -m pip install -r requirements.txt
cd ..Train
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\.venv\Scripts\python.exe engine\main.pyBuild the C++ executable
Skip if quadtrix.exe already exists. To rebuild:
g++ -std=c++17 -O2 -I. -Iinclude -o quadtrix.exe main.cppFor maximum CPU throughput:
g++ -std=c++17 -O3 -march=native -I. -Iinclude -o quadtrix.exe main.cppQuadtrix is a decoder-only transformer. The architecture follows the standard GPT design with pre-layer normalization, causal self-attention, and residual connections. The C++ version is :
- Character-level tokenizer
- Manual tensor operations
- Analytical backpropagation
- AdamW optimizer with bias correction
- Checkpoint save/load
- Autoregressive generation
The PyTorch path uses torch.nn and tiktoken for faster experimentation and GPU acceleration.
Quadtrix.cpp/
├── main.cpp
├── config/
│ └── config.h
├── include/
│ ├── tensor.h
│ ├── gpt.h
│ └── backward.h
│ └── ...
│
├── data/
│ └── input.txt
├── engine/
│ ├── main.py
│ ├── inference.py
│ └── best_model.pt
├── backend/
│ ├── main.py
│ ├── inference.py
│ └── requirements.txt
├── frontend/
│ ├── src/
│ └── ...
└── quadtrix.exe (after build)
- Vaswani et al., "Attention Is All You Need", NeurIPS 2017.
- Radford et al., "Language Models are Unsupervised Multitask Learners", 2019.
- Karpathy, nanoGPT
- Karpathy, minGPT
MIT

