feat: add toy character-level transformer language model by jbrown1618 · Pull Request #161 · jbrown1618/vector

jbrown1618 · 2026-04-12T22:45:26Z

Summary

Implements a minimal GPT-style transformer language model operating at character level, with deliberately tiny dimensions suitable for this library's toy scale.

New Components

File	Description
`Tokenizer.ts`	Character-level tokenizer (a-z + space → indices 0-26)
`TransformerBlock.ts`	Single-head self-attention with causal masking, feed-forward network (ReLU), residual connections
`LanguageModel.ts`	Full model: embeddings → transformer → output projection; `train()` via finite-difference gradients; `generate()` with greedy decoding
`NumberUtilities.ts`	Added numerically stable `softmax()` utility

Architecture

Vocabulary: 27 characters (space + a-z)
Default dimensions: embedding dim = 8, feed-forward dim = 16, context length = 16
Training: Finite-difference gradient estimation (suitable for toy scale only)
Generation: Autoregressive with greedy decoding

Tests

36 new tests across 4 test files covering all new components. All 694 tests pass, build succeeds.

Implement a minimal GPT-style transformer operating at character level with deliberately tiny dimensions, suitable for this library's toy scale: - Tokenizer: character-level encoding (a-z + space, vocab size 27) - softmax: numerically stable softmax utility function - TransformerBlock: single-head self-attention with causal masking, feed-forward network with ReLU, and residual connections - LanguageModel: full model composing embeddings, positional encodings, transformer block, and output projection. Supports train() via finite-difference gradients and generate() with greedy decoding. Includes comprehensive tests for all new components (36 new tests). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add toy character-level transformer language model#161

feat: add toy character-level transformer language model#161
jbrown1618 wants to merge 1 commit intomainfrom
feat/toy-language-model

jbrown1618 commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jbrown1618 commented Apr 12, 2026

Summary

New Components

Architecture

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant