Skip to content

feat: add toy character-level transformer language model#161

Open
jbrown1618 wants to merge 1 commit intomainfrom
feat/toy-language-model
Open

feat: add toy character-level transformer language model#161
jbrown1618 wants to merge 1 commit intomainfrom
feat/toy-language-model

Conversation

@jbrown1618
Copy link
Copy Markdown
Owner

Summary

Implements a minimal GPT-style transformer language model operating at character level, with deliberately tiny dimensions suitable for this library's toy scale.

New Components

File Description
Tokenizer.ts Character-level tokenizer (a-z + space → indices 0-26)
TransformerBlock.ts Single-head self-attention with causal masking, feed-forward network (ReLU), residual connections
LanguageModel.ts Full model: embeddings → transformer → output projection; train() via finite-difference gradients; generate() with greedy decoding
NumberUtilities.ts Added numerically stable softmax() utility

Architecture

  • Vocabulary: 27 characters (space + a-z)
  • Default dimensions: embedding dim = 8, feed-forward dim = 16, context length = 16
  • Training: Finite-difference gradient estimation (suitable for toy scale only)
  • Generation: Autoregressive with greedy decoding

Tests

36 new tests across 4 test files covering all new components. All 694 tests pass, build succeeds.

Implement a minimal GPT-style transformer operating at character level
with deliberately tiny dimensions, suitable for this library's toy scale:

- Tokenizer: character-level encoding (a-z + space, vocab size 27)
- softmax: numerically stable softmax utility function
- TransformerBlock: single-head self-attention with causal masking,
  feed-forward network with ReLU, and residual connections
- LanguageModel: full model composing embeddings, positional encodings,
  transformer block, and output projection. Supports train() via
  finite-difference gradients and generate() with greedy decoding.

Includes comprehensive tests for all new components (36 new tests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant