Skip to content

Latest commit

 

History

History
109 lines (80 loc) · 2.54 KB

File metadata and controls

109 lines (80 loc) · 2.54 KB

Tok Developer Quick Reference

Library API

Tok is a Go library for LLM token compression. Import it directly:

import "github.com/GrayCodeAI/tok"

Basic Compression

// One-shot compression
compressed, stats := tok.Compress(text, tok.Aggressive)

// Reusable compressor instance
c := tok.NewCompressor(tok.Adaptive)
output, stats := c.Compress(text)

Token Estimation

// Fast estimate
count := tok.EstimateTokens(text)

// Precise BPE-based count
count := tok.EstimateTokensPrecise(text)

Compression Modes

tok.Minimal    // ~20% savings, preserve most content
tok.Aggressive // ~60% savings, aggressive filtering
tok.Adaptive   // Auto-selects based on input characteristics
tok.Surface    // Light surface-level cleanup
tok.Trim       // Remove whitespace and formatting
tok.Extract    // Keep only key information
tok.Core       // Distill to essential meaning
tok.Code       // Code-aware compression
tok.Log        // Optimized for log output
tok.Thread     // Optimized for conversation threads

Pipeline Coordinator Pooling

import "github.com/GrayCodeAI/tok/internal/filter"

// Use default pool for repeated compression
pool := filter.GetDefaultPool()
coord := pool.Get()
defer pool.Put(coord)

output, stats := coord.Process(input)

// Or create custom pool
customPool := filter.NewCoordinatorPool(myConfig)

Code-Aware Chunking

// Get language-specific patterns for semantic chunking
patterns := tok.GetLanguagePatterns("go")

Internal Packages

Package Purpose
internal/filter 31-layer compression pipeline
internal/core Token estimation, pipeline runner
internal/cache Multi-level caching with git-aware watcher
internal/config Configuration management
internal/simd SIMD optimizations (Go 1.26+)
internal/utils Shared utilities

Testing

# Run all tests
make test

# Run with race detector
make test-race

# Run benchmarks
make benchmark

# Run adaptive benchmark comparison
make benchmark-adaptive

# Run scenario benchmark suite
make benchmark-suite

# Run ablation tests
make ablation

Performance Tips

  1. Always use coordinator pooling — 10-20x faster than creating new coordinators
  2. Choose the right modeAdaptive auto-selects; manual modes save overhead
  3. Reuse Compressor instances — amortizes setup cost across calls
  4. Use EstimateTokens for budgeting — fast heuristic for token counting
  5. Use EstimateTokensPrecise for billing — accurate BPE count