You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docsite/docs/api/jit.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ sidebar_position: 9
4
4
5
5
# JIT Compilation API
6
6
7
-
[VSA](/docs/concepts/glossary) operations run in loops over thousands of vector elements. The JIT compiler replaces these loops with native SIMD instructions, processing 16--32 elements per CPU cycle. Result: **15--260x speedup** on hot paths. You do not need to understand JIT internals -- just create an engine and call the same operations.
7
+
[VSA](/concepts/glossary) operations run in loops over thousands of vector elements. The JIT compiler replaces these loops with native SIMD instructions, processing 16--32 elements per CPU cycle. Result: **15--260x speedup** on hot paths. You do not need to understand JIT internals -- just create an engine and call the same operations.
8
8
9
9
The JIT system compiles specialized machine code for your exact vector dimension at runtime. The first call for a given dimension compiles the function. Every subsequent call reuses the cached native code.
10
10
@@ -67,7 +67,7 @@ Frees all compiled functions, executable memory, and caches.
Computes the [dot product](/docs/concepts/glossary) of two hypervectors using JIT-compiled SIMD code. Vectors are automatically unpacked before the operation. The function compiles on first use for the given dimension and caches for reuse.
70
+
Computes the [dot product](/concepts/glossary) of two hypervectors using JIT-compiled SIMD code. Vectors are automatically unpacked before the operation. The function compiles on first use for the given dimension and caches for reuse.
Element-wise ternary multiplication ([binding](/docs/concepts/glossary)). **Modifies `a` in place.** The result vector `a` is marked dirty so the packed representation recomputes on next access.
79
+
Element-wise ternary multiplication ([binding](/concepts/glossary)). **Modifies `a` in place.** The result vector `a` is marked dirty so the packed representation recomputes on next access.
80
80
81
81
```zig
82
82
try engine.bind(&vec_a, &vec_b); // vec_a now holds the bound result
@@ -88,7 +88,7 @@ try engine.bind(&vec_a, &vec_b); // vec_a now holds the bound result
Element-wise sum with ternary threshold ([bundling](/docs/concepts/glossary)). **Modifies `a` in place.** For each position: positive sum becomes `+1`, negative sum becomes `-1`, zero stays `0`.
91
+
Element-wise sum with ternary threshold ([bundling](/concepts/glossary)). **Modifies `a` in place.** For each position: positive sum becomes `+1`, negative sum becomes `-1`, zero stays `0`.
92
92
93
93
```zig
94
94
try engine.bundle(&vec_a, &vec_b); // vec_a now holds the bundled result
Copy file name to clipboardExpand all lines: docsite/docs/api/sequence-hdc.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ sidebar_position: 8
6
6
7
7
This module turns text into vectors. Feed it strings like "hello world", and it produces compact numeric vectors that capture the text's pattern. Similar texts produce similar vectors. Use it for language detection, text classification, or semantic search -- without training a neural network.
8
8
9
-
Under the hood, the module uses [Hyperdimensional Computing](/docs/concepts/glossary) (HDC). It maps characters to high-dimensional [ternary vectors](/docs/concepts/glossary) (\{-1, 0, +1\}), then combines them to represent words, phrases, and documents. The key insight: texts that share character patterns produce vectors that point in similar directions.
9
+
Under the hood, the module uses [Hyperdimensional Computing](/concepts/glossary) (HDC). It maps characters to high-dimensional [ternary vectors](/concepts/glossary) (\{-1, 0, +1\}), then combines them to represent words, phrases, and documents. The key insight: texts that share character patterns produce vectors that point in similar directions.
10
10
11
11
**Source:**`src/sequence_hdc.zig`
12
12
@@ -22,7 +22,7 @@ graph LR
22
22
D --> E["Compare via<br/>cosine similarity"]
23
23
```
24
24
25
-
1.**Split** the input into overlapping character [n-grams](/docs/concepts/glossary) (e.g., trigrams).
25
+
1.**Split** the input into overlapping character [n-grams](/concepts/glossary) (e.g., trigrams).
26
26
2.**Encode** each n-gram by looking up character vectors and combining them.
27
27
3.**Bundle** all n-gram vectors into a single vector using majority vote.
28
28
4.**Compare** the result to stored vectors using cosine similarity.
@@ -103,7 +103,7 @@ The n-gram size controls how much local context each encoding captures.
103
103
104
104
## ItemMemory
105
105
106
-
Maps symbol IDs (or ASCII characters) to deterministically generated random [hypervectors](/docs/concepts/glossary). Vectors are lazily created on first access and cached in a `HashMap`.
106
+
Maps symbol IDs (or ASCII characters) to deterministically generated random [hypervectors](/concepts/glossary). Vectors are lazily created on first access and cached in a `HashMap`.
107
107
108
108
Each trit in a generated vector is uniformly random from \{-1, 0, +1\}, seeded by `symbol_id * 2654435761 + seed` using the standard PRNG.
109
109
@@ -147,7 +147,7 @@ Encodes an entire string as an array of character hypervectors. Returns a newly
147
147
148
148
## NGramEncoder
149
149
150
-
Encodes character [n-grams](/docs/concepts/glossary) using position-encoded binding. Each character in an n-gram shifts by its position index, then all characters bind together. This preserves order: "abc" and "bac" produce different vectors.
150
+
Encodes character [n-grams](/concepts/glossary) using position-encoded binding. Each character in an n-gram shifts by its position index, then all characters bind together. This preserves order: "abc" and "bac" produce different vectors.
Copy file name to clipboardExpand all lines: docsite/docs/api/sparse.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ sidebar_position: 10
6
6
7
7
When most elements in your vector are zero, storing all of them wastes memory. `SparseVector` stores only the non-zero elements with their positions. For a 10,000-element vector with 90% zeros, this saves 10x memory and makes operations 10x faster.
8
8
9
-
Trinity uses [ternary vectors](/docs/concepts/glossary) (\{-1, 0, +1\}). Many operations -- masking, gating, thresholding -- produce vectors dominated by zeros. `SparseVector` exploits this by keeping two sorted arrays: indices (where non-zero elements live) and values (what those elements are). All lookups use binary search. All VSA operations use merge-join algorithms that skip zeros entirely.
9
+
Trinity uses [ternary vectors](/concepts/glossary) (\{-1, 0, +1\}). Many operations -- masking, gating, thresholding -- produce vectors dominated by zeros. `SparseVector` exploits this by keeping two sorted arrays: indices (where non-zero elements live) and values (what those elements are). All lookups use binary search. All VSA operations use merge-join algorithms that skip zeros entirely.
These are kernel benchmark numbers measuring raw computation speed, not end-to-end text generation. See [GPU Inference Benchmarks](/docs/benchmarks/gpu-inference) for methodology.
38
+
These are kernel benchmark numbers measuring raw computation speed, not end-to-end text generation. See [GPU Inference Benchmarks](/benchmarks/gpu-inference) for methodology.
39
39
40
40
---
41
41
@@ -99,4 +99,4 @@ Trinity is positioned as the **green computing leader** in LLM inference. The te
99
99
- GPT-4/Claude: Estimated from API response times
100
100
- All coherence verified with standard prompts (12/12 coherent responses for Trinity)
101
101
102
-
See [BitNet Coherence Report](/docs/research/bitnet-report) for detailed test methodology.
102
+
See [BitNet Coherence Report](/research/bitnet-report) for detailed test methodology.
The numbers above are for the BitNet b1.58-2B-4T model (2.4 billion parameters) using the bitnet.cpp inference engine with I2_S quantization. Actual throughput depends on batch size, sequence length, and system configuration.
20
20
21
21
:::caution
22
-
These throughput figures represent bitnet.cpp kernel benchmark results (measuring raw computation speed), not end-to-end text generation throughput. End-to-end generation speed is substantially lower due to sequential token generation, memory transfers, and tokenizer overhead. See the [BitNet Coherence Report](/docs/research/bitnet-report) for measured end-to-end generation speeds.
22
+
These throughput figures represent bitnet.cpp kernel benchmark results (measuring raw computation speed), not end-to-end text generation throughput. End-to-end generation speed is substantially lower due to sequential token generation, memory transfers, and tokenizer overhead. See the [BitNet Coherence Report](/research/bitnet-report) for measured end-to-end generation speeds.
Copy file name to clipboardExpand all lines: docsite/docs/benchmarks/index.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,19 +33,19 @@ Ternary \{-1, 0, +1\} weights eliminate the need for multiplication in matrix-ve
33
33
34
34
### GPU Inference
35
35
36
-
BitNet b1.58 models running on consumer and datacenter GPUs achieve throughput measured in hundreds of thousands of tokens per second for small models. Performance varies by GPU type, model size, and batch configuration. See [GPU Inference Benchmarks](/docs/benchmarks/gpu-inference) for detailed numbers.
36
+
BitNet b1.58 models running on consumer and datacenter GPUs achieve throughput measured in hundreds of thousands of tokens per second for small models. Performance varies by GPU type, model size, and batch configuration. See [GPU Inference Benchmarks](/benchmarks/gpu-inference) for detailed numbers.
37
37
38
38
### JIT Compilation
39
39
40
-
Trinity includes a custom JIT compiler with backends for ARM64 (Apple Silicon, Raspberry Pi, etc.) and x86-64 (Intel/AMD). VSA operations such as bind, bundle, dot product, and permute are compiled to native machine code at runtime, with compiled functions cached for reuse. See [JIT Compilation Performance](/docs/benchmarks/jit-performance) for architecture-specific results.
40
+
Trinity includes a custom JIT compiler with backends for ARM64 (Apple Silicon, Raspberry Pi, etc.) and x86-64 (Intel/AMD). VSA operations such as bind, bundle, dot product, and permute are compiled to native machine code at runtime, with compiled functions cached for reuse. See [JIT Compilation Performance](/benchmarks/jit-performance) for architecture-specific results.
41
41
42
42
### Memory Efficiency
43
43
44
-
The framework provides multiple memory representations optimized for different use cases: HybridBigInt with lazy packed/unpacked conversion, bit-packed trit arrays, and sparse COO-format vectors for data with many zeros. A 10,000-dimensional vector that would consume 40KB in float32 fits in roughly 2.5KB using packed ternary encoding. See [Memory Efficiency](/docs/benchmarks/memory-efficiency) for a detailed breakdown.
44
+
The framework provides multiple memory representations optimized for different use cases: HybridBigInt with lazy packed/unpacked conversion, bit-packed trit arrays, and sparse COO-format vectors for data with many zeros. A 10,000-dimensional vector that would consume 40KB in float32 fits in roughly 2.5KB using packed ternary encoding. See [Memory Efficiency](/benchmarks/memory-efficiency) for a detailed breakdown.
45
45
46
46
### Competitor Comparison
47
47
48
-
How does Trinity stack up against Groq, GPT-4, and other LLM providers? Trinity offers 35-52 tok/s on CPU with self-hosted costs of $0.01-0.35/hr, compared to cloud providers charging per-token fees. See [Competitor Comparison](/docs/benchmarks/competitor-comparison) for detailed benchmarks and cost analysis.
48
+
How does Trinity stack up against Groq, GPT-4, and other LLM providers? Trinity offers 35-52 tok/s on CPU with self-hosted costs of $0.01-0.35/hr, compared to cloud providers charging per-token fees. See [Competitor Comparison](/benchmarks/competitor-comparison) for detailed benchmarks and cost analysis.
Copy file name to clipboardExpand all lines: docsite/docs/concepts/balanced-ternary.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,7 +77,7 @@ Trinity represents trits in memory using a compact **packed encoding** that stor
77
77
78
78
This encoding uses 2 bits per trit, achieving an effective density of 1.585 / 2 = 79.3% of the theoretical maximum. While not perfectly optimal (the theoretical minimum is log2(3) = 1.585 bits per trit), the 2-bit encoding enables fast bitwise operations and aligns naturally with byte boundaries.
79
79
80
-
The [HybridBigInt](/docs/api/hybrid) type in Trinity manages this encoding transparently. It maintains two representations: a **packed** form for memory-efficient storage and an **unpacked** form (an array of individual trit values) for fast computation. Conversions between the two are performed lazily -- only when needed -- and are cached to avoid redundant work.
80
+
The [HybridBigInt](/api/hybrid) type in Trinity manages this encoding transparently. It maintains two representations: a **packed** form for memory-efficient storage and an **unpacked** form (an array of individual trit values) for fast computation. Conversions between the two are performed lazily -- only when needed -- and are cached to avoid redundant work.
81
81
82
82
With this encoding, a 256-trit vector (a common dimension in Trinity's VSA operations) occupies just 64 bytes in packed form, compared to 256 bytes if each trit were stored in a full byte, or 1024 bytes if stored as 32-bit floats.
83
83
@@ -97,13 +97,13 @@ With this encoding, a 256-trit vector (a common dimension in Trinity's VSA opera
97
97
98
98
The balanced ternary representation is the foundation of every subsystem in Trinity:
99
99
100
-
-**VSA operations** ([bind, unbind, bundle](/docs/api/vsa)) operate element-wise on ternary vectors. Binding uses trit multiplication; unbinding is identical to binding (the operation is its own inverse for non-zero trits).
101
-
-**BitNet inference** ([Firebird](/docs/api/firebird)) quantizes LLM weights to \{-1, 0, +1\}, turning matrix multiplications into accumulations.
102
-
-**The Ternary VM** ([VM](/docs/api/vm)) executes bytecode with a ternary instruction set, operating on ternary stack values.
100
+
-**VSA operations** ([bind, unbind, bundle](/api/vsa)) operate element-wise on ternary vectors. Binding uses trit multiplication; unbinding is identical to binding (the operation is its own inverse for non-zero trits).
101
+
-**BitNet inference** ([Firebird](/api/firebird)) quantizes LLM weights to \{-1, 0, +1\}, turning matrix multiplications into accumulations.
102
+
-**The Ternary VM** ([VM](/api/vm)) executes bytecode with a ternary instruction set, operating on ternary stack values.
103
103
104
104
## Further Reading
105
105
106
-
-[Ternary Computing Concepts](/docs/concepts) -- overview and motivation
107
-
-[The Trinity Identity](/docs/concepts/trinity-identity) -- why the golden ratio connects to base-3
108
-
-[VSA API Reference](/docs/api/vsa) -- ternary vector operations
109
-
-[HybridBigInt API Reference](/docs/api/hybrid) -- packed trit storage
106
+
-[Ternary Computing Concepts](/concepts) -- overview and motivation
107
+
-[The Trinity Identity](/concepts/trinity-identity) -- why the golden ratio connects to base-3
108
+
-[VSA API Reference](/api/vsa) -- ternary vector operations
109
+
-[HybridBigInt API Reference](/api/hybrid) -- packed trit storage
0 commit comments