Skip to content

Commit 0ee1655

Browse files
committed
session: reverse-engineer reasoning via causal edge diffing (NARS + attention + MoE gate)
1 parent 0168de9 commit 0ee1655

1 file changed

Lines changed: 224 additions & 0 deletions

File tree

Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
# SESSION: Reverse-Engineer Reasoning via Causal Edge Diffing
2+
3+
## MISSION
4+
5+
Extract the structural geometry of "how to think" from:
6+
1. Llama 4 Maverick MoE gate projections (routing topology)
7+
2. Qwen3.5 base→distilled attention diffs (reasoning circuit)
8+
3. Cross-model comparison (scale-invariant reasoning atoms)
9+
10+
Feed into NARS truth values on causal edges. First real training data
11+
for the NARS stack.
12+
13+
## READ FIRST
14+
15+
```bash
16+
cat src/hpc/gguf_indexer.rs # stream_index_gguf_bf16, classify_tensor
17+
cat src/hpc/nars.rs # TruthValue, revision, evidence
18+
cat src/hpc/bgz17_bridge.rs # Base17 type, L1 distance
19+
cat src/hpc/causality.rs # CausalEdge if it exists
20+
```
21+
22+
## PHASE 1: Index All Models (Q8_0, streaming)
23+
24+
Five GGUF files, all single-shard, ~105 GB total:
25+
26+
```
27+
unsloth/Qwen3.5-27B-GGUF
28+
→ Qwen3.5-27B-Q8_0.gguf 28.59 GB (base)
29+
30+
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
31+
→ Qwen3.5-27B.Q8_0.gguf 28.59 GB (distilled v1)
32+
33+
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF
34+
→ Qwen3.5-27B.Q8_0.gguf 28.59 GB (distilled v2)
35+
36+
unsloth/Qwen3.5-9B-GGUF
37+
→ Qwen3.5-9B-Q8_0.gguf 9.52 GB (base 9B)
38+
39+
Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
40+
→ Qwen3.5-9B.Q8_0.gguf 9.52 GB (distilled 9B)
41+
```
42+
43+
Use `stream_index_gguf` (f32 path — Q8_0 needs actual dequantization).
44+
Output: 5 bgz7 files with per-tensor, per-row Base17 projections.
45+
46+
## PHASE 2: Attention Diff (the reasoning circuit)
47+
48+
For each tensor pair (base vs distilled), matched by name:
49+
50+
```rust
51+
// Pseudocode — actual implementation in causal_diff.rs
52+
for (name, base_rows, dist_rows) in matched_tensors(base_bgz7, dist_bgz7) {
53+
let layer_type = classify_tensor(name);
54+
55+
for (row_idx, (b, d)) in base_rows.zip(dist_rows).enumerate() {
56+
let distance = b.l1(&d);
57+
58+
if distance > threshold {
59+
let edge = CausalEdge64 {
60+
subject: palette_index(b), // base archetype
61+
verb: BECOMES, // structural transformation
62+
object: palette_index(d), // distilled archetype
63+
truth: TruthValue {
64+
frequency: distance as f32 / max_l1 as f32,
65+
confidence: 1.0 / (1.0 + row_count as f32), // NARS evidence
66+
},
67+
};
68+
69+
// Tag with attention-specific metadata
70+
match classify_projection(name) {
71+
Q => emit_q_edge(edge, layer, head),
72+
K => emit_k_edge(edge, layer, head),
73+
V => emit_v_edge(edge, layer, head),
74+
O => emit_o_edge(edge, layer, head),
75+
Gate => emit_gate_edge(edge, layer),
76+
_ => emit_generic(edge),
77+
}
78+
}
79+
}
80+
}
81+
```
82+
83+
### What Each Projection Shift Means
84+
85+
```
86+
Q shifted, K stable → model asks NEW questions of SAME information
87+
= learned to LOOK for reasoning structure
88+
NARS: high frequency, high confidence
89+
90+
K shifted → model EXPOSES different features to attention
91+
= deeper change, new token-level signals
92+
NARS: moderate frequency, lower confidence (rarer)
93+
94+
V shifted → WHAT gets retrieved changed
95+
= content-level reasoning substrate
96+
NARS: varies by layer depth
97+
98+
O shifted → HOW multi-head outputs COMBINE
99+
= synthesis/integration change
100+
NARS: if high → distillation core is integration
101+
102+
Q+O shift, K stable → REASONING SCAFFOLD CIRCUIT
103+
= the minimal structural change for reasoning
104+
These heads ARE the distillation's value
105+
```
106+
107+
### Attention Head Clustering
108+
109+
```
110+
Cluster 1: Q+O shift, K stable → "reasoning scaffold" heads
111+
Cluster 2: K+V shift → "representation change" heads
112+
Cluster 3: all stable → "unchanged capability" heads
113+
Cluster 4: Q shift only → "query refinement" heads
114+
115+
Each cluster → one Sigma concept node
116+
Cross-model same cluster → SUPPORTS edge (scale-invariant)
117+
Cross-model different cluster → CONTRADICTS edge (scale-dependent)
118+
```
119+
120+
## PHASE 3: MoE Gate Topology (from Maverick bgz7)
121+
122+
The Maverick bgz7 already has gate projections indexed.
123+
Extract the gate tensor Base17 patterns separately:
124+
125+
```
126+
blk.{N}.ffn_gate_inp → router gate [n_experts, hidden_dim]
127+
Each ROW = one expert's activation pattern
128+
Base17 of that row = expert's structural identity
129+
```
130+
131+
Expert identity in Base17 space:
132+
- Experts with similar Base17 → structurally redundant (SUPPORTS)
133+
- Experts with distant Base17 → specialized (distinct concept nodes)
134+
- Cluster the 128 expert fingerprints → find natural expert groups
135+
136+
Cross with attention: which attention heads' Q projections align
137+
with which expert gate patterns? That alignment = the routing circuit.
138+
139+
```
140+
head_17_Q_pattern ──CAUSES──→ expert_37_gate_pattern
141+
(this head's queries activate this expert)
142+
truth: cosine(head_Q_base17, expert_gate_base17)
143+
```
144+
145+
## PHASE 4: NARS Truth Population
146+
147+
Every edge from phases 2-3 carries a TruthValue:
148+
149+
```rust
150+
TruthValue {
151+
frequency: f32, // how often this transformation occurs
152+
// = proportion of rows in this tensor that shifted
153+
confidence: f32, // evidence strength
154+
// = 1 - 1/(1+k) where k = number of observed rows
155+
}
156+
```
157+
158+
NARS revision across models:
159+
```
160+
evidence_27b_v1: (f=0.7, c=0.92) // 70% of Q rows shifted in 27B v1
161+
evidence_27b_v2: (f=0.8, c=0.92) // 80% shifted in v2 (more distillation)
162+
evidence_9b: (f=0.5, c=0.88) // only 50% shifted in 9B (capacity limit)
163+
164+
revised = nars_revision(evidence_27b_v1, evidence_9b)
165+
→ (f=0.62, c=0.95) // integrated belief about reasoning scaffold
166+
```
167+
168+
The revised truth tells you: "reasoning scaffold changes affect ~62% of
169+
Q projection rows, with 95% confidence, scale-dependent (27B > 9B)."
170+
171+
## PHASE 5: Sigma Concept Nodes (Ada Integration)
172+
173+
Each cluster of edges becomes a concept in Ada's graph:
174+
175+
```
176+
Σ.reasoning_scaffold = {
177+
evidence: [27b_v1_edges, 27b_v2_edges, 9b_edges],
178+
truth: revised_truth,
179+
composition: {Q_shift: 0.73, O_shift: 0.82, K_stable: 0.95},
180+
heads: [17, 23, 24, 31], // discovered by clustering
181+
scale_invariant: false, // 9B diverges
182+
source: "Qwen3.5 → Claude-Opus distillation"
183+
}
184+
185+
Σ.expert_redundancy = {
186+
evidence: [maverick_gate_similarities],
187+
truth: (f=0.96, c=0.99), // 96% structurally interchangeable
188+
meaning: "MoE expert weights are commodity, routing is intelligence"
189+
}
190+
191+
Σ.reasoning_scaffold ──CAUSES──→ Σ.expert_redundancy
192+
// reasoning heads SHAPE what the router sees
193+
// truth: to be discovered by cross-model alignment
194+
```
195+
196+
## DELIVERABLES
197+
198+
1. `causal_diff.rs` — load two bgz7 files, emit CausalEdge64 per shifted row
199+
2. `attention_cluster.rs` — cluster edges by projection type per head
200+
3. Test: `test_qwen35_reasoning_diff` — run the full 5-model pipeline
201+
4. Test: `test_maverick_gate_topology` — extract gate patterns from existing bgz7
202+
5. Output: `.claude/knowledge/reasoning_reverse_eng_results.md`
203+
204+
## WHY THIS MATTERS
205+
206+
The NARS stack has:
207+
- TruthValue with frequency + confidence ✓
208+
- Revision (evidence integration) ✓
209+
- Inference rules ✓
210+
- Graph storage ✓
211+
212+
What it's MISSING: real evidence. Every truth value is currently
213+
manufactured. This pipeline generates the first OBSERVED truth values
214+
from actual model weight differences. The NARS stack goes from
215+
theoretical to empirical in one session.
216+
217+
The thinking orchestration atoms (mcp-orchestrator-vsa) can then
218+
CONSTRUCT reasoning patterns from the observed evidence:
219+
"To add structured reasoning to a model, shift Q+O projections
220+
in heads [17,23,24,31] by palette distance 3-7. Expected improvement:
221+
f=0.62±0.15 at c=0.95."
222+
223+
That's not prompt engineering. That's weight-space surgery
224+
informed by causal evidence. Programming AGI by observation.

0 commit comments

Comments
 (0)