You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: spider-web logo + Fibonacci spiral formulas + zero-gravity physics
- Inverted logo: black petals with white outlines (spider-web effect)
- White hover highlight on logo petals with white tooltip (black text)
- 42 sacred formulas orbit in Fibonacci golden-angle spiral
- Alternating rotation directions per ring layer
- Formulas stop on mouse hover (not scatter away)
- Click formula to expand description
- Sacred world panels fully opaque pure black background
- Panel slot reuse (fixes count overflow after many opens)
- Fixed applyMouse() vertex rotation matching draw()
- ESC hides panels (not exit), Cmd+Q to quit
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
-**autoSplitN()**: Divides any model's layers evenly across N nodes
304
+
-**Cross-platform**: macOS arm64 coordinator + Linux x86_64 worker, zero dependencies
251
305
252
306
### Key Finding
253
307
The dominant bottleneck on localhost was **CPU contention**, not network. When each node has its own CPU, pipeline parallelism delivers the expected parallel speedup. Network adds ~100ms RTT overhead per decode step but this is dwarfed by the compute savings from eliminating contention.
254
308
255
309
### Next Steps
256
310
257
311
1.~~**Multi-machine test**: Deploy on 2 separate machines to measure real parallel speedup~~**DONE**
258
-
2.**Tokenizer integration**: GGUF tokenizer for coherent text output
259
-
3.**Larger models**: Qwen2.5 7B Q4_K_M (requires download, ~4GB per shard)
260
-
4.**N-way pipeline**: Extend for >2 nodes
261
-
5.**Tensor parallelism**: Split matmul across nodes (complementary to pipeline)
312
+
2.~~**N-way pipeline**: Extend for >2 nodes~~**DONE** (PipelineRelay)
313
+
3.**3 separate machines**: Deploy on 3 VPS to measure real 3-way parallel speedup
314
+
4.**Tokenizer integration**: GGUF tokenizer for coherent text output
315
+
5.**Larger models**: Qwen2.5 7B Q4_K_M (requires download, ~4GB per shard)
316
+
6.**Tensor parallelism**: Split matmul across nodes (complementary to pipeline)
0 commit comments