Commit cf505bb
Fix K-frozen attention bug + substrate-attention variants (L0-L3)
Two parallel wins from the K-frozen bug discovery:
1. tape_transpose Rust builtin fixes the original bug — K's
gradient now flows through the score path.
2. Substrate-K, substrate-K+Q, and fully-parameter-free attention
variants ship as Prometheus layers, testing the deeper
hypothesis: can the substrate REPLACE attention?
Rust additions (omnimcode-core/src/interpreter.rs):
tape_transpose(a_id) -> [cols, rows]
Forward: swap dimensions.
Backward: transpose upstream gradient. ~25 lines total.
TapeOp::Transpose variant added.
Prometheus additions (examples/lib/prometheus.omc):
L0 — prom_attention_forward (existing, FIXED)
Switched from tape_value/arr_transpose/tape_const hack to
tape_transpose. Q·K^T now backpropagates to K properly.
L1 — prom_attention_substrate_k_*
K replaced by CRT-PE position memory. Q, V still learned.
2 params per layer instead of 3.
L2 — prom_attention_substrate_kq_*
K AND Q both CRT-PE. Only V is learned.
1 param per layer.
L3 — prom_attention_substrate_full_*
All three substrate-derived. Q = K = CRT-PE; V = identity
(pass-through). ZERO learnable params in the attention block.
Regression tests (examples/tests/test_prometheus.omc):
test_attention_backward_flows_to_QKV
Locks the K-fix: ALL of Q, K, V get non-zero gradient after
one backward pass. Would have caught the original bug.
test_tape_transpose_forward
test_tape_transpose_backward
test_substrate_k_has_no_K_params
test_substrate_full_has_zero_params
All 15 Prometheus tests pass.
A/B experiment (examples/prometheus_attention_4way.omc):
Trains all four variants (L0, L1, L2, L3) on the same task with
the same seeds. Three seeds × 250 steps each. Tests the deepest
question OMC can pose: "is attention's expressivity actually
needed at this scale, or can the substrate provide enough
inductive prior?"
Result will land in a separate commit when the run finishes.
Strategic significance:
The user's reframe ("why not invent a new K primitive?") shifted
the work from "patch the bug" to "ship 4 architecturally distinct
attention layers." Either way the bug is gone; but the substrate
variants test the harder hypothesis that OMC's substrate doesn't
just augment transformer primitives — it can REPLACE them.
Whichever variant wins (or loses) the A/B is a real empirical
result on the substrate's expressive power at the attention layer.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent 8fcf7ee commit cf505bb
4 files changed
Lines changed: 541 additions & 27 deletions
File tree
- examples
- lib
- tests
- omnimcode-core/src
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
640 | 640 | | |
641 | 641 | | |
642 | 642 | | |
643 | | - | |
644 | | - | |
645 | | - | |
646 | | - | |
647 | | - | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
648 | 648 | | |
649 | 649 | | |
650 | 650 | | |
651 | 651 | | |
652 | | - | |
653 | 652 | | |
654 | 653 | | |
655 | 654 | | |
656 | 655 | | |
657 | 656 | | |
658 | 657 | | |
659 | 658 | | |
660 | | - | |
661 | | - | |
662 | | - | |
663 | | - | |
664 | | - | |
| 659 | + | |
| 660 | + | |
665 | 661 | | |
666 | 662 | | |
667 | | - | |
668 | | - | |
669 | | - | |
670 | | - | |
671 | | - | |
672 | | - | |
673 | | - | |
674 | | - | |
675 | | - | |
676 | | - | |
677 | | - | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
678 | 678 | | |
679 | | - | |
680 | | - | |
| 679 | + | |
| 680 | + | |
681 | 681 | | |
682 | | - | |
683 | | - | |
684 | 682 | | |
685 | | - | |
| 683 | + | |
686 | 684 | | |
687 | 685 | | |
688 | 686 | | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
689 | 808 | | |
690 | 809 | | |
691 | 810 | | |
| |||
0 commit comments