Aiur kernel: u8_mul gadget + multiplication/arithmetic cleanups#410
Open
arthurpaulino wants to merge 3 commits into
Open
Aiur kernel: u8_mul gadget + multiplication/arithmetic cleanups#410arthurpaulino wants to merge 3 commits into
u8_mul gadget + multiplication/arithmetic cleanups#410arthurpaulino wants to merge 3 commits into
Conversation
- Check.lean safe_refs_only: replace `match u { 0=>1, 1=>0 }` with
`1 - is_unsafe_ci(ci)` — pure arithmetic, drops the match's selectors.
- ByteStream.u64_add now returns (U64, G), exposing the final carry-out;
removes the kernel's duplicate u64_add_with_carry. klimbs_succ /
klimbs_add_carry / klimbs_mul_single call u64_add directly. u64_add
had no prior callers, so no ripple to Blake3 / IxonSerialize.
New `u8_mul(a, b)` Aiur primitive: byte * byte -> (low, high), with low + 256*high = a*b. Slots into the existing Bytes2 preprocessed 256x256 table as two extra columns plus a lookup channel, so its cost class matches u8_add (one lookup, two auxiliaries). Wires the op through the Rust prover (bytecode, bytes2 gadget, constraints, trace, execute, FFI) and every Lean Aiur stage (Source through Bytecode), compiler, semantics, and Meta syntax. FFI ctor tag 18 is u8Mul; subsequent Op tags shift by one. Verified via the aiur and aiur-cross suites: prove/verify passes for 45*131 = (7, 23) and 255*255 = (1, 254).
u64_mul's byte schoolbook now splits each byte product via the u8_mul gadget instead of a field mul, so column accumulators are sums of bytes (< ~4096) rather than sums of products (< ~520k). divmod_256 therefore carry-propagates only small values: for Nat.mul 1000000 1000000 its trace height drops 86 -> 8 (FFT -29k). u64_mul itself widens (186 -> 506) from the u8_mul lookups -- a fixed +320 -- so the net is a win for any non-trivial multiplication. Adds IxVMPrim.nat_mul_big as a multi-byte mul check target.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Optimization pass over the Aiur kernel. Adds a
u8_mulAiur gadget, routesthe kernel's primitive
Natmultiplication through it, and cleans up twosmall arithmetic redundancies.
Commits
perf(kernel): boolean-inversion opt + dedup u64_addCheck.lean::safe_refs_only: replacematch u { 0 => 1, 1 => 0 }with1 - is_unsafe_ci(ci)— pure field arithmetic, drops the match'sselectors and branch blocks.
ByteStream.u64_addnow returns(U64, G), exposing the finalcarry-out, and the kernel's duplicate
u64_add_with_carryis removed.klimbs_succ/klimbs_add_carry/klimbs_mul_singlecallu64_adddirectly.
u64_addhad no prior callers, so nothing else is affected.feat(aiur): add u8_mul gadgetNew
u8_mul(a, b)Aiur primitive: byte × byte →(low, high)withlow + 256·high = a·b. It slots into the existingBytes2preprocessed256×256 table as two extra columns plus a lookup channel, so its circuit
cost class matches
u8_add(one lookup, two auxiliaries).The op is wired through the Rust prover (bytecode,
Bytes2gadget,constraints, trace, execute, FFI) and every Lean Aiur stage (Source
through Bytecode), the compiler, the semantics, and
Metasyntax. TheFFI constructor tag
18isu8Mul; subsequentOptags shift by one.Verified via the
aiurandaiur-crosssuites: prove/verify passes for45·131 = (7, 23)and255·255 = (1, 254).perf(kernel): use u8_mul gadget in u64_mul, shrink divmod_256u64_mul's byte schoolbook now splits each byte product via theu8_mulgadget instead of a field multiply, so each column accumulator is a sum
of bytes (
< ~4096) rather than a sum of products (< ~520k).divmod_256therefore only carry-propagates small values.Effect on
Nat.mul 1000000 1000000:divmod_256u64_muldivmod_256trace height drops 86 → 8.u64_mulwidens (186 → 506) fromthe
u8_mullookups — a fixed +320 — so the net is a win for anynon-trivial multiplication; only trivial single-byte products (e.g.
6·7) pay the width without an offsettingdivmod_256saving. Largeroperands win by more: the old
divmod_256height scaled with operandsize (~1000+ rows for 32-bit-sized factors) while the new one stays
bounded near 16.
Adds
IxVMPrim.nat_mul_big(Nat.mul 1000000 1000000) as a multi-bytemul check target.
Testing
lake test -- --ignored ixvm— passes (kernel arena, incl.natMul*,natPow*).lake test -- aiur/aiur-cross— passes, including the newu8_mulprove/verify cases.
cargo build) and full Lean (lake build) build clean.