You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(rope): traceable interleaved RoPE for graph export (unblocks TinyLlama→IREE)
Interleaved RoPE's raw-array path (copyToFloatArray → scalar rotate →
fromFloatArray) records the rotated Q/K as a disconnected constant under graph
tracing, severing the link to the projection weights. Post-GQA head-broadcast
that lowers to an insert_slice-into-tensor.empty() constant cascade that
segfaults iree-compile (iree-dispatch-creation-convert-tensor-to-flow,
null ElementsAttr::getType in greedy fold; seqLen>=2 only).
Add applyRoPEInterleavedOps: a pure-tensor-op interleaved rotation (reshape
[headDim]->[halfRotary,2], narrow even/odd, rotate with cos/sin tables,
re-interleave), numerically identical to the raw path. Gated on
input.ops is KspTensorOps so it runs only under tracing; eager keeps the
raw fast path byte-identical (no perf/correctness change). Full-rotary only.
Verified via the skainet-tinyllama-iree composite build: real TinyLlama
exports + compiles to aarch64 .vmfb at seq=2 and seq=8; eager-jvm still
coherent (matches llama.cpp); LlamaDslPipelineTest green.
Perf-Tag: perf/b1-rope-traceable
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0 commit comments