Skip to content

Commit 2b980ea

Browse files
committed
fix(inline): correct recursive inline miscompile (fib returned 38)
The depth-1 recursive inline pass produced a module that miscompiled at the LLVM tier: `main()` returned `Int(38)` instead of `Int(102334155)`. Two coupled bugs. First, `substitute_operands` only rewrote operand HirIds. For the standard inliner (caller != callee) this was invisible — the callee's HirIds are disjoint from the caller's, so cloned instruction results landed on unique ids either way. For the recursive inliner the snapshot IS a clone of the live function, so cloned instructions carried the live function's HirIds as results, producing duplicate SSA definitions. The LLVM backend's value_map keyed on HirId then overwrote the live definition with the cloned one. Fix: substitute results too. No behaviour change for the standard inliner — the result is the same fresh id that was already pre-registered in `caller.values`. Second, when the original call block was split twice (once per inline, reverse-inst-idx order so earlier indices stay valid), the post-block of the LATER inline ended up holding instructions that the EARLIER inline's cloned blocks consumed — but the EARLIER inline's blocks were inserted first into the function's `IndexMap`. The LLVM backend processes blocks in IndexMap order and silently falls back to `const_zero` for any HirValue it hasn't yet seen, so the cloned `Lt(n - 2, 2)` cond folded to `Lt(0, 2) = true`, the body collapsed to `return n - 2`, and `fib(40) = 40 - 2 = 38`. Fix: a CFG-DFS reorder of `func.blocks` at the end of the pass, so every block follows its predecessors in iteration order. SSA dominance then guarantees value_map is populated before any use. Bench impact unchanged from the original landing — fib(40) ~344 ms exec, mandelbrot ~409 ms, all kernels return the right values. 248 compiler tests pass.
1 parent d24d5c3 commit 2b980ea

1 file changed

Lines changed: 127 additions & 0 deletions

File tree

crates/compiler/src/inline.rs

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -336,11 +336,93 @@ pub fn run_module_recursive(module: &mut HirModule) -> RecursiveInlineStats {
336336
stats.self_calls_inlined += 1;
337337
}
338338
rebuild_cfg_edges(live);
339+
// The LLVM backend processes blocks in `func.blocks.iter()`
340+
// order (IndexMap insertion order) and falls back to
341+
// `const_zero` placeholders in `value_map` for any HirValue
342+
// it hasn't yet defined. When recursive inlining splits the
343+
// original call block twice, the post-block of the LATER
344+
// inline ends up holding instructions used by the cloned
345+
// blocks of the EARLIER inline — but the earlier inline's
346+
// blocks were inserted first, so they're compiled before
347+
// those instructions exist. Reorder the IndexMap so every
348+
// block follows its predecessors in iteration order; SSA
349+
// dominance then guarantees value_map is populated before
350+
// any use. Without this, fib(40)'s `Lt(n - 2, 2)` cond
351+
// folds to `Lt(0, 2) = true`, the body collapses to
352+
// `return n - 2`, and main() returns 38 instead of
353+
// 102334155.
354+
reorder_blocks_by_cfg(live);
339355
}
340356

341357
stats
342358
}
343359

360+
/// Rebuild `func.blocks` (an `IndexMap` whose iteration order the
361+
/// LLVM backend honours) so every block appears AFTER all of its
362+
/// predecessors. A DFS pre-order from the entry block produces
363+
/// such an ordering: SSA dominance means dominators are visited
364+
/// before dominated blocks.
365+
///
366+
/// Used at the end of [`run_module_recursive`] — the apply order
367+
/// (reverse-inst-idx so earlier indices stay valid as splicing
368+
/// inserts content) puts the LATER inline's blocks AFTER the
369+
/// EARLIER inline's blocks in insertion order, even though
370+
/// execution flows the other way. The downstream LLVM backend's
371+
/// value_map then gets `const_zero` placeholders for any
372+
/// instruction it hasn't yet processed, masking the mis-ordering
373+
/// as a silent miscompile.
374+
fn reorder_blocks_by_cfg(func: &mut HirFunction) {
375+
use indexmap::IndexMap;
376+
377+
let entry = func.entry_block;
378+
if !func.blocks.contains_key(&entry) {
379+
return;
380+
}
381+
let mut visited: HashSet<HirId> = HashSet::new();
382+
let mut order: Vec<HirId> = Vec::with_capacity(func.blocks.len());
383+
let mut stack: Vec<HirId> = vec![entry];
384+
while let Some(bid) = stack.pop() {
385+
if !visited.insert(bid) {
386+
continue;
387+
}
388+
order.push(bid);
389+
if let Some(block) = func.blocks.get(&bid) {
390+
// Push successors in reverse so the DFS visits them
391+
// in declaration order — keeps the diff against the
392+
// pre-pass ordering minimal when nothing has changed.
393+
let mut succs: Vec<HirId> = block.successors.clone();
394+
succs.reverse();
395+
for s in succs {
396+
if !visited.contains(&s) {
397+
stack.push(s);
398+
}
399+
}
400+
}
401+
}
402+
// Blocks unreachable from `entry` — orphans created mid-pass,
403+
// dead branches whose CFG edges were rebuilt out — go at the
404+
// tail so we don't lose anything the rest of the pipeline
405+
// might still touch.
406+
let tail: Vec<HirId> = func
407+
.blocks
408+
.keys()
409+
.copied()
410+
.filter(|b| !visited.contains(b))
411+
.collect();
412+
for bid in tail {
413+
order.push(bid);
414+
}
415+
416+
let mut new_blocks: IndexMap<HirId, crate::hir::HirBlock> =
417+
IndexMap::with_capacity(func.blocks.len());
418+
for bid in &order {
419+
if let Some(blk) = func.blocks.shift_remove(bid) {
420+
new_blocks.insert(*bid, blk);
421+
}
422+
}
423+
func.blocks = new_blocks;
424+
}
425+
344426
/// Recursive-inline classifier. Allows direct self-`Call`s inside the
345427
/// body (those become the depth-1 residual after inlining), but
346428
/// still rejects every other Call kind (`IndirectCall`,
@@ -1200,6 +1282,51 @@ fn replace_uses_across_function(func: &mut HirFunction, subs: &HashMap<HirId, Hi
12001282
}
12011283

12021284
fn substitute_operands(inst: &mut HirInstruction, subs: &HashMap<HirId, HirId>) {
1285+
// Substitute the instruction's RESULT (defining HirId) as well as
1286+
// its operands. Both are necessary for cloned instructions: the
1287+
// subs map's image (`fresh_id`s minted up-front) is what the
1288+
// caller's `values` table was populated with, so the cloned
1289+
// instruction's result has to land at the fresh id to match.
1290+
//
1291+
// For the standard inliner this is a no-op-equivalent — the
1292+
// callee's HirIds are disjoint from the caller's, so the result
1293+
// gets a fresh id either way (the difference was invisible
1294+
// because downstream codegen iterates instructions in
1295+
// declaration order and didn't notice the duplicate caller-side
1296+
// HirId).
1297+
//
1298+
// For the recursive inliner this is load-bearing: the snapshot
1299+
// shares every HirId with the live function (it IS a clone), so
1300+
// without result-substitution the cloned instructions land in
1301+
// the caller carrying the live function's HirIds — duplicate
1302+
// SSA defs. The LLVM backend's value_map keyed on HirId then
1303+
// overwrites the live definition with the cloned one, every
1304+
// operand referencing the original now resolves to the cloned
1305+
// value, and fib(40) returns Int(38) instead of Int(102334155).
1306+
let map_result = |id: &mut HirId| {
1307+
if let Some(&new) = subs.get(id) {
1308+
*id = new;
1309+
}
1310+
};
1311+
match inst {
1312+
HirInstruction::Binary { result, .. }
1313+
| HirInstruction::Unary { result, .. }
1314+
| HirInstruction::Cast { result, .. }
1315+
| HirInstruction::Load { result, .. }
1316+
| HirInstruction::GetElementPtr { result, .. }
1317+
| HirInstruction::ExtractValue { result, .. }
1318+
| HirInstruction::InsertValue { result, .. }
1319+
| HirInstruction::Alloca { result, .. }
1320+
| HirInstruction::Select { result, .. }
1321+
| HirInstruction::Atomic { result, .. } => map_result(result),
1322+
HirInstruction::Call { result, .. } | HirInstruction::IndirectCall { result, .. } => {
1323+
if let Some(r) = result {
1324+
map_result(r);
1325+
}
1326+
}
1327+
_ => {}
1328+
}
1329+
12031330
let map = |id: &mut HirId| {
12041331
if let Some(&new) = subs.get(id) {
12051332
*id = new;

0 commit comments

Comments
 (0)