Skip to content

Fix reverse custom-rule rooting for nested sparse wrappers#3073

Open
AshtonSBradley wants to merge 2 commits into
EnzymeAD:mainfrom
AshtonSBradley:asb/issue-3072-sparse-wrapper-rta
Open

Fix reverse custom-rule rooting for nested sparse wrappers#3073
AshtonSBradley wants to merge 2 commits into
EnzymeAD:mainfrom
AshtonSBradley:asb/issue-3072-sparse-wrapper-rta

Conversation

@AshtonSBradley
Copy link
Copy Markdown

@AshtonSBradley AshtonSBradley commented May 7, 2026

Summary

Explanation

The failure in #3072 came from reverse custom-rule setup reusing the original rooted argument value when boxing a Duplicated{SparseMatrixCSC} argument. In the nested wrapper reproducer, that root pointer was defined along the forward path, so the reverse-generated load for loaded.roots.primal.SparseMatrixCSC{Float64, Int64} failed LLVM verification because the definition did not dominate the use.

This updates rooted argument handling to call lookup_value for roots_val in reverse mode, matching how the primal argument value is remapped before being boxed. That keeps the root load anchored to the reverse-pass value map instead of the original forward-path value.

The new sparse regression exercises the nested active wrapper case from the issue: a scalar wrapper plus a sparse matrix wrapper passed through scaled mul! under set_runtime_activity(Reverse). The test checks both the primal result and the expected gradient dp ≈ [-4.0].

Notes

  • The regression fixture uses a concrete nested wrapper type so it exercises the sparse rooting bug without also triggering the unrelated parametric-constructor _compute_sparams path under ParallelTestRunner.

Tests

  • julia --project=test -e 'include("test/rules/internal_rules/sparsearrays_rules.jl")'
    • SparseArrays spmatvec reverse rule: 4002 passed
    • SparseArrays nested wrapper reverse rule: 2 passed
    • SparseArrays spmatmat reverse rule: 1709 passed
  • julia --project=test test/runtests.jl rules/internal_rules/sparsearrays_rules --jobs=1 --quickfail
    • Overall: 5713 passed, 5713 total, SUCCESS
  • julia --project=@runic -e 'using Runic; exit(Runic.main(["--inplace", "test/rules/internal_rules/sparsearrays_rules.jl"]))'
  • git diff --check

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Your PR no longer requires formatting changes. Thank you for your contribution!

@AshtonSBradley AshtonSBradley force-pushed the asb/issue-3072-sparse-wrapper-rta branch from 6484864 to 9b72c2c Compare May 7, 2026 08:03
@vchuravy vchuravy force-pushed the asb/issue-3072-sparse-wrapper-rta branch from 9b72c2c to fe14659 Compare May 7, 2026 11:51
@vchuravy vchuravy requested a review from wsmoses May 7, 2026 11:51
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.85%. Comparing base (179b608) to head (d2de4ca).
⚠️ Report is 65 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3073      +/-   ##
==========================================
+ Coverage   66.76%   69.85%   +3.08%     
==========================================
  Files          65       66       +1     
  Lines       21522    21721     +199     
==========================================
+ Hits        14369    15173     +804     
+ Misses       7153     6548     -605     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vchuravy
Copy link
Copy Markdown
Member

vchuravy commented May 7, 2026

Error in testset SparseArrays nested wrapper reverse rule:
Error During Test at /home/runner/work/Enzyme.jl/Enzyme.jl/test/rules/internal_rules/sparsearrays_rules.jl:93
  Got exception outside of a @test
  EnzymeRuntimeException: Enzyme execution failed within
  
  (::Main.var"##rules/internal_rules/sparsearrays_rules#142909".var"#f#f##0")(::Vector{Float64})
       @ Main.var"##rules/internal_rules/sparsearrays_rules#142909" ~/work/Enzyme.jl/Enzyme.jl/test/rules/internal_rules/sparsearrays_rules.jl:94
  
  Hint: catch this exception as `err` and call `code_typed(err)` to inspect the surrounding code.
  
  Enzyme: jl_call calling convention not implemented in aug_forward for   %jl_f__compute_sparams_ret = call nonnull "enzyme_type"="{[-1]:Pointer}" ptr addrspace(10) (ptr, ptr addrspace(10), ...) @julia.call(ptr nonnull @jl_f__compute_sparams, ptr addrspace(10) null, ptr addrspace(10) @"ejl_inserted$_Main___rules_internal_rules_sparsearrays_rules_142909_SparseWrapperScaledMatrix_167324$false$139975272843472", ptr addrspace(10) @"ejl_inserted$jl_global_167325$false$139975308452240", ptr addrspace(10) nonnull %"box::SparseWrapperScale", ptr addrspace(10) nonnull %34) #20, !dbg !75
  
  Stacktrace:
    [1] f
      @ ~/work/Enzyme.jl/Enzyme.jl/test/rules/internal_rules/sparsearrays_rules.jl:97 [inlined]
    [2] diffejulia_f_167313wrap
      @ ~/work/Enzyme.jl/Enzyme.jl/test/rules/internal_rules/sparsearrays_rules.jl:0
    [3] macro expansion
      @ ~/work/Enzyme.jl/Enzyme.jl/src/compiler.jl:6703 [inlined]
    [4] enzyme_call
      @ ~/work/Enzyme.jl/Enzyme.jl/src/compiler.jl:6182 [inlined]
    [5] CombinedAdjointThunk
      @ ~/work/Enzyme.jl/Enzyme.jl/src/compiler.jl:6066 [inlined]
    [6] autodiff
      @ ~/work/Enzyme.jl/Enzyme.jl/src/Enzyme.jl:528 [inlined]
    [7] autodiff(mode::EnzymeCore.ReverseMode{false, true, false, EnzymeCore.FFIABI, false, false}, f::Main.var"##rules/internal_rules/sparsearrays_rules#142909".var"#f#f##0", ::Type{EnzymeCore.Active}, args::EnzymeCore.Duplicated{Vector{Float64}})
      @ Enzyme ~/work/Enzyme.jl/Enzyme.jl/src/Enzyme.jl:549
    [8] top-level scope
      @ ~/work/Enzyme.jl/Enzyme.jl/test/rules/internal_rules/sparsearrays_rules.jl:94
    [9] macro expansion
      @ /opt/hostedtoolcache/julia/1.12.6/x64/share/julia/stdlib/v1.12/Test/src/Test.jl:1777 [inlined]
   [10] macro expansion
      @ ~/work/Enzyme.jl/Enzyme.jl/test/rules/internal_rules/sparsearrays_rules.jl:106 [inlined]
   [11] include(mapexpr::Function, mod::Module, _path::String)
      @ Base ./Base.jl:307
   [12] macro expansion
      @ ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:385 [inlined]
   [13] macro expansion
      @ /opt/hostedtoolcache/julia/1.12.6/x64/share/julia/stdlib/v1.12/Test/src/Test.jl:1777 [inlined]
   [14] macro expansion
      @ ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:385 [inlined]
   [15] macro expansion
      @ /opt/hostedtoolcache/julia/1.12.6/x64/share/julia/stdlib/v1.12/Test/src/Test.jl:1777 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:384 [inlined]
   [17] macro expansion
      @ ./timing.jl:697 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:383
   [19] eval(m::Module, e::Any)
      @ Core ./boot.jl:489
   [20] execute(::Type{ParallelTestRunner.TestRecord}, mod::Module, f::Expr, name::String, start_time::Float64, custom_args::@NamedTuple{})
      @ ParallelTestRunner ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:376
   [21] (::ParallelTestRunner.var"#inner#runtest##0"{Type{ParallelTestRunner.TestRecord}, Expr, String, Expr, Float64, @NamedTuple{}})()
      @ ParallelTestRunner ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:413
   [22] runtest(RecordType::Type{ParallelTestRunner.TestRecord}, f::Expr, name::String, init_code::Expr, start_time::Float64, custom_args::@NamedTuple{})
      @ ParallelTestRunner ~/.julia/packages/ParallelTestRunner/xjQuD/src/ParallelTestRunner.jl:424
   [23] (::var"#handle##0#handle##1"{Sockets.TCPSocket, UInt64, Bool, @Kwargs{}, Tuple{typeof(ParallelTestRunner.runtest), DataType, Expr, String, Expr, Float64, @NamedTuple{}}, typeof(invokelatest)})()
      @ Main ~/.julia/packages/Malt/OteGQ/src/worker.jl:120

Looks related

@AshtonSBradley AshtonSBradley force-pushed the asb/issue-3072-sparse-wrapper-rta branch from afe8f10 to d2de4ca Compare May 7, 2026 21:58
@AshtonSBradley
Copy link
Copy Markdown
Author

Yes, related. The compiler fix still targets the root-dominance failure, but my first regression test used a parametric SparseWrapperScaledMatrix{S, M} constructor inside the differentiated function. Under ParallelTestRunner's generated test module, that introduced an unrelated _compute_sparams jl_call path.

I changed the test fixture to use a concrete nested wrapper type instead. That keeps the sparse wrapper/rooting coverage for #3072 without testing the separate parametric-constructor limitation. I also verified the file through the same runner shape:

julia --project=test test/runtests.jl rules/internal_rules/sparsearrays_rules --jobs=1 --quickfail

Overall 5713/5713 SUCCESS

@AshtonSBradley
Copy link
Copy Markdown
Author

I checked the current CI logs after the fixture update. The _compute_sparams / jl_call failure from the first regression test no longer appears.

The sparse regression path now passes in the CI-shaped jobs I inspected:

  • Julia 1.10 x86 ubuntu: rules/internal_rules/sparsearrays_rules passes 5713/5713
  • Julia 1.12 windows: rules/internal_rules/sparsearrays_rules passes 5713/5713

The remaining failures I saw look unrelated to this patch: several jobs end with Malt.TerminatedWorkerException() in other testsets or a self-hosted runner/container failure in the Lux integration job. I do not see the sparse nested-wrapper Enzyme error anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reverse mode verifier failure for nested wrapper around SparseMatrixCSC mul!

2 participants