KLU: add thread-safe solve entry points#1038
KLU: add thread-safe solve entry points#1038luke-kiernan wants to merge 1 commit intoDrTimothyAldenDavis:devfrom
Conversation
Legacy klu_solve / klu_tsolve race on Numeric->Xwork when called concurrently against a shared Numeric. Add klu_*_solve_ws and klu_*_tsolve_ws variants that take caller-supplied scratch, plus klu_*_solve_worksize() to query the size. Legacy entry points unchanged; no ABI break. Includes Demo/klu_thread_demo.c regression test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Above was 80% AI. But in principle it seems like a straightforward patch. If there's things I've overlooked, please let me know and I'll do my best to address them. |
|
I can't include any AI generated code |
Understood. I'll close the PR. I may re-write by hand and re-open, but that will take a while. Related question: is there a concrete reason why the library doesn't already allow for thread-parallel calls to |
|
I wrote the library a long time ago, when parallelism was not a concern. |
|
Here's why I don't want to include AI generated code: |
|
You don't have to justify your policy to me. The root problem I'm trying to address is 2 steps removed: I'll close this PR. I may re-write by hand, but I can't really justify taking the time to do so right now. |
|
No problem. I will add this to my TODO list. I definitely agree it's something that needs to be done. I'll try to get to it over the summer. |
Summary
Add thread-safe variants of
klu_solve/klu_tsolveso multiple threads can solve against a single sharedNumeric.The legacy entry points use
Numeric->Xworkas scratch, so concurrent solves on the same factorization race on that buffer. This PR adds:klu_*_solve_ws/klu_*_tsolve_ws— same as the legacy entry points, but take a caller-suppliedvoid *Workinstead of writing toNumeric->Xwork.klu_*_solve_worksize(Symbolic, Common)— returns the required size in bytes (4 * n * sizeof(Entry), with overflow check).The legacy
klu_solve/klu_tsolveare unchanged — they still useNumeric->Xwork— so this is a pure addition. No struct fields touched, no existing signature changed, no SOVERSION bump.Threading contract: any number of concurrent
klu_*_solve_ws/klu_*_tsolve_wscalls against a singleNumeric, provided each thread supplies its ownWorkbuffer and its ownklu_common. (As before,klu_refactormutates theNumericand is never concurrent-safe with solves.)Implementation: each of
klu_solve.c/klu_tsolve.cis split into a static_corehelper (takesvoid *Work) plus publicklu_solve/klu_solve_ws/ (klu_tsolve/klu_tsolve_ws) wrappers.Test plan
Demo/klu_thread_demo.cis included as a permanent regression test: 8 pthreads × 200 iters × (solve + tsolve) against a sharedNumeric, each with its ownWorkandCommon, results compared bit-exact to a serial reference. Wired into theSUITESPARSE_DEMOSbuild, gated on non-Windows.nmverified)klu_*_solve_ws/klu_*_tsolve_wsproduce bit-identical output to the legacy entry points (single-threaded smoke test)klu_thread_demopasses: 8 threads × 200 iters × 2 solves on n=200 tridiagklu_solve(sharingNumeric->Xwork) reliably fails on iter 0 — confirms the test really exercises the race and the new path resolves itNotes
_ws(workspace) over_2for descriptive clarity. Open to bikeshed.klu_refactorandklu_condestaren't the concurrent-target use case (refactor mutatesNumericregardless).NOT WIN32; on Windows the test is skipped (the library change itself is portable).🤖 Generated with Claude Code