Skip to content

Commit 960de52

Browse files
MarcusMNoackclaude
andcommitted
Integrate PR #31: preconditioner framework, warm-starts, kernel sparse blocks
Integrates the capabilities from #31 on top of the recent gp_data/gp_prior/gp_kv refactor, omitting the changes that conflicted with the new structure or removed safety/diagnostics. gp_lin_alg.py - New sparse preconditioner framework: ILU, IC(0), block Jacobi, additive Schwarz, AMG; alias resolution via `resolve_gp2scale_linalg_mode` and `normalize_sparse_preconditioner_type`; dispatcher `calculate_sparse_preconditioner(KV, args) -> (factor, operator)`. - Block conjugate gradient (`_block_conjugate_gradient`) with multi-column x0 normalization (`_normalize_rhs`, `_normalize_initial_guess`, `_column_initial_guess`) so warm-starts survive an append. - maxiter support in MINRES/CG (`sparse_minres_maxiter`, `sparse_cg_maxiter`, `sparse_krylov_maxiter`). - GPU detection: `_torch_gpu_device` (CUDA + MPS + device index), `_cupy_gpu_available`, `_imate_gpu_enabled`; `get_gpu_engine` now actually verifies a usable GPU before returning torch/cupy. - CPU fallback with UserWarning in every GPU branch that previously set `res = None`; cupy added to `calculate_logdet`, `calculate_inv`, `solve`, `matmul`, `matmul3`. - `args=args` propagation through `update_Chol_factor`, `cholesky_update_rank_n`, `update_logdet`, `calculate_inv_from_chol`, `update_inv`, `cholesky_update_rank_1_torch`. - Preserves NonPositiveDefiniteError and its diagnostics; preserves user dtype on GPU paths (rejected float32 downcasts); keeps `.tocsc()` in `calculate_sparse_solve`; preserves docstrings. gp_kv.py (training-path preconditioner cache + warm-start plumbing) - Mode aliases `sparseCGpre_<type>` / `sparseMINRESpre_<type>` resolved at __init__; resolved args written back to data.args. - Preconditioner cache (factor/operator/signature/shape/reuse counter) used by solve(), set_KV (force-refresh), update_KV (reuse-with-interval), and compute_new_*. Refresh interval default 1 = always rebuild (no change to prior behavior). Invalidated on shape or `sparse_preconditioner_*` arg change. Cached factor/operator dropped from pickle state. - compute_new_KVinvY / compute_new_KVlogdet_KVinvY gain optional x0 for warm-start; replaces inline `sparse.linalg.spilu` with the unified preconditioner framework. gp_marginal_likelihood.py - `_warm_start_KVinvY` cache; `_iterative_initial_guess` returns a usable x0 (last training iteration's solution, fallback to committed kv.KVinvY) when `args["sparse_krylov_warm_start"]=True`. Default off. kernels.py - Support-aware sparse Wendland blocks for gp2Scale: `wendland_anisotropic_gp2Scale_cpu_sparse` and its GPU counterpart, output-sensitive cKDTree neighbor search in whitened coordinates. - Multi-backend GPU Wendland with torch/cupy/CPU fallback. - Preserved existing docstrings and `polynomial_kernel` (collaborator's version returned `p` instead of the kernel value). gp.py / fvgp.py - Docstrings updated: `linalg_mode` documents the new `_<type>` aliases, `args` section split into themed groups with the new `sparse_preconditioner_*`, `sparse_krylov_*`, `sparse_*_maxiter`, `GPU_device`, `GPU_device_index` keys. pyproject.toml - `pyamg` added to tests extras so CI exercises the AMG preconditioner. CLAUDE.md - New 'Iterative-solver acceleration' section documenting the cache, preconditioner choices, warm-start, and the default-off semantics; `GPkv` and `GPMarginalLikelihood` rows updated. Tests - All linalg modes (canonical + every `_<type>` alias for the two preconditioned solvers) exercised end-to-end in `test_linalg_modes`. - Per-preconditioner correctness via CG convergence (`test_calculate_sparse_preconditioner_{ilu,ic0,block_jacobi, additive_schwarz,amg}`). - Block CG, multi-column x0, maxiter, legacy tolerance keys, GPU engine detection, CPU fallbacks for both linalg and kernel paths. - New training-path tests: preconditioner cache reuse counter and refresh interval, signature invalidation, set_KV force-refresh, warm-start on/off, build-failure fallback, end-to-end agreement of cached + warm-started compute_new_KVlogdet against the cold-start baseline. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 7682a99 commit 960de52

9 files changed

Lines changed: 2107 additions & 173 deletions

File tree

CLAUDE.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Both classes are composed of internal specialist objects created at `__init__` t
3939
| `GPprior` | [gp_prior.py](fvgp/gp_prior.py) | Kernel and mean function (default: anisotropic Matérn with ARD). In gp2Scale mode also owns `x_data_scatter_future` (the persistent dask scatter of `x_data`) |
4040
| `GPlikelihood` | [gp_likelihood.py](fvgp/gp_likelihood.py) | Noise model (variances or callable) |
4141
| `GPkv` | [gp_kv.py](fvgp/gp_kv.py) | Owns K+V matrix state and all factorizations; dispatches solves/logdets across linalg modes |
42-
| `GPMarginalLikelihood` | [gp_marginal_likelihood.py](fvgp/gp_marginal_likelihood.py) | Log marginal likelihood and its gradient; delegates factorization to `GPkv` |
42+
| `GPMarginalLikelihood` | [gp_marginal_likelihood.py](fvgp/gp_marginal_likelihood.py) | Log marginal likelihood and its gradient; delegates factorization to `GPkv`. Maintains `_warm_start_KVinvY` for iterative training solves when `args["sparse_krylov_warm_start"]=True`. |
4343
| `GPposterior` | [gp_posterior.py](fvgp/gp_posterior.py) | Posterior mean/covariance; information-theoretic quantities |
4444
| `GPtraining` | [gp_training.py](fvgp/gp_training.py) | Hyperparameter optimization (scipy, hgdl async, MCMC, Adam) |
4545

@@ -64,7 +64,7 @@ Gotchas:
6464
### Key supporting modules
6565

6666
- **[gp_lin_alg.py](fvgp/gp_lin_alg.py)** — CPU/GPU linear algebra primitives; Cholesky, LU, sparse solvers; defines `NonPositiveDefiniteError`
67-
- **[gp_kv.py](fvgp/gp_kv.py)**`GPkv` manages all K+V state across linalg modes: `"Chol"`, `"CholInv"`, `"Inv"`, `"sparseLU"`, `"sparseCG"`, `"sparseMINRES"`, and preconditioned variants. The mode is set at init and determines which factorization is updated when data or hyperparameters change. Custom solvers can be injected as a 3-tuple of callables.
67+
- **[gp_kv.py](fvgp/gp_kv.py)**`GPkv` manages all K+V state across linalg modes: `"Chol"`, `"CholInv"`, `"Inv"`, `"sparseLU"`, `"sparseCG"`, `"sparseMINRES"`, and preconditioned variants. The mode is set at init and determines which factorization is updated when data or hyperparameters change. Custom solvers can be injected as a 3-tuple of callables. For `sparseMINRESpre`/`sparseCGpre`, `GPkv` caches the preconditioner across `update_KV` / `compute_new_*` calls and rebuilds when `Preconditioner_reuse_counter``args["sparse_preconditioner_refresh_interval"] - 1` or when the shape/`sparse_preconditioner_*` args fingerprint changes. `set_KV` always force-refreshes. Aliases like `"sparseCGpre_amg"` are resolved at `__init__` into the canonical mode plus `args["sparse_preconditioner_type"]`.
6868
- **[kernels.py](fvgp/kernels.py)** — 15+ built-in kernels including Matérn, squared exponential, Wendland (compactly supported)
6969
- **[gp_mcmc.py](fvgp/gp_mcmc.py)** — Adaptive Metropolis–Hastings sampler used for Bayesian hyperparameter inference
7070
- **[gp_actor.py](fvgp/gp_actor.py)**`AsyncOptimizer` wraps `_MCMCActor` and `_AdamActor` for non-blocking background training; used by `GPtraining` for async MCMC and Adam modes
@@ -92,6 +92,15 @@ client.run(lambda: None) # flush pending releases
9292

9393
The `test_gp2Scale` test uses exactly this pattern between linalg-mode iterations.
9494

95+
### Iterative-solver acceleration (sparseCG / sparseMINRES / *pre modes)
96+
97+
For `sparseCG`, `sparseMINRES`, `sparseCGpre`, and `sparseMINRESpre`, the user can opt into two orthogonal accelerators via `args` on the `GP` constructor:
98+
99+
- **Preconditioner caching** (`sparseCGpre`/`sparseMINRESpre` only): `args["sparse_preconditioner_refresh_interval"] = N` reuses a single preconditioner for up to N consecutive `update_KV` / `compute_new_*` calls before rebuilding. Default `N=1` rebuilds on every call (same as no caching). `args["sparse_preconditioner_type"]` selects the kernel — `"ilu"` (default), `"ic"`/`"incomplete_cholesky"`, `"block_jacobi"`, `"schwarz"`/`"additive_schwarz"`, `"amg"` (requires pyamg). Mode aliases `"sparseCGpre_<type>"` / `"sparseMINRESpre_<type>"` set the type as a shortcut. Cache is invalidated automatically when `KV.shape` or any `sparse_preconditioner_*` arg changes.
100+
- **Warm-start** (all iterative modes): `args["sparse_krylov_warm_start"] = True` makes `GPMarginalLikelihood` pass the previous training iteration's `KVinvY` as `x0` to the next iterative solve. Cuts iteration counts substantially when successive hyperparameter trials are close. Stored in `marginal_likelihood._warm_start_KVinvY`; reset to `None` on pickling.
101+
102+
Both default off so existing behavior is preserved.
103+
95104
### Customization API
96105

97106
Kernels, mean functions, and noise models are all plain Python callables with standardized signatures. Users pass them as arguments to `GP`/`fvGP` constructors. The full hyperparameter vector is shared across kernel, mean, and noise callables, but each callable must only read its reserved index range. Kernel gradients can be user-supplied or computed via finite differences.

fvgp/fvgp.py

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -180,8 +180,13 @@ class fvGP(GP):
180180
* ``"sparseCG"`` — sparse conjugate-gradient iterative solver.
181181
* ``"sparseMINRES"`` — sparse MINRES iterative solver.
182182
* ``"sparseSolve"`` — direct sparse solve via scipy.
183-
* ``"sparseCGpre"`` — conjugate-gradient with an incomplete-LU preconditioner.
184-
* ``"sparseMINRESpre"`` — MINRES with an incomplete-LU preconditioner.
183+
* ``"sparseCGpre"`` — preconditioned conjugate-gradient. The preconditioner type
184+
is selected by ``args["sparse_preconditioner_type"]`` (default ``"ilu"``;
185+
also ``"ic"``/``"incomplete_cholesky"``, ``"block_jacobi"``,
186+
``"schwarz"``/``"additive_schwarz"``, or ``"amg"`` (requires pyamg)).
187+
* ``"sparseMINRESpre"`` — preconditioned MINRES; same preconditioner choices.
188+
* ``"sparseCGpre_<type>"`` / ``"sparseMINRESpre_<type>"`` — shortcut that sets
189+
``args["sparse_preconditioner_type"]`` to ``<type>`` (e.g. ``"sparseCGpre_amg"``).
185190
186191
**Custom solver (any GP):**
187192
@@ -207,18 +212,62 @@ class fvGP(GP):
207212
args: dict, optional
208213
Advanced options. Recognized keys are:
209214
215+
Stochastic-Lanczos logdet (sparse modes):
216+
210217
- "random_logdet_lanczos_degree" : int; default = 20
211218
- "random_logdet_error_rtol" : float; default = 0.01
212219
- "random_logdet_verbose" : True/False; default = False
213220
- "random_logdet_print_info" : True/False; default = False
214-
- "sparse_minres_tol" : float
215-
- "sparse_cg_tol" : float
216221
- "random_logdet_lanczos_compute_device" : str; default = "cpu"/"gpu"
222+
223+
Sparse iterative solver tolerances and iteration limits:
224+
225+
- "sparse_cg_tol" : float; default = 1e-5
226+
- "sparse_minres_tol" : float; default = 1e-5
227+
- "sparse_cg_maxiter" : int; default = None (use scipy default)
228+
- "sparse_minres_maxiter" : int; default = None (use scipy default)
229+
- "sparse_krylov_maxiter" : int; default = None (applies to both if the
230+
solver-specific key is not set)
231+
- "sparse_block_krylov" : True/False; default = False — use a block CG
232+
variant when there are multiple RHS columns
233+
- "sparse_krylov_mode" : "single"/"block"; equivalent toggle
234+
- "sparse_krylov_block_size" : int — RHS block size for block CG
235+
236+
Iterative-solver acceleration (``sparseCG``/``sparseMINRES`` and the
237+
``*pre`` variants):
238+
239+
- "sparse_krylov_warm_start" : True/False; default = False — feed the
240+
previous training iteration's ``KVinvY`` as ``x0`` to the next solve
241+
- "sparse_preconditioner_type" : str; default = "ilu". One of "ilu",
242+
"ic"/"ichol"/"incomplete_cholesky", "block_jacobi", "schwarz"/
243+
"additive_schwarz", "amg" (requires pyamg)
244+
- "sparse_preconditioner_refresh_interval" : int; default = 1 —
245+
reuse the cached preconditioner for up to N consecutive solves
246+
before rebuilding. ``set_KV`` always force-refreshes.
247+
- "sparse_preconditioner_block_size" : int — block size for block_jacobi
248+
and additive_schwarz partitions
249+
- "sparse_preconditioner_schwarz_overlap" : int — overlap layers for
250+
additive Schwarz
251+
- "sparse_preconditioner_drop_tol" / "sparse_preconditioner_fill_factor"
252+
— forwarded to scipy ``spilu`` for "ilu"
253+
- "sparse_preconditioner_amg_*" — forwarded to pyamg
254+
(``max_levels``, ``max_coarse``, ``strength``, ``cycle``, etc.)
255+
- "sparse_preconditioner_shift" / "_growth" / "_attempts" — diagonal
256+
shift retry knobs for "ic" / "block_jacobi" / "additive_schwarz" when
257+
a local Cholesky encounters a non-PD block
258+
259+
Cholesky compute-device routing:
260+
217261
- "Chol_factor_compute_device" : str; default = "cpu"/"gpu"
218262
- "update_Chol_factor_compute_device": str; default = "cpu"/"gpu"
219263
- "Chol_solve_compute_device" : str; default = "cpu"/"gpu"
220264
- "Chol_logdet_compute_device" : str; default = "cpu"/"gpu"
221-
- "GPU_engine" : str; default = "torch"/"cupy"
265+
266+
GPU backend:
267+
268+
- "GPU_engine" : "torch"/"cupy"; default = first available
269+
- "GPU_device" : str; e.g. "cuda:1" or "mps"
270+
- "GPU_device_index" : int — explicit CUDA device index
222271
223272
All other keys will be stored and are available as part of the object instance and
224273
in kernel, mean, and noise functions.

fvgp/gp.py

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -180,8 +180,13 @@ class GP:
180180
* ``"sparseCG"`` — sparse conjugate-gradient iterative solver.
181181
* ``"sparseMINRES"`` — sparse MINRES iterative solver.
182182
* ``"sparseSolve"`` — direct sparse solve via scipy.
183-
* ``"sparseCGpre"`` — conjugate-gradient with an incomplete-LU preconditioner.
184-
* ``"sparseMINRESpre"`` — MINRES with an incomplete-LU preconditioner.
183+
* ``"sparseCGpre"`` — preconditioned conjugate-gradient. The preconditioner type
184+
is selected by ``args["sparse_preconditioner_type"]`` (default ``"ilu"``;
185+
also ``"ic"``/``"incomplete_cholesky"``, ``"block_jacobi"``,
186+
``"schwarz"``/``"additive_schwarz"``, or ``"amg"`` (requires pyamg)).
187+
* ``"sparseMINRESpre"`` — preconditioned MINRES; same preconditioner choices.
188+
* ``"sparseCGpre_<type>"`` / ``"sparseMINRESpre_<type>"`` — shortcut that sets
189+
``args["sparse_preconditioner_type"]`` to ``<type>`` (e.g. ``"sparseCGpre_amg"``).
185190
186191
**Custom solver (any GP):**
187192
@@ -207,18 +212,62 @@ class GP:
207212
args: dict, optional
208213
Advanced options. Recognized keys are:
209214
215+
Stochastic-Lanczos logdet (sparse modes):
216+
210217
- "random_logdet_lanczos_degree" : int; default = 20
211218
- "random_logdet_error_rtol" : float; default = 0.01
212219
- "random_logdet_verbose" : True/False; default = False
213220
- "random_logdet_print_info" : True/False; default = False
214-
- "sparse_minres_tol" : float
215-
- "sparse_cg_tol" : float
216221
- "random_logdet_lanczos_compute_device" : str; default = "cpu"/"gpu"
222+
223+
Sparse iterative solver tolerances and iteration limits:
224+
225+
- "sparse_cg_tol" : float; default = 1e-5
226+
- "sparse_minres_tol" : float; default = 1e-5
227+
- "sparse_cg_maxiter" : int; default = None (use scipy default)
228+
- "sparse_minres_maxiter" : int; default = None (use scipy default)
229+
- "sparse_krylov_maxiter" : int; default = None (applies to both if the
230+
solver-specific key is not set)
231+
- "sparse_block_krylov" : True/False; default = False — use a block CG
232+
variant when there are multiple RHS columns
233+
- "sparse_krylov_mode" : "single"/"block"; equivalent toggle
234+
- "sparse_krylov_block_size" : int — RHS block size for block CG
235+
236+
Iterative-solver acceleration (``sparseCG``/``sparseMINRES`` and the
237+
``*pre`` variants):
238+
239+
- "sparse_krylov_warm_start" : True/False; default = False — feed the
240+
previous training iteration's ``KVinvY`` as ``x0`` to the next solve
241+
- "sparse_preconditioner_type" : str; default = "ilu". One of "ilu",
242+
"ic"/"ichol"/"incomplete_cholesky", "block_jacobi", "schwarz"/
243+
"additive_schwarz", "amg" (requires pyamg)
244+
- "sparse_preconditioner_refresh_interval" : int; default = 1 —
245+
reuse the cached preconditioner for up to N consecutive solves
246+
before rebuilding. ``set_KV`` always force-refreshes.
247+
- "sparse_preconditioner_block_size" : int — block size for block_jacobi
248+
and additive_schwarz partitions
249+
- "sparse_preconditioner_schwarz_overlap" : int — overlap layers for
250+
additive Schwarz
251+
- "sparse_preconditioner_drop_tol" / "sparse_preconditioner_fill_factor"
252+
— forwarded to scipy ``spilu`` for "ilu"
253+
- "sparse_preconditioner_amg_*" — forwarded to pyamg
254+
(``max_levels``, ``max_coarse``, ``strength``, ``cycle``, etc.)
255+
- "sparse_preconditioner_shift" / "_growth" / "_attempts" — diagonal
256+
shift retry knobs for "ic" / "block_jacobi" / "additive_schwarz" when
257+
a local Cholesky encounters a non-PD block
258+
259+
Cholesky compute-device routing:
260+
217261
- "Chol_factor_compute_device" : str; default = "cpu"/"gpu"
218262
- "update_Chol_factor_compute_device": str; default = "cpu"/"gpu"
219263
- "Chol_solve_compute_device" : str; default = "cpu"/"gpu"
220264
- "Chol_logdet_compute_device" : str; default = "cpu"/"gpu"
221-
- "GPU_engine" : str; default = "torch"/"cupy"
265+
266+
GPU backend:
267+
268+
- "GPU_engine" : "torch"/"cupy"; default = first available
269+
- "GPU_device" : str; e.g. "cuda:1" or "mps"
270+
- "GPU_device_index" : int — explicit CUDA device index
222271
223272
All other keys will be stored and are available as part of the object instance and
224273
in kernel, mean, and noise functions.

0 commit comments

Comments
 (0)