Skip to content

Commit 74d3823

Browse files
committed
document iterative solver and Monte Carlo caching options
- add README and Sphinx docs for the jacobi-gmres solver workflow - document aggregate_by and postprocess_multiprocessing behavior - update examples and API docstrings to reflect aggregated MC outputs
1 parent bccf95e commit 74d3823

11 files changed

Lines changed: 639 additions & 40 deletions

File tree

README.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,44 @@ If not specified, all the methods, years, regions and scenarios
154154
defined in the datapackage.json file are used, which can be very
155155
time-consuming.
156156

157+
For larger Monte Carlo studies, ``Pathways.calculate(...)`` also supports an
158+
experimental iterative solver and cache-shaping options:
159+
160+
```python
161+
162+
p.calculate(
163+
methods=methods,
164+
models=models,
165+
scenarios=scenarios,
166+
regions=regions,
167+
years=years,
168+
variables=variables,
169+
use_distributions=300,
170+
solver="jacobi-gmres",
171+
iterative_rtol=1e-8,
172+
aggregate_by=["act_category", "location"],
173+
multiprocessing=True,
174+
postprocess_multiprocessing=True,
175+
)
176+
177+
```
178+
179+
- `solver="direct"` is the default and uses `bw2calc.MultiLCA`.
180+
- `solver="jacobi-gmres"` uses an experimental iterative `MultiLCA` backend.
181+
It can reuse previous supply arrays as warm starts and falls back to the
182+
direct solve if GMRES does not converge.
183+
- `aggregate_by` currently supports `act_category` and `location`. These
184+
dimensions are collapsed to a single `"aggregated"` label before Monte Carlo
185+
iteration arrays are cached, which reduces cache size and shortens
186+
post-processing.
187+
- `postprocess_multiprocessing=True` parallelizes final cached-result assembly.
188+
During this phase, Pathways logs per-year assembly progress so long runs do
189+
not appear stalled.
190+
191+
Use `aggregate_by` only when you do not need detailed attribution along those
192+
dimensions in the final results. If you relax `iterative_rtol` for speed, check
193+
the result against a direct-solver reference first.
194+
157195
Once calculated, the results of the LCA calculations are stored in the `.lcia_results`
158196
attribute of the `Pathways` object as an ``xarray.DataArray``.
159197

docs/api_reference.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,14 @@ Pathways
1414
- :py:meth:`pathways.Pathways.calculate` —
1515
Compute LCA results for selected methods, models, scenarios, regions, years, and variables.
1616
Parameters include ``demand_cutoff``, ``use_distributions``, ``subshares``,
17-
``remove_uncertainty``, ``seed``, ``multiprocessing``,
17+
``remove_uncertainty``, ``seed``, ``solver``, the ``iterative_*`` tuning
18+
arguments, ``aggregate_by``, ``multiprocessing``,
1819
``postprocess_multiprocessing``, and ``double_accounting``.
20+
``solver="direct"`` uses ``bw2calc.MultiLCA`` and
21+
``solver="jacobi-gmres"`` selects the experimental iterative backend.
1922

2023
- :py:meth:`pathways.Pathways.aggregate_results` —
2124
Aggregate low-contribution activity categories under ``"other"``; optional interpolation.
2225

2326
- :py:meth:`pathways.Pathways.export_results` —
2427
Export **non-zero** results to compressed Parquet (``.gzip``).
25-

docs/examples.rst

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,9 +45,11 @@ Monte Carlo sampling and export
4545
-------------------------------
4646

4747
Set ``use_distributions`` to a non-zero integer to trigger Monte Carlo draws
48-
for the technosphere uncertainty parameters stored in the datapackage. The
49-
example below runs five iterations and writes both the aggregated results and
50-
per-iteration outputs to disk.
48+
for the technosphere uncertainty parameters stored in the datapackage. For
49+
larger runs, you can switch to the experimental iterative solver and collapse
50+
selected dimensions before cached iteration arrays are written. The example
51+
below runs five iterations, aggregates ``act_category`` and ``location`` during
52+
caching, and exports the final tensor.
5153

5254
.. code-block:: python
5355
@@ -59,6 +61,11 @@ per-iteration outputs to disk.
5961
years=[2030],
6062
variables=["Electricity|Generation"],
6163
use_distributions=5,
64+
solver="jacobi-gmres",
65+
iterative_rtol=1e-8,
66+
aggregate_by=["act_category", "location"],
67+
multiprocessing=True,
68+
postprocess_multiprocessing=True,
6269
remove_uncertainty=False,
6370
)
6471
@@ -74,3 +81,7 @@ per-iteration outputs to disk.
7481
7582
mc_book = STATS_DIR / "ModelX_baseline_2030.xlsx"
7683
print(mc_book.exists())
84+
85+
In this example, ``pw.lca_results`` keeps the same dimension names, but the
86+
``act_category`` and ``location`` coordinates each contain only the single value
87+
``"aggregated"``.

docs/index.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,10 @@ Pathways provides tools for **prospective life cycle assessment (LCA)** driven b
1111
You can then compute multi-year, multi-region impact results with a single call to
1212
:py:meth:`pathways.Pathways.calculate`, aggregate the results with
1313
:py:meth:`pathways.Pathways.aggregate_results`, and export them to a compact
14-
Parquet file with :py:meth:`pathways.Pathways.export_results`.
14+
Parquet file with :py:meth:`pathways.Pathways.export_results`. For larger Monte
15+
Carlo studies, ``calculate`` can also use an experimental iterative solver,
16+
collapse selected dimensions before caching, and parallelize the final
17+
cache-assembly step.
1518

1619
Contents
1720
--------

docs/quickstart.rst

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,32 @@ and export the non-zero entries to Parquet:
5050
# 5) Export non-zero cells to compressed Parquet
5151
out = pw.export_results("results_baseline")
5252
print("Wrote:", out) # e.g., results_baseline.gzip
53+
54+
Large Monte Carlo runs
55+
----------------------
56+
57+
When ``use_distributions`` is greater than zero, you can switch from the
58+
default direct solver to the experimental iterative backend and reduce cache
59+
size by collapsing dimensions you do not need later:
60+
61+
.. code-block:: python
62+
63+
pw.calculate(
64+
methods=["AWARE"],
65+
models=["REMIND"],
66+
scenarios=["SSP2-NPi"],
67+
regions=["World"],
68+
years=[2050],
69+
variables=["Electricity|Generation"],
70+
use_distributions=300,
71+
solver="jacobi-gmres",
72+
iterative_rtol=1e-8,
73+
aggregate_by=["act_category", "location"],
74+
multiprocessing=True,
75+
postprocess_multiprocessing=True,
76+
)
77+
78+
``solver="direct"`` remains the default. ``aggregate_by`` currently supports
79+
``"act_category"`` and ``"location"`` and replaces those coordinates with a
80+
single ``"aggregated"`` label in the resulting array, so use it only when you
81+
do not need detailed attribution along those dimensions.

docs/user_guide.rst

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,43 @@ Pathways workflow
5454

5555
``(act_category, variable, year, region, location, model, scenario, impact_category)``
5656

57+
When ``aggregate_by`` is used during Monte Carlo runs, the dimensions stay in
58+
the output, but the collapsed coordinates become ``"aggregated"``.
59+
60+
**Advanced Monte Carlo options**
61+
62+
.. code-block:: python
63+
64+
pw.calculate(
65+
methods=["AWARE"],
66+
models=["REMIND"],
67+
scenarios=["SSP2-NPi"],
68+
regions=["World"],
69+
years=[2050],
70+
variables=["Electricity|Generation"],
71+
use_distributions=300,
72+
solver="jacobi-gmres",
73+
iterative_rtol=1e-8,
74+
aggregate_by=["act_category", "location"],
75+
multiprocessing=True,
76+
postprocess_multiprocessing=True,
77+
)
78+
79+
- ``solver="direct"`` is the default and uses ``bw2calc.MultiLCA``.
80+
- ``solver="jacobi-gmres"`` uses an experimental iterative backend. It
81+
reuses prior supply arrays as warm starts when possible and falls back to
82+
the direct solve if GMRES does not converge.
83+
- The iterative backend exposes ``iterative_rtol``, ``iterative_atol``,
84+
``iterative_restart``, ``iterative_maxiter``, and
85+
``iterative_use_guess`` on :meth:`pathways.Pathways.calculate`.
86+
- ``aggregate_by`` currently supports ``"act_category"`` and
87+
``"location"``. It collapses those dimensions before Monte Carlo
88+
iteration arrays are cached, which reduces cache size and shortens
89+
post-processing.
90+
- ``postprocess_multiprocessing=True`` parallelizes final cached-result
91+
assembly after the solver stage. Pathways also logs yearly assembly
92+
progress during this step.
93+
5794
3. **Aggregate for display**
5895

5996
.. code-block:: python
@@ -74,3 +111,5 @@ Notes
74111
- Supported ``ecoinvent_version`` values are ``"3.10"``, ``"3.11"``, and ``"3.12"``.
75112
- Matrix files ``A_matrix.csv`` and ``B_matrix.csv`` are semicolon-delimited and may include a header row.
76113
- Index files ``A_matrix_index.csv`` and ``B_matrix_index.csv`` may include a header row.
114+
- If you loosen ``iterative_rtol`` for speed, compare the result against a
115+
direct-solver run before using it for production Monte Carlo analysis.

pathways/jacobi_gmres_multi_lca.py

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
"""Experimental iterative ``MultiLCA`` backend for Pathways.
2+
3+
This backend keeps Pathways on the standard ``bw2calc.MultiLCA`` lifecycle,
4+
but swaps the direct sparse solve for GMRES with a Jacobi preconditioner.
5+
Whenever available, it reuses the matrix-preparation and preconditioner
6+
helpers from ``bw2calc.JacobiGMRESLCA``.
7+
"""
8+
9+
from __future__ import annotations
10+
11+
import logging
12+
from typing import Optional
13+
14+
import bw2calc as bc
15+
import matrix_utils as mu
16+
import numpy as np
17+
from scipy import sparse
18+
from scipy.sparse.linalg import LinearOperator, gmres
19+
20+
logger = logging.getLogger(__name__)
21+
22+
try:
23+
_BW_JACOBI_GMRES_LCA = bc.JacobiGMRESLCA
24+
except AttributeError: # pragma: no cover - older bw2calc versions
25+
_BW_JACOBI_GMRES_LCA = None
26+
27+
28+
class JacobiGMRESMultiLCA(bc.MultiLCA):
29+
"""Solve multi-demand LCI systems with GMRES and Jacobi preconditioning."""
30+
31+
def __init__(
32+
self,
33+
*args,
34+
rtol: float = 1e-8,
35+
atol: float = 0.0,
36+
restart: Optional[int] = 50,
37+
maxiter: Optional[int] = 300,
38+
use_guess: bool = True,
39+
direct_fallback: bool = True,
40+
**kwargs,
41+
):
42+
super().__init__(*args, **kwargs)
43+
self.rtol = rtol
44+
self.atol = atol
45+
self.restart = restart
46+
self.maxiter = maxiter
47+
self.use_guess = use_guess
48+
self.direct_fallback = direct_fallback
49+
50+
self._matrix_prepared = False
51+
self._cached_preconditioner: Optional[LinearOperator] = None
52+
self.guesses: dict[str, np.ndarray] = {}
53+
54+
def __next__(self) -> None:
55+
# Matrix values can change on each Monte Carlo draw.
56+
self._matrix_prepared = False
57+
self._cached_preconditioner = None
58+
self.guesses = {}
59+
super().__next__()
60+
61+
def load_lci_data(self, nonsquare_ok=False) -> None:
62+
super().load_lci_data(nonsquare_ok=nonsquare_ok)
63+
self._matrix_prepared = False
64+
self._cached_preconditioner = None
65+
self.guesses = {}
66+
67+
def _prepare_matrix(self) -> None:
68+
# ``MappedMatrix`` updates ``technosphere_mm.matrix`` across MC draws.
69+
# Rebind here so GMRES always sees the current technosphere values instead
70+
# of a stale CSC conversion from an earlier draw.
71+
if hasattr(self, "technosphere_mm"):
72+
self.technosphere_matrix = self.technosphere_mm.matrix
73+
74+
if _BW_JACOBI_GMRES_LCA is not None:
75+
_BW_JACOBI_GMRES_LCA._prepare_matrix(self)
76+
return
77+
78+
if self._matrix_prepared:
79+
return
80+
81+
self.technosphere_matrix = self.technosphere_matrix.tocsc(copy=False)
82+
self.technosphere_matrix.sum_duplicates()
83+
self.technosphere_matrix.eliminate_zeros()
84+
self.technosphere_matrix.sort_indices()
85+
self._matrix_prepared = True
86+
87+
def _build_jacobi_preconditioner(self) -> Optional[LinearOperator]:
88+
if _BW_JACOBI_GMRES_LCA is not None:
89+
return _BW_JACOBI_GMRES_LCA._build_jacobi_preconditioner(self)
90+
91+
if self._cached_preconditioner is not None:
92+
return self._cached_preconditioner
93+
94+
diagonal = self.technosphere_matrix.diagonal()
95+
if np.any(diagonal == 0):
96+
return None
97+
98+
inverse_diagonal = 1.0 / diagonal
99+
self._cached_preconditioner = LinearOperator(
100+
shape=self.technosphere_matrix.shape,
101+
matvec=lambda x: inverse_diagonal * x,
102+
dtype=self.technosphere_matrix.dtype,
103+
)
104+
return self._cached_preconditioner
105+
106+
def _solve_with_gmres(
107+
self,
108+
demand: np.ndarray,
109+
*,
110+
x0: np.ndarray | None = None,
111+
demand_name: str | None = None,
112+
) -> np.ndarray:
113+
self._prepare_matrix()
114+
preconditioner = self._build_jacobi_preconditioner()
115+
116+
try:
117+
solution, info = gmres(
118+
self.technosphere_matrix,
119+
demand,
120+
x0=x0,
121+
rtol=self.rtol,
122+
atol=self.atol,
123+
restart=self.restart,
124+
maxiter=self.maxiter,
125+
M=preconditioner,
126+
)
127+
except TypeError: # pragma: no cover - SciPy compatibility fallback
128+
solution, info = gmres(
129+
self.technosphere_matrix,
130+
demand,
131+
x0=x0,
132+
tol=self.rtol,
133+
atol=self.atol,
134+
restart=self.restart,
135+
maxiter=self.maxiter,
136+
M=preconditioner,
137+
)
138+
139+
solution = np.asarray(solution, dtype=np.float64)
140+
if not solution.shape:
141+
solution = solution.reshape((1,))
142+
143+
if info != 0:
144+
if not self.direct_fallback:
145+
raise RuntimeError(
146+
"GMRES failed to converge "
147+
f"(demand={demand_name!r}, info={info}, rtol={self.rtol}, maxiter={self.maxiter})"
148+
)
149+
150+
logger.warning(
151+
"GMRES failed to converge for demand %s; falling back to direct solve.",
152+
demand_name,
153+
)
154+
solution = np.asarray(bc.spsolve(self.technosphere_matrix, demand))
155+
if not solution.shape:
156+
solution = solution.reshape((1,))
157+
158+
return solution
159+
160+
def lci_calculation(self) -> None:
161+
"""Calculate inventories for many demands using iterative solves."""
162+
count = len(self.dicts.activity)
163+
demand_items = list(self.demand_arrays.items())
164+
if not demand_items:
165+
self.supply_arrays = {}
166+
self.inventories = mu.SparseMatrixDict([])
167+
return
168+
169+
supply_arrays: dict[str, np.ndarray] = {}
170+
previous_solution: np.ndarray | None = None
171+
172+
for name, demand in demand_items:
173+
x0 = None
174+
if self.use_guess:
175+
x0 = self.guesses.get(name)
176+
if x0 is None:
177+
x0 = previous_solution
178+
179+
solution = self._solve_with_gmres(demand, x0=x0, demand_name=name)
180+
supply_arrays[name] = solution
181+
previous_solution = solution
182+
183+
if self.use_guess:
184+
self.guesses[name] = solution
185+
186+
self.supply_arrays = supply_arrays
187+
self.inventories = mu.SparseMatrixDict(
188+
[
189+
(
190+
name,
191+
self.biosphere_matrix
192+
@ sparse.spdiags([arr], [0], count, count),
193+
)
194+
for name, arr in self.supply_arrays.items()
195+
]
196+
)

0 commit comments

Comments
 (0)