Add the DC-EGM solver (discrete-continuous endogenous grid method) by hmgaudecker · Pull Request #390 · OpenSourceEconomics/pylcm

hmgaudecker · 2026-06-19T16:11:38Z

Draft. Stacked on #391 (feat/type-local-continuation-v) → main. (The earlier
#388/#389 stack has been consolidated into this branch.)

Adds the DC-EGM solver (discrete-continuous endogenous grid method) behind the
Solver ABC seam. It inverts the Euler equation on an exogenous end-of-period grid
instead of grid-searching the continuous action, with a FUES upper envelope to drop
dominated candidates. Selected per regime via solver=DCEGM(...); GridSearch (the
default) is untouched, and the engine dispatches polymorphically — no fork on solver
type.

Commit structure (reviewable in steps)

DC-EGM solver implementation (squashed) — the kernel: Euler inversion, the FUES
envelope, the closed-form constrained segment, multi-target continuation, and
asset-row mode; wired through the ABC DCEGM.validate / build_period_kernels.
Stochastic-node expectation in lax.scan blocks — a memory knob for the child
stochastic-node mesh.
FUES provenance + credits (3 commits) — NOTICE/license for the FUES derivation and
an acknowledgments page.
Modular split of step.py (5 pure-move commits) — regime_introspection →
kernel_scope → continuation → step_core → asset_row, taking the monolith from
~2900 to ~800 lines so the simple single-post-state path reads cleanly apart from the
asset-row and continuation machinery.
bind_continuation — collapse the continuation behind one bound
savings → (expected_value, expected_marginal) callable (ContinuationPlan).
Beartype + test fixes (2 commits) — the contract dataclasses are beartyped by the
package claw, so cyclic field types use the two-definition alias pattern (precise for
ty, bare container at runtime); plus a valid-DCEGM seam build test.
Seam Layer 2 — a uniform PeriodKernel/KernelResult adapter; the solve loop
calls one kernel and branches only on optional generic outputs, not on solver type.
Streaming refine-to-query — fold the asset-row single-query interpolation into the
FUES scan (refine_to_bracket + publish_node_from_bracket), removing the n_pad
intra-node envelope scratch. Single-post-state still publishes the full refined
envelope as its carry, unchanged.

Solvers and accuracy work built on the DC-EGM seam

On top of the kernel, this branch adds the prime-time solvers and the Dobrescu–Shanker
benchmark reproductions:

NEGM (nested EGM) and the 1-D RFC / MSS / LTM upper-envelope backends
(DCEGM(upper_envelope=...)), plus the 2-D RFC/G2EGM two-asset foundation for the
DS-2024 pension comparison.
DS-2026 Applications 1/2/3 reproduced (retirement, continuous-housing NEGM +
EGM-FUES, discrete-housing-with-tax); full paper-scale matrix run on gpu-01.
Jit internal functions on a per-period level #129 — branch-aware VFI oracle (tests/solution/_branch_aware_vfi_oracle.py): pins
the App.2 (S,s) EGM-vs-VFI gap to a comparator-ordering term that is identically zero
on the scored interior, correcting the earlier attribution; the tolerated interior gap
is DC-EGM durable-corner discretization. Standard-mode oracle == brute to 8e-5.
DS-2024 housing RFC-vs-NEGM comparison (ds2024_housing*.py,
benchmarks/ds_replication/ds2024_housing_comparison.py): NEGM (nested continuous
housing) vs RFC/FUES (discrete-housing DC-EGM, the source's per-housing-column 1-D
rooftop cut). Reproduces the paper headline — RFC runtime < NEGM runtime — against a
grid-search VFI oracle (faithful at delta = 0; the paper's delta = 0.10 keeper
depreciation awaits a shared housing-axis carry extension).

Validation

Full gpu-01 oracle battery (Tesla V100, -n 4, on a free GPU): 1405 passed, 15
skipped, 2 xfailed. Covers the DC-EGM-vs-brute value-function-parity battery, the
19-case refine-to-query unit equivalence (streamed ≡ row-then-interp), and the Layer-2
distinct-compiled-cores regression.
ty and prek clean.

Pending (does not block reviewing the engine)

DS-2024 housing delta = 0.10 fidelity: the keeper depreciates the held stock off
the housing grid; the strict-identity NEGM keeper needs a housing-axis carry extension.
ACA-scale memory (Gate C): refine-to-query's n_pad runtime win is only visible at
ACA scale — the in-suite oracles are too small to exercise it; handed to the aca side.
Throughput vs the brute-force solver at matched accuracy (Gate E) is a downstream
experiment this work unlocks, not settled here.

🤖 Generated with Claude Code

Route the existing brute-force grid search through a per-regime solver configuration and a builder registry, with no change to the numerics. - `Regime.solver: BruteForce | DCEGM` (default `BruteForce()`), exported from `lcm` alongside the `DCEGM` configuration class. - `_lcm.solution.registry`: `SolverBuildContext`, `SolverKernels`, the `SolverKernelBuilder` protocol, and `_build_brute_force_kernels` (the former `_build_max_Q_over_a_per_period`), dispatched on `type(regime.solver)` via `SOLVER_KERNEL_BUILDERS`. - `DCEGM` is published as the final configuration surface (fields + field validation), but its engine is not yet wired in; a regime requesting it is rejected at model build with `NotImplementedError`. Behavior-preserving: the full test suite passes unchanged and an explicit `BruteForce()` yields the same value function as the default. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Points the benchmark feature's aca-model at feat/dcegm-solver's tip — main's audit fixes + the DC-EGM solver + smooth-share eligibility — the version the solver-seam work is developed against. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Replace the `type(solver)`-keyed builder registry with a polymorphic `Solver` ABC. The engine now calls `solver.validate(context)` then `solver.build_period_kernels(context)` — no `SOLVER_KERNEL_BUILDERS` dict, no `BruteForce | DCEGM` union, no standalone DC-EGM guard. - `_lcm/solution/contract.py` (new): the `Solver` ABC (abstract `build_period_kernels`, default no-op `validate`), `SolverBuildContext`, and `SolverKernels`. An engine leaf — imports nothing that reaches `lcm.solvers`, so the façade can re-export it without an import cycle. - `_lcm/solution/solvers.py` (new): `GridSearch(Solver)` (the relocated grid-search builder, with function-local `jax`/`get_max_Q_over_a` imports) and `DCEGM(Solver)` (the published config; `validate` raises the not-yet-available guard, so a regime requesting it is rejected at model build). - `lcm/solvers.py` → thin re-export façade; `registry.py` deleted; the `processing` dispatch and `Regime.solver` field updated. - Rename the default solver `BruteForce` → `GridSearch` (more descriptive; alpha permits the break). Faithful to dcegm-solver-seam-abc-design.md (ABC, not Protocol). Layer 2 (generic KernelResult) does not apply here — the stub seam has no EGM fork. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

review-notebook-app · 2026-06-19T16:11:43Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

read-the-docs-community · 2026-06-19T16:14:56Z

Documentation build overview

📚 pylcm | 🛠️ Build #33360521 | 📁 Comparing e362009 against latest (dc59124)

🔍 Preview build

38 files changed · + 1 added · ± 37 modified

+ Added

credits/index.html

± Modified

github-actions · 2026-06-19T22:02:24Z

Benchmark comparison (main → HEAD)

Comparing 7d8d2aac (main) → e3620099 (HEAD)

Benchmark	Statistic	before	after	Ratio
aca-baseline	execution time	16.099 s	15.333 s	0.95
	peak GPU mem	588 MB	588 MB	1.00
	compilation time	521.91 s	444.98 s	0.85
	peak CPU mem	7.64 GB	7.42 GB	0.97
aca-baseline-debug	execution time	103.913 s	102.546 s	0.99
	peak GPU mem	587 MB	587 MB	1.00
	compilation time	658.61 s	648.49 s	0.98
	peak CPU mem	8.13 GB	8.15 GB	1.00
Mahler-Yum	execution time	4.612 s	4.533 s	0.98
	peak GPU mem	520 MB	520 MB	1.00
	compilation time	11.78 s	11.21 s	0.95
	peak CPU mem	1.59 GB	1.59 GB	1.00
Precautionary Savings - Solve	execution time	26.9 ms	23.8 ms	0.89
	peak GPU mem	8 MB	8 MB	1.00
	compilation time	1.55 s	1.57 s	1.02
	peak CPU mem	1.16 GB	1.16 GB	1.00
Precautionary Savings - Simulate	execution time	65.9 ms	60.3 ms	0.91
	peak GPU mem	157 MB	157 MB	1.00
	compilation time	3.79 s	3.57 s	0.94
	peak CPU mem	1.32 GB	1.32 GB	1.00
Precautionary Savings - Solve & Simulate	execution time	89.6 ms	89.2 ms	1.00
	peak GPU mem	566 MB	566 MB	1.00
	compilation time	4.81 s	4.74 s	0.99
	peak CPU mem	1.30 GB	1.31 GB	1.00
Precautionary Savings - Solve & Simulate (irreg)	execution time	214.0 ms	208.1 ms	0.97
	peak GPU mem	2.18 GB	2.18 GB	1.00
	compilation time	5.55 s	5.00 s	0.90
	peak CPU mem	1.37 GB	1.37 GB	1.00
IskhakovEtAl2017DCEGMSimulate	execution time		192.9 ms
	compilation time		4.38 s
	peak CPU mem		1.70 GB
IskhakovEtAl2017DCEGMSolve	execution time		1.966 s
	compilation time		7.23 s
	peak CPU mem		1.59 GB
IskhakovEtAl2017Simulate	execution time	243.7 ms	201.2 ms	0.83
	compilation time	5.19 s	4.33 s	0.83
	peak CPU mem	1.29 GB	1.30 GB	1.01
IskhakovEtAl2017Solve	execution time	53.9 ms	50.9 ms	0.94
	compilation time	0.89 s	0.68 s	0.76
	peak CPU mem	1.15 GB	1.15 GB	1.00
IskhakovEtAl2017DCEGMSimulateGpuPeakMem	peak GPU mem		282 MB
IskhakovEtAl2017DCEGMSolveGpuPeakMem	peak GPU mem		0 MB
IskhakovEtAl2017SimulateGpuPeakMem	peak GPU mem	281 MB	281 MB	1.00
IskhakovEtAl2017SolveGpuPeakMem	peak GPU mem	67 MB	67 MB	1.00

The full discrete-continuous endogenous grid method, re-rooted as a single commit on top of the solver-selection seam (PR #388). The tree reproduces the reviewed feat/dcegm tip exactly, minus two root-level audit scratch files. On top of the seam this adds the `_lcm.egm` engine, DC-EGM build-time validation, the `_build_dcegm_kernels` builder and the EGM-specific SolverBuildContext / SolverKernels / SolutionPhase fields, and replaces the seam's build-time `NotImplementedError` guard with the real solver. To be split into reviewable sub-PRs at review time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The child stochastic-node expectation built the full node-stacked `node_values`/`node_marginals` arrays and reduced them with a weighted sum downstream of the map. So `stochastic_node_batch_size` (a `lax.map`) shed only the per-node gather working-set while still materialising the full `(..., n_nodes)` output — half a fix, and at large scale the residual stack can bind and even raise peak. Fold the weighted sum into a `lax.scan` carry instead: each block reads only its nodes and accumulates the running expectation, so neither the per-node gather nor the full node-stacked output is materialised. `0` (or a size >= the mesh) keeps the single fused vmap + reduction. The block reduction reorders the floating-point adds, so the value function matches the fused solve to tight numerical tolerance rather than bit-identically; the node-batch invariance test asserts `allclose` at rtol/atol 1e-9. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- New `docs/credits.md` (linked into the docs nav and the README), crediting the methods pylcm builds on (Carroll EGM; Iskhakov–Jørgensen–Rust–Schjerning DC-EGM; Dobrescu–Shanker FUES; Fella generalized EGM; the Druedahl / Druedahl–Jørgensen / Dobrescu–Shanker multidimensional-EGM methods behind the reserved upper-envelope backend), the discretization methods (Tauchen, Rouwenhorst, Kopecky–Suen, Fella–Gallipoli–Pan), the replicated example models shipped in pylcm (IJRS 2017, Mahler–Yum 2024), QuantEcon, and the open-source ecosystem (OSE, NumEconCopenhagen, akshayshanker, JeppeDruedahl, fediskhakov). - Add the missing BibTeX entries (Carroll 2006, Fella 2014, Dobrescu–Shanker FUES 2022 and RFC 2024, Druedahl 2021, Druedahl–Jørgensen 2017). - Fix the Mahler & Yum (2024) page range in references.bib: 1697–1733 (Econometrica 92(5)), not 1307–1343. Every citation verified against the journal/DOI or the authors' repositories. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

pylcm's Fast Upper-Envelope Scan was adapted from the `OpenSourceEconomics/upper-envelope` package (Apache-2.0, © The Upper-Envelope Authors) and since substantially modified. Both projects are Apache-2.0, so the derivation is permitted; this records the upstream attribution and the modification notice (Apache-2.0 §4(b)/(c)) in the `fues.py` header and the Credits page, which previously credited only the method's paper. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

pylcm's Fast Upper-Envelope Scan is adapted from OpenSourceEconomics/upper-envelope (Apache-2.0). Add a root `NOTICE` recording that derivation and the upstream copyright, and vendor the upstream license text verbatim at `licenses/upper-envelope-LICENSE`. The in-source attribution in `fues.py` (which ships in the wheel via `src/`) already carries the modification notice; these files add the conventional repo/sdist-level attribution. No other part of the codebase derives from third-party source — the DC-EGM kernel and the NEGM Phase-0 toys are written against pylcm's own API — so only the upper-envelope derivation is recorded. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move the pure spec-introspection helpers (discrete/passive/process state names, child state name, child discrete actions, child resources function and its arg names, regime-function concatenation) out of step.py into egm/regime_introspection.py. These read only a regime's static spec and hold no kernel state, so they form the dependency leaf that the kernel build, the continuation subsystem, and the kernel-scope checks all import from — breaking the otherwise-circular dependency between step.py and the scope checks that move next. Behavior-preserving pure move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move the four `_find_unsupported_*` functions — the build-time checks that name features outside the kernel's current scope — out of step.py into egm/kernel_scope.py. They read the processed kernel-build context (period carry targets, qualified flat params, the regime's VInterpolationInfo), which the model-time `validation` module does not have, so they belong with the kernel build rather than with the model-construction validators. step.py keeps calling `_find_unsupported_feature` to install the raising step. Behavior- preserving pure move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move the multi-target continuation machinery — target splitting, the per-child carry reader, per-row resources queries and gradients, the stochastic-node expectation, passive-state blending, discrete-choice aggregation (hard max or EV1 logsum), and the child-read build — out of step.py into egm/continuation.py. The core EGM step and the asset-row step keep calling `_get_child_carry_reader` / `_build_child_reads` / `get_egm_continuation_targets`, so step.py shrinks by ~870 lines and the textbook Euler inversion no longer sits in the same file as the multi-target, passive-state, taste-shock, and stochastic-node logic. Behavior-preserving pure move; the bound continuation contract is a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move the certifiable core algorithm — _EgmKernelPieces, _get_solve_one_combo, _compute_nodes_over_savings, _get_compute_node, _compute_constrained_candidates, _publish_V_and_carry_rows, and the constrained-offset constant — out of step.py into egm/step_core.py. It reads the expected continuation through continuation's per-target carry reader and depends on nothing in asset-row mode or the kernel orchestration. step.py keeps the orchestration (build, dispatch, combo map) and the asset-row path (extracted next). The one test importing _publish_V_and_carry_rows from step is repointed to step_core. Behavior- preserving pure move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move the per-Euler-node solve — _get_solve_one_combo_asset_rows, its differentiated continuation closure _get_expected_continuation_value, and the scalar-query publisher _publish_node_V_and_policy — out of step.py into egm/asset_row.py. It maps step_core's single-post-state pipeline over the Euler grid and consumes continuation's per-target reader; step.py keeps only the orchestration (build, dispatch, combo map, feasibility, raising step). This completes the split: the textbook core (step_core) and the asset-row extension (asset_row) are now separate, certifiable modules. Behavior-preserving pure move; the streaming refine-to-query feature lands here next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Collapse the multi-target continuation aggregation into `ContinuationPlan` (build-time statics) plus `bind_continuation`, which binds the plan to one combo pool and the next period's carries and returns a `savings -> (expected value, expected marginal)` callable. The textbook step (`_get_compute_node`) and the asset-row step (`_get_expected_continuation_value`) read only that bound callable instead of each re-deriving the regime-transition probabilities, child carry readers, and per-target aggregation. Behavior-preserving: the bound callable performs the identical aggregation, and it evaluates the probabilities and child reads from the combo pool *inside* the builder, so the asset-row Euler-slot gradient still carries the direct dP/da . EV terms (precomputing them outside the differentiated closure would silently drop them). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…the GridSearch rename. The re-rooted `SolverBuildContext` / `SolverKernels` are beartyped by the package claw, so under PEP 649 their field annotations resolve to real objects when a context is constructed at model build. `UserRegime` and `VInterpolationInfo` (whose module imports `lcm.regime`) close an import cycle through the `lcm.solvers` façade, so each is referenced via a two-form alias — precise for ty under `TYPE_CHECKING`, a bare container for the beartype claw at runtime. The other engine types import normally. The `BruteForce` → `GridSearch` rename had not reached the DC-EGM oracle test modules; propagate it, including the validation-message `match=` string. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

`test_model_with_dcegm_solver_builds` asserts that selecting the DC-EGM solver builds the model rather than being rejected, but it placed the solver on the stock iskhakov `working_life` regime, which lacks the resources, post-decision, and inverse-marginal-utility functions the DC-EGM contract requires — so validation correctly rejected it. Point the test at `build_dcegm_model`, whose regimes satisfy the contract. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…r 2) Remove the residual solver-type fork in the backward-induction loop. The loop no longer branches `if regime.solution.egm_step is not None`; every regime — grid search or DC-EGM — exposes one uniform per-period adapter that wraps its shared jitted core, calls it with the solver's own argument layout, and assembles a `KernelResult(V_arr, carry, sim_policy)` outside JIT. The only branches left in the loop are on the optional generic outputs (`result.carry is not None` / `result.sim_policy is not None`), never on solver type. Contract (`_lcm/solution/contract.py`): - Add `KernelResult` (frozen dataclass) and `PeriodKernel` (runtime_checkable structural Protocol exposing `core`, `build_lower_args`, `with_fixed_params`, `__call__`). - Reshape `SolverKernels` → `SolutionKernels(period_kernels, continuation_template)`, dropping the `max_Q_over_a` / `egm_step` / `egm_carry_template` / `egm_reachable_targets` quartet. - Name the continuation channel solver-agnostically via `ContinuationPayload` (= `EGMCarry` today). Solvers (`_lcm/solution/solvers.py`): - `GridSearch`/`DCEGM.build_period_kernels` build per-period adapter closures (`_GridSearchPeriodKernel` / `_DCEGMPeriodKernel`) over the existing jitted core; the EGM core's reachable-carry filter and param projection move onto the DC-EGM adapter. Engine wiring: - `SolutionPhase` carries `period_kernels` + `continuation_template` (plus a `solves_via_dcegm` property); the cyclic `PeriodKernel` annotation uses the TYPE_CHECKING-precise / runtime-widened two-definition pattern so the beartype claw resolves the dataclass field at construction. - The terminal-continuation publisher is composed engine-side (`_TerminalCarryPeriodKernel` in regime_building/processing.py) as an output decorator, not inside `GridSearch`. - `model_processing._partial_fixed_params_into_regimes` binds fixed params via each adapter's `with_fixed_params`; `simulation/additional_targets.py` reads `solves_via_dcegm`. Compilation reuse and the loop's memory/host-offload/gc discipline are preserved: only the shared core is jitted and identity-deduped (by `id(Q_and_F)` / function identity), no adapter is jitted per period, and the solve loop's device eviction / carry-buffer release / reachable-carry subset logic is untouched (dispatch re-routed only). Add `test_period_kernels_sharing_a_config_reuse_one_compiled_core` asserting the distinct-compiled-core count. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

In asset-row mode the per-node solve refines a full NaN-padded upper envelope of static length n_pad and interpolates it at exactly one query (resources_at_node) to publish a scalar (V_node, policy_node), then discards it. Batched over combos x Euler-nodes, that n_pad scratch axis is the binder that OOMs large solves. Fold the single-query interpolation into the upper-envelope scan so the n_pad rows never materialize: - Extract `_interp_between_nodes` in interp.py — the pure two-bracket-node arithmetic plus the Hermite correction, shared by `interp_on_prepared_grid` and the streamed path. The existing interp unit tests pin it (behavior- preserving). This guarantees the streamed value cannot diverge from row-then-interp: only which two nodes differs. - Add `refine_to_bracket` + `QueryBracket` in fues.py — a new scan driver reusing `_inspect_candidate` verbatim, changing only the emission sink: each step's up-to-3 emitted points fold into an O(1) bracket-capture carry (first/second, rolling prev/last, lo = latest emitted with grid <= q, hi = first with grid > q, running counts), no [n_input,3] stack and no [n_pad] rows. Post-scan, the bracket reproduces clip(searchsorted(side="right"), 1, max(n_kept-1, 1)) node-for-node: the side="right" tie-break at a duplicated kink (right copy wins the lower slot), the below-first clamp (first, second), and the at-or-above-last clamp (second-last, last; single-live falls back to the NaN-padded slot like the row path). Geometry only — no utility, borrowing limit, or floor. - Add `publish_node_from_bracket` in asset_row.py — the scalar-bracket counterpart of `_publish_node_V_and_policy`: the shared `_interp_between_nodes` with the value Hermite slope = grad(utility) at the two bracket policies only, plus the constrained floor and the n_kept > n_pad overflow poison, preserved identically. Wire the asset-row node solve to refine_to_bracket -> publish_node_from_bracket and drop the n_pad envelope from that path. Scope: asset-row mode only. Single-post-state mode publishes the refined envelope AS its inter-period carry (queried later at many parent points), so step_core's single-post-state path keeps calling `refine` unchanged. The new equivalence test pins the streamed pair against refine + _publish_node_V_and_policy for the same candidates + query, to fp tolerance: smooth, kinked, multi-crossing, all-dead, single-live, overflow, and query below-first / above-last / exactly on a duplicated kink abscissa. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The backward-induction loop forced a `gc.collect()` every period to return the device pool from a rolled-off continuation carry (a registered pytree CPython frees only on a cyclic collection). A grid-search period rolls no such carry, so the collection bought nothing there while its fixed cost dominated the warm solve of small / many-period models — deteriorating those benchmarks by up to ~25x (the Iskhakov-et-al grid-search warm solve: 1.01s -> 0.35s). Gate the collection on the loop's own per-period carry output (`period_egm_carries`), empty exactly when no carry rolled. The backward- induction loop stays backend-agnostic — it does not fork on solver type or read `solution.solves_via_dcegm`; the clean separation of the solve backends is a separate design task. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The Dobrescu-Shanker housing keeper (no-house-trade) regime is a plain 1-D DC-EGM model: liquid assets is the Euler state, consumption the continuous action, and housing a passive continuous state whose service flow `alpha * log(housing)` is read directly by utility. The cornerstone test locks in that the DC-EGM envelope-condition guard admits utility reading a *passive* continuous state (it forbids only the Euler state) by calling `validate_dcegm_regimes` directly on the finalized keeper regimes — no kernel trace, no solve. The full GPU solve is gated off the local box. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Phase-0 of the NEGM / multidimensional DC-EGM build (general 2-asset capability): - `negm_phase0/brute_feasibility_probe.py`: sweeps a 2-continuous-state (liquid + illiquid) + 2-continuous-action + income-shock lifecycle solved with BruteForce, reporting wall-clock and GPU peak. Finding: brute is single-solve feasible at fine grids (~4-9 s, <600 MB on a V100 up to 2.5 B combos/period) — the "brute breaks at Laibson scale" premise does not hold; NEGM is an accuracy/throughput/generality play, not a feasibility rescue. - `negm_phase0/kinked_toy_oracle.py`: the smallest 2-asset toy carrying the Laibson frictions (credit-card rate kink at a^X=0, withdrawal penalty at Iz=0, Z>=0 floor, u(C+iota*Z)), brute-solved as the V-parity oracle for a future NEGM prototype. - `negm_phase0/negm-phase0-findings.md`: brute-feasibility table, the NEGM nesting derivation with the P1 kink-candidate handling, build-vs-buy, and the gate recommendation (build for the general capability; reframe the motivation). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…del. Implements P2 of the Dobrescu–Shanker replication: the NEGM solver as a third `Solver` on the ABC seam, composing an inner 1-D `DCEGM` solve of the consumption-savings problem with an outer deterministic grid search over a durable/illiquid margin. - `NEGM(Solver)` in `_lcm/solution/solvers.py` (re-exported from `lcm.solvers` and `lcm`): frozen dataclass with `inner: DCEGM`, `outer_action`, `outer_post_decision`, `outer_grid`, optional `outer_no_adjustment_candidate`. `__post_init__` guards reject a stochastic outer grid and margins that coincide with the inner ones (`RegimeInitializationError`), mirroring DCEGM's `_fail_if_*` helpers. - `_NEGMPeriodKernel`: non-jitted outer adapter over the shared jitted inner DC-EGM core; sweeps the outer grid plus the per-node no-adjustment candidate, binds the outer post-decision into the regime's flat params per node, and collapses the outer axis by `max`. Static only (ty + prek green); not run. - `validate_negm_regimes` in `_lcm/egm/negm_validation.py`, wired into the same model-build path as `validate_dcegm_regimes`. Fail-loudly contract (`ModelInitializationError`, each message naming the alternative solver): no outer margin → use `DCEGM`; outer margin coincides with the inner margin → reject; coupled-2-Euler (outer post-decision feeds the inner Euler law / savings-stage pool, or a non-additively-separable utility cross-term) → use the 2-D EGM foundation; taste-shocked discrete choice → reject (ordering). - Kinked-toy NEGM model (`tests/test_models/negm_kinked_toy.py`) and the G1 parity test (`tests/solution/test_negm_kinked_toy.py`, skipped — GPU-only). - Unit tests: 6 config-guard cases, 8 contract-validator cases (construct and assert-raise, no solve). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The DS housing-keeper DC-EGM probe gains a brute-force (GridSearch) twin and a DC-EGM-vs-brute value-parity test, mirroring tests/solution/test_egm_passive. The brute twin solves the same keeper economics without the Euler/passive split: housing is a regular fixed continuous state, consumption a grid-searched action, the liquid-asset law of motion reads consumption directly, and a borrowing constraint floors post-decision liquid assets at the borrowing limit (the floor of the DC-EGM savings grid). build_working_regime, build_model, and build_params gain a variant arg ("dcegm" | "brute"). The parity test asserts DC-EGM >= brute and allclose at the passive- interpolation tolerance, excluding the lowest liquid-asset nodes where the coarse consumption grid makes brute itself unreliable. It is skip-marked (gpu-01 only) since it solves and OOMs the local box. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The splay/batch-size invariance assertions hard-coded float64 tolerances (rtol/atol 1e-12 for the combo axes, 1e-9 for the stochastic-node reduction), which the 32-bit CI job cannot meet: rescheduling the lax.map / reordering the weighted-sum adds shifts results at single-precision scale. Key the tolerance off conftest's X64_ENABLED, matching the established pattern in test_economic_validation / test_grids. Also stop test_egm_interp forcing dtype=jnp.float64 for its interpolation grid; under --precision=32 that raises a float64-truncated-to-float32 UserWarning that CI treats as an error. Use the canonical default float dtype instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The solve module's job is backward induction; name it for that. Renames src/_lcm/solution/solve_brute.py → backward_induction.py and tests/solution/test_solve_brute.py → test_backward_induction.py, updates every import (model.py, simulation/compile.py, test_beartype_claw.py), the two test-function names, and the doc/docstring references. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Add `envelope_at_query`: the query-side counterpart of the full-row refiners. It evaluates the branch-aware upper envelope directly at a set of query abscissae — a fixed-shape (n_query, n_segment) bracket-and-reduce with no sequential scan and no NaN-padded refined row. For each query the value is the maximum over the live bracketing segments; the policy and marginal are the winner's. A folded branch contributes several bracketing segments, so the maximum is exact for the piecewise-linear correspondence, and a value tie breaks right-continuously (the larger-slope segment) to match the kernel's side="right" read. Topology is explicit: a segment is a consecutive-pair link whose endpoints share a segment_id, so unrelated branches are never bridged. Verified against the host oracle on a clean crossing, a folded branch, a non-bridging branch, and an out-of-range query. This is the backend asset-row mode wants — one query per Euler node, no full envelope to refine — and the machinery the NEGM outer envelope migrates onto so a sub-grid adjustment island is never sampled away. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The nested-EGM outer fold read every adjuster candidate onto the keeper's shared cash-on-hand grid and took the running maximum at those nodes — a sampled maximum that silently drops an adjuster branch winning only on a coh interval strictly between two shared nodes (a thin (S, s) adjustment island). Recover the island with a bounded, fixed-size merge: each fold also reads the running envelope at the candidate's own coh nodes and records that adjuster's single best interior win in its own pre-allocated column, so the published carry grows by the adjuster count (additive), never the grid product. A win must clear a relative noise floor — the shared-grid read is a cubic-Hermite interpolant, so a smooth branch's apparent excess is O(h^2) representation noise, whereas a genuine island beats the envelope by an economically meaningful margin (measured separation ~1000x). 'finalize' splices the peaks in and sorts into ascending coh order. The published carry is therefore n_adjusters wider than the keeper's. That width is an AOT-compilation invariant — the parent period's kernel is lowered against the regime's continuation template — so widen the NEGM regime's template by the same bounded count. Flips the strict-xfail (S,s) island red gate to green; the synthetic build_outer_envelope_carry battery and all NEGM solve tests stay green. The kinked-toy CPU simulate near-tie (a documented cross-backend grid-search argmax difference) is unchanged — the splice is transparent to it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The LTM/MSS topology fix (consume `segment_id`) lets the upper-envelope backends and the host oracle avoid bridging unrelated branches when an explicit topology is supplied. The production single-post-state kernel supplies none, and that is correct, not an omission: the candidate chain it refines is one connected branch (the constrained segment joins the unconstrained Euler branch continuously at the borrowing limit), within-period discrete choices are maximized across combos in the V array rather than stacked into one refine, and a future-kink-induced fold is data-emergent (a decrease in the inverted endogenous grid) rather than known a priori. So there is no explicit multi-branch topology for the kernel to emit; explicit `segment_id` is the oracle/test path. Document the contract at the call site. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…RFC spatial-index) The four 2-D region clouds (ucon/dcon/acon/con) each invert their FOCs assuming their own constraints bind, so a node where the region's inequalities fail is a finite but KKT-inconsistent candidate. These were computed (`valid_region`) but not dropped before the rooftop cut and publish, because doing so regressed the pension RFC solver: masking thins the valid survivors at coarse grids, and the publisher preferred the *largest-area* containing simplex, whose linear interpolant spans a wide arc of the curved value surface and errs at the low-pension corner (p90 0.20 -> 0.246 on `test_ds_pension_rfc_solver`). Root cause was the simplex selection, not the mask: barycentric interpolation error scales with triangle area x surface curvature, and the degeneracy floor already rejects ill-conditioned triangles, so the *smallest* non-degenerate containing simplex is both well-conditioned and most accurate. Prefer it. With that publish, applying the KKT mask now improves accuracy instead of regressing: `test_rfc_two_asset_parity` p90 0.044 -> 0.015, and the pension RFC solver passes (p90 < 0.20). 80 two-asset RFC/G2EGM/pension/publisher/oracle tests stay green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The segment mesh marked a node valid on finiteness and positive consumption alone, so a finite-but-KKT-inconsistent cloud node (a region's FOCs solved where its inequalities fail) entered the triangulation. AND the region's `valid_region` mask into `valid_node` so only genuine KKT candidates form the mesh. Unlike the RFC interpolating publish, G2EGM's across-segment maximum already dominates these spurious nodes (they carry a suboptimal, lower value), so the mask does not regress: a target a masked node used to cover becomes a hole and falls to the accurate direct-Bellman fill. 22 G2EGM/two-asset parity tests stay green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

EGM needs (u')^{-1} to recover the action from the Euler equation; a model whose marginal utility has no closed-form inverse could not be solved. Add a bracketed bisection that inverts any continuous, strictly decreasing u' numerically. The load-bearing piece is the gradient: a fixed-iteration bisection returns correct values but, differentiated naively, wrong or iteration-dependent gradients. The forward solve is detached (stop_gradient) and the derivative is supplied by the implicit-function theorem through one corrector step, so dc*/dm = 1/u''(c*), dc*/dθ = -∂_θ u'(c*) / u''(c*) exactly, independent of the iteration count. The unrolled where-based loop is jittable and vmap-safe. Tests: CRRA value parity to rtol 1e-7; the parameter derivative matched against the analytic CRRA derivative AND a finite-difference reference (the contract that guards against the wrong-gradient failure mode); residual convergence on u'(c)=c^{-2}+e^{-c} (no closed-form inverse); vmap+jit. Kernel auto-selection (use this when no analytic inverse_marginal_utility is supplied) is the follow-on wiring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Make 'inverse_marginal_utility' optional for the DCEGM solver: a regime that omits it is solved by the iEGM path, which inverts u' numerically from 'utility'. - Validation: drop 'inverse_marginal_utility' from the required-functions set, and skip the (u')^-1-vs-jax.grad(utility) consistency check when it is absent. - step.py: extract the analytic inverse only when present, else None. - step_core.py: '_EgmKernelPieces.inverse_marginal_utility_func' is now optional; when None, build the inverter from the action-derivative of the combo-bound 'utility_of_action' (jax.grad) over a savings-scaled bracket. The analytic path is unchanged (byte-identical) when the inverse is supplied. End-to-end test: the IJRS CRRA retirement model solved with the analytic inverse removed reproduces the analytic-inverse value function to rtol 1e-6, so the numeric root finder recovers the same (u')^-1 through the full solve. Existing DC-EGM/NEGM analytic-path and validation tests stay green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Generalize the linear continuation E[V'] over the one-step stochastic lottery to a certainty-equivalent g_inv(E[g(V')]) — the form the Epstein-Zin / power-mean and entropic-risk operators take. A regime opts in by supplying a value_transform (g) / inverse_value_transform (g_inv) pair in its functions; both read continuation_value (the next value array) from the engine plus any params. With neither supplied the operator is the identity and the expectation stays the plain weighted average — byte-identical, default fixtures unchanged. - Q_and_F seam: the stochastic-state average becomes inverse_value_transform(average(value_transform(V'), weights)) when a pair is present; regime mixing stays linear (the documented first implementation). - Build: _build_continuation_operator concatenates the pair from the regime functions (excluding H); supplying only one half is a ModelInitializationError. Their non-continuation_value args join the Q_and_F signature so params reach them at solve time. - Params template: continuation_value is engine-provided, so it is dropped from the discovered params (alongside the existing marginal_continuation handling, refactored into _drop_engine_provided_args). Tests: identity pair reproduces the linear solution byte-for-byte; the entropic CE on a known lottery equals the hand value and lies below the mean (risk penalty); a level-shift operator changes the solved value function (end-to-end wiring + param read); supplying one transform without its inverse is rejected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…dness caveat From Pro's adjudication of the roadmap-tail work: - W1 (A.4): the implicit-function corrector is valid only at an interior bracketed root; outside the bracket the detached bisection clamps to a bound and the interior 1/u'' Jacobian would silently extrapolate a wrong gradient. Guard it: the target is bracketed iff u'(c_upper) <= m <= u'(c_lower); the unbracketed branch returns the clamped root detached, so the binding-bound active-set derivative is 0, not the bogus interior slope. New test pins value-at-bound + zero gradient. - RFC publish (A.3): the docstring still claimed largest-area simplex selection while the code selects smallest; correct it, and document that the area floor is not a shape gate (a sliver can pass it) — the robust refinement is a most-local well-conditioned simplex, deferred since the area gate suffices on the models exercised. - W4a (A.5): document the soundness boundary — wrapping only the per-target stochastic average with linear regime mixing equals the joint certainty equivalent only for affine operators or a single reachable target; a non-linear operator (EZ/entropic) over multiple targets is unsound and needs the joint-lottery form (with is_linear capability metadata), not yet implemented. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Pro's review flagged that wrapping only the per-target stochastic average and then mixing target regimes linearly is unsound for a non-linear operator: sum_r p_r g_inv(E_z[g(V_rz)]) != g_inv(sum_rz p_r p_z g(V_rz)) unless g is affine or there is one reachable target. Fix it properly (not a guard): split the operator into _transform_and_average (transform + stochastic average per target, no inverse) and _invert_joint (apply g_inv once). The seam now accumulates the regime-mixed transformed continuation sum_r p_r E_z[g(V_rz)] and inverts once, giving the genuine joint certainty equivalent g_inv(sum_rz p_r p_z g(V_rz)) — sound across both regime and shock uncertainty. The diagnostic compute_intermediates path mirrors the seam so its reported E_next_V / Q match the solved value for an operator model. Default (no operator) and affine operators are byte-identical (g_inv outside the linear mix collapses to the old form). New test pins the joint CE against the hand value and shows it differs from the per-target form when the two regimes' continuations differ. _apply_continuation_operator is kept as the single-lottery composition for the unit tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Pro: the area floor (DEGENERATE_AREA_FLOOR) rejects collinear triangles by area only, not shape, so an ill-conditioned sliver can pass it and give an unstable affine interpolant. Add a normalized mean-ratio shape gate Q = sqrt(3)*2A / (sum of squared edges) in (0,1] (1=equilateral, ->0 sliver) and select the smallest-area *well-conditioned* containing simplex (Q > SHAPE_QUALITY_FLOOR), falling back to the smallest containing one only when every container is a sliver (coverage over conditioning), then the nearest survivor. Replaces the bare smallest-area rule. 25 RFC/G2EGM/publisher/parity tests stay green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…lity Replace the fixed 60-iteration bisection forward solve with a safeguarded Newton iteration in log-consumption. In log c the CRRA family is exactly linear (log u' = -crra log c), so Newton converges in a single step and stays well-conditioned for any power-law-like u' whose steep-near-zero, flat-at-large-c curvature makes a plain-c Newton overshoot into a slow bisection tail. The bisection fallback (taken when a Newton step leaves the live bracket) preserves robustness. The detached-root + analytic IFT corrector gradient contract and the bracket/active-set guard are unchanged. A 15-iteration convergence test on the wide [1e-8, 1e6] bracket — where 15 bisection steps cannot reach the root — pins the solve as Newton-driven. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Thread a scan_unroll factor through refine_envelope and refine_to_bracket into both FUES jax.lax.scan call sites, exposed as DCEGM.fues_scan_unroll (validated >= 1) and wired through the get_upper_envelope / get_bracket_finder dispatch. The FUES scan is sequential and latency-bound on accelerators; unrolling trades compile time for fewer loop-carry round trips. Pure performance knob — the refined envelope is bit-identical across unroll values (numerical-invariance test at unroll in {2,4,8}); config rejects unroll < 1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The query-side backend's dense (n_query, n_segment) bracket-and-reduce is itself the memory wall at scale. Add an opt-in segment_block_size that swaps it for a two-pass blocked scan over segment blocks, peaking at (n_query, block): - pass 1 accumulates the running per-query max (the envelope value); - pass 2 re-scans, keeping the global max-slope winner among segments within the tie tolerance of that fixed envelope value (the right-continuous tie-break). Both passes are exact associative folds against a fixed target, so the result matches the dense path up to floating-point reassociation between the two XLA lowerings (parity test at block sizes {1,2,3,4}, rtol 1e-12). Default (segment_block_size=0) keeps the dense path unchanged. Refactored the shared link construction into a _SegmentLinks struct. The memory/GPU payoff is gpu-01-validated; this lands the correct, opt-in mechanism. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ming doc A.3 (rfc_2d.py): the shape-quality formula returned half the standard mean ratio (0.5 at equilateral, not 1). Use the standard Q = 4*sqrt(3)*A / sum(l^2) so Q=1 is equilateral; the gate is unchanged (SHAPE_QUALITY_FLOOR 0.05 -> 0.10 to match the doubled scale). Docstring corrected. W1 (numeric_inverse.py): the log-consumption solve needs u' > 0 and m > 0. Add a log_well_defined gate (m>0, u'(c_lower)>0, u'(c_upper)>0, finite log m); when it fails (bliss point, marginal touching zero, non-finite branch) return NaN so the kernel's NaN diagnostics name the (regime, period) instead of the log path silently returning a clamped bound. Byte-identical for valid CRRA inputs; two fail-loud tests added. A.5 (Q_and_F.py): document the joint-unresolved-lottery timing assumption on the continuation operator — g_inv applies once over the full joint regime+shock lottery; staged resolution (regime revealed before continuation risk, or shock-dependent regime probs) would need an explicit operator-scope option. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Generalize the keeper's injected durable transition from a hard-coded identity to the regime's outer_no_adjustment_candidate function: the keeper holds the durable at next_<durable> = keep(<durable>) instead of always = <durable>. With no candidate (None) the auto-identity is kept, so the default path is unchanged; a depreciating keep(d) = d(1-delta) lands the kept stock off the durable grid and the inner DC-EGM's passive read blends it over the grid's nodes. Byte-identical at delta=0: keep_housing returns the stock unchanged, so the DS-2024 and App.2 housing keeper solves are unaffected (3 + 12 build/parity tests green). Enables the DS-2024 delta=0.10 keeper (next). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Thread `delta` into `build_model` and bake it into the keeper's no-adjustment map via `_make_keep_housing(delta)`: `keep_housing(h) = h(1 - delta)`. With the keeper-durable wiring (outer_no_adjustment_candidate), the NEGM keeper holds the house at the depreciated level for free and the inner DC-EGM's passive read blends the continuation off the housing grid, so the model is faithful at any delta. At delta=0 the map returns the stock unchanged (h * 1.0 == h), so the keeper solves are byte-identical and the 3 build/parity tests stay green. The brute grid-search twin remains a valid oracle only at delta=0 (it searches next_housing on the grid, where the free-keep level h(1-delta) lands only when delta=0); the delta>0 keeper is validated against a dense host VFI oracle (next). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…oracle Add a dense host VFI oracle (`tests/solution/_ds2024_housing_vfi_oracle.py`) that mirrors the NEGM nest and searches the free-keep level `h(1-delta)` as an explicit candidate alongside the outer house grid. At delta>0 that level is off the house grid, so the on-grid brute twin cannot represent it — the oracle is the only valid reference for the depreciating keeper. Two interior tests: - the oracle reproduces the brute solve at delta=0 (pins its economics); - the NEGM solve at delta=0.10 reproduces the oracle (mean interior |dV| ~0.08, the same grid-resolution agreement as the delta=0 brute pair) — the keeper's depreciated hold is faithful. Exclude the oracle helper from the test-naming hook, matching the other tests/solution/_*_oracle.py helpers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…durable A reviewer flagged that `_build_coh_shift_function` lifts adjusters using the identity keeper reference (`next = durable`) rather than the keeper core's no-adjustment level `keep(durable) = durable(1-delta)`, which the outer-envelope design doc says should be the credited-cost reference. Switching to `keep(durable)` in isolation is the design-exact shift, but a differential VFI-oracle test shows it *regresses* DS-2024 delta=0.10: the corrected lift places each adjuster at a coh the fixed keeper-grid envelope reads by extrapolation, and the one-island-per-adjuster splice over-counts, moving the converging ~0.08 interior agreement to a plateaued ~0.18. The identity reference and the island splice are tuned together; the shipped solver converges (mean 0.0825 -> 0.0759 as n_grid 10 -> 18). Correcting both belongs with the segment-topology outer-envelope redesign, so document the coupling here rather than ship a regression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The Australia-2015-16 capital-income tax brackets are not printed in the DS-2026 paper text (only Figure 9); they are transcribed from the co-authors' replication repo. Record the exact provenance so upstream drift is detectable: GitHub akshayshanker/FUES, examples/housing_renting/config_HR/STD_RES_SETTINGS_4_TAXES/ master.yml, pinned at commit 00961e0b (blob 41661779), and a sha256[:16] checksum (4e8d1bf0748f6933) of the resolved (LOWER, OFFSET, RATE) arrays. All 10 brackets verified to match the source tax_table.brackets exactly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The GPU peak-memory tracker spawns `python -m benchmarks.asv._gpu_mem` with cwd=_PROJECT_ROOT, but _PROJECT_ROOT resolved to the benchmarks/ directory, not the repo root, so the subprocess could not import `benchmarks` and every GpuPeakMem benchmark failed with `ModuleNotFoundError: No module named 'benchmarks'` (the whole benchmark-pr workflow red). The file moved one level deeper in 7dc5887 (benchmarks/ -> benchmarks/asv/) but the `.parent.parent` chain was not updated. Point it at the true repo root (three parents up). Verified locally: `python -m benchmarks.asv._gpu_mem` resolves from the corrected cwd and fails only on missing argv, no longer on import. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

W1 made `inverse_marginal_utility` optional: when a regime omits it, EGM derives the inverse numerically (`numeric_inverse.py`; validation.py:298 documents this, and tests/solution/test_iegm_wiring.py covers a model solving without it). So removing that function is no longer a contract violation, and the negative case in test_dcegm_contract_violation_raises stopped raising — failing `main` deterministically on every platform. Drop the now-contradictory case; the iEGM path's positive coverage already lives in test_iegm_wiring.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The DC-EGM solve/oracle battery added on this branch makes the serial suite ~86 min on CI runners. Bake `-n 4` into the tests / tests-32bit / tests-with-cov tasks so every CI job (and local `pixi run tests`) distributes across 4 workers; `--dist loadfile` (already in addopts) keeps each file on one worker. Set `XLA_PYTHON_CLIENT_PREALLOCATE=false` on the tasks — a no-op on CPU, but required on GPU so the 4 worker processes don't each preallocate ~75% of the device and OOM; they grow to their actual small per-test need, and loadfile keeps the one memory-heavy file (Mahler-Yum) isolated to a single worker. Verified locally: 4/4 workers spin up and pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The state-action-param bundle threaded through `Q_and_F`, `max_Q_over_a`, and their `_transform_and_average` / `_invert_joint` helpers carries every flat param, including lookup-table params represented as `MappingLeaf` / `SequenceLeaf` (tax schedules, pension accrual tables). The bundle was annotated `FloatND | IntND | BoolND`, which excludes those leaf types, so the package beartype claw rejected a `MappingLeaf` param whenever its O(1) sampling happened to draw that key — an intermittent failure independent of the model's correctness. Widen the bundle annotation to the canonical `_ParamsLeaf` union. The two concrete taste-shock extractions (`taste_shocks__scale`, `taste_shock_key`) are narrowed back at their call sites with `cast`, since `logsum_and_softmax` and `draw_taste_shock_noise` require the scalar/array types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…iles) The single self-hosted GPU runner is starved of host CPU/memory when four xdist workers each AOT-compile the DC-EGM oracle suite in parallel, and the runner loses communication with the server (~60 min, then death) instead of finishing the suite in roughly a quarter of that serially. Append `-n0` to the two GPU steps so pytest-xdist's last `-n` wins and the suite runs in-process. The CPU matrix keeps `-n 4`, where the four workers are a clear speedup. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The macOS runner has 3 vCPUs and 7 GB of RAM; four xdist workers each AOT-compiling the DC-EGM solve/simulate/oracle tests peak past its memory and swap-thrash the suite to ~4 hours. A `pytest_collection_modifyitems` hook marks that battery `slow` (all of `tests/solution/`, plus the DC-EGM-family modules elsewhere), and the macOS matrix leg runs `-m "not slow"`. Linux and the GPU runner still run the full suite, so the platform-independent kernel stays covered; macOS keeps the lighter engine/regime/grid/simulation tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The committed `pixi.lock` is written in the format pixi 0.71.x emits; pixi 0.70.1 cannot parse it and aborts every job at install with `expected a string, found table`. Pin the `setup-pixi` steps to the version that produced the lock so the frozen install succeeds. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

`test_simulate_using_raw_inputs` calls `simulate` directly, bypassing the public `Model.simulate` canonicalization, so its `flat_params` must already carry canonical dtypes. `jnp.asarray(0)` is `int64` under x64, but the params-leaf contract is `int32`; `_invert_joint`'s beartype check rejected it whenever its O(1) sampling drew that key. Pin the leaf to `int32` — the dtype `cast_params_to_canonical_dtypes` produces on the public path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

hmgaudecker and others added 3 commits June 17, 2026 15:01

hmgaudecker force-pushed the feat/solver-seam branch from e0fecdd to 44e8522 Compare June 21, 2026 15:29

hmgaudecker and others added 21 commits June 21, 2026 18:09

hmgaudecker force-pushed the feat/dcegm branch from 5c4aeff to a4b3023 Compare June 21, 2026 16:35

hmgaudecker and others added 30 commits June 25, 2026 13:17

Update lock file using correct syntax.

fa32a01

Merge branch 'feat/type-local-continuation-v' into feat/dcegm

e362009

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add the DC-EGM solver (discrete-continuous endogenous grid method)#390

Add the DC-EGM solver (discrete-continuous endogenous grid method)#390
hmgaudecker wants to merge 186 commits into
feat/type-local-continuation-vfrom
feat/dcegm

hmgaudecker commented Jun 19, 2026 •

edited

Loading

Uh oh!

review-notebook-app Bot commented Jun 19, 2026

Uh oh!

read-the-docs-community Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hmgaudecker commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commit structure (reviewable in steps)

Solvers and accuracy work built on the DC-EGM seam

Validation

Pending (does not block reviewing the engine)

Uh oh!

review-notebook-app Bot commented Jun 19, 2026

Uh oh!

read-the-docs-community Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark comparison (main → HEAD)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hmgaudecker commented Jun 19, 2026 •

edited

Loading

read-the-docs-community Bot commented Jun 19, 2026 •

edited

Loading

github-actions Bot commented Jun 19, 2026 •

edited

Loading