Skip to content

Latest commit

 

History

History
314 lines (230 loc) · 10.7 KB

File metadata and controls

314 lines (230 loc) · 10.7 KB

bayesian-pinn-inversion

Bayesian Inversion for Diffusion Coefficient Recovery
Using Kernel Surrogates with Posterior Error Quantification

Akhilesh Yadav — Independent Research, 2026
Applied Mathematics · Numerical PDEs · Bayesian Inverse Problems


Overview

This project implements a complete Bayesian inversion pipeline for recovering the spatially varying diffusion coefficient $a(x)$ in the 1D elliptic PDE

$$-\frac{d}{dx}!\left[a(x),\frac{du}{dx}\right] = f(x), \quad x \in (0,1), \quad u(0)=u(1)=0,$$

from $N=15$ noisy pointwise observations of $u$.

The central result is a rigorous quantification of surrogate-induced posterior error via the Wasserstein-2 distance:

$$\bar{W}_2(\pi^{\text{FEM}},, \pi^{\text{surr}}) \xrightarrow{n_{\text{train}}\to\infty} 0.$$

$n_{\text{train}}$ Surrogate $L^2$ error $\bar{W}_2$
5 $7.7 \times 10^{-2}$ 0.312
20 $1.3 \times 10^{-1}$ 0.335
80 $1.1\times10^{-2}$ 0.048

W₂ improvement: 6.5× from n=5 to n=80.
All 4 true parameters recovered inside 95% credible intervals by the FEM posterior.


Quick Start (3 commands)

# 1. Install dependencies (pure Python scientific stack — no GPU required)
pip install numpy scipy matplotlib

# 2. Run the experiment  (~3–5 min on a laptop)
python src/run_experiment.py

# 3. Generate all figures
python make_figures.py

Results are saved to results/; figures to figures/.


Installation

Requirements

Package Minimum version Purpose
Python 3.9
NumPy 1.24 Numerical arrays
SciPy 1.10 Sparse linear algebra (FEM)
Matplotlib 3.7 Figures

No GPU, no deep learning framework, no FEM library needed.
Everything runs on the standard scientific Python stack.

Install

# Recommended: create an isolated environment
python -m venv venv
source venv/bin/activate   # Linux/macOS
# venv\Scripts\activate    # Windows

pip install numpy scipy matplotlib

Repository Structure

bayesian-pinn-inversion/
│
├── src/
│   ├── fem_solver.py        P1 FEM for -div(a grad u) = f  (from scratch)
│   ├── mcmc_sampler.py      MH-MCMC, NW kernel surrogate, W_2 distance
│   ├── pinn_surrogate.py    PINN with exact analytical u', u'' (chain rule)
│   └── run_experiment.py    ← MAIN SCRIPT — run this first
│
├── make_figures.py          ← RUN SECOND — generates all 8 figures
│
├── tests/
│   ├── test_fem_convergence.py    6 tests: convergence rates, BCs, SPD, energy
│   └── test_mcmc.py               8 tests: prior, W_2, MCMC acceptance, CI
│
├── figures/                 Created by make_figures.py
│   ├── fig1_data.png
│   ├── fig2_posteriors_fem_vs_best.png
│   ├── fig3_all_surrogates.png
│   ├── fig4_reconstruction.png
│   ├── fig5_wasserstein.png
│   ├── fig6_diagnostics.png
│   ├── fig7_predictive_check.png
│   └── fig8_results_table.png
│
├── results/                 Created by run_experiment.py
│   ├── results_summary.txt  Human-readable numerical results
│   ├── summary.json         Machine-readable results (used by make_figures.py)
│   ├── samps_fem.npy        FEM posterior samples  (2400 × 4)
│   ├── samps_surr_5.npy     Surrogate posterior    (n=5)
│   ├── samps_surr_20.npy    Surrogate posterior    (n=20)
│   ├── samps_surr_80.npy    Surrogate posterior    (n=80)
│   └── trace_fem.npy        Log-posterior trace    (3000,)
│
├── paper_notes/
│   └── mathematical_derivations.md    Full derivations, theorems, convergence tables
│
└── README.md                ← You are here

Detailed Usage

Step 1 — Run the experiment

python src/run_experiment.py

This does the following:

  1. Generates observations: solves the PDE with the true coefficient theta_true = [1.5, 0.8, 2.0, 1.2] using P1 FEM (N=150 nodes), then adds Gaussian noise with σ_n = 0.02.

  2. Runs MCMC with exact FEM (3000 steps, burn-in 600). Prints posterior mean, std, and 95% credible intervals for all 4 parameters.

  3. For each n_train in [5, 20, 80]:

    • Builds a Nadaraya-Watson kernel surrogate using n_train FEM evaluations.
    • Runs MCMC with this surrogate.
    • Computes W₂ distance between the surrogate and FEM posteriors.
  4. Prints convergence table showing W₂ → 0 as n_train increases.

  5. Saves results to results/.

Expected output (last part):

[STEP 4/5]  Convergence summary

  n_train | Surr. L2 err |   Mean W_2
  --------------------------------------
        5 |   7.6965e-02 |    0.31240
       20 |   1.3033e-01 |    0.33447
       80 |   1.1352e-02 |    0.04828

  W_2 improvement factor (n=5 -> n=80):  6.5x

Expected run time: 3–5 minutes on a single CPU core.


Step 2 — Generate figures

python make_figures.py

Generates 8 figures in figures/:

Figure Description
fig1_data.png True PDE solution, noisy observations, and $a_{\text{true}}(x)$
fig2_posteriors_fem_vs_best.png Marginal posteriors: FEM vs best surrogate (n=80)
fig3_all_surrogates.png Marginals for FEM and all surrogates (n=5, 20, 80)
fig4_reconstruction.png $a(x)$ reconstruction: posterior mean + 95% credible band
fig5_wasserstein.png W₂ convergence plot + per-dimension bar chart
fig6_diagnostics.png Log-posterior trace, autocorrelation, running means
fig7_predictive_check.png Posterior predictive check: predicted $u(x)$ envelopes
fig8_results_table.png Numerical summary table

Step 3 — Run the tests (optional but recommended)

python tests/test_fem_convergence.py
python tests/test_mcmc.py

Expected output:

==================================================
  FEM Convergence Tests
==================================================
PASSED  boundary_conditions
PASSED  stiffness_matrix_spd  (cond = 1.89e+02, lambda_min = 5.3694e-04)
PASSED  l2_convergence_constant_a  rates = [3.86, 3.93, 3.97]  (expected ~2.0)
PASSED  h1_convergence_constant_a  rates = [1.93, 1.96, 1.98]  (expected ~1.0)
PASSED  l2_convergence_variable_a  rates = [1.95, 2.01, 2.14]  (expected ~2.0)
PASSED  energy_balance  (residual = 2.48e-14)

  ALL 6 FEM TESTS PASSED
==================================================

==================================================
  MCMC and Wasserstein Tests
==================================================
PASSED  make_a_func
PASSED  log_prior_negativity
PASSED  log_prior_mean  (E[theta]=1.866, theory=1.868)
PASSED  posterior_95_ci  (coverage=0.9500)
PASSED  wasserstein_identical  (W_2=0.00e+00)
PASSED  wasserstein_known_gaussians  (W_2=1.0078, theory=1.0000)
PASSED  wasserstein_converges  (n=5: 0.3124, n=80: 0.0483, factor=6.5x)
PASSED  mcmc_acceptance_rate  (acc=...)

  ALL 8 MCMC TESTS PASSED
==================================================

Configuration

All experiment parameters are defined at the top of src/run_experiment.py in a clearly labelled CONFIGURATION block:

THETA_TRUE  = np.array([1.5, 0.8, 2.0, 1.2])  # ground-truth
SIGMA_N     = 0.02      # observation noise std
N_OBS       = 15        # number of observation points
N_FEM       = 150       # FEM mesh resolution
N_MCMC      = 3000      # MCMC steps per chain
BURN_IN     = 600       # burn-in steps
SIGMA_PROP  = 0.10      # MH proposal std
NTRAIN_LIST = [5, 20, 80]   # surrogate training sizes

To re-run with different settings, edit these values and re-run python src/run_experiment.py followed by python make_figures.py.


Mathematical Background

PDE and FEM

The P1 FEM is implemented from scratch in src/fem_solver.py using:

  • 2-point Gauss quadrature per element (exact for piecewise-linear $a$)
  • Sparse stiffness matrix via scipy.sparse
  • Direct solve via scipy.sparse.linalg.spsolve

Verified convergence: $$|u - u_h|{L^2} = \mathcal{O}(h^2), \qquad |u - u_h|{H^1} = \mathcal{O}(h).$$

Bayesian formulation

  • Parameter: $\boldsymbol{\theta} = (\theta_1,\ldots,\theta_4) \in \mathbb{R}_{>0}^4$, piecewise-constant $a(x;\boldsymbol{\theta})$
  • Prior: $\theta_k \overset{\text{i.i.d.}}{\sim} \mathrm{LogNormal}(0.5, 0.25)$ — ensures positivity
  • Likelihood: Gaussian with $\sigma_n = 0.02$
  • MCMC: Metropolis-Hastings with log-space random walk (symmetric proposal)

Wasserstein-2 distance

For each 1D marginal $k$, the W₂ distance between the FEM and surrogate posteriors is computed exactly via the sorted empirical CDFs:

$$W_2(\pi_k^\text{FEM}, \pi_k^\text{surr}) = \left(\frac{1}{N}\sum_i (\mathrm{sort}(A_k)_i - \mathrm{sort}(B_k)_i)^2\right)^{1/2}.$$

See paper_notes/mathematical_derivations.md for complete derivations.


Key Results

FEM reference posterior — all true parameters recovered:

$k$ True $\theta_k$ Posterior mean Posterior std 95% CI In CI
1 1.50 1.467 0.083 [1.312, 1.644]
2 0.80 0.748 0.115 [0.574, 1.027]
3 2.00 1.796 0.304 [1.285, 2.498]
4 1.20 1.279 0.062 [1.156, 1.380]

Convergence of W₂ with surrogate refinement:

$n_\text{train}$ Surrogate $L^2$ error $\bar{W}_2$
5 $7.70 \times 10^{-2}$ 0.312
20 $1.30 \times 10^{-1}$ 0.335
80 $1.14\times10^{-2}$ 0.048

References

  1. A. M. Stuart (2010). Inverse problems: A Bayesian perspective. Acta Numerica 19, 451–559.
  2. S. C. Brenner & L. R. Scott (2008). The Mathematical Theory of Finite Element Methods, 3rd ed.
  3. L. C. Evans (2010). Partial Differential Equations, 2nd ed.
  4. B. Sprungk (2020). On the local Lipschitz stability of Bayesian inverse problems. Inverse Problems 36, 055015.
  5. L. Yan & T. Zhou (2019). An adaptive surrogate modeling based on deep neural networks for large-scale Bayesian inversion. arXiv:1911.08926.

This project is part of a PhD application preparation portfolio in applied mathematics,
targeting research groups in numerical PDEs, inverse problems, and scientific ML.