Skip to content

BUG: gp.TP default prior is not a multivariate Student-t process #8290

@rx18-eng

Description

@rx18-eng

Describe the issue:

I think pm.gp.TP.prior() gives the wrong joint prior when reparameterize=True, which is the default.

In TP._build_prior, the default path samples independent univariate Student-t variables:

v = pm.StudentT(name + "_rotated_", mu=0.0, sigma=1.0, nu=self.nu, size=size)
f = pm.Deterministic(name, mu + cholesky(cov).dot(v))

The non-reparameterized path uses the correct multivariate distribution:

f = pm.MvStudentT(name, nu=self.nu, mu=mu, scale=cov, **kwargs)

These are not equivalent for finite nu. A multivariate Student-t has shared tail scale / dependence across dimensions, while the current default uses independent tail scales in the rotated coordinates.

So the default TP prior silently changes the joint prior, posterior, and conditional predictions.

Reproduceable code example:

I created a temp venv, installed runtime deps, imported this checkout, and compared:

  • default gp.TP.prior(..., reparameterize=True)
  • correct gp.TP.prior(..., reparameterize=False) / MvStudentT
import numpy as np
import pymc as pm
import scipy.stats as st

nu = 3.0
X = np.array([[0.0], [1.0]])
ls = np.sqrt(-0.5 / np.log(0.8))
jitter = 1e-6
cov_func = pm.gp.cov.ExpQuad(1, ls=ls)

K = cov_func(X).eval() + jitter * np.eye(2)
L = np.linalg.cholesky(K)

points = [
    np.array([0.0, 0.0]),
    np.array([1.0, 1.0]),
    np.array([2.0, -1.0]),
]

with pm.Model() as m_reparam:
    tp = pm.gp.TP(scale_func=cov_func, nu=nu)
    tp.prior("f", X, reparameterize=True, jitter=jitter)
    reparam_logp_fn = m_reparam.compile_logp()

with pm.Model() as m_mvt:
    tp = pm.gp.TP(scale_func=cov_func, nu=nu)
    tp.prior("f", X, reparameterize=False, jitter=jitter)
    mvt_logp_fn = m_mvt.compile_logp()

print("point | default TP induced logp | MvStudentT TP logp | diff")
for x in points:
    v = np.linalg.solve(L, x)
    default_tp = float(reparam_logp_fn({"f_rotated_": v}) - np.log(np.linalg.det(L)))
    expected = float(mvt_logp_fn({"f": x}))
    print(x, default_tp, expected, default_tp - expected)

output:

[0. 0.] -1.490954846072 -1.327054216825 -0.163900629248
[1. 1.] -2.139054025350 -2.114756458049 -0.024297567300
[ 2. -1.] -7.150095032011 -6.704294555484 -0.445800476528

The default reparameterized TP and the correct MvStudentT prior give measurably different log densities for the same finite-dimensional values.

PyMC version information:

PyMC/PyMC3 Version: current main / v6 branch
PyTensor/Aesara Version: not available in my local checkout
Python Version: Python 3
Operating system: Linux
How did you install PyMC/PyMC3: source checkout

Context for the issue:

This affects anyone using pm.gp.TP(...).prior(...) with the default reparameterize=True, especially with small or moderate nu.

The current tests compare the reparameterized TP against a GP with nu=10000, where the difference almost disappears. For finite nu, the default prior is not the same as MvStudentT, so inference can be biased without any warning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions