Describe the issue:
I think pm.gp.TP.prior() gives the wrong joint prior when reparameterize=True, which is the default.
In TP._build_prior, the default path samples independent univariate Student-t variables:
v = pm.StudentT(name + "_rotated_", mu=0.0, sigma=1.0, nu=self.nu, size=size)
f = pm.Deterministic(name, mu + cholesky(cov).dot(v))
The non-reparameterized path uses the correct multivariate distribution:
f = pm.MvStudentT(name, nu=self.nu, mu=mu, scale=cov, **kwargs)
These are not equivalent for finite nu. A multivariate Student-t has shared tail scale / dependence across dimensions, while the current default uses independent tail scales in the rotated coordinates.
So the default TP prior silently changes the joint prior, posterior, and conditional predictions.
Reproduceable code example:
I created a temp venv, installed runtime deps, imported this checkout, and compared:
- default
gp.TP.prior(..., reparameterize=True)
- correct
gp.TP.prior(..., reparameterize=False) / MvStudentT
import numpy as np
import pymc as pm
import scipy.stats as st
nu = 3.0
X = np.array([[0.0], [1.0]])
ls = np.sqrt(-0.5 / np.log(0.8))
jitter = 1e-6
cov_func = pm.gp.cov.ExpQuad(1, ls=ls)
K = cov_func(X).eval() + jitter * np.eye(2)
L = np.linalg.cholesky(K)
points = [
np.array([0.0, 0.0]),
np.array([1.0, 1.0]),
np.array([2.0, -1.0]),
]
with pm.Model() as m_reparam:
tp = pm.gp.TP(scale_func=cov_func, nu=nu)
tp.prior("f", X, reparameterize=True, jitter=jitter)
reparam_logp_fn = m_reparam.compile_logp()
with pm.Model() as m_mvt:
tp = pm.gp.TP(scale_func=cov_func, nu=nu)
tp.prior("f", X, reparameterize=False, jitter=jitter)
mvt_logp_fn = m_mvt.compile_logp()
print("point | default TP induced logp | MvStudentT TP logp | diff")
for x in points:
v = np.linalg.solve(L, x)
default_tp = float(reparam_logp_fn({"f_rotated_": v}) - np.log(np.linalg.det(L)))
expected = float(mvt_logp_fn({"f": x}))
print(x, default_tp, expected, default_tp - expected)
output:
[0. 0.] -1.490954846072 -1.327054216825 -0.163900629248
[1. 1.] -2.139054025350 -2.114756458049 -0.024297567300
[ 2. -1.] -7.150095032011 -6.704294555484 -0.445800476528
The default reparameterized TP and the correct MvStudentT prior give measurably different log densities for the same finite-dimensional values.
PyMC version information:
PyMC/PyMC3 Version: current main / v6 branch
PyTensor/Aesara Version: not available in my local checkout
Python Version: Python 3
Operating system: Linux
How did you install PyMC/PyMC3: source checkout
Context for the issue:
This affects anyone using pm.gp.TP(...).prior(...) with the default reparameterize=True, especially with small or moderate nu.
The current tests compare the reparameterized TP against a GP with nu=10000, where the difference almost disappears. For finite nu, the default prior is not the same as MvStudentT, so inference can be biased without any warning.
Describe the issue:
I think
pm.gp.TP.prior()gives the wrong joint prior whenreparameterize=True, which is the default.In
TP._build_prior, the default path samples independent univariate Student-t variables:The non-reparameterized path uses the correct multivariate distribution:
These are not equivalent for finite nu. A multivariate Student-t has shared tail scale / dependence across dimensions, while the current default uses independent tail scales in the rotated coordinates.
So the default TP prior silently changes the joint prior, posterior, and conditional predictions.
Reproduceable code example:
I created a temp venv, installed runtime deps, imported this checkout, and compared:
gp.TP.prior(..., reparameterize=True)gp.TP.prior(..., reparameterize=False)/MvStudentToutput:
The default reparameterized TP and the correct
MvStudentTprior give measurably different log densities for the same finite-dimensional values.PyMC version information:
Context for the issue:
This affects anyone using
pm.gp.TP(...).prior(...)with the defaultreparameterize=True, especially with small or moderatenu.The current tests compare the reparameterized TP against a GP with
nu=10000, where the difference almost disappears. For finitenu, the default prior is not the same asMvStudentT, so inference can be biased without any warning.