Hi John — I have been using Claude to port the Fast Template
Periodogram to a C implementation in
vartools. Claude worked
through the math in Hoffman et al. (2021,
arXiv:2101.12348) and traced it
against the pyftp code. It looks like there's a bug in the
root-finding polynomial that explains the documented ~5×10⁻³ precision
floor against direct summation. Fix is a one-character swap. Details
below.
Where
pyftp/pseudo_poly.py:281–292 — PseudoPolynomial.root_finding_poly
pyftp/pseudo_poly.py:466 — compute_zeros (uses the same construction inline)
The math
PseudoPolynomial represents
PP(b) = (1−b²)^r · [ p(b) + sqrt(1−b²) · q(b) ]
with r ≤ 0 and p, q real-coefficient polynomials in b. The class
docstring (line 144), __call__ (line 196), __mul__ (line 230), and
deriv (line 262) are all consistent with this convention: the "p" slot
is the bare polynomial and the "q" slot is the sqrt(1−b²)-multiplied
polynomial.
To find the zeros of PP, divide by (1−b²)^r (valid for |b| < 1) and
square the resulting condition p(b) = −sqrt(1−b²)·q(b):
So the polynomial whose real roots in [−1, 1] are the zeros of PP is:
F(b) = p²(b) − (1 − b²) · q²(b)
What the code computes
root_finding_poly instead returns
return pol.polysub(pol.polymul(pol.polymul(self.p, self.p), (1, 0, -1)),
pol.polymul(self.q, self.q))
which is
F_pyftp(b) = (1 − b²) · p²(b) − q²(b)
— the same expression with p and q swapped. compute_zeros builds the
same expression inline at line 466. The docstring on line 282 claims to
return (1 − x²)^(−(2r+1)) · p² − q², but the code drops the r-dependent
prefactor regardless of r (which is consistent with the squared form
only if p and q were swapped in the class definition, which they
aren't).
These two polynomials have different roots in general. They coincide for
a few degenerate cases (e.g. p = ±q) but disagree everywhere else.
Concrete H=1 demonstration
Take template (c₁, s₁) = (1, 0) (a pure cosine). Deriving the
optimality condition 2·MM·∂(YM) − YM·∂(MM) = 0 by hand for that
template gives
b · (YS·CC − YC·CS) + sgn·sqrt(1−b²) · (YS·CS − YC·SS) = 0
So Pp(b) = b·(YS·CC − YC·CS) and Pq(b) = (YS·CS − YC·SS). Let
A = (YS·CC − YC·CS)² and B = (YC·SS − YS·CS)². Then:
| polynomial |
equation in b |
real solution |
Correct Pp² − (1−b²)·Pq² |
b²·A − (1−b²)·B = 0 |
b² = B / (A + B) |
pyftp's (1−b²)·Pp² − Pq² |
(1−b²)·b²·A − B = 0 |
b⁴ − b² + B/A = 0 |
For A = 2, B = 1 the correct polynomial gives b² = 1/3, while the
pyftp polynomial gives b² = (1 ± √(1 − 2))/2 — the discriminant is
negative, so no real roots at all in a case where the correct
answer is a clean b = 1/√3. (The post-processing in compute_zeros
filters to roots with |imag| < 1e-5 and clips |real| to ≤ 1, so it
returns whatever happens to fall through that net — but those aren't
the optimal θ₂ values.)
Empirical verification
Claude implemented a brute-force θ₂ scan (gold standard) and ran it
alongside the corrected polynomial and pyftp's current polynomial, on
several synthetic LCs with known templates. 50 trial frequencies per
case, 2880-point θ₂ scan for the brute-force baseline:
| template |
corrected Pp² − (1−b²)·Pq² |
pyftp (1−b²)·Pp² − Pq² |
| H=1 pure cosine |
max 8.0e-07, mean 3.2e-08 |
max 1.8e-03, mean 2.0e-04 |
| H=2 mixed |
max 2.0e-06, mean 8.7e-08 |
max 1.4e-02, mean 6.2e-04 |
| H=3 RR-Lyrae-ish |
max 5.2e-06, mean 1.4e-07 |
max 3.9e-02, mean 1.0e-03 |
| H=5 multi-harmonic |
max 8.7e-07, mean 6.8e-08 |
max 1.1e-02, mean 5.3e-04 |
The corrected polynomial matches the brute-force scan to within
the scan's finite resolution (~1e-7). pyftp's polynomial diverges by
4–5 orders of magnitude — consistent with the assert(max(dp) < 0.5)
and assert(median(dp) < 5e-3) tolerances in
pyftp/tests/test_modeler.py:158-160, and with commit
5b17c7e's
note about "still not perfect agreement between NFFT and direct
summation periodograms, possibly has to do with small errors in the
polynomial coefficients".
The verification script is below — drop-in, no NFFT dependency.
Suggested fix
In pseudo_poly.py, swap the multiplications by (1, 0, -1) so it
multiplies q² instead of p²:
def root_finding_poly(self):
- """ (1 - x^2)^(-(2 * r + 1)) * p^2 - q^2
+ """ p^2 - (1 - x^2) * q^2
Returns
-------
coef : np.ndarray
Coefficients of a polynomial that has the
same number of roots as the PP
"""
- return pol.polysub(pol.polymul(pol.polymul(self.p, self.p), (1, 0, -1)),
- pol.polymul(self.q, self.q))
+ return pol.polysub(pol.polymul(self.p, self.p),
+ pol.polymul(pol.polymul(self.q, self.q), (1, 0, -1)))
And the corresponding line in compute_zeros:
- P = pol.polysub(pol.polymul((1, 0, -1), pol.polymul(Pp, Pp)),
- pol.polymul(Pq, Pq))
+ P = pol.polysub(pol.polymul(Pp, Pp),
+ pol.polymul((1, 0, -1), pol.polymul(Pq, Pq)))
(Or simpler: replace compute_zeros's inline build with a single call
to root_finding_poly on a PseudoPolynomial(Pp, Pq, r=-1) — same
result, less duplication.)
I expect pyftp/tests/test_modeler.py will then be able to assert
much tighter tolerances against direct_periodogram.
Reproduction script
"""Compare brute-force theta_2 scan, corrected polynomial, and pyftp's polynomial."""
import numpy as np
import numpy.polynomial.polynomial as npp
from numpy.polynomial.polynomial import polyroots
from pyftp.pseudo_poly import compute_polynomial_tensors, get_polynomial_vectors
from pyftp.utils import Summations
def synthetic_lc(N=300, T=30.0, period=1.234, cn=(0.5,), sn=(0.1,), noise=0.02, rseed=42):
rng = np.random.RandomState(rseed)
t = np.sort(rng.uniform(0, T, N))
omega = 2 * np.pi / period
y = np.zeros_like(t)
for n, (c, s) in enumerate(zip(cn, sn), start=1):
y += c * np.cos(n * omega * t) + s * np.sin(n * omega * t)
y += noise * rng.randn(N)
return t, y, np.full_like(t, noise)
def compute_sums(t, y, yerr, H, omega):
w = 1.0 / yerr**2; w /= w.sum()
ybar = (w * y).sum()
YY = (w * (y - ybar)**2).sum()
Cn = np.array([(w * np.cos((n+1) * omega * t)).sum() for n in range(2*H)])
Sn = np.array([(w * np.sin((n+1) * omega * t)).sum() for n in range(2*H)])
YC = np.array([(w * (y - ybar) * np.cos((n+1) * omega * t)).sum() for n in range(H)])
YS = np.array([(w * (y - ybar) * np.sin((n+1) * omega * t)).sum() for n in range(H)])
Cf = np.concatenate([[1.0], Cn]); Sf = np.concatenate([[0.0], Sn])
CC = np.zeros((H, H)); CS = np.zeros((H, H)); SS = np.zeros((H, H))
for n in range(1, H+1):
for m in range(1, H+1):
d = n - m; s = n + m; ad = abs(d)
sgn_mn = -np.sign(d) if d != 0 else 0.0
CC[n-1, m-1] = 0.5 * (Cf[ad] + Cf[s]) - Cf[n] * Cf[m]
CS[n-1, m-1] = 0.5 * (sgn_mn * Sf[ad] + Sf[s]) - Cf[n] * Sf[m]
SS[n-1, m-1] = 0.5 * (Cf[ad] - Cf[s]) - Sf[n] * Sf[m]
return YY, YC, YS, CC, CS, SS, Cn, Sn
def evaluate_P(theta, cn, sn, YY, YC, YS, CC, CS, SS):
H = len(cn)
nvec = np.arange(1, H+1)
ct, st = np.cos(nvec * theta), np.sin(nvec * theta)
A = cn * ct - sn * st
B = sn * ct + cn * st
YM = (A * YC + B * YS).sum()
MM = A @ CC @ A + 2 * A @ CS @ B + B @ SS @ B
return (YM * YM) / (YY * MM) if MM > 1e-30 and YY > 1e-30 else 0.0
def brute_force_P(t, y, yerr, cn, sn, omega, ntheta=2880):
YY, YC, YS, CC, CS, SS, _, _ = compute_sums(t, y, yerr, len(cn), omega)
best = 0.0
for i in range(ntheta):
P = evaluate_P(2 * np.pi * i / ntheta, cn, sn, YY, YC, YS, CC, CS, SS)
if P > best: best = P
return best
def polynomial_P(t, y, yerr, cn, sn, omega, use_correct=True):
cn = np.asarray(cn, dtype=float); sn = np.asarray(sn, dtype=float)
H = len(cn)
YY, YC, YS, CC, CS, SS, Cn_, Sn_ = compute_sums(t, y, yerr, H, omega)
sums = Summations(C=Cn_[:H], S=Sn_[:H], YC=YC, YS=YS, CC=CC, CS=CS, SS=SS)
AAdAp, AAdAq, AAdBp, AAdBq, ABdAp, ABdAq, ABdBp, ABdBq, \
BBdAp, BBdAq, BBdBp, BBdBq = compute_polynomial_tensors(*get_polynomial_vectors(cn, sn, sgn=1))
Kaada = np.einsum('i,jk->ijk', sums.YC[:H], sums.CC[:H,:H]) - np.einsum('k,ij->ijk', sums.YC, sums.CC[:H,:H])
Kaadb = np.einsum('i,jk->ijk', sums.YC[:H], sums.CS[:H,:H]) - np.einsum('k,ij->ijk', sums.YS, sums.CC[:H,:H])
Kabda = np.einsum('i,kj->ijk', sums.YC[:H], sums.CS[:H,:H]) + np.einsum('j,ik->ijk', sums.YS, sums.CC[:H,:H])
Kabdb = np.einsum('i,jk->ijk', sums.YC[:H], sums.SS[:H,:H]) + np.einsum('j,ik->ijk', sums.YS, sums.CS[:H,:H])
Kbbda = np.einsum('i,kj->ijk', sums.YS[:H], sums.CS[:H,:H]) - np.einsum('k,ij->ijk', sums.YC, sums.SS[:H,:H])
Kbbdb = np.einsum('i,jk->ijk', sums.YS[:H], sums.SS[:H,:H]) - np.einsum('k,ij->ijk', sums.YS, sums.SS[:H,:H])
Pp = sum(np.einsum('ijkl,ijk->l', t, k) for t, k in zip(
[AAdAp, AAdBp, ABdAp, ABdBp, BBdAp, BBdBp],
[Kaada, Kaadb, Kabda, Kabdb, Kbbda, Kbbdb]))
Pq = sum(np.einsum('ijkl,ijk->l', t, k) for t, k in zip(
[AAdAq, AAdBq, ABdAq, ABdBq, BBdAq, BBdBq],
[Kaada, Kaadb, Kabda, Kabdb, Kbbda, Kbbdb]))
if use_correct:
P = npp.polysub(npp.polymul(Pp, Pp), npp.polymul((1, 0, -1), npp.polymul(Pq, Pq)))
else:
P = npp.polysub(npp.polymul((1, 0, -1), npp.polymul(Pp, Pp)), npp.polymul(Pq, Pq))
roots = polyroots(P)
cands = [min(1.0, abs(r.real)) * np.sign(r.real) for r in roots
if abs(r.imag) < 1e-5 and abs(r.real) < 1 + 1e-5]
cands.extend([-1.0, 1.0])
best = 0.0
for b in cands:
theta = np.arccos(b)
for sgn in [-1, 1]:
theta_eff = (2 * np.pi - theta) if sgn < 0 else theta
P_val = evaluate_P(theta_eff, cn, sn, YY, YC, YS, CC, CS, SS)
if P_val > best: best = P_val
return best
for label, H, cn, sn, period in [
("H=1 pure cosine", 1, [1.0], [0.0], 1.234),
("H=3 RR-Lyrae-ish", 3, [-0.181, -0.075, -0.020], [-0.110, 0.000, 0.030], 0.514),
]:
t, y, yerr = synthetic_lc(N=300, T=30, period=period, cn=cn, sn=sn)
errs_correct, errs_buggy = [], []
for f in np.linspace(0.3, 3.0, 50):
omega = 2 * np.pi * f
Pb = brute_force_P(t, y, yerr, np.asarray(cn), np.asarray(sn), omega)
errs_correct.append(abs(polynomial_P(t, y, yerr, cn, sn, omega, True) - Pb))
errs_buggy.append( abs(polynomial_P(t, y, yerr, cn, sn, omega, False) - Pb))
print(f"{label:25s} correct max={max(errs_correct):.2e} buggy max={max(errs_buggy):.2e}")
Output:
H=1 pure cosine correct max=8.02e-07 buggy max=1.78e-03
H=3 RR-Lyrae-ish correct max=5.24e-06 buggy max=3.89e-02
Hi John — I have been using Claude to port the Fast Template
Periodogram to a C implementation in
vartools. Claude worked
through the math in Hoffman et al. (2021,
arXiv:2101.12348) and traced it
against the pyftp code. It looks like there's a bug in the
root-finding polynomial that explains the documented ~5×10⁻³ precision
floor against direct summation. Fix is a one-character swap. Details
below.
Where
pyftp/pseudo_poly.py:281–292—PseudoPolynomial.root_finding_polypyftp/pseudo_poly.py:466—compute_zeros(uses the same construction inline)The math
PseudoPolynomialrepresentswith
r ≤ 0andp,qreal-coefficient polynomials inb. The classdocstring (line 144),
__call__(line 196),__mul__(line 230), andderiv(line 262) are all consistent with this convention: the "p" slotis the bare polynomial and the "q" slot is the sqrt(1−b²)-multiplied
polynomial.
To find the zeros of
PP, divide by(1−b²)^r(valid for|b| < 1) andsquare the resulting condition
p(b) = −sqrt(1−b²)·q(b):So the polynomial whose real roots in
[−1, 1]are the zeros ofPPis:What the code computes
root_finding_polyinstead returnswhich is
— the same expression with
pandqswapped.compute_zerosbuilds thesame expression inline at line 466. The docstring on line 282 claims to
return
(1 − x²)^(−(2r+1)) · p² − q², but the code drops ther-dependentprefactor regardless of
r(which is consistent with the squared formonly if
pandqwere swapped in the class definition, which theyaren't).
These two polynomials have different roots in general. They coincide for
a few degenerate cases (e.g.
p = ±q) but disagree everywhere else.Concrete
H=1demonstrationTake template
(c₁, s₁) = (1, 0)(a pure cosine). Deriving theoptimality condition
2·MM·∂(YM) − YM·∂(MM) = 0by hand for thattemplate gives
So
Pp(b) = b·(YS·CC − YC·CS)andPq(b) = (YS·CS − YC·SS). LetA = (YS·CC − YC·CS)²andB = (YC·SS − YS·CS)². Then:Pp² − (1−b²)·Pq²b²·A − (1−b²)·B = 0b² = B / (A + B)(1−b²)·Pp² − Pq²(1−b²)·b²·A − B = 0b⁴ − b² + B/A = 0For
A = 2, B = 1the correct polynomial givesb² = 1/3, while thepyftp polynomial gives
b² = (1 ± √(1 − 2))/2— the discriminant isnegative, so no real roots at all in a case where the correct
answer is a clean
b = 1/√3. (The post-processing incompute_zerosfilters to roots with
|imag| < 1e-5and clips|real|to≤ 1, so itreturns whatever happens to fall through that net — but those aren't
the optimal θ₂ values.)
Empirical verification
Claude implemented a brute-force
θ₂scan (gold standard) and ran italongside the corrected polynomial and pyftp's current polynomial, on
several synthetic LCs with known templates. 50 trial frequencies per
case, 2880-point θ₂ scan for the brute-force baseline:
Pp² − (1−b²)·Pq²(1−b²)·Pp² − Pq²The corrected polynomial matches the brute-force scan to within
the scan's finite resolution (~1e-7). pyftp's polynomial diverges by
4–5 orders of magnitude — consistent with the
assert(max(dp) < 0.5)and
assert(median(dp) < 5e-3)tolerances inpyftp/tests/test_modeler.py:158-160, and with commit5b17c7e's
note about "still not perfect agreement between NFFT and direct
summation periodograms, possibly has to do with small errors in the
polynomial coefficients".
The verification script is below — drop-in, no NFFT dependency.
Suggested fix
In
pseudo_poly.py, swap the multiplications by(1, 0, -1)so itmultiplies
q²instead ofp²:And the corresponding line in
compute_zeros:(Or simpler: replace
compute_zeros's inline build with a single callto
root_finding_polyon aPseudoPolynomial(Pp, Pq, r=-1)— sameresult, less duplication.)
I expect
pyftp/tests/test_modeler.pywill then be able to assertmuch tighter tolerances against
direct_periodogram.Reproduction script
Output: