You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount)
Replace Python's standard one-proportion z-test `prop_test(p, n, p0)` with
Clojure's Wilson-score-like formula `prop_test(succ, n)` from `stats.clj:10-15`:
```
2 * sqrt(n+1) * ((succ+1)/(n+1) - 0.5)
```
The Clojure formula has a built-in +1 pseudocount (Laplace smoothing / Beta(1,1)
prior) that regularizes extreme values for small Polis groups. This is separate
from the `PSEUDO_COUNT=2.0` used for `pa`/`pd` estimation (Beta(2,2) prior):
- `pa = (na + 1) / (ns + 2)` — Beta(2,2) prior for probability estimation
- `pat = 2 * sqrt(ns+1) * ((na+1)/(ns+1) - 0.5)` — Beta(1,1) prior for significance testing
**What changed in the output**: `pat`, `pdt` values (proportion test z-scores),
and downstream `agree_metric` / `disagree_metric` values. The z-scores are
now slightly different due to `sqrt(n+1)` vs `sqrt(n)` and `(succ+1)/(n+1)` vs
`(na+1)/(n+2)` denominators.
- `repness.py`: `prop_test(p, n, p0)` → `prop_test(succ, n)` with Clojure formula
- `repness.py`: `prop_test_vectorized(p, n, p0)` → `prop_test_vectorized(succ, n)`
- `repness.py`: Callers updated to pass raw counts `(na, ns)` instead of `(pa, ns, 0.5)`
- `test_discrepancy_fixes.py`: Removed xfail from D5 formula test, added 8 test cases + edge case
- `test_repness_unit.py`, `test_old_format_repness.py`: Updated for new signature
- Golden snapshots re-recorded for all datasets
- [x] D5 formula tests pass (8 input pairs + edge cases)
- [x] D5 Clojure blob consistency check passes (all datasets)
- [x] Full test suite passes (public + private, 19/19 regression tests)
- [x] Only pre-existing failure: pakistan-incremental D2 (unrelated)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
commit-id:48b77ba3
0 commit comments