Commit be320d5
committed
Fix D6: match Clojure two-proportion test formula (+1 pseudocount)
## Summary
The Python `two_prop_test` used a standard two-proportion z-test with no pseudocounts,
while Clojure's `stats/two-prop-test` (stats.clj:18-33) adds +1 to all four inputs
(`succ-in`, `succ-out`, `pop-in`, `pop-out`) via `(map inc ...)` before computing
the pooled z-test. This Laplace smoothing regularizes z-scores for small group sizes,
which are common in Polis conversations.
## Changes
- **Signature change**: `two_prop_test(p1, n1, p2, n2)` (proportions) →
`two_prop_test(succ_in, succ_out, pop_in, pop_out)` (raw counts)
- **Formula**: Standard pooled z-test on pseudocount-adjusted values:
`pi1 = (succ_in+1)/(pop_in+1)`, `pi_hat = (s1+s2)/(p1+p2)`
- **Callers updated**: Both scalar (`add_comparative_stats`) and vectorized
(`compute_group_comment_stats_df`) now pass raw counts matching Clojure's
`(stats/two-prop-test (:na in-stats) (sum :na rest-stats) (:ns in-stats) (sum :ns rest-stats))`
(repness.clj:97-100)
## Affected output fields
- `rat` (agree representativeness test z-score)
- `rdt` (disagree representativeness test z-score)
- `agree_metric`, `disagree_metric` (downstream of rat/rdt)
## Test plan
- [x] Targeted D6 tests pass (formula, edge cases, regularization effect)
- [x] Full test suite passes (excluding DynamoDB/MinIO tests)
- [x] Private dataset tests pass (--include-local)
- [x] Golden snapshots re-recorded for all 7 datasets
🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Squashed commits
- RED: add D6 blob injection test (two_prop_test vs Clojure repness-test)
- Fix D6: match Clojure two-proportion test formula (+1 pseudocount)
- Plan: add D6 PR number and stack position to cross-reference
commit-id:23c03d701 parent a387b9e commit be320d5
6 files changed
Lines changed: 314 additions & 100 deletions
File tree
- delphi
- docs
- polismath/pca_kmeans_rep
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
440 | 440 | | |
441 | 441 | | |
442 | 442 | | |
443 | | - | |
444 | | - | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
445 | 495 | | |
446 | 496 | | |
447 | 497 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
467 | 468 | | |
468 | 469 | | |
469 | 470 | | |
470 | | - | |
| 471 | + | |
471 | 472 | | |
472 | 473 | | |
473 | 474 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
99 | 99 | | |
100 | | - | |
101 | | - | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
102 | 112 | | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
108 | 118 | | |
109 | | - | |
| 119 | + | |
110 | 120 | | |
111 | | - | |
| 121 | + | |
112 | 122 | | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
121 | 140 | | |
122 | 141 | | |
123 | | - | |
124 | | - | |
| 142 | + | |
125 | 143 | | |
126 | 144 | | |
127 | 145 | | |
| |||
182 | 200 | | |
183 | 201 | | |
184 | 202 | | |
185 | | - | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
186 | 206 | | |
187 | | - | |
188 | | - | |
| 207 | + | |
| 208 | + | |
189 | 209 | | |
190 | | - | |
| 210 | + | |
191 | 211 | | |
192 | | - | |
193 | | - | |
| 212 | + | |
| 213 | + | |
194 | 214 | | |
195 | 215 | | |
196 | 216 | | |
| |||
493 | 513 | | |
494 | 514 | | |
495 | 515 | | |
496 | | - | |
497 | | - | |
| 516 | + | |
| 517 | + | |
498 | 518 | | |
499 | | - | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
500 | 523 | | |
501 | 524 | | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | | - | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
506 | 529 | | |
507 | 530 | | |
508 | 531 | | |
509 | 532 | | |
510 | | - | |
511 | | - | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
512 | 538 | | |
513 | | - | |
514 | | - | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
515 | 542 | | |
516 | | - | |
517 | | - | |
| 543 | + | |
| 544 | + | |
518 | 545 | | |
519 | | - | |
| 546 | + | |
| 547 | + | |
520 | 548 | | |
521 | 549 | | |
522 | 550 | | |
| |||
649 | 677 | | |
650 | 678 | | |
651 | 679 | | |
652 | | - | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
653 | 683 | | |
654 | | - | |
655 | | - | |
| 684 | + | |
| 685 | + | |
656 | 686 | | |
657 | 687 | | |
658 | | - | |
659 | | - | |
| 688 | + | |
| 689 | + | |
660 | 690 | | |
661 | 691 | | |
662 | 692 | | |
| |||
0 commit comments