Commit bc30aa4
committed
Vectorize participant info computation (3-15x speedup)
## Summary
- Replaces the O(N×G×C) per-participant Python loop in `_compute_participant_info_optimized` with bulk NumPy operations: matrix-wide vote counting (`np.sum` over axis) and per-group Pearson correlation via `P @ g` matrix multiply
- Adds 31 unit tests covering vote counts, group correlations, edge cases (small groups, zero-std, NaN handling, missing members), and golden snapshot regression
- Correlations now return Python `float` instead of `numpy.float64`
- Includes a benchmark script (`scripts/benchmark_participant_info.py`) that runs old vs new on the same data
## Benchmark results
Measured on real datasets (5 runs, median), old per-participant loop vs new vectorized:
| Dataset | Size | Old | New | Speedup |
|---------|------|-----|-----|---------|
| vw | 69p × 125c × 4g | 0.011s | 0.001s | **14.6x** |
| biodiversity | 536p × 314c × 2g | 0.047s | 0.006s | **8.1x** |
| _(larger private datasets)_ | | | | **3–6x** |
Speedup is higher on smaller datasets (loop overhead dominates) and lower on very large ones (matrix materialization dominates). Overall **3–15x** depending on size.
## Test plan
- [x] 31 unit tests pass (pre-vectorization baseline established first, then re-run post)
- [x] Golden snapshot regression passes for biodiversity + vw
- [x] Full regression test suite passes (40/40)
- [x] Benchmark run on all datasets including private (results above)
- [x] Max correlation diff across all datasets: < 2e-15
🤖 Generated with [Claude Code](https://claude.com/claude-code)
commit-id:ea7471961 parent 1f86ab3 commit bc30aa4
12 files changed
Lines changed: 1022 additions & 277 deletions
File tree
- delphi
- docs
- notebooks
- polismath
- conversation
- pca_kmeans_rep
- scripts
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
14 | 25 | | |
15 | 26 | | |
16 | 27 | | |
| |||
38 | 49 | | |
39 | 50 | | |
40 | 51 | | |
| 52 | + | |
41 | 53 | | |
42 | 54 | | |
43 | 55 | | |
| |||
399 | 411 | | |
400 | 412 | | |
401 | 413 | | |
402 | | - | |
403 | | - | |
404 | | - | |
405 | | - | |
406 | | - | |
407 | | - | |
408 | | - | |
409 | | - | |
410 | | - | |
411 | | - | |
412 | | - | |
413 | | - | |
414 | | - | |
415 | | - | |
416 | | - | |
417 | | - | |
418 | | - | |
419 | | - | |
420 | | - | |
421 | | - | |
422 | | - | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
423 | 441 | | |
424 | 442 | | |
425 | 443 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| |||
792 | 792 | | |
793 | 793 | | |
794 | 794 | | |
795 | | - | |
796 | | - | |
| 795 | + | |
797 | 796 | | |
798 | 797 | | |
799 | | - | |
| 798 | + | |
800 | 799 | | |
801 | 800 | | |
802 | 801 | | |
803 | 802 | | |
804 | | - | |
805 | | - | |
| 803 | + | |
806 | 804 | | |
807 | 805 | | |
808 | 806 | | |
809 | 807 | | |
810 | 808 | | |
811 | 809 | | |
812 | | - | |
813 | | - | |
| 810 | + | |
| 811 | + | |
814 | 812 | | |
815 | | - | |
816 | | - | |
817 | | - | |
818 | | - | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
819 | 830 | | |
820 | | - | |
821 | | - | |
822 | | - | |
823 | | - | |
824 | | - | |
825 | | - | |
826 | | - | |
827 | | - | |
828 | | - | |
829 | | - | |
830 | | - | |
831 | | - | |
832 | | - | |
833 | | - | |
834 | | - | |
835 | | - | |
836 | | - | |
837 | | - | |
838 | | - | |
839 | | - | |
840 | | - | |
841 | | - | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
842 | 837 | | |
843 | | - | |
844 | | - | |
845 | | - | |
846 | | - | |
847 | | - | |
848 | | - | |
849 | | - | |
850 | | - | |
851 | | - | |
852 | | - | |
853 | | - | |
854 | | - | |
855 | | - | |
856 | | - | |
857 | | - | |
858 | | - | |
859 | | - | |
860 | | - | |
861 | | - | |
862 | | - | |
863 | | - | |
864 | | - | |
865 | | - | |
866 | | - | |
867 | | - | |
868 | | - | |
869 | | - | |
870 | | - | |
871 | | - | |
872 | | - | |
873 | | - | |
874 | | - | |
875 | | - | |
876 | | - | |
877 | | - | |
878 | | - | |
879 | | - | |
880 | | - | |
881 | | - | |
882 | | - | |
883 | | - | |
884 | | - | |
885 | | - | |
886 | | - | |
887 | | - | |
888 | | - | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
889 | 873 | | |
890 | | - | |
891 | | - | |
892 | | - | |
893 | | - | |
894 | | - | |
895 | | - | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
896 | 883 | | |
897 | | - | |
| 884 | + | |
898 | 885 | | |
899 | 886 | | |
900 | 887 | | |
901 | 888 | | |
902 | | - | |
| 889 | + | |
903 | 890 | | |
904 | 891 | | |
905 | 892 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | 21 | | |
23 | 22 | | |
0 commit comments