@@ -740,11 +740,33 @@ ld = ts.ld_matrix(sites=[[0, 1], [1, 2, 3]])
740740print(ld)
741741```
742742
743- Because we implement two-locus statistics for multi-allelic data, we require
744- a method for combining the statistical results from each pair of alleles into
745- one summary for a pair of sites. There two methods for combining results from
746- multiple alleles, ` hap_weighted ` and ` total_weighted ` , which are
747- statistic-specific and not chosen by the user:
743+ Because we allow for two-locus statistics to be computed for multi-allelic
744+ data, we need to be able to combine statistical results from each pair of
745+ alleles into one summary for a pair of sites. We use two implementations for
746+ combining results from multiple alleles: ` hap_weighted ` and ` total_weighted ` .
747+ These are statistic-specific and not chosen by the user.
748+
749+ Briefly, consider a pair of sites with {math}` n ` alleles at the first locus and
750+ {math}` m ` alleles at the second. Write {math}` f_{i,j} ` as the statistic
751+ computed for focal alleles {math}` A_i ` and {math}` B_j ` , with haplotype weights
752+ {math}` (A_i B_j, A_i b_j, a_i B_j) ` , where {math}` a_i ` and {math}` b_j ` are the
753+ collection of alleles that are not the focal alleles {math}` A_i ` or
754+ {math}` B_j ` , respectively. Then the weighting schemes are defined as:
755+
756+ - ` hap_weighted ` : {math}` \sum_{i=1}^{n}\sum_{j=1}^{m}p(A_{i}B_{j})f_{ij} ` ,
757+ where {math}` p(A_{i}B_{j}) ` is the frequency of haplotype {math}` A_{i}B_{j} ` .
758+ This method was first introduced in [ Karlin
759+ (1981)] ( https://doi.org/10.1111/j.1469-1809.1981.tb00308.x ) and reviewed in
760+ [ Zhao (2007)] ( https://doi.org/10.1017/S0016672307008634 ) .
761+
762+ - ` total_weighted ` : {math}` \frac{1}{n m}\sum_{i=1}^{n}\sum_{j=1}^{m}f_{ij} ` .
763+ This method assigns equal weight to each of the possible pairs of focal
764+ alleles at the two sites, taking the arithmetic mean of statistics over
765+ focal haplotypes.
766+
767+ Out of all of the available summary functions, only {math}` r^2 ` uses
768+ ` hap_weighted ` normalisation, with the remainder using uniform weighting
769+ (` total_weighted ` ).
748770
749771(sec_stats_two_locus_branch)=
750772
0 commit comments