Skip to content

Commit b5efa32

Browse files
committed
try to clarify branch mode
1 parent ece62a9 commit b5efa32

1 file changed

Lines changed: 13 additions & 13 deletions

File tree

docs/stats.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -776,18 +776,18 @@ Out of all of the available summary functions, only {math}`r^2` uses
776776

777777
#### Branch
778778

779-
The `branch` mode computes two-locus statistics between pairs of trees. The
780-
resulting statistic is a summary depending on marginal tree topologies and the
781-
branch lengths of the trees at a pair of positions. To perform this
782-
computation, we consider the counts of all possible two-locus haplotypes that
783-
could be generated by mutations on each pair of branches between the two
784-
trees, which is summarized using an LD statistic. We then sum each pairwise
785-
comparison weighted by the product of the two branch lengths above the nodes on
786-
each tree.
787-
788-
Similar to the single-site statistics computed in `branch` mode, this results
789-
in a statistic that is proportional to the expected statistic under an infinite
790-
sites model (with mutation rate 1), conditioned on the pair of trees.
779+
The `branch` mode computes expected two-locus statistics between pairs of
780+
trees, conditioned on the marginal topologies and branch lengths of those
781+
trees. The trees for which we compute statistics are specified by positions,
782+
and for a pair of positions we consider all possible haplotypes that could be
783+
generated by a single mutation occurring at the two trees.
784+
785+
For two trees, one with {math}`n` branches and the other with {math}`m`
786+
branches, there are {math}`nm` possible pairs of branches that may carry the
787+
pair of mutations. For each pair, we compute the two-locus statistic, and then
788+
sum these values weighted by the product of the two branch lengths. Given the
789+
two mutations occur, this accounts for the relative probability that the two
790+
mutations fall on any pair of branches.
791791

792792
The time complexity of this method is quadratic in the number of samples,
793793
due to the pairwise comparisons of branches from each pair of trees.
@@ -892,7 +892,7 @@ input. Each of our summary functions has the signature
892892
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = p_{ab} - p_{a}p_{b}`
893893

894894
This statistic is polarised, as the unpolarised result, which averages over
895-
allele labelings, is zero. Uses the `total` normalisation method.
895+
allele labelings, is zero. Uses the `total` weighting method.
896896

897897
`D_prime`
898898
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D}{D_{\max}}`,

0 commit comments

Comments
 (0)