Skip to content

Commit 7e4152c

Browse files
Apply suggestions from code review
Co-authored-by: Peter Ralph <petrel.harp@gmail.com>
1 parent c86f85e commit 7e4152c

2 files changed

Lines changed: 24 additions & 22 deletions

File tree

docs/stats.md

Lines changed: 19 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -805,10 +805,10 @@ To compute a two-way two-locus statistic, the `index` argument must be
805805
provided. The statistics are selected in the same way (with the `stat`
806806
argument), but we provide a restricted set of two-way statistics (see
807807
{ref}`sec_stats_two_locus_summary_functions_two_way`). The dimension-dropping
808-
rules for the result follow the rest of the tskit stats API in that scalar
809-
indexes will produce a single two-dimensional matrix, while list of indexes
808+
rules for the result follow the rest of the tskit stats API in that a single list
809+
or tuple will produce a single two-dimensional matrix, while list of these
810810
will produce a three-dimensional array, with the outer dimension of length
811-
equal to the length of the list of indexes.
811+
equal to the length of the list.
812812

813813
For concreteness, we would expect the following dimensions with the specified
814814
`sample_sets` and `indexes` arguments.
@@ -869,47 +869,46 @@ input. Each of our summary functions has the signature
869869
allele labelings, is zero. Uses the `total` normalisation method.
870870

871871
`D_prime`
872-
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D}{D_{\max}}`
872+
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D}{D_{\max}}`,
873873

874-
Where {math}```
875-
D_{\max} = \begin{cases}
874+
where {math}
876875
\min\{p_A (1-p_B), p_B (1-p_B)\} & \textrm{if }D>=0 \\
877876
\min\{p_A p_B, (1-p_B) (1-p_B)\} & \textrm{otherwise}
878877
\end{cases}```
879878

880-
and {math}`D` is defined above.
879+
and {math}`D` is defined above. Polarised, `total` weighted.
881880

882881
`D2`
883882
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = (p_{ab} - p_{a} p_{b})^2`
884883

885-
Unpolarised, total weighted.
884+
Unpolarised, `total` weighted.
886885

887886
`Dz`
888-
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = D (1 - 2 p_{a})(1 - 2p_{b})`
887+
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = D (1 - 2 p_{a})(1 - 2p_{b})`,
889888

890-
Where {math}`D` is defined above. Unpolarised, total weighted.
889+
where {math}`D` is defined above. Unpolarised, `total` weighted.
891890

892891
`pi2`
893892
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = p_{a}p_{b}(1-p_{a})(1-p_{b})`
894893

895-
Unpolarised, total weighted.
894+
Unpolarised, `total` weighted.
896895

897896
`r`
898-
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D}{\sqrt{p_{a}p_{b}(1-p_{a})(1-p_{b})}}`
897+
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D}{\sqrt{p_{a}p_{b}(1-p_{a})(1-p_{b})}}`,
899898

900-
Where {math}`D` is defined above. Polarised, total weighted.
899+
where {math}`D` is defined above. Polarised, `total` weighted.
901900

902901
`r2`
903-
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D^{2}}{p_{a}p_{b}(1-p_{a})(1-p_{b})}`
902+
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{D^{2}}{p_{a}p_{b}(1-p_{a})(1-p_{b})}`,
904903

905-
Where {math}`D` is defined above. Unpolarised, haplotype weighted.
904+
where {math}`D` is defined above. Unpolarised, `haplotype` weighted.
906905

907906
Unbiased two-locus statistics from the Hill-Robertson (1968) system are
908-
computed from haplotype counts. Derivations for these unbiased estimators can
907+
computed from haplotype counts. Definitions of these unbiased estimators can
909908
be found in [Ragsdale and Gravel
910909
(2020)](https://doi.org/10.1093/molbev/msz265). They require at least 4 samples
911-
to be valid and are specified as `stat=D2_unbiased`, `Dz_unbiased`, or
912-
`pi2_unbiased`.
910+
to be valid and are specified as `stat="D2_unbiased"`, `"Dz_unbiased"`, or
911+
`"pi2_unbiased"`.
913912

914913
(sec_two_locus_summary_functions_two_way)=
915914

@@ -921,9 +920,9 @@ Two-way statistics are indexed by sample sets {math}`i, j` and compute values
921920
using haplotype counts within pairs of sample sets.
922921

923922
`D2`
924-
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = D_i * D_j`
923+
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = D_i * D_j`,
925924

926-
Where {math}`D` is defined above.
925+
where {math}`D` is defined above.
927926

928927
`r2`
929928
: {math}`f(w_{Ab}, w_{aB}, w_{AB}, n) = \frac{(p_{AB_i} - (p_{A_i} p_{B_i})) (p_{AB_j} - (p_{A_j} p_{B_j}))}{\sqrt{p_{A_i} (1 - p_{A_i}) p_{B_i} (1 - p_{B_i})}\sqrt{p_{A_j} (1 - p_{A_j}) p_{B_j} (1 - p_{B_j})}}`

python/tskit/trees.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10967,7 +10967,8 @@ def ld_matrix(
1096710967
Some LD statistics are defined for two sample sets instead of within a
1096810968
single set of samples. If the ``indexes`` argument is specified, at
1096910969
least two sample sets must also be specified. ``indexes`` specifies the
10970-
sample set indexes between which to compute LD.
10970+
indexes of the sample sets in the ``sample_sets`` list
10971+
between which to compute LD.
1097110972
1097210973
For more on how the ``indexes`` and ``sample_sets`` interact with the
1097310974
output dimensions, see the :ref:`sec_stats_two_locus_sample_sets`
@@ -10985,7 +10986,7 @@ def ld_matrix(
1098510986
:math:`D` y n "D"
1098610987
:math:`D'` y n "D_prime"
1098710988
:math:`D_z` n n "Dz"
10988-
:math:`\pi2` n n "pi2"
10989+
:math:`\pi_2` n n "pi2"
1098910990
:math:`\widehat{D^2}` n y "D2_unbiased"
1099010991
:math:`\widehat{D_z}` n n "Dz_unbiased"
1099110992
:math:`\widehat{\pi_2}` n n "pi2_unbiased"
@@ -11002,10 +11003,12 @@ def ld_matrix(
1100211003
specified as a list of lists to control the row and column sites.
1100311004
Only applicable in site mode. Specify as
1100411005
``[[row_sites], [col_sites]]`` or ``[all_sites]``.
11006+
Defaults to all sites.
1100511007
:param list positions: A list of genomic positions where expected LD is
1100611008
computed. Only applicable in branch mode. Can be specified as a list
1100711009
of lists to control the row and column positions. Specify as
1100811010
``[[row_positions], [col_positions]]`` or ``[all_positions]``.
11011+
Defaults to the leftmost coordinates of all trees.
1100911012
:param list indexes: A list of 2-tuples or a single 2-tuple, specifying
1101011013
the indexes of two sample sets over which to compute a two-way LD
1101111014
statistic. Only :math:`r^2`, :math:`D^2`, and :math:`\widehat{D^2}`

0 commit comments

Comments
 (0)