Skip to content

Commit f8f4d51

Browse files
committed
Clarity individual table sorting requirements before subset
1 parent 25a26a7 commit f8f4d51

2 files changed

Lines changed: 16 additions & 9 deletions

File tree

python/tskit/tables.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3807,7 +3807,8 @@ def sort_individuals(self):
38073807
"""
38083808
Sorts the individual table in place, so that parents come before children,
38093809
and the parent column is remapped as required. Node references to individuals
3810-
are also updated.
3810+
are also updated. This is a stricter order than is required for a valid tree
3811+
sequence.
38113812
"""
38123813
self._ll_tables.sort_individuals()
38133814
# TODO add provenance
@@ -3816,9 +3817,11 @@ def canonicalise(self, remove_unreferenced=None):
38163817
"""
38173818
This puts the tables in *canonical* form, imposing a stricter order on the
38183819
tables than :ref:`required <sec_valid_tree_sequence_requirements>` for
3819-
a valid tree sequence. In particular, the individual
3820-
and population tables are sorted by the first node that refers to each
3821-
(see :meth:`TreeSequence.subset`). Then, the remaining tables are sorted
3820+
a valid tree sequence. In particular, the population table is sorted to
3821+
place populations with the lowest node IDs first, and the individual table
3822+
is sorted firstly as in :meth:`.sort_individuals` and secondarily
3823+
by the lowest ID of the nodes that refer to each individual
3824+
(see :meth:`TreeSequence.subset`). The remaining tables are sorted
38223825
as in :meth:`.sort`, with the modification that mutations are sorted by
38233826
site, then time (if known), then the mutation's node's time, then number
38243827
of descendant mutations (ensuring that parent mutations occur before

python/tskit/trees.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7513,11 +7513,15 @@ def subset(
75137513
the ancestry of these nodes - for that, see :meth:`.simplify`.
75147514
75157515
This has the side effect that it may change the order of the nodes,
7516-
individuals, populations, and migrations in the tree sequence: the nodes
7517-
in the new tree sequence will be in the order provided in ``nodes``, and
7518-
both individuals and populations will be ordered by the earliest retained
7519-
node that refers to them. (However, ``reorder_populations`` may be set to
7520-
False to keep the population table unchanged.)
7516+
populations, individuals, and migrations in the tree sequence. Nodes
7517+
in the new tree sequence will be in the order provided in ``nodes``.
7518+
Populations will be ordered in ascending order of the lowest ID of
7519+
the nodes that refer to them. Individuals will be not only ordered
7520+
so that :attr:`~Individual.parents` come before children (see
7521+
:meth:`~TableCollection.sort_individuals`) but in addition
7522+
will be secondarily sorted in ascending order of the lowest ID of
7523+
their referring nodes. (However, ``reorder_populations`` may be set
7524+
to ``False`` to keep the population table unchanged.)
75217525
75227526
By default, the method removes all individuals and populations not
75237527
referenced by any nodes, and all sites not referenced by any mutations.

0 commit comments

Comments
 (0)