Skip to content

Commit 8912fe4

Browse files
Add some documentation notes on lineages
Closes #2294
1 parent 1c3e143 commit 8912fe4

2 files changed

Lines changed: 32 additions & 0 deletions

File tree

algorithms.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -734,6 +734,17 @@ def __repr__(self):
734734

735735
@dataclasses.dataclass
736736
class Lineage:
737+
"""
738+
A lineage represents a single genome in a coalescent model simulation,
739+
and keeps track of the head and tail of the ancestry segment lists.
740+
For the SMC(k) model, we also keep a Hull object which represents the
741+
information required to implement the search indexes for that model.
742+
743+
Note that the situation with the DTWF and pedigree models is confusing
744+
because we use segment chains to represent ancestry, which have lineages
745+
associated with them, but they're not used in any meaninful way.
746+
"""
747+
737748
head: Segment
738749
tail: Segment
739750
population: int = -1

lib/msprime.h

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,9 +83,30 @@ typedef struct segment_t_t {
8383
double right;
8484
struct segment_t_t *prev;
8585
struct segment_t_t *next;
86+
/* NOTE: In the DTWF model we don't really use the lineage and it would be
87+
* better if we explicitly reserved the concept as something only used in
88+
* the coalescent models. One thing we could do is to make this member a
89+
* union, which was either a lineage (for the coalescent models) or an
90+
* "individual" for the DTWF/pedigree code. Thus, we could then separate
91+
* the DTWF and coalescent main loops and population storage (it's
92+
* pointless using AVL trees for the DTWF code) while keeping the low-level
93+
* segment merging code the same. This would require a significant
94+
* refactoring (rewriting, really) of the DTWF code, though.
95+
*/
8696
struct lineage_t_t *lineage;
8797
} segment_t;
8898

99+
/* A lineage represents a single ancestral (or sample) genome in the coalescent
100+
* models, and keeps track of the head and tail of the segment chains. These
101+
* lineages are what are stored in the populations. For the SMC(k) model,
102+
* we also store a hull_t object, which keeps track of the information required
103+
* to implement the indexes for that model.
104+
*
105+
* Note that the situation is quite confusing for the DTWF and pedigree models,
106+
* which sort-of use the same structures as the coalescent models, but don't
107+
* really use them in a meaningful way. So, while we need to define lineages
108+
* as well as segments here, they don't actually do anything.
109+
*/
89110
typedef struct lineage_t_t {
90111
population_id_t population;
91112
label_id_t label;

0 commit comments

Comments
 (0)