Skip to content

Commit 8e43436

Browse files
authored
Describe algorithm for graph layout (#112)
This is a start on documentation of the main algorithm (issue #6). I have been pushing around with the code for some time and I seem to discover more nuances all the time. I think the full algorithm documentation will take a significant amount of time, and therefore it would be more sensible to write it gradually. The goal of this PR is to describe all the main steps of the algorithm in broad terms.
1 parent 529681b commit 8e43436

3 files changed

Lines changed: 130 additions & 1 deletion

File tree

docs/branch_assignment.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
2+
# Overview
3+
4+
To generate a graph, [GitGraph::new()] will read the repository
5+
and assign every commit to a single branch.
6+
7+
It takes the following steps to generate the graph
8+
9+
- Identify branches
10+
- Sort branches by persistence
11+
- Trace branches to commits
12+
- Filtering and indexing
13+
14+
## Identify branches
15+
Local and remote git-branches and tags are used as candidates for branches.
16+
A branch can be identified by a merge commit, even though no current git-branch
17+
refers to it.
18+
19+
## Sort branches by persistence
20+
Each branch is assigned a persistence which can be configured by settings.
21+
Think of persistence as z-order where lower values take preceedence.
22+
**TODO** Merge branch
23+
24+
## Trace branches to commits
25+
The branches now get to pick their commits, in order of persistence. Each
26+
branch starts with a head, and follow the primary parent while it is
27+
available. It stops when the parent is a commit already assigned to a branch.
28+
**TODO** Duplicate branch names
29+
**TODO** Handle visual artifacts on merge
30+
31+
## Filtering and indexing
32+
Commits that have not been assigned a branch is filtered out.
33+
An *index_map* is created to map from original commit index, to filtered
34+
commit index.
35+
**TODO** what? why? Would it not be better to track from child/heads instead of every single commit in repo?
36+
37+
38+
39+
40+
# Branch sorting
41+
The goal of this algorithm is to assign a column number to each tracked branch so that they can be visualized linearly without overlapping in the graph. It uses a shortest-first scheduling strategy (optionally longest-first and with forward/backward start sorting).
42+
43+
## Initialization
44+
- occupied: A vector of vectors of vectors of tuples.
45+
The outer vector is indexed by the branch's order_group (determined by branch_order based on the settings.branches.order).
46+
Each inner vector represents a column within that order group,
47+
and the tuples (start, end) store the range of commits occupied by a branch in that column.
48+
49+
## Preparing Branches for Sorting
50+
- It creates branches_sort, a vector of tuples containing the branch index, its start commit index (range.0), its end commit index (range.1), its source order group, and its target order group.
51+
- It filters out branches that don't have a defined range (meaning they weren't associated with any commits).
52+
## Sorting Branches
53+
- The branches_sort vector is sorted based on a key that prioritizes:
54+
1. The maximum of the source and target order groups. This likely aims to keep related branches (e.g., those involved in merges) closer together.
55+
2. The length of the branch's lifespan (end - start commit index), either shortest-first or longest-first based on the shortest_first setting.
56+
3. The starting commit index, either forward or backward based on the forward setting.

src/graph.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,11 +43,13 @@ pub struct GitGraph {
4343
}
4444

4545
impl GitGraph {
46+
/// Generate a branch graph for a repository
4647
pub fn new(
4748
mut repository: Repository,
4849
settings: &Settings,
4950
max_count: Option<usize>,
5051
) -> Result<Self, String> {
52+
#![doc = include_str!("../docs/branch_assignment.md")]
5153
let mut stashes = HashSet::new();
5254
repository
5355
.stash_foreach(|_, _, oid| {
@@ -72,6 +74,8 @@ impl GitGraph {
7274

7375
let head = HeadInfo::new(&repository.head().map_err(|err| err.message().to_string())?)?;
7476

77+
// commits will hold the CommitInfo for all commits covered
78+
// indices maps git object id to an index into commits.
7579
let mut commits = Vec::new();
7680
let mut indices = HashMap::new();
7781
let mut idx = 0;
@@ -112,22 +116,26 @@ impl GitGraph {
112116
forward,
113117
);
114118

119+
// Remove commits not on a branch. This will give all commits a new index.
115120
let filtered_commits: Vec<CommitInfo> = commits
116121
.into_iter()
117122
.filter(|info| info.branch_trace.is_some())
118123
.collect();
119124

125+
// Create indices from git object id into the filtered commits
120126
let filtered_indices: HashMap<Oid, usize> = filtered_commits
121127
.iter()
122128
.enumerate()
123129
.map(|(idx, info)| (info.oid, idx))
124130
.collect();
125131

132+
// Map from old index to new index. None, if old index was removed
126133
let index_map: HashMap<usize, Option<&usize>> = indices
127134
.iter()
128135
.map(|(oid, index)| (*index, filtered_indices.get(oid)))
129136
.collect();
130137

138+
// Update branch.range from old to new index. Shrink if endpoints were removed.
131139
for branch in all_branches.iter_mut() {
132140
if let Some(mut start_idx) = branch.range.0 {
133141
let mut idx0 = index_map[&start_idx];

src/print/unicode.rs

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ pub fn print_unicode(graph: &GitGraph, settings: &Settings) -> Result<UnicodeGra
8383
None
8484
};
8585

86+
// Compute commit text into text_lines and add blank rows
87+
// if needed to match branch graph inserts.
8688
let mut index_map = vec![];
8789
let mut text_lines = vec![];
8890
let mut offset = 0;
@@ -133,6 +135,7 @@ pub fn print_unicode(graph: &GitGraph, settings: &Settings) -> Result<UnicodeGra
133135
[SPACE, WHITE, settings.branches.persistence.len() as u8 + 2],
134136
);
135137

138+
// Compute branch lines in grid
136139
for (idx, info) in graph.commits.iter().enumerate() {
137140
if let Some(trace) = info.branch_trace {
138141
let branch = &graph.all_branches[trace];
@@ -418,11 +421,38 @@ fn hline(
418421
}
419422
}
420423

421-
/// Calculates required additional rows
424+
/// Calculates required additional rows to visually connect commits that
425+
/// are not direct descendants in the main commit list. These "inserts"
426+
// represent the horizontal lines in the graph.
427+
///
428+
/// # Arguments
429+
///
430+
/// * `graph`: A reference to the `GitGraph` structure containing the
431+
// commit and branch information.
432+
/// * `compact`: A boolean indicating whether to use a compact layout,
433+
// potentially merging some insertions with commits.
434+
///
435+
/// # Returns
436+
///
437+
/// A `HashMap` where the keys are the indices of commits in the
438+
/// `graph.commits` vector, and the values are vectors of vectors
439+
/// of `Occ`. Each inner vector represents a potential row of
440+
/// insertions needed *before* the commit at the key index. The
441+
/// `Occ` enum describes what occupies a cell in that row
442+
/// (either a commit or a range representing a connection).
443+
///
422444
fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>> {
445+
// Initialize an empty HashMap to store the required insertions. The key is the commit
446+
// index, and the value is a vector of rows, where each row is a vector of Occupations (`Occ`).
423447
let mut inserts: HashMap<usize, Vec<Vec<Occ>>> = HashMap::new();
424448

449+
// First, for each commit, we initialize an entry in the `inserts`
450+
// map with a single row containing the commit itself. This ensures
451+
// that every commit has a position in the grid.
425452
for (idx, info) in graph.commits.iter().enumerate() {
453+
// Get the visual column assigned to the branch of this commit. Unwrap is safe here
454+
// because `branch_trace` should always point to a valid branch with an assigned column
455+
// for commits that are included in the filtered graph.
426456
let column = graph.all_branches[info.branch_trace.unwrap()]
427457
.visual
428458
.column
@@ -431,30 +461,56 @@ fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>>
431461
inserts.insert(idx, vec![vec![Occ::Commit(idx, column)]]);
432462
}
433463

464+
// Now, iterate through the commits again to identify connections
465+
// needed between parents that are not directly adjacent in the
466+
// `graph.commits` list.
434467
for (idx, info) in graph.commits.iter().enumerate() {
468+
// If the commit has a branch trace (meaning it belongs to a visualized branch).
435469
if let Some(trace) = info.branch_trace {
470+
// Get the `BranchInfo` for the current commit's branch.
436471
let branch = &graph.all_branches[trace];
472+
// Get the visual column of the current commit's branch. Unwrap is safe as explained above.
437473
let column = branch.visual.column.unwrap();
438474

475+
// Iterate through the two possible parents of the current commit.
439476
for p in 0..2 {
477+
// If the commit has a parent at this index (0 for the first parent, 1 for the second).
440478
if let Some(par_oid) = info.parents[p] {
479+
// Try to find the index of the parent commit in the `graph.commits` vector.
441480
if let Some(par_idx) = graph.indices.get(&par_oid) {
442481
let par_info = &graph.commits[*par_idx];
443482
let par_branch = &graph.all_branches[par_info.branch_trace.unwrap()];
444483
let par_column = par_branch.visual.column.unwrap();
484+
// Determine the sorted range of columns between the current commit and its parent.
445485
let column_range = sorted(column, par_column);
446486

487+
// If the column of the current commit is different from the column of its parent,
488+
// it means we need to draw a horizontal line (an "insert") to connect them.
447489
if column != par_column {
490+
// Find the index in the `graph.commits` list where the visual connection
491+
// should deviate from the parent's line. This helps in drawing the graph
492+
// correctly when branches diverge or merge.
448493
let split_index = super::get_deviate_index(graph, idx, *par_idx);
494+
// Access the entry in the `inserts` map for the `split_index`.
449495
match inserts.entry(split_index) {
496+
// If there's already an entry at this `split_index` (meaning other
497+
// insertions might be needed before this commit).
450498
Occupied(mut entry) => {
499+
// Find the first available row in the existing vector of rows
500+
// where the new range doesn't overlap with existing occupations.
451501
let mut insert_at = entry.get().len();
452502
for (insert_idx, sub_entry) in entry.get().iter().enumerate() {
453503
let mut occ = false;
504+
// Check for overlaps with existing `Occ` in the current row.
454505
for other_range in sub_entry {
506+
// Check if the current column range overlaps with the other range.
455507
if other_range.overlaps(&column_range) {
456508
match other_range {
509+
// If the other occupation is a commit.
457510
Occ::Commit(target_index, _) => {
511+
// In compact mode, we might allow overlap with the commit itself
512+
// for merge commits (specifically the second parent) to keep the
513+
// graph tighter.
458514
if !compact
459515
|| !info.is_merge
460516
|| idx != *target_index
@@ -464,7 +520,9 @@ fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>>
464520
break;
465521
}
466522
}
523+
// If the other occupation is a range (another connection).
467524
Occ::Range(o_idx, o_par_idx, _, _) => {
525+
// Avoid overlap with connections between the same commits.
468526
if idx != *o_idx && par_idx != o_par_idx {
469527
occ = true;
470528
break;
@@ -473,12 +531,15 @@ fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>>
473531
}
474532
}
475533
}
534+
// If no overlap is found in this row, we can insert here.
476535
if !occ {
477536
insert_at = insert_idx;
478537
break;
479538
}
480539
}
540+
// Get a mutable reference to the vector of rows for this `split_index`.
481541
let vec = entry.get_mut();
542+
// If no suitable row was found, add a new row.
482543
if insert_at == vec.len() {
483544
vec.push(vec![Occ::Range(
484545
idx,
@@ -487,6 +548,7 @@ fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>>
487548
column_range.1,
488549
)]);
489550
} else {
551+
// Otherwise, insert the new range into the found row.
490552
vec[insert_at].push(Occ::Range(
491553
idx,
492554
*par_idx,
@@ -495,7 +557,9 @@ fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>>
495557
));
496558
}
497559
}
560+
// If there's no entry at this `split_index` yet.
498561
Vacant(entry) => {
562+
// Create a new entry with a single row containing the range.
499563
entry.insert(vec![vec![Occ::Range(
500564
idx,
501565
*par_idx,
@@ -511,6 +575,7 @@ fn get_inserts(graph: &GitGraph, compact: bool) -> HashMap<usize, Vec<Vec<Occ>>>
511575
}
512576
}
513577

578+
// Return the map of required insertions.
514579
inserts
515580
}
516581

0 commit comments

Comments
 (0)