Skip to content

Commit 9aa529c

Browse files
committed
commit ordering
Create but-workspace commit ordering function that sorts the commits by their parentage, according to the workspace appearance.
1 parent dc1ef0b commit 9aa529c

9 files changed

Lines changed: 545 additions & 22 deletions

File tree

crates/but-workspace/src/commit/squash_commits.rs

Lines changed: 3 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
pub(crate) mod function {
44
use anyhow::{Result, bail};
55
use but_core::RefMetadata;
6-
use but_graph::{SegmentIndex, SegmentRelation, projection::Workspace};
6+
use but_graph::{SegmentRelation, projection::Workspace};
77
use but_rebase::{
88
commit::DateMode,
99
graph_rebase::{
@@ -12,33 +12,14 @@ pub(crate) mod function {
1212
},
1313
};
1414

15+
use crate::workspace_graph::find_commit_segment_index;
16+
1517
#[derive(Debug, Clone, Copy, Eq, PartialEq)]
1618
enum ReorderDirection {
1719
MoveSubjectAboveTarget,
1820
MoveSubjectBelowTarget,
1921
}
2022

21-
fn find_commit_segment_index(
22-
workspace: &Workspace,
23-
commit_id: gix::ObjectId,
24-
) -> Option<SegmentIndex> {
25-
let (_, stack_segment, _) = workspace.find_commit_and_containers(commit_id)?;
26-
let commit_offset = stack_segment
27-
.commits
28-
.iter()
29-
.position(|c| c.id == commit_id)?;
30-
31-
let mut owning_segment = stack_segment.id;
32-
for (segment_id, offset) in &stack_segment.commits_by_segment {
33-
if *offset > commit_offset {
34-
break;
35-
}
36-
owning_segment = *segment_id;
37-
}
38-
39-
Some(owning_segment)
40-
}
41-
4223
fn determine_reorder_direction(
4324
workspace: &Workspace,
4425
repo: &gix::Repository,

crates/but-workspace/src/lib.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@ pub mod legacy;
3535
/// Types specifically for the user-interface.
3636
pub mod ui;
3737

38+
/// Utilities for deterministic ordering operations.
39+
pub mod ordering;
40+
3841
pub mod commit_engine;
3942
/// Tools for manipulating trees
4043
pub mod tree_manipulation;
@@ -45,6 +48,7 @@ pub use tree_manipulation::discard_worktree_changes::discard_workspace_changes;
4548
pub mod branch;
4649

4750
mod changeset;
51+
mod workspace_graph;
4852

4953
/// Utility types for the [`WorkspaceCommit`].
5054
pub mod commit;
Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# Commit Parentage Ordering
2+
3+
This document explains how commit selector ordering works in [commit_parentage.rs](commit_parentage.rs), and why the implementation is structured the way it is.
4+
5+
## Goal
6+
7+
Given a set of commit selectors, produce a deterministic order such that:
8+
9+
- Parent commits come before child commits.
10+
- Unrelated commits still have a stable, deterministic order.
11+
- Duplicate selectors are deduplicated by commit id (first occurrence wins).
12+
13+
This ordering is useful for operations that must apply commits in dependency-safe order.
14+
15+
## Inputs and Output
16+
17+
Function: `order_commit_selectors_by_parentage(editor, selectors) -> Result<Vec<Selector>>`
18+
19+
- Input selectors can be any type implementing `ToCommitSelector`.
20+
- The output is a list of normalized `Selector` values.
21+
22+
## Preconditions and Errors
23+
24+
The function returns an error if a selected commit cannot be found in the workspace traversal represented by `editor.workspace`.
25+
26+
Why this is required:
27+
28+
- Deterministic tie-breaking depends on workspace traversal rank.
29+
- Segment-based ancestry checks also depend on the workspace graph/projection.
30+
31+
## High-Level Pipeline
32+
33+
The algorithm has five phases.
34+
35+
1. Normalize and deduplicate input
36+
- Resolve each incoming selector to `(Selector, CommitOwned)` with `editor.find_selectable_commit`.
37+
- Keep only the first occurrence of each commit id.
38+
- Resolve and store each commit's owning `SegmentIndex`.
39+
40+
2. Compute deterministic fallback rank
41+
- Build a map: `commit_id -> rank` from workspace parent-to-child traversal order.
42+
- This rank is used only when ancestry does not constrain order.
43+
44+
3. Build ancestry constraint graph
45+
- For every selected pair `(left, right)`, determine relation.
46+
- If `left` is ancestor of `right`, add directed edge `left -> right`.
47+
- If `right` is ancestor of `left`, add directed edge `right -> left`.
48+
- If unrelated, add no edge.
49+
50+
4. Topological sort with stable tie-breaking
51+
- Use Kahn's algorithm over indegrees.
52+
- Keep all currently ready nodes in a min-priority structure keyed by:
53+
- `(workspace_rank, input_order)`
54+
- Repeatedly pop the best ready node, emit it, and reduce indegree of its children.
55+
56+
5. Validate completeness
57+
- If output length is smaller than selected length, constraints were cyclic/inconsistent.
58+
- Return an explicit error in that case.
59+
60+
## How Ancestry Is Determined
61+
62+
The implementation prefers segment-level relation checks first, then falls back to commit-level merge-base logic only when needed.
63+
64+
### Segment-first classification
65+
66+
For selected commits `left` and `right`, call:
67+
68+
- `editor.workspace.graph.relation_between(left.segment_id, right.segment_id)`
69+
70+
Mapping used:
71+
72+
- `Ancestor` -> `LeftIsAncestorOfRight`
73+
- `Descendant` -> `RightIsAncestorOfLeft`
74+
- `Disjoint` or `Diverged` -> `Unrelated`
75+
- `Identity` -> unresolved at segment level, so use commit-level fallback
76+
77+
### Same-segment fallback
78+
79+
When both commits are in the same segment (`Identity`), they can still have parent-child relation. In that case:
80+
81+
- Compute merge-base on commit ids.
82+
- If merge-base is `left`, then `left` is ancestor of `right`.
83+
- If merge-base is `right`, then `right` is ancestor of `left`.
84+
- Otherwise, treat as unrelated.
85+
86+
This hybrid approach keeps common cases cheap and explicit while preserving correctness inside a single segment.
87+
88+
## Why Not Pure Commit Merge-Base For Everything?
89+
90+
Pure commit-level checks for every pair work, but they are less explicit about workspace/segment intent and duplicate logic now captured in `Graph::relation_between`.
91+
92+
Using segment relations first gives:
93+
94+
- clearer semantics aligned with the graph model,
95+
- faster short-circuiting for many pairs,
96+
- one shared place for relationship semantics.
97+
98+
## Complexity
99+
100+
Let `n` be number of selected unique commits.
101+
102+
- Pairwise relation discovery: `O(n^2)` comparisons.
103+
- Topological processing:
104+
- each push/pop on ready queue: `O(log n)`
105+
- overall typically `O((n + e) log n)` where `e` is number of ancestry edges.
106+
107+
Total dominated by pairwise relation checks plus heap operations.
108+
109+
## Determinism Guarantees
110+
111+
Determinism is achieved by:
112+
113+
- deduping by first occurrence,
114+
- using workspace rank for unrelated commits,
115+
- using `input_order` as secondary tiebreaker.
116+
117+
So repeated runs with the same inputs and workspace state produce the same output.
118+
119+
## Notes for Future Changes
120+
121+
If behavior needs to tolerate commits not present in workspace traversal, one possible policy is:
122+
123+
- assign such commits rank after all ranked commits,
124+
- preserve relative order by `input_order`.
125+
126+
Current implementation intentionally errors to keep assumptions strict and explicit.

0 commit comments

Comments
 (0)