|
| 1 | +# Commit Parentage Ordering |
| 2 | + |
| 3 | +This document explains how commit selector ordering works in [commit_parentage.rs](commit_parentage.rs), and why the implementation is structured the way it is. |
| 4 | + |
| 5 | +## Goal |
| 6 | + |
| 7 | +Given a set of commit selectors, produce a deterministic order such that: |
| 8 | + |
| 9 | +- Parent commits come before child commits. |
| 10 | +- Unrelated commits still have a stable, deterministic order. |
| 11 | +- Duplicate selectors are deduplicated by commit id (first occurrence wins). |
| 12 | + |
| 13 | +This ordering is useful for operations that must apply commits in dependency-safe order. |
| 14 | + |
| 15 | +## Inputs and Output |
| 16 | + |
| 17 | +Function: `order_commit_selectors_by_parentage(editor, selectors) -> Result<Vec<Selector>>` |
| 18 | + |
| 19 | +- Input selectors can be any type implementing `ToCommitSelector`. |
| 20 | +- The output is a list of normalized `Selector` values. |
| 21 | + |
| 22 | +## Preconditions and Errors |
| 23 | + |
| 24 | +The function returns an error if a selected commit cannot be found in the workspace traversal represented by `editor.workspace`. |
| 25 | + |
| 26 | +Why this is required: |
| 27 | + |
| 28 | +- Deterministic tie-breaking depends on workspace traversal rank. |
| 29 | +- Segment-based ancestry checks also depend on the workspace graph/projection. |
| 30 | + |
| 31 | +## High-Level Pipeline |
| 32 | + |
| 33 | +The algorithm has five phases. |
| 34 | + |
| 35 | +1. Normalize and deduplicate input |
| 36 | +- Resolve each incoming selector to `(Selector, CommitOwned)` with `editor.find_selectable_commit`. |
| 37 | +- Keep only the first occurrence of each commit id. |
| 38 | +- Resolve and store each commit's owning `SegmentIndex`. |
| 39 | + |
| 40 | +2. Compute deterministic fallback rank |
| 41 | +- Build a map: `commit_id -> rank` from workspace parent-to-child traversal order. |
| 42 | +- This rank is used only when ancestry does not constrain order. |
| 43 | + |
| 44 | +3. Build ancestry constraint graph |
| 45 | +- For every selected pair `(left, right)`, determine relation. |
| 46 | +- If `left` is ancestor of `right`, add directed edge `left -> right`. |
| 47 | +- If `right` is ancestor of `left`, add directed edge `right -> left`. |
| 48 | +- If unrelated, add no edge. |
| 49 | + |
| 50 | +4. Topological sort with stable tie-breaking |
| 51 | +- Use Kahn's algorithm over indegrees. |
| 52 | +- Keep all currently ready nodes in a min-priority structure keyed by: |
| 53 | + - `(workspace_rank, input_order)` |
| 54 | +- Repeatedly pop the best ready node, emit it, and reduce indegree of its children. |
| 55 | + |
| 56 | +5. Validate completeness |
| 57 | +- If output length is smaller than selected length, constraints were cyclic/inconsistent. |
| 58 | +- Return an explicit error in that case. |
| 59 | + |
| 60 | +## How Ancestry Is Determined |
| 61 | + |
| 62 | +The implementation prefers segment-level relation checks first, then falls back to commit-level merge-base logic only when needed. |
| 63 | + |
| 64 | +### Segment-first classification |
| 65 | + |
| 66 | +For selected commits `left` and `right`, call: |
| 67 | + |
| 68 | +- `editor.workspace.graph.relation_between(left.segment_id, right.segment_id)` |
| 69 | + |
| 70 | +Mapping used: |
| 71 | + |
| 72 | +- `Ancestor` -> `LeftIsAncestorOfRight` |
| 73 | +- `Descendant` -> `RightIsAncestorOfLeft` |
| 74 | +- `Disjoint` or `Diverged` -> `Unrelated` |
| 75 | +- `Identity` -> unresolved at segment level, so use commit-level fallback |
| 76 | + |
| 77 | +### Same-segment fallback |
| 78 | + |
| 79 | +When both commits are in the same segment (`Identity`), they can still have parent-child relation. In that case: |
| 80 | + |
| 81 | +- Compute merge-base on commit ids. |
| 82 | +- If merge-base is `left`, then `left` is ancestor of `right`. |
| 83 | +- If merge-base is `right`, then `right` is ancestor of `left`. |
| 84 | +- Otherwise, treat as unrelated. |
| 85 | + |
| 86 | +This hybrid approach keeps common cases cheap and explicit while preserving correctness inside a single segment. |
| 87 | + |
| 88 | +## Why Not Pure Commit Merge-Base For Everything? |
| 89 | + |
| 90 | +Pure commit-level checks for every pair work, but they are less explicit about workspace/segment intent and duplicate logic now captured in `Graph::relation_between`. |
| 91 | + |
| 92 | +Using segment relations first gives: |
| 93 | + |
| 94 | +- clearer semantics aligned with the graph model, |
| 95 | +- faster short-circuiting for many pairs, |
| 96 | +- one shared place for relationship semantics. |
| 97 | + |
| 98 | +## Complexity |
| 99 | + |
| 100 | +Let `n` be number of selected unique commits. |
| 101 | + |
| 102 | +- Pairwise relation discovery: `O(n^2)` comparisons. |
| 103 | +- Topological processing: |
| 104 | + - each push/pop on ready queue: `O(log n)` |
| 105 | + - overall typically `O((n + e) log n)` where `e` is number of ancestry edges. |
| 106 | + |
| 107 | +Total dominated by pairwise relation checks plus heap operations. |
| 108 | + |
| 109 | +## Determinism Guarantees |
| 110 | + |
| 111 | +Determinism is achieved by: |
| 112 | + |
| 113 | +- deduping by first occurrence, |
| 114 | +- using workspace rank for unrelated commits, |
| 115 | +- using `input_order` as secondary tiebreaker. |
| 116 | + |
| 117 | +So repeated runs with the same inputs and workspace state produce the same output. |
| 118 | + |
| 119 | +## Notes for Future Changes |
| 120 | + |
| 121 | +If behavior needs to tolerate commits not present in workspace traversal, one possible policy is: |
| 122 | + |
| 123 | +- assign such commits rank after all ranked commits, |
| 124 | +- preserve relative order by `input_order`. |
| 125 | + |
| 126 | +Current implementation intentionally errors to keep assumptions strict and explicit. |
0 commit comments