|
| 1 | +# Deviations from Paper |
| 2 | + |
| 3 | +This document tracks implementation deviations from the research |
| 4 | +paper [Context-Selection for Git Diff](https://nikolay-eremeev.com/blog/context-selection-git-diff/). |
| 5 | + |
| 6 | +## 1. Caller Importance Weighting for Impact Needs |
| 7 | + |
| 8 | +**Paper reference:** Section 4.2.1 (Impact need scoring) |
| 9 | + |
| 10 | +**Problem:** The paper's `m(f, n)` assigns a flat 0.8 to any |
| 11 | +fragment that mentions a symbol for `impact` needs. This cannot |
| 12 | +distinguish production callers (`handler.ts`) from peripheral |
| 13 | +code (`examples/parsing.ts`) — both receive identical scores. |
| 14 | + |
| 15 | +**Extension:** For impact needs only, the match strength is |
| 16 | +scaled by a file importance factor: |
| 17 | + |
| 18 | +```text |
| 19 | +m'(f, n) = m(f, n) * I(f) where n.type == "impact" |
| 20 | +``` |
| 21 | + |
| 22 | +`I(f)` is computed from three layers: |
| 23 | + |
| 24 | +| Layer | Signal | Importance | |
| 25 | +|-------|--------|------------| |
| 26 | +| Path patterns | `examples/`, `demo/`, `vendor/`, etc. | 0.15 | |
| 27 | +| Generated code | `generated/`, `__generated__/` paths | 0.10 | |
| 28 | +| Script dirs | `scripts/`, `tools/`, `bin/` | 0.40 | |
| 29 | +| Graph topology | Leaf node (in=0, out>0) | 0.25 | |
| 30 | +| Graph topology | Isolated (in=0, out=0) | 0.50 | |
| 31 | +| Graph topology | Production (in>0) | min(1.0, 0.7 + 0.1*in) | |
| 32 | + |
| 33 | +Path-based layers take priority over graph topology. |
| 34 | + |
| 35 | +**Submodularity preservation:** Since `I(f) in [0, 1]` is a |
| 36 | +constant per-fragment multiplier, `m'(f, n) <= m(f, n)`. The |
| 37 | +augmented score `a(f, n) = m'(f, n) + eta * R(f)` remains |
| 38 | +monotone submodular — scaling a nonneg input to `phi(max(...))` by |
| 39 | +a constant in [0, 1] preserves concavity of the max-of-concave |
| 40 | +composition. |
| 41 | + |
| 42 | +**Scope:** Only impact needs are affected. Definition, signature, |
| 43 | +test, invariant, and background needs use unmodified `m(f, n)`. |
0 commit comments