Respect superproject .gitattributes eol rule when fetching git dependencies#1260
Respect superproject .gitattributes eol rule when fetching git dependencies#1260spoorcc wants to merge 5 commits into
Conversation
…encies
When a superproject has a .gitattributes file with a global eol setting
(e.g. `* text=auto eol=lf` or `* text=auto eol=crlf`), dfetch now uses
`git check-attr` to read the effective eol preference for the destination
path and applies it to the temporary git checkout before `git reset --hard`,
so vendored files always have the line endings the superproject enforces.
- Add `eol` field to `CheckoutOptions` in `git_types.py`
- Add `effective_eol()` (queries superproject via git check-attr) and
`_configure_eol()` (writes .git/info/attributes + core.eol) to
`GitLocalRepo` in `git.py`
- Wire up in `GitSubProject._fetch_impl()` in `gitsubproject.py`
- Add BDD feature test with CRLF-forcing and LF-forcing scenarios
- Add `'<path>' has {ending} line endings` step for binary byte checks
- Add step for committing CRLF content into a test git repo
- Document the feature in doc/howto/updating-projects.rst
https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
|
Warning Review limit reached
More reviews will be available in 52 minutes and 36 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughDfetch queries the effective ChangesGit EOL Handling via gitattributes
Sequence Diagram(s)sequenceDiagram
participant GitSubProject
participant GitLocalRepo
participant CheckoutVersion
participant WorkingTree
GitSubProject->>GitLocalRepo: effective_eol(target_path)
GitLocalRepo-->>GitSubProject: "lf" / "crlf" / None
GitSubProject->>CheckoutVersion: checkout_version(options with eol)
CheckoutVersion->>GitLocalRepo: _configure_eol(eol) (if provided)
CheckoutVersion->>GitLocalRepo: git fetch & reset to FETCH_HEAD
GitLocalRepo->>WorkingTree: rewrite files to LF or CRLF (skip .git, binaries)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Action performedReview finished.
|
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dfetch/project/gitsubproject.py`:
- Line 5: The file imports pathlib twice; resolve the redundancy by keeping one
style and updating usages: either remove the top-level import pathlib and use
the existing from pathlib import Path everywhere (replace any occurrences of
pathlib.Path with Path, e.g., where gitsubproject references pathlib.Path), or
keep import pathlib and remove from pathlib import Path and change any bare Path
uses to pathlib.Path; update the symbol usage around the code that currently
calls pathlib.Path (reference: pathlib.Path and Path in this module) so imports
and usages are consistent.
In `@dfetch/vcs/git.py`:
- Around line 364-369: The _configure_eol method currently writes any string to
.git/info/attributes and sets git core.eol without validation; add input
validation at the start of _configure_eol(eol: str) to only accept "lf" or
"crlf" (case-normalize if desired) and raise a ValueError (with a clear message
referencing the invalid eol) before touching Path(".git/info/attributes") or
calling run_on_cmdline(logger, ["git", "config", "core.eol", eol]) when the
value is not recognized.
In `@features/steps/generic_steps.py`:
- Around line 430-440: The CRLF check in step_impl is too permissive (it only
checks that CRLF appears and no lone CR), so change the CRLF branch to assert
that every newline byte is preceded by a CR: read content as bytes and assert
not re.search(rb'(?<!\r)\n', content) (i.e. no LF that isn't preceded by CR) and
optionally assert b"\r\n" in content to ensure at least one newline exists; keep
the LF branch as-is (assert b"\r" not in content) and raise the same ValueError
for unknown endings.
In `@features/steps/git_steps.py`:
- Around line 162-168: The step_impl test currently strips context.text before
CRLF conversion which can remove the trailing newline and prevent EOL
normalization from being exercised; change step_impl to stop using strip(),
convert all '\n' to '\r\n' from the original context.text, and ensure the
resulting bytes end with a '\r\n' (append one if missing) before calling
pathlib.Path(path).write_bytes(...) and commit_all("Add CRLF file") so the file
always has a trailing CRLF for the EOL conversion test.
In `@features/superproject-line-ending-attrs.feature`:
- Around line 9-26: The scenario "Superproject forces CRLF line endings" is
platform-dependent because the "a git repository" setup step leaves README.md
with implicit system line endings; change the test fixture so the repository
created by the step that yields SomeProject.git includes an explicit README.md
with LF line endings (not relying on the system default) before committing, so
that when the test sets .gitattributes (in MyProject) and runs "dfetch update"
the conversion to CRLF is exercised deterministically; locate the
repository-creation helper used by the "a git repository" step or the test data
for SomeProject.git and add an explicit README.md file with LF-only line endings
to ensure consistent behavior across platforms.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: fcf68fd8-5611-4904-95e2-af874b03fdcb
📒 Files selected for processing (8)
CHANGELOG.rstdfetch/project/gitsubproject.pydfetch/vcs/git.pydfetch/vcs/git_types.pydoc/howto/updating-projects.rstfeatures/steps/generic_steps.pyfeatures/steps/git_steps.pyfeatures/superproject-line-ending-attrs.feature
- Remove duplicate `import pathlib` from gitsubproject.py; use `Path` throughout - Replace pre-checkout git-attributes approach (_configure_eol) with Python-based post-checkout conversion (_apply_eol): walks every text file (NULL-byte binary detection), normalises CRLF→LF then optionally LF→CRLF; this reliably handles CRLF-stored objects where git's own eol=lf smudge filter leaves CRLF unchanged - Add input validation (ValueError) to _apply_eol for unrecognised eol values - Tighten CRLF assertion in generic_steps.py: use (?<!\r)\n lookbehind to ensure every LF is preceded by CR, not just that CRLF appears - Fix CRLF commit step in git_steps.py: use splitlines()+join instead of strip() to preserve trailing CRLF - Add explicit `with LF content` / `with CRLF content` git-repo setup steps so line-ending test fixtures are deterministic across platforms - Restructure both feature scenarios to use the new explicit setup steps, fixing a bug where the CRLF scenario was referencing the wrong git tag https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
dfetch/project/gitsubproject.py (1)
81-81: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueConsider documenting the dummy filename suffix.
The
/asuffix appended toself.local_pathqueries the effective EOL for a hypothetical file inside the destination directory (since.gitattributesrules are path-based). While functionally correct, this pattern is not immediately obvious. A brief comment would improve clarity.📝 Suggested documentation
+ # Query effective eol for a file within the destination path + # (gitattributes rules are path-specific, so we append a dummy filename) eol = GitLocalRepo(Path.cwd()).effective_eol(f"{self.local_path}/a")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dfetch/project/gitsubproject.py` at line 81, Add a short inline comment explaining why a dummy file suffix is appended when calling GitLocalRepo.effective_eol — clarify that eol = GitLocalRepo(Path.cwd()).effective_eol(f"{self.local_path}/a") uses "/a" to query a hypothetical file inside the destination directory because .gitattributes rules are path-based; place the comment near the call that references self.local_path and the effective_eol method so future readers understand the intent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dfetch/vcs/git.py`:
- Around line 364-381: The binary-detection heuristic in _apply_eol currently
checks only the first 8192 bytes for a NUL byte; add a brief inline comment
above that check explaining this is an intentional, common heuristic (matching
Git’s behavior) to avoid scanning whole files for performance and that it may
misclassify some binaries but is acceptable for this use case. Reference the
_apply_eol method and the "if b\"\\x00\" in raw[:8192]:" check so the maintainer
knows exactly where to add the explanatory comment.
---
Outside diff comments:
In `@dfetch/project/gitsubproject.py`:
- Line 81: Add a short inline comment explaining why a dummy file suffix is
appended when calling GitLocalRepo.effective_eol — clarify that eol =
GitLocalRepo(Path.cwd()).effective_eol(f"{self.local_path}/a") uses "/a" to
query a hypothetical file inside the destination directory because
.gitattributes rules are path-based; place the comment near the call that
references self.local_path and the effective_eol method so future readers
understand the intent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ba0cc472-4400-4852-a1bc-65b71088ddb5
📒 Files selected for processing (5)
dfetch/project/gitsubproject.pydfetch/vcs/git.pyfeatures/steps/generic_steps.pyfeatures/steps/git_steps.pyfeatures/superproject-line-ending-attrs.feature
- _apply_eol: note that NUL-in-first-8KiB is an intentional binary detection heuristic that mirrors git's own approach - _fetch_impl: explain that the "/a" suffix on the effective_eol path query is needed so path-based .gitattributes rules are evaluated against the destination directory https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
Move _configure_eol (writes .git/info/attributes + sets core.eol/autocrlf) to run BEFORE git fetch so the smudge filter is active when git checks out files, rather than converting after the fact in Python. Scenario 2 now uses LF-content remote (git-native eol=lf doesn't convert CRLF blobs; that smudge direction is handled by eol=crlf in Scenario 1). https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
When the superproject is a git repo whose .gitattributes enforces an eol style, dfetch now also applies that preference to SVN subprojects. Since svn export is a one-shot operation with no built-in line-ending config, the conversion uses git after the export: git init in the destination, configure .git/info/attributes + core.eol, run git-add (normalises text files to LF in the index) + git-checkout-index --all --force (re-emits files with the requested eol), then remove .git/. Adds two BDD scenarios: - Git superproject eol=crlf + SVN subproject with LF content → CRLF - Git superproject eol=lf + SVN subproject with CRLF content → LF https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
When a superproject has a .gitattributes file with a global eol setting
(e.g.
* text=auto eol=lfor* text=auto eol=crlf), dfetch now usesgit check-attrto read the effective eol preference for the destinationpath and applies it to the temporary git checkout before
git reset --hard,so vendored files always have the line endings the superproject enforces.
eolfield toCheckoutOptionsingit_types.pyeffective_eol()(queries superproject via git check-attr) and_configure_eol()(writes .git/info/attributes + core.eol) toGitLocalRepoingit.pyGitSubProject._fetch_impl()ingitsubproject.py'<path>' has {ending} line endingsstep for binary byte checkshttps://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
Summary by CodeRabbit
New Features
Documentation
Tests