Skip to content

Respect superproject .gitattributes eol rule when fetching git dependencies#1260

Open
spoorcc wants to merge 5 commits into
mainfrom
claude/superproject-line-ending-attrs-wz5647
Open

Respect superproject .gitattributes eol rule when fetching git dependencies#1260
spoorcc wants to merge 5 commits into
mainfrom
claude/superproject-line-ending-attrs-wz5647

Conversation

@spoorcc

@spoorcc spoorcc commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

When a superproject has a .gitattributes file with a global eol setting
(e.g. * text=auto eol=lf or * text=auto eol=crlf), dfetch now uses
git check-attr to read the effective eol preference for the destination
path and applies it to the temporary git checkout before git reset --hard,
so vendored files always have the line endings the superproject enforces.

  • Add eol field to CheckoutOptions in git_types.py
  • Add effective_eol() (queries superproject via git check-attr) and
    _configure_eol() (writes .git/info/attributes + core.eol) to
    GitLocalRepo in git.py
  • Wire up in GitSubProject._fetch_impl() in gitsubproject.py
  • Add BDD feature test with CRLF-forcing and LF-forcing scenarios
  • Add '<path>' has {ending} line endings step for binary byte checks
  • Add step for committing CRLF content into a test git repo
  • Document the feature in doc/howto/updating-projects.rst

https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu

Summary by CodeRabbit

  • New Features

    • Git dependencies now respect .gitattributes line-ending rules (eol=lf / eol=crlf), so vendored files match the project's enforced style across platforms.
  • Documentation

    • Added guidance explaining how line endings are determined and applied when fetching dependencies, including resolution behavior.
  • Tests

    • Added acceptance and step tests to verify and enforce correct line-ending handling during dependency fetches.

…encies

When a superproject has a .gitattributes file with a global eol setting
(e.g. `* text=auto eol=lf` or `* text=auto eol=crlf`), dfetch now uses
`git check-attr` to read the effective eol preference for the destination
path and applies it to the temporary git checkout before `git reset --hard`,
so vendored files always have the line endings the superproject enforces.

- Add `eol` field to `CheckoutOptions` in `git_types.py`
- Add `effective_eol()` (queries superproject via git check-attr) and
  `_configure_eol()` (writes .git/info/attributes + core.eol) to
  `GitLocalRepo` in `git.py`
- Wire up in `GitSubProject._fetch_impl()` in `gitsubproject.py`
- Add BDD feature test with CRLF-forcing and LF-forcing scenarios
- Add `'<path>' has {ending} line endings` step for binary byte checks
- Add step for committing CRLF content into a test git repo
- Document the feature in doc/howto/updating-projects.rst

https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@spoorcc, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 52 minutes and 36 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f87725ba-6a1c-42e5-a906-aaed6bbe28f0

📥 Commits

Reviewing files that changed from the base of the PR and between ac2a27f and cb17a51.

📒 Files selected for processing (4)
  • dfetch/project/svnsubproject.py
  • dfetch/vcs/git.py
  • features/steps/svn_steps.py
  • features/superproject-line-ending-attrs.feature

Walkthrough

Dfetch queries the effective .gitattributes EOL for a dependency destination, threads that eol into checkout options, configures the repo checkout to honor it, and normalizes working-tree files; tests, docs, and changelog were added.

Changes

Git EOL Handling via gitattributes

Layer / File(s) Summary
EOL contract and GitLocalRepo implementation
dfetch/vcs/git_types.py, dfetch/vcs/git.py
CheckoutOptions adds optional eol. GitLocalRepo adds effective_eol(path) to run git check-attr eol -- <path> and _configure_eol(eol) to write local .git/info/attributes and config before fetch; checkout_version invokes configuration when options.eol is set.
EOL integration in GitSubProject fetch
dfetch/project/gitsubproject.py
_fetch_impl computes the destination's effective EOL via GitLocalRepo(Path.cwd()).effective_eol(...), passes it into CheckoutOptions(eol=...), and standardizes Path usage for mkdir/rglob operations.
BDD feature scenarios and test steps
features/superproject-line-ending-attrs.feature, features/steps/generic_steps.py, features/steps/git_steps.py
Adds two feature scenarios verifying vendored files honor superproject .gitattributes EOL, a step that asserts file endings (LF/CRLF), and steps that create remote repos/files with explicit LF/CRLF content.
Documentation and changelog
doc/howto/updating-projects.rst, CHANGELOG.rst
Documents the line-ending behavior in the update workflow and adds a changelog entry stating .gitattributes EOL rules are respected during fetch.

Sequence Diagram(s)

sequenceDiagram
  participant GitSubProject
  participant GitLocalRepo
  participant CheckoutVersion
  participant WorkingTree
  GitSubProject->>GitLocalRepo: effective_eol(target_path)
  GitLocalRepo-->>GitSubProject: "lf" / "crlf" / None
  GitSubProject->>CheckoutVersion: checkout_version(options with eol)
  CheckoutVersion->>GitLocalRepo: _configure_eol(eol) (if provided)
  CheckoutVersion->>GitLocalRepo: git fetch & reset to FETCH_HEAD
  GitLocalRepo->>WorkingTree: rewrite files to LF or CRLF (skip .git, binaries)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • dfetch-org/dfetch#1080: Extends the CheckoutOptions/checkout_version wiring previously refactored in #1080 by adding the eol field and EOL application logic.
  • dfetch-org/dfetch#1144: Related changes touching CheckoutOptions dataclass; earlier PR introduced the dataclass shape used here.

Suggested labels

development, documentation

Suggested reviewers

  • ben-edna
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 46.15% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately and specifically summarizes the main change: implementing support for respecting a superproject's .gitattributes eol rule when fetching git dependencies, which is the core objective of all changes across the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/superproject-line-ending-attrs-wz5647

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@spoorcc

spoorcc commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@spoorcc

spoorcc commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dfetch/project/gitsubproject.py`:
- Line 5: The file imports pathlib twice; resolve the redundancy by keeping one
style and updating usages: either remove the top-level import pathlib and use
the existing from pathlib import Path everywhere (replace any occurrences of
pathlib.Path with Path, e.g., where gitsubproject references pathlib.Path), or
keep import pathlib and remove from pathlib import Path and change any bare Path
uses to pathlib.Path; update the symbol usage around the code that currently
calls pathlib.Path (reference: pathlib.Path and Path in this module) so imports
and usages are consistent.

In `@dfetch/vcs/git.py`:
- Around line 364-369: The _configure_eol method currently writes any string to
.git/info/attributes and sets git core.eol without validation; add input
validation at the start of _configure_eol(eol: str) to only accept "lf" or
"crlf" (case-normalize if desired) and raise a ValueError (with a clear message
referencing the invalid eol) before touching Path(".git/info/attributes") or
calling run_on_cmdline(logger, ["git", "config", "core.eol", eol]) when the
value is not recognized.

In `@features/steps/generic_steps.py`:
- Around line 430-440: The CRLF check in step_impl is too permissive (it only
checks that CRLF appears and no lone CR), so change the CRLF branch to assert
that every newline byte is preceded by a CR: read content as bytes and assert
not re.search(rb'(?<!\r)\n', content) (i.e. no LF that isn't preceded by CR) and
optionally assert b"\r\n" in content to ensure at least one newline exists; keep
the LF branch as-is (assert b"\r" not in content) and raise the same ValueError
for unknown endings.

In `@features/steps/git_steps.py`:
- Around line 162-168: The step_impl test currently strips context.text before
CRLF conversion which can remove the trailing newline and prevent EOL
normalization from being exercised; change step_impl to stop using strip(),
convert all '\n' to '\r\n' from the original context.text, and ensure the
resulting bytes end with a '\r\n' (append one if missing) before calling
pathlib.Path(path).write_bytes(...) and commit_all("Add CRLF file") so the file
always has a trailing CRLF for the EOL conversion test.

In `@features/superproject-line-ending-attrs.feature`:
- Around line 9-26: The scenario "Superproject forces CRLF line endings" is
platform-dependent because the "a git repository" setup step leaves README.md
with implicit system line endings; change the test fixture so the repository
created by the step that yields SomeProject.git includes an explicit README.md
with LF line endings (not relying on the system default) before committing, so
that when the test sets .gitattributes (in MyProject) and runs "dfetch update"
the conversion to CRLF is exercised deterministically; locate the
repository-creation helper used by the "a git repository" step or the test data
for SomeProject.git and add an explicit README.md file with LF-only line endings
to ensure consistent behavior across platforms.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: fcf68fd8-5611-4904-95e2-af874b03fdcb

📥 Commits

Reviewing files that changed from the base of the PR and between e2c0992 and a573f1a.

📒 Files selected for processing (8)
  • CHANGELOG.rst
  • dfetch/project/gitsubproject.py
  • dfetch/vcs/git.py
  • dfetch/vcs/git_types.py
  • doc/howto/updating-projects.rst
  • features/steps/generic_steps.py
  • features/steps/git_steps.py
  • features/superproject-line-ending-attrs.feature

Comment thread dfetch/project/gitsubproject.py
Comment thread dfetch/vcs/git.py
Comment thread features/steps/generic_steps.py
Comment thread features/steps/git_steps.py
Comment thread features/superproject-line-ending-attrs.feature
- Remove duplicate `import pathlib` from gitsubproject.py; use `Path` throughout
- Replace pre-checkout git-attributes approach (_configure_eol) with
  Python-based post-checkout conversion (_apply_eol): walks every text
  file (NULL-byte binary detection), normalises CRLF→LF then optionally
  LF→CRLF; this reliably handles CRLF-stored objects where git's own
  eol=lf smudge filter leaves CRLF unchanged
- Add input validation (ValueError) to _apply_eol for unrecognised eol values
- Tighten CRLF assertion in generic_steps.py: use (?<!\r)\n lookbehind to
  ensure every LF is preceded by CR, not just that CRLF appears
- Fix CRLF commit step in git_steps.py: use splitlines()+join instead of
  strip() to preserve trailing CRLF
- Add explicit `with LF content` / `with CRLF content` git-repo setup steps
  so line-ending test fixtures are deterministic across platforms
- Restructure both feature scenarios to use the new explicit setup steps,
  fixing a bug where the CRLF scenario was referencing the wrong git tag

https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dfetch/project/gitsubproject.py (1)

81-81: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider documenting the dummy filename suffix.

The /a suffix appended to self.local_path queries the effective EOL for a hypothetical file inside the destination directory (since .gitattributes rules are path-based). While functionally correct, this pattern is not immediately obvious. A brief comment would improve clarity.

📝 Suggested documentation
+        # Query effective eol for a file within the destination path
+        # (gitattributes rules are path-specific, so we append a dummy filename)
         eol = GitLocalRepo(Path.cwd()).effective_eol(f"{self.local_path}/a")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/project/gitsubproject.py` at line 81, Add a short inline comment
explaining why a dummy file suffix is appended when calling
GitLocalRepo.effective_eol — clarify that eol =
GitLocalRepo(Path.cwd()).effective_eol(f"{self.local_path}/a") uses "/a" to
query a hypothetical file inside the destination directory because
.gitattributes rules are path-based; place the comment near the call that
references self.local_path and the effective_eol method so future readers
understand the intent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dfetch/vcs/git.py`:
- Around line 364-381: The binary-detection heuristic in _apply_eol currently
checks only the first 8192 bytes for a NUL byte; add a brief inline comment
above that check explaining this is an intentional, common heuristic (matching
Git’s behavior) to avoid scanning whole files for performance and that it may
misclassify some binaries but is acceptable for this use case. Reference the
_apply_eol method and the "if b\"\\x00\" in raw[:8192]:" check so the maintainer
knows exactly where to add the explanatory comment.

---

Outside diff comments:
In `@dfetch/project/gitsubproject.py`:
- Line 81: Add a short inline comment explaining why a dummy file suffix is
appended when calling GitLocalRepo.effective_eol — clarify that eol =
GitLocalRepo(Path.cwd()).effective_eol(f"{self.local_path}/a") uses "/a" to
query a hypothetical file inside the destination directory because
.gitattributes rules are path-based; place the comment near the call that
references self.local_path and the effective_eol method so future readers
understand the intent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ba0cc472-4400-4852-a1bc-65b71088ddb5

📥 Commits

Reviewing files that changed from the base of the PR and between a573f1a and 95619c7.

📒 Files selected for processing (5)
  • dfetch/project/gitsubproject.py
  • dfetch/vcs/git.py
  • features/steps/generic_steps.py
  • features/steps/git_steps.py
  • features/superproject-line-ending-attrs.feature

Comment thread dfetch/vcs/git.py Outdated
claude added 2 commits June 12, 2026 08:49
- _apply_eol: note that NUL-in-first-8KiB is an intentional binary
  detection heuristic that mirrors git's own approach
- _fetch_impl: explain that the "/a" suffix on the effective_eol path
  query is needed so path-based .gitattributes rules are evaluated
  against the destination directory

https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
Move _configure_eol (writes .git/info/attributes + sets core.eol/autocrlf)
to run BEFORE git fetch so the smudge filter is active when git checks out
files, rather than converting after the fact in Python.

Scenario 2 now uses LF-content remote (git-native eol=lf doesn't convert
CRLF blobs; that smudge direction is handled by eol=crlf in Scenario 1).

https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
When the superproject is a git repo whose .gitattributes enforces an eol
style, dfetch now also applies that preference to SVN subprojects.

Since svn export is a one-shot operation with no built-in line-ending
config, the conversion uses git after the export: git init in the
destination, configure .git/info/attributes + core.eol, run git-add
(normalises text files to LF in the index) + git-checkout-index --all
--force (re-emits files with the requested eol), then remove .git/.

Adds two BDD scenarios:
- Git superproject eol=crlf + SVN subproject with LF content → CRLF
- Git superproject eol=lf  + SVN subproject with CRLF content → LF

https://claude.ai/code/session_01WAGxAB7145LZ2CEvGbAKYu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants