Skip to content

Commit ddae79f

Browse files
committed
Add support for git archive generated git info files
The user is intended to write a template file and modify git attributes to fill in the template with repo information when building the archive. This is used by Github's `Download as ZIP` function. The implementation largely follows setuptools-scm's with some minor caveats: * This tool supports branch in template, this defaults to HEAD when there is no branch, just like the case where we have branch info. * The precedence of version resolving is a bit different. Here we do PKG-INFO, then archive file, then git info, while setuptools-scm puts git info first, assuming any git info must be the current repo's git info. This does mean that dirty clones that have the archival file will accidentally use that, but that should be somewhat obvious of a diagnosis.
1 parent 929e0d6 commit ddae79f

8 files changed

Lines changed: 771 additions & 1 deletion

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Add support for ``git archive`` builds via a tracked ``.git_archival.txt``
2+
file. When the file is present and its ``$Format:...$`` placeholders have
3+
been substituted, the version is derived from its contents instead of
4+
running ``git``. See ``git_archive`` documentation for the required
5+
``.git_archival.txt`` and ``.gitattributes`` setup.

docs/comparison.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ VCS support
3737
+---------------------------+-----+-----------+-------------------------------+-----------------------------+
3838
| Package | Git | Mercurial | Can be used in git submodules | Support for ``git-archive`` |
3939
+===========================+=====+===========+===============================+=============================+
40-
| setuptools-git-versioning | yes | no | yes | no |
40+
| setuptools-git-versioning | yes | no | yes | yes |
4141
+---------------------------+-----+-----------+-------------------------------+-----------------------------+
4242
| setuptools-scm | yes | yes | yes | yes |
4343
+---------------------------+-----+-----------+-------------------------------+-----------------------------+

docs/git_archive.rst

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
.. _git-archive:
2+
3+
Supporting ``git archive`` builds
4+
---------------------------------
5+
6+
By default ``setuptools-git-versioning`` reads version information by running
7+
``git`` against the project's ``.git`` directory. When the project is built
8+
from a ``git archive`` tarball (for example, GitHub's "Download ZIP", or a
9+
manual ``git archive HEAD -o release.tar``), no ``.git`` directory exists and
10+
``git`` cannot be invoked.
11+
12+
To make ``git archive`` builds work, add a ``.git_archival.txt`` file to your
13+
repository whose contents will be rewritten by git at archive time. The
14+
project will read the rewritten file when building from the archive.
15+
16+
Setup
17+
~~~~~
18+
19+
1. Create ``.git_archival.txt`` in the repository root:
20+
21+
.. code-block:: text
22+
:caption: .git_archival.txt
23+
24+
node: $Format:%H$
25+
describe-name: $Format:%(describe:tags=true,match=*[0-9]*)$
26+
27+
2. Tell git to substitute the ``$Format:...$`` placeholders by adding the
28+
following line to ``.gitattributes`` in the repository root (creating the
29+
file if it does not exist):
30+
31+
.. code-block:: text
32+
:caption: .gitattributes
33+
34+
.git_archival.txt export-subst
35+
36+
3. Commit both files:
37+
38+
.. code-block:: bash
39+
40+
git add .git_archival.txt .gitattributes
41+
git commit -m "add git archive support"
42+
43+
When ``git archive`` runs, the placeholders are expanded into the actual
44+
commit SHA and ``git describe`` output for the archived commit. When the
45+
package is later built from the extracted archive,
46+
``setuptools-git-versioning`` reads the file and resolves the version using
47+
the configured ``template`` / ``dev_template`` / ``dirty_template``.
48+
49+
The same file format is used by ``setuptools-scm``, so a single
50+
``.git_archival.txt`` works with both tools.
51+
52+
Optional: include branch information
53+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54+
55+
If your templates reference ``{branch}``, also add a ``ref-names`` line:
56+
57+
.. code-block:: text
58+
:caption: .git_archival.txt (with branch info)
59+
60+
node: $Format:%H$
61+
describe-name: $Format:%(describe:tags=true,match=*[0-9]*)$
62+
ref-names: $Format:%D$
63+
64+
.. warning::
65+
66+
Including ``ref-names`` causes the archive's contents to change every
67+
time a new ref points at the archived commit (for example, when a new
68+
branch is created). This breaks archive checksum stability across
69+
re-archivals of the same commit. Only opt in if you actually need
70+
``{branch}`` substitution.
71+
72+
If ``ref-names`` is not present (or is present but indicates a detached
73+
``HEAD``) and a template references ``{branch}``, the literal string
74+
``HEAD`` is substituted - matching the output of
75+
``git rev-parse --abbrev-ref HEAD`` in detached-HEAD state.
76+
77+
Priority and interaction with other schemas
78+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
79+
80+
The version source is selected in this order:
81+
82+
1. ``PKG-INFO`` (sdist install) - wins whenever present.
83+
2. ``.git_archival.txt`` - used when the file exists and its placeholders
84+
have been substituted.
85+
3. The normal flow: ``version_callback``, ``version_file``, live ``git``
86+
commands, ``starting_version``.
87+
88+
This means ``.git_archival.txt`` only takes effect when there is no
89+
``PKG-INFO`` (so a normal sdist install still wins) and is opportunistic in
90+
working checkouts: a stray un-substituted file logs a warning and is
91+
ignored, falling through to the live ``git`` flow.
92+
93+
Limitations
94+
~~~~~~~~~~~
95+
96+
- ``tag_filter``, ``tag_formatter``, and ``sort_by`` have no effect on
97+
archive builds. The tag is whatever ``git describe`` chose at archive
98+
time.
99+
- ``count_commits_from_version_file`` and ``version_file`` are not consulted
100+
in the archive flow.
101+
- Older git versions (<2.32) do not understand the ``%(describe...)``
102+
placeholder. In that case the file is left with the literal text
103+
``%(describe...)`` and ``setuptools-git-versioning`` will warn and fall
104+
back to the ``ref-names`` field for the tag (which only succeeds when
105+
``HEAD`` is exactly on a tag).

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
command
1919
ci
2020
runtime_version
21+
git_archive
2122
schemas/index
2223
options/index
2324
substitutions/index

setuptools_git_versioning/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
from typing import TYPE_CHECKING, Any
44

5+
from setuptools_git_versioning.archival import parse_archival_file, version_from_archival
56
from setuptools_git_versioning.git import (
67
count_since,
78
get_all_tags,
@@ -37,5 +38,7 @@ def parse_config(dist: Distribution, attr: Any, value: Any) -> None:
3738
"get_version",
3839
"infer_version",
3940
"is_dirty",
41+
"parse_archival_file",
42+
"version_from_archival",
4043
"version_from_git",
4144
]
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
from __future__ import annotations
2+
3+
import logging
4+
import os # noqa: TC003
5+
import re
6+
from dataclasses import dataclass
7+
from email.parser import HeaderParser
8+
from pathlib import Path
9+
from typing import TYPE_CHECKING
10+
11+
from setuptools_git_versioning.defaults import (
12+
DEFAULT_DEV_TEMPLATE,
13+
DEFAULT_DIRTY_TEMPLATE,
14+
DEFAULT_TEMPLATE,
15+
)
16+
from setuptools_git_versioning.log import DEBUG, INFO
17+
from setuptools_git_versioning.subst import resolve_substitutions
18+
19+
if TYPE_CHECKING:
20+
from packaging.version import Version
21+
22+
ARCHIVAL_FILENAME = ".git_archival.txt"
23+
DESCRIBE_UNSUPPORTED = "%(describe"
24+
FORMAT_UNSUBSTITUTED = "$Format"
25+
DESCRIBE_PARTS = 3 # tag-N-gSHA
26+
27+
REF_TAG_RE = re.compile(r"(?<=\btag: )([^,]+)\b")
28+
REF_HEAD_RE = re.compile(r"HEAD\s*->\s*([^,]+)")
29+
FULL_SHA_RE = re.compile(r"^([0-9a-f]{40}|[0-9a-f]{64})$") # SHA-1 or SHA-256
30+
31+
log = logging.getLogger(__name__)
32+
33+
34+
@dataclass
35+
class ArchivalData:
36+
tag: str
37+
ccount: int
38+
sha: str
39+
full_sha: str
40+
dirty: bool
41+
branch: str | None
42+
43+
44+
def parse_archival_file(path: str | os.PathLike) -> dict[str, str]:
45+
"""Read a .git_archival.txt file and return its key/value pairs.
46+
47+
Keys are normalized to lowercase so lookups behave consistently
48+
regardless of whether the file uses `node:` or `Node:` etc.
49+
"""
50+
content = Path(path).read_text(encoding="utf-8")
51+
log.log(DEBUG, "'%s' content:\n%s", ARCHIVAL_FILENAME, content)
52+
message = HeaderParser().parsestr(content)
53+
54+
# HeaderParser treats the first blank line as the end of headers.
55+
# Anything after it ends up in the message body and is silently
56+
# dropped from .items(). Warn the user instead of losing fields.
57+
payload = message.get_payload()
58+
if isinstance(payload, str) and payload.strip():
59+
log.warning(
60+
"'%s' contains content after a blank line; those fields will be ignored",
61+
ARCHIVAL_FILENAME,
62+
)
63+
64+
return {key.lower(): value for key, value in message.items()}
65+
66+
67+
def _parse_describe(describe: str) -> tuple[str, int, str | None, bool]:
68+
"""Parse a `git describe`-style string into (tag, ccount, short_sha, dirty)."""
69+
dirty = False
70+
if describe.endswith("-dirty"):
71+
dirty = True
72+
describe = describe[: -len("-dirty")]
73+
74+
parts = describe.rsplit("-", 2)
75+
if len(parts) < DESCRIBE_PARTS:
76+
return describe, 0, None, dirty
77+
78+
tag, ccount_str, gnode = parts
79+
try:
80+
ccount = int(ccount_str)
81+
except ValueError:
82+
return describe, 0, None, dirty
83+
84+
short_sha = gnode[1:] if gnode.startswith("g") else gnode
85+
return tag, ccount, short_sha, dirty
86+
87+
88+
def _branch_from_ref_names(ref_names: str) -> str | None:
89+
match = REF_HEAD_RE.search(ref_names)
90+
if match:
91+
return match.group(1).strip()
92+
return None
93+
94+
95+
def archival_to_version_data(data: dict[str, str]) -> ArchivalData | None:
96+
"""Convert parsed archival data into structured version info, or None.
97+
98+
Returns None when the file looks unsubstituted or otherwise unusable so
99+
the caller can fall through to live git.
100+
"""
101+
if any(FORMAT_UNSUBSTITUTED in value for value in data.values()):
102+
log.warning(
103+
"'%s' contains unprocessed '$Format:...$' placeholders, skipping",
104+
ARCHIVAL_FILENAME,
105+
)
106+
return None
107+
108+
node = data.get("node", "").strip()
109+
full_sha = node if FULL_SHA_RE.match(node) else ""
110+
ref_names = data.get("ref-names", "")
111+
branch = _branch_from_ref_names(ref_names)
112+
describe = data.get("describe-name", "").strip()
113+
114+
describe_tag: str | None = None
115+
ccount = 0
116+
short_sha = ""
117+
dirty = False
118+
119+
if describe and DESCRIBE_UNSUPPORTED not in describe:
120+
describe_tag, ccount, parsed_sha, dirty = _parse_describe(describe)
121+
if parsed_sha:
122+
short_sha = parsed_sha
123+
elif describe:
124+
log.warning(
125+
"git archive did not expand %(describe...) (git <2.32), falling back to ref-names",
126+
)
127+
128+
if describe_tag is not None:
129+
tag = describe_tag
130+
else:
131+
tags = REF_TAG_RE.findall(ref_names)
132+
if not tags:
133+
log.log(
134+
INFO,
135+
"'%s' has no usable describe-name or tag in ref-names",
136+
ARCHIVAL_FILENAME,
137+
)
138+
return None
139+
tag = tags[0].strip()
140+
141+
# Prefer the full SHA when available so {sha} matches the live-git
142+
# path's `full_sha[:8]` rendering. Fall back to the short SHA from
143+
# describe-name only when no valid `node` field is present.
144+
if full_sha:
145+
short_sha = full_sha[:8]
146+
elif short_sha:
147+
full_sha = short_sha
148+
149+
return ArchivalData(
150+
tag=tag,
151+
ccount=ccount,
152+
sha=short_sha[:8],
153+
full_sha=full_sha,
154+
dirty=dirty,
155+
branch=branch,
156+
)
157+
158+
159+
def version_from_archival(
160+
project_root: str | os.PathLike,
161+
*,
162+
template: str = DEFAULT_TEMPLATE,
163+
dev_template: str = DEFAULT_DEV_TEMPLATE,
164+
dirty_template: str = DEFAULT_DIRTY_TEMPLATE,
165+
) -> Version | None:
166+
"""Return a Version derived from .git_archival.txt, or None if unavailable."""
167+
archival_path = Path(project_root).joinpath(ARCHIVAL_FILENAME)
168+
if not archival_path.exists():
169+
log.log(DEBUG, "No '%s' present at '%s'", ARCHIVAL_FILENAME, project_root)
170+
return None
171+
172+
log.log(INFO, "File '%s' is found, reading its content", archival_path)
173+
data = parse_archival_file(archival_path)
174+
info = archival_to_version_data(data)
175+
if info is None:
176+
return None
177+
178+
log.log(DEBUG, "Parsed archival data: %r", info)
179+
180+
if info.dirty:
181+
log.log(INFO, "Using template from 'dirty_template' option")
182+
chosen = dirty_template
183+
elif info.ccount > 0:
184+
log.log(INFO, "Using template from 'dev_template' option")
185+
chosen = dev_template
186+
else:
187+
log.log(INFO, "Using template from 'template' option")
188+
chosen = template
189+
190+
# When ref-names is absent or doesn't reveal a current branch, default
191+
# to the literal "HEAD" so `{branch}` substitution mirrors what
192+
# `git rev-parse --abbrev-ref HEAD` produces in detached-HEAD state.
193+
branch = info.branch if info.branch is not None else "HEAD"
194+
195+
rendered = resolve_substitutions(
196+
chosen,
197+
sha=info.sha,
198+
tag=info.tag,
199+
ccount=info.ccount,
200+
branch=branch,
201+
full_sha=info.full_sha,
202+
)
203+
log.log(INFO, "Version number after resolving substitutions: %r", rendered)
204+
205+
# Deferred to avoid a top-level circular import:
206+
# `version.py` imports `version_from_archival` from this module.
207+
from setuptools_git_versioning.version import sanitize_version
208+
209+
return sanitize_version(rendered)

setuptools_git_versioning/version.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
# where 'packaging' is not installed yet
1212
from packaging.version import Version
1313

14+
from setuptools_git_versioning.archival import version_from_archival
1415
from setuptools_git_versioning.defaults import (
1516
DEFAULT_DEV_TEMPLATE,
1617
DEFAULT_DIRTY_TEMPLATE,
@@ -114,6 +115,16 @@ def version_from_git( # noqa: PLR0915, PLR0912, PLR0913, C901
114115
# running on sdist package, do not sanitize
115116
return Version(version_str)
116117

118+
archival_version = version_from_archival(
119+
project_root,
120+
template=template,
121+
dev_template=dev_template,
122+
dirty_template=dirty_template,
123+
)
124+
if archival_version is not None:
125+
log.log(INFO, "Resolved version from '.git_archival.txt': %s", archival_version)
126+
return archival_version
127+
117128
if version_callback is not None:
118129
if version_file is not None:
119130
msg = "Either 'version_file' or 'version_callback' can be passed, but not both at the same time"

0 commit comments

Comments
 (0)