fix(install): skip cache for unpinned virtual deps when on-disk hash matches lockfile (closes #1548)#1553
Conversation
…matches lockfile When a git/virtual-file dependency was recorded in the lockfile without a resolved_commit (e.g. ADO partial-clone fallback could not pin a SHA, or a virtual-file dep was carved out without a commit anchor), the second install bypassed the content-hash cache-skip and re-downloaded the package every time. Re-clones of floating refs can yield a slightly different fresh hash, which then trips the supply-chain mismatch check in sources.py with a false-positive 'supply-chain attack' alert. Add cache-skip parity with the existing resolved_commit and registry branches: when the lockfile recorded a content_hash for an unpinned dep and the on-disk content still hashes to that value, skip re-download. Supply-chain detection is preserved -- the new branch only fires when on-disk content is verified unchanged; any real divergence falls through to the fresh-download flow where the existing mismatch check still runs. closes #1548 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes a false-positive supply-chain Content hash mismatch failure on the second apm install run for unpinned git/virtual-file dependencies by allowing cache reuse when the on-disk content still matches the lockfile-recorded content_hash.
Changes:
- Extend
_resolve_download_strategycache-skip logic to handle lockfile entries that have acontent_hashbut noresolved_commit(and no ref drift / update mode). - Add unit tests covering the new skip/no-skip behavior for unpinned git dependencies based on
verify_package_hash. - Add a changelog entry describing the fix.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/install/phases/integrate.py |
Adds a cache-skip branch for unpinned deps when on-disk content matches the lockfile content_hash. |
tests/unit/install/test_integrate_phase3w5.py |
Adds regression tests for the new unpinned-dependency cache-skip behavior (positive + negative + guard). |
CHANGELOG.md |
Documents the fix under Unreleased. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 1
|
|
||
| ### Fixed | ||
|
|
||
| - `apm install` no longer fails the second run with a false-positive `Content hash mismatch ... supply-chain attack` error for unpinned git/virtual-file dependencies whose lockfile entry could not record a `resolved_commit` (e.g. ADO partial-clone fallback paths). When the on-disk content still hashes to the lockfile-recorded value, the redundant re-download is now skipped, mirroring the existing cache-skip parity with pinned and registry deps. Real content divergence still falls through to the supply-chain verification path. (closes #1548) -- by @OskarKlintrot |
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 0 | The state handoff through InstallContext.content_hash_verified_deps keeps the optimization explicit and localized. |
| CLI Logging Expert | 0 | 0 | 0 | No user-facing output changes; existing mismatch messaging remains intact. |
| DevX UX Expert | 0 | 0 | 0 | Reinstall behavior now matches package-manager expectations: unchanged locked content is reused quietly. |
| Supply Chain Security Expert | 0 | 0 | 0 | The content hash remains the only trust signal for no-commit lockfile entries, and mismatches still fail closed. |
| OSS Growth Hacker | 0 | 0 | 0 | The changelog now leads with the user-visible false-positive error. |
| Doc Writer | 0 | 0 | 0 | Changelog coverage is sufficient; no Starlight page drift found for this internal install-path fix. |
| Test Coverage Expert | 0 | 0 | 0 | Added integration roundtrip coverage plus mutation-break evidence for the no-resolved_commit content-hash replay path. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Recommendation
Ship when the maintainer is ready. All foldable panel follow-ups were addressed in the pushed commit, Copilot has no inline comments, and all PR checks are successful.
Full per-persona findings
Python Architect
No findings.
CLI Logging Expert
No findings.
DevX UX Expert
No findings.
Supply Chain Security Expert
No findings.
OSS Growth Hacker
No findings.
Auth Expert -- inactive
No authentication or credential-resolution behavior changed.
Doc Writer
No findings.
Test Coverage Expert
No findings.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
Fold shepherd follow-ups for PR #1553 by adding an integration regression test for content-hash-only lockfile replay, teaching pre-download to honor verified content_hash entries with no resolved_commit, and avoiding a second tree hash after a verified skip-redownload decision. Addresses panel follow-ups for issue #1548. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Closes #1548.
apm installwas failing on the second run with a false-positiveContent hash mismatch ... supply-chain attackerror for unpinned git/virtual-file dependencies. This PR adds the missing cache-skip branch so the redundant re-download (and the false alert it triggered) goes away, without weakening supply-chain detection.Problem
When a git or virtual-file dep was recorded in
apm.lock.yamlwithout aresolved_commit-- typically because the underlying transport fell back from partial-clone to full bare-clone (observed on Azure DevOps repos whose server does not support partial-clone filter v2) and APM had no SHA to pin against the floating ref -- the cache-skip logic in_resolve_download_strategy(src/apm_cli/install/phases/integrate.py:113-160) had no branch covering the case. The two existing branches required either:locked_dep.resolved_committo be set and not"cached", orlocked_dep.source == "registry".Unpinned git deps fell through both,
lockfile_matchstayedFalse, and the package was re-downloaded on every install. A re-clone of a floating ref can produce a fresh hash that does not byte-match the lockfile-recordedcontent_hash, so the supply-chain check atsrc/apm_cli/install/sources.py:766-787thensys.exit(1)with "supply-chain attack" wording -- the false positive reported in #1548.Fix
Add cache-skip parity for the unpinned class. In
_resolve_download_strategy, after the existing registry branch, add aneliffor(locked_dep, locked_dep.content_hash, not ref_changed, not update_refs). When_should_skip_redownloadconfirms the on-disk content still hashes to the lockfile-recorded value, setlockfile_match = True(with the_via_content_hash_onlyflag for the existing self-heal pipeline). The package is intact, so re-downloading it is wasted work and the source of the false positive.This matches Option 1 from the triage brief. Option 2 (record a
resolved_commiteven on partial-clone-fallback paths) is more architecturally correct but ripples through the lockfile schema and self-heal chain; this PR keeps the change isolated and reversible.Supply-chain boundary
The supply-chain check is preserved. The new branch only fires when
verify_package_hash(install_path, locked_dep.content_hash)returnsTrue-- i.e. on-disk bytes already hash to the lockfile-recorded value. We never suppress a fresh download that would otherwise have produced a different hash than the lockfile; we only suppress a redundant download whose result we have already verified against the lockfile via the on-disk content. Real divergence (file actually tampered) still falls through to the fresh-download path wheresources.pyruns the existing mismatch check.Tests
tests/unit/install/test_integrate_phase3w5.py-- three new cases onTestResolveDownloadStrategy:test_unpinned_git_dep_skips_when_on_disk_hash_matches-- positive: withresolved_commit=None,content_hash="deadbeef",source="git"andverify_package_hash -> True, assertsskip_download is True. Fails at HEAD without the fix; passes after. Verified the mutation-break gate bygit stashof the integrate.py change: this test fails, confirming the fix is the cause of the green.test_unpinned_git_dep_redownloads_when_on_disk_hash_differs-- negative gate: same setup,verify_package_hash -> False, assertsskip_download is False. Proves a real on-disk mismatch is NOT masked by the new branch.test_unpinned_git_dep_no_skip_when_lockfile_lacks_content_hash-- guard: when neitherresolved_commitnorcontent_hashare present there is no integrity anchor, so the branch must not fire.Full suite:
1638 passedacrosstests/unit/install/,tests/unit/test_install_command.py,tests/unit/test_drift_registry.py,tests/unit/test_drift_detection.py.How to test locally
apm.ymlwith a git dep on a repo whose server cannot serve a partial-clone (or any virtual-file dep that lands without aresolved_commit).apm installonce -- this succeeds and writes a lockfile entry withcontent_hashset andresolved_commitempty.apm installa second time. Before this PR:[x] Content hash mismatch ... supply-chain attack. After this PR: install succeeds, the cached package is reused, no re-download, no false alert.Validation evidence
uv run --extra dev ruff check src/ tests/-- silent.uv run --extra dev ruff format --check src/ tests/-- silent.uv run --extra dev python -m pylint --disable=all --enable=R0801 --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/-- 10.00/10.bash scripts/lint-auth-signals.sh-- clean.test_integrate_phase3w5.py; broader: 1638 / 1638 across install + drift modules.ASCII-only.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com