Skip to content

Improved/aggressive checksum caching#1548

Open
timw wants to merge 2 commits into
chains-project:mainfrom
indexity-io:feature/checksum-caching
Open

Improved/aggressive checksum caching#1548
timw wants to merge 2 commits into
chains-project:mainfrom
indexity-io:feature/checksum-caching

Conversation

@timw
Copy link
Copy Markdown
Contributor

@timw timw commented Apr 16, 2026

This PR proposes two related changes:

Improved/cross project caching

Applying caching (similar to that implemented for the remote checksum calculator) to all cache calculator, and applying a single cache across all projects in a multi project/reactor build - achieved by stashing the underlying caches in the Maven session.
This has a significant impact on performance of :generate, as deep transitive dependencies are evaluated repeatedly and increasingly as a reactor build progresses through the project dependency tree.
There are limitations to the cross-project caching - i.e. if the reactor creates project sets with distinct classloaders from the initiating/parent project. In this case the caching falls back to a project scoped cache.

In my test project, this results in 58,000 checksum calculation total requests (including plugin dependencies) only making 1070 requests to the underlying remote checksum calculator. There's a more modest gain with the local calculator (about 3 seconds on my large test project), with a bit of the benefit coming from having the multi-threaded prewarm code applied to all calculator now.

Incremental generation mode

An opt-in feature (i.e. with config flag) that avoids re-calculating checksums on subsequent :generatecalls.
This is achieved by simply pre-loading the checksums in the lockfile into the checksum calculator cache (which is now known to decorate all checksum calculators).
In my test project, this reduces the remaining 1070 remote calculator calls to ~30 (there are a bunch of glitches in the dependency tree like missing/relocated poms etc. that still seem to cause calculations).
Removing the lockfile and re-generating (or changing the config flag) provides for forcing re-generation where a checksum changes for valid reasons.

I think that this doesn't change the security guarantees of the lockfile...
We 'trust' the checksums in the lockfile, so re-calculating them has two possible outcomes:

  1. The checksum is unchanged, and we could have avoided recalculating it.
  2. The checksum is changed, which can be detected by a :validate run either before or after the incremental :generate.
    The only difference in behaviour with a full regeneration would be generating a new checksum for an artifact that wasn't expected to change - I think the incremental behaviour is actually more desirable TBH (should this even be an opt-in feature>?).

Happy to discuss/split/rework, since this landed without an initial discussion issue (the bulk of the implementation was in a stash from a year ago that I'd forgotten about and noticed while doing other patches).

timw added 2 commits April 16, 2026 20:52
Apply caching as decorator to all checksum calculators, and allow it to be maintained across projects in a reactor build.
Opt-in feature to generate lockfile incrementally - only calculating checksums/repository information for new/changed coordinates when a lockfile already exists.
This is implemented by pre-populating the cache with the details from the existing lockfile.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant