Improved/aggressive checksum caching#1548
Open
timw wants to merge 2 commits into
Open
Conversation
Apply caching as decorator to all checksum calculators, and allow it to be maintained across projects in a reactor build.
Opt-in feature to generate lockfile incrementally - only calculating checksums/repository information for new/changed coordinates when a lockfile already exists. This is implemented by pre-populating the cache with the details from the existing lockfile.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR proposes two related changes:
Improved/cross project caching
Applying caching (similar to that implemented for the remote checksum calculator) to all cache calculator, and applying a single cache across all projects in a multi project/reactor build - achieved by stashing the underlying caches in the Maven session.
This has a significant impact on performance of
:generate, as deep transitive dependencies are evaluated repeatedly and increasingly as a reactor build progresses through the project dependency tree.There are limitations to the cross-project caching - i.e. if the reactor creates project sets with distinct classloaders from the initiating/parent project. In this case the caching falls back to a project scoped cache.
In my test project, this results in 58,000 checksum calculation total requests (including plugin dependencies) only making 1070 requests to the underlying remote checksum calculator. There's a more modest gain with the local calculator (about 3 seconds on my large test project), with a bit of the benefit coming from having the multi-threaded prewarm code applied to all calculator now.
Incremental generation mode
An opt-in feature (i.e. with config flag) that avoids re-calculating checksums on subsequent
:generatecalls.This is achieved by simply pre-loading the checksums in the lockfile into the checksum calculator cache (which is now known to decorate all checksum calculators).
In my test project, this reduces the remaining 1070 remote calculator calls to ~30 (there are a bunch of glitches in the dependency tree like missing/relocated poms etc. that still seem to cause calculations).
Removing the lockfile and re-generating (or changing the config flag) provides for forcing re-generation where a checksum changes for valid reasons.
I think that this doesn't change the security guarantees of the lockfile...
We 'trust' the checksums in the lockfile, so re-calculating them has two possible outcomes:
:validaterun either before or after the incremental:generate.The only difference in behaviour with a full regeneration would be generating a new checksum for an artifact that wasn't expected to change - I think the incremental behaviour is actually more desirable TBH (should this even be an opt-in feature>?).
Happy to discuss/split/rework, since this landed without an initial discussion issue (the bulk of the implementation was in a stash from a year ago that I'd forgotten about and noticed while doing other patches).