Rapidly updating a parent image can cause a child image to use stale base#6784
Open
metronom72 wants to merge 1 commit into
Open
Rapidly updating a parent image can cause a child image to use stale base#6784metronom72 wants to merge 1 commit into
metronom72 wants to merge 1 commit into
Conversation
A child image resolves its base (FROM) tag from the base ImageMap. That tag is published over two channels: the engine's build result, which is written synchronously, and the apiserver ImageMap status, which is flushed asynchronously by the image reconcilers. When a base image is reused (not rebuilt) in a build pass -- e.g. it was just rebuilt in another manifest that shares it -- BuildAndDeploy read the FROM tag from the apiserver ImageMap status, which can still lag the propagated result. The child was then built against a stale base tag and never self-corrected, because the engine already considered it up-to-date. Seed the imageMapSet status from the reused build result instead of the apiserver status, so reused base images inject the latest tag. Fixes tilt-dev#6634 Signed-off-by: Mikhail Dorokhovich <mikhail@dorokhovich.com>
82dc61a to
a2af852
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fixes #6634.
Under concurrent rebuilds of a shared base image, a child image can be built
against a stale parent base image and never self-correct. The parent
deployment runs the latest base, but a child keeps an older base baked into its
FROM, so files inherited from the base diverge across images. The larger theproject and the more image relations there are, the easier it is to trigger.
Root Cause
A child image resolves its base (
FROM) tag from the baseImageMap. That tagis published over two channels with different timing:
ImageMap.Status— flushed asynchronously by thedockerimage/cmdimage reconcilers.
In
internal/engine/buildcontrol/image_build_and_deployer.go,BuildAndDeploybuilds its
imageMapSetfrom the apiserver (ctrlClient.Get) for every imagetarget, including ones that are reused (not rebuilt) in this pass. Tilt
de-dups redundant base rebuilds across manifests that share a base image, so when
a base is rebuilt in one manifest and reused in another, the second manifest's
child reads its
FROMfrom the apiserverImageMap— which can still lag thepropagated result. The engine already considers the child up-to-date, so nothing
re-triggers it and the stale tag becomes permanent.
(Within a single manifest this can't happen:
UpdateImageMapmutates the sharedimageMapSetin place andRunBuildsis topologically ordered, so a childalways sees its freshly-built base. The bug is specific to the cross-manifest
reuse path.)
Solution
When seeding
imageMapSet, for an image that is reused in this pass, take itsstatus from the authoritative build result the engine already computed
(
TargetQueue.ReusedResults()) instead of the asynchronously-flushed apiserverImageMap.Status. Rebuilt targets are still overwritten byUpdateImageMapasbefore.
Changes
internal/engine/buildcontrol/image_build_and_deployer.go: prefer the reusedbuild result's status over the apiserver
ImageMapstatus when buildingimageMapSet.internal/engine/buildcontrol/image_build_and_deployer_test.go: addTestMultiStageDockerBuildReusedBaseWithStaleImageMap.Testing
ImageMapstatus is deliberately stale, and the child must be builtFROMthelatest base tag. It fails without the fix (child built
FROM …:tilt-stale)and passes with it.
TestManifestsWithCommonAncestorAndTrigger,TestManifestsWithTwoCommonAncestors,TestTwoK8sTargetsWithBaseImage*andthe multi-stage build tests — dedup and trigger-spillover behavior unchanged.
buildcontrols/reducers.go, so theinfinite-build behavior referenced by engine: fix bugs in image build caching #3542 is unaffected.