Skip to content

Discard rule results from runs superseded by Shake restarts #4988

Open
crtschin wants to merge 2 commits into
haskell:masterfrom
crtschin:crtschin/stale-diagnostics
Open

Discard rule results from runs superseded by Shake restarts #4988
crtschin wants to merge 2 commits into
haskell:masterfrom
crtschin:crtschin/stale-diagnostics

Conversation

@crtschin

@crtschin crtschin commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Closes #4985.

When any rule is (re)ran, do to being invalidated (in the above issue, due to documentDidChange notifications from typing), we track in our shake setup which shake session rule and rule results belong to. The step bookkeeping is an implementation detail of our shake setup to track whether keys are dirty.

What went wrong here is that a rule result being put in the database for a rule build demand, can interleave with the step incrementing done on shake restarts. This has the consequence that a shake rule may write it's result under a new step value, or leave a now outdated result.

For a single rule R, the following can occur:
    step 1   build A starts rule R, reading its inputs at step 1
    step 2   a restart bumps the step to 2 and marks R dirty
     ...     A finally finishes and, unguarded, writes Clean@2,
               a stale result, but its timestamp claims it is fresh

This PR adds a guard against that by checking that the step value is that same as when the rule started, implying no restart occurred. If one did, we should not overwrite it with the old result, as a new build may have raced and written a new one.

I added the (in the previous setup) flaky test, and reran it multiple times.

@fendor fendor requested a review from soulomoon June 24, 2026 07:14
A restart cancels the running build with an async exception, but the computation
it spawned can outlive it and finish under a later build. 'compute' stamps the
result with whatever the step is *now*, so a value read from stale inputs is
recorded as freshly built and dependents skip recomputing it:

@soulomoon soulomoon Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought asyncWithCleanUp would cleanup all the chilren ? So the spawned one won't live after we stop the old session ? Where does it go wrong 🤔 should we fix it instead ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought asyncWithCleanUp would cleanup all the chilren ?

That's the intention, and it does.

But notice that incDatabase is called in a separate thread relative to all of the reads in compute, which means it can interleave and cause a rule's computation to assume the step of a latter rule build action. This is what causes the superceded diagnostic rule result to survive.

I don't think this is savable from the level of asyncWithCleanup. It's compute that writes the incorrect Clean with the new step value. A check has to occur in the same transaction that sets the status.

@soulomoon soulomoon Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design, incDatabase should only be called after we cancelShakeSession and it should stop all the session's running threads. I think it is bug if any compute interleave incDatabase.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that sounds reasonable, I'll try this again later today.

@crtschin crtschin changed the title WIP: Discard rule results from runs superseded by Shake restarts Discard rule results from runs superseded by Shake restarts Jun 24, 2026
@crtschin crtschin marked this pull request as ready for review June 24, 2026 19:00
@crtschin crtschin requested a review from wz1000 as a code owner June 24, 2026 19:00
@crtschin crtschin requested a review from soulomoon June 24, 2026 19:00
@crtschin crtschin added the status: needs review This PR is ready for review label Jun 24, 2026
@crtschin crtschin force-pushed the crtschin/stale-diagnostics branch from d09584c to 42382e0 Compare June 24, 2026 19:03
@crtschin

Copy link
Copy Markdown
Collaborator Author

Another data point for consideration my prefrontal cortex whipt up. If some rules survive past restarts attempts, due to (intended) laziness in shake database values. It's conceivable these thunks which capture parts of the state, never get evaluated or cleaned up. Which would also explain some memory usage/space leaks we've been observing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: needs review This PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Diagnostics don't always refresh on edits.

2 participants