Skip to content

fix(ui): stabilize workflow node execution state updates#9029

Open
JPPhoto wants to merge 3 commits intoinvoke-ai:mainfrom
JPPhoto:workflow-node-execution-event-ordering
Open

fix(ui): stabilize workflow node execution state updates#9029
JPPhoto wants to merge 3 commits intoinvoke-ai:mainfrom
JPPhoto:workflow-node-execution-event-ordering

Conversation

@JPPhoto
Copy link
Copy Markdown
Collaborator

@JPPhoto JPPhoto commented Apr 8, 2026

Summary

Fixes workflow node execution state updates in the frontend event layer.

This change fixes nodes getting stuck in IN_PROGRESS or showing duplicate outputs when socket events arrive out of order or are repeated. The fix moves the event-ordering logic into shared helpers and uses a listener-local completed-invocation key set so late invocation_started / invocation_progress events cannot overwrite a completed node state.

Related Issues / Discussions

QA Instructions

  1. On main, run a workflow in the Workflow Editor and examine the Outputs pane for a node that executes. You should see two outputs even when the node is executed once.
  2. After pulling and building (or running in dev mode), open the Workflow Editor and run a workflow with visible node progress.
    • Confirm nodes transition from pending to in progress to completed.
    • Confirm completed nodes do not revert back to in progress during or after the run.
    • Confirm the Outputs pane does not show duplicate outputs for a single node execution.
  3. Execute pnpm vitest run to run regression tests.

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@JPPhoto JPPhoto added the frontend PRs that change frontend files label Apr 8, 2026
@JPPhoto JPPhoto force-pushed the workflow-node-execution-event-ordering branch 9 times, most recently from 23fae17 to 894df8a Compare April 14, 2026 00:39
@lstein lstein self-assigned this Apr 14, 2026
@lstein lstein added the v6.13.x label Apr 14, 2026
@lstein lstein moved this to 6.13.x Theme: MODELS in Invoke - Community Roadmap Apr 14, 2026
@JPPhoto JPPhoto force-pushed the workflow-node-execution-event-ordering branch 12 times, most recently from 057032b to 780663a Compare April 20, 2026 21:02
@JPPhoto JPPhoto force-pushed the workflow-node-execution-event-ordering branch 2 times, most recently from 36894df to f2068cf Compare April 21, 2026 00:40
Copy link
Copy Markdown
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a funny coincidence that I noticed the doubled output just a few days ago on my own, and was puzzled by it, not knowing whether it was a feature or a bug.

In any case, I'm having difficulty testing this PR. I set up the very simple integer operation workflow shown below. When I run it, the nodes go into permanent pending state and continue showing pending even after cancelling. On the other hand, when using an image generation workflow that previously doubled the output, the doubled output is gone. However, the nodes at the very beginning and end of the workflow get stuck in PENDING. Is there something I'm doing wrong?

(Yes, I rebuilt the front end)

Image

@JPPhoto
Copy link
Copy Markdown
Collaborator Author

JPPhoto commented Apr 21, 2026

It's a funny coincidence that I noticed the doubled output just a few days ago on my own, and was puzzled by it, not knowing whether it was a feature or a bug.

In any case, I'm having difficulty testing this PR. I set up the very simple integer operation workflow shown below. When I run it, the nodes go into permanent pending state and continue showing pending even after cancelling. On the other hand, when using an image generation workflow that previously doubled the output, the doubled output is gone. However, the nodes at the very beginning and end of the workflow get stuck in PENDING. Is there something I'm doing wrong?

(Yes, I rebuilt the front end)

I think there are multiple issues here and this PR addresses double results while #9043 addresses status issues. Can you locally merge #9043 into your checkout and see if everything is better with both applied?

@JPPhoto JPPhoto force-pushed the workflow-node-execution-event-ordering branch from e0b1e7d to 4afe259 Compare April 21, 2026 02:00
@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 21, 2026

It's a funny coincidence that I noticed the doubled output just a few days ago on my own, and was puzzled by it, not knowing whether it was a feature or a bug.
In any case, I'm having difficulty testing this PR. I set up the very simple integer operation workflow shown below. When I run it, the nodes go into permanent pending state and continue showing pending even after cancelling. On the other hand, when using an image generation workflow that previously doubled the output, the doubled output is gone. However, the nodes at the very beginning and end of the workflow get stuck in PENDING. Is there something I'm doing wrong?
(Yes, I rebuilt the front end)

I think there are multiple issues here and this PR addresses double results while #9043 addresses status issues. Can you locally merge #9043 into your checkout and see if everything is better with both applied?

Will do. I'm traveling for a bunch of business meetings this week so it may be a couple days before I get back to this, but I'm anxious to get it pushed through.

@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 21, 2026

I've created a branch that contains both #9029 and #9043 . However the problem of stuck workflows persists.

When I create the simplest workflow of them all, a single Add Integers node, and press the invoke button, about 90% of the time it gets stuck in pending state. If I create a slightly more complex workflow, such as feeding the Add Integers output into Integer Range of Size, the workflow completes about 80% of the time and get stuck in PENDING the rest of the time.

This suggests to me that there is still a race condition of some sort. Let me know if testing this the wrong way.

@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 21, 2026

Oh, wait, I merged #9042, not #9043. Trying that as well.

No, the problem persists. This is with all three PRs (#9029, #9042 and #9043) merged into a local branch. Also I note that the single node Add Integer workflow sometimes appears to run to complete, but produces no output.

@JPPhoto JPPhoto force-pushed the workflow-node-execution-event-ordering branch from 4afe259 to e65a67f Compare April 22, 2026 00:56
@JPPhoto
Copy link
Copy Markdown
Collaborator Author

JPPhoto commented Apr 22, 2026

I think I have the issue isolated. The node's initial execution state has not been put into $nodeExecutionStates before the first socket event for that node arrives. For very fast workflows, invocation_started, invocation_progress, or even invocation_complete can win that race, and the handler was previously dropping the event because there was no existing execution-state entry to update. Try this PR again now and see if that resolves the problem.

@JPPhoto JPPhoto force-pushed the workflow-node-execution-event-ordering branch from 0af9f1f to 46cc494 Compare April 22, 2026 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend PRs that change frontend files v6.13.x

Projects

Status: 6.13.x Theme: MODELS

Development

Successfully merging this pull request may close these issues.

2 participants