Skip to content

[CPU] Allow Gather on KV cache paths in MatchSdpaKvCache#35162

Closed
CuriousPanCake wants to merge 1 commit intoopenvinotoolkit:masterfrom
CuriousPanCake:CVS-183493_wa
Closed

[CPU] Allow Gather on KV cache paths in MatchSdpaKvCache#35162
CuriousPanCake wants to merge 1 commit intoopenvinotoolkit:masterfrom
CuriousPanCake:CVS-183493_wa

Conversation

@CuriousPanCake
Copy link
Copy Markdown
Contributor

Details:

MatchSdpaKvCache is not executed due to the presence of Gather among the Concat's children on KV cache paths after StatefulSDPAFusion. This prevents MemoryInputSDPA creation, leaving KV states unassigned and causing
a null state crash at inference.

          ┌─────────┐               
    ┌─────┤ReadValue├────┐          
    │     └────┬────┘    │          
    │          │         │          
    ▼          ▼         ▼          
┌────────┐ ┌───────┐ ┌──────┐       
│SDPAWith│ │ShapeOf│ │Gather│       
│KVCache │ └───────┘ └──────┘       
└────────┘            //Not Expected

Tickets:

AI Assistance:

  • AI assistance used: yes

Signed-off-by: Andrii Staikov andrii.staikov@intel.com

@CuriousPanCake CuriousPanCake requested review from a team as code owners April 3, 2026 12:21
@github-actions github-actions Bot added the category: CPU OpenVINO CPU plugin label Apr 3, 2026
@CuriousPanCake CuriousPanCake changed the title [CPU] Allow Gather in MatchSdpaKvCache [CPU] Allow Gather on KV cache paths in MatchSdpaKvCache Apr 3, 2026
@mryzhov mryzhov requested a review from Copilot April 8, 2026 13:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a CPU plugin graph-optimizer pattern match so MatchSdpaKvCache still runs when a Gather node appears as an additional consumer on KV-cache paths after StatefulSDPAFusion, preventing missing KV state assignment and a null-state crash during inference.

Changes:

  • Extends MatchSdpaKvCache’s eligibility check to allow Type::Gather among MemoryInput’s children (in addition to ScaledDotProductAttention and ShapeOf).

for (auto&& item : childEdges) {
auto childNode = item->getChild();
if (none_of(childNode->getType(), Type::ScaledDotProductAttention, Type::ShapeOf)) {
if (none_of(childNode->getType(), Type::ScaledDotProductAttention, Type::ShapeOf, Type::Gather)) {
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] This change fixes a crash-prone behavioral bug in KV-cache state assignment, but the PR doesn’t add a regression test covering the newly accepted pattern (MemoryInput having an additional Gather consumer alongside SDPA/ShapeOf). Please add a CPU plugin unit/functional test that builds a model where ReadValue/MemoryInput feeds the fused SDPA-with-KV-cache path and also has a sibling Gather consumer, and asserts compilation + inference works (no null-state crash) and the KV states are assigned/usable.

Copilot generated this review using guidance from repository custom instructions.
@CuriousPanCake
Copy link
Copy Markdown
Contributor Author

This PR is a WA. The rootcause was additionally investigated and a solution merged within bcbe69a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: CPU OpenVINO CPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants