Skip to content

[SM6.10][Exec][Bugfix] Fix OuterProduct/AccumulateToDescriptor Smoke Tests for Thread Matrices#8387

Open
V-FEXrt wants to merge 2 commits intomicrosoft:mainfrom
V-FEXrt:linalg-outeropt-layout
Open

[SM6.10][Exec][Bugfix] Fix OuterProduct/AccumulateToDescriptor Smoke Tests for Thread Matrices#8387
V-FEXrt wants to merge 2 commits intomicrosoft:mainfrom
V-FEXrt:linalg-outeropt-layout

Conversation

@V-FEXrt
Copy link
Copy Markdown
Collaborator

@V-FEXrt V-FEXrt commented Apr 17, 2026

Fixes #8386

Comment thread tools/clang/unittests/HLSLExec/LinAlgTests.cpp Outdated
Device, DxcSupport, std::move(Op),
[NumElements, Params, FillValue](LPCSTR Name, std::vector<BYTE> &Data,
st::ShaderOp *) {
VERIFY_IS_TRUE(fillInputBuffer(Name, Data, Params.CompType, NumElements,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the layout isn't RowMajor, this needs to run a ConvertLinearAlgebraMatrix

Copy link
Copy Markdown
Collaborator Author

@V-FEXrt V-FEXrt Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<edit: the comment here was in the wrong place>

Comment thread tools/clang/unittests/HLSLExec/LinAlgTests.cpp
@github-project-automation github-project-automation bot moved this from New to In progress in HLSL Roadmap Apr 17, 2026
@anupamachandra
Copy link
Copy Markdown
Collaborator

The title is a little misleading, AccumulateToDescriptor for Thread Scope matrices require OuterProductOptimal layouts, not all thread matrices.

@V-FEXrt
Copy link
Copy Markdown
Collaborator Author

V-FEXrt commented Apr 17, 2026

@anupamachandra yep, my bad. I threw the first draft of this together a bit too quickly. I'm working on updating it now. Thanks for pointing that out!

@V-FEXrt V-FEXrt changed the title [SM6.10][Exec][Bugfix] Thread mats should be OuterProductOptimal layout [SM6.10][Exec][Bugfix] AccumulateToDescriptor requiresx OuterProductOptimal for Thread mats Apr 17, 2026
@V-FEXrt V-FEXrt changed the title [SM6.10][Exec][Bugfix] AccumulateToDescriptor requiresx OuterProductOptimal for Thread mats [SM6.10][Exec][Bugfix] AccumulateToDescriptor requires OuterProductOptimal for Thread Mats Apr 17, 2026
@V-FEXrt V-FEXrt changed the title [SM6.10][Exec][Bugfix] AccumulateToDescriptor requires OuterProductOptimal for Thread Mats [SM6.10][Exec][Bugfix] Fix OuterProduct/AccumulateToDescriptor Smoke Tests for Thread Matrices Apr 18, 2026
SS << " -DUSE=" << static_cast<int>(Params.Use);
SS << " -DSCOPE=" << static_cast<int>(Params.Scope);
SS << " -DSTRIDE=" << Params.strideBytes();
SS << " -DSTRIDE=" << Params.rowStride();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stride is a problem for group shared load and store, from spec, the stride of group shared is the count of elements, so it should be N or M for group shared.

it needs to fix:
__builtin_LinAlg_MatrixLoadFromMemory(
Mat, GsData, OFFSET, STRIDE, LAYOUT);
__builtin_LinAlg_MatrixStoreToMemory(
Mat, GsData, OFFSET, STRIDE, LAYOUT);

also, group shared offset is set to 0 from test, it's okay here, but I guess the offset for group shared also the count of elements?

// flatten the 2D index into a 1D index then scale by element size
// Always store row-major and work it out in the test runner
uint coordToByteOffset(uint2 coord) {
return (coord.y * N_DIM + coord.x) * ELEM_SIZE;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this PR, but I guess coordToByteOffset should be this?
return (coord.x * N_DIM + coord.y) * ELEM_SIZE;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

[SM6.10] linAlgMatrixAccumulateToDescriptor: Incorrect Matrix definition in LinAlgTests

5 participants