Skip to content

[DRAFT][SM6.10][BugFix] Update calculation of loaded bias vector size in MultiplyAdd#8389

Draft
hekota wants to merge 1 commit intomicrosoft:users/hekota/pr8388-fix-size-checkfrom
hekota:muladd-convert-in-op
Draft

[DRAFT][SM6.10][BugFix] Update calculation of loaded bias vector size in MultiplyAdd#8389
hekota wants to merge 1 commit intomicrosoft:users/hekota/pr8388-fix-size-checkfrom
hekota:muladd-convert-in-op

Conversation

@hekota
Copy link
Copy Markdown
Member

@hekota hekota commented Apr 18, 2026

For bias vector with a packed type, when the vector is loaded from memory, the number of components of the loaded vector should be equal to the number of packed components divided by the number of elements per scalar, rounded to the nearest integer. For example VectorRef<ComponentType::F8_E4M3FN, 7> is loaded into vector<uint,2>.

In the LinAlg MultiplyAdd function the loaded bias vector and its interpretation are passed into __builtin_LinAlg_MatrixVectorMultiplyAdd built-in function which is lowered to the dx.op.linAlgMatVecMulAdd op. The expectation is that the dx.op.linAlgMatVecMulAdd op will convert the bias vector to the output vector type based on the bias interpretation.

Since the sizes of the output, input and bias vectors can all differ, the HLSL intrinsic support had to be extended to support vectors of 3 different sizes in one built-in function.

If the bias vector is packed, the number of components of the loaded
vector should be the number of packed components divided by the number
of elements per scalar, rounded to the nearest integer.

The loaded vector and its interpretation are passed into `__builtin_LinAlg_MatrixVectorMultiplyAdd` built-in function which is lowered to `dx.op.linAlgMatVecMulAdd` op.

The expectation is that the op will convert the bias vector to the output vector type based on the bias interpretation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

1 participant