Description
|
typename hlsl::enable_if< |
|
InterpretedVector<InputElTy, VecK, InputInterp>::Size == K, |
|
vector<OutputElTy, M> >::type |
|
// clang-format on |
|
MultiplyAdd(Matrix<MatrixDT, M, K, MatrixUse::A, MatrixScope::Thread> MatrixA, |
|
typename hlsl::enable_if< |
|
InterpretedVector<InputElTy, VecK, InputInterp>::Size == K, |
|
vector<OutputElTy, M> >::type |
|
// clang-format on |
|
MultiplyAdd(Matrix<MatrixDT, M, K, MatrixUse::A, MatrixScope::Thread> MatrixA, |
After dxc fixed the linalg::Convert DstN value calculation in #8359,
this issue appears: for F16->FP8 case, when K = 15, InterpretedVector<uint, 4, ComponentEnum::F8_E4M3FN>::Size=ElementsPerScalar * N=4 * 4 =16.
I believe the correct logic should be to ensure Size >= K.
Steps to Reproduce
Actual Behavior
Environment
- DXC version
- Host Operating System <!--- Host operating system and version --->
Description
DirectXShaderCompiler/tools/clang/lib/Headers/hlsl/dx/linalg.h
Lines 508 to 512 in 4ad9834
DirectXShaderCompiler/tools/clang/lib/Headers/hlsl/dx/linalg.h
Lines 544 to 548 in 4ad9834
After dxc fixed the linalg::Convert DstN value calculation in #8359,
this issue appears: for F16->FP8 case, when
K= 15,InterpretedVector<uint, 4, ComponentEnum::F8_E4M3FN>::Size=ElementsPerScalar*N=4 * 4 =16.I believe the correct logic should be to ensure
Size>=K.Steps to Reproduce
Actual Behavior
Environment