impr: Use `f16` and better subgroup shader in MNIST example by reczkok · Pull Request #2412 · software-mansion/TypeGPU

reczkok · 2026-04-27T16:43:08Z

No description provided.

github-actions · 2026-04-27T16:44:07Z

pkg.pr.new

packages
Ready to be installed by your favorite package manager ⬇️

https://pkg.pr.new/software-mansion/TypeGPU/typegpu@e8fec053c5a2b80438c3f69b8c0294093bc1626b

https://pkg.pr.new/software-mansion/TypeGPU/@typegpu/noise@e8fec053c5a2b80438c3f69b8c0294093bc1626b

https://pkg.pr.new/software-mansion/TypeGPU/unplugin-typegpu@e8fec053c5a2b80438c3f69b8c0294093bc1626b

benchmark
view benchmark

commit
view commit

github-actions · 2026-04-27T16:44:52Z

📊 Bundle Size Comparison

🟢 Decreased	➖ Unchanged	🔴 Increased	❔ Unknown
0	355	0	0

👀 Notable results

Static test results:

No major changes.

Dynamic test results:

No major changes.

📋 All results

Click to reveal the results table (354 entries).

Test	tsdown
dataImportEverything.ts	87.17 kB (➖)
dataImportOneDirect.ts	22.59 kB (➖)
dataImportOneStar.ts	22.59 kB (➖)
functionWithUseGpu.ts	282 B (➖)
functionWithoutUseGpu.ts	24 B (➖)
importEntireLibrary.ts	283.91 kB (➖)
stdImportEverything.ts	102.83 kB (➖)
stdImportOneDirect.ts	46.17 kB (➖)
stdImportOneStar.ts	46.17 kB (➖)
tgpuImportEverything.ts	256.39 kB (➖)
tgpuImportOne.ts	256.40 kB (➖)
MissingBindGroupsError from typegpu.ts	1.34 kB (➖)
MissingLinksError from typegpu.ts	201 B (➖)
MissingSlotValueError from typegpu.ts	151 B (➖)
MissingVertexBuffersError from typegpu.ts	1.35 kB (➖)
NotUniformError from typegpu.ts	1.30 kB (➖)
ResolutionError from typegpu.ts	1.55 kB (➖)
ShaderGenerator from typegpu.ts	552 B (➖)
Void from typegpudata.ts	734 B (➖)
WgslGenerator from typegpu.ts	106.41 kB (➖)
abs from typegpustd.ts	63.62 kB (➖)
acos from typegpustd.ts	63.62 kB (➖)
acosh from typegpustd.ts	63.62 kB (➖)
add from typegpustd.ts	46.17 kB (➖)
align from typegpudata.ts	24.28 kB (➖)
alignmentOf from typegpudata.ts	19.82 kB (➖)
allEq from typegpustd.ts	49.11 kB (➖)
all from typegpustd.ts	49.11 kB (➖)
and from typegpustd.ts	49.11 kB (➖)
any from typegpustd.ts	49.12 kB (➖)
arrayLength from typegpustd.ts	12.32 kB (➖)
arrayOf from typegpudata.ts	24.08 kB (➖)
asin from typegpustd.ts	63.62 kB (➖)
asinh from typegpustd.ts	63.62 kB (➖)
atan2 from typegpustd.ts	63.62 kB (➖)
atan from typegpustd.ts	63.62 kB (➖)
atanh from typegpustd.ts	63.62 kB (➖)
atomicAdd from typegpustd.ts	13.68 kB (➖)
atomicAnd from typegpustd.ts	13.68 kB (➖)
atomicLoad from typegpustd.ts	13.67 kB (➖)
atomicMax from typegpustd.ts	13.68 kB (➖)
atomicMin from typegpustd.ts	13.68 kB (➖)
atomicOr from typegpustd.ts	13.68 kB (➖)
atomicStore from typegpustd.ts	13.68 kB (➖)
atomicSub from typegpustd.ts	13.68 kB (➖)
atomicXor from typegpustd.ts	13.68 kB (➖)
atomic from typegpudata.ts	779 B (➖)
bitShiftLeft from typegpustd.ts	46.17 kB (➖)
bitShiftRight from typegpustd.ts	46.17 kB (➖)
bitcastU32toF32 from typegpustd.ts	41.99 kB (➖)
bitcastU32toI32 from typegpustd.ts	42.00 kB (➖)
bool from typegpudata.ts	10.86 kB (➖)
builtin from typegpudata.ts	26.46 kB (➖)
ceil from typegpustd.ts	63.62 kB (➖)
clamp from typegpustd.ts	63.62 kB (➖)
common from typegpu.ts	55.83 kB (➖)
comparisonSampler from typegpudata.ts	753 B (➖)
copy from typegpustd.ts	12.31 kB (➖)
cos from typegpustd.ts	63.62 kB (➖)
cosh from typegpustd.ts	63.62 kB (➖)
countLeadingZeros from typegpustd.ts	63.62 kB (➖)
countOneBits from typegpustd.ts	63.62 kB (➖)
countTrailingZeros from typegpustd.ts	63.62 kB (➖)
cross from typegpustd.ts	63.62 kB (➖)
d from typegpu.ts	84.85 kB (➖)
deepEqual from typegpudata.ts	2.19 kB (➖)
degrees from typegpustd.ts	63.62 kB (➖)
determinant from typegpustd.ts	63.62 kB (➖)
disarrayOf from typegpudata.ts	12.85 kB (➖)
discard from typegpustd.ts	12.07 kB (➖)
distance from typegpustd.ts	63.61 kB (➖)
div from typegpustd.ts	46.17 kB (➖)
dot4I8Packed from typegpustd.ts	63.62 kB (➖)
dot4U8Packed from typegpustd.ts	63.61 kB (➖)
dot from typegpustd.ts	63.61 kB (➖)
dpdxCoarse from typegpustd.ts	12.92 kB (➖)
dpdxFine from typegpustd.ts	12.92 kB (➖)
dpdx from typegpustd.ts	12.91 kB (➖)
dpdyCoarse from typegpustd.ts	12.92 kB (➖)
dpdyFine from typegpustd.ts	12.92 kB (➖)
dpdy from typegpustd.ts	12.92 kB (➖)
eq from typegpustd.ts	49.11 kB (➖)
exp2 from typegpustd.ts	63.62 kB (➖)
exp from typegpustd.ts	63.62 kB (➖)
extensionEnabled from typegpustd.ts	12.41 kB (➖)
extractBits from typegpustd.ts	63.62 kB (➖)
f16 from typegpudata.ts	10.86 kB (➖)
f32 from typegpudata.ts	10.86 kB (➖)
faceForward from typegpustd.ts	63.62 kB (➖)
firstLeadingBit from typegpustd.ts	63.62 kB (➖)
firstTrailingBit from typegpustd.ts	63.62 kB (➖)
float16 from typegpudata.ts	18.78 kB (➖)
float16x2 from typegpudata.ts	18.78 kB (➖)
float16x4 from typegpudata.ts	18.78 kB (➖)
float32 from typegpudata.ts	18.78 kB (➖)
float32x2 from typegpudata.ts	18.78 kB (➖)
float32x3 from typegpudata.ts	18.78 kB (➖)
float32x4 from typegpudata.ts	18.78 kB (➖)
floor from typegpustd.ts	63.62 kB (➖)
fma from typegpustd.ts	63.62 kB (➖)
formatToWGSLType from typegpudata.ts	18.77 kB (➖)
fract from typegpustd.ts	63.61 kB (➖)
frexp from typegpustd.ts	63.61 kB (➖)
fwidthCoarse from typegpustd.ts	12.92 kB (➖)
fwidthFine from typegpustd.ts	12.92 kB (➖)
fwidth from typegpustd.ts	12.92 kB (➖)
ge from typegpustd.ts	49.12 kB (➖)
getLongestContiguousPrefix from typegpudata.ts	22.61 kB (➖)
gt from typegpustd.ts	49.12 kB (➖)
i32 from typegpudata.ts	10.86 kB (➖)
identity2 from typegpustd.ts	24.80 kB (➖)
identity3 from typegpustd.ts	24.80 kB (➖)
identity4 from typegpustd.ts	24.80 kB (➖)
insertBits from typegpustd.ts	63.62 kB (➖)
interpolate from typegpudata.ts	24.28 kB (➖)
invariant from typegpudata.ts	24.65 kB (➖)
inverseSqrt from typegpustd.ts	63.62 kB (➖)
isAccessor from typegpu.ts	65 B (➖)
isAlignAttrib from typegpudata.ts	755 B (➖)
isAtomic from typegpudata.ts	755 B (➖)
isBufferShorthand from typegpu.ts	1.81 kB (➖)
isBuffer from typegpu.ts	86.76 kB (➖)
isBuiltinAttrib from typegpudata.ts	757 B (➖)
isBuiltin from typegpudata.ts	22.54 kB (➖)
isCloseTo from typegpustd.ts	49.12 kB (➖)
isComparisonSampler from typegpu.ts	61.21 kB (➖)
isContiguous from typegpudata.ts	22.60 kB (➖)
isData from typegpudata.ts	1.81 kB (➖)
isDecorated from typegpudata.ts	758 B (➖)
isDisarray from typegpudata.ts	1.11 kB (➖)
isInterpolateAttrib from typegpudata.ts	761 B (➖)
isLazy from typegpu.ts	61 B (➖)
isLocationAttrib from typegpudata.ts	758 B (➖)
isLooseData from typegpudata.ts	1.16 kB (➖)
isLooseDecorated from typegpudata.ts	1.12 kB (➖)
isMutableAccessor from typegpu.ts	73 B (➖)
isPackedData from typegpudata.ts	18.84 kB (➖)
isPtr from typegpudata.ts	752 B (➖)
isSampler from typegpu.ts	61.20 kB (➖)
isSizeAttrib from typegpudata.ts	754 B (➖)
isSlot from typegpu.ts	61 B (➖)
isTexture from typegpu.ts	61.19 kB (➖)
isTgpuComputeFn from typegpu.ts	69 B (➖)
isTgpuFn from typegpu.ts	765 B (➖)
isTgpuFragmentFn from typegpu.ts	70 B (➖)
isTgpuVertexFn from typegpu.ts	68 B (➖)
isUnstruct from typegpudata.ts	1.11 kB (➖)
isUsableAsRender from typegpu.ts	55 B (➖)
isUsableAsSampled from typegpu.ts	56 B (➖)
isUsableAsStorage from typegpu.ts	56 B (➖)
isUsableAsUniform from typegpu.ts	61.18 kB (➖)
isUsableAsVertex from typegpu.ts	86.75 kB (➖)
isVariable from typegpu.ts	63.00 kB (➖)
isWgslArray from typegpudata.ts	754 B (➖)
isWgslData from typegpudata.ts	1.31 kB (➖)
isWgslStruct from typegpudata.ts	755 B (➖)
ldexp from typegpustd.ts	63.62 kB (➖)
le from typegpustd.ts	49.12 kB (➖)
length from typegpustd.ts	63.61 kB (➖)
location from typegpudata.ts	24.28 kB (➖)
log2 from typegpustd.ts	63.62 kB (➖)
log from typegpustd.ts	63.62 kB (➖)
lt from typegpustd.ts	49.11 kB (➖)
mat2x2f from typegpudata.ts	24.80 kB (➖)
mat3x3f from typegpudata.ts	24.80 kB (➖)
mat4x4f from typegpudata.ts	24.80 kB (➖)
matToArray from typegpudata.ts	24.93 kB (➖)
max from typegpustd.ts	63.62 kB (➖)
memoryLayoutOf from typegpudata.ts	39.91 kB (➖)
min from typegpustd.ts	63.62 kB (➖)
mix from typegpustd.ts	63.61 kB (➖)
mod from typegpustd.ts	46.17 kB (➖)
modf from typegpustd.ts	63.61 kB (➖)
mul from typegpustd.ts	46.17 kB (➖)
ne from typegpustd.ts	49.11 kB (➖)
neg from typegpustd.ts	46.16 kB (➖)
normalize from typegpustd.ts	63.62 kB (➖)
not from typegpustd.ts	49.11 kB (➖)
or from typegpustd.ts	49.11 kB (➖)
pack2x16float from typegpustd.ts	33.81 kB (➖)
pack4x8unorm from typegpustd.ts	33.81 kB (➖)
packedFormats from typegpudata.ts	18.79 kB (➖)
patchArrayBuffer from typegpu.ts	49.05 kB (➖)
pow from typegpustd.ts	63.62 kB (➖)
ptrFn from typegpudata.ts	859 B (➖)
ptrHandle from typegpudata.ts	851 B (➖)
ptrPrivate from typegpudata.ts	858 B (➖)
ptrStorage from typegpudata.ts	856 B (➖)
ptrUniform from typegpudata.ts	852 B (➖)
ptrWorkgroup from typegpudata.ts	860 B (➖)
quantizeToF16 from typegpustd.ts	63.62 kB (➖)
radians from typegpustd.ts	63.62 kB (➖)
range from typegpustd.ts	12.68 kB (➖)
readFromArrayBuffer from typegpu.ts	49.59 kB (➖)
ref from typegpudata.ts	4.18 kB (➖)
reflect from typegpustd.ts	63.62 kB (➖)
refract from typegpustd.ts	63.62 kB (➖)
reverseBits from typegpustd.ts	63.62 kB (➖)
rotateX4 from typegpustd.ts	46.93 kB (➖)
rotateY4 from typegpustd.ts	46.93 kB (➖)
rotateZ4 from typegpustd.ts	46.93 kB (➖)
rotationX4 from typegpustd.ts	24.80 kB (➖)
rotationY4 from typegpustd.ts	24.80 kB (➖)
rotationZ4 from typegpustd.ts	24.80 kB (➖)
round from typegpustd.ts	63.62 kB (➖)
sampler from typegpudata.ts	742 B (➖)
saturate from typegpustd.ts	63.62 kB (➖)
scale4 from typegpustd.ts	46.93 kB (➖)
scaling4 from typegpustd.ts	24.80 kB (➖)
select from typegpustd.ts	49.12 kB (➖)
sign from typegpustd.ts	63.62 kB (➖)
sin from typegpustd.ts	63.62 kB (➖)
sinh from typegpustd.ts	63.62 kB (➖)
sint16 from typegpudata.ts	18.78 kB (➖)
sint16x2 from typegpudata.ts	18.78 kB (➖)
sint16x4 from typegpudata.ts	18.78 kB (➖)
sint32 from typegpudata.ts	18.78 kB (➖)
sint32x2 from typegpudata.ts	18.78 kB (➖)
sint32x3 from typegpudata.ts	18.78 kB (➖)
sint32x4 from typegpudata.ts	18.78 kB (➖)
sint8 from typegpudata.ts	18.78 kB (➖)
sint8x2 from typegpudata.ts	18.78 kB (➖)
sint8x4 from typegpudata.ts	18.78 kB (➖)
sizeOf from typegpudata.ts	22.59 kB (➖)
size from typegpudata.ts	24.28 kB (➖)
smoothstep from typegpustd.ts	63.62 kB (➖)
snorm16 from typegpudata.ts	18.78 kB (➖)
snorm16x2 from typegpudata.ts	18.78 kB (➖)
snorm16x4 from typegpudata.ts	18.78 kB (➖)
snorm8 from typegpudata.ts	18.78 kB (➖)
snorm8x2 from typegpudata.ts	18.78 kB (➖)
snorm8x4 from typegpudata.ts	18.78 kB (➖)
sqrt from typegpustd.ts	63.62 kB (➖)
std from typegpu.ts	100.05 kB (➖)
step from typegpustd.ts	63.62 kB (➖)
storageBarrier from typegpustd.ts	13.68 kB (➖)
struct from typegpudata.ts	3.51 kB (➖)
sub from typegpustd.ts	46.17 kB (➖)
subgroupAdd from typegpustd.ts	21.84 kB (➖)
subgroupAll from typegpustd.ts	21.85 kB (➖)
subgroupAnd from typegpustd.ts	21.85 kB (➖)
subgroupAny from typegpustd.ts	21.85 kB (➖)
subgroupBallot from typegpustd.ts	21.85 kB (➖)
subgroupBroadcastFirst from typegpustd.ts	21.85 kB (➖)
subgroupBroadcast from typegpustd.ts	21.85 kB (➖)
subgroupElect from typegpustd.ts	21.85 kB (➖)
subgroupExclusiveAdd from typegpustd.ts	21.85 kB (➖)
subgroupExclusiveMul from typegpustd.ts	21.85 kB (➖)
subgroupInclusiveAdd from typegpustd.ts	21.85 kB (➖)
subgroupInclusiveMul from typegpustd.ts	21.85 kB (➖)
subgroupMax from typegpustd.ts	21.85 kB (➖)
subgroupMin from typegpustd.ts	21.85 kB (➖)
subgroupMul from typegpustd.ts	21.85 kB (➖)
subgroupOr from typegpustd.ts	21.85 kB (➖)
subgroupShuffleDown from typegpustd.ts	21.85 kB (➖)
subgroupShuffleUp from typegpustd.ts	21.85 kB (➖)
subgroupShuffleXor from typegpustd.ts	21.85 kB (➖)
subgroupShuffle from typegpustd.ts	21.85 kB (➖)
subgroupXor from typegpustd.ts	21.85 kB (➖)
tan from typegpustd.ts	63.62 kB (➖)
tanh from typegpustd.ts	63.62 kB (➖)
texture1d from typegpudata.ts	11.32 kB (➖)
texture2dArray from typegpudata.ts	11.34 kB (➖)
texture2d from typegpudata.ts	11.32 kB (➖)
texture3d from typegpudata.ts	11.32 kB (➖)
textureBarrier from typegpustd.ts	13.68 kB (➖)
textureCubeArray from typegpudata.ts	11.34 kB (➖)
textureCube from typegpudata.ts	11.32 kB (➖)
textureDepth2dArray from typegpudata.ts	11.33 kB (➖)
textureDepth2d from typegpudata.ts	11.31 kB (➖)
textureDepthCubeArray from typegpudata.ts	11.33 kB (➖)
textureDepthCube from typegpudata.ts	11.31 kB (➖)
textureDepthMultisampled2d from typegpudata.ts	11.33 kB (➖)
textureDimensions from typegpustd.ts	23.60 kB (➖)
textureExternal from typegpudata.ts	873 B (➖)
textureGather from typegpustd.ts	23.60 kB (➖)
textureLoad from typegpustd.ts	23.61 kB (➖)
textureMultisampled2d from typegpudata.ts	11.34 kB (➖)
textureSampleBaseClampToEdge from typegpustd.ts	23.61 kB (➖)
textureSampleBias from typegpustd.ts	23.61 kB (➖)
textureSampleCompareLevel from typegpustd.ts	23.61 kB (➖)
textureSampleCompare from typegpustd.ts	23.61 kB (➖)
textureSampleGrad from typegpustd.ts	23.61 kB (➖)
textureSampleLevel from typegpustd.ts	23.61 kB (➖)
textureSample from typegpustd.ts	23.61 kB (➖)
textureStorage1d from typegpudata.ts	1.01 kB (➖)
textureStorage2dArray from typegpudata.ts	1.03 kB (➖)
textureStorage2d from typegpudata.ts	1.01 kB (➖)
textureStorage3d from typegpudata.ts	1.01 kB (➖)
textureStore from typegpustd.ts	23.61 kB (➖)
tgpu.accessor from typegpu.ts	256.40 kB (➖)
tgpu.bindGroupLayout from typegpu.ts	256.40 kB (➖)
tgpu.comptime from typegpu.ts	256.40 kB (➖)
tgpu.computeFn from typegpu.ts	256.40 kB (➖)
tgpu.const from typegpu.ts	256.39 kB (➖)
tgpu.fn from typegpu.ts	256.39 kB (➖)
tgpu.fragmentFn from typegpu.ts	256.40 kB (➖)
tgpu.initFromDevice from typegpu.ts	256.40 kB (➖)
tgpu.init from typegpu.ts	256.39 kB (➖)
tgpu.lazy from typegpu.ts	256.39 kB (➖)
tgpu.mutableAccessor from typegpu.ts	256.40 kB (➖)
tgpu.privateVar from typegpu.ts	256.40 kB (➖)
tgpu.resolveWithContext from typegpu.ts	256.41 kB (➖)
tgpu.resolve from typegpu.ts	256.39 kB (➖)
tgpu.slot from typegpu.ts	256.39 kB (➖)
tgpu.unroll from typegpu.ts	256.39 kB (➖)
tgpu.vertexFn from typegpu.ts	256.40 kB (➖)
tgpu.vertexLayout from typegpu.ts	256.40 kB (➖)
tgpu.workgroupVar from typegpu.ts	256.40 kB (➖)
tgpu from typegpu.ts	256.39 kB (➖)
translate4 from typegpustd.ts	46.93 kB (➖)
translation4 from typegpustd.ts	24.80 kB (➖)
transpose from typegpustd.ts	63.62 kB (➖)
trunc from typegpustd.ts	63.62 kB (➖)
u16 from typegpudata.ts	10.89 kB (➖)
u32 from typegpudata.ts	10.86 kB (➖)
uint16 from typegpudata.ts	18.78 kB (➖)
uint16x2 from typegpudata.ts	18.78 kB (➖)
uint16x4 from typegpudata.ts	18.78 kB (➖)
uint32 from typegpudata.ts	18.78 kB (➖)
uint32x2 from typegpudata.ts	18.78 kB (➖)
uint32x3 from typegpudata.ts	18.78 kB (➖)
uint32x4 from typegpudata.ts	18.78 kB (➖)
uint8 from typegpudata.ts	18.77 kB (➖)
uint8x2 from typegpudata.ts	18.78 kB (➖)
uint8x4 from typegpudata.ts	18.78 kB (➖)
unorm10 10 10 2 from typegpudata.ts	18.78 kB (➖)
unorm16 from typegpudata.ts	18.78 kB (➖)
unorm16x2 from typegpudata.ts	18.78 kB (➖)
unorm16x4 from typegpudata.ts	18.78 kB (➖)
unorm8 from typegpudata.ts	18.78 kB (➖)
unorm8x2 from typegpudata.ts	18.78 kB (➖)
unorm8x4 bgra from typegpudata.ts	18.78 kB (➖)
unorm8x4 from typegpudata.ts	18.78 kB (➖)
unpack2x16float from typegpustd.ts	33.81 kB (➖)
unpack4x8unorm from typegpustd.ts	33.81 kB (➖)
unstruct from typegpudata.ts	1.65 kB (➖)
vec2b from typegpudata.ts	17.28 kB (➖)
vec2f from typegpudata.ts	17.28 kB (➖)
vec2h from typegpudata.ts	17.28 kB (➖)
vec2i from typegpudata.ts	17.28 kB (➖)
vec2u from typegpudata.ts	17.28 kB (➖)
vec3b from typegpudata.ts	17.28 kB (➖)
vec3f from typegpudata.ts	17.28 kB (➖)
vec3h from typegpudata.ts	17.28 kB (➖)
vec3i from typegpudata.ts	17.28 kB (➖)
vec3u from typegpudata.ts	17.28 kB (➖)
vec4b from typegpudata.ts	17.28 kB (➖)
vec4f from typegpudata.ts	17.28 kB (➖)
vec4h from typegpudata.ts	17.28 kB (➖)
vec4i from typegpudata.ts	17.28 kB (➖)
vec4u from typegpudata.ts	17.28 kB (➖)
workgroupBarrier from typegpustd.ts	13.68 kB (➖)
writeToArrayBuffer from typegpu.ts	48.85 kB (➖)

If you wish to run a comparison for other, slower bundlers, run the 'Tree-shake test' from the GitHub Actions menu.

github-actions · 2026-04-27T16:46:03Z

Resolution Time Benchmark

---
config:
  themeVariables:
    xyChart:
      plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
  title "Random Branching (🔴 PR | 🔵 main | 🟢 release)"
  x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
  y-axis "time (ms)"
  line [0.95, 1.93, 4.42, 6.49, 7.69, 11.01, 22.11, 22.52]
  line [0.99, 2.06, 4.40, 6.69, 7.60, 11.03, 21.56, 24.80]
  line [0.91, 1.89, 4.39, 6.48, 7.36, 10.73, 21.20, 22.31]

---
config:
  themeVariables:
    xyChart:
      plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
  title "Linear Recursion (🔴 PR | 🔵 main | 🟢 release)"
  x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
  y-axis "time (ms)"
  line [0.34, 0.53, 0.64, 0.81, 1.13, 1.18, 1.40, 1.61]
  line [0.29, 0.54, 0.71, 0.87, 1.18, 1.25, 1.47, 1.58]
  line [0.31, 0.52, 0.71, 0.87, 1.12, 1.14, 1.38, 1.53]

---
config:
  themeVariables:
    xyChart:
      plotColorPalette: "#E63946, #3B82F6, #059669"
---
xychart
  title "Full Tree (🔴 PR | 🔵 main | 🟢 release)"
  x-axis "max depth" [1, 2, 3, 4, 5, 6, 7, 8]
  y-axis "time (ms)"
  line [0.89, 2.07, 4.26, 6.69, 12.73, 26.66, 56.97, 119.53]
  line [0.82, 2.17, 3.63, 6.59, 12.54, 26.94, 55.85, 114.15]
  line [0.83, 2.14, 4.31, 6.51, 13.08, 27.06, 55.75, 113.60]

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T10:59:49Z

+export function downloadLayers(
+  root: TgpuRoot,
+  floatShcema: d.F32 | d.F16,
+): Promise<[LayerData, LayerData][]> {


Typo in parameter name floatShcema (should be floatSchema). Keeping the misspelling makes the API harder to read/search and increases the chance of propagating the typo to call sites.

Copilot · 2026-04-29T10:59:49Z

+      const outputCount = buffers[i].biases.dataType.elementCount;
+      boundPipeline.dispatchWorkgroups(
+        subgroupPipeline ? outputCount : Math.ceil(outputCount / WORKGROUP_SIZE),
+      );


dispatchWorkgroups uses outputCount when the subgroup pipeline is selected, but subgroupCompute computes num_subgroups outputs per workgroup (neuronIndex = wid.x * nsg + sgid). This over-dispatches workgroups by a factor of nsg (e.g., 2x for 64 threads with 32-wide subgroups), doing unnecessary work for larger layers. Consider either dispatching ceil(outputCount / outputsPerWorkgroup) (if you can determine outputsPerWorkgroup) or adjusting the shader/work mapping so each workgroup corresponds to exactly one output when dispatch count must be outputCount.

aleksanderkatan

Nice!

aleksanderkatan · 2026-05-14T10:19:14Z

 const context = canvas.getContext('2d') as CanvasRenderingContext2D;

 const bars = Array.from(document.querySelectorAll('.bar')) as HTMLDivElement[];
 const subgroupsEl = document.getElementById('subgroups-status') as HTMLSpanElement;


Could we also include status for f16? I don't think there is a way to know whether the shader runs on f16 or f32 at this moment

reczkok requested a review from iwoplaza April 27, 2026 16:44

reczkok requested review from aleksanderkatan and cieplypolar April 27, 2026 16:44

reczkok requested a review from Copilot April 28, 2026 08:53

Copilot started reviewing on behalf of reczkok April 28, 2026 08:53 View session

This comment was marked as outdated.

Sign in to view

reczkok force-pushed the impr/cleaner-mnist branch from 79b1935 to 4328747 Compare April 29, 2026 10:53

reczkok requested a review from Copilot April 29, 2026 10:54

Copilot started reviewing on behalf of reczkok April 29, 2026 10:54 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

aleksanderkatan approved these changes May 14, 2026

View reviewed changes

reczkok added 2 commits May 26, 2026 17:09

use f16 and better subgroup shader

7d1a682

do not over-dispatch

82d84e8

reczkok force-pushed the impr/cleaner-mnist branch from 4328747 to 82d84e8 Compare May 26, 2026 15:09

fixes

e8fec05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

impr: Use `f16` and better subgroup shader in MNIST example#2412

impr: Use `f16` and better subgroup shader in MNIST example#2412
reczkok wants to merge 3 commits into
releasefrom
impr/cleaner-mnist

reczkok commented Apr 27, 2026

Uh oh!

github-actions Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

aleksanderkatan left a comment

Uh oh!

aleksanderkatan May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

reczkok commented Apr 27, 2026

Uh oh!

github-actions Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Bundle Size Comparison

👀 Notable results

Static test results:

Dynamic test results:

📋 All results

Uh oh!

github-actions Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Resolution Time Benchmark

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

aleksanderkatan left a comment

Choose a reason for hiding this comment

Uh oh!

aleksanderkatan May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Apr 27, 2026 •

edited

Loading

github-actions Bot commented Apr 27, 2026 •

edited

Loading

github-actions Bot commented Apr 27, 2026 •

edited

Loading