Commit 59d3097
committed
Add MobileNet/ResNet8 layer-accurate ConvGrad kernel regression tests
15 single-op ConvGradX/W test directories that mirror actual backward
shapes in MobileNetV1 0.25x (stem conv, DW/PW blocks 0/1/3/7/11 incl.
stride-2 variants) and ResNet8 layer3_conv2. Each has inputs.npz
(random seeded), outputs.npz (PyTorch `torch.nn.grad.conv2d_{input,
weight}` reference), and single-op ONNX.
Post the parent commit (ConvGradX global-pad + ConvGradW accum fix),
13/15 pass bit-exact:
- ConvGradX_DW_block_0 / ConvGradW_DW_block_0: tiler "single minimal
outer shape" legalization failure in standalone mode only; the same
shape tiles fine inside the full MobileNet training pipeline.
- ConvGradW_R8_L3_conv2: fails 100% due to the mixed C_out+HW tile
memset bug (same L1 dW slice re-memset'd across HW subtiles). ResNet8
training doesn't actually hit this case, so it's parked as a
known-issue regression test.
The debug/gen_kernel_tests.py generator builds all shapes via
torch.nn.grad APIs for reproducibility.1 parent a329c70 commit 59d3097
52 files changed
Lines changed: 98 additions & 0 deletions
File tree
- DeeployTest/Tests/Kernels/FP32
- ConvGradW_DW_block_0
- ConvGradW_DW_block_11_s2
- ConvGradW_DW_block_1_s2
- ConvGradW_DW_block_3_s2
- ConvGradW_DW_block_7_s2
- ConvGradW_PW_block_0
- ConvGradW_PW_block_11
- ConvGradW_R8_L3_conv2
- ConvGradW_Stem
- ConvGradX_DW_block_0
- ConvGradX_DW_block_11_s2
- ConvGradX_DW_block_1_s2
- ConvGradX_DW_block_3_s2
- ConvGradX_DW_block_7_s2
- ConvGradX_PW_block_0
- ConvGradX_PW_block_11
- ConvGradX_R8_L3_conv2
- debug
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
0 commit comments