Skip to content

Commit ecab71e

Browse files
committed
[NE16] enable --enableStrides for model tests: stride-2 convs on NE16
With the engine-aware DW NHWC fixup pass in place, the previous saturation failure on stride-2 NE16 dispatch is gone — the root cause was the global NHWC swap forcing wrong layout, not a HAL-level bug. Add --enableStrides alongside --enable-3x3 to the model fixture so all 27 MobileNetV1 convs go to NE16 (no cluster fallback). gvsoc gap9.evk: - PW-only: 1 847 256 cyc MAC/Cyc 4.05 - PW + DW-s1 (--enable-3x3): 1 190 437 cyc MAC/Cyc 6.29 - All convs (--enable-3x3 + Strides): 845 217 cyc MAC/Cyc 8.86 Final speedup vs PW-only baseline: 2.19x (-54.2% cycles). NE16 dispatch count goes from 14 -> 28 (all 27 Convs + the final Gemm-as-PW), cluster path runs only the residual MaxPool. All 10 NE16 tests still pass (9 kernels + MobileNetV1).
1 parent b8a518a commit ecab71e

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

DeeployTest/test_platforms.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1057,7 +1057,7 @@ def test_gap9_w_ne16_tiled_models_l2_singlebuffer(test_params, deeploy_test_dir,
10571057
l1 = l1,
10581058
default_mem_level = "L2",
10591059
double_buffer = False,
1060-
gen_args = ["--enable-3x3"],
1060+
gen_args = ["--enable-3x3", "--enableStrides"],
10611061
)
10621062
run_and_assert_test(test_name, config, skipgen, skipsim)
10631063

0 commit comments

Comments
 (0)