Skip to content

Commit 757fc94

Browse files
vksnkxnnpack-bot
authored andcommitted
Prevent scheduling of ki/ko loops in packing.
In order to keep rest of the loops fused I added "identity" splits where step == extent, as a side effect now the packed buffer is stored inside of the loop. PiperOrigin-RevId: 918002726
1 parent ce14e18 commit 757fc94

1 file changed

Lines changed: 18 additions & 0 deletions

File tree

ynnpack/subgraph/dot.cc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -501,6 +501,24 @@ uint32_t define_pack_b(ynn_subgraph_t subgraph, const dot_type& type,
501501
if (num_k_dims > 1) {
502502
sched->force_root = true;
503503
}
504+
// Use "identity" splits where step == extent which will always fuse into
505+
// parent loop as long as extents match.
506+
// TODO(vksnk): Ideally we should select better steps, so the packing is
507+
// parallelized even if it's not fused.
508+
for (int i = 0; i < output.extents.size(); i++) {
509+
sched->loop_splits.push_back({dims[i], output.physical_extent(i),
510+
slinky::loop::serial,
511+
output.physical_extent(i),
512+
/*step_is_required=*/false});
513+
}
514+
// ki (dim 0) and ko (dim 2) must not be split.
515+
// We enforce this by requiring their step to be equal to their extent.
516+
sched->loop_splits[0].step_is_required = true;
517+
sched->loop_splits[2].step_is_required = true;
518+
// We split the n into no and ni, so in order to be able to fuse it with
519+
// n-loop we trick scheduler into thinking that extent is matching n-extent.
520+
sched->loop_splits[3].extent = input.extent(0);
521+
504522
func.user_data() = sched.get();
505523
runtime.scheduling_info_storage.push_back(std::move(sched));
506524

0 commit comments

Comments
 (0)