Fuse Convolution + Activation node as a part of graph transformations pipeline#276
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #276 +/- ##
==========================================
+ Coverage 84.42% 84.95% +0.53%
==========================================
Files 57 59 +2
Lines 3274 3710 +436
Branches 1989 2287 +298
==========================================
+ Hits 2764 3152 +388
- Misses 245 271 +26
- Partials 265 287 +22 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…_2023 into neiroyt/feature_fuseconvrelu e enter a commit message to explain why this merge is necessary, iall if it merges an updated upstream into a topic branch. starting with '#' will be ignored, and an empty message aborts
| } | ||
| std::shared_ptr<Layer> layer = layer_based_shared_copy(layer_to, options); | ||
| std::shared_ptr<Layer> layer; | ||
| if (layer_to->getName() == kConvRelu && |
There was a problem hiding this comment.
Can't we use layer_based_shared_copy() for kConvRelu case as well? This looks cumbersome
There was a problem hiding this comment.
layer_based_shared_copy() copies layer (shared_ptr layer_to in this case), also this func is used in graph.clone().
However any ConvRelu in new graph needs fields from ConvLayer+Relu, so here in general case a new function can be implemented, which constructs layer depending on what we want.
| start_ = layer->getID(); | ||
| } | ||
|
|
||
| void setInput(Tensor& vec) { |
There was a problem hiding this comment.
Is it possible to avoid introducing setInput/setOutput just for the one function call?
| void build_graph_linear(it_lab_ai::Graph& graph, it_lab_ai::Tensor& input, | ||
| it_lab_ai::Tensor& output, RuntimeOptions options, | ||
| bool comments) { | ||
| bool comments, bool enable_postops) { |
There was a problem hiding this comment.
I remember you mentioned that you need ito set enable_postops = false. Where is that?
| conv.run(input, output, options); | ||
| switch (input[0].get_type()) { | ||
| case Type::kInt: { | ||
| relu<int>(output[0]); | ||
| break; | ||
| } | ||
| case Type::kFloat: { | ||
| relu<float>(output[0]); | ||
| break; | ||
| } | ||
| default: { | ||
| throw std::runtime_error("Unsupported tensor type"); | ||
| } |
There was a problem hiding this comment.
Now I clearly see why conv + relu exec time is the same as with fused layers. You need to implement fsed implementation here to see the time difference instead of calling two layers
Closes #272