support emitting a generated function#667
Conversation
Benchmark Results
Benchmark PlotsA plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-0.9 #667 +/- ##
===============================================
- Coverage 71.85% 71.36% -0.49%
===============================================
Files 14 14
Lines 906 915 +9
===============================================
+ Hits 651 653 +2
- Misses 255 262 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
; ...
L33: ; preds = %L29
; │ @ none within `macro expansion` @ /home/vchuravy/src/KernelAbstractions/src/macros.jl:321
; │┌ @ multidimensional.jl:417 within `iterate`
br label %L22, !dbg !87, !llvm.loop !100
; ...
!100 = distinct !{!100, !101}
!101 = !{!"llvm.loop.unroll.count", i64 17}Seems to work! |
|
Homework for myself: Write the kernel below with |
|
@vchuravy can this be merged? We would like to publish the package soon 😄 |
|
@fjwillemsen nobody expressed to me after my proposal that they actually wanted this, and I don't like to merge functionality without at least a clear user. |
|
I understand, I'll get @evelyne-ringoot in the loop and get back to you on this. Thank you! |
|
Hi, the original motivation for this was the performance difference in all kernels where input values are being used for loop unrolls, versus where this values is defined as const, so the scope is not limited to specifying the loop unroll number, but general availability of loop unrolling constants during early compile times, in particular for more complex functions. However, I can no longer reproduce a substantial performance difference in all kernels but one, where the performance difference swings both ways depending on hyperparameters (better using ::Val vs better using const). Considering that, the master branch works fine for our application! |
Motivated by #665
Proposed syntax
Sadly this doesn't quite work yet, since I need to handle the
$NcorrectlyCurrently:
Whereas
Note that the QuoteNode got broken into smaller pieces with the interpolated variable being left alone and the rest being passed to
_exprandQuoteNode