Replay constant propagation on top of rebased DFA framework#663
Replay constant propagation on top of rebased DFA framework#663mlange05 wants to merge 28 commits into
Conversation
|
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/663/index.html |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #663 +/- ##
==========================================
+ Coverage 94.20% 94.22% +0.01%
==========================================
Files 288 290 +2
Lines 47625 47920 +295
==========================================
+ Hits 44867 45154 +287
- Misses 2758 2766 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
353e011 to
6e43e62
Compare
|
Happy to help improve this where required. Some things that come to mind are:
|
928ac9f to
e127e15
Compare
d04ba05 to
450b313
Compare
|
Ok, following the first stage agentic adaptation of the original feature implementation I went back to refactor this entire thing a little and bring it more in line with our more widespread conventions in the
And final comment, I don't think this feature is entirely complete or fully tested yet - but I believe it's in a good enough condition to merge. Therefore, @reuterbal @awnawab I'd appreciate a review to get this over the line (with potential for more corner-case massaging afterwards. Also, many thanks to @Andrew-Beggs-ECMWF for the original implementation. I don't think a strict review is required for this, but if you have time for a quick look, comments are always welcome. |
awnawab
left a comment
There was a problem hiding this comment.
Many thanks @Andrew-Beggs-ECMWF and @mlange05 for this very powerful transformation 🙏 I've left a bunch of nitpicky comments in the implementation and highlighted a few instances of missing coverage. Going by the PR description, these can probably be patched afterwards. However there is one corner case I believe where the constant propagation can be wrong for a loop with constant bounds and a loop variant assignment in the body. If I'm mistaken then please ignore and merge anyway, otherwise it would be good to fix that corner case before this is merged.
| def __init__(self, **kwargs): | ||
| super().__init__(**kwargs) | ||
|
|
||
| self.constants_map = {} |
There was a problem hiding this comment.
This appears unused, and we seem to instead rely on looking up constants_map from the kwargs.
|
|
||
|
|
||
| def _array_indices_to_accesses(dimensions, shape): | ||
| accesses = functools.partial(itertools.product) |
There was a problem hiding this comment.
This is very fancy, and I learned a new thing, but perhaps this would be more readable:
arg_lists = []
for dimension in dimensions:
if isinstance(dimension, sym.RangeIndex):
arg_lists.append([sym.IntLiteral(v) for v in get_pyrange(...)])
else:
arg_lists.append([dimension])
return itertools.product(*arg_lists)
| accesses = functools.partial(itertools.product) | ||
| for count, dimension in enumerate(dimensions): | ||
| if isinstance(dimension, sym.RangeIndex): | ||
| start = dimension.start if dimension.start is not None else sym.IntLiteral(1) |
There was a problem hiding this comment.
Couldn't these checks just be if dimension.start?
| new_condition = mapper(o.condition, constants_map=constants_map) | ||
|
|
||
| # Pass two copies of the constants map forward ... | ||
| with dict_override(kwargs, {'constants_map': deepcopy(constants_map)}): |
There was a problem hiding this comment.
Might be worth adding a TODO here to check if we know whether the condition evaluates to true or false, in which case we can make the constant propagation a bit less conservative. Classic example would be after inlining when if(present(arg)) checks are actually resolved.
There was a problem hiding this comment.
Ah, yes, but we already have a dead-code removal transformation that would do this if applied straight after the constant propagation. Arguably this could be added as an optional step in the outer wrapper (much like the loop unrolling) to capture the cross-cutting effects (remove one if branch, all assignments in the other can be propagated). However, I think this could be left for a follow-up PR and I propose to raise an issue for this at this point?
| if not non_literals and not isinstance(new_lhs, sym.Array): | ||
| update_constants_map(new_lhs, new_rhs, constants_map) | ||
| else: | ||
| invalidate_constants_map(new_lhs, constants_map) |
There was a problem hiding this comment.
I could be mistaken, but I don't think we are hitting this line in the case of an array with all literal dimensions and literal rhs.
| mapper = ConstantPropagationMapper() | ||
|
|
||
| assert mapper(sym.Sum((sym.IntLiteral(1), sym.IntLiteral(2)))) == sym.IntLiteral(3) | ||
| assert mapper(sym.Quotient(sym.IntLiteral(7), sym.IntLiteral(2))) == sym.IntLiteral(3) |
There was a problem hiding this comment.
Might be worth adding a test here for negative integer division.
| loop_constants_map = deepcopy(constants_map) | ||
|
|
||
| for assign in assignments: | ||
| if not set(FindVariables().visit(assign.rhs)).intersection(lhs_vars): |
There was a problem hiding this comment.
Should we pop the relevant entry if o.variable is in the rhs? Specifically for a case like:
integer :: d = 0
do i = 1, 5 ! const bounds
d = i * 2 ! loop-variant: RHS uses i which is in lhs_vars
end do
! outer map still has d = 0 here — but at runtime d = 10
There was a problem hiding this comment.
Good point, but again, I think this feature extension could be added in a follow-on PR. I propose parking this in an issue again?
There was a problem hiding this comment.
Edit: I think the "advanced" analysis of using the last-known value for i as a constant that could be added to the constant map would be a future extension - the incorrect invalidation is indeed a bug. Will fix ...
| assign_kwargs['constants_map'] = loop_constants_map | ||
| self.visit_Assignment(assign, **assign_kwargs) | ||
| else: | ||
| for assign in assignments: |
There was a problem hiding this comment.
This branch is also currently untested I believe.
|
|
||
| c = 0 | ||
| do i = 1, a | ||
| c = c + b |
There was a problem hiding this comment.
The in loop accumutation is currently untested for the non unroll case, that would be a nice test to add.
|
|
||
|
|
||
| def sum_int_literals(expr): | ||
| def sum_int_literals(expr, integer_arithmetic=True, floating_point=False): |
There was a problem hiding this comment.
[optional] Should this now be called sum_literals?
Introduce a ConstantPropagationAnalysis scaffold with declaration-map handling, expression-folding helpers, and initial analysis tests so the PR #515 replay can build incrementally on the rebased DFA framework.
Add the transformer entry point and smoke tests for the PR #515 replay, wiring the current constant-propagation analysis scaffold into the transformations package without bringing in the full loop semantics yet.
Port the expression-mapper improvements from PR #515 onto the rebased analysis scaffold, including binary helper cleanup and boolean short-circuit coverage, while keeping the broader loop and conditional propagation work for later commits.
Bring over the key constant-propagation tests from PR #515 and mark the not-yet-replayed behaviors as expected failures so the remaining implementation work is captured as executable targets on the new replay branch.
Update the constant-propagation analysis to rewrite assignments, maintain the running constants map, and persist transformed routine bodies so the replayed basic propagation tests now pass while loop-specific behavior remains for later commits.
Track constant maps separately across conditional branches and merge them conservatively so branch-local propagation works without leaking inconsistent values across the join point.
Handle constant loop bounds in the replayed constant-propagation analysis by unrolling when possible and conservatively invalidating loop-written values otherwise, while avoiding circular imports through a local loop-transform import.
Carry constant values out of non-unrolled loops when they are derived solely from loop invariants and the loop is guaranteed to execute, while still treating induction-variable-dependent values conservatively.
Remove the loop-related expected-failure markers that now pass on the replay branch so the test suite reflects the current constant-propagation behavior explicitly.
Use partial matching for array writes with dynamic or non-literal indices so constant propagation invalidates only affected accesses instead of discarding the full array state.
Prefer a cheaper depth-first unroll path when a constant-bound loop has no sibling loops and the induction variable is not used in nested loop bounds, while preserving the existing breadth-first behavior for more complex nests.
Replace top-level Loki imports in the constant-propagation modules with lower-level imports and align the transformer visit signature with the base Transformer API so full-codebase pylint stays clean without changing behavior.
Instead of re-implementing various symbolic reductions, we should add the missing one to the core simplification mapper and inteherit from it.
450b313 to
4996fe0
Compare
Summary
ConstantPropagationAnalysisandConstantPropagationTransformerImportant
This PR is based on top of #662 and therefore requires #662 to be merged first.
Background
This PR is a modern replay of the original constant-propagation work proposed in #515.
The original branch for #515 was stacked on top of the older DFA refactor from #521. Since that historical branch diverged, the underlying DFA implementation and related infrastructure changed substantially on
main. Replaying the feature on top of the rebased DFA framework from #662 was therefore safer and easier to review than attempting a literal rebase of the original branch.What This PR Changes
loki/analyse/constant_propagation_analysis.pyloki/transformations/constant_propagation.pyloki/transformations/tests/test_constant_propagation.pyWhy
The goal of this PR is to restore the constant-propagation feature originally developed in #515, but on top of the current Loki code base and the rebased DFA architecture from #662.
With #662 in place, constant propagation can build on explicit DFA extension points and a clearer analysis structure, instead of depending on the older stacked branch layout.
Relationship To Earlier Work
This PR is a replay of the ideas and functionality from:
Acknowledgements
This PR is based on the original constant-propagation work in #515 by @Andrew-Beggs-ECMWF.
It also depends directly on the rebased DFA framework introduced in #662, which itself replays the architectural direction of #521.
This replay does not preserve the original commit history directly, but it intentionally carries forward the core functionality and design goals of the original contribution.
Many thanks to @Andrew-Beggs-ECMWF for the original implementation and for the stacked development that motivated this replay.