Skip to content

Replay constant propagation on top of rebased DFA framework#663

Open
mlange05 wants to merge 28 commits into
mainfrom
naml-replay-pr515-const-prop
Open

Replay constant propagation on top of rebased DFA framework#663
mlange05 wants to merge 28 commits into
mainfrom
naml-replay-pr515-const-prop

Conversation

@mlange05
Copy link
Copy Markdown
Collaborator

@mlange05 mlange05 commented Apr 1, 2026

Summary

  • replay the constant-propagation work from Constant Propagation Transform #515 on top of the rebased DFA framework from Replay abstract DFA refactor on current main #662
  • add a ConstantPropagationAnalysis and ConstantPropagationTransformer
  • support constant propagation through assignments, conditionals, loops, and dynamic array invalidation
  • include the loop-unroll optimization path needed by the replayed implementation
  • bring over and validate the relevant constant-propagation test coverage

Important

This PR is based on top of #662 and therefore requires #662 to be merged first.

Background

This PR is a modern replay of the original constant-propagation work proposed in #515.

The original branch for #515 was stacked on top of the older DFA refactor from #521. Since that historical branch diverged, the underlying DFA implementation and related infrastructure changed substantially on main. Replaying the feature on top of the rebased DFA framework from #662 was therefore safer and easier to review than attempting a literal rebase of the original branch.

What This PR Changes

  • add loki/analyse/constant_propagation_analysis.py
  • add loki/transformations/constant_propagation.py
  • add and expand loki/transformations/tests/test_constant_propagation.py
  • implement constant propagation through:
    • declaration initializers
    • assignment rewriting
    • conditional branch merging
    • loop handling, including no-unroll cases
    • dynamic array access invalidation
  • add the optional loop-unroll fast path needed by the replayed constant-propagation implementation

Why

The goal of this PR is to restore the constant-propagation feature originally developed in #515, but on top of the current Loki code base and the rebased DFA architecture from #662.

With #662 in place, constant propagation can build on explicit DFA extension points and a clearer analysis structure, instead of depending on the older stacked branch layout.

Relationship To Earlier Work

This PR is a replay of the ideas and functionality from:

Acknowledgements

This PR is based on the original constant-propagation work in #515 by @Andrew-Beggs-ECMWF.

It also depends directly on the rebased DFA framework introduced in #662, which itself replays the architectural direction of #521.

This replay does not preserve the original commit history directly, but it intentionally carries forward the core functionality and design goals of the original contribution.

Many thanks to @Andrew-Beggs-ECMWF for the original implementation and for the stacked development that motivated this replay.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 1, 2026

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/663/index.html

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 97.37705% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.22%. Comparing base (8315ebd) to head (4373f7d).

Files with missing lines Patch % Lines
loki/transformations/constant_propagation.py 95.00% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #663      +/-   ##
==========================================
+ Coverage   94.20%   94.22%   +0.01%     
==========================================
  Files         288      290       +2     
  Lines       47625    47920     +295     
==========================================
+ Hits        44867    45154     +287     
- Misses       2758     2766       +8     
Flag Coverage Δ
lint_rules 96.40% <ø> (ø)
loki 94.20% <97.37%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mlange05 mlange05 force-pushed the naml-replay-pr515-const-prop branch from 353e011 to 6e43e62 Compare April 1, 2026 14:32
@Andrew-Beggs-ECMWF
Copy link
Copy Markdown
Contributor

Happy to help improve this where required. Some things that come to mind are:

  1. loki/analyse/constant_propagation_analysis.py should ideally have 100% test coverage because of how easy it is to introduce subtle bugs (ask me how I know)
  2. There is currently a bug in this somewhere. I can't remember exactly what it was, but when I used this on cloudsc, there was some loop where it propagated something it shouldn't have. I think it was ZEXPLICIT loop where it propagated ZEXPLICIT=0 inside the loop, so it never accumulated, but I'd have to check.
  3. Not all Fortran constructs are supported. I'm not sure it's worth waiting on these to be implemented and to have branches diverge again. Perhaps ship it as is with unimplemented nodes raising exceptions?

@mlange05 mlange05 force-pushed the naml-replay-pr515-const-prop branch from 928ac9f to e127e15 Compare April 28, 2026 09:06
@mlange05 mlange05 force-pushed the naml-replay-pr515-const-prop branch 2 times, most recently from d04ba05 to 450b313 Compare May 28, 2026 04:21
@mlange05 mlange05 requested a review from reuterbal May 28, 2026 04:22
@mlange05
Copy link
Copy Markdown
Collaborator Author

Ok, following the first stage agentic adaptation of the original feature implementation I went back to refactor this entire thing a little and bring it more in line with our more widespread conventions in the loki.transformation package. Importantly, this changes a few fundamental things about this feature:

  • Instead of implementing yet-another-expression-simplifier, the ConstantPrropgationMapper now inherits from the base SimplifyMapper and overrides the integer-division convention. This, however, requires the fixes in Expression: Further enhance and fix simplify #688 to get the dedicated mapper tests to pass.
  • The analysis and synthesis of this transformation have now been merged into a pure Transformer, meaning we do not need to attach the constant map onto IR nodes anymore - and thus we do not need to hijack the dataflow analysis either. Note, this might allow us to revert so of the recent encapsulation changes - but I leave that discussion for a follow-on PR.
  • In addition to full-scale Transformer, I've also added an entry-point method that takes the unroll_loops flag, as before. This will ensure that loop unrolling is triggered after an initial constant propagation pass, and triggers a second one afterwards, to ensure that unrolled assignments also inherit the right constants, if these are known.

And final comment, I don't think this feature is entirely complete or fully tested yet - but I believe it's in a good enough condition to merge. Therefore, @reuterbal @awnawab I'd appreciate a review to get this over the line (with potential for more corner-case massaging afterwards.

Also, many thanks to @Andrew-Beggs-ECMWF for the original implementation. I don't think a strict review is required for this, but if you have time for a quick look, comments are always welcome.

@mlange05 mlange05 marked this pull request as ready for review May 28, 2026 04:34
@mlange05 mlange05 requested a review from awnawab May 28, 2026 04:34
Copilot stopped work on behalf of mlange05 due to an error May 28, 2026 09:07
Copy link
Copy Markdown
Contributor

@awnawab awnawab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks @Andrew-Beggs-ECMWF and @mlange05 for this very powerful transformation 🙏 I've left a bunch of nitpicky comments in the implementation and highlighted a few instances of missing coverage. Going by the PR description, these can probably be patched afterwards. However there is one corner case I believe where the constant propagation can be wrong for a loop with constant bounds and a loop variant assignment in the body. If I'm mistaken then please ignore and merge anyway, otherwise it would be good to fix that corner case before this is merged.

def __init__(self, **kwargs):
super().__init__(**kwargs)

self.constants_map = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears unused, and we seem to instead rely on looking up constants_map from the kwargs.



def _array_indices_to_accesses(dimensions, shape):
accesses = functools.partial(itertools.product)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very fancy, and I learned a new thing, but perhaps this would be more readable:

  arg_lists = []
  for dimension in dimensions:
      if isinstance(dimension, sym.RangeIndex):
          arg_lists.append([sym.IntLiteral(v) for v in get_pyrange(...)])
      else:
          arg_lists.append([dimension])
  return itertools.product(*arg_lists)

accesses = functools.partial(itertools.product)
for count, dimension in enumerate(dimensions):
if isinstance(dimension, sym.RangeIndex):
start = dimension.start if dimension.start is not None else sym.IntLiteral(1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't these checks just be if dimension.start?

new_condition = mapper(o.condition, constants_map=constants_map)

# Pass two copies of the constants map forward ...
with dict_override(kwargs, {'constants_map': deepcopy(constants_map)}):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a TODO here to check if we know whether the condition evaluates to true or false, in which case we can make the constant propagation a bit less conservative. Classic example would be after inlining when if(present(arg)) checks are actually resolved.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, but we already have a dead-code removal transformation that would do this if applied straight after the constant propagation. Arguably this could be added as an optional step in the outer wrapper (much like the loop unrolling) to capture the cross-cutting effects (remove one if branch, all assignments in the other can be propagated). However, I think this could be left for a follow-up PR and I propose to raise an issue for this at this point?

if not non_literals and not isinstance(new_lhs, sym.Array):
update_constants_map(new_lhs, new_rhs, constants_map)
else:
invalidate_constants_map(new_lhs, constants_map)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be mistaken, but I don't think we are hitting this line in the case of an array with all literal dimensions and literal rhs.

mapper = ConstantPropagationMapper()

assert mapper(sym.Sum((sym.IntLiteral(1), sym.IntLiteral(2)))) == sym.IntLiteral(3)
assert mapper(sym.Quotient(sym.IntLiteral(7), sym.IntLiteral(2))) == sym.IntLiteral(3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a test here for negative integer division.

loop_constants_map = deepcopy(constants_map)

for assign in assignments:
if not set(FindVariables().visit(assign.rhs)).intersection(lhs_vars):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pop the relevant entry if o.variable is in the rhs? Specifically for a case like:

integer :: d = 0
do i = 1, 5     ! const bounds
  d = i * 2     ! loop-variant: RHS uses i which is in lhs_vars
end do
! outer map still has d = 0 here — but at runtime d = 10

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but again, I think this feature extension could be added in a follow-on PR. I propose parking this in an issue again?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I think the "advanced" analysis of using the last-known value for i as a constant that could be added to the constant map would be a future extension - the incorrect invalidation is indeed a bug. Will fix ...

assign_kwargs['constants_map'] = loop_constants_map
self.visit_Assignment(assign, **assign_kwargs)
else:
for assign in assignments:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is also currently untested I believe.


c = 0
do i = 1, a
c = c + b
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in loop accumutation is currently untested for the non unroll case, that would be a nice test to add.

Comment thread loki/expression/symbolic.py Outdated


def sum_int_literals(expr):
def sum_int_literals(expr, integer_arithmetic=True, floating_point=False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] Should this now be called sum_literals?

mlange05 added 15 commits June 1, 2026 08:44
Introduce a ConstantPropagationAnalysis scaffold with declaration-map handling, expression-folding helpers, and initial analysis tests so the PR #515 replay can build incrementally on the rebased DFA framework.
Add the transformer entry point and smoke tests for the PR #515 replay, wiring the current constant-propagation analysis scaffold into the transformations package without bringing in the full loop semantics yet.
Port the expression-mapper improvements from PR #515 onto the rebased analysis scaffold, including binary helper cleanup and boolean short-circuit coverage, while keeping the broader loop and conditional propagation work for later commits.
Bring over the key constant-propagation tests from PR #515 and mark the not-yet-replayed behaviors as expected failures so the remaining implementation work is captured as executable targets on the new replay branch.
Update the constant-propagation analysis to rewrite assignments, maintain the running constants map, and persist transformed routine bodies so the replayed basic propagation tests now pass while loop-specific behavior remains for later commits.
Track constant maps separately across conditional branches and merge them conservatively so branch-local propagation works without leaking inconsistent values across the join point.
Handle constant loop bounds in the replayed constant-propagation analysis by unrolling when possible and conservatively invalidating loop-written values otherwise, while avoiding circular imports through a local loop-transform import.
Carry constant values out of non-unrolled loops when they are derived solely from loop invariants and the loop is guaranteed to execute, while still treating induction-variable-dependent values conservatively.
Remove the loop-related expected-failure markers that now pass on the replay branch so the test suite reflects the current constant-propagation behavior explicitly.
Use partial matching for array writes with dynamic or non-literal indices so constant propagation invalidates only affected accesses instead of discarding the full array state.
Prefer a cheaper depth-first unroll path when a constant-bound loop has no sibling loops and the induction variable is not used in nested loop bounds, while preserving the existing breadth-first behavior for more complex nests.
Replace top-level Loki imports in the constant-propagation modules with lower-level imports and align the transformer visit signature with the base Transformer API so full-codebase pylint stays clean without changing behavior.
Instead of re-implementing various symbolic reductions, we should add
the missing one to the core simplification mapper and inteherit from it.
@mlange05 mlange05 force-pushed the naml-replay-pr515-const-prop branch from 450b313 to 4996fe0 Compare June 1, 2026 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants