Skip to content

IR: Refactoring and introduction of explicitly names "statement nodes"#656

Open
mlange05 wants to merge 19 commits into
mainfrom
naml-ir-intrinsic-nodes
Open

IR: Refactoring and introduction of explicitly names "statement nodes"#656
mlange05 wants to merge 19 commits into
mainfrom
naml-ir-intrinsic-nodes

Conversation

@mlange05
Copy link
Copy Markdown
Collaborator

@mlange05 mlange05 commented Mar 9, 2026

Note: This PR technically introduces an externally API-changing change, as it renames ir.Intrinsic -> ir.GenericStmt

This PR refactors the loki.ir.nodes source into a dedicated sub-package and introduces a new set of statement nodes that all inherit from loki.ir.nodes.GenericStmt. These replace previous ir.Intrinsic nodes (a misnomer from the early days 😉 ), which in turn allows us to refer to specific statement nodes explicitly by type (eg. ir.ImplicitStmt or ir.ContainsStmt). It is worth pointing out that I've only specialised CONTAINS, IMPLICIT, CYCLE, RETURN, GO TO, PUBLIC, PRIVATE thus far, but others could follow.

Pleas note also that the latter two now exist as dedicated node types, but I've not changed (yet) the internal mechanics of the Module that use this information.

In some more detail:

  • Moved IR utilities (including dataclass_strict to loki.tools
  • Separated loki.ir.nodes into sub-package with sources for abstract_nodes, internal_nodes and leaf_nodes
  • Renamed loki.ir.Intrinsic => loki.ir.GenericStmt
  • Added source for stmt_nodes to loki.ir.nodes and implemented ImplicitStmt, ContainsStmt, ReturnStmt, CycleStmt, GotoStmt, PrivateStmt and PublicStmt
  • Adjusted used of generic statement nodes in both code and test base
  • Consolidated some statement tests in ir.nodes.tests to capture new node types and avoid JIT
  • Some more import massaging in a test files that I had to touch anyway

@mlange05 mlange05 requested a review from reuterbal March 9, 2026 09:43
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 9, 2026

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/656/index.html

@mlange05 mlange05 force-pushed the naml-ir-intrinsic-nodes branch 7 times, most recently from cf50b27 to c705e00 Compare March 9, 2026 15:03
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 89.60302% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.16%. Comparing base (03c986f) to head (0abab4d).

Files with missing lines Patch % Lines
loki/frontend/omni.py 0.00% 25 Missing ⚠️
loki/ir/nodes/stmt_nodes.py 88.88% 17 Missing ⚠️
loki/backend/tests/test_stringifier.py 20.00% 4 Missing ⚠️
loki/frontend/fparser.py 96.55% 2 Missing ⚠️
loki/ir/tests/test_control_flow.py 94.87% 2 Missing ⚠️
loki/expression/tests/test_expression.py 83.33% 1 Missing ⚠️
loki/frontend/tests/test_nodes_statements.py 92.30% 1 Missing ⚠️
loki/frontend/tests/test_nodes_subroutines.py 0.00% 1 Missing ⚠️
loki/frontend/util.py 0.00% 1 Missing ⚠️
loki/tests/test_modules.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #656      +/-   ##
==========================================
- Coverage   94.20%   94.16%   -0.04%     
==========================================
  Files         288      286       -2     
  Lines       47546    47274     -272     
==========================================
- Hits        44790    44516     -274     
- Misses       2756     2758       +2     
Flag Coverage Δ
lint_rules ?
loki 94.16% <89.60%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mlange05 mlange05 force-pushed the naml-ir-intrinsic-nodes branch from 51a2cda to da12fe7 Compare March 9, 2026 16:52
@mlange05 mlange05 marked this pull request as ready for review March 9, 2026 18:43
@mlange05 mlange05 force-pushed the naml-ir-intrinsic-nodes branch from da12fe7 to 3055d61 Compare April 11, 2026 07:48
@mlange05
Copy link
Copy Markdown
Collaborator Author

@reuterbal I've rebased this over recent main (0.3.6) for now and added a few more (less commonly used) statement node types. Seeing this, it might make sense to wait for PR #665 and then expand the explicit frontend use of these uncommon types?

@mlange05 mlange05 marked this pull request as draft April 28, 2026 08:48
@mlange05 mlange05 force-pushed the naml-ir-intrinsic-nodes branch 5 times, most recently from 57f25ff to 0fef097 Compare April 29, 2026 12:27
@mlange05 mlange05 marked this pull request as ready for review April 29, 2026 14:18
@mlange05
Copy link
Copy Markdown
Collaborator Author

Ok, @reuterbal I've rebased and added some bits to the new frontend test structure. There's more to be done there, but this one was already getting big, so I'll keep it to what it is now. Ready for another look.

Copy link
Copy Markdown
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

The refactoring of IR nodes into multiple files is a very welcome change. Unfortunately, the Github diff view doesn't make it very easy to see where things have simply moved or where other changes have been incorporated, too - but I trust that this has been executed with caution and that tests would have failed if something was out of the ordinary.

I have flagged a few cases where I think the frontends are not correctly translating to the new statement node types but instead create "old-fashioned" intrinsic/generic nodes. That should be straightforward to fix.

The only conceptual question for me is the fact, that the new statement nodes are not capturing execution/programming concepts in the way we used to do this as much as possible in the Loki IR, but are very much aligned with the specific Fortran-way of expressing such concepts. I flagged this in the respective places and would be interested in your thoughts on this. I am not fundamentally opposed to doing it in the way this is implement now, but wanted to point out the change this constitutes and ensure that it is a concious decision to design it that way.

code = module.to_fortran()
code_lines = code.splitlines()
assert len(code_lines) in (35, 36) # OMNI produces an extra line
assert len(code_lines) in (35, 37) # OMNI produces an extra line
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤨



@dataclass_strict(frozen=True)
class PublicStmt(GenericStmt):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I 100% like the very Fortran-string-equivalent representation of statement nodes, as it abandons the user-friendly view onto execution concepts that Loki encapsulated so far. It removes some of the higher-level approach over an AST representation that we tried to maintain so far.

Specifically, I'm concerned about PublicStmt vs. PrivateStmt - both of which could be classified as a VisibilityStmt with a positive vs. negative default. A transformation or lint rule that is interested in visibility of symbols will now have to look for both node types instead of searching directly for one type only.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline: #690



@dataclass_strict(frozen=True)
class ContainsStmt(GenericStmt):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly for ContainsStmt: Semantically, this does not add any additional information other than delineating the beginning of the contains part. Loki's IR already captures this through the contains property and I'm wondering if a ContainsSection as an overload of the internal node type Section wouldn't be a more intuitive way of representing this concept without the need to explicitly represent this as a single-word-string-only node?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline: #689



@dataclass_strict(frozen=True)
class PrintStmt(GenericStmt):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we'd want a WriteStmt, too, at some point, but that would constitute a larger impact due to it's use in transformations?

Also here, wondering if a generic OutputStmt might not be a better fit?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, in principle yes, but due to the various flavours of string and output formatting it's quite hard to generalise. So, we might look into generalising output (and file handling) intrinsics at a later stage, but defer to a future PR for this.

Comment thread loki/frontend/fparser.py Outdated
return ir.StopStmt(**kwargs)

visit_Intrinsic_Stmt = visit_Generic_Stmt
visit_Format_Stmt = visit_Generic_Stmt
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I not spot a FormatStmt class in stmt_nodes.py?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's more involved than expected, but yes - please see top commit after rebase.

Comment thread loki/frontend/fparser.py Outdated
# the usual `Output_Item_List` entity.
return ir.Intrinsic(text=f'PRINT {", ".join(str(i) for i in o.items if i is not None)}',
source=kwargs.get('source'), label=kwargs.get('label'))
return ir.GenericStmt(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not generate a PrintStmt?

Comment thread loki/frontend/omni.py Outdated
args = f", {args}" if values else ""
fmt = o.attrib['format']
return ir.Intrinsic(text=f'print {fmt}{args}', source=kwargs['source'])
return ir.GenericStmt(text=f'print {fmt}{args}', source=kwargs['source'])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a PrintStmt?

Comment thread loki/frontend/omni.py Outdated
def visit_FformatDecl(self, o, **kwargs):
fmt = f'FORMAT{o.attrib["format"]}'
return ir.Intrinsic(text=fmt, source=kwargs['source'])
return ir.GenericStmt(text=fmt, source=kwargs['source'])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a FormatStmt?

Instead of taking the `text` argument in the constructor, we suppress
it and derived the string from a tuple of expression/str arguments in
the `values` argument. This allows us to cleanly use sub-expressions
as part of output statements.
@mlange05 mlange05 force-pushed the naml-ir-intrinsic-nodes branch from 0abab4d to a2cfbb7 Compare May 31, 2026 07:18
mlange05 added 2 commits June 1, 2026 06:32
Make OMNI FORMAT declaration parsing quote-aware so commas inside
quoted format strings are retained as part of the string literal
instead of splitting into separate descriptors. Preserve the leading
numeric FORMAT statement label from source metadata when OMNI does not
attach it directly to the declaration node.

Add regression coverage for labelled FORMAT statements with quoted
comma-containing items in the OMNI frontend, update frontend output
statement assertions to check typed values, and cover PRINT/FORMAT
string literal generation in fgen tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants