Skip to content

Add support for parsing digraphs#158

Closed
HaydenMichel8 wants to merge 5 commits into
robotpy:mainfrom
HaydenMichel8:add_digraphs
Closed

Add support for parsing digraphs#158
HaydenMichel8 wants to merge 5 commits into
robotpy:mainfrom
HaydenMichel8:add_digraphs

Conversation

@HaydenMichel8

Copy link
Copy Markdown
Contributor

C++ allows for digraphs as a replacement for certain tokens. While trigraphs were removed digraphs are still valid in modern C++. Namely, the following replacements can be used because some computer keyboards don't have all these characters or something:

  • <: replaces [
  • :> replaces ]
  • <% replaces {
  • %> replaces }

So this means that something like

namespace myns
<%
    int value;
%>

Is valid C++ code ( verified as compiling on https://godbolt.org/ but giving a parse error on https://robotpy.github.io/cxxheaderparser/ )

This change to support parsing for these tokens will allow cxxheaderparser to parse legacy C or C++ code that uses these weird symbols.

@virtuald virtuald left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Looking at https://en.cppreference.com/cpp/language/operator_alternative, they have this example which your PR fails to parse:

%:include <iostream>

struct X
<%
    compl X() <%%> // destructor
    X() <%%>
    X(const X bitand) = delete; // copy constructor
    // X(X and) = delete; // move constructor
    
    bool operator not_eq(const X bitand other)
    <%
       return this not_eq bitand other;
    %>
%>;

int main(int argc, char* argv<::>) 
<%
    // lambda with reference-capture:
    auto greet = <:bitand:>(const char* name)
    <%
        std::cout << "Hello " << name
                  << " from " << argv<:0:> << '\n';
    %>;
    
    if (argc > 1 and argv<:1:> not_eq nullptr)
        greet(argv<:1:>);
    else
        greet("Anon");
%>

I think it's fine to not support the weird include thing or named versions of operators like bitand, but if you replace those it still gets stuck on the char* argv<::>.

I asked GPT to solve the digraph problem, I think it's approach is slightly better and parses these cases (see #160)

Comment thread tests/test_digraphs.py
"""<% %> should work as { } in a function body context (body is skipped)."""
# The parser skips function bodies but must recognise the braces.
content = """\
#include <iostream>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any future contributions, please use python -m cxxheaderparser.gentest to generate tests.

@virtuald virtuald closed this Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants