Skip to content

Migrate LexFlags parsing to GrmtoolsSectionParser and Header.#556

Merged
ltratt merged 3 commits into
softdevteam:masterfrom
ratmice:lrlex_header_lexflags_on_pr554
Apr 23, 2025
Merged

Migrate LexFlags parsing to GrmtoolsSectionParser and Header.#556
ltratt merged 3 commits into
softdevteam:masterfrom
ratmice:lrlex_header_lexflags_on_pr554

Conversation

@ratmice
Copy link
Copy Markdown
Collaborator

@ratmice ratmice commented Apr 20, 2025

This rebuilds my previous PR #553 on top of the current master branch.
I didn't manage to avoid all problems associated with Span/Location conversion.

In particular, because we need to synthesize a LexFlags for default values, and merge those in with the %grmtools section in the .l file, we have to synthesize a made up span for said default values.

It isn't to difficult to see though that when the LexFlags::try_from in from_str encounters a default value with a synthesized span, the hard coded values in the default shouldn't cause an error, and therefore the span should go unused. After which the header parameter to try_from gets dropped. So as much as I dislike just making up spans, in this case it doesn't really seem like it can materialize into a problem where the user could see the synthesized span.

Comment thread lrlex/src/lib/ctbuilder.rs Outdated
Comment thread lrlex/src/lib/lexer.rs Outdated
Comment thread lrlex/src/lib/lexer.rs Outdated

pub(crate) fn default_lex_flags_header() -> Header<Span> {
// Because this is called in `from_str`, which needs a span to convert to `LexBuildError`
// there doesn't seem to be any way to avoid making up this span out of thin air.
Copy link
Copy Markdown
Collaborator Author

@ratmice ratmice Apr 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, One thing we could do here is just remove this function and implement the defaults manually on LexFlags so we don't have to synthesize Span like this.

It unfortunately gets called from a few places, so perhaps pub(crate) fn setup_default_lex_flags(&mut LexFlags).

Edit: I suppose that with that we'll still need to add to the Header in CTLexerBuilder if only so that mark_required will work, however we don't have any required arguments for LRLex, and so we aren't actually checking that.

Anyhow I haven't really formed an opinion on what the right thing to do here is, but lean towards

  • pushing the default initialization further down possibly as far down as the Regex initialization in Rule.
  • documenting the things which have default initialization such that they should not be marked required.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best I could think of is in 82069cc

We couldn't do this so late as when constructing rules, because they affect parsing of the .l file.
Also I didn't document these shouldn't be marked required, since the constructors already document their
default values it seemed redundant.

Comment thread lrlex/src/lib/ctbuilder.rs Outdated
.map_err(|e| format!("When reading '{}': {e}", lexerp.display()))?;
let mut header = self.header;
let defaults = default_lex_flags_header();
defaults.merge_into(&mut header)?;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the only call to merge_into so we could remove that function if we want to?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might we need it again later or is this likely to be the only user of that function ever?

Copy link
Copy Markdown
Collaborator Author

@ratmice ratmice Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see us needing to call it later, in the sense that I think this series has all the merge_from calls that the repo should currently need. In order to need to call it we would need to add another tool, otherwise it would be that a user needs to call it if we decide to make this pub + unhidden.

I needed it for approximately two attempts of various patch series, there are two reasons merge_into is usable in more cases than merge_from:

  1. It is more flexible when merging the DEFAULT_LEX_FLAGS since it requires no knowledge about the destination header to know that the from header will be overridden by the destinations values via Theirs.
  2. It allows merging the defaults into another structure without moving that struct since it takes other as an owned value.

Since we now know the contents of the source header, the first is not necessary,
and since we're not passing headers by mutable reference, the second is no longer necessary.
Thus we currently fall into the more narrow band where merge_from can be used to the same effect.

It is a fairly simple derivative of copy/paste merge_from and reverse the src/destination values so it isn't the end of the world to just add it at that time, if we find we need it either.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so should we remove it or keep it?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since it is actually introduced in this series, we should remove it if only because the series is too big already. If it was already in the tree I'd probably lean towards leaving it be just for API completeness.
Because of course we can always add it in a separate patch later if it's needed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that in 1a43c84

@ltratt
Copy link
Copy Markdown
Member

ltratt commented Apr 23, 2025

Please squash.

ratmice added 3 commits April 23, 2025 03:58
This unifies the code in lrlex with the rest of the crates by using the
`GrmtoolsSectionParser`, for parsing the `%grmtools` section and using
`Header` for merging values from the builder with those from the parsed header.
@ratmice ratmice force-pushed the lrlex_header_lexflags_on_pr554 branch from 1a43c84 to 707bfd7 Compare April 23, 2025 11:16
@ratmice ratmice marked this pull request as ready for review April 23, 2025 11:17
@ratmice
Copy link
Copy Markdown
Collaborator Author

ratmice commented Apr 23, 2025

Squashed, I tried to pull out the few free-standing commits I could find.

@ltratt ltratt added this pull request to the merge queue Apr 23, 2025
Merged via the queue into softdevteam:master with commit 3997be1 Apr 23, 2025
2 checks passed
@ratmice
Copy link
Copy Markdown
Collaborator Author

ratmice commented Apr 23, 2025

Okay, well I think unless anything comes up that should conclude that patch series.
It turned out much more involved than I had expected, I feel like I made
some painful choices along the way and paid the price, but that has largely been rectified.

I don't really have a whole lot in the way of breaking changes left on my todo list,
with most of the stuff I'm looking towards implementing I think should be non-breaking.

@ratmice ratmice deleted the lrlex_header_lexflags_on_pr554 branch April 23, 2025 11:37
@ltratt
Copy link
Copy Markdown
Member

ltratt commented Apr 24, 2025

Looking forward to it!

@ratmice ratmice mentioned this pull request Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants