-
Notifications
You must be signed in to change notification settings - Fork 41
Add source snippets to missing_from_lexer/parser errors #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess I would suggest a good first step is to focus on what to do about the duplication of
rule_idshere.Since it actually impacts semver, but also contributes to the other code duplication.
The thing to note is that this patch is adding
yacc_grammarfield to this struct,and
rule_idsfield is justSo we are essentially storing 2 copies of the
rule_ids, one with the&strrefs convertedto_owned().It is worth noting that in lrlex we make another
owned_mapclone of this value inCTLexerBuilder::build(),as part of the existing code path.
The main issue I see is the semver changes to
CTParser::token_map, it isn't actually possible to reimplement thetoken_mapfunction usingself.yacc_grammar, because of the ref-to-owned vs ref-to-ref differences.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb idea (off the top of my head): does this suggest that
rule_idsshould be wrapped in something like anRc?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative to changing this is just keeping both the
rule_idsandyacc_grammarfield duplication,and rewriting the
set_rule_ids_spannedcodepath fromnimbleparseto use therule_idsfield,making this new code path conform to the existing
CTLexerBuildercode paths instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we were going to use
Rcwhile avoiding duplication thatRcwould still need to be withinYaccGrammar, the particular thing we want out of theYaccGrammaris we need to calltoken_spanThen because we need that, it means we also already have
YaccGrammar::token_mapwithtoken_mapitself based uponiter_tidxsand also allocating aHashMap.I suppose that would leave us with:
YaccGrammar::token_maprule_ids: Rc<HashMap<...>>shared byCTParserBuilder/CTLexerBuilderSo I guess this is also a variation of keeping both the
rule_idsandyacc_grammarfields.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I guess another thing to take into account, we have these two functions here:
https://docs.rs/lrlex/latest/lrlex/trait.LexerDef.html#tymethod.set_rule_ids
https://docs.rs/lrlex/latest/lrlex/trait.LexerDef.html#tymethod.set_rule_ids_spanned
Which are the ultimate destination for both these variations, so another question I have is if we had versions
of those functions which accepted an
IntoIterator<Item=(&str, LexerTypesT::StorageT)>as a parameter instead of the&HashMap<&str, LexerTypesT::StorageT>, maybe it would be possible to get rid of all these allocations entirely?Edit: This may be a dumb idea, it looks like both
set_rule_idsvariations use the hashmaps to actually do lookups, so the simplest implementation of these functions taking anIntoIteratorwould actually useHashMap::from_iter()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In f4a42cb I tried messing with the iterator API...
Returning a
Box<HashMap>fromtoken_mapin place of the rule_id field, I'm not sure if the iterator based API pulls it's weight really, we still end up with clones for backwards compatibility of return values, and theCTLexerBuilder::rule_ids_mapfunction.That said it seems like a bit of an improvement over the original PR implementation, but maybe it's not worth trying to remove the duplicated values which we can't seem to benefit from fully without more drastic semver breakage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, where we break semver, I'd hope to keep it to "nice" parts of the API. This one does feel like it will intrude on pretty much every compile-time user? I suspect, though, there is a nice API here, and it's just eluding us right now...
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the actual breakage is pretty small, because of the signature for
CTLexerBuilder::rule_ids_map
has
std::borrow::Borrow<HashMap<String, LexerTypesT::StorageT>,The only change I had to make to any compile-time stuff was the manual lexer,
because ct_token_map uses
&HashMapin it's signature rather than theBorrow<...>. Perhaps we can switch that to useBorrow<>to further reduce breakage.Also, I hadn't noticed that all of
CTLexer's fields and methods are private so changing the types of those fields isn't actually a semver break like I had thought.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I did 8b973f2, but I'm starting to think that in this PR I should retreat from removing duplicate fields, and reducing allocations.
I think if we allow the duplicate
rule_idfield, we maybe can get something working without any semver breakage, or code duplication, very similar to how this patch is now, but without thetoken_maporLexerDef::set_rule_ids_spanned_iterchange.Then we can potentially revisit all that in a separate patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Oops, s/nice/niche/. That changed the meaning of that sentence quite profoundly. Sorry!]