Skip to content

getScopingInfo utility added#3118

Merged
Cole-Greer merged 2 commits into
apache:masterfrom
vaibhavm99:getScopeInfo_utility
Jun 13, 2025
Merged

getScopingInfo utility added#3118
Cole-Greer merged 2 commits into
apache:masterfrom
vaibhavm99:getScopeInfo_utility

Conversation

@vaibhavm99

Copy link
Copy Markdown
Contributor

Added a utility function getScopingInfo() in all classes implementing Scoping and a method getScopingInfoForTraversal to get the labels needed by a traversal, which returns a Set of ScopingInfo class which contains the label and Pop info.

@Cole-Greer

Copy link
Copy Markdown
Contributor

Hi @vaibhavm99, thanks for opening a PR! I should have time to give it a full review by Monday at the latest. Could I ask a bit about the motivation for the new getScopingInfo() method? The label functionality is mostly covered by getScopeKeys(), is the main objective to give some common interface for finding the Pop value of "select" type steps? Is this change in response to any difficulties with the current interface?

@codecov-commenter

codecov-commenter commented May 9, 2025

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 89.42308% with 11 lines in your changes missing coverage. Please review.

Project coverage is 78.22%. Comparing base (cfd6889) to head (38fc28f).
Report is 266 commits behind head on master.

Files with missing lines Patch % Lines
.../gremlin/process/traversal/step/PopContaining.java 63.15% 5 Missing and 2 partials ⚠️
...rocess/traversal/step/map/TraversalSelectStep.java 50.00% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3118      +/-   ##
============================================
+ Coverage     77.87%   78.22%   +0.35%     
- Complexity    13578    13799     +221     
============================================
  Files          1015     1025      +10     
  Lines         59308    60179     +871     
  Branches       6835     7013     +178     
============================================
+ Hits          46184    47074     +890     
+ Misses        10817    10746      -71     
- Partials       2307     2359      +52     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vkagamlyk

Copy link
Copy Markdown
Contributor

Hi @vaibhavm99, will this change allow to save not the whole Path, but only the parts that are actually used?

@vaibhavm99

Copy link
Copy Markdown
Contributor Author

Hi @vaibhavm99, thanks for opening a PR! I should have time to give it a full review by Monday at the latest. Could I ask a bit about the motivation for the new getScopingInfo() method? The label functionality is mostly covered by getScopeKeys(), is the main objective to give some common interface for finding the Pop value of "select" type steps? Is this change in response to any difficulties with the current interface?

getScopeKeys only gives us a List of labels needed, but does not contain the Pop info. This is a common interface to get both label and the Pop value for all the labels needed. This is to provide label information inside the repeat step during the Neptune FastTranslation.

@vaibhavm99

Copy link
Copy Markdown
Contributor Author

Hi @vaibhavm99, will this change allow to save not the whole Path, but only the parts that are actually used?

Hi @vkagamlyk , I'm sorry I don't understand what you mean by "the whole Path." This change should not have any breaking problems with older code, as it is just adding a new API. This would take in a traversal and provide us with the label and Pop info for the labels that are needed in that traversal.

@vkagamlyk

Copy link
Copy Markdown
Contributor

Hi @vaibhavm99, will this change allow to save not the whole Path, but only the parts that are actually used?

Hi @vkagamlyk , I'm sorry I don't understand what you mean by "the whole Path." This change should not have any breaking problems with older code, as it is just adding a new API. This would take in a traversal and provide us with the label and Pop info for the labels that are needed in that traversal.

Hi @vaibhavm99!
In some cases it is necessary to transfer Traversers between servers. Part of Traverser can be Path object that is larger than other Traverser parts. If we know in advance which parts of the path are used, then we can transfer only them.
Path is list of pairs <label, value>, getScopeKeys can be used to get list of used labels.

@Cole-Greer

Copy link
Copy Markdown
Contributor

Hi @vaibhavm99 I'm happy with extending convenience interfaces for providers. I am curious since the SelectStep, SelectOneStep, and TraversalSelectStep are the only steps which are not hardcoded to Pop.last, if you also considered unifying these steps with some sort of Selecting interface which could extend Scoping and add public Pop getPop();

@Cole-Greer

Copy link
Copy Markdown
Contributor

I see this PR is currently targeting the master branch, this branch is currently accumulating changes for TinkerPop 4.0.0-beta.2, and ultimately TinkerPop 4. I would recommend targeting this change to 3.7-dev, which is accumulating changes for 3.7.4 (non-breaking only), or 3.8-dev which is for TinkerPop 3.8.0 (breaking changes permitted). Any changes introduced to an older development branch will also get introduced to the newer branches when the PR is merged.

*
* @return A Set of {@link ScopingInfo} values which contain the label and Pop value
*/
public default Set<ScopingInfo> getScopingInfo() {return null; }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a default implementation here? If something implements Scoping shouldn't it have a proper implementation? or are there cases where you would return getScopeKeys but not getScopingInfo? and if so, would return null be the right default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are a few classes that do not implement Pop at all, so for them we do not need getScopingInfo() but they do have getScopeKeys(). For example, AddEdgeStep and AddPropertyStep. For those I added a default. Would changing it to return an empty Set be better?

@vaibhavm99

Copy link
Copy Markdown
Contributor Author

Hi @vaibhavm99, will this change allow to save not the whole Path, but only the parts that are actually used?

Hi @vkagamlyk , I'm sorry I don't understand what you mean by "the whole Path." This change should not have any breaking problems with older code, as it is just adding a new API. This would take in a traversal and provide us with the label and Pop info for the labels that are needed in that traversal.

Hi @vaibhavm99! In some cases it is necessary to transfer Traversers between servers. Part of Traverser can be Path object that is larger than other Traverser parts. If we know in advance which parts of the path are used, then we can transfer only them. Path is list of pairs <label, value>, getScopeKeys can be used to get list of used labels.

This API would not cause any breaking change in getScopeKeys, this would give us the Pop info for those labels as well if you'd need it. When some labels are overloaded, Pop can be used to distinguish which mapping of that label is needed. For example, if only first is needed we would just need to have the <label, value1> instead of all the <label, value> pairs. This API gives additional info along with what getScopeKeys provides.

@vaibhavm99

Copy link
Copy Markdown
Contributor Author

Hi @vaibhavm99 I'm happy with extending convenience interfaces for providers. I am curious since the SelectStep, SelectOneStep, and TraversalSelectStep are the only steps which are not hardcoded to Pop.last, if you also considered unifying these steps with some sort of Selecting interface which could extend Scoping and add public Pop getPop();

Yes, getScopingInfo() is overloaded in every class that implements Scoping, if that class hardcodes Pop to Pop.last that is given out, and if it gives the user option to select Pop, I used that value using the getPop() function.

@upadhyay-prashant

Copy link
Copy Markdown
Contributor

@vkagamlyk / @Cole-Greer ,

Thank you for the detailed review. You got it right, that we're working on optimizing label identification in prefix traversals, with the goal of only injecting necessary labels rather than the entire path. This approach could indeed be valuable for Traverser injection in TinkerPop.

You've made a good point about the current limitations of the Scoping interface, particularly regarding complete label information requirements. Let me explain why the current POJO structure might be more suitable than the proposed alternative:

Consider this example:

gremlin> g.V().as('a').
......1>   out().as('b', 'a').
......2>   in().as('b').
......3>   where(select(first, 'a').where(eq('b')))

==>v[1]
==>v[1]
==>v[1]
==>v[4]
==>v[4]

In this case, the outermost where step requires [['a',Pop.first], ['b',Pop.last]], which wouldn't fit well with the proposed structure:

public static class ScopingInfo {
    public Set<String> labels;
    public Pop pop;
}

@spmallette : I hope above example explains why an simple getPop API wont be enough. Let me know if you have better suggestions.

Regarding the implementation strategy, our idea was to merge to master first and then cherry-pick to 3.7 and 3.8. Would you like to suggest any specific concerns about this approach?

Looking forward to your thoughts on this.

@Cole-Greer

Copy link
Copy Markdown
Contributor

Hi @upadhyay-prashant, I'm not sure I understand the limitation arising from the above example with a simple getPop() API.

In the case of where(select(first, 'a').where(eq('b'))), you should be able to get all of the relevant data via:

selectStep.getScopeKeys() -> {'a'}
selectStep.getPop() -> Pop.first
whereStep.getScopeKeys() -> {'b'}
whereStep.getPop() -> Pop.last

There is never more than 1 Pop in a single step instance, I don't see why the same data as getScopingInfo() cannot be extracted by a call to getScopeKeys() and getPop().

@Cole-Greer

Copy link
Copy Markdown
Contributor

Regarding the implementation strategy, our idea was to merge to master first and then cherry-pick to 3.7 and 3.8. Would you like to suggest any specific concerns about this approach?

The typical approach for this in TinkerPop would be to target the initial PR to 3.7-dev. Whichever committer ends up merging the PR will then manually merge the changes up to 3.8-dev and master on your behalf. Typically the only time we ask contributors to open separate PRs for each branch is if each branch requires substantially different solutions for some reason.

@vaibhavm99 vaibhavm99 force-pushed the getScopeInfo_utility branch from 277612e to d2ec3a4 Compare May 22, 2025 23:40
@xiazcy

xiazcy commented May 23, 2025

Copy link
Copy Markdown
Contributor

LGTM, thanks for the change! VOTE +1

@Cole-Greer Cole-Greer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good to me. Left a couple of comments for small tweaks. There are a couple of steps which give their own implementation for getPopInstructions() which do not have tests in this PR, which would be nice to have (TraversalSelectStep, AddVertexStep, AddVertexStartStep, AddEdgeStep, AddEdgeStartStep). It would also be nice to have some testing for steps which directly use the default implementations in Scpoing or TraversalParent.

For all of the steps which are TraversalParents, it would be nice if the tests also validated that getPopInstructions() is successfully producing the PopInstruction's from child traversals, as well as from the step itself.

@vaibhavm99 vaibhavm99 force-pushed the getScopeInfo_utility branch from d2ec3a4 to 591e667 Compare June 6, 2025 18:52
Comment thread CHANGELOG.asciidoc Outdated
and TraversalParent.java, and classes that use these interfaces.
@vaibhavm99 vaibhavm99 force-pushed the getScopeInfo_utility branch from 591e667 to 39c11b2 Compare June 11, 2025 21:24
@Cole-Greer

Cole-Greer commented Jun 13, 2025

Copy link
Copy Markdown
Contributor

Thanks for coming through and making all of these changes @vaibhavm99. My vote is VOTE +1.

Note that I intend to cherry pick this back to 3.7-dev when merging as this is completely non-breaking and can probably deliver some value sooner from that branch.

Comment thread gremlin-core/src/test/java/org/apache/tinkerpop/gremlin/TestDataBuilder.java Outdated
@Cole-Greer Cole-Greer merged commit 2ac1eb9 into apache:master Jun 13, 2025
6 checks passed
Cole-Greer pushed a commit that referenced this pull request Jun 13, 2025
Added PopContaining interface, and its implementation in Scoping.java
and TraversalParent.java, and classes that use these interfaces.

Co-authored-by: Vaibhav Malhotra <vaimalho@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants