Python: Modernize py/mixed-tuple-returns#19136
Merged
Conversation
Removes the dependence on points-to in favour of an approach based on (local) data-flow. I first tried a version that used type tracking, as this more accurately mimics the behaviour of the old query. However, I soon discovered that there were _many_ false positives in this setup. The main bad pattern I saw was a helper function somewhere deep inside the code that both receives and returns an argument that can be tuples with different sizes and origins. In this case, global flow produces something akin to a cartesian product of "n-tuples that flow into the function" and "m-tuples that flow into the function" where m < n. To combat this, I decided to instead focus on only flow _within_ a given function (and so local data-flow was sufficient). Additionally, another class of false positives I saw was cases where the return type actually witnessed that the function in question could return tuples of varying sizes. In this case it seems reasonable to not flag these instances, since they are already (presumably) being checked by a type checker. More generally, if you've annotated the return type of the function with anything (not just `Tuple[...]`), then there's probably little need to flag it.
As we're no longer tracking tuples across function boundaries, we lose the result that related to this setup (which, as the preceding commit explains, lead to a lot of false positives).
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR modernizes the py/mixed-tuple-returns query by removing the dependency on points-to analysis and shifting to a local data-flow approach to reduce false positives.
- Removed false positives by no longer flagging tuples passed as function arguments.
- Enhanced handling of functions with annotated return types to avoid unnecessary warnings.
Files not reviewed (2)
- python/ql/src/Functions/ReturnConsistentTupleSizes.ql: Language not supported
- python/ql/test/query-tests/Functions/return_values/ReturnConsistentTupleSizes.expected: Language not supported
Tip: Leave feedback on Copilot's review comments with the 👎 and 👍 buttons to help improve review quality. Learn more
python/ql/src/change-notes/2025-03-27-modernize-mixed-tuple-returns-query.md
Outdated
Show resolved
Hide resolved
tausbn
commented
Mar 27, 2025
python/ql/src/change-notes/2025-03-27-modernize-mixed-tuple-returns-query.md
Outdated
Show resolved
Hide resolved
joefarebrother
previously approved these changes
Mar 28, 2025
Contributor
joefarebrother
left a comment
There was a problem hiding this comment.
Looks good 👍
Just one minor comment.
| @@ -1,2 +1 @@ | |||
| | functions_test.py:306:1:306:39 | Function returning_different_tuple_sizes | returning_different_tuple_sizes returns $@ and $@. | functions_test.py:308:16:308:18 | Tuple | tuple of size 2 | functions_test.py:310:16:310:20 | Tuple | tuple of size 3 | | |||
Contributor
There was a problem hiding this comment.
I might like to put some comments in the tests explaining the expected results.
(in particular that this case no longer gives an alert)
Adds a comment explaining why we no longer flag the indirect tuple example. Also adds a test case which _would_ be flagged if not for the type annotation.
joefarebrother
approved these changes
Mar 28, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Removes the dependence on points-to in favour of an approach based on (local) data-flow.
I first tried a version that used type tracking, as this more accurately mimics the behaviour of the old query. However, I soon discovered that there were many false positives in this setup. The main bad pattern I saw was a helper function somewhere deep inside the code that both receives and returns an argument that can be tuples with different sizes and origins. In this case, global flow produces something akin to a cartesian product of "n-tuples that flow into the function" and "m-tuples that flow into the function" where m < n.
To combat this, I decided to instead focus on only flow within a given function (and so local data-flow was sufficient).
Additionally, another class of false positives I saw was cases where the return type actually witnessed that the function in question could return tuples of varying sizes. In this case it seems reasonable to not flag these instances, since they are already (presumably) being checked by a type checker.
More generally, if you've annotated the return type of the function with anything (not just
Tuple[...]), then there's probably little need to flag it.