SNOW-3489633: remove redundant LIMIT 1 on each COUNT slot subquery#4251
SNOW-3489633: remove redundant LIMIT 1 on each COUNT slot subquery#4251sfc-gh-yuwang wants to merge 4 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4251 +/- ##
==========================================
+ Coverage 95.38% 95.49% +0.10%
==========================================
Files 171 171
Lines 44243 44187 -56
Branches 7557 7535 -22
==========================================
- Hits 42203 42198 -5
+ Misses 1252 1226 -26
+ Partials 788 763 -25 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
| # add limit 1 because aggregate may be on non-aggregate function in a scalar aggregation | ||
| # for example, df.agg(lit(1)) |
There was a problem hiding this comment.
is this assumption legit in the context of SCOS? what cases will be broken if we always remove limit_expression(1) for SCOS?
given the complexity of the implementation (extra tree walk, code maintainability, etc.) and the minimal perf improvement it can help, I'm wondering if it's worth the efforts.
There was a problem hiding this comment.
locally verified that the scenario mentioned in the comment would not break SCOS with current change in snowpark
There was a problem hiding this comment.
Do you have any metrics on the improvements from this change? I agree with Adam that this change seems risky for little benefit.
There was a problem hiding this comment.
I'm curious about why the no-aggregation function in SCOS avoids the issue after we remove limit 1
However, if the underlying logic is overly complex or confusing, let's skip it. Without a clear understanding, we won't have high confidence against regressions.
There was a problem hiding this comment.
I did a few re-test, it turns out that this change is also not safe for scos, so we will have to add a lot of logic to make it safe for scos like the first version of this change.
At this point, I think it does not worth it to have this change just to remove limit 1.
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-NNNNNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.