GROOVY-11649: add partitionPoint to Groovy Array and List(missed in java and groovy)#2210
Conversation
|
I created this issue to track this potential enhancement: |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #2210 +/- ##
==================================================
+ Coverage 68.9576% 68.9784% +0.0208%
- Complexity 29591 29625 +34
==================================================
Files 1423 1423
Lines 114102 114217 +115
Branches 19793 19818 +25
==================================================
+ Hits 78682 78785 +103
- Misses 28800 28803 +3
- Partials 6620 6629 +9
🚀 New features to boost your workflow:
|
|
This is an interesting proposal. One immediate point of feedback is I would change the IntRange semantics to follow normal Groovy idioms. I know that means it might be slightly different to rust and other languages, but it follows what we do for all other Groovy methods. So, basically you'd get rid of the "- 1" when accessing getToInt() and change your examples from "0..arr.size()" to "0..<arr.size()" or "0..-1". Groovy let's you pick between (l, r), [l, r), [l, r] etc by using 0<..<4, 1..<4, 1..3 etc. I also haven't checked yet whether reverse ranges are handled properly, but perhaps you've already considered that. Also, Java doesn't have a sorted list but offers Collections.sort(). Groovy doesn't have partition (currently) but does have groupBy. So, I think it makes sense to discuss in more detail how this would typically be used. Is it less useful than in other languages given the JDK collection semantics or is other functionality needed (like a partition extension method) to reap the full benefits. I think discussing such things is best on the mailing list or the issue. |
…ew, method name to binaryPartitionPoint
@paulk-asert Thank you
after all, so I change the method name to binaryPartitionPoint to be more precise. |
|
I actually think i prefer just |
|
Also, was there anything in particular you used from |
cause partition_point is O(logn), suppose that element.get(i) is O(1). so here we need a "random access" data store (such as an array). the java doc says for AbstractList:
though there no actual limit in java collection framework. |
partitionPoint is ok for me. |
|
what about the usage, my q: why rust vec.partition_point is useful if there is binary_search already? Let me help you understand the difference between Let me explain why
let s = [0, 1, 1, 1, 1, 2, 3, 5, 8];
// To find the range of all 1s:
let low = s.partition_point(|x| x < &1); // Returns 1
let high = s.partition_point(|x| x <= &1); // Returns 5
// Now s[low..high] contains all the 1s
assert_eq!(&s[low..high], &[1, 1, 1, 1]);
Here's a practical example showing the difference: let vec = vec![1, 2, 2, 2, 3, 4];
// Using binary_search to find a 2:
match vec.binary_search(&2) {
Ok(idx) => println!("Found a 2 at {}", idx), // Will find *some* 2, but which one?
Err(_) => println!("Not found")
}
// Using partition_point to find ALL 2s:
let start = vec.partition_point(|x| x < &2);
let end = vec.partition_point(|x| x <= &2);
println!("All 2s are in positions {}..{}", start, end); // Gives you the exact rangeSo while both functions are useful,
|
|
I was thinking a bit more about usage overnight. For arrays, the "clean insertion" point argument is less compelling but somewhat useful none-the-less. Given that it isn't easy to partition primitive arrays, I think partitionPoint is slightly less compelling. We could get by with both bisectLeft and bisectRight for instance. But even for natural ordering scenarios, the convenience of a single method is nicer than the two separate methods in my view. I looked at the grades example from Python bisect: https://docs.python.org/3/library/bisect.html#examples With binary search we have something like this: We have to artificially subtract one from each of the breakpoints compared to the Python bisect example. With partitionPoint it would become: This is much cleaner. |
In general, we would use List here. That makes it work for all lists. The default is ArrayList which should have good performance but if the user wishes to supply some other kind of list, they will be responsible for ensuring they pick a list matching their performance requirements. |
|
Can you rename back to partitionPoint, and use List instead of AbstractList. I am still pondering the benefits of Predicate vs Closure for the DGM List and AGM T[] variants. Did you make them different for a reason? |
…artitionPoint, prefer Predicate<T> to Closure<?>, DGM prefer List to AbstractList
@paulk-asert ok, 3 have done:
|
|
I am not sure if "prefer Predicate to Closure<?>" is correct yet. I think that will be the case for primitive arrays, still thinking about List and T[]. It might be best to align those with how we'd sort those two things but I am just thinking through the cases whether it turns out to be the same. I'll writer another comment in a few hours. |
|
Predicate looks fine in all cases. |
|
Merged! Thanks! |
|
Apologies, I meant to squash when merging. |
partition_point missed in java and groovy like in