Feature or enhancement
Overview
collections.Counter.most_common() currently has two execution paths:
if n is None:
return sorted(self.items(), key=itemgetter(1), reverse=True)
else:
return heapq.nlargest(n, self.items(), key=itemgetter(1))
While this is correct and efficient in general, there are common cases (especially n == 1 or very small n) where heapq.nlargest introduces avoidable overhead (heap allocation, tuple comparisons, Python-level callbacks).
This proposal is to investigate and potentially introduce small fast paths for common n values, without changing semantics or public API behavior.
Counter.most_common(1) is a very common usage pattern (e.g., “find the mode”), but it still uses the full capability of heapq.nlargest.
For example:
c = Counter(data)
c.most_common(1)
This could potentially be implemented more efficiently via a single-pass max selection:
max(self.items(), key=itemgetter(1))
or an equivalent loop-based approach, avoiding heap construction entirely.
Similarly, for very small n, a specialized strategy may outperform heapq.nlargest, especially for large counters.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
Feature or enhancement
Overview
collections.Counter.most_common()currently has two execution paths:While this is correct and efficient in general, there are common cases (especially
n == 1or very smalln) whereheapq.nlargestintroduces avoidable overhead (heap allocation, tuple comparisons, Python-level callbacks).This proposal is to investigate and potentially introduce small fast paths for common
nvalues, without changing semantics or public API behavior.Counter.most_common(1)is a very common usage pattern (e.g., “find the mode”), but it still uses the full capability ofheapq.nlargest.For example:
This could potentially be implemented more efficiently via a single-pass max selection:
or an equivalent loop-based approach, avoiding heap construction entirely.
Similarly, for very small
n, a specialized strategy may outperformheapq.nlargest, especially for large counters.Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
Counter.most_common()Optimization for Smalln+ Test Cases #144129