-
-
Notifications
You must be signed in to change notification settings - Fork 34.5k
gh-136599: Improve long_hash #136600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-136599: Improve long_hash #136600
Changes from 12 commits
146f5aa
a162da2
4f9fc76
07bce4b
32341de
194fb7a
a48860f
6d3754b
fec9fbe
08d7ba9
14a90f1
76c4f6a
f720557
55e5bd9
0aa56f0
c1a3184
b9a487d
b0fd0d8
c6e060d
9b6e628
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -1693,5 +1693,22 @@ class MyInt(int): | |||||
| # GH-117195 -- This shouldn't crash | ||||||
| object.__sizeof__(1) | ||||||
|
|
||||||
| def test_hash(self): | ||||||
| # gh-136599 | ||||||
| self.assertEqual(hash(-1), -2) | ||||||
| self.assertEqual(hash(0), 0) | ||||||
| self.assertEqual(hash(10), 10) | ||||||
|
|
||||||
| self.assertEqual(hash(sys.hash_info.modulus - 2), sys.hash_info.modulus - 2) | ||||||
| self.assertEqual(hash(sys.hash_info.modulus - 1), sys.hash_info.modulus - 1) | ||||||
| self.assertEqual(hash(sys.hash_info.modulus), 0) | ||||||
| self.assertEqual(hash(sys.hash_info.modulus + 1), 1) | ||||||
|
|
||||||
| self.assertEqual(hash(-sys.hash_info.modulus - 2), -2) | ||||||
| self.assertEqual(hash(-sys.hash_info.modulus - 1), -2) | ||||||
| self.assertEqual(hash(-sys.hash_info.modulus), 0) | ||||||
| self.assertEqual(hash(-sys.hash_info.modulus + 1), - (sys.hash_info.modulus - 1)) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
|
|
||||||
| if __name__ == "__main__": | ||||||
| unittest.main() | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| Improve performance of :class:`int` hash calculations. |
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -3676,7 +3676,20 @@ long_hash(PyObject *obj) | |||
| } | ||||
| i = _PyLong_DigitCount(v); | ||||
| sign = _PyLong_NonCompactSign(v); | ||||
| x = 0; | ||||
|
|
||||
| // unroll first two digits | ||||
| #if ( PyHASH_BITS > PyLong_SHIFT ) | ||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PyHASH_BITS can be either 31 or 61, while PyLong_SHIFT - either 15 or 30. So, this condition is always true. BTW, it seems the 15-bit digit is untested by regular CI. Is there at least some buildbot with such settings? I'll open an issue.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The case is indeed untested, but I added it because of a comment by Serhiy #136600 (comment)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
FYI: #138336
Then fine. Though, an assert might be option.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can replace the test with a build assertion: |
||||
| assert(i>=2); | ||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe just use Otherwise LGTM. |
||||
| --i; | ||||
| x = v->long_value.ob_digit[i]; | ||||
| assert(x < PyHASH_MODULUS); | ||||
| #endif | ||||
|
skirpichev marked this conversation as resolved.
Outdated
|
||||
| #if ( PyHASH_BITS > (2 * PyLong_SHIFT) ) | ||||
|
eendebakpt marked this conversation as resolved.
Outdated
|
||||
| --i; | ||||
| x = ((x << PyLong_SHIFT)); | ||||
|
eendebakpt marked this conversation as resolved.
Outdated
|
||||
| x += v->long_value.ob_digit[i]; | ||||
| assert(x < PyHASH_MODULUS); | ||||
| #endif | ||||
|
vstinner marked this conversation as resolved.
|
||||
| while (--i >= 0) { | ||||
| /* Here x is a quantity in the range [0, _PyHASH_MODULUS); we | ||||
| want to compute x * 2**PyLong_SHIFT + v->long_value.ob_digit[i] modulo | ||||
|
|
||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that part could be in a separate pr, merged first.
The docs describe algorithm in details:
https://docs.python.org/3/library/stdtypes.html#hashing-of-numeric-types
Maybe we could test it against pure-Python implementation, using also hypothesis?