-
-
Notifications
You must be signed in to change notification settings - Fork 34.5k
gh-90533: Implement BytesIO.peek() #30808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
b833b83
50a2cfb
eaa7672
00457ae
882579d
c1eed72
afc200c
79ab9a4
b493914
2a1c85c
d398717
26d1e81
9a19ff9
9300ade
d214089
d6691b8
3e51adb
3661b65
cd40d77
04372bd
6b9ae8c
f7406f6
d9528e2
bc8134b
b6ffca8
5fe5645
4126a64
1ea40c2
77e04d6
4d2f2dd
c16bebf
08bd7da
6174fca
b8b8cf4
7ac914e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -67,6 +67,10 @@ def readinto(barrier, b, into, *ignore): | |
| barrier.wait() | ||
| b.readinto(into) | ||
|
|
||
| def peek(barrier, b, into, *ignore): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Need a bit more than this. These helpers define "how do we make the many threads race as much as possible" / help synchornize their start. This needs to also get added to the So |
||
| barrier.wait() | ||
| b.peek(into) | ||
|
|
||
| def close(barrier, b, *ignore): | ||
| barrier.wait() | ||
| b.close() | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| Add :meth:`io.BytesIO.peek`. |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not default to
size=0likeBufferedReader?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read through the comments. In general anything returning a
bytesrather than abytes-likememoryview is going to require allocating and copying potentially many bytes... If the copy is really concerning I'd lean returning a memoryview rather than abyteswhich is a mandatory copy.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memoryviewcame up in this comment and the three following it: #30808 (comment)I don’t know what the conclusion is given your comment. Should a memoryview be returned instead? Most important to me is compatibility with what
BufferedReader.peek()returns.I am not too concerned about the extra memory for a
bytesobject as long as the default issize=1.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to not using memoryview / definitely a good rationale in that thread.
On 3.14.0 BufferedReader (intentionally limiting buffer size) gives:
Which doesn't match the behavior here or defaults. That in particular concerns me because it seems like people will build and test I/O stack pieces (ex. file parsing) expecting the
.peekbehavior ofBytesIOthen get something different when they read actual files/data...Looking for more alternatives, the documentation page says (https://docs.python.org/3/library/io.html#io.BufferedReader.peek):
To me that leaves an intentional gap where we can always return less data than the total amount available. Peek gives no guarantees. That makes me wonder: Could we default to 0 and just always slice
[:DEFAULT_BUFFER_SIZE]?That would mean you always get
DEFAULT_BUFFER_SIZEwhich matches defaultBufferedReaderbehavior unless there is less data available. If called in a loop yes that's a DEFAULT_BUFFER_SIZE repeated copy (BufferedIO also does that / thebytesrequires a copy), but it's a lot less than "the whole buffer".