BufWriter: Make the soundness of BorrowedBuf usage clearer.#155321
BufWriter: Make the soundness of BorrowedBuf usage clearer.#155321briansmith wants to merge 2 commits intorust-lang:mainfrom
BufWriter: Make the soundness of BorrowedBuf usage clearer.#155321Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
858db5d to
6f35180
Compare
There was a problem hiding this comment.
I think this seems like a good clarification with the comment addressed. Could you also improve https://doc.rust-lang.org/nightly/std/io/struct.BorrowedBuf.html#method.is_init to note confusingly say it returns the length when it's a boolean?
| // A less conservative alternative, if this causes performance issues, would be to | ||
| // enhance `flush_buf` to guarantee that it doesn't cause `self.buf` to reallocate | ||
| // or to de-initialize its spare capacity. (That may be possible as `flush_buf` | ||
| // never grows the buffer.) |
There was a problem hiding this comment.
Hm, I think that guarantee is almost upheld today. Vec::drain(..written) probably is allowed to semantically de-initialize the tail by moving the memory to the front, I'm not quite sure what our impl under the hood looks like with respect to that property.
I think as-is this will cause re-initialization of the buffer in many cases on every read, though. Can you look into whether flush_buf can be made to uphold that stronger guarantee? I think we would largely just need to ensure that we can guarantee no uninit bytes are written during the drain (maybe replacing it with a lower-level operation to do so)?
There was a problem hiding this comment.
Good idea. I updated the PR to replace the use of Vec::drain with two simpler operations like you suggested.
|
Reminder, once the PR becomes ready for a review, use |
6f35180 to
504e0f0
Compare
This comment has been minimized.
This comment has been minimized.
dac3f14 to
cc602b7
Compare
Done. I copied the documentation from |
|
@rustbot ready. |
| // preserving the spare capacity; see note above. | ||
| let new_len = self.buffer.len() - self.written; | ||
| // SAFETY: Assumes `<&mut [u8]>::copy_within` never writes | ||
| // outside of the slice, so it preserves the spare capacity. |
There was a problem hiding this comment.
I think it's safe to assume that the copy_within code can't create uninitialized bytes -- it's a safe method on &mut [u8] here, after all.
The trickier, unmentioned, safety assumption is that as_mut_slice and truncate don't de-initialize any bytes outside the Vec's len. I think the current wording (especially out of context in the SAFETY comments you added) can be read as preserving the allocation, but we need a stronger condition of not de-initializing the tail IIUC. Can we update to say something like "preserve spare capacity, potentially newly created (for truncate), as initialized if it was before"?
I'm not actually sure if we need the capacity to still exist -- a reallocating truncate might actually be OK? -- but that doesn't matter much since we have a public guarantee that truncate doesn't change the allocation:
Note that this method has no effect on the allocated capacity of the vector.
as_mut_slice has no such guarantee, but I think it's strongly implied since &mut s[..] calling into an allocator would be deeply weird.
There was a problem hiding this comment.
The trickier, unmentioned, safety assumption is that as_mut_slice and truncate don't de-initialize any bytes outside the Vec's len. I think the current wording (especially out of context in the SAFETY comments you added) can be read as preserving the allocation, but we need a stronger condition of not de-initializing the tail IIUC. Can we update to say something like "preserve spare capacity, potentially newly created (for truncate), as initialized if it was before"?
Thanks. I updated the comments to be more in line with your suggestion to mention that the concern is about de-initialization. I think that if/when truncate/as_mut_slice are changed in the future where an even more specific policy is needed, then the comment will get updated then with precisely what that change requires.
I'm not actually sure if we need the capacity to still exist -- a reallocating truncate might actually be OK? but that doesn't matter much since we have a public guarantee that truncate doesn't change the allocation:
The guarantee isn't that truncate won't change the allocation. The guarantee is that truncate won't change the capacity. There is probably a lot of code that assumes that the allocation won't be moved (i.e. as_ptr() and friends don't change in surprising ways) and moving wouldn't be allowed by implicit performance requirements, but there isn't a guarantee about that yet, AFAICT.
Regardless, I think the updated comments address the main risk; Vec reserves the right to do whatever it wants with the spare capacity (except shrink its size) but these methods are constrained by BufWriters usage to guarantee slightly more.
cc602b7 to
b2e0a3a
Compare
This comment has been minimized.
This comment has been minimized.
|
@rustbot ready. Just to make sure we're on the same page regarding the precise wording, I won't merge this without another review. |
b2e0a3a to
ffa1c5d
Compare
The previous safety comment was outdated as it was written before `BorrowedBuf::set_init` was changed to be a boolean. It also failed to address the possibility that `flush_buf` invalidated the assumption. Simplify the implementation of `flush_buf` to more obviously preserve the spare capacity, and document the need to preserve the spare capacity where necessary.
ffa1c5d to
2261684
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
The previous safety comment was outdated as it was written before
BorrowedBuf::set_initwas changed to be a boolean. It also failed to address the possibility thatflush_bufinvalidated the assumption.