optimize put_back_char#3317
Conversation
Always encode the char directly into the buffer. Only shift the buffer content if there isn't enough room in the front.
98d3664 to
af54fb2
Compare
|
Rebased on top of #3316 to prevent conflicts |
There was a problem hiding this comment.
Note to self: see if these tests are still slow
There was a problem hiding this comment.
Unexpected results for running the following test script
./test.sh
#!/usr/bin/bash
set -x
commit=$(git describe)
cargo clean --workspace
cargo +nightly miri test --no-run -- --ignored lorem_ipsum
run()
{
time cargo +nightly miri test -- --ignored lorem_ipsum
}
run &> "${commit}.log"
| commit | real | user | sys |
|---|---|---|---|
| 8dffd72 (master) | 0m39.583s | 0m39.757s | 0m0.553s |
| b2dad66 (PR 3316) | 17m26.012s | 17m24.759s | 0m0.804s |
| af54fb2 (this PR) | 6m56.416s | 6m56.457s | 0m0.570s |
|
|
||
| // Raise the error. If we still have data in | ||
| // the buffer, it will be returned on the next | ||
| // loop. |
There was a problem hiding this comment.
As far as I can tell this comment is a lie.
Even before my change there is nothing that consumes the invalid bytes from the buffer.
The test I added has to manually consume the invalid bytes
There was a problem hiding this comment.
The comment can be false. It would be a lie if it is deliberately false, which I think we can safely exclude. Very likely it is an error due to a wrong assumption or mental model of the situation.
There was a problem hiding this comment.
Sorry, lie might have been a bit strong of a word.
Could also be that the code was changed but the comment was overlooked and not updated in tandem.
There was a problem hiding this comment.
The question is: Is the current behavior, the comment or neither correct?
The comment reads to me as if peek_char should consume the invalid bytes.
Currently implemented is that neither peek_char nor read_char consume the invalid bytes when they return the error.
It might be reasonable for only read_char to consume the invalid bytes when it returns the error similar to how it consumes the valid bytes when it returns a char.
There was a problem hiding this comment.
@mthom you wrote the comment in question, could you shine some light on what the expected behavior is. Currently it appears to me that the implementation and the comment disagree.
Always encode the char directly into the buffer.
Only shift the buffer content if there isn't enough room in the front.
I noticed in #3314 that we were copying the buffer content into a temp buffer just to copy it back.
Always doing two copies seamed unnecessary.
This now does zero copies if there is enough space at the front and otherwise inserts the missing amount of bytes at the front copying the existing content once.
So this changes from always two copies to at most one1.
Footnotes
the two copies were to a separate allocation so non-overlapping, the new copy is most definitely an overlapping copy. ↩