optimize put_back_char by Skgland · Pull Request #3317 · mthom/scryer-prolog

Skgland · 2026-04-26T11:48:25Z

Always encode the char directly into the buffer.
Only shift the buffer content if there isn't enough room in the front.

I noticed in #3314 that we were copying the buffer content into a temp buffer just to copy it back.
Always doing two copies seamed unnecessary.
This now does zero copies if there is enough space at the front and otherwise inserts the missing amount of bytes at the front copying the existing content once.
So this changes from always two copies to at most one¹.

the two copies were to a separate allocation so non-overlapping, the new copy is most definitely an overlapping copy. ↩

Always encode the char directly into the buffer. Only shift the buffer content if there isn't enough room in the front.

Skgland · 2026-04-26T12:15:04Z

Rebased on top of #3316 to prevent conflicts

Skgland · 2026-04-29T10:46:56Z

Note to self: see if these tests are still slow

Unexpected results for running the following test script

./test.sh

#!/usr/bin/bash set -x commit=$(git describe) cargo clean --workspace cargo +nightly miri test --no-run -- --ignored lorem_ipsum run() { time cargo +nightly miri test -- --ignored lorem_ipsum } run &> "${commit}.log"

commit real user sys

8dffd72 (master) 0m39.583s 0m39.757s 0m0.553s

b2dad66 (PR 3316) 17m26.012s 17m24.759s 0m0.804s

af54fb2 (this PR) 6m56.416s 6m56.457s 0m0.570s

Commit f8ddd78 improves the performance to a point better than before 8dffd72

commit real user sys

f8ddd78 0m27.786s 0m26.686s 0m0.616s

What I meant to write was

Commit f8ddd78 improves the performance to a point better than before b2dad66 (PR 3316)
i.e better than 8dffd72 (master)

Skgland · 2026-04-30T18:01:13Z


            // Raise the error. If we still have data in
            // the buffer, it will be returned on the next
            // loop.


As far as I can tell this comment is a lie.
Even before my change there is nothing that consumes the invalid bytes from the buffer.

The test I added has to manually consume the invalid bytes

https://github.com/mthom/scryer-prolog/pull/3317/changes#diff-2a865146e51bf3d68c6f674a0abdcf0da9c8a2c25b0d04fa040375443bdb2c86R402

https://github.com/mthom/scryer-prolog/pull/3317/changes#diff-2a865146e51bf3d68c6f674a0abdcf0da9c8a2c25b0d04fa040375443bdb2c86R421

The comment can be false. It would be a lie if it is deliberately false, which I think we can safely exclude. Very likely it is an error due to a wrong assumption or mental model of the situation.

Sorry, lie might have been a bit strong of a word.
Could also be that the code was changed but the comment was overlooked and not updated in tandem.

The question is: Is the current behavior, the comment or neither correct?

The comment reads to me as if peek_char should consume the invalid bytes.
Currently implemented is that neither peek_char nor read_char consume the invalid bytes when they return the error.
It might be reasonable for only read_char to consume the invalid bytes when it returns the error similar to how it consumes the valid bytes when it returns a char.

@mthom you wrote the comment in question, could you shine some light on what the expected behavior is. Currently it appears to me that the implementation and the comment disagree.

josd and others added 2 commits April 26, 2026 13:42

Improve CharReader buffering

b2dad66

optimize put_back_char

af54fb2

Always encode the char directly into the buffer. Only shift the buffer content if there isn't enough room in the front.

Skgland force-pushed the optimize-put_back_char branch from 98d3664 to af54fb2 Compare April 26, 2026 12:14

Skgland commented Apr 29, 2026

View reviewed changes

Skgland added 4 commits April 30, 2026 19:01

add a test with bad utf-8

48a34aa

use a function rather than a closure

327a677

optimize CharReader

f8ddd78

un-ignore lorem_ipsum tests as they are no longer slow

89dde3f

Skgland commented Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize put_back_char#3317

optimize put_back_char#3317
Skgland wants to merge 6 commits into
mthom:masterfrom
Skgland:optimize-put_back_char

Skgland commented Apr 26, 2026

Uh oh!

Skgland commented Apr 26, 2026

Uh oh!

Skgland Apr 29, 2026

Uh oh!

Skgland Apr 29, 2026

Uh oh!

Skgland Apr 30, 2026

Uh oh!

Skgland Apr 30, 2026 •

edited

Loading

Uh oh!

Skgland Apr 30, 2026

Uh oh!

triska Apr 30, 2026

Uh oh!

Skgland Apr 30, 2026 •

edited

Loading

Uh oh!

Skgland Apr 30, 2026

Uh oh!

Skgland Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

commit	real	user	sys
`8dffd72` (master)	0m39.583s	0m39.757s	0m0.553s
`b2dad66` (PR 3316)	17m26.012s	17m24.759s	0m0.804s
`af54fb2` (this PR)	6m56.416s	6m56.457s	0m0.570s

Conversation

Skgland commented Apr 26, 2026

Footnotes

Uh oh!

Skgland commented Apr 26, 2026

Uh oh!

Skgland Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

triska Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Skgland Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Skgland Apr 30, 2026 •

edited

Loading

Skgland Apr 30, 2026 •

edited

Loading