Skip to content

Added the ability to process streams that use CR or CRLF as newlines#25

Merged
clue merged 1 commit into
clue:masterfrom
boenrobot:eol-processing
Jan 26, 2022
Merged

Added the ability to process streams that use CR or CRLF as newlines#25
clue merged 1 commit into
clue:masterfrom
boenrobot:eol-processing

Conversation

@boenrobot

Copy link
Copy Markdown
Contributor

Fixes #6

@SimonFrings

Copy link
Copy Markdown
Contributor

@boenrobot thanks for looking into this, definitely an interesting solution 👍

I was thinking, what if there's the case a chunk ends with \r\n\r and the last \n is missing because it didn't fit. This would lead to the next message starting with \n and result in the parser failing to parse the whole message.

What do you think about this one?

@SimonFrings SimonFrings added bug Something isn't working new feature New feature or request labels Jan 21, 2022
@boenrobot

Copy link
Copy Markdown
Contributor Author

Interesting edge case... I guess it should be safe to just remove the leading space from the buffer? I'll prepare a test and see if that addresses it.

@boenrobot

boenrobot commented Jan 21, 2022

Copy link
Copy Markdown
Contributor Author

I added a test case, and it seems like it passes.

And on closer examination, that seems to make sense thanks to the fact that empty lines in a message are ignored. The first message is cut at the second \r, and the next \n is treated as an empty line in another message, and that empty line is effectively skipped due to $name being false (since the line doesn't contain :). That way, the end result is the same.

@boenrobot boenrobot force-pushed the eol-processing branch 2 times, most recently from 5a75f95 to 683805f Compare January 24, 2022 09:55
@boenrobot

boenrobot commented Jan 24, 2022

Copy link
Copy Markdown
Contributor Author

I've reverted what I thought was a performance improvement in the building of the message, after doing some synthetic benchmarks, and added an "/S" flag to the regex, since the optimizations it's supposed to make are applicable exactly in scenarios like this one.

And I've squashed all commits into one, to not pollute the commit log.

@clue clue left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boenrobot Thanks for looking into this, changes LGTM!

Looks like an excellent improvement over the current situation, we can also build on top of this if we find that a broken up CRLF shows up again!

Keep it up! :shipit:

@clue clue added this to the v1.0.0 milestone Jan 26, 2022
@clue clue merged commit 09bc48a into clue:master Jan 26, 2022
@boenrobot boenrobot deleted the eol-processing branch January 26, 2022 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working new feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Also accept CR in place of LF (or CRLF)

3 participants