This tutorial explains the basic mental model of the core BinaryParsec surface.
It starts with one of the simplest real binary layouts:
- the first 4 bytes store the payload length
- the next
Nbytes are the payload
If you are new to the library, this is the place to start before the more package-specific guides.
Suppose the bytes are:
00 00 00 05 48 65 6C 6C 6F
Read as:
00 00 00 05The payload length, encoded as a big-endian unsigned 32-bit integer.48 65 6C 6C 6FThe payload bytes.
That payload happens to be ASCII "Hello".
A ContiguousParser<'T> is a value that knows how to read a 'T from a contiguous block of bytes.
So when you see:
let message = ...that usually means:
"message" is the parser for one message
not:
"message" is a parsed message that already exists
This repo tends to name parsers after the thing they parse:
messageframefieldheader
That is why the parser name is often not parseMessage. Both styles are valid, but the repo usually prefers the shorter noun form inside parser modules.
Here is a minimal parser for the example protocol:
open BinaryParsec
open BinaryParsec.Syntax
let message =
parse {
let! size = u32be
let! payload = takeSlice (int size)
return payload
}Read it line by line:
parse { ... }Build a parser using the computation-expression syntax.let! size = u32beRead 4 bytes as a big-endianuint32.let! payload = takeSlice (int size)Now that the size is known, carve out exactly that many bytes as a zero-copy slice.return payloadReturn the payload portion.
This is the main parser pattern to remember:
read a field -> use its value to decide the next read
takeSlice does not copy bytes into a new array.
It returns a ByteSlice, which is a small descriptor:
type ByteSlice =
{
Offset: int
Length: int
}That means:
- where the bytes start in the original input
- how many bytes belong to that region
For the example input, the payload slice would effectively mean:
{ Offset = 4; Length = 5 }That is a zero-copy view of the payload. The parser keeps pointing into the original input buffer instead of allocating a new payload array.
Think of ByteSlice as a bookmark into the input.
To use the parser, run it against a ReadOnlySpan<byte>:
let input =
[|
0x00uy; 0x00uy; 0x00uy; 0x05uy
0x48uy; 0x65uy; 0x6Cuy; 0x6Cuy; 0x6Fuy
|]
match Contiguous.run message (ReadOnlySpan<byte>(input)) with
| Ok payloadSlice ->
let payload = ByteSlice.asSpan (ReadOnlySpan<byte>(input)) payloadSlice
let bytes = payload.ToArray()
()
| Error err ->
()What happens here:
Contiguous.runstarts the parser at byte0- on success, it returns the parsed value
- in this case, the parsed value is a
ByteSlice ByteSlice.asSpanresolves that slice back to real bytes over the original input
If you want a parser that returns both the length and the payload slice, make that explicit:
open BinaryParsec
open BinaryParsec.Syntax
type SizedMessage =
{
Size: uint32
Payload: ByteSlice
}
let message =
parse {
let! size = u32be
let! payload = takeSlice (int size)
return
{
Size = size
Payload = payload
}
}This is often easier to understand than returning only the payload.
For simple fixed sequences, you can also use the .>>. (zip) operator:
let transaction = u16be .>>. u16be // returns (uint16 * uint16)And the |>> (map) operator to transform the result:
let point = u16be .>>. u16be |>> fun (x, y) -> {| X = x; Y = y |}When a parser fails, it reports the position and a message. You can add a label to a parser to make the error message clearer using the <?> operator:
let message =
parse {
let! size = u32be <?> "Message size prefix"
let! payload = takeSlice (int size) <?> "Message payload"
return { Size = size; Payload = payload }
} <?> "Sized message"If the input ends abruptly while reading the size prefix, the error message will be prefixed with "Message size prefix".
There are two common choices:
- Keep
ByteSlicein the parser result when you want zero-copy tokenization. - Convert the slice to
byte[]later when you want a stable owned model.
There is a third common case:
- Parse the bounded payload immediately when the format reads more clearly that way.
For example:
open BinaryParsec
open BinaryParsec.Syntax
let payload =
parse {
let! command = ``byte``
let! argument = u16be
return command, argument
}
let message =
parse {
let! payloadLength = ``byte``
let! parsedPayload = parseExactly (int payloadLength) payload
return payloadLength, parsedPayload
}That reads more like a format spec because the parser makes the boundary and the nested structure both explicit.
That distinction is common in this repo:
- slice types point into the original input
- materialized types own their data
The core Contiguous runner expects one complete contiguous input buffer.
So if the bytes are arriving from a socket or stream, the usual outer flow is:
- read the 4-byte length prefix
- decode the size
- keep buffering until the full payload is available
- run the parser over the complete message bytes
The parser handles message structure. The transport layer handles incomplete reads and buffering.