This document describes the binary layout for each data type in the Bogo serialization format.
┌─────────────┬─────────────────────────────────────┐
│ Version │ Encoded Data │
│ (1 byte) │ (variable) │
└─────────────┴─────────────────────────────────────┘
- Version: Always
0x00(version 0) - appears only once at the start - Encoded Data: The actual encoded value using type-specific format below
┌─────────────┐
│ 0x00 │
│ (null) │
└─────────────┘
Total: 1 byte
┌─────────────┐
│ 0x01 │
│ (true) │
└─────────────┘
Total: 1 byte
┌─────────────┐
│ 0x02 │
│ (false) │
└─────────────┘
Total: 1 byte
┌─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x03 │ Len Size │ Length │ String Data │
│ (string) │ (1 byte) │ (varint) │ (len bytes) │
└─────────────┴─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes used for the varint length encoding
- Length (Y): String length as varint, occupying exactly X bytes
- String Data: UTF-8 string bytes, exactly Y bytes
Parsing Steps:
- Read 1 byte → X (number of bytes for length)
- Read X bytes → decode as varint to get Y (actual string length)
- Read Y bytes → decode as UTF-8 string
┌─────────────┬─────────────┬─────────────────┐
│ 0x05 │ Len Size │ Varint Data │
│ (int) │ (1 byte) │ (len bytes) │
└─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes used for the varint encoding
- Varint Data: Signed varint (zigzag encoded), occupying exactly X bytes
Parsing Steps:
- Read 1 byte → X (number of bytes for varint)
- Read X bytes → decode as signed varint
┌─────────────┬─────────────┬─────────────────┐
│ 0x06 │ Len Size │ Uvarint Data │
│ (uint) │ (1 byte) │ (len bytes) │
└─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes used for the uvarint encoding
- Uvarint Data: Unsigned varint, occupying exactly X bytes
Parsing Steps:
- Read 1 byte → X (number of bytes for uvarint)
- Read X bytes → decode as unsigned varint
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ 0x07 │ Len Size │ Sign+Exponent │ Mantissa │
│ (float) │ (1 byte) │ (2 bytes) │ (varint) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
- Len Size (X): Total size of sign+exponent+mantissa data
- Sign+Exponent: 2 bytes little-endian (sign bit + 11-bit exponent)
- Mantissa: 52-bit mantissa as uvarint (only if non-zero)
Parsing Steps:
- Read 1 byte → X (total size of float data)
- Read X bytes → first 2 bytes are sign+exponent, remaining bytes are mantissa varint
┌─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x08 │ Len Size │ Data Size │ Binary Data │
│ (blob) │ (1 byte) │ (varint) │ (size bytes) │
└─────────────┴─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes used for the data size varint
- Data Size (Y): Size of binary data as varint, occupying exactly X bytes
- Binary Data: Raw bytes, exactly Y bytes
Parsing Steps:
- Read 1 byte → X (number of bytes for data size)
- Read X bytes → decode as varint to get Y (binary data size)
- Read Y bytes → raw binary data
Use Cases: Images, files, UUIDs, hashes, encrypted data, certificates
┌─────────────┬─────────────┬─────────────┬─────────────┬───────────────────────────────────────┐
│ 0x00 │ 0x08 │ 0x01 │ 0x10 │ UUID Bytes │
│ (version) │ (blob) │ (X=1) │ (Y=16) │ (16 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴───────────────────────────────────────┘
16 bytes of raw UUID data (e.g., 550e8400-e29b-41d4-a716-446655440000)
┌─────────────┬─────────────────────────────────────┐
│ 0x09 │ Timestamp │
│(timestamp) │ (8 bytes) │
└─────────────┴─────────────────────────────────────┘
- Timestamp: Unix timestamp in milliseconds as int64, little-endian, exactly 8 bytes
Parsing Steps:
- Read 8 bytes → decode as int64 little-endian to get timestamp (milliseconds since Unix epoch)
Notes:
- Fixed 8-byte encoding (no length header needed)
- Millisecond precision
- Always UTC timezone
- Little-endian byte order
- Supports dates from 1970 to ~292 million years in the future
- Negative values for dates before 1970
┌─────────────┬─────────────┬─────────────────────────────────────────┐
│ 0x00 │ 0x09 │ Timestamp │
│ (version) │(timestamp) │ (1705317045123) │
└─────────────┴─────────────┴─────────────────────────────────────────┘
Timestamp 1705317045123 as 8-byte int64 little-endian = Jan 15, 2024 10:30:45.123 UTC
┌─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x0A │ Len Size │ Data Size │ Element Data │
│ (untyped) │ (1 byte) │ (varint) │ (size bytes) │
└─────────────┴─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes used for the data size varint
- Data Size (Y): Total size of all encoded elements as varint, occupying exactly X bytes
- Element Data: Concatenated encoded elements, exactly Y bytes total
Parsing Steps:
- Read 1 byte → X (number of bytes for data size)
- Read X bytes → decode as varint to get Y (total size of all elements)
- Read Y bytes → parse as concatenated encoded elements (each with full type headers)
Note: Each element includes its own type byte and length information for boundary detection.
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x0B │ Element Type│ Len Size │ List Count │ Element Data │
│(typed list) │ (1 byte) │ (1 byte) │ (varint) │ (size bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────────┘
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x0B │ Element Type│ Len Size │ Data Size │ Element Data │
│(typed list) │ (1 byte) │ (1 byte) │ (varint) │ (size bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────────┘
- Element Type: The type of all elements
- Len Size (X): Number of bytes used for the count/size varint
- List Count (N): Number of elements (for fixed-size types only)
- Data Size (Y): Total size of all element data (for variable-size types)
- Element Data: Concatenated element data
Parsing Steps:
For Fixed-Size Types (TypeBoolTrue, TypeBoolFalse, TypeByte, TypeTimestamp):
- Read 1 byte → element type
- Read 1 byte → X (number of bytes for count)
- Read X bytes → decode as varint to get N (element count)
- For bools/bytes: Read N bytes → parse as N elements (1 byte each)
- For timestamps: Read N × 8 bytes → parse as N timestamps (8 bytes each)
For Variable-Size Types (TypeString, TypeInt, TypeUint, TypeFloat, TypeBlob):
- Read 1 byte → element type
- Read 1 byte → X (number of bytes for data size)
- Read X bytes → decode as varint to get Y (total data size)
- Read Y bytes → parse elements sequentially (each with type-specific length headers)
Predetermined Sizes:
- TypeBoolTrue/TypeBoolFalse: 1 byte each (no length header needed)
- TypeByte: 1 byte each (no length header needed)
- TypeTimestamp: 8 bytes each (no length header needed)
- TypeString/TypeInt/TypeUint/TypeFloat/TypeBlob: Variable (need length headers per element)
┌─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x0C │ Len Size │ Data Size │ Field Data │
│ (object) │ (1 byte) │ (varint) │ (size bytes) │
└─────────────┴─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes used for the data size varint
- Data Size (Y): Total size of all key-value pairs as varint, occupying exactly X bytes
- Field Data: Concatenated key-value pairs, exactly Y bytes total
Each field entry in the Field Data follows this format:
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────────┐
│ Len Size │ Entry Size │ Key Length │ Key │ Value │
│ (1 byte) │ (varint) │ (1 byte) │ (key bytes) │ (encoded) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────────┘
- Len Size (X): Number of bytes for the entry size varint
- Entry Size (Z): Total size of this key-value entry (key length + key + value), occupying X bytes
- Key Length (K): Length of key string (0-255 bytes)
- Key: UTF-8 key string, exactly K bytes
- Value: Encoded value using standard type encoding
Parsing Steps:
- Read 1 byte → X (number of bytes for data size)
- Read X bytes → decode as varint to get Y (total size of all fields)
- For each field entry within Y bytes: a. Read 1 byte → X₂ (bytes for entry size) b. Read X₂ bytes → decode as varint to get Z (this entry size) c. Read 1 byte → K (key length) d. Read K bytes → key string e. Read remaining bytes of entry → decode value using standard encoding
Key Constraints:
- Keys are limited to 255 bytes (single-byte length encoding)
- Keys must be valid UTF-8 strings
- Duplicate keys are allowed (implementation-defined behavior)
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────────────────┐
│ 0x00 │ 0x03 │ 0x01 │ 0x05 │ 'h' 'e' 'l' 'l' 'o' │
│ (version) │ (string) │ (X=1) │ (Y=5) │ (5 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────────────────┘
X=1 means "read next 1 byte for length", Y=5 means "string is 5 bytes long"
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────────────────────────────────────┐
│ 0x00 │ 0x0A │ 0x01 │ 0x07 │ Element Data │
│ (version) │ (untyped) │ (X=1) │ (Y=7) │ (7 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────────────────────────────────────┘
│
├─ 0x05 0x01 0x01 (int: 1)
└─ 0x03 0x01 0x02 'h' 'i' (string: "hi")
X=1 means "read next 1 byte for data size", Y=7 means "element data is 7 bytes total"
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x00 │ 0x0B │ 0x01 │ 0x01 │ 0x03 │ Element Data │
│ (version) │ (typed) │ (bool) │ (X=1) │ (N=3) │ (3 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────────┘
│
├─ 0x01 (true)
├─ 0x00 (false)
└─ 0x01 (true)
- Element Type=0x01 (TypeBoolTrue, but represents bool type)
- X=1, N=3 (3 elements)
- Each bool is 1 byte: 0x01=true, 0x00=false
- Total: 3 × 1 = 3 bytes (no length headers needed)
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x00 │ 0x0B │ 0x05 │ 0x01 │ 0x09 │ Element Data │
│ (version) │ (typed) │ (int) │ (X=1) │ (Y=9) │ (9 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────────┘
│
├─ 0x01 0x64 (100)
├─ 0x01 0xC8 (200)
└─ 0x02 0x2C 0x01 (300)
- Element Type=0x05 (TypeInt)
- X=1, Y=9 (total data size is 9 bytes)
- Each int has its own length header (varint encoding)
- Parser reads Y bytes total, parsing elements sequentially
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────────┐
│ 0x00 │ 0x0B │ 0x03 │ 0x01 │ 0x0C │ Element Data │
│ (version) │ (typed) │ (string) │ (X=1) │ (Y=12) │ (12 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────────┘
│
├─ 0x01 0x01 'a' (len=1)
├─ 0x01 0x02 'bb' (len=2)
└─ 0x01 0x03 'ccc' (len=3)
- Element Type=0x03 (TypeString)
- X=1, Y=12 (total data size is 12 bytes)
- Each string has its own length header since strings are variable-length
┌─────────────┬─────────────┬─────────────┬─────────────┬───────────────────────────────────────────────┐
│ 0x00 │ 0x0C │ 0x01 │ 0x17 │ Field Data │
│ (version) │ (object) │ (X=1) │ (Y=23) │ (23 bytes) │
└─────────────┴─────────────┴─────────────┴─────────────┴───────────────────────────────────────────────┘
│
├─ Entry 1: "name":"John"
│ 0x01 0x0C 0x04 'name' 0x03 0x01 0x04 'John'
│ (1)(12)(4)(name)(string)(1)(4)(John)
│
└─ Entry 2: "age":25
0x01 0x07 0x03 'age' 0x05 0x01 0x19
(1)(7)(3)(age)(int)(1)(25)
Breakdown:
- Object has Y=23 bytes of field data total
- Entry 1: len_size=1, entry_size=12, key="name"(1+4=5 bytes), value="John"(string: 1+1+1+4=7 bytes)
- Total entry: 1 + 1 + (1+4) + (1+1+1+4) = 14 bytes
- Entry 2: len_size=1, entry_size=7, key="age"(1+3=4 bytes), value=25(int: 1+1+1=3 bytes)
- Total entry: 1 + 1 + (1+3) + (1+1+1) = 9 bytes
- Field data total: 14 + 9 = 23 bytes
- All multi-byte integers use little-endian encoding for fixed-width fields
- Varints follow Go's binary.Varint/Uvarint encoding
- Strings are UTF-8 encoded
- Lists and objects store total payload size for efficient parsing
- Float encoding separates IEEE 754 components for space efficiency