Skip to content

Commit 5b81ba1

Browse files
polybassaCopilotNils Weiss
authored
Add CBOR implementation following the ASN.1 implementation paradigm (by Copilot) (#4916)
* Implement CBOR parser following ASN.1 paradigm Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> Address code review comments: improve error messages and implement proper half-float decoding Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> Add cbor2 interoperability tests (cbor2 used ONLY in tests) Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> Document cbor2 as test-only dependency in test file header Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> Add CBOR documentation to advanced_usage.rst following ASN.1 pattern Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> Add cbor2 to tox.ini testenv deps for CBOR interoperability tests Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> * Reorder imports in `cbor.py` and `cborcodec.py` for better organization; remove trailing whitespaces. * Add adapted tests from PR #4875 for CBOR encoding edge cases Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> * Remove `advanced_usage.rst` documentation. * Fix codacy issues * Fix UTF-8 encoding test failures on Windows by specifying encoding in file opens Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> * Reorganize imports in `cbor.py` to improve readability. * Add RandCBORObject for fuzzing and comprehensive unit tests Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> * Fix extremely slow CBOR fuzzing tests by optimizing recursive structure generation Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> * Fix syntax errors in CBOR unit tests by adding blank lines before assertions Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> * cleanup CBOR docs * fix flake8 * fix ci issues * Reformat CBOR code for improved readability and compliance with style guidelines. * Update import path in CBOR documentation for consistency * Update import path in `cbor.py` for consistency with module structure * Adjust warning level in CBOR logging and suppress additional Sphinx warnings --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: polybassa <1676055+polybassa@users.noreply.github.com> Co-authored-by: Nils Weiss <nils.weiss@dissecto.com>
1 parent e986fbc commit 5b81ba1

8 files changed

Lines changed: 2445 additions & 4 deletions

File tree

doc/scapy/advanced_usage/cbor.rst

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
CBOR
2+
====
3+
4+
What is CBOR?
5+
-------------
6+
7+
.. note::
8+
9+
This section provides a practical introduction to CBOR from Scapy's perspective. For the complete specification, see RFC 8949.
10+
11+
CBOR (Concise Binary Object Representation) is a data format whose goal is to provide a compact, self-describing binary data interchange format based on the JSON data model. It is defined in RFC 8949 and is designed to be small in code size, reasonably small in message size, and extensible without the need for version negotiation.
12+
13+
CBOR provides basic data types including:
14+
15+
* **Unsigned integers** (major type 0): Non-negative integers
16+
* **Negative integers** (major type 1): Negative integers
17+
* **Byte strings** (major type 2): Raw binary data
18+
* **Text strings** (major type 3): UTF-8 encoded strings
19+
* **Arrays** (major type 4): Ordered sequences of values
20+
* **Maps** (major type 5): Unordered key-value pairs
21+
* **Semantic tags** (major type 6): Tagged values with additional semantics
22+
* **Simple values and floats** (major type 7): Booleans, null, undefined, and floating-point numbers
23+
24+
Each CBOR data item begins with an initial byte that encodes the major type (in the top 3 bits) and additional information (in the low 5 bits). This design allows for compact encoding while maintaining self-describing properties.
25+
26+
Scapy and CBOR
27+
--------------
28+
29+
30+
Creating CBOR objects
31+
^^^^^^^^^^^^^^^^^^^^^
32+
33+
CBOR objects can be easily created and composed::
34+
35+
>>> from scapy.cbor import CBOR_UNSIGNED_INTEGER, CBOR_TEXT_STRING, CBOR_BYTE_STRING, CBOR_ARRAY
36+
>>> # Create basic types
37+
>>> num = CBOR_UNSIGNED_INTEGER(42)
38+
>>> text = CBOR_TEXT_STRING("Hello, CBOR!")
39+
>>> data = CBOR_BYTE_STRING(b'\x01\x02\x03')
40+
>>>
41+
>>> # Create collections
42+
>>> arr = CBOR_ARRAY([CBOR_UNSIGNED_INTEGER(1),
43+
... CBOR_UNSIGNED_INTEGER(2),
44+
... CBOR_TEXT_STRING("three")])
45+
>>> arr
46+
<CBOR_ARRAY[[<CBOR_UNSIGNED_INTEGER[1]>, <CBOR_UNSIGNED_INTEGER[2]>, <CBOR_TEXT_STRING['three']>]]>
47+
>>>
48+
>>> # Create maps
49+
>>> from scapy.cbor.cborcodec import CBORcodec_MAP
50+
>>> mapping = {"name": "Alice", "age": 30, "active": True}
51+
52+
Encoding and decoding
53+
^^^^^^^^^^^^^^^^^^^^^
54+
55+
CBOR objects are encoded using their ``.enc()`` method. All codecs are referenced in the ``CBOR_Codecs`` object. The default codec is ``CBOR_Codecs.CBOR``::
56+
57+
>>> num = CBOR_UNSIGNED_INTEGER(42)
58+
>>> encoded = bytes(num)
59+
>>> encoded.hex()
60+
'182a'
61+
>>>
62+
>>> # Decode back
63+
>>> decoded, remainder = CBOR_Codecs.CBOR.dec(encoded)
64+
>>> decoded.val
65+
42
66+
>>> isinstance(decoded, CBOR_UNSIGNED_INTEGER)
67+
True
68+
69+
Encoding collections::
70+
71+
>>> from scapy.cbor import CBORcodec_ARRAY, CBORcodec_MAP
72+
>>> # Encode an array
73+
>>> encoded = CBORcodec_ARRAY.enc([1, 2, 3, 4, 5])
74+
>>> encoded.hex()
75+
'850102030405'
76+
>>>
77+
>>> # Decode the array
78+
>>> decoded, _ = CBOR_Codecs.CBOR.dec(encoded)
79+
>>> [item.val for item in decoded.val]
80+
[1, 2, 3, 4, 5]
81+
>>>
82+
>>> # Encode a map
83+
>>> encoded = CBORcodec_MAP.enc({"x": 100, "y": 200})
84+
>>> decoded, _ = CBOR_Codecs.CBOR.dec(encoded)
85+
>>> isinstance(decoded, CBOR_MAP)
86+
True
87+
88+
Working with different types
89+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
90+
91+
CBOR supports various data types::
92+
93+
>>> # Booleans
94+
>>> true_val = CBOR_TRUE()
95+
>>> false_val = CBOR_FALSE()
96+
>>> bytes(true_val).hex()
97+
'f5'
98+
>>> bytes(false_val).hex()
99+
'f4'
100+
>>>
101+
>>> # Null and undefined
102+
>>> null_val = CBOR_NULL()
103+
>>> undef_val = CBOR_UNDEFINED()
104+
>>> bytes(null_val).hex()
105+
'f6'
106+
>>> bytes(undef_val).hex()
107+
'f7'
108+
>>>
109+
>>> # Floating point
110+
>>> float_val = CBOR_FLOAT(3.14159)
111+
>>> bytes(float_val).hex()
112+
'fb400921f9f01b866e'
113+
>>>
114+
>>> # Negative integers
115+
>>> neg = CBOR_NEGATIVE_INTEGER(-100)
116+
>>> bytes(neg).hex()
117+
'3863'
118+
119+
Complex structures
120+
^^^^^^^^^^^^^^^^^^
121+
122+
CBOR supports nested structures::
123+
124+
>>> # Nested arrays
125+
>>> nested = CBORcodec_ARRAY.enc([1, [2, 3], [4, [5, 6]]])
126+
>>> decoded, _ = CBOR_Codecs.CBOR.dec(nested)
127+
>>> isinstance(decoded, CBOR_ARRAY)
128+
True
129+
>>>
130+
>>> # Complex maps with mixed types
131+
>>> data = {
132+
... "name": "Bob",
133+
... "age": 25,
134+
... "active": True,
135+
... "tags": ["user", "admin"]
136+
... }
137+
>>> encoded = CBORcodec_MAP.enc(data)
138+
>>> decoded, _ = CBOR_Codecs.CBOR.dec(encoded)
139+
>>> len(decoded.val)
140+
4
141+
142+
Semantic tags
143+
^^^^^^^^^^^^^
144+
145+
CBOR supports semantic tags (major type 6) for providing additional meaning to data items::
146+
147+
>>> # Tag 1 is for Unix epoch timestamps
148+
>>> import time
149+
>>> timestamp = int(time.time())
150+
>>> tagged = CBOR_SEMANTIC_TAG((1, CBOR_UNSIGNED_INTEGER(timestamp)))
151+
>>> encoded = bytes(tagged)
152+
>>> decoded, _ = CBOR_Codecs.CBOR.dec(encoded)
153+
>>> decoded.val[0] # Tag number
154+
1
155+
156+
157+
Error handling
158+
^^^^^^^^^^^^^^
159+
160+
Scapy provides safe decoding with error handling::
161+
162+
>>> # Safe decoding returns error objects for invalid data
163+
>>> invalid_data = b'\xff\xff\xff'
164+
>>> obj, remainder = CBOR_Codecs.CBOR.safedec(invalid_data)
165+
>>> isinstance(obj, CBOR_DECODING_ERROR)
166+
True
167+

doc/scapy/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,4 +194,4 @@
194194
'Miscellaneous'),
195195
]
196196

197-
suppress_warnings = ["app.add_directive"]
197+
suppress_warnings = ["app.add_directive", "ref.python"]

scapy/cbor/__init__.py

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# SPDX-License-Identifier: GPL-2.0-only
2+
# This file is part of Scapy
3+
# See https://scapy.net/ for more information
4+
5+
"""
6+
Package holding CBOR (Concise Binary Object Representation) related modules.
7+
Follows the same paradigm as ASN.1 implementation.
8+
"""
9+
10+
from scapy.cbor.cbor import (
11+
CBOR_Error,
12+
CBOR_Encoding_Error,
13+
CBOR_Decoding_Error,
14+
CBOR_BadTag_Decoding_Error,
15+
CBOR_Codecs,
16+
CBOR_MajorTypes,
17+
CBOR_Object,
18+
CBOR_UNSIGNED_INTEGER,
19+
CBOR_NEGATIVE_INTEGER,
20+
CBOR_BYTE_STRING,
21+
CBOR_TEXT_STRING,
22+
CBOR_ARRAY,
23+
CBOR_MAP,
24+
CBOR_SEMANTIC_TAG,
25+
CBOR_SIMPLE_VALUE,
26+
CBOR_FALSE,
27+
CBOR_TRUE,
28+
CBOR_NULL,
29+
CBOR_UNDEFINED,
30+
CBOR_FLOAT,
31+
CBOR_DECODING_ERROR,
32+
RandCBORObject,
33+
)
34+
35+
from scapy.cbor.cborcodec import (
36+
CBORcodec_Object,
37+
CBORcodec_UNSIGNED_INTEGER,
38+
CBORcodec_NEGATIVE_INTEGER,
39+
CBORcodec_BYTE_STRING,
40+
CBORcodec_TEXT_STRING,
41+
CBORcodec_ARRAY,
42+
CBORcodec_MAP,
43+
CBORcodec_SEMANTIC_TAG,
44+
CBORcodec_SIMPLE_AND_FLOAT,
45+
)
46+
47+
__all__ = [
48+
# Exceptions
49+
"CBOR_Error",
50+
"CBOR_Encoding_Error",
51+
"CBOR_Decoding_Error",
52+
"CBOR_BadTag_Decoding_Error",
53+
# Codecs
54+
"CBOR_Codecs",
55+
"CBOR_MajorTypes",
56+
# Objects
57+
"CBOR_Object",
58+
"CBOR_UNSIGNED_INTEGER",
59+
"CBOR_NEGATIVE_INTEGER",
60+
"CBOR_BYTE_STRING",
61+
"CBOR_TEXT_STRING",
62+
"CBOR_ARRAY",
63+
"CBOR_MAP",
64+
"CBOR_SEMANTIC_TAG",
65+
"CBOR_SIMPLE_VALUE",
66+
"CBOR_FALSE",
67+
"CBOR_TRUE",
68+
"CBOR_NULL",
69+
"CBOR_UNDEFINED",
70+
"CBOR_FLOAT",
71+
"CBOR_DECODING_ERROR",
72+
# Random/Fuzzing
73+
"RandCBORObject",
74+
# Codec classes
75+
"CBORcodec_Object",
76+
"CBORcodec_UNSIGNED_INTEGER",
77+
"CBORcodec_NEGATIVE_INTEGER",
78+
"CBORcodec_BYTE_STRING",
79+
"CBORcodec_TEXT_STRING",
80+
"CBORcodec_ARRAY",
81+
"CBORcodec_MAP",
82+
"CBORcodec_SEMANTIC_TAG",
83+
"CBORcodec_SIMPLE_AND_FLOAT",
84+
]

0 commit comments

Comments
 (0)