Skip to content

Commit e86740d

Browse files
committed
docs: accurately document ZstdDecompressor.decompress()
Its behavior around multiple frames and extra input was under-specified. Let's explicitly call out the currently implemented behavior. Related to #59 and #181 and probably other issues.
1 parent cdf9c92 commit e86740d

1 file changed

Lines changed: 20 additions & 8 deletions

File tree

zstandard/backend_cffi.py

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3717,18 +3717,28 @@ def memory_size(self):
37173717

37183718
def decompress(self, data, max_output_size=0):
37193719
"""
3720-
Decompress data in its entirety in a single operation.
3720+
Decompress data in a single operation.
37213721
3722-
This method will decompress the entirety of the argument and return the
3723-
result.
3722+
This method will decompress the input data in a single operation and
3723+
return the decompressed data.
37243724
3725-
The input bytes are expected to contain a full Zstandard frame
3725+
The input bytes are expected to contain at least 1 full Zstandard frame
37263726
(something compressed with :py:meth:`ZstdCompressor.compress` or
37273727
similar). If the input does not contain a full frame, an exception will
37283728
be raised.
37293729
3730+
If the input contains multiple frames, only the first frame will be
3731+
decompressed. If you need to decompress multiple frames, use an API
3732+
like :py:meth:`ZstdCompressor.stream_reader` with
3733+
``read_across_frames=True``.
3734+
3735+
If the input contains extra data after a full frame, that extra input
3736+
data is silently ignored. This behavior is undesirable in many scenarios
3737+
and will likely be changed or controllable in a future release (see
3738+
#181).
3739+
37303740
If the frame header of the compressed data does not contain the content
3731-
size ``max_output_size`` must be specified or ``ZstdError`` will be
3741+
size, ``max_output_size`` must be specified or ``ZstdError`` will be
37323742
raised. An allocation of size ``max_output_size`` will be performed and an
37333743
attempt will be made to perform decompression into that buffer. If the
37343744
buffer is too small or cannot be allocated, ``ZstdError`` will be
@@ -3737,9 +3747,11 @@ def decompress(self, data, max_output_size=0):
37373747
Uncompressed data could be much larger than compressed data. As a result,
37383748
calling this function could result in a very large memory allocation
37393749
being performed to hold the uncompressed data. This could potentially
3740-
result in ``MemoryError`` or system memory swapping. Therefore it is
3741-
**highly** recommended to use a streaming decompression method instead
3742-
of this one.
3750+
result in ``MemoryError`` or system memory swapping. If you don't need
3751+
the full output data in a single contiguous array in memory, consider
3752+
using streaming decompression for more resilient memory behavior.
3753+
3754+
Usage:
37433755
37443756
>>> dctx = zstandard.ZstdDecompressor()
37453757
>>> decompressed = dctx.decompress(data)

0 commit comments

Comments
 (0)