compress2/decompress2 corrupt all-zeros buffer when length is not a multiple of `typesize`

`blosc2.compress2()` followed by `blosc2.decompress2()` fails to round-trip an
**all-zeros** input whose **byte length is not a multiple of `typesize`**.
`compress2` emits a 32-byte "all-zeros special value" frame that `decompress2`
then refuses to decode, raising:

```
ValueError: Error while decompressing, check the src data and/or the dparams
```

The data is silently lost at compress time (the 32-byte frame does not encode
the true length correctly), so this is a data-corruption bug, not merely a
decode-side error.

## Environment

- python-blosc2: **4.5.1**
- c-blosc2: **3.1.4** (2026-06-17)
- Python: 3.12.11
- numpy: 2.1.2 (not required to reproduce)
- OS: Linux 6.17 x86_64 (glibc 2.39)

Also reproduced on python-blosc2 4.3.3 / c-blosc2 earlier, so this is not a
recent regression.

## Minimal reproduction (no numpy)

```python
import blosc2

data = bytes(707658)            # all zeros; 707658 % 8 == 2  (NOT a multiple of typesize 8)
c = blosc2.compress2(data, typesize=8)
print(len(c))                   # -> 32  (all-zeros special-value frame)
blosc2.decompress2(c)           # -> ValueError: Error while decompressing, ...
```

## Trigger conditions (all three required)

1. The input is **all zeros** (triggers blosc2's zero special-value frame; the
   compressed output is 32 bytes regardless of input size).
2. The input **byte length is not a multiple of `typesize`**.
3. Any codec — reproduced with `ZSTD`, `LZ4`, and `BLOSCLZ`.

If any of these does not hold, the round-trip succeeds.

## Controls (all behave correctly)

```python
import blosc2

# Length IS a multiple of typesize -> OK
blosc2.decompress2(blosc2.compress2(bytes(707656), typesize=8))      # OK (707656 % 8 == 0)

# typesize=1 -> every length is a multiple -> OK
blosc2.decompress2(blosc2.compress2(bytes(707658), typesize=1))      # OK

# Non-zero data at the same (non-multiple) length -> OK
blosc2.decompress2(blosc2.compress2(b"\x07" * 707658, typesize=8))   # OK (clen=86, not the 32-byte zero frame)

# Random/incompressible data at the same length -> OK
```

## Divisibility sweep (all-zeros, `typesize=8`)

| length | length % 8 | result |
|-------:|:----------:|:------:|
| 80000  | 0 | OK |
| 80001  | 1 | **FAIL** |
| 80007  | 7 | **FAIL** |
| 80008  | 0 | OK |
| 707656 | 0 | OK |
| 707658 | 2 | **FAIL** |
| 707664 | 0 | OK |

Same pattern for `typesize=4` (fails unless `len % 4 == 0`) and `typesize=2`
(fails unless `len % 2 == 0`). `typesize=1` always succeeds.

## Related observation

The blosc1-compatibility API guards against this by rejecting non-multiple
lengths up front:

```python
blosc2.compress(bytes(707658), typesize=8)
# ValueError: len(src) can only be a multiple of typesize (8).
```

`compress2` instead accepts the same input and produces a frame that cannot be
decompressed. It should either apply the same validation, or (preferably)
correctly handle a trailing partial element in the all-zeros special-value path.

## Impact

Real-world hit: we compress arbitrary numpy arrays as raw byte streams. An
all-zeros region (e.g. a cleared/blank segmentation tile) of `166*49*87 =
707658` bytes silently produced an undecodable frame, surfacing only at
decompress time on the receiving end. Passing `typesize=1` is a safe workaround
for byte-stream payloads, but the underlying `compress2`/`decompress2`
inconsistency looks like a genuine bug.

## Workaround

Pass `typesize=1` when compressing a raw byte stream (or otherwise ensure the
length is a multiple of `typesize`).
```

---

*Repo to file against: https://github.com/Blosc/python-blosc2 (route to
c-blosc2 if the fault is in the special-value frame codec).*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compress2/decompress2 corrupt all-zeros buffer when length is not a multiple of `typesize` #665

Environment

Minimal reproduction (no numpy)

Trigger conditions (all three required)

Controls (all behave correctly)

Divisibility sweep (all-zeros, `typesize=8`)

Related observation

Impact

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

length	length % 8	result
80000	0	OK
80001	1	FAIL
80007	7	FAIL
80008	0	OK
707656	0	OK
707658	2	FAIL
707664	0	OK

Uh oh!

compress2/decompress2 corrupt all-zeros buffer when length is not a multiple of typesize #665

Description

Environment

Minimal reproduction (no numpy)

Trigger conditions (all three required)

Controls (all behave correctly)

Divisibility sweep (all-zeros, typesize=8)

Related observation

Impact

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

compress2/decompress2 corrupt all-zeros buffer when length is not a multiple of `typesize` #665

Divisibility sweep (all-zeros, `typesize=8`)