Skip to content

Security: eval() used on benchmark data allows code injection #52

@CrepuscularIRIS

Description

@CrepuscularIRIS

Bug Description

The MMLongBench-Doc benchmark loader uses eval() to parse evidence_pages and evidence_sources fields from dataset items. If the benchmark dataset is tampered with (e.g., man-in-the-middle on download, or a malicious dataset fork), arbitrary Python code will be executed.

Location

OmniSimpleMem/omni_memory/evaluation/benchmarks.py:1035,1043

Reproduction

# If an attacker modifies the benchmark dataset JSON to include:
# {"evidence_pages": "__import__('os').system('id')", ...}

# The eval() on line 1035 will execute: __import__('os').system('id')
# This runs arbitrary system commands
# To trigger, run the benchmark evaluation:
cd OmniSimpleMem
python -c "from omni_memory.evaluation.benchmarks import MMLongBenchDocBenchmark; b = MMLongBenchDocBenchmark('/path/to/malicious_data')"

Impact

Arbitrary code execution when loading benchmark data from untrusted sources.

Suggested Fix

# Replace eval() with ast.literal_eval() which only allows literals (lists, dicts, strings, numbers)
import ast

# Line 1035
evidence_pages = ast.literal_eval(evidence_pages)

# Line 1043
evidence_sources = ast.literal_eval(evidence_sources)

ast.literal_eval() safely parses Python literal expressions without executing arbitrary code.


Found via automated codebase analysis. Happy to submit a PR if confirmed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions