Skip to content

[Security] Quadratic complexity DoS when parsing dotted keys with shared prefix (CWE-400) #479

@netpack

Description

@netpack

Summary

tomlkit exhibits super-linear time complexity when parsing TOML documents with multiple dotted keys sharing a common prefix. A valid 1.3 KB document causes 8+ seconds of CPU consumption.

Affected version: 0.15.0 (latest)

Reproduction:

import tomlkit
import time

80 dotted keys with shared prefix — valid TOML, only 1,339 bytes

payload = "\n".join([f"a.b.c.key{i} = {i}" for i in range(80)])

start = time.time()
tomlkit.parse(payload)
print(f"Elapsed: {time.time() - start:.2f}s") # ~8 seconds

Expected behavior: Parsing should complete in milliseconds (tomli handles this in 0.3ms).

Actual behavior: Parsing takes 8+ seconds for 80 keys, with super-linear growth:

10 keys → 0.008s
40 keys → 0.64s
80 keys → 8.15s
Growth factor: 8x more input → 1,036x more time (O(n³) or worse).

Comparison with tomli:

import tomli
tomli.loads(payload) # 0.0003 seconds (25,407x faster)

Impact:

Poetry (30M+ monthly downloads) uses tomlkit to parse pyproject.toml
A malicious pyproject.toml in a git repository could DoS CI/CD pipelines
Any application parsing untrusted TOML with tomlkit is vulnerable
Root cause: The internal table merging logic for dotted keys appears to re-traverse the table hierarchy for each new key, causing polynomial time complexity when many keys share a prefix path.

Workaround: Use tomli/tomllib for parsing untrusted TOML input.

CWE: CWE-400 (Uncontrolled Resource Consumption)

Discovered by: Frédéric Bogaerts, University of Coimbra (VAITP Research Project)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions