Skip to content

Commit ee24ee6

Browse files
committed
feat(website): update the website wiki pages
1 parent c2ef23d commit ee24ee6

91 files changed

Lines changed: 2970 additions & 6340 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

website/img/primitives.png

53.9 KB
Loading

website/img/procedure.png

29.3 KB
Loading

website/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ <h1>(All You Ever Wanted To Know About) <br> Python Class Pollution</h1>
1818
<a href="wiki/docs/">Wiki</a>
1919
<a href="https://jackfromeast.github.io/assets/Pyrl.pdf">Paper</a>
2020
<a href="wiki/docs/tool/pyrl/">Tool</a>
21-
<a href="wiki/docs/reference/cve-index/">Dataset</a>
21+
<a href="wiki/docs/collection/showcases/">Dataset</a>
2222
</div>
2323
<h2>What is Python class pollution?</h2>
2424
<img class="hero-icon" src="img/icon.png" alt="Python Class Pollution">

website/source-landing/content/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ title: "Python Class Pollution"
88
<a href="wiki/docs/">Wiki</a>
99
<a href="https://jackfromeast.github.io/assets/Pyrl.pdf">Paper</a>
1010
<a href="wiki/docs/tool/pyrl/">Tool</a>
11-
<a href="wiki/docs/reference/cve-index/">Dataset</a>
11+
<a href="wiki/docs/collection/showcases/">Dataset</a>
1212
</div>
1313

1414
## What is Python class pollution?

website/source/content/docs/_index.md

Lines changed: 24 additions & 113 deletions
Original file line numberDiff line numberDiff line change
@@ -7,129 +7,40 @@ bookFlatSection: false
77

88
# Python Class Pollution
99

10-
**Class pollution** is a vulnerability pattern in which an attacker traverses Python's
11-
runtime object graph through dunder attributes &mdash; `__class__`, `__init__`,
12-
`__globals__`, `sys.modules`, and so on &mdash; and overwrites attributes in unintended
13-
classes, functions, or modules. The traversal is driven by a reflective
14-
`getattr`/`setattr` (or `__getitem__`/`__setitem__`) loop whose path or keys come from
15-
untrusted input.
10+
**Class pollution** is a vulnerability class where an attacker traverses Python's runtime object graph through dunder attributes such as `__class__`, `__init__`, `__globals__`, and `sys.modules`, and overwrites attributes in unintended classes, functions, or modules. The traversal is driven by a reflective attribute or item access loop whose path or keys come from untrusted input.
1611

17-
It is the Python analogue of JavaScript prototype pollution[^silvanovich2021], but the
18-
primitives are richer: because Python's object model is class-based with a flexible
19-
reflection layer, pollution can reach classes, functions, modules, and even descriptor
20-
slots &mdash; not just a single root prototype.
12+
It is the Python analogue of [JavaScript prototype pollution][jsproto], but the primitives are richer: Python's class-based object model with a flexible reflection layer lets pollution reach classes, functions, modules, and descriptor slots.
2113

22-
## A motivating example
14+
## Roadmap
2315

24-
```python
25-
def update(user, data):
26-
for key in data:
27-
val = data[key]
28-
if isinstance(val, dict):
29-
update(getattr(user, key), val)
30-
else:
31-
setattr(user, key, val)
32-
```
16+
This wiki is organized into the following sections. Most readers can pick the entry point that matches their goal:
3317

34-
The function looks like a routine deep-merge of nested form data onto a model object. But
35-
because `getattr` does not distinguish between developer-defined attributes and dunder
36-
attributes, an attacker-controlled `data` can step through Python's object graph:
18+
<!-- - **[Taxonomy]({{< relref "taxonomy" >}})**: the systematic taxonomy of class pollution along three aspects: pollution primitives, vulnerability types, and consequences.
19+
- **[Pollution Targets]({{< relref "targets" >}})**: runtime objects (classes, modules, functions, globals) that are reachable via reflection and meaningfully change program behavior when modified.
20+
- **[Gadgets]({{< relref "gadgets" >}})**: concrete target + value combinations that turn a pollution primitive into RCE, XSS, authentication bypass, DoS, or token leakage.
21+
- **[Tool]({{< relref "tool" >}})**: documentation for *Pyrl* (the detection tool, built on operational taint analysis over CodeQL) and *Polluter* (an exploitation/testing helper).
22+
- **[Collection]({{< relref "collection" >}})**: a curated database of confirmed vulnerable Python packages with end-to-end PoCs, including the assigned CVEs and showcase walkthroughs.
23+
- **[Defense]({{< relref "defense" >}})**: mitigations along the object resolution path: key sanitization at the "get" primitive and guards at the "set" primitive. -->
3724

38-
```json
39-
{"__class__": {"__getattribute__": "1337"}}
40-
```
25+
## About this wiki
4126

42-
After this call, `type(user).__getattribute__` is the string `"1337"`. Any attribute
43-
access on any instance of the `User` class now raises `TypeError: 'str' object is not
44-
callable` &mdash; a denial-of-service primitive. Extending the path through
45-
`__init__.__globals__.sys.modules` reaches any imported module, which is where the
46-
primitive becomes RCE ([gadgets/rce]({{< relref "gadgets/rce" >}})), stored XSS
47-
([gadgets/xss]({{< relref "gadgets/xss" >}})), or authentication bypass
48-
([gadgets/auth-bypass]({{< relref "gadgets/auth-bypass" >}})).
27+
This wiki accompanies our IEEE S&P 2026 paper [*The First Large-Scale Systematic Study of Python Class Pollution Vulnerability*][paper]. Its goal is to be a living reference for the vulnerability class. Concretely, we want it to:
4928

50-
## Reading guide
29+
- Document the taxonomy, targets, and gadgets in a way that is easier to extend than a PDF.
30+
- Track new CVEs, gadgets, and showcases as they are discovered.
31+
- Provide actionable defense guidance for library and application maintainers.
5132

52-
Different audiences read this wiki differently. Start here:
33+
## Contributions
5334

54-
- **Security researchers** looking to understand the vulnerability class:
55-
[Taxonomy]({{< relref "taxonomy" >}}) → [Targets]({{< relref "targets" >}}) →
56-
[Gadgets]({{< relref "gadgets" >}}).
57-
- **Bug hunters and CTF players** looking for exploits to adapt:
58-
[Showcases]({{< relref "collection/showcases" >}}) are end-to-end PoCs;
59-
[Gadgets]({{< relref "gadgets" >}}) catalogues the building blocks.
60-
- **Library maintainers** with a reflective update function in their codebase:
61-
[Defense]({{< relref "defense" >}}) is the shortest path, then
62-
[Pyrl]({{< relref "tool/pyrl" >}}) to scan your own code.
63-
- **Readers of the paper** looking to map claims onto artifacts:
64-
[Tools]({{< relref "tool" >}}) documents Pyrl and Polluter;
65-
[Collection]({{< relref "collection" >}}) lists every confirmed finding.
66-
67-
## Key differences from JavaScript prototype pollution
68-
69-
| Aspect | JS prototype pollution | Python class pollution |
70-
|--------|------------------------|------------------------|
71-
| Object model | Prototype-based | Class-based + descriptor protocol |
72-
| Pollution path | Uniform prototype chain (`__proto__`) | Multiple: attribute, item, variable |
73-
| Canonical target | `Object.prototype` | Classes, modules, functions, closures |
74-
| Namespace | Single (properties) | Two (attribute vs. item) |
75-
| Resolution | Prototype chain lookup | MRO + descriptor protocol |
76-
| Typical sink | `{}` merged from user input | `setattr` / `obj[k]=v` over a dotted path |
77-
78-
The second-to-last row is the important one for exploitation. Python's two namespaces
79-
(`obj.attr` vs. `obj[key]`) give rise to three distinct "set" primitives (attr-only,
80-
item-only, or dual), which combined with two "get" primitives (agnostic or constrained)
81-
produce the six vulnerability types in the [taxonomy]({{< relref "taxonomy" >}}).
82-
83-
## Threat model
84-
85-
The vulnerable Python package processes input from one of three channels:
86-
87-
1. **Remote input** &mdash; HTTP body, query string, WebSocket message, RPC argument
88-
reaching a server-side reflective update. Example:
89-
[django-unicorn]({{< relref "collection/showcases/django-unicorn" >}}) (WebSocket).
90-
2. **Local input** &mdash; command-line arguments, configuration files, LLM tool outputs
91-
reaching a CLI's reflective setter. Example:
92-
[Azure CLI]({{< relref "collection/showcases/azure-cli" >}}) (`--set`).
93-
3. **Package-level input** &mdash; a public API of a library that accepts a dotted path
94-
and a value, reachable from another package that trusts its caller. Example:
95-
`pydash.set_`, `glom.assign`, `mo_dots.set_attr`.
96-
97-
In all three cases, the attacker controls the `name` (dotted path) and/or the `value` that
98-
reach a reflective sink. The attacker does **not** need to control imports: any
99-
`sys.modules` entry reached by any code path in the victim process is in scope.
100-
101-
## Scale of the problem
102-
103-
The analysis behind this research[^paper] scanned **671,475** real-world Python programs
104-
with Pyrl and produced:
105-
106-
- **868** unique vulnerability reports,
107-
- **47** confirmed zero-day exploitable vulnerabilities,
108-
- **7** CVE identifiers assigned from this research (see
109-
[CVE index]({{< relref "reference/cve-index" >}})),
110-
- Critical findings in Microsoft Azure CLI, Google Mesop, Taipy, django-unicorn, ComfyUI,
111-
Hugging Face Diffusers, and others.
112-
113-
## Related work
114-
115-
- **JavaScript prototype pollution** was first documented by Olivier Arteau in 2018 and
116-
systematized by Silvanovich and others[^silvanovich2021]. The object-model differences
117-
above mean the Python variant is not a mechanical port.
118-
- **`pydash` gadget** (2022): [@abdulrah33m] published the first public demonstration of a
119-
dunder-walk gadget in Python via `pydash.set_`.
120-
- **`deepdiff` advisory** ([CVE-2024-5254][deepdiff-cve], by [@chilaxan][chilaxan]): the
121-
first CVE issued for a Python reflective-merge sink.
122-
- **Pyrl** (this work, IEEE S&P 2025[^paper]): the first automated detector, built on an
123-
operational taint-analysis extension of CodeQL's Python support.
35+
Contributions are welcome: new gadgets, additional showcases, corrections, and translations. The site is built with Hugo from markdown sources under [`website/source/`](https://github.com/jackfromeast/python-class-pollution/tree/main/website/source). To propose a change, open an [issue](https://github.com/jackfromeast/python-class-pollution/issues) or a [pull request](https://github.com/jackfromeast/python-class-pollution/pulls) on the repo: https://github.com/jackfromeast/python-class-pollution.
12436

12537
## References
12638

127-
[^silvanovich2021]: Natalie Silvanovich. *The Risks of JavaScript Prototype Pollution*.
128-
Project Zero, 2021. <https://googleprojectzero.blogspot.com/>
129-
[^paper]: Zhengyu Liu, Jiacheng Zhong, Jianjia Yu, Muxi Lyu, Zifeng Kang, Yinzhi Cao.
130-
*The First Large-Scale Systematic Study of Python Class Pollution Vulnerability*.
131-
IEEE S&P 2025. <https://jackfromeast.github.io/assets/Pyrl.pdf>
39+
1. Abdulraheem Khaled, *"Prototype Pollution in Python."* 2023. [Link](https://blog.abdulrah33m.com/prototype-pollution-in-python/). Also presented at Black Hat MEA 2023, [Link](https://blackhatmea.com/session/prototype-pollution-bug-python).
40+
2. Ziyi Ouyang, *"Research and Explore of Prototype Pollution Attack in Python."* ACCTCS 2023. [Link](https://ieeexplore.ieee.org/abstract/document/10145365).
41+
3. Qingyun Zhang, *"Exploitation and prevention of Python prototype chain pollution."* Applied and Computational Engineering,43,229-236. [Link](https://doi.org/10.54254/2755-2721/43/20230839).
42+
4. Zhengyu Liu, Jiacheng Zhong, Jianjia Yu, Muxi Lyu, Zifeng Kang, Yinzhi Cao, *"The First Large-Scale Systematic Study of Python Class Pollution Vulnerability."* IEEE S&P 2026. [Link](https://jackfromeast.github.io/assets/Pyrl.pdf).
13243

133-
[deepdiff-cve]: https://nvd.nist.gov/vuln/detail/CVE-2024-5254
134-
[chilaxan]: https://github.com/chilaxan
135-
[@abdulrah33m]: https://github.com/abdulrah33m
44+
[jsproto]: https://portswigger.net/web-security/prototype-pollution
45+
[pydash]: https://github.com/dgilland/pydash
46+
[paper]: https://jackfromeast.github.io/assets/Pyrl.pdf

0 commit comments

Comments
 (0)