fix: sanitize user_id to prevent path traversal in Flask web app (CWE-22)#64
Open
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
Open
fix: sanitize user_id to prevent path traversal in Flask web app (CWE-22)#64sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
Conversation
…-22) Add _is_safe_id() validation to the /init_memory endpoint to reject user_id values containing path separators, ".." components, null bytes, or other characters that could escape the intended data directory. Without this check, an attacker can POST user_id="../../etc/cron.d" to create arbitrary directories (via os.makedirs in the Memoryos constructor) or delete arbitrary directory trees (via shutil.rmtree in /clear_memory). The fix uses an allowlist regex (alphanumeric, hyphens, underscores, dots) applied at the HTTP boundary before the value reaches any filesystem operation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Vulnerability Summary
CWE-22: Improper Limitation of a Pathname to a Restricted Directory (Path Traversal)
Severity: High — unauthenticated arbitrary directory deletion via
shutil.rmtreeData Flow
/init_memorywith a JSON body containinguser_id.user_idis read fromrequest.jsonwith zero validation (only an empty-string check exists).Memoryos(user_id=...)→os.path.join(data_storage_path, "users", user_id)→ stored asself.user_data_dir./clear_memory, the storedmemory_system.user_data_diris passed directly toshutil.rmtree().user_idsuch as../../important_datacausesshutil.rmtreeto delete arbitrary directories on the filesystem.Preconditions
0.0.0.0(all interfaces)login_required, no API keys, no Bearer tokensMemoryos.__init__completes with fake credentials (no API calls happen during construction)Exploit Sketch
Fix Description & Rationale
Changes (2 files, minimal diff)
memoryos-playground/memdemo/app.py_is_safe_id()validator + apply it at/init_memoryinput boundarytests/test_cwe22_app_flask.pyDefense Strategy
Primary defense — strict allowlist regex on
user_idat the only input point (/init_memory):A-Za-z0-9,.,_,-only/,\,.., null bytes, and empty stringsassistant_idis derived asf"assistant_{user_id}"so it is safe by construction after validationDefense-in-depth checks within
_is_safe_id():.and..path components/,\, or..as a belt-and-suspenders guard\x00)Why this approach
user_identers the systemMemoryoslibrary itself are needed for the web-facing fix (though library-level validation would be a good follow-up for defense in depth)Test Results
All tests pass. Run via
python3 tests/test_cwe22_app_flask.py:Attack payloads tested:
../etc/cron.d,..\\windows\\system32,foo/../../etc/passwd,foo/../../../tmp/evil,....//....//etc,/absolute/path,.,..Safe payloads verified (no false positives):
alice,bob_123,user-2024,CamelCaseUserDisprove Analysis
We attempted to disprove this finding through multiple angles:
login_required, no API keys, no Bearer tokens. Endpoints are fully unauthenticated.0.0.0.0:5019. No CORS restrictions, no reverse proxy config found.user_idin original code — only an empty-string check.request.json→data.get("user_id")→Memoryos(user_id=...)→os.path.join(...)→shutil.rmtree(). No sanitization anywhere in the chain.SECURITY.mdexists.Verdict: CONFIRMED_VALID — High confidence
Additional Notes
The same vulnerable
os.path.join(data_storage_path, "users", user_id)pattern exists in the library modules:memoryos-pypi/memoryos.py:71memoryos-chromadb/storage_provider.py:35memoryos-mcp/memoryos/memoryos.py:71These are not directly web-exposed, but any application passing untrusted input as
user_idto theMemoryosclass would be vulnerable at the library level. Adding validation in the library constructor would be a valuable follow-up.This PR focuses on the immediately exploitable web-facing endpoint, which is the highest priority.
Thank you for your time reviewing this. Happy to address any feedback.