contain JobOperations.download output paths within download_path#47834
contain JobOperations.download output paths within download_path#47834nabhan06 wants to merge 1 commit into
Conversation
|
Thank you for your contribution @nabhan06! We will review the pull request and get back to you soon. |
|
@nabhan06 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
1 similar comment
|
@nabhan06 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a path-traversal vulnerability in JobOperations.download (sdk/ml/azure-ai-ml). Output names are taken verbatim from the run-history service response and used to build local destinations as download_path / "named-outputs" / item_name. A name containing .. segments or an absolute path could cause pathlib to place artifacts outside the caller-supplied download_path. The fix resolves each destination and skips any output that would fall outside download_path, and adds a unit test asserting malicious output names are not written outside the download directory.
Changes:
- Added a containment check in the download loop that resolves the destination and skips (with a warning) any output whose resolved path escapes
download_path. - Added
test_download_skips_output_name_path_traversalverifying..-based and absolute output names are skipped while the legitimatedefault/logs output is still downloaded. - Added a CHANGELOG entry under the 1.35.0 (unreleased) Bugs Fixed section.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
sdk/ml/azure-ai-ml/azure/ai/ml/operations/_job_operations.py |
Adds resolve-and-contain check so outputs resolving outside download_path are skipped instead of escaping the directory. |
sdk/ml/azure-ai-ml/tests/job_common/unittests/test_job_operations.py |
New unit test returning malicious output names and asserting nothing is written outside download_path. |
sdk/ml/azure-ai-ml/CHANGELOG.md |
Documents the bug fix under 1.35.0 (unreleased) Bugs Fixed. |
The fix is correct and well-targeted: download_path is normalized to a Path earlier in the method (line 994), the containment check uses resolve() on both sides (correctly collapsing .. and normalizing absolute overrides and symlinks), and the batch/default branches are not vulnerable since they use fixed directory names rather than the untrusted item_name. The test's mocked status (JobStatus.COMPLETED) is a valid terminal status, the mocks bypass unrelated code paths correctly, and the assertions cover both the traversal and absolute-path cases plus the legitimate download. No async duplicate of this logic exists that would need the same change.
Description
JobOperations.downloadbuilds each output's local destination asdownload_path / "named-outputs" / item_name, whereitem_nameis a job output name taken verbatim from the run-history service response (get_job_output_uris_from_dataplane->run_metadata.outputskeys). A name containing..climbs out ofdownload_path, and an absolute name makespathlibdropdownload_pathentirely, so artifacts land in a directory the job author picked rather than the one the caller asked for. The blob/gen2/fileshare helpers already contain each blob relative to theirdestination, but that check runs against the already-escapeddestination, so it can't stop this. The download loop now resolves each destination and skips any output whose path would fall outsidedownload_path.Added a unit test in
test_job_operations.pythat returns malicious output names from the service and asserts nothing is written outsidedownload_path.All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines