fix(cli): make spacy download --url actually use the custom URL#13964
Open
shaun0927 wants to merge 2 commits intoexplosion:masterfrom
Open
fix(cli): make spacy download --url actually use the custom URL#13964shaun0927 wants to merge 2 commits intoexplosion:masterfrom
spacy download --url actually use the custom URL#13964shaun0927 wants to merge 2 commits intoexplosion:masterfrom
Conversation
The custom-URL feature added in explosion#13848 has been broken since release: any value passed via `--url` is silently dropped (when the URL has no trailing slash) or rejected by the post-construction guard (when it does), so the flag cannot succeed under any input. Two interlocking defects in `download_model`: 1. When `base_url` (= the user's custom URL) lacked a trailing slash, line 183 reassigned it to `about.__download_url__ + "/"`, discarding the user's choice instead of just appending `/`. 2. The line-185 `startswith(about.__download_url__)` guard then rejected any URL that *was* preserved, because a custom mirror by definition does not start with the GitHub URL. This change: * Appends `/` to the actual `base_url` instead of swapping it for the default. * Skips the GitHub-origin guard when `custom_url` is provided. The user has explicitly opted into a custom source via `--url`; the relative- path guard is still in force when no `--url` is passed, preserving the protection from explosion#13848's review thread. Adds a regression test exercising both trailing-slash variants of `--url` plus the default-URL path so the relative-path rejection from `test_download_rejects_relative_urls` keeps working.
…test Adds the signed SCA at .github/contributors/shaun0927.md as required for first-time contributors. Reformats the new test_download_uses_custom_url block so `python -m ruff format spacy --check` passes (the for-loop tuple exceeded the project's 88-char line length).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #13963.
Description
The
--urlflag added in #13848 is broken: any value passed via--urlis either silently dropped (no trailing slash) or rejected by the post-construction guard (with trailing slash), so the flag cannot succeed under any input. See #13963 for the full reproduction.spacy/cli/download.py:180-186had two interlocking defects:base_url(= the user's custom URL) lacked a trailing slash, line 183 reassigned it toabout.__download_url__ + "/", discarding the user's choice.startswith(about.__download_url__)guard then rejected any URL that was preserved, because a custom mirror by definition does not start with the GitHub URL.This change:
/to the actualbase_urlinstead of swapping it for the default.custom_urlis provided. The user has explicitly opted into a custom source via--url; the relative-path guard is still in force when no--urlis passed (thetest_download_rejects_relative_urlscase continues to pass).Adds a regression test (
test_download_uses_custom_url) exercising both trailing-slash variants of--urlplus the default-URL path so the relative-path rejection from #13848's review thread is preserved.No new dependencies. No behavior change for users who do not pass
--url.Types of change
Bug fix.
Checklist