Skip to content

fix(cli): make spacy download --url actually use the custom URL#13964

Open
shaun0927 wants to merge 2 commits intoexplosion:masterfrom
shaun0927:fix/download-custom-url
Open

fix(cli): make spacy download --url actually use the custom URL#13964
shaun0927 wants to merge 2 commits intoexplosion:masterfrom
shaun0927:fix/download-custom-url

Conversation

@shaun0927
Copy link
Copy Markdown

Fixes #13963.

Description

The --url flag added in #13848 is broken: any value passed via --url is either silently dropped (no trailing slash) or rejected by the post-construction guard (with trailing slash), so the flag cannot succeed under any input. See #13963 for the full reproduction.

spacy/cli/download.py:180-186 had two interlocking defects:

  1. When base_url (= the user's custom URL) lacked a trailing slash, line 183 reassigned it to about.__download_url__ + "/", discarding the user's choice.
  2. The line-185 startswith(about.__download_url__) guard then rejected any URL that was preserved, because a custom mirror by definition does not start with the GitHub URL.

This change:

  • Appends / to the actual base_url instead of swapping it for the default.
  • Skips the GitHub-origin guard when custom_url is provided. The user has explicitly opted into a custom source via --url; the relative-path guard is still in force when no --url is passed (the test_download_rejects_relative_urls case continues to pass).

Adds a regression test (test_download_uses_custom_url) exercising both trailing-slash variants of --url plus the default-URL path so the relative-path rejection from #13848's review thread is preserved.

No new dependencies. No behavior change for users who do not pass --url.

Types of change

Bug fix.

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

The custom-URL feature added in explosion#13848 has been broken since release: any
value passed via `--url` is silently dropped (when the URL has no trailing
slash) or rejected by the post-construction guard (when it does), so the
flag cannot succeed under any input.

Two interlocking defects in `download_model`:

1. When `base_url` (= the user's custom URL) lacked a trailing slash, line
   183 reassigned it to `about.__download_url__ + "/"`, discarding the
   user's choice instead of just appending `/`.
2. The line-185 `startswith(about.__download_url__)` guard then rejected
   any URL that *was* preserved, because a custom mirror by definition
   does not start with the GitHub URL.

This change:

  * Appends `/` to the actual `base_url` instead of swapping it for the
    default.
  * Skips the GitHub-origin guard when `custom_url` is provided. The user
    has explicitly opted into a custom source via `--url`; the relative-
    path guard is still in force when no `--url` is passed, preserving
    the protection from explosion#13848's review thread.

Adds a regression test exercising both trailing-slash variants of `--url`
plus the default-URL path so the relative-path rejection from
`test_download_rejects_relative_urls` keeps working.
…test

Adds the signed SCA at .github/contributors/shaun0927.md as required for
first-time contributors. Reformats the new test_download_uses_custom_url
block so `python -m ruff format spacy --check` passes (the for-loop tuple
exceeded the project's 88-char line length).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

spacy download --url is silently ignored or rejected — the custom URL flag never works

1 participant