Skip to content

Bump spacy from 3.7.6 to 3.8.14#915

Open
dependabot[bot] wants to merge 1 commit intodevelopmentfrom
dependabot/pip/spacy-3.8.14
Open

Bump spacy from 3.7.6 to 3.8.14#915
dependabot[bot] wants to merge 1 commit intodevelopmentfrom
dependabot/pip/spacy-3.8.14

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Mar 30, 2026

Bumps spacy from 3.7.6 to 3.8.14.

Release notes

Sourced from spacy's releases.

v3.8.14: Bug fix for model downloading in environments without pip on PATH

  • Fix spacy download failing in environments where pip is not on PATH but is available as a Python module (e.g., some virtual environments and containers)

v3.8.13: Pin confection to new version

The v3.8.12 release didn't update the confection pin, which meant that if you did an upgrade-install models wouldn't load.

v3.8.12

Use confection v1.3 and Thinc v8.3.13, which implement custom validation logic in place of Pydantic, allowing us to properly adopt Pydantic v2 and provide full Python 3.14 support.

Our dependency tree used Pydantic v1 in unusual ways, and relied on behaviours that Pydantic v2 reformed. In the time since Pydantic v2 was released there were a few attempts to migrate over to it, but the task has been complicated by the fact that the confection library has a fairly tangled implementation and I had reduced availability for open-source work in 2024 and 2025.

Specifically, our library confection provides the extensible configuration system we use in spaCy and Thinc. The config system allows you to refer to values that will be supplied by arbitrary functions, that e.g. define some neural network model or its sublayers. The functionality in confection is complicated because we aggressively prioritised user experience in the specification, even if it required increased implementation complexity.

Confection's original implementation built a dynamic Pydantic v1 schema for function-supplied values ("promises"). We validate the schema before calling any promises, and then validate the schema again after calling all the promises and substituting in their values. The variable-interpolation system adds further difficulties to the implementation, and we have to do it all subclassing the Python built-in configparser, which ties us to implementation choices I'd do differently if I had a clean slate.

Here's one summary of Pydantic v1-specific behaviours that the migration to v2 particularly difficult for us. This particular summary was produced during a session with Claude Code Opus 4.6, so nuances of it might be wrong. The full history of attempts at doing this spans over different refactors separated by a few months at a time, so I don't have a full record of all the things that I struggled with. It's possible some details of this summary are incorrect though.

The core problem we kept hitting: Pydantic v2 compiles validation schemas upfront and has much stricter immutability. The whole session has been a series of workarounds for this:

 1. Schema mutation — v1 let you mutate __fields__ in place; v2 needs model_rebuild() which loses forward ref namespaces, or create_model subclasses which don't propagate to parent schemas.
 2. model_dump vs dict — v2 converts dataclasses to dicts, breaking resolved objects. Needed a custom _model_to_dict helper.
 3. model_construct drops extras — v2 silently drops fields with extra="forbid", needed manual workarounds.
 4. Strict coercion — v2 coerces ndarray to List[Floats1d] via iteration, needed strict=True.
 5. Forward refs — Every schema with TYPE_CHECKING imports needs model_rebuild() with the right namespace, and that breaks when confection re-rebuilds later.
In order to adjust for behavioural differences like this, I'd refactored confection to build the different versions of the schema in multiple passes, instead of building all the representations together as we'd been doing. However this refactor itself had problems, further complicating the migration.

I've now bitten the bullet and rolled back the refactor I'd been attempting of confection, and instead replaced the Pydantic validation with custom logic. This allows Confection to remove Pydantic as a dependency entirely.~ Update: Actually I went back and got the refactor working. All much nicer now.

I've taken some lengths to explain this because migrating off a dependency after breaking changes can be a sensitive topic. I want to stress that the changes Pydantic made from v1 to v2 are very good, and I greatly appreciate them as a user of FastAPI in our services. It would be very bad for the ecosystem if Pydantic pinned themselves to exactly matching the behaviours they had in v1 just to avoid breaking support for the sort of thing we'd been doing. Instead users who were relying on those behaviours like us should just find some way to adapt --- either vendor the v1 version we need, or change our behaviours, or implement an alternative. I would have liked to do this sooner but we've ultimately gone with the third option.

v3.8.11: Add Windows ARM wheels

Add wheels for Python 3.11, 3.12, 3.13 and 3.14 for Windows ARM. Windows ARM wheels for Python 3.10 and earlier are not available in numpy, so aren't provided.

v3.8.3: Improve memory zone stability

Fix bug in memory zones when non-transient strings were added to the StringStore inside a memory zone. This caused a bug in the morphological analyser that caused string not found errors when applied during a memory zone.

v3.8: Memory management for persistent services, numpy 2.0 support

Optional memory management for persistent services

Support a new context manager method Language.memory_zone(), to allow long-running services to avoid growing memory usage from cached entries in the Vocab or StringStore. Once the memory zone block ends, spaCy will evict Vocab and StringStore entries that were added during the block, freeing up memory. Doc objects created inside a memory zone block should not be accessed outside the block.

The current implementation disables population of the tokenizer cache inside the memory zone, resulting in some performance impact. The performance difference will likely be negligible if you're running a full pipeline, but if you're only running the tokenizer, it'll be much slower. If this is a problem, you can mitigate it by warming the cache first, by processing the first few batches of text without creating a memory zone. Support for memory zones in the tokenizer will be added in a future update.

The Language.memory_zone() context manager also checks for a memory_zone() method on pipeline components, so that components can perform similar memory management if necessary. None of the built-in components currently require this.

If you component needs to add non-transient entries to the StringStore or Vocab, you can pass the allow_transient=False flag to the Vocab.add() or StringStore.add() components.

Example usage:

... (truncated)

Commits
  • 0069cf9 Set version to 3.8.14
  • 5603226 fix: check pip module availability instead of PATH binary in download (#13947)
  • d4bb796 Add least-privilege permissions to CI workflow
  • 9d29209 Pass github context via stdin instead of CLI arg
  • 4216738 Pin GitHub Actions to commit SHAs for supply chain security
  • 297938e Add smoke test and upgrade test to release build workflow
  • fdca647 Set version to 3.8.13
  • 0d94a9d Pin confection>=1.3.2 — older versions crash with pydantic v2
  • f175a51 Fully migrate to Pydantic v2 (#13940)
  • 24255bd Fix import sorting for ruff isort compliance
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [spacy](https://github.com/explosion/spaCy) from 3.7.6 to 3.8.14.
- [Release notes](https://github.com/explosion/spaCy/releases)
- [Commits](explosion/spaCy@release-v3.7.6...release-v3.8.14)

---
updated-dependencies:
- dependency-name: spacy
  dependency-version: 3.8.14
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants