Skip to content

Commit 0d3cf78

Browse files
maxwbuckleyclaude
andcommitted
Use HF idiom for variant probing: try-and-recover, cache-first
Replace the ``HfApi().list_repo_files`` probe in ``_variant_available`` with the ``transformers`` / ``huggingface_hub`` idiom: cache lookup via ``try_to_load_from_cache`` first, then ``hf_hub_download`` with ``EntryNotFoundError`` catch. Resolution order (cheapest first): 1. Local directory: ``Path.exists()`` — unchanged. 2. File in the local HF cache for this repo+revision: ``try_to_load_from_cache`` returns the cached path. **Pure local, no network.** This is the new fast path — repeat loads of cached models pay zero API calls. 3. ``local_files_only=True``: return ``None`` (uncertain) without touching the network. 4. ``hf_hub_download(filename=variant_file)``: success means the file exists (and is now cached, so the subsequent ``snapshot_download`` reuses it — no double download). ``EntryNotFoundError`` means the publisher hasn't uploaded a variant. Net effects: - **Warm cache** (most common case): zero Hub API round-trips. Previously paid one ``list_repo_files`` call per load. - **Cold cache, file present**: equivalent — one HEAD + one GET, vs one ``list_repo_files`` + one snapshot_download fetch. Same bytes, fewer distinct calls. - **Cold cache, file absent**: one HEAD that returns 404 → the soft-fallback warning fires and the standard download path runs. Same outward behavior. Drops the dependency on ``HfApi`` (removed from imports). Adds defensive catches around ``try_to_load_from_cache`` so a malformed repo_id (e.g. a non-existent local path that wasn't caught by step 1) falls through to "uncertain" instead of raising ``HFValidationError``. Also adds a TODO(strict-variant) comment in ``_resolve_variant``: once half-precision variant files have been published for the flagship GLiNER models on the Hub, flip the soft-fallback branch to raise ``EntryNotFoundError`` instead of warning. That matches ``transformers.PreTrainedModel.from_pretrained(variant=...)`` strictness. The soft-fallback was a transitional choice — at PR-merge time, no GLiNER repo on the Hub shipped a variant file, so a strict surface would have been broken-on-arrival for every caller. Tests pass (88), ruff clean. End-to-end smoke against ``urchade/gliner_medium-v2.1`` confirms ``_variant_available`` returns ``False`` for the missing bf16 variant via ``EntryNotFoundError``, and the soft-fallback warning still fires from the load path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0635381 commit 0d3cf78

1 file changed

Lines changed: 52 additions & 12 deletions

File tree

gliner/model.py

Lines changed: 52 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,15 @@
1515
from packaging import version
1616
from safetensors import safe_open
1717
from transformers import AutoTokenizer
18-
from huggingface_hub import HfApi, PyTorchModelHubMixin, snapshot_download
18+
from huggingface_hub import (
19+
PyTorchModelHubMixin,
20+
hf_hub_download,
21+
snapshot_download,
22+
try_to_load_from_cache,
23+
)
1924
from torch.utils.data import DataLoader
2025
from safetensors.torch import save_file
26+
from huggingface_hub.errors import EntryNotFoundError
2127

2228
try:
2329
from onnxruntime.quantization import QuantType, quantize_dynamic
@@ -339,33 +345,58 @@ def _variant_available(
339345
) -> Optional[bool]:
340346
"""Probe whether ``model.{variant}.safetensors`` is published.
341347
342-
For local paths the probe is a disk check. For Hub repos it calls
343-
``HfApi().list_repo_files``.
348+
Resolution order (matches the ``transformers`` / ``huggingface_hub``
349+
idiom — cheapest checks first, no list-files round-trip):
350+
351+
1. ``model_id`` is a local directory → ``Path.exists()``.
352+
2. The file is in the local HF cache for this repo+revision →
353+
``try_to_load_from_cache``. Pure local lookup, no network.
354+
3. ``local_files_only=True`` → return ``None`` (uncertain) without
355+
hitting the network.
356+
4. ``hf_hub_download`` for the variant filename: success means the
357+
file exists (and is now cached, so the subsequent
358+
``snapshot_download`` reuses it); ``EntryNotFoundError`` means
359+
the publisher hasn't uploaded a variant.
344360
345361
Returns:
346362
``True`` / ``False`` when known, or ``None`` when availability
347-
cannot be determined (offline, transient API failure, gated
348-
repo without auth). Callers should treat ``None`` as
349-
"uncertain — try the narrow download and let it fail loudly".
363+
cannot be determined (offline + uncached, transient API failure,
364+
gated repo without auth). ``None`` is treated as "try the narrow
365+
download and let it fail loudly".
350366
"""
351367
target = f"model.{variant}.safetensors"
352368

369+
# 1. Local directory.
353370
model_dir = Path(model_id)
354371
if model_dir.exists() and model_dir.is_dir():
355372
return (model_dir / target).exists()
356373

374+
# 2. Already cached for this repo+revision? Pure local — no network.
375+
# try_to_load_from_cache validates the repo_id format; an
376+
# HFValidationError here means the input isn't a valid repo_id at
377+
# all (e.g. a non-existent local path), so treat as uncertain.
378+
try:
379+
cached = try_to_load_from_cache(repo_id=model_id, filename=target, revision=revision)
380+
except Exception:
381+
return None
382+
if isinstance(cached, str):
383+
return True
384+
385+
# 3. Offline mode: can't probe further.
357386
if local_files_only:
358387
return None
359388

389+
# 4. Try-and-recover via hf_hub_download. Success caches the file so
390+
# the subsequent snapshot_download reuses it (no double download).
360391
try:
361-
files = HfApi().list_repo_files(repo_id=model_id, revision=revision, token=token)
392+
hf_hub_download(repo_id=model_id, filename=target, revision=revision, token=token)
393+
return True
394+
except EntryNotFoundError:
395+
return False
362396
except Exception:
363-
# The list of possible huggingface_hub errors here is wide
364-
# (network, auth, repo not found, gated). Treat any failure as
365-
# "availability unknown" and let the subsequent download path
366-
# produce the canonical error if there really is a problem.
397+
# Auth / network / repo-not-found / gated — surface as uncertain
398+
# and let the subsequent download path produce the canonical error.
367399
return None
368-
return target in files
369400

370401
@classmethod
371402
def _resolve_variant(
@@ -392,6 +423,15 @@ def _resolve_variant(
392423
model_id, variant, revision=revision, token=token, local_files_only=local_files_only
393424
)
394425
if available is False:
426+
# TODO(strict-variant): once half-precision variant files have been
427+
# published for the flagship GLiNER models on the Hub, flip this
428+
# branch to raise ``EntryNotFoundError`` (or a wrapped equivalent)
429+
# instead of warning + falling back. That matches
430+
# ``transformers.PreTrainedModel.from_pretrained(variant=...)``,
431+
# which is strict — explicit > implicit. The soft-fallback was a
432+
# transitional choice taken because, at PR-merge time, no GLiNER
433+
# repo on the Hub shipped a variant file, so a strict surface
434+
# would have been broken-on-arrival for every caller.
395435
warnings.warn(
396436
f"variant={variant!r} requested but 'model.{variant}.safetensors' is not "
397437
f"published in {model_id!r}. Falling back to the default fp32 file with "

0 commit comments

Comments
 (0)