Skip to content

Avoid reading self source code via file open()#46152

Open
rasmi wants to merge 1 commit into
huggingface:mainfrom
rasmi:importlib-resources-source
Open

Avoid reading self source code via file open()#46152
rasmi wants to merge 1 commit into
huggingface:mainfrom
rasmi:importlib-resources-source

Conversation

@rasmi

@rasmi rasmi commented May 22, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Avoid open()-reading transformers' own .py source in create_import_structure_from_path (alongside #46207). This is helpful for hermetic Python builds (e.g. with Bazel) and other non-standard environments (as opposed to open() which can vary based on the precise environment and filesystem).

Changes

  • Refactors create_import_structure_from_path into a path-based wrapper plus _create_import_structure_from_traversable, which walks via Traversable.iterdir() / read_text(). _resolve_traversable obtains the Traversable once from the already-imported root transformers package and recurses through it (avoiding files(subpackage), which would force-import the subpackage and defeat the _LazyModule bootstrap). Paths outside the transformers package (e.g. test fixtures) fall back to pathlib.Path, which exposes the same interface. (I left the API surface as-is so as to not have to refactor callers to here from __init__.py -- but can clean up the method and update callers if preferred).

Code Agent Policy

The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by
code agents. We are currently bottlenecked by our ability to review and respond to them. As a result,
we ask that new users do not submit pure code agent PRs at this time.
You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents
not to open any PRs or issues for the moment.

PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this
repeatedly or maliciously.

This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result,
this policy is likely to be updated regularly in the near future. For more information, please read CONTRIBUTING.md.

  • I confirm that this is not a pure code agent PR.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@zucchini-nlp @Rocketknight1

@zucchini-nlp zucchini-nlp left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, tbh the "open file to infer if impl can be set dynamically" is a dirty hack made in the past. I wonder if we can instead rely on self._supports_attention_backend to check if dynamic impl are supported, since checking the file content is not bullet proof anyway cc @Cyrilvallez

@vasqu

vasqu commented May 25, 2026

Copy link
Copy Markdown
Collaborator

_supports_attention_backend sadly alone is not viable because of some unique circumstances here and there (i.e. encoder-decoder attentions where we do not have proper kwargs handling atm) or just some models being legacy and having missing flags or similar.

So in the end, we are still kind of forced to check this imo. But maybe I'm overlooking something

@rasmi rasmi force-pushed the importlib-resources-source branch 2 times, most recently from 0729274 to 8ac17c2 Compare May 25, 2026 17:35
@rasmi

rasmi commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @zucchini-nlp @vasqu. I updated _can_set_attn_implementation and _can_set_experts_implementation to inspect the classes directly instead of reading the source (I didn't realize it was a hack to begin with). This tests the same properties as before, just by reading the class attributes instead of the file text.

I checked the behavior of the new class-based checks against the old source-reading approach for all 2150 PreTrainedModel classes / 457 modeling files:

  • _can_set_experts_implementation: agrees on every class.
  • _can_set_attn_implementation: agrees on every class except 12 in legacy models -- FSMT (3), OpenAIGPT (5), SLANet (4) -- where the old check returned True as a false positive because its regex (class \w+Attention(nn.Module)) didn't match either bare class Attention(nn.Module): (FSMT, GPT) or attention-named non-transformer modules (SLANet's GRU cell).

For those 12 mismatches, the new approach yielding False seems more correct to me. None of the FSMT / OpenAIGPT / SLANet models actually use _attn_implementation, so set_attn_implementation would silently appear to succeed without doing anything. Now it logs a warning instead, aligning with other models that don't use the modern interface.

LMK if this is okay, or if a different approach is preferred!

@rasmi rasmi changed the title Use importlib.resources for self-source reads Avoid reading self source code via file open() May 25, 2026
@Cyrilvallez

Copy link
Copy Markdown
Member

Hey @rasmi. No matter what we decide here, we really need to break this PR in two different PRs, one for can_set_xxx, and the other for create_import_structure_from_path. This is because one is much much more sensitive than the other, and we need to be able to properly review and backtrack it later

Comment thread src/transformers/modeling_utils.py Outdated
Comment on lines +2105 to +2108
return any(
"ALL_ATTENTION_FUNCTIONS" in getattr(getattr(attn_cls, "forward", None), "__globals__", {})
for attn_cls in attention_classes
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be much better/easier with inspect.getsource or something

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cyrilvallez -- split this out into #46207 as requested, but we can keep the discussion here if easier. WDYT about the false positives in the source-reading approach vs. class inspection? Any preference in one approach over the other?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #46207.

@rasmi

rasmi commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

@Cyrilvallez -- moved the _can_set_*_implementation to #46207 and left this one for create_import_structure_from_path!

@rasmi rasmi requested a review from Cyrilvallez May 28, 2026 02:35
@rasmi

rasmi commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

(@Cyrilvallez I requested a review here instead of #46207 by accident but it would be appreciated here too!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants