Skip to content

fix: avoid leading separator in user_agent when no caller-supplied agent is provided#510

Merged
danieldk merged 2 commits into
huggingface:mainfrom
MedChaouch:fix/leading-semicolon-in-user-agent
May 1, 2026
Merged

fix: avoid leading separator in user_agent when no caller-supplied agent is provided#510
danieldk merged 2 commits into
huggingface:mainfrom
MedChaouch:fix/leading-semicolon-in-user-agent

Conversation

@MedChaouch
Copy link
Copy Markdown
Contributor

Summary

_get_hf_api() in kernels/src/kernels/utils.py produces a malformed user-agent when no user_agent argument is passed. The empty user_agent_str is appended to with += "; kernels/...", leaving a leading ; . When the resulting string is forwarded to HfApi(...), huggingface_hub joins its own fields with another ; , producing an empty token in the middle of the User-Agent header.

After the dedup/format step in huggingface_hub, this leaves a trailing ; in the final header, which strict HTTP clients (httpx ≥ 0.25) reject with LocalProtocolError: Illegal header value.

This breaks any flow that triggers _get_hf_api() without supplying a user_agent — most notably _get_available_versions(), which is hit when transformers loads finegrained-fp8 / deep-gemm kernels.

Repro

from kernels._versions import _get_available_versions
_get_available_versions("kernels-community/finegrained-fp8")
# httpx.LocalProtocolError: Illegal header value
# b'kernels/0.13.0; hf_hub/1.12.2; python/3.12.3; '

End-to-end, the same bug surfaces as a crash when running a transformers Qwen3-VL-*-FP8 model — the FP8 kernel fetch never completes.

Fix

Build the system info as a separate sys_info string with no leading separator, then concatenate it onto any caller-supplied user_agent with a single ; only when the caller-supplied part is non-empty.

Tests

Added test_user_agent_no_leading_or_empty_segment to kernels/tests/test_user_agent.py covering the three input shapes that previously triggered the bug (None, "", {}).

_get_hf_api() built user_agent_str = "" then did
user_agent_str += "; kernels/...", producing a leading "; ". When the
resulting string was forwarded to HfApi(...), huggingface_hub joined
its own fields with another "; ", producing an empty token in the
final User-Agent. After dedup, this left a trailing "; " in the
header, which strict HTTP clients (httpx >= 0.25) reject with
LocalProtocolError: Illegal header value.

This breaks any code path that triggers _get_hf_api() without
supplying a user_agent — most notably _get_available_versions(),
which transformers hits when resolving finegrained-fp8 / deep-gemm
kernel versions for FP8 models.

Build the system info as a separate sys_info string with no leading
separator, then join it onto any caller-supplied user_agent with a
single "; " only when the caller-supplied part is non-empty. Adds
a regression test in test_user_agent.py covering None, "", and {}
inputs.
Comment thread kernels/src/kernels/utils.py Outdated
Comment on lines +638 to +657
backend = _select_backend(None).variant_str
user_agent_str += (
f"; kernels/{__version__}; python/{python}; backend/{backend}; platform/{_platform()}; file_type/kernel"
sys_info = (
f"kernels/{__version__}; python/{python}; backend/{backend}; platform/{_platform()}; file_type/kernel"
)

if has_torch:
import torch

user_agent_str += f"; torch/{torch.__version__}"
sys_info += f"; torch/{torch.__version__}"
if has_tvm_ffi:
import tvm_ffi

user_agent_str += f"; tvm-ffi/{tvm_ffi.__version__}"
sys_info += f"; tvm-ffi/{tvm_ffi.__version__}"

# Add glibc version if available
glibc = glibc_version()
if glibc is not None:
user_agent_str += f"; glibc/{glibc}"
sys_info += f"; glibc/{glibc}"

user_agent_str = f"{user_agent_str}; {sys_info}" if user_agent_str else sys_info
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rather than juggling with a lot of strings, it might be better to push everything to a list and then '; '.join it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — refactored in 3208d97 to build a list of parts and '; '.join at the end. Behavior is unchanged, the test still passes, just less branchy.

Address review feedback: replace the user_agent_str string-juggling with a
single list of parts that gets joined at the end. Behavior is identical
to the previous commit (existing test_user_agent_no_leading_or_empty_segment
still passes), the code is just less branchy.
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@danieldk danieldk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@danieldk danieldk merged commit c1f8527 into huggingface:main May 1, 2026
57 of 58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants