feat: tag S3 client user agent for request attribution#9795
Draft
goanpeca wants to merge 1 commit into
Draft
Conversation
👷 Deploy request for label-studio-docs-new-theme pending review.Visit the deploys page to approve it
|
👷 Deploy request for heartex-docs pending review.Visit the deploys page to approve it
|
✅ Deploy Preview for label-studio-storybook canceled.
|
✅ Deploy Preview for label-studio-playground canceled.
|
6927155 to
53ad241
Compare
Signed-off-by: Gonzalo Peña-Castellanos <goanpeca@gmail.com>
53ad241 to
0b3846f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reason for change
Label Studio talks to object storage through boto3 in
label_studio/io_storages/s3/utils.py. When it builds the S3 client and resource, it does not set auser_agent_extra, so every request goes out with only the default botocore user agent.This makes it hard to identify Label Studio traffic against Amazon S3 or any S3-compatible object store (for example Backblaze B2, Cloudflare R2, or MinIO, all reached through the same code path via a custom
endpoint_url). A stable client marker helps storage-side observability and support when diagnosing connection or throughput issues, without changing any request behavior.The change attaches a
label-studio/<version>token touser_agent_extraon the S3 client and resource. It follows the commonframework/versionuser-agent convention, resolves the version from the installed package metadata, and falls back tolabel-studio/devwhen the package is not installed (source checkout).Screenshots
N/A. Backend-only change, no UI impact.
Rollout strategy
No feature flag required. The change is purely additive: it appends a token to the S3 client and resource user agent and leaves
signature_version(s3v4) and the optional customendpoint_urluntouched. The client and resource now share a singleboto3.session.Configinstead of constructing two identical ones. No new configuration, environment variable, or dependency is introduced (importlib.metadatais standard library).Testing
Added a unit test module
label_studio/io_storages/s3/tests/test_user_agent.pythat verifies:_get_user_agent_extra()returns alabel-studio/prefixed token.user_agent_extra.signature_versionstayss3v4.endpoint_urlis preserved end to end (using an S3-compatible endpoint as one example, alongside the default AWS S3 path).Acceptance criteria: when Label Studio issues an S3 request (against AWS S3 or an S3-compatible endpoint), the outgoing user agent contains a
label-studio/<version>token, and signature version and endpoint configuration are unchanged.Run locally:
Risks
Low. The token only extends the user-agent string that boto3 already sends; it does not alter authentication, signing, or the request path. The existing
test_resolve_s3_url.pybehavior is unaffected. If package metadata cannot be resolved, the code degrades tolabel-studio/devrather than raising.Reviewer notes
Scope is intentionally minimal and additive (two files, one small helper plus the shared
Config), in line with the contributing guide's preference for small, single-purpose PRs. The PR title uses thefeat:prefix per the repository's title convention. Happy to guard this behind a flag or adjust the token format if you would prefer a different convention.