Skip to content

Commit c9749ef

Browse files
authored
feat: add HuggingFace Buckets API support (#22)
* docs: add buckets support design spec Design for adding HuggingFace Buckets API to the Rust SDK, covering HFBucket handle, API methods, xet integration, error handling improvements, blocking API, and CLI commands matching the Python hf buckets interface. * docs: add buckets implementation plan 13-task plan covering error handling, types, HFBucket handle, xet refactoring, API methods, blocking wrapper, and CLI commands. * feat: add BucketNotFound, Forbidden, Conflict, RateLimited error variants Extend HFError with dedicated variants for HTTP 403, 409, and 429 status codes, plus BucketNotFound for bucket-specific 404s. Update check_response to map these status codes to the new variants instead of the generic Http error. Remove now-redundant inner status code matching in format_hf_error. * feat: add bucket data types and param structs * feat: add HFBucket handle with factory method and URL helpers * refactor: extract xet token URL builders for repo and bucket * feat: add bucket lifecycle API methods on HFClient * feat: add bucket scoped API, xet upload/download methods Add impl HFBucket with info, list_tree, get_paths_info, get_file_metadata, batch, delete_files, upload_files, and download_files methods. Xet integration uses bucket-specific token URLs. * feat: add HFBucketSync blocking wrapper * feat: add hfrs buckets CLI scaffolding with create, info, delete, move * feat: add hfrs buckets list, remove, and cp commands * fix: add missing filename field to XetBatchFile in bucket downloads * fix: address code review findings for buckets implementation - Restore RepoNotFound error context in fetch_xet_connection_info by passing NotFoundContext and identifier through to check_response - Add trailing newline to NDJSON batch body per spec - Use keyed HashMap lookup in download_files instead of positional zip to prevent silent data corruption on out-of-order server responses - Simplify get_file_metadata by calling check_response directly, removing unreachable!() and redundant status checks - Update CLAUDE.md project layout with new bucket modules - Add buckets example demonstrating create/list/info workflows * docs: add bucket sync design spec * docs: move progress into BucketSyncParams * docs: add bucket sync implementation plan * feat: add BucketSyncParams and SyncDirection types * feat: add SyncPlan, SyncOperation, SyncAction types * test: add unit tests for SyncPlan summary methods * feat: implement HFBucket::sync() with plan computation and execution * feat: add sync to blocking API wrapper * feat: add hfrs buckets sync CLI subcommand * test: add bucket sync integration tests * feat: add buckets cargo feature gate All bucket-related code (types, API, blocking, xet impl, CLI) is now behind the "buckets" feature which requires "xet". The "cli" feature includes "buckets" automatically. * test: switch test repos to hf-internal-testing owned repos - Models: openai-community/gpt2 → hf-internal-testing/tiny-gemma3 - Datasets: rajpurkar/squad, xet-team/xet-spec-reference-files → hf-internal-testing/cats_vs_dogs_sample - Xet repo: mcpotato/42-xet-test-repo → hf-internal-testing/tiny-gemma3 (model.safetensors) - Keep openai-community as test_model_author for list_models search tests (hf-internal-testing models aren't returned by the listing API) * test: add 1MB random binary file to bucket sync seed data * fix: address code review findings for sync implementation - Pass path as parameter to compare_files instead of returning empty placeholder that callers must set after the fact - Remove unused bucket_resolve_url from client.rs - Wire SyncPlan.download_entries into execute_download_plan to avoid redundant get_paths_info API call and ensure plan integrity - Add structured tracing to HFBucket::xet_download_batch * rm docs superpowers * re-org * try to fix rate limiting
1 parent 54ea2a4 commit c9749ef

44 files changed

Lines changed: 3923 additions & 2162 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENTS.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ huggingface_hub_rust/
8383
│ │ ├── lib.rs # Public re-exports, crate docs
8484
│ │ ├── client.rs # HFClient, HFClientBuilder, HFClientInner, auth headers, URL builders
8585
│ │ ├── repository.rs # HFRepository/HFRepo handle, repo-scoped params, repo-bound methods
86+
│ │ ├── bucket.rs # HFBucket handle with factory method and URL helpers
8687
│ │ ├── constants.rs # Env var names, default URLs, repo type helpers
8788
│ │ ├── error.rs # HFError enum, Result alias, NotFoundContext
8889
│ │ ├── pagination.rs # Generic paginate<T>() with Link header parsing
@@ -96,6 +97,10 @@ huggingface_hub_rust/
9697
│ │ │ ├── user.rs # User, Organization, OrgMembership
9798
│ │ │ ├── commit.rs # CommitInfo, GitCommitInfo, GitRefs, CommitOperation, AddSource
9899
│ │ │ ├── params.rs # All *Params structs with TypedBuilder
100+
│ │ │ ├── buckets/
101+
│ │ │ │ ├── mod.rs # BucketInfo, BucketUrl, BucketTreeEntry, BucketFileMetadata types
102+
│ │ │ │ └── sync.rs # SyncPlan, SyncOperation, SyncAction types
103+
│ │ │ ├── bucket_params.rs # Bucket parameter structs with TypedBuilder
99104
│ │ │ └── spaces.rs # SpaceRuntime, SpaceVariable (behind "spaces" feature)
100105
│ │ └── api/
101106
│ │ ├── mod.rs # Module declarations
@@ -104,7 +109,20 @@ huggingface_hub_rust/
104109
│ │ ├── files.rs # File listing, download, upload, create_commit, snapshot_download
105110
│ │ ├── commits.rs # Commit listing, diffs, branch/tag management
106111
│ │ ├── users.rs # whoami, auth_check, user/org info, followers
107-
│ │ └── spaces.rs # Space runtime, secrets, variables, hardware, pause/restart
112+
│ │ ├── spaces.rs # Space runtime, secrets, variables, hardware, pause/restart
113+
│ │ └── buckets/
114+
│ │ ├── mod.rs # All bucket API methods (create, delete, list, move, tree, batch, download)
115+
│ │ └── sync.rs # HFBucket::sync() — plan computation and execution
116+
│ ├── src/bin/hfrs/commands/buckets/ # CLI bucket subcommands
117+
│ │ ├── mod.rs
118+
│ │ ├── create.rs
119+
│ │ ├── list.rs
120+
│ │ ├── info.rs
121+
│ │ ├── delete.rs
122+
│ │ ├── remove.rs
123+
│ │ ├── move_bucket.rs
124+
│ │ ├── cp.rs
125+
│ │ └── sync.rs
108126
│ └── tests/
109127
│ └── integration_test.rs # Integration tests against live Hub API
110128
```

Cargo.lock

Lines changed: 12 additions & 14 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)