Skip to content

Commit 2f09539

Browse files
stephantulPringled
andauthored
Make token optional and private an argument, add template (#39)
* Fix modelcard comments * Make token optional, add private as an optional argument which is set to False by default * Update cite * Resolved comment --------- Co-authored-by: Pringled <thomas123@live.nl>
1 parent 62cba14 commit 2f09539

3 files changed

Lines changed: 12 additions & 10 deletions

File tree

model2vec/model.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -209,15 +209,17 @@ def _batch(sentences: list[str], batch_size: int) -> Iterator[list[str]]:
209209
"""Batch the sentences into equal-sized."""
210210
return (sentences[i : i + batch_size] for i in range(0, len(sentences), batch_size))
211211

212-
def push_to_hub(self, repo_id: str, token: str | None) -> None:
212+
def push_to_hub(self, repo_id: str, private: bool = False, token: str | None = None) -> None:
213213
"""
214214
Push the model to the huggingface hub.
215215
216216
NOTE: you need to pass a token if you are pushing a private model.
217217
218218
:param repo_id: The repo id to push to.
219+
:param private: Whether the repo, if created is set to private.
220+
If the repo already exists, this doesn't change the visibility.
219221
:param token: The huggingface token to use.
220222
"""
221223
with TemporaryDirectory() as temp_dir:
222224
self.save_pretrained(temp_dir, model_name=repo_id)
223-
push_folder_to_hub(Path(temp_dir), repo_id, token)
225+
push_folder_to_hub(Path(temp_dir), repo_id, private, token)

model2vec/model_card_template.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,7 @@
44

55
# {{ model_name }} Model Card
66

7-
Model2Vec distills a Sentence Transformer into a small, static model.
8-
This model is ideal for applications requiring fast, lightweight embeddings.
9-
7+
This [Model2Vec](https://github.com/MinishLab/model2vec) model is a distilled version of {% if base_model %}the [{{ base_model }}](https://huggingface.co/{{ base_model }}){% else %}a{% endif %} Sentence Transformer. It uses static embeddings, allowing text embeddings to be computed orders of magnitude faster on both GPU and CPU. It is designed for applications where computational resources are limited or where real-time performance is critical.
108

119

1210
## Installation
@@ -17,7 +15,7 @@ pip install model2vec
1715
```
1816

1917
## Usage
20-
A StaticModel can be loaded using the `from_pretrained` method:
18+
Load this model using the `from_pretrained` method:
2119
```python
2220
from model2vec import StaticModel
2321

@@ -50,19 +48,20 @@ It works by passing a vocabulary through a sentence transformer model, then redu
5048

5149
## Additional Resources
5250

51+
- [All Model2Vec models on the hub](https://huggingface.co/models?library=model2vec)
5352
- [Model2Vec Repo](https://github.com/MinishLab/model2vec)
5453
- [Model2Vec Results](https://github.com/MinishLab/model2vec?tab=readme-ov-file#results)
5554
- [Model2Vec Tutorials](https://github.com/MinishLab/model2vec/tree/main/tutorials)
5655

57-
## Model Authors
56+
## Library Authors
5857

5958
Model2Vec was developed by the [Minish Lab](https://github.com/MinishLab) team consisting of [Stephan Tulkens](https://github.com/stephantul) and [Thomas van Dongen](https://github.com/Pringled).
6059

6160
## Citation
6261

6362
Please cite the [Model2Vec repository](https://github.com/MinishLab/model2vec) if you use this model in your work.
6463
```
65-
@software{minishlab2024word2vec,
64+
@software{minishlab2024model2vec,
6665
authors = {Stephan Tulkens, Thomas van Dongen},
6766
title = {Model2Vec: Turn any Sentence Transformer into a Small Fast Model},
6867
year = {2024},

model2vec/utils.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -165,16 +165,17 @@ def load_pretrained(
165165
return embeddings, tokenizer, config
166166

167167

168-
def push_folder_to_hub(folder_path: Path, repo_id: str, token: str | None) -> None:
168+
def push_folder_to_hub(folder_path: Path, repo_id: str, private: bool, token: str | None) -> None:
169169
"""
170170
Push a model folder to the huggingface hub, including model card.
171171
172172
:param folder_path: The path to the folder.
173173
:param repo_id: The repo name.
174+
:param private: Whether the repo is private.
174175
:param token: The huggingface token.
175176
"""
176177
if not huggingface_hub.repo_exists(repo_id=repo_id, token=token):
177-
huggingface_hub.create_repo(repo_id, token=token)
178+
huggingface_hub.create_repo(repo_id, token=token, private=private)
178179

179180
# Push model card and all model files to the Hugging Face hub
180181
huggingface_hub.upload_folder(repo_id=repo_id, folder_path=folder_path, token=token)

0 commit comments

Comments
 (0)