The embedding code is making multiple requests to the huggingface API every time we load the model. It seems to be looking for the configuration file to load the model weights. We should figure out what configuration file we need and cache it on the first run. (Ideally we should load the model once per session as well, but I think that should be tracked separately.)
example logging output:
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/4328cf26390c98c5e3c738b4460a05b95f4911f5/modules.json "HTTP/1.1 200 OK"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/sentence_bert_config.json "HTTP/1.1 307 Temporary Redirect"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/4328cf26390c98c5e3c738b4460a05b95f4911f5/sentence_bert_config.json "HTTP/1.1 200 OK"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/adapter_config.json "HTTP/1.1 404 Not Found"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/config.json "HTTP/1.1 307 Temporary Redirect"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/4328cf26390c98c5e3c738b4460a05b95f4911f5/config.json "HTTP/1.1 200 OK"
Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 199/199 [00:00<00:00, 7989.38it/s]
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/processor_config.json "HTTP/1.1 404 Not Found"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/video_preprocessor_config.json "HTTP/1.1 404 Not Found"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/resolve/main/preprocessor_config.json "HTTP/1.1 404 Not Found"
[2026-05-15 08:52:22] INFO:httpx::HTTP Request: HEAD
The embedding code is making multiple requests to the huggingface API every time we load the model. It seems to be looking for the configuration file to load the model weights. We should figure out what configuration file we need and cache it on the first run. (Ideally we should load the model once per session as well, but I think that should be tracked separately.)
example logging output: