docs: update READMEs for the params-at-init encoder API

MechaCritter · claude · MechaCritter · commit 6bd934273cbe · 2026-06-13T10:26:55.000+02:00
Quickstart now configures the encoder from parameters, calls learn()
and shows save_to_disk/load_from_disk with .encoder files. Document the
kmeans_params/gmm_params/pca_params dictionaries in the encoders README
and mark KMeansWeights/GMMWeights loading as deprecated.

Co-Authored-By: Claude Fable 5 &lt;noreply@anthropic.com&gt;
diff --git a/README.md b/README.md
@@ -55,6 +55,13 @@ similarity_score = encoder.similarity_score(image1, image2)
 
 print(f"Similarity Score: {similarity_score}")
 ```
+
+A fitted encoder can be saved to a `.encoder` file and restored later:
+
+```python
+path = encoder.save_to_disk("vlad_oxford102")  # writes vlad_oxford102.encoder
+encoder = VLADEncoder.load_from_disk(path)
+```
 You can also visit the [introduction notebook](examples/getting_started.ipynb) for more examples.
 
 I also provided various notebooks for different use-cases. Feel free to check them out, and let me know if you
@@ -109,14 +116,19 @@ For more details on the dataset, please refer to the [documentation](pyvisim/dat
 
 ## Pretrained Models
 
+> [!CAUTION]
+> **Deprecated:** Loading pretrained models via the `KMeansWeights`/`GMMWeights` enums is deprecated
+> and will be removed in a future release. Train an encoder with `learn()` and persist it with `save_to_disk()`/
+> `load_from_disk()` (`.encoder` files) instead.
+
 The following pretrained models are provided for clustering and dimensionality reduction. All clustering
 models were trained with `k=256`. The choice of `k` was made arbitrarily
 based on the paper <sup>[5](#references)</sup>, where the authors tested with `k=32`, `64`, `128`, `256`, `512`, and so on.
 Since higher values would take too long, I chose `k=256` as a balance between performance and computational cost.
 
 ### KMeans Models
 
-You can access these weights by importing `KMWeights` from the `pyvisim.encoders` module.
+You can access these weights by importing `KMeansWeights` from the `pyvisim.encoders` module.
 
 | Model Name                             | Features Extracted From | PCA Applied | Feature Dimensions |
 |----------------------------------------|-------------------------|-------------|--------------------|
diff --git a/pyvisim/encoders/README.md b/pyvisim/encoders/README.md
@@ -29,6 +29,40 @@ differ in the way they aggregate these descriptors and the underlying clustering
 After the feature extraction step, the local features are aggregated to their respective cluster centers. The final
 encoding matrix is then flattened and normalized to produce the final feature vector representation of the image.
 
+## Configuring Encoders
+
+The encoders build their clustering models internally: VLAD always uses K-Means and the Fisher Vector encoder always
+uses a Gaussian Mixture Model (both implemented in `pyvisim.clustering`).
+
+```python
+from pyvisim.encoders import VLADEncoder, FisherVectorEncoder
+
+vlad = VLADEncoder(
+    n_clusters=256,
+    kmeans_params={"random_state": 42},
+    pca_params={"n_components": 64},
+)
+fisher = FisherVectorEncoder(
+    n_components=256,
+    gmm_params={"random_state": 42},
+)
+```
+
+Calling `learn(images)` fits the configured PCA (if any) and the clustering model. A fitted encoder can be saved to
+disk and restored later:
+
+```python
+vlad.learn(images)
+path = vlad.save_to_disk("vlad")  # writes vlad.encoder
+vlad = VLADEncoder.load_from_disk(path)
+```
+
+The `.encoder` file stores the fitted clustering model, the PCA model and the normalization hyperparameters. The
+feature extractor and the similarity function are not serialized; provide them again when loading.
+
+Loading pretrained models via the `KMeansWeights`/`GMMWeights` enums is deprecated and will be removed in a future
+release.
+
 ## Similarity Metric Pipeline
 The _Pipeline_ class is designed to handle multiple encoders simultaneously to compute feature vectors. It takes
 a list of encoders (instances of the ImageEncoderBase class defined in the '_base_encoder.py' file) and a function