Skip to content

Commit a3e6126

Browse files
MechaCritterclaude
andauthored
Feat/clustering models (#19)
* feat(clustering): add clustering and PCA models backed by scikit-learn Introduce pyvisim/clustering with KMeans, GaussianMixtureModel and PCA, models that own the underlying scikit-learn estimator and expose the attributes the encoders need (cluster_centers, weights, means, covariances, n_components, n_features_in, ...) through typed getters. The models take the scikit-learn constructor parameters directly and are created unfitted; this prepares for removing scikit-learn objects from the encoder constructors. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(encoders)!: configure clustering and PCA via params, not sklearn objects Breaking change: VLADEncoder and FisherVectorEncoder no longer accept scikit-learn estimators (kmeans_model/gmm_model/pca) in their constructors. VLAD always uses K-Means and Fisher Vectors always use a GMM, so the encoders now build the matching pyvisim.clustering models themselves from the parameters passed at initialization: n_clusters/n_components plus the optional kmeans_params/gmm_params and pca_params dictionaries, whose entries are forwarded verbatim to the underlying scikit-learn estimators. - learn() no longer takes n_clusters/kwargs; it fits the models that were configured at initialization. A configured PCA is now applied (and fitted first if necessary) before fitting the clustering model; previously it was silently reset with a warning. - All scikit-learn attribute access (cluster_centers_, weights_, means_, covariances_, n_features_in_, ...) goes through the clustering and PCA model getters. - Dimension validation is skipped for unfitted models and applies once the models are fitted. - The default RootSIFT feature extractor moved into ImageEncoderBase. - Loading pretrained KMeansWeights/GMMWeights still works; the loaded estimators are adopted by the corresponding pyvisim models. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(encoders): add save_to_disk/load_from_disk with .encoder files Encoders can now persist their learned state to a versioned .encoder file (fitted clustering model, PCA model and normalization hyperparameters) and be restored from it via the load_from_disk classmethod. The feature extractor and similarity function are not serialized and are provided again at load time; dimension validation runs on restore. This is the designated replacement for loading pretrained models via the KMeansWeights/GMMWeights enums. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(encoders): deprecate loading from KMeansWeights/GMMWeights Passing the weights enums to the encoder constructors now emits a DeprecationWarning; the enums and the loading path will be removed in a future release in favor of save_to_disk()/load_from_disk() with .encoder files. The enum docstrings carry the same notice. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs: update READMEs for the params-at-init encoder API Quickstart now configures the encoder from parameters, calls learn() and shows save_to_disk/load_from_disk with .encoder files. Document the kmeans_params/gmm_params/pca_params dictionaries in the encoders README and mark KMeansWeights/GMMWeights loading as deprecated. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * in image encoder base, dim_reduction_factore is runtime-checked (has to be greater than 0, and an integer) * n_components in PCA has to be greater than zero * added _ENCODER_FILE_FORMAT_VERSION_COMPATIBILITY in case future updates use different format * now raise error when covariance type other than `diag` is passed to GMM (instead of warning and mutating like currently done) --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
1 parent e9d3076 commit a3e6126

11 files changed

Lines changed: 682 additions & 139 deletions

File tree

README.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,13 @@ similarity_score = encoder.similarity_score(image1, image2)
5757

5858
print(f"Similarity Score: {similarity_score}")
5959
```
60+
61+
A fitted encoder can be saved to a `.encoder` file and restored later:
62+
63+
```python
64+
path = encoder.save_to_disk("vlad_oxford102") # writes vlad_oxford102.encoder
65+
encoder = VLADEncoder.load_from_disk(path)
66+
```
6067
You can also visit the [introduction notebook](examples/getting_started.ipynb) for more examples.
6168

6269
I also provided various notebooks for different use-cases. Feel free to check them out, and let me know if you
@@ -111,14 +118,19 @@ For more details on the dataset, please refer to the [documentation](pyvisim/dat
111118

112119
## Pretrained Models
113120

121+
> [!CAUTION]
122+
> **Deprecated:** Loading pretrained models via the `KMeansWeights`/`GMMWeights` enums is deprecated
123+
> and will be removed in a future release. Train an encoder with `learn()` and persist it with `save_to_disk()`/
124+
> `load_from_disk()` (`.encoder` files) instead.
125+
114126
The following pretrained models are provided for clustering and dimensionality reduction. All clustering
115127
models were trained with `k=256`. The choice of `k` was made arbitrarily
116128
based on the paper <sup>[5](#references)</sup>, where the authors tested with `k=32`, `64`, `128`, `256`, `512`, and so on.
117129
Since higher values would take too long, I chose `k=256` as a balance between performance and computational cost.
118130

119131
### KMeans Models
120132

121-
You can access these weights by importing `KMWeights` from the `pyvisim.encoders` module.
133+
You can access these weights by importing `KMeansWeights` from the `pyvisim.encoders` module.
122134

123135
| Model Name | Features Extracted From | PCA Applied | Feature Dimensions |
124136
|----------------------------------------|-------------------------|-------------|--------------------|

pyvisim/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
PyVisim: A Python library for image similarity analysis using Image Encoders and Neural Networks.
33
"""
44

5-
__all__ = ["datasets", "encoders", "features", "eval"]
5+
__all__ = ["clustering", "datasets", "encoders", "features", "eval"]

pyvisim/clustering/__init__.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
from ._base_clustering import ClusteringModelBase
2+
from .gmm import GaussianMixtureModel
3+
from .kmeans import KMeans
4+
from .pca import PCA
5+
6+
__all__ = [
7+
"ClusteringModelBase",
8+
"KMeans",
9+
"GaussianMixtureModel",
10+
"PCA",
11+
]
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
"""
2+
Base classes for the scikit-learn-backed models used by the image encoders.
3+
"""
4+
5+
import abc
6+
from typing import Any, ClassVar, TypeVar
7+
8+
import numpy as np
9+
from sklearn.exceptions import NotFittedError
10+
from sklearn.utils.validation import check_is_fitted
11+
12+
_SklearnModelT = TypeVar("_SklearnModelT", bound="_SklearnModelBase")
13+
14+
15+
class _SklearnModelBase(abc.ABC):
16+
"""
17+
Base class for models backed by a scikit-learn estimator.
18+
19+
:param model: Underlying scikit-learn estimator instance.
20+
"""
21+
22+
_sklearn_class: ClassVar[type[Any]]
23+
24+
def __init__(self, model: Any) -> None:
25+
self._model = model
26+
27+
@property
28+
def is_fitted(self) -> bool:
29+
"""Whether the underlying estimator has been fitted."""
30+
try:
31+
check_is_fitted(self._model)
32+
except NotFittedError:
33+
return False
34+
return True
35+
36+
def _check_is_fitted(self) -> None:
37+
"""
38+
Ensures the underlying estimator is fitted before accessing
39+
fitted-only attributes.
40+
41+
:raises NotFittedError: If the underlying estimator is not fitted.
42+
"""
43+
if not self.is_fitted:
44+
raise NotFittedError(
45+
f"This {type(self).__name__} instance is not fitted yet. "
46+
"Call 'fit' with appropriate data before using this attribute."
47+
)
48+
49+
@property
50+
def n_features_in(self) -> int:
51+
"""
52+
Number of features the fitted estimator expects as input.
53+
54+
:raises NotFittedError: If the underlying estimator is not fitted.
55+
"""
56+
self._check_is_fitted()
57+
return int(self._model.n_features_in_)
58+
59+
def fit(self, features: np.ndarray) -> None:
60+
"""
61+
Fits the underlying estimator on the given feature matrix.
62+
63+
:param features: Feature matrix of shape (n_samples, n_features).
64+
"""
65+
self._model.fit(features)
66+
67+
def _validate_sklearn_model(self) -> None: # noqa: B027
68+
"""
69+
Hook for subclasses to validate (or coerce) an estimator that was
70+
passed in directly via :meth:`_from_sklearn`.
71+
"""
72+
73+
@classmethod
74+
def _from_sklearn(cls: type[_SklearnModelT], model: Any) -> _SklearnModelT:
75+
"""
76+
Creates a model from an existing scikit-learn estimator.
77+
78+
This is an internal constructor used to adopt pretrained estimators
79+
(loaded from legacy weight files).
80+
81+
:param model: Estimator instance of the underlying scikit-learn class.
82+
:return: A model backed by the given estimator.
83+
:raises TypeError: If ``model`` is not an instance of the underlying class.
84+
"""
85+
if not isinstance(model, cls._sklearn_class):
86+
raise TypeError(
87+
f"{cls.__name__} can only be created from instances of "
88+
f"{cls._sklearn_class.__name__}, not {type(model).__name__}."
89+
)
90+
instance = cls.__new__(cls)
91+
_SklearnModelBase.__init__(instance, model)
92+
instance._validate_sklearn_model()
93+
return instance
94+
95+
def __repr__(self) -> str:
96+
return f"{type(self).__name__}(model={self._model!r})"
97+
98+
99+
class ClusteringModelBase(_SklearnModelBase):
100+
"""
101+
Base class for clustering models.
102+
"""
103+
104+
@property
105+
@abc.abstractmethod
106+
def n_clusters(self) -> int:
107+
"""Number of clusters (or mixture components) of the estimator."""
108+
raise NotImplementedError

pyvisim/clustering/gmm.py

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
"""Gaussian Mixture Model used by the Fisher Vector encoder."""
2+
3+
from typing import Any
4+
5+
import numpy as np
6+
from sklearn.mixture import GaussianMixture
7+
8+
from ._base_clustering import ClusteringModelBase
9+
10+
11+
class GaussianMixtureModel(ClusteringModelBase):
12+
"""
13+
Gaussian Mixture clustering model, used by the Fisher Vector encoder.
14+
It is backed by :class:`sklearn.mixture.GaussianMixture`.
15+
16+
Only diagonal covariance matrices are supported: the Fisher Vector
17+
computation relies on them, and training is much faster.
18+
19+
:param n_components: Number of mixture components.
20+
:param gmm_params: Additional keyword arguments forwarded verbatim to
21+
:class:`sklearn.mixture.GaussianMixture` (e.g. ``random_state``).
22+
:raises ValueError: If a ``covariance_type`` other than ``"diag"`` is requested.
23+
"""
24+
25+
_sklearn_class = GaussianMixture
26+
27+
def __init__(self, n_components: int = 256, **gmm_params: Any) -> None:
28+
covariance_type = gmm_params.pop("covariance_type", "diag")
29+
if covariance_type != "diag":
30+
raise ValueError(
31+
f"{type(self).__name__} only supports covariance_type='diag', "
32+
f"got {covariance_type!r}."
33+
)
34+
super().__init__(
35+
GaussianMixture(
36+
n_components=n_components, covariance_type="diag", **gmm_params
37+
)
38+
)
39+
40+
def _validate_sklearn_model(self) -> None:
41+
if self._model.covariance_type != "diag":
42+
raise ValueError(
43+
f"{type(self).__name__} only supports covariance_type='diag', "
44+
f"got {self._model.covariance_type!r}."
45+
)
46+
47+
@property
48+
def n_clusters(self) -> int:
49+
"""Number of mixture components of the GMM."""
50+
return int(self._model.n_components)
51+
52+
@property
53+
def weights(self) -> np.ndarray:
54+
"""
55+
Mixture weights of each component, shape (n_components,).
56+
57+
:raises NotFittedError: If the underlying estimator is not fitted.
58+
"""
59+
self._check_is_fitted()
60+
return np.asarray(self._model.weights_)
61+
62+
@property
63+
def means(self) -> np.ndarray:
64+
"""
65+
Mean of each mixture component, shape (n_components, n_features).
66+
67+
:raises NotFittedError: If the underlying estimator is not fitted.
68+
"""
69+
self._check_is_fitted()
70+
return np.asarray(self._model.means_)
71+
72+
@property
73+
def covariances(self) -> np.ndarray:
74+
"""
75+
Diagonal covariance of each component, shape (n_components, n_features).
76+
77+
:raises NotFittedError: If the underlying estimator is not fitted.
78+
"""
79+
self._check_is_fitted()
80+
return np.asarray(self._model.covariances_)
81+
82+
def predict_proba(self, features: np.ndarray) -> np.ndarray:
83+
"""
84+
Evaluates the components' posterior probability for each feature vector.
85+
86+
:param features: Feature matrix of shape (n_samples, n_features).
87+
:return: Posterior probabilities, shape (n_samples, n_components).
88+
:raises NotFittedError: If the underlying estimator is not fitted.
89+
"""
90+
self._check_is_fitted()
91+
return np.asarray(self._model.predict_proba(features))

pyvisim/clustering/kmeans.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
"""K-Means clustering class used by the VLAD encoder."""
2+
3+
from typing import Any
4+
5+
import numpy as np
6+
from sklearn.cluster import KMeans as _SklearnKMeans
7+
8+
from ._base_clustering import ClusteringModelBase
9+
10+
11+
class KMeans(ClusteringModelBase):
12+
"""
13+
K-Means clustering model, used by the VLAD encoder. It is
14+
backed by :class:`sklearn.cluster.KMeans`.
15+
16+
:param n_clusters: Number of clusters to form.
17+
:param kmeans_params: Additional keyword arguments forwarded verbatim to
18+
:class:`sklearn.cluster.KMeans` (e.g. ``random_state``, ``n_init``).
19+
"""
20+
21+
_sklearn_class = _SklearnKMeans
22+
23+
def __init__(self, n_clusters: int = 256, **kmeans_params: Any) -> None:
24+
super().__init__(_SklearnKMeans(n_clusters=n_clusters, **kmeans_params))
25+
26+
@property
27+
def n_clusters(self) -> int:
28+
"""Number of clusters of the K-Means model."""
29+
return int(self._model.n_clusters)
30+
31+
@property
32+
def cluster_centers(self) -> np.ndarray:
33+
"""
34+
Coordinates of the cluster centers, shape (n_clusters, n_features).
35+
36+
:raises NotFittedError: If the underlying estimator is not fitted.
37+
"""
38+
self._check_is_fitted()
39+
return np.asarray(self._model.cluster_centers_)
40+
41+
def predict(self, features: np.ndarray) -> np.ndarray:
42+
"""
43+
Predicts the closest cluster for each feature vector.
44+
45+
:param features: Feature matrix of shape (n_samples, n_features).
46+
:return: Cluster index of each sample, shape (n_samples,).
47+
:raises NotFittedError: If the underlying estimator is not fitted.
48+
"""
49+
self._check_is_fitted()
50+
return np.asarray(self._model.predict(features))

pyvisim/clustering/pca.py

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
"""Principal Component Analysis model used by the image encoders."""
2+
3+
from typing import Any
4+
5+
import numpy as np
6+
from sklearn.decomposition import PCA as _SklearnPCA
7+
8+
from ._base_clustering import _SklearnModelBase
9+
10+
11+
class PCA(_SklearnModelBase):
12+
"""
13+
Principal Component Analysis model, used by the image encoders to
14+
reduce the dimensionality of local features. It is backed by
15+
:class:`sklearn.decomposition.PCA`.
16+
17+
:param n_components: Number of components to keep.
18+
:param pca_params: Additional keyword arguments forwarded verbatim to
19+
:class:`sklearn.decomposition.PCA` (e.g. ``whiten``, ``random_state``).
20+
"""
21+
22+
_sklearn_class = _SklearnPCA
23+
24+
def __init__(self, n_components: int, **pca_params: Any) -> None:
25+
super().__init__(_SklearnPCA(n_components=n_components, **pca_params))
26+
27+
@property
28+
def n_components(self) -> int:
29+
"""
30+
Number of components of the fitted PCA.
31+
32+
:raises NotFittedError: If the underlying estimator is not fitted.
33+
"""
34+
self._check_is_fitted()
35+
return int(self._model.n_components_)
36+
37+
def transform(self, features: np.ndarray) -> np.ndarray:
38+
"""
39+
Projects the given features onto the principal components.
40+
41+
:param features: Feature matrix of shape (n_samples, n_features).
42+
:return: Reduced features of shape (n_samples, n_components).
43+
:raises NotFittedError: If the underlying estimator is not fitted.
44+
"""
45+
self._check_is_fitted()
46+
return np.asarray(self._model.transform(features))

pyvisim/encoders/README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,40 @@ differ in the way they aggregate these descriptors and the underlying clustering
2929
After the feature extraction step, the local features are aggregated to their respective cluster centers. The final
3030
encoding matrix is then flattened and normalized to produce the final feature vector representation of the image.
3131

32+
## Configuring Encoders
33+
34+
The encoders build their clustering models internally: VLAD always uses K-Means and the Fisher Vector encoder always
35+
uses a Gaussian Mixture Model (both implemented in `pyvisim.clustering`).
36+
37+
```python
38+
from pyvisim.encoders import VLADEncoder, FisherVectorEncoder
39+
40+
vlad = VLADEncoder(
41+
n_clusters=256,
42+
kmeans_params={"random_state": 42},
43+
pca_params={"n_components": 64},
44+
)
45+
fisher = FisherVectorEncoder(
46+
n_components=256,
47+
gmm_params={"random_state": 42},
48+
)
49+
```
50+
51+
Calling `learn(images)` fits the configured PCA (if any) and the clustering model. A fitted encoder can be saved to
52+
disk and restored later:
53+
54+
```python
55+
vlad.learn(images)
56+
path = vlad.save_to_disk("vlad") # writes vlad.encoder
57+
vlad = VLADEncoder.load_from_disk(path)
58+
```
59+
60+
The `.encoder` file stores the fitted clustering model, the PCA model and the normalization hyperparameters. The
61+
feature extractor and the similarity function are not serialized; provide them again when loading.
62+
63+
Loading pretrained models via the `KMeansWeights`/`GMMWeights` enums is deprecated and will be removed in a future
64+
release.
65+
3266
## Similarity Metric Pipeline
3367
The _Pipeline_ class is designed to handle multiple encoders simultaneously to compute feature vectors. It takes
3468
a list of encoders (instances of the ImageEncoderBase class defined in the '_base_encoder.py' file) and a function

0 commit comments

Comments
 (0)