- Added date filtering for queries
- Updated indexers to sync dates with MediaStore
- Improved efficiency of large batch handling in add method in FileEmbeddingStore
- Added model manager with tests
- Added clustering package, with incremental clustering
- Added
HNSWIndexto embeddings/ package - Added classification package
- Added token max length to TextEmbeddingProvider
- Made tokenizers internal
- Added SmartScanException
- Added update method to
IEmbeddingStore - Added save method to
IEmbeddingStore
- Renamed ModelSource with ModelAssetSource and added core/models package
- Renamed updatePrototype to updatePrototypeEmbedding
- Renamed filterIds to ids in
IEmbeddingStorequery method and use Set instead of Long - Return number of items removed in
IEmbeddingStoreremove method - Return number of items added in
IEmbeddingStoreadd method - Refactor
FileEmbeddingStoreto use a persistent file offset index with in-place updates FileEmbeddingStoreremove method now only removes items from cache and removal are not automatically persisted, call save method to persist removals. Add and update methods both remain write-though.- MinSDK changed to 28 (previously 30).
- Removed embedBatch method from
IEmbeddingProviderand added as util method instead in embeddings/ package - Remove detector/face package and added contents to detectors/
- Filter ids before running query in
FileEmbeddingStore - Made
FileEmbeddingStorethread-safe
- Removed
FileEmbeddingStorepagination query overload
- Added
updatePrototypemethod in EmbeddingUtils.kt
- Renamed
EmbeddingtoStoredEmbeddingfor clarity (breaking change) fewShotClassificationnow uses cohesionScores instead of manual threshold and conf margin
- Removed
PrototypeEmbeddingclass. For classification, StoredEmbedding is now used and id: Long, represent classId. Use is expected to store mapping to string if needed.
- Added support for filterIds when querying
- Fixed duplication bug in
addmethod ofFileEmbeddingStore - Fixed get(ids) return no results due to empty cache
- Queries now return List
- Indexer call
onBatchCompleteon listener if provided
- Removed
useCache
- Added new models: DinoV2 and InceptionResnet
- Removed
FileEmbeddingRetrieverand and its query methods toFileEmbeddingStore(breaking) - Removed
iscachedfromFileEmbeddingStore(breaking) - Removed
ProcessOptions(breaking) - Removed data packages for both core and ml modules and moved the files into relevant packages
- Added new text embedding provider, Mini-LM
- Add
initializedandisInitializedtoIEmbeddingProvider"
- IEmbeddingProvider is require to provider
embeddingDimvariable (used to be optional) - Renamed
embeddingLengthtoembeddingDimforFileEmbeddingStoreconstructor param - Move interfaces:
- Moved to core/embeddings:
IEmbeddingStore,IRetriever,IEmbeddingProvider - Moved to core/processor:
IProcessorListener - Moved to ml/models:
IModelLoader
- Moved to core/embeddings:
-
Project structure refactored from core + extensions to core + ml.
-
Imports updated accordingly:
- core → minimal runtime: shared interfaces, data classes, embeddings, media helpers, processor execution, and efficient batch/concurrent processing.
- ml → ML infrastructure and models: model loaders, base models, embedding providers (e.g., CLIP), and few-shot classifiers. Optional or experimental ML-related features can be added under
ml/providers. - Both modules organize contracts and data classes under their own
data/packages.
-
All
IEmbeddingProvidersmust now implementembedBatch -
ClipImageEmbedderandClipTextEmbeddernow accept context instead of resources -
BatchProcessornow accepts aContext(usesapplicationContextinternally).
- fix
ClipTextEmbedder: prevent IllegalCapacity in embed
Organiserclass removed.
This release replaces the old core and extensions structure.
If you are upgrading from ≤1.0.4, update imports and Gradle dependencies.
- Pass file directly in
FileEmbeddingStoreconstructor instead of dir and filename - Update batch processor to ensure progress is tracked correctly regardless of errors
- Update batch processor to call onComplete even if items is empty
FileEmbeddingRetrievernow supports batch retrieval viastartandendindices with a newqueryoverload.FileEmbeddingStoregetAllmethod renamed toget, and two new overloads added:get(ids: List<Long>)– fetch multiple embeddings by ID.get(id: Long)– fetch a single embedding by ID.
- Tests added to verify correct behavior and boundary handling for the new query overload.
- Moved MemoryUtils into processor
- Moved IProcessorListener to its own file
- Moved MemoryOptions into ProcessorData.kt
- Update indexers to users correctly named parameter item instead of id to prevent issues with named parameters
- Fixed typo in getScaledDimension function
- IEmbeddingStore interface - getAll, isCached, exist
- Use linked hashmap for cache instead of list
- Pass store to Indexers
- Update tests
- Initial release