Skip to content

Latest commit

 

History

History
179 lines (127 loc) · 5.66 KB

File metadata and controls

179 lines (127 loc) · 5.66 KB

v2.0.1 - 01/05/2026

Added

  • Added date filtering for queries

Changed

  • Updated indexers to sync dates with MediaStore
  • Improved efficiency of large batch handling in add method in FileEmbeddingStore

v2.0.0 - 19/04/2026

Added

  • Added model manager with tests
  • Added clustering package, with incremental clustering
  • Added HNSWIndex to embeddings/ package
  • Added classification package
  • Added token max length to TextEmbeddingProvider
  • Made tokenizers internal
  • Added SmartScanException
  • Added update method to IEmbeddingStore
  • Added save method to IEmbeddingStore

Changed

  • Renamed ModelSource with ModelAssetSource and added core/models package
  • Renamed updatePrototype to updatePrototypeEmbedding
  • Renamed filterIds to ids in IEmbeddingStore query method and use Set instead of Long
  • Return number of items removed in IEmbeddingStore remove method
  • Return number of items added in IEmbeddingStore add method
  • Refactor FileEmbeddingStore to use a persistent file offset index with in-place updates
  • FileEmbeddingStore remove method now only removes items from cache and removal are not automatically persisted, call save method to persist removals. Add and update methods both remain write-though.
  • MinSDK changed to 28 (previously 30).
  • Removed embedBatch method from IEmbeddingProvider and added as util method instead in embeddings/ package
  • Remove detector/face package and added contents to detectors/

Fixed

  • Filter ids before running query in FileEmbeddingStore
  • Made FileEmbeddingStore thread-safe

Removed

  • Removed FileEmbeddingStore pagination query overload

v1.3.0 - 21/12/2025

Added

  • Added updatePrototype method in EmbeddingUtils.kt

Changed

  • Renamed Embedding to StoredEmbedding for clarity (breaking change)
  • fewShotClassification now uses cohesionScores instead of manual threshold and conf margin

Removed

  • Removed PrototypeEmbedding class. For classification, StoredEmbedding is now used and id: Long, represent classId. Use is expected to store mapping to string if needed.

v1.2.1 - 17/12/2025

Added

  • Added support for filterIds when querying

Fixed

  • Fixed duplication bug in add method of FileEmbeddingStore
  • Fixed get(ids) return no results due to empty cache

Changed

  • Queries now return List
  • Indexer call onBatchComplete on listener if provided

Removed

  • Removed useCache

v1.2.0 - 26/11/2025

Added

  • Added new models: DinoV2 and InceptionResnet

Removed

  • Removed FileEmbeddingRetriever and and its query methods to FileEmbeddingStore (breaking)
  • Removed iscached from FileEmbeddingStore (breaking)
  • Removed ProcessOptions (breaking)
  • Removed data packages for both core and ml modules and moved the files into relevant packages

v1.1.1 - 04/11/2025

Added

  • Added new text embedding provider, Mini-LM
  • Add initialized and isInitialized to IEmbeddingProvider"

Changed

  • IEmbeddingProvider is require to provider embeddingDim variable (used to be optional)
  • Renamed embeddingLength to embeddingDim for FileEmbeddingStore constructor param
  • Move interfaces:
    • Moved to core/embeddings: IEmbeddingStore, IRetriever, IEmbeddingProvider
    • Moved to core/processor: IProcessorListener
    • Moved to ml/models: IModelLoader

v1.1.0 - 30/10/2025

Changed

  • Project structure refactored from core + extensions to core + ml.

  • Imports updated accordingly:

    • core → minimal runtime: shared interfaces, data classes, embeddings, media helpers, processor execution, and efficient batch/concurrent processing.
    • ml → ML infrastructure and models: model loaders, base models, embedding providers (e.g., CLIP), and few-shot classifiers. Optional or experimental ML-related features can be added under ml/providers.
    • Both modules organize contracts and data classes under their own data/ packages.
  • All IEmbeddingProviders must now implement embedBatch

  • ClipImageEmbedder and ClipTextEmbedder now accept context instead of resources

  • BatchProcessor now accepts a Context (uses applicationContext internally).

Fixed

  • fix ClipTextEmbedder: prevent IllegalCapacity in embed

Removed

  • Organiser class removed.

Notes

This release replaces the old core and extensions structure.
If you are upgrading from ≤1.0.4, update imports and Gradle dependencies.


v1.0.4 – 19/10/2025

Changed

  • Pass file directly in FileEmbeddingStore constructor instead of dir and filename
  • Update batch processor to ensure progress is tracked correctly regardless of errors
  • Update batch processor to call onComplete even if items is empty

v1.0.3 – 14/10/2025

Added

  • FileEmbeddingRetriever now supports batch retrieval via start and end indices with a new query overload.
  • FileEmbeddingStore getAll method renamed to get, and two new overloads added:
    • get(ids: List<Long>) – fetch multiple embeddings by ID.
    • get(id: Long) – fetch a single embedding by ID.
  • Tests added to verify correct behavior and boundary handling for the new query overload.

v1.0.2 – 05/10/2025

Changed

  • Moved MemoryUtils into processor
  • Moved IProcessorListener to its own file
  • Moved MemoryOptions into ProcessorData.kt
  • Update indexers to users correctly named parameter item instead of id to prevent issues with named parameters

Fixed

  • Fixed typo in getScaledDimension function

v1.0.1 – 26/09/2025

Changed

  • IEmbeddingStore interface - getAll, isCached, exist
  • Use linked hashmap for cache instead of list
  • Pass store to Indexers
  • Update tests

v1.0.0 – 23/09/2025

  • Initial release