feat(arcadedb): add metadata query methods to ArcadeDBDocumentStore#3013
Merged
davidsbatista merged 11 commits intodeepset-ai:mainfrom Mar 24, 2026
Merged
feat(arcadedb): add metadata query methods to ArcadeDBDocumentStore#3013davidsbatista merged 11 commits intodeepset-ai:mainfrom
davidsbatista merged 11 commits intodeepset-ai:mainfrom
Conversation
Contributor
Author
|
Thank you @davidsbatista for the help with the Mixin refactoring and the 'meta' prefix fixes. I'm still learning the internal patterns of the repo; I really appreciate the guidance and the polish. |
Contributor
|
Thank you @ria-19 for your contribution, I did a few last adjustments as you noticed. |
davidsbatista
approved these changes
Mar 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issues
ArcadeDBDocumentStore#2980Proposed Changes:
Implements five metadata query methods for
ArcadeDBDocumentStorelisted in issue #2980:count_documents_by_filtercount_unique_metadata_by_filterget_metadata_fields_infoget_metadata_field_min_maxget_metadata_field_unique_valuesAlso adds:
_infer_metadata_field_type&_extract_distinct_valuesstatic helpers.SCHEMA_SAMPLING_LIMITclass constant (default 1000)Implementation notes
Schema sampling:
get_metadata_fields_infousesLIMIT 1000viaSCHEMA_SAMPLING_LIMITto prevent the OOM risks and latency issues associated with full-table scans on large stores.Search term embedding :
get_metadata_field_unique_valuesembeds the search term via_sql_str()rather thanpositional_params. This is because_commandcurrently sends params as a JSON array, but ArcadeDB's HTTP APIexpects a named params map
{"key": value}with:keyplaceholders. No existing method usespositional_params,so this has not caused failures elsewhere.How did you test it?
Integration tests added for all five methods covering: happy path, no matches, empty filter, empty field list, pagination, and case-insensitive search. All run against real ArcadeDB via the existing Docker service in CI.
Bug Fix: Resolved a FrozenInstanceError in assert_documents_are_equal using dataclasses.replace for document comparison.
Known follow-up items
_command positional_paramssends a JSON array, but ArcadeDB expects a named map. Will raise as a separatebugissue after further verification.AstraDocumentStore._get_metadata_projection_documentsfetches all documents with no limit for schema inference. Will raise as anenhancementfor Astra after further verification.Notes for the reviewer
SQL Constraints: Used SELECT DISTINCT + Python len() for unique counts because the current ArcadeDB SQL parser has limitations with COUNT(DISTINCT ...).
Sampling Default: 1000 is the current default for schema inference; let me know if the team prefers a different threshold.
AI assistance disclaimer
Developed with AI assistance for syntax review and code audit. I authored the underlying logic, verified all implementations against the ArcadeDB HTTP API, and confirmed all tests pass locally.
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:.