Skip to content

Commit 6414430

Browse files
committed
fix(pinecone): handle float32 precision loss in assert_documents_are_equal
Pinecone stores embedding vectors as float32 internally, so values round-tripped through the store may differ from the original float64 values in filterable_docs. Document.__eq__ uses to_dict() which includes the embedding list, causing test_filter_simple_async and test_filter_compound_async to fail even when all other fields match. Fix: compare documents sorted by ID with embedding fields stripped, verifying only that embedding presence (None vs non-None) matches. Exact float values are not the invariant under test in filter tests.
1 parent cc71560 commit 6414430

1 file changed

Lines changed: 11 additions & 3 deletions

File tree

integrations/pinecone/tests/test_document_store_async.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
# SPDX-License-Identifier: Apache-2.0
44

55
import os
6+
from dataclasses import replace
67

78
import numpy as np
89
import pytest
@@ -52,9 +53,16 @@ async def document_store(self, document_store_async: PineconeDocumentStore):
5253

5354
@staticmethod
5455
def assert_documents_are_equal(received: list[Document], expected: list[Document]):
55-
# Pinecone returns documents ordered by similarity to the dummy vector, not by insertion
56-
# order. Sort both lists by id before comparing to make the assertion order-independent.
57-
assert sorted(received, key=lambda d: d.id) == sorted(expected, key=lambda d: d.id)
56+
# Pinecone stores vectors as float32; round-tripped embeddings may differ by a small
57+
# floating-point epsilon from the float64 originals. Compare sorted by ID (Pinecone
58+
# returns by similarity, not insertion order) and ignore exact embedding values,
59+
# verifying only that embedding presence matches.
60+
received_sorted = sorted(received, key=lambda d: d.id)
61+
expected_sorted = sorted(expected, key=lambda d: d.id)
62+
assert len(received_sorted) == len(expected_sorted)
63+
for recv, exp in zip(received_sorted, expected_sorted):
64+
assert (recv.embedding is not None) == (exp.embedding is not None)
65+
assert replace(recv, embedding=None) == replace(exp, embedding=None)
5866

5967
@pytest.mark.asyncio
6068
async def test_count_not_empty_async(self, document_store: PineconeDocumentStore):

0 commit comments

Comments
 (0)