Skip to content

Commit 5e5e7a7

Browse files
julian-rischclaude
andauthored
chore: enforce ruff docstring rules in integrations 31-40 (openrouter, opensearch, optimum, paddleocr, pgvector, pinecone, pyversity, qdrant, ragas, snowflake) (#3011)
* chore: enforce ruff docstring rules (D102/D103/D205/D209/D213/D417/D419) in integrations 31-40 Adds D102, D103, D205, D209, D213, D417, D419 ruff rules to pyproject.toml for: openrouter, opensearch, optimum, paddleocr, pgvector, pinecone, pyversity, qdrant, ragas, snowflake. Fixes all resulting docstring violations. Part of #2947 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add sleep wrapper for delete_all_documents_async in Pinecone conftest Pinecone's eventual consistency requires a delay after deletions before counts reflect the change. The conftest already wrapped delete_documents_async with a sleep, but missed delete_all_documents_async, causing a flaky test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 277d123 commit 5e5e7a7

35 files changed

Lines changed: 212 additions & 74 deletions

File tree

integrations/openrouter/pyproject.toml

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,13 @@ select = [
8585
"ARG",
8686
"B",
8787
"C",
88+
"D102", # Missing docstring in public method
89+
"D103", # Missing docstring in public function
90+
"D205", # 1 blank line required between summary line and description
91+
"D209", # Closing triple quotes go to new line
92+
"D213", # summary lines must be positioned on the second physical line of the docstring
93+
"D417", # Missing argument descriptions in the docstring
94+
"D419", # Docstring is empty
8895
"DTZ",
8996
"E",
9097
"EM",
@@ -134,9 +141,9 @@ ban-relative-imports = "parents"
134141

135142
[tool.ruff.lint.per-file-ignores]
136143
# Tests can use magic values, assertions, and relative imports
137-
"tests/**/*" = ["PLR2004", "S101", "TID252", "ANN"]
144+
"tests/**/*" = ["D", "PLR2004", "S101", "TID252", "ANN"]
138145
# Examples can print their output and don't need type annotations
139-
"examples/**/*" = ["T201", "ANN"]
146+
"examples/**/*" = ["D", "T201", "ANN"]
140147

141148
[tool.coverage.run]
142149
source = ["haystack_integrations"]

integrations/openrouter/src/haystack_integrations/components/generators/openrouter/chat/chat_generator.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
class OpenRouterChatGenerator(OpenAIChatGenerator):
1919
"""
2020
Enables text generation using OpenRouter generative models.
21+
2122
For supported models, see [OpenRouter docs](https://openrouter.ai/models).
2223
2324
Users can pass any text generation parameters valid for the OpenRouter chat completion API
@@ -71,8 +72,7 @@ def __init__(
7172
http_client_kwargs: dict[str, Any] | None = None,
7273
) -> None:
7374
"""
74-
Creates an instance of OpenRouterChatGenerator. Unless specified otherwise,
75-
the default model is `openai/gpt-5-mini`.
75+
Creates an instance of OpenRouterChatGenerator.
7676
7777
:param api_key:
7878
The OpenRouter API key.

integrations/opensearch/pyproject.toml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,13 @@ select = [
9494
"ARG",
9595
"B",
9696
"C",
97+
"D102", # Missing docstring in public method
98+
"D103", # Missing docstring in public function
99+
"D205", # 1 blank line required between summary line and description
100+
"D209", # Closing triple quotes go to new line
101+
"D213", # summary lines must be positioned on the second physical line of the docstring
102+
"D417", # Missing argument descriptions in the docstring
103+
"D419", # Docstring is empty
97104
"DTZ",
98105
"E",
99106
"EM",
@@ -145,7 +152,7 @@ ban-relative-imports = "parents"
145152

146153
[tool.ruff.lint.per-file-ignores]
147154
# Tests can use magic values, assertions, and relative imports
148-
"tests/**/*" = ["PLR2004", "S101", "TID252", "ANN"]
155+
"tests/**/*" = ["D", "PLR2004", "S101", "TID252", "ANN"]
149156

150157
[tool.coverage.run]
151158
source = ["haystack_integrations"]

integrations/opensearch/src/haystack_integrations/components/retrievers/opensearch/open_search_hybrid_retriever.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -114,8 +114,9 @@ def __init__(
114114
**kwargs: Any,
115115
) -> None:
116116
"""
117-
Initialize the OpenSearchHybridRetriever, a super component to retrieve documents from OpenSearch using
118-
both embedding-based and keyword-based retrieval methods.
117+
Initialize the OpenSearchHybridRetriever using both embedding-based and keyword-based retrieval methods.
118+
119+
This is a super component to retrieve documents from OpenSearch using both retrieval methods.
119120
120121
We don't explicitly define all the init parameters of the components in the constructor, for each
121122
of the components, since that would be around 20+ parameters. Instead, we define the most important ones
@@ -242,7 +243,9 @@ def __init__(
242243

243244
if TYPE_CHECKING:
244245

245-
def warm_up(self) -> None: ...
246+
def warm_up(self) -> None:
247+
"""Warm up the underlying pipeline components."""
248+
...
246249

247250
def run(
248251
self,
@@ -251,7 +254,9 @@ def run(
251254
filters_embedding: dict[str, Any] | None = None,
252255
top_k_bm25: int | None = None,
253256
top_k_embedding: int | None = None,
254-
) -> dict[str, list[Document]]: ...
257+
) -> dict[str, list[Document]]:
258+
"""Run the hybrid retrieval pipeline and return retrieved documents."""
259+
...
255260

256261
def _create_pipeline(self, data: dict[str, Any]) -> Pipeline:
257262
"""
@@ -328,6 +333,7 @@ def to_dict(self) -> dict[str, Any]:
328333

329334
@classmethod
330335
def from_dict(cls, data: dict[str, Any]) -> "OpenSearchHybridRetriever":
336+
"""Deserialize an OpenSearchHybridRetriever from a dictionary."""
331337
# deserialize the document store
332338
doc_store = OpenSearchDocumentStore.from_dict(data["init_parameters"]["document_store"])
333339
data["init_parameters"]["document_store"] = doc_store

integrations/opensearch/src/haystack_integrations/document_stores/opensearch/auth.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ def _get_aws_session(
4141
) -> "boto3.Session":
4242
"""
4343
Creates an AWS Session with the given parameters.
44+
4445
Checks if the provided AWS credentials are valid and can be used to connect to AWS.
4546
4647
:param aws_access_key_id: AWS access key ID.

integrations/opensearch/src/haystack_integrations/document_stores/opensearch/document_store.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -559,6 +559,7 @@ async def write_documents_async(
559559
def _deserialize_document(hit: dict[str, Any]) -> Document:
560560
"""
561561
Creates a Document from the search hit provided.
562+
562563
This is mostly useful in self.filter_documents().
563564
"""
564565
data = hit["_source"]
@@ -1482,6 +1483,7 @@ def _embedding_retrieval(
14821483
) -> list[Document]:
14831484
"""
14841485
Retrieves documents that are most similar to the query embedding using a vector similarity metric.
1486+
14851487
It uses the OpenSearch's Approximate k-Nearest Neighbors search algorithm.
14861488
14871489
This method is not meant to be part of the public interface of
@@ -1513,8 +1515,9 @@ async def _embedding_retrieval_async(
15131515
search_kwargs: dict[str, Any] | None = None,
15141516
) -> list[Document]:
15151517
"""
1516-
Asynchronously retrieves documents that are most similar to the query embedding using a vector similarity
1517-
metric. It uses the OpenSearch's Approximate k-Nearest Neighbors search algorithm.
1518+
Asynchronously retrieves documents most similar to the query embedding using a vector similarity metric.
1519+
1520+
It uses the OpenSearch's Approximate k-Nearest Neighbors search algorithm.
15181521
15191522
This method is not meant to be part of the public interface of
15201523
`OpenSearchDocumentStore` nor called directly.
@@ -1641,8 +1644,7 @@ def _extract_distinct_counts_from_aggregations(
16411644

16421645
def count_unique_metadata_by_filter(self, filters: dict[str, Any], metadata_fields: list[str]) -> dict[str, int]:
16431646
"""
1644-
Returns the number of unique values for each specified metadata field of the documents
1645-
that match the provided filters.
1647+
Returns the number of unique values for each specified metadata field of the documents that match the filters.
16461648
16471649
:param filters: The filters to apply to count documents.
16481650
For filter syntax, see [Haystack metadata filtering](https://docs.haystack.deepset.ai/docs/metadata-filtering)
@@ -1685,8 +1687,7 @@ async def count_unique_metadata_by_filter_async(
16851687
self, filters: dict[str, Any], metadata_fields: list[str]
16861688
) -> dict[str, int]:
16871689
"""
1688-
Asynchronously returns the number of unique values for each specified metadata field of the documents
1689-
that match the provided filters.
1690+
Asynchronously returns the number of unique values for each specified metadata field matching the filters.
16901691
16911692
:param filters: The filters to apply to count documents.
16921693
For filter syntax, see [Haystack metadata filtering](https://docs.haystack.deepset.ai/docs/metadata-filtering)
@@ -1862,6 +1863,7 @@ def get_metadata_field_unique_values(
18621863
) -> tuple[list[str], dict[str, Any] | None]:
18631864
"""
18641865
Returns unique values for a metadata field, optionally filtered by a search term in the content.
1866+
18651867
Uses composite aggregations for proper pagination beyond 10k results.
18661868
18671869
:param metadata_field: The metadata field to get unique values for.
@@ -1927,6 +1929,7 @@ async def get_metadata_field_unique_values_async(
19271929
) -> tuple[list[str], dict[str, Any] | None]:
19281930
"""
19291931
Asynchronously returns unique values for a metadata field, optionally filtered by a search term in the content.
1932+
19301933
Uses composite aggregations for proper pagination beyond 10k results.
19311934
19321935
:param metadata_field: The metadata field to get unique values for.

integrations/optimum/pyproject.toml

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,13 @@ select = [
108108
"ARG",
109109
"B",
110110
"C",
111+
"D102", # Missing docstring in public method
112+
"D103", # Missing docstring in public function
113+
"D205", # 1 blank line required between summary line and description
114+
"D209", # Closing triple quotes go to new line
115+
"D213", # summary lines must be positioned on the second physical line of the docstring
116+
"D417", # Missing argument descriptions in the docstring
117+
"D419", # Docstring is empty
111118
"DTZ",
112119
"E",
113120
"EM",
@@ -153,9 +160,9 @@ ban-relative-imports = "parents"
153160

154161
[tool.ruff.lint.per-file-ignores]
155162
# Tests can use magic values, assertions, and relative imports
156-
"tests/**/*" = ["PLR2004", "S101", "TID252", "ANN"]
163+
"tests/**/*" = ["D", "PLR2004", "S101", "TID252", "ANN"]
157164
# Examples can print their output
158-
"examples/**" = ["T201"]
165+
"examples/**" = ["D", "T201"]
159166
"tests/**" = ["T201"]
160167

161168
[tool.coverage.run]

integrations/optimum/src/haystack_integrations/components/embedders/optimum/optimization.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,10 @@
1111

1212
class OptimumEmbedderOptimizationMode(Enum):
1313
"""
14-
[ONXX Optimization modes](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/optimization)
15-
support by the Optimum Embedders.
14+
ONNX Optimization modes supported by the Optimum Embedders.
15+
16+
See [Optimum ONNX optimization docs](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/optimization)
17+
for more details.
1618
"""
1719

1820
#: Basic general optimizations.

integrations/optimum/src/haystack_integrations/components/embedders/optimum/optimum_document_embedder.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,10 @@
1717
@component
1818
class OptimumDocumentEmbedder:
1919
"""
20-
A component for computing `Document` embeddings using models loaded with the
21-
[HuggingFace Optimum](https://huggingface.co/docs/optimum/index) library,
22-
leveraging the ONNX runtime for high-speed inference.
20+
A component for computing `Document` embeddings using models loaded with the HuggingFace Optimum library.
21+
22+
Uses the [HuggingFace Optimum](https://huggingface.co/docs/optimum/index) library and leverages the ONNX
23+
runtime for high-speed inference.
2324
2425
The embedding of each Document is stored in the `embedding` field of the Document.
2526
@@ -199,6 +200,7 @@ def _prepare_texts_to_embed(self, documents: list[Document]) -> list[str]:
199200
def run(self, documents: list[Document]) -> dict[str, list[Document]]:
200201
"""
201202
Embed a list of Documents.
203+
202204
The embedding of each Document is stored in the `embedding` field of the Document.
203205
204206
:param documents:

integrations/optimum/src/haystack_integrations/components/embedders/optimum/optimum_text_embedder.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,10 @@
1616
@component
1717
class OptimumTextEmbedder:
1818
"""
19-
A component to embed text using models loaded with the
20-
[HuggingFace Optimum](https://huggingface.co/docs/optimum/index) library,
21-
leveraging the ONNX runtime for high-speed inference.
19+
A component to embed text using models loaded with the HuggingFace Optimum library.
20+
21+
Uses the [HuggingFace Optimum](https://huggingface.co/docs/optimum/index) library and leverages the ONNX
22+
runtime for high-speed inference.
2223
2324
Usage example:
2425
```python

0 commit comments

Comments
 (0)