-
Notifications
You must be signed in to change notification settings - Fork 45
tests: Add parametrized validation test for manifest-only connectors #706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
9465284
473f237
617c64f
ec5a6b1
8a3a248
104aac2
dc2416e
fcaf435
6aeb97b
5841083
79e9118
9282e0b
0a4b66e
a87f360
7f92efa
0ecc73d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,226 @@ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Unit tests for validating manifest.yaml files from the connector registry against the CDK schema. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| This test suite fetches all manifest-only connectors from the Airbyte connector registry, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| downloads their manifest.yaml files from public endpoints, and validates them against | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| the current declarative component schema defined in the CDK. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import json | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import logging | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from pathlib import Path | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from typing import Any, Dict, List, Tuple | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from unittest.mock import patch | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import pytest | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import requests | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import yaml | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from airbyte_cdk.sources.declarative.validators.validate_adheres_to_schema import ( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ValidateAdheresToSchema, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| logger = logging.getLogger(__name__) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
aaronsteers marked this conversation as resolved.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EXCLUDED_CONNECTORS = [ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| CONNECTOR_REGISTRY_URL = "https://connectors.airbyte.com/files/registries/v0/oss_registry.json" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MANIFEST_URL_TEMPLATE = "https://connectors.airbyte.com/files/metadata/airbyte/{connector_name}/latest/manifest.yaml" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| VALIDATION_SUCCESSES = [] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| VALIDATION_FAILURES = [] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| DOWNLOAD_FAILURES = [] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
coderabbitai[bot] marked this conversation as resolved.
Outdated
Comment on lines
+441
to
+106
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Session attributes are never attached to the session object. The fixtures return lists for tracking results, but Would you consider either:
For option 1, you could add something like: @pytest.fixture(scope="session", autouse=True)
def setup_session_tracking(request):
"""Set up session-level result tracking."""
request.session._validation_successes = []
request.session._validation_failures = []
request.session._download_failures = []
return {
'successes': request.session._validation_successes,
'failures': request.session._validation_failures,
'download_failures': request.session._download_failures
}wdyt? 🤖 Prompt for AI Agents
coderabbitai[bot] marked this conversation as resolved.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| # Use pytest fixtures to provide thread-safe lists for test results. | |
| @pytest.fixture | |
| def validation_successes(): | |
| return [] | |
| @pytest.fixture | |
| def validation_failures(): | |
| return [] | |
| @pytest.fixture | |
| def download_failures(): | |
| return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider more robust schema file loading
The current path navigation (going up 4 parent directories) is brittle and assumes a specific directory structure. What if we used a more robust approach like importlib.resources or tried multiple potential locations?
Also, should we add error handling for when the schema file doesn't exist? This would provide a clearer error message than a generic FileNotFoundError.
Example:
def load_declarative_component_schema() -> Dict[str, Any]:
"""Load the declarative component schema from the CDK."""
schema_path = (
Path(__file__).resolve().parent.parent.parent.parent
/ "airbyte_cdk/sources/declarative/declarative_component_schema.yaml"
)
if not schema_path.exists():
raise FileNotFoundError(
f"Declarative component schema not found at {schema_path}. "
"Please ensure the CDK is properly installed."
)
with open(schema_path, "r") as file:
return yaml.safe_load(file)wdyt?
🤖 Prompt for AI Agents
In unit_tests/sources/declarative/test_manifest_registry_validation.py around
lines 37 to 44, replace the brittle Path(...).parent.parent.parent.parent
approach with a more robust loader: attempt to load
"declarative_component_schema.yaml" via importlib.resources (or
importlib.resources.files().joinpath/read_text in py3.9+) from the
airbyte_cdk.sources.declarative package first, fall back to the existing
relative path only if the resource API fails, and if neither location yields the
file raise a clear FileNotFoundError that includes attempted locations and a
short instruction to ensure the CDK is installed; ensure the function returns
the parsed yaml as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make schema loading robust (prefer importlib.resources, with clear fallback and error).
The current path traversal is brittle and will break if the repo layout changes or when running from an installed package. Shall we try importlib.resources first and fall back to the relative path with a clear error if both fail, wdyt?
Apply this diff:
def load_declarative_component_schema() -> Dict[str, Any]:
- """Load the declarative component schema from the CDK."""
- schema_path = (
- Path(__file__).resolve().parent.parent.parent.parent
- / "airbyte_cdk/sources/declarative/declarative_component_schema.yaml"
- )
- with open(schema_path, "r") as file:
- schema = yaml.safe_load(file)
- if not isinstance(schema, dict):
- raise ValueError("Schema must be a dictionary")
- return schema
+ """Load the declarative component schema from the CDK.
+
+ Attempts importlib.resources first (installed CDK), falling back to repo-relative path.
+ """
+ data: str
+ errors: List[str] = []
+ # Try to read the schema from the installed package resources
+ try:
+ from importlib import resources as importlib_resources # py>=3.9
+ pkg = "airbyte_cdk.sources.declarative"
+ resource = importlib_resources.files(pkg).joinpath("declarative_component_schema.yaml") # type: ignore[attr-defined]
+ data = resource.read_text(encoding="utf-8")
+ except Exception as e:
+ errors.append(f"importlib.resources failed: {e}")
+ # Fallback to repo-relative file path
+ schema_path = (
+ Path(__file__).resolve().parent.parent.parent.parent
+ / "airbyte_cdk/sources/declarative/declarative_component_schema.yaml"
+ )
+ if not schema_path.exists():
+ raise FileNotFoundError(
+ f"Declarative component schema not found. Tried package resources and path {schema_path}. "
+ f"Errors: {'; '.join(errors)}"
+ )
+ data = schema_path.read_text(encoding="utf-8")
+
+ schema = yaml.safe_load(data)
+ if not isinstance(schema, dict):
+ raise ValueError("Schema must be a dictionary")
+ return schemaAdditionally, at the top of the module, import typing for List if needed by this function. wdyt?
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def load_declarative_component_schema() -> Dict[str, Any]: | |
| """Load the declarative component schema from the CDK.""" | |
| schema_path = ( | |
| Path(__file__).resolve().parent.parent.parent.parent | |
| / "airbyte_cdk/sources/declarative/declarative_component_schema.yaml" | |
| ) | |
| with open(schema_path, "r") as file: | |
| schema = yaml.safe_load(file) | |
| if not isinstance(schema, dict): | |
| raise ValueError("Schema must be a dictionary") | |
| return schema | |
| def load_declarative_component_schema() -> Dict[str, Any]: | |
| - """Load the declarative component schema from the CDK.""" | |
| - schema_path = ( | |
| - Path(__file__).resolve().parent.parent.parent.parent | |
| - / "airbyte_cdk/sources/declarative/declarative_component_schema.yaml" | |
| - ) | |
| - with open(schema_path, "r") as file: | |
| - schema = yaml.safe_load(file) | |
| - if not isinstance(schema, dict): | |
| - raise ValueError("Schema must be a dictionary") | |
| - return schema | |
| + """Load the declarative component schema from the CDK. | |
| + | |
| + Attempts importlib.resources first (installed CDK), falling back to repo-relative path. | |
| + """ | |
| + data: str | |
| + errors: List[str] = [] | |
| + # Try to read the schema from the installed package resources | |
| + try: | |
| + from importlib import resources as importlib_resources # py>=3.9 | |
| + pkg = "airbyte_cdk.sources.declarative" | |
| + resource = importlib_resources.files(pkg).joinpath( | |
| + "declarative_component_schema.yaml" | |
| + ) # type: ignore[attr-defined] | |
| + data = resource.read_text(encoding="utf-8") | |
| + except Exception as e: | |
| + errors.append(f"importlib.resources failed: {e}") | |
| + # Fallback to repo-relative file path | |
| + schema_path = ( | |
| + Path(__file__).resolve().parent.parent.parent.parent | |
| + / "airbyte_cdk/sources/declarative/declarative_component_schema.yaml" | |
| + ) | |
| + if not schema_path.exists(): | |
| + raise FileNotFoundError( | |
| + f"Declarative component schema not found. Tried package resources and path {schema_path}. " | |
| + f"Errors: {'; '.join(errors)}" | |
| + ) | |
| + data = schema_path.read_text(encoding="utf-8") | |
| + | |
| + schema = yaml.safe_load(data) | |
| + if not isinstance(schema, dict): | |
| + raise ValueError("Schema must be a dictionary") | |
| + return schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Type annotation and error handling improvements
A few suggestions for this function:
-
The return type
List[Tuple[str, str]]is misleading since the second element is alwaysNone. Should this beList[Tuple[str, None]]orList[Tuple[str, Optional[str]]]? -
Using
pytest.fail()in a helper function makes it less reusable outside of pytest context. Would it be better to raise an exception and let the caller decide how to handle it? -
Given the network dependency, should we add retry logic for transient failures? The PR objectives mention considering "network dependency flakiness (e.g., retries or circuit breakers)".
Example with retries:
from typing import Optional
import time
def get_manifest_only_connectors() -> List[Tuple[str, Optional[str]]]:
"""Fetch manifest-only connectors from the registry."""
max_retries = 3
for attempt in range(max_retries):
try:
response = requests.get(CONNECTOR_REGISTRY_URL, timeout=30)
response.raise_for_status()
# ... rest of the logic
except Exception as e:
if attempt == max_retries - 1:
raise RuntimeError(f"Failed to fetch connector registry after {max_retries} attempts: {e}")
time.sleep(2 ** attempt) # Exponential backoffwdyt?
🤖 Prompt for AI Agents
In unit_tests/sources/declarative/test_manifest_registry_validation.py around
lines 47-70, the helper get_manifest_only_connectors has an inaccurate return
type, calls pytest.fail inside a utility (reducing reusability), and lacks
retry/backoff for transient network errors; change the signature to return
List[Tuple[str, Optional[str]]], import Optional from typing, replace
pytest.fail with raising a clear exception (e.g., RuntimeError) so callers
decide handling, and add retry logic with a small max_retries and exponential
backoff (sleep between attempts) before raising the final error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid pytest.fail in helpers; add retries and accurate type for CDK version.
- Returning
List[Tuple[str, str]]but filling with"unknown"is misleading. - Calling
pytest.fail()inside a utility reduces reusability and makes collection fail hard. - Given network flakiness, adding a small retry/backoff helps. Shall we adjust accordingly, wdyt?
Apply this diff:
-def get_manifest_only_connectors() -> List[Tuple[str, str]]:
+def get_manifest_only_connectors() -> List[Tuple[str, Optional[str]]]:
@@
- try:
- response = requests.get(CONNECTOR_REGISTRY_URL, timeout=30)
- response.raise_for_status()
- registry = response.json()
-
- manifest_connectors: List[Tuple[str, str]] = []
- for source in registry.get("sources", []):
- if source.get("language") == "manifest-only":
- connector_name = source.get("dockerRepository", "").replace("airbyte/", "")
- if connector_name:
- manifest_connectors.append((connector_name, "unknown"))
-
- return manifest_connectors
- except Exception as e:
- pytest.fail(f"Failed to fetch connector registry: {e}")
+ max_retries = 3
+ for attempt in range(max_retries):
+ try:
+ response = requests.get(CONNECTOR_REGISTRY_URL, timeout=30)
+ response.raise_for_status()
+ registry = response.json()
+
+ manifest_connectors: List[Tuple[str, Optional[str]]] = []
+ for source in registry.get("sources", []):
+ if source.get("language") == "manifest-only":
+ connector_name = str(source.get("dockerRepository", "")).replace("airbyte/", "")
+ if connector_name:
+ manifest_connectors.append((connector_name, None))
+ return manifest_connectors
+ except Exception as e:
+ if attempt == max_retries - 1:
+ raise RuntimeError(f"Failed to fetch connector registry after {max_retries} attempts: {e}") from e
+ time.sleep(2 ** attempt) # Exponential backoffAdd these imports at the top of the file:
from typing import Optional
import timeThis keeps the helper reusable and resilient while preserving current behavior for callers. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Global cache might affect test isolation
The _git_manifest_cache global variable could cause issues with test isolation, especially if tests are run multiple times in the same process or in parallel. The cache persists across test runs and could lead to stale data.
Would you consider using a fixture-based cache or clearing the cache between test sessions? For example:
@pytest.fixture(scope="session")
def manifest_cache():
"""Session-scoped cache for downloaded manifests."""
return {}Then pass the cache as a parameter rather than using global state. wdyt?
🤖 Prompt for AI Agents
In unit_tests/sources/declarative/test_manifest_registry_validation.py around
lines 521 to 561, the use of the module-level _git_manifest_cache breaks test
isolation; replace the global with an injectable test fixture or ensure it is
cleared between tests: add a pytest fixture (session or function scoped as
appropriate) that returns an empty dict and pass that cache into
download_manifest (or make download_manifest accept an optional cache
parameter); inside the function use the provided cache instead of the global,
update callers/tests to supply the fixture, and ensure any tests that rely on a
fresh cache either use function scope or explicitly clear the cache in
setup/teardown so no stale data persists across runs or parallel tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add retries/backoff to HTTP downloads; improve error propagation
Given registry flakiness, can we add simple retry/backoff to manifest downloads and only append a single terminal failure to download_failures, wdyt?
Apply this diff:
def download_manifest(
connector_name: str, download_failures: List[Tuple[str, str]]
) -> Tuple[str, str]:
@@
- url = MANIFEST_URL_TEMPLATE.format(connector_name=connector_name)
- try:
- response = requests.get(url, timeout=30)
- response.raise_for_status()
- manifest_content = response.text
-
- manifest_dict = yaml.safe_load(manifest_content)
- cdk_version = manifest_dict.get("version", "unknown")
-
- return manifest_content, cdk_version
- except Exception as e:
- download_failures.append((connector_name, str(e)))
- raise
+ url = MANIFEST_URL_TEMPLATE.format(connector_name=connector_name)
+ max_retries = 3
+ for attempt in range(max_retries):
+ try:
+ response = requests.get(url, timeout=30)
+ response.raise_for_status()
+ manifest_content = response.text
+ manifest_dict = yaml.safe_load(manifest_content) or {}
+ cdk_version = manifest_dict.get("version", "unknown")
+ return manifest_content, cdk_version
+ except Exception as e:
+ if attempt == max_retries - 1:
+ download_failures.append((connector_name, str(e)))
+ raise
+ time.sleep(2 ** attempt) # Exponential backoffCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In unit_tests/sources/declarative/test_manifest_registry_validation.py around
lines 145-181, the HTTP manifest download is flaky and currently appends an
entry to download_failures on every exception without retrying; implement a
retry loop with exponential backoff (e.g., 3 attempts with increasing delays)
for the requests.get call and only append to download_failures once after all
retries fail; ensure response.raise_for_status() is checked on each attempt,
catch and sleep between attempts, and re-raise the final exception after
recording the single terminal failure.
Copilot
AI
Aug 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function calls get_manifest_only_connectors() which makes a network request to the registry. When used in @pytest.mark.parametrize, this network call happens during test collection, potentially slowing down test discovery. Consider caching the result or using a different approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding CI-friendly connector limits
The PR objectives mention "considering a CI-friendly limit on processed connectors" since there are ~475 connectors. Running all of them might cause CI timeouts or excessive resource usage.
Would you consider adding an optional limit mechanism? For example:
import os
def get_manifest_only_connector_names() -> List[str]:
"""Get manifest-only connector names with optional limit for CI."""
connectors = get_manifest_only_connectors()
connector_names = [name for name, _ in connectors]
# Allow limiting connectors for CI runs
max_connectors = os.getenv("MAX_CONNECTORS_TO_TEST")
if max_connectors:
limit = int(max_connectors)
if limit > 0:
connector_names = connector_names[:limit]
logger.info(f"Limited to testing {limit} connectors (CI mode)")
return connector_namesThis would allow CI to run with MAX_CONNECTORS_TO_TEST=50 while still allowing full runs locally. wdyt?
🤖 Prompt for AI Agents
In unit_tests/sources/declarative/test_manifest_registry_validation.py around
lines 95 to 104, add an optional CI-friendly limit to
get_manifest_only_connector_names by reading an environment variable (e.g.,
MAX_CONNECTORS_TO_TEST) and slicing the returned list of connector names if the
variable is set to a positive integer; import os at the top, defensively parse
the env var to int (ignore or log and treat as unset on parse errors or
non-positive values), log when a limit is applied, and keep the default behavior
unchanged when the env var is absent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid network during test collection: move parametrization to pytest_generate_tests
Calling get_manifest_only_connector_names() at import time makes test collection flaky/slow and can fail entirely if the registry is down. Shall we move parametrization to pytest_generate_tests and keep collection side-effect free, wdyt?
Apply this diff to remove import-time network access:
-@pytest.mark.parametrize("connector_name", get_manifest_only_connector_names())
+# Parametrization moved to pytest_generate_tests to avoid network at collection time.Add this hook at the end of the file to perform parametrization safely at runtime:
def pytest_generate_tests(metafunc):
if "connector_name" in metafunc.fixturenames:
try:
names = get_manifest_only_connector_names()
except Exception as e:
names = []
tr = metafunc.config.pluginmanager.get_plugin("terminalreporter")
if tr:
tr.write_line(f"[manifest-validator] Skipping connector parametrization due to registry error: {e}")
metafunc.parametrize("connector_name", names)🤖 Prompt for AI Agents
In unit_tests/sources/declarative/test_manifest_registry_validation.py around
line 281, the test currently calls get_manifest_only_connector_names() at import
time via @pytest.mark.parametrize which causes network access during collection;
remove that decorator usage and instead add a pytest_generate_tests hook at the
end of the file that checks if "connector_name" is in metafunc.fixturenames,
calls get_manifest_only_connector_names() inside a try/except (falling back to
an empty list on error), logs the exception to the terminal reporter if present,
and calls metafunc.parametrize("connector_name", names) so parametrization
happens at collection/runtime without import-time network access.
Copilot
AI
Aug 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema is loaded and validator is created for every test iteration. Since the schema is static, consider loading it once and reusing it across all test iterations using a module-level fixture or cached function.
| validator.validate(manifest_dict) | |
| # Use the module-level validator to validate the manifest | |
| try: | |
| VALIDATOR.validate(manifest_dict) |
Uh oh!
There was an error while loading. Please reload this page.