Skip to content

fix(rest): make REST connector E2E + unit tests hermetic, harden OpenAPI parser#27835

Open
mohittilala wants to merge 5 commits intomainfrom
fix/rest-source-test-network-mock
Open

fix(rest): make REST connector E2E + unit tests hermetic, harden OpenAPI parser#27835
mohittilala wants to merge 5 commits intomainfrom
fix/rest-source-test-network-mock

Conversation

@mohittilala
Copy link
Copy Markdown
Contributor

@mohittilala mohittilala commented Apr 30, 2026

Describe your changes:

Make the REST connector tests stop depending on petstore3.swagger.io and reject malformed OpenAPI responses up front.

  • Playwright AutoPilot Rest fills the form with /home/airflow/ingestion/examples/openapi/sample.json (a fixture bundled into the ingestion container) instead of a live URL.
  • REST source unit tests patch get_connection so RestSource.create no longer hits the network.
  • parse_openapi_schema, parse_openapi_schema_from_file, and parse_openapi_schema_from_s3 raise OpenAPIParseError when parsing returns a value that isn't a dict (yaml.safe_load returns a bare string for HTML error pages, which previously leaked through to callers).

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Bug fix

  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.

Summary by Gitar

  • Error handling improvements:
    • Updated parse_openapi_schema_from_file and parse_openapi_schema_from_s3 to include the file path or S3 URI in OpenAPIParseError messages.
  • Test coverage:
    • Added unit tests for _ensure_mapping to verify validation of input types like list, str, and None.
    • Added integration tests in test_openapi_parser.py covering file path reporting for invalid JSON/YAML schemas.

This will update automatically on new commits.

@mohittilala mohittilala self-assigned this Apr 30, 2026
Copilot AI review requested due to automatic review settings April 30, 2026 04:02
@mohittilala mohittilala requested review from a team as code owners April 30, 2026 04:02
@github-actions github-actions Bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Apr 30, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make REST connector E2E + unit tests hermetic (no dependency on external petstore3.swagger.io) and to harden OpenAPI parsing by rejecting non-object parse results early.

Changes:

  • Updated Playwright REST service form automation to use a bundled OpenAPI schema fixture file path instead of a live URL.
  • Updated REST unit tests to patch get_connection so RestSource.create(...) doesn’t perform network calls.
  • Hardened OpenAPI parsing helpers to raise OpenAPIParseError when parsed content is not a mapping/object.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
openmetadata-ui/src/main/resources/ui/playwright/support/entity/ingestion/ApiIngestionClass.ts Switch REST connector E2E form fill from external URL to local OpenAPI fixture path.
ingestion/tests/unit/topology/api/test_rest.py Patch get_connection in unit tests to avoid network during RestSource.create(...).
ingestion/src/metadata/ingestion/source/api/rest/parser.py Add _ensure_mapping guard and apply it across HTTP/file/S3 OpenAPI parsing paths.
ingestion/examples/openapi/sample.json Add a bundled OpenAPI fixture used by the E2E flow.
Comments suppressed due to low confidence (1)

ingestion/tests/unit/topology/api/test_rest.py:441

  • The new get_connection patch in RESTTest.__init__ only affects construction of self.rest_source. This test module still has other RestSource.create(...) calls (e.g., in test_endpoint_filter_pattern) that are only patching ApiServiceSource.test_connection and will still execute the real get_connection (and hit https://petstore3.swagger.io/...) during RestSource.create, so the suite is not fully hermetic yet. Consider patching metadata.ingestion.source.api.api_service.get_connection for those tests as well (or overriding the config to use the local fixture).
    @patch(
        "metadata.ingestion.source.api.api_service.get_connection",
        return_value=MOCK_JSON_RESPONSE,
    )
    @patch("metadata.ingestion.source.api.api_service.ApiServiceSource.test_connection")
    def __init__(self, methodName, test_connection, get_connection) -> None:  # noqa: N803
        super().__init__(methodName)
        test_connection.return_value = False
        self.config = OpenMetadataWorkflowConfig.model_validate(mock_rest_config)
        self.rest_source = RestSource.create(
            mock_rest_config["source"],
            self.config.workflowConfig.openMetadataServerConfig,
        )

Comment thread ingestion/src/metadata/ingestion/source/api/rest/parser.py Outdated
Comment thread ingestion/src/metadata/ingestion/source/api/rest/parser.py Outdated
Comment thread ingestion/src/metadata/ingestion/source/api/rest/parser.py
@github-actions
Copy link
Copy Markdown
Contributor

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 61%
62.06% (62310/100391) 42.25% (33402/79049) 45.3% (9904/21862)

@mohittilala mohittilala moved this to In Review / QA 👀 in Shipping Apr 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 30, 2026

🟡 Playwright Results — all passed (12 flaky)

✅ 3986 passed · ❌ 0 failed · 🟡 12 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 298 0 1 4
🟡 Shard 2 751 0 3 8
🟡 Shard 3 744 0 2 7
🟡 Shard 4 772 0 3 18
🟡 Shard 5 686 0 1 41
🟡 Shard 6 735 0 2 8
🟡 12 flaky test(s) (passed on retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when description is updated (shard 2, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when owner is added (shard 2, 1 retry)
  • Features/DataProductRename.spec.ts › should rename data product and verify assets are still associated (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Should display custom properties for apiCollection in right panel (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for MlModel (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate Description Rule Is_Not_Set (shard 4, 1 retry)
  • Pages/EntityDataSteward.spec.ts › Tier Add, Update and Remove (shard 5, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Comment thread ingestion/src/metadata/ingestion/source/api/rest/parser.py
Copilot AI review requested due to automatic review settings April 30, 2026 10:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

logger.debug("Successfully parsed OpenAPI schema as JSON")
return parsed # noqa: TRY300
except json.JSONDecodeError:
except (json.JSONDecodeError, OpenAPIParseError):
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In parse_openapi_schema's fallback path, _ensure_mapping(json.loads(...), "JSON") raising OpenAPIParseError (e.g., body is a valid JSON array/scalar) is currently caught and treated the same as JSONDecodeError, which then logs "Content is not valid JSON" and tries YAML. Since the JSON parse already succeeded, retrying YAML can’t fix the shape and also discards the more specific error message. Consider letting OpenAPIParseError bubble up here (or at least logging a more accurate message like "JSON is not an object") and skipping the YAML fallback in this case.

Suggested change
except (json.JSONDecodeError, OpenAPIParseError):
except json.JSONDecodeError:

Copilot uses AI. Check for mistakes.
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 1, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Hardens the OpenAPI parser by introducing robust error handling and makes REST connector tests fully hermetic by replacing network dependencies with local fixtures. Resolved an issue where OpenAPIParseError could escape fallback blocks.

✅ 1 resolved
Edge Case: OpenAPIParseError escapes fallback try/except blocks

📄 ingestion/src/metadata/ingestion/source/api/rest/parser.py:81-95 📄 ingestion/src/metadata/ingestion/source/api/rest/parser.py:147-158 📄 ingestion/src/metadata/ingestion/source/api/rest/parser.py:236-247
_ensure_mapping raises OpenAPIParseError, but the fallback except clauses only catch json.JSONDecodeError and yaml.YAMLError. This means:

  1. In parse_openapi_schema (line 82): if json.loads succeeds but returns a non-dict (e.g. a JSON array [...]), _ensure_mapping raises OpenAPIParseError which is not caught by except json.JSONDecodeError on line 85. The YAML fallback is never attempted and the descriptive final error on line 97 is never reached.
  2. Same issue on line 90: if yaml.safe_load returns a non-dict (bare string from an HTML page), OpenAPIParseError escapes except yaml.YAMLError on line 93.
  3. Identical pattern in parse_openapi_schema_from_file (lines 148-156) and parse_openapi_schema_from_s3 (lines 237-245).

Practically, case 1 is benign (YAML is a JSON superset, so would produce the same non-dict), but case 2 is reachable: an HTML error page parses as a bare YAML string, _ensure_mapping raises, and the user sees "YAML parsing produced str, expected an object/mapping" instead of the clearer final error. Adding OpenAPIParseError to the except clauses keeps the fallback chain intact.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 1, 2026

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 1, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

Status: In Review / QA 👀

Development

Successfully merging this pull request may close these issues.

2 participants