fix(catalogs): validate extension and preset catalog payload shape#2621
Open
Quratulain-bilal wants to merge 2 commits into
Open
fix(catalogs): validate extension and preset catalog payload shape#2621Quratulain-bilal wants to merge 2 commits into
Quratulain-bilal wants to merge 2 commits into
Conversation
`ExtensionCatalog._fetch_single_catalog` and
`PresetCatalog._fetch_single_catalog` only check that the `extensions` /
`presets` key is *present* in the parsed catalog JSON. They don't check
that the value is a JSON object, and they don't check that the root is
a JSON object at all. A malformed (or compromised) upstream catalog
returning:
{"schema_version": "1.0", "extensions": []}
passes both `"extensions" not in catalog_data` and the subsequent
`response.read()` JSON parse, gets cached on disk, and then crashes
deep inside `_get_merged_extensions` (resp. `_get_merged_packs`) with:
AttributeError: 'list' object has no attribute 'items'
instead of the existing user-facing
`ExtensionError("Invalid catalog format from <url>")` /
`PresetError("Invalid preset catalog format")` that the surrounding
code is clearly trying to produce.
The sibling integration-catalog reader already validates this — see
`src/specify_cli/integrations/catalog.py` where the fetch path
explicitly checks both `isinstance(catalog_data, dict)` and
`isinstance(catalog_data.get("integrations"), dict)` before returning.
This change mirrors that pattern in the extension and preset readers so
the three catalog fetchers stay consistent and a malformed upstream
surfaces as the user-facing error instead of a raw Python traceback.
Adds parametrized regression tests covering:
- root payload is not a JSON object (list, str, int, null)
- root is a dict but `extensions` / `presets` value is the wrong type
(list, str, null, int)
All eight bad-payload shapes now raise the expected catalog error.
Contributor
There was a problem hiding this comment.
Pull request overview
Hardens the extension and preset catalog fetchers so malformed upstream JSON payloads fail fast with the intended user-facing ExtensionError / PresetError instead of crashing later with raw Python exceptions during merge/iteration.
Changes:
- Add root-object and nested
extensions/presetstype validation in_fetch_single_catalog()for extensions and presets. - Add regression tests covering non-dict roots and wrong-type
extensions/presetsvalues to ensure the expected catalog errors are raised. - Align error behavior more closely with the existing integration catalog reader’s payload-shape validation approach.
Show a summary per file
| File | Description |
|---|---|
src/specify_cli/extensions.py |
Adds JSON payload shape validation for extension catalogs before caching/merging. |
src/specify_cli/presets.py |
Adds JSON payload shape validation for preset catalogs before caching/merging. |
tests/test_extensions.py |
Adds parametrized regression tests for malformed extension catalog payload shapes. |
tests/test_presets.py |
Adds parametrized regression tests for malformed preset catalog payload shapes. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 2
| raise ExtensionError( | ||
| f"Invalid catalog format from {entry.url}: " | ||
| "'extensions' must be a JSON object" | ||
| ) |
Comment on lines
+2048
to
+2072
| # Validate payload shape before iteration. Checking only key | ||
| # presence would let a payload like ``{"presets": []}`` or | ||
| # ``{"presets": null}`` slip through here and then crash with | ||
| # ``AttributeError: 'list' object has no attribute 'items'`` deep | ||
| # inside ``_get_merged_packs``. The sibling integration catalog | ||
| # reader already guards both the root object and the nested | ||
| # mapping (see ``integrations/catalog.py``); the preset catalog | ||
| # must stay consistent so a malformed upstream surfaces as the | ||
| # user-facing ``Invalid preset catalog format`` error instead of | ||
| # a raw Python traceback. | ||
| if not isinstance(catalog_data, dict): | ||
| raise PresetError( | ||
| f"Invalid preset catalog format from {entry.url}: " | ||
| "expected a JSON object" | ||
| ) | ||
| if ( | ||
| "schema_version" not in catalog_data | ||
| or "presets" not in catalog_data | ||
| ): | ||
| raise PresetError("Invalid preset catalog format") | ||
| if not isinstance(catalog_data.get("presets"), dict): | ||
| raise PresetError( | ||
| f"Invalid preset catalog format from {entry.url}: " | ||
| "'presets' must be a JSON object" | ||
| ) |
…erge
Addresses Copilot review feedback on this PR.
`_fetch_single_catalog` now validates that the ``extensions`` / ``presets``
value is a mapping, but it doesn't (and shouldn't) validate every entry
inside that mapping. A payload like:
{"schema_version": "1.0", "extensions": {"good": {...}, "bad": []}}
passes the fetch-level guard, then later crashes inside
``_get_merged_extensions`` (resp. ``_get_merged_packs``) at
``{**ext_data, ...}`` with ``TypeError: 'list' object is not a mapping``.
The sibling integration-catalog reader at
``src/specify_cli/integrations/catalog.py:245`` handles this with a
per-entry ``isinstance(integ_data, dict)`` skip during merge, so one
malformed entry doesn't poison an otherwise valid catalog. This change
mirrors that pattern in the extension and preset mergers and adds
regression tests asserting that valid entries continue to merge while
malformed siblings are silently dropped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
ExtensionCatalog._fetch_single_catalogandPresetCatalog._fetch_single_catalogonly check that theextensions/presetskey is present in the parsed catalog JSON. They don't check that the value is a JSON object, and they don'tcheck that the root is a JSON object at all. A malformed (or compromised) upstream catalog returning:
{"schema_version": "1.0", "extensions": []}passes both the
"extensions" not in catalog_datacheck and the JSON parse, gets cached on disk, and then crashes deepinside
_get_merged_extensions(resp._get_merged_packs) with:instead of the existing user-facing
ExtensionError("Invalid catalog format from <url>")/PresetError("Invalid preset catalog format")that the surrounding code is clearly trying to produce.Reproducer
Why this matters
AttributeError) instead of the existingExtensionError/PresetError. Users see a Python stack trace and assume the CLI itself is broken, not their catalog.SPECKIT_CATALOG_URLenv var, project-levelextension-catalogs.yml) can includecommunity-hosted catalogs. A single misedited entry on any catalog in the stack crashes the whole resolve path.
src/specify_cli/integrations/catalog.py:175,186already validates both theroot object and the nested mapping — the three catalog fetchers had drifted out of sync.
The change
src/specify_cli/extensions.py— addisinstance(catalog_data, dict)andisinstance(catalog_data.get("extensions"), dict)guards in_fetch_single_catalogbefore iteration, raising the existingExtensionError("Invalid catalog format from ...")with a more specific suffix when the type is wrong.src/specify_cli/presets.py— same guards in_fetch_single_catalog, raisingPresetError.integrations/catalog.pypattern exactly, with a short comment explaining theprecedent so the rationale is visible to the next reader.
Tests
Parametrized regression tests in both
tests/test_extensions.py::TestExtensionCatalogandtests/test_presets.py::TestPresetCatalogcover:[],"oops",42,nullextensions/presetsvalue is the wrong type:[],"oops",null,42All 8 bad-payload shapes per file (16 total) now raise the expected catalog error.
All checks passed!