Add type annotations and configure mypy - claude web next try#167
Add type annotations and configure mypy - claude web next try#167RonnyPfannschmidt wants to merge 5 commits intomainfrom
Conversation
This commit adds complete type annotations to all Python files in the prance codebase and configures mypy with strict type checking settings. Key changes: - Configure mypy in pyproject.toml with strict settings: - disallow_untyped_defs=true - disallow_untyped_calls=true - disallow_incomplete_defs=true - warn_return_any=true - Other strict type checking options - Add type annotations to all modules: - prance/util/exceptions.py: Type-annotated raise_from function - prance/util/iterators.py: Added JsonValue and PathElement type aliases - prance/util/path.py: Complete type annotations for path operations - prance/util/__init__.py: Typed utility functions - prance/util/fs.py: File system operation types - prance/util/formats.py: Format parsing and serialization types - prance/util/url.py: URL handling with proper ParseResult types - prance/util/resolver.py: Reference resolution with full typing - prance/util/translator.py: Reference translation types - prance/mixins.py: Mixin classes with proper typing - prance/__init__.py: Core parser classes fully typed - prance/convert.py: Conversion functions typed - prance/cli.py: CLI interface types - Type design principles: - Used specific types instead of Any wherever possible - Defined JsonValue type alias for JSON-like structures - Properly handled optional dependencies with type: ignore comments - Added type narrowing with isinstance checks - Used Union, Optional, Sequence, Mapping appropriately - Mypy configuration: - Added CLI module override to allow untyped decorators from click - All files now pass mypy strict checking All type checking now passes: mypy prance/ returns success.
Reviewer's GuideThis PR introduces a strict mypy configuration and retrofits the entire prance codebase with comprehensive type annotations—from utility and core parser modules to CLI commands and conversion logic—ensuring all files pass strict mypy checking. Class diagram for updated type annotations in prance.util.pathclassDiagram
class PathElement {
<<type alias>>
str | int
}
class JsonValue {
<<type alias>>
Mapping[str, JsonValue]
Sequence[JsonValue]
str
int
float
bool
None
}
class prance.util.path {
+_json_ref_escape(path: PathElement) str
+_str_path(path: Sequence[PathElement]) str
+path_get(obj: JsonValue, path: Optional[Sequence[PathElement]], defaultvalue: JsonValue = None, path_of_obj: Tuple[PathElement, ...] = ()) JsonValue
+path_set(obj: JsonValue, path: Sequence[PathElement], value: JsonValue, **options: bool) JsonValue
}
Class diagram for updated type annotations in prance.init.py (BaseParser and ResolvingParser)classDiagram
class BaseParser {
+url: ParseResult
+_spec_string: Optional[str]
+specification: Optional[JsonValue]
+version: Optional[str]
+version_name: Optional[str]
+version_parsed: tuple
+valid: bool
+options: Dict[str, Any]
+backend: str
+__init__(url: Optional[str] = None, spec_string: Optional[str] = None, lazy: bool = False, **kwargs: Any) None
+parse() None
+_validate() None
+_validate_flex(spec_version: Version) None
+_validate_swagger_spec_validator(spec_version: Version) None
+_validate_openapi_spec_validator(spec_version: Version) None
+_strict_warning() str
}
class ResolvingParser {
+__reference_cache: Dict[Union[str, tuple], JsonValue]
+__init__(url: Optional[str] = None, spec_string: Optional[str] = None, lazy: bool = False, **kwargs: Any) None
+_validate() None
}
ResolvingParser --|> BaseParser
Class diagram for updated type annotations in prance.util.resolver.RefResolverclassDiagram
class RefResolver {
+specs: JsonValue
+url: Optional[Union[str, ParseResult]]
+parsed_url: Optional[ParseResult]
+_url_key: Optional[Tuple[str, bool]]
+__reclimit: int
+__reclimit_handler: Callable[[int, ParseResult, Tuple[Tuple[str, Tuple[PathElement, ...]], ...]], Any]
+__reference_cache: Dict[Union[str, Tuple[str, bool]], JsonValue]
+__resolve_types: int
+__resolve_method: int
+__encoding: Optional[str]
+__strict: bool
+__soft_dereference_objs: Dict[str, JsonValue]
+__init__(specs: JsonValue, url: Optional[Union[str, ParseResult]] = None, **options: Any) None
+resolve_references() None
+_dereferencing_iterator(base_url: Optional[ParseResult], partial: JsonValue, path: Tuple[PathElement, ...], recursions: Tuple[Tuple[str, Tuple[PathElement, ...]], ...]) Iterator[Tuple[Tuple[PathElement, ...], JsonValue]]
+_collect_soft_refs(ref_url: ParseResult, item_path: List[PathElement], value: JsonValue) str
+_skip_reference(base_url: ParseResult, ref_url: ParseResult) bool
+_dereference(ref_url: ParseResult, obj_path: List[PathElement], recursions: Tuple[Tuple[str, Tuple[PathElement, ...]], ...]) JsonValue
+_resolve_partial(base_url: Optional[ParseResult], partial: JsonValue, recursions: Tuple[Tuple[PathElement, ...], ...]) JsonValue
}
Class diagram for updated type annotations in prance.util.translator._RefTranslatorclassDiagram
class _RefTranslator {
+specs: JsonValue
+url: Optional[ParseResult]
+__strict: bool
+__reference_cache: Dict[Tuple[str, bool], JsonValue]
+__collected_references: Dict[str, Optional[JsonValue]]
+__init__(specs: JsonValue, url: Optional[str]) None
+translate_references() None
+_dereference(ref_url: ParseResult, obj_path: List[PathElement]) JsonValue
+_translate_partial(base_url: ParseResult, partial: JsonValue) JsonValue
+_translating_iterator(base_url: ParseResult, partial: JsonValue, path: Tuple[PathElement, ...]) Iterator[Tuple[Tuple[PathElement, ...], Dict[str, str]]]
}
Class diagram for updated type annotations in prance.util.formatsclassDiagram
class prance.util.formats {
+ParseError(ValueError)
+__format_preferences(filename: Optional[str], content_type: Optional[str]) Tuple[str, ...]
+__parse_yaml(spec_str: str) JsonValue
+__parse_json(spec_str: str) JsonValue
+__serialize_yaml(specs: JsonValue) str
+__serialize_json(specs: JsonValue) str
+format_info(format_name: str) Tuple[Optional[str], Optional[str]]
+parse_spec_details(spec_str: str, filename: Optional[str] = None, **kwargs: Optional[str]) Tuple[JsonValue, Optional[str], Optional[str]]
+parse_spec(spec_str: str, filename: Optional[str] = None, **kwargs: Optional[str]) JsonValue
+serialize_spec(specs: JsonValue, filename: Optional[str] = None, **kwargs: Optional[str]) str
}
Class diagram for updated type annotations in prance.util.fsclassDiagram
class prance.util.fs {
+is_pathname_valid(pathname: str) bool
+from_posix(fname: str) str
+to_posix(fname: str) str
+abspath(filename: str, relative_to: Optional[str] = None) str
+canonical_filename(filename: str) str
+detect_encoding(filename: str, default_to_utf8: bool = True, **kwargs: bool) str
+read_file(filename: str, encoding: Optional[str] = None) str
+write_file(filename: str, contents: str, encoding: Optional[str] = None) None
}
Class diagram for updated type annotations in prance.util.iteratorsclassDiagram
class prance.util.iterators {
+item_iterator(value: JsonValue, path: Tuple[PathElement, ...] = ()) Iterator[Tuple[Tuple[PathElement, ...], JsonValue]]
+reference_iterator(specs: JsonValue, path: Tuple[PathElement, ...] = ()) Iterator[Tuple[PathElement, JsonValue, Tuple[PathElement, ...]]]
}
Class diagram for updated type annotations in prance.util.exceptionsclassDiagram
class prance.util.exceptions {
+raise_from(klass: Type[BaseException], from_value: Optional[BaseException], extra_message: Optional[str] = None) None
}
Class diagram for updated type annotations in prance.util.initclassDiagram
class prance.util.__init__ {
+stringify_keys(data: MappingT) MappingT
+validation_backends() Tuple[str, ...]
+default_validation_backend() str
}
Class diagram for updated type annotations in prance.mixinsclassDiagram
class CacheSpecsMixin {
+specification: Any
+specs_updated() bool
}
class YAMLMixin {
+yaml() str
}
class JSONMixin {
+json() str
}
YAMLMixin --|> CacheSpecsMixin
JSONMixin --|> CacheSpecsMixin
Class diagram for updated type annotations in prance.convertclassDiagram
class prance.convert {
+ConversionError(ValueError)
+convert_str(spec_str: str, filename: Optional[str] = None, **kwargs: Optional[str]) Tuple[str, str]
+convert_url(url: Union[str, ParseResult], cache: Optional[Dict[str, Tuple[str, Optional[str]]]] = None) Tuple[str, str]
+convert_spec(parser_or_spec: Union[JsonValue, BaseParser], parser_klass: Optional[Type[BaseParser]] = None, *args: Any, **kwargs: Any) BaseParser
}
Class diagram for updated type annotations in prance.cliclassDiagram
class prance.cli {
+__write_to_file(filename: str, specs: JsonValue) None
+__parser_for_url(url: str, resolve: bool, backend: str, strict: bool, encoding: Optional[str]) Tuple[prance.BaseParser, str]
+__validate(parser: prance.BaseParser, name: str) None
+cli() None
+GroupWithCommandOptions.add_command(cmd: click.Command, name: Optional[str] = None) None
+GroupWithCommandOptions.build_command_invoke(original_invoke: Any) Any
+backend_options(ctx: click.Context, resolve: bool, backend: str, strict: bool, encoding: Optional[str]) None
+validate(ctx: click.Context, output_file: Optional[str], urls: Tuple[str, ...]) None
+compile(ctx: click.Context, url_or_path: str, output_file: Optional[str]) None
+convert(url_or_path: str, output_file: Optional[str]) None
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `prance/util/url.py:52-56` </location>
<code_context>
- parsed = url
- if not isinstance(parsed, tuple):
+ parsed: ParseResult
+ if isinstance(url, tuple):
+ parsed = url
+ else:
from .fs import is_pathname_valid
</code_context>
<issue_to_address>
**suggestion:** Type checking for tuple in absurl is fragile and may not cover all ParseResult cases.
Use isinstance(url, ParseResult) instead of checking for tuple, as ParseResult is a namedtuple subclass. This will improve clarity and correctness.
```suggestion
if isinstance(url, ParseResult):
parsed = url
else:
from .fs import is_pathname_valid
```
</issue_to_address>
### Comment 2
<location> `prance/util/path.py:37` </location>
<code_context>
-def path_get(obj, path, defaultvalue=None, path_of_obj=()):
+def path_get(obj: JsonValue, path: Optional[Sequence[PathElement]], defaultvalue: JsonValue = None, path_of_obj: Tuple[PathElement, ...] = ()) -> JsonValue:
"""
Retrieve the value from obj indicated by path.
</code_context>
<issue_to_address>
**issue (bug_risk):** String conversion for dict keys in path_get may cause unexpected KeyErrors.
Accessing dict keys with obj[str(path[0])] can fail if the keys are not strings. Use obj[path[0]] or ensure key types are consistent.
</issue_to_address>
### Comment 3
<location> `prance/util/path.py:146` </location>
<code_context>
seq.append({})
- def safe_idx(seq, index):
+ def safe_idx(seq: Sequence[PathElement], index: int) -> Optional[Type[int]]:
"""
Safely index a sequence.
</code_context>
<issue_to_address>
**suggestion:** safe_idx's return type is misleading and may cause confusion.
The annotation should be Optional[type] to accurately reflect that the return value can be the type of any element in the sequence, not just int.
```suggestion
def safe_idx(seq: Sequence[PathElement], index: int) -> Optional[type]:
```
</issue_to_address>
### Comment 4
<location> `prance/__init__.py:144-147` </location>
<code_context>
if self._spec_string:
from .util.formats import parse_spec
- self.specification = parse_spec(self._spec_string, self.url)
+ self.specification = parse_spec(self._spec_string, self.url.path)
# If we have a parsed spec, convert it to JSON. Then we can validate
</code_context>
<issue_to_address>
**issue (bug_risk):** Passing self.url.path to parse_spec may not be sufficient for remote URLs.
Passing only the path may omit necessary information for remote URLs. Use the full URL or ParseResult to preserve context for accurate format detection.
</issue_to_address>
### Comment 5
<location> `prance/util/url.py:19` </location>
<code_context>
-def urlresource(url):
+def urlresource(url: ParseResult) -> str:
"""
Return the resource part of a parsed URL.
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying urlresource() using ParseResult._replace and unifying the cache logic in fetch_url and fetch_url_text.
- Simplify urlresource() by using `ParseResult._replace()` instead of list-slicing and casts
- Unify your “text” and “parsed” caches: always pass the same cache dict to both `fetch_url_text` and `fetch_url`, so you can drop the `text_cache` splitting logic entirely
```python
from urllib.parse import ParseResult
def urlresource(url: ParseResult) -> str:
"""Keep only scheme, netloc, path; drop params, query, fragment."""
return url._replace(params='', query='', fragment='').geturl()
```
```python
def fetch_url_text(
url: ParseResult,
cache: Optional[Dict[str, Tuple[str, Optional[str]]]] = None,
encoding: Optional[str] = None
) -> Tuple[str, Optional[str]]:
if cache is None:
cache = {}
text_key = "text_" + urlresource(url)
if text_key in cache:
return cache[text_key]
# … load content, set content_type …
cache[text_key] = (content, content_type)
return content, content_type
```
```python
def fetch_url(
url: ParseResult,
cache: Optional[Dict[Union[str, Tuple[str, bool]], JsonValue]] = None,
encoding: Optional[str] = None,
strict: bool = True
) -> JsonValue:
if cache is None:
cache = {}
parsed_key = (urlresource(url), strict)
if parsed_key in cache:
return cache[parsed_key].copy()
content, content_type = fetch_url_text(url, cache, encoding)
result = parse_spec(content, url.path, content_type=content_type)
if not strict:
result = stringify_keys(result)
cache[parsed_key] = result
return result.copy()
```
This removes the cast noise in `urlresource` and the local `text_cache` + filtering loop in `fetch_url` while preserving identical behavior.
</issue_to_address>
### Comment 6
<location> `prance/util/resolver.py:123` </location>
<code_context>
# If there are any objects collected when using TRANSLATE_EXTERNAL, add
# them to components/schemas
if self.__soft_dereference_objs:
- if "components" not in self.specs:
- self.specs["components"] = {}
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring nested type checks and repeated guards into early exits and helper functions to simplify control flow.
Here are two small refactorings that retain all type checks but collapse the nesting and remove a lot of boilerplate.
1) Flattening the “add soft‐refs” logic in `resolve_references` using `dict.setdefault` and a guard upfront:
Before:
```python
if self.__soft_dereference_objs:
# Type narrow specs to MutableMapping for safe indexing
if isinstance(self.specs, MutableMapping):
if "components" not in self.specs:
self.specs["components"] = {}
components = self.specs["components"]
if isinstance(components, MutableMapping):
if "schemas" not in components:
components.update({"schemas": {}})
schemas = components["schemas"]
if isinstance(schemas, MutableMapping):
schemas.update(self.__soft_dereference_objs)
```
After:
```python
from collections.abc import MutableMapping
if not isinstance(self.specs, MutableMapping) or not self.__soft_dereference_objs:
return
components = self.specs.setdefault("components", {})
if not isinstance(components, MutableMapping):
return
schemas = components.setdefault("schemas", {})
if isinstance(schemas, MutableMapping):
schemas.update(self.__soft_dereference_objs)
```
2) Extracting the early‐exit checks in `_dereferencing_iterator` into a tiny helper:
Before:
```python
for _, refstring, item_path in reference_iterator(partial):
# Type narrow refstring to str for split_url_reference
if not isinstance(refstring, str):
continue
# base_url must be a ParseResult for split_url_reference
if base_url is None:
continue
ref_url, obj_path = _url.split_url_reference(base_url, refstring)
…
```
After:
```python
def _split_ref(self, base_url, refstring):
if base_url is None or not isinstance(refstring, str):
return None
return _url.split_url_reference(base_url, refstring)
…
for _, refstring, item_path in reference_iterator(partial):
split = self._split_ref(base_url, refstring)
if split is None:
continue
ref_url, obj_path = split
…
```
These two changes collapse the repeated `isinstance` boilerplate into guard clauses and tiny helpers, making the control flow much clearer without dropping any type safety.
</issue_to_address>
### Comment 7
<location> `prance/util/path.py:61` </location>
<code_context>
+ obj[str(path[0])], path[1:], defaultvalue, path_of_obj=path_of_obj + (path[0],)
)
- elif isinstance(obj, Sequence):
+ elif isinstance(obj, AbcSequence):
if path is None or len(path) < 1:
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the sequence branch in path_set to remove nested helpers and use direct type-driven logic for clarity and reduced indirection.
```markdown
You can flatten out the nested helpers (`fill_sequence` / `safe_idx`) and drive decisions directly from the next path‐element’s type. That removes ~50 LOC of indirection and makes your `path_set` sequence‐branch much easier to follow:
1. pull your `collections.abc` imports (and any small `_validate_path`) to module scope
2. rewrite the `elif isinstance(obj, AbcSequence):` branch like so
```python
elif isinstance(obj, AbcSequence):
# ensure int index
idx = int(path[0])
if not isinstance(obj, MutableSequence):
raise TypeError(f"Sequence is not mutable: {type(obj)}")
if idx < 0:
raise KeyError(f"Negative index: {idx}")
# optionally extend to that index
if create and idx >= len(obj):
obj.extend([None] * (idx - len(obj) + 1))
elif idx >= len(obj):
raise IndexError(f"Index {idx} out of range")
# last segment? set and return
if len(path) == 1:
obj[idx] = value
return obj
# ensure intermediate container
if obj[idx] is None:
obj[idx] = [] if isinstance(path[1], int) else {}
return path_set(obj[idx], path[1:], value, create=create)
```
Advantages:
- no nested `fill_sequence` / `safe_idx`
- all sequence‐extension logic in one place
- next‐element type guides list/dict choice
- keeps full feature parity (lazy‐create + immutability checks)
```
</issue_to_address>
### Comment 8
<location> `prance/util/resolver.py:123-134` </location>
<code_context>
if self.__soft_dereference_objs:
# Type narrow specs to MutableMapping for safe indexing
if isinstance(self.specs, MutableMapping):
if "components" not in self.specs:
self.specs["components"] = {}
components = self.specs["components"]
if isinstance(components, MutableMapping):
if "schemas" not in components:
components.update({"schemas": {}})
schemas = components["schemas"]
if isinstance(schemas, MutableMapping):
schemas.update(self.__soft_dereference_objs)
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/merge-nested-ifs))
```suggestion
if self.__soft_dereference_objs and isinstance(self.specs, MutableMapping):
if "components" not in self.specs:
self.specs["components"] = {}
components = self.specs["components"]
if isinstance(components, MutableMapping):
if "schemas" not in components:
components.update({"schemas": {}})
schemas = components["schemas"]
if isinstance(schemas, MutableMapping):
schemas.update(self.__soft_dereference_objs)
```
<br/><details><summary>Explanation</summary>Too much nesting can make code difficult to understand, and this is especially
true in Python, where there are no brackets to help out with the delineation of
different nesting levels.
Reading deeply nested code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>
### Comment 9
<location> `prance/util/translator.py:88-99` </location>
<code_context>
if self.__collected_references:
# Type narrow specs to MutableMapping for safe indexing
if isinstance(self.specs, MutableMapping):
if "components" not in self.specs:
self.specs["components"] = {}
components = self.specs["components"]
if isinstance(components, MutableMapping):
if "schemas" not in components:
components.update({"schemas": {}})
schemas = components["schemas"]
if isinstance(schemas, MutableMapping):
schemas.update(self.__collected_references)
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/merge-nested-ifs))
```suggestion
if self.__collected_references and isinstance(self.specs, MutableMapping):
if "components" not in self.specs:
self.specs["components"] = {}
components = self.specs["components"]
if isinstance(components, MutableMapping):
if "schemas" not in components:
components.update({"schemas": {}})
schemas = components["schemas"]
if isinstance(schemas, MutableMapping):
schemas.update(self.__collected_references)
```
<br/><details><summary>Explanation</summary>Too much nesting can make code difficult to understand, and this is especially
true in Python, where there are no brackets to help out with the delineation of
different nesting levels.
Reading deeply nested code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>
### Comment 10
<location> `prance/util/url.py:249` </location>
<code_context>
entry = cache.get(url_key_tuple, None)
</code_context>
<issue_to_address>
**suggestion (code-quality):** Replace `dict.get(x, None)` with `dict.get(x)` ([`remove-none-from-default-get`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/remove-none-from-default-get))
```suggestion
entry = cache.get(url_key_tuple)
```
<br/><details><summary>Explanation</summary>When using a dictionary's `get` method you can specify a default to return if
the key is not found. This defaults to `None`, so it is unnecessary to specify
`None` if this is the required behaviour. Removing the unnecessary argument
makes the code slightly shorter and clearer.
</details>
</issue_to_address>
### Comment 11
<location> `prance/util/formats.py:71-75` </location>
<code_context>
def __parse_yaml(spec_str: str) -> JsonValue: # noqa: N802
from ruamel.yaml import YAML, parser # type: ignore[import-not-found]
try:
yaml = YAML(typ="safe")
return yaml.load(str(spec_str)) # type: ignore[no-any-return]
except parser.ParserError as err:
raise ParseError(str(err))
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Remove unnecessary casts to int, str, float or bool ([`remove-unnecessary-cast`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-cast/))
- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>
### Comment 12
<location> `prance/util/formats.py:81-84` </location>
<code_context>
def __parse_json(spec_str: str) -> JsonValue: # noqa: N802
import json
try:
return json.loads(str(spec_str)) # type: ignore[no-any-return]
except ValueError as err:
raise ParseError(str(err))
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Remove unnecessary casts to int, str, float or bool ([`remove-unnecessary-cast`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-cast/))
- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>
### Comment 13
<location> `prance/util/path.py:28-29` </location>
<code_context>
def _json_ref_escape(path: PathElement) -> str:
"""JSON-reference escape object path."""
path_str = str(path) # Could be an int, etc.
path_str = path_str.replace("~", "~0")
path_str = path_str.replace("/", "~1")
return path_str
</code_context>
<issue_to_address>
**suggestion (code-quality):** Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/inline-immediately-returned-variable/))
```suggestion
return path_str.replace("/", "~1")
```
</issue_to_address>
### Comment 14
<location> `prance/util/resolver.py:33-35` </location>
<code_context>
def default_reclimit_handler(limit: int, parsed_url: ParseResult, recursions: Tuple[Tuple[str, Tuple[PathElement, ...]], ...] = ()) -> None:
"""Raise prance.util.url.ResolutionError."""
path: List[str] = []
for rc in recursions:
path.append("{}#/{}".format(rc[0], "/".join(str(p) for p in rc[1])))
path_str = "\n".join(path)
raise _url.ResolutionError(
"Recursion reached limit of %d trying to "
'resolve "%s"!\n%s' % (limit, parsed_url.geturl(), path_str)
)
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Convert for loop into list comprehension ([`list-comprehension`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/list-comprehension/))
- Replace call to format with f-string ([`use-fstring-for-formatting`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-fstring-for-formatting/))
```suggestion
path: List[str] = [
f'{rc[0]}#/{"/".join(str(p) for p in rc[1])}' for rc in recursions
]
```
</issue_to_address>
### Comment 15
<location> `prance/util/url.py:30` </location>
<code_context>
def urlresource(url: ParseResult) -> str:
"""
Return the resource part of a parsed URL.
The resource part is defined as the part without query, parameters or
fragment. Just the scheme, netloc and path remains.
:param tuple url: A parsed URL
:return: The resource part of the URL
:rtype: str
"""
res_list: List[Optional[str]] = list(url)[0:3] + [None, None, None]
return parse.ParseResult(*cast(Tuple[str, str, str, str, str, str], res_list)).geturl()
</code_context>
<issue_to_address>
**suggestion (code-quality):** Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] ([`remove-redundant-slice-index`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-redundant-slice-index/))
```suggestion
res_list: List[Optional[str]] = list(url)[:3] + [None, None, None]
```
</issue_to_address>
### Comment 16
<location> `prance/util/url.py:181` </location>
<code_context>
def fetch_url_text(url: ParseResult, cache: Optional[Dict[str, Tuple[str, Optional[str]]]] = None, encoding: Optional[str] = None) -> Tuple[str, Optional[str]]:
"""
Fetch the URL.
If the URL is a file URL, the format used for parsing depends on the file
extension. Otherwise, YAML is assumed.
The URL may also use the `python` scheme. In this scheme, the netloc part
refers to an importable python package, and the path part to a path relative
to the package path, e.g. `python://some_package/path/to/file.yaml`.
:param tuple url: The url, parsed as returned by `absurl` above.
:param Mapping cache: An optional cache. If the URL can be found in the
cache, return the cache contents.
:param str encoding: Provide an encoding for local URLs to override
encoding detection, if desired. Defaults to None.
:return: The resource text of the URL, and the content type.
:rtype: tuple
"""
if cache is None:
cache = {}
url_key = "text_" + urlresource(url)
entry = cache.get(url_key, None)
if entry is not None:
return entry
# Fetch contents according to scheme. We assume requests can handle all the
# non-file schemes, or throw otherwise.
content: str
content_type: Optional[str] = None
if url.scheme in (None, "", "file"):
from .fs import read_file, from_posix
try:
content = read_file(from_posix(url.path), encoding)
except FileNotFoundError as ex:
from .exceptions import raise_from
raise_from(ResolutionError, ex, f"File not found: {url.path}")
elif url.scheme == "python":
# Resolve package path
package = url.netloc
path = url.path
if path[0] == "/":
path = path[1:]
from importlib.resources import files
path_traversable = files(package).joinpath(path)
from .fs import read_file, from_posix
content = read_file(from_posix(str(path_traversable)), encoding)
else:
import requests # type: ignore[import-untyped]
response = requests.get(url.geturl())
if not response.ok: # pragma: nocover
raise ResolutionError(
'Cannot fetch URL "%s": %d %s'
% (url.geturl(), response.status_code, response.reason)
)
content_type = response.headers.get("content-type", "text/plain")
content = response.text
cache[url_key] = (content, content_type)
return content, content_type
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Use f-string instead of string concatenation ([`use-fstring-for-concatenation`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-fstring-for-concatenation/))
- Extract code out into function ([`extract-method`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/extract-method/))
- Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/inline-immediately-returned-variable/))
</issue_to_address>
### Comment 17
<location> `prance/util/url.py:230` </location>
<code_context>
def fetch_url(url: ParseResult, cache: Optional[Dict[Union[str, Tuple[str, bool]], JsonValue]] = None, encoding: Optional[str] = None, strict: bool = True) -> JsonValue:
"""
Fetch the URL and parse the contents.
Same as fetch_url_text(), but also parses the content and only
returns the parse results.
:param tuple url: The url, parsed as returned by `absurl` above.
:param Mapping cache: An optional cache. If the URL can be found in the
cache, return the cache contents.
:param str encoding: Provide an encoding for local URLs to override
encoding detection, if desired. Defaults to None.
:return: The parsed file.
:rtype: dict
"""
# Return from cache, if parsed result is already present.
if cache is None:
cache = {}
url_key_tuple: Tuple[str, bool] = (urlresource(url), strict)
entry = cache.get(url_key_tuple, None)
if entry is not None:
if isinstance(entry, Mapping):
return entry.copy() # type: ignore[no-any-return, attr-defined]
return entry
# Fetch URL text
text_cache: Dict[str, Tuple[str, Optional[str]]] = {}
for key, value in cache.items():
if isinstance(key, str) and isinstance(value, tuple):
text_cache[key] = value
content, content_type = fetch_url_text(url, text_cache, encoding=encoding)
cache.update(text_cache)
# Parse the result
from .formats import parse_spec
result = parse_spec(content, url.path, content_type=content_type)
# Perform some sanitization in lenient mode.
if not strict:
from collections.abc import MutableMapping
from . import stringify_keys
if isinstance(result, MutableMapping):
result = stringify_keys(result)
# Cache and return result
cache[url_key_tuple] = result
if isinstance(result, Mapping):
return result.copy() # type: ignore[no-any-return, attr-defined]
return result
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Convert for loop into dictionary comprehension ([`dict-comprehension`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/dict-comprehension/))
- Lift code into else after jump in control flow [×2] ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Replace if statement with if expression [×2] ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| parsed: ParseResult | ||
| if isinstance(url, tuple): | ||
| parsed = url | ||
| else: | ||
| from .fs import is_pathname_valid |
There was a problem hiding this comment.
suggestion: Type checking for tuple in absurl is fragile and may not cover all ParseResult cases.
Use isinstance(url, ParseResult) instead of checking for tuple, as ParseResult is a namedtuple subclass. This will improve clarity and correctness.
| parsed: ParseResult | |
| if isinstance(url, tuple): | |
| parsed = url | |
| else: | |
| from .fs import is_pathname_valid | |
| if isinstance(url, ParseResult): | |
| parsed = url | |
| else: | |
| from .fs import is_pathname_valid |
|
|
||
|
|
||
| def path_get(obj, path, defaultvalue=None, path_of_obj=()): | ||
| def path_get(obj: JsonValue, path: Optional[Sequence[PathElement]], defaultvalue: JsonValue = None, path_of_obj: Tuple[PathElement, ...] = ()) -> JsonValue: |
There was a problem hiding this comment.
issue (bug_risk): String conversion for dict keys in path_get may cause unexpected KeyErrors.
Accessing dict keys with obj[str(path[0])] can fail if the keys are not strings. Use obj[path[0]] or ensure key types are consistent.
| seq.append({}) | ||
|
|
||
| def safe_idx(seq, index): | ||
| def safe_idx(seq: Sequence[PathElement], index: int) -> Optional[Type[int]]: |
There was a problem hiding this comment.
suggestion: safe_idx's return type is misleading and may cause confusion.
The annotation should be Optional[type] to accurately reflect that the return value can be the type of any element in the sequence, not just int.
| def safe_idx(seq: Sequence[PathElement], index: int) -> Optional[Type[int]]: | |
| def safe_idx(seq: Sequence[PathElement], index: int) -> Optional[type]: |
| if self._spec_string: | ||
| from .util.formats import parse_spec | ||
|
|
||
| self.specification = parse_spec(self._spec_string, self.url) | ||
| self.specification = parse_spec(self._spec_string, self.url.path) |
There was a problem hiding this comment.
issue (bug_risk): Passing self.url.path to parse_spec may not be sufficient for remote URLs.
Passing only the path may omit necessary information for remote URLs. Use the full URL or ParseResult to preserve context for accurate format detection.
|
|
||
|
|
||
| def urlresource(url): | ||
| def urlresource(url: ParseResult) -> str: |
There was a problem hiding this comment.
issue (complexity): Consider simplifying urlresource() using ParseResult._replace and unifying the cache logic in fetch_url and fetch_url_text.
- Simplify urlresource() by using
ParseResult._replace()instead of list-slicing and casts - Unify your “text” and “parsed” caches: always pass the same cache dict to both
fetch_url_text and fetch_url, so you can drop thetext_cache splitting logic entirely
from urllib.parse import ParseResult
def urlresource(url: ParseResult) -> str:
"""Keep only scheme, netloc, path; drop params, query, fragment."""
return url._replace(params='', query='', fragment='').geturl()def fetch_url_text(
url: ParseResult,
cache: Optional[Dict[str, Tuple[str, Optional[str]]]] = None,
encoding: Optional[str] = None
) -> Tuple[str, Optional[str]]:
if cache is None:
cache = {}
text_key = "text_" + urlresource(url)
if text_key in cache:
return cache[text_key]
# … load content, set content_type …
cache[text_key] = (content, content_type)
return content, content_typedef fetch_url(
url: ParseResult,
cache: Optional[Dict[Union[str, Tuple[str, bool]], JsonValue]] = None,
encoding: Optional[str] = None,
strict: bool = True
) -> JsonValue:
if cache is None:
cache = {}
parsed_key = (urlresource(url), strict)
if parsed_key in cache:
return cache[parsed_key].copy()
content, content_type = fetch_url_text(url, cache, encoding)
result = parse_spec(content, url.path, content_type=content_type)
if not strict:
result = stringify_keys(result)
cache[parsed_key] = result
return result.copy()This removes the cast noise in urlresource and the local text_cache + filtering loop in fetch_url while preserving identical behavior.
| path_str = path_str.replace("/", "~1") | ||
| return path_str |
There was a problem hiding this comment.
suggestion (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)
| path_str = path_str.replace("/", "~1") | |
| return path_str | |
| return path_str.replace("/", "~1") |
| path: List[str] = [] | ||
| for rc in recursions: | ||
| path.append("{}#/{}".format(rc[0], "/".join(rc[1]))) | ||
| path = "\n".join(path) | ||
| path.append("{}#/{}".format(rc[0], "/".join(str(p) for p in rc[1]))) |
There was a problem hiding this comment.
suggestion (code-quality): We've found these issues:
- Convert for loop into list comprehension (
list-comprehension) - Replace call to format with f-string (
use-fstring-for-formatting)
| path: List[str] = [] | |
| for rc in recursions: | |
| path.append("{}#/{}".format(rc[0], "/".join(rc[1]))) | |
| path = "\n".join(path) | |
| path.append("{}#/{}".format(rc[0], "/".join(str(p) for p in rc[1]))) | |
| path: List[str] = [ | |
| f'{rc[0]}#/{"/".join(str(p) for p in rc[1])}' for rc in recursions | |
| ] |
| """ | ||
| res_list = list(url)[0:3] + [None, None, None] | ||
| return parse.ParseResult(*res_list).geturl() | ||
| res_list: List[Optional[str]] = list(url)[0:3] + [None, None, None] |
There was a problem hiding this comment.
suggestion (code-quality): Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] (remove-redundant-slice-index)
| res_list: List[Optional[str]] = list(url)[0:3] + [None, None, None] | |
| res_list: List[Optional[str]] = list(url)[:3] + [None, None, None] |
- Update JsonValue type alias to allow Mapping[Union[str, int], ...]
instead of just Mapping[str, ...] to support integer keys
- Remove str() cast from path_get that was breaking integer key access
- Fix translator.py type annotation for value assignment
- Remove unused type:ignore comments that are no longer needed
This fixes test failures in test_util_path.py where mappings have
integer keys like {3: 'value'}.
All mypy checks still pass with strict configuration.
1573de5 to
a1c9379
Compare
- Allow base_url to be None in split_url_reference and _skip_reference to properly handle RefResolver without a URL context - Remove early return when base_url is None to allow errors to propagate as expected by test_resolver_noname - Add None check before accessing base_url.path in _skip_reference - Fix test_valid_pathname to use 300 characters instead of 256 to ensure it fails on all platforms (some systems allow 256-char names) All tests now pass: 134 passed, 18 skipped (excluding network tests)
for more information, see https://pre-commit.ci
This commit adds complete type annotations to all Python files in the prance codebase and configures mypy with strict type checking settings.
Key changes:
Configure mypy in pyproject.toml with strict settings:
Add type annotations to all modules:
Type design principles:
Mypy configuration:
All type checking now passes: mypy prance/ returns success.
Fixes # .
Changes proposed in this PR:
Summary by Sourcery
Add comprehensive type annotations across the prance codebase and enforce strict static type checking with MyPy
Enhancements:
Build: