Extend as JSON schema to include list[Any]#12
Conversation
bentsku
left a comment
There was a problem hiding this comment.
I think this looks good, just added some question/comments but I might have missed something. If that's the case I'll just approve, I'm happy to be contradicted here 😄
| @register_schema | ||
| class DictAsJSONSchema(Schema): | ||
| """An Avro string schema representing a Python Dict[str, Any] or List[Dict[str, Any]] assuming JSON serialization""" | ||
| """ |
There was a problem hiding this comment.
question: should we rename the DictAsJSONSchema to something like TypeAsJSONSchema to indicates it can map more?
There was a problem hiding this comment.
Good suggestion! Done it.
|
|
||
| def is_logically_json(py_type: Type) -> bool: | ||
| """Returns whether a given type is logically a JSON and can be serialized as such""" | ||
| return _is_list_any(py_type) or _is_list_dict_str_any(py_type) or _is_dict_str_any(py_type) |
There was a problem hiding this comment.
question: is it worth keeping the 3 different functions as they are? Here, if we're to match the last condition, we will call origin = get_origin(py_type) three times. Would it be worth to call those things only once and pass it to those functions? Or are they used elsewhere?
If they're not used, I could see them being merged into one and we could use recursion on them?
Something like:
def is_logically_json(py_type: Type) -> bool:
origin = get_origin(py_type)
if not inspect.isclass(origin):
return False
args = get_args(py_type)
if issubclass(origin, list):
if not args:
return False
elif args == (Any,):
return True
return is_logically_json(args[0])
elif issubclass(origin, dict):
return args == (str, Any)Or something like that? I'm not sure it's really clearer, with some comments it would be better but at least we're calling the methods only once every time and we can add more case more easily, and the recursion would handle all cases of "logically JSON"? it might be the case already though
I feel like this would match the Dict[str, List[Any]] for example which I'm not sure currently it would work
bentsku
left a comment
There was a problem hiding this comment.
LGTM, after discussing offline the recursion method is more permissive and we'd like to have more control over that.
I think if we need to add another check in is_logically_json we should look into refactoring that code, as we're calling the same methods over py_type a lot of type, and we could probably improve that, but for now that's good 👍
This framework has the feature of reducing specific types (
dict[str, Any]andlist[dict[str, Any]) to astring(orbytes), assuming the fields are dumped to JSON.With this PR, we also extend the same behavior to
list[Any], as we did find many occurrences in LocalStack when this would be very useful!