This change will break the API
Current this type look like this:
class CreateArcRequest(BaseModel):
rdi: Annotated[str, Field(description="Research Data Infrastructure identifier")]
arc: Annotated[dict, Field(description="ARC definition in RO-Crate JSON format")]
It is proposed to change the arc type from dict to str.
We're currently using dict as arc type, because we cannot use ARCtrl.ARC here. pydantic BaseModels can only consist of other BaseModels or basic types. The ‘dict’ type is the closest pydantic-compatible approximation to ARC that we can achieve. By using this appraoch, pydantic can at least check, if the incoming string can be parsed into a dict.
But this choice makes it cumbersome and expensive to create an object of type CreateArcRequest: we need an ARC as dict. Typically this requires this conversion chain arc->str->dict. Then we can construct the CreateArcRequest that will immediately be converted it back to a JSON string.
arc_dict = json.loads(arc.ToROCrateJsonString())
request = CreateArcRequest(rdi="test"i, arc=arc_dict)
body = request.model_dump_json()
If arc was a str, this would remove the JSON parsing step:
arc_json = arc.ToROCrateJsonString()
request = CreateArcRequest(rdi="test"i, arc=arc_json)
body = request.model_dump_json()
Also server-side this could simplify things. Currently the ARC JSON string in fact is parsed twice: first by pydantic/FastAPI that converts it into a dict, then by ARCtrl that converts the dict into an ARC. There are several server-side approaches for this:
- Just change
dict into str, thus skipping any automatic pydantic validations of the filed and parse the string manually into an ARC.
- Don't use
str but even ARCtrl.ARC as type for the arc field. This requires a pydantic field_serializer:
class ArcModel(BaseModel):
rdi: str
arc: ARCtrl.ARC
@field_validator("arc", mode="before")
@classmethod
def parse_arc(cls, v: Any):
if isinstance(v, ARCtrl.ARC):
return v
if isinstance(v, str):
return ARCtrl.ARC.FromROCrateJsonString(v)
raise TypeError(f"Unsupported type for arc: {type(v)}")
@field_serializer("arc")
def serialize_arc(self, arc: ARCtrl.ARC):
return json.loads(arc.to_json())
This has the downside that we always need to deal with ARCtrl.ARC objects. The development of SQL-to-ARC showed, that in practice it might also be, that we might have to deal with the JSON string right away. So this would again introduce an ARC->JSON->ARC conversion. So I object this solution.
3. It could also be possible to combine the str and ARCtrl.ARC approach in the model by introduction a Union: ARCtrl.ARC | str. This should be further investigated.
This change will break the API
Current this type look like this:
It is proposed to change the
arctype fromdicttostr.We're currently using
dictasarctype, because we cannot useARCtrl.ARChere. pydanticBaseModels can only consist of otherBaseModels or basic types. The ‘dict’ type is the closest pydantic-compatible approximation toARCthat we can achieve. By using this appraoch, pydantic can at least check, if the incoming string can be parsed into a dict.But this choice makes it cumbersome and expensive to create an object of type CreateArcRequest: we need an ARC as
dict. Typically this requires this conversion chainarc->str->dict. Then we can construct theCreateArcRequestthat will immediately be converted it back to a JSON string.If
arcwas astr, this would remove the JSON parsing step:Also server-side this could simplify things. Currently the ARC JSON string in fact is parsed twice: first by pydantic/FastAPI that converts it into a
dict, then byARCtrlthat converts thedictinto anARC. There are several server-side approaches for this:dictintostr, thus skipping any automatic pydantic validations of the filed and parse the string manually into anARC.strbut evenARCtrl.ARCas type for thearcfield. This requires a pydanticfield_serializer:This has the downside that we always need to deal with
ARCtrl.ARCobjects. The development ofSQL-to-ARCshowed, that in practice it might also be, that we might have to deal with the JSON string right away. So this would again introduce an ARC->JSON->ARC conversion. So I object this solution.3. It could also be possible to combine the
strandARCtrl.ARCapproach in the model by introduction a Union:ARCtrl.ARC | str. This should be further investigated.