Use ontoportal-client classes in BioPortal and AgroPortal implementations#204
Use ontoportal-client classes in BioPortal and AgroPortal implementations#204pkalita-lbl merged 5 commits intomainfrom
Conversation
…ions These changes introduce an `OntoPortalInterface` base class. It expects subclasses to populate its `OntoPortalClientClass` and `api_key_name` class variables. `OntoPortalInterface` then handles all of the implementation details of constructing the `OntoPortalClient` instance and using it to make requests.
cmungall
left a comment
There was a problem hiding this comment.
I wasn't thinking of OntoPortal as an interface, but rather as a parent class for some implementations.
A few other things to bear in mind:
- ontoportal-client is still young, I figure the methods it exposes may vary over time, which will mean a bit more coordinating and perhaps some version pinning
- we don't have a great testing strategy for bioportal and friends yet. I run the tests locally using my API key, but gh actions will skip them (I think). We need to set up Mock unit tests, and have separate integration tests. This will all come in time but just be aware that there is more potential to temporarily break things here!
|
I agree there is some potential for confusion here about Definitely on the same page as you about testing. I also run them locally using my API key, and you're right that they are skipped when run in the GitHub workflow. Mocking can be a headache but it's better than nothing. I can open an issue about that. |
|
The ontoportal client has some pretty slick testing set up, you can see how the combination of api keys set as GitHub actions secrets and the unified lookup with pystow simplifies it pretty nicely |
|
That's also a good idea @cthoyt. I'll take a closer look at that when I work on #205. I've updated this branch to replace Maybe long term we want to put the BioPortal, AgroPortal, etc implementations into the |
|
@pkalita-lbl just pushed a new ontoportal-client release (https://github.com/cthoyt/ontoportal-client/releases/tag/v0.0.3), that also does more of the things that you have in your get_response and get_json wrappers to reduce more code |
| response = self._get_response(*args, **kwargs) | ||
| return response.json() | ||
|
|
||
| def ontologies(self) -> Iterable[CURIE]: |
There was a problem hiding this comment.
the type annotation here for the result is misleading. the acronyms are prefixes, not CURIEs
There was a problem hiding this comment.
good point, I think this is something that may need a deeper fix as part of a separate PR through
There was a problem hiding this comment.
I think I'm with Chris on this one. It's not good, but it's also not an isolated thing and would probably be handled better in a separate PR.
| self.client = self.OntoPortalClientClass(api_key=api_key) | ||
| if not self.ontoportal_client_class: | ||
| raise NotImplementedError("ontoportal_client_class not specified") | ||
| api_key = get_apikey_value(self.ontoportal_client_class.name) |
There was a problem hiding this comment.
the API key getting is handled by the class itself, so you can skip this completely from the OAK side
There was a problem hiding this comment.
Yeah I saw that ontoportal_client manages API keys as well but as I understand it that would require the user to create either an environment variable or a file under ~/.config. The OAK command line interface has a set-apikey subcommand that I think we need to continue to support. So I thought it was better to continue to let OAK manage its own key storage.
There was a problem hiding this comment.
In #142 I made it much more integrated with PyStow, since it's a pain to have to use several competing systems when there's already something ready to go out of the box. If someone wants to put effort into unmucking the testing (#173), then it will be possible to finish that PR, since right now the logging is too much of a mess to even figure it out. If I had to guess though, it might be because PRs from forks don't get access to the secrets in the GHA workflows
There was a problem hiding this comment.
I can talk to Sujay about #173 since he currently is assigned and see if has any progress on it. If not maybe I can take it over from him. Also after I commented on #142 Chris gave me admin rights on this repo, so I'd be happy to work with you on that one. But as far as these changes go are you okay with decoupling them from the PyStow work? I'm not sure how much effort that will be, but does it make sense to just get this in now?
There was a problem hiding this comment.
All of the code is simple, the only real effort is coordination :p
There was a problem hiding this comment.
I can clean this up later, it’s not a blocker for your PR
There was a problem hiding this comment.
Definitely appreciate all your feedback. I'm going to merge this now, but please get in touch with me next week (obo-community slack maybe?) and we can try to coordinate on getting #142 going.
|
|
||
| def _get_response(self, path: str, *args, raise_for_status=True, **kwargs): | ||
| def _get_response(self, *args, **kwargs): | ||
| check_limit() |
There was a problem hiding this comment.
do you know if the rate limit is the same for all instances? we could also push the limit checking upstream
There was a problem hiding this comment.
It's basically a global rate limiter: https://github.com/INCATools/ontology-access-kit/blob/main/src/oaklib/utilities/rate_limiter.py which seems perhaps too conservative, but there may be some good reasons for it that I'm not aware of.
There was a problem hiding this comment.
By instances, I mean bioportal, agoportal, etc. I know BioPortal says 15 requests per second on https://www.bioontology.org/wiki/Annotator_Optimizing_and_Troublehooting, but finding this kind of specific documentation for all portals is annoying
There was a problem hiding this comment.
Ah I see, no I don't know off hand if all ontoportal instances have the same rate limits or if it's configurable.
|
Yes, consider this a first pass. Globally it's conservative as some of the
individual queries launched at the triplestore endpoints can be
mega-queries, but it could be tuned for each category of endpoint, then by
endpoint specifics
…On Fri, Jul 22, 2022 at 10:07 AM Patrick Kalita ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/oaklib/implementations/ontoportal/ontoportal_implementation_base.py
<#204 (comment)>
:
>
def prefix_map(self) -> PREFIX_MAP:
# TODO
return {}
- def _get_response(self, path: str, *args, raise_for_status=True, **kwargs):
+ def _get_response(self, *args, **kwargs):
check_limit()
It's basically a global rate limiter:
https://github.com/INCATools/ontology-access-kit/blob/main/src/oaklib/utilities/rate_limiter.py
which seems perhaps too conservative, but there may be some good reasons
for it that I'm not aware of.
—
Reply to this email directly, view it on GitHub
<#204 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMMOKZKLQMO4MWAN6R37TVVLIMDANCNFSM54JDSJRQ>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
|
I think using pystow universally for api keys is good. Up to you both how
best to go about it ie merge other pr, redo work on this one
…On Fri, Jul 22, 2022 at 12:40 PM Charles Tapley Hoyt < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/oaklib/implementations/ontoportal/ontoportal_implementation_base.py
<#204 (comment)>
:
> @@ -50,22 +49,18 @@ def __post_init__(self):
if self.focus_ontology is None:
if self.resource:
self.focus_ontology = self.resource.slug
- if not self.OntoPortalClientClass:
- raise NotImplementedError("OntoPortalClientClass not specified")
- api_key = get_apikey_value(self.api_key_name)
- self.client = self.OntoPortalClientClass(api_key=api_key)
+ if not self.ontoportal_client_class:
+ raise NotImplementedError("ontoportal_client_class not specified")
+ api_key = get_apikey_value(self.ontoportal_client_class.name)
In #142 <#142> I
made it much more integrated with PyStow, since it's a pain to have to use
several competing systems when there's already something ready to go out of
the box. If someone wants to put effort into unmucking the testing (#173
<#173>), then it
will be possible to finish that PR, since right now the logging is too much
of a mess to even figure it out. If I had to guess though, it might be
because PRs from forks don't get access to the secrets in the GHA workflows
—
Reply to this email directly, view it on GitHub
<#204 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMMOMOVKUNGBHMEY4E6WDVVL2MBANCNFSM54JDSJRQ>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
These changes introduce an
OntoPortalInterfacebase class. It expects subclasses to populate theOntoPortalClientClassandapi_key_nameclass variables.OntoPortalInterfacethen handles all of the implementation details of constructing theOntoPortalClientinstance and using it to make requests.Fixes #168