Skip to content

Use ontoportal-client classes in BioPortal and AgroPortal implementations#204

Merged
pkalita-lbl merged 5 commits intomainfrom
issue-168
Jul 22, 2022
Merged

Use ontoportal-client classes in BioPortal and AgroPortal implementations#204
pkalita-lbl merged 5 commits intomainfrom
issue-168

Conversation

@pkalita-lbl
Copy link
Copy Markdown
Collaborator

These changes introduce an OntoPortalInterface base class. It expects subclasses to populate the OntoPortalClientClass and api_key_name class variables. OntoPortalInterface then handles all of the implementation details of constructing the OntoPortalClient instance and using it to make requests.

Fixes #168

…ions

These changes introduce an `OntoPortalInterface` base class. It expects subclasses to populate its `OntoPortalClientClass` and `api_key_name` class variables. `OntoPortalInterface` then handles all of the implementation details of constructing the `OntoPortalClient` instance and using it to make requests.
@pkalita-lbl pkalita-lbl marked this pull request as ready for review July 21, 2022 22:42
@pkalita-lbl pkalita-lbl requested review from cmungall and cthoyt July 21, 2022 22:42
Copy link
Copy Markdown
Collaborator

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't thinking of OntoPortal as an interface, but rather as a parent class for some implementations.

A few other things to bear in mind:

  • ontoportal-client is still young, I figure the methods it exposes may vary over time, which will mean a bit more coordinating and perhaps some version pinning
  • we don't have a great testing strategy for bioportal and friends yet. I run the tests locally using my API key, but gh actions will skip them (I think). We need to set up Mock unit tests, and have separate integration tests. This will all come in time but just be aware that there is more potential to temporarily break things here!

@pkalita-lbl
Copy link
Copy Markdown
Collaborator Author

I agree there is some potential for confusion here about OntoPortalInterface; it's not exactly an interface in the same way the other classes in the interfaces package are. I considered putting it in the implementations package, but it's also not quite an implementation that can can be used on its own. It's a bit neither-here-nor-there, but thinking about it now it probably belongs more in implementations. I'll make that change before merging.

Definitely on the same page as you about testing. I also run them locally using my API key, and you're right that they are skipped when run in the GitHub workflow. Mocking can be a headache but it's better than nothing. I can open an issue about that.

@cthoyt
Copy link
Copy Markdown
Collaborator

cthoyt commented Jul 22, 2022

The ontoportal client has some pretty slick testing set up, you can see how the combination of api keys set as GitHub actions secrets and the unified lookup with pystow simplifies it pretty nicely

@pkalita-lbl
Copy link
Copy Markdown
Collaborator Author

That's also a good idea @cthoyt. I'll take a closer look at that when I work on #205.

I've updated this branch to replace oaklib.interfaces.ontoportal_interface.OntoPortalInterface with oaklib.implementations.ontoportal.ontoportal_implementation_base.OntoPortalImplementationBase. I think that makes it more clear what the purpose of that class is.

Maybe long term we want to put the BioPortal, AgroPortal, etc implementations into the oaklib.implementations.ontoportal package as well?

@cthoyt
Copy link
Copy Markdown
Collaborator

cthoyt commented Jul 22, 2022

@pkalita-lbl just pushed a new ontoportal-client release (https://github.com/cthoyt/ontoportal-client/releases/tag/v0.0.3), that also does more of the things that you have in your get_response and get_json wrappers to reduce more code

Comment thread src/oaklib/implementations/bioportal/agroportal_implementation.py Outdated
Comment thread src/oaklib/implementations/bioportal/agroportal_implementation.py Outdated
response = self._get_response(*args, **kwargs)
return response.json()

def ontologies(self) -> Iterable[CURIE]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type annotation here for the result is misleading. the acronyms are prefixes, not CURIEs

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I think this is something that may need a deeper fix as part of a separate PR through

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm with Chris on this one. It's not good, but it's also not an isolated thing and would probably be handled better in a separate PR.

self.client = self.OntoPortalClientClass(api_key=api_key)
if not self.ontoportal_client_class:
raise NotImplementedError("ontoportal_client_class not specified")
api_key = get_apikey_value(self.ontoportal_client_class.name)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the API key getting is handled by the class itself, so you can skip this completely from the OAK side

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I saw that ontoportal_client manages API keys as well but as I understand it that would require the user to create either an environment variable or a file under ~/.config. The OAK command line interface has a set-apikey subcommand that I think we need to continue to support. So I thought it was better to continue to let OAK manage its own key storage.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #142 I made it much more integrated with PyStow, since it's a pain to have to use several competing systems when there's already something ready to go out of the box. If someone wants to put effort into unmucking the testing (#173), then it will be possible to finish that PR, since right now the logging is too much of a mess to even figure it out. If I had to guess though, it might be because PRs from forks don't get access to the secrets in the GHA workflows

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can talk to Sujay about #173 since he currently is assigned and see if has any progress on it. If not maybe I can take it over from him. Also after I commented on #142 Chris gave me admin rights on this repo, so I'd be happy to work with you on that one. But as far as these changes go are you okay with decoupling them from the PyStow work? I'm not sure how much effort that will be, but does it make sense to just get this in now?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the code is simple, the only real effort is coordination :p

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can clean this up later, it’s not a blocker for your PR

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely appreciate all your feedback. I'm going to merge this now, but please get in touch with me next week (obo-community slack maybe?) and we can try to coordinate on getting #142 going.


def _get_response(self, path: str, *args, raise_for_status=True, **kwargs):
def _get_response(self, *args, **kwargs):
check_limit()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know if the rate limit is the same for all instances? we could also push the limit checking upstream

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's basically a global rate limiter: https://github.com/INCATools/ontology-access-kit/blob/main/src/oaklib/utilities/rate_limiter.py which seems perhaps too conservative, but there may be some good reasons for it that I'm not aware of.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By instances, I mean bioportal, agoportal, etc. I know BioPortal says 15 requests per second on https://www.bioontology.org/wiki/Annotator_Optimizing_and_Troublehooting, but finding this kind of specific documentation for all portals is annoying

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, no I don't know off hand if all ontoportal instances have the same rate limits or if it's configurable.

@cmungall
Copy link
Copy Markdown
Collaborator

cmungall commented Jul 22, 2022 via email

@cmungall
Copy link
Copy Markdown
Collaborator

cmungall commented Jul 22, 2022 via email

@pkalita-lbl pkalita-lbl merged commit 0d9f355 into main Jul 22, 2022
@pkalita-lbl pkalita-lbl deleted the issue-168 branch July 22, 2022 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Externalize OntoPortal client functions (e.g., for BioPortal, AgroPortal, EcoPortal)

3 participants