Conversation
Summary of ChangesHello @safaricd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request expands the telemetry data collected during model operations ( Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds tracking for torch and sklearn versions, as well as the GPU type, to the telemetry events. The introduction of a ModelCallEvent base class is a nice refactoring that improves code structure by reducing duplication between FitEvent and PredictEvent. My review includes a suggestion for the new _get_gpu_type function to make its return values less ambiguous and to improve its robustness. Overall, these are good additions for richer telemetry.
There was a problem hiding this comment.
Very nice! Minor suggestions:
- [med] Add TPU detection?
- [high] add package versions: tabpfn-extensions, numpy, pandas, maybe autogluon (would help us understand if people ran us from AG as well?)
- [med] detect Colab / Kaggle / Notebook context
- [important] torch.version.cuda, torch.backends.cudnn.version()
- [important] python version!
- [important] "os": platform.system(),
noahho
left a comment
There was a problem hiding this comment.
see previous feedback, adding more info
Added these too.
We already have methods for detecting whether someone runs inside a notebook or not - however, how would this information be used after all, from an analytics point of view?
We track this since day 1.
How would this information be used after all, from an analytics point of view? I think that having information about e.g. packages/dependencies is very useful, but is information about whether someone runs TabPFN on a Mac vs. Windows relevant to us? |
|
System (windows / Mac / Unix) is very important for development, wea already have a bunch of code to support these and it's good to know how far we should go. For the colab this tells us about the kind of workload (production or experimentation?). Detecting haggle specifically would be great for gtm |
|
@noahho just added the following additional properties tracked across all events:
With that being said, I'll heard the changes and push a new PyPI version. Anything else you'd like to add? |
Change Description
With this change, we track additional properties for
fitandpredictevents:torchversionsklearnversionOn the GDPR side, we are safe with tracking these as they will not allow us to fingerprint users based on these.