Added support for random weighted sampling for unbalanced datasets#284
Draft
hummuscience wants to merge 1 commit into
Draft
Added support for random weighted sampling for unbalanced datasets#284hummuscience wants to merge 1 commit into
hummuscience wants to merge 1 commit into
Conversation
Collaborator
|
Thanks for the PR @hummuscience! Happy to take a closer look soon. I'm a bit swamped until mid-May with end of semester/deadlines, but after that let's definitely plan to meet and discuss further (both this PR and the top_view_mouse model). This work will actually dovetail quite nicely with the COCO input for heterogeneous datasets issue. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a draft pull request for adding the weighted random sampling option as suggested here: #158 (comment)
I am not very experienced with coding and good practices (typical biologist background :D) so I put this together with the help of an LLM.
For some reason, I was unable to pass the config setting to the function. Am I missing something? This is the reason it's currently set to be enabled by default. Technically, even when its turned on, it should still work normally for data that is not unbalanced, right?
I also added a test to check if the functionality works. Even though, it makes more sense to add a test with an unbalanced dataset as input and check that the outputs are correct. Right?
Also, I am not sure yet how well this works with the suggestion to use the COCO input for heterogenous datasets: #263
As I said, not much experience here and would love some input on how to do this right.
I am also open for a meeting (as @themattinthehatt suggested) to discuss this and also adding the top_view_mouse model to LP.