Skip to content

Map for adding cross validation training and evaluation #86

@jmamath

Description

@jmamath

Hello and thank you for this amazing package.

Instead of using replicates, I would be interested in adding a cross validation training and evaluation scheme based on the domain metadata.

Say a dataset has domain: A,B,C. I would like to:

  • train on 70% of data sampled from A,B and evaluate in distribution on the remaining 30 % from A,B and out of distribution on C.
  • train on 70% of data sampled from B,C and evaluate in distribution on the remaining 30 % from B,C and out of distribution on A.
  • train on 70% of data sampled from C,A and evaluate in distribution on the remaining 30 % from C,A and out of distribution on B.

Finally average the in distribution and the out of distribution metric to have the final performance.

Here the 70-30 split is arbitrary and should be modifiable.

I am just starting exploring the package having only replicated the ERM result on the camelyon17 dataset.

It seems that the grouper object might be a good start to implement the following procedure. But, I am still lacking a high level overview of the code. So how would you do this ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions