Skip to content

PiML scored test and embeddings - unique scenario #58

@sachinvs7

Description

@sachinvs7

Hi, very clever thinking on PiML's scored testing not requiring the actual model object.

Context:
I have the input features, target response data, and the corresponding model predictions - all for a particular dataset and task.
The problem is the features (one text, remaining non-text) are "transformed"/learned via embeddings in the original model. Text feature is via huggingface fine-tuning and non-text is via FT-Transformer. Then the original model brings both sets of embeddings together to make predictions via a fusion MLP (3 components in total).

I am not concerned about the text feature and I figured I can pass in the tuned n-dimensional representation of the text feature as n-d columns (because the probabilities factor the full X or i/p); and then apply scored testing for all original non-text features as it is.

1. All features don't start off as embeddings. But the final prediction result and probabilities are. Am I at risk of misleading results? Because my non-text is also treated as embeddings by the original model.
2. Would sincerely appreciate it if you can share thoughts as to how I can apply scored testing here/PiML in general.

@ajzhanghk @ZebinYang @simoncos @CnBDM-Su

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions