PiML scored test and embeddings - unique scenario

Hi, very clever thinking on PiML's scored testing not requiring the actual model object. 

**Context:** 
I have the input features, target response data, and the corresponding model predictions - all for a particular dataset and task.
The problem is the features (one text, remaining non-text) are "transformed"/learned via embeddings in the original model. Text feature is via huggingface fine-tuning and non-text is via FT-Transformer. Then the original model brings both sets of embeddings together to make predictions via a fusion MLP (3 components in total). 

I am not concerned about the text feature and I figured I can pass in the tuned n-dimensional representation of the text feature as n-d columns (because the probabilities factor the full X or i/p); and then apply scored testing for all original non-text features as it is. 

**1. All features don't start off as embeddings. But the final prediction result and probabilities are. Am I at risk of misleading results? Because my non-text is also treated as embeddings by the original model.
2. Would sincerely appreciate it if you can share thoughts as to how I can apply scored testing here/PiML in general.**

@ajzhanghk @ZebinYang @simoncos @CnBDM-Su 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PiML scored test and embeddings - unique scenario #58

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

PiML scored test and embeddings - unique scenario #58

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions