Skip to content

Add generalized TextImputer implementation(Attention Mask & Seq2Seq not included)#554

Open
samoger wants to merge 1 commit into
mmschlk:mainfrom
ddddxx1:text-imputer-draft-clean
Open

Add generalized TextImputer implementation(Attention Mask & Seq2Seq not included)#554
samoger wants to merge 1 commit into
mmschlk:mainfrom
ddddxx1:text-imputer-draft-clean

Conversation

@samoger

@samoger samoger commented Jun 20, 2026

Copy link
Copy Markdown

Summary

This draft PR introduces a generalized TextImputer implementation for
text-based Shapley explanations.

Included so far

  • Player strategies for:

    • subwords
    • words
    • named entities
    • syntactic chunks
    • sentences
  • Perturbation strategies for:

    • mask-token replacement
    • pad-token replacement
    • token removal
    • neutral replacement
    • WordNet-based neutral replacement
    • masked-language-model infilling
  • Target callables for:

    • encoder-only sequence classifiers
    • causal language models with multi-token target labels
  • Fast mock-based unit tests:

    • no model downloads
    • no real LLM inference
    • no NLTK downloads during import
    • 43 tests run in approximately 4 seconds

@samoger samoger marked this pull request as ready for review June 20, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant