Commit 35e33b3
authored
feat: add PARAPHRASE_MULTILINGUAL_MINILM_L12_V2 text embeddings model (#1115)
## Description
Adds the `paraphrase-multilingual-MiniLM-L12-v2` sentence-transformer
model — the second multilingual embeddings model after distiluse,
completing #945. Ships **only the XNNPACK 8da4w variant** under
`MODEL_REGISTRY.ALL_MODELS` (see "Why a single variant" below).
384-d output, max 126 tokens, 50+ languages. Tokenizer is Unigram +
Precompiled normalizer + Metaspace decoder — **requires the bumped
`pytorch/extension/llm/tokenizers` runtime from #1114**, so this PR
blocks on that landing first and should be rebased onto main once #1114
merges.
HF repo:
[software-mansion/react-native-executorch-paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/software-mansion/react-native-executorch-paraphrase-multilingual-MiniLM-L12-v2)
(`v0.9.0` tag, layout mirrors distiluse).
**Why a single variant** —
TLDR 8da4w works faster then all and was also one of the smallest,
without loss in precision.
Longer answer:
unlike distiluse, where Core ML fp32 won iPhone thanks to ANE
acceleration, benchmarks on iPhone 17 Pro + OnePlus 12 (~80-token input,
50 measured forwards after 3 warmups) showed the XNNPACK 8da4w variant
Pareto-dominates the other three on both platforms: faster than XNNPACK
fp32, Core ML fp32 *and* Core ML fp16 on iPhone, and ~36% smaller
steady-state memory footprint than the next-best variant. Likely cause:
paraphrase-multilingual-MiniLM-L12-v2 is a smaller model (~118 M params,
12 layers) where Core ML's runtime doesn't push enough work onto ANE for
the precision-conversion overhead to pay off. fp16 being slower than
fp32 on Core ML for this model is a tell that the runtime is falling
back to slower compute units. Shipping only `_8DA4W` keeps the public
surface aligned with the data; if a future Core ML or model update flips
the verdict, easy to add the other variants back.
**Memory methodology note** — the new paraphrase row in
`docs/docs/02-benchmarks/memory-usage.md` reports RSS / `phys_footprint`
deltas from a clean app baseline (loaded − idle), captured on-device at
the same conceptual point. The existing distiluse rows there (36 / 44
MB) come from an older measurement pass with a different (and not
reconstructable from the diff) methodology, so the two rows are not
directly comparable. A separate pass to re-measure distiluse and other
rows with the same methodology would be a good follow-up.
### Introduces a breaking change?
- [ ] Yes
- [x] No
### Type of change
- [ ] Bug fix (change which fixes an issue)
- [x] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)
### Tested on
- [x] iOS
- [x] Android
### Testing instructions
1. `cd apps/text-embeddings && npx expo run:ios` (or `run:android`).
2. Pick **"Multilingual Paraphrase (8da4w)"** in the model picker.
3. Add a sentence in one language, query with an aligned sentence in
another (e.g. Polish "Słoneczko" against "It's so sunny outside!"). The
cross-lingual pair should top the matches.
### Related issues
Closes the paraphrase-multilingual half of #945 (the distiluse half
landed in #1098).
### Checklist
- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings
### Additional notes
Blocks on #1114.1 parent e7b7529 commit 35e33b3
7 files changed
Lines changed: 57 additions & 35 deletions
File tree
- apps/text-embeddings/app/text-embeddings
- docs/docs
- 02-benchmarks
- 03-hooks/01-natural-language-processing
- packages/react-native-executorch/src
- constants
- types
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
41 | 46 | | |
42 | 47 | | |
43 | 48 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
180 | 180 | | |
181 | 181 | | |
182 | 182 | | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
192 | 193 | | |
193 | 194 | | |
194 | 195 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
110 | 111 | | |
111 | 112 | | |
112 | 113 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
131 | 132 | | |
132 | 133 | | |
133 | 134 | | |
| |||
Lines changed: 9 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1102 | 1102 | | |
1103 | 1103 | | |
1104 | 1104 | | |
| 1105 | + | |
| 1106 | + | |
1105 | 1107 | | |
1106 | 1108 | | |
1107 | 1109 | | |
| |||
1159 | 1161 | | |
1160 | 1162 | | |
1161 | 1163 | | |
| 1164 | + | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
| 1172 | + | |
1162 | 1173 | | |
1163 | 1174 | | |
1164 | 1175 | | |
| |||
1349 | 1360 | | |
1350 | 1361 | | |
1351 | 1362 | | |
| 1363 | + | |
1352 | 1364 | | |
1353 | 1365 | | |
1354 | 1366 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
0 commit comments