Skip to content

Commit e36f0aa

Browse files
feat(privacy-filter): add token-level PII detection model
Adds a new PrivacyFilter native model with TS hook, module, types and demo screen, plus the openai/privacy-filter and OpenMed/privacy-filter nemotron presets. Uses constrained Viterbi decoding with tunable BIOES transition biases (matching openai's viterbi_calibration.json schema), sliding 256-token windows with 50% overlap for arbitrary-length input, and exposes results as a typed PiiEntity[] via a getJsiValue overload. seq_len is read from the forward method's input shape so the same runner works with any single-method privacy-filter export. Viterbi decoder is split into its own module (Viterbi.{h,cpp}) following the peer Utils.{h,cpp} convention. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c11dbad commit e36f0aa

22 files changed

Lines changed: 1696 additions & 1 deletion

File tree

.cspell-wordlist.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,3 +188,8 @@ stringifying
188188
hɛloʊ
189189
wɜːld
190190
bielik
191+
nemotron
192+
BIOES
193+
viterbi
194+
argmaxes
195+
unpadded

.eslintrc.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ const VALID_CATEGORIES = [
1313
'Models - Semantic Segmentation',
1414
'Models - Speech To Text',
1515
'Models - Style Transfer',
16+
'Models - Privacy Filter',
1617
'Models - Text Embeddings',
1718
'Models - Text to Speech',
1819
'Models - VLM',

apps/llm/app.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,8 @@
6363
},
6464
"entitlements": {
6565
"com.apple.developer.kernel.increased-memory-limit": true
66-
}
66+
},
67+
"appleTeamId": "J5FM626PE2"
6768
},
6869
"android": {
6970
"adaptiveIcon": {

apps/llm/app/_layout.tsx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,14 @@ export default function _layout() {
146146
headerTitleStyle: { color: ColorPalette.primary },
147147
}}
148148
/>
149+
<Drawer.Screen
150+
name="privacy_filter/index"
151+
options={{
152+
drawerLabel: 'Privacy Filter (PII)',
153+
title: 'Privacy Filter',
154+
headerTitleStyle: { color: ColorPalette.primary },
155+
}}
156+
/>
149157
</Drawer>
150158
</GeneratingContext>
151159
);

apps/llm/app/index.tsx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,12 @@ export default function Home() {
4141
>
4242
<Text style={styles.buttonText}>Multimodal LLM (VLM)</Text>
4343
</TouchableOpacity>
44+
<TouchableOpacity
45+
style={styles.button}
46+
onPress={() => router.navigate('privacy_filter/')}
47+
>
48+
<Text style={styles.buttonText}>Privacy Filter (PII)</Text>
49+
</TouchableOpacity>
4450
</View>
4551
</View>
4652
);
Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
import { useMemo, useState } from 'react';
2+
import {
3+
ActivityIndicator,
4+
ScrollView,
5+
StyleSheet,
6+
Text,
7+
TouchableOpacity,
8+
View,
9+
} from 'react-native';
10+
import { useIsFocused } from '@react-navigation/native';
11+
import { useSafeAreaInsets } from 'react-native-safe-area-context';
12+
import {
13+
PiiEntity,
14+
PRIVACY_FILTER_NEMOTRON,
15+
PRIVACY_FILTER_OPENAI,
16+
PrivacyFilterModelSources,
17+
usePrivacyFilter,
18+
} from 'react-native-executorch';
19+
import ColorPalette from '../../colors';
20+
import { ModelOption, ModelPicker } from '../../components/ModelPicker';
21+
import {
22+
buildSegments,
23+
colorForLabel,
24+
matchEntities,
25+
} from '../../utils/piiMatching';
26+
27+
// Sample tuned for the OpenAI base model — exercises the 8 entity types it
28+
// recognizes (person, email, phone, account_number, address, date, url,
29+
// secret).
30+
const OPENAI_SAMPLE = `My name is Sarah Chen and I work as a senior engineer at Acme Corp. You can reach me at sarah.chen@acmecorp.io or call my direct line at (415) 923-0847. For billing inquiries, my account number is ACC-8821-4490-3371.
31+
32+
I've been living at 17 Birchwood Lane, Portland, OR 97201 since October 3rd, 2019. Before that I was at 8 Rue de Rivoli, Paris, 75001, France. My personal website is https://sarahchen.dev and my GitHub is https://github.com/schen-eng. Feel free to connect — I usually respond within a business day.
33+
34+
My date of birth is June 12, 1991, and my backup email is s.chen.personal@gmail.com in case the primary address is unreachable. This message also contains a confidential API key: sk-T93kXpLm2NvBqR7dYwZ4. Please do not share it outside the team. You can also reach my colleague James Okonkwo at j.okonkwo@acmecorp.io or at his mobile +44 7911 123456.`;
35+
// Sample tuned for the OpenMed Nemotron model — covers categories the base
36+
// OpenAI model doesn't have (medical, financial, technical, demographic).
37+
38+
const NEMOTRON_SAMPLE = `Patient intake for Maria Lopez, female, age 47, blood type O+, born 1978-05-12. MRN 994-2210-AB; health plan beneficiary number HPBN-552-9931 with Aetna. SSN 412-55-7821, national ID DNI 88-7762-X. Primary occupation: registered nurse, currently employed full-time at Mercy General. Religion: Catholic; political view: independent.
39+
40+
Reach her at maria.lopez@example.com or +1 (415) 555-0142. Mailing address: 84 Cedar Hill Road, Apt 3B, Berkeley, CA 94703, United States. Vehicle plate 7XKL922; driver license CA-D1294883.
41+
42+
Payment for last visit: Visa ending 4992-1133-7820-4419, expires 11/28, CVV 884. Bank routing 021000089, SWIFT BIC CHASUS33. Employer EIN tax ID 47-3320118. Customer ID CUST-553201, employee ID EMP-A0093.
43+
44+
Workstation MAC 3C:22:FB:8E:01:9A, IPv4 10.0.42.118, device IMEI 359888061234560. Service account API key sk-live-Tn8x3pLm2NvBqR7dYwZ4QF, password Hunter2!Spring. Session cookie sid=eyJ1c2VyIjoiOTk0MjIxMCJ9.`;
45+
46+
const MODEL_OPTIONS: ModelOption<PrivacyFilterModelSources>[] = [
47+
{ label: 'OpenAI Privacy Filter (8 entities)', value: PRIVACY_FILTER_OPENAI },
48+
{
49+
label: 'OpenMed Nemotron (55 entities)',
50+
value: PRIVACY_FILTER_NEMOTRON,
51+
},
52+
];
53+
54+
// Pick the right sample to display/run based on the active model.
55+
function sampleFor(model: PrivacyFilterModelSources): string {
56+
return model.modelName === PRIVACY_FILTER_NEMOTRON.modelName
57+
? NEMOTRON_SAMPLE
58+
: OPENAI_SAMPLE;
59+
}
60+
61+
function HighlightedText({
62+
source,
63+
entities,
64+
}: {
65+
source: string;
66+
entities: PiiEntity[];
67+
}) {
68+
const segments = useMemo(
69+
() => buildSegments(source, matchEntities(source, entities)),
70+
[source, entities]
71+
);
72+
return (
73+
<Text style={styles.sampleText}>
74+
{segments.map((seg, i) =>
75+
seg.label ? (
76+
<Text
77+
key={i}
78+
style={[
79+
styles.highlight,
80+
{ backgroundColor: colorForLabel(seg.label) },
81+
]}
82+
>
83+
{seg.text}
84+
</Text>
85+
) : (
86+
<Text key={i}>{seg.text}</Text>
87+
)
88+
)}
89+
</Text>
90+
);
91+
}
92+
93+
function PrivacyFilterScreen() {
94+
const { bottom } = useSafeAreaInsets();
95+
const [entities, setEntities] = useState<PiiEntity[] | null>(null);
96+
const [runError, setRunError] = useState<string | null>(null);
97+
const [inferenceMs, setInferenceMs] = useState<number | null>(null);
98+
const [selectedModel, setSelectedModel] = useState<PrivacyFilterModelSources>(
99+
PRIVACY_FILTER_OPENAI
100+
);
101+
102+
const filter = usePrivacyFilter({ model: selectedModel });
103+
const sampleText = sampleFor(selectedModel);
104+
105+
const onRun = async () => {
106+
setRunError(null);
107+
setEntities(null);
108+
setInferenceMs(null);
109+
const startedAt = Date.now();
110+
try {
111+
const result = await filter.generate(sampleText);
112+
const elapsed = Date.now() - startedAt;
113+
setInferenceMs(elapsed);
114+
setEntities(result);
115+
} catch (e) {
116+
let msg: string;
117+
if (e instanceof Error) {
118+
msg = e.message;
119+
} else if (e && typeof e === 'object' && 'message' in e) {
120+
const code =
121+
'code' in e ? ` (code ${(e as { code: unknown }).code})` : '';
122+
msg = `${(e as { message: string }).message}${code}`;
123+
} else {
124+
try {
125+
msg = JSON.stringify(e);
126+
} catch {
127+
msg = String(e);
128+
}
129+
}
130+
setRunError(msg);
131+
}
132+
};
133+
134+
const disabled = !filter.isReady || filter.isGenerating;
135+
136+
return (
137+
<View style={[styles.container, { paddingBottom: bottom + 8 }]}>
138+
<ModelPicker
139+
models={MODEL_OPTIONS}
140+
selectedModel={selectedModel}
141+
onSelect={(m) => {
142+
setEntities(null);
143+
setRunError(null);
144+
setInferenceMs(null);
145+
setSelectedModel(m);
146+
}}
147+
label="Model"
148+
disabled={filter.isGenerating}
149+
/>
150+
151+
{filter.error && (
152+
<View style={styles.errorBanner}>
153+
<Text style={styles.errorText}>
154+
Load error: {filter.error.message}
155+
</Text>
156+
</View>
157+
)}
158+
159+
{!filter.isReady && !filter.error && (
160+
<View style={styles.centerBlock}>
161+
<ActivityIndicator color={ColorPalette.primary} />
162+
<Text style={styles.muted}>
163+
Downloading model…{' '}
164+
{Math.round((filter.downloadProgress ?? 0) * 100)}%
165+
</Text>
166+
</View>
167+
)}
168+
169+
<ScrollView style={styles.textBox}>
170+
{entities ? (
171+
<HighlightedText source={sampleText} entities={entities} />
172+
) : (
173+
<Text style={styles.sampleText}>{sampleText}</Text>
174+
)}
175+
</ScrollView>
176+
177+
<TouchableOpacity
178+
style={[styles.runButton, disabled && styles.buttonDisabled]}
179+
onPress={onRun}
180+
disabled={disabled}
181+
>
182+
{filter.isGenerating ? (
183+
<ActivityIndicator color="#fff" />
184+
) : (
185+
<Text style={styles.runButtonText}>
186+
Detect PII
187+
{inferenceMs !== null && ` · ${inferenceMs} ms`}
188+
</Text>
189+
)}
190+
</TouchableOpacity>
191+
192+
{runError && (
193+
<View style={styles.errorBanner}>
194+
<Text style={styles.errorText}>Run error: {runError}</Text>
195+
</View>
196+
)}
197+
</View>
198+
);
199+
}
200+
201+
export default function PrivacyFilterScreenWrapper() {
202+
const isFocused = useIsFocused();
203+
return isFocused ? <PrivacyFilterScreen /> : null;
204+
}
205+
206+
const styles = StyleSheet.create({
207+
container: {
208+
flex: 1,
209+
padding: 16,
210+
backgroundColor: '#fff',
211+
gap: 10,
212+
},
213+
textBox: {
214+
flex: 1,
215+
borderWidth: 1,
216+
borderColor: '#e0e0e0',
217+
borderRadius: 8,
218+
padding: 10,
219+
},
220+
sampleText: {
221+
fontSize: 13,
222+
color: '#222',
223+
lineHeight: 19,
224+
},
225+
highlight: {
226+
fontWeight: '600',
227+
borderRadius: 3,
228+
},
229+
runButton: {
230+
backgroundColor: ColorPalette.primary,
231+
borderRadius: 8,
232+
paddingVertical: 12,
233+
alignItems: 'center',
234+
},
235+
runButtonText: {
236+
color: '#fff',
237+
fontSize: 15,
238+
fontWeight: '600',
239+
},
240+
buttonDisabled: {
241+
opacity: 0.5,
242+
},
243+
centerBlock: {
244+
alignItems: 'center',
245+
gap: 6,
246+
paddingVertical: 8,
247+
},
248+
muted: {
249+
color: '#666',
250+
fontSize: 12,
251+
},
252+
errorBanner: {
253+
backgroundColor: '#fdecea',
254+
borderColor: '#f5c6cb',
255+
borderWidth: 1,
256+
borderRadius: 6,
257+
padding: 8,
258+
},
259+
errorText: {
260+
color: '#a94442',
261+
fontSize: 12,
262+
},
263+
});

0 commit comments

Comments
 (0)