Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/concepts/rewrite.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,28 @@ The text is then rewritten to reduce identifiability, applying targeted transfor

---

## Key concepts

**Sensitivity** measures the intrinsic re-identification damage an entity causes if it appears in the output — independently of what else is retained. It is not the protection decision; it feeds the downstream leakage scoring system.
Comment thread
asteier2026 marked this conversation as resolved.

| Level | Meaning | Examples | Leakage weight |
|-------|---------|---------|----------------|
| `high` | Exposure alone can identify a person | Names, ID numbers, contact details | 1.0 |
| `medium` | Meaningfully narrows the identity space | Location, occupation, age | 0.6 |
| `low` | Minimal standalone identifying power | Generic attributes, widely shared traits | 0.3 |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add gender here as an example as well!


**Protection method** describes how a sensitive entity is transformed. The choice reflects a holistic view of the document — what other entities are being protected and how, then shapes what each individual entity needs.

| Method | What it does | Typical use |
|--------|-------------|-------------|
| `replace` | Substitutes the entity with a plausible synthetic alternative | Direct identifiers (names, IDs, contact details) |
| `generalize` | Replaces the entity with a broader form | Quasi-identifiers (exact date → quarter, city → region) |
| `suppress_inference` | Rewrites the surrounding text to remove cues that enable the inference | Latent entities that are implied rather than stated |
| `remove` | Deletes the entity entirely | Cases where neither replacement nor generalization can preserve meaning without retaining the identifying detail |
| `leave_as_is` | Leaves the entity unchanged | Entities judged not to require protection in context |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also use gender as an example here


---

## Basic usage

```python
Expand Down
Loading