|
| 1 | +# Memory Models & Migrations |
| 2 | + |
| 3 | +This guide covers how to work with PyRIT's memory models — where they live, how to add or update them, and how the migration system works. |
| 4 | + |
| 5 | +## Where Things Live |
| 6 | + |
| 7 | +| What | Path | |
| 8 | +|---|---| |
| 9 | +| ORM models (SQLAlchemy) | `pyrit/memory/memory_models.py` | |
| 10 | +| Domain objects they map to | `pyrit/models/` (e.g. `MessagePiece`, `Score`, `Seed`, `AttackResult`, `ScenarioResult`) | |
| 11 | +| Alembic migration environment | `pyrit/memory/alembic/env.py` | |
| 12 | +| Migration revisions | `pyrit/memory/alembic/versions/` | |
| 13 | +| Migration helpers | `pyrit/memory/migration.py` | |
| 14 | +| CLI migration tool | `build_scripts/memory_migrations.py` | |
| 15 | +| Schema diagram | `doc/code/memory/10_schema_diagram.md` | |
| 16 | + |
| 17 | +## Current Models |
| 18 | + |
| 19 | +All models inherit from the SQLAlchemy `Base` declarative class and live in `memory_models.py`: |
| 20 | + |
| 21 | +- **`PromptMemoryEntry`** — prompt/response data (`PromptMemoryEntries` table) |
| 22 | +- **`ScoreEntry`** — evaluation results (`ScoreEntries` table) |
| 23 | +- **`EmbeddingDataEntry`** — embeddings for semantic search (`EmbeddingData` table) |
| 24 | +- **`SeedEntry`** — dataset prompts/templates (`SeedPromptEntries` table) |
| 25 | +- **`AttackResultEntry`** — attack execution results (`AttackResultEntries` table) |
| 26 | +- **`ScenarioResultEntry`** — scenario execution metadata (`ScenarioResultEntries` table) |
| 27 | + |
| 28 | +Each entry model has a corresponding domain object and conversion methods (e.g. `PromptMemoryEntry.__init__(entry: MessagePiece)` and `get_message_piece()`). |
| 29 | + |
| 30 | +## Adding or Updating a Model |
| 31 | + |
| 32 | +### 1. Edit the model |
| 33 | + |
| 34 | +Make your changes in `pyrit/memory/memory_models.py`. Follow these conventions: |
| 35 | + |
| 36 | +- Use `mapped_column()` with explicit types. |
| 37 | +- Use `CustomUUID` for all UUID columns (handles cross-database compatibility). |
| 38 | +- Add foreign keys where relationships exist. |
| 39 | +- Include `pyrit_version` on new entry models. |
| 40 | + |
| 41 | +### 2. Generate a migration |
| 42 | + |
| 43 | +```bash |
| 44 | +python build_scripts/memory_migrations.py generate -m "short description of change" |
| 45 | +``` |
| 46 | + |
| 47 | +This creates a new revision file under `pyrit/memory/alembic/versions/`. **Review the generated file carefully** — auto-generated migrations may need manual adjustments (e.g. for data migrations or default values). |
| 48 | + |
| 49 | +### 3. Validate the migration |
| 50 | + |
| 51 | +```bash |
| 52 | +python build_scripts/memory_migrations.py check |
| 53 | +``` |
| 54 | + |
| 55 | +This verifies the schema produced by running all migrations matches the current models. Both pre-commit hooks (see below) and CI run this check. |
| 56 | + |
| 57 | +### 4. Update the schema diagram |
| 58 | + |
| 59 | +If you changed the schema in a meaningful way (added a table, added a foreign key, etc.), update the Mermaid diagram in `doc/code/memory/10_schema_diagram.md`. |
| 60 | + |
| 61 | +## How Migrations Run at Startup |
| 62 | + |
| 63 | +Schema migrations are triggered inside each memory class constructor (`SQLiteMemory.__init__` and `AzureSQLMemory.__init__`). When `skip_schema_migration=False` (the default), the inherited `_run_schema_migration()` method on `MemoryInterface` runs: |
| 64 | + |
| 65 | +``` |
| 66 | +SQLiteMemory.__init__() / AzureSQLMemory.__init__() |
| 67 | + → _run_schema_migration() # pyrit/memory/memory_interface.py |
| 68 | + → run_schema_migrations(engine=...) # pyrit/memory/migration.py |
| 69 | + → alembic upgrade head |
| 70 | + → check_schema_migrations(engine=...) # pyrit/memory/migration.py |
| 71 | + → alembic check |
| 72 | +``` |
| 73 | + |
| 74 | +Both SQLite and AzureSQL follow the same migration path: first `run_schema_migrations` applies any pending Alembic revisions (`alembic upgrade head`), then `check_schema_migrations` verifies the resulting schema matches the current models (`alembic check`). The behavior depends on database state: |
| 75 | + |
| 76 | +| Database state | What happens | |
| 77 | +|---|---| |
| 78 | +| **Fresh (no tables)** | All migrations apply from scratch | |
| 79 | +| **Already versioned** | Only unapplied migrations run (idempotent) | |
| 80 | +| **Legacy (tables exist, no version tracking)** | Validates schema matches models, stamps current version, then upgrades. Raises `RuntimeError` on mismatch to prevent data corruption | |
| 81 | + |
| 82 | +Migrations run inside a transaction (`engine.begin()`), so a failed migration rolls back cleanly. The version tracking table is `pyrit_memory_alembic_version`. |
| 83 | + |
| 84 | +Users can skip migrations by passing `skip_schema_migration=True` to the memory class constructor. When using `initialize_pyrit_async()`, this can be forwarded via `**memory_instance_kwargs`: |
| 85 | + |
| 86 | +```python |
| 87 | +await initialize_pyrit_async("SQLite", skip_schema_migration=True) |
| 88 | +``` |
| 89 | + |
| 90 | +## Important Rules |
| 91 | + |
| 92 | +### Migration revisions are immutable |
| 93 | + |
| 94 | +Once a migration revision is committed, it **must not be modified or deleted**. This is enforced by a pre-commit hook (`enforce_alembic_revision_immutability`). If you need to fix a migration, create a new revision instead. |
| 95 | + |
| 96 | +### Pre-commit hooks |
| 97 | + |
| 98 | +Two hooks run automatically when you touch memory-related files: |
| 99 | + |
| 100 | +1. **`enforce_alembic_revision_immutability`** — blocks modifications/deletions to existing revision files. |
| 101 | +2. **`memory-migrations-check`** — runs `memory_migrations.py check` to verify the schema is in sync. |
| 102 | + |
| 103 | +These hooks trigger on changes to `pyrit/memory/memory_models.py`, `pyrit/memory/migration.py`, and files under `pyrit/memory/alembic/`. |
0 commit comments