Skip to content

MAINT: Production schema migration guard + add deliberate migration script for release#2028

Open
jsong468 wants to merge 5 commits into
microsoft:mainfrom
jsong468:azure_sql_migration_guard
Open

MAINT: Production schema migration guard + add deliberate migration script for release#2028
jsong468 wants to merge 5 commits into
microsoft:mainfrom
jsong468:azure_sql_migration_guard

Conversation

@jsong468

@jsong468 jsong468 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

PR: Production Schema Migration Guard

Description

PyRIT's AzureSQLMemory runs alembic upgrade head automatically on construction, meaning any user connecting to a shared Azure SQL database triggers a schema upgrade using whatever migration files are on their machine. This PR prevents accidental schema modifications to production and adds a deliberate release-time migration path.

Runtime Guard (AzureSQLMemory)

  • When the connection string matches AZURE_SQL_DB_CONNECTION_STRING_PROD, the constructor now runs check-only mode instead of full migration
  • Check-only verifies the database schema matches the Python models without modifying it
  • If schemas match → normal startup, no error
  • If schemas don't match → logs a warning but does not block startup, so developers on newer code can still query prod data
  • If the prod env var is not set → existing behavior (full migration), no breaking change
  • The check always runs when migration won't (prod connections OR skip_schema_migration=True), providing visibility without modification
  • AzureSQLMemory catches both AutogenerateDiffsDetected and CommandError (the latter fires when the DB hasn't been upgraded at all)

_check_schema_migration on MemoryInterface

  • Added as a sibling to _run_schema_migration — a clean primitive that calls check_schema_migrations and lets exceptions propagate
  • The policy of what to do with errors (warn vs. raise) lives in AzureSQLMemory, not in the base class
  • _run_schema_migration: upgrade + verify
  • _check_schema_migration: verify only, raises on mismatch

Release Migration Script (build_scripts/migrate_prod_memory_schema.py)

  • Thin wrapper (~120 lines) that constructs AzureSQLMemory(skip_schema_migration=True) and calls _run_schema_migration() — no duplicated migration logic
  • Validates release environment: must be on a releases/ branch, clean working tree in pyrit/memory/, no .dev version suffix
  • Reads AZURE_SQL_DB_CONNECTION_STRING_PROD from ~/.pyrit/.env automatically (loaded with override=False so explicit env vars take precedence)
  • Since it runs from the release branch, upgrade head is the release revision
  • Interactive "type yes" confirmation when running in a terminal; skipped in CI
  • --skip-environment-check flag for CI pipelines where Git state may differ

memory_migrations.py head Subcommand

  • Prints the current Alembic head revision ID so releasers can identify what will be applied
  • Sits alongside existing generate and check subcommands

Design Decisions

  • Warn instead of raise on schema mismatch: When a developer on main (with unreleased schema changes) connects to prod, the guard detects the mismatch but only logs a warning — it does not block startup. This preserves the primary goal (no accidental schema modification) while keeping prod usable for querying data. The previous iteration raised AutogenerateDiffsDetected, which blocked developers from using prod at all after any schema PR merged to main.
  • Check runs when migration doesn't: The read-only check runs whenever migration won't — both for prod connections and when skip_schema_migration=True. This provides visibility without modification. When migration runs, it fixes any mismatch anyway, so checking first would be redundant noise.
  • _check_schema_migration is a clean primitive: The base class method just calls check_schema_migrations and lets exceptions propagate. The prod-specific "catch and warn" policy lives in AzureSQLMemory.__init__ where the context (prod vs. non-prod) is known. This follows the principle that base classes provide primitives, subclasses implement policy.
  • Thin wrapper over AzureSQLMemory: The migration script reuses the existing constructor and _run_schema_migration() rather than duplicating migration logic. This avoids code divergence and ensures the release path uses the exact same code path as normal startup.
  • upgrade head (not explicit revision): Since the script requires running from a release branch (enforced by environment checks), head on that branch IS the release revision. An explicit --target-revision argument was redundant with the branch check.

Tests and Documentation

Unit Tests Added (9 tests)

  • 5 tests in test_azure_sql_memory.py (prod guard):

    • Prod connection runs check-only, not migration
    • Prod connection with schema mismatch warns but does not block startup
    • Non-prod connection runs full migration normally
    • No prod env var set runs full migration normally
    • Prod with skip_schema_migration=True still runs check, never migrates
  • 4 tests in test_migration.py:

    • _check_schema_migration delegates to check_schema_migrations
    • _check_schema_migration raises AutogenerateDiffsDetected on mismatch (primitive behavior)
    • _check_schema_migration raises RuntimeError when engine is None
    • memory_migrations.py head outputs a valid hex revision ID

E2E Verification (test_migration_release.py — not committed, local only)

  • Test 1: Full migration flow — fresh SQLite DB is migrated to head via AzureSQLMemory(skip=True) + _run_schema_migration()

  • Test 2: Idempotency — running migration twice succeeds (second is no-op)

  • Test 3: Prod guard after migration — connecting to a migrated prod DB passes schema check

  • Test 4: Prod guard on old schema — connecting to an outdated DB logs a mismatch warning but doesn't block

  • Test 5: Environment checks — script fails when not on a release branch

  • Test 6: Unit tests — all 72 migration/guard unit tests pass

  • Also tested with the dev DB, which was already up to date but messages printed as expected.

Documentation

  • Added Step 9 ("Migrate Production Database Schema") to doc/contributing/10_release_process.md
  • Includes prerequisites (release branch, frozen deps, version verification), migration command, post-migration verification, and rollback policy
  • Renumbered subsequent steps (10–12)

@jsong468 jsong468 marked this pull request as draft June 17, 2026 02:38
Comment thread doc/contributing/10_release_process.md Outdated
Comment thread pyrit/memory/memory_interface.py Outdated
Comment thread pyrit/memory/azure_sql_memory.py
@@ -0,0 +1,317 @@
# Copyright (c) Microsoft Corporation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if adding a 300 line script is a bit of overhead, with a lot of overlap with other migration code (i.e. the run_migrations functions there which upgrades a db given its engine.

I think this should be a thin wrapper that constructs an AzureSQLMemory with skip=true and the prod connection string, then calls its _run_schema_migration ... no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, slimmed it down!

@jsong468 jsong468 marked this pull request as ready for review June 30, 2026 22:15
@jsong468 jsong468 changed the title [DRAFT] MAINT: Production schema migration guard + add deliberate migration script for release MAINT: Production schema migration guard + add deliberate migration script for release Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants