Skip to content

Feature: Migration flow for importing non-PostKit projects into PostKit #21

@nipunappri

Description

@nipunappri

Summary

Introduce a first-class workflow for importing existing PostgreSQL projects into PostKit. Currently, postkit init assumes a greenfield project — there is no way to onboard a database that already has tables, indexes, constraints, and data. Teams evaluating PostKit must be able to adopt it incrementally without rebuilding their schema from scratch.


Motivation

User Story

As a developer with an existing PostgreSQL project, I want to import my database into PostKit so that I can start using PostKit's migration workflow for future changes, without losing or recreating my current schema.

Current Pain Points

  1. No import pathpostkit init creates empty state files (committed.json: {}, empty schema.sql). Pointing it at an existing database does nothing.
  2. Manual baseline is error-prone — Users must manually pg_dump their schema, place it in the right directory, and hand-craft committed.json to match. Any mismatch causes db plan to generate a full-drop-and-recreate plan.
  3. No documentation — There are zero docs describing how to bring an existing project under PostKit management.
  4. Adoption blocker — This is the Introduce command to populate a part of the stack #1 question new users ask: "I already have a database, can I use PostKit?"

Proposed Solution

New Command: postkit db import

postkit db import [options]

Import an existing PostgreSQL database into PostKit by creating a baseline snapshot.

Options

Flag Description Default
--url <string> Database URL to import from Uses localDbUrl from config
--schema <string> PostgreSQL schema to import public
--name <string> Label for the baseline migration "imported_baseline"
--skip-data Only import schema, ignore seed data false
--dry-run Preview what would be imported without writing false

Behavior

# Typical usage — import an existing database
postkit db import --url "postgres://user:pass@host:5432/existing_db"

# Preview first
postkit db import --url "postgres://user:pass@host:5432/existing_db" --dry-run

# Import a specific schema
postkit db import --url "postgres://..." --schema app_schema --name "v2_baseline"

Step-by-Step Flow

┌─────────────────────────────────────────────┐
│  1. Validate & Connect                       │
│     • Verify target database is reachable    │
│     • Check no active PostKit session exists  │
│     • Warn if .postkit/ state already exists  │
├─────────────────────────────────────────────┤
│  2. Dump Schema                              │
│     • pg_dump --schema-only (filtered)       │
│     • Separate infra statements (roles,      │
│       extensions, schemas) into schema/infra/ │
│     • Write table DDL to schema/schema.sql   │
│     • Extract GRANT statements → schema/grants/│
├─────────────────────────────────────────────┤
│  3. Create Baseline Migration                │
│     • Generate timestamped migration file    │
│       (e.g. 20260408120000_imported_baseline) │
│     • Migration contains full CREATE TABLE   │
│       statements with -- migrate:up / down   │
│     • Place in .postkit/db/migrations/       │
├─────────────────────────────────────────────┤
│  4. Initialize State                         │
│     • Set committed.json with baseline       │
│       migration recorded as applied          │
│     • Set .postkit/db/schema.sql snapshot    │
│     • Mark session as idle (no active session)│
├─────────────────────────────────────────────┤
│  5. Optional: Capture Seed Data              │
│     • If --skip-data is false, dump table    │
│       row counts and sample data to          │
│       schema/seeds/                          │
├─────────────────────────────────────────────┤
│  6. Report                                   │
│     • Tables imported: N                     │
│     • Indexes: N                             │
│     • Foreign keys: N                        │
│     • Migration file: .postkit/db/migrations/│
│       20260408120000_imported_baseline.sql   │
│     • Next step: postkit db start            │
└─────────────────────────────────────────────┘

Technical Design

1. Schema Introspection

Use pg_dump --schema-only --no-owner --no-privileges to extract DDL, then parse the output to categorize statements:

Category Destination
CREATE TABLE, indexes, constraints, triggers schema/schema.sql
CREATE EXTENSION, CREATE SCHEMA, role DDL schema/infra/
GRANT / REVOKE schema/grants/

2. Baseline Migration File

-- Migration: imported_baseline
-- Created: 2026-04-08T12:00:00Z
-- Source: postgres://user:***@host:5432/existing_db
-- This is an auto-generated baseline from an existing database.

-- migrate:up

CREATE TABLE IF NOT EXISTS users (
    id SERIAL PRIMARY KEY,
    email varchar(255) NOT NULL,
    ...
);

-- ... all other tables ...

-- migrate:down
-- (intentionally empty — baseline cannot be reversed)

3. State Initialization

{
  "migrations": [
    {
      "id": "20260408120000",
      "name": "imported_baseline",
      "appliedAt": "2026-04-08T12:00:00.000Z",
      "checksum": "sha256:abc123...",
      "source": "import",
      "tables": ["users", "categories", "products", "..."]
    }
  ],
  "importedFrom": {
    "url": "postgres://user:***@host:5432/existing_db",
    "importedAt": "2026-04-08T12:00:00.000Z",
    "schemaCount": 14,
    "version": "1.0.5"
  }
}

4. Edge Cases

Scenario Expected Behavior
.postkit/ already exists with state Abort: "PostKit is already initialized. Use postkit db import --force to overwrite."
.postkit/ exists but empty/corrupt Proceed with warning and repair
Active migration session Abort: "Cannot import while a migration session is active. Run postkit db commit or postkit db abort first."
Database is empty (no tables) Abort: "No tables found in the target database. Use postkit init for a new project."
Circular foreign key references Handle gracefully — defer constraint creation
Schema with views/functions Include in migration, categorize appropriately
--dry-run flag Output the full plan without writing any files

5. Config Generation

If postkit.config.json does not exist, auto-generate it:

{
  "db": {
    "localDbUrl": "<imported-url>",
    "schemaPath": "schema",
    "schema": "public",
    "remotes": {}
  }
}

Acceptance Criteria

  • postkit db import --url <url> successfully imports an existing database
  • Generated schema/schema.sql matches the source database DDL
  • Baseline migration file is created in .postkit/db/migrations/
  • committed.json correctly records the baseline as applied
  • After import, postkit db plan produces an empty diff (no changes)
  • postkit db startdb plandb applydb commit workflow works after import
  • --dry-run previews import without side effects
  • Existing .postkit/ state is protected (abort or --force)
  • Documentation is added for the import flow
  • Error messages are clear and actionable

Documentation Updates Required

1. New Section: "Migrating an Existing Project"

Location: docs/getting-started/migrating-to-postkit.md

Content should cover:

  • When to use postkit db import vs postkit init
  • Step-by-step import guide
  • What to expect after import
  • How to configure remotes for deployment after import
  • Troubleshooting common import issues

2. Update: README.md

Add import command to the usage section:

# Import an existing database into PostKit
postkit db import --url "postgres://user:pass@host:5432/myapp"

Add to the DB Module command table:

Command Description
postkit db import Import existing database into PostKit

3. Update: Quick Start Guide

Add an alternative path for existing projects:

# For existing projects with a database:
postkit db import --url "postgres://user:pass@host:5432/existing_db"
postkit db remote add staging "postgres://user:pass@staging-host:5432/myapp"
postkit db start    # Ready to make changes!

4. Update: Configuration Docs

Document the importedFrom field in committed.json and how it affects state tracking.

5. Update: DB Module Overview

Add db import to the command reference with full option documentation.


Out of Scope (Future Work)

  • Incremental import — Importing only specific tables/schemas (use --schema for now)
  • Migration from other tools — Converting Flyway/Liquibase/Prisma migration history
  • Data migration — Moving actual row data between databases (use db seed separately)
  • Reverse migration — Generating migrate:down SQL for the baseline
  • Multi-schema import — Importing multiple PostgreSQL schemas in one command

Priority Justification

High priority because:

  1. This is the most common onboarding question for new users
  2. Without it, PostKit is only usable for greenfield projects — a very small market
  3. The feature is well-scoped and does not require architectural changes
  4. It directly impacts adoption and evaluation of PostKit

Estimated Effort

Task Estimate
Schema introspection & dump parsing 2-3 days
Baseline migration generation 1 day
State initialization & edge cases 1-2 days
--dry-run mode 0.5 days
Config generation 0.5 days
Documentation 1 day
Testing 1-2 days
Total ~7-10 days

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions