Skip to content

SQLite converter loads all rows into memory, causing OOM on large tables #77

@taariq

Description

@taariq

Problem

The SQLite to PostgreSQL converter (src/sqlite/converter.rs) loads all rows into memory before inserting to PostgreSQL. This causes out-of-memory errors on large SQLite databases.

Example

A 2.3GB SQLite file with 7.2 million rows in the prices table would require 8-16GB of RAM to process, making the tool unusable for moderately-sized SQLite databases.

Root Cause

convert_table_to_jsonb() calls read_table_data() which returns Vec<HashMap<String, Value>> containing ALL rows, then converts them all to another Vec<(String, JsonValue)> before batch inserting.

// Current implementation - loads everything into memory
let rows = crate::sqlite::reader::read_table_data(conn, table)?;
let mut result = Vec::with_capacity(rows.len());
for row in rows.into_iter() {
    // ... convert each row
}

Proposed Solution

Implement streaming/batched processing similar to the xmin-based sync:

  1. Read rows in batches (e.g., 10,000 rows at a time)
  2. Convert and insert each batch before reading the next
  3. Track progress for resumability

This pattern already exists in src/xmin/reader.rs with keyset pagination.

Acceptance Criteria

  • SQLite converter processes rows in configurable batches (default 10,000)
  • Memory usage stays constant regardless of table size
  • Large tables (7M+ rows) can be migrated successfully
  • Existing tests continue to pass

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions