Skip to content

Latest commit

 

History

History
102 lines (71 loc) · 2.9 KB

File metadata and controls

102 lines (71 loc) · 2.9 KB

Running PostgreSQL with Docker

↑ Up | ← Previous | Next →

Now we want to do real data engineering. Let's use a Postgres database for that.

You can run a containerized version of Postgres that doesn't require any installation steps. You only need to provide a few environment variables to it as well as a volume for storing data.

Running PostgreSQL in a Container

Create a folder anywhere you'd like for Postgres to store data in. We will use the example folder ny_taxi_postgres_data. Here's how to run the container:

docker run -it --rm \
  -e POSTGRES_USER="root" \
  -e POSTGRES_PASSWORD="root" \
  -e POSTGRES_DB="ny_taxi" \
  -v ny_taxi_postgres_data:/var/lib/postgresql \
  -p 5432:5432 \
  postgres:18

Explanation of Parameters

  • -e sets environment variables (user, password, database name)
  • -v ny_taxi_postgres_data:/var/lib/postgresql creates a named volume
    • Docker manages this volume automatically
    • Data persists even after container is removed
    • Volume is stored in Docker's internal storage
  • -p 5432:5432 maps port 5432 from container to host
  • postgres:18 uses PostgreSQL version 18 (latest as of Dec 2025)

Alternative Approach - Bind Mount

First create the directory, then map it:

mkdir ny_taxi_postgres_data

docker run -it \
  -e POSTGRES_USER="root" \
  -e POSTGRES_PASSWORD="root" \
  -e POSTGRES_DB="ny_taxi" \
  -v $(pwd)/ny_taxi_postgres_data:/var/lib/postgresql \
  -p 5432:5432 \
  postgres:18

Named Volume vs Bind Mount

  • Named volume (name:/path): Managed by Docker, easier
  • Bind mount (/host/path:/container/path): Direct mapping to host filesystem, more control

Connecting to PostgreSQL

Once the container is running, we can log into our database with pgcli.

Install pgcli:

uv add --dev pgcli

The --dev flag marks this as a development dependency (not needed in production). It will be added to the [dependency-groups] section of pyproject.toml instead of the main dependencies section.

Now use it to connect to Postgres:

uv run pgcli -h localhost -p 5432 -u root -d ny_taxi
  • uv run executes a command in the context of the virtual environment
  • -h is the host. Since we're running locally we can use localhost.
  • -p is the port.
  • -u is the username.
  • -d is the database name.
  • The password is not provided; it will be requested after running the command.

When prompted, enter the password: root

Basic SQL Commands

Try some SQL commands:

-- List tables
\dt

-- Create a test table
CREATE TABLE test (id INTEGER, name VARCHAR(50));

-- Insert data
INSERT INTO test VALUES (1, 'Hello Docker');

-- Query data
SELECT * FROM test;

-- Exit
\q

↑ Up | ← Previous | Next →