PostgreSQL Wire Protocol Guide

A comprehensive guide to the PostgreSQL wire protocol as implemented in pg-wire-mock.

Quick Introduction to the PostgreSQL Wire Protocol

PostgreSQL uses a custom message-based protocol for communication between clients and the server. This protocol is supported over TCP/IP and also over Unix-domain sockets. The current version of the protocol is 3.0.

This project implements a minimal subset of the protocol, allowing you to experiment with it and learn how it works by building a mock server that can communicate with standard PostgreSQL clients like psql.

Introduction

PostgreSQL is a powerful, open-source relational database management system (RDBMS) that is widely used for handling large amounts of data efficiently. The PostgreSQL wire protocol is the communication language between PostgreSQL clients and servers.

In this repository, we'll explore the wire protocol in detail by building a mock implementation. We'll be using the psql command-line utility to interact with our mock server, observing how queries are transmitted over the network and how results are returned.

Frontend/Backend Protocol

The protocol that clients and servers use to communicate is called the "Frontend/Backend Protocol."

A typical message in the PostgreSQL message-based protocol follows this structure:

char tag | int32 len | payload

The first byte is the message type (a single character)
Followed by a 4-byte message length
Followed by the payload data

Common Message Types

Here are some of the most common message types:

'R': Authentication
'Q': Query
'X': Terminate
'Z': Ready for Query
'P': Parse
'B': Bind
'E': Error
'C': Command Complete
'D': Data Row
'T': Row Description
'N': Notice Response

For a detailed description of all message types, see the PostgreSQL wire protocol documentation.

Authentication Flow

The authentication flow between the client and server is as follows:

If authentication is required, the server sends an AuthenticationRequest. There are several authentication types that can be requested, including plain-text passwords and MD5-encrypted passwords.

Once authentication is complete (or if no auth is necessary), the server sends an AuthenticationOK message.

Authentication Request Packet

The authentication request packet consists of the following fields:


'R'	int32 len	int32 method	optional other

Using Wireshark to inspect an AuthenticationRequest packet:

Simple Query Flow

A standard query cycle starts with the client sending a Query message to the server. The query flow is as follows:

The client sends an SQL command (starting with 'Q')
The server replies with RowDescription ('T') detailing the result structure
The server sends DataRow ('D') messages for each row in the result
Finally, the server sends CommandComplete ('C') and ReadyForQuery ('Z')

Here's a visual representation of the query flow:

Let's examine each message type in detail:

Query Message ('Q')

The Query message structure:

'Q' | int32 len | char[len] query

In Wireshark, a query packet looks like:

RowDescription Message ('T')

The RowDescription message describes the structure of the result set:

'T' | int32 len | int16 numfields | str col | int32 tableoid | int16 colno | int32 typeoid | int16 typelen | int32 typmod | int16 format

For each field, it includes:

Column name
Table object ID
Column number
Data type object ID
Data type size
Type modifier
Format code (text/binary)

In Wireshark:

DataRow Message ('D')

The DataRow message contains the actual data for a single row:

'D' | int32 len | int16 numfields | int32 fieldlen | char[fieldlen] data ...

In Wireshark:

CommandComplete Message ('C')

The CommandComplete message indicates that the query is complete:

'C' | int32 len | str tag

In Wireshark:

ReadyForQuery Message ('Z')

The ReadyForQuery message indicates that the server is ready for another query:

'Z' | int32 len | 'I' or 'T' or 'E'

The status byte indicates:

'I': Idle (not in a transaction)
'T': In a transaction
'E': In a failed transaction

In Wireshark:

Extended Query Protocol

PostgreSQL also supports an extended query protocol which separates the process into multiple steps:

Parse: Client sends a Parse message with an SQL query
Bind: Client binds values to the parameters in the parsed query
Execute: Client requests execution of the bound query
Sync: Client requests synchronization

This extended protocol allows for better performance in applications that reuse similar queries with different parameters.

Error and Notice Messages

The server can send Error ('E') and Notice ('N') messages at any time. These messages have a similar format:

'E'/'N' | int32 len | byte field-type | str field-value | ... | \0

Common field types include:

'S': Severity
'C': Code
'M': Message
'D': Detail
'H': Hint

Transaction Management

PostgreSQL uses the following transaction states:

'I': Idle (not in a transaction)
'T': In a transaction block
'E': In a failed transaction block

The client can control transactions with SQL commands:

BEGIN: Start a transaction block
COMMIT: Commit the current transaction
ROLLBACK: Roll back the current transaction

Learning the Protocol

A great way to get a better understanding of the protocol is to:

Install Wireshark and inspect the packets sent between client and server
Review the PostgreSQL protocol documentation
Examine the implemented messages in this repository

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PostgreSQL Wire Protocol Guide

Quick Introduction to the PostgreSQL Wire Protocol

Introduction

Frontend/Backend Protocol

Common Message Types

Authentication Flow

Authentication Request Packet

Simple Query Flow

Query Message ('Q')

RowDescription Message ('T')

DataRow Message ('D')

CommandComplete Message ('C')

ReadyForQuery Message ('Z')

Extended Query Protocol

Error and Notice Messages

Transaction Management

Learning the Protocol

References

FilesExpand file tree

PROTOCOL.md

Latest commit

History

PROTOCOL.md

File metadata and controls

PostgreSQL Wire Protocol Guide

Quick Introduction to the PostgreSQL Wire Protocol

Introduction

Frontend/Backend Protocol

Common Message Types

Authentication Flow

Authentication Request Packet

Simple Query Flow

Query Message ('Q')

RowDescription Message ('T')

DataRow Message ('D')

CommandComplete Message ('C')

ReadyForQuery Message ('Z')

Extended Query Protocol

Error and Notice Messages

Transaction Management

Learning the Protocol

References