Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/user/content/ingest-data/mysql/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ menu:

## Change Data Capture (CDC)

Materialize supports MySQL as a real-time data source. The [MySQL source](/sql/create-source/mysql/)
uses MySQL's [binlog replication protocol](/sql/create-source/mysql/#change-data-capture)
Materialize supports MySQL as a real-time data source. The [MySQL source](/sql/create-source/mysql-v2/)
uses MySQL's [binlog replication protocol](/sql/create-source/mysql-v2/#change-data-capture)
to **continually ingest changes** resulting from CRUD operations in the upstream
database. The native support for MySQL Change Data Capture (CDC) in Materialize
gives you the following benefits:
Expand Down
6 changes: 1 addition & 5 deletions doc/user/content/ingest-data/mysql/amazon-aurora.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,11 +385,7 @@ your networking configuration.

### 3. Start ingesting data

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source-options" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="schema-changes" %}}
{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="ingest-data-step" %}}

[//]: # "TODO(morsapaes) Replace these Step 6. and 7. with guidance using the
new progress metrics in mz_source_statistics + console monitoring, when
Expand Down
6 changes: 1 addition & 5 deletions doc/user/content/ingest-data/mysql/amazon-rds.md
Original file line number Diff line number Diff line change
Expand Up @@ -399,11 +399,7 @@ your networking configuration.

### 3. Start ingesting data

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source-options" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="schema-changes" %}}
{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="ingest-data-step" %}}

[//]: # "TODO(morsapaes) Replace these Step 6. and 7. with guidance using the
new progress metrics in mz_source_statistics + console monitoring, when
Expand Down
6 changes: 1 addition & 5 deletions doc/user/content/ingest-data/mysql/azure-db.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,11 +198,7 @@ your networking configuration.

### 3. Start ingesting data

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source-options" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="schema-changes" %}}
{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="ingest-data-step" %}}

### 4. Monitor the ingestion status

Expand Down
6 changes: 1 addition & 5 deletions doc/user/content/ingest-data/mysql/google-cloud-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,11 +192,7 @@ your networking configuration.

### 3. Start ingesting data

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source-options" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="schema-changes" %}}
{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="ingest-data-step" %}}

### 4. Monitor the ingestion status

Expand Down
6 changes: 1 addition & 5 deletions doc/user/content/ingest-data/mysql/self-hosted.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,11 +190,7 @@ your networking configuration.

### 3. Start ingesting data

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="create-source-options" %}}

{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="schema-changes" %}}
{{% include-example file="examples/ingest_data/mysql/create_source_cloud" example="ingest-data-step" %}}

### 4. Monitor the ingestion status

Expand Down
2 changes: 1 addition & 1 deletion doc/user/content/ingest-data/mysql/source-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ Create a connection to your MySQL database using the [`CREATE CONNECTION` syntax
## Create a source

In Materialize, create a source using the [`CREATE SOURCE`
syntax](/sql/create-source/mysql/).
syntax](/sql/create-source/mysql-v2/).

```mzsql
CREATE SOURCE my_source
Expand Down
303 changes: 303 additions & 0 deletions doc/user/content/sql/create-source/mysql-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
---
title: "CREATE SOURCE: MySQL (New Syntax)"
description: "Connecting Materialize to a MySQL database for Change Data Capture (CDC)."
pagerank: 40
menu:
main:
parent: 'create-source'
identifier: cs_mysql-v2
name: MySQL (New Syntax)
weight: 21
---

{{< private-preview />}}

{{< source-versioning-disambiguation is_new=true
other_ref="[old reference page](/sql/create-source/mysql/)" include_blurb=true >}}

## Prerequisites

{{% create-source/intro %}}
Materialize supports MySQL (5.7+) as a real-time data source. To connect to a
MySQL database, you first need to tweak its configuration to enable
[GTID-based binary log (binlog) replication](#change-data-capture) and set
[`binlog_row_metadata=FULL`](#change-data-capture), and then
[create a connection](#prerequisite-creating-a-connection-to-mysql) in
Materialize that specifies access and authentication parameters.
{{% /create-source/intro %}}

## Syntax

{{% include-syntax file="examples/create_source_mysql" example="syntax" %}}

## Ingesting data

After a source is created, you can create tables from the source referencing
upstream MySQL tables that have GTID-based binlog replication enabled. You can
create multiple tables that reference the same upstream table.

See [`CREATE TABLE FROM SOURCE`](/sql/create-table/) for details.

#### Handling table schema changes
Copy link
Copy Markdown
Member

@bosconi bosconi Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_020 (section-split, conf 2): heading level skip — ## Ingesting data jumps to #### with no intervening ###. Promoting to ### makes this a peer of ### Change data capture and ### Monitoring source progress.

Suggested change
#### Handling table schema changes
### Handling table schema changes


The use of `CREATE SOURCE` with the new [`CREATE TABLE FROM
SOURCE`](/sql/create-table/) allows for the handling of certain upstream DDL
changes without downtime.

See [`CREATE TABLE FROM SOURCE`](/sql/create-table/#handling-table-schema-changes) for details.

#### Supported types
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_020 (section-split, conf 2): heading level skip — promote to ### for consistent hierarchy.

Suggested change
#### Supported types
### Supported types


With the new syntax, after a MySQL source is created, you [`CREATE TABLE FROM
SOURCE`](/sql/create-table/) to create a corresponding table in Materialize and
start ingesting data.

{{% include-from-yaml data="mysql_source_details"
name="mysql-supported-types" %}}

{{% include-from-yaml data="mysql_source_details"
name="mysql-unsupported-types" %}}

For more information, including strategies for handling unsupported types,
see [`CREATE TABLE FROM SOURCE`](/sql/create-table/).

#### Upstream table truncation restrictions
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_020 (section-split, conf 2): heading level skip — promote to ### for consistent hierarchy.

Suggested change
#### Upstream table truncation restrictions
### Upstream table truncation restrictions


{{% include-from-yaml data="mysql_source_details"
name="mysql-truncation-restriction" %}}

For additional considerations, see also [`CREATE TABLE`](/sql/create-table/).

### Change data capture

{{< note >}}
Comment thread
patrickwwbutler marked this conversation as resolved.

For step-by-step instructions on enabling GTID-based binlog replication for your
MySQL service, see the integration guides:
[Amazon RDS](/ingest-data/mysql/amazon-rds/),
[Amazon Aurora](/ingest-data/mysql/amazon-aurora/),
[Azure DB](/ingest-data/mysql/azure-db/),
[Google Cloud SQL](/ingest-data/mysql/google-cloud-sql/),
[Self-hosted](/ingest-data/mysql/self-hosted/).

{{< /note >}}
Comment thread
patrickwwbutler marked this conversation as resolved.

The source uses MySQL's binlog replication protocol to **continually ingest
changes** resulting from `INSERT`, `UPDATE` and `DELETE` operations in the
upstream database. This process is known as _change data capture_.

The replication method used is based on [global transaction identifiers
(GTIDs)](https://dev.mysql.com/doc/refman/8.0/en/replication-gtids.html), and
guarantees **transactional consistency** — any operation inside a MySQL
transaction is assigned the same timestamp in Materialize, which means that the
source will never show partial results based on partially replicated
transactions.

Before creating a source in Materialize, you **must** configure the upstream
MySQL database for GTID-based binlog replication:

{{% mysql-direct/ingesting-data/mysql-configs %}}

In addition, the new `CREATE TABLE FROM SOURCE` syntax requires full row
metadata in the binlog. You must set the following system variable on the
upstream MySQL server:

```sql
SET GLOBAL binlog_row_metadata = FULL;
```

If you're running MySQL using a managed service, additional configuration
changes might be required. For step-by-step instructions on enabling GTID-based
binlog replication for your MySQL service, see the integration guides.
Comment on lines +109 to +111
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_013 (tone-tightening, conf 2): shorten verbose cross-reference.

Suggested change
If you're running MySQL using a managed service, additional configuration
changes might be required. For step-by-step instructions on enabling GTID-based
binlog replication for your MySQL service, see the integration guides.
If you're running MySQL using a managed service, additional configuration
changes might be required. To enable GTID-based binlog replication for your
MySQL service, see the integration guides.


#### Binlog retention

{{< warning >}}
Comment thread
patrickwwbutler marked this conversation as resolved.

If Materialize tries to resume replication and finds GTID gaps due to missing
binlog files, the source enters an errored state and you have to drop and
recreate it.

{{< /warning >}}
Comment thread
patrickwwbutler marked this conversation as resolved.

By default, MySQL retains binlog files for **30 days** (i.e., 2592000 seconds)
before automatically removing them. This is configurable via the
[`binlog_expire_logs_seconds`](https://dev.mysql.com/doc/mysql-replication-excerpt/8.0/en/replication-options-binary-log.html#sysvar_binlog_expire_logs_seconds)
system variable. We recommend using the default value for this configuration in
order to not compromise Materialize's ability to resume replication in case of
failures or restarts.

In some MySQL managed services, binlog expiration can be overridden by a
service-specific configuration parameter. It's important that you double-check
if such a configuration exists, and ensure it's set to the maximum interval
available.

As an example, [Amazon RDS for MySQL](/ingest-data/mysql/amazon-rds/) has its
own configuration parameter for binlog retention ([`binlog retention hours`](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/mysql-stored-proc-configuring.html#mysql_rds_set_configuration-usage-notes.binlog-retention-hours))
that overrides `binlog_expire_logs_seconds` and is set to `NULL` by default.

### Monitoring source progress

[//]: # "TODO(morsapaes) Replace this section with guidance using the new
progress metrics in mz_source_statistics + console monitoring, when available
(also for PostgreSQL)."
Comment on lines +141 to +143
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_032 (comment-removal, conf 5): TODO(morsapaes) is a contributor name, not a tracked ticket — the carve-out for tracked issues does not apply.

Suggested change
[//]: # "TODO(morsapaes) Replace this section with guidance using the new
progress metrics in mz_source_statistics + console monitoring, when available
(also for PostgreSQL)."


By default, MySQL sources expose progress metadata as a subsource that you can
use to monitor source **ingestion progress**. The name of the progress subsource
can be specified when creating a source using the `EXPOSE PROGRESS AS` clause;
otherwise, it will be named `<src_name>_progress`.

The following metadata is available for each source as a progress subsource:

Field | Type | Details
-------------------|---------------------------------------------------------|--------------
`source_id_lower` | [`uuid`](/sql/types/uuid/) | The lower-bound GTID `source_id` of the GTIDs covered by this range.
`source_id_upper` | [`uuid`](/sql/types/uuid/) | The upper-bound GTID `source_id` of the GTIDs covered by this range.
`transaction_id` | [`uint8`](/sql/types/uint/#uint8-info) | The `transaction_id` of the next GTID possible from the GTID `source_id`s covered by this range.

And can be queried using:

```mzsql
SELECT transaction_id
FROM <src_name>_progress;
```

Progress metadata is represented as a [GTID set](https://dev.mysql.com/doc/refman/8.0/en/replication-gtids-concepts.html)
of future possible GTIDs, which is similar to the
[`gtid_executed`](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html#sysvar_gtid_executed)
system variable on a MySQL replica. The reported `transaction_id` should
increase as Materialize consumes **new** binlog records from the upstream MySQL
database. For more details on monitoring source ingestion progress and debugging
related issues, see [Troubleshooting](/ops/troubleshooting/).
Comment on lines +170 to +171
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_013 (tone-tightening, conf 2): shorten verbose cross-reference.

Suggested change
database. For more details on monitoring source ingestion progress and debugging
related issues, see [Troubleshooting](/ops/troubleshooting/).
database. For more information, see [Troubleshooting](/ops/troubleshooting/).


## Known limitations

{{% include-headless "/headless/mysql-considerations" %}}

## Example

{{< important >}}
Comment thread
patrickwwbutler marked this conversation as resolved.

Before creating a MySQL source, you must enable GTID-based binlog replication in
the upstream database. For step-by-step instructions, see the integration guide
for your MySQL service: [Amazon RDS](/ingest-data/mysql/amazon-rds/),
[Amazon Aurora](/ingest-data/mysql/amazon-aurora/),
[Azure DB](/ingest-data/mysql/azure-db/),
[Google Cloud SQL](/ingest-data/mysql/google-cloud-sql/),
[Self-hosted](/ingest-data/mysql/self-hosted/).

{{< /important >}}
Comment thread
patrickwwbutler marked this conversation as resolved.

### Creating a source {#create-source-example}

#### Prerequisite: Creating a connection to MySQL

First, you must create a connection to your MySQL database. A connection
describes how to connect and authenticate to an external system you want
Materialize to read data from.

Once created, a connection is **reusable** across multiple `CREATE SOURCE`
statements. For more details on creating connections, check the
[`CREATE CONNECTION`](/sql/create-connection/#mysql) documentation page.
Comment on lines +199 to +201
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_013 (tone-tightening, conf 2): shorten verbose cross-reference.

Suggested change
Once created, a connection is **reusable** across multiple `CREATE SOURCE`
statements. For more details on creating connections, check the
[`CREATE CONNECTION`](/sql/create-connection/#mysql) documentation page.
Once created, a connection is **reusable** across multiple `CREATE SOURCE`
statements. For more information, see [`CREATE CONNECTION`](/sql/create-connection/#mysql).


```mzsql
CREATE SECRET mysqlpass AS '<MYSQL_PASSWORD>';

CREATE CONNECTION mysql_connection TO MYSQL (
HOST 'instance.foo000.us-west-1.rds.amazonaws.com',
PORT 3306,
USER 'materialize',
PASSWORD SECRET mysqlpass
);
```

If your MySQL server is not exposed to the public internet, you can [tunnel the
connection](/sql/create-connection/#network-security-connections) through an
AWS PrivateLink service (Materialize Cloud) or an SSH bastion host.

{{< tabs tabID="1" >}}
{{< tab "AWS PrivateLink (Materialize Cloud)">}}

{{< include-md file="shared-content/aws-privatelink-cloud-only-note.md" >}}

```mzsql
CREATE CONNECTION privatelink_svc TO AWS PRIVATELINK (
SERVICE NAME 'com.amazonaws.vpce.us-east-1.vpce-svc-0e123abc123198abc',
AVAILABILITY ZONES ('use1-az1', 'use1-az4')
);

CREATE CONNECTION mysql_connection TO MYSQL (
HOST 'instance.foo000.us-west-1.rds.amazonaws.com',
PORT 3306,
USER 'root',
PASSWORD SECRET mysqlpass,
AWS PRIVATELINK privatelink_svc
);
```

For step-by-step instructions on creating AWS PrivateLink connections and
configuring an AWS PrivateLink service to accept connections from Materialize,
check [this guide](/ops/network-security/privatelink/).
Comment on lines +238 to +240
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_013 (tone-tightening, conf 2): shorten verbose cross-reference.

Suggested change
For step-by-step instructions on creating AWS PrivateLink connections and
configuring an AWS PrivateLink service to accept connections from Materialize,
check [this guide](/ops/network-security/privatelink/).
For more information, see [this guide](/ops/network-security/privatelink/).


{{< /tab >}}
{{< tab "SSH tunnel">}}

```mzsql
CREATE CONNECTION ssh_connection TO SSH TUNNEL (
HOST 'bastion-host',
PORT 22,
USER 'materialize'
);
```

```mzsql
CREATE CONNECTION mysql_connection TO MYSQL (
HOST 'instance.foo000.us-west-1.rds.amazonaws.com',
SSH TUNNEL ssh_connection
);
```

For step-by-step instructions on creating SSH tunnel connections and configuring
an SSH bastion server to accept connections from Materialize, check
[this guide](/ops/network-security/ssh-tunnel/).
Comment on lines +260 to +262
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule_013 (tone-tightening, conf 2): shorten verbose cross-reference.

Suggested change
For step-by-step instructions on creating SSH tunnel connections and configuring
an SSH bastion server to accept connections from Materialize, check
[this guide](/ops/network-security/ssh-tunnel/).
For more information, see [this guide](/ops/network-security/ssh-tunnel/).


{{< /tab >}}
{{< /tabs >}}

#### Creating the source in Materialize

You **must** enable GTID-based binlog replication before creating the source.
See [Change data capture](#change-data-capture) for configuration details.

Once replication is configured, create a `SOURCE` in Materialize:

```mzsql
CREATE SOURCE mz_source
FROM MYSQL CONNECTION mysql_connection;
```

After a source is created, you can create a table from the source, referencing
specific upstream table(s).

_Creates a table in Materialize from the upstream table `mydb.orders`_

```mzsql
CREATE TABLE orders FROM SOURCE mz_source (REFERENCE mydb.orders);
```

Comment thread
patrickwwbutler marked this conversation as resolved.
### Modifying an existing source

{{< include-md file="headless/alter-source-snapshot-blocking-behavior.md" >}}

## Related pages

- [`CREATE TABLE`](/sql/create-table/)
- [`CREATE SECRET`](/sql/create-secret)
- [`CREATE CONNECTION`](/sql/create-connection)
- [`CREATE SOURCE`](../)
- MySQL integration guides:
- [Amazon RDS](/ingest-data/mysql/amazon-rds/)
- [Amazon Aurora](/ingest-data/mysql/amazon-aurora/)
- [Azure DB](/ingest-data/mysql/azure-db/)
- [Google Cloud SQL](/ingest-data/mysql/google-cloud-sql/)
- [Self-hosted](/ingest-data/mysql/self-hosted/)
Loading
Loading