Skip to content

Commit fe8f06d

Browse files
committed
PS-10995 [DOCS] -[feedback] PS 8.4 Innodb index creation
modified: docs/innodb-expanded-fast-index-creation.md
1 parent 8b2072e commit fe8f06d

2 files changed

Lines changed: 200 additions & 27 deletions

File tree

docs/extended-mysqldump.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,16 @@ More information can be found in [Backup Locks](backup-locks.md).
1717
More information can be found in
1818
[Compressed columns with dictionaries](compressed-columns.md).
1919

20+
## `InnoDB` secondary keys and `--innodb-optimize-keys`
21+
22+
For *InnoDB* tables, `--innodb-optimize-keys` omits secondary keys (and related
23+
constraints) from the initial `CREATE TABLE` in the dump and adds them in a
24+
follow-up `ALTER TABLE` after the data is loaded. That pattern works well when
25+
the target server can build those indexes using [expanded fast index
26+
creation](innodb-expanded-fast-index-creation.md). See that page for
27+
limitations (foreign keys, partitioned tables, `AUTO_INCREMENT`, implicit
28+
primary keys, and others) and for the `expand_fast_index_creation` variable.
29+
2030
## Taking backup by descending primary key order
2131

2232
–order-by-primary-desc tells `mysqldump` to take the backup by
Lines changed: 190 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,144 @@
11
# Expanded fast index creation
22

3-
Percona has implemented several changes related to *MySQL*’s fast index creation
4-
feature. Fast index creation was implemented in *MySQL* as a way to speed up the
5-
process of adding or dropping indexes on tables with many rows.
3+
## What fast index creation is
64

7-
This feature implements a session variable that enables extended fast index
8-
creation. Besides optimizing DDL directly,
9-
[expand_fast_index_creation](#expanded-fast-index-creation) may also optimize index access for
10-
subsequent DML statements because using it results in much less fragmented
11-
indexes.
5+
In *InnoDB*, secondary indexes are separate B-tree structures from the clustered
6+
index (the primary key). When the server creates a new secondary index on
7+
an existing table, it can do so in two conceptually different ways:
128

13-
## The **mysqldump** command
9+
1. Row-by-row maintenance — For each row, insert that row’s entries into the
10+
new secondary index as you go. Those inserts arrive in primary-key order, not
11+
in secondary-key order, so the growing B-tree suffers many random-looking
12+
page splits and a large amount of write amplification. If the server is
13+
also copying the table (rebuild `ALTER TABLE`), the same pattern applies:
14+
every row copied into the new table must update every secondary index
15+
immediately.
1416

15-
A new option, `--innodb-optimize-keys`, was implemented in **mysqldump**. It
16-
changes the way *InnoDB* tables are dumped, so that secondary and foreign keys
17+
2. Fast index creation (sorted / bulk build) — The server scans the table’s
18+
clustered index in primary-key order, generates secondary-key tuples, sorts
19+
them (often using external merge sort), and builds the secondary index from
20+
that ordered stream. Work is staged in temporary files under the configured
21+
`tmpdir`, then merged into a compact B-tree. That path avoids the worst
22+
random-insert behavior of the row-by-row approach and usually completes with
23+
less I/O and a less fragmented index.
24+
25+
So fast index creation means: build the secondary index from a sorted
26+
stream after reading the table (or a copy) in clustered index order, instead of
27+
growing the index by arbitrary-order inserts during the same phase as the data
28+
copy.
29+
30+
Dropping an index is already a cheap metadata change in many cases; the
31+
performance win is dominated by creating indexes on large tables.
32+
33+
## How this differs from Oracle MySQL {{vers}}
34+
35+
The figure below contrasts a typical Oracle MySQL workflow with Percona Server
36+
when expanded fast index creation is in use (including
37+
`mysqldump --innodb-optimize-keys` during restore).
38+
39+
![Comparison of standard MySQL and Percona Server with expanded fast index creation for backup restore and copy-style ALTER TABLE or OPTIMIZE TABLE](_static/expand-fast-creation.png)
40+
41+
The following compares Percona Server for MySQL with Oracle MySQL {{vers}} on
42+
code paths that still rebuild the table. InnoDB classifies each `ALTER TABLE`
43+
operation by algorithm (`INSTANT`, `INPLACE`, `COPY`, and so on); those
44+
classifications can change between releases, so treat the [InnoDB online DDL
45+
documentation :octicons-link-external-16:](https://dev.mysql.com/doc/refman/{{vers}}/en/innodb-online-ddl.html)
46+
as authoritative for whether a specific statement performs a copy in your
47+
version.
48+
49+
Upstream *MySQL* (*InnoDB*) already uses a sorted, bulk-style path when it
50+
adds a secondary index in operations that are implemented as “add index
51+
only” (for example, some `CREATE INDEX` / `ALTER TABLE ... ADD INDEX` flows
52+
that do not rebuild the whole table).
53+
54+
Where Oracle MySQL still does a full table rebuild (copy algorithm —
55+
for example many `ALTER TABLE` changes that force a new table), rows are
56+
inserted into the new copy while all secondary indexes are live. Each
57+
insert must update every non-primary index at once. Even if the server later
58+
uses efficient mechanics for individual index builds, interleaving those
59+
updates with the copy keeps more indexes “hot” for the whole copy and tends to
60+
produce heavier random I/O and more fragmented trees than deferring secondary
61+
index creation until the clustered data is complete.
62+
63+
Percona Server for MySQL extends that behavior with expanded fast index
64+
creation (controlled by
65+
[`expand_fast_index_creation`](#expanded-fast-index-creation)): on rebuild-style
66+
`ALTER TABLE` / `OPTIMIZE TABLE`, eligible non-unique secondary indexes are
67+
dropped for the copy phase and recreated afterward using the fast sorted-build
68+
path on the finished table. The copy phase then maintains only what *InnoDB*
69+
requires for the clustered index (and any indexes that cannot be deferred),
70+
which is the main difference from Oracle MySQL on the same code paths.
71+
72+
Oracle MySQL {{vers}} can apply `INSTANT` or in-place (`INPLACE`)
73+
DDL to many `ALTER TABLE` operations so the server avoids a full table copy or
74+
keeps work inside the existing *InnoDB* file. That path is separate from the
75+
rebuild logic `expand_fast_index_creation` augments; there is no interaction to
76+
“tune” for those statements.
77+
78+
## When this optimization applies
79+
80+
### `INSTANT`, `INPLACE`, and why this variable usually does not matter for them
81+
82+
If an `ALTER TABLE` runs as `INSTANT` (for example, adding a nullable column at
83+
the end of the table when supported) or as an online in-place operation that
84+
does not rebuild the whole table, the server is not performing a full table
85+
copy that Percona optimizes. In those cases
86+
`expand_fast_index_creation` is generally unnecessary: the expensive secondary
87+
index pattern this feature improves simply is not used in the same way.
88+
89+
### When `expand_fast_index_creation` helps
90+
91+
`expand_fast_index_creation` is most beneficial when the operation requires a
92+
table copy—for example changing a column’s data type in a way that forces a
93+
rebuild, or other alters classified with the copy algorithm. On that path,
94+
Percona Server intercepts the copy so eligible non-unique secondary indexes are
95+
rebuilt with the sorted temporary-file workflow instead of being maintained on
96+
every inserted row during the copy.
97+
98+
Expanded fast index creation only affects statements that rebuild the table and
99+
copy rows into a new *InnoDB* table. Typical cases include:
100+
101+
* `OPTIMIZE TABLE` on an *InnoDB* table (internally `ALTER TABLE ... ENGINE=InnoDB`)
102+
* `ALTER TABLE` operations that the server implements with a table rebuild and
103+
the copy algorithm, as listed in the [InnoDB online DDL operations
104+
table :octicons-link-external-16:](https://dev.mysql.com/doc/refman/{{vers}}/en/innodb-online-ddl-operations.html)
105+
* An `ALTER TABLE` where you explicitly request `ALGORITHM=COPY` (when that
106+
algorithm is permitted for the operation)
107+
108+
Routine schema changes that stay on `INSTANT` or `INPLACE` never enter this path
109+
and are unaffected by `expand_fast_index_creation`.
110+
111+
## Verify and monitor
112+
113+
* Check whether the feature is enabled:
114+
115+
```sql
116+
SHOW VARIABLES LIKE 'expand_fast_index_creation';
117+
```
118+
119+
In Percona Server for MySQL {{vers}} the default is `OFF`. Enable it for a
120+
session or globally before running DDL, for example
121+
`SET SESSION expand_fast_index_creation = ON;`.
122+
123+
* To see how *MySQL* classifies a specific `ALTER TABLE`, use the online DDL
124+
documentation for your version (linked [above](#when-this-optimization-applies)).
125+
There is no single `EXPLAIN` for DDL; classification is per operation and
126+
version.
127+
128+
* [`tmpdir`](https://dev.mysql.com/doc/refman/{{vers}}/en/server-system-variables.html#sysvar_tmpdir)
129+
free space is the usual operational bottleneck; see
130+
[Limitations](#limitations) for how large it must be and what happens when it
131+
is exhausted.
132+
133+
Besides shortening DDL directly,
134+
[`expand_fast_index_creation`](#expanded-fast-index-creation) may also help
135+
subsequent DML because indexes built in one sorted pass are often less
136+
fragmented than those maintained incrementally through a long copy.
137+
138+
## The mysqldump command
139+
140+
The `--innodb-optimize-keys` option changes the way *InnoDB* tables are dumped,
141+
so that secondary and foreign keys
17142
are created after loading the data, thus taking advantage of fast index
18143
creation. More specifically:
19144

@@ -25,7 +150,7 @@ create the previously omitted keys.
25150

26151
## `ALTER TABLE`
27152

28-
When `ALTER TABLE` requires a table copy, secondary keys are now dropped and
153+
When `ALTER TABLE` requires a table copy, secondary keys are dropped and
29154
recreated later, after copying the data. The following restrictions apply:
30155

31156
* Only non-unique keys can be involved in this optimization.
@@ -39,16 +164,42 @@ keys.
39164
## `OPTIMIZE TABLE`
40165

41166
Internally, `OPTIMIZE TABLE` is mapped to `ALTER TABLE ... ENGINE=innodb`
42-
for *InnoDB* tables. As a consequence, it now also benefits from fast index
43-
creation, with the same restrictions as for `ALTER TABLE`.
167+
for *InnoDB* tables. As a consequence, it also benefits from fast index
168+
creation when `expand_fast_index_creation` is enabled and the optimization
169+
applies, with the same restrictions as for `ALTER TABLE`.
170+
171+
## Limitations
172+
173+
!!! warning "`tmpdir` free space — the most common failure"
174+
175+
In practice, the usual reason expanded fast index creation fails is
176+
running out of disk space on the filesystem used for
177+
[`tmpdir`](https://dev.mysql.com/doc/refman/{{vers}}/en/server-system-variables.html#sysvar_tmpdir)
178+
(often the same mount as `/tmp`).
44179

45-
## Caveats
180+
With this optimization enabled, the server does not only make index
181+
maintenance cheaper in memory: it materializes each secondary index in
182+
temporary files (sorted runs and merge passes) and only then merges the
183+
result into the final *InnoDB* index. That can consume far more transient
184+
space than a rough “indexes fit in the tablespace” estimate suggests.
46185

47-
*InnoDB* fast index creation uses temporary files in tmpdir for all indexes
48-
being created. So make sure you have enough tmpdir space when using
49-
[expand_fast_index_creation](#expanded-fast-index-creation). It is a session variable, so you can
50-
temporarily switch it off if you are short on tmpdir space and/or don’t want
51-
this optimization to be used for a specific table.
186+
Size the filesystem using the secondary index footprint you are
187+
rebuilding, not the primary table size alone. You typically need
188+
well above the on-disk size of those secondary indexes as free
189+
space under `tmpdir`, on top of anything else the same `ALTER TABLE` or
190+
`OPTIMIZE TABLE` already needs. For example, a table with about 500 GB
191+
of data and about 200 GB of secondary indexes may still require
192+
significantly more than 200 GB of free `tmpdir` space while those
193+
indexes are being built.
194+
195+
If `tmpdir` fills during the operation, the statement fails and rolls
196+
back. You lose the work done up to that point and must free or enlarge
197+
storage (or point `tmpdir` at a larger volume), or run with
198+
`expand_fast_index_creation` disabled for that job, before retrying.
199+
200+
[`expand_fast_index_creation`](#expanded-fast-index-creation) is a session or
201+
global variable: you can set it to `OFF` for a single session if `tmpdir` is
202+
too small for a specific table or maintenance window.
52203

53204
There’s also a number of cases when this optimization is not applicable:
54205

@@ -62,16 +213,16 @@ dropping keys that are part of a FOREIGN KEY constraint;
62213
* `ALTER TABLE` and `OPTIMIZE TABLE` always process partitioned tables as if
63214
[expand_fast_index_creation](#expanded-fast-index-creation) is OFF;
64215

65-
* **mysqldump --innodb-optimize-keys** ignores foreign keys because
216+
* mysqldump --innodb-optimize-keys ignores foreign keys because
66217
*InnoDB* requires a full table rebuild on foreign key changes. So adding them
67218
back with a separate `ALTER TABLE` after restoring the data from a dump
68219
would actually make the restore slower;
69220

70-
* **mysqldump --innodb-optimize-keys** ignores indexes on
221+
* mysqldump --innodb-optimize-keys ignores indexes on
71222
`AUTO_INCREMENT` columns, because they must be indexed, so it is impossible
72223
to temporarily drop the corresponding index;
73224

74-
* **mysqldump --innodb-optimize-keys** ignores the first UNIQUE index on
225+
* mysqldump --innodb-optimize-keys ignores the first UNIQUE index on
75226
non-nullable columns when the table has no `PRIMARY KEY` defined, because in
76227
this case *InnoDB* picks such an index as the clustered one.
77228

@@ -86,12 +237,24 @@ this case *InnoDB* picks such an index as the clustered one.
86237
| Scope: | Local/Global |
87238
| Dynamic: | Yes |
88239
| Data type | Boolean |
89-
| Default value | ON/OFF |
240+
| Default value | OFF |
241+
242+
When set to `ON`, *InnoDB* may drop eligible non-unique secondary indexes for the
243+
data-copy phase of rebuild-style `ALTER TABLE` and `OPTIMIZE TABLE`, then
244+
recreate them with the sorted bulk build described [above](#what-fast-index-creation-is).
245+
246+
## Related documentation
90247

91-
!!! admonition "See also"
248+
### In this manual
92249

93-
[Improved InnoDB fast index creation :octicons-link-external-16:](https://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/)
250+
* [Percona Server for MySQL feature comparison](feature-comparison.md) — how this capability compares to MySQL {{vers}}
251+
* [Percona Server for MySQL variables](percona-server-system-variables.md) — full list of Percona-specific system variables, including `expand_fast_index_creation`
252+
* [Extended mysqldump](extended-mysqldump.md) — Percona `mysqldump` enhancements, including `--innodb-optimize-keys`
253+
* [InnoDB page fragmentation counters](innodb-fragmentation-count.md) — monitoring index fragmentation
94254

95-
[Thinking about running OPTIMIZE on your InnoDB Table? Stop! :octicons-link-external-16:](https://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/)
255+
### MySQL Reference Manual
96256

257+
* [InnoDB and online DDL :octicons-link-external-16:](https://dev.mysql.com/doc/refman/{{vers}}/en/innodb-online-ddl.html)
258+
* [InnoDB online DDL operations :octicons-link-external-16:](https://dev.mysql.com/doc/refman/{{vers}}/en/innodb-online-ddl-operations.html)
259+
* [`tmpdir` system variable :octicons-link-external-16:](https://dev.mysql.com/doc/refman/{{vers}}/en/server-system-variables.html#sysvar_tmpdir)
97260

0 commit comments

Comments
 (0)