Skip to content

Commit ff92c71

Browse files
committed
PS-10995 [DOCS] -[feedback] PS 8.4 Innodb index creation
modified: docs/innodb-expanded-fast-index-creation.md
1 parent 8b2072e commit ff92c71

1 file changed

Lines changed: 78 additions & 20 deletions

File tree

docs/innodb-expanded-fast-index-creation.md

Lines changed: 78 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,71 @@
11
# Expanded fast index creation
22

3-
Percona has implemented several changes related to *MySQL*’s fast index creation
4-
feature. Fast index creation was implemented in *MySQL* as a way to speed up the
5-
process of adding or dropping indexes on tables with many rows.
6-
7-
This feature implements a session variable that enables extended fast index
8-
creation. Besides optimizing DDL directly,
9-
[expand_fast_index_creation](#expanded-fast-index-creation) may also optimize index access for
10-
subsequent DML statements because using it results in much less fragmented
11-
indexes.
12-
13-
## The **mysqldump** command
14-
15-
A new option, `--innodb-optimize-keys`, was implemented in **mysqldump**. It
3+
## What fast index creation is
4+
5+
In *InnoDB*, secondary indexes are separate B-tree structures from the clustered
6+
index (the primary key). When the server creates a new secondary index on
7+
an existing table, it can do so in two conceptually different ways:
8+
9+
1. Row-by-row maintenance — For each row, insert that row’s entries into the
10+
new secondary index as you go. Those inserts arrive in primary-key order, not
11+
in secondary-key order, so the growing B-tree suffers many random-looking
12+
page splits and a large amount of write amplification. If the server is
13+
also copying the table (rebuild `ALTER TABLE`), the same pattern applies:
14+
every row copied into the new table must update every secondary index
15+
immediately.
16+
17+
2. Fast index creation (sorted / bulk build) — The server scans the table’s
18+
clustered index in primary-key order, generates secondary-key tuples, sorts
19+
them (often using external merge sort), and builds the secondary index from
20+
that ordered stream. Work is staged in temporary files under the configured
21+
`tmpdir`, then merged into a compact B-tree. That path avoids the worst
22+
random-insert behavior of the row-by-row approach and usually completes with
23+
less I/O and a less fragmented index.
24+
25+
So fast index creation means: build the secondary index from a sorted
26+
stream after reading the table (or a copy) in clustered index order, instead of
27+
growing the index by arbitrary-order inserts during the same phase as the data
28+
copy.
29+
30+
Dropping an index is already a cheap metadata change in many cases; the
31+
performance win is dominated by creating indexes on large tables.
32+
33+
## How this differs from community MySQL
34+
35+
Upstream *MySQL* (*InnoDB*) already uses a sorted, bulk-style path when it
36+
adds a secondary index in operations that are implemented as “add index
37+
only” (for example, some `CREATE INDEX` / `ALTER TABLE ... ADD INDEX` flows
38+
that do not rebuild the whole table).
39+
40+
Where community *MySQL* still does a full table rebuild (copy algorithm —
41+
for example many `ALTER TABLE` changes that force a new table), rows are
42+
inserted into the new copy while all secondary indexes are live. Each
43+
insert must update every non-primary index at once. Even if the server later
44+
uses efficient mechanics for individual index builds, interleaving those
45+
updates with the copy keeps more indexes “hot” for the whole copy and tends to
46+
produce heavier random I/O and more fragmented trees than deferring secondary
47+
index creation until the clustered data is complete.
48+
49+
Percona Server for MySQL extends that behavior with expanded fast index
50+
creation (controlled by
51+
[`expand_fast_index_creation`](#expanded-fast-index-creation)): on rebuild-style
52+
`ALTER TABLE` / `OPTIMIZE TABLE`, eligible non-unique secondary indexes are
53+
dropped for the copy phase and recreated afterward using the fast sorted-build
54+
path on the finished table. The copy phase then maintains only what *InnoDB*
55+
requires for the clustered index (and any indexes that cannot be deferred),
56+
which is the main difference from stock *MySQL* on the same code paths.
57+
58+
Other *MySQL* 8.x features (online DDL, `INSTANT` where supported) are
59+
unchanged; this optimization targets cases that still perform a table copy.
60+
61+
Besides shortening DDL directly,
62+
[`expand_fast_index_creation`](#expanded-fast-index-creation) may also help
63+
subsequent DML because indexes built in one sorted pass are often less
64+
fragmented than those maintained incrementally through a long copy.
65+
66+
## The mysqldump command
67+
68+
A new option, `--innodb-optimize-keys`, was implemented in mysqldump. It
1669
changes the way *InnoDB* tables are dumped, so that secondary and foreign keys
1770
are created after loading the data, thus taking advantage of fast index
1871
creation. More specifically:
@@ -42,7 +95,7 @@ Internally, `OPTIMIZE TABLE` is mapped to `ALTER TABLE ... ENGINE=innodb`
4295
for *InnoDB* tables. As a consequence, it now also benefits from fast index
4396
creation, with the same restrictions as for `ALTER TABLE`.
4497

45-
## Caveats
98+
## Limitations
4699

47100
*InnoDB* fast index creation uses temporary files in tmpdir for all indexes
48101
being created. So make sure you have enough tmpdir space when using
@@ -62,16 +115,16 @@ dropping keys that are part of a FOREIGN KEY constraint;
62115
* `ALTER TABLE` and `OPTIMIZE TABLE` always process partitioned tables as if
63116
[expand_fast_index_creation](#expanded-fast-index-creation) is OFF;
64117

65-
* **mysqldump --innodb-optimize-keys** ignores foreign keys because
118+
* mysqldump --innodb-optimize-keys ignores foreign keys because
66119
*InnoDB* requires a full table rebuild on foreign key changes. So adding them
67120
back with a separate `ALTER TABLE` after restoring the data from a dump
68121
would actually make the restore slower;
69122

70-
* **mysqldump --innodb-optimize-keys** ignores indexes on
123+
* mysqldump --innodb-optimize-keys ignores indexes on
71124
`AUTO_INCREMENT` columns, because they must be indexed, so it is impossible
72125
to temporarily drop the corresponding index;
73126

74-
* **mysqldump --innodb-optimize-keys** ignores the first UNIQUE index on
127+
* mysqldump --innodb-optimize-keys ignores the first UNIQUE index on
75128
non-nullable columns when the table has no `PRIMARY KEY` defined, because in
76129
this case *InnoDB* picks such an index as the clustered one.
77130

@@ -88,10 +141,15 @@ this case *InnoDB* picks such an index as the clustered one.
88141
| Data type | Boolean |
89142
| Default value | ON/OFF |
90143

91-
!!! admonition "See also"
144+
When enabled, *InnoDB* may drop eligible non-unique secondary indexes for the
145+
data-copy phase of rebuild-style `ALTER TABLE` and `OPTIMIZE TABLE`, then
146+
recreate them with the sorted bulk build described [above](#what-fast-index-creation-is).
92147

93-
[Improved InnoDB fast index creation :octicons-link-external-16:](https://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/)
148+
## Related documentation
94149

95-
[Thinking about running OPTIMIZE on your InnoDB Table? Stop! :octicons-link-external-16:](https://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/)
150+
* [Percona Server for MySQL feature comparison](feature-comparison.md) — how this capability compares to MySQL {{vers}}
151+
* [Percona Server for MySQL variables](percona-server-system-variables.md) — full list of Percona-specific system variables, including `expand_fast_index_creation`
152+
* [Extended mysqldump](extended-mysqldump.md) — other Percona enhancements to `mysqldump`
153+
* [InnoDB page fragmentation counters](innodb-fragmentation-count.md) — monitoring index fragmentation
96154

97155

0 commit comments

Comments
 (0)