PS-10995 [DOCS] -[feedback] PS 8.4 Innodb index creation

patrickbirch · patrickbirch · commit ff92c7120562 · 2026-04-01T05:46:28.000-05:00
modified:   docs/innodb-expanded-fast-index-creation.md
diff --git a/docs/innodb-expanded-fast-index-creation.md b/docs/innodb-expanded-fast-index-creation.md
@@ -1,18 +1,71 @@
 # Expanded fast index creation
 
-Percona has implemented several changes related to *MySQL*’s fast index creation
-feature. Fast index creation was implemented in *MySQL* as a way to speed up the
-process of adding or dropping indexes on tables with many rows.
-
-This feature implements a session variable that enables extended fast index
-creation. Besides optimizing DDL directly,
-[expand_fast_index_creation](#expanded-fast-index-creation) may also optimize index access for
-subsequent DML statements because using it results in much less fragmented
-indexes.
-
-## The **mysqldump** command
-
-A new option, `--innodb-optimize-keys`, was implemented in **mysqldump**. It
+## What fast index creation is
+
+In *InnoDB*, secondary indexes are separate B-tree structures from the clustered
+index (the primary key). When the server creates a new secondary index on
+an existing table, it can do so in two conceptually different ways:
+
+1. Row-by-row maintenance — For each row, insert that row’s entries into the
+   new secondary index as you go. Those inserts arrive in primary-key order, not
+   in secondary-key order, so the growing B-tree suffers many random-looking
+   page splits and a large amount of write amplification. If the server is
+   also copying the table (rebuild `ALTER TABLE`), the same pattern applies:
+   every row copied into the new table must update every secondary index
+   immediately.
+
+2. Fast index creation (sorted / bulk build) — The server scans the table’s
+   clustered index in primary-key order, generates secondary-key tuples, sorts
+   them (often using external merge sort), and builds the secondary index from
+   that ordered stream. Work is staged in temporary files under the configured
+   `tmpdir`, then merged into a compact B-tree. That path avoids the worst
+   random-insert behavior of the row-by-row approach and usually completes with
+   less I/O and a less fragmented index.
+
+So fast index creation means: build the secondary index from a sorted
+stream after reading the table (or a copy) in clustered index order, instead of
+growing the index by arbitrary-order inserts during the same phase as the data
+copy.
+
+Dropping an index is already a cheap metadata change in many cases; the
+performance win is dominated by creating indexes on large tables.
+
+## How this differs from community MySQL
+
+Upstream *MySQL* (*InnoDB*) already uses a sorted, bulk-style path when it
+adds a secondary index in operations that are implemented as “add index
+only” (for example, some `CREATE INDEX` / `ALTER TABLE ... ADD INDEX` flows
+that do not rebuild the whole table).
+
+Where community *MySQL* still does a full table rebuild (copy algorithm —
+for example many `ALTER TABLE` changes that force a new table), rows are
+inserted into the new copy while all secondary indexes are live. Each
+insert must update every non-primary index at once. Even if the server later
+uses efficient mechanics for individual index builds, interleaving those
+updates with the copy keeps more indexes “hot” for the whole copy and tends to
+produce heavier random I/O and more fragmented trees than deferring secondary
+index creation until the clustered data is complete.
+
+Percona Server for MySQL extends that behavior with expanded fast index
+creation (controlled by
+[`expand_fast_index_creation`](#expanded-fast-index-creation)): on rebuild-style
+`ALTER TABLE` / `OPTIMIZE TABLE`, eligible non-unique secondary indexes are
+dropped for the copy phase and recreated afterward using the fast sorted-build
+path on the finished table. The copy phase then maintains only what *InnoDB*
+requires for the clustered index (and any indexes that cannot be deferred),
+which is the main difference from stock *MySQL* on the same code paths.
+
+Other *MySQL* 8.x features (online DDL, `INSTANT` where supported) are
+unchanged; this optimization targets cases that still perform a table copy.
+
+Besides shortening DDL directly,
+[`expand_fast_index_creation`](#expanded-fast-index-creation) may also help
+subsequent DML because indexes built in one sorted pass are often less
+fragmented than those maintained incrementally through a long copy.
+
+## The mysqldump command
+
+A new option, `--innodb-optimize-keys`, was implemented in mysqldump. It
 changes the way *InnoDB* tables are dumped, so that secondary and foreign keys
 are created after loading the data, thus taking advantage of fast index
 creation. More specifically:
@@ -42,7 +95,7 @@ Internally, `OPTIMIZE TABLE` is mapped to `ALTER TABLE ... ENGINE=innodb`
 for *InnoDB* tables. As a consequence, it now also benefits from fast index
 creation, with the same restrictions as for `ALTER TABLE`.
 
-## Caveats
+## Limitations
 
 *InnoDB* fast index creation uses temporary files in tmpdir for all indexes
 being created. So make sure you have enough tmpdir space when using
@@ -62,16 +115,16 @@ dropping keys that are part of a FOREIGN KEY constraint;
 * `ALTER TABLE` and `OPTIMIZE TABLE` always process partitioned tables as if
 [expand_fast_index_creation](#expanded-fast-index-creation) is OFF;
 
-* **mysqldump --innodb-optimize-keys** ignores foreign keys because
+* mysqldump --innodb-optimize-keys ignores foreign keys because
 *InnoDB* requires a full table rebuild on foreign key changes. So adding them
 back with a separate `ALTER TABLE` after restoring the data from a dump
 would actually make the restore slower;
 
-* **mysqldump --innodb-optimize-keys** ignores indexes on
+* mysqldump --innodb-optimize-keys ignores indexes on
 `AUTO_INCREMENT` columns, because they must be indexed, so it is impossible
 to temporarily drop the corresponding index;
 
-* **mysqldump --innodb-optimize-keys** ignores the first UNIQUE index on
+* mysqldump --innodb-optimize-keys ignores the first UNIQUE index on
 non-nullable columns when the table has no `PRIMARY KEY` defined, because in
 this case *InnoDB* picks such an index as the clustered one.
 
@@ -88,10 +141,15 @@ this case *InnoDB* picks such an index as the clustered one.
 | Data type      | Boolean            |
 | Default value  | ON/OFF             |
 
-!!! admonition "See also"
+When enabled, *InnoDB* may drop eligible non-unique secondary indexes for the
+data-copy phase of rebuild-style `ALTER TABLE` and `OPTIMIZE TABLE`, then
+recreate them with the sorted bulk build described [above](#what-fast-index-creation-is).
 
-    [Improved InnoDB fast index creation :octicons-link-external-16:](https://www.mysqlperformanceblog.com/2011/11/06/improved-innodb-fast-index-creation/)
+## Related documentation
 
-    [Thinking about running OPTIMIZE on your InnoDB Table? Stop! :octicons-link-external-16:](https://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/) 
+* [Percona Server for MySQL feature comparison](feature-comparison.md) — how this capability compares to MySQL {{vers}}
+* [Percona Server for MySQL variables](percona-server-system-variables.md) — full list of Percona-specific system variables, including `expand_fast_index_creation`
+* [Extended mysqldump](extended-mysqldump.md) — other Percona enhancements to `mysqldump`
+* [InnoDB page fragmentation counters](innodb-fragmentation-count.md) — monitoring index fragmentation