You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: information-schema/information-schema-analyze-status.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,7 +64,7 @@ Fields in the `ANALYZE_STATUS` table are described as follows:
64
64
*`TABLE_SCHEMA`: The name of the database to which the table belongs.
65
65
*`TABLE_NAME`: The name of the table.
66
66
*`PARTITION_NAME`: The name of the partitioned table.
67
-
*`JOB_INFO`: The information of the `ANALYZE`task. If an index is analyzed, this information will include the index name. When `tidb_analyze_version = 2`, this information will include configuration items such as sample rate.
67
+
*`JOB_INFO`: A brief description of the `ANALYZE`subtask. It shows the `ANALYZE` scope, such as columns, indexes, or global statistics merge, and might include the effective options used, such as `buckets`, `topn`, `samplerate`, or `samples`.
68
68
*`PROCESSED_ROWS`: The number of rows that have been processed.
69
69
*`START_TIME`: The start time of the `ANALYZE` task.
70
70
*`END_TIME`: The end time of the `ANALYZE` task.
@@ -79,4 +79,4 @@ Fields in the `ANALYZE_STATUS` table are described as follows:
Copy file name to clipboardExpand all lines: sql-statements/sql-statement-show-analyze-status.md
+6-27Lines changed: 6 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ Currently, the `SHOW ANALYZE STATUS` statement returns the following columns:
21
21
|`Table_schema`| The database name |
22
22
|`Table_name`| The table name |
23
23
|`Partition_name`| The partition name |
24
-
|`Job_info`|The task information. If an index is analyzed, this information will include the index name. When `tidb_analyze_version =2`, this information will include configuration items such as sample rate. |
24
+
|`Job_info`|A brief description of the `ANALYZE` subtask. It shows the `ANALYZE` scope, such as columns, indexes, or global statistics merge, and might include the effective options used, such as `buckets`, `topn`, `samplerate`, or `samples`. |
25
25
|`Processed_rows`| The number of rows that have been analyzed |
26
26
|`Start_time`| The time at which the task starts |
27
27
|`State`| The state of a task, including `pending`, `running`, `finished`, and `failed`|
> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. The following example shows the current `ANALYZE` behavior with Statistics Version 2.
45
+
42
46
```sql
43
47
mysql> create table t(x int, index idx(x)) partition by hash(x) partitions 2;
Copy file name to clipboardExpand all lines: statistics.md
+13-16Lines changed: 13 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -300,7 +300,7 @@ TiDB will overwrite the previously recorded persistent configuration using the n
300
300
301
301
### Disable ANALYZE configuration persistence
302
302
303
-
To disable the `ANALYZE` configuration persistence feature, set the `tidb_persist_analyze_options` system variable to `OFF`. Because the `ANALYZE` configuration persistence feature is not applicable to `tidb_analyze_version = 1`, setting `tidb_analyze_version = 1` can also disable the feature.
303
+
To disable the `ANALYZE` configuration persistence feature, set the `tidb_persist_analyze_options` system variable to `OFF`.
304
304
305
305
After disabling the `ANALYZE` configuration persistence feature, TiDB does not clear the persisted configuration records. Therefore, if you enable this feature again, TiDB continues to collect statistics using the previously recorded persistent configurations.
306
306
@@ -356,13 +356,17 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL;
356
356
357
357
## Versions of statistics
358
358
359
-
The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`.
359
+
>**Warning:**
360
+
>
361
+
> Statistics Version 1 (`tidb_analyze_version = 1`) is no longer supported for new statistics collection. TiDB keeps reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version = 2`). It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions).
362
+
363
+
The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`.
360
364
361
365
- For TiDB Self-Managed, the default value of this variable changes from`1` to `2` starting fromv5.3.0.
362
366
- For TiDB Cloud, the default value of this variable changes from`1` to `2` starting fromv6.5.0.
363
-
-If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade.
367
+
-When you upgrade a cluster that still persists `tidb_analyze_version = 1`, TiDB rewrites the persisted global value to `2` during upgrade.
364
368
365
-
Version 2 is preferred, and will continue to be enhanced to ultimately replace Version 1 completely. Compared to Version 1, Version 2 improves the accuracy of many of the statistics collected for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, andalso supporting automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)).
369
+
Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, andit supports automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). For new statistics collection, Version 2 is the only supported statistics version.
366
370
367
371
The following table lists the information collected by each version for usage in the optimizer estimates:
368
372
@@ -377,29 +381,22 @@ The following table lists the information collected by each version for usage in
377
381
378
382
### Switch between statistics versions
379
383
380
-
It is recommended to ensure that all tables/indexes (and partitions) utilize statistics collection fromthe same version. Version 2 is recommended, however, it is not recommended to switch from one version to another without a justifiable reason such asan issue experienced with the version in use. A switch between versions might take a period of time when no statistics are available until all tables have been analyzed with the new version, which might negatively affect the optimizer plan choices if statistics are not available.
384
+
It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2as soon aspossible. Until Version 2statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1statistics for that object.
381
385
382
-
Examples of justifications to switch might include - with Version 1, there could be inaccuracies in equal/INpredicate estimation due to hash collisions when collecting Count-Min sketch statistics. Solutions are listed in the [Count-Min Sketch](#count-min-sketch) section. Alternatively, setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects is also a solution. In the early release of Version 2, there was a risk of memory overflow after `ANALYZE`. This issue is resolved, but initially, one solution was to set `tidb_analyze_version = 1` and rerun `ANALYZE` on all objects.
386
+
One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/INpredicates because the Count-Min sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects.
383
387
384
-
To prepare `ANALYZE` for switching between versions:
388
+
To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2:
385
389
386
390
- If the `ANALYZE` statement is executed manually, manually analyze every table to be analyzed.
FROM information_schema.tables JOIN mysql.stats_histograms
391
395
ON table_id = tidb_table_id
392
-
WHERE stats_ver = 2;
396
+
WHERE stats_ver = 1;
393
397
```
394
398
395
-
- If TiDB automatically executes the `ANALYZE` statement because the auto-analysis has been enabled, execute the following statement that generates the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement:
FROM information_schema.tables JOIN mysql.stats_histograms
400
-
ON table_id = tidb_table_id
401
-
WHERE stats_ver = 2;
402
-
```
399
+
- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set`tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. To speed up the migration for important objects, run `ANALYZE`on them manually.
403
400
404
401
- If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this:
Copy file name to clipboardExpand all lines: system-variables.md
+9-4Lines changed: 9 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1181,16 +1181,21 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a
1181
1181
1182
1182
### tidb_analyze_version <span class="version-mark">New in v5.1.0</span>
1183
1183
1184
+
> **Warning:**
1185
+
>
1186
+
> Statistics Version 1 (`tidb_analyze_version =1`) is no longer supported for new statistics collection. TiDB keeps reading existing Version 1 statistics for upgrade compatibility, but all new `ANALYZE` operations use Statistics Version 2 (`tidb_analyze_version =2`). It is recommended that you use `tidb_analyze_version =2`.
1187
+
1184
1188
- Scope: SESSION | GLOBAL
1185
1189
- Persists to cluster: Yes
1186
1190
- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No
1187
1191
- Type: Integer
1188
1192
- Default value: `2`
1189
-
- Range: `[1, 2]`
1193
+
- Range: `[1, 2]`. Only `2` is supported for new statistics collection.
1190
1194
- Controls how TiDB collects statistics.
1191
-
- For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0.
1192
-
- For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0.
1193
-
- If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade.
1195
+
- If you try to set this variable to `1`, TiDB returns an error.
1196
+
- For TiDB Self-Managed, the default value of this variable changed from `1` to `2` starting from v5.3.0.
1197
+
- For TiDB Cloud, the default value of this variable changed from `1` to `2` starting from v6.5.0.
1198
+
- When you upgrade a cluster that still persists `tidb_analyze_version =1`, TiDB rewrites the persisted global value to `2` during upgrade.
1194
1199
- For detailed introduction about this variable, see [Introduction to Statistics](/statistics.md).
1195
1200
1196
1201
### tidb_analyze_skip_column_types <span class="version-mark">New in v7.2.0</span>
0 commit comments