|
| 1 | +# EXPLAIN Test Analysis |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +After extensive analysis of the 2446 explain_todo tests, I found that the test data contains fundamental inconsistencies that prevent systematic fixes. |
| 6 | + |
| 7 | +## Key Findings |
| 8 | + |
| 9 | +### 1. LIMIT 0 and FORMAT Stripping |
| 10 | + |
| 11 | +The expected output varies between tests for identical patterns: |
| 12 | +- **02998_system_dns_cache_table**: Expects LIMIT 0 and FORMAT to be STRIPPED |
| 13 | +- **03031_table_function_fuzzquery**: Expects LIMIT 0 and FORMAT to be KEPT |
| 14 | + |
| 15 | +Both tests have `LIMIT 0 FORMAT TSVWithNamesAndTypes;` but expect different outputs. |
| 16 | + |
| 17 | +### 2. SETTINGS Position |
| 18 | + |
| 19 | +The position of `Set` in the output varies: |
| 20 | +- **01293_external_sorting_limit_bug** (explain_todo): Expects Set at SelectQuery level |
| 21 | +- **01104_distributed_numbers_test** (passing): Expects Set at SelectWithUnionQuery level |
| 22 | + |
| 23 | +Changing the logic to fix one breaks the other. |
| 24 | + |
| 25 | +### 3. AND/OR Flattening |
| 26 | + |
| 27 | +Some tests expect flattened boolean operations, others expect nested: |
| 28 | +- **00824_filesystem** (explain_todo): Expects `Function and` with 3 children (flattened) |
| 29 | +- **03653_keeper_histogram_metrics** (passing): Expects nested `Function and` (2 children each) |
| 30 | + |
| 31 | +Implementing flattening broke 173 passing tests. |
| 32 | + |
| 33 | +## Root Cause |
| 34 | + |
| 35 | +The `explain.txt` files were generated from different ClickHouse versions or configurations, leading to inconsistent expected outputs. Without regenerating test data with a consistent ClickHouse version, these inconsistencies cannot be resolved. |
| 36 | + |
| 37 | +## Statistics |
| 38 | + |
| 39 | +- Total tests with explain_todo: 2446 |
| 40 | +- Tests with stmt1 in explain_todo: 142 |
| 41 | +- Tests currently passing from explain_todo: 0 |
| 42 | + |
| 43 | +## Recommendations |
| 44 | + |
| 45 | +1. **Regenerate test data**: Run all tests against a single ClickHouse version to get consistent expected output |
| 46 | +2. **Version-specific logic**: If supporting multiple ClickHouse versions, implement version detection |
| 47 | +3. **Focus on specific patterns**: Fix individual tests rather than broad changes when required |
0 commit comments