Parquet-back raw-PUT test indices on the analytics-engine route#5529
Conversation
PR Reviewer Guide 🔍(Review updated until commit 47e7c4e)Here are some key observations to aid the review process:
|
03663d0 to
f153b99
Compare
PR Code Suggestions ✨Latest suggestions up to 769dd82 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit f153b99
Suggestions up to commit 03663d0
|
|
Persistent review updated to latest commit f153b99 |
f153b99 to
769dd82
Compare
|
Persistent review updated to latest commit 769dd82 |
With -Dtests.analytics.parquet_indices=true, indices created by a raw document PUT (e.g. `PUT /test/_doc/1` in a test's init()) bypass AnalyticsIndexConfig.applyIndexCreationSettings, so they inherit the composite *value* — and are therefore routed to the analytics engine by RestUnifiedQueryAction.isAnalyticsIndex — but not the `pluggable.dataformat.enabled` flag. They are then stored as a plain-Lucene EngineBackedIndexer whose acquireReader() is unimplemented, and the query fails with `StreamException[INTERNAL] Failed to start streaming fragment`. Apply the cluster-level composite defaults in setUpIndices() so every index — including raw-PUT ones — is stored as a parquet-backed DataFormatAwareEngine that is actually scannable by the analytics engine it routes to. No-op unless tests.analytics.parquet_indices=true, so normal CI is unchanged. Signed-off-by: Kai Huang <ahkcs@amazon.com>
769dd82 to
47e7c4e
Compare
|
Persistent review updated to latest commit 47e7c4e |
| // auto-creates via a raw document PUT, which bypasses createIndexByRestClient) parquet-backed | ||
| // composite, so it is stored as a DataFormatAwareEngine and is actually scannable by the | ||
| // analytics engine it routes to. Must run before init() creates any index. | ||
| TestUtils.AnalyticsIndexConfig.applyClusterSettings(client()); |
There was a problem hiding this comment.
- Any cleanup required in tear down logic?
- I assume this applied to all SQL tests too if any raw doc put?
There was a problem hiding this comment.
Looked into the issues:
- No extra cleanup needed — same lifecycle as enableCalcite(). applyClusterSettings() is called from setUpIndices() (@before) so it's re-applied before every test method, and all of these settings are wiped by the existing @afterclass cleanUpIndices() → wipeAllClusterSettings() (nulls persistent./transient. after each class).
- Correct — but only when -Dtests.analytics.parquet_indices=true, which is set only for the analytics-route runs
Problem
Running the Calcite IT suite through the analytics-engine route (
-Dtests.analytics.parquet_indices=true) produces a large bucket ofStreamException[errorCode=INTERNAL] "Failed to start streaming fragment on [<index>][<shard>]"failures, whose cluster-side cause isUnsupportedOperationException: acquireReader is not supported in EngineBackedIndexer.The root cause is a routing-vs-storage inconsistency:
RestUnifiedQueryAction.isAnalyticsIndex)pluggable.dataformatvalue ==compositeDataFormatAwareEngine(IndicesService.getIndexerFactory→IndexSettings.isPluggableDataFormatEnabled)pluggable.dataformat.enabled=true(+ feature flag)Test indices created by a raw document
PUT(e.g.PUT /test/_doc/1in a test'sinit()) bypassAnalyticsIndexConfig.applyIndexCreationSettings(which only stamps indices created viacreateIndexByRestClient/loadIndex). They inherit the composite value from the cluster opt-in — so they route to the analytics engine — but not theenabledflag, so they are stored as a plain-LuceneEngineBackedIndexerwhoseacquireReader()is an unimplemented stub. The query then fails with the StreamException above.Fix
Add
AnalyticsIndexConfig.applyClusterSettings(), which sets the cluster-level composite defaults (cluster.pluggable.dataformat.enabled=true,cluster.pluggable.dataformat=composite, parquet primary, lucene secondary) whentests.analytics.parquet_indices=true, and call it fromSQLIntegTestCase.setUpIndices()beforeinit(). Every index — including raw-PUTones — is then stored as a parquet-backedDataFormatAwareEnginethat the analytics engine can actually scan. It is a no-op unlesstests.analytics.parquet_indices=true, so normal CI is unchanged.Notes:
MapperParsingException: Field [_field_names] requires [FULL_TEXT_SEARCH]).cluster.pluggable.dataformat.restrict.allowlistis a static (non-dynamic) setting and is intentionally not set here; it belongs in the cluster node config if a run needs to exempt extra index prefixes.Verification (analytics route,
-Dtests.analytics.parquet_indices=true)acquireReaderStreamExceptions across the affected IT classes: 59 → 0.CalciteNotLikeNullITCalcitePPLLookupITCalcitePPLParseITCalcitePPLPluginITIndices now execute on the real analytics + DataFusion path and return correct rows. Failures that remain on the heavier classes (aggregation, datetime, mixed-field) are pre-existing, separately-tracked analytics-engine gaps that the
acquireReaderstub previously masked — percentile value scaling, unsupported datetime functions (TO_DAYS,ADDTIME, …),distinct_count_approx,SPAN/timestamp type divergence, etc. They are not regressions and are out of scope for this test-infra change; once the indices are scannable they simply surface their true cause instead of the generic streaming-fragment wrapper.