Commit 8a26532
committed
[SPARK-57758][SQL] Restore O(1) built-in function resolution in the analyzer
### What changes were proposed in this pull request?
SPARK-54807 added qualified function names and a configurable resolution search path (`spark.sql.functionResolution.sessionOrder`). As a side effect, every `UnresolvedFunction` now makes `FunctionResolution.resolveFunction` / `resolveTableFunction` build an ordered candidate search path, allocate `Seq`s, and iterate candidates (each doing a name-kind parse plus a registry lookup). None of this was memoized, so it was recomputed for every function node, on every analysis pass.
This PR restores the previous fast path for the dominant built-in case, without changing resolution precedence:
1. **Per-analysis-pass memoization.** A lazily-filled memo on `AnalysisContext` stores the computed resolution search-path entries. The path is stable within a single analysis pass (`SET PATH` / `USE` / conf changes happen between passes, and each pass / view / SQL-function body runs under a fresh `AnalysisContext` object), and the memo shares the context's per-pass lifetime: a fresh context (`reset` / `withNewAnalysisContext` / the `copy` or construction for a view or SQL-function body) automatically starts with an empty memo, and the memo is collected with the context. It therefore needs no thread-local, identity key, or weak reference, and is stale-free because the memoized value derives only from the context's immutable fields (`resolutionPathEntries`, `catalogAndNamespace`). `sqlResolutionPathEntriesForAnalysis` (and hence `resolutionCandidates`, the `builtinFastPathSafe` gate, and the `UNRESOLVED_ROUTINE` error path) now read from this memo.
2. **Built-in-only fast-path.** In `resolveFunction` / `resolveTableFunction`, for a single-part, non-internal name, when `system.builtin` is the **first** entry of the effective path, resolve directly against the in-memory built-in registry (`resolveScalarFunctionByIdentifier` / `resolveTableFunctionByIdentifier` with `FunctionRegistry.builtinFunctionIdentifier`) and return on hit. A miss falls through to the unchanged candidate loop.
Correctness: the fast-path fires only when `system.builtin` is the **first** path entry, so no earlier entry can shadow a built-in hit -- neither a `system.session` entry (a temporary/session function, as under mode `first`) nor a catalog/schema placed before `system.builtin` by a custom `SET PATH`. The default `sessionOrder` modes `second` / `last` keep `system.builtin` first (fast-path on); mode `first` puts `system.session` first (fast-path off); only a custom `SET PATH` can place another entry before `system.builtin` (fast-path off). In every case the fast-path matches the slow candidate loop, so resolution precedence is unchanged. The gate predicate is `CatalogManager.isBuiltinFirstOnPath`.
The optional `FORBIDDEN_OPERATION` masking noted in the JIRA is tracked separately in [SPARK-57759](https://issues.apache.org/jira/browse/SPARK-57759) and is intentionally left unchanged here.
### Why are the changes needed?
For built-in-heavy plans the per-function overhead is paid for every function node. Under Spark Connect, which re-analyzes the entire (growing) plan on every `AnalyzePlan` call, the cost scales roughly with plan size x number of analyze calls, producing a multi-fold regression in analysis time versus a pre-SPARK-54807 build. Execution time is unaffected; the regression is isolated to the analysis phase. This restores O(1) built-in resolution for the common case while preserving the qualified-name and configurable-order semantics SPARK-54807 introduced.
### Does this PR introduce _any_ user-facing change?
No. This is a performance fix; resolution results and error behavior are unchanged.
### How was this patch tested?
- New cases in `FunctionQualificationSuite`:
- `SECTION 17a`: the built-in fast-path returns the built-in under the default order while the temp is reachable via `session.`, and switching `sessionOrder` to `first` in the same session correctly bypasses the fast-path so the temp shadows the built-in (exercises the per-pass recompute and the gate).
- `SECTION 17b`: built-in and extension table-function fast-path.
- `SECTION 17c`: a persistent scalar function placed before `system.builtin` via custom `SET PATH` correctly wins over the built-in (fast-path bypassed); built-in wins when `system.builtin` leads.
- `SECTION 17d`: the same regression guard for the table-function fast-path.
- `SECTION 17e`: the fast-path raises the built-in's argument error rather than falling through to a same-named session function (the fast-path and the slow candidate loop fail on the same candidate).
- `SECTION 17f`: the per-pass memo recomputes for a SQL-function body whose pinned path differs from the caller's, so a single statement yields both resolutions (neither context reuses the other's memo).
- New unit test in `CatalogManagerSuite`: `isBuiltinFirstOnPath` over representative path shapes, guarding the gate predicate directly (the fast-path has no behavioral signature).
- Existing suites pass: `FunctionQualificationSuite` + `SetPathSuite` (137) and `LookupFunctionsSuite` (3), which cover the single-pass resolver, dynamic `SET PATH` ordering, and the `COUNT(*)` rewrite gate that depend on resolution order.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor (Claude Opus 4.8)
Closes #56869 from MaxGekk/fix-fun-resolution.
Authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>1 parent f520a2f commit 8a26532
5 files changed
Lines changed: 275 additions & 4 deletions
File tree
- sql
- catalyst/src
- main/scala/org/apache/spark/sql
- catalyst/analysis
- connector/catalog
- test/scala/org/apache/spark/sql/connector/catalog
- core/src/test/scala/org/apache/spark/sql
Lines changed: 30 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
182 | 182 | | |
183 | 183 | | |
184 | 184 | | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
185 | 215 | | |
186 | 216 | | |
187 | 217 | | |
| |||
Lines changed: 64 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
117 | 144 | | |
118 | 145 | | |
119 | 146 | | |
| |||
134 | 161 | | |
135 | 162 | | |
136 | 163 | | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
137 | 170 | | |
138 | 171 | | |
139 | 172 | | |
| |||
189 | 222 | | |
190 | 223 | | |
191 | 224 | | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
192 | 242 | | |
193 | 243 | | |
194 | 244 | | |
| |||
263 | 313 | | |
264 | 314 | | |
265 | 315 | | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
266 | 326 | | |
267 | 327 | | |
268 | 328 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
588 | 588 | | |
589 | 589 | | |
590 | 590 | | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
591 | 600 | | |
592 | 601 | | |
593 | 602 | | |
| |||
Lines changed: 24 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
191 | 215 | | |
192 | 216 | | |
193 | 217 | | |
| |||
Lines changed: 148 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1316 | 1316 | | |
1317 | 1317 | | |
1318 | 1318 | | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
| 1324 | + | |
| 1325 | + | |
| 1326 | + | |
| 1327 | + | |
| 1328 | + | |
| 1329 | + | |
| 1330 | + | |
| 1331 | + | |
| 1332 | + | |
| 1333 | + | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
| 1349 | + | |
| 1350 | + | |
| 1351 | + | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
| 1364 | + | |
| 1365 | + | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
| 1370 | + | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
| 1377 | + | |
| 1378 | + | |
| 1379 | + | |
| 1380 | + | |
| 1381 | + | |
| 1382 | + | |
| 1383 | + | |
| 1384 | + | |
| 1385 | + | |
| 1386 | + | |
| 1387 | + | |
| 1388 | + | |
| 1389 | + | |
| 1390 | + | |
| 1391 | + | |
| 1392 | + | |
| 1393 | + | |
| 1394 | + | |
| 1395 | + | |
| 1396 | + | |
| 1397 | + | |
| 1398 | + | |
| 1399 | + | |
| 1400 | + | |
| 1401 | + | |
| 1402 | + | |
| 1403 | + | |
| 1404 | + | |
| 1405 | + | |
| 1406 | + | |
| 1407 | + | |
| 1408 | + | |
| 1409 | + | |
| 1410 | + | |
| 1411 | + | |
| 1412 | + | |
| 1413 | + | |
| 1414 | + | |
| 1415 | + | |
| 1416 | + | |
| 1417 | + | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
| 1427 | + | |
| 1428 | + | |
| 1429 | + | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
1319 | 1467 | | |
1320 | 1468 | | |
1321 | 1469 | | |
| |||
0 commit comments