perf: optimize `map_extract` function lookup for common key types by lyne7-sc · Pull Request #21237 · apache/datafusion

lyne7-sc · 2026-03-29T14:37:45Z

Which issue does this PR close?

Closes #.

Rationale for this change

This PR optimizes map_extract for common key types. Previously, map_extract compared keys by building single-element array slices for each comparison. That is more expensive than necessary for common key types such as primitive integers, strings, and binary values.

What changes are included in this PR?

specialize map_extract key lookup for primitive, string, and binary key types using direct value comparison instead of array slicing
keep the generic fallback for unsupported key types
add a map_extract benchmark with representative cases

Benchmark

group                                                main                                    optimized
-----                                                ----                                    ---------
map_extract_1000_binary_found_last                   97.64   174.3±6.10ms        ? ?/sec     1.00  1784.9±63.47µs        ? ?/sec
map_extract_1000_binary_view_found_last              72.04   188.1±3.65ms        ? ?/sec     1.00      2.6±0.18ms        ? ?/sec
map_extract_1000_int32_found_last                    388.31   154.4±2.12ms        ? ?/sec    1.00   397.5±23.08µs        ? ?/sec
map_extract_1000_utf8_found_last                     80.58   173.5±6.90ms        ? ?/sec     1.00      2.2±0.06ms        ? ?/sec
map_extract_1000_utf8_found_middle                   79.85    84.7±2.36ms        ? ?/sec     1.00  1061.0±53.43µs        ? ?/sec
map_extract_1000_utf8_view_found_last                64.81   175.7±5.23ms        ? ?/sec     1.00      2.7±0.12ms        ? ?/sec

Are these changes tested?

Yes. slt passed, and benchmark coverage was added.

Are there any user-facing changes?

No.

rluvaton · 2026-03-29T21:20:01Z

Damn

map_extract_1000_int32_found_last                    388.31   154.4±2.12ms        ? ?/sec    1.00   397.5±23.08µs        ? ?/sec

rluvaton · 2026-03-29T21:20:52Z

Please create a seperate PR for the benchmarks so we can run then with comparison to this optimization

lyne7-sc · 2026-03-30T14:09:54Z

Please create a seperate PR for the benchmarks so we can run then with comparison to this optimization

@rluvaton thanks, opened a separate PR for the benchmarks here: #21251

optimize map_extract

773f88d

github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Mar 29, 2026

lyne7-sc mentioned this pull request Mar 30, 2026

bench: add benchmark for map_extract function #21251

Open

lyne7-sc and others added 3 commits April 7, 2026 22:49

add bench case

7e3177d

remove benchmark (moved to separate PR)

7e6eb12

Merge branch 'main' into perf/map_extract

e84e86c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize `map_extract` function lookup for common key types#21237

perf: optimize `map_extract` function lookup for common key types#21237
lyne7-sc wants to merge 4 commits intoapache:mainfrom
lyne7-sc:perf/map_extract

lyne7-sc commented Mar 29, 2026

Uh oh!

rluvaton commented Mar 29, 2026 •

edited

Loading

Uh oh!

rluvaton commented Mar 29, 2026

Uh oh!

lyne7-sc commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lyne7-sc commented Mar 29, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Benchmark

Are these changes tested?

Are there any user-facing changes?

Uh oh!

rluvaton commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rluvaton commented Mar 29, 2026

Uh oh!

lyne7-sc commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rluvaton commented Mar 29, 2026 •

edited

Loading