Skip to content

refactor(query): support partitioned hash join#19553

Open
zhang2014 wants to merge 39 commits intodatabendlabs:mainfrom
zhang2014:refactor/partitioned-hash-join
Open

refactor(query): support partitioned hash join#19553
zhang2014 wants to merge 39 commits intodatabendlabs:mainfrom
zhang2014:refactor/partitioned-hash-join

Conversation

@zhang2014
Copy link
Copy Markdown
Member

@zhang2014 zhang2014 commented Mar 16, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

refactor(query): support partitioned hash join

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

zhang2014 and others added 3 commits March 16, 2026 11:04
Under hash shuffle, build and probe data are already partitioned by thread.
This replaces the shared hash table (atomic CAS, Mutex, Barrier) with a
Doris-style compact hash table (4 bytes/row index-based chain) that each
thread builds and probes independently, eliminating all synchronization
overhead.

- Reorganize memory/ into unpartitioned/ (broadcast) and partitioned/ (shuffle)
- Add CompactJoinHashTable<I: RowIndex> with index-based chaining
- Add PartitionedBuild with fixed 65536-row chunks and bit-shift addressing
- Implement all 7 join types for the partitioned path
- Route hash shuffle joins through partitioned pipeline in physical_hash_join

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eam for partitioned hash join

Move visited bitmap from CompactJoinHashTable to PartitionedBuild so the
hash table is fully immutable after build. Introduce CompactProbeStream
implementing the ProbeStream trait for streaming probe with index-based
chaining. Replace eager probe()/probe_and_mark_visited() with streaming
create_probe_matched/create_probe factory methods. Rewrite all 7 join
types (inner, left, left semi, left anti, right, right semi, right anti)
with dedicated streaming JoinStream implementations. Right-side joins use
field-level split borrowing to avoid borrow conflicts between immutable
hash table access and mutable visited marking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Mar 16, 2026
@zhang2014 zhang2014 added the ci-cloud Build docker image for cloud test label Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Docker Image for PR

  • tag: pr-19553-58597bf-1774625355

note: this image tag is only available for internal use.

@zhang2014 zhang2014 marked this pull request as ready for review March 31, 2026 06:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b9fbcb46a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Docker Image for PR

  • tag: pr-19553-29f773e-1775055104

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Docker Image for PR

  • tag: pr-19553-80015d7-1775066889

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Docker Image for PR

  • tag: pr-19553-2b485da-1775102640

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Docker Image for PR

  • tag: pr-19553-b0d68ac-1775115117

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Docker Image for PR

  • tag: pr-19553-05c9fac-1775122949

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Docker Image for PR

  • tag: pr-19553-1ae441a-1775137830

note: this image tag is only available for internal use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-refactor this PR changes the code base without new features or bugfix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants