Skip to content

SortingKernels: use int64_t type for num_tile#3555

Open
BartoszKokoszko wants to merge 30 commits into
intel:mainfrom
BartoszKokoszko:dev/bkokoszx/fix-outofmemory-error
Open

SortingKernels: use int64_t type for num_tile#3555
BartoszKokoszko wants to merge 30 commits into
intel:mainfrom
BartoszKokoszko:dev/bkokoszx/fix-outofmemory-error

Conversation

@BartoszKokoszko
Copy link
Copy Markdown
Contributor

In order to avoid num_tile overflow it should be
declared as int64_t type.

In order to avoid num_tile overflow it should be
declared as int64_t type.
Copilot AI review requested due to automatic review settings May 5, 2026 09:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the XPU SYCL sorting kernel to compute num_tiles in 64-bit to avoid overflow during the ceil-division calculation for large num_elements.

Changes:

  • Switches num_tiles in segmented_radix_sort_pairs_kernel from int to int64_t.
  • Performs the num_elements + TILE_PROCESSING_LENGTH - 1 arithmetic in 64-bit via static_cast<int64_t>(num_elements).

Skill files read:

  • .github/skills/xpu-ops-pr-review/SKILL.md
  • .github/skills/xpu-ops-pr-review/references/torch-xpu-ops-review-notes.md
  • .github/skills/xpu-ops-pr-review/references/review-checklist.md
  • .github/skills/xpu-ops-pr-review/references/bc-guidelines.md

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Copilot AI review requested due to automatic review settings May 5, 2026 10:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Copilot AI review requested due to automatic review settings May 6, 2026 08:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Copilot AI review requested due to automatic review settings May 6, 2026 09:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Copilot AI review requested due to automatic review settings May 11, 2026 07:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h
Copilot AI review requested due to automatic review settings May 13, 2026 07:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h
Copilot AI review requested due to automatic review settings May 13, 2026 13:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

src/ATen/native/xpu/sycl/SortingKernels.h:194

  • sycl_kernel_submit(num_segments * num_tiles * GROUP_SIZE, ...) multiplies int values first, so the product can overflow in 32-bit signed arithmetic before being passed to the int64_t overload. Please promote to int64_t (and ideally check for overflow) when computing the global range.

This issue also appears on line 328 of the same file.

  auto caller = SegmentedRadixSortPairsUpsweepFunctor<method_t, key_t, value_t>(
      keys_in, counts, num_elements, begin_bit, end_bit);
  sycl_kernel_submit(
      num_segments * num_tiles * GROUP_SIZE,
      GROUP_SIZE,
      at::xpu::getCurrentSYCLQueue(),
      caller);

src/ATen/native/xpu/sycl/SortingKernels.h:342

  • sycl_kernel_submit(num_segments * num_tiles * GROUP_SIZE, ...) is still computed in int arithmetic here, which can overflow before conversion to the int64_t parameter type. Compute the global range using int64_t (and consider an overflow check) to avoid launching an incorrect ND-range.
  auto caller =
      SegmentedRadixSortPairsDownsweepFunctor<method_t, key_t, value_t>(
          keys_in,
          keys_out,
          values_in,
          values_out,
          num_elements,
          begin_bit,
          end_bit,
          count);
  sycl_kernel_submit(
      num_segments * num_tiles * GROUP_SIZE,
      GROUP_SIZE,
      at::xpu::getCurrentSYCLQueue(),
      caller);

Copilot AI review requested due to automatic review settings May 19, 2026 12:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
Copy link
Copy Markdown
Contributor

@BBBela BBBela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Thank you! 😉

Copy link
Copy Markdown

@pawel-olejniczak pawel-olejniczak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BartoszKokoszko BartoszKokoszko requested a review from CuiYifeng May 26, 2026 15:14
Copy link
Copy Markdown
Contributor

@CuiYifeng CuiYifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any test cases for such overflow? The remaining part of this PR LGTM.

Comment thread src/ATen/native/xpu/sycl/SortingKernels.h Outdated
@BartoszKokoszko
Copy link
Copy Markdown
Contributor Author

Do we have any test cases for such overflow? The remaining part of this PR LGTM.

@CuiYifeng

Directly no. It was taken from #3426

@CuiYifeng
Copy link
Copy Markdown
Contributor

Do we have any test cases for such overflow? The remaining part of this PR LGTM.

@CuiYifeng

Directly no. It was taken from #3426

Please add a test case in torch-xpu-ops referring to test_topk_large_k in test/test_sort_and_select.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants