Skip to content

LoRA: Implementing kernels using CUBE computation unit#432

Merged
ping1jing2 merged 8 commits into
sgl-project:mainfrom
vlserov:vlserov/lora_kernels_cube
May 6, 2026
Merged

LoRA: Implementing kernels using CUBE computation unit#432
ping1jing2 merged 8 commits into
sgl-project:mainfrom
vlserov:vlserov/lora_kernels_cube

Conversation

@vlserov
Copy link
Copy Markdown
Contributor

@vlserov vlserov commented Apr 8, 2026

Implementing kernels using CUBE computation unit instead of using VECTOR computation unit
Update for PR #384

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new LoRA operations, sgemmc_expand and sgemmc_shrink, to the NPU kernel library, including the necessary host-side tiling logic and PyTorch extension bindings. The review identified several critical issues in the kernel implementations, specifically regarding incorrect core-to-token mapping, missing tensor offsets for input and weight data, and improper indexing during data copy operations. Additionally, a potential division-by-zero vulnerability was highlighted in the host-side tiling calculation.

Comment thread csrc/lora/op_kernel/sgemmc_expand_kernel.cpp Outdated
Comment thread csrc/lora/op_kernel/sgemmc_expand_kernel.cpp Outdated
Comment thread csrc/lora/op_kernel/sgemmc_expand_kernel.cpp Outdated
Comment thread csrc/lora/op_kernel/sgemmc_expand_kernel.cpp Outdated
Comment thread csrc/lora/op_kernel/sgemmc_shrink_kernel.cpp Outdated
Comment thread csrc/lora/op_kernel/sgemmc_shrink_kernel.cpp Outdated
Comment thread csrc/lora/op_kernel/sgemmc_shrink_kernel.cpp Outdated
Comment thread csrc/lora/op_host/sgemmc_expand.cpp
@vlserov vlserov marked this pull request as draft April 8, 2026 16:30
@vlserov vlserov marked this pull request as ready for review April 24, 2026 06:25
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown

@ssshinigami ssshinigami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ping1jing2 ping1jing2 merged commit d6733df into sgl-project:main May 6, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants