Skip to content

perf: Qwen image optimize.#1230

Open
shan-chen-feng wants to merge 2 commits intojd-opensource:mainfrom
shan-chen-feng:qwen_image_optimize
Open

perf: Qwen image optimize.#1230
shan-chen-feng wants to merge 2 commits intojd-opensource:mainfrom
shan-chen-feng:qwen_image_optimize

Conversation

@shan-chen-feng
Copy link
Copy Markdown
Collaborator

No description provided.

@shan-chen-feng shan-chen-feng changed the title Qwen image optimize feat: Qwen image optimize. Apr 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a communication and computation overlap mechanism for sequence parallelism in the DiT model. It adds a new QwenDoubleStreamAttnProcessorCMO2_0 implementation, updates the transformer block to support this mode via a new global flag, and refactors positional embedding handling. The review feedback highlights several style improvements, including replacing auto with explicit types, using the torch:: namespace, marking implementation classes as final, and fixing typos in parameter annotations.

Comment thread xllm/models/dit/npu/qwen_image_edit/pipeline_qwenimage_edit_plus.h Outdated
const torch::Tensor& encoder_hidden_states_mask = torch::Tensor(),
const torch::Tensor& attention_mask = torch::Tensor(),
const std::tuple<at::Tensor, at::Tensor>& image_rotary_emb = {}) {
const std::tuple<at::Tensor, at::Tensor>& image_rotary_emb = {}) = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Use the torch:: namespace instead of at:: for tensor declarations.

Suggested change
const std::tuple<at::Tensor, at::Tensor>& image_rotary_emb = {}) = 0;
const std::tuple<torch::Tensor, torch::Tensor>& image_rotary_emb = {}) = 0;
References
  1. Use torch:: namespace instead of at:: or c10:: wherever possible. Prefer the highest-level PyTorch C++ API. (link)

Comment thread xllm/models/dit/npu/qwen_image_edit/transformer_qwen_image.h Outdated
Comment thread xllm/models/dit/npu/qwen_image_edit/transformer_qwen_image.h Outdated
Comment on lines +1565 to +1566
/*pre_tockens=*/65535,
/*next_tockens=*/65535);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Correct the typos in the parameter annotations: tockens should be tokens.

Suggested change
/*pre_tockens=*/65535,
/*next_tockens=*/65535);
/*pre_tokens=*/65535,
/*next_tokens=*/65535);
References
  1. Annotate constant arguments with a comment indicating the parameter name when calling functions or constructors. (link)

Comment thread xllm/models/dit/npu/qwen_image_edit/transformer_qwen_image.h Outdated
Comment thread xllm/models/dit/npu/qwen_image_edit/transformer_qwen_image.h Outdated
@XuZhang99 XuZhang99 changed the title feat: Qwen image optimize. perf: Qwen image optimize. Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant