Add opt-in Exclusive Self Attention support by taivu1998 · Pull Request #13710 · huggingface/diffusers

taivu1998 · 2026-05-10T11:23:17Z

Summary

Adds opt-in support for Exclusive Self Attention (XSA) in the shared Diffusers attention stack.

Adds exclusive_self_attention=False to Attention.
Applies the XSA projection after attention output and before output projection for verified self-attention paths.
Wires the option through BasicTransformerBlock.
Exposes the option through Transformer2DModel, DiTTransformer2DModel, and PixArtTransformer2DModel configs.
Adds focused tests for the XSA math, self-vs-cross attention gating, processor variants, and model config/forward propagation.

Motivation

Issue #13447 requests an optional Exclusive Self Attention mode based on:

z_i = y_i - (y_i @ v_i) / (v_i @ v_i) * v_i

This PR implements the equivalent normalized projection form:

value_normalized = F.normalize(value, p=2, dim=-1)
hidden_states = hidden_states - (hidden_states * value_normalized).sum(dim=-1, keepdim=True) * value_normalized

The flag is stored on the Attention module rather than on processor instances, so it remains stable when attention processors are swapped.

Scope

Supported in this PR:

AttnProcessor
AttnProcessor2_0
FusedAttnProcessor2_0
SlicedAttnProcessor
XFormersAttnProcessor
AttnProcessorNPU
XLAFlashAttnProcessor2_0

The projection is applied only when the current call is true self-attention (encoder_hidden_states is None). Cross-attention remains unchanged even if sequence lengths happen to match.

Model-level exposure is included for:

Transformer2DModel
DiTTransformer2DModel
PixArtTransformer2DModel

Deferred

This intentionally does not change added-KV processors, joint attention processors, direct-dispatch model-specific processors, or full U-Net constructor propagation. Those paths have more ambiguous token/value pairing semantics and are better handled in follow-up PRs if maintainers want broader coverage.

Validation

Ran:

PYTHONPATH=src python -m pytest tests/models/test_exclusive_self_attention.py
PYTHONPATH=src python -m pytest tests/models/test_layers_utils.py -k "test_spatial_transformer_default or exclusive_self_attention"
PYTHONPATH=src python -m pytest tests/models/transformers/test_models_dit_transformer2d.py -k "test_output or exclusive_self_attention"
PYTHONPATH=src python -m pytest tests/models/transformers/test_models_pixart_transformer2d.py -k "test_output or exclusive_self_attention"
python -m py_compile src/diffusers/models/attention_processor.py src/diffusers/models/attention.py src/diffusers/models/transformers/transformer_2d.py src/diffusers/models/transformers/dit_transformer_2d.py src/diffusers/models/transformers/pixart_transformer_2d.py tests/models/test_exclusive_self_attention.py tests/models/test_layers_utils.py tests/models/transformers/test_models_dit_transformer2d.py tests/models/transformers/test_models_pixart_transformer2d.py
uvx ruff check src/diffusers/models/attention_processor.py src/diffusers/models/attention.py src/diffusers/models/transformers/transformer_2d.py src/diffusers/models/transformers/dit_transformer_2d.py src/diffusers/models/transformers/pixart_transformer_2d.py tests/models/test_exclusive_self_attention.py tests/models/test_layers_utils.py tests/models/transformers/test_models_dit_transformer2d.py tests/models/transformers/test_models_pixart_transformer2d.py
uvx ruff format --check src/diffusers/models/attention_processor.py src/diffusers/models/attention.py src/diffusers/models/transformers/transformer_2d.py src/diffusers/models/transformers/dit_transformer_2d.py src/diffusers/models/transformers/pixart_transformer_2d.py tests/models/test_exclusive_self_attention.py tests/models/test_layers_utils.py tests/models/transformers/test_models_dit_transformer2d.py tests/models/transformers/test_models_pixart_transformer2d.py
git diff --check

Add opt-in exclusive self-attention

30f522f

github-actions Bot added fixes-issue models tests size/L PR with diff > 200 LOC labels May 10, 2026

taivu1998 marked this pull request as ready for review May 11, 2026 03:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opt-in Exclusive Self Attention support#13710

Add opt-in Exclusive Self Attention support#13710
taivu1998 wants to merge 1 commit intohuggingface:mainfrom
taivu1998:tdv/issue-13447-xsa

taivu1998 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

taivu1998 commented May 10, 2026

Summary

Motivation

Scope

Deferred

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant