Merge branch 'main' into ltx2pipelinespeedup

ViktoriiaRomanova · web-flow · commit 1293f2ac1236 · 2026-05-08T02:18:49.000+07:00
diff --git a/.github/workflows/pr_style_bot.yml b/.github/workflows/pr_style_bot.yml
@@ -5,13 +5,14 @@ on:
     types: [created]
 
 permissions:
-  contents: write
   pull-requests: write
+  contents: read
 
 jobs:
   style:
-    uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@e000c1c89c65aee188041723456ac3a479416d4c  # main
+    uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@e2867e92c07d15e1bf18994d0a945ef5ad6b8d65
     with:
       python_quality_dependencies: "[quality]"
     secrets:
-      bot_token: ${{ secrets.HF_STYLE_BOT_ACTION }}
+      app_id: ${{ secrets.HF_BOT_STYLE_APP_ID }}
+      app_private_key: ${{ secrets.HF_BOT_STYLE_SECRET_PEM }}
diff --git a/docs/source/en/optimization/attention_backends.md b/docs/source/en/optimization/attention_backends.md
@@ -35,7 +35,7 @@ The [`~ModelMixin.set_attention_backend`] method iterates through all the module
 The example below demonstrates how to enable the `_flash_3_hub` implementation for FlashAttention-3 from the [`kernels`](https://github.com/huggingface/kernels) library, which allows you to instantly use optimized compute kernels from the Hub without requiring any setup.
 
 > [!NOTE]
-> FlashAttention-3 is not supported for non-Hopper architectures, in which case, use FlashAttention with `set_attention_backend("flash")`.
+> FlashAttention-3 requires Ampere GPUs at a minimum.
 
 ```py
 import torch