🔥[FSDP 1/2] PyTorch FSDP: Getting Started with Fully Sharded Data Parallel(FSDP) (#139)

DefTruth · web-flow · commit 1694c32ae84b · 2025-04-25T16:03:10.000+08:00
🔥[FSDP 1/2] PyTorch FSDP: Getting Started with Fully Sharded Data Parallel(FSDP)
diff --git a/README.md b/README.md
@@ -121,6 +121,8 @@ python3 download_pdfs.py # The code is generated by Doubao AI
 |2024.11|🔥🔥[**TP: Comm Compression**] Communication Compression for Tensor Parallel LLM Inference(@recogni.com)|[[pdf]](https://arxiv.org/pdf/2411.09510)| ⚠️|⭐️⭐️ |
 |2024.11|🔥🔥🔥[**SP: Star-Attention, 11x~ speedup**] Star Attention: Efficient LLM Inference over Long Sequences(@NVIDIA)|[[pdf]](https://arxiv.org/pdf/2411.17116)|[[Star-Attention]](https://github.com/NVIDIA/Star-Attention) ![](https://img.shields.io/github/stars/NVIDIA/Star-Attention.svg?style=social)|⭐️⭐️ |
 |2024.12|🔥🔥[**SP: TokenRing**] TokenRing: An Efficient Parallelism Framework for Infinite-Context LLMs via Bidirectional Communication(@SJTU) |[[pdf]](https://arxiv.org/pdf/2412.20501)|[[token-ring]](https://github.com/ACA-Lab-SJTU/token-ring) ![](https://img.shields.io/github/stars/ACA-Lab-SJTU/token-ring.svg?style=social)|⭐️⭐️ |
+|2025.05|🔥🔥[**FSDP 1/2**] PyTorch FSDP: Getting Started with Fully Sharded Data Parallel(FSDP) (@pytorch) | [[docs]](https://pytorch.org/tutorials/intermediate/FSDP_tutorial.html#getting-started-with-fully-sharded-data-parallel-fsdp) | ⚠️ |⭐️⭐️ |
+
 
 ### 📖Disaggregating Prefill and Decoding ([©️back👆🏻](#paperlist))
 <div id="P-D-Disaggregating"></div>