Decentralized Fully Stochastic Primal-dual Gradient Algorithm (FSPDA)
-
Updated
May 29, 2025 - Python
Decentralized Fully Stochastic Primal-dual Gradient Algorithm (FSPDA)
Implementation and evaluation of memory-efficient adapter-based fine-tuning for llm's. This project compares a standard GPU-based adapter approach with a sparse CPU-based adapter (MEFT), analyzing training speed and GPU memory usage across different adapter ranks. Includes custom adapter injection into GPT-2, Top-K sparse activation, and results.
Add a description, image, and links to the sparse-communication topic page so that developers can more easily learn about it.
To associate your repository with the sparse-communication topic, visit your repo's landing page and select "manage topics."