|
1 | | - |
2 | | - |
3 | | -<p align="center"> |
4 | | -<img src="img/README.assets/logo.png" width="20%"> <br> |
5 | | -</p> |
6 | | -<div align="center"> |
7 | | -<h1>SparseSSM</h1> |
8 | | - <div align="center"> |
9 | | - <a href="https://opensource.org/licenses/Apache-2.0"> |
10 | | - <img alt="License: Apache 2.0" src="https://img.shields.io/badge/License-Apache%202.0-4E94CE.svg"> |
11 | | - </a> |
12 | | - <a href="https://github.com/state-spaces/mamba"> |
13 | | - <img alt="Mamba" src="https://img.shields.io/badge/LLMs-Mamba-FAB093?style=flat-square"> |
14 | | - </a> |
15 | | - </div> |
16 | | - <p align="center"> |
17 | | - <h3>Efficient Selective Structured State Space Models Can Be Pruned in One-Shot</h3> |
18 | | -</p> |
19 | | -</div> |
20 | | - |
21 | | - |
22 | | - |
23 | | -State-space language models such as Mamba match Transformer quality while permitting linear complexity inference, yet still comprise billions of parameters that hinder deployment. Existing one-shot pruning methods are tailored to attention blocks and fail to account for the time-shared and discretized state-transition matrix at the heart of the selective state-space module (SSM). In this paper, we introduce ***SparseSSM***, the first training-free pruning framework that extends the classic optimal brain surgeon (OBS) framework to state space architectures. Our layer-wise algorithm **(i)** derives an approximate second-order saliency score that aggregates Hessian-trace information across time steps, **(ii)** incorporates a component sensitivity analysis to guide feed-forward network (FFN) pruning, which also sheds light on where redundancy resides in mamba architecture, **(iii)** can be easily extended to semi-structured and structured sparsity. Empirically, we prune 50% of SSM weights without fine-tuning and observe no zero-shot accuracy loss, achieving the current state-of-the-art pruning algorithm for Mamba-based LLMs. |
24 | | - |
25 | | - |
26 | | - |
27 | | -## 🚀Quick Start |
28 | | - |
29 | | -### 1. Install environment |
30 | | - |
31 | | -```bash |
32 | | -git clone https://github.com/CFinTech/SparseSSM |
33 | | -cd SparseSSM |
34 | | -pip install -r requirements.txt |
35 | | -``` |
36 | | - |
37 | | -### 2. Download dataset |
38 | | - |
39 | | -The data for calibrations can be downloaded [here](https://huggingface.co/datasets/mindchain/wikitext2). |
40 | | - |
41 | | -### 3. Execute |
42 | | - |
43 | | -To prune the SSM module, you can run the following command: |
44 | | - |
45 | | -```bash |
46 | | -CUDA_VISIBLE_DEVICES=${your_gpu_id} python main.py \ |
47 | | - path/to/your/model wikitext2 \ |
48 | | - --experiment_name your_experiment_name\ |
49 | | - --method "sparsessm_dev" \ |
50 | | - --save path/to/pruned_model \ |
51 | | - --sparsity 0.5 \ |
52 | | - --nsamples 64 \ |
53 | | - --minlayer 0 \ |
54 | | - --maxlayer 100 \ |
55 | | - --prune_A True \ |
56 | | - --do_prune \ |
57 | | - --eval_zero_shot \ |
58 | | - --log_wandb \ |
59 | | -``` |
60 | | - |
61 | | - |
62 | | - |
63 | | -## 🖼️ Method Overview |
64 | | - |
65 | | - |
66 | | - |
67 | | -Illustration of SparseSSM. The **first row** depicts the evolution of the diagonal parameter matrix $A_{log}$ within the SSM module in Mamba, together with a schematic of the forward-propagation process. In the **second row**, the **left panel** shows the procedure for obtaining a mask from the Hessian estimate at a single time step, while the **right panel** presents our weighted strategy for merging the masks across all time steps, darker background indicates larger weights. |
68 | | - |
69 | | -## 📊 Comparison of Experimental Results |
70 | | - |
71 | | -Performance analysis for one-shot unstructured pruning of SSM modules in Mamba models at $50\%$ sparsity. |
72 | | - |
73 | | - |
74 | | - |
75 | | -## 🙏 Acknowledgements |
76 | | - |
77 | | -- This source code is derived from the famous PyTorch reimplementation of [SparseGPT](https://github.com/IST-DASLab/sparsegpt) and [mamba-minimal](https://github.com/johnma2006/mamba-minimal). |
78 | | -- We use [Mamba checkpoints](https://huggingface.co/state-spaces) to test our method. |
79 | | -- The README file is inspired by [LLM-pruner](https://github.com/horseee/LLM-Pruner). |
80 | | - |
81 | | -## Citation |
82 | | - |
83 | | -If you find this work useful for your research, please consider citing our paper: |
84 | | - |
85 | | -``` |
86 | | -@article{tuo2025sparsessm, |
87 | | - title={SparseSSM: Efficient Selective Structured State Space Models Can Be Pruned in One-Shot}, |
88 | | - author={Kaiwen Tuo and Huan Wang}, |
89 | | - journal={arXiv preprint arXiv:2506.09613}, |
90 | | - year={2025}, |
91 | | -} |
92 | | -``` |
93 | | - |
| 1 | +# Academic Project Page Template |
| 2 | +This is an academic paper project page template. |
| 3 | + |
| 4 | + |
| 5 | +Example project pages built using this template are: |
| 6 | +- https://horwitz.ai/probex |
| 7 | +- https://vision.huji.ac.il/probegen |
| 8 | +- https://horwitz.ai/mother |
| 9 | +- https://horwitz.ai/spectral_detuning |
| 10 | +- https://vision.huji.ac.il/ladeda |
| 11 | +- https://vision.huji.ac.il/dsire |
| 12 | +- https://horwitz.ai/podd |
| 13 | +- https://dreamix-video-editing.github.io |
| 14 | +- https://horwitz.ai/conffusion |
| 15 | +- https://horwitz.ai/3d_ads/ |
| 16 | +- https://vision.huji.ac.il/ssrl_ad |
| 17 | +- https://vision.huji.ac.il/deepsim |
| 18 | + |
| 19 | + |
| 20 | + |
| 21 | +## Start using the template |
| 22 | +To start using the template click on `Use this Template`. |
| 23 | + |
| 24 | +The template uses html for controlling the content and css for controlling the style. |
| 25 | +To edit the websites contents edit the `index.html` file. It contains different HTML "building blocks", use whichever ones you need and comment out the rest. |
| 26 | + |
| 27 | +**IMPORTANT!** Make sure to replace the `favicon.ico` under `static/images/` with one of your own, otherwise your favicon is going to be a dreambooth image of me. |
| 28 | + |
| 29 | +## Components |
| 30 | +- Teaser video |
| 31 | +- Images Carousel |
| 32 | +- Youtube embedding |
| 33 | +- Video Carousel |
| 34 | +- PDF Poster |
| 35 | +- Bibtex citation |
| 36 | + |
| 37 | +## Tips: |
| 38 | +- The `index.html` file contains comments instructing you what to replace, you should follow these comments. |
| 39 | +- The `meta` tags in the `index.html` file are used to provide metadata about your paper |
| 40 | +(e.g. helping search engine index the website, showing a preview image when sharing the website, etc.) |
| 41 | +- The resolution of images and videos can usually be around 1920-2048, there rarely a need for better resolution that take longer to load. |
| 42 | +- All the images and videos you use should be compressed to allow for fast loading of the website (and thus better indexing by search engines). For images, you can use [TinyPNG](https://tinypng.com), for videos you can need to find the tradeoff between size and quality. |
| 43 | +- When using large video files (larger than 10MB), it's better to use youtube for hosting the video as serving the video from the website can take time. |
| 44 | +- Using a tracker can help you analyze the traffic and see where users came from. [statcounter](https://statcounter.com) is a free, easy to use tracker that takes under 5 minutes to set up. |
| 45 | +- This project page can also be made into a github pages website. |
| 46 | +- Replace the favicon to one of your choosing (the default one is of the Hebrew University). |
| 47 | +- Suggestions, improvements and comments are welcome, simply open an issue or contact me. You can find my contact information at [https://horwitz.ai](https://horwitz.ai) |
| 48 | + |
| 49 | +## Acknowledgments |
| 50 | +Parts of this project page were adopted from the [Nerfies](https://nerfies.github.io/) page. |
| 51 | + |
| 52 | +## Website License |
| 53 | +<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. |
0 commit comments