Skip to content

Commit 17db718

Browse files
authored
add LexCHA Performance Benchmarks
1 parent 3abc50d commit 17db718

1 file changed

Lines changed: 27 additions & 0 deletions

File tree

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,33 @@ The following table shows the execution time (in seconds) for generating all per
199199

200200
*Note: Benchmarks conducted on standard GitHub-hosted runners with `-O3` optimization.*
201201

202+
---
203+
## LexCHA Performance Benchmarks
204+
205+
The following tables show the performance comparison between the standard C++ library (`std::next_permutation`) and the LexCHA SIMD-accelerated engine across different CPU architectures.
206+
207+
### Environment 1: Cloud VM (GitHub Actions / AMD EPYC)
208+
* **Compiler:** `g++ -O3 -march=native -std=c++17`
209+
* **Target:** Strict byte-level vectorization (`_mm_shuffle_epi8`)
210+
211+
| N | Std (s) | Acc (s) | Std (ns/perm) | Acc (ns/perm) | Speedup |
212+
| :---: | :---: | :---: | :---: | :---: | :---: |
213+
| **10** | 0.011107 | 0.002524 | 3.060918 | 0.695410 | **4.40x** |
214+
| **11** | 0.118945 | 0.027397 | 2.979828 | 0.686353 | **4.34x** |
215+
| **12** | 1.422392 | 0.329176 | 2.969494 | 0.687213 | **4.32x** |
216+
| **13** | 18.484087 | 4.282182 | 2.968368 | 0.687677 | **4.32x** |
217+
218+
### Environment 2: Local Host (Intel Core Architecture)
219+
* **Compiler:** `g++ -O3 -march=native -std=c++17`
220+
* **Advantage:** Dedicated Intel shuffle ports and optimal Store-to-Load Forwarding (STLF).
221+
222+
| N | Std (s) | Acc (s) | Std (ns/perm) | Acc (ns/perm) | Speedup |
223+
| :---: | :---: | :---: | :---: | :---: | :---: |
224+
| **10** | 0.013980 | 0.001881 | 3.852431 | 0.518463 | **7.43x** |
225+
| **11** | 0.150596 | 0.021415 | 3.772737 | 0.536486 | **7.03x** |
226+
| **12** | 1.783026 | 0.255463 | 3.722380 | 0.533324 | **6.98x** |
227+
| **13** | 22.455285 | 3.312170 | 3.606104 | 0.531903 | **6.78x** |
228+
202229
---
203230

204231
## 💻 Source Code

0 commit comments

Comments
 (0)