You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+23-22Lines changed: 23 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,15 +8,14 @@
8
8
9
9
English | [简体中文](README.zh-CN.md)
10
10
11
-
A CUDA SGEMM engineering notebook designed for both deep learning and interview presentation: from readable FP32 baselines to guarded Tensor Core WMMA, with cuBLAS-backed verification and explicit benchmark boundaries.
11
+
This repository is a CUDA SGEMM case study presented as a technical whitepaper and kernel academy. It starts from readable FP32 baselines, climbs through tiled, bank-conflict-aware, double-buffer, and guarded Tensor Core WMMA paths, then frames every performance claim with explicit validation boundaries.
Runtime tests and benchmarks require a CUDA-capable local machine. Hosted CI is limited to formatting, repository-structure, OpenSpec/governance, and Pages checks.
32
+
Runtime tests and benchmarks require a local CUDA-capable machine. Hosted CI covers repository integrity, documentation, OpenSpec validation, and Pages buildability.
34
33
35
-
## Start here (GitHub Pages)
34
+
## GitHub Pages entry points
35
+
36
+
The README is the executive summary. The long-form technical narrative lives on Pages.
36
37
37
38
| Goal | Entry point |
38
39
|------|-------------|
39
-
| Open English home |[Docs Home](https://lessup.github.io/sgemm-optimization/en/)|
40
+
| Open English home |[English Home](https://lessup.github.io/sgemm-optimization/en/)|
40
41
| Open Chinese home |[中文首页](https://lessup.github.io/sgemm-optimization/zh/)|
41
-
|Build and run once |[Getting Started](https://lessup.github.io/sgemm-optimization/en/getting-started)|
0 commit comments