Skip to content

Commit c5634e9

Browse files
author
shijiashuai
committed
docs: v0.3.0 documentation internationalization and professional refactor
- Restructure docs/ directory with bilingual support (en/ and zh-CN/) - Add complete English translations of all technical guides: - GEMM Optimization (7-step journey) - Memory Optimization (coalesced access, vectorization) - Reduction Optimization (warp shuffle, online softmax) - FlashAttention (IO-aware attention) - CUDA 13 Features (Hopper architecture) - Add Chinese translations of API and Architecture docs - Create professional documentation navigation with language switching - Reorganize changelog/ directory with archive/ - Add v0.3.0 release notes with bilingual content - Update CHANGELOG.md with [0.3.0] entry - Update README.md and README.zh-CN.md with new doc links This release makes documentation accessible to both English and Chinese readers.
1 parent 20b7411 commit c5634e9

62 files changed

Lines changed: 12574 additions & 3050 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 55 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -6,47 +6,78 @@ labels: bug
66
assignees: ''
77
---
88

9-
## Bug Description
9+
## 🐛 Bug Description
10+
1011
A clear and concise description of what the bug is.
1112

12-
## Environment
13+
## 🔧 Environment
14+
15+
Please fill in the following information:
1316

1417
| Item | Value |
1518
|------|-------|
16-
| OS | (e.g., Ubuntu 22.04) |
17-
| CUDA Version | (e.g., 13.1) |
18-
| GPU | (e.g., RTX 4090) |
19-
| Driver Version | (e.g., 545.23.08) |
20-
| CMake Version | (e.g., 3.28) |
21-
| Compiler | (e.g., GCC 11.4) |
19+
| **OS** | e.g., Ubuntu 22.04 |
20+
| **CUDA Version** | e.g., 12.4 |
21+
| **GPU** | e.g., RTX 4090 (SM 89) |
22+
| **Driver Version** | e.g., 545.23.08 |
23+
| **CMake Version** | e.g., 3.28 |
24+
| **Compiler** | e.g., GCC 11.4 |
25+
| **Python Version** | e.g., 3.11 (if using bindings) |
26+
27+
## 📋 Steps to Reproduce
28+
29+
1. Clone repository: `git clone https://github.com/LessUp/hpc-ai-optimization-lab.git`
30+
2. Build command: `cmake -S . -B build && cmake --build build`
31+
3. Run command: `...`
32+
4. See error
2233

23-
## Steps to Reproduce
34+
### Minimal Reproducible Example
35+
36+
```cpp
37+
// If applicable, provide minimal code to reproduce the issue
38+
#include "module/kernel.cuh"
39+
40+
int main() {
41+
// ...
42+
return 0;
43+
}
44+
```
2445

25-
1. Go to '...'
26-
2. Run '...'
27-
3. See error
46+
## ✅ Expected Behavior
2847

29-
## Expected Behavior
3048
A clear and concise description of what you expected to happen.
3149

32-
## Actual Behavior
50+
## ❌ Actual Behavior
51+
3352
A clear and concise description of what actually happened.
3453

35-
## Error Output
54+
## 📜 Error Output
55+
3656
```
37-
Paste any error messages or logs here
57+
Paste any error messages, warnings, or logs here
3858
```
3959

40-
## Minimal Reproducible Example
41-
```cpp
42-
// Paste minimal code to reproduce the issue
43-
```
60+
## 📊 Performance Issue? (if applicable)
61+
62+
If this is a performance-related bug, please include:
63+
64+
| Metric | Expected | Actual |
65+
|--------|----------|--------|
66+
| Execution Time | | |
67+
| TFLOPS | | |
68+
| Memory Bandwidth | | |
69+
70+
## 🔍 Additional Context
71+
72+
Add any other context about the problem here:
73+
- Did this work in a previous version?
74+
- Are you using any special build flags?
75+
- Any workarounds you've tried?
4476

45-
## Additional Context
46-
Add any other context about the problem here.
77+
## ✅ Checklist
4778

48-
## Checklist
49-
- [ ] I have searched existing issues to ensure this bug hasn't been reported
79+
- [ ] I have searched [existing issues](https://github.com/LessUp/hpc-ai-optimization-lab/issues) to ensure this bug hasn't been reported
5080
- [ ] I have provided all required environment information
5181
- [ ] I have included steps to reproduce the bug
5282
- [ ] I have included relevant error messages/logs
83+
- [ ] I have tested with the latest version on the main branch

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,21 @@
1+
# GitHub Issue Template Configuration
2+
# See: https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository
3+
14
blank_issues_enabled: false
5+
26
contact_links:
3-
- name: Documentation
7+
- name: 📚 Documentation
48
url: https://github.com/LessUp/hpc-ai-optimization-lab/tree/main/docs
59
about: Check the documentation before opening an issue
6-
- name: Discussions
10+
11+
- name: 💬 Discussions
712
url: https://github.com/LessUp/hpc-ai-optimization-lab/discussions
8-
about: Ask questions and discuss ideas
13+
about: Ask questions and discuss ideas with the community
14+
15+
- name: 🚀 Quick Start Guide
16+
url: https://github.com/LessUp/hpc-ai-optimization-lab#-quick-start
17+
about: Get started quickly with setup and build instructions
18+
19+
- name: 📖 API Reference
20+
url: https://github.com/LessUp/hpc-ai-optimization-lab/blob/main/docs/API_REFERENCE.md
21+
about: Browse the complete API documentation

.github/ISSUE_TEMPLATE/feature_request.md

Lines changed: 64 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,36 +6,81 @@ labels: enhancement
66
assignees: ''
77
---
88

9-
## Feature Description
10-
A clear and concise description of the feature you'd like.
9+
## 💡 Feature Description
1110

12-
## Problem Statement
13-
Is your feature request related to a problem? Please describe.
14-
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
11+
A clear and concise description of the feature you'd like to see implemented.
1512

16-
## Proposed Solution
17-
A clear and concise description of what you want to happen.
13+
## 🎯 Problem Statement
1814

19-
## Alternatives Considered
20-
A clear and concise description of any alternative solutions or features you've considered.
15+
Is your feature request related to a problem? Please describe:
16+
17+
**As a** [type of user],
18+
**I want** [goal/desire],
19+
**So that** [benefit/reason].
20+
21+
Example: As a CUDA developer learning optimization, I want a tutorial on [topic], so that I can understand [concept].
22+
23+
## 🛠️ Proposed Solution
2124

22-
## Use Case
23-
Describe the use case for this feature. Who would benefit from it?
25+
A clear and concise description of what you want to happen.
26+
27+
### Implementation Ideas (Optional)
2428

25-
## Implementation Ideas (Optional)
26-
If you have ideas about how this could be implemented, share them here.
29+
If you have ideas about how this could be implemented:
2730

2831
```cpp
2932
// Optional: pseudo-code or API design
33+
namespace hpc::module {
34+
template<typename T>
35+
void new_kernel(const T* input, T* output, size_t n);
36+
}
3037
```
3138
32-
## Additional Context
33-
Add any other context, screenshots, or references about the feature request here.
39+
## 🔄 Alternatives Considered
40+
41+
A clear and concise description of any alternative solutions or features you've considered.
42+
43+
## 📈 Use Cases
44+
45+
Describe specific use cases for this feature:
46+
47+
1. **Use Case 1**: Description of how this feature would be used
48+
2. **Use Case 2**: Another scenario where this is helpful
49+
50+
## 📋 Affected Modules
51+
52+
Which modules would this feature affect?
53+
54+
- [ ] `src/common/` - Common utilities
55+
- [ ] `src/01_elementwise/` - Elementwise operations
56+
- [ ] `src/02_reduction/` - Reduction operations
57+
- [ ] `src/03_gemm/` - GEMM operations
58+
- [ ] `src/04_convolution/` - Convolution operations
59+
- [ ] `src/05_attention/` - Attention mechanisms
60+
- [ ] `src/06_quantization/` - Quantization utilities
61+
- [ ] `src/07_cuda13_features/` - CUDA 13 features
62+
- [ ] `python/` - Python bindings
63+
- [ ] `docs/` - Documentation
64+
- [ ] `tests/` - Testing infrastructure
65+
66+
## 🔗 Related Work
67+
68+
Are there any related:
69+
- Research papers?
70+
- Open source projects (CUTLASS, FlashAttention, etc.)?
71+
- GitHub issues or PRs?
72+
73+
## 📊 Performance Expectations (if applicable)
74+
75+
| Metric | Target |
76+
|--------|--------|
77+
| Throughput | e.g., X TFLOPS |
78+
| Latency | e.g., X ms |
79+
| Memory | e.g., X MB |
3480
35-
## Related Issues/PRs
36-
- #issue_number (if applicable)
81+
## ✅ Checklist
3782
38-
## Checklist
39-
- [ ] I have searched existing issues to ensure this feature hasn't been requested
83+
- [ ] I have searched [existing issues](https://github.com/LessUp/hpc-ai-optimization-lab/issues) to ensure this feature hasn't been requested
4084
- [ ] I have clearly described the problem this feature would solve
4185
- [ ] I have provided a clear description of the proposed solution
86+
- [ ] I have identified which modules would be affected

0 commit comments

Comments
 (0)