Skip to content

Commit e991ee7

Browse files
committed
docs(build): add unified caching strategy documentation
Document the unified caching strategy implemented across all build workflows. **Contents:** - Decision tree for choosing caching approach - Three caching patterns (Native C++, WASM C++, Non-C++) - Cache key strategy and naming conventions - Five-layer cache hierarchy - Build time comparisons - Implementation checklist for new workflows - Troubleshooting guide **Patterns:** 1. Native C++ (smol): ccache + build directory + output 2. WASM C++ (Yoga, ONNX): build directory + output 3. Non-C++ (AI, SEA): output only **Cache Layers:** - Layer 1: Build dependencies (Python, Ninja, Emscripten) - Layer 2: Source code (.node-source/, .yoga-source/, .onnx-source/) - Layer 3: Intermediate build (CMake cache, compiled objects) - Layer 4: Compilation cache (ccache for native only) - Layer 5: Final output (dist/ artifacts) This provides a reference for maintaining consistency as new build workflows are added.
1 parent 4df2b7b commit e991ee7

File tree

1 file changed

+213
-0
lines changed

1 file changed

+213
-0
lines changed

docs/build/caching-strategy.md

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# Build Caching Strategy
2+
3+
## Overview
4+
5+
Socket CLI uses a **unified, consistent caching strategy** across all build workflows to minimize build times and preserve compilation progress between CI runs.
6+
7+
## Strategy Decision Tree
8+
9+
```
10+
Does the build compile C/C++?
11+
├─ YES → Is it native or WASM?
12+
│ ├─ Native Build (smol)
13+
│ │ ├─ Use ccache for per-object-file caching
14+
│ │ └─ Use build directory cache for CMake state
15+
│ └─ WASM Build (Yoga, ONNX)
16+
│ └─ Use build directory cache only
17+
│ (Emscripten doesn't integrate well with ccache)
18+
└─ NO (AI Models, SEA)
19+
└─ Cache final output only
20+
```
21+
22+
## Caching Patterns
23+
24+
### Pattern 1: Native C++ Builds (smol)
25+
26+
**Use case**: Compiling Node.js from source to native binaries
27+
28+
**Strategy**: ccache + build directory cache
29+
30+
```yaml
31+
- name: Setup ccache (Linux/macOS)
32+
uses: hendrikmuhs/ccache-action@...
33+
with:
34+
key: build-${{ platform }}-${{ arch }}-${{ hash }}
35+
max-size: 2G
36+
37+
- name: Restore build cache
38+
uses: actions/cache@...
39+
with:
40+
path: |
41+
packages/node-smol-builder/build
42+
packages/node-smol-builder/.node-source
43+
key: node-smol-build-${{ platform }}-${{ arch }}-${{ hash }}
44+
restore-keys: |
45+
node-smol-build-${{ platform }}-${{ arch }}-
46+
47+
- name: Restore binary cache
48+
uses: actions/cache@...
49+
with:
50+
path: packages/node-smol-builder/dist/socket-smol-*
51+
key: node-smol-${{ platform }}-${{ arch }}-${{ hash }}
52+
```
53+
54+
**Why both ccache and build directory?**
55+
- **ccache**: Caches individual compiled object files (very granular)
56+
- **build directory**: Caches CMake configuration, dependency tracking, build state
57+
- **Together**: Maximum build speed and failure recovery
58+
59+
**Benefits:**
60+
- First build: ~60-90 minutes
61+
- Cached build: ~5-10 minutes (ccache hits on all objects)
62+
- Partial failure: Can resume from cached state
63+
64+
### Pattern 2: WASM C++ Builds (Yoga, ONNX)
65+
66+
**Use case**: Compiling C++ to WebAssembly with Emscripten
67+
68+
**Strategy**: Build directory cache only (no ccache)
69+
70+
```yaml
71+
- name: Restore output cache
72+
uses: actions/cache@...
73+
with:
74+
path: packages/yoga-layout/build/wasm
75+
key: yoga-wasm-${{ hash }}
76+
77+
- name: Restore build cache
78+
uses: actions/cache@...
79+
with:
80+
path: |
81+
packages/yoga-layout/build
82+
packages/yoga-layout/.yoga-source
83+
key: yoga-build-${{ hash }}
84+
restore-keys: |
85+
yoga-build-
86+
```
87+
88+
**Why no ccache?**
89+
- Emscripten uses custom LLVM-based compilation
90+
- ccache integration is unreliable with Emscripten
91+
- Build directory caching achieves the same goal more simply
92+
93+
**Benefits:**
94+
- Yoga: ~2-3 minutes → ~1 minute (already fast)
95+
- ONNX: ~30-40 minutes → ~2-3 minutes (on failure recovery)
96+
- Simpler, more reliable than ccache integration
97+
98+
### Pattern 3: Non-C++ Builds (AI Models, SEA)
99+
100+
**Use case**: Python model conversion, JavaScript bundling
101+
102+
**Strategy**: Output cache only
103+
104+
```yaml
105+
- name: Restore output cache
106+
uses: actions/cache@...
107+
with:
108+
path: packages/socketbin-cli-ai/dist
109+
key: ai-models-${{ hash }}
110+
```
111+
112+
**Why output only?**
113+
- No compilation involved (Python scripts, JS bundling)
114+
- Intermediate state doesn't speed up rebuilds
115+
- Simple caching is sufficient
116+
117+
## Cache Key Strategy
118+
119+
All caches use **content-based hashing** for invalidation:
120+
121+
```bash
122+
HASH=$(find <paths> -type f \( -name "pattern" \) | sort | xargs sha256sum | sha256sum | cut -d' ' -f1)
123+
```
124+
125+
**Key format:**
126+
```
127+
<workflow>-<type>-<platform>-<arch>-<content-hash>
128+
```
129+
130+
**Examples:**
131+
- `node-smol-build-linux-x64-abc123def456` (smol build cache)
132+
- `yoga-build-abc123def456` (Yoga build cache)
133+
- `onnx-runtime-build-abc123def456` (ONNX build cache)
134+
135+
**Restore keys** provide prefix matching for partial cache hits:
136+
```yaml
137+
restore-keys: |
138+
node-smol-build-linux-x64-
139+
node-smol-build-linux-
140+
```
141+
142+
## Cache Layers
143+
144+
### Layer 1: Build Dependencies
145+
- **Cached**: Python, Ninja, Emscripten SDK
146+
- **Purpose**: Avoid re-downloading build tools
147+
- **Duration**: Stable across builds
148+
149+
### Layer 2: Source Code
150+
- **Cached**: Cloned repositories (`.node-source/`, `.yoga-source/`, `.onnx-source/`)
151+
- **Purpose**: Skip git clone operations
152+
- **Duration**: Stable unless version changes
153+
154+
### Layer 3: Intermediate Build
155+
- **Cached**: CMake cache, compiled objects (`build/`)
156+
- **Purpose**: Resume compilation from previous state
157+
- **Duration**: Invalidated on source/patch changes
158+
159+
### Layer 4: Compilation Cache (Native only)
160+
- **Cached**: Per-object-file compilation results (ccache)
161+
- **Purpose**: Instant reuse of unchanged compiled objects
162+
- **Duration**: Survives source changes (object-level granularity)
163+
164+
### Layer 5: Final Output
165+
- **Cached**: Blessed artifacts (`dist/`)
166+
- **Purpose**: Skip entire build if nothing changed
167+
- **Duration**: Exact hash match required
168+
169+
## Build Time Comparison
170+
171+
| Build | First Run | Cached | With Intermediate Cache |
172+
|-------|-----------|--------|------------------------|
173+
| Smol (native) | 60-90 min | 5-10 min | 10-15 min (partial) |
174+
| ONNX (WASM) | 30-40 min | instant | 2-3 min (on failure) |
175+
| Yoga (WASM) | 2-3 min | instant | 1-2 min (partial) |
176+
| AI Models | 10-15 min | instant | N/A (no compilation) |
177+
| SEA | 5-10 min | instant | N/A (just bundling) |
178+
179+
## Implementation Checklist
180+
181+
When adding a new build workflow:
182+
183+
- [ ] Determine if it compiles C/C++ (Pattern 1 or 2)
184+
- [ ] If native C++: Add ccache setup
185+
- [ ] Add build directory cache for all C++ builds
186+
- [ ] Add output cache for final artifacts
187+
- [ ] Use content-based hash for cache keys
188+
- [ ] Add restore-keys for prefix matching
189+
- [ ] Test cache hit/miss scenarios
190+
- [ ] Document expected build times
191+
192+
## Troubleshooting
193+
194+
### Cache not restoring
195+
- Check cache key hash generation includes all relevant files
196+
- Verify restore-keys provide fallback options
197+
- Check GitHub Actions cache size limits (10 GB per repo)
198+
199+
### Build slower with cache
200+
- Check ccache statistics (`ccache -s`)
201+
- Verify build directory cache includes CMake cache
202+
- Check for cache corruption (force rebuild with `--force`)
203+
204+
### Cache too large
205+
- Adjust ccache max-size (default: 2G)
206+
- Clean build directories of unnecessary artifacts
207+
- Consider excluding large intermediate files
208+
209+
## Related Documentation
210+
211+
- [Build/Dist Structure](build-dist-structure.md) - Archive and promotion workflow
212+
- [Node.js Patches](../../build/patches/socket/README.md) - Patch management
213+
- [GitHub Actions Caching](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows)

0 commit comments

Comments
 (0)