Skip to content

Commit bb327a4

Browse files
committed
Initial commit: Cache patterns benchmark example
Add particle simulation demonstrating Array of Structures (AoS) vs Structure of Arrays (SoA) for cache-friendly data layouts. - Implement AoS and SoA particle systems - Add benchmarks for position updates, kinetic energy, and gravity - Include documentation on cache-friendly patterns - Pin Rust toolchain to 1.83.0
0 parents  commit bb327a4

8 files changed

Lines changed: 455 additions & 0 deletions

File tree

.gitignore

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Rust build artifacts
2+
/target/
3+
Cargo.lock
4+
5+
# IDE files
6+
.vscode/
7+
.idea/
8+
*.swp
9+
*.swo
10+
*~
11+
12+
# OS files
13+
.DS_Store
14+
Thumbs.db

Cargo.toml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
[package]
2+
name = "cache-patterns"
3+
version = "0.1.0"
4+
edition = "2021"
5+
6+
[dependencies]
7+
8+
[dev-dependencies]
9+
divan = "0.1"
10+
11+
[[bench]]
12+
name = "particle_simulation"
13+
harness = false

README.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Cache Patterns Benchmark
2+
3+
This crate demonstrates the performance impact of different data layouts on CPU cache utilization through a particle physics simulation.
4+
5+
## Initial Assumption
6+
7+
**Hypothesis**: Data layout significantly impacts CPU cache behavior. Specifically, organizing data as a Structure of Arrays (SoA) should show measurably better cache performance than Array of Structures (AoS) when operations only access a subset of fields.
8+
9+
This benchmark is designed to validate this hypothesis using CodSpeed's walltime instrument, which provides hardware performance counters including cache hit/miss rates, memory bandwidth, and IPC (instructions per cycle).
10+
11+
## The Problem: Array of Structures (AoS) vs Structure of Arrays (SoA)
12+
13+
### Array of Structures (AoS) - Cache Unfriendly
14+
```rust
15+
struct Particle {
16+
position: Vec3, // 12 bytes
17+
velocity: Vec3, // 12 bytes
18+
mass: f32, // 4 bytes
19+
} // = 28 bytes per particle (40 with padding)
20+
21+
particles: Vec<Particle>
22+
```
23+
24+
**Memory layout**: `[pos0, vel0, mass0, pos1, vel1, mass1, pos2, vel2, mass2, ...]`
25+
26+
When we only need to update positions, we load entire cache lines containing velocity and mass data that we don't use, wasting bandwidth and cache space.
27+
28+
### Structure of Arrays (SoA) - Cache Friendly
29+
```rust
30+
struct ParticleSystem {
31+
positions: Vec<Vec3>,
32+
velocities: Vec<Vec3>,
33+
masses: Vec<f32>,
34+
}
35+
```
36+
37+
**Memory layout**:
38+
- `positions: [pos0, pos1, pos2, ...]`
39+
- `velocities: [vel0, vel1, vel2, ...]`
40+
- `masses: [mass0, mass1, mass2, ...]`
41+
42+
When we update positions, every byte in the cache line is useful data, maximizing cache efficiency.
43+
44+
## Expected Performance Characteristics
45+
46+
### AoS (Cache Unfriendly)
47+
- Higher L1/L2/L3 cache miss rates
48+
- Lower memory bandwidth utilization
49+
- More stalls waiting for memory
50+
51+
### SoA (Cache Friendly)
52+
- Lower cache miss rates (better spatial locality)
53+
- Higher effective memory bandwidth
54+
- Better prefetcher efficiency
55+
56+
## Running the Benchmarks
57+
58+
```bash
59+
# Run with standard benchmarking
60+
cargo bench
61+
62+
# Run with CodSpeed profiling to see cache counters
63+
# (requires CodSpeed setup with walltime instrument)
64+
codspeed run cargo bench
65+
```
66+
67+
## What to Look For in CodSpeed Profiling
68+
69+
When comparing AoS vs SoA versions with CodSpeed's walltime instrument, you should see:
70+
71+
1. **Cache Misses**: SoA should show significantly fewer L1/L2/L3 cache misses
72+
2. **Memory Operations**: Better cache line utilization in SoA version
73+
3. **Instructions Per Cycle (IPC)**: Higher IPC in SoA due to less memory stalls
74+
4. **Wall Time**: SoA should be faster, especially with larger datasets
75+
76+
## Benchmark Operations
77+
78+
Each version implements three operations:
79+
80+
1. **update_positions**: `position = position + velocity * dt`
81+
- Tests spatial locality when accessing two arrays
82+
83+
2. **compute_kinetic_energy**: `sum(0.5 * mass * velocity²)`
84+
- Tests cache behavior when skipping position data
85+
86+
3. **apply_gravity**: `velocity = velocity + gravity * dt`
87+
- Tests cache behavior when accessing only one field
88+
89+
## Dataset Sizes
90+
91+
- **Small**: 1,000 particles (~40 KB for AoS, ~32 KB for SoA)
92+
- **Medium**: 10,000 particles (~400 KB for AoS, ~320 KB for SoA)
93+
- **Large**: 100,000 particles (~4 MB for AoS, ~3.2 MB for SoA)
94+
95+
Different sizes stress different cache levels (L1/L2/L3).

benches/particle_simulation.rs

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
use cache_patterns::{aos, soa, Vec3};
2+
3+
fn main() {
4+
divan::main();
5+
}
6+
7+
const SMALL: usize = 1_000;
8+
const MEDIUM: usize = 10_000;
9+
const LARGE: usize = 100_000;
10+
11+
// ============================================================================
12+
// Array of Structures (AoS) - Cache Unfriendly
13+
// ============================================================================
14+
15+
#[divan::bench(name = "aos_update_positions_small")]
16+
fn aos_update_positions_small(bencher: divan::Bencher) {
17+
let mut system = aos::ParticleSystem::new(SMALL);
18+
bencher.bench_local(|| {
19+
system.update_positions(0.016);
20+
});
21+
}
22+
23+
#[divan::bench(name = "aos_update_positions_medium")]
24+
fn aos_update_positions_medium(bencher: divan::Bencher) {
25+
let mut system = aos::ParticleSystem::new(MEDIUM);
26+
bencher.bench_local(|| {
27+
system.update_positions(0.016);
28+
});
29+
}
30+
31+
#[divan::bench(name = "aos_update_positions_large")]
32+
fn aos_update_positions_large(bencher: divan::Bencher) {
33+
let mut system = aos::ParticleSystem::new(LARGE);
34+
bencher.bench_local(|| {
35+
system.update_positions(0.016);
36+
});
37+
}
38+
39+
#[divan::bench(name = "aos_kinetic_energy_small")]
40+
fn aos_kinetic_energy_small(bencher: divan::Bencher) {
41+
let system = aos::ParticleSystem::new(SMALL);
42+
bencher.bench_local(|| {
43+
divan::black_box(system.compute_kinetic_energy());
44+
});
45+
}
46+
47+
#[divan::bench(name = "aos_kinetic_energy_medium")]
48+
fn aos_kinetic_energy_medium(bencher: divan::Bencher) {
49+
let system = aos::ParticleSystem::new(MEDIUM);
50+
bencher.bench_local(|| {
51+
divan::black_box(system.compute_kinetic_energy());
52+
});
53+
}
54+
55+
#[divan::bench(name = "aos_kinetic_energy_large")]
56+
fn aos_kinetic_energy_large(bencher: divan::Bencher) {
57+
let system = aos::ParticleSystem::new(LARGE);
58+
bencher.bench_local(|| {
59+
divan::black_box(system.compute_kinetic_energy());
60+
});
61+
}
62+
63+
#[divan::bench(name = "aos_apply_gravity_small")]
64+
fn aos_apply_gravity_small(bencher: divan::Bencher) {
65+
let mut system = aos::ParticleSystem::new(SMALL);
66+
let gravity = Vec3::new(0.0, -9.81, 0.0);
67+
bencher.bench_local(|| {
68+
system.apply_gravity(gravity, 0.016);
69+
});
70+
}
71+
72+
#[divan::bench(name = "aos_apply_gravity_medium")]
73+
fn aos_apply_gravity_medium(bencher: divan::Bencher) {
74+
let mut system = aos::ParticleSystem::new(MEDIUM);
75+
let gravity = Vec3::new(0.0, -9.81, 0.0);
76+
bencher.bench_local(|| {
77+
system.apply_gravity(gravity, 0.016);
78+
});
79+
}
80+
81+
#[divan::bench(name = "aos_apply_gravity_large")]
82+
fn aos_apply_gravity_large(bencher: divan::Bencher) {
83+
let mut system = aos::ParticleSystem::new(LARGE);
84+
let gravity = Vec3::new(0.0, -9.81, 0.0);
85+
bencher.bench_local(|| {
86+
system.apply_gravity(gravity, 0.016);
87+
});
88+
}
89+
90+
// ============================================================================
91+
// Structure of Arrays (SoA) - Cache Friendly
92+
// ============================================================================
93+
94+
#[divan::bench(name = "soa_update_positions_small")]
95+
fn soa_update_positions_small(bencher: divan::Bencher) {
96+
let mut system = soa::ParticleSystem::new(SMALL);
97+
bencher.bench_local(|| {
98+
system.update_positions(0.016);
99+
});
100+
}
101+
102+
#[divan::bench(name = "soa_update_positions_medium")]
103+
fn soa_update_positions_medium(bencher: divan::Bencher) {
104+
let mut system = soa::ParticleSystem::new(MEDIUM);
105+
bencher.bench_local(|| {
106+
system.update_positions(0.016);
107+
});
108+
}
109+
110+
#[divan::bench(name = "soa_update_positions_large")]
111+
fn soa_update_positions_large(bencher: divan::Bencher) {
112+
let mut system = soa::ParticleSystem::new(LARGE);
113+
bencher.bench_local(|| {
114+
system.update_positions(0.016);
115+
});
116+
}
117+
118+
#[divan::bench(name = "soa_kinetic_energy_small")]
119+
fn soa_kinetic_energy_small(bencher: divan::Bencher) {
120+
let system = soa::ParticleSystem::new(SMALL);
121+
bencher.bench_local(|| {
122+
divan::black_box(system.compute_kinetic_energy());
123+
});
124+
}
125+
126+
#[divan::bench(name = "soa_kinetic_energy_medium")]
127+
fn soa_kinetic_energy_medium(bencher: divan::Bencher) {
128+
let system = soa::ParticleSystem::new(MEDIUM);
129+
bencher.bench_local(|| {
130+
divan::black_box(system.compute_kinetic_energy());
131+
});
132+
}
133+
134+
#[divan::bench(name = "soa_kinetic_energy_large")]
135+
fn soa_kinetic_energy_large(bencher: divan::Bencher) {
136+
let system = soa::ParticleSystem::new(LARGE);
137+
bencher.bench_local(|| {
138+
divan::black_box(system.compute_kinetic_energy());
139+
});
140+
}
141+
142+
#[divan::bench(name = "soa_apply_gravity_small")]
143+
fn soa_apply_gravity_small(bencher: divan::Bencher) {
144+
let mut system = soa::ParticleSystem::new(SMALL);
145+
let gravity = Vec3::new(0.0, -9.81, 0.0);
146+
bencher.bench_local(|| {
147+
system.apply_gravity(gravity, 0.016);
148+
});
149+
}
150+
151+
#[divan::bench(name = "soa_apply_gravity_medium")]
152+
fn soa_apply_gravity_medium(bencher: divan::Bencher) {
153+
let mut system = soa::ParticleSystem::new(MEDIUM);
154+
let gravity = Vec3::new(0.0, -9.81, 0.0);
155+
bencher.bench_local(|| {
156+
system.apply_gravity(gravity, 0.016);
157+
});
158+
}
159+
160+
#[divan::bench(name = "soa_apply_gravity_large")]
161+
fn soa_apply_gravity_large(bencher: divan::Bencher) {
162+
let mut system = soa::ParticleSystem::new(LARGE);
163+
let gravity = Vec3::new(0.0, -9.81, 0.0);
164+
bencher.bench_local(|| {
165+
system.apply_gravity(gravity, 0.016);
166+
});
167+
}

rust-toolchain.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[toolchain]
2+
channel = "1.92.0"

src/aos.rs

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
/// Array of Structures (AoS) - Cache Unfriendly
2+
/// When we iterate to update positions, we skip over velocity and mass data,
3+
/// leading to poor cache utilization
4+
5+
use crate::Vec3;
6+
7+
#[derive(Clone, Debug)]
8+
pub struct Particle {
9+
pub position: Vec3,
10+
pub velocity: Vec3,
11+
pub mass: f32,
12+
}
13+
14+
impl Particle {
15+
pub fn new(position: Vec3, velocity: Vec3, mass: f32) -> Self {
16+
Self {
17+
position,
18+
velocity,
19+
mass,
20+
}
21+
}
22+
}
23+
24+
pub struct ParticleSystem {
25+
pub particles: Vec<Particle>,
26+
}
27+
28+
impl ParticleSystem {
29+
pub fn new(count: usize) -> Self {
30+
let mut particles = Vec::with_capacity(count);
31+
for i in 0..count {
32+
let fi = i as f32;
33+
particles.push(Particle::new(
34+
Vec3::new(fi, fi * 2.0, fi * 3.0),
35+
Vec3::new(fi * 0.1, fi * 0.2, fi * 0.3),
36+
1.0 + fi * 0.01,
37+
));
38+
}
39+
Self { particles }
40+
}
41+
42+
/// Update particle positions based on velocity
43+
/// Poor cache behavior: we load entire Particle struct (40 bytes) but only need
44+
/// position (12 bytes) and velocity (12 bytes)
45+
pub fn update_positions(&mut self, dt: f32) {
46+
for particle in &mut self.particles {
47+
particle.position = particle.position.add(&particle.velocity.scale(dt));
48+
}
49+
}
50+
51+
/// Compute total kinetic energy
52+
/// Poor cache behavior: we access velocity and mass, skipping position data
53+
pub fn compute_kinetic_energy(&self) -> f32 {
54+
let mut total = 0.0;
55+
for particle in &self.particles {
56+
let v2 = particle.velocity.x * particle.velocity.x
57+
+ particle.velocity.y * particle.velocity.y
58+
+ particle.velocity.z * particle.velocity.z;
59+
total += 0.5 * particle.mass * v2;
60+
}
61+
total
62+
}
63+
64+
/// Apply gravity to all particles
65+
/// Poor cache behavior: we only need to modify velocity, but load entire struct
66+
pub fn apply_gravity(&mut self, gravity: Vec3, dt: f32) {
67+
for particle in &mut self.particles {
68+
particle.velocity = particle.velocity.add(&gravity.scale(dt));
69+
}
70+
}
71+
}

0 commit comments

Comments
 (0)