Skip to content

Commit 0b1e4a5

Browse files
committed
git sync merging mem with search
1 parent 8fbe582 commit 0b1e4a5

370 files changed

Lines changed: 43034 additions & 328 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
results/
22
examples/lm_eval/prompts/system_message.txt
33
examples/lm_eval/prompts/evaluator_system_message.txt
4-
4+
examples/math_mas/USAGE.txt
5+
examples/math_mas/CONFIGURATION_GUIDE.md
6+
examples/circle_packing/EXPERIMENTS.md
57
# Python
68
__pycache__/
79
*.py[cod]
Lines changed: 335 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
# Circle Packing Experiments
2+
3+
This directory contains experiment scripts for comparing different search strategies on the circle packing problem (n=26 circles in a unit square).
4+
5+
## Problem Overview
6+
7+
**Objective**: Pack 26 non-overlapping circles into a unit square to maximize the sum of their radii.
8+
9+
**Target**: 2.635 (from AlphaEvolve paper)
10+
11+
**Constraints**:
12+
- All circles must fit entirely within the unit square (0 ≤ x, y ≤ 1)
13+
- No circles may overlap
14+
- Exactly 26 circles must be placed
15+
16+
## Search Strategies
17+
18+
We compare 4 different search strategies:
19+
20+
1. **MAP-Elites**: Quality-diversity algorithm with island-based evolution
21+
2. **Best-of-N**: Maintains N independent lineages, keeps the best
22+
3. **Beam Search**: Keeps top M programs, generates N candidates per iteration
23+
4. **MCTS**: Monte Carlo Tree Search with UCT exploration
24+
25+
## Quick Start
26+
27+
### Run Individual Strategies
28+
29+
```bash
30+
# MAP-Elites (100 iterations)
31+
./run_map_elites.sh 100
32+
33+
# Best-of-N (4 lineages)
34+
./run_best_of_n.sh 100
35+
36+
# Beam Search (beam_width=4, branch_factor=8)
37+
./run_beam_search.sh 100
38+
39+
# MCTS (expansion_width=3)
40+
./run_mcts.sh 100
41+
```
42+
43+
### Test a Program
44+
45+
```bash
46+
# Test the initial program
47+
python test_program.py initial_program.py
48+
49+
# Test an evolved program
50+
python test_program.py openevolve_output/best/best_program.py
51+
52+
# Test without visualization
53+
python test_program.py initial_program.py --no-visualize
54+
55+
# Compare multiple programs
56+
python test_program.py initial_program.py best_program.py
57+
```
58+
59+
## Configuration Files
60+
61+
Each search strategy has a dedicated config file:
62+
63+
- `config_map_elites.yaml` - MAP-Elites with 2D feature grid (sum_radii × eval_time)
64+
- `config_best_of_n.yaml` - Best-of-N with 4 lineages
65+
- `config_beam_search.yaml` - Beam Search with beam_width=4, branch_factor=8
66+
- `config_mcts.yaml` - MCTS with expansion_width=3, exploration_constant=√2
67+
68+
### Key Configuration Parameters
69+
70+
All configs share:
71+
```yaml
72+
max_iterations: 100
73+
checkpoint_interval: 10
74+
random_seed: 42
75+
llm:
76+
primary_model: "gpt-5"
77+
temperature: 0.7
78+
max_tokens: 8192
79+
evaluator:
80+
timeout: 600 # 10 minutes per evaluation
81+
cascade_evaluation: true
82+
cascade_thresholds: [0.5, 0.75]
83+
```
84+
85+
### MAP-Elites Specific
86+
87+
```yaml
88+
database:
89+
population_size: 60
90+
num_islands: 4
91+
feature_dimensions:
92+
- "sum_radii" # Maximize this (0.0 to 2.7)
93+
- "eval_time" # Minimize this (faster is better)
94+
feature_bins:
95+
sum_radii: 20
96+
eval_time: 10
97+
```
98+
99+
## Expected Evolution Patterns
100+
101+
### Early Iterations (0-20)
102+
- Simple geometric patterns (concentric rings, grids)
103+
- Sum of radii: 0.5 - 1.5
104+
- Quick evaluations (<1s)
105+
106+
### Mid Iterations (20-60)
107+
- Hexagonal arrangements emerge
108+
- Variable-sized circles
109+
- Sum of radii: 1.5 - 2.2
110+
- Evaluation times increase (1-5s)
111+
112+
### Late Iterations (60-100)
113+
- Mathematical optimization (scipy.optimize)
114+
- Hybrid approaches
115+
- Sum of radii: 2.2 - 2.6+
116+
- Longer evaluations (5-60s)
117+
118+
## Metrics Tracked
119+
120+
All evaluations return:
121+
122+
```python
123+
{
124+
"sum_radii": 2.634, # Primary objective
125+
"target_ratio": 0.9996, # sum_radii / 2.635
126+
"validity": 1.0, # 1.0 if valid, 0.0 if invalid
127+
"eval_time": 12.5, # Seconds to evaluate
128+
"combined_score": 0.9996 # target_ratio * validity
129+
}
130+
```
131+
132+
## Output Structure
133+
134+
Each strategy creates its own output directory:
135+
136+
```
137+
openevolve_output/
138+
├── best/ # MAP-Elites results
139+
│ ├── best_program.py
140+
│ └── best_program_info.json
141+
├── checkpoints/
142+
│ ├── checkpoint_10/
143+
│ ├── checkpoint_20/
144+
│ └── ...
145+
├── best_of_n/ # Best-of-N results
146+
│ ├── best/
147+
│ └── checkpoints/
148+
├── beam_search/ # Beam Search results
149+
│ ├── best/
150+
│ └── checkpoints/
151+
└── mcts/ # MCTS results
152+
├── best/
153+
└── checkpoints/
154+
```
155+
156+
## Analyzing Results
157+
158+
### View Best Program
159+
160+
```bash
161+
# Display best program code
162+
cat openevolve_output/best/best_program.py
163+
164+
# View metrics
165+
cat openevolve_output/best/best_program_info.json
166+
```
167+
168+
### Visualize Best Solution
169+
170+
```python
171+
from openevolve_output.best.best_program import run_packing, visualize
172+
173+
centers, radii, sum_radii = run_packing()
174+
print(f"Sum of radii: {sum_radii}")
175+
print(f"Target ratio: {sum_radii / 2.635:.2%}")
176+
visualize(centers, radii)
177+
```
178+
179+
### Compare Strategies
180+
181+
```bash
182+
# Test all best programs
183+
python test_program.py \
184+
openevolve_output/best/best_program.py \
185+
openevolve_output/best_of_n/best/best_program.py \
186+
openevolve_output/beam_search/best/best_program.py \
187+
openevolve_output/mcts/best/best_program.py
188+
```
189+
190+
## Resuming from Checkpoint
191+
192+
You can resume evolution from any checkpoint:
193+
194+
```bash
195+
# Resume MAP-Elites from iteration 50
196+
python ../../openevolve-run.py \
197+
initial_program.py \
198+
evaluator.py \
199+
--config config_map_elites.yaml \
200+
--checkpoint openevolve_output/checkpoints/checkpoint_50 \
201+
--iterations 100 # Run 100 MORE iterations
202+
```
203+
204+
## Two-Phase Evolution (Advanced)
205+
206+
For breaking through plateaus, run a two-phase evolution:
207+
208+
### Phase 1: Exploration (100 iterations)
209+
```bash
210+
./run_map_elites.sh 100
211+
```
212+
213+
### Phase 2: Exploitation (100 more iterations)
214+
215+
Modify the config to encourage more aggressive optimization:
216+
217+
```yaml
218+
# Phase 2 adjustments
219+
database:
220+
population_size: 70 # More diversity
221+
num_islands: 5 # More parallel exploration
222+
exploitation_ratio: 0.6 # More exploration
223+
224+
prompt:
225+
system_message: |
226+
Focus on breaking through the plateau by trying fundamentally
227+
different approaches. Consider:
228+
- scipy.optimize for mathematical optimization
229+
- Hybrid geometric + numerical approaches
230+
- Variable circle sizes with strategic placement
231+
```
232+
233+
Then resume:
234+
```bash
235+
python ../../openevolve-run.py \
236+
openevolve_output/checkpoints/checkpoint_100/best_program.py \
237+
evaluator.py \
238+
--config config_phase_2.yaml \
239+
--iterations 100
240+
```
241+
242+
## Troubleshooting
243+
244+
### Programs timeout during evaluation
245+
246+
Increase the timeout:
247+
```yaml
248+
evaluator:
249+
timeout: 1200 # 20 minutes instead of 10
250+
```
251+
252+
### Evolution converges too quickly
253+
254+
Increase diversity:
255+
```yaml
256+
database:
257+
population_size: 80
258+
num_islands: 6
259+
exploration_ratio: 0.4 # More exploration
260+
```
261+
262+
### Low quality solutions
263+
264+
The LLM might need better guidance:
265+
```yaml
266+
prompt:
267+
system_message: |
268+
# Add specific strategies and examples
269+
# Emphasize use of scipy.optimize
270+
# Provide geometric insights
271+
```
272+
273+
## Expected Runtime
274+
275+
**Per Strategy** (100 iterations):
276+
- Total time: ~12-24 hours
277+
- Per iteration: ~5-15 minutes
278+
- Parallel evaluations: 4 concurrent
279+
- Checkpoint every 10 iterations
280+
281+
**All 4 Strategies** (run in parallel):
282+
- Total time: ~12-24 hours
283+
- Resource usage: 16 concurrent evaluations
284+
285+
## Success Criteria
286+
287+
A successful evolution should achieve:
288+
289+
- ✅ **Valid packing**: No overlaps, all circles inside square
290+
- ✅ **Sum of radii ≥ 2.5**: Getting close to target
291+
- ✅ **Target ratio ≥ 95%**: Within 5% of AlphaEvolve result
292+
- 🎯 **Sum of radii ≥ 2.63**: Matching or beating AlphaEvolve
293+
294+
## Research Questions
295+
296+
These experiments are designed to answer:
297+
298+
1. **Which search strategy performs best?**
299+
- Compare final sum_radii across strategies
300+
- Analyze convergence speed
301+
302+
2. **What algorithmic discoveries emerge?**
303+
- Geometric constructions vs numerical optimization
304+
- Hybrid approaches
305+
306+
3. **How does diversity help?**
307+
- MAP-Elites with feature grid vs single-objective
308+
- Island-based evolution benefits
309+
310+
4. **What is the role of LLM temperature?**
311+
- Higher temp = more exploration
312+
- Lower temp = more exploitation
313+
314+
## Next Steps
315+
316+
After running experiments:
317+
318+
1. **Analyze logs**: Check `openevolve_output/*/logs/`
319+
2. **Compare strategies**: Use `test_program.py` for side-by-side comparison
320+
3. **Visualize evolution**: Plot sum_radii over iterations
321+
4. **Extract insights**: What patterns led to breakthroughs?
322+
5. **Iterate**: Run phase 2 with best strategy
323+
324+
## Citation
325+
326+
If you use these experiments, please cite the OpenEvolve paper and the original AlphaEvolve work:
327+
328+
```
329+
@article{alphaevolve2024,
330+
title={AlphaEvolve: Evolutionary Code Generation},
331+
author={DeepMind Team},
332+
journal={Nature},
333+
year={2024}
334+
}
335+
```

0 commit comments

Comments
 (0)