This guide helps you understand GigaEvo's architecture from a bird's-eye view before diving into implementation details.
┌─────────────────────────────────────────────────────────────────┐
│ Your Problem │
│ (validate.py + metrics.yaml + initial_programs/) │
└────────────────────────┬────────────────────────────────────────┘
│
↓
┌─────────────────────────────────────────────────────────────────┐
│ Main Evolution Loop │
│ (run.py) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Evolution │───→│ Strategy │───→│ Redis │ │
│ │ Engine │ │ (Islands) │ │ Storage │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ↓ ↓ ↓ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ DAG Runner │ │ LLM Mutation │ │ Stages │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Understanding this is essential. Every program goes through these states:
┌─────────────────────────────────────────────────────────────────┐
│ PROGRAM STATE MACHINE │
└─────────────────────────────────────────────────────────────────┘
┌──────────┐
│ FRESH │ ← Program created or refreshed
└────┬─────┘
│ DagRunner picks it up
↓
┌──────────────────────┐
│ DAG_PROCESSING_ │ ← DAG executing stages
│ STARTED │
└────┬─────────────────┘
│ DAG completes successfully
↓
┌──────────────────────┐
│ DAG_PROCESSING_ │ ← Ready for ingestion
│ COMPLETED │
└────┬─────────────────┘
│ EvolutionEngine processes it
├─→ (New program + accepted) ──→ EVOLVING
├─→ (Existing program) ────────→ EVOLVING (restored)
└─→ (Rejected) ────────────────→ DISCARDED
┌──────────┐
│ EVOLVING │ ← Active in island archive
└────┬─────┘
│ Refresh phase
└─→ Back to FRESH (to update lineage-aware stages)
- FRESH programs are picked up by
DagRunner - DAG_PROCESSING_STARTED programs are being evaluated
- DAG_PROCESSING_COMPLETED programs await ingestion by
EvolutionEngine - EVOLVING programs are in the archive and can be selected as parents
- Programs cycle:
EVOLVING → FRESH → ... → EVOLVINGto update lineage info
EvolutionEngine waits for "idle" (no FRESH or DAG_PROCESSING_STARTED programs) before:
- Selecting elites and creating mutants
- Ingesting completed programs
- Refreshing evolving programs
Debugging tip: If evolution is stuck, check Redis for programs in FRESH/DAG_PROCESSING_STARTED:
redis-cli KEYS "state:FRESH:*"
redis-cli KEYS "state:DAG_PROCESSING_STARTED:*"One complete generation consists of 6 phases:
┌─────────────────────────────────────────────────────────────────┐
│ GENERATION PHASES │
└─────────────────────────────────────────────────────────────────┘
Phase 1: Wait for Idle
└─→ Block until no FRESH or DAG_PROCESSING_STARTED programs
Phase 2: Select & Mutate
├─→ Strategy.select_elites(N)
├─→ MutationOperator.mutate() → Create N mutants
└─→ Store mutants in Redis (state: FRESH)
Phase 3: Wait for Mutant DAGs
└─→ Block until mutants complete evaluation
Phase 4: Ingest Completed Programs
├─→ For each DAG_PROCESSING_COMPLETED program:
│ ├─→ Already in archive? → Restore to EVOLVING
│ ├─→ New + accepted by strategy? → Add, set EVOLVING
│ └─→ Rejected? → Set DISCARDED
Phase 5: Refresh Evolving Programs
├─→ Get all programs in EVOLVING state
├─→ Flip them to FRESH
└─→ Lineage-aware stages will re-run with updated family tree
Phase 6: Wait for Refresh DAGs
└─→ Block until refresh completes
[Increment generation counter and repeat]
Programs need to be refreshed because:
- New children are added to their lineage
- Descendant statistics change
- LLM-based stages (insights, lineage analysis) should be recomputed with updated context
Performance note: This means stages run multiple times per program. Use cacheable=True for deterministic stages to avoid redundant computation.
┌─────────────────────────────────────────────────────────────────┐
│ DAG EXECUTION │
└─────────────────────────────────────────────────────────────────┘
Program (state: FRESH)
↓
DagRunner picks it up
↓
DAG built from blueprint
↓
Stages execute in parallel (respecting dependencies)
│
├─→ ValidateCode (cacheable)
│ ├─→ SUCCESS → Continue
│ └─→ FAILED → Skip dependent stages
│
├─→ ExecuteProgram (depends on ValidateCode)
│ ├─→ Runs user's entrypoint() function
│ └─→ Captures output
│
├─→ ValidateOutput (depends on ExecuteProgram)
│ ├─→ Runs validate() from problem
│ ├─→ Returns metrics (fitness, etc.)
│ └─→ May return an optional artifact (e.g. bottleneck data, arrays) for mutation context
│
├─→ ComputeComplexity (independent, cacheable)
│ └─→ Analyzes code structure
│
├─→ MergeMetrics
│ └─→ Combines all metrics
│
├─→ InsightsStage (depends on metrics, non-cacheable)
│ └─→ LLM generates insights
│
└─→ MutationContextStage (non-cacheable)
└─→ Formats context for future mutation
↓
All stages complete
↓
Program state: DAG_PROCESSING_COMPLETED
ExecuteProgram.OutputModel = Box[np.ndarray]
↓ DataFlowEdge(source="ExecuteProgram", dest="ValidateOutput", input_name="payload")
ValidateOutput.InputsModel.payload: Box[np.ndarray]
How to find input_name: Look at the destination stage's InputsModel class.
| Stage Type | Cacheable? | Why |
|---|---|---|
| ValidateCode | ✅ Yes | Code syntax doesn't change |
| ExecuteProgram | ✅ Yes | Deterministic execution |
| ComputeComplexity | ✅ Yes | Static code analysis |
| InsightsStage | ✅ Yes | Fixed LLM based analysis |
| LineageStage | ❌ No | Depends on evolving family tree |
| MutationContextStage | ❌ No | Aggregates non-cacheable data |
┌─────────────────────────────────────────────────────────────────┐
│ MULTI-ISLAND SYSTEM │
└─────────────────────────────────────────────────────────────────┘
Island 1: "fitness_island" Island 2: "simplicity_island"
┌──────────────────────────┐ ┌──────────────────────────┐
│ Behavior Space: │ │ Behavior Space: │
│ - fitness (0-100) │ │ - fitness (0-100) │
│ - validity (0-1) │ │ - complexity (0-1000) │
│ │ │ │
│ Archive: 20×5 = 100 cells│ │ Archive: 20×10 = 200 cells│
│ │ │ │
│ Selector: Maximize │ │ Selector: Maximize │
│ fitness │ │ fitness / complexity │
└────────────┬─────────────┘ └─────────────┬────────────┘
│ │
└────────────→ Migration ←───────────────┘
(every 50 gens)
Programs track their island membership:
program.metadata = {
"home_island": "fitness_island", # Where created
"current_island": "simplicity_island", # Where currently lives
"iteration": 42,
"mutation_context": "...",
}Generation 50, 100, 150, ... (every migration_interval):
1. Select Migrants
├─→ Island 1: Select top 5 by fitness
└─→ Island 2: Select top 5 by fitness
2. Route Migrants
├─→ Island 1 migrants → Route to Island 2
└─→ Island 2 migrants → Route to Island 1
3. Add to Destination
├─→ Try to add to destination archive
└─→ Must improve a cell to be accepted
4. Remove from Source
├─→ If successfully added, remove from source
└─→ If removal fails, rollback (remove from destination)
Why rollback? To maintain invariant: "No program exists in multiple islands simultaneously."
Redis is the single source of truth. Understanding the key schema is essential for debugging.
┌─────────────────────────────────────────────────────────────────┐
│ REDIS KEY SCHEMA │
└─────────────────────────────────────────────────────────────────┘
program:{program_id} → Program object (JSON)
state:{state_value}:{program_id} → State index
counter → Atomic counter for updates
# Island-specific
island_{island_id}:archive → Hash: cell → program_id
# Example keys:
program:a1b2c3d4-... → Program data
state:FRESH:a1b2c3d4-... → Index entry
state:EVOLVING:a1b2c3d4-... → Index entry
island_fitness_island:archive → Archive hash
# Show all states
redis-cli --scan --pattern "state:*" | cut -d: -f2 | sort | uniq -c
# Show programs in FRESH state
redis-cli KEYS "state:FRESH:*"
# Show island archive size
redis-cli HLEN "island_fitness_island:archive"
# Get program details
redis-cli GET "program:a1b2c3d4-..." | jq .
# Show all island archives
redis-cli KEYS "island_*:archive"The mutation process involves multiple stages:
┌─────────────────────────────────────────────────────────────────┐
│ MUTATION PIPELINE │
└─────────────────────────────────────────────────────────────────┘
1. MutationContextStage (runs on parents)
├─→ Formats metrics for LLM
├─→ Adds insights
├─→ Adds lineage info
└─→ Stores in program.metadata[MUTATION_CONTEXT_METADATA_KEY]
2. Parent Selection
├─→ EvolutionEngine: Strategy.select_elites(N)
├─→ ParentSelector: Group elites into parent tuples
└─→ Usually 1-2 parents per mutation
3. MutationAgent
├─→ Reads pre-formatted mutation context from parent metadata
├─→ Builds prompt:
│ ├─→ System: task_description + metrics_description
│ └─→ User: parent code + context
├─→ Calls LLM
├─→ Parses response (extracts code block or applies diff)
└─→ Returns MutationSpec
4. Create Child Program
├─→ Program.from_mutation_spec()
├─→ Set lineage (parents, generation, mutation name)
├─→ Store in Redis (state: FRESH)
└─→ Update parent.lineage.children
5. Child Evaluation
└─→ DAG pipeline runs (same as any program)
System Prompt (from prompts/mutation/system.txt):
Task: {task_description}
Metrics: {metrics_description}
Instructions: ...
User Prompt (from prompts/mutation/user.txt):
Mutate {count} parent programs:
=== Parent 1 ===
```python
{parent.code}
```
{parent.metadata[MUTATION_CONTEXT_METADATA_KEY]}
← This contains formatted metrics, insights, lineage
Critical dependency: If MutationContextStage is missing from your pipeline, mutation prompts will lack context and produce poor results.
The config system uses Hydra with custom resolvers:
# config/experiment/base.yaml
defaults:
- /constants: base # Load constants/base.yaml
- /redis: default # Load redis/default.yaml
- /llm: single # Load llm/single.yaml
- /algorithm: single_island
- /pipeline: auto
# Hydra instantiation
dag_blueprint:
_target_: gigaevo.runner.dag_blueprint.DAGBlueprint
nodes:
ValidateCode:
_target_: gigaevo.programs.stages.validation.ValidateCodeStage
_partial_: true # Create factory, not instance
timeout: 30.0
# Custom resolvers
${problem.dir} # Resolves to problem directory path
${ref:redis_storage} # References another instantiated object
${metrics_context} # Resolves to metrics context# _partial_: true
# Creates: lambda: ValidateCodeStage(timeout=30.0)
# Used when DAGBlueprint needs to create multiple instances
# _partial_: false (or omitted)
# Creates: ValidateCodeStage(timeout=30.0)
# Used for singletonsCheck:
-
Are there programs in FRESH state waiting for DAG?
redis-cli KEYS "state:FRESH:*" | wc -l
-
Are there programs in DAG_PROCESSING_STARTED?
redis-cli KEYS "state:DAG_PROCESSING_STARTED:*" | wc -l
-
Check DagRunner metrics in logs:
[DagRunner] active_count: 8, completed: 142 -
Look for stage timeouts or failures in logs
Check:
-
Do programs have required behavior metrics?
missing = set(island.behavior_space.behavior_keys) - program.metrics.keys()
-
Are bounds reasonable for metric values?
# All programs mapping to same cell? island.behavior_space.feature_bounds
-
Is archive selector too strict?
Check:
-
Is
ValidateCodestage in your pipeline? -
Are error messages being passed to LLM in subsequent mutations?
-
Check prompt construction in logs:
[MutationAgent] Built prompt with 2 parents (system: 1200 chars, user: 3400 chars) -
Is
MutationContextStagepresent and running?
Check:
- Archive size:
await strategy.get_metrics() - Elite selection: Are any elites being selected?
- Parent selector: Is it producing valid parent tuples?
- Generation limit: Has
max_generationsbeen reached?
| File | Purpose |
|---|---|
run.py |
Main entry point |
gigaevo/evolution/engine/core.py |
Evolution generation loop |
gigaevo/runner/dag_runner.py |
Picks up FRESH programs, runs DAGs |
gigaevo/evolution/strategies/multi_island.py |
Multi-island strategy |
gigaevo/evolution/strategies/island.py |
Single island (archive) |
gigaevo/programs/dag/dag.py |
DAG execution engine |
gigaevo/programs/dag/automata.py |
Stage scheduling logic |
gigaevo/database/redis_program_storage.py |
Redis interface |
gigaevo/database/state_manager.py |
Program state transitions |
gigaevo/llm/agents/mutation.py |
LLM mutation agent |
- Quick Start: Follow README.md to run your first evolution
- Create a Problem: See
problems/heilbron/as template - Customize Evolution: Modify
config/experiment/base.yaml - Add Custom Stages: Read
DAG_SYSTEM.md - Debug Issues: Use Redis commands and logs
- DAG System: See
DAG_SYSTEM.md - Evolution Strategies: See
EVOLUTION_STRATEGIES.md - Configuration: See
config/directory structure - Tools: See
../tools/README.mdfor analysis utilities