Skip to content

Latest commit

 

History

History
124 lines (96 loc) · 3.4 KB

File metadata and controls

124 lines (96 loc) · 3.4 KB

SOFAI (Slow and Fast AI) Examples

This directory contains examples of the SOFAI sampling strategy - a two-tier approach using fast and slow models.

Files

sofai_graph_coloring.py

Complete example using SOFAI for constraint satisfaction problems (graph coloring).

Key Features:

  • Two-tier model architecture (fast S1, slow S2)
  • Iterative feedback loop with fast model
  • Escalation to slow model when needed
  • Custom validation with detailed feedback
  • Consumer-grade hardware friendly

Concepts Demonstrated

  • Two-Tier Architecture: Fast model for iteration, slow model for hard cases
  • Escalation Strategy: When to switch from fast to slow model
  • Feedback Loops: Providing validation feedback to guide generation
  • Constraint Satisfaction: Solving CSP problems with LLMs
  • Cost Optimization: Using expensive models only when necessary

SOFAI Architecture

User Request
    ↓
S1 (Fast Model) ←─────┐
    ↓                  │
Validation             │
    ↓                  │
Pass? ──No→ Feedback ──┘
    │
   Yes
    ↓
Success? ──No→ Escalate to S2 (Slow Model)
    │              ↓
   Yes         Validation
    ↓              ↓
  Result        Result

Basic Usage

from mellea import start_session
from mellea.backends.ollama import OllamaModelBackend
from mellea.stdlib.sampling import SOFAISamplingStrategy
from mellea.stdlib.requirements import req

# Create fast and slow backends
s1_backend = OllamaModelBackend(model_id="granite4.1:3b")
s2_backend = OllamaModelBackend(model_id="granite4:latest")

# Create SOFAI strategy
strategy = SOFAISamplingStrategy(
    s1_backend=s1_backend,
    s2_backend=s2_backend,
    s1_loop_budget=3,  # Try fast model 3 times
    escalate_on_failure=True
)

# Use with requirements
m = start_session()
result = m.instruct(
    "Solve the problem...",
    requirements=[req("Must satisfy constraint X")],
    strategy=strategy
)

Configuration Options

SOFAISamplingStrategy(
    s1_backend=fast_backend,      # Fast model for iteration
    s2_backend=slow_backend,       # Slow model for escalation
    s1_loop_budget=3,              # Max attempts with fast model
    escalate_on_failure=True,      # Use slow model if fast fails
    feedback_strategy="all_errors" # How to provide feedback
)

Feedback Strategies

  • "simple": Basic pass/fail feedback
  • "first_error": Feedback on first failing requirement
  • "all_errors": Detailed feedback on all failures

Use Cases

  • Constraint Satisfaction: Graph coloring, scheduling, planning
  • Complex Reasoning: Multi-step problems requiring iteration
  • Cost Optimization: Use expensive models only when needed
  • Quality Assurance: Fast iteration with slow model fallback

Model Selection

Fast Models (S1)

  • granite4.1:3b
  • llama3.2:3b
  • mistral:7b

Slow Models (S2)

  • granite4:latest
  • llama3:70b
  • mixtral:8x7b

Performance Tips

  1. Choose appropriate S1 budget: Balance speed vs. success rate
  2. Use specific requirements: Clear constraints help both models
  3. Provide good feedback: Detailed validation helps iteration
  4. Profile your use case: Measure when escalation is needed

Related Documentation

  • See mellea/stdlib/sampling/sofai.py for implementation
  • See test/stdlib/sampling/test_sofai_*.py for more examples
  • See mellea/stdlib/sampling/ for other sampling strategies