|
| 1 | +--- |
| 2 | +name: aletheia-math-research |
| 3 | +description: > |
| 4 | + Agente de pesquisa matematica autonoma baseado em Aletheia (Feng et al., 2026). |
| 5 | + Implementa loop Generator-Verifier-Reviser com verificacao Cora-Debate V1-V7 |
| 6 | + e desacoplamento thinking/output. Atinge nivel L2 (Publishable Research) em |
| 7 | + dominios de teoria dos numeros, combinatoria, algebra e geometria. |
| 8 | + Use quando precisar resolver problemas matematicos de pesquisa, gerar provas |
| 9 | + autonomas, ou conduzir investigacao cientifica com verificacao iterativa. |
| 10 | +spec: "SPEC-012" |
| 11 | +version: "1.0" |
| 12 | +category: research |
| 13 | +tags: [aletheia, mathematics, research, generator-verifier, cora-debate, autonomous-science] |
| 14 | +dependencies: [SPEC-001, CORA-Eval, cora-debate, reasoning-orchestrator-v11] |
| 15 | +tdd_suite: "scripts/aletheia_engine.py" |
| 16 | +ct_count: 10 |
| 17 | +reference: "Feng, T. et al. (2026). Towards Autonomous Mathematics Research. arXiv:2602.10177v3." |
| 18 | +status: active |
| 19 | +--- |
| 20 | + |
| 21 | +# Aletheia Math Research Engine |
| 22 | + |
| 23 | +## Arquitetura |
| 24 | + |
| 25 | +``` |
| 26 | +PROBLEM ──▶ [GENERATOR] ──▶ [VERIFIER (Cora V1-V7)] ──▶ pass? ──▶ SOLUTION |
| 27 | + ▲ │ │ |
| 28 | + │ ▼ fail │ |
| 29 | + └────── [REVISER] ◀────┘ │ |
| 30 | + (feedback loop) │ |
| 31 | + max_attempts = 10 (hyperparameter) │ |
| 32 | +``` |
| 33 | + |
| 34 | +## 3 Subagentes |
| 35 | + |
| 36 | +| Subagente | Funcao | Inspiracao | |
| 37 | +|-----------|--------|------------| |
| 38 | +| **Generator** | Produz solucao em linguagem natural com 16 tipos de raciocinio | Feng et al. §2.2 | |
| 39 | +| **Verifier** | Verifica via Cora-Debate V1-V7 + deteccao de alucinacoes | Feng et al. §2.2 | |
| 40 | +| **Reviser** | Corrige flaws identificados pelo Verifier | Feng et al. §2.1 | |
| 41 | + |
| 42 | +## Benchmark (5 problemas) |
| 43 | + |
| 44 | +| ID | Dominio | Dificuldade | |
| 45 | +|:---|:---|:---| |
| 46 | +| IMO-2024-P1 | Number Theory | Olympiad | |
| 47 | +| Erdos-1051 | Combinatorics | Research Open | |
| 48 | +| FutureMath-Basic-1 | Algebra | PhD Exercise | |
| 49 | +| Thue-Morse | Combinatorics | Olympiad | |
| 50 | +| Goldbach-Variant | Number Theory | Research Open | |
| 51 | + |
| 52 | +## Uso |
| 53 | + |
| 54 | +```python |
| 55 | +from aletheia_engine import AletheiaEngine, MathProblem |
| 56 | + |
| 57 | +engine = AletheiaEngine(max_attempts=10, strictness=0.7) |
| 58 | +problem = MathProblem( |
| 59 | + id="My-Problem", |
| 60 | + statement="Prove that...", |
| 61 | + domain="number_theory", |
| 62 | + difficulty="research_open" |
| 63 | +) |
| 64 | +session = engine.solve(problem) |
| 65 | +print(session.status, session.final_solution) |
| 66 | +``` |
| 67 | + |
| 68 | +## Referencia Principal |
| 69 | + |
| 70 | +Feng, T., Trinh, T.H., Bingham, G. et al. (2026). |
| 71 | +**Towards Autonomous Mathematics Research.** |
| 72 | +arXiv:2602.10177v3 [cs.LG]. |
| 73 | +Google DeepMind Superhuman Reasoning Team. |
0 commit comments