Skip to content

Commit 2a3165d

Browse files
author
MarceloClaro
committed
round 14: Aletheia Math Research Engine — Feng et al. (2026) — Generator-Verifier-Reviser loop
1 parent dcafd3e commit 2a3165d

2 files changed

Lines changed: 898 additions & 0 deletions

File tree

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
name: aletheia-math-research
3+
description: >
4+
Agente de pesquisa matematica autonoma baseado em Aletheia (Feng et al., 2026).
5+
Implementa loop Generator-Verifier-Reviser com verificacao Cora-Debate V1-V7
6+
e desacoplamento thinking/output. Atinge nivel L2 (Publishable Research) em
7+
dominios de teoria dos numeros, combinatoria, algebra e geometria.
8+
Use quando precisar resolver problemas matematicos de pesquisa, gerar provas
9+
autonomas, ou conduzir investigacao cientifica com verificacao iterativa.
10+
spec: "SPEC-012"
11+
version: "1.0"
12+
category: research
13+
tags: [aletheia, mathematics, research, generator-verifier, cora-debate, autonomous-science]
14+
dependencies: [SPEC-001, CORA-Eval, cora-debate, reasoning-orchestrator-v11]
15+
tdd_suite: "scripts/aletheia_engine.py"
16+
ct_count: 10
17+
reference: "Feng, T. et al. (2026). Towards Autonomous Mathematics Research. arXiv:2602.10177v3."
18+
status: active
19+
---
20+
21+
# Aletheia Math Research Engine
22+
23+
## Arquitetura
24+
25+
```
26+
PROBLEM ──▶ [GENERATOR] ──▶ [VERIFIER (Cora V1-V7)] ──▶ pass? ──▶ SOLUTION
27+
▲ │ │
28+
│ ▼ fail │
29+
└────── [REVISER] ◀────┘ │
30+
(feedback loop) │
31+
max_attempts = 10 (hyperparameter) │
32+
```
33+
34+
## 3 Subagentes
35+
36+
| Subagente | Funcao | Inspiracao |
37+
|-----------|--------|------------|
38+
| **Generator** | Produz solucao em linguagem natural com 16 tipos de raciocinio | Feng et al. §2.2 |
39+
| **Verifier** | Verifica via Cora-Debate V1-V7 + deteccao de alucinacoes | Feng et al. §2.2 |
40+
| **Reviser** | Corrige flaws identificados pelo Verifier | Feng et al. §2.1 |
41+
42+
## Benchmark (5 problemas)
43+
44+
| ID | Dominio | Dificuldade |
45+
|:---|:---|:---|
46+
| IMO-2024-P1 | Number Theory | Olympiad |
47+
| Erdos-1051 | Combinatorics | Research Open |
48+
| FutureMath-Basic-1 | Algebra | PhD Exercise |
49+
| Thue-Morse | Combinatorics | Olympiad |
50+
| Goldbach-Variant | Number Theory | Research Open |
51+
52+
## Uso
53+
54+
```python
55+
from aletheia_engine import AletheiaEngine, MathProblem
56+
57+
engine = AletheiaEngine(max_attempts=10, strictness=0.7)
58+
problem = MathProblem(
59+
id="My-Problem",
60+
statement="Prove that...",
61+
domain="number_theory",
62+
difficulty="research_open"
63+
)
64+
session = engine.solve(problem)
65+
print(session.status, session.final_solution)
66+
```
67+
68+
## Referencia Principal
69+
70+
Feng, T., Trinh, T.H., Bingham, G. et al. (2026).
71+
**Towards Autonomous Mathematics Research.**
72+
arXiv:2602.10177v3 [cs.LG].
73+
Google DeepMind Superhuman Reasoning Team.

0 commit comments

Comments
 (0)