Skip to content

Commit a73c92a

Browse files
committed
Archive support with optional history
1 parent 3c88176 commit a73c92a

8 files changed

Lines changed: 572 additions & 1 deletion

File tree

CLAUDE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,8 @@ Two-stage templating: `values.yaml.gotmpl` with `@enum/@default/@description` an
178178

179179
**Ethereum `--mode full|archive`** (default `full`): controls whether reth runs as a pruned full node (~500 GB mainnet / ~100 GB testnet) or an archive node retaining all historical state (~4 TB+ mainnet / ~300 GB testnet). Archive mode is for state replay (block explorers, historical `eth_call`, indexers); full mode is the right default for everything else. The mode flows through to (a) the reth `--full` arg in `internal/embed/networks/ethereum/helmfile.yaml.gotmpl`, (b) PVC sizing in `templates/pvc.yaml`, and (c) the `helmfile` `persistence.size` request. `obol network install ethereum` runs a disk-space preflight via `internal/network/preflight.go` — it warns when `cfg.DataDir` has less free disk than `(network, mode)` is expected to need, prompts the user, and auto-continues in non-interactive mode (no TTY / JSON output) so scripted installs don't deadlock. Other execution clients (geth, nethermind, besu, erigon) ignore the mode flag for now.
180180

181+
**Ethereum `--since` (partial archive)**: when `--mode=archive` would otherwise mean genesis-to-tip, `--since` bounds the archive at a known historical point and translates to reth's `--prune.account-history.{before,distance}` + `--prune.storage-history.{before,distance}` flags (plus `--prune.receipts.pre-merge` / `--prune.bodies.pre-merge` when the cutoff is at or before the merge). Accepted forms: EL hardfork names (`merge`, `shanghai`, `cancun`, `prague`, `osaka` — mainnet only, verified block numbers in `internal/network/hardforks.go`); durations (`365d`, `1y`, `6mo` — resolved against the post-merge 12s slot rate as a `--prune.*.distance` value); raw block numbers (`22500000`); or `genesis`/`all` (no extra args). Resolution happens in `resolveEthereumArchiveScope` in `internal/network/picker.go`: `--since` wins outright; if `--mode` is unset and a TTY is attached, a `full vs archive` picker runs; if `--mode=archive` is set without `--since` on a TTY, an `Archive scope` picker offers the hardfork presets + custom block + 365 days + genesis. Non-TTY defaults to `mode=full` (mode unset) or `since=genesis` (mode=archive). Resolved scope is appended to `values.yaml` as `pruneKind` / `pruneBlock` / `pruneDistance` and consumed by the helmfile. Partial archive is wired only for reth; geth/besu/erigon/nethermind emit a warning and run with chart defaults. Hardfork-name presets are rejected on testnets (mainnet block numbers don't apply).
182+
181183
## Stack Lifecycle
182184

183185
| Command | Action |

docs/getting-started.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -232,9 +232,48 @@ obol network install ethereum --network=mainnet
232232
obol network install ethereum --network=mainnet --mode=archive
233233
```
234234

235-
The installer warns when the data directory has less free disk than the
235+
When `--mode` is omitted on a TTY, the installer prompts. The disk-space
236+
preflight warns when the data directory has less free disk than the
236237
chosen mode is likely to need.
237238

239+
### Partial archive (`--since`)
240+
241+
A full mainnet archive from genesis is ~4 TB+, but most archive use cases
242+
(indexers, recent-state replay) only need history back to a known point.
243+
`--since` keeps an archive bounded — reth gets the right `--prune.*`
244+
flags wired through the chart:
245+
246+
```bash
247+
# Archive back to the merge (Sep 2022, ~1.5 TB)
248+
obol network install ethereum --network=mainnet --mode=archive --since=merge
249+
250+
# Archive back to Cancun (Mar 2024, ~800 GB)
251+
obol network install ethereum --network=mainnet --mode=archive --since=cancun
252+
253+
# Archive of the last 365 days (~600 GB)
254+
obol network install ethereum --network=mainnet --mode=archive --since=365d
255+
256+
# Archive from a specific block forward
257+
obol network install ethereum --network=mainnet --mode=archive --since=22500000
258+
```
259+
260+
Accepted `--since` values:
261+
262+
| Form | Example | Meaning |
263+
|---|---|---|
264+
| EL fork name | `merge`, `shanghai`, `cancun`, `prague`, `osaka` | Prune state before that mainnet hardfork |
265+
| Duration | `365d`, `1y`, `6mo` | Keep last N blocks (~12s slot rate) |
266+
| Block number | `22500000` | Prune state before that block |
267+
| `genesis` / `all` | `genesis` | Full archive from genesis |
268+
269+
When `--mode=archive` is set without `--since` on a TTY, the installer
270+
shows an interactive picker. On non-TTY (scripts, CI), the default is
271+
`all history`. `--since` is currently fine-tuned for **reth**; other
272+
execution clients fall back to their chart-default behavior with a warning.
273+
274+
Fork-name presets reference mainnet block numbers — use a raw block
275+
number or a duration on testnets.
276+
238277
Verify:
239278

240279
```bash

internal/embed/networks/ethereum/helmfile.yaml.gotmpl

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,24 @@ releases:
5858
{{- end }}
5959
{{- if ne .Values.mode "archive" }}
6060
- --full
61+
{{- else if eq (.Values.pruneKind | default "") "before" }}
62+
# Partial archive: prune state history before block N.
63+
# Receipts/bodies are pruned to the same cutoff via the
64+
# pre-merge presets where applicable; otherwise reth
65+
# uses the same `before` value.
66+
- --prune.account-history.before={{ .Values.pruneBlock }}
67+
- --prune.storage-history.before={{ .Values.pruneBlock }}
68+
{{- if le (int .Values.pruneBlock) 15537394 }}
69+
- --prune.receipts.pre-merge
70+
- --prune.bodies.pre-merge
71+
{{- else }}
72+
- --prune.receipts.before={{ .Values.pruneBlock }}
73+
- --prune.bodies.before={{ .Values.pruneBlock }}
74+
{{- end }}
75+
{{- else if eq (.Values.pruneKind | default "") "distance" }}
76+
# Partial archive: keep last N blocks of history.
77+
- --prune.account-history.distance={{ .Values.pruneDistance }}
78+
- --prune.storage-history.distance={{ .Values.pruneDistance }}
6179
{{- end }}
6280
{{- end }}
6381

internal/embed/networks/ethereum/values.yaml.gotmpl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,7 @@ consensusClient: {{.ConsensusClient}}
2020
# @default full
2121
# @description Node mode. 'full' prunes historical state (~500GB mainnet); 'archive' keeps all state for history replay (~4TB+ mainnet)
2222
mode: {{.Mode}}
23+
24+
# @default
25+
# @description Archive scope (only used with --mode=archive). Accepts: 'merge', 'shanghai', 'cancun', 'prague', 'osaka' (EL fork name); duration like '365d', '1y', '6mo'; raw block number; or 'genesis' for all history. Prompts interactively on TTY if omitted.
26+
since: {{.Since}}

internal/network/hardforks.go

Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
package network
2+
3+
import (
4+
"fmt"
5+
"strconv"
6+
"strings"
7+
"time"
8+
)
9+
10+
// Hardfork describes a mainnet hardfork activation point used by the
11+
// partial-archive picker. All block numbers are verified for Ethereum
12+
// mainnet — paris from the reth chainspec source, post-merge forks from
13+
// the beacon API by computing slot = (fork_ts - beacon_genesis) / 12 and
14+
// reading execution_payload.block_number from the canonical block at (or
15+
// shortly after) that slot.
16+
//
17+
// Verified 2026-05-27 against:
18+
// - reth v2.2.0 crates/chainspec/src/spec.rs (paris)
19+
// - go-ethereum params/config.go (post-merge fork timestamps)
20+
// - ethereum-beacon-api.publicnode.com (slot → execution block lookups)
21+
type Hardfork struct {
22+
// Name is the canonical EL fork name used by --since=<name>.
23+
Name string
24+
// DisplayName is shown in the interactive picker.
25+
DisplayName string
26+
// Block is the first execution block at or after the fork activation.
27+
Block uint64
28+
// Date is the activation date, used only for picker labels.
29+
Date string
30+
// ApproxArchiveSizeTB is a rough estimate of mainnet archive disk usage
31+
// when pruning state history before this block (reth). Used in picker
32+
// labels so users have a realistic expectation. Values rounded.
33+
ApproxArchiveSizeTB float64
34+
}
35+
36+
// MainnetHardforks is the ordered list of mainnet hardforks supported by
37+
// --since presets, oldest first.
38+
var MainnetHardforks = []Hardfork{
39+
{
40+
Name: "merge",
41+
DisplayName: "the merge",
42+
Block: 15537394,
43+
Date: "2022-09-15",
44+
ApproxArchiveSizeTB: 1.5,
45+
},
46+
{
47+
Name: "shanghai",
48+
DisplayName: "shanghai",
49+
Block: 17034870,
50+
Date: "2023-04-12",
51+
ApproxArchiveSizeTB: 1.2,
52+
},
53+
{
54+
Name: "cancun",
55+
DisplayName: "cancun",
56+
Block: 19426587,
57+
Date: "2024-03-13",
58+
ApproxArchiveSizeTB: 0.8,
59+
},
60+
{
61+
Name: "prague",
62+
DisplayName: "prague",
63+
Block: 22431084,
64+
Date: "2025-05-07",
65+
ApproxArchiveSizeTB: 0.4,
66+
},
67+
{
68+
Name: "osaka",
69+
DisplayName: "osaka",
70+
Block: 23935694,
71+
Date: "2025-12-03",
72+
ApproxArchiveSizeTB: 0.2,
73+
},
74+
}
75+
76+
// HardforkByName returns the hardfork with the given name, or nil.
77+
func HardforkByName(name string) *Hardfork {
78+
for i := range MainnetHardforks {
79+
if MainnetHardforks[i].Name == strings.ToLower(name) {
80+
return &MainnetHardforks[i]
81+
}
82+
}
83+
return nil
84+
}
85+
86+
// ArchiveScope is the resolved meaning of --since after parsing.
87+
type ArchiveScope struct {
88+
// Kind is one of: "all" (full archive from genesis), "before" (prune
89+
// before a specific block), "distance" (keep last N blocks of history).
90+
Kind string
91+
// Block is set when Kind == "before".
92+
Block uint64
93+
// Distance is set when Kind == "distance".
94+
Distance uint64
95+
// Label is a human-readable description for logs/picker confirmation.
96+
Label string
97+
}
98+
99+
// ParseSince resolves a --since value into an ArchiveScope. Accepted
100+
// forms:
101+
// - "genesis", "all": full archive from genesis
102+
// - "merge", "shanghai", "cancun", "prague", "osaka": EL hardfork name
103+
// - "<N>d", "<N>mo", "<N>y": duration back from chain head (12s slots)
104+
// - "<block>": raw block number (any unsigned integer)
105+
//
106+
// Mode is reth-anchored: "before" presets get translated to reth's
107+
// --prune.<segment>.before flags; "distance" presets to --prune.<segment>.distance.
108+
func ParseSince(raw string) (ArchiveScope, error) {
109+
v := strings.TrimSpace(strings.ToLower(raw))
110+
if v == "" {
111+
return ArchiveScope{}, fmt.Errorf("--since cannot be empty")
112+
}
113+
114+
if v == "genesis" || v == "all" {
115+
return ArchiveScope{Kind: "all", Label: "all history (from genesis)"}, nil
116+
}
117+
118+
if hf := HardforkByName(v); hf != nil {
119+
return ArchiveScope{
120+
Kind: "before",
121+
Block: hf.Block,
122+
Label: fmt.Sprintf("since %s (block %d, %s)", hf.DisplayName, hf.Block, hf.Date),
123+
}, nil
124+
}
125+
126+
// Duration: <N>{d,mo,y}. Translates to block distance using the
127+
// post-merge slot time of 12s. "mo" is 30 days; "y" is 365 days.
128+
if blocks, ok := parseDurationBlocks(v); ok {
129+
return ArchiveScope{
130+
Kind: "distance",
131+
Distance: blocks,
132+
Label: fmt.Sprintf("last %s (~%d blocks from tip)", v, blocks),
133+
}, nil
134+
}
135+
136+
// Raw block number.
137+
if n, err := strconv.ParseUint(v, 10, 64); err == nil {
138+
return ArchiveScope{
139+
Kind: "before",
140+
Block: n,
141+
Label: fmt.Sprintf("since block %d", n),
142+
}, nil
143+
}
144+
145+
return ArchiveScope{}, fmt.Errorf("unrecognized --since value %q (try: genesis, merge, shanghai, cancun, prague, osaka, 365d, 1y, or a block number)", raw)
146+
}
147+
148+
// parseDurationBlocks accepts "<N>d|mo|y" and returns the equivalent
149+
// post-merge block distance (12s slots, ignoring missed slots — close
150+
// enough for pruning purposes since reth uses --prune.*.distance which
151+
// is anchored to chain tip at run time).
152+
func parseDurationBlocks(v string) (uint64, bool) {
153+
var n uint64
154+
var unit string
155+
for i := 0; i < len(v); i++ {
156+
if v[i] < '0' || v[i] > '9' {
157+
parsed, err := strconv.ParseUint(v[:i], 10, 64)
158+
if err != nil || i == 0 {
159+
return 0, false
160+
}
161+
n = parsed
162+
unit = v[i:]
163+
break
164+
}
165+
}
166+
if unit == "" {
167+
return 0, false
168+
}
169+
170+
var seconds uint64
171+
switch unit {
172+
case "d", "day", "days":
173+
seconds = n * uint64(24*time.Hour/time.Second)
174+
case "mo", "month", "months":
175+
seconds = n * 30 * uint64(24*time.Hour/time.Second)
176+
case "y", "yr", "year", "years":
177+
seconds = n * 365 * uint64(24*time.Hour/time.Second)
178+
default:
179+
return 0, false
180+
}
181+
182+
// 12s per slot post-merge.
183+
return seconds / 12, true
184+
}

internal/network/hardforks_test.go

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
package network
2+
3+
import "testing"
4+
5+
func TestParseSince(t *testing.T) {
6+
cases := []struct {
7+
in string
8+
wantKind string
9+
wantNum uint64 // Block for "before", Distance for "distance"
10+
wantErr bool
11+
}{
12+
{"genesis", "all", 0, false},
13+
{"all", "all", 0, false},
14+
{"merge", "before", 15537394, false},
15+
{"MERGE", "before", 15537394, false},
16+
{"shanghai", "before", 17034870, false},
17+
{"cancun", "before", 19426587, false},
18+
{"prague", "before", 22431084, false},
19+
{"osaka", "before", 23935694, false},
20+
{"365d", "distance", 365 * 24 * 60 * 60 / 12, false},
21+
{"1y", "distance", 365 * 24 * 60 * 60 / 12, false},
22+
{"6mo", "distance", 6 * 30 * 24 * 60 * 60 / 12, false},
23+
{"15537394", "before", 15537394, false},
24+
{"", "", 0, true},
25+
{"yesterday", "", 0, true},
26+
{"-1y", "", 0, true},
27+
}
28+
29+
for _, c := range cases {
30+
t.Run(c.in, func(t *testing.T) {
31+
got, err := ParseSince(c.in)
32+
if (err != nil) != c.wantErr {
33+
t.Fatalf("err = %v, wantErr = %v", err, c.wantErr)
34+
}
35+
if c.wantErr {
36+
return
37+
}
38+
if got.Kind != c.wantKind {
39+
t.Fatalf("kind = %q, want %q", got.Kind, c.wantKind)
40+
}
41+
switch got.Kind {
42+
case "before":
43+
if got.Block != c.wantNum {
44+
t.Fatalf("block = %d, want %d", got.Block, c.wantNum)
45+
}
46+
case "distance":
47+
if got.Distance != c.wantNum {
48+
t.Fatalf("distance = %d, want %d", got.Distance, c.wantNum)
49+
}
50+
}
51+
})
52+
}
53+
}
54+
55+
func TestMainnetHardforkOrder(t *testing.T) {
56+
for i := 1; i < len(MainnetHardforks); i++ {
57+
if MainnetHardforks[i-1].Block >= MainnetHardforks[i].Block {
58+
t.Fatalf("hardforks must be ordered oldest first: %s (block %d) >= %s (block %d)",
59+
MainnetHardforks[i-1].Name, MainnetHardforks[i-1].Block,
60+
MainnetHardforks[i].Name, MainnetHardforks[i].Block)
61+
}
62+
}
63+
}

internal/network/network.go

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,20 @@ func Install(cfg *config.Config, u *ui.UI, network string, overrides map[string]
125125
templateData[field.Name] = value
126126
}
127127

128+
// Ethereum: resolve archive scope from --mode/--since, prompting on
129+
// TTY when the user under-specifies. Updates templateData["Mode"] to
130+
// reflect any picker choice so downstream code (preflight, YAML
131+
// render) sees the final value.
132+
var archiveScope ArchiveScope
133+
if network == "ethereum" {
134+
scope, resolvedMode, err := resolveEthereumArchiveScope(u, templateData, overrides)
135+
if err != nil {
136+
return err
137+
}
138+
archiveScope = scope
139+
templateData["Mode"] = resolvedMode
140+
}
141+
128142
// Disk-space preflight (currently only meaningful for ethereum). The
129143
// check warns and prompts; in non-interactive mode (no TTY / JSON) it
130144
// auto-continues so scripted installs don't deadlock.
@@ -156,6 +170,17 @@ func Install(cfg *config.Config, u *ui.UI, network string, overrides map[string]
156170
return fmt.Errorf("failed to execute values template: %w", err)
157171
}
158172

173+
// Append the resolved archive scope as additional YAML so helmfile can
174+
// translate it into per-client prune args. Kept separate from the
175+
// template because it's computed by the CLI, not passed by the user.
176+
if network == "ethereum" {
177+
var sb strings.Builder
178+
sb.Write(buf.Bytes())
179+
appendArchiveScopeYAML(&sb, archiveScope)
180+
buf.Reset()
181+
buf.WriteString(sb.String())
182+
}
183+
159184
// Validate that the generated content is valid YAML
160185
var yamlCheck any
161186
if err := yaml.Unmarshal(buf.Bytes(), &yamlCheck); err != nil {

0 commit comments

Comments
 (0)