Skip to content

Refactor tipset_by_height and load_child_tipset to distinguish error from miss#7005

Open
sudo-shashank wants to merge 3 commits intomainfrom
shashank/refactor-tipset-lookup
Open

Refactor tipset_by_height and load_child_tipset to distinguish error from miss#7005
sudo-shashank wants to merge 3 commits intomainfrom
shashank/refactor-tipset-lookup

Conversation

@sudo-shashank
Copy link
Copy Markdown
Contributor

@sudo-shashank sudo-shashank commented May 6, 2026

Summary of changes

Changes introduced in this pull request:

  • tipset_by_height and load_child_tipset now return Result<Option<Tipset>, Error> (Ok(None) = clean cache miss, Err = actual storage/IO failure)
  • Propagate error as a genuine storage/IO failure, enabling fast-fail behavior

Reference issue to close (if applicable)

Closes #6755

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Bug Fixes

    • Clearer, epoch-specific error messages for missing tipsets and improved diagnostics across commands, RPCs, and tooling.
  • Improvements

    • More robust tipset resolution, stricter error propagation, and caching refinements for finality and state workflows.
    • Hardening of export/diff and snapshot flows for more reliable chain queries.
  • API Changes

    • Tipset lookups now return an explicit optional "not found" result instead of an error.

@sudo-shashank sudo-shashank added the RPC requires calibnet RPC checks to run on CI label May 6, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Walkthrough

Refactors tipset lookup APIs so tipset_by_height and load_child_tipset return Result<Option<Tipset>, Error> (Ok(None) for a known miss, Err for storage errors) and updates many call sites to propagate errors and add contextual with_context messages.

Changes

Tipset lookup & call-site propagation

Layer / File(s) Summary
Core API Signatures / Behavior
src/chain/store/index.rs, src/chain/store/chain_store.rs
tipset_by_height and load_child_tipset now return Result<Option<Tipset>, Error>; successes are Ok(Some(...)), missing tipsets return Ok(None). Docs updated.
Core ChainStore logic
src/chain/store/chain_store.rs
load_child_tipset reimplemented to return Ok(Some(child)) when a valid child is found, Ok(None) when absent, and Err for blockstore/index failures.
Index resolution logic & tests
src/chain/store/index.rs
Null-tipset resolution branches updated to return Ok(Some(...))/Ok(None); genesis and branch paths updated; unit tests adjusted and a new test ensures broken ancestor chains still error.
Error-context enrichment at call sites
src/dev/subcommands/*, src/tool/subcommands/*, src/interpreter/*, src/state_manager/*, src/rpc/methods/*, src/message_pool/msgpool/provider.rs
Numerous tipset lookups now chain .with_context(...) to add epoch/height-specific messages; call sites propagate Err instead of silently swallowing storage errors.
State manager & execution wiring
src/state_manager/mod.rs, src/state_manager/chain_rand.rs
State-loading, rewind, and child-tipset loading paths now propagate errors from load_child_tipset and attach richer diagnostic context.
Finality & RPC adjustments
src/rpc/methods/chain.rs, src/rpc/methods/eth/tipset_resolver.rs, src/rpc/methods/eth.rs, src/rpc/methods/f3.rs
Finality helper signature updated and a cached EC finality helper added; finality and tipset resolution logic adapted to Option semantics and improved error contexts.
Tools: export/diff/snapshot/index subcommands
src/tool/subcommands/archive_cmd.rs, src/tool/subcommands/{benchmark_cmd,index_cmd,snapshot_cmd}.rs, src/dev/subcommands/{export_state_tree_cmd,state_cmd}.rs
Export/diff/snapshot/index commands updated to use the optional API, validate diff epochs on-chain, stream diffs when provided, and add contextual errors.

Sequence Diagram(s)

(Skipped — changes are API return-shape refactors and enriched error contexts; no new multi-component sequential control flow requiring visualization.)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • ChainSafe/forest#6880: Modifies tipset_by_height in src/chain/store/index.rs — overlaps on signature/semantics changes.
  • ChainSafe/forest#6274: Touches chain store/index APIs and tipset-related behaviors similar to this PR.
  • ChainSafe/forest#6797: Adjusts finalized-tipset resolution logic in src/rpc/methods/chain.rs, related to this PR's finality helper changes.

Suggested reviewers

  • LesnyRumcajs
  • hanabi1224
  • akaladarshi
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main objective: refactoring tipset_by_height and load_child_tipset to return Option types, enabling distinction between errors and cache misses.
Linked Issues check ✅ Passed The PR fully addresses all coding objectives in issue #6755: refactored tipset_by_height and load_child_tipset to return Result<Option>, updated all call sites across the codebase to handle the new signatures, and propagated errors appropriately for fast-fail semantics.
Out of Scope Changes check ✅ Passed All changes in the PR are directly related to refactoring tipset lookup APIs and their call sites. Error context improvements are part of the scope, making errors more informative when genuine failures occur.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch shashank/refactor-tipset-lookup
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch shashank/refactor-tipset-lookup

Comment @coderabbitai help to get the list of available commands and usage tips.

@sudo-shashank sudo-shashank changed the title refactor tipset_by_height and load_child_tipset to distinguish error from miss Refactor tipset_by_height and load_child_tipset to distinguish error from miss May 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/state_manager/mod.rs (1)

472-478: ⚡ Quick win

Avoid the second load_child_tipset on the None path.

When load_child_tipset(ts) returns None, load_tipset_state falls through to load_executed_tipset(ts), which immediately calls load_child_tipset(ts) again inside the cache fill. That adds an extra chain-store lookup on head-tipset cache misses in a hot path. Thread the already-resolved child tipset into the executed-tipset load instead of re-querying it.

Also applies to: 497-503

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/state_manager/mod.rs` around lines 472 - 478, The match in
load_tipset_state currently calls self.chain_store().load_child_tipset(ts) and,
on None, falls through to load_executed_tipset(ts) which re-queries
load_child_tipset inside its cache fill; to avoid the duplicate lookup, change
load_executed_tipset to accept an Option<ChildTipset> (or add a helper like
load_executed_tipset_with_child) and pass the already-resolved child tipset when
available so the cache-fill path uses the provided child instead of calling
load_child_tipset again; apply the same pattern to the other occurrence
referenced (around lines 497-503) so both sites thread the resolved child tipset
into the executed-tipset loader.
src/tool/subcommands/archive_cmd.rs (1)

844-854: ⚡ Quick win

Mirror the two-stage error context used in do_export.

Here the new lookup only labels the miss case. If tipset_by_height returns a real storage/index error, the CLI loses the requested epoch from the error path.

Proposed pattern
     let tipset = chain_index
-        .tipset_by_height(epoch, heaviest_tipset.clone(), ResolveNullTipset::TakeOlder)?
+        .tipset_by_height(epoch, heaviest_tipset.clone(), ResolveNullTipset::TakeOlder)
+        .with_context(|| format!("failed to resolve tipset at epoch {epoch}"))?
         .context("tipset not found at requested epoch")?;

     let child_tipset = chain_index
         .tipset_by_height(
             epoch + 1,
             heaviest_tipset.clone(),
             ResolveNullTipset::TakeNewer,
-        )?
+        )
+        .with_context(|| format!("failed to resolve child tipset after epoch {epoch}"))?
         .context("child tipset not found at epoch+1")?;

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tool/subcommands/archive_cmd.rs` around lines 844 - 854, The two tipset
lookups using chain_index.tipset_by_height (for variables tipset and
child_tipset with ResolveNullTipset::TakeOlder/TakeNewer and heaviest_tipset)
only add a final `.context("... not found")` which hides the requested epoch on
storage/index errors; change both calls to use the two-stage pattern used in
do_export: first attempt the call, map or propagate the underlying error, and
then apply `.context(format!("tipset not found at requested epoch {}", epoch))`
(and similarly for epoch+1) so that real storage/index errors retain their
original messages while the missing-tipset case still includes the epoch
context.
src/rpc/methods/chain.rs (1)

397-400: ⚡ Quick win

Preserve call-site context for the Err(_) path too.

These updated lookups mostly convert None into a better error, but they still let storage/index failures escape without height-specific context. The finality helper fallback lookups also propagate raw errors today.

Proposed pattern
         let tss = ctx
             .chain_index()
-            .tipset_by_height(height, ts, ResolveNullTipset::TakeOlder)?
+            .tipset_by_height(height, ts, ResolveNullTipset::TakeOlder)
+            .with_context(|| format!("failed to resolve tipset at height {height}"))?
             .with_context(|| format!("tipset not found at height {height}"))?;

Use the same two-stage pattern in the export call sites and in the finality fallback lookups so both Err(_) and Ok(None) are diagnosable.

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

Also applies to: 596-599, 898-900, 929-930, 1071-1074, 1099-1102, 1237-1254

src/rpc/methods/eth.rs (1)

957-960: ⚡ Quick win

Add context before unwrapping the lookup Result.

These call sites currently only annotate the Ok(None) case. If tipset_by_height fails due to blockstore/index IO, the ? returns early and the queried height never appears in the error.

Proposed pattern
     chain
         .chain_index()
-        .tipset_by_height(height, head, resolve)?
+        .tipset_by_height(height, head, resolve)
+        .with_context(|| format!("failed to resolve tipset at height {height}"))?
         .with_context(|| format!("tipset not found at height {height}"))

Apply the same two-stage pattern to the canonical-height lookup as well.

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

Also applies to: 973-976

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/eth.rs` around lines 957 - 960, The call to
chain.chain_index().tipset_by_height(...) currently applies ? before adding
context, so IO/index errors lose the height context; change to the two-stage
pattern: first call chain.chain_index().tipset_by_height(height, head,
resolve).with_context(|| format!("failed to lookup tipset at height {height}"))?
to attach context to any Err, then separately handle the Ok(None) case (e.g.,
.ok_or_else(|| anyhow::anyhow!("tipset not found at height {height}"))?) so that
both IO errors and "not found" return messages include the height; apply the
same change for the other occurrence around tipset_by_height at the later call
site.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/chain/store/index.rs`:
- Around line 121-126: The chain walk currently uses
from.chain(&self.db).tuple_windows() which stops silently on missing ancestor
tipsets and causes missing-parent errors to be treated as Ok(None); replace that
iterator usage with an explicit ancestor-load loop that calls the tipset
parent-loading API (e.g. invoke the Tipset/Store method that loads parents from
self.db for each step) and propagate any storage/load errors as Err(...)
immediately, only returning Ok(None) when you have explicitly verified the chain
contains no tipset at the requested epoch; update the same logic used in the
comparable block further down (the code region replacing tuple_windows) so both
places explicitly load parents and fail on missing data instead of downgrading
to Ok(None).

---

Nitpick comments:
In `@src/rpc/methods/eth.rs`:
- Around line 957-960: The call to chain.chain_index().tipset_by_height(...)
currently applies ? before adding context, so IO/index errors lose the height
context; change to the two-stage pattern: first call
chain.chain_index().tipset_by_height(height, head, resolve).with_context(||
format!("failed to lookup tipset at height {height}"))? to attach context to any
Err, then separately handle the Ok(None) case (e.g., .ok_or_else(||
anyhow::anyhow!("tipset not found at height {height}"))?) so that both IO errors
and "not found" return messages include the height; apply the same change for
the other occurrence around tipset_by_height at the later call site.

In `@src/state_manager/mod.rs`:
- Around line 472-478: The match in load_tipset_state currently calls
self.chain_store().load_child_tipset(ts) and, on None, falls through to
load_executed_tipset(ts) which re-queries load_child_tipset inside its cache
fill; to avoid the duplicate lookup, change load_executed_tipset to accept an
Option<ChildTipset> (or add a helper like load_executed_tipset_with_child) and
pass the already-resolved child tipset when available so the cache-fill path
uses the provided child instead of calling load_child_tipset again; apply the
same pattern to the other occurrence referenced (around lines 497-503) so both
sites thread the resolved child tipset into the executed-tipset loader.

In `@src/tool/subcommands/archive_cmd.rs`:
- Around line 844-854: The two tipset lookups using chain_index.tipset_by_height
(for variables tipset and child_tipset with
ResolveNullTipset::TakeOlder/TakeNewer and heaviest_tipset) only add a final
`.context("... not found")` which hides the requested epoch on storage/index
errors; change both calls to use the two-stage pattern used in do_export: first
attempt the call, map or propagate the underlying error, and then apply
`.context(format!("tipset not found at requested epoch {}", epoch))` (and
similarly for epoch+1) so that real storage/index errors retain their original
messages while the missing-tipset case still includes the epoch context.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1d34d777-c49e-410e-86e7-998b738ebfad

📥 Commits

Reviewing files that changed from the base of the PR and between 0042f5b and 487c1a1.

📒 Files selected for processing (19)
  • src/chain/store/chain_store.rs
  • src/chain/store/index.rs
  • src/dev/subcommands/export_state_tree_cmd.rs
  • src/dev/subcommands/state_cmd.rs
  • src/interpreter/fvm3.rs
  • src/interpreter/fvm4.rs
  • src/message_pool/msgpool/provider.rs
  • src/rpc/methods/chain.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/rpc/methods/eth/tipset_resolver.rs
  • src/rpc/methods/f3.rs
  • src/rpc/methods/state.rs
  • src/state_manager/chain_rand.rs
  • src/state_manager/mod.rs
  • src/tool/subcommands/archive_cmd.rs
  • src/tool/subcommands/benchmark_cmd.rs
  • src/tool/subcommands/index_cmd.rs
  • src/tool/subcommands/snapshot_cmd.rs

Comment thread src/chain/store/index.rs Outdated
@sudo-shashank sudo-shashank marked this pull request as ready for review May 6, 2026 05:30
@sudo-shashank sudo-shashank requested a review from a team as a code owner May 6, 2026 05:30
@sudo-shashank sudo-shashank requested review from LesnyRumcajs and hanabi1224 and removed request for a team May 6, 2026 05:30
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 27.69953% with 154 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.04%. Comparing base (0042f5b) to head (878bb44).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/state_manager/mod.rs 15.38% 19 Missing and 3 partials ⚠️
src/rpc/methods/chain.rs 44.73% 9 Missing and 12 partials ⚠️
src/dev/subcommands/state_cmd.rs 0.00% 18 Missing ⚠️
src/rpc/methods/state.rs 0.00% 15 Missing ⚠️
src/tool/subcommands/archive_cmd.rs 0.00% 12 Missing and 2 partials ⚠️
src/chain/store/chain_store.rs 35.71% 8 Missing and 1 partial ⚠️
src/dev/subcommands/export_state_tree_cmd.rs 0.00% 8 Missing ⚠️
src/rpc/methods/f3.rs 0.00% 8 Missing ⚠️
src/tool/subcommands/benchmark_cmd.rs 0.00% 7 Missing ⚠️
src/message_pool/msgpool/provider.rs 0.00% 5 Missing and 1 partial ⚠️
... and 9 more
Additional details and impacted files
Files with missing lines Coverage Δ
src/chain/store/index.rs 90.00% <96.00%> (+2.50%) ⬆️
src/interpreter/fvm3.rs 42.13% <0.00%> (-0.22%) ⬇️
src/interpreter/fvm4.rs 42.18% <0.00%> (-0.21%) ⬇️
src/rpc/methods/eth/filter/mod.rs 88.89% <75.00%> (-0.07%) ⬇️
src/tool/subcommands/snapshot_cmd.rs 0.00% <0.00%> (ø)
src/rpc/methods/eth.rs 65.18% <42.85%> (-0.07%) ⬇️
src/state_manager/chain_rand.rs 70.16% <0.00%> (-1.39%) ⬇️
src/tool/subcommands/index_cmd.rs 0.00% <0.00%> (ø)
src/rpc/methods/eth/tipset_resolver.rs 61.64% <0.00%> (+0.83%) ⬆️
src/message_pool/msgpool/provider.rs 54.79% <0.00%> (-6.40%) ⬇️
... and 9 more

... and 6 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0042f5b...878bb44. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sudo-shashank sudo-shashank marked this pull request as draft May 6, 2026 05:50
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/chain/store/index.rs (1)

183-187: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate checkpoint load errors instead of treating them like cache misses.

This fast path currently collapses Err(_) and Ok(None) into the same branch. If a cached checkpoint key hits a real blockstore/IO failure, tipset_by_height silently retries from the original from tipset instead of fast-failing.

Suggested fix
         let mut checkpoint_from_epoch = to;
         while checkpoint_from_epoch < from_epoch {
-            if let Some(checkpoint_from_key) =
-                self.ts_height_cache.get_cloned(&checkpoint_from_epoch)
-                && let Ok(Some(checkpoint_from)) = self.load_tipset(&checkpoint_from_key)
-            {
-                from = checkpoint_from;
-                break;
+            if let Some(checkpoint_from_key) =
+                self.ts_height_cache.get_cloned(&checkpoint_from_epoch)
+            {
+                match self.load_tipset(&checkpoint_from_key) {
+                    Ok(Some(checkpoint_from)) => {
+                        from = checkpoint_from;
+                        break;
+                    }
+                    Ok(None) => {}
+                    Err(err) => return Err(err),
+                }
             }
             checkpoint_from_epoch = next_checkpoint(checkpoint_from_epoch);
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chain/store/index.rs` around lines 183 - 187, The fast-path inside
tipset_by_height currently treats load_tipset errors the same as cache misses;
change the logic around checkpoint_from_epoch so that after retrieving
checkpoint_from_key via self.ts_height_cache.get_cloned(&checkpoint_from_epoch),
you explicitly match the result of self.load_tipset(&checkpoint_from_key): if
load_tipset returns Err(e) propagate/return that Err immediately (fast-fail), if
it returns Ok(Some(checkpoint_from)) continue the fast-path as intended, and if
it returns Ok(None) treat it as a cache miss and fall back to the existing slow
path. Update the block containing checkpoint_from_epoch,
ts_height_cache.get_cloned, and load_tipset to implement this explicit
matching/propagation.
src/rpc/methods/chain.rs (1)

1235-1249: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse EC finality lookup errors into None.

Both branches here still swallow tipset_by_height failures (let Ok(Some(...)) / .ok().flatten()). That makes blockstore corruption look like “no finalized tipset” and defeats the new miss-vs-error split for ChainGetTipSetFinalityStatus.

Suggested fix
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
-                (head.epoch() - depth).max(0),
-                head.shallow_clone(),
-                ResolveNullTipset::TakeOlder,
-            )
-        {
-            Some(ts)
-        } else {
-            let ec_finality_epoch =
-                (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
-            ctx.chain_index()
-                .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
-                .ok()
-                .flatten()
-        };
+        let finalized = if depth >= 0 {
+            let threshold_epoch = (head.epoch() - depth).max(0);
+            match ctx
+                .chain_index()
+                .tipset_by_height(
+                    threshold_epoch,
+                    head.shallow_clone(),
+                    ResolveNullTipset::TakeOlder,
+                )
+                .with_context(|| {
+                    format!("failed to resolve EC finality tipset at height {threshold_epoch}")
+                })?
+            {
+                Some(ts) => Some(ts),
+                None => {
+                    let ec_finality_epoch =
+                        (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+                    ctx.chain_index()
+                        .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
+                        .with_context(|| {
+                            format!(
+                                "failed to resolve fallback EC finality tipset at height {ec_finality_epoch}"
+                            )
+                        })?
+                }
+            }
+        } else {
+            let ec_finality_epoch =
+                (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+            ctx.chain_index()
+                .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
+                .with_context(|| {
+                    format!(
+                        "failed to resolve fallback EC finality tipset at height {ec_finality_epoch}"
+                    )
+                })?
+        };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1235 - 1249, The code currently
silences errors from ctx.chain_index().tipset_by_height in both branches (the
pattern using let Ok(Some(...)) and .ok().flatten()), turning real errors into a
None "no finalized tipset" result; update the logic in the block that computes
finalized so that you explicitly handle Result from
ctx.chain_index().tipset_by_height: match the Result to propagate Err (return or
map the error into the RPC error used by ChainGetTipSetFinalityStatus), treat
Ok(Some(ts)) as Some(ts), and treat Ok(None) as the intended None/miss case;
locate the computation assigned to finalized and replace the uses of let
Ok(Some(...)) and .ok().flatten() with explicit match/if let handling that
distinguishes Err vs Ok(None) so blockstore/lookup errors are not collapsed into
a missing-finalized-tipset outcome.
🧹 Nitpick comments (1)
src/rpc/methods/chain.rs (1)

397-400: ⚡ Quick win

Add context on the Result layer before unwrapping the Option.

These call sites only annotate the None case right now. If tipset_by_height returns a real storage error, the RPC loses the height-specific context that this PR is trying to preserve.

Example pattern
 let start_ts = ctx
     .chain_index()
-    .tipset_by_height(epoch, head, ResolveNullTipset::TakeOlder)?
+    .tipset_by_height(epoch, head, ResolveNullTipset::TakeOlder)
+    .with_context(|| format!("couldn't get a tipset at height {epoch}"))?
     .with_context(|| format!("tipset not found at height {epoch}"))?;

As per coding guidelines, "**/*.rs: Use anyhow::Result<T> for most operations and add context with .context() when errors occur`."

Also applies to: 596-600, 896-899, 927-930, 1071-1074, 1099-1102

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 397 - 400, The call to
chain_index().tipset_by_height(..., ResolveNullTipset::TakeOlder) currently
attaches the height context only when the Option is None; instead add context on
the Result layer before unwrapping so storage errors also get the height info:
call .context(|| format!("tipset not found at height {epoch}")) (or
.with_context on the Result) immediately after tipset_by_height(...) and before
any ? that converts the Result/Option, e.g. adjust the
chain_index().tipset_by_height(...) usage in the current snippet and the other
similar tipset_by_height call sites in this file so the error from storage is
annotated with the epoch message.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/chain/store/index.rs`:
- Around line 183-187: The fast-path inside tipset_by_height currently treats
load_tipset errors the same as cache misses; change the logic around
checkpoint_from_epoch so that after retrieving checkpoint_from_key via
self.ts_height_cache.get_cloned(&checkpoint_from_epoch), you explicitly match
the result of self.load_tipset(&checkpoint_from_key): if load_tipset returns
Err(e) propagate/return that Err immediately (fast-fail), if it returns
Ok(Some(checkpoint_from)) continue the fast-path as intended, and if it returns
Ok(None) treat it as a cache miss and fall back to the existing slow path.
Update the block containing checkpoint_from_epoch, ts_height_cache.get_cloned,
and load_tipset to implement this explicit matching/propagation.

In `@src/rpc/methods/chain.rs`:
- Around line 1235-1249: The code currently silences errors from
ctx.chain_index().tipset_by_height in both branches (the pattern using let
Ok(Some(...)) and .ok().flatten()), turning real errors into a None "no
finalized tipset" result; update the logic in the block that computes finalized
so that you explicitly handle Result from ctx.chain_index().tipset_by_height:
match the Result to propagate Err (return or map the error into the RPC error
used by ChainGetTipSetFinalityStatus), treat Ok(Some(ts)) as Some(ts), and treat
Ok(None) as the intended None/miss case; locate the computation assigned to
finalized and replace the uses of let Ok(Some(...)) and .ok().flatten() with
explicit match/if let handling that distinguishes Err vs Ok(None) so
blockstore/lookup errors are not collapsed into a missing-finalized-tipset
outcome.

---

Nitpick comments:
In `@src/rpc/methods/chain.rs`:
- Around line 397-400: The call to chain_index().tipset_by_height(...,
ResolveNullTipset::TakeOlder) currently attaches the height context only when
the Option is None; instead add context on the Result layer before unwrapping so
storage errors also get the height info: call .context(|| format!("tipset not
found at height {epoch}")) (or .with_context on the Result) immediately after
tipset_by_height(...) and before any ? that converts the Result/Option, e.g.
adjust the chain_index().tipset_by_height(...) usage in the current snippet and
the other similar tipset_by_height call sites in this file so the error from
storage is annotated with the epoch message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 683bb560-d9e6-4ae9-8e20-70282996de34

📥 Commits

Reviewing files that changed from the base of the PR and between 487c1a1 and f111bb7.

📒 Files selected for processing (2)
  • src/chain/store/index.rs
  • src/rpc/methods/chain.rs

@sudo-shashank sudo-shashank marked this pull request as ready for review May 6, 2026 08:43
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/rpc/methods/chain.rs (1)

1202-1216: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Finality status still suppresses lookup failures.

These branches discard tipset load errors with if let Ok(parent) and .ok().flatten(). On a corrupted or partially-pruned chain, ChainGetTipSetFinalityStatus, safe, and finalized can now return a fallback tipset instead of surfacing the storage failure, which is the same ambiguity this refactor is trying to remove elsewhere.

🛠️ Suggested direction
         while chain.len() < chain_len {
             chain.push(ts.len() as i64);
-            if let Ok(parent) = ctx.chain_index().load_required_tipset(ts.parents()) {
-                // insert 0 for null rounds
-                if let Ok(n_null_tipsets_to_pad) = usize::try_from(ts.epoch() - parent.epoch() - 1)
-                    && n_null_tipsets_to_pad > 0
-                {
-                    let target_len =
-                        (chain.len().saturating_add(n_null_tipsets_to_pad)).min(chain_len);
-                    chain.resize(target_len, 0);
-                }
-                ts = parent;
-            } else {
+            if ts.epoch() == 0 {
                 break;
             }
+            let parent = ctx
+                .chain_index()
+                .load_required_tipset(ts.parents())
+                .context("failed to load ancestor while computing EC finality")?;
+            if let Ok(n_null_tipsets_to_pad) = usize::try_from(ts.epoch() - parent.epoch() - 1)
+                && n_null_tipsets_to_pad > 0
+            {
+                let target_len =
+                    (chain.len().saturating_add(n_null_tipsets_to_pad)).min(chain_len);
+                chain.resize(target_len, 0);
+            }
+            ts = parent;
         }
…
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
-                (head.epoch() - depth).max(0),
-                head.shallow_clone(),
-                ResolveNullTipset::TakeOlder,
-            ) {
-            Some(ts)
+        let finalized = if depth >= 0 {
+            ctx.chain_index()
+                .tipset_by_height(
+                    (head.epoch() - depth).max(0),
+                    head.shallow_clone(),
+                    ResolveNullTipset::TakeOlder,
+                )
+                .with_context(|| format!("failed to resolve EC finalized tipset at depth {depth}"))?
         } else {
             let ec_finality_epoch =
                 (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
             ctx.chain_index()
                 .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
-                .ok()
-                .flatten()
+                .with_context(|| format!("failed to resolve EC finalized tipset at epoch {ec_finality_epoch}"))?
         };

As per coding guidelines: **/*.rs: Use anyhow::Result<T> for most operations and add context with .context() when errors occur.

Also applies to: 1235-1248

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1202 - 1216, The loop currently
swallows storage errors by using "if let Ok(parent)" and .ok().flatten(),
causing ChainGetTipSetFinalityStatus (and helper code paths like safe and
finalized) to return fallback tipsets instead of propagating load failures;
change these to propagate errors with anyhow::Result and add context: replace
the "if let Ok(parent) = ctx.chain_index().load_required_tipset(ts.parents()) {
... } else { break; }" pattern with a fallible load that uses ? (or
map_err(...).context(...)?), e.g. call
ctx.chain_index().load_required_tipset(ts.parents()).context("loading parent
tipset for ChainGetTipSetFinalityStatus")? and handle the result accordingly so
storage failures bubble up; apply the same treatment to the similar block around
lines 1235-1248 (and any .ok().flatten() uses in this function) so errors are
surfaced instead of suppressed.
♻️ Duplicate comments (1)
src/chain/store/index.rs (1)

121-125: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

tipset_by_height still has no clean-miss path.

Line 212 still walks with from.chain(&self.db).tuple_windows(), and the only unresolved exit at Lines 230-232 is Err(...), so Result<Option<Tipset>, _> never actually produces Ok(None). That means callers in this PR still can't observe the new miss-vs-error split from #6755. Either keep the old non-optional signature or switch this to an explicit parent-load loop so missing ancestors stay Err and a proven miss can return Ok(None).

Also applies to: 162-233

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Around line 1202-1216: The loop currently swallows storage errors by using "if
let Ok(parent)" and .ok().flatten(), causing ChainGetTipSetFinalityStatus (and
helper code paths like safe and finalized) to return fallback tipsets instead of
propagating load failures; change these to propagate errors with anyhow::Result
and add context: replace the "if let Ok(parent) =
ctx.chain_index().load_required_tipset(ts.parents()) { ... } else { break; }"
pattern with a fallible load that uses ? (or map_err(...).context(...)?), e.g.
call ctx.chain_index().load_required_tipset(ts.parents()).context("loading
parent tipset for ChainGetTipSetFinalityStatus")? and handle the result
accordingly so storage failures bubble up; apply the same treatment to the
similar block around lines 1235-1248 (and any .ok().flatten() uses in this
function) so errors are surfaced instead of suppressed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 14a65015-54f6-456a-a482-9641c7fe7033

📥 Commits

Reviewing files that changed from the base of the PR and between 487c1a1 and f111bb7.

📒 Files selected for processing (2)
  • src/chain/store/index.rs
  • src/rpc/methods/chain.rs

@sudo-shashank sudo-shashank force-pushed the shashank/refactor-tipset-lookup branch from f111bb7 to 39c0377 Compare May 6, 2026 09:11
@sudo-shashank sudo-shashank force-pushed the shashank/refactor-tipset-lookup branch from 39c0377 to 878bb44 Compare May 6, 2026 09:12
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/rpc/methods/chain.rs (1)

1235-1250: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Storage/IO errors silently swallowed in first tipset_by_height call.

The let Ok(Some(ts)) = ... pattern treats Err(...) the same as Ok(None), causing genuine storage/IO errors to fall through to the else branch instead of propagating. This contradicts the PR objective of ensuring fast-fail behavior for real errors.

Consider separating the error case from the cache-miss case:

Proposed fix to propagate errors while preserving fallback for cache miss
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
+        let finalized = if depth >= 0 {
+            match ctx.chain_index().tipset_by_height(
                 (head.epoch() - depth).max(0),
                 head.shallow_clone(),
                 ResolveNullTipset::TakeOlder,
-            ) {
-            Some(ts)
+            )? {
+                Some(ts) => Some(ts),
+                None => {
+                    let ec_finality_epoch =
+                        (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+                    ctx.chain_index().tipset_by_height(
+                        ec_finality_epoch,
+                        head,
+                        ResolveNullTipset::TakeOlder,
+                    )?
+                }
+            }
         } else {
             let ec_finality_epoch =
                 (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
             ctx.chain_index().tipset_by_height(
                 ec_finality_epoch,
                 head,
                 ResolveNullTipset::TakeOlder,
             )?
         };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1235 - 1250, The current let
Ok(Some(ts)) = ctx.chain_index().tipset_by_height(...) pattern silently treats
Err as Ok(None) and lets storage/IO errors fall through; change this to
explicitly match the Result from ctx.chain_index().tipset_by_height (or use let
tmp = ...? if you want to propagate immediately) so that Err is propagated
(using ? or returning the error) while still distinguishing Ok(Some(ts)) (use
that for finalized) and Ok(None) (perform the fallback that computes
ec_finality_epoch and calls ctx.chain_index().tipset_by_height(...) with
ResolveNullTipset::TakeOlder); specifically update the logic around finalized,
head, depth, ctx.chain_index().tipset_by_height and ResolveNullTipset::TakeOlder
to handle Err vs Ok(None) separately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Around line 1235-1250: The current let Ok(Some(ts)) =
ctx.chain_index().tipset_by_height(...) pattern silently treats Err as Ok(None)
and lets storage/IO errors fall through; change this to explicitly match the
Result from ctx.chain_index().tipset_by_height (or use let tmp = ...? if you
want to propagate immediately) so that Err is propagated (using ? or returning
the error) while still distinguishing Ok(Some(ts)) (use that for finalized) and
Ok(None) (perform the fallback that computes ec_finality_epoch and calls
ctx.chain_index().tipset_by_height(...) with ResolveNullTipset::TakeOlder);
specifically update the logic around finalized, head, depth,
ctx.chain_index().tipset_by_height and ResolveNullTipset::TakeOlder to handle
Err vs Ok(None) separately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 40b2a2e1-f45f-4091-9918-b92aeb1a0ca5

📥 Commits

Reviewing files that changed from the base of the PR and between f111bb7 and 878bb44.

📒 Files selected for processing (2)
  • src/chain/store/index.rs
  • src/rpc/methods/chain.rs

Comment thread src/chain/store/index.rs
ResolveNullTipset::TakeNewer => return Ok(Some(child)),
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should return None at line 330

Comment thread src/interpreter/fvm3.rs
)
.context("Failed to get tipset cid")?;
)?
.with_context(|| format!("tipset not found at epoch {epoch}"))?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use load_required_xx to avoid repetitive with_context(|| format!("tipset not found at epoch {epoch}"))?;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: make tipset_by_height and load_child_tipset return anyhow::Result<Option<Tipset>> to distinguish error from miss

2 participants