Skip to content

L-02 Fix false clearing of finality violation error in case of termproary …#429

Merged
dhaidashenko merged 16 commits intodevelopfrom
fix/PLEX-2780-L-02-clearing-finality-violation-on-temp-errs
May 6, 2026
Merged

L-02 Fix false clearing of finality violation error in case of termproary …#429
dhaidashenko merged 16 commits intodevelopfrom
fix/PLEX-2780-L-02-clearing-finality-violation-on-temp-errs

Conversation

@dhaidashenko
Copy link
Copy Markdown
Contributor

Problem

Possible Clearing of finalityViolated Masks Unresolved Finality Violations

The PollAndSaveLogs function clears the finalityViolated latch whenever pollAndSaveLogs returns nil, but that helper returns nil on several non-success paths: failed latestBlocks or latestSafeBlock reads, failed InsertLogsWithBlocks writes, and even when there are no new blocks to process. As a result, after a genuine finality violation has been latched, a later soft failure can incorrectly clear that latch, emit a misleading "completed successfully" log, and cause Healthy to report no error even though no successful recovery poll has actually occurred.

The interaction between the backup and primary pollers can amplify this behaviour. Both run in the same select loop, and when a finality violation is detected by BackupPollAndSaveLogs and latched via finalityViolated, the primary poller (which runs far more frequently) may clear it on its very next tick. Thus, there is a risk of persistent oscillation where the true error state is visible only in brief windows and Healthy reports healthy for the majority of the time, even though the underlying violation has not been resolved.

Fix

Clear finality violation error only once we successfully confirm that DB state matches RPC

@dhaidashenko dhaidashenko requested a review from a team as a code owner April 17, 2026 11:02
@github-actions
Copy link
Copy Markdown
Contributor

✅ API Diff Results - No breaking changes


📄 View full apidiff report

@mchain0 mchain0 requested a review from pavel-raykov April 17, 2026 11:27
@mchain0
Copy link
Copy Markdown

mchain0 commented Apr 17, 2026

it's a pr between two feature branches, not sure entirely sure if you need my (or anyone's) review. added @pavel-raykov since he probably has more context over this codebase

@dhaidashenko dhaidashenko changed the title Fix false clearing of finality violation error in case of termproary … L-02 Fix false clearing of finality violation error in case of termproary … Apr 17, 2026
@pavel-raykov pavel-raykov requested a review from ilija42 April 17, 2026 13:39
ilija42
ilija42 previously approved these changes Apr 17, 2026
Base automatically changed from fix/PLEX-2778-M-02-Replay-Reorg-Detection-Bypassed-When-Parent-Block-Is-Missing to develop April 30, 2026 12:54
@dhaidashenko dhaidashenko dismissed ilija42’s stale review April 30, 2026 12:54

The base branch was changed.

@ilija42 ilija42 requested review from a team as code owners April 30, 2026 12:54
…ation-on-temp-errs

# Conflicts:
#	pkg/logpoller/log_poller.go
#	pkg/logpoller/log_poller_internal_test.go
#	pkg/logpoller/log_poller_test.go
#	pkg/logpoller/orm_test.go
Copilot AI review requested due to automatic review settings May 5, 2026 17:41
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

📊 API Diff Results

No changes detected for module github.com/smartcontractkit/chainlink-evm

View full report

@dhaidashenko dhaidashenko requested a review from ilija42 May 5, 2026 17:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR narrows when the log poller clears a latched finality-violation health error, moving the reset from the outer PollAndSaveLogs success path into the deeper reorg/canonical-chain verification path. It also adds regression tests around several transient-failure scenarios to confirm the latch is preserved until recovery is verified.

Changes:

  • Remove unconditional clearing of finalityViolated after any nil return from pollAndSaveLogs.
  • Clear the finality latch only after getCurrentBlockMaybeHandleReorg succeeds during polling.
  • Add internal tests for transient RPC/DB-like failures and one happy-path recovery case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/logpoller/log_poller.go Moves finality-latch clearing into the canonical-chain verification portion of polling.
pkg/logpoller/log_poller_internal_test.go Adds targeted regression tests for latch persistence and successful clearing behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/logpoller/log_poller.go
Comment thread pkg/logpoller/log_poller_internal_test.go
@dhaidashenko dhaidashenko merged commit 5fe5dcb into develop May 6, 2026
38 checks passed
@dhaidashenko dhaidashenko deleted the fix/PLEX-2780-L-02-clearing-finality-violation-on-temp-errs branch May 6, 2026 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants