fix(harness): make pfn_chain_stress robust on cross-FS / flaky-net hosts by keanji-x · Pull Request #720 · Galxe/gravity-sdk

keanji-x · 2026-05-18T02:14:20Z

Three small defensive fixes hit while reproducing a mempool issue with the pfn_chain_stress harness on a host where /tmp is tmpfs and the repo lives on a separate device, and where GitHub fetches occasionally hiccup. None of these change runtime behavior of the binary under test — they only affect the harness's resilience on developer-machine variations.

1. `cluster/deploy.sh` — cross-device hardlink

gravity_node is hardlinked into each node's /tmp/gravity-cluster-pfn-*/<node>/bin/. When /tmp and target/ are on different filesystems the ln -f fails with "Invalid cross-device link" and tears down the whole setup.

Fall back to cp -f — matches what the gravity_cli path a few lines up already does (and the inline comment claims).

2. `cluster/genesis.sh` — flaky-network git fetch

Every run does git fetch origin + git pull origin <ref> on external/gravity_chain_core_contracts. A transient network error there kills the whole stress run.

Demote both to warnings — the local working copy is usually already at the right ref, and a stale local copy is strictly better than a hard failure for this workflow.

3. `regression/pfn_chain_stress/run.sh` — wait-for-chain set -e brittleness

The wait-for-chain loop does bn=$(curl ... | sed ...) under set -euo pipefail. The first probe typically lands while node1's RPC is still starting; curl exits 7, pipefail propagates it through the command substitution, and set -e kills the script before the chain is even up.

Wrap curl in { ... || true; } so the wait loop actually waits.

Test plan

Ran ./run.sh pfn3 --clean end-to-end on a host where all three failure modes were hit; harness now completes cleanly through bench + TPS analysis.
Verify no regression on a 'pristine' host where /tmp and target/ are same FS (no functional change there).

🤖 Generated with Claude Code

Three small defensive fixes hit while reproducing a mempool issue with the pfn_chain_stress harness on a host where /tmp is tmpfs and the repo lives on a separate device, and where GitHub fetches occasionally hiccup. 1. cluster/deploy.sh: gravity_node binary is hardlinked into each node's /tmp/gravity-cluster-pfn-*/<node>/bin/. When /tmp and target/ are on different filesystems the `ln -f` fails with "Invalid cross-device link" and tears down the whole setup. Fall back to `cp -f` — matches what the gravity_cli path already does a few lines up. 2. cluster/genesis.sh: every run does `git fetch origin` + `git pull origin <ref>` on the external contracts repo. A transient network error there kills the whole stress run. Demote both to warnings — the local working copy is usually already at the right ref and a stale local copy is strictly better than a hard failure for this workflow. 3. regression/pfn_chain_stress/run.sh: the wait-for-chain loop does `bn=$(curl ... | sed ...)` under `set -euo pipefail`. The first probe typically lands while node1's RPC is still starting; curl exits 7, pipefail propagates it through the command substitution, and set -e kills the script before the chain is even up. Wrap curl in `{ ... || true; }` so the wait loop actually waits. None of these change runtime behavior of the binary under test — they only affect the harness's resilience on developer-machine variations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-07-03T02:17:00Z

This PR is stale because it has been open 45 days with no activity. Remove the stale label, comment or push a commit - otherwise this will be closed in 15 days.

github-actions Bot added the Stale label Jul 3, 2026

keanji-x closed this Jul 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(harness): make pfn_chain_stress robust on cross-FS / flaky-net hosts#720

fix(harness): make pfn_chain_stress robust on cross-FS / flaky-net hosts#720
keanji-x wants to merge 1 commit into
Galxe:mainfrom
keanji-x:kj/fix-pfn-stress-harness

keanji-x commented May 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

keanji-x commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. cluster/deploy.sh — cross-device hardlink

2. cluster/genesis.sh — flaky-network git fetch

3. regression/pfn_chain_stress/run.sh — wait-for-chain set -e brittleness

Test plan

Uh oh!

github-actions Bot commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

keanji-x commented May 18, 2026 •

edited

Loading

1. `cluster/deploy.sh` — cross-device hardlink

2. `cluster/genesis.sh` — flaky-network git fetch

3. `regression/pfn_chain_stress/run.sh` — wait-for-chain set -e brittleness