feat(infra): add HPMN backup workflow by vivekgsharma · Pull Request #740 · dashpay/dash-network-deploy

vivekgsharma · 2026-04-14T12:29:57Z

Issue being fixed or feature implemented

Implements the first HP masternode backup workflow for testnet as follow-up work for infra issue #25.

This change is scoped to hp_masternodes only and is intended to provide a repeatable backup path for the runtime state that matters most for HPMN recovery, without relying on full-instance snapshots or
modifying the existing deploy.yml flow.

Related context:

HP masternode backup strategy for testnet #25
EBS snapshot audit / cleanup #24

What was done?

Added a dedicated HPMN backup implementation under Ansible, plus documentation and a GitHub Actions workflow scaffold.

Changes include:

added ansible/roles/hpmn_backup
added ansible/hpmn_backup_install.yml for one-time backup tooling installation on HPMNs
added ansible/hpmn_backup_run.yml for on-demand backup execution
added .github/workflows/hpmn-backup.yml for workflow_dispatch-based triggering
added docs/hpmn-backup.md

The backup flow is isolated from normal provisioning and does not require using ./bin/deploy -p.

The backup currently collects the following runtime state on each HP masternode:

/home/dashmate/.dashmate/config.json
/home/dashmate/.dashmate/<network>/platform/drive/tenderdash
/home/dashmate/.dashmate/<network>/platform/gateway/ssl
/var/lib/docker/volumes/dashmate_<network>_drive_tenderdash/_data
/var/lib/docker/volumes/dashmate_<network>_core_data/_data/.dashcore/testnet3/llmq

The backup archive is created locally on the HPMN and uploaded directly to S3 with stamped object paths using:

network
hostname
UTC timestamp
trigger label

This implementation is designed to be non-disruptive:

no reboot
no service stop
no dashmate reset
no container restart

How Has This Been Tested?

Functional testing was performed against live testnet infrastructure on hp-masternode-1.

Read-only verification on hp-masternode-1 confirmed the presence of:

Dashmate config
Tenderdash host config
gateway SSL material
Core llmq
Tenderdash named Docker volume

It also confirmed that priv_validator_key.json / priv_validator_state.json were not present on the host or visible in the running container during inspection.

Live validation performed:

Ran ansible/hpmn_backup_install.yml against hp-masternode-1
Triggered one backup run and verified upload to S3
Inspected the uploaded archive manifest
Expanded backup scope to include the Tenderdash named volume
Reinstalled updated backup script on hp-masternode-1
Deleted the first test object from S3
Reran the backup and revalidated the resulting archive

Validated S3 bucket:

s3://dash-testnet-hpmns-backups

Validated uploaded test object:

s3://dash-testnet-hpmns-backups/hpmn-backups/testnet/hp-masternode-1/20260414T121720Z_second-test.tar.gz

Observed result:

upload completed successfully
resulting archive size was ~1.62 GB
S3 object stored with AES256 server-side encryption
archive manifest confirmed inclusion of:
- config.json
- Tenderdash host config dir
- gateway SSL dir
- Tenderdash Docker volume _data
- Core llmq

Breaking Changes

None.

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation

Summary by CodeRabbit

New Features
- Manual backup and restore workflows for HP masternodes with network/host targeting, batching, optional tooling install, S3 upload (AES256 or KMS), and single-host finalize step
- On-host install/run actions and scripts to package/upload backups and perform restores, emitting backup metadata (host, id, S3 URI)
Documentation
- Detailed runbooks for backup, restore, and single-node recovery with examples and remediation steps
Bug Fixes
- Fixed user/group handling and tightened guard conditions
Chores
- Added ignore pattern for local tooling and a small package to setup tasks

coderabbitai · 2026-04-14T12:30:10Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fb78d389-bab2-453a-9876-c7ac7d6b3ab6

📥 Commits

Reviewing files that changed from the base of the PR and between 0ef9f1d and cb50fd0.

📒 Files selected for processing (2)

ansible/roles/hpmn_backup/tasks/run.yml
ansible/roles/hpmn_restore/tasks/run.yml

🚧 Files skipped from review as they are similar to previous changes (2)

ansible/roles/hpmn_restore/tasks/run.yml
ansible/roles/hpmn_backup/tasks/run.yml

📝 Walkthrough

Walkthrough

Adds manual HP masternode backup and restore: GitHub Actions workflows, Ansible playbooks/roles (install/run/restore/finalize), backup & restore shell templates, role defaults/tasks, docs, a small deploy apt tweak, and a dashmate tasks conditional fix.

Changes

Cohort / File(s)	Summary
GitHub Actions workflows `.github/workflows/hpmn-backup.yml`, `.github/workflows/hpmn-restore.yml`	New manual `workflow_dispatch` workflows for backup and restore. Accept inputs, set env/secrets (AWS/S3), install pinned controller deps, manage SSH keys, validate network/inventory and S3 params (including SSE/KMS logic), clone network configs, and run Ansible playbooks with optional install steps.
Top-level Ansible playbooks `ansible/hpmn_backup_install.yml`, `ansible/hpmn_backup_run.yml`, `ansible/hpmn_restore_install.yml`, `ansible/hpmn_restore_run.yml`, `ansible/hpmn_restore_finalize.yml`	New playbooks to install tooling and run backup/restore flows. Backup plays target `hp_masternodes` with configurable `serial`; restore run/finalize assert single-host and invoke role `tasks_from: run`.
hpmn_backup role: defaults & tasks `ansible/roles/hpmn_backup/defaults/main.yml`, `ansible/roles/hpmn_backup/tasks/main.yml`, `ansible/roles/hpmn_backup/tasks/run.yml`	Defaults for script/staging paths, network name, S3 prefix/SSE/KMS and candidate/search paths; tasks install `awscli`/`rsync`, create staging dir, deploy backup script, resolve & validate S3 params, execute the backup script with env vars and register/output results.
hpmn_backup role: template `ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2`	Templated backup script that discovers static/volume/runtime artifacts, writes `manifest.txt`, stages payload, creates tarball, uploads to S3 with optional AES256 or aws:kms SSE (KMS key validation), and prints BACKUP_HOST, BACKUP_ID, BACKUP_S3_URI.
hpmn_restore role: defaults & tasks `ansible/roles/hpmn_restore/defaults/main.yml`, `ansible/roles/hpmn_restore/tasks/main.yml`, `ansible/roles/hpmn_restore/tasks/run.yml`	Defaults for restore script/staging/network/home/start-services and AWS region; tasks install `awscli`/`rsync`, create staging dir, deploy restore script, resolve & validate S3 URI/region, run restore script and output logs.
hpmn_restore role: template `ansible/roles/hpmn_restore/templates/restore-hpmn.sh.j2`	Restore script template that downloads/extracts S3 archive, validates `manifest.txt`, rsyncs included payload entries into destinations, optionally applies ownership mappings and restarts services, and prints restore metadata.
Docs `docs/hpmn-backup.md`, `docs/hpmn-recovery.md`	New documentation describing backup workflow, archive contents, S3 naming and SSE/KMS guidance, restore constraints, and a detailed single-node recovery runbook with operator steps and failure modes.
Misc Ansible tweak `ansible/deploy.yml`	Added `acl` to apt package list in pre_tasks.
Other Ansible fixes `ansible/roles/dashmate/tasks/main.yml`	Adjusted condition for adding dashmate user to docker group and tightened guards to require `dashmate_config_result.rc is defined` before checking return code.
Repo housekeeping `.gitignore`	Added `.codex` to `.gitignore`.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant GHA as "GitHub Actions\n(hpmn-backup / hpmn-restore)"
    participant Ctrl as "Ansible\nControl Machine"
    participant Host as "HP Masternode\nHost"
    participant S3 as "AWS S3\nBucket"

    User->>GHA: Trigger workflow_dispatch (network, label/uri, options)
    GHA->>Ctrl: Prepare controller (checkout, install deps, clone configs, validate inputs)
    Ctrl->>Host: (optional) Run install playbook -> deploy tooling & scripts
    Ctrl->>Host: Run run playbook -> include role run task
    Host->>Host: Execute script (discover, stage, manifest, tar)
    Host->>S3: Upload tarball (AES256 or aws:kms)
    S3-->>Host: Upload complete
    Host-->>Ctrl: Return script stdout (BACKUP/RESTORE metadata)
    Ctrl-->>GHA: Playbook complete
    GHA-->>User: Workflow finished

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Implement HP masternode backup strategy for testnet infra#25: Implements the same HPMN backup prototype (playbooks, role, scripts, and manual workflows) described in that issue.

Poem

🐰
I hop through logs and stitch a tar,
I herd the keys and pack each jar,
Manifests neat, the archive flies,
Encrypted home beneath blue skies,
A carrot-cheer for safe goodbyes! 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'feat(infra): add HPMN backup workflow' directly and clearly summarizes the main change: introduction of a new HPMN backup workflow infrastructure component.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch vivek/testnet-backup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/hpmn-backup.yml:
- Line 42: The workflow currently disables SSH host verification via
ANSIBLE_HOST_KEY_CHECKING and uses StrictHostKeyChecking no which accepts any
host key; instead seed the runner's SSH known_hosts with pinned host key
fingerprints for the targets and github.com before any SSH/Ansible steps.
Replace the ANSIBLE_HOST_KEY_CHECKING/StrictHostKeyChecking no usage by adding a
step that adds verified host keys (via ssh-keyscan or a fixed fingerprint entry)
into ~/.ssh/known_hosts and ensure the Ansiblе/ssh steps reference that
known_hosts (and remove ANSIBLE_HOST_KEY_CHECKING/StrictHostKeyChecking
overrides). Target symbols to change: ANSIBLE_HOST_KEY_CHECKING,
StrictHostKeyChecking (ssh options), and the workflow steps that call
ssh-keyscan/known_hosts population around where github.com and remote hosts are
used.

In `@ansible/roles/hpmn_backup/tasks/run.yml`:
- Line 7: The resolved SSE mode assignment hpmn_backup_sse_mode_resolved
currently applies default(hpmn_backup_s3_sse_mode,
lookup('env','HPMN_BACKUP_S3_SSE_MODE')) in the wrong order so the role default
always wins; change it to use lookup('env','HPMN_BACKUP_S3_SSE_MODE') first and
then default to the role var hpmn_backup_s3_sse_mode (i.e.,
hpmn_backup_sse_mode_resolved: "{{ lookup('env', 'HPMN_BACKUP_S3_SSE_MODE') |
default(hpmn_backup_s3_sse_mode, true) }}") so the environment variable
overrides the role default and preserves consistent SSE/KMS behavior.

In `@ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2`:
- Around line 70-72: The backup script currently uses cp -a on live paths (see
use of cp -a "$path" "$payload_dir" and record_path), which can produce
inconsistent, non-restorable backups for active Tendermint/Tenderdash and Dash
Core volumes; update the template to quiesce or snapshot those services before
copying: add logic that either (a) stops or pauses the relevant
services/containers (referencing their service names used in your
playbooks/defaults) and waits for clean shutdown, or (b) creates a
filesystem-level consistent snapshot (LVM, ZFS, or cloud disk snapshot) of the
underlying volume and copy from the snapshot path, then resume/start services;
ensure the script records which method was used and only runs cp/rsync from the
quiesced or snapshot path before calling record_path so uploaded archives are
recovery-grade.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ccd9a15d-27c3-4ee2-b195-a288c106c87b

📥 Commits

Reviewing files that changed from the base of the PR and between 555e290 and 727872f.

📒 Files selected for processing (8)

.github/workflows/hpmn-backup.yml
ansible/hpmn_backup_install.yml
ansible/hpmn_backup_run.yml
ansible/roles/hpmn_backup/defaults/main.yml
ansible/roles/hpmn_backup/tasks/main.yml
ansible/roles/hpmn_backup/tasks/run.yml
ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2
docs/hpmn-backup.md

coderabbitai · 2026-04-14T12:42:05Z

+      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
+      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+      AWS_REGION: ${{ secrets.AWS_REGION }}
+      ANSIBLE_HOST_KEY_CHECKING: "false"


⚠️ Potential issue | 🟠 Major

Don't disable SSH host verification for this workflow.

Line 42 and Lines 77-87 make the runner trust any host key for both Ansible and github.com. A spoofed repo endpoint or target host would then receive the backup commands and the AWS credentials that ansible/roles/hpmn_backup/tasks/run.yml forwards into the remote environment. Please seed known_hosts with pinned fingerprints instead of using StrictHostKeyChecking no / ANSIBLE_HOST_KEY_CHECKING=false.

Also applies to: 77-87

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/hpmn-backup.yml at line 42, The workflow currently disables SSH host verification via ANSIBLE_HOST_KEY_CHECKING and uses StrictHostKeyChecking no which accepts any host key; instead seed the runner's SSH known_hosts with pinned host key fingerprints for the targets and github.com before any SSH/Ansible steps. Replace the ANSIBLE_HOST_KEY_CHECKING/StrictHostKeyChecking no usage by adding a step that adds verified host keys (via ssh-keyscan or a fixed fingerprint entry) into ~/.ssh/known_hosts and ensure the Ansiblе/ssh steps reference that known_hosts (and remove ANSIBLE_HOST_KEY_CHECKING/StrictHostKeyChecking overrides). Target symbols to change: ANSIBLE_HOST_KEY_CHECKING, StrictHostKeyChecking (ssh options), and the workflow steps that call ssh-keyscan/known_hosts population around where github.com and remote hosts are used.

coderabbitai · 2026-04-14T12:42:05Z

+  if [[ -e "$path" ]]; then
+    cp -a --parents "$path" "$payload_dir"
+    record_path "included" "$category" "$path"


⚠️ Potential issue | 🟠 Major

cp -a on live node state won't give you a recovery-grade backup.

This helper is later fed active Tenderdash and Dash Core volume paths from ansible/roles/hpmn_backup/defaults/main.yml, so it can copy files from different points in time while the containers are still writing. The upload can succeed and still produce an archive that is not restorable. Please quiesce those services or snapshot/export the underlying volume before copying database/state directories.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2` around lines 70 - 72, The backup script currently uses cp -a on live paths (see use of cp -a "$path" "$payload_dir" and record_path), which can produce inconsistent, non-restorable backups for active Tendermint/Tenderdash and Dash Core volumes; update the template to quiesce or snapshot those services before copying: add logic that either (a) stops or pauses the relevant services/containers (referencing their service names used in your playbooks/defaults) and waits for clean shutdown, or (b) creates a filesystem-level consistent snapshot (LVM, ZFS, or cloud disk snapshot) of the underlying volume and copy from the snapshot path, then resume/start services; ensure the script records which method was used and only runs cp/rsync from the quiesced or snapshot path before calling record_path so uploaded archives are recovery-grade.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

ansible/roles/dashmate/tasks/main.yml (1)
129-136: ⚠️ Potential issue | 🟡 Minor

Use {{ dashmate_user }} in the docker-group task.

Line 131 still targets the literal dashmate account, so any override of dashmate_user creates one user but adds a different user to the docker group. That leaves the actual service account unable to manage Docker.
♻️ Proposed fix
 - name: Add dashmate user to docker group
   ansible.builtin.user:
-    name: 'dashmate'
+    name: '{{ dashmate_user }}'
     groups: docker
     append: true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/main.yml` around lines 129 - 136, The task "Add
dashmate user to docker group" currently hardcodes name: 'dashmate' — change it
to use the dashmate_user variable ({{ dashmate_user }}) so the user added to the
docker group matches any overridden account; update the ansible.builtin.user
entry for name accordingly while keeping groups: docker, append: true and the
existing when conditions (not (skip_dashmate_image_update | default(false)) and
dashmate_user_exists).

♻️ Duplicate comments (1)

ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2 (1)
72-74: ⚠️ Potential issue | 🟠 Major

Quiesce or snapshot stateful paths before copying.

At Line 73, cp -a reads live Tenderdash/Core volume data. That can produce a successfully uploaded but non-restorable archive due to mixed-time state. Please copy from a quiesced or snapshot source and record the consistency method in manifest.txt.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2` around lines 72 - 74,
The backup currently runs cp -a on live stateful paths (the if [[ -e "$path" ]];
then cp -a --parents "$path" "$payload_dir" record_path "included" "$category"
"$path") which can produce inconsistent snapshots; before copying each stateful
path referenced by $path, ensure the service/volume is quiesced or a
filesystem/volume snapshot is created and copy from that snapshot location
instead of the live path, and update the manifest consistency entry by
passing/recording the consistency method (e.g., "quiesced" or
"snapshot:<snapshot-id>") via the record_path invocation or by appending to
manifest.txt so manifest.txt records how each $path was frozen for
restoreability.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/hpmn-restore.yml:
- Around line 103-140: Add a preflight step before any ansible-playbook runs
that verifies the workflow input TARGET_HOST resolves to exactly one host from
the hp_masternodes group in the inventory networks/${NETWORK_NAME}.inventory;
implement this by using Ansible’s inventory listing to filter hp_masternodes by
the LIMIT (TARGET_HOST), count the resulting hosts, and fail the job unless the
count is exactly 1 (so the subsequent steps with --limit "${TARGET_HOST}" cannot
no-op on zero hosts). Ensure the step references TARGET_HOST, hp_masternodes,
and networks/${NETWORK_NAME}.inventory so reviewers can find it easily and make
the workflow exit non-zero with a clear error if the count is not 1.

In `@ansible/roles/hpmn_restore/tasks/run.yml`:
- Around line 8-30: Add a preflight check that fails fast when the restore
script is not present or not executable: use ansible.builtin.stat on the
hpmn_restore_script_path and assert that stat_result.exists and
(stat_result.mode is executable) before the "Run HP masternode restore script"
task (or replace the current "Validate restore inputs" with this extra
assertion). Reference the variable hpmn_restore_script_path and register the
stat (e.g., stat_result) so you can use an ansible.builtin.assert to abort with
a clear message when the script file is missing or not executable instead of
letting the command task hit ENOENT.

In `@ansible/roles/hpmn_restore/templates/restore-hpmn.sh.j2`:
- Around line 47-58: The manifest loop currently trusts each path and can be
tricked to escape payload_dir or write arbitrary destinations; update the loop
that reads status, category, path (and uses payload_dir and source_path) to
canonicalize and validate both source and destination before any mkdir/rsync:
compute a normalized absolute source (e.g., realpath -m
"${payload_dir}/${path}") and normalized destination (e.g., realpath -m
"${restore_root}/${path}" or ensure path is relative and join with restore
root), then ensure the source path starts with payload_dir canonical prefix and
the destination starts with the intended restore root prefix; if an included
entry's source does not exist or validation fails, log an error and exit
non-zero instead of silently skipping, otherwise proceed with mkdir and rsync
only for validated paths.

In `@docs/hpmn-backup.md`:
- Around line 145-149: The markdown link currently points to an absolute local
path (`/home/.../docs/hpmn-recovery.md`) which will break on GitHub; update the
link in the Restore section (the
`[`docs/hpmn-recovery.md`](/home/.../docs/hpmn-recovery.md)` instance) to a
repository-relative path such as `docs/hpmn-recovery.md` or `./hpmn-recovery.md`
so it resolves correctly in the repo and on GitHub.
- Around line 123-125: The docs contain inconsistent capitalization of "GitHub"
(e.g., the header "GitHub Actions Workflow" and other occurrences of "Github");
update all instances of "Github" to the correct "GitHub" spelling throughout the
document (including the header and the occurrence noted around the same
section), ensuring consistent casing everywhere.

---

Outside diff comments:
In `@ansible/roles/dashmate/tasks/main.yml`:
- Around line 129-136: The task "Add dashmate user to docker group" currently
hardcodes name: 'dashmate' — change it to use the dashmate_user variable ({{
dashmate_user }}) so the user added to the docker group matches any overridden
account; update the ansible.builtin.user entry for name accordingly while
keeping groups: docker, append: true and the existing when conditions (not
(skip_dashmate_image_update | default(false)) and dashmate_user_exists).

---

Duplicate comments:
In `@ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2`:
- Around line 72-74: The backup currently runs cp -a on live stateful paths (the
if [[ -e "$path" ]]; then cp -a --parents "$path" "$payload_dir" record_path
"included" "$category" "$path") which can produce inconsistent snapshots; before
copying each stateful path referenced by $path, ensure the service/volume is
quiesced or a filesystem/volume snapshot is created and copy from that snapshot
location instead of the live path, and update the manifest consistency entry by
passing/recording the consistency method (e.g., "quiesced" or
"snapshot:<snapshot-id>") via the record_path invocation or by appending to
manifest.txt so manifest.txt records how each $path was frozen for
restoreability.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ca38acc5-e386-4c1a-8437-200686666e6f

📥 Commits

Reviewing files that changed from the base of the PR and between 727872f and 0aad60b.

📒 Files selected for processing (18)

.github/workflows/hpmn-backup.yml
.github/workflows/hpmn-restore.yml
ansible/deploy.yml
ansible/hpmn_backup_install.yml
ansible/hpmn_backup_run.yml
ansible/hpmn_restore_finalize.yml
ansible/hpmn_restore_install.yml
ansible/hpmn_restore_run.yml
ansible/roles/dashmate/tasks/main.yml
ansible/roles/hpmn_backup/defaults/main.yml
ansible/roles/hpmn_backup/tasks/run.yml
ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2
ansible/roles/hpmn_restore/defaults/main.yml
ansible/roles/hpmn_restore/tasks/main.yml
ansible/roles/hpmn_restore/tasks/run.yml
ansible/roles/hpmn_restore/templates/restore-hpmn.sh.j2
docs/hpmn-backup.md
docs/hpmn-recovery.md

✅ Files skipped from review due to trivial changes (8)

ansible/deploy.yml
ansible/hpmn_restore_install.yml
ansible/hpmn_backup_install.yml
docs/hpmn-recovery.md
ansible/roles/hpmn_restore/defaults/main.yml
.github/workflows/hpmn-backup.yml
ansible/roles/hpmn_backup/defaults/main.yml
ansible/hpmn_backup_run.yml

🚧 Files skipped from review as they are similar to previous changes (1)

ansible/roles/hpmn_backup/tasks/run.yml

coderabbitai · 2026-04-14T14:01:28Z

+      - name: Validate restore config
+        run: |
+          test -f "networks/${NETWORK_NAME}.inventory"
+          test -f "networks/${NETWORK_NAME}.yml"
+          test -n "$HPMN_RESTORE_S3_URI"
+          test -n "$AWS_REGION"
+
+      - name: Install restore tooling on target host
+        if: ${{ inputs.install_restore_tooling }}
+        run: |
+          ansible-playbook \
+            -i "networks/${NETWORK_NAME}.inventory" \
+            ansible/hpmn_restore_install.yml \
+            -e "@networks/${NETWORK_NAME}.yml" \
+            -e "dash_network_name=${NETWORK_NAME}" \
+            --limit "${TARGET_HOST}"
+
+      - name: Restore target host from S3 backup
+        run: |
+          ansible-playbook \
+            -i "networks/${NETWORK_NAME}.inventory" \
+            ansible/hpmn_restore_run.yml \
+            -e "@networks/${NETWORK_NAME}.yml" \
+            -e "dash_network_name=${NETWORK_NAME}" \
+            -e "hpmn_restore_s3_uri=${HPMN_RESTORE_S3_URI}" \
+            -e "hpmn_restore_start_services=${{ inputs.start_services }}" \
+            --limit "${TARGET_HOST}"
+
+      - name: Finalize restored target host
+        if: ${{ inputs.finalize_restore }}
+        run: |
+          ansible-playbook \
+            -i "networks/${NETWORK_NAME}.inventory" \
+            ansible/hpmn_restore_finalize.yml \
+            -e "@networks/${NETWORK_NAME}.yml" \
+            -e "dash_network_name=${NETWORK_NAME}" \
+            -e "dash_network=${NETWORK_NAME}" \
+            --limit "${TARGET_HOST}"


⚠️ Potential issue | 🟠 Major

Validate that target_host resolves to exactly one HP masternode before running Ansible.

The playbook-level assert only protects the “more than one host” case. If --limit "${TARGET_HOST}" matches zero hosts, Ansible can no-op and this workflow still looks successful. Add a preflight inventory check and fail unless the limit resolves to exactly one host in hp_masternodes.

🛠️ Proposed fix

- name: Validate restore config run: | test -f "networks/${NETWORK_NAME}.inventory" test -f "networks/${NETWORK_NAME}.yml" test -n "$HPMN_RESTORE_S3_URI" test -n "$AWS_REGION" + matched_hosts="$( + ansible hp_masternodes \ + -i "networks/${NETWORK_NAME}.inventory" \ + --limit "${TARGET_HOST}" \ + --list-hosts | tail -n +2 | sed '/^[[:space:]]*$/d' | wc -l + )" + test "$matched_hosts" -eq 1

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Validate restore config

run: |

test -f "networks/${NETWORK_NAME}.inventory"

test -f "networks/${NETWORK_NAME}.yml"

test -n "$HPMN_RESTORE_S3_URI"

test -n "$AWS_REGION"

- name: Install restore tooling on target host

if: ${{ inputs.install_restore_tooling }}

run: |

ansible-playbook \

-i "networks/${NETWORK_NAME}.inventory" \

ansible/hpmn_restore_install.yml \

-e "@networks/${NETWORK_NAME}.yml" \

-e "dash_network_name=${NETWORK_NAME}" \

--limit "${TARGET_HOST}"

- name: Restore target host from S3 backup

run: |

ansible-playbook \

-i "networks/${NETWORK_NAME}.inventory" \

ansible/hpmn_restore_run.yml \

-e "@networks/${NETWORK_NAME}.yml" \

-e "dash_network_name=${NETWORK_NAME}" \

-e "hpmn_restore_s3_uri=${HPMN_RESTORE_S3_URI}" \

-e "hpmn_restore_start_services=${{ inputs.start_services }}" \

--limit "${TARGET_HOST}"

- name: Finalize restored target host

if: ${{ inputs.finalize_restore }}

run: |

ansible-playbook \

-i "networks/${NETWORK_NAME}.inventory" \

ansible/hpmn_restore_finalize.yml \

-e "@networks/${NETWORK_NAME}.yml" \

-e "dash_network_name=${NETWORK_NAME}" \

-e "dash_network=${NETWORK_NAME}" \

--limit "${TARGET_HOST}"

- name: Validate restore config

run: |

test -f "networks/${NETWORK_NAME}.inventory"

test -f "networks/${NETWORK_NAME}.yml"

test -n "$HPMN_RESTORE_S3_URI"

test -n "$AWS_REGION"

matched_hosts="$(

ansible hp_masternodes \

-i "networks/${NETWORK_NAME}.inventory" \

--limit "${TARGET_HOST}" \

--list-hosts | tail -n +2 | sed '/^[[:space:]]*$/d' | wc -l

)"

test "$matched_hosts" -eq 1

- name: Install restore tooling on target host

if: ${{ inputs.install_restore_tooling }}

run: |

ansible-playbook \

-i "networks/${NETWORK_NAME}.inventory" \

ansible/hpmn_restore_install.yml \

-e "@networks/${NETWORK_NAME}.yml" \

-e "dash_network_name=${NETWORK_NAME}" \

--limit "${TARGET_HOST}"

- name: Restore target host from S3 backup

run: |

ansible-playbook \

-i "networks/${NETWORK_NAME}.inventory" \

ansible/hpmn_restore_run.yml \

-e "@networks/${NETWORK_NAME}.yml" \

-e "dash_network_name=${NETWORK_NAME}" \

-e "hpmn_restore_s3_uri=${HPMN_RESTORE_S3_URI}" \

-e "hpmn_restore_start_services=${{ inputs.start_services }}" \

--limit "${TARGET_HOST}"

- name: Finalize restored target host

if: ${{ inputs.finalize_restore }}

run: |

ansible-playbook \

-i "networks/${NETWORK_NAME}.inventory" \

ansible/hpmn_restore_finalize.yml \

-e "@networks/${NETWORK_NAME}.yml" \

-e "dash_network_name=${NETWORK_NAME}" \

-e "dash_network=${NETWORK_NAME}" \

--limit "${TARGET_HOST}"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/hpmn-restore.yml around lines 103 - 140, Add a preflight step before any ansible-playbook runs that verifies the workflow input TARGET_HOST resolves to exactly one host from the hp_masternodes group in the inventory networks/${NETWORK_NAME}.inventory; implement this by using Ansible’s inventory listing to filter hp_masternodes by the LIMIT (TARGET_HOST), count the resulting hosts, and fail the job unless the count is exactly 1 (so the subsequent steps with --limit "${TARGET_HOST}" cannot no-op on zero hosts). Ensure the step references TARGET_HOST, hp_masternodes, and networks/${NETWORK_NAME}.inventory so reviewers can find it easily and make the workflow exit non-zero with a clear error if the count is not 1.

coderabbitai · 2026-04-14T14:01:28Z

+- name: Validate restore inputs
+  ansible.builtin.assert:
+    that:
+      - hpmn_restore_s3_uri_resolved | length > 0
+      - hpmn_restore_aws_region_resolved | length > 0
+    fail_msg: >-
+      HPMN restore requires HPMN_RESTORE_S3_URI or hpmn_restore_s3_uri, plus AWS_REGION.
+
+- name: Run HP masternode restore script
+  ansible.builtin.command:
+    argv:
+      - "{{ hpmn_restore_script_path }}"
+  environment:
+    AWS_ACCESS_KEY_ID: "{{ lookup('env', 'AWS_ACCESS_KEY_ID') }}"
+    AWS_SECRET_ACCESS_KEY: "{{ lookup('env', 'AWS_SECRET_ACCESS_KEY') }}"
+    AWS_SESSION_TOKEN: "{{ lookup('env', 'AWS_SESSION_TOKEN') }}"
+    AWS_REGION: "{{ hpmn_restore_aws_region_resolved }}"
+    AWS_DEFAULT_REGION: "{{ hpmn_restore_aws_region_resolved }}"
+    HPMN_RESTORE_S3_URI: "{{ hpmn_restore_s3_uri_resolved }}"
+    HPMN_RESTORE_NETWORK: "{{ hpmn_restore_network_name }}"
+    HPMN_RESTORE_DASHMATE_HOME: "{{ hpmn_restore_dashmate_home }}"
+    HPMN_RESTORE_START_SERVICES: "{{ 'true' if hpmn_restore_start_services_resolved else 'false' }}"
+  register: hpmn_restore_run_result


⚠️ Potential issue | 🟠 Major

Fail fast if the restore script was never installed.

.github/workflows/hpmn-restore.yml defaults install_restore_tooling to false, but this role invokes {{ hpmn_restore_script_path }} unconditionally. On a fresh replacement node that becomes a late ENOENT instead of a clear preflight error.

🛠️ Proposed fix

- name: Validate restore inputs ansible.builtin.assert: that: - hpmn_restore_s3_uri_resolved | length > 0 - hpmn_restore_aws_region_resolved | length > 0 fail_msg: >- HPMN restore requires HPMN_RESTORE_S3_URI or hpmn_restore_s3_uri, plus AWS_REGION. +- name: Check restore script is installed + ansible.builtin.stat: + path: "{{ hpmn_restore_script_path }}" + register: hpmn_restore_script_stat + +- name: Validate restore script presence + ansible.builtin.assert: + that: + - hpmn_restore_script_stat.stat.exists + - hpmn_restore_script_stat.stat.isreg + fail_msg: >- + Restore script {{ hpmn_restore_script_path }} is missing. Run hpmn_restore_install first + or set install_restore_tooling=true. + - name: Run HP masternode restore script ansible.builtin.command: argv: - "{{ hpmn_restore_script_path }}"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Validate restore inputs

ansible.builtin.assert:

that:

- hpmn_restore_s3_uri_resolved | length > 0

- hpmn_restore_aws_region_resolved | length > 0

fail_msg: >-

HPMN restore requires HPMN_RESTORE_S3_URI or hpmn_restore_s3_uri, plus AWS_REGION.

- name: Run HP masternode restore script

ansible.builtin.command:

argv:

- "{{ hpmn_restore_script_path }}"

environment:

AWS_ACCESS_KEY_ID: "{{ lookup('env', 'AWS_ACCESS_KEY_ID') }}"

AWS_SECRET_ACCESS_KEY: "{{ lookup('env', 'AWS_SECRET_ACCESS_KEY') }}"

AWS_SESSION_TOKEN: "{{ lookup('env', 'AWS_SESSION_TOKEN') }}"

AWS_REGION: "{{ hpmn_restore_aws_region_resolved }}"

AWS_DEFAULT_REGION: "{{ hpmn_restore_aws_region_resolved }}"

HPMN_RESTORE_S3_URI: "{{ hpmn_restore_s3_uri_resolved }}"

HPMN_RESTORE_NETWORK: "{{ hpmn_restore_network_name }}"

HPMN_RESTORE_DASHMATE_HOME: "{{ hpmn_restore_dashmate_home }}"

HPMN_RESTORE_START_SERVICES: "{{ 'true' if hpmn_restore_start_services_resolved else 'false' }}"

register: hpmn_restore_run_result

- name: Validate restore inputs

ansible.builtin.assert:

that:

- hpmn_restore_s3_uri_resolved | length > 0

- hpmn_restore_aws_region_resolved | length > 0

fail_msg: >-

HPMN restore requires HPMN_RESTORE_S3_URI or hpmn_restore_s3_uri, plus AWS_REGION.

- name: Check restore script is installed

ansible.builtin.stat:

path: "{{ hpmn_restore_script_path }}"

register: hpmn_restore_script_stat

- name: Validate restore script presence

ansible.builtin.assert:

that:

- hpmn_restore_script_stat.stat.exists

- hpmn_restore_script_stat.stat.isreg

fail_msg: >-

Restore script {{ hpmn_restore_script_path }} is missing. Run hpmn_restore_install first

or set install_restore_tooling=true.

- name: Run HP masternode restore script

ansible.builtin.command:

argv:

- "{{ hpmn_restore_script_path }}"

environment:

AWS_ACCESS_KEY_ID: "{{ lookup('env', 'AWS_ACCESS_KEY_ID') }}"

AWS_SECRET_ACCESS_KEY: "{{ lookup('env', 'AWS_SECRET_ACCESS_KEY') }}"

AWS_SESSION_TOKEN: "{{ lookup('env', 'AWS_SESSION_TOKEN') }}"

AWS_REGION: "{{ hpmn_restore_aws_region_resolved }}"

AWS_DEFAULT_REGION: "{{ hpmn_restore_aws_region_resolved }}"

HPMN_RESTORE_S3_URI: "{{ hpmn_restore_s3_uri_resolved }}"

HPMN_RESTORE_NETWORK: "{{ hpmn_restore_network_name }}"

HPMN_RESTORE_DASHMATE_HOME: "{{ hpmn_restore_dashmate_home }}"

HPMN_RESTORE_START_SERVICES: "{{ 'true' if hpmn_restore_start_services_resolved else 'false' }}"

register: hpmn_restore_run_result

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ansible/roles/hpmn_restore/tasks/run.yml` around lines 8 - 30, Add a preflight check that fails fast when the restore script is not present or not executable: use ansible.builtin.stat on the hpmn_restore_script_path and assert that stat_result.exists and (stat_result.mode is executable) before the "Run HP masternode restore script" task (or replace the current "Validate restore inputs" with this extra assertion). Reference the variable hpmn_restore_script_path and register the stat (e.g., stat_result) so you can use an ansible.builtin.assert to abort with a clear message when the script file is missing or not executable instead of letting the command task hit ENOENT.

coderabbitai · 2026-04-14T14:01:28Z

+while IFS='|' read -r status category path; do
+  [[ "$status" == "included" ]] || continue
+  source_path="${payload_dir}${path}"
+
+  if [[ -d "$source_path" ]]; then
+    mkdir -p "$path"
+    rsync -a --delete "${source_path}/" "${path}/"
+  elif [[ -f "$source_path" ]]; then
+    mkdir -p "$(dirname "$path")"
+    rsync -a "$source_path" "$path"
+  fi
+done < "$manifest_path"


⚠️ Potential issue | 🔴 Critical

Reject unexpected manifest paths before restoring as root.

Lines 47-58 trust every path in manifest.txt as both a source and destination path. Because the archive URI is operator-supplied, a crafted archive can use .. segments or out-of-scope absolute paths to escape payload_dir and overwrite arbitrary files as root. This loop also silently skips included entries that are missing from payload, which can turn archive corruption into a partial “successful” restore.

🔒 Proposed fix

while IFS='|' read -r status category path; do [[ "$status" == "included" ]] || continue + [[ "$path" == /* ]] || { + echo "Manifest path must be absolute: $path" >&2 + exit 1 + } + [[ "$path" != *"/../"* && "$path" != "../"* && "$path" != *"/.." ]] || { + echo "Manifest path contains parent traversal: $path" >&2 + exit 1 + } + case "$path" in + /home/dashmate/.dashmate/*|\ + /var/lib/docker/volumes/dashmate_"${network_name}"_drive_tenderdash/_data*|\ + /var/lib/docker/volumes/dashmate_"${network_name}"_core_data/_data/.dashcore/testnet3/llmq*) + ;; + *) + echo "Unexpected restore path in manifest: $path" >&2 + exit 1 + ;; + esac source_path="${payload_dir}${path}" if [[ -d "$source_path" ]]; then mkdir -p "$path" rsync -a --delete "${source_path}/" "${path}/" elif [[ -f "$source_path" ]]; then mkdir -p "$(dirname "$path")" rsync -a "$source_path" "$path" + else + echo "Manifest entry marked included is missing from payload: $path" >&2 + exit 1 fi done < "$manifest_path"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

while IFS='|' read -r status category path; do

[[ "$status" == "included" ]] || continue

source_path="${payload_dir}${path}"

if [[ -d "$source_path" ]]; then

mkdir -p "$path"

rsync -a --delete "${source_path}/" "${path}/"

elif [[ -f "$source_path" ]]; then

mkdir -p "$(dirname "$path")"

rsync -a "$source_path" "$path"

fi

done < "$manifest_path"

while IFS='|' read -r status category path; do

[[ "$status" == "included" ]] || continue

[[ "$path" == /* ]] || {

echo "Manifest path must be absolute: $path" >&2

exit 1

}

[[ "$path" != *"/../"* && "$path" != "../"* && "$path" != *"/.." ]] || {

echo "Manifest path contains parent traversal: $path" >&2

exit 1

}

case "$path" in

/home/dashmate/.dashmate/*|\

/var/lib/docker/volumes/dashmate_"${network_name}"_drive_tenderdash/_data*|\

/var/lib/docker/volumes/dashmate_"${network_name}"_core_data/_data/.dashcore/testnet3/llmq*)

;;

*)

echo "Unexpected restore path in manifest: $path" >&2

exit 1

;;

esac

source_path="${payload_dir}${path}"

if [[ -d "$source_path" ]]; then

mkdir -p "$path"

rsync -a --delete "${source_path}/" "${path}/"

elif [[ -f "$source_path" ]]; then

mkdir -p "$(dirname "$path")"

rsync -a "$source_path" "$path"

else

echo "Manifest entry marked included is missing from payload: $path" >&2

exit 1

fi

done < "$manifest_path"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ansible/roles/hpmn_restore/templates/restore-hpmn.sh.j2` around lines 47 - 58, The manifest loop currently trusts each path and can be tricked to escape payload_dir or write arbitrary destinations; update the loop that reads status, category, path (and uses payload_dir and source_path) to canonicalize and validate both source and destination before any mkdir/rsync: compute a normalized absolute source (e.g., realpath -m "${payload_dir}/${path}") and normalized destination (e.g., realpath -m "${restore_root}/${path}" or ensure path is relative and join with restore root), then ensure the source path starts with payload_dir canonical prefix and the destination starts with the intended restore root prefix; if an included entry's source does not exist or validation fails, log an error and exit non-zero instead of silently skipping, otherwise proceed with mkdir and rsync only for validated paths.

coderabbitai · 2026-04-14T14:01:28Z

+## GitHub Actions Workflow
+
+Use `.github/workflows/hpmn-backup.yml` for a manual `workflow_dispatch` run.


⚠️ Potential issue | 🟡 Minor

Capitalize GitHub consistently.

These references should use GitHub, not Github.

Also applies to: 192-192

🧰 Tools

🪛 LanguageTool

[uncategorized] ~125-~125: The official name of this software platform is spelled with a capital “H”.
Context: ...es ``` ## GitHub Actions Workflow Use .github/workflows/hpmn-backup.yml for a manual...

(GITHUB)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/hpmn-backup.md` around lines 123 - 125, The docs contain inconsistent capitalization of "GitHub" (e.g., the header "GitHub Actions Workflow" and other occurrences of "Github"); update all instances of "Github" to the correct "GitHub" spelling throughout the document (including the header and the occurrence noted around the same section), ensuring consistent casing everywhere.

coderabbitai · 2026-04-14T14:01:28Z

+## Restore
+
+Restore is intentionally single-host only. Use a replacement or disposable HP masternode, not the whole group.
+
+For the full tested step-by-step recovery procedure, including public IP cutover, replacement host provisioning, restore, finalize, verification, and troubleshooting, use [`docs/hpmn-recovery.md`](/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md).


⚠️ Potential issue | 🟡 Minor

Use a repository-relative link here.

Line 149 points at an absolute path from a local workstation, so the link will be broken in the repo and on GitHub. Link to docs/hpmn-recovery.md or ./hpmn-recovery.md instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/hpmn-backup.md` around lines 145 - 149, The markdown link currently points to an absolute local path (`/home/.../docs/hpmn-recovery.md`) which will break on GitHub; update the link in the Restore section (the `[`docs/hpmn-recovery.md`](/home/.../docs/hpmn-recovery.md)` instance) to a repository-relative path such as `docs/hpmn-recovery.md` or `./hpmn-recovery.md` so it resolves correctly in the repo and on GitHub.

coderabbitai

♻️ Duplicate comments (1)

docs/hpmn-backup.md (1)

147-151: ⚠️ Potential issue | 🟡 Minor

Fix broken link to recovery documentation.

Line 151 contains an absolute local filesystem path (/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md) that won't work in the repository or on GitHub. Use a relative path instead.

-For the full tested step-by-step recovery procedure, including public IP cutover, replacement host provisioning, restore, finalize, verification, and troubleshooting, use [`docs/hpmn-recovery.md`](/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md).
+For the full tested step-by-step recovery procedure, including public IP cutover, replacement host provisioning, restore, finalize, verification, and troubleshooting, see [`docs/hpmn-recovery.md`](./hpmn-recovery.md).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/hpmn-backup.md` around lines 147 - 151, The link in the Restore section
uses an absolute local path; update the Markdown link target to the
repository-relative path for the recovery doc (replace
`/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md` with
`docs/hpmn-recovery.md` or `./docs/hpmn-recovery.md`) so the
[`docs/hpmn-recovery.md`] link resolves correctly on GitHub; locate the link
text "docs/hpmn-recovery.md" in the Restore paragraph and change only the URL
portion to the relative path.

🧹 Nitpick comments (1)

ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2 (1)

137-159: Empty SSE mode case appears unreachable.

The empty string case (lines 138-139) won't be hit in practice since ansible/roles/hpmn_backup/tasks/run.yml validates that hpmn_backup_sse_mode_resolved in ['AES256', 'aws:kms']. The default from defaults/main.yml is also AES256. This defense-in-depth is fine, but you could simplify by removing the empty case or explicitly failing on it.

♻️ Optional: Fail explicitly on empty SSE mode

 case "$sse_mode" in
-  "")
-    ;;
   AES256)
     aws_args+=(--sse AES256)
     if [[ -n "$kms_key_id" ]]; then
       echo "HPMN_BACKUP_S3_KMS_KEY_ID must be empty when SSE mode is AES256" >&2
       exit 1
     fi
     ;;
   aws:kms)
     aws_args+=(--sse aws:kms)
     if [[ -z "$kms_key_id" ]]; then
       echo "HPMN_BACKUP_S3_KMS_KEY_ID is required when SSE mode is aws:kms" >&2
       exit 1
     fi
     aws_args+=(--sse-kms-key-id "$kms_key_id")
     ;;
   *)
     echo "Unsupported HPMN_BACKUP_S3_SSE_MODE: $sse_mode" >&2
     exit 1
     ;;
 esac

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2` around lines 137 -
159, The empty string case in the case "$sse_mode" block is unreachable and
should be removed or made explicit; update the case handling around the sse_mode
switch (the "" branch that currently does nothing) to either delete that branch
entirely or replace it with an explicit failure that echoes a clear error (e.g.,
"HPMN_BACKUP_S3_SSE_MODE must be set") and exits non‑zero, keeping the AES256
and aws:kms branches and the default fallback intact; adjust any references to
aws_args if needed so behavior remains consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/hpmn-backup.md`:
- Around line 147-151: The link in the Restore section uses an absolute local
path; update the Markdown link target to the repository-relative path for the
recovery doc (replace
`/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md` with
`docs/hpmn-recovery.md` or `./docs/hpmn-recovery.md`) so the
[`docs/hpmn-recovery.md`] link resolves correctly on GitHub; locate the link
text "docs/hpmn-recovery.md" in the Restore paragraph and change only the URL
portion to the relative path.

---

Nitpick comments:
In `@ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2`:
- Around line 137-159: The empty string case in the case "$sse_mode" block is
unreachable and should be removed or made explicit; update the case handling
around the sse_mode switch (the "" branch that currently does nothing) to either
delete that branch entirely or replace it with an explicit failure that echoes a
clear error (e.g., "HPMN_BACKUP_S3_SSE_MODE must be set") and exits non‑zero,
keeping the AES256 and aws:kms branches and the default fallback intact; adjust
any references to aws_args if needed so behavior remains consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 48cc7e0a-9f1f-41ac-a1ce-78ceef58d6a6

📥 Commits

Reviewing files that changed from the base of the PR and between 0aad60b and 4d93599.

📒 Files selected for processing (5)

ansible/roles/hpmn_backup/defaults/main.yml
ansible/roles/hpmn_backup/tasks/main.yml
ansible/roles/hpmn_backup/templates/backup-hpmn.sh.j2
docs/hpmn-backup.md
docs/hpmn-recovery.md

✅ Files skipped from review due to trivial changes (1)

docs/hpmn-recovery.md

🚧 Files skipped from review as they are similar to previous changes (1)

ansible/roles/hpmn_backup/defaults/main.yml

coderabbitai

♻️ Duplicate comments (1)

docs/hpmn-backup.md (1)

154-154: ⚠️ Potential issue | 🟡 Minor

Use a repository-relative link instead of a local absolute path.

The current link points to a workstation-specific path, so it will be broken in the repo and on GitHub. Please switch it to a repo-relative target (for example ./hpmn-recovery.md).

Suggested fix

-For the full tested step-by-step recovery procedure, including public IP cutover, replacement host provisioning, restore, finalize, verification, and troubleshooting, use [`docs/hpmn-recovery.md`](/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md).
+For the full tested step-by-step recovery procedure, including public IP cutover, replacement host provisioning, restore, finalize, verification, and troubleshooting, use [`docs/hpmn-recovery.md`](./hpmn-recovery.md).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/hpmn-backup.md` at line 154, The markdown link in docs/hpmn-backup.md
uses a workstation absolute path (/home/vivek/.../hpmn-recovery.md); change that
target to a repository-relative link (for example ./hpmn-recovery.md) so the
anchor
`[docs/hpmn-recovery.md](/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md)`
becomes a repo-relative reference and will resolve correctly in the repository
and on GitHub.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/hpmn-backup.md`:
- Line 154: The markdown link in docs/hpmn-backup.md uses a workstation absolute
path (/home/vivek/.../hpmn-recovery.md); change that target to a
repository-relative link (for example ./hpmn-recovery.md) so the anchor
`[docs/hpmn-recovery.md](/home/vivek/code/dash-network-deploy/docs/hpmn-recovery.md)`
becomes a repo-relative reference and will resolve correctly in the repository
and on GitHub.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 425867fa-ac67-464c-a679-dd0b9526486e

📥 Commits

Reviewing files that changed from the base of the PR and between 4d93599 and 0ef9f1d.

📒 Files selected for processing (5)

.gitignore
ansible/roles/hpmn_restore/defaults/main.yml
ansible/roles/hpmn_restore/templates/restore-hpmn.sh.j2
docs/hpmn-backup.md
docs/hpmn-recovery.md

✅ Files skipped from review due to trivial changes (2)

.gitignore
docs/hpmn-recovery.md

🚧 Files skipped from review as they are similar to previous changes (2)

ansible/roles/hpmn_restore/templates/restore-hpmn.sh.j2
ansible/roles/hpmn_restore/defaults/main.yml

[infra] add HPMN Backup Workflow

727872f

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

vivekgsharma added 2 commits April 14, 2026 13:45

[infra] harden HPMN backup and restore flow

4ea8198

[infra] add HPMN recovery runbook

0aad60b

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

[infra] capture Drive ABCI state in HPMN backups

4d93599

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

vivekgsharma added 2 commits April 14, 2026 16:45

[infra] fix HPMN restore volume ownership

7d9ec30

update gitignore

0ef9f1d

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

[infra] fix HPMN ansible lint line lengths

cb50fd0

vivekgsharma requested a review from ktechmidas April 15, 2026 07:57

vivekgsharma merged commit a3e7d8e into v1.0-dev Apr 15, 2026
2 checks passed

		## GitHub Actions Workflow

		Use `.github/workflows/hpmn-backup.yml` for a manual `workflow_dispatch` run.

Conversation

vivekgsharma commented Apr 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue being fixed or feature implemented

What was done?

How Has This Been Tested?

Breaking Changes

Checklist:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vivekgsharma commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading