Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions .github/workflows/hpmn-backup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
name: Manual HP Masternode Backup

on:
workflow_dispatch:
inputs:
network:
description: "Network to back up"
required: true
type: string
default: testnet
backup_label:
description: "Short label stamped into the S3 object path"
required: true
type: string
host_limit:
description: "Ansible limit expression (start with a single hp-masternode-N for validation)"
required: true
type: string
default: hp-masternode-1
batch_size:
description: "How many HPMNs may run backup concurrently"
required: true
type: string
default: "5"
install_backup_tooling:
description: "Install/update backup script prerequisites before running the backup"
required: true
type: boolean
default: false

jobs:
backup:
name: Run HP masternode backup
runs-on: ubuntu-22.04
timeout-minutes: 90
env:
NETWORK_NAME: ${{ inputs.network }}
HOST_LIMIT: ${{ inputs.host_limit }}
HPMN_BACKUP_TRIGGER_LABEL: ${{ inputs.backup_label }}
HPMN_BACKUP_S3_BUCKET: ${{ vars.HPMN_BACKUP_S3_BUCKET }}
HPMN_BACKUP_S3_PREFIX: ${{ vars.HPMN_BACKUP_S3_PREFIX }}
HPMN_BACKUP_S3_SSE_MODE: ${{ vars.HPMN_BACKUP_S3_SSE_MODE }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_SESSION_TOKEN: ${{ secrets.AWS_SESSION_TOKEN }}
AWS_REGION: ${{ secrets.AWS_REGION }}
ANSIBLE_HOST_KEY_CHECKING: "false"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't disable SSH host verification for this workflow.

Line 42 and Lines 77-87 make the runner trust any host key for both Ansible and github.com. A spoofed repo endpoint or target host would then receive the backup commands and the AWS credentials that ansible/roles/hpmn_backup/tasks/run.yml forwards into the remote environment. Please seed known_hosts with pinned fingerprints instead of using StrictHostKeyChecking no / ANSIBLE_HOST_KEY_CHECKING=false.

Also applies to: 77-87

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/hpmn-backup.yml at line 42, The workflow currently
disables SSH host verification via ANSIBLE_HOST_KEY_CHECKING and uses
StrictHostKeyChecking no which accepts any host key; instead seed the runner's
SSH known_hosts with pinned host key fingerprints for the targets and github.com
before any SSH/Ansible steps. Replace the
ANSIBLE_HOST_KEY_CHECKING/StrictHostKeyChecking no usage by adding a step that
adds verified host keys (via ssh-keyscan or a fixed fingerprint entry) into
~/.ssh/known_hosts and ensure the Ansiblе/ssh steps reference that known_hosts
(and remove ANSIBLE_HOST_KEY_CHECKING/StrictHostKeyChecking overrides). Target
symbols to change: ANSIBLE_HOST_KEY_CHECKING, StrictHostKeyChecking (ssh
options), and the workflow steps that call ssh-keyscan/known_hosts population
around where github.com and remote hosts are used.

ANSIBLE_CONFIG: ansible/ansible.cfg

steps:
- name: Checkout dash-network-deploy
uses: actions/checkout@v4

- name: Install controller dependencies
run: |
sudo apt-get update
sudo apt-get install -y python3-pip python3-netaddr
python3 -m pip install --upgrade pip
python3 -m pip install ansible-core==2.16.3 jmespath

- name: Install Ansible roles and collections
run: |
ansible-galaxy install -r ansible/requirements.yml
mkdir -p ~/.ansible/roles
cp -r ansible/roles/* ~/.ansible/roles/

- name: Set up SSH keys
env:
DEPLOY_SERVER_KEY: ${{ secrets.DEPLOY_SERVER_KEY }}
EVO_APP_DEPLOY_KEY: ${{ secrets.EVO_APP_DEPLOY_KEY }}
run: |
mkdir -p ~/.ssh

printf '%s\n' "$DEPLOY_SERVER_KEY" > ~/.ssh/id_rsa
chmod 600 ~/.ssh/id_rsa
ssh-keygen -y -f ~/.ssh/id_rsa > ~/.ssh/id_rsa.pub
chmod 644 ~/.ssh/id_rsa.pub

printf '%s\n' "$EVO_APP_DEPLOY_KEY" > ~/.ssh/id_ed25519
chmod 600 ~/.ssh/id_ed25519

cat > ~/.ssh/config << 'EOL'
Host github.com
IdentityFile ~/.ssh/id_ed25519
StrictHostKeyChecking no

Host *
IdentityFile ~/.ssh/id_rsa
User ubuntu
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOL

chmod 600 ~/.ssh/config

- name: Clone network configs
run: |
rm -rf networks
git clone git@github.com:dashpay/dash-network-configs.git networks

- name: Validate network files and backup config
run: |
test -f "networks/${NETWORK_NAME}.inventory"
test -f "networks/${NETWORK_NAME}.yml"
test -n "$HPMN_BACKUP_S3_BUCKET"
test -n "$AWS_ACCESS_KEY_ID"
test -n "$AWS_SECRET_ACCESS_KEY"
test -n "$AWS_REGION"
case "${HPMN_BACKUP_S3_SSE_MODE:-AES256}" in
AES256)
;;
aws:kms)
test -n "${{ secrets.HPMN_BACKUP_S3_KMS_KEY_ID }}"
;;
*)
echo "Unsupported HPMN_BACKUP_S3_SSE_MODE: ${HPMN_BACKUP_S3_SSE_MODE}" >&2
exit 1
;;
esac

- name: Install backup tooling on target HP masternodes
if: ${{ inputs.install_backup_tooling }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_backup_install.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "hpmn_backup_install_serial=${{ inputs.batch_size }}" \
--limit "${HOST_LIMIT}"

- name: Trigger HP masternode backup
run: |
if [[ "${HPMN_BACKUP_S3_SSE_MODE:-AES256}" == "aws:kms" ]]; then
export HPMN_BACKUP_S3_KMS_KEY_ID="${{ secrets.HPMN_BACKUP_S3_KMS_KEY_ID }}"
fi

ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_backup_run.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "hpmn_backup_serial=${{ inputs.batch_size }}" \
-e "hpmn_backup_trigger_label=${HPMN_BACKUP_TRIGGER_LABEL}" \
--limit "${HOST_LIMIT}"
140 changes: 140 additions & 0 deletions .github/workflows/hpmn-restore.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
name: HP Masternode Restore

on:
workflow_dispatch:
inputs:
network:
description: "Network to restore"
required: true
type: string
default: testnet
target_host:
description: "Single HP masternode host to restore"
required: true
type: string
default: hp-masternode-1
restore_s3_uri:
description: "Full S3 URI of the backup archive to restore"
required: true
type: string
install_restore_tooling:
description: "Install/update restore prerequisites before running restore"
required: true
type: boolean
default: false
start_services:
description: "Start dashmate services directly from the restore script"
required: true
type: boolean
default: false
finalize_restore:
description: "Run the finalize playbook after restore to regenerate host-specific config and start services"
required: true
type: boolean
default: true

jobs:
restore:
name: Restore HP masternode
runs-on: ubuntu-22.04
timeout-minutes: 120
env:
NETWORK_NAME: ${{ inputs.network }}
TARGET_HOST: ${{ inputs.target_host }}
HPMN_RESTORE_S3_URI: ${{ inputs.restore_s3_uri }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_SESSION_TOKEN: ${{ secrets.AWS_SESSION_TOKEN }}
AWS_REGION: ${{ secrets.AWS_REGION }}
ANSIBLE_HOST_KEY_CHECKING: "false"
ANSIBLE_CONFIG: ansible/ansible.cfg

steps:
- name: Checkout dash-network-deploy
uses: actions/checkout@v4

- name: Install controller dependencies
run: |
sudo apt-get update
sudo apt-get install -y python3-pip python3-netaddr
python3 -m pip install --upgrade pip
python3 -m pip install ansible-core==2.16.3 jmespath

- name: Install Ansible roles and collections
run: |
ansible-galaxy install -r ansible/requirements.yml
mkdir -p ~/.ansible/roles
cp -r ansible/roles/* ~/.ansible/roles/

- name: Set up SSH keys
env:
DEPLOY_SERVER_KEY: ${{ secrets.DEPLOY_SERVER_KEY }}
EVO_APP_DEPLOY_KEY: ${{ secrets.EVO_APP_DEPLOY_KEY }}
run: |
mkdir -p ~/.ssh

printf '%s\n' "$DEPLOY_SERVER_KEY" > ~/.ssh/id_rsa
chmod 600 ~/.ssh/id_rsa
ssh-keygen -y -f ~/.ssh/id_rsa > ~/.ssh/id_rsa.pub
chmod 644 ~/.ssh/id_rsa.pub

printf '%s\n' "$EVO_APP_DEPLOY_KEY" > ~/.ssh/id_ed25519
chmod 600 ~/.ssh/id_ed25519

cat > ~/.ssh/config << 'EOL'
Host github.com
IdentityFile ~/.ssh/id_ed25519
StrictHostKeyChecking no

Host *
IdentityFile ~/.ssh/id_rsa
User ubuntu
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOL

chmod 600 ~/.ssh/config

- name: Clone network configs
run: |
rm -rf networks
git clone git@github.com:dashpay/dash-network-configs.git networks

- name: Validate restore config
run: |
test -f "networks/${NETWORK_NAME}.inventory"
test -f "networks/${NETWORK_NAME}.yml"
test -n "$HPMN_RESTORE_S3_URI"
test -n "$AWS_REGION"

- name: Install restore tooling on target host
if: ${{ inputs.install_restore_tooling }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_install.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
--limit "${TARGET_HOST}"

- name: Restore target host from S3 backup
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_run.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "hpmn_restore_s3_uri=${HPMN_RESTORE_S3_URI}" \
-e "hpmn_restore_start_services=${{ inputs.start_services }}" \
--limit "${TARGET_HOST}"

- name: Finalize restored target host
if: ${{ inputs.finalize_restore }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_finalize.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "dash_network=${NETWORK_NAME}" \
--limit "${TARGET_HOST}"
Comment on lines +103 to +140
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate that target_host resolves to exactly one HP masternode before running Ansible.

The playbook-level assert only protects the “more than one host” case. If --limit "${TARGET_HOST}" matches zero hosts, Ansible can no-op and this workflow still looks successful. Add a preflight inventory check and fail unless the limit resolves to exactly one host in hp_masternodes.

🛠️ Proposed fix
       - name: Validate restore config
         run: |
           test -f "networks/${NETWORK_NAME}.inventory"
           test -f "networks/${NETWORK_NAME}.yml"
           test -n "$HPMN_RESTORE_S3_URI"
           test -n "$AWS_REGION"
+          matched_hosts="$(
+            ansible hp_masternodes \
+              -i "networks/${NETWORK_NAME}.inventory" \
+              --limit "${TARGET_HOST}" \
+              --list-hosts | tail -n +2 | sed '/^[[:space:]]*$/d' | wc -l
+          )"
+          test "$matched_hosts" -eq 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Validate restore config
run: |
test -f "networks/${NETWORK_NAME}.inventory"
test -f "networks/${NETWORK_NAME}.yml"
test -n "$HPMN_RESTORE_S3_URI"
test -n "$AWS_REGION"
- name: Install restore tooling on target host
if: ${{ inputs.install_restore_tooling }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_install.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
--limit "${TARGET_HOST}"
- name: Restore target host from S3 backup
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_run.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "hpmn_restore_s3_uri=${HPMN_RESTORE_S3_URI}" \
-e "hpmn_restore_start_services=${{ inputs.start_services }}" \
--limit "${TARGET_HOST}"
- name: Finalize restored target host
if: ${{ inputs.finalize_restore }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_finalize.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "dash_network=${NETWORK_NAME}" \
--limit "${TARGET_HOST}"
- name: Validate restore config
run: |
test -f "networks/${NETWORK_NAME}.inventory"
test -f "networks/${NETWORK_NAME}.yml"
test -n "$HPMN_RESTORE_S3_URI"
test -n "$AWS_REGION"
matched_hosts="$(
ansible hp_masternodes \
-i "networks/${NETWORK_NAME}.inventory" \
--limit "${TARGET_HOST}" \
--list-hosts | tail -n +2 | sed '/^[[:space:]]*$/d' | wc -l
)"
test "$matched_hosts" -eq 1
- name: Install restore tooling on target host
if: ${{ inputs.install_restore_tooling }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_install.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
--limit "${TARGET_HOST}"
- name: Restore target host from S3 backup
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_run.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "hpmn_restore_s3_uri=${HPMN_RESTORE_S3_URI}" \
-e "hpmn_restore_start_services=${{ inputs.start_services }}" \
--limit "${TARGET_HOST}"
- name: Finalize restored target host
if: ${{ inputs.finalize_restore }}
run: |
ansible-playbook \
-i "networks/${NETWORK_NAME}.inventory" \
ansible/hpmn_restore_finalize.yml \
-e "@networks/${NETWORK_NAME}.yml" \
-e "dash_network_name=${NETWORK_NAME}" \
-e "dash_network=${NETWORK_NAME}" \
--limit "${TARGET_HOST}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/hpmn-restore.yml around lines 103 - 140, Add a preflight
step before any ansible-playbook runs that verifies the workflow input
TARGET_HOST resolves to exactly one host from the hp_masternodes group in the
inventory networks/${NETWORK_NAME}.inventory; implement this by using Ansible’s
inventory listing to filter hp_masternodes by the LIMIT (TARGET_HOST), count the
resulting hosts, and fail the job unless the count is exactly 1 (so the
subsequent steps with --limit "${TARGET_HOST}" cannot no-op on zero hosts).
Ensure the step references TARGET_HOST, hp_masternodes, and
networks/${NETWORK_NAME}.inventory so reviewers can find it easily and make the
workflow exit non-zero with a clear error if the count is not 1.

2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,5 @@ networks

# Local docs
INFRA.MD

.codex
1 change: 1 addition & 0 deletions ansible/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
- name: Update apt cache and install jq
ansible.builtin.apt:
pkg:
- acl
- jq
- unzip
update_cache: true
Expand Down
10 changes: 10 additions & 0 deletions ansible/hpmn_backup_install.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
- name: Install HP masternode backup tooling
hosts: hp_masternodes
serial: "{{ hpmn_backup_install_serial | default(5) }}"
become: true
gather_facts: true
roles:
- role: hpmn_backup
tags:
- hpmn_backup_install
13 changes: 13 additions & 0 deletions ansible/hpmn_backup_run.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
- name: Run HP masternode backup on demand
hosts: hp_masternodes
serial: "{{ hpmn_backup_serial | default(5) }}"
become: true
gather_facts: false
tasks:
- name: Execute HP masternode backup role
ansible.builtin.include_role:
name: hpmn_backup
tasks_from: run
tags:
- hpmn_backup_run
25 changes: 25 additions & 0 deletions ansible/hpmn_restore_finalize.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
- name: Finalize restored HP masternode
hosts: hp_masternodes
serial: 1
become: true
gather_facts: false
pre_tasks:
- name: Enforce single-host finalize
ansible.builtin.assert:
that:
- ansible_play_hosts_all | length == 1
fail_msg: Finalize must target exactly one HP masternode host.
run_once: true

- name: Load HP masternode config for finalize
ansible.builtin.set_fact:
node: "{{ hp_masternodes[inventory_hostname] }}"
when: inventory_hostname in hp_masternodes

roles:
- role: dash_cli
- role: dashmate

tags:
- hpmn_restore_finalize
10 changes: 10 additions & 0 deletions ansible/hpmn_restore_install.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
- name: Install HP masternode restore tooling
hosts: hp_masternodes
serial: 1
become: true
gather_facts: true
roles:
- role: hpmn_restore
tags:
- hpmn_restore_install
20 changes: 20 additions & 0 deletions ansible/hpmn_restore_run.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
- name: Restore HP masternode from backup
hosts: hp_masternodes
serial: 1
become: true
gather_facts: false
pre_tasks:
- name: Enforce single-host restore
ansible.builtin.assert:
that:
- ansible_play_hosts_all | length == 1
fail_msg: Restore must target exactly one HP masternode host.
run_once: true
tasks:
- name: Execute HP masternode restore role
ansible.builtin.include_role:
name: hpmn_restore
tasks_from: run
tags:
- hpmn_restore_run
Loading
Loading