Skip to content

feat: add CI automation for devnet create and destroy workflows#738

Merged
vivekgsharma merged 37 commits intov1.0-devfrom
devnet
Mar 31, 2026
Merged

feat: add CI automation for devnet create and destroy workflows#738
vivekgsharma merged 37 commits intov1.0-devfrom
devnet

Conversation

@vivekgsharma
Copy link
Copy Markdown
Collaborator

@vivekgsharma vivekgsharma commented Mar 30, 2026

Create Devnet using GitHub Actions. The main problems were CI environment drift and a few runtime issues that only showed up in clean CI runs, especially during masternode funding and HP node Dashmate setup.

What was done?

  • Pinned CI to the working Ansible environment used by manual deploys:
    • ubuntu-22.04
    • ansible-core==2.16.3
    • jmespath
  • Fixed config generation issues used by CI-created networks:
    • set Terraform main_domain
    • ensured dashmate_core_rpc_quorum_list_password is generated and rendered correctly
  • Hardened spork activation:
    • safe handling when EHF/spork keys are absent
    • avoided failures on unsupported sporks
  • Updated Dashmate Ansible role behavior for HP nodes:
    • run commands with correct HOME using become_flags: -H
    • pass explicit Dashmate config selection during render/start/update/status/SSL operations
    • retry transient Let’s Encrypt failures
  • Fixed masternode funding flow in Ansible:
    • kept 1-Dash fee funding batched
    • changed 4000-Dash HPMN collateral funding to one-by-one with confirmation between sends
    • this avoids Transaction too large failures from oversized wallet transactions in CI

How Has This Been Tested?

  • Repeatedly tested with GitHub Actions Create Devnet / Destroy Devnet
  • Verified resumed debugging runs against failing Ansible sections during iteration
  • Completed a successful CI run through:
    • masternode registration
    • HPMN collateral funding
    • ProTx creation
    • HP node Dashmate/platform setup
    • post-deploy service verification

Breaking Changes

None.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling in deployment workflows with safer configuration validation and fallback mechanisms.
    • Enhanced retry logic for SSL certificate operations with configurable timeouts.
  • Chores

    • Updated deployment infrastructure with strengthened devnet name validation.
    • Optimized deployment process to support resuming interrupted operations.
    • Adjusted configuration defaults for improved system initialization.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

GitHub Actions workflows enhanced with devnet resume mode, SSH write key support, and stricter validation. Ansible tasks updated with privilege escalation flags and spork/collateral funding logic improvements. Configuration generators set quorum list password and domain defaults.

Changes

Cohort / File(s) Summary
GitHub Actions Workflows
.github/workflows/create-devnet.yml, .github/workflows/destroy-devnet.yml
Added workflow dispatch input deploy_tags for Ansible tag control. Changed runner to ubuntu-22.04, upgraded to ansible-core==2.16.3, and added DEVNET_ONLY_GUARD. Enhanced devnet name validation (reject reserved names, validate patterns). Added optional SSH write key (EVO_APP_DEPLOY_WRITE_KEY) with fallback to deploy key. Introduced resume mode detection for existing config sets with conditional Terraform skip. Enhanced tfvars preprocessing with default main_domain. Made config push more robust with conditional copying and if: always() push step. Updated Terraform destroy with auto-approval flags.
Ansible Privilege Escalation
ansible/roles/dashmate/tasks/build.yml, destroy_platform.yml, main.yml, quick_update.yml, reindex.yml, rolling_restart.yml, ssl/letsencrypt.yml, ssl/zerossl.yml
Added become_flags: '-H' to privilege escalation context across multiple dashmate command tasks. Added conditional set_fact in main.yml to set user/group existence after creation. Converted fast-mode restart/force-start commands to folded multiline form in main.yml. Added retry logic with retries: 3, delay: 20, and until condition to letsencrypt SSL task.
Ansible Spork Activation
ansible/roles/activate_dashd_sporks/tasks/main.yml, ansible/roles/generate_blocks/tasks/main.yml
Changed spork key lookup from direct indexing to safe dictionary .get() method to handle missing keys gracefully. Removed dedicated "Activate EHF" task.
Ansible Collateral Funding Refactor
ansible/roles/mn_fund_collateral/tasks/fund_collateral.yml, fund_one_collateral.yml
Refactored from single bulk sendmany operation to per-target funding loop. Extracted individual funding and block generation logic into new fund_one_collateral.yml task file. Updated fund_collateral.yml to iterate with with_items and delegate to included task.
Ansible Masternode Initialization
ansible/roles/mn_init/tasks/main.yml
Derived mn_names from mnlist and switched loops to iterate over derived list. Replaced collateral funding include with direct sendmany operation and inlined generate_blocks role invocation. Updated fee funding to use inline transaction with block generation.
Configuration Generators
lib/configGenerator/generateAnsibleConfig.js, generateTerraformConfig.js
Set dashmate_core_rpc_quorum_list_password to same value as dashmate_core_rpc_dashmate_password in Ansible config. Changed Terraform main_domain default from empty string to 'networks.dash.org'.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 With spork keys now safe and SSH write keys in hand,
Devnets resume their work 'cross the ansible land,
Privilege escalation hops with flags ever bright,
Collateral funding flows smooth—one target per bite,
The networks are ready, the domains are set right! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately and concisely summarizes the main change: adding GitHub Actions CI automation for devnet creation and destruction workflows, which is the core focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devnet

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
ansible/roles/dashmate/tasks/set_vars.yml (1)

8-12: ⚠️ Potential issue | 🟠 Major

dash_network_name may be undefined, causing --config= to be passed.

Per ansible/group_vars/all lines 73-75, dash_network_name is commented out and not defined by default. Using it directly without an is defined check will result in dashmate_config_arg expanding to --config= (empty value), which will break dashmate commands.

Suggested fix with conditional handling
 - name: Set dashmate vars
   ansible.builtin.set_fact:
     dashmate_cmd: "{{ 'yarn dashmate' if dashmate_branch is defined else 'dashmate' }}"
     dashmate_cwd: "{{ dashmate_source_dir if dashmate_branch is defined else dashmate_home }}"
-    dashmate_config_arg: "--config={{ dash_network_name }}"
+    dashmate_config_arg: "{{ '--config=' + dash_network_name if dash_network_name is defined and dash_network_name else '' }}"

As per coding guidelines: "Use is defined checks... for variables such as... in Ansible tasks to prevent undefined variable errors".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/set_vars.yml` around lines 8 - 12, The
dashmate_config_arg is being set unconditionally using dash_network_name which
may be undefined; update the ansible.builtin.set_fact so dashmate_config_arg
only includes the --config= flag when dash_network_name is defined (e.g., build
the value conditionally using an "if dash_network_name is defined" check) and
otherwise set dashmate_config_arg to an empty string or omit the flag; locate
the set_fact task that defines dashmate_cmd, dashmate_cwd and
dashmate_config_arg and replace the current dashmate_config_arg assignment with
a Jinja conditional that checks dash_network_name before concatenating the
--config= prefix.
ansible/roles/dashmate/tasks/main.yml (2)

685-698: ⚠️ Potential issue | 🟡 Minor

Fix line length to pass ansible-lint.

Line 687 exceeds the 160-character limit (163 characters), causing the pipeline to fail. Split the command for better readability.

Proposed fix
 - name: Start not started services and replace updated services
-  ansible.builtin.command: "{{ dashmate_cmd }} start --force --verbose{{ '' if needs_core_restart | default(false) else ' --platform' }} {{ dashmate_config_arg }}"
+  ansible.builtin.command: >-
+    {{ dashmate_cmd }} start --force --verbose
+    {{ '' if needs_core_restart | default(false) else '--platform' }}
+    {{ dashmate_config_arg }}
   become: true
   become_user: dashmate
   become_flags: '-H'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/main.yml` around lines 685 - 698, The ansible
task "Start not started services and replace updated services" has a command
line that exceeds the 160-char lint limit; change the ansible.builtin.command
invocation so the long command is split across multiple lines (for example
replace the single long string with a multi-line YAML scalar or a cmd list)
using the existing variables (dashmate_cmd, needs_core_restart,
dashmate_config_arg) and keep the same args (chdir: '{{ dashmate_cwd }}'),
register (dashmate_force_start) and when/changed_when logic; ensure the
conditional inline expression ({{ '' if needs_core_restart | default(false) else
' --platform' }}) is preserved but moved onto its own line or concatenated in
the multi-line value to bring each line under 160 chars.

621-634: ⚠️ Potential issue | 🟡 Minor

Fix line length to pass ansible-lint.

Line 623 exceeds the 160-character limit (164 characters), causing the pipeline to fail. Split the command for better readability.

Proposed fix
 - name: Restart dashmate services (fast mode - individual restart)
-  ansible.builtin.command: "{{ dashmate_cmd }} restart --safe --verbose{{ '' if needs_core_restart | default(false) else ' --platform' }} {{ dashmate_config_arg }}"
+  ansible.builtin.command: >-
+    {{ dashmate_cmd }} restart --safe --verbose
+    {{ '' if needs_core_restart | default(false) else '--platform' }}
+    {{ dashmate_config_arg }}
   become: true
   become_user: dashmate
   become_flags: '-H'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/main.yml` around lines 621 - 634, The command
string in the "Restart dashmate services (fast mode - individual restart)" task
is over 160 characters; change the ansible.builtin.command invocation to use the
argv (list) form so the command elements are split across multiple lines (e.g.
replace the single long command value with args: argv: - "{{ dashmate_cmd }}" -
"restart" - "--safe" - "--verbose{{ '' if needs_core_restart | default(false)
else ' --platform' }}" - "{{ dashmate_config_arg }}" ), keeping the same chdir,
become settings and register dashmate_restart_all, so each argument is on its
own line and the lint line-length error is avoided.
ansible/roles/dashmate/tasks/ssl/zerossl.yml (1)

28-39: ⚠️ Potential issue | 🟡 Minor

Fix line length to pass ansible-lint.

Line 29 exceeds the 160-character limit (164 characters), causing the pipeline to fail. Split the command across multiple lines using YAML folded scalar or break up the arguments.

Proposed fix
 - name: Set ZeroSSL certificate ID to dashmate config from SSM if not set
-  ansible.builtin.command: "{{ dashmate_cmd }} config set {{ dashmate_zerossl_config_path }}.id {{ dashmate_zerossl_ssm_certificate_id }} {{ dashmate_config_arg }}"
+  ansible.builtin.command: >-
+    {{ dashmate_cmd }} config set {{ dashmate_zerossl_config_path }}.id
+    {{ dashmate_zerossl_ssm_certificate_id }} {{ dashmate_config_arg }}
   become: true
   become_user: dashmate
   become_flags: '-H'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/ssl/zerossl.yml` around lines 28 - 39, The task
"Set ZeroSSL certificate ID to dashmate config from SSM if not set" has a single
command line that exceeds the 160-char limit; split the long command so
ansible-lint passes. Replace the inline long string built from dashmate_cmd,
dashmate_zerossl_config_path, dashmate_zerossl_ssm_certificate_id and
dashmate_config_arg with either a YAML folded scalar (>- ) for the command value
or use the command module's argv form (list of arguments) so the long command is
broken into multiple YAML lines while preserving chdir, become, register
(dashmate_zerossl_id) and when conditions.
🧹 Nitpick comments (6)
ansible/roles/mn_fund_collateral/tasks/fund_collateral.yml (1)

5-17: Consider removing unused payments dict construction.

The payments dict is now only used for the debug output (line 15-17). Since funding is done one-by-one via fund_one_collateral.yml, this dict is no longer needed for the actual funding operation. You could simplify by debugging payment_targets directly.

♻️ Optional simplification
 ---
 
-# Requires "payment_targets" and "amount" variables
-
-- name: Initialize array for payments
-  ansible.builtin.set_fact:
-    payments: []
-
-- name: Populate payment addresses and values
-  ansible.builtin.set_fact:
-    payments: "{{ payments | default({}) | combine({item: amount}) }}"
-  with_items:
-    - '{{ payment_targets }}'
-
-- name: Show list of payments to be made
+- name: Show list of addresses to be funded with {{ amount ~ ' Dash' }}
   ansible.builtin.debug:
-    var: payments
+    var: payment_targets
 
 - name: Fund listed masternodes with {{ amount ~ ' Dash'}}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/mn_fund_collateral/tasks/fund_collateral.yml` around lines 5 -
17, Remove the unused intermediate payments dict by eliminating the set_fact
that initializes and populates payments and instead reference payment_targets
directly in the debug task; update or remove any tasks that set or combine the
payments fact (search for payments and the set_fact building it) and change the
debug task to show payment_targets, confirming that fund_one_collateral.yml is
used for actual funding so no other code expects the payments variable.
ansible/roles/dashmate/tasks/reindex.yml (1)

5-13: Missing {{ dashmate_config_arg }} in reindex command.

This task imports set_vars.yml which defines dashmate_config_arg, but the command doesn't use it. For consistency with other dashmate commands in this PR, append the config argument.

Suggested fix
 - name: Reindex dashmate node
-  ansible.builtin.command: "{{ dashmate_cmd }} reindex -d --force"
+  ansible.builtin.command: "{{ dashmate_cmd }} reindex -d --force {{ dashmate_config_arg }}"
   become: true
   become_user: dashmate
   become_flags: '-H'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/reindex.yml` around lines 5 - 13, The reindex
task is missing the dashmate_config_arg so the command should include the config
flag like other dashmate tasks; update the ansible task that runs "{{
dashmate_cmd }} reindex -d --force" (task name "Reindex dashmate node") to
append "{{ dashmate_config_arg }}" to the command string, leaving chdir, become,
become_user, register (dashmate_reindex) and changed_when logic unchanged so it
uses the same config argument as the rest of the role.
ansible/roles/dashmate/tasks/rolling_restart.yml (1)

3-11: Missing set_vars.yml import and {{ dashmate_config_arg }} in restart command.

This task doesn't import ./set_vars.yml (unlike other dashmate tasks) and doesn't append {{ dashmate_config_arg }} to the command. The context snippet from main.yml line 623 shows the expected pattern includes the config argument.

Suggested fix
 ---
 
+- name: Set vars
+  ansible.builtin.import_tasks: ./set_vars.yml
+
 - name: Restart dashmate services for current chunk
-  ansible.builtin.command: "{{ dashmate_cmd }} restart --safe --verbose{{ '' if needs_core_restart | default(false) else ' --platform' }}"
+  ansible.builtin.command: "{{ dashmate_cmd }} restart --safe --verbose{{ '' if needs_core_restart | default(false) else ' --platform' }} {{ dashmate_config_arg }}"
   become: true
   become_user: dashmate
   become_flags: '-H'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/rolling_restart.yml` around lines 3 - 11, Add
the missing import of ./set_vars.yml at the top of this task file and append the
dashmate_config_arg variable to the dashmate restart command; specifically,
ensure the task that registers dashmate_restart_all (task name "Restart dashmate
services for current chunk") sources ./set_vars.yml like other dashmate tasks
and modify the ansible.builtin.command invocation (dashmate_cmd restart ...) to
include {{ dashmate_config_arg }} before other flags so the restart uses the
configured dashmate_config_arg and matches the main.yml pattern.
ansible/roles/dashmate/tasks/destroy_platform.yml (1)

6-13: Missing {{ dashmate_config_arg }} in reset command.

Other dashmate commands in this PR (e.g., SSL obtain, main.yml restart/status) append {{ dashmate_config_arg }} for explicit config selection. This task imports set_vars.yml which defines dashmate_config_arg, but doesn't use it in the command.

For consistency and to avoid ambiguity when multiple configs exist, consider appending the config argument:

Suggested fix
- ansible.builtin.command: "{{ dashmate_cmd }} reset --platform --force --hard"
+ ansible.builtin.command: "{{ dashmate_cmd }} reset --platform --force --hard {{ dashmate_config_arg }}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/destroy_platform.yml` around lines 6 - 13, The
"Reset platform" Ansible task (registered as dashmate_reset) is missing the
dashmate_config_arg in the command invocation; update the
ansible.builtin.command for the task that builds "{{ dashmate_cmd }} reset
--platform --force --hard" to append the template variable "{{
dashmate_config_arg }}" so the command becomes "{{ dashmate_cmd }} reset
--platform --force --hard {{ dashmate_config_arg }}" (preserving existing
quoting, become/become_user/become_flags, chdir and register settings) to ensure
the explicit config is used.
ansible/roles/dashmate/tasks/quick_update.yml (1)

44-53: Missing {{ dashmate_config_arg }} in restart command.

The task imports set_vars.yml which defines dashmate_config_arg, but the restart command doesn't use it. This is inconsistent with other dashmate commands updated in this PR that append the config argument for explicit config selection.

Suggested fix
 - name: Restart dashmate services
-  ansible.builtin.command: "{{ dashmate_cmd }} restart --verbose"
+  ansible.builtin.command: "{{ dashmate_cmd }} restart --verbose {{ dashmate_config_arg }}"
   become: true
   become_user: dashmate
   become_flags: '-H'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/quick_update.yml` around lines 44 - 53, The
restart task "Restart dashmate services" currently invokes the command via the
variable dashmate_cmd but omits the dashmate_config_arg; update the command
invocation in that task to append the dashmate_config_arg so the restart uses
the same explicit config selection as other dashmate commands (keep the task's
become/become_user/become_flags, chdir, register: dashmate_restart, when:
installed_dashmate_version != dashmate_version, and changed_when logic
unchanged).
ansible/deploy.yml (1)

3-11: Clarify or remove "TEMPORARY DEBUG MODE" comment.

The comment on line 3 states "TEMPORARY DEBUG MODE: resume from HP Dashmate setup". If this is intentional for the current PR scope, consider adding context about when this will be changed. If it's leftover debug code, it should be removed or the full deployment flow restored before merging.

The commented-out governance proposals code (lines 5-11) should either be removed or tracked in an issue.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/deploy.yml` around lines 3 - 11, Update the temporary debug comment
and the commented-out task: either delete the "TEMPORARY DEBUG MODE: resume from
HP Dashmate setup" line or replace it with a short note stating the reason and a
target date/issue (e.g., "left for PR X / issue `#Y`") so intent is clear; then
either remove the commented-out governance task block or add a TODO referencing
an issue number that tracks restoring the Create governance proposals role
(generate_proposals) so the deploy.yml no longer contains ambiguous debug
leftovers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ansible/roles/mn_fund_collateral/tasks/fund_one_collateral.yml`:
- Around line 9-16: The task name uses a Jinja template in the middle which
triggers ansible-lint name[template]; change the task name so the template is at
the end (e.g., move "{{ payment_target }}" to the end of the string) for the
task that includes the role generate_blocks with vars txid_list and tx_source
and the when condition referencing fund_result.stdout; keep the rest of the task
(ansible.builtin.include_role, vars generate/txid_list/tx_source and the when)
unchanged.
- Around line 3-7: The task name includes Jinja templates in the middle which
triggers ansible-lint name[template]; update the task name so all templates
appear only at the end (e.g., change the name to something like "Fund collateral
address: {{ payment_target }} {{ amount }}" or any wording where the templates
payment_target and amount are adjacent at the end), leaving the dash-cli
command, register (fund_result), and changed_when logic unchanged; edit the task
name string in the task that currently reads "Fund collateral address {{
payment_target }} with {{ amount }} Dash".

---

Outside diff comments:
In `@ansible/roles/dashmate/tasks/main.yml`:
- Around line 685-698: The ansible task "Start not started services and replace
updated services" has a command line that exceeds the 160-char lint limit;
change the ansible.builtin.command invocation so the long command is split
across multiple lines (for example replace the single long string with a
multi-line YAML scalar or a cmd list) using the existing variables
(dashmate_cmd, needs_core_restart, dashmate_config_arg) and keep the same args
(chdir: '{{ dashmate_cwd }}'), register (dashmate_force_start) and
when/changed_when logic; ensure the conditional inline expression ({{ '' if
needs_core_restart | default(false) else ' --platform' }}) is preserved but
moved onto its own line or concatenated in the multi-line value to bring each
line under 160 chars.
- Around line 621-634: The command string in the "Restart dashmate services
(fast mode - individual restart)" task is over 160 characters; change the
ansible.builtin.command invocation to use the argv (list) form so the command
elements are split across multiple lines (e.g. replace the single long command
value with args: argv: - "{{ dashmate_cmd }}" - "restart" - "--safe" -
"--verbose{{ '' if needs_core_restart | default(false) else ' --platform' }}" -
"{{ dashmate_config_arg }}" ), keeping the same chdir, become settings and
register dashmate_restart_all, so each argument is on its own line and the lint
line-length error is avoided.

In `@ansible/roles/dashmate/tasks/set_vars.yml`:
- Around line 8-12: The dashmate_config_arg is being set unconditionally using
dash_network_name which may be undefined; update the ansible.builtin.set_fact so
dashmate_config_arg only includes the --config= flag when dash_network_name is
defined (e.g., build the value conditionally using an "if dash_network_name is
defined" check) and otherwise set dashmate_config_arg to an empty string or omit
the flag; locate the set_fact task that defines dashmate_cmd, dashmate_cwd and
dashmate_config_arg and replace the current dashmate_config_arg assignment with
a Jinja conditional that checks dash_network_name before concatenating the
--config= prefix.

In `@ansible/roles/dashmate/tasks/ssl/zerossl.yml`:
- Around line 28-39: The task "Set ZeroSSL certificate ID to dashmate config
from SSM if not set" has a single command line that exceeds the 160-char limit;
split the long command so ansible-lint passes. Replace the inline long string
built from dashmate_cmd, dashmate_zerossl_config_path,
dashmate_zerossl_ssm_certificate_id and dashmate_config_arg with either a YAML
folded scalar (>- ) for the command value or use the command module's argv form
(list of arguments) so the long command is broken into multiple YAML lines while
preserving chdir, become, register (dashmate_zerossl_id) and when conditions.

---

Nitpick comments:
In `@ansible/deploy.yml`:
- Around line 3-11: Update the temporary debug comment and the commented-out
task: either delete the "TEMPORARY DEBUG MODE: resume from HP Dashmate setup"
line or replace it with a short note stating the reason and a target date/issue
(e.g., "left for PR X / issue `#Y`") so intent is clear; then either remove the
commented-out governance task block or add a TODO referencing an issue number
that tracks restoring the Create governance proposals role (generate_proposals)
so the deploy.yml no longer contains ambiguous debug leftovers.

In `@ansible/roles/dashmate/tasks/destroy_platform.yml`:
- Around line 6-13: The "Reset platform" Ansible task (registered as
dashmate_reset) is missing the dashmate_config_arg in the command invocation;
update the ansible.builtin.command for the task that builds "{{ dashmate_cmd }}
reset --platform --force --hard" to append the template variable "{{
dashmate_config_arg }}" so the command becomes "{{ dashmate_cmd }} reset
--platform --force --hard {{ dashmate_config_arg }}" (preserving existing
quoting, become/become_user/become_flags, chdir and register settings) to ensure
the explicit config is used.

In `@ansible/roles/dashmate/tasks/quick_update.yml`:
- Around line 44-53: The restart task "Restart dashmate services" currently
invokes the command via the variable dashmate_cmd but omits the
dashmate_config_arg; update the command invocation in that task to append the
dashmate_config_arg so the restart uses the same explicit config selection as
other dashmate commands (keep the task's become/become_user/become_flags, chdir,
register: dashmate_restart, when: installed_dashmate_version !=
dashmate_version, and changed_when logic unchanged).

In `@ansible/roles/dashmate/tasks/reindex.yml`:
- Around line 5-13: The reindex task is missing the dashmate_config_arg so the
command should include the config flag like other dashmate tasks; update the
ansible task that runs "{{ dashmate_cmd }} reindex -d --force" (task name
"Reindex dashmate node") to append "{{ dashmate_config_arg }}" to the command
string, leaving chdir, become, become_user, register (dashmate_reindex) and
changed_when logic unchanged so it uses the same config argument as the rest of
the role.

In `@ansible/roles/dashmate/tasks/rolling_restart.yml`:
- Around line 3-11: Add the missing import of ./set_vars.yml at the top of this
task file and append the dashmate_config_arg variable to the dashmate restart
command; specifically, ensure the task that registers dashmate_restart_all (task
name "Restart dashmate services for current chunk") sources ./set_vars.yml like
other dashmate tasks and modify the ansible.builtin.command invocation
(dashmate_cmd restart ...) to include {{ dashmate_config_arg }} before other
flags so the restart uses the configured dashmate_config_arg and matches the
main.yml pattern.

In `@ansible/roles/mn_fund_collateral/tasks/fund_collateral.yml`:
- Around line 5-17: Remove the unused intermediate payments dict by eliminating
the set_fact that initializes and populates payments and instead reference
payment_targets directly in the debug task; update or remove any tasks that set
or combine the payments fact (search for payments and the set_fact building it)
and change the debug task to show payment_targets, confirming that
fund_one_collateral.yml is used for actual funding so no other code expects the
payments variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 254b47a8-b3f3-46c1-8178-9ef385ddb745

📥 Commits

Reviewing files that changed from the base of the PR and between 85b6eaf and 4afc26a.

📒 Files selected for processing (20)
  • .github/workflows/create-devnet.yml
  • .github/workflows/destroy-devnet.yml
  • ansible/deploy.yml
  • ansible/roles/activate_dashd_sporks/tasks/main.yml
  • ansible/roles/dashmate/tasks/build.yml
  • ansible/roles/dashmate/tasks/destroy_platform.yml
  • ansible/roles/dashmate/tasks/main.yml
  • ansible/roles/dashmate/tasks/quick_update.yml
  • ansible/roles/dashmate/tasks/reindex.yml
  • ansible/roles/dashmate/tasks/rolling_restart.yml
  • ansible/roles/dashmate/tasks/set_vars.yml
  • ansible/roles/dashmate/tasks/ssl/letsencrypt.yml
  • ansible/roles/dashmate/tasks/ssl/zerossl.yml
  • ansible/roles/dashmate/templates/dashmate.json.j2
  • ansible/roles/generate_blocks/tasks/main.yml
  • ansible/roles/mn_fund_collateral/tasks/fund_collateral.yml
  • ansible/roles/mn_fund_collateral/tasks/fund_one_collateral.yml
  • ansible/roles/mn_init/tasks/main.yml
  • lib/configGenerator/generateAnsibleConfig.js
  • lib/configGenerator/generateTerraformConfig.js

Comment on lines +3 to +7
- name: Fund collateral address {{ payment_target }} with {{ amount }} Dash
ansible.builtin.command: >
dash-cli -rpcwallet={{ wallet_rpc_wallet_faucet }} sendtoaddress {{ payment_target }} {{ amount }}
register: fund_result
changed_when: fund_result.stdout | length == 64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix ansible-lint name[template] violations.

The pipeline reports name[template] warnings because Jinja templates should only appear at the end of task names. Consider restructuring the task names.

🔧 Proposed fix
-- name: Fund collateral address {{ payment_target }} with {{ amount }} Dash
+- name: "Fund collateral address with Dash: {{ payment_target }} ({{ amount }})"
   ansible.builtin.command: >
     dash-cli -rpcwallet={{ wallet_rpc_wallet_faucet }} sendtoaddress {{ payment_target }} {{ amount }}
   register: fund_result
   changed_when: fund_result.stdout | length == 64
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Fund collateral address {{ payment_target }} with {{ amount }} Dash
ansible.builtin.command: >
dash-cli -rpcwallet={{ wallet_rpc_wallet_faucet }} sendtoaddress {{ payment_target }} {{ amount }}
register: fund_result
changed_when: fund_result.stdout | length == 64
- name: "Fund collateral address with Dash: {{ payment_target }} ({{ amount }})"
ansible.builtin.command: >
dash-cli -rpcwallet={{ wallet_rpc_wallet_faucet }} sendtoaddress {{ payment_target }} {{ amount }}
register: fund_result
changed_when: fund_result.stdout | length == 64
🧰 Tools
🪛 GitHub Actions: Package Tests

[warning] 3-3: ansible-lint (name[template]): Jinja templates should only be at the end of 'name' (Task/Handler: Fund collateral address {{ payment_target }} with {{ amount }} Dash)

🪛 GitHub Check: Test package

[failure] 3-3: name[template]
Jinja templates should only be at the end of 'name'

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/mn_fund_collateral/tasks/fund_one_collateral.yml` around lines
3 - 7, The task name includes Jinja templates in the middle which triggers
ansible-lint name[template]; update the task name so all templates appear only
at the end (e.g., change the name to something like "Fund collateral address: {{
payment_target }} {{ amount }}" or any wording where the templates
payment_target and amount are adjacent at the end), leaving the dash-cli
command, register (fund_result), and changed_when logic unchanged; edit the task
name string in the task that currently reads "Fund collateral address {{
payment_target }} with {{ amount }} Dash".

Comment thread ansible/roles/mn_fund_collateral/tasks/fund_one_collateral.yml Outdated
@vivekgsharma vivekgsharma changed the title Devnet feat: add CI automation for devnet create and destroy workflows Mar 30, 2026
Copy link
Copy Markdown
Contributor

@ktechmidas ktechmidas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass at a review

create:
name: Create Devnet
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have an error on latest?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on ubuntu-latest I hit CI-only environment drift issues. Pinning to ubuntu-22.04 was to stabilize the workflow against the known-good deploy environment.

run: |
python3 -m pip install --upgrade pip
python3 -m pip install ansible
python3 -m pip install ansible-core==2.16.3 jmespath
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we'll deal this in a separate PR, I know 2.19 broke a lot of stuff and our entire codebase probably needs an update...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow was failing in CI with the newer Ansible from ubuntu-latest/pip install ansible, while the same deploy flow was working manually with Ansible 2.16.3. So this was necessary to remove CI environment drift and get Create Devnet run successfully

env:
DEPLOY_SERVER_KEY: ${{ secrets.DEPLOY_SERVER_KEY }}
EVO_APP_DEPLOY_KEY: ${{ secrets.EVO_APP_DEPLOY_KEY }}
EVO_APP_DEPLOY_WRITE_KEY: ${{ secrets.EVO_APP_DEPLOY_WRITE_KEY }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if it would generate a new key for each devnet, maybe we can use GPG to encrypt it and show it? Then you & i can decrypt on our sides and get a new key

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I am iterating with the testing and will get this added once you do the final review and PR is about to merge , this is will me in fast testing.

Comment thread ansible/deploy.yml
Comment thread ansible/roles/activate_dashd_sporks/tasks/main.yml Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
ansible/deploy.yml (1)

3-3: ⚠️ Potential issue | 🟠 Major

Don't leave the resume/debug path under full_deploy.

.github/workflows/create-devnet.yml:15-19 still defaults deploy_tags to full_deploy, and .github/workflows/create-devnet.yml:26-33 forwards that tag directly to ./bin/deploy. Since this file is explicitly in temporary debug mode and already has a dedicated dashmate_deploy tag, merging it like this changes the default CI/manual contract from full deployment to a partial resume path. Please keep this subset behind dashmate_deploy or another debug-only tag and restore the end-to-end flow under full_deploy.

Also applies to: 27-29

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/deploy.yml` at line 3, The ansible/deploy.yml change put a
resume/debug path under the shared full_deploy tag, which breaks the default
CI/manual contract; revert the debug-specific resume logic into the
dashmate_deploy (or another debug-only) tag and restore full_deploy to run the
full end-to-end flow. Specifically, move or guard any resume-only tasks/paths
currently under the full_deploy tag so they only run under dashmate_deploy,
ensure deploy_tags still defaults to full_deploy and that ./bin/deploy receives
that full_deploy behavior, and apply the same fix to the other occurrences of
the resume/debug block referenced elsewhere in the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ansible/roles/dashmate/tasks/main.yml`:
- Around line 121-127: The task "Update dashmate existence facts after user
creation" currently sets dashmate_user_exists=true too early and causes the
subsequent "Add dashmate user to docker group" task (which checks when: not
dashmate_user_exists) to be skipped; move the entire ansible.builtin.set_fact
block named "Update dashmate existence facts after user creation" so it runs
after the "Add dashmate user to docker group" task (i.e., place the set_fact
task below the Docker-group task) so the docker-group task sees
dashmate_user_exists as false and can add the new user to the docker group
before the existence facts are updated.

---

Duplicate comments:
In `@ansible/deploy.yml`:
- Line 3: The ansible/deploy.yml change put a resume/debug path under the shared
full_deploy tag, which breaks the default CI/manual contract; revert the
debug-specific resume logic into the dashmate_deploy (or another debug-only) tag
and restore full_deploy to run the full end-to-end flow. Specifically, move or
guard any resume-only tasks/paths currently under the full_deploy tag so they
only run under dashmate_deploy, ensure deploy_tags still defaults to full_deploy
and that ./bin/deploy receives that full_deploy behavior, and apply the same fix
to the other occurrences of the resume/debug block referenced elsewhere in the
file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 68922182-13f5-411c-b8a4-1f8f4735624f

📥 Commits

Reviewing files that changed from the base of the PR and between a9f8fa2 and 02791c3.

📒 Files selected for processing (4)
  • ansible/deploy.yml
  • ansible/roles/dashmate/tasks/main.yml
  • ansible/roles/dashmate/tasks/ssl/letsencrypt.yml
  • ansible/roles/dashmate/tasks/ssl/zerossl.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • ansible/roles/dashmate/tasks/ssl/letsencrypt.yml

Comment on lines +121 to +127
- name: Update dashmate existence facts after user creation
ansible.builtin.set_fact:
dashmate_group_exists: true
dashmate_user_exists: true
when:
- not (skip_dashmate_image_update | default(false))
- not dashmate_user_exists
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate the file and check its size
find . -name "main.yml" -path "*/ansible/roles/dashmate/tasks/*" | head -5

Repository: dashpay/dash-network-deploy

Length of output: 110


🏁 Script executed:

# Read the relevant section with context
cat -n ansible/roles/dashmate/tasks/main.yml | sed -n '110,145p'

Repository: dashpay/dash-network-deploy

Length of output: 1462


Move the existence-fact update below the Docker-group task.

On a fresh node, the "Update dashmate existence facts after user creation" task (lines 121-127) sets dashmate_user_exists to true before the next task evaluates its condition. This causes Add dashmate user to docker group (lines 129-136) to be skipped because its condition when: not dashmate_user_exists becomes false. The new user account is created without Docker group membership, which breaks later dashmate update/start/restart operations.

Suggested fix
-- name: Update dashmate existence facts after user creation
-  ansible.builtin.set_fact:
-    dashmate_group_exists: true
-    dashmate_user_exists: true
-  when:
-    - not (skip_dashmate_image_update | default(false))
-    - not dashmate_user_exists
-
 - name: Add dashmate user to docker group
   ansible.builtin.user:
     name: 'dashmate'
     groups: docker
     append: true
   when:
     - not (skip_dashmate_image_update | default(false))
     - not dashmate_user_exists
+
+- name: Update dashmate existence facts after user creation
+  ansible.builtin.set_fact:
+    dashmate_group_exists: true
+    dashmate_user_exists: true
+  when:
+    - not (skip_dashmate_image_update | default(false))
+    - not dashmate_user_exists
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ansible/roles/dashmate/tasks/main.yml` around lines 121 - 127, The task
"Update dashmate existence facts after user creation" currently sets
dashmate_user_exists=true too early and causes the subsequent "Add dashmate user
to docker group" task (which checks when: not dashmate_user_exists) to be
skipped; move the entire ansible.builtin.set_fact block named "Update dashmate
existence facts after user creation" so it runs after the "Add dashmate user to
docker group" task (i.e., place the set_fact task below the Docker-group task)
so the docker-group task sees dashmate_user_exists as false and can add the new
user to the docker group before the existence facts are updated.

@vivekgsharma vivekgsharma requested a review from ktechmidas March 31, 2026 09:42
Copy link
Copy Markdown
Contributor

@dashinfraclaw dashinfraclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — @dashinfraclaw

Went through the full diff. This is solid CI hardening work. A few observations:

✅ Good stuff

  • Pinning ubuntu-22.04 + ansible-core==2.16.3 — essential for reproducibility. ubuntu-latest drifting is a classic CI footgun.
  • become_flags: -H on all dashmate tasks — correct fix. Without it, HOME stays as root's when using become_user: dashmate, which breaks npm/yarn/cargo that write to $HOME. This was probably causing intermittent CI failures.
  • Resume mode for existing configs — smart. Lets you re-run Ansible without re-terraforming when debugging mid-flow failures. The partial config detection (fail if 1 or 2 of 3 files exist) is a nice safety check.
  • Jinja .get(item, false) for spork lookups — fixes KeyError when a spork doesn't exist in the response. The old [item] would blow up on missing keys.
  • default(..., true) for quorum_list_password — the true param makes Jinja treat empty strings as undefined too, not just None. Good catch.
  • One-by-one HPMN collateral funding — correct fix for "Transaction too large" from batching 4000 DASH × N sends. Regular MN fee funding (1 DASH each) stays batched via sendmany, which makes sense since those are small.
  • Let's Encrypt retry (3x, 20s delay) — rate limits and transient DNS failures are real. Good addition.
  • Write key separation for configs repo push — clean approach with graceful fallback.

⚠️ Minor concerns

  1. EHF spork activation removed entirely (activate_dashd_sporks/main.yml lines deleted). Was SPORK_24_TEST_EHF intentionally dropped from all devnet flows, or just from the initial spork activation? If EHF is still needed later in the pipeline, this could silently break it.

  2. fund_one_collateral.yml uses sendtoaddress — this sends one tx per node and waits for confirmation each time. For 20+ HPMNs that's 20+ block-generation cycles. The PR description acknowledges this is intentional to avoid "Transaction too large", but worth noting it will significantly slow down HPMN-heavy devnet creation. Could be optimized later with small batches (e.g., 5 at a time) instead of strictly one-by-one.

  3. mn_names fact construction — the new dict2items/map pattern is safer than iterating a dict directly, but the else [] branch means a misconfigured mnlist would silently produce zero masternodes instead of failing. Might want a fail task if mn_names is empty and mnlist was expected.

  4. Config push step has if: always() — this means it runs even if the deploy failed. That's probably intentional (save partial progress), but could push incomplete/broken configs to the repo on a mid-deploy failure. The resume mode would then pick up those broken configs on retry. Consider gating on at least Terraform success.

  5. TF_CLI_ARGS_destroy: "-auto-approve" — fine for CI, just noting this means the destroy workflow has no confirmation step at all. The concurrency lock helps, but a manual approval gate on the destroy workflow might be worth considering.

Nit

  • The commit to generateAnsibleConfig.js sets quorum_list_password = dashmate_password — this means they're always identical. If that's the intent, the Jinja template default is redundant (but harmless as belt-and-suspenders).

Overall: good to go with the minor considerations above. The become_flags: -H fix alone is worth merging — that's been a known pain point.

@dashinfraclaw
Copy link
Copy Markdown
Contributor

🚨 Destroy Workflow — Testnet/Mainnet Protection Audit

Traced the full destroy path from the GitHub Actions workflow through → → . Here's the safety analysis:

Current protections

  1. Naming prefix: The workflow hardcodes , so any input gets the prefix. Entering testnet becomes devnet-testnet, not testnet. ✅
  2. Input validation: Rejects empty names and names starting with devnet- (to prevent devnet-devnet-...). ✅
  3. Config file gating: The workflow copies configs from dash-network-configs and fails if the .yml/.tfvars/.inventory triple doesn't exist. There's no devnet-testnet.* config set, so it would fail. ✅
  4. Terraform workspace isolation: init.sh parses the network name and selects a Terraform workspace matching it. devnet-* networks get their own workspace, separate from testnet/mainnet. ✅
  5. bin/destroy has a testnet safeguard: If NETWORK_NAME == "testnet", it requires typing a solemn oath. But this can't trigger from the workflow since the name always has devnet- prefix. ✅

🟡 Gaps that should be addressed

1. No input sanitization on special characters

The devnet_name input has zero character validation. Someone could enter:

  • Spaces, quotes, backticks → potential shell injection in the ./bin/destroy "$NETWORK_NAME" call
  • Very long strings → potential issues with Terraform workspace names or S3 keys

Suggested fix — add to the Validate devnet name step:

if [[ ! "$NAME" =~ ^[a-z0-9]([a-z0-9-]*[a-z0-9])?$ ]]; then
  echo "Error: devnet_name must be lowercase alphanumeric with hyphens only"
  exit 1
fi
if [[ ${#NAME} -gt 32 ]]; then
  echo "Error: devnet_name must be 32 characters or less"
  exit 1
fi

2. No explicit blocklist for protected network names

While the devnet- prefix prevents direct testnet/mainnet targeting, a belt-and-suspenders blocklist would make the intent explicit and survive refactors:

PROTECTED_NAMES=("testnet" "mainnet" "mainnet-support" "regtest")
for protected in "${PROTECTED_NAMES[@]}"; do
  if [[ "$NAME" == "$protected" || "devnet-$NAME" == "$protected" ]]; then
    echo "Error:  is a protected network name and cannot be used as a devnet name"
    exit 1
  fi
done

3. TF_CLI_ARGS_destroy: "-auto-approve" with no manual approval gate

The destroy workflow runs immediately on dispatch with no GitHub environment protection or approval step. Anyone with workflow_dispatch permission can destroy any devnet instantly. Consider:

  • Adding a GitHub Environment with required reviewers for the destroy workflow
  • Or at minimum, add a summary step with a pause/confirmation before actual destruction

4. The AWS credentials in secrets have no scope restriction

The same AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY are used for both create and destroy. If these credentials have access to testnet/mainnet Terraform state or EC2 instances, a modified workflow file in a PR branch could target them (workflow_dispatch runs on the selected branch's version of the workflow). Consider:

  • Using an IAM role that can only manage resources tagged Environment=devnet*
  • Or using OIDC-based auth with conditions on the workflow

Verdict

Testnet is safe from the current workflow as-is — the devnet- prefix, config file gating, and Terraform workspace isolation provide layered protection. But the gaps above mean the protection relies on convention rather than enforcement. The input sanitization fix is low-effort and should go in with this PR. The IAM scoping is a longer-term hardening item.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/destroy-devnet.yml:
- Around line 236-247: The current post-destroy git push suppresses failures by
echoing a warning and exiting 0, which can leave stale dash-network-configs
state; change the push error handling in the block that checks
EVO_APP_DEPLOY_WRITE_KEY (the GIT_SSH_COMMAND='ssh -i ~/.ssh/id_ed25519_write
...' git push and the fallback git push) to fail the job on push errors (return
non-zero) and surface the git error output instead of silencing it — i.e.,
remove the "exit 0" behavior and pipe or log the git stderr (or let git's exit
code bubble up) so the workflow fails on push failure and operators can see the
failure reason.
- Around line 54-65: The script validates character set for NAME but lacks a
length cap; add a check on the length of NAME so the resulting "devnet-$NAME"
does not exceed platform limits (e.g., 63 chars). Specifically, after the
character-regex check for NAME and before the reserved-name check, compute the
length of NAME (using ${`#NAME`}) and fail if it is greater than the allowed
maximum (e.g., 56 so "devnet-$NAME" ≤ 63); include a clear error message
referencing devnet-$NAME and NAME so long names are rejected early.
- Around line 213-216: The workflow enables non-interactive terraform destroy
via TF_IN_AUTOMATION and TF_CLI_ARGS_destroy which needs a manual approval gate;
update the destroy job in .github/workflows/destroy-devnet.yml to target a
protected GitHub Environment (e.g., set the job's environment to
"devnet-destroy") and configure that Environment to require reviewers or
approvals before the job runs, ensuring the destructive
TF_CLI_ARGS_destroy="-auto-approve" step cannot execute without explicit human
approval.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b1e57fe-ea3e-48fa-a636-3f55706309b7

📥 Commits

Reviewing files that changed from the base of the PR and between 02791c3 and d5eb962.

📒 Files selected for processing (2)
  • .github/workflows/create-devnet.yml
  • .github/workflows/destroy-devnet.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/create-devnet.yml

Comment on lines +54 to +65
if [[ ! "$NAME" =~ ^[a-z0-9][a-z0-9-]*$ ]]; then
echo "Error: devnet_name must be lowercase alphanumeric with optional hyphens"
exit 1
fi
if [[ "$NAME" == "testnet" || "$NAME" == "mainnet" || "$NAME" == mainnet-* ]]; then
echo "Error: reserved network names are not allowed in this workflow"
exit 1
fi
if [[ ! "devnet-$NAME" =~ ^devnet-[a-z0-9][a-z0-9-]*$ ]]; then
echo "Error: resulting network name is not a valid devnet name"
exit 1
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add an explicit max-length check for devnet_name.

Line 54 validates characters, but there’s still no length bound. Very long names can pass here and fail later in Terraform/cloud naming constraints, causing partial/fragile destroy runs.

🔧 Suggested patch
           if [[ ! "$NAME" =~ ^[a-z0-9][a-z0-9-]*$ ]]; then
             echo "Error: devnet_name must be lowercase alphanumeric with optional hyphens"
             exit 1
           fi
+          if [[ ${`#NAME`} -gt 32 ]]; then
+            echo "Error: devnet_name must be 32 characters or fewer"
+            exit 1
+          fi
           if [[ "$NAME" == "testnet" || "$NAME" == "mainnet" || "$NAME" == mainnet-* ]]; then
             echo "Error: reserved network names are not allowed in this workflow"
             exit 1
           fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [[ ! "$NAME" =~ ^[a-z0-9][a-z0-9-]*$ ]]; then
echo "Error: devnet_name must be lowercase alphanumeric with optional hyphens"
exit 1
fi
if [[ "$NAME" == "testnet" || "$NAME" == "mainnet" || "$NAME" == mainnet-* ]]; then
echo "Error: reserved network names are not allowed in this workflow"
exit 1
fi
if [[ ! "devnet-$NAME" =~ ^devnet-[a-z0-9][a-z0-9-]*$ ]]; then
echo "Error: resulting network name is not a valid devnet name"
exit 1
fi
if [[ ! "$NAME" =~ ^[a-z0-9][a-z0-9-]*$ ]]; then
echo "Error: devnet_name must be lowercase alphanumeric with optional hyphens"
exit 1
fi
if [[ ${`#NAME`} -gt 32 ]]; then
echo "Error: devnet_name must be 32 characters or fewer"
exit 1
fi
if [[ "$NAME" == "testnet" || "$NAME" == "mainnet" || "$NAME" == mainnet-* ]]; then
echo "Error: reserved network names are not allowed in this workflow"
exit 1
fi
if [[ ! "devnet-$NAME" =~ ^devnet-[a-z0-9][a-z0-9-]*$ ]]; then
echo "Error: resulting network name is not a valid devnet name"
exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/destroy-devnet.yml around lines 54 - 65, The script
validates character set for NAME but lacks a length cap; add a check on the
length of NAME so the resulting "devnet-$NAME" does not exceed platform limits
(e.g., 63 chars). Specifically, after the character-regex check for NAME and
before the reserved-name check, compute the length of NAME (using ${`#NAME`}) and
fail if it is greater than the allowed maximum (e.g., 56 so "devnet-$NAME" ≤
63); include a clear error message referencing devnet-$NAME and NAME so long
names are rejected early.

Comment on lines +213 to 216
env:
TF_IN_AUTOMATION: "true"
TF_CLI_ARGS_destroy: "-auto-approve"
run: |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

destroy auto-approve needs a manual approval gate.

Line 215 enables non-interactive destroy approval. For a destructive workflow, this should be paired with an Environment requiring reviewers (or equivalent pause/approval) to prevent accidental teardown.

🔧 Suggested workflow guard
 jobs:
   destroy:
     name: Destroy Devnet
     runs-on: ubuntu-22.04
+    environment: devnet-destroy
     timeout-minutes: 60

Then configure required reviewers on the devnet-destroy GitHub Environment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/destroy-devnet.yml around lines 213 - 216, The workflow
enables non-interactive terraform destroy via TF_IN_AUTOMATION and
TF_CLI_ARGS_destroy which needs a manual approval gate; update the destroy job
in .github/workflows/destroy-devnet.yml to target a protected GitHub Environment
(e.g., set the job's environment to "devnet-destroy") and configure that
Environment to require reviewers or approvals before the job runs, ensuring the
destructive TF_CLI_ARGS_destroy="-auto-approve" step cannot execute without
explicit human approval.

Comment on lines +236 to +247
# Use optional write key if configured; otherwise try default key.
if [[ -n "$EVO_APP_DEPLOY_WRITE_KEY" && -f "$HOME/.ssh/id_ed25519_write" ]]; then
GIT_SSH_COMMAND='ssh -i ~/.ssh/id_ed25519_write -o StrictHostKeyChecking=no' git push || {
echo "::warning::Failed to push config removal with EVO_APP_DEPLOY_WRITE_KEY"
exit 0
}
else
git push || {
echo "::warning::Failed to push config removal (likely read-only EVO_APP_DEPLOY_KEY). Configure secret EVO_APP_DEPLOY_WRITE_KEY with write access."
exit 0
}
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not silently ignore config-repo push failures after destroy.

Lines 239-246 convert push failures to warnings + exit 0. That can leave stale dash-network-configs entries after successful infra destroy, creating state drift and follow-up operational risk.

🔧 Suggested behavior change
           if [[ -n "$EVO_APP_DEPLOY_WRITE_KEY" && -f "$HOME/.ssh/id_ed25519_write" ]]; then
             GIT_SSH_COMMAND='ssh -i ~/.ssh/id_ed25519_write -o StrictHostKeyChecking=no' git push || {
               echo "::warning::Failed to push config removal with EVO_APP_DEPLOY_WRITE_KEY"
-              exit 0
+              exit 1
             }
           else
             git push || {
               echo "::warning::Failed to push config removal (likely read-only EVO_APP_DEPLOY_KEY). Configure secret EVO_APP_DEPLOY_WRITE_KEY with write access."
-              exit 0
+              exit 1
             }
           fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/destroy-devnet.yml around lines 236 - 247, The current
post-destroy git push suppresses failures by echoing a warning and exiting 0,
which can leave stale dash-network-configs state; change the push error handling
in the block that checks EVO_APP_DEPLOY_WRITE_KEY (the GIT_SSH_COMMAND='ssh -i
~/.ssh/id_ed25519_write ...' git push and the fallback git push) to fail the job
on push errors (return non-zero) and surface the git error output instead of
silencing it — i.e., remove the "exit 0" behavior and pipe or log the git stderr
(or let git's exit code bubble up) so the workflow fails on push failure and
operators can see the failure reason.

@dashinfraclaw
Copy link
Copy Markdown
Contributor

Follow-up review — latest commit (13:12 UTC)

Nice, the safety checks from my earlier audit are now in. Checking the new additions:

✅ Input sanitization added

  • Lowercase alphanumeric + hyphens regex — good
  • Reserved name blocklist (testnet, mainnet, mainnet-*) — good
  • Double-validation (raw input + resulting devnet-NAME) — belt-and-suspenders, good
  • Applied to both create and destroy workflows — good

✅ DEVNET_ONLY_GUARD env var

Set on both workflows. Not currently consumed by any downstream script, but serves as a signal/assertion that these workflows are devnet-only. Could be checked in bin/deploy or bin/destroy as an additional guard in the future.

One small gap remaining

The blocklist catches testnet, mainnet, and mainnet-* but not regtest or mainnet-support (which exists in dash-network-configs). Probably fine since those would become devnet-regtest / devnet-mainnet-support which are harmless, but for completeness:

if [[ "$NAME" == "testnet" || "$NAME" == "mainnet" || "$NAME" == "regtest" || "$NAME" == mainnet-* ]]; then

Otherwise LGTM. The safety story is solid now.

Copy link
Copy Markdown
Contributor

@ktechmidas ktechmidas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vivekgsharma vivekgsharma merged commit 555e290 into v1.0-dev Mar 31, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants