Skip to content

chore(helm/charts): import erpc from morpho-infra-helm#90

Open
rguichard wants to merge 5 commits into
morpho-mainfrom
feature/pla-1455-move-app-related-deployment-config-into-application-repos
Open

chore(helm/charts): import erpc from morpho-infra-helm#90
rguichard wants to merge 5 commits into
morpho-mainfrom
feature/pla-1455-move-app-related-deployment-config-into-application-repos

Conversation

@rguichard

@rguichard rguichard commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR brings the eRPC Helm chart and its production environment configuration into this repository, consolidating application-level deployment config alongside the application code rather than keeping it in a separate infrastructure repository. The migration covers the full production topology: multiple eRPC instances, HAProxy smart load balancers with Prometheus-driven weight and fallback controllers, Redis HA, and a CNPG-managed PostgreSQL cluster.

Two companion fixes are included: security hardening of the validation scripts based on an Aikido review, and bumping the eRPC and erpc-validator image tags to 0.1.3 to match the latest production values.

Changes

New Helm chart — helm/charts/erpc/

  • Base application chart with Deployment, HPA (autoscaling/v2), PDB, ClusterIP services (HTTP + metrics + optional headless), and ServiceAccounts.
  • Vault-based secret pipeline: a pre-install Job renders a config template from Vault using a built-in AWK renderer that resolves __SECRET_<KEY>__ placeholders and generates per-key auth strategies from API_KEY_* entries; an optional dedicated validation init-container (vault.validationImage) runs erpc validate before the secret is stored in Kubernetes.
  • CNPG PostgreSQL cluster (pg-erpc.yaml) with volume-snapshot or Barman S3 backup, custom autovacuum/checkpoint tuning, and a recovery health monitoring ConfigMap.
  • Security hardening gate (securityHardening.enabled) implementing the Pod Security Standards restricted profile (non-root, read-only rootfs, dropped capabilities, seccomp RuntimeDefault).
  • Local development support: docker-compose.yml, .env.example, and test-local.sh.

Production environment wrapper — helm/environments/prd/erpc/

  • Five eRPC sub-chart instances (erpc, erpc-dev, erpc-fallback, erpc-processing, erpc-router), each with independent Vault config paths, resource profiles, and per-instance runtime ServiceAccounts.
  • Three HAProxy deployments (erpc-haproxy, erpc-processing-haproxy, erpc-router-haproxy), each carrying two sidecar controllers: a weight-controller that adjusts per-pod backend weights using goroutine count, P99 latency, request rate, and CPU metrics; and a fallback-controller that gradually activates the hot-standby fallback pool on sustained degradation and immediately on total outage.
  • Redis HA (redis-ha + HAProxy + prometheus-redis-exporter) with Vault-managed password.
  • PrometheusRule with recording rules across histogram families, counters, and gauges; latency percentile rules; fallback traffic share rules; and alerting rules for high fallback traffic and actionable failure rates.
  • Vault secret materialization Jobs for Redis and DB secrets, with matching RBAC.

CI/CD and repo governance

  • .github/CODEOWNERS: platform-engineers and @0x666c6f as mandatory reviewers on all patterns including /helm/.
  • .github/workflows/helm-charts-validation.yaml: three-job pipeline — chart discovery, parallel helm lint + kubeconform validation (with database chart skip logic), summary gate.
  • .github/workflows/slack-pr-notification.yaml: Slack notification on merge to morpho-main, skipping bot authors.
  • .github/workflows/wiz-iac-scan.yml: Wiz CLI IaC scan triggered on PRs touching helm/.
  • validate-helm-charts.sh: local script mirroring CI validation, with GNU parallel support and sequential fallback.

Aikido security fixes

  • validate-helm-charts.sh: process_chart uses return instead of exit so the sequential loop always reaches the summary; single mktemp -d with unified cleanup via rm -rf; chart path passed directly to helm dependency update / helm template instead of cd-ing; helm template | kubeconform pipeline split for correct exit-code propagation.
  • test-local.sh: LABELS_SERVICE_API_KEY redacted when echoing the auth response; eth_getLogs block numbers corrected to a valid 100 000-block range on Arbitrum.

Image version bump

  • values.yaml: erpc and erpc-validator image tags updated from 0.0.77 to 0.1.3.

Testing

Local validation script

# Install prerequisites: helm >= 3.16.4, kubeconform v0.6.1
./validate-helm-charts.sh          # parallel (requires GNU parallel)
./validate-helm-charts.sh -s       # sequential fallback

Both the base chart and the prd environment wrapper should report lint and kubeconform passes; erpc-db and any *postgres* charts are intentionally skipped by kubeconform.

CI
Push or open a PR touching helm/** — the Helm Chart Validation workflow runs chart discovery, parallel validation, and a summary job that fails the check if any chart fails.

Local Docker Compose (eRPC chart)

cd helm/charts/erpc
cp .env.example .env  # fill in API keys
docker-compose up -d
./test-local.sh

All seven test cases (health, Ethereum/Arbitrum/Base RPC, auth enforcement, eth_getLogs, DynamoDB note) should pass.

Edge cases to verify

  • A chart with dependencies: in Chart.yaml correctly triggers helm dependency update before template rendering.
  • process_chart failures in sequential mode do not abort validation of subsequent charts (the return fix).
  • Requests to eRPC without a secret are rejected when LABELS_SERVICE_API_KEY is configured; the rejection response does not echo the key back in test output.

Codex follow-up

  • Fixed agent-harness repo-map drift by making scripts/review-repo-map.sh use deterministic case-insensitive sorting: LC_ALL=C sort -f -u.
  • Fixed Aikido checkout-token findings by setting persist-credentials: false in Helm validation and Wiz IaC workflows.
  • Verified locally with ./scripts/agent-harness/update-repo-map.sh and make agent-check.

@wiz-c998a0ef2b

wiz-c998a0ef2b Bot commented Jun 24, 2026

Copy link
Copy Markdown

Wiz Scan Summary

Scanner Findings
Vulnerability Finding Vulnerabilities -
Data Finding Sensitive Data -
Secret Finding Secrets -
IaC Misconfiguration IaC Misconfigurations -
SAST Finding SAST Findings -
Software Management Finding Software Management Findings -
Total -

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

Comment thread validate-helm-charts.sh Outdated
Comment thread helm/charts/erpc/test-local.sh Outdated
Comment thread validate-helm-charts.sh Outdated
Comment thread validate-helm-charts.sh Outdated
@rguichard rguichard force-pushed the feature/pla-1455-move-app-related-deployment-config-into-application-repos branch from 3db6a38 to 631f9ad Compare June 24, 2026 08:59
Comment thread validate-helm-charts.sh
Comment thread helm/charts/erpc/test-local.sh Outdated
@rguichard rguichard force-pushed the feature/pla-1455-move-app-related-deployment-config-into-application-repos branch from 631f9ad to 7cad0d6 Compare June 24, 2026 13:13
Comment thread validate-helm-charts.sh Outdated
Comment thread .github/workflows/helm-charts-validation.yaml
Comment thread .github/workflows/pr-commands.yml Outdated
Comment thread .github/workflows/update-image-tag.yaml Outdated
Comment thread .github/CODEOWNERS
Comment thread helm/charts/erpc/values.yaml Outdated
@rguichard rguichard force-pushed the feature/pla-1455-move-app-related-deployment-config-into-application-repos branch from 7cad0d6 to 7cadc0e Compare June 24, 2026 13:33
Comment thread validate-helm-charts.sh Outdated
Comment thread validate-helm-charts.sh Outdated
Comment thread validate-helm-charts.sh
rguichard and others added 3 commits June 25, 2026 12:21
Apply the fixes from the per-finding review branches directly:

validate-helm-charts.sh:
- process_chart now returns instead of exit, so the sequential loop
  validates every chart and reaches the summary (the parallel path is
  unaffected).
- Use a single `mktemp -d` dir (lint/template files) instead of an
  unused base temp file; clean up with one `rm -rf`.
- Pass the chart path to `helm dependency update` / `helm template`
  instead of cd-ing into the dir; drop the now-unused INITIAL_DIR.
- Split the `helm template | kubeconform` pipeline for readability
  (keeping the `if !` guard, which is pipefail-safe).

test-local.sh:
- Redact LABELS_SERVICE_API_KEY when echoing the auth response.
- Correct the eth_getLogs decimal block numbers (0x1254048f =
  307,496,079, 0x12558b2f = 307,596,079, range = 100,000).

The Aikido suggestion to move should_skip_validation before `helm lint`
was intentionally not applied: lint is meant to run for all charts;
only kubeconform is skipped for database charts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- CODEOWNERS: add @0x666c6f as a mandatory codeowner alongside the
  teams, on every pattern (incl. /helm/) so he is a required reviewer
  repo-wide, not only for files matching the default `*` rule.
- values.yaml: bump erpc and erpc-validator image tags 0.0.77 -> 0.1.3
  to match the latest prd values in morpho-org/morpho-infra-helm.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@rguichard rguichard force-pushed the feature/pla-1455-move-app-related-deployment-config-into-application-repos branch from 6553bf2 to a8d49f2 Compare June 25, 2026 10:21
@rguichard rguichard requested a review from 0x666c6f June 25, 2026 10:21
@rguichard

Copy link
Copy Markdown
Collaborator Author

@0x666c6f do you know why agent-harness is failing ? I ran ./scripts/agent-harness/update-repo-map.sh but the file was already up to date

Comment thread .github/workflows/helm-charts-validation.yaml
Comment thread .github/workflows/wiz-iac-scan.yml
@0x666c6f

0x666c6f commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Fixed in 7dd92d7.

Root cause: scripts/review-repo-map.sh used locale-sensitive sort -u. Linux CI sorted uppercase top-level files differently than the local macOS run, so make agent-check saw a generated review/repo-map.md diff even though it looked up to date locally.

Fix: force deterministic case-insensitive ordering with LC_ALL=C sort -f -u. I verified with:

./scripts/agent-harness/update-repo-map.sh
make agent-check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants