Fix production promotion environment secret access#754
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR inlines the promote-to-production job gated by ChangesProduction Promotion Environment and Workflow
Sequence Diagram(s)sequenceDiagram
participant CallerJob
participant SetupProductionEnvironment
participant ContainerRegistry
participant ProductionDeployment
participant HealthCheckComposite
participant CreateGitHubRelease
CallerJob->>SetupProductionEnvironment: validate CPLN_TOKEN_PRODUCTION & required secrets
SetupProductionEnvironment->>ContainerRegistry: capture production/staging images metadata
CallerJob->>ContainerRegistry: copy staging image -> production
CallerJob->>ProductionDeployment: deploy copied image (optional release-phase)
ProductionDeployment->>HealthCheckComposite: request health status polls
HealthCheckComposite-->>CallerJob: report success/failure
CallerJob->>CreateGitHubRelease: create release when promotion succeeds
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.controlplane/readme.md:
- Around line 96-107: The troubleshooting docs mention ruling out a same-named
repository or organization secret for CPLN_TOKEN_PRODUCTION but the example
commands only check the environment and repository scopes; update the
.controlplane/readme.md section referencing CPLN_TOKEN_PRODUCTION to add an
organization-level check (use the GitHub CLI org secret listing for the org)
alongside the existing environment and repository gh secret commands so
operators can detect an org secret that would mask a missing environment secret.
In @.github/workflows/cpflow-promote-staging-to-production.yml:
- Around line 14-35: The four hard-coded env vars (HEALTH_CHECK_RETRIES,
HEALTH_CHECK_INTERVAL, ROLLBACK_READINESS_RETRIES, ROLLBACK_READINESS_INTERVAL)
need to be wired to repository variables so overrides work; update each to read
from vars with a fallback default (e.g., change HEALTH_CHECK_RETRIES to use ${{
vars.HEALTH_CHECK_RETRIES || '24' }}, HEALTH_CHECK_INTERVAL to ${{
vars.HEALTH_CHECK_INTERVAL || '15' }}, and similarly for
ROLLBACK_READINESS_RETRIES and ROLLBACK_READINESS_INTERVAL) while keeping the
same numeric defaults and names so the workflow honors repo-level overrides.
In `@bin/test-cpflow-github-flow`:
- Around line 93-100: The current check only ensures a non-empty checkout ref
but doesn't verify the step actually uses actions/checkout or that it checks out
shakacode/control-plane-flow; update the validation around
checkout_step/checkout_ref to also read checkout_step["uses"] and
checkout_step.fetch("with", {})["repo"] and abort unless checkout_step["uses"]
includes "actions/checkout" and the repo equals "shakacode/control-plane-flow"
(or explicitly fail if repo is missing/incorrect), leaving refs[checkout_ref]
update only after those checks pass.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cefcc2db-590b-4374-a39c-fd81c48c2a12
📒 Files selected for processing (6)
.controlplane/readme.md.controlplane/shakacode-team.md.github/cpflow-help.md.github/workflows/cpflow-promote-staging-to-production.ymlbin/pin-cpflow-github-refbin/test-cpflow-github-flow
Greptile SummaryThis PR solves a GitHub Actions limitation — environment secrets are not forwarded to called cross-repo reusable workflows — by replacing the single
Confidence Score: 4/5The core architectural change is sound — environment secrets are now properly gated by the The promotion workflow itself is well-structured: the environment gate is in the right place, rollback captures state before deployment, and the
Important Files Changed
Reviews (1): Last reviewed commit: "Fix production promotion environment sec..." | Re-trigger Greptile |
| text = File.read(path) | ||
| updated = text | ||
| .gsub(%r{(uses:\s+shakacode/control-plane-flow/\.github/workflows/[^@\s]+@)[^\s]+}, "\\1#{ref}") | ||
| .gsub(%r{(repository:\s+shakacode/control-plane-flow\n\s+ref:\s+)[^\s]+}, "\\1#{ref}") |
There was a problem hiding this comment.
Fragile multi-line regex — silent miss if lines are not adjacent
The pattern (repository:\s+shakacode/control-plane-flow\n\s+ref:\s+)[^\s]+ requires ref: to appear on the very next line after repository: shakacode/control-plane-flow with no intervening comment or blank line. If anyone inserts a comment (e.g. # pinned) between those two YAML keys, the regex will not match, the file will be silently skipped, and pin-cpflow-github-ref will report success without updating the production checkout ref. The only detection is running test-cpflow-github-flow afterward and noticing the refs-diverged abort.
|
|
||
| promote_path = ".github/workflows/cpflow-promote-staging-to-production.yml" | ||
| promote_doc = YAML.load_file(promote_path, aliases: true) | ||
| promote_job = promote_doc.fetch("jobs", {}).fetch("promote-to-production") |
There was a problem hiding this comment.
Bare
Hash#fetch raises an opaque Ruby exception if the job is renamed
promote_doc.fetch("jobs", {}).fetch("promote-to-production") (without a default or block) raises KeyError: key not found: promote-to-production instead of a clean abort message. If the job is ever renamed the test will still fail, but the error gives no hint that the script needs to be updated to match the new job name. Using fetch("promote-to-production") { abort "..." } or || {} with a follow-up check would produce a developer-friendly message consistent with the rest of the script.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| - name: Roll back on failure | ||
| if: failure() && steps.capture-current.outputs.rollback_state != '' && steps.capture-current.outputs.rollback_state != '{}' |
There was a problem hiding this comment.
"Wait for rollback readiness" runs even when rollback itself failed
Both if: conditions are identical — failure() && rollback_state != '' && rollback_state != '{}' — so "Wait for rollback readiness" runs whenever any earlier step failed, including when "Roll back on failure" itself exits 1. Because the rollback step has no id, there is no way to gate the readiness poll on rollback success. The step will poll all workloads regardless of whether their images were actually reverted, and may emit misleading "ready" status for workloads that still carry the broken image. Adding an id: rollback to the step and including steps.rollback.outcome == 'success' in the condition would make the intent explicit.
🚀 Quick Review App CommandsWelcome! Here are the commands you can use in this PR:
|
Code Review: Fix production promotion environment secret accessOverviewThis PR correctly addresses a fundamental GitHub Actions limitation: environment secrets from a caller's protected environment are not passed through to cross-repo reusable workflows. The solution — inlining the production promotion job with Security ✅
Issues FoundBug — null image captured into rollback state (line 248 of workflow) workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image')"No null guard here. If a workload has no containers (e.g., a job-only workload), Suspicious Misleading summary when deploy fails before health-check (workflow line 513) Redundant env var in Fragile regex in Minor Nits
SummaryThe core approach is architecturally sound and the secret-scoping fix is correct. Two items worth addressing before merge: the missing |
| [[ -n "${workload_name}" ]] || continue | ||
|
|
||
| workload_json="$(cpln workload get "${workload_name}" --gvc "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" -o json)" | ||
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image')" |
There was a problem hiding this comment.
Missing null guard — unlike the staging capture step (line 302) which uses // empty, this will output the literal string "null" if a workload has no containers. That string then propagates into the rollback state and the subsequent [[ -z "${selected_image}" ]] check won't catch it, leaving a stale "null" image as the rollback target.
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image')" | |
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image // empty')" |
| working_directory: .cpflow | ||
| cpln_cli_version: ${{ vars.CPLN_CLI_VERSION }} | ||
| cpflow_version: ${{ vars.CPFLOW_VERSION }} | ||
| control_plane_flow_ref: shakacode/control-plane-flow/.github/workflows/cpflow-promote-staging-to-production.yml@v5.0.4 |
There was a problem hiding this comment.
This looks like a carry-over from the deleted uses: ref. The value is a full workflow path (…/cpflow-promote-staging-to-production.yml@v5.0.4) rather than a simple tag or owner/repo@ref. If cpflow-setup-environment expects either format for version validation/logging, this should be verified against that action's action.yml — passing an unexpected format could be silently ignored or cause a confusing error if the action tries to parse it.
| { | ||
| echo "## Promotion Summary" | ||
| echo | ||
| if [[ "${HEALTHY}" == "true" ]]; then |
There was a problem hiding this comment.
When deploy-image fails before the health-check step runs, steps.health-check.outputs.healthy is empty (the step was skipped). The else branch fires and prints ❌ Status: deployment failed — accidentally correct, but a skipped step and an explicit false result are indistinguishable here. Consider making the three states explicit:
| if [[ "${HEALTHY}" == "true" ]]; then | |
| if [[ "${HEALTHY}" == "true" ]]; then | |
| echo "✅ Status: deployment successful" | |
| elif [[ "${HEALTHY}" == "false" ]]; then | |
| echo "❌ Status: health check failed" | |
| else | |
| echo "❌ Status: deployment failed before health check" | |
| fi |
| env: | ||
| GH_REPO: ${{ github.repository }} | ||
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| GITHUB_RUN_ID: ${{ github.run_id }} |
There was a problem hiding this comment.
GITHUB_RUN_ID is already a default environment variable injected by GitHub Actions into every step — no need to re-declare it.
| GITHUB_RUN_ID: ${{ github.run_id }} | |
| GH_REPO: ${{ github.repository }} | |
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | |
| STAGING_APP_NAME: ${{ needs.promote-to-production.outputs.staging_app_name }} | |
| PRODUCTION_APP_NAME: ${{ needs.promote-to-production.outputs.production_app_name }} |
| text = File.read(path) | ||
| updated = text | ||
| .gsub(%r{(uses:\s+shakacode/control-plane-flow/\.github/workflows/[^@\s]+@)[^\s]+}, "\\1#{ref}") | ||
| .gsub(%r{(repository:\s+shakacode/control-plane-flow\n\s+ref:\s+)[^\s]+}, "\\1#{ref}") |
There was a problem hiding this comment.
This regex assumes ref: is the line immediately after repository: in the with: block. If the fields are ever reordered (e.g., path: moved above ref: in the checkout step), the pattern silently produces no match and the production checkout ref is left at the old value with no warning. Consider adding a post-update verification — e.g., grep the file for the old ref after the substitution loop and abort if any remain.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7a1005c20
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| abort "#{promote_path}:promote-to-production must pin the Checkout control-plane-flow actions step" | ||
| end | ||
|
|
||
| refs[checkout_ref] << "#{promote_path}:promote-to-production" |
There was a problem hiding this comment.
Check the setup action ref too
When the production workflow is repinned, this consistency check only adds the checkout step's ref to refs, but the workflow also has a second pin in cpflow-setup-environment's control_plane_flow_ref. If that setup ref is left on an older tag while the checkout/ref wrappers are updated, bin/test-cpflow-github-flow still reports a single ref even though production promotion can later fail in repos that set CPFLOW_VERSION, because the setup action validates that input against the checked-out control-plane-flow tag. Please include the setup action's control_plane_flow_ref in the same ref set.
Useful? React with 👍 / 👎.
Code Review: Fix production promotion environment secret accessOverviewWell-reasoned architectural change. Because GitHub does not expose caller-repo environment secrets to cross-repo reusable workflows, production promotion has to run as a direct job in this repo. The PR correctly implements this with appropriate security defaults, parallel rollback polling, and variable-scoped permissions. Strengths
Issues1. Ruby
|
| warn "Error: app '#{app}' is not defined under `apps:` in `.controlplane/controlplane.yml`." | ||
| warn " Fix the PRODUCTION_APP_NAME repository variable or add the app to controlplane.yml." |
There was a problem hiding this comment.
GitHub Actions only parses ::error:: and ::warning:: annotations from stdout. Ruby's warn writes to stderr, so this message won't appear as a job annotation in the Actions UI when PRODUCTION_APP_NAME refers to an undefined app.
| warn "Error: app '#{app}' is not defined under `apps:` in `.controlplane/controlplane.yml`." | |
| warn " Fix the PRODUCTION_APP_NAME repository variable or add the app to controlplane.yml." | |
| $stdout.puts "::error::app '#{app}' is not defined under `apps:` in `.controlplane/controlplane.yml`. Fix the PRODUCTION_APP_NAME repository variable or add the app to controlplane.yml." |
One $stdout.puts with an ::error:: prefix covers both the annotation and the explanation, and keeps the pattern consistent with the puts "::error::..." calls below.
| - name: Wait for rollback readiness | ||
| if: failure() && steps.capture-current.outputs.rollback_state != '' && steps.capture-current.outputs.rollback_state != '{}' |
There was a problem hiding this comment.
This step shares the same if: condition as the rollback step above. When the rollback step exits 1 early — for example because the JSON state couldn't be parsed (line 389) — this step still runs and polls workloads that were never actually rolled back, producing misleading output during a production incident.
Consider adding an id: rollback to the "Roll back on failure" step and extending this condition:
- name: Wait for rollback readiness
if: failure() && steps.rollback.outcome != 'failure' && steps.capture-current.outputs.rollback_state != '' && steps.capture-current.outputs.rollback_state != '{}'That way readiness polling only runs when the rollback step actually attempted to update workloads.
|
|
||
| CONTROL_PLANE_FLOW_WORKFLOW = %r{\Ashakacode/control-plane-flow/\.github/workflows/[^@\s]+@([^\s]+)\z} | ||
| PROMOTE_WORKFLOW = %r{\Ashakacode/control-plane-flow/\.github/workflows/cpflow-promote-staging-to-production\.yml@[^\s]+\z} | ||
| EXPECTED_CPFLOW_CHECKOUT_ACTION = "actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd" |
There was a problem hiding this comment.
This SHA is hardcoded and won't be updated by bin/pin-cpflow-github-ref, which only targets shakacode/control-plane-flow refs. When the workflow's actions/checkout SHA is bumped (e.g. for a security patch), this constant will also need a manual update — and there's nothing in the tooling to prompt that.
Consider either:
- Adding a comment here noting it must be kept in sync with the
uses:line in the workflow manually, or - Extending
bin/pin-cpflow-github-refto accept an--actions-checkout-shaoption that updates both the workflow YAML and this constant.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is ON. A cloud agent has been kicked off to fix the reported issue.
Reviewed by Cursor Bugbot for commit a7a1005. Configure here.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bin/test-cpflow-github-flow`:
- Around line 129-134: The abort message when setup_ref fails PROMOTE_WORKFLOW
should be more actionable: update the abort call in the block that reads
setup_ref = setup_step.fetch("with", {})["control_plane_flow_ref"] (and uses
PROMOTE_WORKFLOW and promote_path) to include the expected fully-qualified
workflow ref format (for example:
owner/repo/.github/workflows/<workflow>.yml@<version>) so the error clearly
shows a concrete example of the required .../.github/workflows/...yml@<version>
pinned ref.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6df16c81-86ed-4297-bf04-1d91e13e5a79
📒 Files selected for processing (2)
.github/workflows/cpflow-promote-staging-to-production.ymlbin/test-cpflow-github-flow
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/cpflow-promote-staging-to-production.yml
Code ReviewOverviewThis PR correctly addresses a real GitHub Actions constraint: environment secrets are not exposed to cross-repo reusable workflows, so Strengths
Issues1.
2. Hardcoded
checkout_action = Array(promote_job["steps"]).find { |s| s["name"] == "Checkout control-plane-flow actions" }&.fetch("uses", nil)
# then compare checkout_step["uses"] == checkout_action rather than a constantOr at minimum add a comment directing maintainers to update this constant when 3. Mixed stdout/stderr in Ruby error messages In the 4. Promotion summary is misleading on early failures The if [[ "${HEALTHY}" == "true" ]]; then
echo "✅ Status: deployment successful"
elif [[ -z "${HEALTHY}" ]]; then
echo "❌ Status: deployment failed before health check"
else
echo "❌ Status: deployment failed health check"
fiMinor Notes
VerdictThe architectural change is sound and the implementation handles the failure path more carefully than the reusable workflow it replaces. Address the |
| CONTROL_PLANE_FLOW_WORKFLOW = %r{\Ashakacode/control-plane-flow/\.github/workflows/[^@\s]+@([^\s]+)\z} | ||
| PROMOTE_WORKFLOW = %r{\Ashakacode/control-plane-flow/\.github/workflows/cpflow-promote-staging-to-production\.yml@[^\s]+\z} | ||
| PROMOTE_WORKFLOW = %r{\Ashakacode/control-plane-flow/\.github/workflows/cpflow-promote-staging-to-production\.yml@([^\s]+)\z} | ||
| EXPECTED_CPFLOW_CHECKOUT_ACTION = "actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd" |
There was a problem hiding this comment.
This SHA must stay manually in sync with the actions/checkout ref in the workflow file — if pin-cpflow-github-ref (or a manual edit) updates the checkout SHA in the workflow, this constant needs a matching update or the test will fail with a confusing mismatch message.
Consider deriving the expected value from the workflow file itself so there's a single source of truth:
| EXPECTED_CPFLOW_CHECKOUT_ACTION = "actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd" | |
| # Read from the workflow file so this stays in sync automatically. | |
| EXPECTED_CPFLOW_CHECKOUT_ACTION = begin | |
| require "yaml" | |
| promote = YAML.load_file(".github/workflows/cpflow-promote-staging-to-production.yml", aliases: true) | |
| step = Array(promote.dig("jobs", "promote-to-production", "steps")).find { |s| s["name"] == "Checkout control-plane-flow actions" } | |
| step&.fetch("uses") or abort "Could not find Checkout control-plane-flow actions step in promotion workflow" | |
| end |
Or at minimum add an inline comment: # Keep in sync with the checkout SHA in cpflow-promote-staging-to-production.yml.
| staging_image="${staging_image_ref}" | ||
| fi | ||
|
|
||
| echo "image=${staging_image}" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
staging_image is derived from a CPLN API response via jq -r, which strips JSON quoting but preserves any embedded newlines in the value. Writing it with a bare echo means a newline in the image ref could inject additional key=value pairs into GITHUB_OUTPUT or silently truncate the value.
The rollback_state capture a few steps earlier already handles this correctly with a randomized heredoc delimiter. Apply the same pattern here for consistency and safety:
| echo "image=${staging_image}" >> "$GITHUB_OUTPUT" | |
| delim="EOF_$(openssl rand -hex 8)" | |
| { | |
| echo "image<<${delim}" | |
| echo "${staging_image}" | |
| echo "${delim}" | |
| } >> "$GITHUB_OUTPUT" |
Same applies to current_image and current_version in the Capture current production image step (lines 279–280).
| puts "::error::PRIMARY_WORKLOAD is not configured and app '#{app}' has multiple workloads: #{workloads.join(', ')}." | ||
| warn " Set the PRIMARY_WORKLOAD repository variable to one of these workloads." |
There was a problem hiding this comment.
GitHub Actions workflow commands (::error::) are processed from stdout, but warn writes to stderr. The hint on line 173 will appear in the raw runner log but won't be attached to the annotation — operators checking the Actions UI will only see the first line.
Consolidate into a single puts call so the full message is captured as one annotation:
| puts "::error::PRIMARY_WORKLOAD is not configured and app '#{app}' has multiple workloads: #{workloads.join(', ')}." | |
| warn " Set the PRIMARY_WORKLOAD repository variable to one of these workloads." | |
| puts "::error::PRIMARY_WORKLOAD is not configured and app '#{app}' has multiple workloads: #{workloads.join(', ')}. Set the PRIMARY_WORKLOAD repository variable to one of these workloads." |
| # via the HEALTH_CHECK_ACCEPTED_STATUSES repo variable to tighten this for apps that | ||
| # expose a dedicated health endpoint (e.g. "200" for a plain /health, or "200 401 403" | ||
| # for apps that auth-gate / without redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} |
There was a problem hiding this comment.
The 200 301 302 default is well-explained in the comment, but a redirect from a maintenance page or an auth-gate passes this check even when the app is genuinely broken. The recommendation to expose /health is buried in a comment — consider making this more visible by emitting a ::warning:: at runtime when the default value is in use:
if [[ "${HEALTH_CHECK_ACCEPTED_STATUSES}" == "200 301 302" ]]; then
echo "::warning::Using default HEALTH_CHECK_ACCEPTED_STATUSES (200 301 302). Consider exposing a /health endpoint and setting HEALTH_CHECK_ACCEPTED_STATUSES=200 in repository variables."
fiThis surfaces the guidance precisely when it matters rather than requiring operators to read the workflow source.
| run: | | ||
| set -euo pipefail | ||
|
|
||
| staging_vars="$(CPLN_TOKEN="${CPLN_TOKEN_STAGING}" cpln gvc get "${STAGING_APP_NAME}" --org "${CPLN_ORG_STAGING}" -o json | jq -r '.spec.env // [] | .[].name' | sort)" |
There was a problem hiding this comment.
This parity check only compares GVC-level environment variables (.spec.env on the GVC object). Workload-level env vars configured directly on individual workloads — rather than the GVC — are not captured by cpln gvc get. Teams that configure env vars at the workload level would get a false "all clear" here. A comment noting this scope limitation would prevent future confusion:
# Note: this check covers GVC-level env vars only. Workload-level env vars
# (set on individual workloads rather than the GVC) are not compared here.| if [[ "${staging_image_ref}" == /org/*/image/* ]]; then | ||
| staging_image="${staging_image_ref##*/image/}" | ||
| elif [[ "${staging_image_ref}" == *.registry.cpln.io/* ]]; then | ||
| staging_image="${staging_image_ref#*.registry.cpln.io/}" | ||
| else | ||
| staging_image="${staging_image_ref}" | ||
| fi |
There was a problem hiding this comment.
The else fallback silently passes the raw reference through as-is. If the image format is neither /org/*/image/* nor *.registry.cpln.io/*, cpflow copy-image-from-upstream --image will receive the full reference string, which may fail with an opaque error rather than a clear diagnostic. Consider adding an explicit warning:
if [[ "${staging_image_ref}" == /org/*/image/* ]]; then
staging_image="${staging_image_ref##*/image/}"
elif [[ "${staging_image_ref}" == *.registry.cpln.io/* ]]; then
staging_image="${staging_image_ref#*.registry.cpln.io/}"
else
echo "::warning::Unrecognized image reference format '${staging_image_ref}'; passing through as-is."
staging_image="${staging_image_ref}"
fi| create-github-release: | ||
| needs: promote-to-production | ||
| if: needs.promote-to-production.result == 'success' | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
The create-github-release job has no timeout-minutes. The default GitHub Actions timeout is 6 hours, so a hung gh release create (e.g. transient API outage) would block the workflow slot for hours after the production deploy has already succeeded. A short timeout is appropriate here:
create-github-release:
needs: promote-to-production
if: needs.promote-to-production.result == 'success'
runs-on: ubuntu-latest
timeout-minutes: 5
permissions:| status_name="${workload_name//\//_}" | ||
| status_file="${status_dir}/${status_name}" | ||
| if [[ ! -f "${status_file}" ]] || [[ "$(<"${status_file}")" != "ready" ]]; then | ||
| echo "::warning::Workload '${workload_name}' did not report ready after rollback." |
There was a problem hiding this comment.
The "wait for rollback readiness" step emits only ::warning:: for workloads that never become ready — it does not exit non-zero. Since this step already executes in the failure() path, the overall job outcome is already a failure, so the warning won't silently hide a broken rollback. However, an operator scanning the step list might see a green checkmark on this step and believe the rollback completed cleanly. Consider adding a comment explaining the intent, or tracking a failure count and exiting non-zero so the step icon itself reflects the unready state.
Code Review — PR #754: Fix production promotion environment secret accessOverviewThis PR addresses a real and important GitHub Actions limitation: environment secrets set on a protected The 550-line net addition is large but warranted: the inline job replicates and extends the upstream workflow's full promotion pipeline with local control. Security ✅
Code Quality ✅
Issues (see inline comments)
None of these are blockers, but the What's Well Done
Overall this is a high-quality, well-reasoned change. The inline comments above are suggestions for hardening, not blocking issues. |
✅ Review App DeletedReview app for PR #754 is deleted |

Summary
environment: productionshakacode/control-plane-flow@v5.0.4actions into.cpflowso the job can still reuse shared promotion logicCPLN_TOKEN_PRODUCTIONstays environment-scoped and production ref pinning stays consistentRelated upstream generator PR: shakacode/control-plane-flow#353
Verification
git diff --check -- .github/workflows/cpflow-promote-staging-to-production.yml .github/cpflow-help.md .controlplane/shakacode-team.md .controlplane/readme.md bin/test-cpflow-github-flow bin/pin-cpflow-github-refbin/conductor-exec bin/test-cpflow-github-flow.cpflowcheckout/setup refsNote
High Risk
Changes the live production deploy path, token scoping, and rollback behavior; misconfiguration could block promotion or leave production in a bad state during failed deploys.
Overview
Production promotion no longer delegates to the cross-repo control-plane-flow reusable workflow. The cpflow-promote-staging-to-production workflow now runs a caller-owned job with
environment: production, checks out pinned v5.0.4 actions into.cpflow, and inlines the full promote pipeline (validation, staging→production image copy, deploy, health checks, multi-workload rollback, and a follow-up GitHub release job).Docs (.controlplane/readme.md, shakacode-team.md, .github/cpflow-help.md) explain why
CPLN_TOKEN_PRODUCTIONmust stay on the protected environment only (not repo/org secrets), warn against moving promotion back behind a reusable workflow, and addgh secret listtroubleshooting.bin/pin-cpflow-github-refandbin/test-cpflow-github-flownow pin and assert the production checkout/setup refs and forbid the old reusable-workflow promotion pattern.Reviewed by Cursor Bugbot for commit 393b9c3. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
Documentation
Refactor
Tests / Chores