[WIP] Add daily egress connectivity allow list generation#289
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds OpenSnitch-based egress capture: new workflows and CI targets to setup OpenSnitch on landing VMs, collect journald logs, analyze connections into an allowlist, and automate publishing/updating Changes
Sequence DiagramsequenceDiagram
participant Trigger as User/Scheduler
participant GH as GitHub Actions
participant E2E as e2e-deployment
participant VM as Landing Zone VM
participant OS as OpenSnitch
participant Collector as collect_egress_logs.sh
participant Analyzer as analyze_egress.py
participant Repo as GitHub Repo/API
Trigger->>GH: dispatch / schedule egress-capture
GH->>E2E: call workflow (connected / disconnected) with inputs
E2E->>VM: run setup_opensnitch.sh (if capture enabled)
VM->>OS: install & start opensnitch
VM->>VM: run workload/tests
GH->>Collector: run collect_egress_logs.sh via job
Collector->>VM: ssh journalctl -> gzip -> save artifact
GH->>Analyzer: download artifacts and run analyze_egress.py
Analyzer-->>GH: generated connected/disconnected YAML
GH->>Repo: run scripts/egress/update_allowlist.sh (diff & PR flow)
alt diffs found
Repo->>Repo: create/update branch, commit, open PR
else no changes
GH-->>Trigger: exit clean
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Tarball created: |
|
/test capture-infra |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/e2e-deployment.yml (1)
469-470:⚠️ Potential issue | 🟡 Minor
skip-cleanupis declared as aworkflow_callinput but only honored forworkflow_dispatch.The cleanup guard (here and on Line 916 for the disconnected job) only checks
github.event_name != 'workflow_dispatch', so when this workflow is invoked viaworkflow_call(e.g., from.github/workflows/egress-capture.yml), theskip-cleanupinput value is effectively ignored — cleanup always runs. It's harmless today because the egress-capture workflow passesskip-cleanup: false, but any future caller that sets it totruewill be surprised.Apply the same condition to both call-style events:
- if: always() && (github.event_name != 'workflow_dispatch' || inputs.skip-cleanup != true) + if: always() && (!contains(fromJSON('["workflow_dispatch", "workflow_call"]'), github.event_name) || inputs.skip-cleanup != true)(And correspondingly for the "Cleanup skipped notice" step on Line 497 and the matching pair on lines 916/943.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-deployment.yml around lines 469 - 470, The cleanup guard currently only checks for workflow_dispatch so inputs.skip-cleanup from workflow_call is ignored; update the if condition on the "Cleanup infrastructure" step and the paired "Cleanup skipped notice" step (and the matching disconnected-job steps) to also respect workflow_call by changing the condition to: always() && ( (github.event_name != 'workflow_dispatch' && github.event_name != 'workflow_call') || inputs.skip-cleanup != true ); apply the same replacement for both occurrences (the main job and the disconnected job) so inputs.skip-cleanup is honored for workflow_call invocations.
🧹 Nitpick comments (6)
scripts/egress/update_allowlist.sh (2)
19-22: Log paths are CWD-dependent; the ENCLAVE_DIR anchor isn't used.
CONNECTED_LOGS/DISCONNECTED_LOGSare relative (egress-logs/connected/...) and rely on the workflow's CWD being the repo root.$ALLOWLISTuses$ENCLAVE_DIRalready — consider anchoring log paths similarly (or honor anEGRESS_LOGS_DIRenv var) so a manual invocation from a different CWD doesn't silently fall into the "no logs" branch and emit an empty allow list on top of the real one.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/egress/update_allowlist.sh` around lines 19 - 22, CONNECTED_LOGS and DISCONNECTED_LOGS are relative and should be anchored like ALLOWLIST to avoid CWD dependency; change their assignments in the script to build absolute paths (e.g., use an EGRESS_LOGS_DIR env var defaulting to "${ENCLAVE_DIR}/egress-logs" and then set CONNECTED_LOGS="${EGRESS_LOGS_DIR}/connected/opensnitch-connections.json.gz" and DISCONNECTED_LOGS="${EGRESS_LOGS_DIR}/disconnected/opensnitch-connections.json.gz"), and ensure subsequent logic uses those variables and checks file existence before falling back to emitting an empty allowlist.
62-71:printf '%b'will corrupt diffs that contain backslashes.
SUMMARYembeds${DIFF}which can legitimately contain\(filenames, escaped chars in added strings).printf '%b'then interprets those as escape sequences — e.g.\t,\n,\0xx— garbling the rendered diff in both the step summary and the PR comment.Prefer building the summary with plain newlines and emitting with
%s:♻️ Proposed change
-SUMMARY="### Egress allow list changes\n\`\`\`diff\n${DIFF}\n\`\`\`" - -# Always print to step summary -printf '%b\n' "$SUMMARY" >> "${GITHUB_STEP_SUMMARY:-/dev/null}" +SUMMARY=$(cat <<EOF +### Egress allow list changes +\`\`\`diff +${DIFF} +\`\`\` +EOF +) + +# Always print to step summary +printf '%s\n' "$SUMMARY" >> "${GITHUB_STEP_SUMMARY:-/dev/null}" @@ - gh pr comment "$PR_NUMBER" --body "$(printf '%b' "$SUMMARY")" + gh pr comment "$PR_NUMBER" --body "$SUMMARY"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/egress/update_allowlist.sh` around lines 62 - 71, The SUMMARY is printed with printf '%b' which interprets backslashes in ${DIFF} and corrupts diffs; change all uses of printf '%b' "$SUMMARY" to printf '%s\n' "$SUMMARY" (for the redirect to GITHUB_STEP_SUMMARY) and change the PR comment call to either pass the raw variable (--body "$SUMMARY") or use printf '%s' "$SUMMARY" when building the body so backslashes are treated literally; update the two occurrences around SUMMARY/DIFF (the printf redirect and the gh pr comment invocation that currently uses printf '%b') to use %s instead..github/workflows/egress-capture.yml (2)
86-91: Invoke the script viabashto avoid relying on the executable bit.If the file's executable bit is ever lost in a repo edit (e.g., contributors on Windows or applying a patch that drops mode), this step will fail with "permission denied". Explicitly using
bashis more robust:- run: scripts/egress/update_allowlist.sh + run: bash scripts/egress/update_allowlist.sh🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/egress-capture.yml around lines 86 - 91, The workflow step named "Analyze and update allow list" currently runs the script directly, which can fail if the executable bit is lost; update the step to invoke the script with bash (call scripts/egress/update_allowlist.sh via bash) so it doesn’t rely on file mode. Keep the same environment variables (DRY_RUN, PR_NUMBER, GH_TOKEN) and simply change the run invocation to use bash to execute scripts/egress/update_allowlist.sh.
30-32:checks: writeat workflow level appears unused.The
update-allowlistjob overrides permissions locally, and the reusablee2e-deployment.ymlruns under its own permission scope. Consider droppingchecks: writehere (or tightening tocontents: readonly) to follow least-privilege.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/egress-capture.yml around lines 30 - 32, Remove the redundant workflow-level "checks: write" permission in the permissions block and leave only the minimum required (e.g., "contents: read"); the "update-allowlist" job already sets its own permissions and the reusable "e2e-deployment.yml" runs under its own scope, so rely on those job-level scopes instead of granting checks: write globally. Update the permissions block by deleting the "checks: write" entry (or change it to read-only if you prefer) so the workflow follows least-privilege.scripts/egress/analyze_egress.py (1)
93-94: Defaultingprototo"tcp"silently mislabels UDP when the field is absent.If an opensnitchd MESSAGE ever omits
proto=, this records the entry astcp— which would be wrong for DNS/NTP-style UDP traffic and could silently produce an incorrect firewall allow list. The header comment at Line 39 also claimsproto=is always present; if that's the case, prefer to skip (returnNone) the entry rather than guess, so you can catch parser drift early.♻️ Proposed change
- proto = fields.get("proto", "tcp").lower() - host = fields.get("host", "") + if "proto" not in fields: + return None + proto = fields["proto"].lower() + host = fields.get("host", "")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/egress/analyze_egress.py` around lines 93 - 94, The parser currently defaults proto to "tcp" with proto = fields.get("proto", "tcp").lower(), which mislabels entries when proto is omitted; change this so the code first checks for the presence of the "proto" key on the fields dict and, if it's missing or empty, return None/skip the entry (instead of guessing "tcp") to surface parser drift; update the parsing routine that builds host/proto (the block referencing fields.get("proto", ...) and host = fields.get("host", "")) to remove the default and explicitly handle the missing proto case by returning None, and ensure any callers of that parsing function handle a None result.scripts/egress/collect_egress_logs.sh (1)
40-57: Silent remote errors produce misleading "empty log" outcomes.Two minor points on robustness:
- Line 43:
journalctl -u opensnitchd -o json --no-pager 2>/dev/null | gziphides real errors (unit not installed, permission denied). Combined with pipefail semantics,gzipwill still succeed on empty input and you get a valid empty.gz+ a benign "No OpenSnitch log entries found" warning — making it hard to distinguish "daemon captured nothing" from "daemon wasn't running". Consider keeping stderr or at leastsudo journalctland checking the remote exit code separately (e.g., run journalctl to a remote tmp file, check rc, then gzip).- Line 55–57: if
/var/log/opensnitchd.logdoesn't exist, the redirect still creates a 0-byteopensnitchd.log.gz(not a valid gzip). Only write the file when the remote actually has content.♻️ Illustrative fix for the optional log
-# Also grab the opensnitchd log file if it exists, compressed -# shellcheck disable=SC2086 -ssh $SSH_OPTS "$LZ_SSH" \ - "gzip -c /var/log/opensnitchd.log 2>/dev/null || true" \ - > "${OUTPUT_DIR}/opensnitchd.log.gz" +# Also grab the opensnitchd log file if present (best-effort; skip if missing). +# shellcheck disable=SC2086 +if ssh $SSH_OPTS "$LZ_SSH" "test -s /var/log/opensnitchd.log"; then + # shellcheck disable=SC2086 + ssh $SSH_OPTS "$LZ_SSH" "gzip -c /var/log/opensnitchd.log" \ + > "${OUTPUT_DIR}/opensnitchd.log.gz" +fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/egress/collect_egress_logs.sh` around lines 40 - 57, Remote commands currently hide errors and create empty/invalid gzip outputs; modify the journalctl and opensnitchd logfile fetch so the remote exit status and file size are checked before gzipping/transferring. For the journalctl step (uses SSH_OPTS, LZ_SSH, LOCAL_LOG and the journalctl -u opensnitchd ... | gzip pipeline) run journalctl on the remote into a temporary file, capture its exit code, and only gzip that temp file and stream it back if journalctl succeeded and the file is non-empty; log or surface the journalctl exit code when it fails. For the /var/log/opensnitchd.log fetch (produces opensnitchd.log.gz under OUTPUT_DIR) test the remote file existence/size (e.g., test -s /var/log/opensnitchd.log) and only gzip and stream it when non-empty, otherwise avoid creating a 0-byte/invalid .gz locally; ensure stderr is not silently discarded so permission/unit errors are visible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/slash-command.yml:
- Around line 231-243: workflow_for_name() is missing mappings for the new
dispatch targets "capture-egress" and "capture-infra", so the /cancel command
cannot route cancellations to those long-running workflows; update the
workflow_for_name() function to include entries for "capture-egress" and
"capture-infra" that return the correct workflow file names (eg.
egress-capture.yml and infra-verify.yml or the canonical workflow identifiers
used elsewhere) so that /cancel capture-egress and /cancel capture-infra resolve
to the dispatched workflows.
In `@scripts/egress/setup_opensnitch.sh`:
- Around line 36-62: The remote setup runs as the unprivileged user (LZ_SSH) but
uses privileged operations (dnf, writing /etc/opensnitchd/default-config.json,
systemctl enable --now opensnitchd), so the script will fail; update the SSH
invocation or commands so they run with elevated privileges: either change ssh
$SSH_OPTS "$LZ_SSH" "bash -s" <<'EOF' to ssh $SSH_OPTS "$LZ_SSH" "sudo bash -s"
<<'EOF' (so the whole heredoc runs as root) or keep the current ssh call and run
privileged steps with sudo and safe file writes (e.g., use sudo dnf install -y
opensnitch, write the JSON via sudo tee /etc/opensnitchd/default-config.json >
/dev/null <<'CONF', and use sudo systemctl enable --now opensnitchd); ensure the
JSON block still uses the here-doc marker default-config.json and that sudo is
applied to all commands that modify system state.
In `@scripts/egress/testdata/sample.json`:
- Around line 1-11: The fixture scripts/egress/testdata/sample.json is
newline-delimited JSON (JSONL) so CI JSON validators flag it as malformed;
rename it to sample.jsonl and update the test harness to reference the new name,
or alternatively add an explicit exclusion for this fixture in the validation
step. Specifically, rename sample.json → sample.jsonl (or add a skip rule), and
modify test_analyze_egress.sh to load sample.jsonl (or add logic to skip/ignore
sample.json when running JSON validation), ensuring the test runner and any
validation config are updated to the new filename or exclusion.
In `@scripts/egress/update_allowlist.sh`:
- Around line 75-112: The branch/PR creation can collide on reruns and existing
open PRs; update the BRANCH naming and PR creation logic so branch names are
unique per run and existing PRs are re-used: change
BRANCH="egress-allowlist/$(date +%Y-%m-%d)" to include the run id (e.g., use
GITHUB_RUN_ID or another unique per-run token) so git checkout -b and git push
won't fail on reruns, and before running gh pr create query GitHub for an
existing open PR matching the egress-allowlist/* pattern (using gh pr list or gh
pr view with the branch pattern) and if found update that PR by pushing to its
branch instead of creating a new one; keep the existing git add/commit/push and
only call gh pr create when no matching open PR exists.
---
Outside diff comments:
In @.github/workflows/e2e-deployment.yml:
- Around line 469-470: The cleanup guard currently only checks for
workflow_dispatch so inputs.skip-cleanup from workflow_call is ignored; update
the if condition on the "Cleanup infrastructure" step and the paired "Cleanup
skipped notice" step (and the matching disconnected-job steps) to also respect
workflow_call by changing the condition to: always() && ( (github.event_name !=
'workflow_dispatch' && github.event_name != 'workflow_call') ||
inputs.skip-cleanup != true ); apply the same replacement for both occurrences
(the main job and the disconnected job) so inputs.skip-cleanup is honored for
workflow_call invocations.
---
Nitpick comments:
In @.github/workflows/egress-capture.yml:
- Around line 86-91: The workflow step named "Analyze and update allow list"
currently runs the script directly, which can fail if the executable bit is
lost; update the step to invoke the script with bash (call
scripts/egress/update_allowlist.sh via bash) so it doesn’t rely on file mode.
Keep the same environment variables (DRY_RUN, PR_NUMBER, GH_TOKEN) and simply
change the run invocation to use bash to execute
scripts/egress/update_allowlist.sh.
- Around line 30-32: Remove the redundant workflow-level "checks: write"
permission in the permissions block and leave only the minimum required (e.g.,
"contents: read"); the "update-allowlist" job already sets its own permissions
and the reusable "e2e-deployment.yml" runs under its own scope, so rely on those
job-level scopes instead of granting checks: write globally. Update the
permissions block by deleting the "checks: write" entry (or change it to
read-only if you prefer) so the workflow follows least-privilege.
In `@scripts/egress/analyze_egress.py`:
- Around line 93-94: The parser currently defaults proto to "tcp" with proto =
fields.get("proto", "tcp").lower(), which mislabels entries when proto is
omitted; change this so the code first checks for the presence of the "proto"
key on the fields dict and, if it's missing or empty, return None/skip the entry
(instead of guessing "tcp") to surface parser drift; update the parsing routine
that builds host/proto (the block referencing fields.get("proto", ...) and host
= fields.get("host", "")) to remove the default and explicitly handle the
missing proto case by returning None, and ensure any callers of that parsing
function handle a None result.
In `@scripts/egress/collect_egress_logs.sh`:
- Around line 40-57: Remote commands currently hide errors and create
empty/invalid gzip outputs; modify the journalctl and opensnitchd logfile fetch
so the remote exit status and file size are checked before
gzipping/transferring. For the journalctl step (uses SSH_OPTS, LZ_SSH, LOCAL_LOG
and the journalctl -u opensnitchd ... | gzip pipeline) run journalctl on the
remote into a temporary file, capture its exit code, and only gzip that temp
file and stream it back if journalctl succeeded and the file is non-empty; log
or surface the journalctl exit code when it fails. For the
/var/log/opensnitchd.log fetch (produces opensnitchd.log.gz under OUTPUT_DIR)
test the remote file existence/size (e.g., test -s /var/log/opensnitchd.log) and
only gzip and stream it when non-empty, otherwise avoid creating a
0-byte/invalid .gz locally; ensure stderr is not silently discarded so
permission/unit errors are visible.
In `@scripts/egress/update_allowlist.sh`:
- Around line 19-22: CONNECTED_LOGS and DISCONNECTED_LOGS are relative and
should be anchored like ALLOWLIST to avoid CWD dependency; change their
assignments in the script to build absolute paths (e.g., use an EGRESS_LOGS_DIR
env var defaulting to "${ENCLAVE_DIR}/egress-logs" and then set
CONNECTED_LOGS="${EGRESS_LOGS_DIR}/connected/opensnitch-connections.json.gz" and
DISCONNECTED_LOGS="${EGRESS_LOGS_DIR}/disconnected/opensnitch-connections.json.gz"),
and ensure subsequent logic uses those variables and checks file existence
before falling back to emitting an empty allowlist.
- Around line 62-71: The SUMMARY is printed with printf '%b' which interprets
backslashes in ${DIFF} and corrupts diffs; change all uses of printf '%b'
"$SUMMARY" to printf '%s\n' "$SUMMARY" (for the redirect to GITHUB_STEP_SUMMARY)
and change the PR comment call to either pass the raw variable (--body
"$SUMMARY") or use printf '%s' "$SUMMARY" when building the body so backslashes
are treated literally; update the two occurrences around SUMMARY/DIFF (the
printf redirect and the gh pr comment invocation that currently uses printf
'%b') to use %s instead.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 09a29c23-70ed-4efc-bf72-22c87a356c04
📒 Files selected for processing (13)
.github/workflows/e2e-deployment.yml.github/workflows/egress-capture.yml.github/workflows/infra-verify.yml.github/workflows/slash-command.ymlMakefile.cidocs/EGRESS_ALLOWLIST.yamlscripts/egress/analyze_egress.pyscripts/egress/collect_egress_logs.shscripts/egress/setup_opensnitch.shscripts/egress/test_analyze_egress.shscripts/egress/testdata/expected.yamlscripts/egress/testdata/sample.jsonscripts/egress/update_allowlist.sh
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/egress/analyze_egress.py (1)
143-152: Optional: consider quotingdnsvalues or usingyaml.safe_dumpfor defensive rendering.The current hand-rolled rendering works for well-formed FQDNs (alnum +
-+.), which is what OpenSnitch will realistically emit. However, a malformed/unexpectedhost=token (e.g. stray whitespace truncated by\S+, or a value that happens to match a YAML reserved token likeyes/no/nullin YAML 1.1 consumers) would round-trip as the wrong type. Since downstream consumers read this via YAML, emittingdns: "<value>"(quoted) or switching toyaml.safe_dumpwould remove the sharp edge at near-zero cost.Not blocking — current format matches the existing test fixture and the expected input domain.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/egress/analyze_egress.py` around lines 143 - 152, The render_yaml function currently emits dns values unquoted which can cause YAML parsing/type issues for unexpected values; update render_yaml to either wrap the dns value in quotes when appending (e.g., emit dns: "<value>") or replace the hand-rolled formatter with yaml.safe_dump for the entries so all strings are safely quoted/escaped; change the implementation of render_yaml (the function name render_yaml) to apply one of these fixes and ensure protocol and port remain rendered as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/egress/analyze_egress.py`:
- Around line 49-54: The is_excluded(ip_str: str) function currently returns
False on ipaddress.ip_address ValueError which allows malformed IPs to bypass
the RFC1918/loopback filter; change the exception handling in is_excluded so
that on ValueError it treats the input as excluded (return True) to fail-safe on
parse errors (reference is_excluded and EXCLUDED_NETWORKS), and optionally add a
brief debug/log message inside the except block to record the parse failure for
troubleshooting.
---
Nitpick comments:
In `@scripts/egress/analyze_egress.py`:
- Around line 143-152: The render_yaml function currently emits dns values
unquoted which can cause YAML parsing/type issues for unexpected values; update
render_yaml to either wrap the dns value in quotes when appending (e.g., emit
dns: "<value>") or replace the hand-rolled formatter with yaml.safe_dump for the
entries so all strings are safely quoted/escaped; change the implementation of
render_yaml (the function name render_yaml) to apply one of these fixes and
ensure protocol and port remain rendered as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c7044cdc-34bc-4260-b704-abe70207a452
📒 Files selected for processing (5)
.github/workflows/slash-command.ymlscripts/egress/analyze_egress.pyscripts/egress/setup_opensnitch.shscripts/egress/test_analyze_egress.shscripts/egress/testdata/sample.jsonl
✅ Files skipped from review due to trivial changes (2)
- scripts/egress/testdata/sample.jsonl
- scripts/egress/setup_opensnitch.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/egress/test_analyze_egress.sh
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/e2e-deployment.yml:
- Around line 24-26: The reusable-workflow input skip-cleanup is never honored
because the cleanup condition only checks for workflow_dispatch; update the
cleanup conditionals in both the connected and disconnected job steps (and the
other two occurrences) to also allow workflow_call and respect
inputs.skip-cleanup by changing checks like github.event_name ==
'workflow_dispatch' to (github.event_name == 'workflow_dispatch' ||
github.event_name == 'workflow_call') and ensure the step additionally verifies
inputs.skip-cleanup != 'true' (or the equivalent expression used elsewhere) so
that when skip-cleanup: true is passed to the workflow_call the cleanup steps
are skipped and the skipped-cleanup notice is shown.
- Around line 145-150: The workflow_call branch currently sets PLUGIN and writes
both storage_plugins and storage_plugins_connected to that value; modify this so
that when PLUGIN == "odf" you still set storage_plugins to "odf" but override
storage_plugins_connected to "lvms" (i.e., detect PLUGIN in the workflow_call
branch and write storage_plugins_connected=["lvms"] for odf) so the
connected-mode invariant remains pinned to lvms; update the echo lines that emit
storage_plugins and storage_plugins_connected accordingly in the workflow_call
branch where PLUGIN is defined.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0155869c-5e29-4377-a744-73bd4d8573fe
📒 Files selected for processing (1)
.github/workflows/e2e-deployment.yml
0f54efb to
4cfc786
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/e2e-deployment.yml:
- Around line 190-191: The job-level output is building the artifact name from
env.ENCLAVE_CLUSTER_NAME which is set via setup-cluster-name into $GITHUB_ENV
and therefore not available to job outputs; instead, have the upload step
(egress_upload) emit the computed artifact name using $GITHUB_OUTPUT (e.g., echo
"artifact-name=egress-logs-connected-${ENCLAVE_CLUSTER_NAME}-${GITHUB_RUN_ID}"
>> $GITHUB_OUTPUT) and then change the job output to reference
steps.egress_upload.outputs.artifact-name (and do the same for the disconnected
artifact/upload step), ensuring you keep the step ID egress_upload and the
exported key artifact-name consistent.
In @.github/workflows/egress-capture.yml:
- Around line 60-94: The workflow currently always runs the update-allowlist job
(if: always() && !cancelled()) which can produce empty allowlist PRs when e2e
jobs fail early and artifacts are missing; modify the workflow or the script to
prevent that: either tighten the job condition (update the if on
update-allowlist to ensure at least one of run-e2e-connected or
run-e2e-disconnected succeeded or that artifacts exist) or change
scripts/egress/update_allowlist.sh to detect missing/empty
egress-logs/connected/* and egress-logs/disconnected/* and abort (or preserve
existing sections) when expected logs are absent and DRY_RUN is false; reference
the update-allowlist job, the actions/download-artifact steps, the DRY_RUN env
and scripts/egress/update_allowlist.sh to locate the code to change.
- Around line 11-13: Remove the temporary trigger files and their workflow
reference: delete the two temp files named .github/TRIGGER_CAPTURE_E2E and
.github/TRIGGER_CAPTURE_INFRA (they are marked "# TEMPORARY — delete before
merging") and update the workflow egress-capture.yml by removing the
pull_request.paths entry that references '.github/TRIGGER_CAPTURE_E2E' (or
revert the commit that added these files); ensure no remaining workflow
references point to those filenames before merging.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 252f3582-1d0f-4d7f-8ff1-cf698c43e649
📒 Files selected for processing (6)
.github/TRIGGER_CAPTURE_E2E.github/TRIGGER_CAPTURE_INFRA.github/workflows/e2e-deployment.yml.github/workflows/egress-capture.yml.github/workflows/infra-verify.ymlscripts/egress/analyze_egress.py
✅ Files skipped from review due to trivial changes (2)
- .github/TRIGGER_CAPTURE_E2E
- .github/TRIGGER_CAPTURE_INFRA
🚧 Files skipped from review as they are similar to previous changes (2)
- scripts/egress/analyze_egress.py
- .github/workflows/infra-verify.yml
| pull_request: | ||
| paths: | ||
| - '.github/TRIGGER_CAPTURE_E2E' |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check whether the trigger file is referenced anywhere else and whether docs explain it.
rg -nP 'TRIGGER_CAPTURE_E2E' -C2
fd -a TRIGGER_CAPTURE_E2ERepository: rh-ecosystem-edge/enclave
Length of output: 51
🏁 Script executed:
# View the egress-capture.yml file around lines 7-31 to see the complete trigger configuration
cat -n .github/workflows/egress-capture.yml | head -40Repository: rh-ecosystem-edge/enclave
Length of output: 1528
🏁 Script executed:
# Check if slash-command.yml exists and examine how it dispatches workflows
fd -type f slash-command.ymlRepository: rh-ecosystem-edge/enclave
Length of output: 240
🏁 Script executed:
# Search for any documentation or comments about the TRIGGER_CAPTURE_E2E mechanism
rg -i 'trigger.*capture' -A2 -B2Repository: rh-ecosystem-edge/enclave
Length of output: 51
🏁 Script executed:
# Find slash-command.yml
fd slash-command.ymlRepository: rh-ecosystem-edge/enclave
Length of output: 51
🏁 Script executed:
# Check if the TRIGGER_CAPTURE_E2E file exists anywhere
find .github -type f -name "*TRIGGER*" 2>/dev/nullRepository: rh-ecosystem-edge/enclave
Length of output: 127
🏁 Script executed:
# Look for any documentation or README related to this workflow
find .github -name "README*" -o -name "*.md" | xargs grep -l "TRIGGER_CAPTURE" 2>/dev/null || echo "No matches found"Repository: rh-ecosystem-edge/enclave
Length of output: 86
🏁 Script executed:
# Search for any references to this workflow in documentation or comments
rg "egress.capture\|TRIGGER_CAPTURE" --type md --type yaml -A3 -B3Repository: rh-ecosystem-edge/enclave
Length of output: 51
🏁 Script executed:
# Check if these trigger files are tracked in git and their contents
git ls-files .github/TRIGGER_CAPTURE_*Repository: rh-ecosystem-edge/enclave
Length of output: 127
🏁 Script executed:
# View the contents and properties of these trigger files
ls -la .github/TRIGGER_CAPTURE_E2E .github/TRIGGER_CAPTURE_INFRA
cat .github/TRIGGER_CAPTURE_E2E | head -20
cat .github/TRIGGER_CAPTURE_INFRA | head -20Repository: rh-ecosystem-edge/enclave
Length of output: 293
🏁 Script executed:
# Check git history to understand when these files were added
git log --oneline --all -- .github/TRIGGER_CAPTURE_E2E | head -10
git log --oneline --all -- .github/TRIGGER_CAPTURE_INFRA | head -10Repository: rh-ecosystem-edge/enclave
Length of output: 201
🏁 Script executed:
# Look for references to these trigger files in workflows or scripts
rg "TRIGGER_CAPTURE" --type yaml --type md --type sh -B3 -A3Repository: rh-ecosystem-edge/enclave
Length of output: 51
🏁 Script executed:
# Check the git diff or status to see if these temporary files are staged for commit
git status --short .github/TRIGGER_CAPTURE_*
git diff --cached .github/TRIGGER_CAPTURE_*
git diff .github/TRIGGER_CAPTURE_*Repository: rh-ecosystem-edge/enclave
Length of output: 51
🏁 Script executed:
# Look at the specific commit to understand the intent
git show 4cfc786 --statRepository: rh-ecosystem-edge/enclave
Length of output: 17915
🏁 Script executed:
# Check if there are any other workflows using file-based triggers
fd workflow.*.yml workflow.yml
rg "paths:" .github/workflows/ -A2 -B1 | grep -A2 "TRIGGER"Repository: rh-ecosystem-edge/enclave
Length of output: 323
These temporary test files must be deleted before merging.
The trigger files .github/TRIGGER_CAPTURE_E2E and .github/TRIGGER_CAPTURE_INFRA are explicitly marked as # TEMPORARY — delete before merging in their contents, and the commit message (4cfc786) clearly states "DELETE THIS COMMIT BEFORE MERGING (revert or interactive rebase)."
While the pull_request trigger mechanism on .github/TRIGGER_CAPTURE_E2E is intentional for testing purposes, these files should not be committed to the main branch. Either delete the files and their corresponding trigger configuration, or revert the commit before merging.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/egress-capture.yml around lines 11 - 13, Remove the
temporary trigger files and their workflow reference: delete the two temp files
named .github/TRIGGER_CAPTURE_E2E and .github/TRIGGER_CAPTURE_INFRA (they are
marked "# TEMPORARY — delete before merging") and update the workflow
egress-capture.yml by removing the pull_request.paths entry that references
'.github/TRIGGER_CAPTURE_E2E' (or revert the commit that added these files);
ensure no remaining workflow references point to those filenames before merging.
There was a problem hiding this comment.
I will delete this files/workflow before merging. They're there since slash commands added in this PR are not yet available.
|
Tarball created: |
6da28e7 to
7edefa9
Compare
4372410 to
1e132da
Compare
eb4d35f to
363d567
Compare
363d567 to
6027724
Compare
|
Tarball created: |
Egress allow list changes--- /home/github-runner/action-runners/runner-3/_work/enclave/enclave/docs/EGRESS_ALLOWLIST.yaml 2026-05-05 11:56:03.408888261 -0500
+++ /tmp/tmp.WmkUtyfeBI 2026-05-05 11:56:09.309915470 -0500
@@ -0,0 +1,187 @@
+# Egress Connectivity Allow List
+#
+# This list documents the external network destinations contacted during a full
+# Enclave Lab deployment, based on OpenSnitch traffic captures from CI runs.
+# Auto-generated by scripts/egress/analyze_egress.py — do not edit manually.
+#
+# Each entry shows:
+# dns: Destination hostname
+# port: Destination port
+# protocol: tcp or udp
+# source: Which tools initiated connections (ansible, pip, podman, etc.)
+#
+# IMPORTANT: This list serves as a reference baseline, not a fixed specification.
+# Actual destinations may vary due to:
+# - CDN/mirror selection based on geography and load balancing
+# - Package repository infrastructure changes
+# - Upstream dependency updates
+#
+# Use this as a starting point for firewall rules, but monitor and adjust based
+# on your specific deployment environment and requirements.
+egress:
+ - dns: access.cdn.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: ansible-galaxy-ng.s3.dualstack.us-east-1.amazonaws.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: api.launchpad.net
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: api.openshift.com
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: cdn.amazonlinux.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: cdn.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - dnf
+ - pip
+ - dns: cdn01.quay.io
+ port: 443
+ protocol: tcp
+ source:
+ - oc
+ - oc-mirror
+ - podman
+ - dns: deb.debian.org
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: developers.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: files.pythonhosted.org
+ port: 443
+ protocol: tcp
+ source:
+ - pip
+ - dns: galaxy.ansible.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: github.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: linux.oracle.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: mirror.openshift.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - oc-mirror
+ - dns: nvd.nist.gov
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: osv-vulnerabilities.storage.googleapis.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: packages.broadcom.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: packages.us-west-2.amazonaws.com
+ port: 80
+ protocol: tcp
+ source:
+ - clair
+ - dns: packages.vmware.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: pypi.org
+ port: 443
+ protocol: tcp
+ source:
+ - pip
+ - dns: quay.io
+ port: 443
+ protocol: tcp
+ source:
+ - oc
+ - oc-mirror
+ - podman
+ - dns: registry.access.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: registry.connect.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: registry.redhat.io
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: release-assets.githubusercontent.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: repo.us-west-2.amazonaws.com
+ port: 80
+ protocol: tcp
+ source:
+ - clair
+ - dns: secdb.alpinelinux.org
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: security-metadata.canonical.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: security-tracker.debian.org
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: security.access.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: subscription.rhsm.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - other
+ - dns: support.novell.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair |
6027724 to
132a79e
Compare
|
Tarball created: |
|
Tarball created: |
98ed6df to
b7ac440
Compare
|
Tarball created: |
|
Tarball created: |
ddeb701 to
eefc1fa
Compare
|
Tarball created: |
eefc1fa to
a85094c
Compare
|
Tarball created: |
Introduces an optional egress capture capability that builds and maintains docs/EGRESS_ALLOWLIST.yaml — a list of outbound connections (DNS name, protocol, port) made by the Landing Zone VM during a full deployment. Normal CI runs are completely unaffected. How it works: OpenSnitch runs on the Landing Zone in accept-all + log mode. After deployment, logs are collected, RFC1918 and loopback destinations are filtered out, and the remainder is rendered as a sorted YAML allow list. The daily egress-capture workflow calls the existing e2e-deployment workflow for both connected and disconnected modes with capture enabled, then diffs the result against the current allow list. If anything changed, it opens a PR automatically. For iterating on the capture tooling itself without waiting for a full e2e run, /test capture-infra triggers the same pipeline through infra-verify. To preview allow list changes on a PR branch, /test capture-egress runs a dry-run and posts the diff as a comment. https://redhat.atlassian.net/browse/MGMT-23743 Signed-off-by: Rafa Porres Molina <rporresm@redhat.com> Assisted-by: Claude Sonnet 4.6
Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
a85094c to
454a24b
Compare
|
Tarball created: |
Egress allow list changes--- /home/github-runner/action-runners/runner-4/_work/enclave/enclave/docs/EGRESS_ALLOWLIST.yaml 2026-05-07 06:55:08.536841892 -0500
+++ /tmp/tmp.BNFNz6KG9H 2026-05-07 06:55:14.634869095 -0500
@@ -0,0 +1,187 @@
+# Egress Connectivity Allow List
+#
+# This list documents the external network destinations contacted during a full
+# Enclave Lab deployment, based on OpenSnitch traffic captures from CI runs.
+# Auto-generated by scripts/egress/analyze_egress.py — do not edit manually.
+#
+# Each entry shows:
+# dns: Destination hostname
+# port: Destination port
+# protocol: tcp or udp
+# source: Which tools initiated connections (ansible, pip, podman, etc.)
+#
+# IMPORTANT: This list serves as a reference baseline, not a fixed specification.
+# Actual destinations may vary due to:
+# - CDN/mirror selection based on geography and load balancing
+# - Package repository infrastructure changes
+# - Upstream dependency updates
+#
+# Use this as a starting point for firewall rules, but monitor and adjust based
+# on your specific deployment environment and requirements.
+egress:
+ - dns: access.cdn.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: ansible-galaxy-ng.s3.dualstack.us-east-1.amazonaws.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: api.launchpad.net
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: api.openshift.com
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: cdn.amazonlinux.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: cdn.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - dnf
+ - pip
+ - dns: cdn01.quay.io
+ port: 443
+ protocol: tcp
+ source:
+ - oc
+ - oc-mirror
+ - podman
+ - dns: deb.debian.org
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: developers.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: files.pythonhosted.org
+ port: 443
+ protocol: tcp
+ source:
+ - pip
+ - dns: galaxy.ansible.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: github.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: linux.oracle.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: mirror.openshift.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - oc-mirror
+ - dns: nvd.nist.gov
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: osv-vulnerabilities.storage.googleapis.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: packages.broadcom.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: packages.us-west-2.amazonaws.com
+ port: 80
+ protocol: tcp
+ source:
+ - clair
+ - dns: packages.vmware.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: pypi.org
+ port: 443
+ protocol: tcp
+ source:
+ - pip
+ - dns: quay.io
+ port: 443
+ protocol: tcp
+ source:
+ - oc
+ - oc-mirror
+ - podman
+ - dns: registry.access.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: registry.connect.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: registry.redhat.io
+ port: 443
+ protocol: tcp
+ source:
+ - oc-mirror
+ - dns: release-assets.githubusercontent.com
+ port: 443
+ protocol: tcp
+ source:
+ - ansible
+ - dns: repo.us-west-2.amazonaws.com
+ port: 80
+ protocol: tcp
+ source:
+ - clair
+ - dns: secdb.alpinelinux.org
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: security-metadata.canonical.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: security-tracker.debian.org
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: security.access.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair
+ - dns: subscription.rhsm.redhat.com
+ port: 443
+ protocol: tcp
+ source:
+ - other
+ - dns: support.novell.com
+ port: 443
+ protocol: tcp
+ source:
+ - clair |
Introduces an optional egress capture capability that builds and maintains docs/EGRESS_ALLOWLIST.yaml — a list of outbound connections (DNS name, protocol, port) made by the Landing Zone VM during a full deployment. Normal CI runs are completely unaffected.
How it works: OpenSnitch runs on the Landing Zone in accept-all + log mode. After deployment, logs are collected, RFC1918 and loopback destinations are filtered out, and the remainder is rendered as a sorted YAML allow list. The daily egress-capture workflow calls the existing e2e-deployment workflow for both connected and disconnected modes with capture enabled, then diffs the result against the current allow list. If anything changed, it opens a PR automatically.
For iterating on the capture tooling itself without waiting for a full e2e run, /test capture-infra triggers the same pipeline through infra-verify. To preview allow list changes on a PR branch, /test capture-egress runs a dry-run and posts the diff as a comment.
https://redhat.atlassian.net/browse/MGMT-23743
Summary by CodeRabbit
New Features
Chores
Documentation
Tests