Skip to content

Commit ed6f26e

Browse files
StackRox Automationclaude
andcommitted
Combine test and production workflows
Removed test-oncall-analysis.yml and made analyze-and-notify.yml dual-purpose: 1. Called by integration-tests.yml when tests actually fail 2. Triggered by PR label 'test-oncall-workflow' for testing Changes: - Added pull_request trigger with label filter - Set workflow parameters based on trigger type - workflow_call: uses inputs from caller - pull_request: uses test values (no artifacts expected) - Outputs parameters from analyze-failures for notify job - Removed separate test workflow - Updated TESTING_ONCALL.md documentation Testing with label: - No fake artifacts created - Claude analyzes empty test-artifacts/ directory - Generates appropriate report (likely "no failures found") - Posts to Slack with [TEST] prefix This simplifies the workflow structure while still allowing easy testing via PR labels. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent f052b83 commit ed6f26e

3 files changed

Lines changed: 92 additions & 135 deletions

File tree

.github/workflows/TESTING_ONCALL.md

Lines changed: 58 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -4,55 +4,78 @@ This document explains how to test the Claude AI test failure analysis workflow.
44

55
## How to Test
66

7-
1. Go to Actions → "Test On-Call Analysis Workflow"
8-
2. Click "Run workflow"
9-
3. Select branch: `add-test-analysis-job` (or whatever your PR branch is)
10-
4. Optionally add a comment (e.g., "Testing PR #3381")
11-
5. Click "Run workflow"
12-
6. Check the Slack channel for the [TEST] notification
7+
Add the label `test-oncall-workflow` to any PR.
138

14-
## What the test does
9+
The `analyze-and-notify` workflow will:
10+
1. Run automatically when the label is added
11+
2. Look for test artifacts (there won't be any on a PR without test failures)
12+
3. Run Claude analysis with empty/no artifacts
13+
4. Generate a report based on what it finds (or an empty report)
14+
5. Post to Slack with [TEST] prefix
1515

16-
1. **Creates fake test failures**: Generates synthetic JUnit XML files with test failures
17-
- The failures reference fake file paths and error messages
18-
- This simulates the artifact structure from real integration test runs
16+
## What This Tests
1917

20-
2. **Runs Claude analysis**:
21-
- Parses the XML test reports from artifacts
22-
- Attempts to examine source code and git history
23-
- Generates analysis report (may note that referenced files don't exist)
18+
**Workflow execution**: All jobs run in correct sequence
19+
**Claude integration**: claude-code-base-action executes successfully
20+
**Skill loading**: `/analyze-test-failures` command is available
21+
**Slack notification**: Webhook delivers message to #team-acs-collector-oncall
22+
**Report generation**: Claude creates analysis-report.md (even if empty)
2423

25-
3. **Posts to Slack**:
26-
- Sends notification to #team-acs-collector-oncall
27-
- Prefixed with [TEST] to indicate test mode
28-
- Includes the AI-generated analysis
24+
⚠️ **What it doesn't test**: Quality of analysis on real collector test failures (no real artifacts)
2925

30-
## What this test validates
26+
## Expected Behavior
3127

32-
**Workflow execution**: All jobs run in correct sequence
33-
**Artifact handling**: Test reports are created, uploaded, and downloaded correctly
34-
**GCP authentication**: Vertex AI credentials work
35-
**Claude integration**: claude-code-base-action runs successfully
36-
**Slack notification**: Webhook delivers message to correct channel
37-
**Report generation**: analysis-report.md is created and included in Slack
28+
**If you test on a PR without test failures:**
29+
30+
Claude will analyze the empty `test-artifacts/` directory and should generate a report saying:
31+
```markdown
32+
**🤖 AI Analysis [TEST MODE]**
33+
34+
**Root Cause**: No test failures found in artifacts directory.
35+
36+
**Evidence**:
37+
• test-artifacts/ directory is empty or contains no JUnit XML files
38+
• No failed tests to analyze
3839

39-
⚠️ **What it doesn't validate**: The quality of Claude's analysis on real collector test failures (since it uses synthetic data)
40+
**Recommendations**:
41+
• This is a test run with no actual test failures
42+
• The workflow is functioning correctly
43+
```
4044

41-
## Expected Slack Message
45+
**Slack message:**
46+
```
47+
[TEST] Integration Tests failed
4248
43-
You should receive a Slack message in #team-acs-collector-oncall with [TEST] prefix containing Claude's analysis of the synthetic test failures. The content will vary based on what Claude finds when analyzing the fake error messages.
49+
**This is a test of the oncall analysis workflow - please ignore**
50+
51+
[Claude's report about no failures found]
52+
```
53+
54+
## To Test with Real Failures
55+
56+
Trigger the workflow on an actual test failure:
57+
1. Wait for integration tests to fail naturally, OR
58+
2. Intentionally break a test and push to a branch
59+
3. The workflow will run automatically with real artifacts
60+
4. Check Slack for analysis with actual root cause
4461

4562
## Cleanup
4663

47-
After testing, you can:
64+
After testing:
4865
- Remove the `test-oncall-workflow` label from the PR
49-
- Delete the test workflow run from Actions
50-
- The Slack message will remain for reference
66+
- The Slack [TEST] message will remain for reference
5167

5268
## Troubleshooting
5369

54-
If the test fails:
70+
**No workflow run triggered:**
71+
- Check that `.github/workflows/analyze-and-notify.yml` exists in the PR branch
72+
- New workflows require merge to main before PR triggers work
73+
74+
**No Slack notification:**
75+
- Check `SLACK_COLLECTOR_ONCALL_WEBHOOK` secret is set
76+
- Verify webhook URL is valid
5577

56-
1. **No Slack message**: Check that `SLACK_COLLECTOR_ONCALL_WEBHOOK` secret is set
57-
2. **No analysis report**: Check the "Analyze test failures with Claude" step logs
58-
3. **Action not found**: Make sure this PR includes `.github/workflows/test-oncall-analysis.yml`
78+
**Claude fails:**
79+
- Check analyze-failures job logs for errors
80+
- Verify `GCP_CLAUDE_SERVICE_ACCOUNT_KEY` and `GCP_CLAUDE_PROJECT_ID` secrets are set
81+
- See "Troubleshooting" section in `.github/scripts/README.md`

.github/workflows/analyze-and-notify.yml

Lines changed: 34 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,37 @@ on:
2121
required: false
2222
type: string
2323
default: ''
24+
pull_request:
25+
types: [labeled]
2426

2527
jobs:
2628
analyze-failures:
2729
runs-on: ubuntu-24.04
30+
if: |
31+
always() && (
32+
github.event_name == 'workflow_call' ||
33+
(github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'test-oncall-workflow'))
34+
)
35+
outputs:
36+
workflow_name: ${{ steps.params.outputs.workflow_name }}
37+
failed_jobs: ${{ steps.params.outputs.failed_jobs }}
38+
is_test: ${{ steps.params.outputs.is_test }}
2839
steps:
40+
- name: Set workflow parameters
41+
id: params
42+
run: |
43+
if [ "${{ github.event_name }}" = "pull_request" ]; then
44+
echo "failed_jobs=rhcos-arm64,cos-logs" >> $GITHUB_OUTPUT
45+
echo "workflow_name=Integration Tests" >> $GITHUB_OUTPUT
46+
echo "is_test=true" >> $GITHUB_OUTPUT
47+
echo "artifact_name=test-failure-artifacts" >> $GITHUB_OUTPUT
48+
else
49+
echo "failed_jobs=${{ inputs.failed-jobs }}" >> $GITHUB_OUTPUT
50+
echo "workflow_name=${{ inputs.workflow-name }}" >> $GITHUB_OUTPUT
51+
echo "is_test=${{ inputs.is-test }}" >> $GITHUB_OUTPUT
52+
echo "artifact_name=" >> $GITHUB_OUTPUT
53+
fi
54+
2955
- name: Checkout repository
3056
uses: actions/checkout@v4
3157

@@ -52,9 +78,9 @@ jobs:
5278
use_vertex: true
5379
allowed_tools: "Skill,Read,Grep,Glob,Bash"
5480
prompt: |
55-
/analyze-test-failures test-artifacts/ "${{ inputs.workflow-name }}" "${{ inputs.failed-jobs }}"
81+
/analyze-test-failures test-artifacts/ "${{ steps.params.outputs.workflow_name }}" "${{ steps.params.outputs.failed_jobs }}"
5682
57-
${{ inputs.is-test && 'Add [TEST MODE] prefix to the report title.' || '' }}
83+
${{ steps.params.outputs.is_test == 'true' && 'Add [TEST MODE] prefix to the report title.' || '' }}
5884
5985
- name: Check if analysis report was created
6086
id: check-report
@@ -117,13 +143,12 @@ jobs:
117143
env:
118144
SLACK_WEBHOOK: ${{ secrets.SLACK_COLLECTOR_ONCALL_WEBHOOK }}
119145
SLACK_CHANNEL: team-acs-collector-oncall
120-
SLACK_COLOR: ${{ inputs.is-test && 'warning' || 'failure' }}
146+
SLACK_COLOR: ${{ needs.analyze-failures.outputs.is_test == 'true' && 'warning' || 'failure' }}
121147
SLACK_LINK_NAMES: true
122-
SLACK_TITLE: "${{ inputs.is-test && '[TEST] ' || '' }}${{ inputs.workflow-name }} failed"
148+
SLACK_TITLE: "${{ needs.analyze-failures.outputs.is_test == 'true' && '[TEST] ' || '' }}${{ needs.analyze-failures.outputs.workflow_name }} failed"
123149
MSG_MINIMAL: actions url,commit
124150
SLACK_MESSAGE: |
125-
${{ inputs.is-test && '**This is a test of the oncall analysis workflow - please ignore**' || '@acs-collector-oncall' }}
126-
${{ inputs.test-comment && format('Comment: {0}', inputs.test-comment) || '' }}
151+
${{ needs.analyze-failures.outputs.is_test == 'true' && '**This is a test of the oncall analysis workflow - please ignore**' || '@acs-collector-oncall' }}
127152
128153
${{ steps.read-analysis.outputs.analysis }}
129154
@@ -133,11 +158,11 @@ jobs:
133158
env:
134159
SLACK_WEBHOOK: ${{ secrets.SLACK_COLLECTOR_ONCALL_WEBHOOK }}
135160
SLACK_CHANNEL: team-acs-collector-oncall
136-
SLACK_COLOR: ${{ inputs.is-test && 'warning' || 'failure' }}
161+
SLACK_COLOR: ${{ needs.analyze-failures.outputs.is_test == 'true' && 'warning' || 'failure' }}
137162
SLACK_LINK_NAMES: true
138-
SLACK_TITLE: "${{ inputs.is-test && '[TEST] ' || '' }}${{ inputs.workflow-name }} failed"
163+
SLACK_TITLE: "${{ needs.analyze-failures.outputs.is_test == 'true' && '[TEST] ' || '' }}${{ needs.analyze-failures.outputs.workflow_name }} failed"
139164
MSG_MINIMAL: actions url,commit
140165
SLACK_MESSAGE: |
141-
${{ inputs.is-test && '**This is a test - AI analysis unavailable**' || '@acs-collector-oncall' }}
166+
${{ needs.analyze-failures.outputs.is_test == 'true' && '**This is a test - AI analysis unavailable**' || '@acs-collector-oncall' }}
142167
143168
AI analysis unavailable. Check workflow logs.

.github/workflows/test-oncall-analysis.yml

Lines changed: 0 additions & 91 deletions
This file was deleted.

0 commit comments

Comments
 (0)