@@ -45,59 +45,30 @@ Uses `claude-code-base-action` to execute the `/analyze-test-failures` skill:
4545- Creates ` analysis-report.md ` with actionable insights
4646
4747** Claude has access to:**
48+ - ` Skill ` - Load and execute the analysis skill
4849- ` Read ` - View source files
4950- ` Grep ` - Search codebase
5051- ` Glob ` - Find files
5152- ` Bash ` - Execute git commands, create reports
52- - ` Skill ` - Load and execute the analysis skill
5353
5454### 4. Notify
55- Posts to Slack with:
55+ Posts to Slack (#team-acs-collector-oncall) with:
5656- AI-generated root cause analysis
5757- Evidence from code and logs
5858- Platform-specific patterns detected
5959- Actionable recommendations with file: line references
6060
6161Falls back to simple notification if analysis fails.
6262
63- ## Required Secrets
64-
65- ### Already Configured ✅
66- - ` SLACK_COLLECTOR_ONCALL_WEBHOOK `
67- - ` GCP_CLAUDE_SERVICE_ACCOUNT_KEY `
68- - ` GCP_CLAUDE_PROJECT_ID `
69-
7063## Files
7164
7265### Workflows
7366- ` .github/workflows/integration-tests.yml ` - Main integration test workflow
7467- ` .github/workflows/analyze-and-notify.yml ` - Reusable analysis workflow
75- - ` .github/workflows/test-oncall-analysis.yml ` - Test workflow with synthetic failures
7668
7769### Skill
7870- ` .claude/commands/analyze-test-failures.md ` - Claude skill defining analysis logic
7971
80- ### Documentation
81- - ` .github/workflows/TESTING_ONCALL.md ` - How to test the workflow
82-
83- ## Testing
84-
85- ### Manual Test Run
86-
87- 1 . Go to Actions → "Test On-Call Analysis Workflow"
88- 2 . Click "Run workflow"
89- 3 . Select branch: ` add-test-analysis-job `
90- 4 . Check #team-acs-collector-oncall for [ TEST] Slack message
91-
92- See ` .github/workflows/TESTING_ONCALL.md ` for details.
93-
94- ### Local Skill Development
95-
96- ``` bash
97- # Test the skill locally (requires Claude CLI)
98- claude /analyze-test-failures test-artifacts/ " Integration Tests" " rhcos-arm64,cos"
99- ```
100-
10172## Example Output
10273
10374** Slack message with AI analysis:**
@@ -143,35 +114,67 @@ Integration tests failed.
143114- Links recent git changes to failures
144115- Provides concrete next steps
145116
117+ ## Testing
118+
119+ ### Test on a PR
120+
121+ Add the label ` test-oncall-workflow ` to any PR to trigger the workflow.
122+
123+ ** What happens:**
124+ - Workflow runs with empty test artifacts
125+ - Claude analyzes and generates a report
126+ - Report is uploaded as artifact
127+ - ** Slack notification is skipped** (only runs on actual test failures)
128+
129+ ** Use case:** Verify Claude analysis executes without spamming Slack.
130+
131+ ** To verify it worked:**
132+ 1 . Check the workflow run in Actions tab
133+ 2 . Download the ` failure-analysis ` artifact to see the generated report
134+
135+ ### Test with Real Failures
136+
137+ The best test is observing the workflow on actual test failures:
138+ 1 . Wait for integration tests to fail naturally
139+ 2 . Check #team-acs-collector-oncall for the AI analysis
140+ 3 . Verify the analysis is helpful and actionable
141+
146142## Configuration
147143
148144### Vertex AI Region
149145Set in ` .github/workflows/analyze-and-notify.yml ` :
150146``` yaml
151147env :
152- CLOUD_ML_REGION : us-east5 # Or your preferred region
148+ CLOUD_ML_REGION : us-east5
153149` ` `
154150
151+ ### Required Secrets
152+
153+ Already configured:
154+ - ` GCP_CLAUDE_SERVICE_ACCOUNT_KEY` - Service account JSON for Vertex AI
155+ - ` GCP_CLAUDE_PROJECT_ID` - GCP project ID
156+ - ` SLACK_COLLECTOR_ONCALL_WEBHOOK` - Slack webhook URL
157+
155158# ## Allowed Tools
159+
156160Claude has access to these tools for investigation :
157161` ` ` yaml
158162allowed_tools: "Skill,Read,Grep,Glob,Bash"
159163` ` `
160164
161165# ## Reusable Workflow Inputs
166+
162167The `analyze-and-notify.yml` workflow accepts :
163168- ` failed-jobs` - Comma-separated list of failed job names
164169- ` workflow-name` - Name of the workflow that failed
165- - ` is-test` - Whether this is a test run (adds [TEST MODE] prefix)
166- - ` test-comment` - Optional comment for test runs
167170
168171# # Troubleshooting
169172
170173# ## No Analysis Report Generated
171174
172175**Check:**
1731761. Claude action step logs - did it execute successfully?
174- 2. "Check if analysis report was created" step - file exists ?
177+ 2. "Check if analysis report was created" step - does file exist ?
1751783. Skill file exists at `.claude/commands/analyze-test-failures.md`
1761794. `Skill` tool is in `allowed_tools`
177180
@@ -189,13 +192,41 @@ Check Claude action logs for specific error details.
189192
190193**Check:**
1911941. `SLACK_COLLECTOR_ONCALL_WEBHOOK` secret is set
192- 2. Notify job logs show the download step succeeded
195+ 2. Notify job logs show download step succeeded
1931963. Webhook URL is valid
194197
198+ # ## Analysis Quality Issues
199+
200+ **If Claude's analysis is not helpful:**
201+ 1. Check that test artifacts are being uploaded correctly
202+ 2. Verify JUnit XML format is valid
203+ 3. Update skill instructions in `.claude/commands/analyze-test-failures.md`
204+ 4. The skill can be iterated on independently of the workflow
205+
206+ # # Local Development
207+
208+ # ## Test the Skill Locally
209+
210+ ` ` ` bash
211+ # Requires Claude CLI installed
212+ claude /analyze-test-failures test-artifacts/ "Integration Tests" "rhcos-arm64,cos"
213+ ` ` `
214+
215+ # ## Update the Skill
216+
217+ Edit `.claude/commands/analyze-test-failures.md` to :
218+ - Change analysis instructions
219+ - Update report format
220+ - Add new investigation steps
221+ - Modify recommendations structure
222+
223+ Changes take effect on the next workflow run - no workflow YAML changes needed.
224+
195225# # Future Enhancements
196226
197227- [ ] Correlate failures with specific PR/commit
198- - [ ] Track failure patterns over time
228+ - [ ] Track failure patterns over time
199229- [ ] Link to similar historical failures
200230- [ ] Auto-create issues for recurring failures
201231- [ ] Support for other test frameworks beyond JUnit XML
232+ - [ ] Integration with test retries/flakiness detection
0 commit comments