Skip to content

Commit 1aef3b3

Browse files
StackRox Automationclaude
andcommitted
Create test failure analysis skill and simplify workflow
Following the pattern from PR #3170, moved all analysis logic into a proper Claude Code skill. Added: - .claude/commands/analyze-test-failures.md - Skill definition with usage, implementation details - Follows repo's slash command pattern - Contains all analysis instructions and report format Changed: - .github/workflows/analyze-and-notify.yml - Simplified prompt to just invoke the skill - Removed ~60 lines of inline instructions - Cleaner, more maintainable workflow Benefits: - Single source of truth for analysis logic - Can be tested locally: `claude /analyze-test-failures` - Can be improved independently of workflow - Follows established repo conventions Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent eec5f28 commit 1aef3b3

2 files changed

Lines changed: 100 additions & 49 deletions

File tree

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
name: analyze-test-failures
3+
description: Analyze test failure artifacts and generate root cause analysis report
4+
---
5+
6+
# Test Failure Analysis
7+
8+
Analyze test failures from CI artifacts and generate a concise root cause analysis for the oncall team.
9+
10+
## Usage
11+
12+
```
13+
/analyze-test-failures <artifacts-dir> <workflow-name> <failed-jobs>
14+
```
15+
16+
**Arguments:**
17+
- `artifacts-dir`: Directory containing test artifacts (default: test-artifacts/)
18+
- `workflow-name`: Name of the workflow that failed (e.g., "Integration Tests")
19+
- `failed-jobs`: Comma-separated list of failed job names
20+
21+
**Example:**
22+
```
23+
/analyze-test-failures test-artifacts/ "Integration Tests" "amd64-integration-tests,arm64-integration-tests"
24+
```
25+
26+
## What This Does
27+
28+
1. **Find test reports**: Searches for JUnit XML files (integration-test-report-*.xml, junit.xml)
29+
2. **Parse failures**: Extracts test names, error messages, stack traces
30+
3. **Investigate code**: Reads failing test source and implementation code
31+
4. **Check git history**: Looks for recent changes that may have caused failures
32+
5. **Identify patterns**: Detects platform-specific issues (arch/OS)
33+
6. **Generate report**: Creates analysis-report.md with findings
34+
35+
## Report Format
36+
37+
The generated `analysis-report.md` contains:
38+
39+
```markdown
40+
**🤖 AI Analysis**
41+
42+
**Root Cause**: [1-2 sentence summary with file:line references]
43+
44+
**Evidence**:
45+
[Specific code observations]
46+
[Patterns across failures]
47+
[Recent changes correlation]
48+
49+
**Affected Platforms**: [Architectures/OS if pattern found]
50+
51+
**Recommendations**:
52+
[Specific file:line to fix with suggested change]
53+
[Additional investigation needed]
54+
[Prevention strategy]
55+
56+
---
57+
**Statistics**
58+
• Total Failures: [count]
59+
• Total Errors: [count]
60+
• Failed Jobs: [list]
61+
```
62+
63+
## Implementation
64+
65+
Start by finding and parsing test reports:
66+
67+
```bash
68+
# Find all XML test reports
69+
find <artifacts-dir> -name "*.xml" -type f
70+
```
71+
72+
For each failure:
73+
- Read the test source code to understand intent
74+
- Examine the implementation being tested
75+
- Check `git log --oneline -20` for recent changes
76+
- Look for patterns across different platforms
77+
78+
Generate the report focusing on **actionable insights** for the oncall engineer:
79+
- File paths and line numbers for fixes
80+
- Platform-specific patterns (endianness, timing, etc.)
81+
- Links to similar past failures if found
82+
83+
Keep the analysis **under 500 words** and emphasize:
84+
- What broke
85+
- Why it broke
86+
- How to fix it
87+
88+
**IMPORTANT**: After analysis, create the report file:
89+
90+
```bash
91+
cat > analysis-report.md <<'EOF'
92+
[your markdown report]
93+
EOF
94+
```

.github/workflows/analyze-and-notify.yml

Lines changed: 6 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -48,58 +48,15 @@ jobs:
4848
env:
4949
ANTHROPIC_VERTEX_PROJECT_ID: ${{ secrets.GCP_CLAUDE_PROJECT_ID }}
5050
CLOUD_ML_REGION: us-east5
51+
TEST_MODE: ${{ inputs.is-test && 'true' || 'false' }}
5152
with:
5253
use_vertex: true
53-
allowed_tools: "Read,Grep,Glob,Bash"
5454
prompt: |
55-
${{ inputs.is-test && '[TEST MODE] ' || '' }}Analyze test failures for the ${{ inputs.workflow-name }} workflow.
56-
57-
Failed jobs: ${{ inputs.failed-jobs }}
58-
Artifacts directory: test-artifacts/
59-
60-
Instructions:
61-
1. Find all XML test reports in test-artifacts/ (integration-test-report-*.xml, junit.xml)
62-
2. Parse the XML to extract failed tests, error messages, and stack traces
63-
3. Read the failing test source code to understand what's being tested
64-
4. Examine the implementation code that's failing
65-
5. Check git log for recent changes that might have caused failures
66-
6. Look for platform-specific patterns (amd64, arm64, s390x, ppc64le)
67-
7. Look for OS-specific patterns (RHEL, Ubuntu, COS, Flatcar, etc.)
68-
69-
Generate a markdown report at analysis-report.md with:
70-
71-
**🤖 AI Analysis${{ inputs.is-test && ' [TEST MODE]' || '' }}**
72-
73-
**Root Cause**: [1-2 sentence summary with file:line references]
74-
75-
**Evidence**:
76-
• [Specific code observations]
77-
• [Patterns across failures]
78-
• [Recent changes correlation]
79-
80-
**Affected Platforms**: [Architectures/OS if pattern found]
81-
82-
**Recommendations**:
83-
• [Specific file:line to fix with suggested change]
84-
• [Additional investigation needed]
85-
• [Prevention strategy]
86-
87-
---
88-
**Statistics**
89-
• Total Failures: [count]
90-
• Total Errors: [count]
91-
• Failed Jobs: ${{ inputs.failed-jobs }}
92-
93-
Keep it under 500 words and actionable.
94-
95-
IMPORTANT: After your analysis, you MUST create the file analysis-report.md with your markdown report using:
96-
cat > analysis-report.md <<'EOF'
97-
[your markdown report here]
98-
EOF
99-
claude_env: |
100-
WORKFLOW_NAME: ${{ inputs.workflow-name }}
101-
FAILED_JOBS: ${{ inputs.failed-jobs }}
102-
ARTIFACTS_DIR: test-artifacts
55+
Run the test failure analysis skill:
56+
57+
/analyze-test-failures test-artifacts/ "${{ inputs.workflow-name }}" "${{ inputs.failed-jobs }}"
58+
59+
${{ inputs.is-test && 'Note: This is a TEST run. Add [TEST MODE] prefix to the report title.' || '' }}
10360
10461
- name: Check Claude action result
10562
if: always()

0 commit comments

Comments
 (0)