You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Developers waste hours manually correlating Salesforce deployment logs, coverage reports, and PMD violations across separate tools to find a root cause that should take minutes to identify.
8
-
This tool feeds those signals into Claude simultaneously and returns a ranked diagnosis — what broke, which component caused it, and what to fix first — in seconds.
7
+
When a Salesforce deployment fails, this tool identifies the root cause in seconds — not hours.
8
+
It feeds deployment errors, coverage data, and PMD violations into Claude and returns a ranked diagnosis: exactly what broke, which component caused it, and what to fix first.
9
9
10
-
> **Why this matters:** There are over 150,000 active Salesforce customer organisations worldwide. Every org doing custom Apex development faces this exact debugging pattern when a deployment fails. Salesforce's own CLI tools — `sf project deploy`, `sf scanner` — produce structured JSON output that no open source tool currently cross-correlates into a ranked diagnosis. This project fills that gap using Claude's reasoning to turn three disconnected outputs into a single, actionable result. There is no equivalent open source tool in the Salesforce ecosystem today.
10
+
> Salesforce's CLI tools produce structured JSON output that no open source tool currently cross-correlates into a ranked diagnosis. This project fills that gap — there is no equivalent in the Salesforce ecosystem today.
11
11
12
12
---
13
13
14
14
## Problem
15
15
16
-
When a Salesforce deployment fails, engineers read raw logs, pull coverage reports, and run PMD scans — then manually piece together why it broke across three disconnected tools.
16
+
When a deployment fails, engineers read raw logs, pull coverage reports, and run PMD scans — then manually piece together why it broke across three disconnected tools.
17
17
18
18
- Deployment logs show *what* failed, not *why*
19
19
- Coverage gaps and code violations have no visible connection to the error
@@ -23,38 +23,57 @@ When a Salesforce deployment fails, engineers read raw logs, pull coverage repor
23
23
24
24
## Why Claude
25
25
26
-
PMD flags a violation. The coverage tool reports 62%. The deployment log shows a `NullPointerException` in `OpportunityService`. Three tools, three outputs — none of them tell you these signals share a single root cause.
26
+
Salesforce CLI, PMD, and coverage tools each report one signal in isolation — they show *what* happened, not *why*, and have no awareness of each other.
27
+
Claude reads all three signals together: when a `NullPointerException` in `OpportunityService` coincides with low coverage and critical PMD violations, it identifies a single `@TestSetup` gap as the common cause — not three separate problems requiring three separate fixes.
28
+
The result is a specific root cause per component and a P0-ranked fix list — without opening a single log file.
27
29
28
-
Claude reasons across all three inputs simultaneously and identifies the causal structure: the `@TestSetup` gap caused the exception, the exception suppressed test execution, and suppressed tests pulled coverage below the deployment threshold. That is not summarization — it is cross-signal reasoning that a rule-based tool cannot perform.
30
+
---
29
31
30
-
Two properties make Claude specifically well-suited for this:
32
+
## Quick Start
31
33
32
-
-**Structured output reliability.** The tool depends on Claude returning valid JSON that conforms to a strict schema on every call. Claude follows schema and formatting instructions precisely enough to be used in a pipeline — where a malformed response is a hard failure, not a warning.
33
-
-**Apex domain knowledge.** Claude accurately identifies Salesforce-specific patterns — safe navigation (`?.`), `@TestSetup` data gaps, governor limit causes — without domain-specific fine-tuning. This means the tool works on real org failures out of the box.
34
+
**No API key needed — run a preset scenario instantly:**
34
35
35
-
The result is a risk score (0–10) and a P0-ranked fix list. Engineers know within seconds whether a deployment is blocked, which component caused it, and what to fix first — without reading a single log line.
36
+
```bash
37
+
python main.py 1 # Failure — risk score 7 🔴
38
+
python main.py 2 # Medium — risk score 3 🟡
39
+
python main.py 3 # Healthy — risk score 0 🟢
40
+
```
36
41
37
-
---
42
+
**Live mode — send your own data to Claude:**
38
43
39
-
## How It Reduces Debugging Time
44
+
```bash
45
+
pip install -r requirements-live.txt
40
46
41
-
Instead of opening three tools and manually correlating their outputs, engineers submit one JSON payload and receive a ranked diagnosis — specific component, technical cause, and exact fix sequence.
42
-
Claude surfaces the causal chain behind the failure, not a list of symptoms, so the P0 fix is clear before any code is opened.
43
-
A 0–10 risk score tells the team immediately whether the deployment can proceed or is actively blocked, eliminating the manual triage step entirely.
47
+
# Windows
48
+
set ANTHROPIC_API_KEY=your_key_here
49
+
# Mac / Linux
50
+
export ANTHROPIC_API_KEY=your_key_here
51
+
52
+
python main.py --input mydata.json --live
53
+
```
54
+
55
+
**End-to-end with a real Salesforce org:**
56
+
57
+
```bash
58
+
sf project deploy start --json > deploy_result.json
|`failed_deployments`|`list`| Each item: `component`, `error`, `failed_tests` count |
55
74
|`code_quality_issues`|`dict`|`pmd_violations` (total) and `critical` (severity 1–2) |
56
75
57
-
Output returned by Claude:
76
+
**Output** — returned by Claude:
58
77
59
78
| Field | Description |
60
79
|---|---|
@@ -68,68 +87,25 @@ Output returned by Claude:
68
87
69
88
## Risk Scoring
70
89
71
-
Weights reflect Salesforce deployment reality. Coverage below 75% is a hard platform blocker with no override — so it carries the highest weight. Active runtime failures block the pipeline immediately. PMD critical violations are serious but do not prevent deployment on their own.
90
+
Coverage below 75% is a hard Salesforce platform blocker with no override — so it carries the highest weight. Active runtime failures block the pipeline immediately. Critical PMD violations are serious but do not prevent deployment on their own.
72
91
73
92
| Dimension | Max Points | Trigger |
74
93
|---|---|---|
75
-
| Code coverage | +4 | < 75% — hard Salesforce deployment blocker, no override possible |
A deployment fails during a sprint release. The engineer has three data points: coverage is below threshold, `OpportunityService` threw an exception, and the static analyser flagged violations.
"cause": "OpportunityService accesses a relationship field (e.g. Opportunity.Account.Name) without a null-guard — the field is null because @TestSetup does not insert a parent Account before creating the Opportunity",
154
+
"cause": "OpportunityService calls opp.Account.Name without a null-guard — Account is null because @TestSetup creates Opportunity records without inserting a parent Account first",
184
155
"component": "OpportunityService"
185
156
},
186
157
{
187
-
"cause": "@TestSetup inserts Opportunity records without a related Account — all methods that traverse the Account relationship encounter null at runtime, causing both the exception and the test failures",
188
-
"component": "OpportunityService"
189
-
},
190
-
{
191
-
"cause": "Coverage at 62% is a direct consequence of the test failures — the same @TestSetup gap suppresses coverage for every OpportunityService method that depends on related data",
158
+
"cause": "The same @TestSetup gap suppresses execution of every method that traverses the Account relationship — this is why coverage dropped to 62%, not a separate coverage problem",
192
159
"component": "Overall Org"
193
160
}
194
161
],
195
162
"recommendations": [
196
163
{
197
-
"action": "Add null-guard in OpportunityService before traversing Account relationship — use safe navigation operator (?.) on all lookup fields: opp.Account?.Name instead of opp.Account.Name",
164
+
"action": "In OpportunityService, replace opp.Account.Name with opp.Account?.Name — the safe navigation operator prevents the NullPointerException when Account is not loaded",
198
165
"priority": "P0 - Immediate"
199
166
},
200
167
{
201
-
"action": "Fix @TestSetup in OpportunityServiceTest — insert Account record first, then create Opportunity with AccountId populated before any test method runs",
168
+
"action": "In OpportunityServiceTest @TestSetup, insert an Account record and set AccountId on each Opportunity before any test method runs — this unblocks all 3 failing tests and recovers coverage above 75% automatically",
202
169
"priority": "P0 - Immediate"
203
170
},
204
171
{
205
-
"action": "After null-guard and @TestSetup are fixed, re-run deployment — coverage should recover above 75% automatically as the suppressed test paths now execute",
206
-
"priority": "P1 - High"
207
-
},
208
-
{
209
-
"action": "Run 'sf scanner run --category Design,Security' on OpportunityService — resolve SOQL-in-loop violations before next production release to avoid governor limit failures under load",
172
+
"action": "Run 'sf scanner run --category Performance --target force-app/main/default/classes/OpportunityService.cls' and move all SOQL calls outside loop bodies before the next production release",
210
173
"priority": "P1 - High"
211
174
}
212
175
]
213
176
}
214
177
```
215
178
216
-
**What went wrong:** Coverage at 62% and a `NullPointerException` in `OpportunityService` appear to be two independent blockers. The analysis identifies they share one root cause: a `@TestSetup` gap (missing parent `Account`) that causes the exception, fails the tests, and suppresses coverage for every related method.
217
-
218
-
**What the analysis reveals beyond the raw input:** The input states `"error": "NullPointerException"` and `"failed_tests": 3`. Nothing in the input mentions `@TestSetup`, parent records, or the Account relationship. Claude identified the likely test data gap, the specific Apex anti-pattern (unsafe relationship traversal), and that the coverage failure is a *symptom* of the exception — not a separate problem. This is a sample output generated by Claude using the same structured prompt the tool sends in `--live` mode.
219
-
220
-
**Fix first:** Add the null-guard and fix `@TestSetup` in one commit. Fixing them together recovers coverage automatically — no additional test writing required.
179
+
-**One root cause, three symptoms.** A single `@TestSetup` gap caused the exception, the 3 test failures, and the coverage drop — one fix resolves all of them.
180
+
-**Exact fix, not general advice.** P0 recommendations name the specific call to change (`opp.Account.Name` → `opp.Account?.Name`) and exactly what to insert in the test setup.
181
+
-**Prioritised.** Two P0 actions unblock the deployment. The P1 SOQL fix is scoped to a specific file and command.
This example shows the tool handling a different failure class: a trigger hitting SOQL limits under load, with two components affected simultaneously.
187
+
A trigger hitting SOQL limits under load, with two components affected simultaneously.
227
188
228
189
### Input
229
190
@@ -249,7 +210,7 @@ This example shows the tool handling a different failure class: a trigger hittin
249
210
}
250
211
```
251
212
252
-
### Output (sample generated by Claude using the tool's prompt)
213
+
### Output
253
214
254
215
```json
255
216
{
@@ -312,17 +273,7 @@ This example shows the tool handling a different failure class: a trigger hittin
312
273
}
313
274
```
314
275
315
-
**What the analysis reveals:** The input only lists two error strings and violation counts. Claude connected the `LimitException` to the SOQL-in-loop PMD violations — identifying the critical violations as the likely *cause* of the runtime error, not a separate issue. It also identified that the two failures independently suppress coverage, meaning both must be fixed before coverage recovers.
Claude connected the `LimitException` to the SOQL-in-loop PMD violations — identifying the critical violations as the *cause* of the runtime error, not a separate issue. The two failures independently suppress coverage, so both must be fixed before coverage recovers.
In `--live` mode, `main.py` sends a structured prompt to the Anthropic API (`claude-sonnet-4-6`) containing the DevOps metrics and a strict JSON schema contract. Claude returns a ranked diagnosis — risks, root causes, and prioritised recommendations — which is validated against the expected schema before display. If the response violates the schema, the tool exits with a clear error rather than silently surfacing bad output.
298
+
In `--live` mode, `main.py` sends a structured prompt to `claude-sonnet-4-6` with the DevOps metrics and a strict JSON schema. The response is validated against the schema before display — if Claude returns a malformed response, the tool exits with a clear error rather than silently surfacing bad output.
348
299
349
-
The three preset scenarios ship with pre-generated outputs to support reproducible demos and offline testing — the same prompt and schema were used to generate them. Mocked and live modes share an identical input/output contract; switching between them requires only the `--live` flag.
300
+
The preset scenarios ship with pre-generated outputs for reproducible demos and offline testing. Mocked and live modes share an identical input/output contract; switching requires only the `--live` flag.
350
301
351
302
---
352
303
353
304
## Contributing
354
305
355
-
Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to add new scenarios, extend the input schema, or improve the prompt.
356
-
357
-
To run the tests locally:
306
+
Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for how to add scenarios, extend the schema, or improve the prompt.
358
307
359
308
```bash
360
309
pip install pytest
@@ -365,16 +314,16 @@ python -m pytest tests/ -v
365
314
366
315
## Roadmap
367
316
368
-
**Near-term (small, shippable)**
369
-
-`--output` flag to write the JSON result to a file for CI pipeline integration
370
-
-`--quiet` flag for machine-readable output (JSON only, no formatted display)
371
-
- GitHub Actions example workflow showing end-to-end org analysis on deployment failure
317
+
**Near-term**
318
+
-`--output` flag to write JSON results to a file for CI pipeline integration
319
+
-`--quiet` flag for machine-readable output (JSON only, no display formatting)
320
+
- GitHub Actions example workflow for end-to-end org analysis on deployment failure
372
321
373
322
**Longer-term**
374
-
-**Apex stack trace parsing** — Accept raw `sf project deploy --json` exception stacks directly; extract line numbers and call chains for line-level diagnosis
375
-
-**Historical diffing** — Compare risk scores across consecutive deployments to surface regressions before they become blockers
376
-
-**Multi-component correlation** — Identify when a failure in one class cascades coverage loss across dependent classes
377
-
-**Slack / Teams alerts** — Push P0 recommendations to engineering channels immediately on detection
323
+
-**Apex stack trace parsing** — accept raw `sf project deploy --json` exception stacks for line-level diagnosis
324
+
-**Historical diffing** — compare risk scores across deployments to surface regressions early
0 commit comments