You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Audit and tighten /diagnose-issue slash command (#895)
* Audit and tighten /diagnose-issue skill
Based on recent issue runs (#877, #879, #884, #885, #886, #889, #890):
- Add Step 0 pre-triage with three short-circuit checks:
(a) bug claim vs informational question,
(b) compare reporter's PE version to local + CHANGELOG grep,
(c) gh search for existing PE-US tracking issue/PR.
- Reframe Critical Rule #1 from "PE / TAXSIM / TaxAct" to
"current PE / reporter's PE / TaxAct" — matches what the bundle
actually contains.
- Add Critical Rule #4: verify against primary sources, not
search summaries (search results on state tax law are frequently
stale or wrong).
- Expand Step 7: when PE and TaxAct disagree, fetch all three —
statute text, current-year form PDF, instructions booklet — and
cross-reference before concluding.
- Document mstat=1 + depx>=1 -> HoH inference in Step 2.
- Step 5 PDF snippet iterates all bundle PDFs instead of hardcoded
form.pdf.
- Step 9 drops stale issue_analysis tracker; adds cross-link step.
- Remove the always-404 <issue>.yaml from the batch download.
- Fix v37/v38/v39 variable-table placeholders that lost their
state-substitution braces.
- Fix self-referencing example link in Common Root Causes.
Tested on issue #886 (WV CDCC): new Step 0c surfaced PR #3019
quickly; new Step 7 forced fetching the 2025 IT-140 booklet
(page 7 recap + page 17 description), confirming PE correctly
implements W.Va. Code §11-21-26.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Skill audit: query PE directly; drop reporter's PE column
Two adjustments based on testing on issue #883:
- Critical Rule #1: comparison is now "current PE vs TaxAct only".
Reporter's PE version is still useful for Step 0b triage (version
comparison + changelog grep) but never for the actual diagnosis
comparison. Avoids confusion about which PE values to use.
- Step 4: comparison table now has just two PE columns and an
explicit rule that every PE value MUST come from a direct query
(Step 3 CSV or `Simulation.calculate(...)`). Never infer a PE
value from a gap between other variables.
- Debugging checklist updated to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add changelog fragment for diagnose-issue skill audit
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@@ -4,9 +4,10 @@ You are helping diagnose discrepancies between PolicyEngine and TAXSIM tax calcu
4
4
5
5
## Critical Rules
6
6
7
-
1.**Always compare ALL THREE systems**: PolicyEngine, TAXSIM, and TaxAct. Never conclude based on just two.
7
+
1.**Compare current PE vs TaxAct.** Only the current PE-US install matters for diagnosis; the reporter's PE version in `output.txt` is useful for triage (Step 0b) but never for the actual comparison. True TAXSIM-proper output is rarely in the bundle — don't pretend to compare against it if it isn't there.
8
8
2.**NEVER post GitHub issues, comments, or PRs without explicit user confirmation.** Always show draft content first and wait for approval.
9
9
3.**Phrase TAXSIM issues as questions** (e.g., "Does TAXSIM-35 incorrectly apply...?" not "TAXSIM-35 incorrectly applies...").
10
+
4.**Verify against primary sources, not search summaries.** When PE and TaxAct disagree on a specific credit or deduction, fetch the actual statute text + current-year form PDF + instructions booklet (see Step 7). Web-search summaries about state tax law are routinely wrong or stale.
10
11
11
12
## Repositories
12
13
@@ -23,9 +24,8 @@ GitHub:
23
24
Each issue at https://github.com/PolicyEngine/policyengine-taxsim/issues contains:
2.**Description/Comments** - Often contains diagnostic hints, suspected root cause, or specific observations from the filer (READ THIS CAREFULLY - it often points to the problem)
26
-
3.**YAML test file** - PolicyEngine situation and expected outputs (attached)
27
-
4.**TaxAct PDF** - The ground truth tax forms (attached)
28
-
5.**TAXSIM reference files** - Available at `taxsim.nber.org/out2psl/{issue_number}/`
27
+
3.**TaxAct PDFs** - Federal 1040, the state form, and any relevant schedules
28
+
4.**TAXSIM reference files** - Available at `taxsim.nber.org/out2psl/{issue_number}/`. Typical contents: `txpydata.csv` (input), `output.txt` (PE emulator output with version banner), `<issue>.txt` (full run log), and one or more PDFs. There is **no YAML file** — don't expect one.
29
29
30
30
**IMPORTANT**: The issue description often contains the filer's analysis of what's wrong. Start by reading the description carefully before diving into code.
31
31
@@ -39,7 +39,6 @@ mkdir -p /tmp/taxsim_$ISSUE && cd /tmp/taxsim_$ISSUE
@@ -56,6 +55,45 @@ Then work ENTIRELY from local files - no more network calls to TAXSIM.
56
55
57
56
## Diagnostic Steps
58
57
58
+
### Step 0: Pre-diagnosis triage (DO THIS FIRST — it short-circuits a lot of work)
59
+
60
+
Three quick checks before any file downloads or PE runs. If any of them fires, you may be done in 5 minutes instead of an hour.
61
+
62
+
**0a. Is this a bug report or an informational question?**
63
+
64
+
Read the issue body. If the reporter is asking *what PE does* or *how PE handles X* (e.g., "What does PE use for fuel cost?") without claiming a specific wrong value, this is **informational**. Skip the full diagnosis — answer their question directly with code references, draft a comment, and confirm before posting. Don't file a PE-US issue.
65
+
66
+
If the body claims a specific discrepancy ("PE returns $X, TaxAct returns $Y, should be $Z"), it's a **bug claim** — continue.
67
+
68
+
**0b. Is this already fixed in current PE?**
69
+
70
+
```bash
71
+
# Reporter's PE-US version (printed in <issue>.txt near the top)
72
+
grep -A1 "policyengine-us" /tmp/taxsim_$ISSUE/$ISSUE.txt | head -3
73
+
74
+
# Local PE-US version
75
+
pip show policyengine-us | grep -i version
76
+
```
77
+
78
+
If the local version is much newer than the reporter's, grep the changelog for relevant state/feature work in the gap:
79
+
80
+
```bash
81
+
cd /Users/pavelmakarchuk/policyengine-us && grep -in "<state-name>\|<state-abbrev>" CHANGELOG.md | head -20
82
+
```
83
+
84
+
If a relevant fix landed between the reporter's version and yours, run a quick PE-direct test with the current version (Step 3) to confirm the issue no longer reproduces. If it doesn't reproduce, close with a brief "after model adjustments, the values now align" comment and stop.
85
+
86
+
**0c. Is there already an open PE-US issue or PR tracking this?**
Try a few queries (state name, specific PE variable, taxsim issue number). If there's an existing tracking issue or open PR that addresses this, just cross-link with a short comment ("Will be addressed here: <PRlink>") and stop.
94
+
95
+
Only if all three checks say "still relevant, fresh problem" do you go to Step 1.
96
+
59
97
### Step 1: Read the Issue
60
98
- Fetch the issue from GitHub (`gh issue view {number} --repo PolicyEngine/policyengine-taxsim`)
61
99
- Read the description carefully for diagnostic hints
@@ -66,7 +104,7 @@ Then work ENTIRELY from local files - no more network calls to TAXSIM.
66
104
67
105
Common data entry errors to check:
68
106
-**State code**: TAXSIM uses alphabetical numbering (1-51), NOT FIPS codes!
69
-
-**Filing status (mstat)**: 1=single, 2=joint, 6=dependent
107
+
-**Filing status (mstat)**: `1=single`, `2=joint`, `6=dependent`. Note: TAXSIM has no separate HoH code — **PE infers HoH from `mstat=1` with `depx≥1`**. So `mstat=1, depx=0` is true single; `mstat=1, depx≥1` is HoH. Most recent issues are HoH despite `mstat=1`.
70
108
-**Ages (page/sage)**: Required for age-based provisions
Compare current PE values against the TaxAct PDF. **Every PE value in the table must come from a direct PE query (Step 3 CSV output or a `Simulation.calculate(...)` call)** — never infer a PE value from gaps between other variables. If you want the pension deduction, query `me_pension_income_deduction` directly; don't subtract AGI − federal AGI.
**If v32=0 for a state tax issue**: The state isn't being set correctly. Check the state code!
97
136
98
137
### Step 5: Extract and Analyze TaxAct PDF Forms (MANDATORY)
99
138
100
139
**THIS STEP IS CRITICAL AND MUST NOT BE SKIPPED.**
101
140
102
-
The TaxAct PDF contains the actual filled-out tax forms - this is the ground truth. The YAML expected values may be incorrect, so always verify against the actual PDF.
141
+
The TaxAct PDFs contain the actual filled-out tax forms — this is the ground truth. Issues usually bundle multiple PDFs (federal 1040, the state form, and any relevant schedules). Iterate them:
103
142
104
143
```bash
105
-
# Extract text using PyMuPDF (from local downloaded file)
106
144
python3 -c "
107
-
import fitz
108
-
doc = fitz.open('/tmp/taxsim_$ISSUE/form.pdf')
109
-
for page_num in range(len(doc)):
110
-
page = doc[page_num]
111
-
print(f'=== Page {page_num + 1} ===')
112
-
print(page.get_text())
145
+
import fitz, glob
146
+
for path in sorted(glob.glob('/tmp/taxsim_$ISSUE/*.pdf')):
147
+
print(f'=== FILE: {path} ===')
148
+
doc = fitz.open(path)
149
+
for page_num in range(len(doc)):
150
+
page = doc[page_num]
151
+
print(f'--- Page {page_num + 1} ---')
152
+
print(page.get_text())
113
153
"
114
154
```
115
155
@@ -123,7 +163,7 @@ for page_num in range(len(doc)):
123
163
6.**Tax Due** - The actual tax calculated on the form
124
164
7.**Credits** - Which credits were claimed and amounts
125
165
126
-
**If YAML expected and PDF differ, the PDF is correct.**The YAML may have been generated incorrectly.
166
+
**If the reporter's claim and the PDF differ, the PDF is correct.**Numeric claims in issue bodies are sometimes paraphrased or based on a stale PE run.
If PE logic appears wrong, verify against official sources:
152
-
- State tax form instructions (primary source)
153
-
- State tax code/statutes
154
-
- Tax Foundation / Tax Policy Center summaries
191
+
192
+
When PE and TaxAct disagree on a specific credit, deduction, or line item, **fetch the primary sources** — don't rely on web-search summaries. Search summaries are routinely wrong or stale (e.g., a search may claim "State X does not offer credit Y" when the statute clearly establishes it). Use search only to *find* the right primary-source URL, then fetch the document.
193
+
194
+
For state credits, verify all three of:
195
+
196
+
1.**Statute text** — `WebFetch` the actual statute (e.g., `https://code.<state>legislature.gov/<section>/`). Quote the relevant language verbatim. Confirms whether the credit exists in law.
197
+
2.**Current-year form PDF** — fetch the current-year state return PDF and check whether the credit actually appears as a line. A statutory credit that hasn't been operationalized on the form may not be claimable in practice for that year.
198
+
3.**Current-year instructions booklet** — fetch the instructions PDF and look for eligibility criteria, income caps, age requirements, or filing prerequisites that PE may not model.
199
+
200
+
Cross-reference: statute → form line → instructions eligibility. If any of the three says something different, **hedge in your write-up** — don't claim PE is correct just because the statute is on the books, and don't claim PE is wrong just because TaxAct's PDF didn't apply a credit (TaxAct can miss things, too; the bundled PDFs sometimes omit schedules).
201
+
202
+
For federal items, use IRS publications and Form 1040 instructions directly.
155
203
156
204
### Step 8: Check policyengine-us Implementation
157
205
```bash
@@ -162,16 +210,16 @@ ls /Users/pavelmakarchuk/policyengine-us/policyengine_us/variables/gov/states/{s
If PE needs a fix, **draft** an issue for policyengine-us with:
169
216
1. Summary of the problem
170
-
2. Root cause analysis with code references
171
-
3. Suggested fix
172
-
4. Integration test with correct expected values (from TaxAct PDF, not PE's buggy output)
217
+
2. Link back to the originating taxsim issue
218
+
3. Root cause analysis with code references
219
+
4. Suggested fix
220
+
5. Integration test with correct expected values (from TaxAct PDF, not PE's buggy output)
173
221
174
-
**Show the draft to the user and wait for approval before posting.**
222
+
**Show the draft to the user and wait for approval before posting.** After posting, cross-link the PE-US issue back from the taxsim issue with a short comment.
175
223
176
224
---
177
225
@@ -214,57 +262,65 @@ print(get_state_code(34)) # Should print "NC"
214
262
| Var | Description | PE Variable |
215
263
|-----|-------------|-------------|
216
264
| fiitax | Federal income tax | income_tax |
217
-
| siitax | State income tax |state_income_tax|
265
+
| siitax | State income tax |`<state>_income_tax`|
218
266
| v10 | Federal AGI | adjusted_gross_income |
219
267
| v13 | Standard Deduction | standard_deduction |
220
268
| v18 | Taxable Income | taxable_income |
221
269
| v22 | Child Tax Credit | ctc_value |
222
270
| v25 | EITC | eitc |
223
-
| v32 | State AGI |{state}_agi |
224
-
| v34 | State Std Deduction |{state}_standard_deduction |
225
-
| v36 | State Taxable Income |{state}_taxable_income |
271
+
| v32 | State AGI |`<state>_agi`|
272
+
| v34 | State Std Deduction |`<state>_standard_deduction`|
273
+
| v36 | State Taxable Income |`<state>_taxable_income`|
226
274
227
275
---
228
276
229
277
## Common Root Causes
230
278
231
-
### 1. YAML Expected Value Incorrect
232
-
- YAML expected value doesn't match actual TaxAct PDF form
233
-
- Always verify by extracting and reading the PDF
234
-
- Example: Issue #657 YAML said $147.97 but PDF showed $0
279
+
### 1. Reporter's claim based on stale PE version
280
+
- Reporter ran an older PE-US; the issue is already fixed.
281
+
- Caught by Step 0b (version comparison + changelog grep).
282
+
283
+
### 2. Already-tracked in policyengine-us
284
+
- An open PE-US issue or PR is in flight covering the same root cause.
285
+
- Caught by Step 0c (`gh search issues`).
286
+
287
+
### 3. YAML / Reporter's Expected Value Incorrect
288
+
- Numbers cited in the issue body don't match the actual TaxAct PDF form
289
+
- Always verify by extracting and reading the PDFs
290
+
- Example: reporter said "PE allows $502" but PDF showed PTC = $0 and PE returned $0 (the $502 was a max-table value, not what PE returned)
235
291
236
-
### 2. Data Entry Errors in Issue
292
+
### 4. Data Entry Errors in Issue
237
293
- Wrong state code (FIPS vs TAXSIM alphabetical)
238
294
- Missing or incorrect age (breaks age-based provisions)
239
-
- Wrong filing status
295
+
- Wrong filing status (most commonly: assuming `mstat=1` is single when it's HoH with dependents)
240
296
241
-
### 3. TAXSIM Bug
297
+
### 5. TAXSIM Bug
242
298
- TAXSIM source code has known bugs (SALT add-back, DTC phaseout, EITC age limits)
243
-
- Compare all three systems to identify
299
+
- Compare current PE vs reporter's PE vs TaxAct to triangulate
244
300
245
-
### 4. Missing State Provisions in PE
301
+
### 6. Missing State Provisions in PE
246
302
- State credit/deduction not implemented
247
303
- Year-specific parameter not updated
248
304
249
-
### 5. Missing Optimization in PE
305
+
### 7. Missing Optimization in PE
250
306
- PE uses fixed 50/50 splits for exemptions/deductions
251
307
- Some states (e.g., MS) allow optimal allocation between spouses
252
308
- TaxAct optimizes these allocations automatically
253
309
254
-
### 6. Input Mapping Issues (policyengine-taxsim)
310
+
### 8. Input Mapping Issues (policyengine-taxsim)
255
311
- Income not being split correctly between spouses
256
312
- Variable not mapped from TAXSIM input to PE situation
- Credits/taxes not handling negative AGI or capital losses correctly
269
325
- Phantom credits when income is negative
270
326
@@ -274,15 +330,17 @@ print(get_state_code(34)) # Should print "NC"
274
330
275
331
When an issue doesn't reproduce as expected:
276
332
333
+
-[ ]**Did Step 0 (triage) short-circuit?** Already fixed in newer PE, already tracked, or informational only?
277
334
-[ ]**State code correct?** (TAXSIM alphabetical, not FIPS)
278
335
-[ ]**v32 (State AGI) non-zero?** (If 0, state setup is wrong)
336
+
-[ ]**Filing status inference?** (`mstat=1 + depx≥1` → HoH, not single)
279
337
-[ ]**Ages set correctly?** (Many provisions are age-gated)
280
338
-[ ]**Income assigned to right person?** (Joint filers: check both)
281
339
-[ ]**Test with Simulation directly?** (Bypasses taxsim mapping)
282
340
-[ ]**Check existing tests in policyengine-us?** (May show expected behavior)
283
-
-[ ]**PDF form extracted and analyzed?** (YAML expected values may be wrong!)
284
-
-[ ]**PDF tax matches YAML expected?**(If not, use PDF as ground truth)
285
-
-[ ]**All three systems compared?** (PE, TAXSIM, and TaxAct)
341
+
-[ ]**PDFs extracted and analyzed?** (Reporter's expected values may be wrong!)
342
+
-[ ]**Compared current PE vs TaxAct?**Every PE value queried directly (no inference from gaps between variables).
343
+
-[ ]**For credit/deduction disagreements: did you fetch the statute + current-year form + instructions booklet?** (Search summaries are not authoritative.)
286
344
287
345
---
288
346
@@ -293,12 +351,12 @@ When an issue doesn't reproduce as expected:
293
351
-`policyengine_taxsim/core/input_mapper.py` - Converts TAXSIM input to PE situations
294
352
-`policyengine_taxsim/core/output_mapper.py` - Extracts PE results to TAXSIM format
295
353
-`policyengine_taxsim/core/utils.py` - State code mappings (SOI_TO_FIPS_MAP)
296
-
-`issue_analysis/` - Diagnosis tracking and findings
297
354
298
355
### policyengine-us
299
356
-`policyengine_us/variables/gov/states/{state}/tax/income/` - State tax variables
300
357
-`policyengine_us/parameters/gov/states/{state}/tax/income/` - State tax parameters
Audit and tighten the /diagnose-issue slash command: add Step 0 pre-triage (Q-vs-bug / version compare / existing PE-US tracking), require primary-source fetches for credit/deduction disagreements, mandate direct PE queries (no inference), and drop stale references.
0 commit comments