|
| 1 | +**STATUS:** Phase 1 Improvements Complete ✅ - Phase 2 Ready |
| 2 | +**Author:** GitHub Copilot (Chat GPT 5.2) |
| 3 | +**PRIORITY**: High |
| 4 | +**Started:** 2026-01-12 |
| 5 | +**Phase 1 Completed:** 2026-01-12 |
| 6 | +**Phase 1 Improvements Completed:** 2026-01-12 |
| 7 | + |
| 8 | +## Context |
| 9 | + |
| 10 | +This plan is based on a real-world calibration exercise where the **deterministic GREP/pattern scanner output (raw JSON findings)** was compared against a **manual review of the actual WP Health Check & Troubleshooting plugin code in `/temp`**. The goal is to convert the observed false positives and “needs review” hot spots into concrete scanner improvements that reduce noise without hiding genuine issues. |
| 11 | + |
| 12 | +## Table of Contents |
| 13 | + |
| 14 | +- [Phased Progress Checklist (High Level)](#phased-progress-checklist-high-level) |
| 15 | +- [Phase 1 — Reduce Obvious False Positives (Low Risk, High Impact)](#phase-1--reduce-obvious-false-positives-low-risk-high-impact) |
| 16 | +- [Phase 2 — Add Context Signals (Guards + Sanitization) to Improve Triage](#phase-2--add-context-signals-guards--sanitization-to-improve-triage) |
| 17 | +- [Phase 3 — Reclassify Findings (Categories + Severity Defaults)](#phase-3--reclassify-findings-categories--severity-defaults) |
| 18 | +- [Acceptance Criteria](#acceptance-criteria) |
| 19 | + |
| 20 | +> **Note for the LLM/agent:** As each task is completed, continuously update this document by ticking the relevant checklist items (`[x]`). |
| 21 | +Also, update changelog to reflect changes. |
| 22 | + |
| 23 | +## Phased Progress Checklist (High Level) |
| 24 | + |
| 25 | +- [x] **Phase 1 complete:** Scanner no longer flags PHPDoc/comment-only matches; avoids POST-method false positives in HTML/REST config. ✅ **COMPLETED 2026-01-12** |
| 26 | +- [ ] **Phase 2 complete:** Findings include context signals (nonce/cap checks; sanitizer detection) and are downgraded appropriately. |
| 27 | +- [ ] **Phase 3 complete:** Findings are categorized (security vs best-practice vs performance) with clearer default severities. |
| 28 | + |
| 29 | +### Phase 1 Results (2026-01-12) |
| 30 | + |
| 31 | +**Initial Implementation (v1.2.3):** |
| 32 | +- ✅ Created `is_line_in_comment()` helper function to detect PHPDoc/comment blocks |
| 33 | +- ✅ Created `is_html_or_rest_config()` helper function to detect HTML forms and REST route configs |
| 34 | +- ✅ Integrated filters into HTTP timeout check, superglobal manipulation check, and unsanitized superglobal read check |
| 35 | +- ✅ Created test fixtures: `phase1-comment-filtering.php` and `phase1-html-rest-filtering.php` |
| 36 | + |
| 37 | +**Phase 1 Improvements (v1.2.4):** |
| 38 | +- ✅ Improved `is_line_in_comment()` with string literal detection, 100-line backscan, inline comment detection |
| 39 | +- ✅ Improved `is_html_or_rest_config()` with anchored patterns, case-insensitive matching |
| 40 | +- ✅ Moved helpers to shared library: `dist/bin/lib/false-positive-filters.sh` |
| 41 | +- ✅ Created verification script: `dist/tests/verify-phase1-improvements.sh` |
| 42 | +- ✅ Enhanced test fixtures with 12+ edge cases |
| 43 | + |
| 44 | +**Results on Health Check Plugin:** |
| 45 | +- **Baseline (before Phase 1)**: 75 total findings |
| 46 | +- **After Phase 1 (v1.2.3)**: 74 total findings (3 PHPDoc false positives eliminated) |
| 47 | +- **After Phase 1 Improvements (v1.2.4)**: **67 total findings** |
| 48 | +- **Total Improvement**: **10.6% reduction** (8 false positives eliminated) |
| 49 | +- **HTTP Timeout Findings**: Consistently 3 (all actual code, no false positives) |
| 50 | + |
| 51 | +**Files Modified:** |
| 52 | +- `dist/bin/check-performance.sh` - Integrated shared library, removed duplicate code |
| 53 | +- `dist/bin/lib/false-positive-filters.sh` - New shared library with improved helpers |
| 54 | +- `dist/tests/fixtures/phase1-comment-filtering.php` - Enhanced with edge cases |
| 55 | +- `dist/tests/fixtures/phase1-html-rest-filtering.php` - Enhanced with edge cases |
| 56 | +- `dist/tests/verify-phase1-improvements.sh` - New verification script |
| 57 | + |
| 58 | +## Phase 1 — Reduce Obvious False Positives (Low Risk, High Impact) |
| 59 | + |
| 60 | +### Goal |
| 61 | +Eliminate the most common “clearly wrong” matches that do not represent executable code paths. |
| 62 | + |
| 63 | +### Checklist |
| 64 | +- [ ] **Comment/docblock aware matching** |
| 65 | + - [ ] Ignore matches inside PHPDoc blocks (`/** ... */`). |
| 66 | + - [ ] Ignore matches inside block comments (`/* ... */`). |
| 67 | + - [ ] Ignore matches inside single-line comments (`// ...`). |
| 68 | + - [ ] Regression check: a docblock like `@uses wp_remote_get()` no longer triggers `http-no-timeout`. |
| 69 | + |
| 70 | +- [ ] **Stop treating HTML/REST config as superglobal access** |
| 71 | + - [ ] Ensure rules like `spo-002-superglobals` only match real superglobal tokens (e.g., `$_GET[` / `$_POST[` / `$_REQUEST[` / etc.). |
| 72 | + - [ ] Explicitly exclude/avoid matching: |
| 73 | + - [ ] `<form ... method="POST">` (HTML attribute) |
| 74 | + - [ ] `array( 'methods' => 'POST', ... )` (REST route config) |
| 75 | + |
| 76 | +### Deliverables |
| 77 | +- [ ] Updated scanner logic/patterns to ignore comment/docblock contexts. |
| 78 | +- [ ] Updated superglobal rules to match only executable access patterns. |
| 79 | +- [ ] A small regression fixture set covering: |
| 80 | + - [ ] docblock `@uses` vs real function call |
| 81 | + - [ ] HTML `<form method="POST">` |
| 82 | + - [ ] REST route `'methods' => 'POST'` |
| 83 | + |
| 84 | +## Phase 2 — Add Context Signals (Guards + Sanitization) to Improve Triage |
| 85 | + |
| 86 | +### Goal |
| 87 | +Keep reporting potentially risky patterns, but attach “context” so reviewers can triage faster and reduce high-severity noise. |
| 88 | + |
| 89 | +### Checklist |
| 90 | +- [ ] **Guard heuristics (nearby checks)** |
| 91 | + - [ ] If a superglobal read is preceded within ~N lines by `check_ajax_referer(`, downgrade severity (e.g., `error -> review`). |
| 92 | + - [ ] If preceded within ~N lines by `wp_verify_nonce(` (or equivalent nonce checks), downgrade severity. |
| 93 | + - [ ] If preceded within ~N lines by `current_user_can(` (or wrapper), downgrade severity. |
| 94 | + - [ ] Output should record which guard(s) were detected (e.g., `guards: ['check_ajax_referer','current_user_can']`). |
| 95 | + |
| 96 | +- [ ] **Sanitizer/caster detection on superglobal reads** |
| 97 | + - [ ] Detect common WP sanitizers/casters wrapping input (examples): |
| 98 | + - [ ] `sanitize_text_field( $_GET[...] )` |
| 99 | + - [ ] `sanitize_email( $_POST[...] )` |
| 100 | + - [ ] `absint( $_GET[...] )` |
| 101 | + - [ ] `esc_url_raw( $_REQUEST[...] )` |
| 102 | + - [ ] Output should record which sanitizer was detected (e.g., `sanitizers: ['sanitize_email']`). |
| 103 | + |
| 104 | +- [ ] **Refine `$wpdb->prepare()` finding severity when no user input exists** |
| 105 | + - [ ] If SQL is a literal and only includes safe identifiers (e.g. `{$wpdb->options}`), classify as best-practice / lower severity. |
| 106 | + - [ ] Keep higher severity for concatenated SQL that includes superglobals or other tainted variables. |
| 107 | + |
| 108 | +### Deliverables |
| 109 | +- [ ] JSON output augmented with guard/sanitizer hints. |
| 110 | +- [ ] Severity downgrade rules for “guarded” findings. |
| 111 | +- [ ] Regression fixtures for guarded vs unguarded superglobal reads. |
| 112 | + |
| 113 | +## Phase 3 — Reclassify Findings (Categories + Severity Defaults) |
| 114 | + |
| 115 | +### Goal |
| 116 | +Separate “likely vulnerability” from “context-dependent security hygiene” and “best practice” so output is easier to consume. |
| 117 | + |
| 118 | +### Checklist |
| 119 | +- [ ] **Add/standardize rule categories** |
| 120 | + - [ ] `security-vuln-likely` |
| 121 | + - [ ] `security-context-dependent` |
| 122 | + - [ ] `best-practice` |
| 123 | + - [ ] `performance` |
| 124 | + |
| 125 | +- [ ] **Define default severity per category** |
| 126 | + - [ ] Ensure best-practice rules (e.g., missing explicit timeout) do not default to the same “HIGH” urgency as exploitable patterns. |
| 127 | + |
| 128 | +- [ ] **Update reporting summary** |
| 129 | + - [ ] Summaries should group by category and severity. |
| 130 | + - [ ] Ensure the report can clearly answer: |
| 131 | + - [ ] “How many confirmed?” |
| 132 | + - [ ] “How many false positives?” |
| 133 | + - [ ] “How many need review?” |
| 134 | + |
| 135 | +### Deliverables |
| 136 | +- [ ] Updated pattern metadata schema (if needed) to include `category` and default severity. |
| 137 | +- [ ] Updated report generator to group by category. |
| 138 | + |
| 139 | +## Acceptance Criteria |
| 140 | + |
| 141 | +- [ ] Running the scanner on `/temp` (WP Health Check plugin) shows: |
| 142 | + - [ ] No findings triggered purely by docblocks/comments. |
| 143 | + - [ ] No findings triggered by HTML `<form method="POST">`. |
| 144 | + - [ ] No findings triggered by REST route `'methods' => 'POST'`. |
| 145 | + - [ ] Superglobal findings include guard/sanitizer context when present. |
| 146 | + - [ ] `$wpdb->query()` “no prepare” static queries are reduced in severity / categorized as best-practice. |
| 147 | + - [ ] Reports clearly separate security-vuln-likely vs best-practice vs performance. |
0 commit comments