Skip to content

Commit 83c5b9a

Browse files
committed
Refresh planning docs and finalize recent scanner fixes
1 parent e739ebf commit 83c5b9a

10 files changed

Lines changed: 186 additions & 413 deletions

File tree

4X4.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
client: Hypercart / Neochrome
3+
repo: https://github.com/Hypercart-Dev-Tools/WP-Code-Check.git
4+
last_edit: 2026-03-23
5+
week_of: 2026-03-23
6+
source_pr_number: n/a
7+
sprint: planning-refresh
8+
9+
---
10+
11+
Pro-tip: Ask your VS Code AI to self populate the metadata above.
12+
13+
# 4x4 Dashboard - Strategic Goals & Weekly Checklist
14+
A simple, actionable framework to prioritize and track engineering tasks. Focus on alignment, transparency, continuous improvement, and increasing clarity for everyone. This document does not replace your project management tools. It is meant to be a simple, actionable checklist to help you focus on the most important tasks for the week.
15+
16+
# About the Current Project
17+
WP Code Check is a zero-dependency static analysis toolkit for WordPress performance, security, and reliability issues. Current engineering focus is stabilizing the search backend, reducing false positives in noisy direct-pattern rules, restoring fixture parity, and creating a measured path for an optional Semgrep backend without destabilizing the Bash scanner.
18+
19+
---
20+
21+
## 1. Strategic Backlog
22+
**Maximum of 4 items. Focus on long-term goals and impactful improvements.**
23+
**Update/Reminder:** If these are new projects without a clear scope, consider using the [Project Scope Outline](#bonus-project-scope-outline) below.
24+
25+
1. - [ ] Unify search backends under one wrapper layer - Replace timeout-wrapped raw file-discovery calls and helper fallbacks with a single API for file discovery, line matches, and context extraction.
26+
2. - [ ] Restore fixture and scorecard trust - Audit failing fixtures, align expected counts with intended semantics, and make regression output credible before larger rule migrations.
27+
3. - [ ] Pilot Semgrep on the highest-noise direct rules - Start with `unsanitized-superglobal-read`, `unsanitized-superglobal-isset-bypass`, `wpdb-query-no-prepare`, and `file-get-contents-url` using side-by-side scorecards.
28+
4. - [ ] Continue rule quality cleanup in Bash - Keep reducing false positives where the current engine is already close, especially heuristics and context-sensitive detectors that do not map cleanly to Semgrep.
29+
30+
---
31+
32+
## 2. Current Week
33+
**Active tasks for the week. Maximum of 4 items.**
34+
35+
> **Tip:** If your team frequently handles urgent issues, consider reserving 1-2 slots for hotfixes. Otherwise, use all 4 slots for planned work.
36+
37+
- [x] Refresh planning source of truth - Updated the Semgrep migration plan to match the current codebase, deprecated `BACKLOG.md`, and moved active planning into this 4X4.
38+
- [ ] Add file-discovery wrapper spike - Draft `cached_file_search()` or equivalent and replace one or two high-value call sites such as `AJAX_FILES` and `TERMS_FILES` to prove the interface.
39+
- [ ] Add observability for slow checks - Implement per-check timeout warnings and a small top-N slow-check summary so long scans stop looking stuck.
40+
- [ ] Audit fixture mismatches and shortlist Semgrep pilots - Re-run failing fixtures, classify false positives vs desired behavior, and finalize the first four Semgrep scorecard candidates.
41+
42+
---
43+
44+
## 3. Previous Week
45+
**Review completed, deferred, or blocked tasks from the prior week.**
46+
47+
- [x] Added Path B observability for aggregated magic-string patterns - phase timing and quality counters are now visible in text and JSON output.
48+
- [x] Fixed stale-registry fallback behavior - eliminated one apparent hang path in the pattern loader and guarded empty search patterns.
49+
- [x] Fixed high-noise direct-pattern false positives - reduced `php-shell-exec-functions`, `spo-002-superglobals`, and `php-dynamic-include` noise with targeted scanner and pattern fixes.
50+
- [ ] Phase 0b observability remains incomplete - heartbeat output and slow-check rollups are still deferred and need a focused pass.
51+
52+
---
53+
54+
## 4. Recent Lessons Learned
55+
**Capture insights to improve processes and avoid repeating mistakes.**
56+
57+
1. Small search-path inconsistencies create outsized noise - most recent false positives came from runner behavior and shell quoting, not from the pattern ideas themselves.
58+
2. Timeout protection is necessary but not sufficient - a timed-out check that silently passes prevents hangs, but it also hides diagnostic value unless we surface warnings and timing data.
59+
3. Planning drift happens fast in a monolithic script - line-number-based roadmap docs stale quickly, so strategic docs need periodic refreshes tied to current code references.
60+
4. Not every noisy rule is a Semgrep problem first - if a Bash rule is already close after a targeted fix, it should drop in Semgrep priority behind noisier or structurally simpler candidates.
61+
62+
---
63+
64+
## Bonus: Project Scope Outline
65+
66+
> **Note:** This section is a work-in-progress and is being developed as a natural extension of the 4x4 methodology. It is not yet a core part of the framework but is included here as a supplementary tool for teams who want to add more context to their planning.
67+
68+
### 1. Goals
69+
Keep WP Code Check fast, explainable, and operationally reliable while improving detection quality on high-value WordPress performance and security checks.
70+
71+
### 2. Assumptions
72+
- The current Bash scanner remains the default engine through any Semgrep pilot.
73+
- Search backend stabilization should land incrementally rather than via a large rewrite.
74+
- Fixture parity work is required before scorecards will be trusted for migration decisions.
75+
- The highest-value weekly work is small and verifiable: one wrapper spike, one observability pass, one fixture audit pass.
76+
77+
### 3. Potential Risks
78+
- Semgrep rules may look cleaner on paper but still miss WordPress-specific context that the Bash pipeline currently encodes.
79+
- Wrapper changes can improve maintainability while accidentally changing output parity if they are not backed by fixture comparisons.
80+
- Monolithic-script edits can create regression risk in unrelated checks unless changes stay narrow and verified.
81+
- Planning can split across too many docs unless this file stays the active weekly view.
82+
83+
### 4. Long-Term Maintainability
84+
Long-term sustainability depends on shrinking the amount of bespoke search behavior in `check-performance.sh`, keeping rule logic externalized where possible, maintaining trustworthy fixtures, and promoting only the rules to Semgrep that clearly beat the Bash implementation on quality and runtime.
85+
86+
---
87+
88+
## How to Use This Template
89+
90+
For detailed guidance on the 4x4 framework, see the [README](README.md).
91+
92+
**Quick tips:**
93+
- Update this document weekly (move "Current Week" to "Previous Week" and plan your new week)
94+
- Limit each section to 4 items maximum to maintain focus
95+
- Capture lessons learned immediately while they're fresh
96+
- Link to detailed tasks in your project management tools (GitHub, Jira, etc.)
97+
98+
---
99+
100+
## License
101+
This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. To view a copy of this license, visit [https://creativecommons.org/licenses/by/4.0/](https://creativecommons.org/licenses/by/4.0/).
102+
103+
Copyright © 2026 Hypercart DBA Neochrome, Inc. | 4x4Clarity.com

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2525

2626
### Documentation
2727

28+
- Refreshed planning docs to match the current codebase and consolidated active planning into `4X4.md`
29+
- Updated `PROJECT/1-INBOX/FEATURE-SEMGREP-MIGRATION-PLAN.md` with current search-backend hotspots, refreshed Phase 0 status notes, and reprioritized the Semgrep pilot list based on current false-positive pressure
30+
- Deprecated `PROJECT/2-WORKING/BACKLOG.md` as an active planning source and converted it into a pointer to `4X4.md` plus the Semgrep roadmap doc
31+
- Populated `4X4.md` with current strategic goals, weekly work, prior-week accomplishments, and planning assumptions for WP Code Check
32+
2833
- Added `PROJECT/1-INBOX/PATTERN-PROPOSAL-LAUNCHPAD-CRASH.md`
2934
- Captures the plan for converting the Launchpad crash lessons into generalized WPCC anti-pattern proposals
3035
- Recommends against a single environment-specific "crash detector"

FEEDBACK-CR-SELF-SERVICE.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,12 @@
1515
**File:** `dist/patterns/php-shell-exec-functions.json`
1616
**FPs eliminated:** 8 (all CRITICAL — all were `curl_exec($curl)` calls)
1717

18-
- [ ] **FOLLOW-UP `php-dynamic-include.json` — WP-CLI bootstrap scripts still flagged as LFI** ⚠️ *Initial tweak in 740ba08 was insufficient*
18+
- [x] **`php-dynamic-include.json` — WP-CLI bootstrap scripts no longer flagged as LFI** *Resolved in follow-up commit*
1919
**Finding:** `check-user-meta.php:13` and `test-alternate-registry-id.php:24``$path` is iterated from a hardcoded static array, never user-controlled.
20-
**Attempted fix:** Added `wp-load` to `exclude_patterns`, but verification showed it does not suppress the actual matched line (`require_once $path;`).
21-
**Next fix:** Use `exclude_files` or a context-aware suppression that recognizes the surrounding `foreach ($wp_load_paths as $path) { if (file_exists($path)) { require_once $path; } }` bootstrap finder pattern.
22-
**File:** `dist/patterns/php-dynamic-include.json`
23-
**FPs remaining:** 2 (both CRITICAL)
20+
**Attempted fix (740ba08 — insufficient):** Added `wp-load` to `exclude_patterns`, but the actual matched line is `require_once $path;` — it does not contain `wp-load`.
21+
**Proper fix:** Added new `exclude_if_file_contains` capability to the simple pattern runner and `dist/bin/check-performance.sh`. When a matched file's content contains any string listed in the new `exclude_if_file_contains` JSON array, all matches in that file are suppressed. Added `"wp eval-file"` to `php-dynamic-include.json` under this key — both WP-CLI scripts have this string in their docblock comment.
22+
**Files changed:** `dist/bin/check-performance.sh` (runner feature), `dist/patterns/php-dynamic-include.json` (new exclusion key)
23+
**FPs eliminated:** 2 (both CRITICAL)
2424

2525
---
2626

@@ -90,12 +90,10 @@
9090
| Fix | File to Edit | Effort | FPs Eliminated | Status |
9191
|-----|-------------|--------|---------------|--------|
9292
| `\b` word boundary on `exec-call` | `php-shell-exec-functions.json` | 1 line | 8 | ✅ Done (740ba08) |
93-
| Add `wp-load` to `exclude_patterns` | `php-dynamic-include.json` | 1 line | 0 verified | ⚠️ Partial only; follow-up still needed |
93+
| `exclude_if_file_contains` + `wp eval-file` | `check-performance.sh` + `php-dynamic-include.json` | Medium | 2 verified | ✅ Done |
9494
| Single-quote inline spo-002 grep | `check-performance.sh` ~L3723 | 1 line | 28 verified | ✅ Done |
9595
| Apply `exclude_patterns` in simple runner | `check-performance.sh` ~L5970 | Medium | 11 verified | ✅ Done |
9696
| Admin-only hook whitelist | `check-performance.sh` | Medium | 1+ per scan | 📋 Deferred |
9797
| N+1 loop containment tightening | `check-performance.sh` | Medium | 2+ per scan | 📋 Deferred |
9898

99-
**Latest measured totals:** 99 findings before scanner fixes → **88 findings after scanner fixes**.
100-
101-
**Important follow-up:** `php-dynamic-include` is still reporting 2 findings after the latest verification. The previous `wp-load` exclusion was too weak because the actual matched line is `require_once $path;`, not the nearby `file_exists($path)` / `wp-load.php` discovery line. That rule needs a better context-aware suppression or file exclusion strategy.
99+
**Latest measured totals:** 99 findings before scanner fixes → **88 findings after first round****86 findings after dynamic-include fix**.

PROJECT/1-INBOX/FEATURE-SEMGREP-MIGRATION-PLAN.md

Lines changed: 37 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@
33
**Created:** 2026-02-10
44
**Status:** Phase 0a Complete
55
**Priority:** High
6-
**Last Updated:** 2026-02-09
6+
**Last Updated:** 2026-03-23
77

88
## Problem/Request
9-
Intermittent scan stalls occur on very large repositories (expected risk) and sometimes on smaller projects (unexpected). The current scanner mixes cached and uncached recursive search paths, with several raw `grep -r*` and `xargs grep` call sites that can still cause unstable runtime behavior.
9+
Intermittent scan stalls still matter on very large repositories and the scanner still mixes multiple search paths. The highest-risk problems are no longer unguarded hangs in the active path; they are now a maintainability split between `cached_grep()`-based scans, timeout-wrapped raw recursive file discovery, and helper-level `xargs` or raw `grep -r` fallback behavior.
1010

1111
## Context
1212
- Scanner entrypoint: `dist/bin/check-performance.sh`
@@ -16,16 +16,21 @@ Intermittent scan stalls occur on very large repositories (expected risk) and so
1616
- `cached_grep()`
1717
- `.wpcignore` support
1818
- `--skip-magic-strings`
19-
- Remaining high-risk hotspots still use raw recursive grep or raw xargs patterns:
20-
- `dist/bin/check-performance.sh:2617`
21-
- `dist/bin/check-performance.sh:3222`
22-
- `dist/bin/check-performance.sh:4216`
23-
- `dist/bin/check-performance.sh:4617`
24-
- `dist/bin/check-performance.sh:5024`
25-
- `dist/bin/check-performance.sh:5271`
26-
- `dist/bin/check-performance.sh:5463`
27-
- `dist/bin/check-performance.sh:5554`
28-
- `dist/bin/check-performance.sh:5633`
19+
- Active scan path still contains timeout-wrapped raw recursive file-discovery calls:
20+
- `dist/bin/check-performance.sh:4372` - `AJAX_FILES`
21+
- `dist/bin/check-performance.sh:4773` - `TERMS_FILES`
22+
- `dist/bin/check-performance.sh:5180` - `CRON_FILES`
23+
- `dist/bin/check-performance.sh:5427` - `N1_FILES`
24+
- `dist/bin/check-performance.sh:5619` - `THANKYOU_CONTEXT_FILES`
25+
- `dist/bin/check-performance.sh:5710` - `SMART_COUPONS_FILES`
26+
- `dist/bin/check-performance.sh:5722` - `PERF_RISK_FILES`
27+
- `dist/bin/check-performance.sh:5789` - `JSON_RESPONSE_FILES`
28+
- Helper-level raw search behavior still exists and is part of the maintenance burden:
29+
- `dist/bin/check-performance.sh:3378` - aggregated pattern `xargs grep`
30+
- `dist/bin/check-performance.sh:3381` - aggregated pattern raw `grep -rHn` fallback
31+
- `dist/bin/check-performance.sh:3576` - `fast_grep()` `xargs grep`
32+
- `dist/bin/check-performance.sh:3579` - `fast_grep()` raw `grep -rHn` fallback
33+
- `dist/bin/check-performance.sh:3634` - `cached_grep()` raw `grep -rHn` fallback
2934

3035
## Direct Answer: Improvements Possible on Raw Recursive/Xargs Paths
3136
Yes. The most impactful improvements are:
@@ -48,15 +53,15 @@ Yes. The most impactful improvements are:
4853
**Tasks**
4954
1. ~~Replace known raw recursive/xargs hotspots with safer cached/wrapped calls.~~ → Refined: xargs calls at lines 2617/3222 are intentionally using pre-cached file lists inside already-protected paths — not actual hotspots. The real issue was 8 unprotected `grep -rl` file-discovery calls.
5055
2. Standardize null-delimited file handling for all multi-file grep execution. → Deferred (existing `cached_grep` already uses `tr '\n' '\0' | xargs -0`; the 8 patched calls are file-discovery, not line-matching)
51-
3.**Ensure every expensive check uses timeout guards.** — Complete (2026-02-09). Wrapped 8 raw `grep -r` calls with `run_with_timeout "$MAX_SCAN_TIME"`:
52-
- `AJAX_FILES` (line ~4216)
53-
- `TERMS_FILES` (line ~4617)
54-
- `CRON_FILES` (line ~5024)
55-
- `N1_FILES` (line ~5271, pipeline — timeout wraps recursive grep stage)
56-
- `THANKYOU_CONTEXT_FILES` (line ~5463)
57-
- `SMART_COUPONS_FILES` (line ~5554)
58-
- `PERF_RISK_FILES` (line ~5566)
59-
- `JSON_RESPONSE_FILES` (line ~5633)
56+
3.**Ensure every expensive check uses timeout guards.** — Complete (2026-02-09). The active file-discovery calls remain raw `grep -r*`, but they are wrapped with `run_with_timeout "$MAX_SCAN_TIME"`:
57+
- `AJAX_FILES` (line ~4372)
58+
- `TERMS_FILES` (line ~4773)
59+
- `CRON_FILES` (line ~5180)
60+
- `N1_FILES` (line ~5427, pipeline — timeout wraps the recursive grep stage)
61+
- `THANKYOU_CONTEXT_FILES` (line ~5619)
62+
- `SMART_COUPONS_FILES` (line ~5710)
63+
- `PERF_RISK_FILES` (line ~5722)
64+
- `JSON_RESPONSE_FILES` (line ~5789)
6065
4. Add heartbeat logs every 10 seconds for long loops. → Deferred to Phase 0b
6166
5. Add top-N slow checks summary at end of scan. → Deferred to Phase 0b
6267
6. Improve docs for `.wpcignore`, `--skip-magic-strings`, and `MAX_SCAN_TIME`. → Deferred to Phase 0b
@@ -67,13 +72,13 @@ Yes. The most impactful improvements are:
6772
- Baseline performance snapshot (before/after) → Deferred to Phase 0b
6873

6974
**Exit Criteria**
70-
- [x] No unguarded raw `grep -r*` in active scan path (remaining raw grep -r only inside `fast_grep()`/`cached_grep()` fallback paths — by design)
75+
- [x] No unguarded raw `grep -r*` in active scan path; however, active scan still contains timeout-wrapped raw file-discovery calls and helper internals still retain raw `grep -r` / `xargs grep` fallback behavior
7176
- [ ] Small-project scans complete reliably in repeated runs → Needs verification testing
7277
- [ ] Users can identify long-running checks from logs → Deferred to Phase 0b
7378

7479
**Implementation Notes (2026-02-09):**
7580
- The xargs calls at lines 2617 and 3222 were originally listed as hotspots but are inside already-protected paths (pre-cached file list + `run_with_timeout`). Removed from scope.
76-
- Timeout behavior: on timeout, check returns empty result, reports "passed," scan continues. Silent degradation chosen over hang. Per-check timeout warnings deferred (see BACKLOG.md).
81+
- Timeout behavior: on timeout, check returns empty result, reports "passed," scan continues. Silent degradation chosen over hang. Per-check timeout warnings remain deferred and are now tracked in `4X4.md` instead of `BACKLOG.md`.
7782
- No new functions or abstractions introduced — reuses existing `run_with_timeout` infrastructure.
7883

7984
### Phase 1: Unified Search Backend Wrapper
@@ -113,17 +118,17 @@ Yes. The most impactful improvements are:
113118
- Semgrep is optional via feature flag.
114119
- Pilot only direct/noisy rule subset.
115120

116-
**Candidate Rules (initial)**
121+
**Candidate Rules (reprioritized for current false-positive pressure)**
117122
1. `unsanitized-superglobal-read`
118-
2. `spo-002-superglobal-manipulation`
123+
2. `unsanitized-superglobal-isset-bypass`
119124
3. `wpdb-query-no-prepare`
120-
4. `php-eval-injection`
121-
5. `php-dynamic-include`
122-
6. `php-shell-exec-functions`
123-
7. `php-hardcoded-credentials`
124-
8. `unsanitized-superglobal-isset-bypass`
125-
9. `file-get-contents-url`
126-
10. `wp-json-html-escape` (evaluate feasibility)
125+
4. `file-get-contents-url`
126+
5. `wp-json-html-escape`
127+
6. `php-hardcoded-credentials`
128+
7. `php-eval-injection`
129+
8. `spo-002-superglobal-manipulation` - lower urgency after the inline grep quoting fix
130+
9. `php-dynamic-include` - lower urgency after the WP-CLI bootstrap false-positive fix
131+
10. `php-shell-exec-functions` - lower urgency after the `curl_exec()` word-boundary fix
127132

128133
**Tasks**
129134
1. Implement `--search-backend semgrep` toggle (default remains current backend).

0 commit comments

Comments
 (0)