You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A simple, actionable framework to prioritize and track engineering tasks. Focus on alignment, transparency, continuous improvement, and increasing clarity for everyone. This document does not replace your project management tools. It is meant to be a simple, actionable checklist to help you focus on the most important tasks for the week.
15
+
16
+
# About the Current Project
17
+
WP Code Check is a zero-dependency static analysis toolkit for WordPress performance, security, and reliability issues. Current engineering focus is stabilizing the search backend, reducing false positives in noisy direct-pattern rules, restoring fixture parity, and creating a measured path for an optional Semgrep backend without destabilizing the Bash scanner.
18
+
19
+
---
20
+
21
+
## 1. Strategic Backlog
22
+
**Maximum of 4 items. Focus on long-term goals and impactful improvements.**
23
+
**Update/Reminder:** If these are new projects without a clear scope, consider using the [Project Scope Outline](#bonus-project-scope-outline) below.
24
+
25
+
1.-[ ] Unify search backends under one wrapper layer - Replace timeout-wrapped raw file-discovery calls and helper fallbacks with a single API for file discovery, line matches, and context extraction.
26
+
2.-[ ] Restore fixture and scorecard trust - Audit failing fixtures, align expected counts with intended semantics, and make regression output credible before larger rule migrations.
27
+
3.-[ ] Pilot Semgrep on the highest-noise direct rules - Start with `unsanitized-superglobal-read`, `unsanitized-superglobal-isset-bypass`, `wpdb-query-no-prepare`, and `file-get-contents-url` using side-by-side scorecards.
28
+
4.-[ ] Continue rule quality cleanup in Bash - Keep reducing false positives where the current engine is already close, especially heuristics and context-sensitive detectors that do not map cleanly to Semgrep.
29
+
30
+
---
31
+
32
+
## 2. Current Week
33
+
**Active tasks for the week. Maximum of 4 items.**
34
+
35
+
> **Tip:** If your team frequently handles urgent issues, consider reserving 1-2 slots for hotfixes. Otherwise, use all 4 slots for planned work.
36
+
37
+
-[x] Refresh planning source of truth - Updated the Semgrep migration plan to match the current codebase, deprecated `BACKLOG.md`, and moved active planning into this 4X4.
38
+
-[ ] Add file-discovery wrapper spike - Draft `cached_file_search()` or equivalent and replace one or two high-value call sites such as `AJAX_FILES` and `TERMS_FILES` to prove the interface.
39
+
-[ ] Add observability for slow checks - Implement per-check timeout warnings and a small top-N slow-check summary so long scans stop looking stuck.
40
+
-[ ] Audit fixture mismatches and shortlist Semgrep pilots - Re-run failing fixtures, classify false positives vs desired behavior, and finalize the first four Semgrep scorecard candidates.
41
+
42
+
---
43
+
44
+
## 3. Previous Week
45
+
**Review completed, deferred, or blocked tasks from the prior week.**
46
+
47
+
-[x] Added Path B observability for aggregated magic-string patterns - phase timing and quality counters are now visible in text and JSON output.
48
+
-[x] Fixed stale-registry fallback behavior - eliminated one apparent hang path in the pattern loader and guarded empty search patterns.
49
+
-[x] Fixed high-noise direct-pattern false positives - reduced `php-shell-exec-functions`, `spo-002-superglobals`, and `php-dynamic-include` noise with targeted scanner and pattern fixes.
50
+
-[ ] Phase 0b observability remains incomplete - heartbeat output and slow-check rollups are still deferred and need a focused pass.
51
+
52
+
---
53
+
54
+
## 4. Recent Lessons Learned
55
+
**Capture insights to improve processes and avoid repeating mistakes.**
56
+
57
+
1. Small search-path inconsistencies create outsized noise - most recent false positives came from runner behavior and shell quoting, not from the pattern ideas themselves.
58
+
2. Timeout protection is necessary but not sufficient - a timed-out check that silently passes prevents hangs, but it also hides diagnostic value unless we surface warnings and timing data.
59
+
3. Planning drift happens fast in a monolithic script - line-number-based roadmap docs stale quickly, so strategic docs need periodic refreshes tied to current code references.
60
+
4. Not every noisy rule is a Semgrep problem first - if a Bash rule is already close after a targeted fix, it should drop in Semgrep priority behind noisier or structurally simpler candidates.
61
+
62
+
---
63
+
64
+
## Bonus: Project Scope Outline
65
+
66
+
> **Note:** This section is a work-in-progress and is being developed as a natural extension of the 4x4 methodology. It is not yet a core part of the framework but is included here as a supplementary tool for teams who want to add more context to their planning.
67
+
68
+
### 1. Goals
69
+
Keep WP Code Check fast, explainable, and operationally reliable while improving detection quality on high-value WordPress performance and security checks.
70
+
71
+
### 2. Assumptions
72
+
- The current Bash scanner remains the default engine through any Semgrep pilot.
73
+
- Search backend stabilization should land incrementally rather than via a large rewrite.
74
+
- Fixture parity work is required before scorecards will be trusted for migration decisions.
75
+
- The highest-value weekly work is small and verifiable: one wrapper spike, one observability pass, one fixture audit pass.
76
+
77
+
### 3. Potential Risks
78
+
- Semgrep rules may look cleaner on paper but still miss WordPress-specific context that the Bash pipeline currently encodes.
79
+
- Wrapper changes can improve maintainability while accidentally changing output parity if they are not backed by fixture comparisons.
80
+
- Monolithic-script edits can create regression risk in unrelated checks unless changes stay narrow and verified.
81
+
- Planning can split across too many docs unless this file stays the active weekly view.
82
+
83
+
### 4. Long-Term Maintainability
84
+
Long-term sustainability depends on shrinking the amount of bespoke search behavior in `check-performance.sh`, keeping rule logic externalized where possible, maintaining trustworthy fixtures, and promoting only the rules to Semgrep that clearly beat the Bash implementation on quality and runtime.
85
+
86
+
---
87
+
88
+
## How to Use This Template
89
+
90
+
For detailed guidance on the 4x4 framework, see the [README](README.md).
91
+
92
+
**Quick tips:**
93
+
- Update this document weekly (move "Current Week" to "Previous Week" and plan your new week)
94
+
- Limit each section to 4 items maximum to maintain focus
95
+
- Capture lessons learned immediately while they're fresh
96
+
- Link to detailed tasks in your project management tools (GitHub, Jira, etc.)
97
+
98
+
---
99
+
100
+
## License
101
+
This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. To view a copy of this license, visit [https://creativecommons.org/licenses/by/4.0/](https://creativecommons.org/licenses/by/4.0/).
Copy file name to clipboardExpand all lines: CHANGELOG.md
+5Lines changed: 5 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
25
25
26
26
### Documentation
27
27
28
+
- Refreshed planning docs to match the current codebase and consolidated active planning into `4X4.md`
29
+
- Updated `PROJECT/1-INBOX/FEATURE-SEMGREP-MIGRATION-PLAN.md` with current search-backend hotspots, refreshed Phase 0 status notes, and reprioritized the Semgrep pilot list based on current false-positive pressure
30
+
- Deprecated `PROJECT/2-WORKING/BACKLOG.md` as an active planning source and converted it into a pointer to `4X4.md` plus the Semgrep roadmap doc
31
+
- Populated `4X4.md` with current strategic goals, weekly work, prior-week accomplishments, and planning assumptions for WP Code Check
**FPs eliminated:** 8 (all CRITICAL — all were `curl_exec($curl)` calls)
17
17
18
-
-[]**FOLLOW-UP `php-dynamic-include.json` — WP-CLI bootstrap scripts still flagged as LFI**⚠️ *Initial tweak in 740ba08 was insufficient*
18
+
-[x]**`php-dynamic-include.json` — WP-CLI bootstrap scripts no longer flagged as LFI**✅ *Resolved in follow-up commit*
19
19
**Finding:**`check-user-meta.php:13` and `test-alternate-registry-id.php:24` — `$path` is iterated from a hardcoded static array, never user-controlled.
20
-
**Attempted fix:** Added `wp-load` to `exclude_patterns`, but verification showed it does not suppress the actual matched line (`require_once $path;`).
21
-
**Next fix:**Use `exclude_files` or a context-aware suppression that recognizes the surrounding `foreach ($wp_load_paths as $path) { if (file_exists($path)) { require_once $path; } }` bootstrap finder pattern.
22
-
**File:**`dist/patterns/php-dynamic-include.json`
23
-
**FPs remaining:** 2 (both CRITICAL)
20
+
**Attempted fix (740ba08 — insufficient):** Added `wp-load` to `exclude_patterns`, but the actual matched line is `require_once $path;` — it does not contain `wp-load`.
21
+
**Proper fix:**Added new `exclude_if_file_contains` capability to the simple pattern runner and `dist/bin/check-performance.sh`. When a matched file's content contains any string listed in the new `exclude_if_file_contains` JSON array, all matches in that file are suppressed. Added `"wp eval-file"` to `php-dynamic-include.json` under this key — both WP-CLI scripts have this string in their docblock comment.
22
+
**Files changed:**`dist/bin/check-performance.sh` (runner feature), `dist/patterns/php-dynamic-include.json` (new exclusion key)
23
+
**FPs eliminated:** 2 (both CRITICAL)
24
24
25
25
---
26
26
@@ -90,12 +90,10 @@
90
90
| Fix | File to Edit | Effort | FPs Eliminated | Status |
| Apply `exclude_patterns` in simple runner |`check-performance.sh`~L5970 | Medium | 11 verified | ✅ Done |
96
96
| Admin-only hook whitelist |`check-performance.sh`| Medium | 1+ per scan | 📋 Deferred |
97
97
| N+1 loop containment tightening |`check-performance.sh`| Medium | 2+ per scan | 📋 Deferred |
98
98
99
-
**Latest measured totals:** 99 findings before scanner fixes → **88 findings after scanner fixes**.
100
-
101
-
**Important follow-up:**`php-dynamic-include` is still reporting 2 findings after the latest verification. The previous `wp-load` exclusion was too weak because the actual matched line is `require_once $path;`, not the nearby `file_exists($path)` / `wp-load.php` discovery line. That rule needs a better context-aware suppression or file exclusion strategy.
99
+
**Latest measured totals:** 99 findings before scanner fixes → **88 findings after first round** → **86 findings after dynamic-include fix**.
Copy file name to clipboardExpand all lines: PROJECT/1-INBOX/FEATURE-SEMGREP-MIGRATION-PLAN.md
+37-32Lines changed: 37 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,10 +3,10 @@
3
3
**Created:** 2026-02-10
4
4
**Status:** Phase 0a Complete
5
5
**Priority:** High
6
-
**Last Updated:** 2026-02-09
6
+
**Last Updated:** 2026-03-23
7
7
8
8
## Problem/Request
9
-
Intermittent scan stalls occur on very large repositories (expected risk) and sometimes on smaller projects (unexpected). The current scanner mixes cached and uncached recursive search paths, with several raw `grep -r*` and `xargs grep` call sites that can still cause unstable runtime behavior.
9
+
Intermittent scan stalls still matter on very large repositories and the scanner still mixes multiple search paths. The highest-risk problems are no longer unguarded hangs in the active path; they are now a maintainability split between `cached_grep()`-based scans, timeout-wrapped raw recursive file discovery, and helper-level `xargs` or raw `grep -r` fallback behavior.
-`dist/bin/check-performance.sh:3579` - `fast_grep()` raw `grep -rHn` fallback
33
+
-`dist/bin/check-performance.sh:3634` - `cached_grep()` raw `grep -rHn` fallback
29
34
30
35
## Direct Answer: Improvements Possible on Raw Recursive/Xargs Paths
31
36
Yes. The most impactful improvements are:
@@ -48,15 +53,15 @@ Yes. The most impactful improvements are:
48
53
**Tasks**
49
54
1.~~Replace known raw recursive/xargs hotspots with safer cached/wrapped calls.~~ → Refined: xargs calls at lines 2617/3222 are intentionally using pre-cached file lists inside already-protected paths — not actual hotspots. The real issue was 8 unprotected `grep -rl` file-discovery calls.
50
55
2. Standardize null-delimited file handling for all multi-file grep execution. → Deferred (existing `cached_grep` already uses `tr '\n' '\0' | xargs -0`; the 8 patched calls are file-discovery, not line-matching)
51
-
3. ✅ **Ensure every expensive check uses timeout guards.** — Complete (2026-02-09). Wrapped 8 raw `grep -r` calls with `run_with_timeout "$MAX_SCAN_TIME"`:
3. ✅ **Ensure every expensive check uses timeout guards.** — Complete (2026-02-09). The active file-discovery calls remain raw `grep -r*`, but they are wrapped with `run_with_timeout "$MAX_SCAN_TIME"`:
4. Add heartbeat logs every 10 seconds for long loops. → Deferred to Phase 0b
61
66
5. Add top-N slow checks summary at end of scan. → Deferred to Phase 0b
62
67
6. Improve docs for `.wpcignore`, `--skip-magic-strings`, and `MAX_SCAN_TIME`. → Deferred to Phase 0b
@@ -67,13 +72,13 @@ Yes. The most impactful improvements are:
67
72
- Baseline performance snapshot (before/after) → Deferred to Phase 0b
68
73
69
74
**Exit Criteria**
70
-
-[x] No unguarded raw `grep -r*` in active scan path (remaining raw grep -r only inside `fast_grep()`/`cached_grep()` fallback paths — by design)
75
+
-[x] No unguarded raw `grep -r*` in active scan path; however, active scan still contains timeout-wrapped raw file-discovery calls and helper internals still retain raw `grep -r` / `xargs grep` fallback behavior
-[ ] Users can identify long-running checks from logs → Deferred to Phase 0b
73
78
74
79
**Implementation Notes (2026-02-09):**
75
80
- The xargs calls at lines 2617 and 3222 were originally listed as hotspots but are inside already-protected paths (pre-cached file list + `run_with_timeout`). Removed from scope.
76
-
- Timeout behavior: on timeout, check returns empty result, reports "passed," scan continues. Silent degradation chosen over hang. Per-check timeout warnings deferred (see BACKLOG.md).
81
+
- Timeout behavior: on timeout, check returns empty result, reports "passed," scan continues. Silent degradation chosen over hang. Per-check timeout warnings remain deferred and are now tracked in `4X4.md` instead of `BACKLOG.md`.
77
82
- No new functions or abstractions introduced — reuses existing `run_with_timeout` infrastructure.
78
83
79
84
### Phase 1: Unified Search Backend Wrapper
@@ -113,17 +118,17 @@ Yes. The most impactful improvements are:
113
118
- Semgrep is optional via feature flag.
114
119
- Pilot only direct/noisy rule subset.
115
120
116
-
**Candidate Rules (initial)**
121
+
**Candidate Rules (reprioritized for current false-positive pressure)**
117
122
1.`unsanitized-superglobal-read`
118
-
2.`spo-002-superglobal-manipulation`
123
+
2.`unsanitized-superglobal-isset-bypass`
119
124
3.`wpdb-query-no-prepare`
120
-
4.`php-eval-injection`
121
-
5.`php-dynamic-include`
122
-
6.`php-shell-exec-functions`
123
-
7.`php-hardcoded-credentials`
124
-
8.`unsanitized-superglobal-isset-bypass`
125
-
9.`file-get-contents-url`
126
-
10.`wp-json-html-escape` (evaluate feasibility)
125
+
4.`file-get-contents-url`
126
+
5.`wp-json-html-escape`
127
+
6.`php-hardcoded-credentials`
128
+
7.`php-eval-injection`
129
+
8.`spo-002-superglobal-manipulation` - lower urgency after the inline grep quoting fix
130
+
9.`php-dynamic-include` - lower urgency after the WP-CLI bootstrap false-positive fix
131
+
10.`php-shell-exec-functions` - lower urgency after the `curl_exec()` word-boundary fix
127
132
128
133
**Tasks**
129
134
1. Implement `--search-backend semgrep` toggle (default remains current backend).
0 commit comments