Skip to content

Commit adbd54f

Browse files
authored
docs(solutions): capture ReDoS learning from plugin trust-boundary inversion (#353)
Documents the CodeQL ReDoS finding that surfaced during PR #352's multi-load registration cycle. The marker regex itself never changed — the trust boundary around it did. Architectural changes that expand the set of input authors can reclassify a previously safe regex as unsafe. Documents the rejected alternatives (bounded-length cap, atomic groups, inline CodeQL suppression), the indexOf/slice solution, the regression test shape, and four prevention rules. Cross-references three adjacent learnings: the PR #335 origin doc that PR #352 reverted, the config overlay trust-boundary doc, and the agent-color schema regression.
1 parent b27a9bc commit adbd54f

1 file changed

Lines changed: 148 additions & 0 deletions

File tree

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
---
2+
title: ReDoS in bootstrap marker regex exposed by plugin trust-boundary inversion
3+
date: 2026-05-11
4+
category: security-issues
5+
module: src/index.ts
6+
problem_type: security_issue
7+
component: assistant
8+
symptoms:
9+
- CodeQL alerts #42 and #43 flagged the bootstrap-marker regex as polynomial-time
10+
- 'Pattern flagged: /<SYSTEMATIC_WORKFLOWS>[\s\S]*?<\/SYSTEMATIC_WORKFLOWS>/'
11+
- Input shape that triggers backtracking — repeated opening markers with no closing tag
12+
- Regex was clean under per-process singleton; flagged immediately after switching to per-load registration
13+
root_cause: scope_issue
14+
resolution_type: code_fix
15+
severity: high
16+
related_components:
17+
- assistant
18+
- tooling
19+
tags: [redos, codeql, regex, trust-boundary, plugin-registration, bootstrap-injection, security, architecture-inversion]
20+
---
21+
22+
# ReDoS in bootstrap marker regex exposed by plugin trust-boundary inversion
23+
24+
## Problem
25+
26+
PR #352 inverted Systematic's plugin registration model from a per-process
27+
singleton (`plugInOnce`) to independent per-load registration. The bootstrap
28+
marker regex `/<SYSTEMATIC_WORKFLOWS>[\s\S]*?<\/SYSTEMATIC_WORKFLOWS>/` was
29+
unchanged from prior versions, but CodeQL flagged it as polynomial-time
30+
ReDoS-vulnerable on the new branch. The regex hadn't moved; the *trust
31+
boundary around it* had. Under the singleton, the regex only saw content
32+
the plugin itself authored. Under per-load registration, every plugin
33+
source can contribute fragments to `output.system` before the marker
34+
matcher runs — making the matched input attacker-influenceable.
35+
36+
## Symptoms
37+
38+
- CodeQL security scanning posted two inline comments (alerts #42 and #43) on `src/index.ts` lines 30–31 with rule `js/polynomial-redos`.
39+
- Description: "may run slow on strings starting with `<SYSTEMATIC_WORKFLOWS>` and with many repetitions of `<SYSTEMATIC_WORKFLOWS>`".
40+
- The same regex had passed CodeQL on every prior commit; the alert surfaced only after the per-load registration commit (`29fa5e5`).
41+
- PR-level CI: `CodeQL` job FAILURE; all other 7 main-workflow jobs SUCCESS. Branch otherwise APPROVED + MERGEABLE.
42+
43+
## What Didn't Work
44+
45+
- **Bounded-length regex** (`/<SYSTEMATIC_WORKFLOWS>[\s\S]{0,100000}?<\/SYSTEMATIC_WORKFLOWS>/`). Caps backtracking depth with an arbitrary ceiling but leaves a polynomial regex in the hot path. The cap also feels like a magic number without operational justification.
46+
- **Atomic groups or possessive quantifiers** (`(?>...)`, `*+`). Not supported by JavaScript regex. Would require a polyfill or workaround that adds more surface area than it removes.
47+
- **Inline CodeQL suppression comment**. Papers over a real trust-boundary change without addressing the underlying class of risk. Rejected on principle.
48+
49+
## Solution
50+
51+
Replace the regex with a linear-time `indexOf`/`slice` helper. The marker has fixed literal delimiters — there was never a regex requirement, only a regex convenience.
52+
53+
**Before** (`src/index.ts`):
54+
```ts
55+
const BOOTSTRAP_MARKER_PATTERN =
56+
/<SYSTEMATIC_WORKFLOWS>[\s\S]*?<\/SYSTEMATIC_WORKFLOWS>/
57+
58+
export const applyBootstrapContent = (
59+
output: { system: string[] },
60+
content: string,
61+
): void => {
62+
for (let i = 0; i < output.system.length; i++) {
63+
const entry = output.system[i]
64+
if (BOOTSTRAP_MARKER_PATTERN.test(entry)) {
65+
output.system[i] = entry.replace(BOOTSTRAP_MARKER_PATTERN, content)
66+
return
67+
}
68+
}
69+
// append path...
70+
}
71+
```
72+
73+
**After**:
74+
```ts
75+
const BOOTSTRAP_MARKER_OPEN = '<SYSTEMATIC_WORKFLOWS>'
76+
const BOOTSTRAP_MARKER_CLOSE = '</SYSTEMATIC_WORKFLOWS>'
77+
78+
const findBootstrapMarkerBlock = (
79+
entry: string,
80+
): { start: number; end: number } | null => {
81+
const start = entry.indexOf(BOOTSTRAP_MARKER_OPEN)
82+
if (start === -1) return null
83+
const closeStart = entry.indexOf(
84+
BOOTSTRAP_MARKER_CLOSE,
85+
start + BOOTSTRAP_MARKER_OPEN.length,
86+
)
87+
if (closeStart === -1) return null
88+
return { start, end: closeStart + BOOTSTRAP_MARKER_CLOSE.length }
89+
}
90+
91+
export const applyBootstrapContent = (
92+
output: { system: string[] },
93+
content: string,
94+
): void => {
95+
for (let i = 0; i < output.system.length; i++) {
96+
const entry = output.system[i]
97+
const block = findBootstrapMarkerBlock(entry)
98+
if (block !== null) {
99+
output.system[i] =
100+
entry.slice(0, block.start) + content + entry.slice(block.end)
101+
return
102+
}
103+
}
104+
// append path unchanged...
105+
}
106+
```
107+
108+
Regression test (`tests/unit/plugin.test.ts`):
109+
```ts
110+
test('completes in linear time on malicious input with many opening markers and no closing tag', () => {
111+
const malicious = `${MARKER_OPEN.repeat(10000)} trailing content with no close`
112+
const output = { system: [malicious] }
113+
const startedAt = Date.now()
114+
applyBootstrapContent(output, wrap('NEW CONTENT'))
115+
const elapsedMs = Date.now() - startedAt
116+
expect(output.system).toHaveLength(1)
117+
expect(output.system[0]).toContain('NEW CONTENT')
118+
expect(output.system[0]).toContain(MARKER_OPEN.repeat(10000))
119+
// Loose upper bound — linear scan finishes in well under 100ms on slow
120+
// CI runners. The prior regex would take seconds-to-minutes on this input.
121+
expect(elapsedMs).toBeLessThan(1000)
122+
})
123+
```
124+
125+
## Why This Works
126+
127+
`String.prototype.indexOf` is a linear-time substring search (`O(n + m)` in modern V8/JavaScriptCore via Boyer-Moore-Horspool or two-way). It cannot backtrack — it either finds the substring at some position or fails after scanning each character once. `slice` is `O(k)` where `k` is the slice length. The total cost of `findBootstrapMarkerBlock` is linear in the entry length.
128+
129+
The non-greedy `[\s\S]*?` regex had a different worst-case shape. When the closing tag is absent, the engine matches the opening tag at position 0, scans forward looking for the closing tag, fails, then tries the opening tag at position 21 (its own length), scans forward again, fails again — `O(n²)` on inputs starting with the opening tag.
130+
131+
Two additional secondary benefits from dropping the regex:
132+
- `String.prototype.replace(regex, content)` treats `$&`, `$1`, `$$` as replacement directives. `slice + content + slice` inserts `content` literally. Safer for arbitrary bootstrap payloads.
133+
- The semantics ("first opening tag, then first closing tag after that position") are more legible as `indexOf` calls than as regex. Future readers don't have to recall non-greedy quantifier behavior.
134+
135+
## Prevention
136+
137+
- **When you invert or expand a plugin's trust boundary, re-audit every regex pattern even if it was CodeQL-clean before.** A pattern's safety depends on what input reaches it, not just on its shape. Architectural changes that broaden the set of authors can reclassify a safe regex as unsafe overnight. The pattern itself never changed in this case — only the data flow around it.
138+
- **Prefer `indexOf`/`slice` over regex for fixed literal delimiters.** If neither side has a regex feature (alternation, character classes, anchors), regex is convenience, not requirement. The simpler tool is provably immune to a class of bug the regex isn't.
139+
- **Add regression tests for malformed-delimiter storms, not just happy-path replacement.** The new test in `plugin.test.ts` runs 10,000 repeated opening markers with no closing tag. A loose `< 1000ms` upper bound is enough to catch any polynomial regression without depending on benchmarking infrastructure.
140+
- **Re-run CodeQL locally on architectural changes before opening a PR.** The CI catch happened post-push; the pattern was visible to local CodeQL on the branch before push if anyone had thought to run it. `gh workflow run codeql.yaml` or `codeql database analyze` against a local DB would surface it earlier.
141+
142+
## Related Issues
143+
144+
- [`docs/solutions/integration-issues/opencode-plugin-factory-duplicate-registration-2026-05-04.md`](../integration-issues/opencode-plugin-factory-duplicate-registration-2026-05-04.md) — the singleton (PR #335) that PR #352 reverted. Parallel-concept; PR #352 added a 2026-05-10 follow-up section to that doc when the singleton was removed.
145+
- [`docs/solutions/best-practices/layered-trust-boundaries-overlay-config-2026-05-09.md`](../best-practices/layered-trust-boundaries-overlay-config-2026-05-09.md) — different trust-boundary concept (config overlays vs runtime regex), same principle: trust assumptions need to be made explicit and re-audited when the boundary moves.
146+
- [`docs/solutions/integration-issues/opencode-config-schema-rejects-ad-hoc-agent-colors-2026-05-09.md`](../integration-issues/opencode-config-schema-rejects-ad-hoc-agent-colors-2026-05-09.md) — another post-shift regression where a previously-tolerated input became invalid under a stricter downstream contract. Parallel-concept.
147+
- PR #352: https://github.com/marcusrbrown/systematic/pull/352
148+
- CodeQL rule: [`js/polynomial-redos`](https://codeql.github.com/codeql-query-help/javascript/js-polynomial-redos/)

0 commit comments

Comments
 (0)