Skip to content

Commit 5355476

Browse files
IMPROVEMENT: allow_regex and deny_regex hardened to allow flags, and handle malformed regex' better
1 parent 1471d59 commit 5355476

21 files changed

Lines changed: 771 additions & 70 deletions

README.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,14 @@ sampling:
6363
temperature: 0.7
6464
context:
6565
inputs:
66-
- user_message
66+
- name: user_message
67+
non_empty: true
68+
reject_secrets: true
6769
- name: app_context
6870
max_size: 2000
71+
allow_regex:
72+
pattern: "^[A-Za-z0-9 _-]+$"
73+
flags: "i"
6974
includes:
7075
- ./shared/tone.md
7176
---
@@ -127,11 +132,12 @@ Supported values for `warnings.contextSize` are `auto`, `off`, `result-only`, `c
127132
- **Overrides** — Environment and tier-based overrides (base → env → tier → runtime)
128133
- **4 provider adapters** — OpenAI, Anthropic, Gemini, OpenRouter — body-only output
129134
- **Validation** — Zod schema validation, Levenshtein-based "did you mean?" for typos, variable usage checks
135+
- **Context hardening** — structured regexes with flags, `/pattern/i` convenience syntax, and built-in `non_empty` / `reject_secrets` validators
130136
- **Context size guardrails** — optional per-input `max_size` metadata with non-blocking render-time warnings
131137
- **Warning controls** — top-level config can suppress or emit context size warnings differently in dev and prod
132138
- **Caching** — LRU cache with mtime-based invalidation
133139
- **CLI** — init, validate, compile, render, inspect, skill
134-
- **Compiled artifacts** — Pre-compile `.md` → JSON or ESM for production
140+
- **Compiled artifacts** — Pre-compile `.md` → JSON or ESM for production, with validation before artifacts are written
135141

136142
## Provider Adapters
137143

@@ -381,10 +387,10 @@ promptopskit init [dir]
381387
promptopskit skill
382388

383389
# Validate all .md files in a directory
384-
promptopskit validate <dir> [--strict]
390+
promptopskit validate [sourceDir] [--source <dir>] [--strict]
385391

386392
# Compile .md → JSON/ESM artifacts
387-
promptopskit compile [src] [out] [--dry-run] [--format json|esm] [--no-clean]
393+
promptopskit compile [sourceDir] [outputDir] [--source <dir>] [--output <dir>] [--dry-run] [--format json|esm] [--no-clean]
388394

389395
# Render a prompt preview (auto-loads .test.yaml sidecar)
390396
promptopskit render <file> [--env <name>] [--tier <name>] [--vars <file>] [--json]
@@ -507,7 +513,7 @@ Prompt files use YAML front matter with these fields:
507513
| `response` | `object` | `{ format, stream }` |
508514
| `tools` | `array` | Tool references (string names or inline definitions) |
509515
| `mcp` | `object` | MCP server references |
510-
| `context` | `object` | `{ inputs, history }` — declare expected variables, with optional per-input `max_size`, `trim`, and `allow_regex`/`deny_regex` controls |
516+
| `context` | `object` | `{ inputs, history }` — declare expected variables, with optional per-input `max_size`, `trim`, structured or literal `allow_regex`/`deny_regex`, and built-in `non_empty` / `reject_secrets` validators |
511517
| `includes` | `string[]` | Paths to included prompt files |
512518
| `environments` | `object` | Named environment overrides |
513519
| `tiers` | `object` | Named tier overrides |

SKILL.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ the fields required by that specific file:
6666
| `response` | object | no | `{ format: text|json|markdown, stream: boolean }` |
6767
| `tools` | array | no | Tool names (strings) or inline definitions with `{ name, description, input_schema }` |
6868
| `mcp` | object | no | `{ servers: [string | { name, config }] }` |
69-
| `context.inputs` | `Array<string | { name, max_size?, trim?, allow_regex?, deny_regex? }>` | no | Declared variable names used in templates, with optional size budgets and runtime hardening controls |
69+
| `context.inputs` | `Array<string | { name, max_size?, trim?, allow_regex?, deny_regex?, non_empty?, reject_secrets? }>` | no | Declared variable names used in templates, with optional size budgets and runtime hardening controls |
7070
| `context.history` | object | no | `{ max_items: number }` |
7171
| `includes` | string[] | no | Relative paths to other prompt files to include |
7272
| `environments` | object | no | Per-environment overrides (see Overrides) |
@@ -102,6 +102,8 @@ Rules:
102102
- Use object-form inputs with `max_size` when a variable is likely to grow large and should trigger early warnings
103103
- Use `trim` to enforce byte budgets before interpolation when `max_size` is set
104104
- Use `allow_regex` for allowlist checks and `deny_regex` for blocklist checks on risky inputs
105+
- Prefer structured regexes like `{ pattern, flags }`; `/pattern/i` strings are also accepted and normalized internally
106+
- Use `non_empty: true` for required user text and `reject_secrets: true` for common secret redaction checks
105107
- Escape literal braces with `\{{` and `\}}`
106108
- In strict mode, missing variables throw an error
107109
- In permissive mode, unresolved placeholders are left intact
@@ -119,6 +121,8 @@ context:
119121
If a rendered value exceeds `max_size`, `renderPrompt()` emits a non-blocking `POK030` warning.
120122
At render time, callers can also pass `onContextOverflow` to transform oversized values before warnings/rendering.
121123

124+
Malformed `allow_regex` and `deny_regex` values fail during `validate` and `compile`, not just at render time. When regex compilation fails, the error includes the prompt id, variable name, field name, and raw configured value.
125+
122126
Example: this is the minimal valid shape for a prompt that references
123127
`{{ pull_request }}` even when provider/model are inherited from defaults:
124128

docs/api-reference.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ const result = await kit.validatePrompt('support/reply');
114114
// { valid: boolean, errors: ValidationError[], warnings: ValidationError[] }
115115
```
116116

117-
`validatePrompt()` covers schema, include-graph, and variable declaration issues. Render-time context size warnings are produced by `renderPrompt()`, not validation.
117+
`validatePrompt()` covers schema, include-graph, variable declaration issues, and context regex compilation. Render-time context size warnings are produced by `renderPrompt()`, not validation.
118118

119119
## `kit.clearCache()`
120120

@@ -219,6 +219,8 @@ const result = validateAsset(asset, ['id', 'schema_version', 'model'], 'hello.md
219219
// { valid: boolean, errors: ValidationError[], warnings: ValidationError[] }
220220
```
221221

222+
`validateAsset()` reports malformed `allow_regex` and `deny_regex` values before runtime, including the prompt id, variable name, field name, and raw configured value in the error message.
223+
222224
### `validateAssetWithIncludes(asset, filePath, frontMatterKeys?)`
223225

224226
Validate a prompt asset including its include graph (checks for missing files and circular includes).

docs/cli.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,19 +36,21 @@ Existing files are skipped. If a `package.json` exists, suggests adding a `build
3636
Validate all prompt `.md` files in a directory.
3737

3838
```bash
39-
promptopskit validate <dir> [--strict]
39+
promptopskit validate [sourceDir] [--source <dir>] [--strict]
4040
```
4141

4242
| Option | Description |
4343
|--------|-------------|
44-
| `<dir>` | Directory to validate (required) |
44+
| `<sourceDir>` | Source directory to validate (defaults to `./prompts`) |
45+
| `--source`, `-s` | Explicit source directory override |
4546
| `--strict` | Treat warnings as errors (exit code 1) |
4647

4748
Checks:
4849
- Zod schema validation against the prompt asset schema
4950
- Missing required fields (`id`, body sections)
5051
- Unknown front matter keys with Levenshtein-based "did you mean?" suggestions
5152
- Variable usage — used but undeclared, declared but unused
53+
- Context regex compilation for `allow_regex` and `deny_regex`
5254
- Include resolution — missing files, circular includes
5355
- Folder defaults inheritance from `defaults.md` (provider, model, metadata, system instructions)
5456

@@ -65,13 +67,13 @@ Output per file:
6567
Compile `.md` prompt files to JSON or ESM artifacts.
6668

6769
```bash
68-
promptopskit compile [src] [out] [--dry-run] [--format json|esm] [--no-clean]
70+
promptopskit compile [sourceDir] [outputDir] [--source <dir>] [--output <dir>] [--dry-run] [--format json|esm] [--no-clean]
6971
```
7072

7173
| Option | Default | Description |
7274
|--------|---------|-------------|
73-
| `<src>` | `./prompts` | Source directory |
74-
| `<out>` | `./.generated-prompts/json` | Output directory |
75+
| `<sourceDir>` | `./prompts` | Source directory |
76+
| `<outputDir>` | `./.generated-prompts/json` | Output directory |
7577
| `--source`, `-s` || Explicit source directory override |
7678
| `--output`, `-o` || Explicit output directory override |
7779
| `--format` | `json` | Output format: `json` or `esm` |
@@ -80,6 +82,8 @@ promptopskit compile [src] [out] [--dry-run] [--format json|esm] [--no-clean]
8082

8183
Includes are resolved during compilation so compiled artifacts are self-sufficient. The output directory is cleared by default before compiling (unless `--no-clean` is set).
8284

85+
Compilation runs validation before writing artifacts. Invalid `allow_regex` or `deny_regex` definitions fail the compile step early with `POK013` instead of surfacing later during `renderPrompt()`.
86+
8387
If you omit `<out>`, the CLI chooses `./.generated-prompts/json` for `json` and `./.generated-prompts/esm` for `esm`.
8488

8589
`defaults.md` files are treated as configuration inputs and are not compiled as standalone prompts.

docs/getting-started.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,11 @@ sampling:
4343
temperature: 0.7
4444
context:
4545
inputs:
46-
- user_message
47-
- app_context
46+
- name: user_message
47+
non_empty: true
48+
reject_secrets: true
49+
- name: app_context
50+
allow_regex: "/^[A-Za-z0-9 _-]+$/i"
4851
includes:
4952
- ./shared/tone.md
5053
---
@@ -114,15 +117,15 @@ Your application owns the HTTP call — PromptOpsKit produces the request body o
114117
npx promptopskit validate ./prompts
115118
```
116119

117-
This checks all `.md` files for schema errors, unknown front matter keys (with "did you mean?" suggestions), and variable usage mismatches.
120+
This checks all `.md` files for schema errors, unknown front matter keys (with "did you mean?" suggestions), variable usage mismatches, and malformed context regex definitions.
118121

119122
## Compile for production
120123

121124
```bash
122125
npx promptopskit compile
123126
```
124127

125-
Pre-compiles `.md` files to JSON (or ESM) artifacts so deployments skip parsing entirely. Add to your build scripts:
128+
Pre-compiles `.md` files to JSON (or ESM) artifacts so deployments skip parsing entirely. Compilation validates prompt files first, so malformed regex definitions fail before artifacts are written. Add to your build scripts:
126129

127130
```json
128131
{

docs/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Open-source developer toolkit for managing prompts, system instructions, tools,
55
## Guides
66

77
- [Getting Started](./getting-started.md) — Install, scaffold, and render your first prompt
8-
- [Prompt Format](./prompt-format.md) — Markdown structure, YAML front matter, H1 sections, variables, and `defaults.md` inheritance
8+
- [Prompt Format](./prompt-format.md) — Markdown structure, YAML front matter, H1 sections, variables, context hardening, and `defaults.md` inheritance
99
- [Composition](./composition.md) — Share system instructions across prompts with `includes`
1010
- [Overrides](./overrides.md) — Environment and tier-based overrides for dev/prod/free/pro
1111
- [Providers](./providers.md) — Provider adapters for OpenAI, Anthropic, Gemini, and OpenRouter
@@ -18,7 +18,7 @@ Open-source developer toolkit for managing prompts, system instructions, tools,
1818
- [API Reference](./api-reference.md) — TypeScript API: `createPromptOpsKit`, `renderPrompt`, standalone functions
1919
- [Schema](./schema.md) — Full YAML front matter schema reference
2020
- [Testing](./testing.md) — Test helpers, mock assets, and sidecar test files
21-
- [Validation](./validation.md) — Schema validation, "did you mean?" suggestions, variable checks
21+
- [Validation](./validation.md) — Schema validation, "did you mean?" suggestions, variable checks, and early regex validation
2222

2323
## Also see
2424

docs/prompt-format.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -169,15 +169,19 @@ Each entry can be either a string variable name or an object with:
169169
- `name` — the template variable name
170170
- `max_size` — optional UTF-8 byte limit for the injected value
171171
- `trim` — optional trim-to-budget (`true`/`end` keeps first bytes, `start` keeps trailing bytes) applied when `max_size` is set
172-
- `allow_regex` — optional allowlist regex; input must match (throws `POK031` on mismatch)
173-
- `deny_regex` — optional blocklist regex; input must not match (throws `POK032` on match)
172+
- `allow_regex` — optional allowlist regex; accepts `"pattern"`, `/pattern/i`, or `{ pattern, flags }` and throws `POK031` on mismatch
173+
- `deny_regex` — optional blocklist regex; accepts `"pattern"`, `/pattern/i`, or `{ pattern, flags }` and throws `POK032` on match
174+
- `non_empty` — optional boolean validator; throws `POK033` when the final value is blank or whitespace-only
175+
- `reject_secrets` — optional boolean validator; throws `POK034` when the value matches the built-in secret detector
174176

175177
The validator warns about:
176178
- Variables used in templates but not declared in `context.inputs`
177179
- Variables declared in `context.inputs` but never used
178180

179181
At render time, PromptOpsKit also emits a non-blocking `POK030` warning when a provided variable exceeds its declared `max_size`. In source and auto modes, the warning is also written to `console.warn` to make local development issues visible early.
180182

183+
Malformed `allow_regex` and `deny_regex` values fail during `validate` and `compile` with `POK013`, so bad patterns are caught before runtime.
184+
181185
Example hardened input definition:
182186

183187
```yaml
@@ -186,7 +190,13 @@ context:
186190
- name: user_id
187191
trim: true
188192
max_size: 24
189-
allow_regex: "^user_[a-z0-9]+$"
193+
allow_regex:
194+
pattern: "^user_[a-z0-9]+$"
195+
flags: "i"
196+
- name: pull_request_body
197+
non_empty: true
198+
reject_secrets: true
199+
deny_regex: "/(ignore previous instructions|system:)/i"
190200
```
191201

192202
## Minimal example

docs/schema.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ context:
139139

140140
| Field | Type | Description |
141141
|-------|------|-------------|
142-
| `inputs` | `Array<string | { name, max_size?, trim?, allow_regex?, deny_regex? }>` | Expected variable names, optionally with size and runtime sanitization constraints |
142+
| `inputs` | `Array<string | { name, max_size?, trim?, allow_regex?, deny_regex?, non_empty?, reject_secrets? }>` | Expected variable names, optionally with size and runtime sanitization constraints |
143143
| `history` | `object` | History settings |
144144
| `history.max_items` | `number` | Maximum history items |
145145

@@ -156,8 +156,12 @@ Object-form inputs add optional controls:
156156

157157
- `max_size`: checked during `renderPrompt()` and can produce `POK030` warnings.
158158
- `trim`: trims incoming values to the `max_size` budget before interpolation (`true`/`end` keeps leading bytes, `start` keeps trailing bytes).
159-
- `allow_regex`: allowlist validation before interpolation; non-matches throw `POK031`.
160-
- `deny_regex`: blocklist validation before interpolation; matches throw `POK032`.
159+
- `allow_regex`: allowlist validation before interpolation; accepts `"pattern"`, `/pattern/i`, or `{ pattern, flags }`, and non-matches throw `POK031`.
160+
- `deny_regex`: blocklist validation before interpolation; accepts `"pattern"`, `/pattern/i`, or `{ pattern, flags }`, and matches throw `POK032`.
161+
- `non_empty`: rejects blank or whitespace-only values with `POK033`.
162+
- `reject_secrets`: rejects common secret-like strings with `POK034`.
163+
164+
Malformed `allow_regex` and `deny_regex` values are reported during `validate` and `compile` with `POK013`.
161165

162166
## `includes`
163167

docs/validation.md

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Validation
22

3-
PromptOpsKit validates prompts at multiple levels — schema structure, front matter keys, variable usage, and include graphs. Render-time context size limits are checked separately during prompt rendering.
3+
PromptOpsKit validates prompts at multiple levels — schema structure, front matter keys, variable usage, context regex compilation, and include graphs. Render-time context size limits are checked separately during prompt rendering.
44

55
## Quick start
66

@@ -34,8 +34,10 @@ const result = await kit.validatePrompt('support/reply');
3434
| `POK010` | Warning | Unknown front matter key (with "did you mean?" suggestion) |
3535
| `POK011` | Warning | Variable used in template but not declared in `context.inputs` |
3636
| `POK012` | Warning | Variable declared in `context.inputs` but never used |
37-
| `POK013` | Error | Invalid context regex pattern (`allow_regex` or `deny_regex`) |
37+
| `POK013` | Error | Invalid context regex pattern (`allow_regex` or `deny_regex`), including prompt id, variable name, field name, and raw configured value |
3838
| `POK014` | Warning | `trim` configured without `max_size` (trim-to-budget skipped) |
39+
| `POK033` | Runtime error | `non_empty` validation failed |
40+
| `POK034` | Runtime error | `reject_secrets` validation matched |
3941
| `POK020` | Error | Include resolution failed (missing file) |
4042
| `POK021` | Error | Circular include detected |
4143

@@ -96,17 +98,25 @@ context:
9698
inputs:
9799
- name: user_id
98100
trim: true
99-
allow_regex: "^user_[a-z0-9]+$"
101+
allow_regex:
102+
pattern: "^user_[a-z0-9]+$"
103+
flags: "i"
100104
- name: user_message
101-
deny_regex: "([Ii]gnore previous instructions|[Ss]ystem:)"
105+
deny_regex: "/(ignore previous instructions|system:)/i"
106+
non_empty: true
107+
reject_secrets: true
102108
```
103109
104110
- `trim` trims values to the `max_size` byte budget before interpolation.
105111
- `allow_regex` enforces an allowlist pattern before interpolation and throws `POK031` when a value fails validation.
106112
- `deny_regex` enforces a blocklist pattern before interpolation and throws `POK032` when a value matches.
107-
- During static validation, malformed `allow_regex` or `deny_regex` patterns are reported as `POK013`.
113+
- `non_empty` rejects blank or whitespace-only values with `POK033`.
114+
- `reject_secrets` rejects common secret-like strings with `POK034`.
115+
- During static validation and compilation, malformed `allow_regex` or `deny_regex` patterns are reported as `POK013`.
108116
- During static validation, `trim` without `max_size` returns a `POK014` warning.
109117

118+
Regex compilation errors include the prompt id, variable name, field name, and raw configured value to make bad prompt definitions easy to locate.
119+
110120
You can override that behavior at the kit level:
111121

112122
```typescript

src/cli/commands/compile.ts

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { readdir, writeFile, mkdir, rm } from 'node:fs/promises';
22
import { join, extname, relative, dirname } from 'node:path';
33
import { loadPromptFile } from '../../parser/index.js';
44
import { resolveIncludes } from '../../composition/index.js';
5+
import { validateAssetWithIncludes } from '../../validation/index.js';
56
import { DEFAULT_PROMPTS_DIR, defaultCompiledDirForFormat } from '../../prompt-resolution.js';
67

78
const HELP = `
@@ -59,7 +60,12 @@ export async function compile(args: string[]): Promise<void> {
5960
const outPath = join(outputDir, rel + outExt);
6061

6162
try {
62-
const { asset: parsed } = await loadPromptFile(file, { defaultsRoot: sourceDir });
63+
const { asset: parsed, raw } = await loadPromptFile(file, { defaultsRoot: sourceDir });
64+
65+
const validation = await validateAssetWithIncludes(parsed, file, Object.keys(raw.frontMatter));
66+
if (validation.errors.length > 0) {
67+
throw new Error(validation.errors.map((error) => `${error.code}: ${error.message}`).join('\n '));
68+
}
6369

6470
// Resolve includes so compiled artifacts are self-sufficient
6571
const asset = (parsed.includes && parsed.includes.length > 0)

0 commit comments

Comments
 (0)