Commit 55c3c5d
fix: unbounded regex quantifiers prevent Outlines DFA state explosion (#203)
* fix: use outlines v1.2 get_regex_logits_processor API
The outlines v1.2 API requires:
1. Wrapping the HF model+tokenizer in outlines.Transformers
2. Calling get_regex_logits_processor(None, wrapped, regex)
Prior code tried to construct OutlinesLogitsProcessor directly with
a tokenizer= kwarg that doesn't exist in v1.2. The error was caught
and silently fell back to unconstrained generation.
Tests now verify the ACTUAL API surface (import paths + factory
function signature) instead of just checking class names exist.
This would have caught all three prior Outlines bugs:
- PR #197: wrong class name (RegexLogitsProcessor)
- PR #201: wrong constructor (tokenizer= kwarg)
- This PR: wrong API pattern (direct constructor vs factory)
33/33 tests pass with outlines 1.2.12 installed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use unbounded regex quantifiers to prevent DFA state explosion
Bounded quantifiers like {1,500} create counting DFA states that
cross-product with every alternative in the regex. The Thought
prefix alone created 1,500 states, exceeding Outlines' 2^31 limit.
Changes:
- [^\n]{1,500} → [^\n]+ (Thought prefix: 1 state vs 1,500)
- [^"]{0,200} → [^"]* (TYPE text: 1 state vs 200)
- \d{1,3} → \d+ (coordinates: 1 state vs 3)
max_new_tokens=512 provides the actual length limit. The DFA
doesn't need to count characters.
New test: test_no_bounded_quantifiers_in_regex asserts no quantifier
in the regex exceeds {N,10}, preventing future regressions.
34/34 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 583ed56 commit 55c3c5d
2 files changed
Lines changed: 28 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
94 | | - | |
95 | | - | |
96 | | - | |
| 94 | + | |
97 | 95 | | |
98 | 96 | | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
99 | 104 | | |
100 | | - | |
101 | | - | |
| 105 | + | |
| 106 | + | |
102 | 107 | | |
103 | 108 | | |
104 | 109 | | |
105 | 110 | | |
106 | | - | |
| 111 | + | |
107 | 112 | | |
108 | 113 | | |
109 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
83 | 100 | | |
84 | 101 | | |
85 | 102 | | |
| |||
0 commit comments