Skip to content

Commit 0249004

Browse files
committed
fix: replace lazy quantifier with greedy to prevent ReDoS backtracking
Replace lazy quantifier +? with greedy quantifier + in regex patterns to prevent catastrophic backtracking and ReDoS attacks. The pattern ([^+\n]+?)(?=\n|$) uses a lazy quantifier that can cause super-linear runtime when the lookahead fails, leading to backtracking. Using a greedy quantifier ([^+\n]+)(?=\n|$) matches as much as possible first, then verifies with the lookahead, preventing excessive backtracking. Fixed patterns: - Invoice Date patterns (2 regexes): Changed +? to + - Due Date patterns (2 regexes): Changed +? to + - Service patterns (2 regexes): Changed +? to + The greedy quantifier is safer because it doesn't backtrack through multiple positions when the lookahead fails, making the regex linear in complexity rather than quadratic or exponential.
1 parent b0f351a commit 0249004

1 file changed

Lines changed: 10 additions & 9 deletions

File tree

src/ui/web/src/pages/DocumentExtraction.tsx

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1056,21 +1056,22 @@ const DocumentExtraction: React.FC = () => {
10561056
if (orderNumMatch) parsedFields.order_number = { value: orderNumMatch[1] };
10571057

10581058
// Invoice Date patterns
1059-
// Use positive lookahead (?=\n|$) instead of non-capturing group to prevent ReDoS
1060-
const invoiceDateMatch = extractedText.match(/Invoice Date:\s*([^+\n]+?)(?=\n|$)/i) ||
1061-
extractedText.match(/Date:\s*([^+\n]+?)(?=\n|$)/i);
1059+
// Use greedy quantifier + instead of lazy +? to prevent ReDoS backtracking
1060+
// The greedy quantifier matches as much as possible, then lookahead verifies newline/end
1061+
const invoiceDateMatch = extractedText.match(/Invoice Date:\s*([^+\n]+)(?=\n|$)/i) ||
1062+
extractedText.match(/Date:\s*([^+\n]+)(?=\n|$)/i);
10621063
if (invoiceDateMatch) parsedFields.invoice_date = { value: invoiceDateMatch[1].trim() };
10631064

10641065
// Due Date patterns
1065-
// Use positive lookahead (?=\n|$) instead of non-capturing group to prevent ReDoS
1066-
const dueDateMatch = extractedText.match(/Due Date:\s*([^+\n]+?)(?=\n|$)/i) ||
1067-
extractedText.match(/Payment Due:\s*([^+\n]+?)(?=\n|$)/i);
1066+
// Use greedy quantifier + instead of lazy +? to prevent ReDoS backtracking
1067+
const dueDateMatch = extractedText.match(/Due Date:\s*([^+\n]+)(?=\n|$)/i) ||
1068+
extractedText.match(/Payment Due:\s*([^+\n]+)(?=\n|$)/i);
10681069
if (dueDateMatch) parsedFields.due_date = { value: dueDateMatch[1].trim() };
10691070

10701071
// Service patterns
1071-
// Use positive lookahead (?=\n|$) instead of non-capturing group to prevent ReDoS
1072-
const serviceMatch = extractedText.match(/Service:\s*([^+\n]+?)(?=\n|$)/i) ||
1073-
extractedText.match(/Description:\s*([^+\n]+?)(?=\n|$)/i);
1072+
// Use greedy quantifier + instead of lazy +? to prevent ReDoS backtracking
1073+
const serviceMatch = extractedText.match(/Service:\s*([^+\n]+)(?=\n|$)/i) ||
1074+
extractedText.match(/Description:\s*([^+\n]+)(?=\n|$)/i);
10741075
if (serviceMatch) parsedFields.service = { value: serviceMatch[1].trim() };
10751076

10761077
// Rate/Price patterns

0 commit comments

Comments
 (0)