Skip to content

Commit b0f351a

Browse files
committed
fix: prevent ReDoS vulnerability in regex patterns
Replace non-capturing groups (?:\n|$) with positive lookaheads (?=\n|$) in regex patterns to prevent catastrophic backtracking and ReDoS attacks. The pattern ([^+\n]+?)(?:\n|$) uses a lazy quantifier followed by a non-capturing group, which can cause super-linear runtime when the input doesn't match. Using positive lookahead (?=\n|$) prevents backtracking by not consuming characters, making the regex safe from ReDoS. Fixed patterns in: - Invoice Date patterns (2 regexes) - Due Date patterns (2 regexes) - Service patterns (2 regexes) All patterns now use (?=\n|$) instead of (?:\n|$) to prevent ReDoS.
1 parent b54cbb2 commit b0f351a

1 file changed

Lines changed: 9 additions & 6 deletions

File tree

src/ui/web/src/pages/DocumentExtraction.tsx

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1056,18 +1056,21 @@ const DocumentExtraction: React.FC = () => {
10561056
if (orderNumMatch) parsedFields.order_number = { value: orderNumMatch[1] };
10571057

10581058
// Invoice Date patterns
1059-
const invoiceDateMatch = extractedText.match(/Invoice Date:\s*([^+\n]+?)(?:\n|$)/i) ||
1060-
extractedText.match(/Date:\s*([^+\n]+?)(?:\n|$)/i);
1059+
// Use positive lookahead (?=\n|$) instead of non-capturing group to prevent ReDoS
1060+
const invoiceDateMatch = extractedText.match(/Invoice Date:\s*([^+\n]+?)(?=\n|$)/i) ||
1061+
extractedText.match(/Date:\s*([^+\n]+?)(?=\n|$)/i);
10611062
if (invoiceDateMatch) parsedFields.invoice_date = { value: invoiceDateMatch[1].trim() };
10621063

10631064
// Due Date patterns
1064-
const dueDateMatch = extractedText.match(/Due Date:\s*([^+\n]+?)(?:\n|$)/i) ||
1065-
extractedText.match(/Payment Due:\s*([^+\n]+?)(?:\n|$)/i);
1065+
// Use positive lookahead (?=\n|$) instead of non-capturing group to prevent ReDoS
1066+
const dueDateMatch = extractedText.match(/Due Date:\s*([^+\n]+?)(?=\n|$)/i) ||
1067+
extractedText.match(/Payment Due:\s*([^+\n]+?)(?=\n|$)/i);
10661068
if (dueDateMatch) parsedFields.due_date = { value: dueDateMatch[1].trim() };
10671069

10681070
// Service patterns
1069-
const serviceMatch = extractedText.match(/Service:\s*([^+\n]+?)(?:\n|$)/i) ||
1070-
extractedText.match(/Description:\s*([^+\n]+?)(?:\n|$)/i);
1071+
// Use positive lookahead (?=\n|$) instead of non-capturing group to prevent ReDoS
1072+
const serviceMatch = extractedText.match(/Service:\s*([^+\n]+?)(?=\n|$)/i) ||
1073+
extractedText.match(/Description:\s*([^+\n]+?)(?=\n|$)/i);
10711074
if (serviceMatch) parsedFields.service = { value: serviceMatch[1].trim() };
10721075

10731076
// Rate/Price patterns

0 commit comments

Comments
 (0)