Fix Inefficient regular expression#1718
Conversation
HannesWell
left a comment
There was a problem hiding this comment.
This change seems reasonable to me. Especially as the Pattern is currently very difficult to understand, it's probably easier to read the sequential code (therefore I haven't compared the logi ...).
I just have two nitpicks below.
| // Determine whether the given attribute list matches the pattern | ||
| return matcher.find(); | ||
| private static boolean isValidAttributeKeyChar(char c) { | ||
| return (c >= 'A' && c <= 'Z') || // |
There was a problem hiding this comment.
Wouldn't be something like Character.isUpperCase/isLowerCase/isDigit() etc. be more readable?
| return (c >= 'A' && c <= 'Z') || // | |
| return Character.isUpperCase(c) || // |
There was a problem hiding this comment.
I tried to replicate what the RegEx is doing here, Character.isUpperCase/isLowerCase/ and alike might include more characters as before and I haven really checked why we test this particular things.
| int length = key.length(); | ||
| for (int i = 0; i < length; i++) { | ||
| char c = key.charAt(i); | ||
| if (!isValidAttributeKeyChar(c)) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; |
There was a problem hiding this comment.
Would a stream-pipeline fit here?
| int length = key.length(); | |
| for (int i = 0; i < length; i++) { | |
| char c = key.charAt(i); | |
| if (!isValidAttributeKeyChar(c)) { | |
| return false; | |
| } | |
| } | |
| return true; | |
| return key.chars().allMatch(PDETextHelper::isValidAttributeKeyChar); |
With the suggestions below one could maybe even inline the isValidAttributeKeyChar method?
9b56aff to
fb6e5c9
Compare
Some regular expressions take a long time to match certain input strings
to the point where the time it takes to match a string of length n is
proportional to nk or even 2n. Such regular expressions can negatively
affect performance, or even allow a malicious user to perform a Denial
of Service ("DoS") attack by crafting an expensive input string for the
regular expression to match.
This replaces the problematic regular expression by a little parser that
checks for the same results as the regular expression.
fb6e5c9 to
ed9afa1
Compare
Some regular expressions take a long time to match certain input strings to the point where the time it takes to match a string of length n is proportional to nk or even 2n. Such regular expressions can negatively affect performance, or even allow a malicious user to perform a Denial of Service ("DoS") attack by crafting an expensive input string for the regular expression to match.
This replaces the problematic regular expression by a little parser that checks for the same results as the regular expression.