Reuse the precompiled regex for collection variable selectors#3573
Open
chweidling wants to merge 2 commits into
Open
Reuse the precompiled regex for collection variable selectors#3573chweidling wants to merge 2 commits into
chweidling wants to merge 2 commits into
Conversation
resolveRegularExpression() in the collection backends compiled (and JIT-compiled) a fresh Utils::Regex from the pattern string on every call. For a regex variable selector such as TX:/regex/ that is evaluated per transaction, this recompiled the same pattern on every request - even though the calling VariableRegex already holds it compiled once at configuration time in its m_r member. Add a Collection::resolveRegularExpression(Utils::Regex *) overload that accepts the pre-compiled regex. The base class keeps the previous behaviour by default (it compiles from r->pattern and delegates), so backends that do not override it are unaffected; InMemoryPerProcess overrides it to scan the collection with the supplied regex directly. Tx_DictElementRegexp now passes its already-compiled &m_r instead of the pattern string. Behaviour is unchanged: m_r is constructed with the same arguments (Utils::Regex(pattern, /*ignoreCase=*/true)) the backend used, so the identical regex is applied - it is just compiled once instead of per transaction. A regression test covering TX:/regex/ selection is added.
|
Author
|
The failing Filed as #3574. All other CI jobs (Linux x32/x64 gcc+clang, all configs, macOS) pass. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



what
Collection::resolveRegularExpression(Utils::Regex *)overload that takes an already-compiled regex, and override it inInMemoryPerProcess.Tx_DictElementRegexp(theTX:/regex/variable selector) now passes its pre-compiledm_rinstead of the pattern string.r->patternand delegates), so backends that don't override it — and the existing compartment-prefixed string overloads — are unchanged.TX:/regex/selection.why
resolveRegularExpression(const std::string&, ...)built a freshUtils::Regex— i.e. apcre2_compile()andpcre2_jit_compile()— on every call.TX:/regex/that is evaluated per transaction (e.g. CRS 921180), this recompiled the same pattern on every request, even though the callingVariableRegexalready holds it compiled once at configuration time inm_r. That compiled regex was simply being ignored.m_ris constructed with the same arguments —Utils::Regex(pattern, /*ignoreCase=*/true)— that the backend used, so the identical regex is applied; it is just compiled once instead of per transaction.v3/master, gcc-O2, system PCRE2 with JIT):TX:/…/rule over 100k transactions: ~64.9 →62.2 µs/tx (+4 % throughput); the gain scales with the number of regex selectors evaluated per request.references