fix: enforce expression size limit before source construction#1302
fix: enforce expression size limit before source construction#1302thesmartshadow wants to merge 4 commits intogoogle:masterfrom
Conversation
Enforce the configured ParserExpressionSizeLimit in Env.Compile() and Env.Parse() before calling common.NewTextSource(), preventing memory allocation proportional to the full input size for oversized expressions. Previously, oversized expressions were correctly rejected but only after the internal rune buffer had already been allocated. This allowed substantial memory allocation even when a strict size limit was configured. The fix adds an early utf8.RuneCountInString() check which is O(n) in CPU but avoids the additional memory allocation and GC pressure caused by eager source/rune-buffer construction. CWE-400: Uncontrolled Resource Consumption
985387a to
9b5a353
Compare
|
/gcbrun |
|
Decoding the entire string twice, seems not ideal. I'd rather we just add additional funcs near common.NewTextSource and friends to take the limit and return an additional error. First we can just check the string byte length (constant time) to see if its over, otherwise we can enforce it as we go when building the buffer. |
|
hi @jcking that makes sense. I reworked the patch along those lines. Instead of doing a separate full The updated version adds limit-aware helpers near I reran the full test suite and the direct PoC as well. All tests are passing, and the compile-time memory spike is still gone with the updated approach. |
|
I dropped the capHint logic and just used len(data) directly for the buffer capacities. I also added a comment there to make it a bit more obvious why that is safe. I reran the full test suite and the direct PoC after the change. Everything is still passing on my side and the memory behavior still looks good. |
| } | ||
|
|
||
| func newBuffer(data string, lines bool) (Buffer, []int32) { | ||
| func newBufferWithLimit(data string, lines bool, limit int) (Buffer, []int32, error) { |
There was a problem hiding this comment.
I think we still get here if len(data) > limit, which means we will allocate the buffer even though we know we are going to go past it. I would still add the following after this if statement.
if limit >= 0 && len(data) > limit {
return nil, nil, &SizeLimitError{
Size: countRemainingCodePoints(data, 0, 0),
Limit: limit,
}
}
There was a problem hiding this comment.
Yeah I updated it so we return before any buffer allocation on the oversized path. I kept the code point count there so multibyte input still follows the same limit semantics.
I reran the tests after the change.
This PR enforces the configured
ParserExpressionSizeLimitinEnv.Compile()and
Env.Parse()before callingcommon.NewTextSource(), preventing memoryallocation proportional to the full input size for oversized expressions.
Problem
The current flow is:
Env.Compile(txt)→common.NewTextSource(txt)common.NewTextSource()→runes.NewBufferAndLineOffsets(contents)runes.newBuffer()eagerly allocates capacity proportional to the full inputbuf.Len() > p.expressionSizeCodePointLimitApplications configuring
ParserExpressionSizeLimit(1000)expect oversizedexpressions to be rejected cheaply, but the internal source/rune-buffer
materialization still occurs before enforcement.
Fix
Add an early
utf8.RuneCountInString(txt)check in bothCompile()andParse()before callingcommon.NewTextSource().Before / After
The test
TestExpressionSizeLimitEarlyEnforcementconfirms 0MiB allocationdelta for a 10MB oversized expression with a 1000 code point limit.