reduce memory#2521
Conversation
WalkthroughThe PR lowers the documented and computed default response body size to 50mb, changes ChangesHTTPX response body handling
Sequence Diagram(s)sequenceDiagram
participant HTTPXDo as HTTPX.Do
participant DecodeData
participant Response
HTTPXDo->>DecodeData: decode respbody into Data
DecodeData-->>HTTPXDo: decoded Data
HTTPXDo->>Response: assign RawData from rawbody
alt VHostStripHTML enabled
HTTPXDo->>Response: count Words and Lines from sanitized respbody
else VHostStripHTML disabled
HTTPXDo->>Response: count Words and Lines from respbody bytes
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
common/httpx/httpx.go (1)
334-338: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick winUse
strings.Countin theVHostStripHTMLbranch too.Lines 334-338 still allocate separator slices via
strings.Split. Sincerespbodystris already materialized there,strings.Count(respbodystr, " ")+1andstrings.Count(strings.TrimSpace(respbodystr), "\n")+1preserve the current semantics without the extra allocation.♻️ Proposed change
if h.Options.VHostStripHTML { respbodystr := h.htmlPolicy.Sanitize(string(respbody)) if respbodystr != "" { - resp.Words = len(strings.Split(respbodystr, " ")) - resp.Lines = len(strings.Split(strings.TrimSpace(respbodystr), "\n")) + resp.Words = strings.Count(respbodystr, " ") + 1 + resp.Lines = strings.Count(strings.TrimSpace(respbodystr), "\n") + 1 } } else if len(respbody) > 0 {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@common/httpx/httpx.go` around lines 334 - 338, The VHostStripHTML branch in httpx’s response counting still uses strings.Split and allocates unnecessary slices; update the resp.Words and resp.Lines calculations in the h.Options.VHostStripHTML block to use strings.Count on respbodystr (with the same TrimSpace handling for lines) so it preserves the existing semantics without extra allocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@common/httpx/httpx.go`:
- Around line 334-338: The VHostStripHTML branch in httpx’s response counting
still uses strings.Split and allocates unnecessary slices; update the resp.Words
and resp.Lines calculations in the h.Options.VHostStripHTML block to use
strings.Count on respbodystr (with the same TrimSpace handling for lines) so it
preserves the existing semantics without extra allocations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9f5b139f-9ed8-4a0e-a6c8-0ff7111ef1b7
📒 Files selected for processing (4)
README.mdcommon/httpx/httpx.gocommon/httpx/option.gocommon/httpx/response_memory_test.go
|
@coderabbitai create an issue to track this PR |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
repo="projectdiscovery/httpx"
pr_number="2521"
requester="Mzack9999"
pr_url="https://github.com/projectdiscovery/httpx/pull/${pr_number}"
comment_url="$(gh api "repos/${repo}/issues/${pr_number}/comments" --paginate \
--jq '[.[] | select(.user.login == "'"${requester}"'" and (.body | contains("create an issue to track this PR")))][-1].html_url // empty')"
tmp_body="$(mktemp)"
cat > "$tmp_body" <<EOF
## Summary
Track the work proposed in PR #${pr_number} to reduce \`httpx\` memory usage by:
- removing redundant response body copies
- lowering the default response read size to **50 MB**
- preserving raw and decoded response-body behavior
- adding regression coverage for response-memory behavior and body metrics
## Rationale
PR #${pr_number} aims to reduce peak memory consumption in response handling without regressing decoding behavior, raw body access, or body-derived metrics.
## Affected areas
- \`common/httpx/httpx.go\`
- \`common/httpx/option.go\`
- \`common/httpx/response_memory_test.go\`
- \`README.md\`
## Acceptance criteria
- The default response read size is updated to **50 MB** and documentation reflects the new default.
- Redundant response body copies are removed from the response handling path.
- Raw response bytes remain available where required, including when decoded data differs from raw bytes.
- Existing or added tests cover raw body handling, decoded body handling, and word/line metric behavior.
- The change is tracked against PR #${pr_number} for review and follow-up.
## Backlinks
- Pull request: ${pr_url}
EOF
if [[ -n "${comment_url}" ]]; then
printf -- "- Requesting comment: %s\n" "${comment_url}" >> "$tmp_body"
fi
issue_url="$(gh issue create \
--repo "$repo" \
--title "Track PR #${pr_number}: reduce httpx response memory footprint" \
--body-file "$tmp_body" \
--assignee "$requester")"
echo "$issue_url"Length of output: 2100
It links back to PR |
Reduces httpx memory usage by removing redundant response body copies and lowering the default response read size to 50mb
Summary by CodeRabbit
Bug Fixes
Documentation