reduce memory by Mzack9999 · Pull Request #2521 · projectdiscovery/httpx

Mzack9999 · 2026-06-24T21:57:25Z

Reduces httpx memory usage by removing redundant response body copies and lowering the default response read size to 50mb

Summary by CodeRabbit

Bug Fixes
- Updated response body handling to keep raw and decoded content available more reliably.
- Improved word and line counting so results stay consistent across different body formats and whitespace.
Documentation
- Updated CLI help text to show the new default response body size limits.

coderabbitai · 2026-06-24T21:57:39Z

Walkthrough

The PR lowers the documented and computed default response body size to 50mb, changes HTTPX.Do to preserve undecoded raw bytes while decoding response data, and switches word and line metrics to byte-based counting. New tests cover body handling, metric equivalence, and allocation behavior.

Changes

HTTPX response body handling

Layer / File(s)	Summary
Default response size `common/httpx/option.go`, `README.md`	`DefaultMaxResponseBodySize` now initializes to `50mb`, and the CLI docs show the matching response save/read defaults.
Raw body retention and byte metrics `common/httpx/httpx.go`	`HTTPX.Do` keeps the undecoded body for `RawData` and computes `Words` and `Lines` from byte counts or sanitized text when HTML stripping is enabled.
Local body behavior tests `common/httpx/response_memory_test.go`	Local `httptest` cases cover raw versus decoded body data, empty bodies, GBK decoding, backing-array sharing, and the non-zero read-size default.
Metric equivalence and allocations `common/httpx/response_memory_test.go`	The remaining tests compare byte-based word and line counts to legacy splitting across whitespace edge cases and assert the counting path allocates zero memory.

Sequence Diagram(s)

sequenceDiagram
  participant HTTPXDo as HTTPX.Do
  participant DecodeData
  participant Response
  HTTPXDo->>DecodeData: decode respbody into Data
  DecodeData-->>HTTPXDo: decoded Data
  HTTPXDo->>Response: assign RawData from rawbody
  alt VHostStripHTML enabled
    HTTPXDo->>Response: count Words and Lines from sanitized respbody
  else VHostStripHTML disabled
    HTTPXDo->>Response: count Words and Lines from respbody bytes
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I hopped through bytes by lantern glow,
kept raw carrot-crunch for the show.
Words and lines go boing in tune,
under a quiet README moon,
and tests thump softly: hop, hop, ho!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title is related to the change, but it is too vague to explain that response body copies and default read size were reduced.	Use a more specific title like "Reduce httpx response memory usage" or mention the lowered default response read size.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch reduce-response-memory-footprint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

🧹 Nitpick comments (1)

common/httpx/httpx.go (1)
334-338: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Use strings.Count in the VHostStripHTML branch too.

Lines 334-338 still allocate separator slices via strings.Split. Since respbodystr is already materialized there, strings.Count(respbodystr, " ")+1 and strings.Count(strings.TrimSpace(respbodystr), "\n")+1 preserve the current semantics without the extra allocation.
♻️ Proposed change
 	if h.Options.VHostStripHTML {
 		respbodystr := h.htmlPolicy.Sanitize(string(respbody))
 		if respbodystr != "" {
-			resp.Words = len(strings.Split(respbodystr, " "))
-			resp.Lines = len(strings.Split(strings.TrimSpace(respbodystr), "\n"))
+			resp.Words = strings.Count(respbodystr, " ") + 1
+			resp.Lines = strings.Count(strings.TrimSpace(respbodystr), "\n") + 1
 		}
 	} else if len(respbody) > 0 {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@common/httpx/httpx.go` around lines 334 - 338, The VHostStripHTML branch in
httpx’s response counting still uses strings.Split and allocates unnecessary
slices; update the resp.Words and resp.Lines calculations in the
h.Options.VHostStripHTML block to use strings.Count on respbodystr (with the
same TrimSpace handling for lines) so it preserves the existing semantics
without extra allocations.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@common/httpx/httpx.go`:
- Around line 334-338: The VHostStripHTML branch in httpx’s response counting
still uses strings.Split and allocates unnecessary slices; update the resp.Words
and resp.Lines calculations in the h.Options.VHostStripHTML block to use
strings.Count on respbodystr (with the same TrimSpace handling for lines) so it
preserves the existing semantics without extra allocations.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9f5b139f-9ed8-4a0e-a6c8-0ff7111ef1b7

📥 Commits

Reviewing files that changed from the base of the PR and between 64e6442 and d2cb859.

📒 Files selected for processing (4)

README.md
common/httpx/httpx.go
common/httpx/option.go
common/httpx/response_memory_test.go

Mzack9999 · 2026-06-24T22:04:09Z

@coderabbitai create an issue to track this PR

coderabbitai · 2026-06-24T22:04:47Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="projectdiscovery/httpx"
pr_number="2521"
requester="Mzack9999"
pr_url="https://github.com/projectdiscovery/httpx/pull/${pr_number}"

comment_url="$(gh api "repos/${repo}/issues/${pr_number}/comments" --paginate \
  --jq '[.[] | select(.user.login == "'"${requester}"'" and (.body | contains("create an issue to track this PR")))][-1].html_url // empty')"

tmp_body="$(mktemp)"
cat > "$tmp_body" <<EOF
## Summary
Track the work proposed in PR #${pr_number} to reduce \`httpx\` memory usage by:
- removing redundant response body copies
- lowering the default response read size to **50 MB**
- preserving raw and decoded response-body behavior
- adding regression coverage for response-memory behavior and body metrics

## Rationale
PR #${pr_number} aims to reduce peak memory consumption in response handling without regressing decoding behavior, raw body access, or body-derived metrics.

## Affected areas
- \`common/httpx/httpx.go\`
- \`common/httpx/option.go\`
- \`common/httpx/response_memory_test.go\`
- \`README.md\`

## Acceptance criteria
- The default response read size is updated to **50 MB** and documentation reflects the new default.
- Redundant response body copies are removed from the response handling path.
- Raw response bytes remain available where required, including when decoded data differs from raw bytes.
- Existing or added tests cover raw body handling, decoded body handling, and word/line metric behavior.
- The change is tracked against PR #${pr_number} for review and follow-up.

## Backlinks
- Pull request: ${pr_url}
EOF

if [[ -n "${comment_url}" ]]; then
  printf -- "- Requesting comment: %s\n" "${comment_url}" >> "$tmp_body"
fi

issue_url="$(gh issue create \
  --repo "$repo" \
  --title "Track PR #${pr_number}: reduce httpx response memory footprint" \
  --body-file "$tmp_body" \
  --assignee "$requester")"

echo "$issue_url"

Length of output: 2100

@Mzack9999 Created the tracking issue: #2522 — #2522

It links back to PR #2521 and includes the follow-up summary, affected areas, and acceptance criteria.

reduce memory

d2cb859

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

Mzack9999 self-assigned this Jun 24, 2026

coderabbitai Bot mentioned this pull request Jun 24, 2026

Track PR #2521: reduce httpx response memory footprint #2522

Closed

Mzack9999 linked an issue Jun 24, 2026 that may be closed by this pull request

Track PR #2521: reduce httpx response memory footprint #2522

Closed

Mzack9999 requested a review from dogancanbakir June 24, 2026 22:20

dogancanbakir approved these changes Jun 25, 2026

View reviewed changes

dogancanbakir merged commit 63fb40a into dev Jun 25, 2026
15 checks passed

dogancanbakir deleted the reduce-response-memory-footprint branch June 25, 2026 13:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reduce memory#2521

reduce memory#2521
dogancanbakir merged 1 commit into
devfrom
reduce-response-memory-footprint

Mzack9999 commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Mzack9999 commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Mzack9999 commented Jun 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Mzack9999 commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mzack9999 commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading