Skip to content

refactor: replace Split in loops with more efficient SplitSeq#7278

Open
stringsbuilder wants to merge 1 commit into
projectdiscovery:devfrom
stringsbuilder:dev
Open

refactor: replace Split in loops with more efficient SplitSeq#7278
stringsbuilder wants to merge 1 commit into
projectdiscovery:devfrom
stringsbuilder:dev

Conversation

@stringsbuilder
Copy link
Copy Markdown

@stringsbuilder stringsbuilder commented Mar 20, 2026

Proposed changes

strings.SplitSeq (introduced in Go 1.23) returns a lazy sequence (strings.Seq), allowing gopher to iterate over tokens one by one without creating an intermediate slice.

It significantly reduces memory allocations and can improve performance for long strings.

More info: golang/go#61901

Proof

Checklist

  • Pull request is created against the dev branch
  • All checks passed (lint, unit/integration/regression tests etc.) with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Summary by CodeRabbit

  • Refactor

    • Optimized internal string-splitting operations across multiple modules for improved performance efficiency.
  • Tests

    • Updated integration and unit tests with optimized string iteration patterns to align with refactored implementations.

Signed-off-by: stringsbuilder <stringsbuilder@outlook.com>
@auto-assign auto-assign Bot requested a review from dogancanbakir March 20, 2026 11:59
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 20, 2026

Walkthrough

This pull request systematically refactors string iteration patterns across 10 files, replacing strings.Split() with strings.SplitSeq() to use sequence-based iteration instead of materializing full slices. Loop constructs have been updated correspondingly to iterate over the sequence producer directly.

Changes

Cohort / File(s) Summary
HTTP Integration Tests
cmd/integration-test/http.go
Replaced strings.Split(results, "\n") with strings.SplitSeq() in five HTTP test methods (httpRawUnsafePath, httpPaths, httpVariableDSLFunction, httpRawPathSingleSlash, httpRawUnsafePathSingleSlash), preserving per-line filtering and field parsing logic.
Configuration Parsing
pkg/catalog/config/nucleiconfig.go, pkg/catalog/config/template.go
Updated parseDebugArgs to use strings.FieldsSeq() with index-based iteration; modified IsTemplateWithRoot to replace strings.Split() with strings.SplitSeq() and adjust loop from for _, p := range to for p := range.
Package Installation & Setup
pkg/installer/template.go, pkg/tmplexec/flow/flow_executor.go
Replaced strings.Split() with strings.SplitSeq() in checksum parsing and file reading; preserved trimming and conditional append logic.
Protocol Handlers & Parsers
pkg/js/devtools/tsgen/parser.go, pkg/protocols/javascript/js.go, pkg/protocols/network/network.go
Updated port and signature parsing to use strings.SplitSeq() instead of strings.Split(); adjusted loops for sequence iteration while maintaining range validation, filtering, and deduplication logic.
Test Files
internal/runner/options_test.go, pkg/testutils/integration.go
Changed string splitting in test parsing from slice iteration to sequence iteration; updated loops in comma-separated and newline-delimited parsing with consistent strings.SplitSeq() adoption.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 The split strings now flow like streams instead of heaps,
SplitSeq makes memory use lighter and more sweet,
No cloning slices whole—just iterate on the fly,
Across each config, test, and protocol file we spy,
Efficiency blooms in patterns repeated and neat!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately describes the main refactoring: replacing strings.Split with strings.SplitSeq throughout the codebase for better performance.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/protocols/network/network.go (1)

202-213: ⚠️ Potential issue | 🟡 Minor

Trim whitespace before parsing each port token.

Line 206 will reject a common input like port: "80, 443" because the second token still includes leading whitespace.

✂️ Suggested fix
-		for port := range strings.SplitSeq(request.Port, ",") {
+		for rawPort := range strings.SplitSeq(request.Port, ",") {
+			port := strings.TrimSpace(rawPort)
 			if port == "" {
 				continue
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/protocols/network/network.go` around lines 202 - 213, When iterating
tokens from strings.SplitSeq(request.Port, ",") trim whitespace from each token
before using it: use a local variable (e.g., tok := strings.TrimSpace(port)),
skip if tok == "", parse tok with strconv.Atoi, validate its range, and append
the trimmed token (tok) to request.ports (not the original port with
whitespace). Update the error messages to include tok where appropriate so
parsing failures reflect the trimmed value; keep references to request.Port,
strings.SplitSeq, strconv.Atoi, and request.ports.
🧹 Nitpick comments (2)
pkg/tmplexec/flow/flow_executor.go (1)

302-311: This still materializes the entire helper file before tokenization.

Line 302's io.ReadAll(reader) dominates the allocation profile here, and the substrings appended in Lines 306-310 keep that full backing string alive. If this path is part of the perf goal, scan the reader directly instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/tmplexec/flow/flow_executor.go` around lines 302 - 311, The code
currently calls io.ReadAll(reader) and then iterates over
strings.SplitSeq(string(bin), "\n"), which materializes the entire reader into
memory and keeps the large backing string alive; replace that pattern by
scanning the reader directly with a bufio.Scanner (or
bufio.Reader.ReadString/ReadBytes loop) to read lines one-by-one from reader,
trim each scanned line (strings.TrimSpace), and append non-empty lines to
values, and if necessary configure Scanner.Buffer to handle long lines; update
the logic that references reader, values, and strings.SplitSeq to use the
scanner instead.
pkg/installer/template.go (1)

568-575: There's still a per-record split allocation in this loop.

Line 570 only needs two fields, but the full split still creates a slice for every checksum entry. A two-part split would align better with the allocation-reduction goal of this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/installer/template.go` around lines 568 - 575, The per-record allocation
comes from using strings.Split on each checksum line (variable v) and assigning
to tmparr; replace that with a two-field split (e.g., strings.SplitN(v, ",", 2)
or better strings.Cut(v, ",")) to avoid allocating a full slice for every entry,
then validate the second field exists before assigning into
allChecksums[tmparr[0]] = tmparr[1] (or using the two return values from
strings.Cut) while keeping the surrounding loop and trimming logic the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/protocols/network/network.go`:
- Around line 202-213: When iterating tokens from strings.SplitSeq(request.Port,
",") trim whitespace from each token before using it: use a local variable
(e.g., tok := strings.TrimSpace(port)), skip if tok == "", parse tok with
strconv.Atoi, validate its range, and append the trimmed token (tok) to
request.ports (not the original port with whitespace). Update the error messages
to include tok where appropriate so parsing failures reflect the trimmed value;
keep references to request.Port, strings.SplitSeq, strconv.Atoi, and
request.ports.

---

Nitpick comments:
In `@pkg/installer/template.go`:
- Around line 568-575: The per-record allocation comes from using strings.Split
on each checksum line (variable v) and assigning to tmparr; replace that with a
two-field split (e.g., strings.SplitN(v, ",", 2) or better strings.Cut(v, ","))
to avoid allocating a full slice for every entry, then validate the second field
exists before assigning into allChecksums[tmparr[0]] = tmparr[1] (or using the
two return values from strings.Cut) while keeping the surrounding loop and
trimming logic the same.

In `@pkg/tmplexec/flow/flow_executor.go`:
- Around line 302-311: The code currently calls io.ReadAll(reader) and then
iterates over strings.SplitSeq(string(bin), "\n"), which materializes the entire
reader into memory and keeps the large backing string alive; replace that
pattern by scanning the reader directly with a bufio.Scanner (or
bufio.Reader.ReadString/ReadBytes loop) to read lines one-by-one from reader,
trim each scanned line (strings.TrimSpace), and append non-empty lines to
values, and if necessary configure Scanner.Buffer to handle long lines; update
the logic that references reader, values, and strings.SplitSeq to use the
scanner instead.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bb4462ab-2d3f-4a47-917f-e710b053336f

📥 Commits

Reviewing files that changed from the base of the PR and between 0ae6d90 and 90cd656.

📒 Files selected for processing (10)
  • cmd/integration-test/http.go
  • internal/runner/options_test.go
  • pkg/catalog/config/nucleiconfig.go
  • pkg/catalog/config/template.go
  • pkg/installer/template.go
  • pkg/js/devtools/tsgen/parser.go
  • pkg/protocols/javascript/js.go
  • pkg/protocols/network/network.go
  • pkg/testutils/integration.go
  • pkg/tmplexec/flow/flow_executor.go

@neo-by-projectdiscovery-dev
Copy link
Copy Markdown

neo-by-projectdiscovery-dev Bot commented Mar 20, 2026

Neo - PR Security Review

No security issues found

Highlights

  • Refactors strings.Split/Fields to lazy iterator-based strings.SplitSeq/FieldsSeq (Go 1.23)
  • Changes span 10 files: port parsing, path validation, checksum parsing, test utilities, and config parsing
  • All security validations (port validation, path traversal checks) remain functionally identical
Hardening Notes
  • The lazy evaluation in SplitSeq defers memory allocation but does not change iteration logic
  • Port validation in network.go and js.go correctly validates each port with the same checks as before
  • Path traversal prevention in template.go still checks each path component against excluded directories
  • Checksum parsing in installer.go processes trusted local files with the same validation
  • The refactoring reduces memory allocations, which is beneficial for DoS prevention

Comment @pdneo help for available commands. · Open in Neo

@stringsbuilder
Copy link
Copy Markdown
Author

@Mzack9999 @dogancanbakir Hi, Could you please review this PR at your convenience? Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant