Conversation
- Dispatch shell link rewriting between streaming and buffered paths based on response size - Reuse buffers and reduce URL construction allocations in proxy handlers - Add nest benchmarks and align extractParts compatibility expectations with the current contract
- Cache route handlers, simplify NoRoute path normalization, and reduce matcher/header allocations - Honor configured transport pool limits in auto mode and add hotpath regression benchmarks/tests
Walkthrough本次变更重构并集中管理代理目标 URL 构建,增加基于响应体大小的链接处理流水线(流式/缓冲),引入字节级 URL 匹配与改写,以及多处匹配器、HTTP 客户端与测试/基准的调整与补充(无导出接口变更)。 Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
变更庆诗
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request optimizes link processing by introducing a dual-path strategy: streaming for large response bodies and buffered processing using a sync.Pool for smaller ones. It also centralizes URL construction logic into a new buildProxyPath function and implements byte-level matching to minimize allocations. Review feedback identifies a bug where the written return parameter is updated asynchronously after the function returns, rendering it useless to the caller. Additionally, suggestions were made to further reduce allocations in the streaming path by adopting byte-based processing and avoiding redundant byte slice conversions during copy operations.
| written := 0 | ||
| written += copy(newURL[written:], prefixHTTPS) | ||
| written += copy(newURL[written:], host) | ||
| written += copy(newURL[written:], []byte("/")) |
There was a problem hiding this comment.
The expression []byte("/") creates a new byte slice allocation every time this function is called. Since the built-in copy function supports copying from a string to a byte slice as a special case, you can use a string literal directly to avoid this unnecessary allocation.
| written += copy(newURL[written:], []byte("/")) | |
| written += copy(newURL[written:], "/") |
| for { | ||
| line, readErr := bufReader.ReadString('\n') | ||
| if readErr != nil { | ||
| if readErr == io.EOF { | ||
| break // 文件结束 | ||
| } | ||
| err = fmt.Errorf("读取行错误: %v", readErr) // 传递错误 | ||
| return // Goroutine 中使用 return 返回错误 | ||
| if readErr != nil && readErr != io.EOF { | ||
| err = fmt.Errorf("read error: %w", readErr) | ||
| return | ||
| } | ||
|
|
||
| // 替换所有匹配的 URL | ||
| modifiedLine := urlPattern.ReplaceAllStringFunc(line, func(originalURL string) string { | ||
| return modifyURL(originalURL, host, cfg) // 假设 modifyURL 函数已定义 | ||
| }) | ||
| if len(line) > 0 { | ||
| modifiedLine := urlPattern.ReplaceAllStringFunc(line, func(originalURL string) string { | ||
| return modifyURL(originalURL, host, cfg) | ||
| }) | ||
|
|
||
| n, writeErr := bufWriter.WriteString(modifiedLine) | ||
| written += int64(n) // 更新写入的字节数 | ||
| if writeErr != nil { | ||
| err = fmt.Errorf("写入文件错误: %v", writeErr) // 传递错误 | ||
| return // Goroutine 中使用 return 返回错误 | ||
| n, writeErr := bufWriter.WriteString(modifiedLine) | ||
| written += int64(n) | ||
| if writeErr != nil { | ||
| err = fmt.Errorf("write error: %w", writeErr) | ||
| return | ||
| } | ||
| } | ||
|
|
||
| if readErr == io.EOF { | ||
| break | ||
| } | ||
| } |
There was a problem hiding this comment.
To further reduce allocations in the streaming path, consider using byte-based processing similar to the buffered path. By using bufReader.ReadBytes('\n') and urlPattern.ReplaceAllFunc with modifyURLBytes, you can avoid multiple string conversions for each line and every URL match found.
hostBytes := []byte(host)
for {
line, readErr := bufReader.ReadBytes('\n')
if readErr != nil && readErr != io.EOF {
err = fmt.Errorf("read error: %w", readErr)
return
}
if len(line) > 0 {
modifiedLine := urlPattern.ReplaceAllFunc(line, func(originalURL []byte) []byte {
return modifyURLBytes(originalURL, hostBytes, cfg)
})
n, writeErr := bufWriter.Write(modifiedLine)
written += int64(n)
if writeErr != nil {
err = fmt.Errorf("write error: %w", writeErr)
return
}
}
if readErr == io.EOF {
break
}
}| return readerOut, written, nil | ||
| } | ||
|
|
||
| func processLinks(input io.ReadCloser, host string, cfg *config.Config, c *touka.Context, bodySize int) (readerOut io.Reader, written int64, err error) { |
There was a problem hiding this comment.
The written named return parameter is modified inside asynchronous goroutines (in both processLinksStreamingInternal and processLinksBufferedInternal) after the processLinks function has already returned. Consequently, the value received by the caller will always be 0, making this return value misleading. If tracking the total bytes written is required, it should be implemented by wrapping the returned io.Reader or using a callback mechanism.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
proxy/routing.go (1)
53-55:switch中的"blob"分支不可达。由于第 49-51 行已将
matcher == "blob"转换为"raw",此处的"blob"分支永远不会被匹配,属于死代码。♻️ 建议移除不可达的 case
switch matcher { - case "releases", "blob", "raw", "gist", "api": + case "releases", "raw", "gist", "api": ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/routing.go` around lines 53 - 55, The "blob" case in the switch on matcher is dead code (matcher is already normalized to "raw" earlier); remove the redundant "blob" entry from the switch that calls ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher) so the switch only contains the reachable cases ("releases", "raw", "gist", "api"), and verify no other logic depends on matcher being "blob" before or after the switch (adjust any upstream normalization if necessary).proxy/nest.go (1)
116-122: 可优化单字节写入。
[]byte("/")会为单字节分配切片,可直接使用字节赋值避免分配。♻️ 建议的优化
newURL := make([]byte, len(prefixHTTPS)+len(host)+1+len(trimmed)) written := 0 written += copy(newURL[written:], prefixHTTPS) written += copy(newURL[written:], host) - written += copy(newURL[written:], []byte("/")) + newURL[written] = '/' + written++ copy(newURL[written:], trimmed)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/nest.go` around lines 116 - 122, The code builds newURL with multiple copy calls and allocates a single-byte slice via []byte("/") which is wasteful; in the block that constructs newURL (variables newURL, prefixHTTPS, host, trimmed) replace the []byte("/") copy with a direct byte assignment to newURL[written] = '/' and increment written by 1, then proceed to copy trimmed into newURL[written:], ensuring written is updated correctly to avoid the extra one-byte allocation.proxy/handler.go (1)
81-83: 与routing.go相同,"blob"分支不可达。建议与
proxy/routing.go保持一致,移除不可达的"blob"分支。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/handler.go` around lines 81 - 83, 在 proxy/handler.go 的 switch matcher 分支里(case "releases", "blob", "raw", "gist", "api")移除不可达的 "blob" 分支以和 routing.go 保持一致;也请确保调用 ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher) 时传入的 matcher 不包含 "blob"(即把 case 改为 "releases", "raw", "gist", "api"),并运行相关测试或调用点以验证没有遗漏的依赖。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/nest.go`:
- Around line 197-238: processLinksBufferedInternal has a data race because the
goroutine assigns to the outer variable written and the function returns
immediately; fix by synchronizing the goroutine's result before returning (or by
changing the return design to not expose written). Concretely, in
processLinksBufferedInternal create a result channel (e.g. struct{written int64;
err error}) or use a sync.WaitGroup + local result vars, send the final written
and err from inside the goroutine to that channel (or set locals while the main
goroutine waits), then receive/wait before returning so that written and err are
set deterministically; reference symbols: processLinksBufferedInternal, written,
err, pipeWriter, goroutine where n, err are computed. Ensure the same
return-value approach is used as in processLinksStreamingInternal for
consistency.
- Around line 132-185: The returned variable "written" in
processLinksStreamingInternal is being mutated inside the spawned goroutine
(written += int64(n)) while the function returns immediately, causing a data
race and always returning 0; to fix either remove "written" from the function
signature and all call sites (simplest) or make its value safe to observe by
synchronizing with the goroutine (e.g., use a channel or a sync.WaitGroup and
only return the accumulated value after the goroutine finishes), ensuring you
update the code around processLinksStreamingInternal and any callers
accordingly; reference symbols: processLinksStreamingInternal, written, the
anonymous goroutine, and bufWriter.WriteString where written is updated.
---
Nitpick comments:
In `@proxy/handler.go`:
- Around line 81-83: 在 proxy/handler.go 的 switch matcher 分支里(case "releases",
"blob", "raw", "gist", "api")移除不可达的 "blob" 分支以和 routing.go 保持一致;也请确保调用
ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher) 时传入的 matcher 不包含 "blob"(即把
case 改为 "releases", "raw", "gist", "api"),并运行相关测试或调用点以验证没有遗漏的依赖。
In `@proxy/nest.go`:
- Around line 116-122: The code builds newURL with multiple copy calls and
allocates a single-byte slice via []byte("/") which is wasteful; in the block
that constructs newURL (variables newURL, prefixHTTPS, host, trimmed) replace
the []byte("/") copy with a direct byte assignment to newURL[written] = '/' and
increment written by 1, then proceed to copy trimmed into newURL[written:],
ensuring written is updated correctly to avoid the extra one-byte allocation.
In `@proxy/routing.go`:
- Around line 53-55: The "blob" case in the switch on matcher is dead code
(matcher is already normalized to "raw" earlier); remove the redundant "blob"
entry from the switch that calls ChunkedProxyRequest(ctx, c, rawPath, cfg,
matcher) so the switch only contains the reachable cases ("releases", "raw",
"gist", "api"), and verify no other logic depends on matcher being "blob" before
or after the switch (adjust any upstream normalization if necessary).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8e24e39a-f80e-47ea-9526-7000c35fdc77
📒 Files selected for processing (6)
proxy/chunkreq.goproxy/handler.goproxy/matcher_test.goproxy/nest.goproxy/nest_bench_test.goproxy/routing.go
| func processLinksStreamingInternal(input io.ReadCloser, host string, cfg *config.Config, c *touka.Context) (readerOut io.Reader, written int64, err error) { | ||
| pipeReader, pipeWriter := io.Pipe() | ||
| readerOut = pipeReader | ||
|
|
||
| go func() { // 在 Goroutine 中执行写入操作 | ||
| go func() { | ||
| defer func() { | ||
| if pipeWriter != nil { // 确保 pipeWriter 关闭,即使发生错误 | ||
| if err != nil { | ||
| if closeErr := pipeWriter.CloseWithError(err); closeErr != nil { // 如果有错误,传递错误给 reader | ||
| c.Errorf("pipeWriter close with error failed: %v, original error: %v", closeErr, err) | ||
| } | ||
| } else { | ||
| if closeErr := pipeWriter.Close(); closeErr != nil { // 没有错误,正常关闭 | ||
| c.Errorf("pipeWriter close failed: %v", closeErr) | ||
| if err == nil { // 如果之前没有错误,记录关闭错误 | ||
| err = closeErr | ||
| } | ||
| } | ||
| } | ||
| if err != nil { | ||
| _ = pipeWriter.CloseWithError(err) | ||
| return | ||
| } | ||
| _ = pipeWriter.Close() | ||
| }() | ||
|
|
||
| defer func() { | ||
| if err := input.Close(); err != nil { | ||
| c.Errorf("input close failed: %v", err) | ||
| if closeErr := input.Close(); closeErr != nil && c != nil { | ||
| c.Errorf("input close failed: %v", closeErr) | ||
| } | ||
|
|
||
| }() | ||
|
|
||
| var bufReader *bufio.Reader | ||
|
|
||
| bufReader = bufio.NewReader(input) | ||
|
|
||
| var bufWriter *bufio.Writer | ||
|
|
||
| bufWriter = bufio.NewWriterSize(pipeWriter, 4096) // 使用 pipeWriter | ||
|
|
||
| //确保writer关闭 | ||
| bufReader := bufio.NewReader(input) | ||
| bufWriter := bufio.NewWriterSize(pipeWriter, 4096) | ||
| defer func() { | ||
| if flushErr := bufWriter.Flush(); flushErr != nil { | ||
| c.Errorf("writer flush failed %v", flushErr) | ||
| // 如果已经存在错误,则保留。否则,记录此错误。 | ||
| if err == nil { | ||
| err = flushErr | ||
| } | ||
| if flushErr := bufWriter.Flush(); flushErr != nil && err == nil { | ||
| err = fmt.Errorf("flush writer failed: %w", flushErr) | ||
| } | ||
| }() | ||
|
|
||
| // 使用正则表达式匹配 http 和 https 链接 | ||
| for { | ||
| line, readErr := bufReader.ReadString('\n') | ||
| if readErr != nil { | ||
| if readErr == io.EOF { | ||
| break // 文件结束 | ||
| } | ||
| err = fmt.Errorf("读取行错误: %v", readErr) // 传递错误 | ||
| return // Goroutine 中使用 return 返回错误 | ||
| if readErr != nil && readErr != io.EOF { | ||
| err = fmt.Errorf("read error: %w", readErr) | ||
| return | ||
| } | ||
|
|
||
| // 替换所有匹配的 URL | ||
| modifiedLine := urlPattern.ReplaceAllStringFunc(line, func(originalURL string) string { | ||
| return modifyURL(originalURL, host, cfg) // 假设 modifyURL 函数已定义 | ||
| }) | ||
| if len(line) > 0 { | ||
| modifiedLine := urlPattern.ReplaceAllStringFunc(line, func(originalURL string) string { | ||
| return modifyURL(originalURL, host, cfg) | ||
| }) | ||
|
|
||
| n, writeErr := bufWriter.WriteString(modifiedLine) | ||
| written += int64(n) // 更新写入的字节数 | ||
| if writeErr != nil { | ||
| err = fmt.Errorf("写入文件错误: %v", writeErr) // 传递错误 | ||
| return // Goroutine 中使用 return 返回错误 | ||
| n, writeErr := bufWriter.WriteString(modifiedLine) | ||
| written += int64(n) | ||
| if writeErr != nil { | ||
| err = fmt.Errorf("write error: %w", writeErr) | ||
| return | ||
| } | ||
| } | ||
|
|
||
| if readErr == io.EOF { | ||
| break | ||
| } | ||
| } | ||
| }() | ||
|
|
||
| return readerOut, written, nil | ||
| } |
There was a problem hiding this comment.
返回值 written 存在数据竞争。
written 在 goroutine 内部累加(第 171 行),但函数在 goroutine 执行前就返回(第 184 行),导致:
- 返回值始终为 0
- 存在数据竞争(虽然当前调用方丢弃此值)
建议:
- 如果不需要
written,移除该返回值 - 如果需要,考虑通过 channel 或在 reader 消费完毕后获取
♻️ 如果不需要 written,简化签名
-func processLinksStreamingInternal(input io.ReadCloser, host string, cfg *config.Config, c *touka.Context) (readerOut io.Reader, written int64, err error) {
+func processLinksStreamingInternal(input io.ReadCloser, host string, cfg *config.Config, c *touka.Context) (readerOut io.Reader, err error) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/nest.go` around lines 132 - 185, The returned variable "written" in
processLinksStreamingInternal is being mutated inside the spawned goroutine
(written += int64(n)) while the function returns immediately, causing a data
race and always returning 0; to fix either remove "written" from the function
signature and all call sites (simplest) or make its value safe to observe by
synchronizing with the goroutine (e.g., use a channel or a sync.WaitGroup and
only return the accumulated value after the goroutine finishes), ensuring you
update the code around processLinksStreamingInternal and any callers
accordingly; reference symbols: processLinksStreamingInternal, written, the
anonymous goroutine, and bufWriter.WriteString where written is updated.
| func processLinksBufferedInternal(input io.ReadCloser, host string, cfg *config.Config, c *touka.Context) (readerOut io.Reader, written int64, err error) { | ||
| pipeReader, pipeWriter := io.Pipe() | ||
| readerOut = pipeReader | ||
| hostBytes := []byte(host) | ||
|
|
||
| // 在返回之前,再刷新一次 (虽然 defer 中已经有 flush,但这里再加一次确保及时刷新) | ||
| if flushErr := bufWriter.Flush(); flushErr != nil { | ||
| if err == nil { // 避免覆盖之前的错误 | ||
| err = flushErr | ||
| go func() { | ||
| defer func() { | ||
| if closeErr := input.Close(); closeErr != nil && c != nil { | ||
| c.Errorf("input close failed: %v", closeErr) | ||
| } | ||
| }() | ||
| defer func() { | ||
| if err != nil { | ||
| _ = pipeWriter.CloseWithError(err) | ||
| return | ||
| } | ||
| return // Goroutine 中使用 return 返回错误 | ||
| _ = pipeWriter.Close() | ||
| }() | ||
|
|
||
| buf := bufferPool.Get().(*bytes.Buffer) | ||
| buf.Reset() | ||
| defer bufferPool.Put(buf) | ||
|
|
||
| if _, err = buf.ReadFrom(input); err != nil { | ||
| err = fmt.Errorf("reading input failed: %w", err) | ||
| return | ||
| } | ||
|
|
||
| modifiedBytes := urlPattern.ReplaceAllFunc(buf.Bytes(), func(originalURL []byte) []byte { | ||
| return modifyURLBytes(originalURL, hostBytes, cfg) | ||
| }) | ||
|
|
||
| var n int | ||
| n, err = pipeWriter.Write(modifiedBytes) | ||
| written = int64(n) | ||
| if err != nil { | ||
| err = fmt.Errorf("writing to pipe failed: %w", err) | ||
| } | ||
| }() | ||
|
|
||
| return readerOut, written, nil // 返回 reader 和 written,error 由 Goroutine 通过 pipeWriter.CloseWithError 传递 | ||
| return readerOut, written, nil | ||
| } |
There was a problem hiding this comment.
同样存在 written 数据竞争问题。
与 processLinksStreamingInternal 相同,第 231 行在 goroutine 内赋值 written,第 237 行立即返回。建议统一处理两个函数的返回值设计。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/nest.go` around lines 197 - 238, processLinksBufferedInternal has a
data race because the goroutine assigns to the outer variable written and the
function returns immediately; fix by synchronizing the goroutine's result before
returning (or by changing the return design to not expose written). Concretely,
in processLinksBufferedInternal create a result channel (e.g. struct{written
int64; err error}) or use a sync.WaitGroup + local result vars, send the final
written and err from inside the goroutine to that channel (or set locals while
the main goroutine waits), then receive/wait before returning so that written
and err are set deterministically; reference symbols:
processLinksBufferedInternal, written, err, pipeWriter, goroutine where n, err
are computed. Ensure the same return-value approach is used as in
processLinksStreamingInternal for consistency.
- Canonicalize filtered header deny-lists so Cloudflare and CDN headers are still removed - Normalize incomplete API repo paths to stable owner-level matcher output regardless of trailing slash or query - Add regression tests covering header canonicalization and incomplete API repo path parsing
perf(proxy): optimize hot request paths
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements several performance optimizations across the proxy logic, including refactoring route handlers to reduce redundant initializations, replacing regex with string manipulation for path parsing, and optimizing header filtering. Significant improvements were made to link processing in proxy/nest.go by introducing a buffered processing strategy for small payloads and utilizing sync.Pool for buffer reuse. Feedback suggests further optimizing the link modification logic by pre-calculating URL prefixes and switching to byte-based reading to minimize allocations in the hot path.
| func modifyURLBytes(url []byte, host []byte, cfg *config.Config) []byte { | ||
| if !EditorMatcherBytes(url, cfg) { | ||
| return url | ||
| } | ||
|
|
||
| var trimmed []byte | ||
| if bytes.HasPrefix(url, prefixHTTPS) { | ||
| trimmed = url[len(prefixHTTPS):] | ||
| } else if bytes.HasPrefix(url, prefixHTTP) { | ||
| trimmed = url[len(prefixHTTP):] | ||
| } else { | ||
| trimmed = url | ||
| } | ||
|
|
||
| newURL := make([]byte, len(prefixHTTPS)+len(host)+1+len(trimmed)) | ||
| written := 0 | ||
| written += copy(newURL[written:], prefixHTTPS) | ||
| written += copy(newURL[written:], host) | ||
| written += copy(newURL[written:], []byte("/")) | ||
| copy(newURL[written:], trimmed) | ||
|
|
||
| return newURL | ||
| } |
There was a problem hiding this comment.
The modifyURLBytes function can be optimized by accepting a pre-calculated prefix (e.g., https://<host>/) instead of just the host string. This avoids repeated allocations and byte slice constructions for the same prefix on every match within a request. Additionally, []byte("/") is an unnecessary allocation; using a string literal in copy or direct byte assignment is more efficient.
func modifyURLBytes(url []byte, prefix []byte, cfg *config.Config) []byte {
if !EditorMatcherBytes(url, cfg) {
return url
}
var trimmed []byte
if bytes.HasPrefix(url, prefixHTTPS) {
trimmed = url[len(prefixHTTPS):]
} else if bytes.HasPrefix(url, prefixHTTP) {
trimmed = url[len(prefixHTTP):]
} else {
trimmed = url
}
newURL := make([]byte, len(prefix)+len(trimmed))
copy(newURL, prefix)
copy(newURL[len(prefix):], trimmed)
return newURL
}| for { | ||
| line, readErr := bufReader.ReadString('\n') | ||
| if readErr != nil { | ||
| if readErr == io.EOF { | ||
| break // 文件结束 | ||
| } | ||
| err = fmt.Errorf("读取行错误: %v", readErr) // 传递错误 | ||
| return // Goroutine 中使用 return 返回错误 | ||
| if readErr != nil && readErr != io.EOF { | ||
| err = fmt.Errorf("read error: %w", readErr) | ||
| return | ||
| } | ||
|
|
||
| // 替换所有匹配的 URL | ||
| modifiedLine := urlPattern.ReplaceAllStringFunc(line, func(originalURL string) string { | ||
| return modifyURL(originalURL, host, cfg) // 假设 modifyURL 函数已定义 | ||
| }) | ||
| if len(line) > 0 { | ||
| modifiedLine := urlPattern.ReplaceAllStringFunc(line, func(originalURL string) string { | ||
| return modifyURL(originalURL, host, cfg) | ||
| }) | ||
|
|
||
| n, writeErr := bufWriter.WriteString(modifiedLine) | ||
| written += int64(n) // 更新写入的字节数 | ||
| if writeErr != nil { | ||
| err = fmt.Errorf("写入文件错误: %v", writeErr) // 传递错误 | ||
| return // Goroutine 中使用 return 返回错误 | ||
| n, writeErr := bufWriter.WriteString(modifiedLine) | ||
| written += int64(n) | ||
| if writeErr != nil { | ||
| err = fmt.Errorf("write error: %w", writeErr) | ||
| return | ||
| } | ||
| } | ||
|
|
||
| if readErr == io.EOF { | ||
| break | ||
| } | ||
| } |
There was a problem hiding this comment.
In processLinksStreamingInternal, switching from ReadString to ReadBytes and using ReplaceAllFunc with the byte-based modifyURLBytes would significantly reduce allocations by avoiding string conversions for every line and match. Pre-calculating the proxy prefix outside the loop further optimizes the hot path and aligns with the performance goals of this PR.
hostBytes := []byte(host)
prefix := make([]byte, len(prefixHTTPS)+len(hostBytes)+1)
nPrefix := copy(prefix, prefixHTTPS)
nPrefix += copy(prefix[nPrefix:], hostBytes)
prefix[nPrefix] = '/'
for {
line, readErr := bufReader.ReadBytes('\n')
if readErr != nil && readErr != io.EOF {
err = fmt.Errorf("read error: %w", readErr)
return
}
if len(line) > 0 {
modifiedLine := urlPattern.ReplaceAllFunc(line, func(originalURL []byte) []byte {
return modifyURLBytes(originalURL, prefix, cfg)
})
n, writeErr := bufWriter.Write(modifiedLine)
written += int64(n)
if writeErr != nil {
err = fmt.Errorf("write error: %w", writeErr)
return
}
}
if readErr == io.EOF {
break
}
}| hostBytes := []byte(host) | ||
|
|
||
| // 在返回之前,再刷新一次 (虽然 defer 中已经有 flush,但这里再加一次确保及时刷新) | ||
| if flushErr := bufWriter.Flush(); flushErr != nil { | ||
| if err == nil { // 避免覆盖之前的错误 | ||
| err = flushErr | ||
| go func() { | ||
| defer func() { | ||
| if closeErr := input.Close(); closeErr != nil && c != nil { | ||
| c.Errorf("input close failed: %v", closeErr) | ||
| } | ||
| }() | ||
| defer func() { | ||
| if err != nil { | ||
| _ = pipeWriter.CloseWithError(err) | ||
| return | ||
| } | ||
| return // Goroutine 中使用 return 返回错误 | ||
| _ = pipeWriter.Close() | ||
| }() | ||
|
|
||
| buf := bufferPool.Get().(*bytes.Buffer) | ||
| buf.Reset() | ||
| defer bufferPool.Put(buf) | ||
|
|
||
| if _, err = buf.ReadFrom(input); err != nil { | ||
| err = fmt.Errorf("reading input failed: %w", err) | ||
| return | ||
| } | ||
|
|
||
| modifiedBytes := urlPattern.ReplaceAllFunc(buf.Bytes(), func(originalURL []byte) []byte { | ||
| return modifyURLBytes(originalURL, hostBytes, cfg) | ||
| }) |
There was a problem hiding this comment.
Update processLinksBufferedInternal to pre-calculate the proxy prefix and pass it to the optimized modifyURLBytes function to avoid redundant work in the replacement callback.
hostBytes := []byte(host)
prefix := make([]byte, len(prefixHTTPS)+len(hostBytes)+1)
nPrefix := copy(prefix, prefixHTTPS)
nPrefix += copy(prefix[nPrefix:], hostBytes)
prefix[nPrefix] = '/'
go func() {
defer func() {
if closeErr := input.Close(); closeErr != nil && c != nil {
c.Errorf("input close failed: %v", closeErr)
}
}()
defer func() {
if err != nil {
_ = pipeWriter.CloseWithError(err)
return
}
_ = pipeWriter.Close()
}()
buf := bufferPool.Get().(*bytes.Buffer)
buf.Reset()
defer bufferPool.Put(buf)
if _, err = buf.ReadFrom(input); err != nil {
err = fmt.Errorf("reading input failed: %w", err)
return
}
modifiedBytes := urlPattern.ReplaceAllFunc(buf.Bytes(), func(originalURL []byte) []byte {
return modifyURLBytes(originalURL, prefix, cfg)
})There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
proxy/httpc.go (2)
90-97:⚠️ Potential issue | 🟡 Minor
initGitHTTPClient的advanced模式同样缺少IdleConnTimeout与
initHTTPClient一致,gittr的advanced模式也应设置IdleConnTimeout。🔧 建议添加 IdleConnTimeout
case "advanced": gittr = &http.Transport{ MaxIdleConns: cfg.Httpc.MaxIdleConns, MaxConnsPerHost: cfg.Httpc.MaxConnsPerHost, MaxIdleConnsPerHost: cfg.Httpc.MaxIdleConnsPerHost, + IdleConnTimeout: 30 * time.Second, WriteBufferSize: 32 * 1024, // 32KB ReadBufferSize: 32 * 1024, // 32KB }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/httpc.go` around lines 90 - 97, The advanced branch in initGitHTTPClient is missing IdleConnTimeout on the gittr transport; update the "advanced" case where gittr is created to set IdleConnTimeout using the same cfg.Httpc.IdleConnTimeout value used in initHTTPClient so the transport has a proper idle timeout configured.
50-58:⚠️ Potential issue | 🟡 Minor
advanced模式缺少IdleConnTimeout配置
auto模式(第 45 行)设置了IdleConnTimeout: 30 * time.Second,但advanced模式未设置此字段。这可能导致advanced模式下空闲连接永不超时,造成资源泄漏。🔧 建议在 advanced 模式中也设置 IdleConnTimeout
case "advanced": tr = &http.Transport{ MaxIdleConns: cfg.Httpc.MaxIdleConns, MaxConnsPerHost: cfg.Httpc.MaxConnsPerHost, MaxIdleConnsPerHost: cfg.Httpc.MaxIdleConnsPerHost, + IdleConnTimeout: 30 * time.Second, WriteBufferSize: 32 * 1024, // 32KB ReadBufferSize: 32 * 1024, // 32KB Protocols: proTolcols, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/httpc.go` around lines 50 - 58, In the "advanced" case where the http.Transport is constructed, add the IdleConnTimeout field (e.g., IdleConnTimeout: cfg.Httpc.IdleConnTimeout or the same 30 * time.Second used in the "auto" branch) to avoid never-expiring idle connections; update the transport creation in the "advanced" case (the http.Transport literal) to include IdleConnTimeout so it matches the idle timeout behavior from the "auto" branch.
🧹 Nitpick comments (2)
proxy/reqheader.go (1)
75-80:defaultHeaders重复赋值,可删除
init()中对defaultHeaders的赋值与第 45-50 行的初始值完全相同,属于冗余代码。♻️ 建议删除冗余赋值
func init() { reqHeadersToRemove = canonicalizeHeaderSet(reqHeadersToRemove) cloneHeadersToRemove = canonicalizeHeaderSet(cloneHeadersToRemove) respHeadersToRemove = canonicalizeHeaderSet(respHeadersToRemove) - defaultHeaders = map[string]string{ - "Accept": "*/*", - "Accept-Encoding": "", - "Transfer-Encoding": "chunked", - "User-Agent": "GHProxy/1.0", - } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/reqheader.go` around lines 75 - 80, The init() function redundantly reassigns the defaultHeaders map which is already initialized at declaration; remove the duplicate assignment inside init() so only the original defaultHeaders declaration (keys "Accept","Accept-Encoding","Transfer-Encoding","User-Agent") remains, keeping any other logic in init() intact and ensuring references to defaultHeaders elsewhere still use the declared variable.proxy/handler.go (1)
85-94: switch 语句中"blob"case 不可达第 81-83 行在进入 switch 前已将
matcher == "blob"转换为"raw",因此第 86 行的"blob"分支永远不会执行。♻️ 建议移除不可达的 case
switch matcher { - case "releases", "blob", "raw", "gist", "api": + case "releases", "raw", "gist", "api": ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/handler.go` around lines 85 - 94, The "blob" case in the switch is unreachable because matcher is normalized to "raw" earlier; remove "blob" from the case list so ChunkedProxyRequest is only invoked for the real reachable values (e.g., "releases", "raw", "gist", "api") and keep the existing calls to ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher), GitReq(ctx, c, rawPath, cfg, "git"), and the default ErrorPage/Git log untouched; update the case line in the switch (the one that currently reads case "releases", "blob", "raw", "gist", "api":) to omit "blob" to eliminate the dead branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@proxy/reqheader.go`:
- Around line 63-74: The local headersToRemove map in proxy/gitreq.go is not
canonicalized while reqheader.go's init uses canonicalizeHeaderSet on global
maps (respHeadersToRemove), causing inconsistent header matching; fix by
applying the same canonicalization to the local headersToRemove (call
canonicalizeHeaderSet(headersToRemove) when the map is created or during package
init) so comparisons use http.CanonicalHeaderKey consistently, and do not
replace or merge with respHeadersToRemove—only transform the local
headersToRemove using the canonicalizeHeaderSet helper.
---
Outside diff comments:
In `@proxy/httpc.go`:
- Around line 90-97: The advanced branch in initGitHTTPClient is missing
IdleConnTimeout on the gittr transport; update the "advanced" case where gittr
is created to set IdleConnTimeout using the same cfg.Httpc.IdleConnTimeout value
used in initHTTPClient so the transport has a proper idle timeout configured.
- Around line 50-58: In the "advanced" case where the http.Transport is
constructed, add the IdleConnTimeout field (e.g., IdleConnTimeout:
cfg.Httpc.IdleConnTimeout or the same 30 * time.Second used in the "auto"
branch) to avoid never-expiring idle connections; update the transport creation
in the "advanced" case (the http.Transport literal) to include IdleConnTimeout
so it matches the idle timeout behavior from the "auto" branch.
---
Nitpick comments:
In `@proxy/handler.go`:
- Around line 85-94: The "blob" case in the switch is unreachable because
matcher is normalized to "raw" earlier; remove "blob" from the case list so
ChunkedProxyRequest is only invoked for the real reachable values (e.g.,
"releases", "raw", "gist", "api") and keep the existing calls to
ChunkedProxyRequest(ctx, c, rawPath, cfg, matcher), GitReq(ctx, c, rawPath, cfg,
"git"), and the default ErrorPage/Git log untouched; update the case line in the
switch (the one that currently reads case "releases", "blob", "raw", "gist",
"api":) to omit "blob" to eliminate the dead branch.
In `@proxy/reqheader.go`:
- Around line 75-80: The init() function redundantly reassigns the
defaultHeaders map which is already initialized at declaration; remove the
duplicate assignment inside init() so only the original defaultHeaders
declaration (keys "Accept","Accept-Encoding","Transfer-Encoding","User-Agent")
remains, keeping any other logic in init() intact and ensuring references to
defaultHeaders elsewhere still use the declared variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 22bf6113-9a2d-48d7-a99f-5b4e8f435545
📒 Files selected for processing (9)
main.goproxy/handler.goproxy/hotpath_test.goproxy/httpc.goproxy/httpc_test.goproxy/match.goproxy/matcher_test.goproxy/perf_compare_test.goproxy/reqheader.go
✅ Files skipped from review due to trivial changes (1)
- proxy/matcher_test.go
| func canonicalizeHeaderSet(headers map[string]struct{}) map[string]struct{} { | ||
| canonicalized := make(map[string]struct{}, len(headers)) | ||
| for key := range headers { | ||
| canonicalized[http.CanonicalHeaderKey(key)] = struct{}{} | ||
| } | ||
| return canonicalized | ||
| } | ||
|
|
||
| func init() { | ||
| reqHeadersToRemove = canonicalizeHeaderSet(reqHeadersToRemove) | ||
| cloneHeadersToRemove = canonicalizeHeaderSet(cloneHeadersToRemove) | ||
| respHeadersToRemove = canonicalizeHeaderSet(respHeadersToRemove) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 检查 gitreq.go 中的 headersToRemove 使用情况
rg -n -A 10 'headersToRemove' --type=goRepository: WJQSERVER-STUDIO/ghproxy
Length of output: 826
🏁 Script executed:
# Find the definition of respHeadersToRemove in reqheader.go
rg -n -B 2 -A 8 'respHeadersToRemove\s*=' proxy/reqheader.go | head -30Repository: WJQSERVER-STUDIO/ghproxy
Length of output: 804
🏁 Script executed:
# Also check what canonicalizeHeaderSet returns and how respHeadersToRemove is used
rg -n 'respHeadersToRemove' proxy/reqheader.goRepository: WJQSERVER-STUDIO/ghproxy
Length of output: 183
🏁 Script executed:
# Verify the behavior of http.Header.Del() with Go's documentation or usage
rg -n 'resp\.Header\.Del' proxy/gitreq.goRepository: WJQSERVER-STUDIO/ghproxy
Length of output: 97
🏁 Script executed:
# Check if there are comments explaining why gitreq.go uses only those specific headers
rg -n -B 5 'headersToRemove := map' proxy/gitreq.go
# Also check the full context of the gitreq.go function to understand intent
rg -n -B 15 'headersToRemove :=' proxy/gitreq.go | head -40Repository: WJQSERVER-STUDIO/ghproxy
Length of output: 882
🏁 Script executed:
# Verify if X-Github-Request-Id, X-Timer, X-Served-By, X-Fastly-Request-Id should also be removed
rg -n 'X-Github-Request-Id|X-Timer|X-Served-By|X-Fastly-Request-Id' proxy/Repository: WJQSERVER-STUDIO/ghproxy
Length of output: 295
Header 规范化方式不一致,但建议需谨慎评估
canonicalizeHeaderSet 在 init 时将 denylist key 转换为 http.CanonicalHeaderKey 形式,这是正确做法。然而,proxy/gitreq.go:101-109 中的本地 headersToRemove map 包含非规范化的 key,且仅移除 3 个 HTTP 安全相关头部(Content-Security-Policy、Referrer-Policy、Strict-Transport-Security)。
重要发现:reqheader.go 中的全局 respHeadersToRemove 包含 7 个头部,除上述 3 个外还包括 4 个 CDN/GitHub 追踪头部(X-Github-Request-Id、X-Timer、X-Served-By、X-Fastly-Request-Id)。两者服务于不同目的。直接用 respHeadersToRemove 替换本地 map 会改变行为,移除额外的头部。
若要统一规范化策略,应在 gitreq.go 中也对 headersToRemove 进行规范化处理,或在 init 函数中单独规范化该 map。但不建议简单地合并两个 map。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proxy/reqheader.go` around lines 63 - 74, The local headersToRemove map in
proxy/gitreq.go is not canonicalized while reqheader.go's init uses
canonicalizeHeaderSet on global maps (respHeadersToRemove), causing inconsistent
header matching; fix by applying the same canonicalization to the local
headersToRemove (call canonicalizeHeaderSet(headersToRemove) when the map is
created or during package init) so comparisons use http.CanonicalHeaderKey
consistently, and do not replace or merge with respHeadersToRemove—only
transform the local headersToRemove using the canonicalizeHeaderSet helper.
Dispatch shell link rewriting between streaming and buffered paths based on response size
Reuse buffers and reduce URL construction allocations in proxy handlers
Add nest benchmarks and align extractParts compatibility expectations with the current contract
Summary by CodeRabbit
发布说明
Bug Fixes
Performance
Tests