Skip to content

[refactor] Semantic Function Clustering: Outliers and Organizational Improvements #6149

@github-actions

Description

@github-actions

Overview

Automated semantic function clustering analysis of 124 non-test Go files in internal/ cataloging 811 functions across 14 packages. The codebase is generally well-organized with clear separation by feature, but several outlier functions and one catch-all file were identified as candidates for improvement.

Summary: 124 files analyzed · 811 functions cataloged · 14 function clusters identified · 3 outlier functions · 1 over-broad helper file

Full Report

Function Inventory

By Package

Package Files Primary Purpose
internal/difc 9 Information Flow Control labels, evaluation, agent registry
internal/server 16 HTTP server, routing, sessions, middleware
internal/logger 13 Structured/file/JSONL/markdown logging
internal/config 11 Configuration loading, validation, guard policies
internal/mcp 9 MCP protocol types, connections, HTTP transport
internal/guard 9 Security guards (noop, wasm, write-sink)
internal/cmd 10 CLI commands and flag registration
internal/launcher 4 Backend process management
internal/proxy 7 GitHub API proxy with DIFC enforcement
internal/envutil 4 Environment variable utilities
internal/strutil 4 String formatting utilities
internal/httputil 3 Shared HTTP helpers
internal/tracing 3 OpenTelemetry setup
internal/sys 2 Container/Docker detection

Key Clusters Identified

  • Validation cluster (~30 functions): config/validation*.go, config/guard_policy_validation.go, auth/header.go:ValidateAPIKey, sys/docker.go:ValidateContainerID, server/session.go:extractAndValidateSession
  • Parse cluster (~20 functions): spread across mcp/, config/, tracing/, guard/, cmd/
  • Logging cluster (~35 functions): well-centralized in logger/, but mcp/errors.go:LogConnectionError diverges
  • Format/Truncate cluster (~10 functions): strutil/truncate.go, logger/rpc_formatter.go, with outliers in server/http_helpers.go:truncateCacheKeyForLog

Identified Issues

1. LogConnectionError is in the Wrong File

Severity: Medium

  • File: internal/mcp/errors.go
  • Function: func LogConnectionError(errCtx ConnectionErrorContext, err error)
  • Issue: errors.go conventionally holds error type definitions. LogConnectionError is a 100-line logging/diagnostic function that outputs to both the file logger and standard log. The mcp package already has a dedicated connection_logging.go file that houses all other connection logging helpers (logReconnectStart, logReconnectResult, logOutboundRPCRequest, logInboundRPCResponse).
  • Recommendation: Move LogConnectionError and its ConnectionErrorContext type from mcp/errors.go into mcp/connection_logging.go. If errors.go becomes empty after the move, delete it.
  • Impact: Better discoverability for developers looking for connection diagnostic logging; errors.go will correctly contain only error types.
Before:
  mcp/errors.go              → ConnectionErrorContext struct, LogConnectionError (100 lines)
  mcp/connection_logging.go  → logReconnectStart, logReconnectResult, logOutbound*, logInbound*

After:
  mcp/connection_logging.go  → all connection logging incl. LogConnectionError + ConnectionErrorContext
  mcp/errors.go              → (remove or keep for future error types)

2. server/http_helpers.go is a Catch-All File

Severity: Low–Medium

  • File: internal/server/http_helpers.go (351 lines, 11 functions)
  • Issue: The file mixes four distinct concerns:
    1. Error/runtime logging: logRuntimeError, rejectRequest, logHTTPRequestBody
    2. Middleware composition: wrapWithMiddleware, applyIfConfigured, withResponseLogging
    3. OTEL tracing: WithOTELTracing
    4. SDK adapter: WithSDKLogging, buildMCPHandler
    5. Utilities: truncateCacheKeyForLog, peekRequestBody
  • Recommendation: Consider splitting into two files:
    • server/middleware.gowrapWithMiddleware, applyIfConfigured, WithOTELTracing, WithSDKLogging, buildMCPHandler
    • server/http_helpers.go — keep request/response utilities only
  • Note: This is a low-priority cosmetic improvement; all current functions are logically related to HTTP request handling.

3. validateOpenTelemetryConfig Embedded in config_tracing.go

Severity: Low

  • File: internal/config/config_tracing.go
  • Function: func validateOpenTelemetryConfig(cfg *TracingConfig, enforceHTTPS bool) error
  • Issue: The config/ package maintains clear validation separation (validation.go, validation_env.go, validation_schema.go, guard_policy_validation.go). A tracing-specific validator is embedded in the config struct file instead.
  • Recommendation: Move validateOpenTelemetryConfig to a new config/validation_tracing.go file, or include it in the existing validation.go, to make all validators discoverable in one place.

4. LogAndWrapCollaboratorPermission Mixes Logging and Data Construction

Severity: Low

  • File: internal/mcp/tool_result.go
  • Function: func LogAndWrapCollaboratorPermission(...)
  • Issue: tool_result.go is focused on result conversion and construction (ConvertToCallToolResult, BuildMCPTextResponse, ParseToolArguments). LogAndWrapCollaboratorPermission both emits debug log lines and constructs a result, which gives it a dual responsibility. The collaborator permission parsing is already in mcp/collaborator_permission.go.
  • Recommendation: Move the function to mcp/collaborator_permission.go alongside ParseCollaboratorPermissionArgs, where all collaborator-permission logic is co-located.

No Duplicates Found

The following patterns were investigated and found to be acceptably distinct:

  • Rate limit parsing: httputil.ParseRateLimitResetHeader (parses Unix timestamp from HTTP header string) vs server/circuit_breaker.go:parseRateLimitResetFromText (parses seconds offset from natural-language error text). Different inputs, different algorithms — not duplicates.
  • getOrCreate pattern: syncutil.GetOrCreate[K,V] (generic, used in 4 places), server/routed.go:filteredServerCache.getOrCreate (LRU+TTL eviction — too complex for the generic). Correctly diverged.
  • Truncation helpers: strutil.Truncate* (generic), auth.TruncateSessionID (auth-domain), logger/sanitize.TruncateSecret (security-domain) — purpose-specific, not duplicates.

Refactoring Recommendations (Prioritized)

Priority 1 — High Impact

# Action Effort Benefit
1 Move LogConnectionError + ConnectionErrorContext from mcp/errors.gomcp/connection_logging.go 30 min Correct semantic placement; errors.go purpose clarified

Priority 2 — Medium Impact

# Action Effort Benefit
2 Move validateOpenTelemetryConfig to config/validation_tracing.go 30 min All validators in dedicated validation files
3 Move LogAndWrapCollaboratorPermission to mcp/collaborator_permission.go 30 min All collaborator logic co-located

Priority 3 — Low Impact

# Action Effort Benefit
4 Split server/http_helpers.go into middleware.go + http_helpers.go 2 hr Clearer file purpose

Implementation Checklist

  • Move ConnectionErrorContext + LogConnectionErrormcp/connection_logging.go
  • Delete or leave empty mcp/errors.go (check for other contents first)
  • Move validateOpenTelemetryConfigconfig/validation_tracing.go
  • Move LogAndWrapCollaboratorPermissionmcp/collaborator_permission.go
  • Optional: split server/http_helpers.go into middleware and helper files
  • Run make agent-finished to verify all tests pass after each move

Analysis Metadata

Metric Value
Total Go files analyzed 124
Total functions cataloged 811
Function clusters identified 14
Outlier functions found 3
Duplicate functions detected 0
Detection method Naming pattern analysis + semantic clustering
Analysis date 2026-05-20

References:

Generated by Semantic Function Refactoring · ● 1.7M ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions