Skip to content

Commit d2d3153

Browse files
committed
Implement response truncation and caching in Smart MCP Proxy. Add intelligent response handling to prevent LLM context bloat, with support for pagination and JSON structure analysis. Update README with new features and configuration options.
1 parent 8290cd9 commit d2d3153

13 files changed

Lines changed: 2270 additions & 28 deletions

File tree

README.md

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,9 @@ Smart MCP Proxy
105105
- **Intelligent Tool Discovery**: Automatically discover and index tools from multiple MCP servers
106106
- **Semantic Search**: Find relevant tools using natural language queries
107107
- **Tool Aggregation**: Combine tools from multiple upstream servers into a single interface
108+
- **Response Truncation & Caching**: Automatically truncate large tool responses to prevent LLM context bloat
109+
- **Smart Pagination**: Access cached response data through pagination with the `read_cache` tool
110+
- **JSON Structure Analysis**: Intelligent splitting of JSON responses by record arrays
108111
- **HTTP & Stdio Support**: Connect to MCP servers via HTTP or stdio protocols
109112
- **Persistent Storage**: Cache tool metadata and connection information
110113
- **Configuration Management**: Flexible JSON-based configuration with environment variable support
@@ -146,6 +149,7 @@ Create a `config.json` file:
146149
"enable_tray": true,
147150
"top_k": 5,
148151
"tools_limit": 15,
152+
"tool_response_limit": 20000,
149153
"mcpServers": [
150154
{
151155
"name": "Local Python Server",
@@ -218,6 +222,87 @@ curl -X POST http://localhost:8080/mcp/ \
218222

219223
The proxy automatically discovers and indexes tools from configured upstream servers. Tools are available through the unified interface with semantic search capabilities.
220224

225+
## Response Truncation & Caching
226+
227+
The Smart MCP Proxy includes intelligent response truncation to prevent LLM context bloat while maintaining access to complete data through caching and pagination.
228+
229+
### How It Works
230+
231+
1. **Automatic Truncation**: Tool responses exceeding the configured limit (default: 20,000 characters) are automatically truncated
232+
2. **JSON Analysis**: The proxy analyzes JSON responses to identify record arrays for intelligent splitting
233+
3. **Smart Caching**: Complete responses are cached with 2-hour TTL for pagination access
234+
4. **Fallback Handling**: Non-JSON or unstructured responses get simple truncation
235+
236+
### Configuration
237+
238+
```json
239+
{
240+
"tool_response_limit": 20000 // Default: 20000 chars, 0 = disabled
241+
}
242+
```
243+
244+
### Truncated Response Format
245+
246+
When a response is truncated, you'll see:
247+
248+
```json
249+
{
250+
"data": [{"id": 1}, {"id": 2}] // Partial data...
251+
}
252+
253+
... [truncated by mcpproxy]
254+
255+
Response truncated (limit: 20000 chars, actual: 45000 chars, records: 150)
256+
Use read_cache tool: key="abc123def...", offset=0, limit=50
257+
Returns: {"records": [...], "meta": {"total_records": 150, "total_size": 45000}}
258+
```
259+
260+
### Accessing Complete Data
261+
262+
Use the `read_cache` tool to access paginated data:
263+
264+
```json
265+
{
266+
"jsonrpc": "2.0",
267+
"id": 4,
268+
"method": "tools/call",
269+
"params": {
270+
"name": "read_cache",
271+
"arguments": {
272+
"key": "abc123def456...",
273+
"offset": 0,
274+
"limit": 50
275+
}
276+
}
277+
}
278+
```
279+
280+
Response:
281+
```json
282+
{
283+
"records": [
284+
{"id": 1, "name": "item1"},
285+
{"id": 2, "name": "item2"}
286+
// ... up to 50 records
287+
],
288+
"meta": {
289+
"key": "abc123def456...",
290+
"total_records": 150,
291+
"limit": 50,
292+
"offset": 0,
293+
"total_size": 45000,
294+
"record_path": "data"
295+
}
296+
}
297+
```
298+
299+
### Cache Management
300+
301+
- **TTL**: Cached responses expire after 2 hours
302+
- **Cleanup**: Automatic cleanup runs every 10 minutes
303+
- **Storage**: Uses the same BBolt database as other proxy data
304+
- **Statistics**: Cache hit/miss rates available through tool stats
305+
221306
## System Tray Usage
222307

223308
### Status Information

cmd/mcpproxy/main.go

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,14 @@ import (
1717
)
1818

1919
var (
20-
configFile string
21-
dataDir string
22-
listen string
23-
logLevel string
24-
enableTray bool
25-
debugSearch bool
26-
version = "v0.1.0" // This will be injected by -ldflags during build
20+
configFile string
21+
dataDir string
22+
listen string
23+
logLevel string
24+
enableTray bool
25+
debugSearch bool
26+
toolResponseLimit int
27+
version = "v0.1.0" // This will be injected by -ldflags during build
2728
)
2829

2930
func main() {
@@ -41,6 +42,7 @@ func main() {
4142
rootCmd.PersistentFlags().StringVar(&logLevel, "log-level", "info", "Log level (debug, info, warn, error)")
4243
rootCmd.PersistentFlags().BoolVar(&enableTray, "tray", true, "Enable system tray")
4344
rootCmd.PersistentFlags().BoolVar(&debugSearch, "debug-search", false, "Enable debug search tool for search relevancy debugging")
45+
rootCmd.PersistentFlags().IntVar(&toolResponseLimit, "tool-response-limit", 0, "Tool response limit in characters (0 = disabled, default: 20000 from config)")
4446

4547
if err := rootCmd.Execute(); err != nil {
4648
fmt.Fprintf(os.Stderr, "Error: %v\n", err)
@@ -73,6 +75,11 @@ func runServer(cmd *cobra.Command, args []string) error {
7375
// Override debug search setting from command line
7476
cfg.DebugSearch = debugSearch
7577

78+
// Override tool response limit from command line if provided
79+
if toolResponseLimit != 0 {
80+
cfg.ToolResponseLimit = toolResponseLimit
81+
}
82+
7683
logger.Info("Configuration loaded",
7784
zap.String("data_dir", cfg.DataDir),
7885
zap.Int("servers_count", len(cfg.Servers)),
@@ -215,6 +222,9 @@ func loadConfig() (*config.Config, error) {
215222
if listen != "" {
216223
cfg.Listen = listen
217224
}
225+
if toolResponseLimit != 0 {
226+
cfg.ToolResponseLimit = toolResponseLimit
227+
}
218228

219229
// Validate the configuration
220230
if err := cfg.Validate(); err != nil {

0 commit comments

Comments
 (0)