Labels: bug, high priority, cli, memory
Summary
Running toolbench analyze on large input files (≥ ~200MB) causes the process to crash with either an out-of-memory error or unhandled exception. The crash appears to occur during parsing/aggregation in lib/analyze.js (or toolbench/analyze.py depending on implementation). Small files work fine.
Expected
toolbench analyze <file> should stream or chunk the input and complete successfully (or fail gracefully with a helpful error) for large files.
Actual
Process crashes with either:
- Node:
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory, or
- Python:
MemoryError or process killed by OS (OOM killer), or
- Unhandled exception and non-zero exit code with no helpful message.
Environment
- ToolBench commit / tag:
HEAD (please replace with exact commit hash)
- OS: Ubuntu 22.04 LTS (also reproduced on macOS 12)
- Node.js: v18.16.0 (if applicable)
- Python: 3.11.4 (if applicable)
- RAM: 8GB
- Reproduction on both machine-local and CI (GitHub Actions) observed
Reproduction steps (minimal)
- Create a large test file (200MB+). Example command to generate a test file:
# Linux / macOS: create a 250MB test file
base64 /dev/urandom | head -c 250000000 > ./test-large.ndjson
- Run the analysis:
# CLI invocation
toolbench analyze ./test-large.ndjson --mode summary
- Observe crash:
# Node.js example crash
$ toolbench analyze ./test-large.ndjson --mode summary
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
Aborted (core dumped)
Or for Python:
$ toolbench analyze ./test-large.ndjson --mode summary
Traceback (most recent call last):
File "toolbench/cli.py", line 42, in <module>
main()
File "toolbench/analyze.py", line 210, in analyze
results = aggregator.aggregate(all_items)
MemoryError
Quick root-cause hypothesis
The current implementation accumulates the entire parsed input into memory (e.g. all_items = [], or a full json.load()/file.read()), then runs in-memory aggregation. For very large files this triggers huge memory usage and crashes. The CLI should either:
- Stream-process input (line-by-line or chunked), keeping only aggregated state, or
- Use a bounded buffer / external temporary storage (SQLite or temporary file) for intermediate results, or
- Provide an option to use memory-limited mode.
Suggested fix (Node.js example)
Change the implementation to stream the file instead of reading whole file into memory.
Before (problematic pattern):
// lib/analyze.js (hypothetical)
const fs = require('fs');
function analyze(path) {
const raw = fs.readFileSync(path, 'utf8'); // reads entire file
const items = raw.split('\n').map(JSON.parse);
// heavy in-memory aggregation
return aggregate(items);
}
After (streaming approach using readline):
// lib/analyze.js
const fs = require('fs');
const readline = require('readline');
async function analyze(path) {
const stats = createAggregator(); // small stateful object
const rl = readline.createInterface({
input: fs.createReadStream(path, { encoding: 'utf8' }),
crlfDelay: Infinity
});
for await (const line of rl) {
if (!line.trim()) continue;
let item;
try {
item = JSON.parse(line);
} catch (err) {
// handle / log parse error per-line
continue;
}
stats.add(item); // aggregator keeps only necessary summary/state
}
return stats.finalize();
}
module.exports = { analyze };
This avoids loading the entire file into memory.
Suggested fix (Python example)
Use an iterator and avoid reading the entire file with json.load().
Before (problematic):
# toolbench/analyze.py
with open(path, 'r', encoding='utf-8') as fh:
data = json.load(fh) # loads whole file -> OOM
aggregator = Aggregator()
aggregator.aggregate(data)
After (streaming, NDJSON example):
# toolbench/analyze.py
def analyze(path):
aggregator = Aggregator()
with open(path, 'r', encoding='utf-8') as fh:
for line in fh:
line = line.strip()
if not line:
continue
try:
item = json.loads(line)
except json.JSONDecodeError:
# optionally log and continue
continue
aggregator.add(item)
return aggregator.result()
If input is not NDJSON, consider ijson for streaming JSON arrays:
# example with ijson for JSON arrays
import ijson
with open(path, 'rb') as fh:
parser = ijson.items(fh, 'item')
for item in parser:
aggregator.add(item)
Add ijson to optional dependencies if needed.
Tests to add (unit / integration)
Node: jest integration test
Create tests/large-file.integration.test.js:
const fs = require('fs');
const { spawnSync } = require('child_process');
const tmp = require('tmp');
// generate small-ish file but conceptually large for CI
test('analyze handles streaming input without OOM', () => {
const tmpFile = tmp.fileSync({ postfix: '.ndjson' });
const lines = [];
for (let i = 0; i < 10000; i++) {
lines.push(JSON.stringify({ id: i, value: Math.random() }));
}
fs.writeFileSync(tmpFile.name, lines.join('\n'), 'utf8');
const result = spawnSync('node', ['bin/toolbench', 'analyze', tmpFile.name], {
encoding: 'utf8',
maxBuffer: 1024 * 1024 * 10
});
expect(result.status).toBe(0);
expect(result.stdout).toMatch(/summary/); // adapt to actual CLI output
});
Python pytest integration
tests/test_analyze_large.py:
import json
import tempfile
import subprocess
import sys
def test_analyze_handles_large_file(tmp_path):
p = tmp_path / "test.ndjson"
with p.open("w", encoding="utf-8") as fh:
for i in range(20000):
fh.write(json.dumps({"id": i, "v": i}) + "\n")
proc = subprocess.run([sys.executable, "-m", "toolbench", "analyze", str(p)],
capture_output=True, text=True)
assert proc.returncode == 0
assert "summary" in proc.stdout.lower() # adapt to actual output
Suggested PR checklist / reviewer notes
- Replace any
readFileSync/read() + JSON.parse() of entire file with streaming approach.
- Add unit/integration test above to CI matrix.
- Add a CLI flag
--stream or automatically detect stdin and stream.
- Update README to document memory-safe mode and supported input formats (NDJSON / JSON array).
- If using
ijson or another third-party streaming parser, add it to dependencies and provide fallback.
Logs / attachments
(Attach any verbose logs or profiler output, e.g., node --trace_gc or python -X tracemalloc, to help root-cause.)
Example: node --max-old-space-size=4096 bin/toolbench analyze test-large.ndjson
- If increasing the node heap avoids crash, it's more evidence of memory use pattern.
Temporary workarounds
- Split input file into smaller chunks and run
toolbench analyze on each, then merge results externally.
- Run under larger-memory machine / increase Node heap with
--max-old-space-size=8192 (not a real fix).
Example minimal patch idea (pseudo)
- Create
lib/streaming-aggregator.js with a small stateful aggregator API (add(item), finalize()).
- Modify CLI entrypoint to choose streaming path for files > 10MB or when
--stream is passed.
- Add tests and docs.
Labels:
bug,high priority,cli,memorySummary
Running
toolbench analyzeon large input files (≥ ~200MB) causes the process to crash with either an out-of-memory error or unhandled exception. The crash appears to occur during parsing/aggregation inlib/analyze.js(ortoolbench/analyze.pydepending on implementation). Small files work fine.Expected
toolbench analyze <file>should stream or chunk the input and complete successfully (or fail gracefully with a helpful error) for large files.Actual
Process crashes with either:
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory, orMemoryErroror process killed by OS (OOM killer), orEnvironment
HEAD(please replace with exact commit hash)Reproduction steps (minimal)
# CLI invocation toolbench analyze ./test-large.ndjson --mode summaryOr for Python:
Quick root-cause hypothesis
The current implementation accumulates the entire parsed input into memory (e.g.
all_items = [], or a fulljson.load()/file.read()), then runs in-memory aggregation. For very large files this triggers huge memory usage and crashes. The CLI should either:Suggested fix (Node.js example)
Change the implementation to stream the file instead of reading whole file into memory.
Before (problematic pattern):
After (streaming approach using readline):
This avoids loading the entire file into memory.
Suggested fix (Python example)
Use an iterator and avoid reading the entire file with
json.load().Before (problematic):
After (streaming, NDJSON example):
If input is not NDJSON, consider
ijsonfor streaming JSON arrays:Add
ijsonto optional dependencies if needed.Tests to add (unit / integration)
Node: jest integration test
Create
tests/large-file.integration.test.js:Python pytest integration
tests/test_analyze_large.py:Suggested PR checklist / reviewer notes
readFileSync/read()+JSON.parse()of entire file with streaming approach.--streamor automatically detectstdinand stream.ijsonor another third-party streaming parser, add it to dependencies and provide fallback.Logs / attachments
(Attach any verbose logs or profiler output, e.g.,
node --trace_gcorpython -X tracemalloc, to help root-cause.)Example:
node --max-old-space-size=4096 bin/toolbench analyze test-large.ndjsonTemporary workarounds
toolbench analyzeon each, then merge results externally.--max-old-space-size=8192(not a real fix).Example minimal patch idea (pseudo)
lib/streaming-aggregator.jswith a small stateful aggregator API (add(item),finalize()).--streamis passed.