TL;DR — Memtrace runs entirely on your machine. Your source code never leaves it.
Memtrace builds a structural knowledge graph from your codebase's AST. Every step happens on your machine:
| Step | Where it runs | What it processes |
|---|---|---|
| AST parsing | Local (Tree-sitter, compiled into the binary) | Source files → symbol nodes |
| Graph construction | Local (MemDB, embedded or self-hosted) | Nodes + edges (CALLS, IMPLEMENTS, IMPORTS) |
| Vector embeddings | Local (ONNX Runtime via fastembed — CoreML on Apple Silicon, CPU elsewhere) | Symbol signatures → vectors stored in local MemDB |
| Full-text search | Local (Tantivy BM25 index on disk) | Symbol names + signatures |
| Git history analysis | Local (libgit2, vendored) | Commit history → bi-temporal graph |
| MCP tool queries | Local (graph traversal + search) | Results returned to your local MCP client |
No source code, file contents, symbol names, embeddings, file paths, or AST data is ever transmitted to any external server.
Memtrace makes exactly three types of network calls:
| Endpoint | POST https://www.memtrace.io/api/device/auth |
| Data sent | License key (MTC-COM-...) + machine hostname |
| Purpose | Validate your license and obtain a session token |
| Frequency | On startup; refresh when session nears expiry |
| Endpoint | POST https://www.memtrace.io/api/device/heartbeat |
| Data sent | Aggregate integer counts only: total nodes, edges, episodes, repositories |
| Purpose | Usage metering and entitlement checks |
| Frequency | Every 15 minutes while running |
By default the heartbeat payload contains no symbol names, no file paths, no code, and no embeddings — only integer totals like { "totalNodes": 4022, "totalEdges": 18441 }.
The one exception is the Weekly Memtrace Receipt feature (off by default, opt-in via the memtrace.io account dashboard). When that toggle is on, the heartbeat additionally carries a small symbol-name surface that powers the weekly summary email. Set MEMTRACE_NO_REMOTE_RECEIPT=1 on a specific machine to keep the receipt feature off regardless of the account-level toggle. Full breakdown: docs/telemetry-compliance-datasheet.md §6.4.
| Source | HuggingFace Hub (via the fastembed library) |
| Data sent | Nothing — this is an inbound download only |
| What's downloaded | ONNX model weights (e.g., BGE-small-en-v1.5) |
| Frequency | Once on first run; cached at ~/.cache/fastembed/ |
| Endpoint | POST https://memtrace.io/api/telemetry/ingest |
| Data sent | App-start events, indexing/embedding durations, aggregate PR review/watch counters, panic reports, and WARN/ERROR log lines from Memtrace's own crates — all sanitised to strip home-dir paths, token-shaped strings, and email addresses. Plus content-free Rail routing-quality buckets (mode, pattern shape, hit/miss, a bucketed score, and a local relevance yes/no) — never the search text or which files matched. The Rail buckets are measured asynchronously by the background daemon, so they never add latency to a search |
| Purpose | Catch crashes and regressions across the user base (the M3-Air "stuck on Loading embedding model" hang, Windows MSVC build failures, etc. are exactly the kind of thing this is for); and, for Rail, measure whether graph-backed search results are relevant — so the decision to make Rail active by default is backed by real evidence |
| Frequency | Batched flush every 60 seconds while running |
| Opt-out | MEMTRACE_TELEMETRY=off disables all of it (also 0/false/disabled/no); MEMTRACE_RAIL_SHADOW=off disables just the Rail buckets; MEMTRACE_RAIL_SHADOW_SAMPLE=0..1 bounds the background measurement rate |
The telemetry payload never contains source code, file contents, symbol names, embeddings, repository paths, the text of your search commands, which files or symbols a search matched, GitHub PR URLs, PR discussion text, reviewer identities, branch names, or commit data. The schema on the receiving end has no column to hold any of those — we'd have to ship a new release to even start collecting them, and we'd announce it here first. Full breakdown: TELEMETRY.md.
- ❌ We do not send source code to any server
- ❌ We do not use cloud-based embedding APIs (OpenAI, Cohere, etc.)
- ❌ We do not transmit symbol names, file paths, or any structural data outside the sanitised crash/error/event payloads documented above
- ❌ We do not store or share IP addresses (standard request logs are kept 7 days for abuse mitigation only)
- ❌ We do not sell, share, or publish anonymised aggregates of telemetry data without notice
If you have questions about data handling or need a security review for your organization, please open an issue or contact us at support@syncable.dev.