Skip to content

fix: cap embed payload at 30k chars to prevent Ollama context overflow (#53)#54

Closed
doITmagic wants to merge 13 commits intomainfrom
fix/issue-53-embed-truncation
Closed

fix: cap embed payload at 30k chars to prevent Ollama context overflow (#53)#54
doITmagic wants to merge 13 commits intomainfrom
fix/issue-53-embed-truncation

Conversation

@doITmagic
Copy link
Copy Markdown
Owner

Description

Fixes the "llm embedding error: the input length exceeds the context length" error reported by users indexing large codebases with Ollama.

Root cause: IndexPaths() was concatenating docstring + signature + code into a single embed payload with no size limit. For large symbols (big classes, generated files, long functions), this easily exceeds the context window of embedding models like nomic-embed-text (8 192 tokens).

Fix: Added a buildEmbedText() helper that caps the embed payload at 30 000 characters (~7 700 tokens, giving ~6% safety headroom). Metadata (docstring + signature) is always preserved in full — only the raw code body is truncated when the total exceeds the limit. A [WARN] log entry is emitted whenever truncation occurs so users can identify oversized symbols without an indexing failure.

Fixes #53

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Checklist:

  • I have performed a self-review of my own code
  • I have formatted my code with go fmt ./...
  • I have run tests go test ./... and they pass
  • I have verified integration with Ollama/Qdrant (if applicable)
  • I have updated the documentation accordingly

razvan and others added 13 commits April 16, 2026 20:05
- Adăugat sistem de 'embedded skills' folosind go:embed pentru fișiere Markdown (fără hardcoding).
- Implementat management de skills (list_skills, install_skill) cu detecție dinamică a metadatelor.
- Actualizat sistemul de update cu caching de 24h și verificări non-blocking la tool calls.
- Adăugat bibliotecă de 6 skill-uri de bază (Go, Python, Laravel, Debugging, etc.).
- Adăugat teste unitare pentru pachetul internal/skills.
- Actualizat .gitignore și regulile IDE generate automat pentru a suporta noul sistem.
- Fix skills path traversal validation

- Fix race condition in update check

- Add Windows zip support for updates

- Improve test cleanup

- Add missing tool schemas
Add buildEmbedText() helper that caps embed payload at 30 000 chars
(~7 700 tokens, safe for 8 192-token models like nomic-embed-text).

Metadata (docstring + signature) is always preserved in full; only the
Code body is truncated when the total exceeds the limit. A WARN is
logged whenever truncation occurs so users can identify large symbols.

Fixes #53
Copilot AI review requested due to automatic review settings April 16, 2026 17:16
@doITmagic doITmagic self-assigned this Apr 16, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses embedding failures caused by oversized embed payloads during indexing (Ollama context overflow), and also introduces a bundled “skills” system plus update-check/apply tooling with caching.

Changes:

  • Cap embedding input via buildEmbedText() (30k Unicode chars) and emit warnings when truncation occurs.
  • Add an updater cache (24h TTL) and expose update/skill management as MCP tools (check_update, apply_update, list_skills, install_skill).
  • Improve file-write correctness in a few places by explicitly handling Close() errors and adjust .gitignore for new skill install output.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/workspace/state.go Switch Save() to a write lock and improve close/error handling on state persistence.
internal/updater/updater.go Add update caching, context-aware HTTP, zip extraction, safer atomic cache writes, and Windows fallback behavior.
internal/updater/updater_test.go Tests for updater cache path creation and save/load behavior.
internal/tools/updates.go New MCP tools to check for updates and apply updates.
internal/tools/list_skills.go New MCP tool to list bundled skills.
internal/tools/install_skill.go New MCP tool to install/uninstall a skill into a workspace.
internal/skills/skills.go Embed and manage bundled skills; install/uninstall logic and metadata parsing.
internal/skills/skills_test.go Tests for skill listing, install/uninstall, and path traversal protection.
internal/skills/embedded/ragcode-update/SKILL.md New embedded skill documentation for updates/maintenance workflow.
internal/skills/embedded/ragcode-priority/examples/search_patterns.md New examples for recommended search workflows.
internal/skills/embedded/ragcode-priority/SKILL.md New embedded skill that enforces “ragcode tools first” workflow.
internal/skills/embedded/python-best-practices/SKILL.md New embedded skill for Python guidelines.
internal/skills/embedded/php-laravel/SKILL.md New embedded skill for PHP/Laravel guidelines.
internal/skills/embedded/go-best-practices/SKILL.md New embedded skill for Go best practices.
internal/skills/embedded/debugging-guide/SKILL.md New embedded skill for debugging workflow.
internal/ragcode/indexer.go Add capped embed text builder and warning logging on truncation.
internal/ragcode/indexer_embed_test.go Tests for embed text truncation behavior and boundaries.
cmd/rag-code-mcp/main.go Wire new tools, update CLI update path, and add background update checks + schemas.
cmd/install/main.go Adjust install path construction and improve close/error handling during file operations.
.gitignore Stop ignoring all of .agent/; ignore only .agent/skills/ and add debug log ignores.

Comment on lines +34 to +65
var full string
if ch.Code != "" {
if meta != "" {
full = meta + "\n\n" + ch.Code
} else {
full = ch.Code
}
} else {
full = meta
}

runes := []rune(full)
if len(runes) <= maxChars {
return full, false
}

// Truncate only the Code portion — keep metadata intact.
metaWithSep := meta
if meta != "" && ch.Code != "" {
metaWithSep = meta + "\n\n"
}
metaRunes := []rune(metaWithSep)
remaining := maxChars - len(metaRunes)
if remaining < 0 {
remaining = 0
}
codeRunes := []rune(ch.Code)
if remaining > len(codeRunes) {
remaining = len(codeRunes)
}
truncated := metaWithSep + string(codeRunes[:remaining])
return truncated, true
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildEmbedText() eagerly constructs full = meta + "\n\n" + ch.Code and then later converts the whole string to []rune. For very large ch.Code values this forces a full copy/allocation (and a full rune slice) even though the result will be truncated, which can significantly increase memory/CPU during indexing. Prefer computing rune lengths and truncating the code portion without concatenating/allocating the full payload up front (e.g., build only the needed prefix up to maxChars).

Suggested change
var full string
if ch.Code != "" {
if meta != "" {
full = meta + "\n\n" + ch.Code
} else {
full = ch.Code
}
} else {
full = meta
}
runes := []rune(full)
if len(runes) <= maxChars {
return full, false
}
// Truncate only the Code portion — keep metadata intact.
metaWithSep := meta
if meta != "" && ch.Code != "" {
metaWithSep = meta + "\n\n"
}
metaRunes := []rune(metaWithSep)
remaining := maxChars - len(metaRunes)
if remaining < 0 {
remaining = 0
}
codeRunes := []rune(ch.Code)
if remaining > len(codeRunes) {
remaining = len(codeRunes)
}
truncated := metaWithSep + string(codeRunes[:remaining])
return truncated, true
metaWithSep := meta
if meta != "" && ch.Code != "" {
metaWithSep = meta + "\n\n"
}
metaRuneCount := len([]rune(metaWithSep))
if ch.Code == "" {
if metaRuneCount <= maxChars {
return meta, false
}
return meta, true
}
remaining := maxChars - metaRuneCount
if remaining < 0 {
return metaWithSep, true
}
codeRuneCount := 0
for idx := range ch.Code {
if codeRuneCount == remaining {
return metaWithSep + ch.Code[:idx], true
}
codeRuneCount++
}
return metaWithSep + ch.Code, false

Copilot uses AI. Check for mistakes.
}
if wasTruncated {
// Log a warning so users know which symbols were partially indexed.
// The full code is still accessible via rag_read_file_context.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references rag_read_file_context, but that tool name doesn't appear to exist in this codebase (the MCP tool is get_code_context). This can confuse users reading logs; consider updating the comment to the correct tool name or removing it.

Suggested change
// The full code is still accessible via rag_read_file_context.
// The full code is still accessible via get_code_context.

Copilot uses AI. Check for mistakes.
Comment thread cmd/rag-code-mcp/main.go
"properties": map[string]interface{}{
"force": map[string]interface{}{
"type": "boolean",
"description": "Force update even if version matches (default: false)",
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool schema for apply_update documents force as defaulting to false, but ApplyUpdateTool.Execute defaults force to true when the argument is omitted. This mismatch can lead to surprising behavior for MCP clients relying on the schema. Align the schema description with the implementation, or change the tool default to match the documented default.

Suggested change
"description": "Force update even if version matches (default: false)",
"description": "Force update even if version matches (default: true)",

Copilot uses AI. Check for mistakes.
Comment thread cmd/rag-code-mcp/main.go
Comment on lines 650 to +670
@@ -655,6 +664,10 @@ func main() {
registerAgentTool(server, searchDocsTool, cfg)
registerAgentTool(server, hybridTool, cfg)
registerAgentTool(server, indexWorkspaceTool, cfg)
registerAgentTool(server, listSkillsTool, cfg)
registerAgentTool(server, installSkillTool, cfg)
registerAgentTool(server, checkUpdateTool, cfg)
registerAgentTool(server, applyUpdateTool, cfg)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/title focuses on embedding payload truncation (#53), but this change set also adds a skills system and updater tools (list_skills/install_skill/check_update/apply_update) plus background update checks. This looks like a significant scope increase that isn't reflected in the PR metadata; please update the PR description (and possibly the title) to cover these additions, or split the updater/skills changes into a separate PR to keep review and rollback risk manageable.

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +59
metaRunes := []rune(metaWithSep)
remaining := maxChars - len(metaRunes)
if remaining < 0 {
remaining = 0
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildEmbedText() does not guarantee the returned text stays within maxChars when the metadata (docstring/signature) alone exceeds the limit. In that case metaWithSep is returned in full and can still overflow the embedding model context window, reintroducing the original failure mode. Consider enforcing the cap unconditionally (e.g., always keep signature, truncate docstring next, then code) and add a test where Docstring length > maxChars.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a new Skills System to provide contextual AI behaviors and enhances the update mechanism with local caching and Windows ZIP support. It also introduces a truncation strategy for code indexing to ensure chunks fit within model context windows. Review feedback highlights the need to ensure temporary update files are cleaned up before os.Exit, suggests memory optimizations for rune counting and string slicing in the indexer, and notes a potential overflow issue if metadata alone exceeds the character limit.

Comment thread cmd/rag-code-mcp/main.go
log.Fatalf("Failed to close temporary file for update: %v", err)
}
// Ensure the temporary file is removed after applying the update
defer os.Remove(tempFile)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The defer os.Remove(tempFile) will not execute because os.Exit(0) is called at the end of this block (line 470). In Go, defer statements are not run when os.Exit() is called. Since this is a short-lived CLI path, you should manually call os.Remove(tempFile) before os.Exit(0) or wrap this logic in a function to ensure the temporary archive is cleaned up.

"encoding/json"
"fmt"
"hash/fnv"
"log"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add unicode/utf8 to the imports to support memory-efficient rune counting and prefixing in the buildEmbedText function.

Suggested change
"log"
"log"
"unicode/utf8"

Comment on lines +45 to +48
runes := []rune(full)
if len(runes) <= maxChars {
return full, false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Converting the entire full string to a []rune slice just to check its length is memory-intensive for large code chunks. Consider using utf8.RuneCountInString(full) which calculates the rune count without allocating a new slice.

	if utf8.RuneCountInString(full) <= maxChars {
		return full, false
	}

Comment on lines +60 to +63
codeRunes := []rune(ch.Code)
if remaining > len(codeRunes) {
remaining = len(codeRunes)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Converting the potentially massive ch.Code string to []rune can lead to high memory usage or OOM for very large files. Since you only need a prefix of at most remaining runes, you can optimize this by converting only a safe byte-prefix of the string.

	codePrefix := ch.Code
	if len(codePrefix) > remaining*4 {
		codePrefix = codePrefix[:remaining*4]
	}
	codeRunes := []rune(codePrefix)
	if remaining > len(codeRunes) {
		remaining = len(codeRunes)
	}

if remaining > len(codeRunes) {
remaining = len(codeRunes)
}
truncated := metaWithSep + string(codeRunes[:remaining])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the metadata (metaWithSep) alone exceeds maxChars, the remaining count becomes 0, but the full metadata is still returned. This means the payload could still exceed the model's context window. While metadata is typically small, a strict cap would be safer to prevent the context overflow error mentioned in the PR description.

@doITmagic doITmagic closed this Apr 16, 2026
@doITmagic doITmagic deleted the fix/issue-53-embed-truncation branch April 16, 2026 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"llm embedding error: the input length exceeds the context length"

2 participants