alexylon
diff --git a/‎CHANGELOG.md‎
Lines changed: 29 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 29 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 23 additions & 18 deletions b/‎README.md‎
Lines changed: 23 additions & 18 deletions
@@ -4,6 +4,35 @@ All notable changes to Sofos are documented in this file.
 
 ## [Unreleased]
 
+### Security
+
+- **Windows absolute paths bypassed the external-path detection** on every filesystem-touching dispatcher (`read_file`, `write_file`, `list_directory`, `glob_files`, `create_directory`, `edit_file`, `morph_edit_file` via the shared resolver), on `execute_bash`'s path scanner, on the image loader, and on the config parser that classifies `Bash(path)` entries. All these call sites used `path.starts_with('/') || path.starts_with('~')`, which only catches the Unix variant — `C:\Users\...` or `\\server\share\...` on Windows slipped through as "relative", got joined to the workspace, and then `Path::join`'s "replace on absolute" rule silently let the path escape. Centralised into two composable helpers in `tools::utils`: `is_absolute_path` (Unix `/foo` + Windows drive / UNC) and `is_absolute_or_tilde` (adds `~` / `~/foo`). Both combine `starts_with('/')` with `Path::is_absolute` rather than relying on either alone — `Path::is_absolute` returns `false` on Windows for a Unix-shaped `/etc/passwd`, which would have re-introduced the bug in reverse.
+- **Tilde expansion (`~` / `~/foo`) now works cross-platform** and respects bash-style remainder semantics. Reads `HOME` on Unix and `USERPROFILE` on Windows (previously `HOME`-only, which left a Windows user typing `~/docs` with no expansion and a confusing "file not found" downstream). Composes via `PathBuf::push` so the separator between home and the remainder is platform-native. Leading separators in the remainder are trimmed before composition, so `~//foo` resolves to `~/foo` as bash would — rather than to the raw `/foo` fragment that `PathBuf::push`'s replace-on-absolute behaviour would otherwise produce.
+- **`glob_files` could enumerate paths outside the workspace without a permission check.** `path=".."` landed on `workspace.join("..")`, which `read_dir` happily walked as the workspace parent; `path="/etc"` was worse — Rust's `Path::join` replaces with absolute paths, so the walk started at `/etc` directly. Neither went through any permission check. The glob path is now canonicalized and routed through the same `check_read_access` gate used by `list_directory` and `read_file`: relative escapes and unauthorised absolute paths are blocked, while explicitly-allowed external directories (matching a `Read(...)` rule or approved via the interactive prompt) still work for legitimate "review `/some/other/repo`" requests.
+- **`glob_files` no longer follows symlinks by default**, matching ripgrep's `rg` behaviour (needs `-L` to follow). Prevents a workspace-internal symlink pointing outside the workspace from leaking filenames under the target directory via the glob walk. Set `follow_symlinks: true` to opt in to the prior behaviour.
+- **MCP tool responses are now bounded.** The MCP server is a separate process sofos can't fully sandbox, but it CAN cap the response text before handing it to the model. Previously an oversized MCP reply could reproduce the same "string too long" HTTP 400 that internal tools used to trigger; now the `text` field is truncated at ~1 MB with a hint that the cap came from sofos, not the server.
+- **MCP image attachments are now capped** at 10 images or ~20 MB base64 bytes per response, whichever hits first. Multimodal providers count images against a separate budget from text, so a chatty MCP server returning dozens of screenshots could blow past provider limits even when the text was short. The cap is greedy: images are walked in order and kept whenever they still fit under both caps, so a single oversized image in the middle of the response is skipped without blocking smaller images that come after it. A note is appended to the response text after text truncation (so it always survives) telling the model how many attachments were dropped.
+
+### Fixed
+
+- **Write-side path resolution now canonicalises through any number of missing intermediate directories.** When creating a new file or directory, `resolve_for_write` used to canonicalise only as far as the *immediate* parent — if the grandparent (or any further ancestor) was also missing, the resolved path stayed un-canonicalised. Whenever the canonical form of an ancestor differs from its literal form, permission rules written against the canonical prefix silently missed the write, and the operation was denied for paths that should have been allowed. Common places this happens: an intermediate symlink at any depth (platform-independent), macOS's built-in `/tmp` → `/private/tmp` redirection, Windows UNC-prefix normalisation (`C:\foo` → `\\?\C:\foo`), and case folding on case-insensitive filesystems. The resolver now walks up to the nearest existing ancestor, canonicalises it, and re-appends the missing tail components so the returned path always reflects every layer of filesystem indirection on the way down.
+- **`edit_file` and `morph_edit_file` no longer corrupt files larger than ~64 KB.** Both tools read the original through the same code path as the `read_file` tool output, which was truncating to the model-facing output cap before the edit was applied. Any file past the cap was silently losing its tail — and gaining a literal `[TRUNCATED: ...]` footer — on every edit. The fix moves the output-cap truncation out of the filesystem layer and into the `read_file` dispatcher, so the edit tools now see the full file regardless of size. Added a regression test (`test_edit_file_preserves_content_past_truncation_cap`) that edits a ~200 KB file and asserts the tail sentinel survives.
+- **Tool outputs can no longer crash the request with "string too long" (HTTP 400).** Every tool result that returns variable-size content is now bounded below OpenAI's 10 MB per-output ceiling:
+  - `search_code` caps matched lines at 300 columns (with `--max-columns-preview`), skips files over 1 MB, excludes `target/`, `node_modules/`, `.git/`, `dist/`, and `build/` by default on top of `.gitignore`, and truncates total output to ~64 KB. Also adds `--` before the pattern so `pattern="-v"` or `pattern="--files"` is treated literally instead of flipping ripgrep's behaviour.
+  - `glob_files` skips the same default excludes and truncates to ~1 MB — a broad pattern like `**/*` over a populated `target/` no longer returns tens of thousands of paths.
+  - `list_directory` truncates to ~1 MB for pathological directories.
+  - `write_file`, `edit_file`, and `morph_edit_file` diff reports truncate to ~1 MB (ANSI-highlighted diffs of large overwrites previously had no ceiling).
+
+### Changed
+
+- **`edit_file` / `morph_edit_file` now check Read AND Write** for external paths (previously only Write). A user who explicitly granted Write but denied or did not grant Read no longer has their file silently read to compute the diff — the scopes now hold independently. A Write-only grant that used to be sufficient for `edit_file` on an external file will now need a Read grant too.
+- **`create_directory`, `move_file`, `copy_file` accept external paths** (absolute and `~/`) with the appropriate permission grants, matching what `write_file` / `edit_file` already supported. `create_directory` and the destination of `move_file` / `copy_file` require Write; the source of `copy_file` requires Read; the source of `move_file` requires Write (the move removes the source). Previously these tools hard-rejected any path outside the workspace.
+
+### Added
+
+- **`search_code` and `glob_files` `include_ignored` parameter** (default `false`). Set to `true` to bypass the built-in excludes (`target/`, `node_modules/`, `.git/`, `dist/`, `build/`) and, for `search_code`, `.gitignore` / `.ignore` filtering. Only set it when you specifically need to look inside build artefacts or vendored code.
+- **`glob_files` `follow_symlinks` parameter** (default `false`). Set to `true` to walk through symlinks the way `rg -L` does.
+
 ## [0.2.1] - 2026-04-20
 
 ### Fixed
 
@@ -1,10 +1,10 @@
 # Sofos Code
 
-![](https://github.com/alexylon/sofos-code/actions/workflows/rust.yml/badge.svg) &nbsp; [![Crates.io](https://img.shields.io/crates/v/sofos.svg?color=blue)](https://crates.io/crates/sofos)
+![CI](https://github.com/alexylon/sofos-code/actions/workflows/rust.yml/badge.svg) &nbsp; [![Crates.io](https://img.shields.io/crates/v/sofos.svg?color=blue)](https://crates.io/crates/sofos)
 
 A blazingly fast, interactive AI coding assistant powered by Claude or GPT, implemented in pure Rust, that can generate code, edit files, and search the web - all from your terminal.
 
-Tested on macOS but should work on Linux and Windows as well.
+Tested on macOS; supported on Linux and Windows.
 
 <div align="center"><img src="/assets/screenshot.png" style="width: 800px;" alt="Sofos Code"></div>
 
@@ -41,7 +41,7 @@ Tested on macOS but should work on Linux and Windows as well.
 - **Image Vision** - Analyze local or web images, paste from clipboard with Ctrl+V
 - **Session History** - Auto-save with an in-TUI resume picker (`/resume` or `sofos -r`)
 - **Custom Instructions** - Project and personal context files
-- **File Operations** - Read, write, edit, list, create (sandboxed)
+- **File Operations** - Read, write, edit, list, glob, create, move, copy, delete (sandboxed; external paths via permission grants)
 - **Targeted Edits** - Diff-based `edit_file` for precise string replacements
 - **Ultra-Fast Editing** - Optional Morph Apply integration (10,500+ tokens/sec)
 - **File Search** - Find files by glob pattern (`**/*.rs`)
@@ -157,7 +157,7 @@ Exit summary shows token usage and estimated cost (based on official API pricing
 
 ```
 -p, --prompt <TEXT>          One-shot mode
--s, --safe-mode              Start in read-only mode (native writes and bash disabled; see Safe Mode note under Available Tools)
+-s, --safe-mode              Start in read-only mode (native writes and bash disabled)
 -r, --resume                 Resume a previous session
     --check-connection       Check API connectivity and exit
     --api-key <KEY>          Anthropic API key (overrides env var)
@@ -185,27 +185,31 @@ For Claude, it enables the thinking protocol and `--thinking-budget` controls to
 For OpenAI (gpt-5 models), `/think on` sets high reasoning effort and `/think off` sets low reasoning effort. 
 The `--thinking-budget` parameter only applies to Claude models.
 
-## Custom Instructions/Context
+## Custom Instructions
 
-**[`AGENTS.md`](https://agents.md)** (project root, version controlled) - Project context for AI agents, team-wide conventions, architecture
-**`.sofos/instructions.md`** (gitignored) - Personal preferences
+Two files are loaded at startup and appended to the system prompt:
 
-Both loaded at startup and appended to system prompt.
+- **[`AGENTS.md`](https://agents.md)** (project root, version controlled) — project context for AI agents: team-wide conventions, architecture, domain vocabulary.
+- **`.sofos/instructions.md`** (gitignored) — personal preferences that shouldn't be shared with the team.
 
 ## Session History
 
 Conversations auto-saved to `.sofos/sessions/`. Resume with `sofos -r` or `/resume`.
 
 ## Available Tools
 
-**File Operations:**
+**File Operations** (accept absolute and `~/` paths with a `Read` or `Write` grant as appropriate — see Security and Configuration):
 - `read_file` - Read file contents
-- `write_file` - Create or overwrite files
-- `edit_file` - Targeted string replacement edits (no API key needed)
-- `morph_edit_file` - Ultra-fast code editing (requires MORPH_API_KEY)
 - `list_directory` - List a single directory's contents
 - `glob_files` - Find files recursively by glob pattern (`**/*.rs`, `src/**/test_*.py`)
-- `create_directory`, `delete_file`, `delete_directory`, `move_file`, `copy_file` - Standard file ops
+- `write_file` - Create or overwrite files (append mode for chunked writes)
+- `edit_file` - Targeted string replacement edits (no API key needed)
+- `morph_edit_file` - Ultra-fast code editing (requires MORPH_API_KEY)
+- `create_directory` - Create a directory (and missing parents)
+- `move_file`, `copy_file` - Move or copy files
+
+**Workspace-only file ops** (absolute / `~/` paths are rejected, even with grants — destructive ops are deliberately scoped to the workspace):
+- `delete_file`, `delete_directory` - Delete files or directories (prompt for confirmation)
 
 **Code & Search:**
 - `search_code` - Fast regex-based code search (requires `ripgrep`)
@@ -218,9 +222,9 @@ Conversations auto-saved to `.sofos/sessions/`. Resume with `sofos -r` or `/resu
 
 **Image Vision:** not a tool — sofos detects image paths (JPEG, PNG, GIF, WebP, up to 20 MB local) in your user messages and loads them automatically as image content blocks. Clipboard paste (Ctrl+V) works the same way. See [Image Vision](#image-vision) under Usage.
 
-**Note:** Tools can access paths outside workspace when allowed via interactive prompt or config. Three separate scopes control access: `Read` (read/list), `Write` (write/edit), and `Bash` (command execution). Each scope is granted independently.
+**Note:** Tools can access paths outside the workspace when allowed via interactive prompt or config. Three independent scopes (`Read` / `Write` / `Bash`) gate this access — see [Security](#security) for the full model.
 
-Safe mode (`--safe-mode` or `/s`) restricts the native tool set to: `list_directory`, `read_file`, `glob_files`, `web_fetch`, and `web_search` (Anthropic + OpenAI provider-native variants). MCP tools are **not** filtered by safe mode — if you've configured MCP servers with mutating tools, those remain available.
+Safe mode (`--safe-mode` or `/s`) restricts the native tool set to read-only operations: `list_directory`, `read_file`, `glob_files`, `web_fetch`, `web_search` (Anthropic + OpenAI provider-native variants), and `search_code` when `ripgrep` is available. MCP tools are **not** filtered by safe mode — if you've configured MCP servers with mutating tools, those remain available.
 
 ## MCP Servers
 
@@ -235,8 +239,9 @@ Tools auto-discovered, prefixed with server name (e.g., `filesystem_read_file`).
 **Sandboxing (by default):**
 - ✅ Full access to workspace files/directories
 - ✅ External access via interactive prompts — user is asked to allow/deny, with option to remember in config
-- Three separate scopes: `Read` (read/list), `Write` (write/edit), `Bash` (commands with external paths)
-- Each scope is independently granted — Read access does not imply Write or Bash access
+- Three separate scopes: `Read` (read/list), `Write` (write/create/move/delete), `Bash` (commands with external paths)
+- Each scope is independently granted — Read access does not imply Write or Bash access, and vice versa
+- Tools that both read and write a file on external paths (`edit_file`, `morph_edit_file`) require **both** `Read` and `Write` grants on the path
 
 **Bash Permissions (3-Tier System):**
 
@@ -305,7 +310,7 @@ headers = { "Authorization" = "Bearer token123" }
 - Three scopes: `Read(path)` for reading, `Write(path)` for writing, `Bash(path)` for bash access — each independent
 - `Bash(path)` entries with globs (e.g. `Bash(/tmp/**)`) grant path access; plain entries (e.g. `Bash(npm test)`) grant command access
 - Glob patterns supported: `*` (single level), `**` (recursive)
-- Tilde expansion: `~` → home directory
+- Tilde expansion: `~` → `$HOME` on Unix, `%USERPROFILE%` on Windows
 - `ask` only works for Bash commands
 
 \* These rules do not restrict MCP server command paths