Skip to content

Commit 8aeab40

Browse files
committed
Canonicalize write paths through missing ancestors; update readme and changelog
1 parent fd2cb0b commit 8aeab40

4 files changed

Lines changed: 160 additions & 59 deletions

File tree

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,35 @@ All notable changes to Sofos are documented in this file.
44

55
## [Unreleased]
66

7+
### Security
8+
9+
- **Windows absolute paths bypassed the external-path detection** on every filesystem-touching dispatcher (`read_file`, `write_file`, `list_directory`, `glob_files`, `create_directory`, `edit_file`, `morph_edit_file` via the shared resolver), on `execute_bash`'s path scanner, on the image loader, and on the config parser that classifies `Bash(path)` entries. All these call sites used `path.starts_with('/') || path.starts_with('~')`, which only catches the Unix variant — `C:\Users\...` or `\\server\share\...` on Windows slipped through as "relative", got joined to the workspace, and then `Path::join`'s "replace on absolute" rule silently let the path escape. Centralised into two composable helpers in `tools::utils`: `is_absolute_path` (Unix `/foo` + Windows drive / UNC) and `is_absolute_or_tilde` (adds `~` / `~/foo`). Both combine `starts_with('/')` with `Path::is_absolute` rather than relying on either alone — `Path::is_absolute` returns `false` on Windows for a Unix-shaped `/etc/passwd`, which would have re-introduced the bug in reverse.
10+
- **Tilde expansion (`~` / `~/foo`) now works cross-platform** and respects bash-style remainder semantics. Reads `HOME` on Unix and `USERPROFILE` on Windows (previously `HOME`-only, which left a Windows user typing `~/docs` with no expansion and a confusing "file not found" downstream). Composes via `PathBuf::push` so the separator between home and the remainder is platform-native. Leading separators in the remainder are trimmed before composition, so `~//foo` resolves to `~/foo` as bash would — rather than to the raw `/foo` fragment that `PathBuf::push`'s replace-on-absolute behaviour would otherwise produce.
11+
- **`glob_files` could enumerate paths outside the workspace without a permission check.** `path=".."` landed on `workspace.join("..")`, which `read_dir` happily walked as the workspace parent; `path="/etc"` was worse — Rust's `Path::join` replaces with absolute paths, so the walk started at `/etc` directly. Neither went through any permission check. The glob path is now canonicalized and routed through the same `check_read_access` gate used by `list_directory` and `read_file`: relative escapes and unauthorised absolute paths are blocked, while explicitly-allowed external directories (matching a `Read(...)` rule or approved via the interactive prompt) still work for legitimate "review `/some/other/repo`" requests.
12+
- **`glob_files` no longer follows symlinks by default**, matching ripgrep's `rg` behaviour (needs `-L` to follow). Prevents a workspace-internal symlink pointing outside the workspace from leaking filenames under the target directory via the glob walk. Set `follow_symlinks: true` to opt in to the prior behaviour.
13+
- **MCP tool responses are now bounded.** The MCP server is a separate process sofos can't fully sandbox, but it CAN cap the response text before handing it to the model. Previously an oversized MCP reply could reproduce the same "string too long" HTTP 400 that internal tools used to trigger; now the `text` field is truncated at ~1 MB with a hint that the cap came from sofos, not the server.
14+
- **MCP image attachments are now capped** at 10 images or ~20 MB base64 bytes per response, whichever hits first. Multimodal providers count images against a separate budget from text, so a chatty MCP server returning dozens of screenshots could blow past provider limits even when the text was short. The cap is greedy: images are walked in order and kept whenever they still fit under both caps, so a single oversized image in the middle of the response is skipped without blocking smaller images that come after it. A note is appended to the response text after text truncation (so it always survives) telling the model how many attachments were dropped.
15+
16+
### Fixed
17+
18+
- **Write-side path resolution now canonicalises through any number of missing intermediate directories.** When creating a new file or directory, `resolve_for_write` used to canonicalise only as far as the *immediate* parent — if the grandparent (or any further ancestor) was also missing, the resolved path stayed un-canonicalised. Whenever the canonical form of an ancestor differs from its literal form, permission rules written against the canonical prefix silently missed the write, and the operation was denied for paths that should have been allowed. Common places this happens: an intermediate symlink at any depth (platform-independent), macOS's built-in `/tmp` → `/private/tmp` redirection, Windows UNC-prefix normalisation (`C:\foo` → `\\?\C:\foo`), and case folding on case-insensitive filesystems. The resolver now walks up to the nearest existing ancestor, canonicalises it, and re-appends the missing tail components so the returned path always reflects every layer of filesystem indirection on the way down.
19+
- **`edit_file` and `morph_edit_file` no longer corrupt files larger than ~64 KB.** Both tools read the original through the same code path as the `read_file` tool output, which was truncating to the model-facing output cap before the edit was applied. Any file past the cap was silently losing its tail — and gaining a literal `[TRUNCATED: ...]` footer — on every edit. The fix moves the output-cap truncation out of the filesystem layer and into the `read_file` dispatcher, so the edit tools now see the full file regardless of size. Added a regression test (`test_edit_file_preserves_content_past_truncation_cap`) that edits a ~200 KB file and asserts the tail sentinel survives.
20+
- **Tool outputs can no longer crash the request with "string too long" (HTTP 400).** Every tool result that returns variable-size content is now bounded below OpenAI's 10 MB per-output ceiling:
21+
- `search_code` caps matched lines at 300 columns (with `--max-columns-preview`), skips files over 1 MB, excludes `target/`, `node_modules/`, `.git/`, `dist/`, and `build/` by default on top of `.gitignore`, and truncates total output to ~64 KB. Also adds `--` before the pattern so `pattern="-v"` or `pattern="--files"` is treated literally instead of flipping ripgrep's behaviour.
22+
- `glob_files` skips the same default excludes and truncates to ~1 MB — a broad pattern like `**/*` over a populated `target/` no longer returns tens of thousands of paths.
23+
- `list_directory` truncates to ~1 MB for pathological directories.
24+
- `write_file`, `edit_file`, and `morph_edit_file` diff reports truncate to ~1 MB (ANSI-highlighted diffs of large overwrites previously had no ceiling).
25+
26+
### Changed
27+
28+
- **`edit_file` / `morph_edit_file` now check Read AND Write** for external paths (previously only Write). A user who explicitly granted Write but denied or did not grant Read no longer has their file silently read to compute the diff — the scopes now hold independently. A Write-only grant that used to be sufficient for `edit_file` on an external file will now need a Read grant too.
29+
- **`create_directory`, `move_file`, `copy_file` accept external paths** (absolute and `~/`) with the appropriate permission grants, matching what `write_file` / `edit_file` already supported. `create_directory` and the destination of `move_file` / `copy_file` require Write; the source of `copy_file` requires Read; the source of `move_file` requires Write (the move removes the source). Previously these tools hard-rejected any path outside the workspace.
30+
31+
### Added
32+
33+
- **`search_code` and `glob_files` `include_ignored` parameter** (default `false`). Set to `true` to bypass the built-in excludes (`target/`, `node_modules/`, `.git/`, `dist/`, `build/`) and, for `search_code`, `.gitignore` / `.ignore` filtering. Only set it when you specifically need to look inside build artefacts or vendored code.
34+
- **`glob_files` `follow_symlinks` parameter** (default `false`). Set to `true` to walk through symlinks the way `rg -L` does.
35+
736
## [0.2.1] - 2026-04-20
837

938
### Fixed

README.md

Lines changed: 23 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Sofos Code
22

3-
![](https://github.com/alexylon/sofos-code/actions/workflows/rust.yml/badge.svg)   [![Crates.io](https://img.shields.io/crates/v/sofos.svg?color=blue)](https://crates.io/crates/sofos)
3+
![CI](https://github.com/alexylon/sofos-code/actions/workflows/rust.yml/badge.svg)   [![Crates.io](https://img.shields.io/crates/v/sofos.svg?color=blue)](https://crates.io/crates/sofos)
44

55
A blazingly fast, interactive AI coding assistant powered by Claude or GPT, implemented in pure Rust, that can generate code, edit files, and search the web - all from your terminal.
66

7-
Tested on macOS but should work on Linux and Windows as well.
7+
Tested on macOS; supported on Linux and Windows.
88

99
<div align="center"><img src="/assets/screenshot.png" style="width: 800px;" alt="Sofos Code"></div>
1010

@@ -41,7 +41,7 @@ Tested on macOS but should work on Linux and Windows as well.
4141
- **Image Vision** - Analyze local or web images, paste from clipboard with Ctrl+V
4242
- **Session History** - Auto-save with an in-TUI resume picker (`/resume` or `sofos -r`)
4343
- **Custom Instructions** - Project and personal context files
44-
- **File Operations** - Read, write, edit, list, create (sandboxed)
44+
- **File Operations** - Read, write, edit, list, glob, create, move, copy, delete (sandboxed; external paths via permission grants)
4545
- **Targeted Edits** - Diff-based `edit_file` for precise string replacements
4646
- **Ultra-Fast Editing** - Optional Morph Apply integration (10,500+ tokens/sec)
4747
- **File Search** - Find files by glob pattern (`**/*.rs`)
@@ -157,7 +157,7 @@ Exit summary shows token usage and estimated cost (based on official API pricing
157157
158158
```
159159
-p, --prompt <TEXT> One-shot mode
160-
-s, --safe-mode Start in read-only mode (native writes and bash disabled; see Safe Mode note under Available Tools)
160+
-s, --safe-mode Start in read-only mode (native writes and bash disabled)
161161
-r, --resume Resume a previous session
162162
--check-connection Check API connectivity and exit
163163
--api-key <KEY> Anthropic API key (overrides env var)
@@ -185,27 +185,31 @@ For Claude, it enables the thinking protocol and `--thinking-budget` controls to
185185
For OpenAI (gpt-5 models), `/think on` sets high reasoning effort and `/think off` sets low reasoning effort.
186186
The `--thinking-budget` parameter only applies to Claude models.
187187
188-
## Custom Instructions/Context
188+
## Custom Instructions
189189
190-
**[`AGENTS.md`](https://agents.md)** (project root, version controlled) - Project context for AI agents, team-wide conventions, architecture
191-
**`.sofos/instructions.md`** (gitignored) - Personal preferences
190+
Two files are loaded at startup and appended to the system prompt:
192191
193-
Both loaded at startup and appended to system prompt.
192+
- **[`AGENTS.md`](https://agents.md)** (project root, version controlled) — project context for AI agents: team-wide conventions, architecture, domain vocabulary.
193+
- **`.sofos/instructions.md`** (gitignored) — personal preferences that shouldn't be shared with the team.
194194

195195
## Session History
196196

197197
Conversations auto-saved to `.sofos/sessions/`. Resume with `sofos -r` or `/resume`.
198198

199199
## Available Tools
200200

201-
**File Operations:**
201+
**File Operations** (accept absolute and `~/` paths with a `Read` or `Write` grant as appropriate — see Security and Configuration):
202202
- `read_file` - Read file contents
203-
- `write_file` - Create or overwrite files
204-
- `edit_file` - Targeted string replacement edits (no API key needed)
205-
- `morph_edit_file` - Ultra-fast code editing (requires MORPH_API_KEY)
206203
- `list_directory` - List a single directory's contents
207204
- `glob_files` - Find files recursively by glob pattern (`**/*.rs`, `src/**/test_*.py`)
208-
- `create_directory`, `delete_file`, `delete_directory`, `move_file`, `copy_file` - Standard file ops
205+
- `write_file` - Create or overwrite files (append mode for chunked writes)
206+
- `edit_file` - Targeted string replacement edits (no API key needed)
207+
- `morph_edit_file` - Ultra-fast code editing (requires MORPH_API_KEY)
208+
- `create_directory` - Create a directory (and missing parents)
209+
- `move_file`, `copy_file` - Move or copy files
210+
211+
**Workspace-only file ops** (absolute / `~/` paths are rejected, even with grants — destructive ops are deliberately scoped to the workspace):
212+
- `delete_file`, `delete_directory` - Delete files or directories (prompt for confirmation)
209213
210214
**Code & Search:**
211215
- `search_code` - Fast regex-based code search (requires `ripgrep`)
@@ -218,9 +222,9 @@ Conversations auto-saved to `.sofos/sessions/`. Resume with `sofos -r` or `/resu
218222
219223
**Image Vision:** not a tool — sofos detects image paths (JPEG, PNG, GIF, WebP, up to 20 MB local) in your user messages and loads them automatically as image content blocks. Clipboard paste (Ctrl+V) works the same way. See [Image Vision](#image-vision) under Usage.
220224
221-
**Note:** Tools can access paths outside workspace when allowed via interactive prompt or config. Three separate scopes control access: `Read` (read/list), `Write` (write/edit), and `Bash` (command execution). Each scope is granted independently.
225+
**Note:** Tools can access paths outside the workspace when allowed via interactive prompt or config. Three independent scopes (`Read` / `Write` / `Bash`) gate this access — see [Security](#security) for the full model.
222226
223-
Safe mode (`--safe-mode` or `/s`) restricts the native tool set to: `list_directory`, `read_file`, `glob_files`, `web_fetch`, and `web_search` (Anthropic + OpenAI provider-native variants). MCP tools are **not** filtered by safe mode — if you've configured MCP servers with mutating tools, those remain available.
227+
Safe mode (`--safe-mode` or `/s`) restricts the native tool set to read-only operations: `list_directory`, `read_file`, `glob_files`, `web_fetch`, `web_search` (Anthropic + OpenAI provider-native variants), and `search_code` when `ripgrep` is available. MCP tools are **not** filtered by safe mode — if you've configured MCP servers with mutating tools, those remain available.
224228

225229
## MCP Servers
226230

@@ -235,8 +239,9 @@ Tools auto-discovered, prefixed with server name (e.g., `filesystem_read_file`).
235239
**Sandboxing (by default):**
236240
- ✅ Full access to workspace files/directories
237241
- ✅ External access via interactive prompts — user is asked to allow/deny, with option to remember in config
238-
- Three separate scopes: `Read` (read/list), `Write` (write/edit), `Bash` (commands with external paths)
239-
- Each scope is independently granted — Read access does not imply Write or Bash access
242+
- Three separate scopes: `Read` (read/list), `Write` (write/create/move/delete), `Bash` (commands with external paths)
243+
- Each scope is independently granted — Read access does not imply Write or Bash access, and vice versa
244+
- Tools that both read and write a file on external paths (`edit_file`, `morph_edit_file`) require **both** `Read` and `Write` grants on the path
240245

241246
**Bash Permissions (3-Tier System):**
242247

@@ -305,7 +310,7 @@ headers = { "Authorization" = "Bearer token123" }
305310
- Three scopes: `Read(path)` for reading, `Write(path)` for writing, `Bash(path)` for bash access — each independent
306311
- `Bash(path)` entries with globs (e.g. `Bash(/tmp/**)`) grant path access; plain entries (e.g. `Bash(npm test)`) grant command access
307312
- Glob patterns supported: `*` (single level), `**` (recursive)
308-
- Tilde expansion: `~` → home directory
313+
- Tilde expansion: `~``$HOME` on Unix, `%USERPROFILE%` on Windows
309314
- `ask` only works for Bash commands
310315
311316
\* These rules do not restrict MCP server command paths

0 commit comments

Comments
 (0)