You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,14 @@ All notable changes to Sofos are documented in this file.
4
4
5
5
## [Unreleased]
6
6
7
+
### Added
8
+
9
+
-**New `view_image` tool lets the model open an image on demand.** Given a local image file path or an `http(s)://` URL, the tool attaches the image to the conversation so the model can describe it. For a folder of images, the model is told to call `list_directory` first and then `view_image` once per file. Supports JPEG, PNG, GIF, and WebP up to 20 MB per local file; URLs are passed through to the model provider, which fetches them on its side. External paths reuse the same Read-permission prompt as `read_file`, so granting access to a directory once covers both tools. Local images larger than 2048 pixels on the long side are downscaled proportionally before they reach the model so a 4K screenshot does not burn through the per-image token budget; smaller images are sent unchanged.
10
+
11
+
### Changed
12
+
13
+
-**Image references typed inline in a prompt are no longer auto-attached.** Previously, a path or URL with an image extension typed in a message would be detected, stripped from the text, and attached as an image content block. Two ergonomic problems followed: vague asks such as "look at the image in `assets/`" without a filename did nothing, and unrelated text that happened to end in `.png` could be misread as a path. Attaching an image now goes through the new `view_image` tool, which the model invokes after reading the prompt; clipboard paste (Ctrl-V) continues to attach images directly.
14
+
7
15
### Security
8
16
9
17
-**Shell command and process substitution are now blocked in bash commands.** Constructs such as `echo $(rm bad)`, backtick substitution, and process substitution `<(cmd)` / `>(cmd)` previously slipped past the permission system because only the outer command name was checked. They are now refused before the command runs, with a clear message that names the marker. Single-quoted literals and arithmetic expansion `$((expr))` continue to work.
Copy file name to clipboardExpand all lines: README.md
+12-7Lines changed: 12 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,7 +62,7 @@ Sofos provides an AI assistant inside your terminal with controlled access to yo
62
62
- create, move, copy, and delete files with permission checks;
63
63
- run safe build and test commands;
64
64
- fetch documentation and use provider-native web search;
65
-
-review local, clipboard, or web images;
65
+
-open local image files or remote image URLs through the `view_image` tool, and accept clipboard pastes directly;
66
66
- update a visible task plan during multi-step work;
67
67
- save and resume conversations;
68
68
- connect to external tools through Model Context Protocol servers.
@@ -81,7 +81,7 @@ The assistant can act through tools, but it does not do so silently: tool calls
81
81
-**Strong permission model** — independent Read, Write, and Bash grants for paths outside the workspace.
82
82
-**Bash safety** — allowed, denied, and ask tiers, plus structural checks for parent traversal, redirection, and dangerous git operations.
83
83
-**Safe mode** — read-only native tools for review-only sessions.
84
-
-**Image vision** — local images, web images, and clipboard paste.
84
+
-**Image vision** — `view_image` tool for local files and remote URLs, plus clipboard paste.
85
85
-**MCP integration** — connect additional tool servers through stdio or streamable HTTP.
86
86
-**Session persistence** — saved conversations, resume picker, restored safe mode, restored model where compatible, and persisted cost counters.
87
87
-**Cost visibility** — token totals, cache hit reporting, and provider-specific price estimates.
@@ -214,21 +214,24 @@ sofos -p "Create a high-level summary of this crate" --safe-mode
214
214
215
215
### Image vision
216
216
217
-
Include image paths or URLs directly in your message, or paste images from the clipboard.
217
+
Ask about an image by referring to it in your message. The model calls the `view_image` tool to open the file or URL you mention.
218
218
219
219
```text
220
220
What is wrong in ./screenshots/error.png?
221
-
Describe "./docs/architecturediagram.webp".
221
+
Describe ./docs/architecture-diagram.webp.
222
222
Review https://example.com/chart.png
223
+
What do you see in the images in ./assets/?
223
224
```
224
225
226
+
For a folder, the model lists the directory first and then opens each image one by one.
227
+
225
228
Clipboard paste:
226
229
227
230
```text
228
231
Ctrl+V # Inserts a numbered marker such as ①.
229
232
```
230
233
231
-
Supported formats: JPEG, PNG, GIF, and WebP. Local images are capped at 20 MB. Paths with spaces should be quoted. Images outside the workspace require Read permission.
234
+
Supported formats: JPEG, PNG, GIF, and WebP. Local images are capped at 20 MB. Images larger than 2048 pixels on the long side are scaled down proportionally before being sent to the model, so a 4K screenshot does not balloon your token budget. Images outside the workspace require Read permission the first time, just like reading a file.
232
235
233
236
---
234
237
@@ -311,10 +314,11 @@ Provider mapping:
311
314
|`delete_directory`| Delete a directory after confirmation. External paths require Write permission. |
312
315
|`execute_bash`| Run approved shell commands through the bash permission system. |
313
316
|`update_plan`| Show the current multi-step task plan with `pending`, `in_progress`, and `completed` statuses. |
317
+
|`view_image`| Attach a local image file or an `http(s)://` URL to the conversation so the model can see it. |
314
318
|`web_fetch`| Fetch a URL and return readable text. |
315
319
|`web_search`| Provider-native web search. |
316
320
317
-
Image vision is not a tool. Sofos detects supported image paths and URLs in user messages and converts them into image content blocks before sending the request.
321
+
Clipboard pastes are not routed through a tool: pressing Ctrl-V in the prompt attaches the image directly to the message.
318
322
319
323
### Safe mode tools
320
324
@@ -325,6 +329,7 @@ Safe mode is enabled with `--safe-mode` or `/s`. It restricts the native tool se
325
329
-`glob_files`;
326
330
-`search_code` when ripgrep is installed;
327
331
-`update_plan`;
332
+
-`view_image`;
328
333
-`web_fetch`;
329
334
-`web_search`.
330
335
@@ -574,7 +579,7 @@ See [`RELEASE.md`](RELEASE.md) for the full process.
574
579
| Path denied | Add a `Read`, `Write`, or `Bash` rule, or approve the interactive prompt. |
575
580
| External edit denied |`edit_file` and `morph_edit_file` need both Read and Write for external files. |
576
581
| Code search unavailable | Install `ripgrep` and ensure `rg` is on `PATH`. |
577
-
| Image path with spaces fails | Quote the path: `"path/with spaces/image.png"`. |
582
+
| Image not opening | Mention the image by path or URL in your message; the model will call `view_image`. For a folder, ask it to look in the folder and it will list and open each image. |
578
583
| Terminal does not insert newline with Shift+Enter | Use Alt+Enter or Ctrl+Enter. |
579
584
| Build problems | Run `rustup update`, then `cargo clean` and `cargo build`. |
Copy file name to clipboardExpand all lines: STRUCTURE.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -275,7 +275,7 @@ src/
275
275
│ ├── codesearch.rs
276
276
│ │ # Ripgrep-backed code search with ignore policy, file-type filters, and output limits.
277
277
│ ├── image.rs
278
-
│ │ # Local and web image detection, validation, encoding, and message-content conversion.
278
+
│ │ # Image loader used by the `view_image` tool: format detection, 20 MB size cap, automatic resize to 2048 pixels on the long side, base64 encoding, and Read-permission integration.
279
279
│ ├── morph_validate.rs
280
280
│ │ # Safety checks that reject suspicious or truncated Morph Apply output before writing files.
281
281
│ ├── plan.rs
@@ -449,7 +449,7 @@ It contains:
449
449
- image size enforcement;
450
450
- numbered marker handling used by the TUI input flow.
451
451
452
-
It does not own general image path loading. Local and web image detection for user messages lives in `tools/image.rs`.
452
+
It does not own image loading from disk. Local and remote image loading for the `view_image` tool lives in `tools/image.rs`.
453
453
454
454
---
455
455
@@ -995,23 +995,23 @@ Rules:
995
995
996
996
### 7.8 `tools/image.rs`
997
997
998
-
`tools/image.rs` owns image detection and loading for user messages.
998
+
`tools/image.rs` owns the image loader behind the `view_image` tool.
999
999
1000
1000
It contains:
1001
1001
1002
-
-local image path parsing;
1003
-
-web image URL detection;
1004
-
-supported-format checks;
1005
-
-base64 encoding;
1006
-
-media-type assignment;
1007
-
-size enforcement;
1008
-
-integration with workspace and external Read permissions.
1002
+
-decode plus optional resize (long side fits within 2048 pixels) before the bytes reach the model;
1003
+
-byte-level format detection: PNG, JPEG, GIF, and WebP pass through unchanged when small enough; other decodable formats (e.g. BMP) are re-encoded as PNG;
1004
+
-base64 encoding and media-type assignment;
1005
+
-the 20 MB per-file size cap on the raw bytes;
1006
+
-canonical workspace resolution so inside/outside classification compares the same path shape on both sides;
1007
+
-integration with the shared Read-permission grant set, so a single "Allow Read access to /foo?" decision answered for `read_file` also covers `view_image`;
1008
+
-a URL passthrough that hands `http(s)://` inputs to the model provider unchanged.
1009
1009
1010
1010
Rules:
1011
1011
1012
-
-Image paths in user text become image content blocks before provider requests.
1013
-
-Unsupported or oversized images should produce clear errors.
1014
-
-Failed web-image loading can be retried without discarding the user's text.
1012
+
-Local files outside the workspace go through the same interactive Read prompt as `read_file`.
1013
+
-Files that fail to decode or exceed the size cap produce errors that name the cause.
1014
+
-The loader never fetches remote URLs itself; the model provider does that on its side.
1015
1015
1016
1016
### 7.9 `tools/morph_validate.rs`
1017
1017
@@ -1502,7 +1502,7 @@ Rules:
1502
1502
| Permission settings and prompts |`tools/permissions/manager.rs`|
0 commit comments