You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ limitations under the License.
23
23
* 🖥️ **GUI interaction:** Launch, screenshot, click, type, and send keys to native GUI applications on X11 via xdotool/ImageMagick (`--gui-x11` flag).
24
24
* 👁️ **Image loading:** Agents can load and visually inspect image files (plots, screenshots, diagrams) via the built-in `load_image` tool — always available, no flags needed.
25
25
* 🎨 **Image tools:** Visual image diffing (`diff_images`), OCR text extraction from images (`screen_ocr`), and a canvas for drawing shapes, text, and annotations (`canvas_create`, `canvas_draw`) — always available.
26
-
* 🎤 **Voice input:** Dictate prompts via microphone using Whisper or ElevenLabs transcription (`/voice` command, requires `BPSA_VOICE_TRANSCRIBER` env var).
26
+
* 🎤 **Dictation input:** Dictate prompts via microphone using Whisper or ElevenLabs transcription (`/dictation` command, requires `BPSA_DICTATION_TRANSCRIBER` env var).
27
27
* ⚡ **Native Python execution:** Execute Python code natively via `exec` for unrestricted processing.
28
28
* 🌍 **Multi-language support:** Code in multiple languages beyond Python (Pascal, PHP, C++, Java and more).
29
29
* 🛠️ **Developer tools:** Lots of new tools that help agents to compile, test, and debug source code in various computing languages.
@@ -33,10 +33,10 @@ limitations under the License.
33
33
34
34
35
35
## Installation
36
-
Install the project, including the voice support, CLIs, OpenAI protocol and LiteLLM dependencies.
36
+
Install the project, including the dictation support, CLIs, OpenAI protocol and LiteLLM dependencies.
This will set up the necessary libraries and the Beyond Python Smolagents framework in your environment.
@@ -62,23 +62,23 @@ BPSA_MAX_TOKENS=64000
62
62
63
63
Context compression parameters can also be configured via env vars (e.g., `BPSA_COMPRESSION_ENABLED`, `BPSA_COMPRESSION_KEEP_RECENT_STEPS`). See [CLI.md](CLI.md) for the full list.
64
64
65
-
#### Voice Input
65
+
#### Dictation Input
66
66
67
-
Dictate prompts via microphone instead of typing. Requires the voice extra and a transcriber environment variable:
67
+
Dictate prompts via microphone instead of typing. Requires the dictation extra and a transcriber environment variable:
Then use `/voice on` in the REPL to start listening and `/voice off` to stop. While active, the prompt shows `[mic] >` and transcribed speech is inserted at the cursor.
81
+
Then use `/dictation on` in the REPL to start listening and `/dictation off` to stop. While active, the prompt shows `[mic] >` and transcribed speech is inserted at the cursor.
0 commit comments