Skip to content

Commit 67e7d79

Browse files
Rename voice feature to dictation
- Rename /voice command to /dictation - Rename BPSA_VOICE_TRANSCRIBER to BPSA_DICTATION_TRANSCRIBER - Rename BPSA_VOICE_MODEL to BPSA_DICTATION_MODEL - Rename BPSA_DEFAULT_VOICE_MODEL to BPSA_DEFAULT_DICTATION_MODEL - Rename pip extra bpsa[voice] to bpsa[dictation] - Update all user-facing messages and banners - Update README.md and CLI.md documentation - Internal _voice_* variable names kept unchanged Model: claude-sonnet-4.6 Co-Authored-By: bpsa2 <241537330+bpsa2@users.noreply.github.com>
1 parent dbe5c2e commit 67e7d79

4 files changed

Lines changed: 46 additions & 46 deletions

File tree

CLI.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -79,14 +79,14 @@ All optional. Configure `CompressionConfig` without touching code:
7979
| `BPSA_COMPRESSION_PRESERVE_FINAL_ANSWER_STEPS` | `1` | Keep final_answer steps uncompressed (`0` or `1`) |
8080
| `BPSA_COMPRESSION_MIN_CHARS` | `4096` | Min characters of content before an LLM compression call is made |
8181

82-
### Voice Input Variables
82+
### Dictation Input Variables
8383

84-
Requires `pip install bpsa[voice]`.
84+
Requires `pip install bpsa[dictation]`.
8585

8686
| Variable | Required | Default | Description |
8787
|----------|----------|---------|-------------|
88-
| `BPSA_VOICE_TRANSCRIBER` | Yes (for `/voice`) | - | Transcriber name: `whisper` or `elevenlabs` |
89-
| `BPSA_VOICE_MODEL` | No | `base.en` (`whisper`) or `scribe_v2` (`elevenlabs`) | Model name passed to the transcriber (whisper only) |
88+
| `BPSA_DICTATION_TRANSCRIBER` | Yes (for `/dictation`) | - | Transcriber name: `whisper` or `elevenlabs` |
89+
| `BPSA_DICTATION_MODEL` | No | `base.en` (`whisper`) or `scribe_v2` (`elevenlabs`) | Model name passed to the transcriber (whisper only) |
9090
| `ELEVENLABS_API_KEY` | Yes (for `elevenlabs`) | - | API key for ElevenLabs Scribe API |
9191

9292
### Supported Model Classes (`BPSA_SERVER_MODEL`)
@@ -184,7 +184,7 @@ Use `prompt_toolkit` for:
184184
| `/show-tools` | List all loaded tools |
185185
| `/undo-steps [N]` | Remove last N steps from memory (default: 1) |
186186
| `/verbose` | Toggle verbose output |
187-
| `/voice [on\|off]` | Toggle voice dictation (requires `BPSA_VOICE_TRANSCRIBER`) |
187+
| `/dictation [on\|off]` | Toggle dictation (requires `BPSA_DICTATION_TRANSCRIBER`) |
188188

189189
## Configuration Layering
190190

README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ limitations under the License.
2323
* 🖥️ **GUI interaction:** Launch, screenshot, click, type, and send keys to native GUI applications on X11 via xdotool/ImageMagick (`--gui-x11` flag).
2424
* 👁️ **Image loading:** Agents can load and visually inspect image files (plots, screenshots, diagrams) via the built-in `load_image` tool — always available, no flags needed.
2525
* 🎨 **Image tools:** Visual image diffing (`diff_images`), OCR text extraction from images (`screen_ocr`), and a canvas for drawing shapes, text, and annotations (`canvas_create`, `canvas_draw`) — always available.
26-
* 🎤 **Voice input:** Dictate prompts via microphone using Whisper or ElevenLabs transcription (`/voice` command, requires `BPSA_VOICE_TRANSCRIBER` env var).
26+
* 🎤 **Dictation input:** Dictate prompts via microphone using Whisper or ElevenLabs transcription (`/dictation` command, requires `BPSA_DICTATION_TRANSCRIBER` env var).
2727
***Native Python execution:** Execute Python code natively via `exec` for unrestricted processing.
2828
* 🌍 **Multi-language support:** Code in multiple languages beyond Python (Pascal, PHP, C++, Java and more).
2929
* 🛠️ **Developer tools:** Lots of new tools that help agents to compile, test, and debug source code in various computing languages.
@@ -33,10 +33,10 @@ limitations under the License.
3333

3434

3535
## Installation
36-
Install the project, including the voice support, CLIs, OpenAI protocol and LiteLLM dependencies.
36+
Install the project, including the dictation support, CLIs, OpenAI protocol and LiteLLM dependencies.
3737

3838
```bash
39-
$ pip install bpsa[voice,browser,openai,litellm]
39+
$ pip install bpsa[dictation,browser,openai,litellm]
4040
```
4141

4242
This will set up the necessary libraries and the Beyond Python Smolagents framework in your environment.
@@ -62,23 +62,23 @@ BPSA_MAX_TOKENS=64000
6262

6363
Context compression parameters can also be configured via env vars (e.g., `BPSA_COMPRESSION_ENABLED`, `BPSA_COMPRESSION_KEEP_RECENT_STEPS`). See [CLI.md](CLI.md) for the full list.
6464

65-
#### Voice Input
65+
#### Dictation Input
6666

67-
Dictate prompts via microphone instead of typing. Requires the voice extra and a transcriber environment variable:
67+
Dictate prompts via microphone instead of typing. Requires the dictation extra and a transcriber environment variable:
6868

6969
```bash
70-
pip install bpsa[voice]
70+
pip install bpsa[dictation]
7171

7272
# Option 1: Whisper (local, offline)
73-
export BPSA_VOICE_TRANSCRIBER=whisper
74-
export BPSA_VOICE_MODEL=base.en # optional (default: base.en)
73+
export BPSA_DICTATION_TRANSCRIBER=whisper
74+
export BPSA_DICTATION_MODEL=base.en # optional (default: base.en)
7575

7676
# Option 2: ElevenLabs (cloud API)
77-
export BPSA_VOICE_TRANSCRIBER=elevenlabs
77+
export BPSA_DICTATION_TRANSCRIBER=elevenlabs
7878
export ELEVENLABS_API_KEY=your_api_key
7979
```
8080

81-
Then use `/voice on` in the REPL to start listening and `/voice off` to stop. While active, the prompt shows `[mic] >` and transcribed speech is inserted at the cursor.
81+
Then use `/dictation on` in the REPL to start listening and `/dictation off` to stop. While active, the prompt shows `[mic] >` and transcribed speech is inserted at the cursor.
8282

8383
### BPSA CLI Usage
8484

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,15 +101,15 @@ vision = [
101101
"helium",
102102
"selenium",
103103
]
104-
voice = [
104+
dictation = [
105105
"voicelistener>=1.0.3",
106106
]
107107
vllm = [
108108
"vllm>=0.10.2",
109109
"torch"
110110
]
111111
all = [
112-
"bpsa[audio,blaxel,docker,e2b,gradio,litellm,mcp,mlx-lm,modal,openai,telemetry,toolkit,transformers,vision,voice,bedrock]",
112+
"bpsa[audio,blaxel,docker,e2b,gradio,litellm,mcp,mlx-lm,modal,openai,telemetry,toolkit,transformers,vision,dictation,bedrock]",
113113
]
114114
quality = [
115115
"ruff>=0.9.0",

src/smolagents/bp_cli.py

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@
2929
BPSA_COMPRESSION_PRESERVE_FINAL_ANSWER_STEPS - Keep final_answer steps (default: 1)
3030
BPSA_COMPRESSION_MIN_CHARS - Min chars before compressing (default: 4096)
3131
32-
Voice input (requires `pip install bpsa[voice]`):
33-
BPSA_VOICE_TRANSCRIBER - Transcriber name: 'whisper' or 'elevenlabs' (required for /voice)
34-
BPSA_VOICE_MODEL - Model name passed to transcriber (optional, whisper only)
32+
Dictation input (requires `pip install bpsa[dictation]`):
33+
BPSA_DICTATION_TRANSCRIBER - Transcriber name: 'whisper' or 'elevenlabs' (required for /dictation)
34+
BPSA_DICTATION_MODEL - Model name passed to transcriber (optional, whisper only)
3535
ELEVENLABS_API_KEY - API key for ElevenLabs transcriber (required when using elevenlabs)
3636
"""
3737

@@ -99,7 +99,7 @@
9999
"GoogleColabModel": [],
100100
}
101101

102-
BPSA_DEFAULT_VOICE_MODEL = None
102+
BPSA_DEFAULT_DICTATION_MODEL = None
103103

104104
class Spinner:
105105
"""Improved spinner using Rich library for better UX and reliability."""
@@ -535,13 +535,13 @@ def print_turn_summary(turn_num: int, elapsed: float, input_tokens: int, output_
535535
console.print(line)
536536

537537

538-
def print_banner(model_id: str, server_model: str, tool_count: int, voice_transcriber: str = None):
539-
voice_line = f"\nVoice: [magenta]{voice_transcriber}[/]" if voice_transcriber else ""
538+
def print_banner(model_id: str, server_model: str, tool_count: int, dictation_transcriber: str = None):
539+
dictation_line = f"\nDictation: [magenta]{dictation_transcriber}[/]" if dictation_transcriber else ""
540540
console.print(
541541
Panel.fit(
542542
f"[bold]BPSA - Beyond Python SmolAgents[/] v{VERSION}\n"
543543
f"Model: [cyan]{model_id}[/] ({server_model})\n"
544-
f"Tools: [green]{tool_count}[/] loaded{voice_line}",
544+
f"Tools: [green]{tool_count}[/] loaded{dictation_line}",
545545
border_style="blue",
546546
)
547547
)
@@ -606,7 +606,7 @@ def _save_aliases(aliases: dict):
606606
"/session-load", "/session-save",
607607
"/show-compression-stats", "/show-memory-stats", "/show-stats",
608608
"/save-step", "/set-max-steps", "/show-step", "/show-steps", "/show-tools", "/undo-steps", "/verbose",
609-
"/voice",
609+
"/dictation",
610610
]
611611

612612

@@ -649,7 +649,7 @@ def print_help():
649649
table.add_row("/show-tools", "List all loaded tools")
650650
table.add_row("/undo-steps \[N]", "Remove last N steps from memory (default: 1)")
651651
table.add_row("/verbose", "Toggle verbose output")
652-
table.add_row(r"/voice \[on|off]", "Toggle voice dictation (requires BPSA_VOICE_TRANSCRIBER)")
652+
table.add_row(r"/dictation \[on|off]", "Toggle dictation (requires BPSA_DICTATION_TRANSCRIBER)")
653653
console.print(table)
654654
console.print()
655655

@@ -706,16 +706,16 @@ def _voice_start():
706706
"""Start the voice listener. Returns an error message string on failure, or None on success."""
707707
global _voice_listener
708708
if _voice_listener is not None:
709-
return "Voice input is already active."
709+
return "Dictation is already active."
710710
try:
711711
from voicelistener import VoiceListener
712712
except ImportError:
713-
return "Voice input requires the voicelistener package. Install with: pip install bpsa[voice]"
713+
return "Dictation requires the voicelistener package. Install with: pip install bpsa[dictation]"
714714

715-
transcriber_name = get_env("BPSA_VOICE_TRANSCRIBER", default="")
715+
transcriber_name = get_env("BPSA_DICTATION_TRANSCRIBER", default="")
716716
if not transcriber_name:
717717
return (
718-
"Set BPSA_VOICE_TRANSCRIBER environment variable to enable voice input"
718+
"Set BPSA_DICTATION_TRANSCRIBER environment variable to enable dictation"
719719
f" (available transcribers: {', '.join(sorted(_VOICE_TRANSCRIBERS))})"
720720
)
721721
transcriber_name = transcriber_name.lower().strip()
@@ -725,7 +725,7 @@ def _voice_start():
725725
f" Available transcribers: {', '.join(sorted(_VOICE_TRANSCRIBERS))}"
726726
)
727727

728-
model = get_env("BPSA_VOICE_MODEL", default=BPSA_DEFAULT_VOICE_MODEL)
728+
model = get_env("BPSA_DICTATION_MODEL", default=BPSA_DEFAULT_DICTATION_MODEL)
729729
kwargs = {}
730730
if model is not None:
731731
kwargs["model_id"] = model
@@ -752,7 +752,7 @@ def _voice_stop():
752752
"""Stop the voice listener."""
753753
global _voice_listener
754754
if _voice_listener is None:
755-
return "Voice input is not active."
755+
return "Dictation is not active."
756756
_voice_listener.stop()
757757
_voice_listener = None
758758
# Drain any remaining items
@@ -1670,8 +1670,8 @@ def run_repl(skip_instructions: bool = False, auto_approve: bool = True, browser
16701670
_verbose = verbose
16711671

16721672
console.clear()
1673-
voice_transcriber = get_env("BPSA_VOICE_TRANSCRIBER", default=None)
1674-
print_banner(model_id, server_model, tool_count, voice_transcriber=voice_transcriber)
1673+
dictation_transcriber = get_env("BPSA_DICTATION_TRANSCRIBER", default=None)
1674+
print_banner(model_id, server_model, tool_count, dictation_transcriber=dictation_transcriber)
16751675

16761676
instructions = None
16771677
if not skip_instructions:
@@ -1889,8 +1889,8 @@ def get_input():
18891889
last_answer = None
18901890
first_turn = True
18911891
console.clear()
1892-
_vt = get_env("BPSA_VOICE_TRANSCRIBER", default="").strip() if _voice_listener is not None else None
1893-
print_banner(model_id, server_model, count_tools(agent), voice_transcriber=_vt or None)
1892+
_dt = get_env("BPSA_DICTATION_TRANSCRIBER", default="").strip() if _voice_listener is not None else None
1893+
print_banner(model_id, server_model, count_tools(agent), dictation_transcriber=_dt or None)
18941894
continue
18951895
elif cmd == "/show-tools":
18961896
print_tools(agent)
@@ -2042,33 +2042,33 @@ def get_input():
20422042
continue
20432043
console.print(f"[cyan]Auto-approve: {'on' if _auto_approve else 'off'}[/]")
20442044
continue
2045-
elif cmd == "/voice":
2045+
elif cmd == "/dictation":
20462046
arg = cmd_args.strip().lower()
20472047
if arg == "on":
2048-
console.print("[cyan]Loading voice support.[/]")
2048+
console.print("[cyan]Loading dictation support.[/]")
20492049
if not _has_prompt_toolkit:
2050-
console.print("[red]Voice input requires voicelistener. Install with: pip install voicelistener[/]")
2050+
console.print("[red]Dictation requires voicelistener. Install with: pip install voicelistener[/]")
20512051
else:
20522052
err = _voice_start()
20532053
if err:
20542054
console.print(f"[red]{err}[/]")
20552055
else:
2056-
console.print("[cyan][mic] Voice input active[/]")
2056+
console.print("[cyan][mic] Dictation active[/]")
20572057
elif arg == "off":
20582058
err = _voice_stop()
20592059
if err:
20602060
console.print(f"[yellow]{err}[/]")
20612061
else:
2062-
console.print("[cyan]Voice input deactivated[/]")
2062+
console.print("[cyan]Dictation deactivated[/]")
20632063
elif arg == "":
20642064
if _voice_listener is not None:
2065-
transcriber = get_env("BPSA_VOICE_TRANSCRIBER", default="(unknown)")
2066-
model = get_env("BPSA_VOICE_MODEL", default=BPSA_DEFAULT_VOICE_MODEL)
2067-
console.print(f"[cyan]Voice: on | transcriber: {transcriber} | model: {model}[/]")
2065+
transcriber = get_env("BPSA_DICTATION_TRANSCRIBER", default="(unknown)")
2066+
model = get_env("BPSA_DICTATION_MODEL", default=BPSA_DEFAULT_DICTATION_MODEL)
2067+
console.print(f"[cyan]Dictation: on | transcriber: {transcriber} | model: {model}[/]")
20682068
else:
2069-
console.print("[dim]Voice: off[/]")
2069+
console.print("[dim]Dictation: off[/]")
20702070
else:
2071-
console.print("[yellow]Usage: /voice [on|off][/]")
2071+
console.print("[yellow]Usage: /dictation [on|off][/]")
20722072
continue
20732073
else:
20742074
console.print(f"[yellow]Unknown command: {cmd}. Type /help for available commands.[/]")

0 commit comments

Comments
 (0)