BPSA - Beyond Python SmolAgents is a fork of the original smolagents that extends its original abilities:
- π» Interactive CLI (
bpsa): Multi-turn REPL with slash commands, command history, tab completion, session stats, and auto-approve mode. - π Infinite runtime CLI (
ad-infinitum): Allows agents to run ad infinitum via autonomous looping. - ποΈ Context compression: Biologically inspired automatic LLM-based summarization of older memory steps to manage context window size during long-running tasks.
- π Browser integration: Control a headed Chromium browser from agent code blocks via Playwright (
--browserflag). - π₯οΈ GUI interaction: Launch, screenshot, click, type, and send keys to native GUI applications on X11 via xdotool/ImageMagick (
--gui-x11flag). - π MCP server integration: Connect any Model Context Protocol server as a tool source via the
--mcpCLI flag. Supports both HTTP (Streamable HTTP) and stdio-based servers. - ποΈ Image loading: Agents can load and visually inspect image files (plots, screenshots, diagrams) via the built-in
load_imagetool β always available, no flags needed. - π¨ Image tools: Visual image diffing (
diff_images), OCR text extraction from images (screen_ocr), and a canvas for drawing shapes, text, and annotations (canvas_create,canvas_draw) β always available. - π€ Dictation input: Dictate prompts via microphone using Whisper or ElevenLabs transcription (
/dictationcommand, requiresBPSA_DICTATION_TRANSCRIBERenv var). - β‘ Native Python execution: Execute Python code natively via
execfor unrestricted processing. - π Multi-language support: Code in multiple languages beyond Python (Pascal, PHP, C++, Java and more).
- π οΈ Developer tools: Lots of new tools that help agents to compile, test, and debug source code in various computing languages.
- π₯ Multi-agent collaboration: Collaborate across multiple agents to solve complex problems.
- π Research tools: Tools that help agents to research and write technical documentation.
- π Documentation generation: Generate and update documentation including READMEs for existing codebases.
Install the project, including the dictation support, CLIs, OpenAI protocol and LiteLLM dependencies.
$ pip install bpsa[dictation,browser,openai,litellm]Find out more at the BPSA GitHub repository.


