Skip to content

Browser Agent

Sitaraman Subramanian edited this page Mar 18, 2026 · 3 revisions

Browser Agent (Magnitude)

The browser agent lets Pentest Copilot control a real browser on the exploit box to interact with web applications. It's powered by Magnitude, a browser automation framework that uses an LLM to navigate pages, fill forms, click buttons, and extract data.

Why

Many web application tests require actual browser interaction: logging in, navigating multi-step workflows, testing client-side functionality, triggering JavaScript-heavy features. Command-line tools like curl can't handle these. The browser agent fills that gap.

Browser Agent VNC vs GUI/VNC (Exploit Box)

These are separate systems:

Feature GUI/VNC (Settings > GUI) Browser Agent VNC
Purpose Remote desktop to the exploit box (Kali) Live view of the browser agent's Chromium window
Where it runs On the exploit box On the backend (Docker container or local host)
When visible When you open the GUI tab in a session When you open the Browser Agent tab and run in Docker mode

The Browser Agent tab shows the live VNC stream only when running via Docker. In developer mode, the browser opens on your local desktop instead.

Display Flow: Docker vs Developer Mode

Docker Mode

When the backend runs in Docker (./run.sh start), the Browser Agent uses the backend's exposed VNC stack:

  1. The backend container starts Xvfb (virtual framebuffer), x11vnc, and websockify at startup.
  2. Chromium runs on display :99 inside the backend container.
  3. The noVNC web client is served on port 6080 (configurable via BROWSER_AGENT_NOVNC_PORT).
  4. The Browser Agent tab loads the VNC stream in an iframe, using the same hostname as the backend (e.g. http://localhost:6080/vnc.html when the frontend talks to http://localhost:8080).

You can watch the browser agent work in real time from the Browser Agent tab.

Developer Mode

When running in developer mode (./run.sh dev), the backend runs on your local machine. The Browser Agent uses your local PC's display:

  1. Chromium opens directly on your desktop (the display given by DISPLAY or MAGNITUDE_DISPLAY, default :99).
  2. The live VNC stream is not available in the UI. The Browser Agent tab shows a message: "Developer Mode. Browser Opens on Desktop."
  3. You see the browser window on your physical screen.

Developer Mode on a Headless Machine (No Physical Display)

If your local PC has no screen (e.g. headless server, SSH session), you need Xvfb and noVNC to view the browser:

  1. Install Xvfb, x11vnc, and noVNC:
# Debian/Ubuntu
sudo apt-get install xvfb x11vnc novnc websockify

# Fedora/RHEL
sudo dnf install xorg-x11-server-Xvfb x11vnc novnc websockify
  1. Start the virtual display and VNC stack before running the backend:
export DISPLAY=:99
Xvfb :99 -screen 0 1280x800x24 -ac +extension GLX +render -noreset &
sleep 1
x11vnc -display :99 -rfbport 5999 -nopw -forever -shared -noxdamage &
sleep 0.5
websockify --web /usr/share/novnc/ 6080 localhost:5999 &
  1. Set MAGNITUDE_DISPLAY=:99 in your backend environment (or use the default).

  2. Run the backend as usual. Chromium will use the virtual display.

  3. To view the browser, open http://localhost:6080/vnc.html in your browser (or use the hostname/IP of the machine if accessing remotely). Port 6080 must be reachable from your client.

Setup

1. Enable in Settings

Go to Settings > Browser Agent and toggle Enable Magnitude on.

2. Configure the Model

The browser agent uses its own LLM (separate from the main agent) to decide what to click, type, and navigate. Configure:

Field Description
Model Provider LLM provider for browser decisions
Model Model identifier
API Key API key
Base URL Custom base URL (for OpenAI-compatible providers)

3. Set the Proxy (Optional)

If you want browser traffic to flow through Burp Suite, set the Proxy URL to Burp's proxy listener:

http://127.0.0.1:8080

This captures all browser requests in Burp's proxy history, so you can inspect and replay them.

4. Display Mode

Setting Behavior
Headless: on (default) Browser runs without a visible window. Faster, no display needed.
Headless: off Browser window appears on the X display. You can watch it work through VNC. Set the X Display to match your VNC display (e.g. :1 for exploit box, :99 for backend).

Browser Agent settings panel

How It Works

The agent calls the browser_action tool with:

Parameter Description
url The URL to navigate to
goal A natural language description of what to do (e.g. "Log in with username admin and password admin123, then navigate to the settings page")
extract (Optional) What information to extract from the page after completing the goal

The browser agent then:

  1. Opens the URL in a Chromium browser
  2. Uses the LLM to interpret the page and decide on actions
  3. Clicks, types, scrolls, and navigates to accomplish the goal
  4. Optionally extracts the requested information
  5. Returns a summary of what happened and any extracted data

The browser_action tool always requires user consent before executing.

Use Cases

  • Test authentication flows - "Log in with admin/admin and check if access is granted to /admin"
  • Walk through multi-step forms - "Fill out the registration form with test data and submit it"
  • Explore web applications - "Navigate the site and list all links in the main navigation"
  • Test CSRF protections - "Submit the password change form and capture the request"
  • Verify XSS - "Enter <script>alert(1)</script> in the search box and check if it executes"

Combining with Burp

When the proxy URL points at Burp, every request the browser makes shows up in Burp's proxy history. This lets you:

  • See exactly what API calls the application makes during user workflows
  • Capture authenticated session cookies and tokens
  • Replay and modify requests through Repeater
  • Use the captured traffic as a starting point for deeper testing

The workflow typically looks like:

  1. Agent uses browser_action to browse the target (traffic proxied to Burp)
  2. Agent uses search_burp_proxy_history to find interesting requests
  3. Agent uses send_to_burp_repeater to test specific endpoints

Quick Test

The Settings > Browser Agent page includes a quick test feature. Enter a target URL and a goal, then click Test to run a one-off browser automation without going through the main agent loop. This is useful for verifying that Magnitude is configured correctly.

Requirements

  • The exploit box needs a browser installed (Chromium or Firefox). The built-in Kali container comes with Firefox ESR.
  • For visible (non-headless) mode, a running X server/VNC display is required on the exploit box.
  • The Magnitude npm package (magnitude-core) is included in the backend dependencies.

Getting Started

Using Pentest Copilot

Configuration

Integrations

Reference

Clone this wiki locally