-
Notifications
You must be signed in to change notification settings - Fork 0
Python ‐ 1. Protocol Specifications
Without information about what a web page is for, how it is structured, what features it provides, and how to interact with it, an AI agent has to figure out everything on its own.
This is commonly done through scrappers and/or vision models aimed at guessing what the agent sees.
Websites being diverse, complex, dynamic, Javascript-heavy and often moslty made of generic
<div>s, this exercise commonly leads to unreliable parsing and broken/unintended interactions.Intelligent agents need richer semantic hints to parse and interact with these pages reliably.
The premise of AWP is simple: include standard information in the HTML page itself, for any agent to be able to reliably understand and interact with it.
For an agent to so, the following information needs to be attached to meaninful and/or interactive HTML tags:
- A
description, for it to know what it is. - A list of possible
interactions, for it to know what to do. - A list of
prerequisites, for it to know what to do prior to interacting. - A list of subsequent
features, for it to know what those interactions lead to.
Additional optional information such as states, or established accessibility parameters (eg. role, aria-*) may also be used to complement the agent's understanding of the page.
Let's start with a simple example. Your agent just found this website by crawling the web:
<html>
<body>
<form>
This site uses cookies
<button>Configure</button>
</form>
<form>
<h1> Website name </h1>
<label> What's next? </label>
<input
type="text"
name="destination"
required
minlength="3"
maxlength="30"/>
<div>
<button disabled> -> </button>
<button> Back </button>
</div>
</form>
</body>
</html>It now needs to understand what it is for, to know if it can be used to answer your query, and if so, how to interact with it?
For all reasons described above, this often becomes a difficult and error-prone task —leading to unintended behaviors and impairing the agent's ability to act reliably on our behalf.
With AWP, this information is now declared in the HTML itself, through standard optional ai-* attributes.
<html ai-description="Travel site to book flights and trains">
<body>
<form>
This site uses cookies
<button>Configure</button>
</form>
<form ai-description="Form to book a flight">
<h1>
Website name
</h1>
<label>
What's next?
</label>
<input
ai-ref="<input-ai-ref>"
ai-description="Form input where to enter the destination"
ai-interactions="input: enables the form confirmation button, given certain constraints;"
type="text"
name="destination"
required
minlength="3"
maxlength="30"/>
<div>
<button
ai-description="Confirmation button to proceed with booking a flight"
ai-interactions="click: proceed; hover: diplay additonal information about possible flights;"
ai-prerequisite-click="<input-ai-ref>: input the destination;"
ai-next-click="list of available flights; book a flight; login;"
disabled>
->
</button>
<button
ai-description="Cancel button to get back to the home page"
ai-interactions="click: dismiss form and return to home page;"
ai-next-click="access forms to book trains; access forms to book flights;">
Back
</button>
</div>
</form>
</body>
</html>The web app can now be reliably understood and used by any AI agents 🙌
| Parameter | Description | Requirement |
|---|---|---|
ai-description |
A natural language description for agents to know what the element is | • Meaningful Element: required• Interactive Element: required• Other Element: absent
|
ai-interactions |
A list of possible interactions, for agents to know what to do with the element Format: <interaction>: <behavior>; <interaction>: <behavior>;..
|
• Meaningful Element: absent• Interactive Element: required• Other Element: absent
|
ai-prerequisite-<interaction> |
A list of prerequisite interactions, for agents to know what to do prior to interacting with the element Format: <ai-ref>: <interaction>;..
|
• Meaningful Element: absent• Interactive Element: optional• Other Element: absent
|
ai-ref |
A unique identifier for agents to know where those prerequisite interactions should be made | • Meaningful Element: absent• Interactive Element: optional• Other Element: absent
|
ai-next-<interaction> |
A list of subsequent features, for agents to know what those interactions lead to Format: <next feature>; <next feature>;..
|
• Meaningful Element: absent• Interactive Element: optional• Other Element: absent
|
ai-state |
A natural language description of the state the component is in | • Meaningful Element: optional• Interactive Element: optional• Other Element: optional
|
An AWP Tool is also distributed by this library to allow any AI agent to reliably use
AWPcompliant websites.
Without information about what an API is for, how it is structured, what features it provides, and how to interact with it, an AI agent has to figure out everything on its own.
This is commonly passed manually as context, fetched via web crawlers attempting to find documentation online, or by spinning up additional middleware servers (eg. mcp) to allow them to be discoverable.
The premise of AWP is simple: include standard information in the API itself, for any agent to be able to reliably understand and interact with it, without requiring additional middleware servers to do so.
For an agent to know how to use any API, the following information needs to be discoverable:
- A list of all each available
endpointson that API, to know what they are - A
descriptionfor each endpoint, to know what they are for -
metainformation for each endpoint, to know how to access them - An
inputdocumentation for each endpoint, to know what to provide - An
outputdocumentation for each endpoint, to know what to expect
With AWP, the API documentation is made accessible on the API itself, with a standard /ai-handshake endpoint.
This allows AI agents to query /ai-handshake, get a complete description of the API, and know how to further interact with it.
For simplicity, and since it is a well established standard on the web, the AWP expects a OpenAPI compliant documentation to be returned by that endpoint.
Here is a simple example: https://editor.swagger.io
| Path | Description | Type | Method | Input | Output | Requirement |
|---|---|---|---|---|---|---|
/ai-handshake |
Standard endpoint returning a OpenAPI compliant documentation of the API which hosts the endpoint, excluding /ai-handshake, JSON or YAML based on headers |
REST | GET | Headers:"Content-Type": "application/yaml"(recommended)or "Content-Type": "application/json"
|
OpenAPI compliant documentation, of requested Content-Type (eg. YAML, JSON, text) |
required |
An AWP Tool is also distributed by this library to allow any AI agent to reliably use
AWPcompliant API.