A breadth-first crawler for the AEO Protocol v0.1.
Give it one seed origin. It fetches that origin's /.well-known/aeo.json, then follows every authority.primary_sources URI as a candidate origin to fetch next — up to a configurable depth and total fetch budget. Output is one JSON Lines record per origin attempted, suitable for piping into jq, a graph database, or any analytics pipeline.
Built on top of aeo-sdk-go.
go install github.com/mizcausevic-dev/aeo-crawler/cmd/aeo-crawler@latestaeo-crawler --seed https://mizcausevic-dev.github.ioOutput (one JSON object per line):
{"origin":"https://mizcausevic-dev.github.io","depth":0,"success":true,"entity_name":"Miz Causevic","entity_type":"Person","claims_count":6,"audit_mode":"none","fetched_at":"2026-05-12T04:00:00Z"}
{"origin":"https://github.com","depth":1,"success":false,"error":"HTTP 404","fetched_at":"2026-05-12T04:00:01Z"}
{"origin":"https://www.linkedin.com","depth":1,"success":false,"error":"HTTP 404","fetched_at":"2026-05-12T04:00:01Z"}
{"origin":"https://mizcausevic.com","depth":1,"success":false,"error":"HTTP 404","fetched_at":"2026-05-12T04:00:01Z"}| Flag | Default | Description |
|---|---|---|
--seed |
required | Seed origin URL. |
--depth |
2 |
Maximum graph distance from the seed. 0 = only fetch the seed. |
--max-fetches |
100 |
Global cap on total fetches. |
--concurrency |
4 |
Maximum in-flight HTTP requests. |
--timeout |
10 |
Per-request timeout in seconds. |
Count successful AEO declarations:
aeo-crawler --seed https://mizcausevic-dev.github.io | jq -c 'select(.success==true)' | wc -lList unique entity names:
aeo-crawler --seed https://mizcausevic-dev.github.io | jq -r 'select(.success==true) | .entity_name' | sort -uFind origins that declare an audit_mode of signature:
aeo-crawler --seed https://example.com --depth 3 | jq -c 'select(.audit_mode=="signature")'For each fetched declaration, authority.primary_sources is treated as the source of next-hop candidate origins. Each URI is normalized to its scheme + host (path stripped). Already-visited origins are not re-fetched. The crawler does not currently chase citation_preferences.canonical_links or claims[].evidence — those are roadmap for v0.2.
Operates against AEO Protocol v0.1 declarations at conformance Level 1 (Declare). Signature verification (L2) and audit-report submission (L3) are not invoked; signed documents are recorded as audit_mode: "signature" but not verified.
- github.com/mizcausevic-dev/aeo-sdk-go — Go SDK for parsing and fetching AEO declarations
- Go standard library (
net/http,encoding/json,context,sync)
go vet ./...
go test -v ./...
go build ./cmd/aeo-crawlerTests use httptest to serve fixture AEO documents — no network is required.
Full spec at github.com/mizcausevic-dev/aeo-protocol-spec.
AGPL-3.0.
| Spec | Implementation |
|---|---|
| AEO Protocol | aeo-sdk-python · aeo-sdk-typescript · aeo-sdk-rust · aeo-sdk-go · aeo-cli · aeo-crawler (this) |
| Prompt Provenance | — |
| Agent Cards | — |
| AI Evidence Format | — |
| MCP Tool Cards | — |
Connect: LinkedIn · Kinetic Gain · Medium · Skills