Skip to content

Commit 5a211cd

Browse files
committed
Add web scraping permissions.
1 parent e9c5c4f commit 5a211cd

1 file changed

Lines changed: 48 additions & 0 deletions

File tree

site/public/robots.txt

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# As a condition of accessing this website, you agree to abide by the following
2+
# content signals:
3+
#
4+
# (a) If a Content-Signal = yes, you may collect content for the corresponding use.
5+
# (b) If a Content-Signal = no, you may not collect content for the corresponding use.
6+
# (c) If no Content-Signal is present for a use, the operator neither grants nor
7+
# restricts permission via Content-Signal for that use.
8+
#
9+
# Signals: search = building a search index / returning hyperlinks and short excerpts
10+
# (not AI-generated summaries)
11+
# ai-input = feeding content into AI models (RAG, grounding, generative answers)
12+
# ai-train = training or fine-tuning AI models
13+
#
14+
# ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF RIGHTS
15+
# UNDER ARTICLE 4 OF EU DIRECTIVE 2019/790 ON COPYRIGHT IN THE DIGITAL SINGLE MARKET.
16+
17+
# --- Default policy: index for search, no AI training -------------------------
18+
User-agent: *
19+
Content-Signal: search=yes,ai-train=no
20+
Allow: /
21+
# Build artifacts and map data — no value to any crawler, and the PMTiles layers
22+
# are large. Keep crawlers out of them.
23+
Disallow: /assets/
24+
25+
# --- AI training / scraper bots: full opt-out ---------------------------------
26+
User-agent: Amazonbot
27+
Disallow: /
28+
29+
User-agent: Applebot-Extended
30+
Disallow: /
31+
32+
User-agent: Bytespider
33+
Disallow: /
34+
35+
User-agent: CCBot
36+
Disallow: /
37+
38+
User-agent: ClaudeBot
39+
Disallow: /
40+
41+
User-agent: Google-Extended
42+
Disallow: /
43+
44+
User-agent: GPTBot
45+
Disallow: /
46+
47+
User-agent: meta-externalagent
48+
Disallow: /

0 commit comments

Comments
 (0)