Skip to content

Commit f47e067

Browse files
feat(public): .md mirror for every route + /llms-full.txt aggregate (#17)
LLMs and crawlers no longer have to parse HTML to consume the marketing site. Every public HTML route now has a parallel .md mirror at the same path with a .md suffix: / → /index.md /pricing → /pricing.md /for-agents → /for-agents.md /status → /status.md /docs → /docs.md (all 8 sections concatenated) /blog → /blog.md (post index) /blog/<slug> → /blog/<slug>.md /use-cases → /use-cases.md (catalogue grouped by category) /use-cases/<slug> → /use-cases/<slug>.md A single aggregate /llms-full.txt (~361 KB) concatenates every markdown page with URL-keyed section separators, so an LLM that wants the entire site's text can fetch one file instead of 115. Sources: - Blog posts: copy .content/blog/<slug>.md verbatim - Use cases: copy .content/use-cases/<slug>.md verbatim - Docs: concatenate .content/docs/*.md (ordered by frontmatter 'order') into a single /docs.md - React-only marketing pages (home, pricing, for-agents, status): copy authored .content/pages/<name>.md - Index pages (/blog, /use-cases): generated at build time from the corresponding directory listings + frontmatter Markdown link convention follows GFM — any URL in the rendered text is in [label](url) form so an LLM can follow them. Internal links use /path.md (the mirror), not /path (the HTML route), so an LLM following links stays in markdown. Total prerender output is now: - 115 HTML files (per-route SPA-pre-render) - 115 .md files (mirror routes) - 1 llms.txt (manifest pointing at the .md routes) - 1 llms-full.txt (361 KB, 115 sections, full text dump) GitHub repo descriptions for InstaNode-dev/content and InstaNode-dev/instanode-web were updated in the same change to advertise the LLM-friendly URLs. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 28c6770 commit f47e067

1 file changed

Lines changed: 237 additions & 6 deletions

File tree

scripts/prerender.mjs

Lines changed: 237 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -135,25 +135,256 @@ async function main() {
135135

136136
// Step 5: copy /llms.txt from the content repo to dist root. The
137137
// llms.txt convention (https://llmstxt.org) expects the file at the
138-
// domain root. Source of truth is .content/llms.txt — that lets the
139-
// content repo author it like any other content file, and the
140-
// dashboard build inlines it into dist/ for static hosting to serve.
138+
// domain root.
141139
const llmsSource = resolve(ROOT, '.content/llms.txt')
140+
let llmsBaseContent = ''
142141
if (existsSync(llmsSource)) {
143-
const llmsContent = await readFile(llmsSource, 'utf-8')
144-
await writeFile(resolve(DIST, 'llms.txt'), llmsContent, 'utf-8')
142+
llmsBaseContent = await readFile(llmsSource, 'utf-8')
143+
await writeFile(resolve(DIST, 'llms.txt'), llmsBaseContent, 'utf-8')
145144
console.log('prerender: copied llms.txt to dist root')
146145
} else {
147146
console.warn('prerender: no .content/llms.txt found, skipping')
148147
}
149148

150-
// Step 6: clean up the SSR bundle — it's only needed during this script.
149+
// Step 6: emit .md mirror routes for every HTML page so LLMs and
150+
// crawlers can consume plain text without parsing HTML. URL convention:
151+
// /foo → /foo.md, /blog/foo → /blog/foo.md, / → /index.md.
152+
//
153+
// Sources:
154+
// - Blog posts: copy .content/blog/<slug>.md verbatim
155+
// - Use cases: copy .content/use-cases/<slug>.md verbatim
156+
// - Docs page: concatenate all .content/docs/*.md (one page in HTML,
157+
// so one combined markdown file at /docs.md)
158+
// - Index pages (/blog.md, /use-cases.md): generated from filenames
159+
// - React-only pages (/, /pricing, /for-agents, /status): copy
160+
// authored .content/pages/<name>.md
161+
//
162+
// All emitted .md files are also concatenated into /llms-full.txt for
163+
// one-shot LLM consumption. Section separators use "---" + the path.
164+
console.log('prerender: emitting .md mirror routes…')
165+
const mdRoutes = await emitMarkdownRoutes()
166+
console.log(`prerender: wrote ${mdRoutes.length} .md files`)
167+
168+
// Step 7: aggregate every .md into /llms-full.txt — a single file an
169+
// LLM can fetch once and have the entire site's content.
170+
await writeAggregate(mdRoutes)
171+
console.log('prerender: wrote llms-full.txt aggregate')
172+
173+
// Step 8: clean up the SSR bundle — it's only needed during this script.
151174
// Leaving it in dist-ssr would inflate the GH Pages upload by ~400 KB.
152175
await rm(SSR_DIST, { recursive: true, force: true })
153176

154177
console.log(`prerender: ${written} files written. SEO-ready.`)
155178
}
156179

180+
/* emitMarkdownRoutes — writes the .md mirror for every HTML route.
181+
*
182+
* Returns an array of {route, path, content} for the aggregate step. */
183+
async function emitMarkdownRoutes() {
184+
const out = []
185+
186+
// Helper: write a .md file at a given route path.
187+
// route '/foo' → dist/foo.md
188+
// route '/foo/bar' → dist/foo/bar.md
189+
// route '/' → dist/index.md
190+
async function writeRouteMd(route, content) {
191+
const fileSubpath = route === '/' ? 'index.md' : route.replace(/^\//, '') + '.md'
192+
const outPath = resolve(DIST, fileSubpath)
193+
await mkdir(dirname(outPath), { recursive: true })
194+
await writeFile(outPath, content, 'utf-8')
195+
out.push({ route: route === '/' ? '/index.md' : route + '.md', content })
196+
}
197+
198+
// 1. React-only pages — read from .content/pages/<name>.md
199+
const reactPageMap = {
200+
'/': 'home.md',
201+
'/pricing': 'pricing.md',
202+
'/for-agents': 'for-agents.md',
203+
'/status': 'status.md',
204+
}
205+
for (const [route, filename] of Object.entries(reactPageMap)) {
206+
const src = resolve(ROOT, `.content/pages/${filename}`)
207+
if (!existsSync(src)) {
208+
console.warn(` skip ${route}: no ${filename}`)
209+
continue
210+
}
211+
const text = await readFile(src, 'utf-8')
212+
await writeRouteMd(route, text)
213+
}
214+
215+
// 2. Blog posts — copy verbatim
216+
const blogDir = resolve(ROOT, '.content/blog')
217+
const blogFiles = existsSync(blogDir)
218+
? readdirSync(blogDir).filter((f) => f.endsWith('.md'))
219+
: []
220+
for (const f of blogFiles) {
221+
const slug = f.replace(/\.md$/, '')
222+
const text = await readFile(resolve(blogDir, f), 'utf-8')
223+
await writeRouteMd(`/blog/${slug}`, text)
224+
}
225+
226+
// 3. /blog index — generated from blog post filenames + frontmatter
227+
if (blogFiles.length > 0) {
228+
const blogIndex = await buildBlogIndex(blogDir, blogFiles)
229+
await writeRouteMd('/blog', blogIndex)
230+
}
231+
232+
// 4. Use cases — copy verbatim per file
233+
const useCaseDir = resolve(ROOT, '.content/use-cases')
234+
const useCaseFiles = existsSync(useCaseDir)
235+
? readdirSync(useCaseDir).filter((f) => f.endsWith('.md'))
236+
: []
237+
for (const f of useCaseFiles) {
238+
const slug = f.replace(/\.md$/, '')
239+
const text = await readFile(resolve(useCaseDir, f), 'utf-8')
240+
await writeRouteMd(`/use-cases/${slug}`, text)
241+
}
242+
243+
// 5. /use-cases index — generated, grouped by category
244+
if (useCaseFiles.length > 0) {
245+
const useCaseIndex = await buildUseCasesIndex(useCaseDir, useCaseFiles)
246+
await writeRouteMd('/use-cases', useCaseIndex)
247+
}
248+
249+
// 6. /docs — concatenate all docs sections into one markdown page
250+
const docsDir = resolve(ROOT, '.content/docs')
251+
const docsFiles = existsSync(docsDir)
252+
? readdirSync(docsDir).filter((f) => f.endsWith('.md'))
253+
: []
254+
if (docsFiles.length > 0) {
255+
const docsPage = await buildDocsPage(docsDir, docsFiles)
256+
await writeRouteMd('/docs', docsPage)
257+
}
258+
259+
return out
260+
}
261+
262+
/* writeAggregate — bundle every .md mirror into one llms-full.txt at
263+
* dist root. Each section is prefixed with a separator that includes
264+
* the URL path the section came from. */
265+
async function writeAggregate(mdRoutes) {
266+
const header = `# instanode.dev — full text dump\n\n` +
267+
`This file is the concatenation of every .md route on instanode.dev.\n` +
268+
`For the per-route URLs and an LLM-oriented index, see\n` +
269+
`https://instanode.dev/llms.txt — that's the manifest pointing here.\n\n` +
270+
`Each section below is delimited by an HTTP-style header line\n` +
271+
`(\`URL: <path>\`) and a horizontal rule. There are ${mdRoutes.length} sections\n` +
272+
`in this file.\n\n`
273+
274+
const sections = mdRoutes.map(({ route, content }) =>
275+
`\n\n---\nURL: ${route}\n---\n\n${content.trim()}\n`,
276+
)
277+
278+
await writeFile(resolve(DIST, 'llms-full.txt'), header + sections.join(''), 'utf-8')
279+
}
280+
281+
/* buildBlogIndex — emit a markdown index of every blog post: title,
282+
* date, excerpt, link to the .md detail. */
283+
async function buildBlogIndex(dir, files) {
284+
const posts = []
285+
for (const f of files) {
286+
const src = await readFile(resolve(dir, f), 'utf-8')
287+
const meta = parseFrontmatter(src)
288+
if (!meta.title || !meta.date) continue
289+
posts.push({
290+
slug: f.replace(/\.md$/, ''),
291+
title: meta.title,
292+
date: meta.date,
293+
excerpt: meta.excerpt || '',
294+
})
295+
}
296+
posts.sort((a, b) => b.date.localeCompare(a.date))
297+
298+
let out = `# Blog — instanode.dev\n\n`
299+
out += `> Build notes, retrospectives, and the occasional rant on what "frictionless for AI agents" actually means.\n\n`
300+
out += `## Posts\n\n`
301+
for (const p of posts) {
302+
out += `### [${p.title}](/blog/${p.slug}.md)\n\n`
303+
out += `*${p.date}*\n\n`
304+
if (p.excerpt) out += `${p.excerpt}\n\n`
305+
}
306+
return out
307+
}
308+
309+
/* buildUseCasesIndex — emit a markdown catalogue of every use case
310+
* grouped by category, each linking to its .md detail page. */
311+
async function buildUseCasesIndex(dir, files) {
312+
const cases = []
313+
for (const f of files) {
314+
const src = await readFile(resolve(dir, f), 'utf-8')
315+
const meta = parseFrontmatter(src)
316+
if (!meta.title || !meta.category) continue
317+
cases.push({
318+
slug: f.replace(/\.md$/, ''),
319+
title: meta.title,
320+
category: meta.category,
321+
scenario: meta.scenario || '',
322+
})
323+
}
324+
325+
const grouped = new Map()
326+
for (const c of cases) {
327+
if (!grouped.has(c.category)) grouped.set(c.category, [])
328+
grouped.get(c.category).push(c)
329+
}
330+
const cats = Array.from(grouped.keys()).sort()
331+
332+
let out = `# Use cases — instanode.dev\n\n`
333+
out += `> ${cases.length} unique scenarios across ${cats.length} archetypes. Each detail page includes a paste-ready prompt that any vanilla LLM (ChatGPT, Claude, Gemini) can act on with no MCP and no installation — point the LLM at https://instanode.dev/llms.txt for the API contract and it generates a runnable script.\n\n`
334+
for (const cat of cats) {
335+
out += `## ${cat}\n\n`
336+
const list = grouped.get(cat).sort((a, b) => a.title.localeCompare(b.title))
337+
for (const c of list) {
338+
out += `- [${c.title}](/use-cases/${c.slug}.md)`
339+
if (c.scenario) out += ` — ${c.scenario}`
340+
out += `\n`
341+
}
342+
out += `\n`
343+
}
344+
return out
345+
}
346+
347+
/* buildDocsPage — concatenate all docs sections (ordered by frontmatter
348+
* 'order') into one markdown page mirroring the HTML /docs page. */
349+
async function buildDocsPage(dir, files) {
350+
const sections = []
351+
for (const f of files) {
352+
const src = await readFile(resolve(dir, f), 'utf-8')
353+
const meta = parseFrontmatter(src)
354+
const body = src.replace(/^---\r?\n[\s\S]*?\r?\n---\r?\n?/, '')
355+
sections.push({
356+
id: f.replace(/\.md$/, ''),
357+
title: meta.title || f,
358+
order: Number(meta.order) || 0,
359+
body: body.trim(),
360+
})
361+
}
362+
sections.sort((a, b) => a.order - b.order)
363+
364+
let out = `# Documentation — instanode.dev\n\n`
365+
out += `> Everything you need to provision, deploy, and claim. Every curl below works against \`https://api.instanode.dev\` as-is.\n\n`
366+
for (const s of sections) {
367+
out += `## ${s.title}\n\n${s.body}\n\n`
368+
}
369+
return out
370+
}
371+
372+
/* parseFrontmatter — tiny YAML subset for blog/use-case/docs headers.
373+
* Mirrors the runtime parsers in src/content/*.ts. */
374+
function parseFrontmatter(src) {
375+
const m = src.match(/^---\r?\n([\s\S]*?)\r?\n---\r?\n?/)
376+
if (!m) return {}
377+
const meta = {}
378+
for (const line of m[1].split(/\r?\n/)) {
379+
const sep = line.indexOf(':')
380+
if (sep < 0) continue
381+
const key = line.slice(0, sep).trim()
382+
const value = line.slice(sep + 1).trim()
383+
if (key) meta[key] = value
384+
}
385+
return meta
386+
}
387+
157388
main().catch((err) => {
158389
console.error('prerender failed:', err)
159390
process.exit(1)

0 commit comments

Comments
 (0)