Skip to content

Commit 7380adb

Browse files
committed
Link checker
1 parent 362d4f4 commit 7380adb

5 files changed

Lines changed: 126 additions & 1 deletion

File tree

.github/workflows/link-check.yml

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
name: Link Check
2+
3+
on:
4+
# Run monthly to catch link rot
5+
schedule:
6+
- cron: '0 0 1 * *' # First day of each month at midnight
7+
# Also run on workflow_dispatch so you can trigger manually
8+
workflow_dispatch:
9+
# Run on PRs to catch broken links before merging
10+
pull_request:
11+
paths:
12+
- 'src/**/*.md'
13+
- 'src/**/*.njk'
14+
15+
jobs:
16+
link-check:
17+
runs-on: ubuntu-latest
18+
steps:
19+
- name: Checkout repository
20+
uses: actions/checkout@v4
21+
22+
- name: Setup Node.js
23+
uses: actions/setup-node@v4
24+
with:
25+
node-version: '20.x'
26+
27+
- name: Install dependencies
28+
run: npm ci
29+
30+
- name: Build site
31+
run: npm run build
32+
33+
- name: Check links
34+
uses: lycheeverse/lychee-action@v1
35+
with:
36+
# Check all HTML files in the built site
37+
args: --verbose --no-progress './_site/**/*.html'
38+
# Fail the workflow if broken links are found
39+
fail: true
40+
env:
41+
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
42+
43+
- name: Create Issue on failure
44+
if: failure()
45+
uses: actions/github-script@v7
46+
with:
47+
script: |
48+
const title = '🔗 Broken links detected';
49+
const body = `The scheduled link check found broken links.\n\nCheck the [workflow run](${context.payload.repository.html_url}/actions/runs/${context.runId}) for details.`;
50+
51+
// Check if an issue already exists
52+
const issues = await github.rest.issues.listForRepo({
53+
owner: context.repo.owner,
54+
repo: context.repo.repo,
55+
state: 'open',
56+
labels: ['broken-links']
57+
});
58+
59+
if (issues.data.length === 0) {
60+
await github.rest.issues.create({
61+
owner: context.repo.owner,
62+
repo: context.repo.repo,
63+
title: title,
64+
body: body,
65+
labels: ['broken-links']
66+
});
67+
}

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,6 @@ _site
44
node_modules
55

66
.DS_Store
7+
8+
# Link checker cache
9+
.lycheecache

.markdownlint.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
{
22
"MD013": false,
3-
"MD033": false
3+
"MD033": false,
4+
"default": true,
5+
"ignores": ["**/*.toml", "LICENSE"]
46
}

README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,18 @@ The development server includes:
2929

3030
Having produced a build in `_site`, the entire directory is pushed up on a `gh-pages` branch to GitHub, and hosted exactly as-is.
3131

32+
## Quality Assurance
33+
34+
### Link Checking
35+
36+
The repository includes automated link checking via GitHub Actions:
37+
38+
- **Scheduled checks**: Runs monthly to catch link rot
39+
- **PR checks**: Validates links in pull requests
40+
- **Manual trigger**: Can be run on-demand from the Actions tab
41+
42+
Configuration is in `lychee.toml`. If broken links are found, an issue will be automatically created with the `broken-links` label.
43+
3244
## Publishing a new blog post
3345

3446
A new blog post should be authored within the `/src/blog` directory. It should have its own subdirectory. The name of the subdirectory will become the URL route, e.g. `/src/blog/42-foo/index.md` will become `rupertmckay.com/blog/42-foo`. I maintain a convention of prefixing the directories with a number in order to keep the directories chronologically ordered in the source code. The markdown file must be named `index.md` or else its name will be appended to the route URL, e.g. `/src/blog/42-foo/foo.md` will become `rupertmckay.com/blog/42-foo/foo`, which we don't want.

lychee.toml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Lychee link checker configuration
2+
# https://github.com/lycheeverse/lychee
3+
4+
# Maximum number of concurrent network requests
5+
max_concurrency = 8
6+
7+
# Check links in HTML
8+
include = ["./**/*.html"]
9+
10+
# Exclude patterns (regex)
11+
exclude = [
12+
# Exclude localhost and local IPs
13+
"http://localhost",
14+
"http://127.0.0.1",
15+
"http://0.0.0.0",
16+
17+
# Exclude common patterns that might be examples
18+
"http://example.com",
19+
"https://example.com",
20+
21+
# Exclude mailto links (they often fail checks)
22+
"mailto:",
23+
]
24+
25+
# Accepted status codes for valid links
26+
accept = [200, 201, 202, 203, 204, 206, 300, 301, 302, 303, 304, 307, 308, "999"]
27+
28+
# Follow redirects
29+
max_redirects = 10
30+
31+
# Timeout in seconds for network requests
32+
timeout = 20
33+
34+
# User agent
35+
user_agent = "Mozilla/5.0 (compatible; LinkChecker/1.0)"
36+
37+
# Exclude private network addresses
38+
exclude_private = true
39+
40+
# Don't check links that are already cached (speeds up repeat checks)
41+
cache = true

0 commit comments

Comments
 (0)