Skip to content

Commit e07f396

Browse files
authored
Merge pull request #142 from KubaO/staging
Fix redirect links in the offline site so they remain offline.
2 parents 1c9660c + 1e360a8 commit e07f396

8 files changed

Lines changed: 676 additions & 261 deletions

File tree

docs/Miscellaneous/Documentation Development.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ To check that none of the internal links in the most recent documentation build
201201

202202
check.bat
203203

204-
This runs [Lychee](https://github.com/lycheeverse/lychee) in offline mode against the built `_site/`.
204+
This runs three checks: [Lychee](https://github.com/lycheeverse/lychee) in offline mode against `_site/` (the live tree), the same against `_site-offline/` (the file://-browsable mirror), and a small Python pass over `_site-offline/` that flags any surviving `https://docs.twinbasic.com/<path>` link --- the offline mirror should not navigate back to the live docs site.
205205

206206
### Building and Local Serving
207207

docs/Miscellaneous/FAQs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ For a full list of all the new features available right now, see the Wiki articl
6969
{: #to-learn-more }
7070
[twinBASIC Home Page](https://twinbasic.com)
7171

72-
twinBASIC GitHub: [Main section](https://github.com/twinbasic/twinbasic) \| [Issues](https://github.com/twinbasic/twinbasic/issues) \| [Discussions](https://github.com/twinbasic/twinbasic/discussions) \| [Language Design](https://github.com/twinbasic/lang-design) \| [ Language Specification](https://github.com/twinbasic/lang-spec) \| [Documentation](https://docs.twinbasic.comi)
72+
twinBASIC GitHub: [Main section](https://github.com/twinbasic/twinbasic) \| [Issues](https://github.com/twinbasic/twinbasic/issues) \| [Discussions](https://github.com/twinbasic/twinbasic/discussions) \| [Language Design](https://github.com/twinbasic/lang-design) \| [ Language Specification](https://github.com/twinbasic/lang-spec) \| [Documentation](https://docs.twinbasic.com)
7373

7474
[twinBASIC Discord](https://discord.gg/UaW9GgKKuE)
7575

docs/_plugins/book-href-rewrite.rb

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,8 @@ def self.process(page)
228228
return if parent_map.empty?
229229
landing_anchors = build_landing_anchors(site)
230230

231+
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
232+
231233
rewritten = 0
232234
landings_stripped = 0
233235
page.output = page.output.gsub(/(<article[^>]*id="(ch-[^"]+)"[^>]*>)(.*?)(<\/article>)/m) do
@@ -254,6 +256,9 @@ def self.process(page)
254256
"#{article_open}#{body}#{article_end}"
255257
end
256258
Jekyll.logger.info "BookHrefRewrite:", "rewrote #{rewritten} chapter bodies, stripped #{landings_stripped} landing H3s"
259+
260+
elapsed_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time) * 1000).round(0)
261+
Jekyll.logger.info "BookHrefRewrite:", "BookHrefRewriter ran in #{elapsed_ms}ms."
257262
end
258263
end
259264

docs/_plugins/offlinify.md

Lines changed: 129 additions & 62 deletions
Large diffs are not rendered by default.

docs/_plugins/offlinify.rb

Lines changed: 420 additions & 192 deletions
Large diffs are not rendered by default.

docs/_plugins/pdfify.rb

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,8 @@ def self.run(site, source_root, dest_root)
8888
return
8989
end
9090

91+
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
92+
9193
# Wipe the destination tree so previous runs do not leave stale
9294
# images behind when source pages are deleted or renamed.
9395
FileUtils.rm_rf(dest)
@@ -132,6 +134,9 @@ def self.run(site, source_root, dest_root)
132134
book_src.delete
133135

134136
Jekyll.logger.info "Pdfify:", "wrote #{dest_root} -- copied #{copied} file(s) (#{image_paths.size} image(s)#{skipped.zero? ? "" : ", #{skipped} missing"})"
137+
138+
elapsed_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time) * 1000).round(0)
139+
Jekyll.logger.info "Pdfify:", "Pdfifier ran in #{elapsed_ms}ms."
135140
end
136141

137142
# Walks book.html for relative `<img src=>` URLs and returns the

docs/check.bat

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
@rem Use lychee to check the links in both build outputs.
1+
@rem Use lychee to check the links in both build outputs, then scan
2+
@rem _site-offline/ for live-site links that survived offlinify.
23
@rem
34
@rem _site/ Online tree. `--fallback-extensions html` mirrors what
45
@rem GitHub Pages does at request time: an extensionless
@@ -10,10 +11,17 @@
1011
@rem markdown sources whose permalink shape doesn't match
1112
@rem the rendered filename (e.g. `[Foo](Foo/)` when Jekyll
1213
@rem wrote `Foo.html`, not `Foo/index.html`).
14+
@rem live-links Greps _site-offline/ HTML for any surviving
15+
@rem https://docs.twinbasic.com reference outside <code> /
16+
@rem <pre> blocks. After _plugins/offlinify.rb strips the
17+
@rem jekyll-seo-tag block from each page, none should
18+
@rem remain -- a hit means a source link goes to the live
19+
@rem site instead of the canonical /tB/... permalink.
20+
@rem See ../scripts/check_offline_live_links.py.
1321
@rem
14-
@rem Both checks always run so you see all errors in one pass; the script
15-
@rem exits non-zero if either fails (online failure takes precedence in
16-
@rem the reported code).
22+
@rem All three checks always run so you see all errors in one pass; the
23+
@rem script exits non-zero if any fails (earlier failures take precedence
24+
@rem in the reported code).
1725
@setlocal
1826
@set LYCHEE="%~dp0..\.claude\lychee.exe"
1927
@echo Checking _site/ (online) ...
@@ -28,5 +36,10 @@
2836
@rem such fallback, and the link is just broken.
2937
@%LYCHEE% --offline --include-fragments --index-files "index.html" --root-dir ".\_site-offline" ".\_site-offline" %*
3038
@set EXIT2=%ERRORLEVEL%
39+
@echo.
40+
@echo Checking _site-offline/ for live-site links ...
41+
@python "%~dp0..\scripts\check_offline_live_links.py"
42+
@set EXIT3=%ERRORLEVEL%
3143
@if %EXIT1% NEQ 0 exit /b %EXIT1%
32-
@exit /b %EXIT2%
44+
@if %EXIT2% NEQ 0 exit /b %EXIT2%
45+
@exit /b %EXIT3%
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
"""
2+
Scan docs/_site-offline/ for any https://docs.twinbasic.com/<path>
3+
reference outside of <code> / <pre> blocks. Exit 1 if any found,
4+
0 otherwise.
5+
6+
Run by docs/check.bat after the offline lychee pass. After
7+
_plugins/offlinify.rb's SEO-block strip, no live-site references
8+
should remain except:
9+
10+
* Sample URLs inside <code> / <pre> blocks (tutorial code that
11+
legitimately shows live URLs as data, e.g. the VBRUN.Hyperlink
12+
`NavigateTo "https://docs.twinbasic.com/"` example). Skipped
13+
via the same code-block shape offlinify uses for its URL
14+
rewrite.
15+
* The bare root URL `https://docs.twinbasic.com` or
16+
`https://docs.twinbasic.com/` -- intentional "go to the live
17+
docs site" links (e.g. the Documentation entry in the FAQ
18+
resource list). Skipped via the tail check below.
19+
20+
Anything deeper (`https://docs.twinbasic.com/tB/Core/Const`,
21+
`https://docs.twinbasic.comi`, ...) is flagged: in the offline
22+
copy those navigate back to the live site, undermining the local
23+
read; in source they should be a relative link or a /tB/...
24+
permalink that resolves locally.
25+
26+
Run from anywhere:
27+
python scripts/check_offline_live_links.py
28+
"""
29+
30+
import re
31+
import sys
32+
from pathlib import Path
33+
34+
SCRIPT_DIR = Path(__file__).resolve().parent
35+
REPO_ROOT = SCRIPT_DIR.parent
36+
OFFLINE_TREE = REPO_ROOT / "docs" / "_site-offline"
37+
38+
# Matches a <code>...</code> or <pre>...</pre> block. Same shape as
39+
# _plugins/offlinify.rb CODE_BLOCK_RE so sample URLs in tutorial
40+
# code are skipped here too.
41+
CODE_BLOCK_RE = re.compile(r"<(code|pre)\b[^>]*>.*?</\1>", re.DOTALL)
42+
43+
# Captures the trailing path/typo characters after the domain. An
44+
# empty tail or `/` means the bare root URL (intentional). Anything
45+
# else is a deep link or a typo (`.comi`, `.com/tB/...`).
46+
LIVE_LINK_RE = re.compile(r"https://docs\.twinbasic\.com(?P<tail>[^\s\"'<>]*)")
47+
48+
49+
def main() -> int:
50+
if not OFFLINE_TREE.is_dir():
51+
print(
52+
f"_site-offline/ not found at {OFFLINE_TREE} -- run docs/build.bat first."
53+
)
54+
return 2
55+
56+
hits = []
57+
for html in sorted(OFFLINE_TREE.rglob("*.html")):
58+
content = html.read_text(encoding="utf-8")
59+
link_matches = list(LIVE_LINK_RE.finditer(content))
60+
if not link_matches:
61+
continue
62+
code_ranges = [(m.start(), m.end()) for m in CODE_BLOCK_RE.finditer(content)]
63+
for m in link_matches:
64+
tail = m.group("tail")
65+
if tail == "" or tail == "/":
66+
continue
67+
if any(s <= m.start() < e for s, e in code_ranges):
68+
continue
69+
line_num = content.count("\n", 0, m.start()) + 1
70+
start = max(0, m.start() - 60)
71+
end = min(len(content), m.start() + 80)
72+
snippet = re.sub(r"[\r\n]+", " ", content[start:end])
73+
hits.append((html, line_num, snippet))
74+
75+
if hits:
76+
print(
77+
f"FAIL: {len(hits)} reference(s) to docs.twinbasic.com in "
78+
f"_site-offline/ outside code blocks:"
79+
)
80+
for path, line_num, snippet in hits:
81+
try:
82+
rel = path.relative_to(REPO_ROOT)
83+
except ValueError:
84+
rel = path
85+
print(f" {rel}:{line_num}: ...{snippet}...")
86+
print()
87+
print(
88+
"Update the source markdown to use a relative link or /tB/... "
89+
"permalink instead."
90+
)
91+
return 1
92+
93+
return 0
94+
95+
96+
if __name__ == "__main__":
97+
sys.exit(main())

0 commit comments

Comments
 (0)