Add checks for broken docs urls by carlosabadia · Pull Request #6448 · reflex-dev/reflex

carlosabadia · 2026-05-04T11:36:35Z

No description provided.

codspeed-hq · 2026-05-04T11:40:08Z

Merging this PR will not alter performance

✅ 24 untouched benchmarks
⏩ 2 skipped benchmarks¹

_{Comparing carlos/docs-links-ci (c7ed339) with main (3702d23)}

2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

greptile-apps · 2026-05-04T11:46:01Z

Greptile Summary

This PR adds a new GitHub Actions workflow and Python script that validate /docs/* Markdown links against the Reflex app's generated sitemap.xml, catching broken URLs and underscore-in-path violations before they reach production. The implementation is well-structured, correctly strips fragments/query strings before the underscore check, and ships good test coverage — including the fragment-underscore false-positive regression case from prior review.

The LINK_RE regex only handles double-quoted Markdown link titles (\"...\"), not the single-quoted ('...') or parenthesised ((...)) forms. Links like [text](/docs/foo 'My Title') would have the title text absorbed into raw, causing every such link to report a spurious "not found in sitemap" error.

Confidence Score: 4/5

Safe to merge after addressing the single-quoted title regex gap; otherwise the tool works correctly.

One P1 logic issue: single-quoted Markdown link titles are not stripped from the captured URL, causing false-positive "not found in sitemap" errors. All other logic (fragment/query stripping for the underscore check, sitemap prefix normalization, skip-dirs) is correct and well-tested.

docs/app/scripts/check_doc_links.py — specifically the LINK_RE constant on line 25.

Important Files Changed

Filename	Overview
.github/workflows/check_doc_links.yml	New CI workflow that builds the Reflex frontend to generate sitemap.xml, then runs the link-checker script; triggers on docs/*/.md, the script, and this file itself.
docs/app/scripts/check_doc_links.py	New script scanning .md files for /docs/* links and validating them against sitemap.xml; correctly strips fragment/query before underscore check, handles both /docs-prefixed and non-prefixed sitemaps.
docs/app/tests/test_doc_links.py	Comprehensive unit tests covering valid links, missing links, underscore detection, fragment handling, skip-dirs, and both sitemap prefix styles; includes the fragment-underscore false-positive regression test.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[GitHub Actions Trigger\npull_request / push to main\nwith docs path filter] --> B[Checkout & Setup Build Env\npython 3.14 + uv sync]
    B --> C[uv run reflex export\n--frontend-only --no-zip\nGenerates .web/public/sitemap.xml]
    C --> D[uv run python\nscripts/check_doc_links.py]
    D --> E[load_sitemap_paths\nParse sitemap.xml → set of normalized paths]
    D --> F[iter_md_files\nrglob *.md, skip SKIP_DIRS]
    F --> G[iter_md_links\nMatch LINK_RE on each line]
    G --> H{For each raw URL}
    H --> I{Underscore in path_only?}
    I -- Yes --> J[Append underscore error]
    I -- No --> K{sitemap_key in valid_paths?}
    J --> K
    K -- No --> L[Append not-found error]
    K -- Yes --> M[OK]
    L --> N{Any errors?}
    J --> N
    M --> N
    N -- Yes --> O[Print errors to stderr\nExit 1 → CI fails]
    N -- No --> P[Print success\nExit 0]

_{Reviews (2): Last reviewed commit: "updates" | Re-trigger Greptile}

masenf · 2026-05-04T17:45:33Z

@greptile-apps re-review

masenf

can the github actions workflow be an add-on step for the existing reflex-docs regression. most of the time taken in this workflow is actually building the app, but we already do that in the other workflow, so we basically get the link checking for free.

i also think the script output could be a little more verbose so you can see all of the links that got checked in the CI instead of "All /docs links resolve against sitemap.xml." and having to trust that it actually checked and didn't just scan the wrong dir and find no markdown files.

finally, love the tests for the test script, very nice.

adhami3310 · 2026-05-05T17:07:23Z

you can parse the markdown files instead of doing regexes btw, we reflex-docgen has a transformer that can be used for this

Add checks for broken docs urls

2187840

carlosabadia requested review from a team and Alek99 as code owners May 4, 2026 11:36

carlosabadia mentioned this pull request May 4, 2026

ENG-9414: Remove hardcoded docs urls #6395

Closed

carlosabadia added the documentation Improvements or additions to documentation label May 4, 2026

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

Comment thread docs/app/scripts/check_doc_links.py Outdated

Comment thread docs/app/tests/test_doc_links.py

updates

873b592

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

Comment thread docs/app/scripts/check_doc_links.py Outdated

masenf reviewed May 5, 2026

View reviewed changes

combine ci and be more verbose

5b405dd

move from regex

c7ed339

adhami3310 approved these changes May 5, 2026

View reviewed changes

Alek99 merged commit 0487d9b into main May 5, 2026
69 checks passed

Alek99 deleted the carlos/docs-links-ci branch May 5, 2026 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add checks for broken docs urls#6448

Add checks for broken docs urls#6448
Alek99 merged 4 commits intomainfrom
carlos/docs-links-ci

carlosabadia commented May 4, 2026

Uh oh!

codspeed-hq Bot commented May 4, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

masenf commented May 4, 2026

Uh oh!

Uh oh!

masenf left a comment

Uh oh!

adhami3310 commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

carlosabadia commented May 4, 2026

Uh oh!

codspeed-hq Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

greptile-apps Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

masenf commented May 4, 2026

Uh oh!

Uh oh!

masenf left a comment

Choose a reason for hiding this comment

Uh oh!

adhami3310 commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codspeed-hq Bot commented May 4, 2026 •

edited

Loading

greptile-apps Bot commented May 4, 2026 •

edited

Loading