Skip to content

Enforce single top-level markdown heading per docs notebook#866

Open
drbenvincent wants to merge 3 commits into
mainfrom
issue-863-single-h1-check
Open

Enforce single top-level markdown heading per docs notebook#866
drbenvincent wants to merge 3 commits into
mainfrom
issue-863-single-h1-check

Conversation

@drbenvincent
Copy link
Copy Markdown
Collaborator

Enforce single top-level markdown heading per docs notebook

Fixes #863

Summary

Sphinx promotes every top-level (#) markdown heading inside a notebook
into its own entry under the toctree in
docs/source/notebooks/index.md. A notebook with N top-level headings
therefore appears N times on the rendered notebooks index page,
polluting the navigation. The convention has historically been "exactly
one # heading per docs notebook", but nothing enforced it.

This PR extends the existing validate-notebooks pre-commit hook
(scripts/validate_notebooks.py) with a single-H1 check scoped to
notebooks under docs/source/notebooks/.

Changes

  • scripts/validate_notebooks.py

    • Add _count_h1_headings(notebook) helper that walks markdown
      cells only
      , tracks fenced code blocks (``` and ~~~), and
      collects top-level # headings. Code cells and # -prefixed
      Python comments inside fenced blocks are skipped so they cannot
      spuriously trigger the rule (this matters for e.g.
      its_lift_test.ipynb, which today has a python fenced block
      inside a markdown cell containing # Calculate ... comments).
    • Add _is_docs_notebook(path) so the rule only applies to notebooks
      under docs/source/notebooks/. Scratch / dev notebooks elsewhere
      in the repo are left alone, and the hook entry stays generic.
    • Add _format_h1_violation(...) to emit a clear, actionable error
      that names each offending cell index and heading text.
    • Wire the new check into validate_notebook(...) after the existing
      nbformat schema validation.
  • causalpy/tests/test_notebook_validation.py

    • Test exactly-one-H1 passes.
    • Parametrized failure tests for zero, two-in-one-cell, and
      three-across-cells H1 cases — each asserts the failure message and
      H1 count.
    • Test that # -prefixed comments inside code cells do not count as
      H1s.
    • Test that lines inside ``` fenced blocks within markdown cells
      do not count as H1s (replicates the its_lift_test.ipynb pattern).
    • Test that ~~~ fenced blocks behave the same.
    • Test that the H1 rule is not applied to notebooks outside
      docs/source/notebooks/.
    • Regression test: run the validator over every checked-in notebook
      in docs/source/notebooks/ and assert it passes.

Audit

Scanned the 31 .ipynb files currently under docs/source/notebooks/
on main0 violations. The known offender
(its_place_in_time_analysis.ipynb, 5 H1s) lives on PR #826 and is not
yet in main; that PR is responsible for fixing its own notebook.
Enabling the check now is therefore safe and prevents future
regressions.

Testing

  • prek run --all-files — all hooks pass, including the now-stricter
    Validate notebook schema hook running over every notebook in the
    repo.

  • python -m pytest causalpy/tests/test_notebook_validation.py -v
    all 11 tests pass.

  • Manual smoke test: ran the validator against a synthetic
    docs/source/notebooks/bad.ipynb with two # headings; the script
    exits 1 and prints:

    /tmp/.../bad.ipynb: expected exactly one top-level (#) markdown heading, found 2.
      cell 0: First
      cell 1: Second
      Each docs notebook must have exactly one top-level (#) heading. Demote the others to ## (or below) so the docs sidebar stays clean.
    

Checklist

  • scripts/validate_notebooks.py enforces exactly one top-level
    markdown heading per docs notebook.
  • Code cells are skipped entirely.
  • Lines inside ``` and ~~~ fenced blocks within markdown
    cells are skipped.
  • Failure message names each offending cell index and heading text.
  • Tests cover positive and negative cases (including the fenced
    block edge case).
  • prek run --all-files passes.
  • Out of scope: other heading-level rules (skipped levels, max
    depth) and broader markdown linting — left for a follow-up.

Closes #863.

Sphinx promotes every top-level (#) markdown heading inside a notebook
into its own toctree entry, so a notebook with N H1s appears N times on
the rendered notebooks index page. The convention has always been
exactly one H1 per docs notebook, but nothing enforced it.

Extends the existing validate-notebooks pre-commit hook
(scripts/validate_notebooks.py) with a single-H1 check scoped to
notebooks under docs/source/notebooks/:

- _count_h1_headings walks markdown cells only and tracks fenced code
  blocks (``` and ~~~) so Python comments inside fenced ```python``` and
  inside code cells cannot spuriously trigger the rule (covers the
  existing its_lift_test.ipynb pattern).
- _is_docs_notebook scopes the check to docs/source/notebooks/, leaving
  scratch/dev notebooks elsewhere alone.
- _format_h1_violation produces an actionable error naming each
  offending cell index and heading text.

Tests cover the valid case, zero/two/three H1 violations, code-cell
comments, fenced ``` and ~~~ blocks in markdown cells, the
out-of-scope exemption, and a regression test that runs the validator
across every shipped docs notebook.

Made-with: Cursor
@drbenvincent drbenvincent added documentation Improvements or additions to documentation devops DevOps related labels Apr 24, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.61%. Comparing base (deb8774) to head (5271c0b).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #866   +/-   ##
=======================================
  Coverage   94.60%   94.61%           
=======================================
  Files          80       80           
  Lines       12764    12821   +57     
  Branches      770      770           
=======================================
+ Hits        12076    12131   +55     
- Misses        485      486    +1     
- Partials      203      204    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Apr 24, 2026

Documentation build overview

📚 causalpy | 🛠️ Build #32412261 | 📁 Comparing 5271c0b against latest (deb8774)

  🔍 Preview build  

1 file changed
± 404.html

The two ``pytest.skip(...)`` guards in ``test_all_docs_notebooks_pass_h1_check``
only fire when ``docs/source/notebooks`` is missing or empty. Under CI both
conditions are always false, so codecov flagged them as uncovered lines /
partial branches and patch coverage dropped below the 94.6% target.

These are intentional defensive guards for partial checkouts, not behaviour
worth testing — exclude them via ``# pragma: no cover`` so patch coverage
reflects real testable code.

Made-with: Cursor
drbenvincent added a commit that referenced this pull request Apr 24, 2026
The validate-notebooks pre-commit hook (#866) deterministically
enforces exactly one top-level H1 per docs notebook, so the pr-review
skill no longer needs to flag it. Remove the dedicated section from
docs-patterns.md and trim the now-stale examples from SKILL.md and
the docs-patterns intro. Keep the how-to-extend.md bullet as the
worked example of the "formalise into a hook → prune from the skill"
maintenance rule.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops DevOps related documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enforce single top-level markdown heading per docs notebook via pre-commit hook

1 participant