Skip to content

Incremental rebuild cache does not re-process changed files on mkdocs serve #215

Description

@cschanhniem

Problem

When mkdocs serve is running and a file is modified, the plugin performs an incremental (dirty) build. The current behavior in on_files() is:

if self.is_serve_dirty_build:
    logging.debug("[git-revision-date-localized] Skipping parallel processing on incremental rebuild, using cache")
    return

This means that on a dirty rebuild, the plugin entirely skips re-computing commit timestamps and relies on the cache populated during the initial build. If a file was changed (e.g. saved after the initial build), its last revision timestamp in the cache may be stale — it reflects the commit data from the initial build, not any new commit.

Actual behavior observed

  1. mkdocs serve starts, initial build populates self.last_revision_commits
  2. User edits docs/index.md and saves
  3. Dirty rebuild runs, on_files() returns early due to is_serve_dirty_build
  4. on_page_markdown() reads stale cached timestamp for the changed file
  5. The page shows the old revision date even though the user made changes

Suggested fix

Option A: Invalidate the cache entry for the specific file(s) that changed on dirty rebuilds. The on_startup() callback receives dirty=True but not which files changed — however, during a dirty rebuild, only modified pages are re-processed, so on_page_markdown() could detect the mismatch and fall through to a direct git log call.

Option B: Add a config option (e.g. revalidate_on_dirty: true) that, when enabled, still runs the parallel processing on dirty builds (perhaps with a smaller worker count or debounce) rather than short-circuiting entirely.

Option C: On dirty rebuilds, do not use the cached result in on_page_markdown() — fall through to self.util.get_git_commit_timestamp() directly. Since only a small number of pages are rebuilt on a dirty build, the performance hit is negligible.

I think Option C is the simplest and most correct — the cache skip in on_files() is a performance optimization, but it trades correctness for speed. For dirty builds the number of pages is tiny, so the direct call is fine.

Happy to submit a PR if this approach sounds right.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions