Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 38 additions & 7 deletions docs/source/components/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,15 +178,17 @@ Configures how **Sphinx-CodeLinks** discovers and processes source files within
exclude = []
include = []
gitignore = true
follow_links = false
comment_type = "cpp"

**Configuration fields:**

- ``src_dir`` - Root directory for source file discovery (relative to Sphinx project root or the directory where the TOML config file is located if given)
- ``exclude`` - List of glob patterns to exclude from processing
- ``include`` - List of glob patterns to include (if empty, includes all files)
- ``gitignore`` - Whether to respect ``.gitignore`` rules when discovering files (Nested .gitignore is NOT supported yet)
- ``comment_type`` - Comment style for the programming language ("cpp" and "python" are currently supported)
- ``gitignore`` - Whether to respect ``.gitignore``, ``.ignore``, and related ignore files when discovering files
- ``follow_links`` - Whether to follow symbolic links during file discovery
- ``comment_type`` - Comment style for the programming language

.. _`source_dir`:

Expand Down Expand Up @@ -251,7 +253,9 @@ Defines a list of glob patterns for files to explicitly include in discovery. Wh
"include/**/*.hpp"
]

**Priority:** The ``include`` option has the highest priority and overrides both ``exclude`` and ``gitignore`` settings.
**Priority:** When ``include`` patterns are specified, only files matching those patterns
are considered (this overrides ``gitignore`` exclusions for matched files).
``exclude`` patterns are then applied to remove files from that set.

**Common inclusion patterns:**

Expand Down Expand Up @@ -317,7 +321,9 @@ Specifies the comment syntax style used in the source code files. This determine
gitignore
^^^^^^^^^

Controls whether to respect ``.gitignore`` files when discovering source files. When enabled, files and directories listed in ``.gitignore`` will be automatically excluded from processing.
Controls whether to respect ignore files when discovering source files.
When enabled, files and directories matched by ignore rules will be automatically
excluded from processing.

**Type:** ``bool``
**Default:** ``true``
Expand All @@ -329,10 +335,34 @@ Controls whether to respect ``.gitignore`` files when discovering source files.

**Behavior:**

- ``true`` - Respect ``.gitignore`` rules (recommended)
- ``false`` - Ignore ``.gitignore`` files and process all matching files
When set to ``true`` (recommended), the following ignore sources are respected:

.. important:: **Current Limitation:** This option only supports the root-level ``.gitignore`` file. Nested ``.gitignore`` files in subdirectories or parent directories are not currently processed.
- ``.gitignore`` files (including nested ``.gitignore`` files in subdirectories)
- ``.ignore`` files (same syntax as ``.gitignore``, useful for non-git projects)
- ``.git/info/exclude``
- Global gitignore (e.g. ``~/.config/git/ignore``)
- Parent directory ignore files

When set to ``false``, all ignore files are disregarded and every matching file is processed.

follow_links
^^^^^^^^^^^^

Controls whether symbolic links are followed during file discovery.
When disabled, symbolic links to directories are not traversed.

**Type:** ``bool``
**Default:** ``false``

.. code-block:: toml

[codelinks.projects.my_project.source_discover]
follow_links = true

**Behavior:**

- ``false`` - Symbolic links to directories are skipped (default, safer)
- ``true`` - Symbolic links are followed, discovering files inside linked directories

For more information about the usage examples, see :ref:`source discover <discover>`.

Expand All @@ -355,6 +385,7 @@ Configures how **Sphinx-CodeLinks** analyse source files to extract markers from
exclude = []
include = []
gitignore = true
follow_links = false
comment_type = "cpp"

[codelinks.projects.my_project.analyse]
Expand Down
1 change: 1 addition & 0 deletions docs/source/components/discover.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Usage Examples
include = []
exclude = ["src/legacy/**", "**/*_test.cpp"]
gitignore = true
follow_links = false
comment_type = "cpp"

**Python Project:**
Expand Down
2 changes: 1 addition & 1 deletion docs/source/development/roadmap.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Source Code Parsing

- Introduce a configurable option to strip leading characters (e.g., ``*``) from commented RST blocks.
- Enrich tagged scopes with additional metadata.
- Enhance ``.gitignore`` handling to support nested ``.gitignore`` files.
- ✅ Nested ``.gitignore`` files are now supported (implemented via ``ignore-python``).

Defining Needs in Source Code
-----------------------------
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ readme = "README.md"
requires-python = ">= 3.12"
dependencies = [
"comment-parser>=1.2.4",
"gitignore-parser>=0.1.11",
"ignore-python>=0.3.3",
"typer>=0.16.0",
"click < 8.2", # click 8.2.* produces empty errors if no args are given
"jsonschema",
Expand Down
9 changes: 7 additions & 2 deletions src/sphinx_codelinks/cmd.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ def analyse( # noqa: PLR0912 # for CLI, so it needs the branches


@app.command(no_args_is_help=True)
def discover(
def discover( # noqa: PLR0913 # CLI command requires multiple parameters
src_dir: Annotated[
Path,
typer.Argument(
Expand Down Expand Up @@ -203,9 +203,13 @@ def discover(
gitignore: Annotated[
bool,
typer.Option(
help="Respect .gitignore in the given directory. Nested .gitignore Not supported"
help="Respect .gitignore files in the given directory and its parents"
),
] = True,
follow_links: Annotated[
bool,
typer.Option(help="Follow symbolic links during file discovery"),
] = False,
comment_type: Annotated[
CommentType,
typer.Option(
Expand All @@ -222,6 +226,7 @@ def discover(
"exclude": exclude,
"include": include,
"gitignore": gitignore,
"follow_links": follow_links,
"comment_type": comment_type,
}

Expand Down
5 changes: 5 additions & 0 deletions src/sphinx_codelinks/source_discover/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ class SourceDiscoverSectionConfigType(TypedDict, total=False):
exclude: list[str]
include: list[str]
gitignore: bool
follow_links: bool
comment_type: CommentType


Expand All @@ -40,6 +41,7 @@ class SourceDiscoverConfigType(TypedDict, total=False):
exclude: list[str]
include: list[str]
gitignore: bool
follow_links: bool
comment_type: CommentType


Expand Down Expand Up @@ -69,6 +71,9 @@ def field_names(cls) -> set[str]:
gitignore: bool = field(default=True, metadata={"schema": {"type": "boolean"}})
"""Whether to respect .gitignore to exclude files."""

follow_links: bool = field(default=False, metadata={"schema": {"type": "boolean"}})
"""Whether to follow symbolic links during file discovery."""

comment_type: str = field(
default="cpp",
metadata={
Expand Down
96 changes: 57 additions & 39 deletions src/sphinx_codelinks/source_discover/source_discover.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
from collections.abc import Callable
import fnmatch
import os
from pathlib import Path

from gitignore_parser import ( # type: ignore[import-untyped] # library has no stub
parse_gitignore,
)
from ignore import WalkBuilder
from ignore.overrides import OverrideBuilder

from sphinx_codelinks.source_discover.config import (
COMMENT_FILETYPE,
Expand All @@ -17,51 +14,72 @@
class SourceDiscover:
def __init__(self, src_discover_config: SourceDiscoverConfig):
self.src_discover_config = src_discover_config
# Only gitignore at source root is considered.
# TODO: Support nested gitignore files
gitignore_path = self.src_discover_config.src_dir / ".gitignore"
self.gitignore_matcher: Callable[[str], bool] | None = (
parse_gitignore(gitignore_path)
if self.src_discover_config.gitignore and gitignore_path.exists()
else None
)
# normalize the file types to lower case with leading dot
self.file_types = {
f".{ext}" for ext in COMMENT_FILETYPE[src_discover_config.comment_type]
}

self.source_paths = self._discover()

def _build_overrides(self) -> OverrideBuilder | None:
"""Build an OverrideBuilder for include/exclude patterns.

Include patterns are added as whitelist globs.
Exclude patterns are added as negated globs (prefixed with ``!``).
"""
has_include = bool(self.src_discover_config.include)
has_exclude = bool(self.src_discover_config.exclude)

if not has_include and not has_exclude:
return None

ob = OverrideBuilder(self.src_discover_config.src_dir)

if has_include:
for pattern in self.src_discover_config.include:
ob.add(pattern)

if has_exclude:
for pattern in self.src_discover_config.exclude:
ob.add(f"!{pattern}")

return ob

def _discover(self) -> list[Path]:
"""Discover source files recursively in the given directory."""
src_dir = self.src_discover_config.src_dir
if not src_dir.is_dir():
return []

gitignore = self.src_discover_config.gitignore

builder = WalkBuilder(src_dir)
# Replicate the Rust ignore crate's standard_filters(gitignore)
# followed by hidden(false), matching ubc_codelinks behaviour.
builder.ignore(gitignore)
builder.parents(gitignore)
builder.git_ignore(gitignore)
builder.git_global(gitignore)
builder.git_exclude(gitignore)
builder.hidden(False)
builder.follow_links(self.src_discover_config.follow_links)

override_builder = self._build_overrides()
if override_builder is not None:
builder.overrides(override_builder.build())

discovered_files = []
for filepath in self.src_discover_config.src_dir.rglob("*"):
if filepath.is_file():
if self.file_types and filepath.suffix.lower() not in self.file_types:
continue
rel_filepath = str(
filepath.relative_to(self.src_discover_config.src_dir)
)
if self.src_discover_config.include and self._matches_any(
rel_filepath, self.src_discover_config.include
):
# "includes" has the highest priority over "gitignore" and "excludes"
discovered_files.append(filepath)
continue
if self.gitignore_matcher and self.gitignore_matcher(
str(filepath.absolute())
):
continue
if self.src_discover_config.exclude and self._matches_any(
rel_filepath, self.src_discover_config.exclude
):
continue
discovered_files.append(filepath)
for entry in builder.build():
filepath = entry.path()
if not filepath.is_file():
continue
if self.file_types and filepath.suffix.lower() not in self.file_types:
continue
# resolve() produces canonical absolute paths; follow_links only
# controls whether the walker descends into symlinked directories
discovered_files.append(filepath.resolve())

sorted_filepaths = sorted(
discovered_files, key=lambda x: os.path.normcase(os.path.normpath(x))
)
return sorted_filepaths

def _matches_any(self, rel_filepath: str, patterns: list[str]) -> bool:
"""Check if the given file path matches any of the given patterns."""
return any(fnmatch.fnmatch(rel_filepath, pattern) for pattern in patterns)
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,7 @@ def get_src_files(
gitignore=src_discover_config.gitignore,
include=src_discover_config.include,
exclude=src_discover_config.exclude,
follow_links=src_discover_config.follow_links,
comment_type=src_discover_config.comment_type,
)
source_discover = SourceDiscover(src_discover)
Expand Down
Loading
Loading