Skip to content

Commit 4e4266f

Browse files
committed
✨ Render full RST need directives from @rst blocks
## Summary Enable users to embed complete RST need directives inside source code comments delimited by `@rst ... @endrst` markers. These blocks are parsed and rendered as real Sphinx-Needs nodes during the Sphinx build, complementing the existing one-line marker support. ## Motivation One-line markers (`@req{ID}`) are convenient for simple needs but cannot express options, content bodies, or arbitrary directive fields. By allowing full directive syntax inside `@rst` blocks, users can write rich need items directly in source comments — including `:links:`, `:status:`, content text, and any other field that `NeedDirective` accepts — while still getting automatic source-tracing URLs. ## Changes ### `analyse/utils.py` - **`ParsedDirective` TypedDict** — structured return type capturing a directive's name, argument, options, content, line offsets, and whether extra content exists outside the directive body. - **`parse_single_directive()`** — regex-based parser that extracts the first directive from an RST text block. Returns `None` when the first non-blank line is not a directive. ### `sphinx_extension/directives/src_trace.py` - **`generate_str_link_name()`** widened to accept `Metadata` (the base class) instead of `OneLineNeed`, so it works for both one-line needs and marked RST blocks. - **`render_marked_rst_needs()`** — new method on `SourceTracingDirective` that iterates `src_analyse.marked_rst`, parses each block, injects local/remote URL options, constructs a `NeedDirective` instance, and calls `.run()` to produce docutils nodes. - Called from `run()` after `render_needs()`. ### `tests/test_analyse_utils.py` - Parametrised tests for `parse_single_directive` covering: minimal directives, options, content bodies, multi-line content, leading/trailing blanks, extra content detection, no-argument directives, namespaced directive names, and `None`-return cases. ### `tests/test_src_trace.py` + `tests/doc_test/` - **`rst_basic`** — integration test with a Sphinx project that uses only `get_rst = true`. Source file contains a single `/* @rst … @endrst */` block with an `.. impl::` directive (`RST_IMPL_1`). Verifies the need node appears in the doctree snapshot. - **`rst_mixed`** — integration test combining `get_oneline_needs` and `get_rst` in the same build. Uses `[[ ]]` one-line markers to avoid clashing with the `@rst` prefix. Source file contains both a one-line need (`OL_IMPL_1`) and an RST block need (`RST_IMPL_2`). Verifies both needs appear in the doctree snapshot. ## Design decisions ### Why instantiate `NeedDirective` directly? `NeedDirective` uses `DummyOptionSpec` — a dummy spec that accepts all options and keeps them as strings ([sphinx-needs source](https://github.com/useblocks/sphinx-needs/blob/df81a5c/sphinx_needs/directives/need.py#L49)). `DummyOptionSpec` was introduced in **sphinx-needs v6** ([commit d09332d](useblocks/sphinx-needs@d09332d)); earlier versions use an explicit `option_spec` dict, so passing arbitrary raw-string options would fail there. **We should consider raising the minimum dependency to `sphinx-needs>=6`** (currently `>=5,<9` in `pyproject.toml`). `NeedDirective.run()` itself does its own key-by-key validation (via a `match key:` block). This means: - No option validation/conversion is needed before instantiation. - Passing raw `dict[str, str | None]` is exactly what the directive expects. Using `NeedDirective` directly (rather than `add_need()`) gives full directive feature support: content body, arbitrary options, and internal NeedDirective logic for `title_from_content`, `delete`, etc. ### Why a custom regex parser instead of docutils parsing? The `parse_single_directive` function is a purposefully simple regex parser scoped to the single-directive-per-block use case. Full docutils RST parsing would be heavier and harder to control for this constrained input. ## Comparison with MyST-Parser's directive handling MyST-Parser's [`run_directive`](https://github.com/executablebooks/MyST-Parser/blob/9364edb/myst_parser/mdit_to_docutils/base.py#L1684) follows a more general pipeline: 1. **Directive class lookup** via `docutils.parsers.rst.directives.directive()` — resolves any registered directive and warns on unknowns. 2. **`parse_directive_text()`** — a dedicated parser that validates options against the directive's `option_spec` (type converters, unknown-key detection), validates arguments against `required_arguments` / `optional_arguments` / `final_argument_whitespace`, and supports both YAML-delimited and RST-style (`:key: value`) option blocks. 3. **Mocked `state` / `state_machine`** (`MockState`, `MockStateMachine`) — because MyST renders from markdown-it tokens, not from within a real docutils state machine, it must mock these so directives can call `nested_parse()`. 4. **Error wrapping** — `DirectiveError` and `MockingError` are caught and converted to clean error nodes. **What we don't need from this approach:** - **Option validation** — `NeedDirective` uses `DummyOptionSpec` and validates internally, so pre-validation would be redundant. - **Directive class lookup** — we always target `NeedDirective`. - **Mocked state** — we run inside a real `SphinxDirective.run()`, so full docutils `state` / `state_machine` are already available. **What we could adopt (potential TODOs):** - [ ] **`content_offset` fallback** — when `content_line_offset` is `None` the code falls back to `self.content_offset` (the enclosing `.. src-trace::` directive's offset). This is semantically incorrect (though harmless when there's no content). Consider using `0` or adding a clarifying comment. - [ ] **Bump minimum sphinx-needs to v6** — `DummyOptionSpec` (which lets us pass arbitrary string options) was added in v6 ([d09332d](useblocks/sphinx-needs@d09332d)). The current constraint is `sphinx-needs>=5,<9`; without bumping it, `render_marked_rst_needs` will break on sphinx-needs <6 where the directive uses a fixed `option_spec`. - [ ] **Per-line source tracking on `StringList`** — the current code creates `StringList(content_lines, source=src_file)` which sets one source for all lines. For richer error messages pointing to exact lines within the RST block, per-line offset info could be added (docutils `StringList` supports this via the `items` parameter). - [ ] **Marker clashes between one-line parser and `@rst` blocks** — when both `get_oneline_needs` and `get_rst` are enabled, comments containing `@rst ... @endrst` may also be matched by the one-line parser if its start sequence overlaps with `@rst` (e.g. the default `@ ` prefix). Currently the analysis pipeline processes every comment node through *both* extractors independently; there is no mutual exclusion. This can produce spurious one-line needs or validation errors from the `@rst` block's content being misinterpreted as one-line fields. Consider adding a skip/guard so that comments already claimed by `extract_marked_rst` are not also fed to `extract_oneline_needs`, or document that users must choose non-overlapping marker sequences. ## Key finding: `@rst` blocks require block comments RST blocks must use C-style block comments (`/* @rst ... @endrst */`) rather than `//` line comments in C/C++. Tree-sitter parses each `//` line as a separate comment node, and `extract_rst()` needs both `@rst` and `@endrst` within a single comment node's text. This is an inherent constraint of the current tree-sitter based extraction and should be documented for users.
1 parent cdf9e58 commit 4e4266f

15 files changed

Lines changed: 606 additions & 5 deletions

File tree

src/sphinx_codelinks/analyse/analyse.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,7 @@ def __init__(
7676
self.git_commit_rev: str | None = (
7777
utils.get_current_rev(self.git_root) if self.git_root else None
7878
)
79-
self.project_path: Path = (
80-
self.git_root if self.git_root else self.analyse_config.src_dir
81-
)
79+
self.project_path: Path = self.git_root or self.analyse_config.src_dir
8280
self.oneline_warnings: list[AnalyseWarning] = []
8381

8482
def get_src_strings(self) -> Generator[tuple[Path, bytes], Any, None]: # type: ignore[explicit-any]

src/sphinx_codelinks/analyse/utils.py

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import configparser
33
import logging
44
from pathlib import Path
5+
import re
56
from typing import TypedDict
67
from urllib.request import pathname2url
78

@@ -355,6 +356,170 @@ class ExtractedRstType(TypedDict):
355356
end_idx: int
356357

357358

359+
class ParsedDirective(TypedDict):
360+
"""A single parsed RST directive."""
361+
362+
name: str
363+
argument: str
364+
options: dict[str, str]
365+
content: str
366+
has_extra_content: bool
367+
directive_line_offset: int
368+
"""0-based line index of the ``.. name::`` line within the input text."""
369+
content_line_offset: int | None
370+
"""0-based line index where the directive content starts within the input text.
371+
372+
``None`` if the directive has no content body.
373+
"""
374+
375+
376+
_RE_DIRECTIVE = re.compile(r"^(\s*)\.\.\s+([\w:.+-]+)\s*::\s*(.*)")
377+
_RE_OPTION = re.compile(r"^\s+:([^:]+):\s*(.*)")
378+
379+
380+
def _parse_options(body_lines: list[str]) -> tuple[dict[str, str], int]:
381+
"""Parse field-list options from the start of directive body lines.
382+
383+
Supports multi-line option values: continuation lines must be indented
384+
and are joined with a single space.
385+
386+
:return: Tuple of (options dict, content_start index into body_lines).
387+
"""
388+
options: dict[str, str] = {}
389+
content_start = 0
390+
current_key: str | None = None
391+
for j, line in enumerate(body_lines):
392+
if not line.strip():
393+
# Blank line ends the option block.
394+
content_start = j + 1
395+
current_key = None
396+
break
397+
opt_match = _RE_OPTION.match(line)
398+
if opt_match:
399+
current_key = opt_match.group(1).strip()
400+
options[current_key] = opt_match.group(2).strip()
401+
content_start = j + 1
402+
elif current_key is not None and line[:1] == " ":
403+
# Continuation line for the previous option value.
404+
# NOTE: In standard RST (docutils),
405+
# continuation indent is measured relative to the field body
406+
# start. Here any leading space is accepted, which is looser
407+
# but correct within a directive body where all lines are
408+
# already indented past the directive marker.
409+
prev = options[current_key]
410+
continuation = line.strip()
411+
options[current_key] = f"{prev} {continuation}" if prev else continuation
412+
content_start = j + 1
413+
else:
414+
content_start = j
415+
break
416+
else:
417+
content_start = len(body_lines)
418+
return options, content_start
419+
420+
421+
def _extract_content(
422+
body_lines: list[str], content_start: int
423+
) -> tuple[list[str], int]:
424+
"""Extract and dedent the content portion of a directive body.
425+
426+
:return: Tuple of (dedented content lines, number of leading blank lines removed).
427+
"""
428+
content_lines = body_lines[content_start:]
429+
content_blanks_removed = 0
430+
while content_lines and not content_lines[0].strip():
431+
content_lines.pop(0)
432+
content_blanks_removed += 1
433+
while content_lines and not content_lines[-1].strip():
434+
content_lines.pop()
435+
if content_lines:
436+
min_indent = min(
437+
len(cl) - len(cl.lstrip()) for cl in content_lines if cl.strip()
438+
)
439+
content_lines = [cl[min_indent:] if cl.strip() else "" for cl in content_lines]
440+
return content_lines, content_blanks_removed
441+
442+
443+
def parse_single_directive(rst_text: str) -> ParsedDirective | None:
444+
"""Parse a single RST directive from text.
445+
446+
Expects text whose first non-blank line is a directive, e.g.::
447+
448+
.. need-type:: argument
449+
:option: value
450+
451+
Content body here.
452+
453+
:param rst_text: The RST text to parse.
454+
:return: Parsed directive, or ``None`` if the first non-blank line
455+
is not a directive.
456+
"""
457+
lines = rst_text.splitlines()
458+
459+
# Find directive on the first non-blank line
460+
dir_idx: int | None = None
461+
dir_match: re.Match[str] | None = None
462+
for i, line in enumerate(lines):
463+
if line.strip():
464+
dir_match = _RE_DIRECTIVE.match(line)
465+
if dir_match:
466+
dir_idx = i
467+
break
468+
469+
if dir_idx is None or dir_match is None:
470+
return None
471+
472+
dir_indent = len(dir_match.group(1))
473+
name = dir_match.group(2)
474+
# NOTE: In standard RST (docutils), directive
475+
# arguments may span multiple lines before the first field-list
476+
# marker. Here only the ``.. name::`` line is captured; this is
477+
# sufficient for NeedDirective where the argument is a single-line
478+
# title.
479+
argument = dir_match.group(3).strip()
480+
481+
# Collect body: indented (or blank) lines after the directive.
482+
# body_end tracks the last non-blank indented line so trailing
483+
# blank lines between the directive and outside content are excluded.
484+
body_end = dir_idx
485+
for i in range(dir_idx + 1, len(lines)):
486+
line = lines[i]
487+
if not line.strip():
488+
continue
489+
if len(line) - len(line.lstrip()) > dir_indent:
490+
body_end = i
491+
else:
492+
break
493+
494+
body_lines = lines[dir_idx + 1 : body_end + 1]
495+
496+
options, content_start = _parse_options(body_lines)
497+
content_lines, content_blanks_removed = _extract_content(body_lines, content_start)
498+
content = "\n".join(content_lines)
499+
500+
# Extra content = any non-blank line outside the directive body.
501+
has_extra = any(lines[i].strip() for i in range(body_end + 1, len(lines)))
502+
503+
# Line offsets relative to the start of rst_text (0-based).
504+
directive_line_offset = dir_idx
505+
if content_lines:
506+
content_line_offset: int | None = (
507+
dir_idx + 1 + content_start + content_blanks_removed
508+
)
509+
else:
510+
content_line_offset = None
511+
512+
return ParsedDirective(
513+
name=name,
514+
argument=argument,
515+
options=options,
516+
content=content,
517+
has_extra_content=has_extra,
518+
directive_line_offset=directive_line_offset,
519+
content_line_offset=content_line_offset,
520+
)
521+
522+
358523
# @Extract reStructuredText blocks embedded in comments, IMPL_RST_1, impl, [FE_RST_EXTRACTION]
359524
def extract_rst(
360525
text: str, start_marker: str, end_marker: str

src/sphinx_codelinks/sphinx_extension/directives/src_trace.py

Lines changed: 113 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,17 @@
55

66
from docutils import nodes
77
from docutils.parsers.rst import directives
8+
from docutils.statemachine import StringList
89
from packaging.version import Version
910
import sphinx
1011
from sphinx.util.docutils import SphinxDirective
1112
from sphinx_needs.api import add_need # type: ignore[import-untyped]
13+
from sphinx_needs.directives.need import NeedDirective # type: ignore[import-untyped]
1214
from sphinx_needs.utils import add_doc # type: ignore[import-untyped]
1315

1416
from sphinx_codelinks.analyse.analyse import SourceAnalyse
15-
from sphinx_codelinks.analyse.models import OneLineNeed
17+
from sphinx_codelinks.analyse.models import Metadata
18+
from sphinx_codelinks.analyse.utils import parse_single_directive
1619
from sphinx_codelinks.config import (
1720
CodeLinksConfig,
1821
CodeLinksProjectConfigType,
@@ -43,7 +46,7 @@ def get_rel_path(doc_path: Path, code_path: Path, base_dir: Path) -> tuple[Path,
4346

4447

4548
def generate_str_link_name(
46-
oneline_need: OneLineNeed,
49+
oneline_need: Metadata,
4750
target_filepath: Path,
4851
dirs: dict[str, Path],
4952
local: bool = False,
@@ -180,6 +183,16 @@ def run(self) -> list[nodes.Node]:
180183
dirs,
181184
)
182185

186+
# render needs from marked RST blocks
187+
rendered_needs.extend(
188+
self.render_marked_rst_needs(
189+
src_analyse,
190+
local_url_field,
191+
remote_url_field,
192+
dirs,
193+
)
194+
)
195+
183196
# for post-processing of need links
184197
# https://github.com/useblocks/sphinx-needs/issues/1210
185198
add_doc(self.env, self.env.docname)
@@ -322,3 +335,101 @@ def render_needs(
322335
] = f"{docs_href}#{oneline_need.need['id']}"
323336

324337
return rendered_needs
338+
339+
def render_marked_rst_needs(
340+
self,
341+
src_analyse: SourceAnalyse,
342+
local_url_field: str | None,
343+
remote_url_field: str | None,
344+
dirs: dict[str, Path],
345+
) -> list[nodes.Node]:
346+
"""Render needs from marked RST blocks (``@rst ... @endrst``).
347+
348+
Each block is expected to contain a single need directive.
349+
Warnings are emitted when the block does not contain a directive
350+
or contains content outside the directive.
351+
"""
352+
rendered_nodes: list[nodes.Node] = []
353+
for marked_rst in src_analyse.marked_rst:
354+
parsed = parse_single_directive(marked_rst.rst)
355+
src_file = str(marked_rst.filepath)
356+
src_line = marked_rst.source_map["start"]["row"] + 1
357+
358+
if parsed is None:
359+
logger.warning(
360+
f"No directive found in marked RST block [{src_file}:{src_line}]",
361+
location=(self.env.docname, self.lineno),
362+
)
363+
continue
364+
365+
if parsed["has_extra_content"]:
366+
logger.warning(
367+
"Content found outside directive in marked RST block "
368+
f"[{src_file}:{src_line}]; "
369+
"only a single directive is supported",
370+
location=(self.env.docname, self.lineno),
371+
)
372+
373+
# Build content StringList with source mapping
374+
content_lines = parsed["content"].splitlines() if parsed["content"] else []
375+
content_offset = self.content_offset
376+
if parsed["content_line_offset"] is not None:
377+
content_offset = src_line - 1 + parsed["content_line_offset"]
378+
content = StringList(content_lines, source=src_file)
379+
380+
# Build arguments list (title)
381+
arguments = [parsed["argument"]] if parsed["argument"] else []
382+
383+
# Options are passed as raw strings without conversion or
384+
# validation here; NeedDirective uses a DummyOptionSpec that
385+
# accepts all keys as strings, and performs its own key-by-key
386+
# validation inside run().
387+
# NOTE: DummyOptionSpec was added in sphinx-needs v6
388+
# (d09332d); earlier versions use a fixed option_spec.
389+
options: dict[str, str | None] = dict(parsed["options"])
390+
391+
# Inject URL fields
392+
filepath = src_analyse.analyse_config.src_dir / marked_rst.filepath
393+
target_filepath = dirs["target_dir"] / filepath.relative_to(dirs["src_dir"])
394+
395+
if local_url_field:
396+
target_filepath.parent.mkdir(parents=True, exist_ok=True)
397+
target_filepath.write_text(filepath.read_text())
398+
local_rel_path, _ = get_rel_path(
399+
Path(self.env.docname), target_filepath, dirs["out_dir"]
400+
)
401+
options[local_url_field] = generate_str_link_name(
402+
marked_rst, local_rel_path, dirs, local=True
403+
)
404+
if remote_url_field:
405+
options[remote_url_field] = generate_str_link_name(
406+
marked_rst, target_filepath, dirs, local=False
407+
)
408+
409+
directive_lineno = src_line + parsed["directive_line_offset"]
410+
411+
# Instantiate NeedDirective directly rather than using add_need(),
412+
# so that it can process the full directive body (content, options)
413+
# through its own run() logic. We pass the real state/state_machine
414+
# from the enclosing SphinxDirective — no mocking needed.
415+
need_directive = NeedDirective(
416+
name=parsed["name"],
417+
arguments=arguments,
418+
options=options,
419+
content=content,
420+
lineno=directive_lineno,
421+
content_offset=content_offset,
422+
block_text="",
423+
state=self.state,
424+
state_machine=self.state_machine,
425+
)
426+
try:
427+
rendered_nodes.extend(need_directive.run())
428+
except Exception as exc:
429+
logger.warning(
430+
"Failed to render directive in marked RST block "
431+
f"[{src_file}:{src_line}]: {exc}",
432+
location=(self.env.docname, self.lineno),
433+
)
434+
435+
return rendered_nodes
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
<document source="<source>">
2+
<target anonymous="" ids="RST_IMPL_1" refid="RST_IMPL_1">
3+
<Need classes="need need-impl" ids="RST_IMPL_1" refid="RST_IMPL_1">
4+
<paragraph>
5+
This need was defined inside an @rst block.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
<document source="<source>">
2+
<target anonymous="" ids="OL_IMPL_1" refid="OL_IMPL_1">
3+
<Need classes="need need-impl" ids="OL_IMPL_1" refid="OL_IMPL_1">
4+
<target anonymous="" ids="RST_IMPL_2" refid="RST_IMPL_2">
5+
<Need classes="need need-impl" ids="RST_IMPL_2" refid="RST_IMPL_2">
6+
<paragraph>
7+
This is a detailed need from an @rst block,
8+
coexisting with a one-line need.

tests/doc_test/rst_basic/conf.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Configuration file for the Sphinx documentation builder.
2+
3+
project = "rst-block-test"
4+
copyright = "2025, useblocks"
5+
author = "useblocks"
6+
7+
extensions = ["sphinx_needs", "sphinx_codelinks"]
8+
9+
templates_path = ["_templates"]
10+
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
11+
12+
src_trace_config_from_toml = "src_trace.toml"
13+
14+
html_theme = "alabaster"
15+
html_static_path = ["_static"]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#include <iostream>
2+
3+
/* @rst
4+
.. impl:: RST Block Implementation
5+
:id: RST_IMPL_1
6+
:status: open
7+
8+
This need was defined inside an @rst block.
9+
@endrst */
10+
void rst_block_function()
11+
{
12+
std::cout << "RST block example" << std::endl;
13+
}

tests/doc_test/rst_basic/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
.. src-trace::
2+
:project: src
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[codelinks.projects.src]
2+
remote_url_pattern = "https://github.com/useblocks/sphinx-codelinks/blob/{commit}/{path}#L{line}"
3+
4+
[codelinks.projects.src.analyse]
5+
get_rst = true

0 commit comments

Comments
 (0)