Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -483,6 +483,7 @@ if (MRDOCS_BUILD_TESTS)
"${PROJECT_BINARY_DIR}/src"
)
target_link_libraries(mrdocs-test PUBLIC mrdocs-core)
target_link_libraries(mrdocs-test PRIVATE Lua::lua)
if (MRDOCS_CLANG)
target_compile_options(mrdocs-test PRIVATE -Wno-covered-switch-default)
endif ()
Expand Down Expand Up @@ -563,6 +564,19 @@ if (MRDOCS_BUILD_TESTS)
endforeach ()
endforeach ()

#-------------------------------------------------
# Script-driven generator example
#-------------------------------------------------
add_test(
NAME mrdocs-generator-script-driven-search-index
COMMAND bash run.sh
--addons=${CMAKE_CURRENT_SOURCE_DIR}/share/mrdocs/addons
--output=${CMAKE_CURRENT_BINARY_DIR}/generator-examples/search-index/reference-output
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/examples/generators/script-driven/search-index
)
set_property(TEST mrdocs-generator-script-driven-search-index PROPERTY
ENVIRONMENT "MRDOCS=$<TARGET_FILE:mrdocs>")

#-------------------------------------------------
# Library-usage examples
#
Expand Down
2 changes: 2 additions & 0 deletions docs/antora-playbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ antora:
strip_page_wrapper: true
- source: examples/generators/data-driven
target: data-driven-generators
- source: examples/generators/script-driven
target: script-driven-generators
- source: examples
target: examples
- require: ./extensions/commands-registry-extension.js
Expand Down
2 changes: 1 addition & 1 deletion docs/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
** xref:generators/adoc.adoc[AsciiDoc]
** xref:generators/xml.adoc[XML]
* Extensions
** xref:extensions/corpus-extensions.adoc[Corpus Extensions]
** xref:extensions/corpus-extensions.adoc[Extensions]
** xref:extensions/handlebars-extensions.adoc[Handlebars Extensions]
** xref:extensions/data-driven-generators.adoc[Data-Driven Generators]
** xref:extensions/antora.adoc[Antora Extensions]
Expand Down
77 changes: 60 additions & 17 deletions docs/modules/ROOT/pages/extensions/corpus-extensions.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
= Corpus Extensions
= Extensions

Use an extension to rewrite metadata across many symbols at once: backfill briefs from a naming convention, tag symbols by group, mark generated code as "see below" in the output. Extensions run between extraction and rendering, so every generator sees the change.
An extension is a Lua or JavaScript script that runs as part of a Mr.Docs build and shapes the documentation in a way templates alone cannot. There are two kinds, and a single script may declare either or both:

* A *corpus transform* rewrites metadata across many symbols at once: backfill briefs from a naming convention, tag symbols by group, mark generated code as "see below" in the output. Transforms run between extraction and rendering, so every generator sees the change.
* A *generator* owns the whole emit: instead of rendering one page per symbol, it traverses the corpus and writes whatever files it wants. That lets it produce shapes the per-page generators cannot, such as a single artifact aggregated across every symbol.

A script declares a transform with `mrdocs.register_transform(fn)` and a generator with `mrdocs.register_generator(id, fn)`. Either way the registered function receives a single context object, `ctx`.

== Languages and addon locations

Expand All @@ -14,9 +19,21 @@ Both scripting languages reach the same `mrdocs` API. The choice is a trade-off,
* *Lua* is the language designed to be embedded. Mr.Docs links it whole, so scripts have access to the entire Lua standard library (`string`, `table`, `math`, `io`, `os`) and can do filesystem work or text munging without leaving the script. The cost is that fewer people read Lua at a glance than read JavaScript. If you're already familiar with Lua, it is the more powerful choice.
====

== Accessing the corpus
== The context object

Every registered function receives one argument, `ctx`:

* `ctx.corpus` is the corpus. Its `symbols` field is a flat array of every extracted symbol; what a script can do with those symbols depends on the kind of extension, described in the sections below.
* `ctx.config` is the resolved configuration: the same object the templates receive, holding every value from the config file and the command line. See xref:configuration/reference.adoc[the configuration reference] for the available keys.

A generator's `ctx` carries one more field, `ctx.output`, covered under <<generators>>.

A script extends Mr.Docs by defining a function named `transform_corpus(corpus)`. Mr.Docs calls it once per loaded script with a flat read-only view of the corpus. A script that doesn't define `transform_corpus` is silently ignored at this step.
A script can register any number of transforms and generators, or none at all. If it registers nothing, Mr.Docs warns that the script had no effect and moves on.

[#corpus-transforms]
== Corpus transforms

A transform is a function passed to `mrdocs.register_transform`. Mr.Docs invokes each registered function once, in registration order, with the `ctx` object. A flat view of the corpus reaches the script through `ctx.corpus`.

[tabs]
======
Expand All @@ -37,7 +54,7 @@ include::example$snippets/extensions/entry-point/addons/extensions/noop.lua[]
----
======

The `corpus` object provides functions that expose the symbol graph. The `corpus.symbols` field is a flat array containing every extracted symbol. Scripts that need queries like "all members of `X`" simply walk the array and filter.
The `ctx.corpus` object provides functions that expose the symbol graph. The `ctx.corpus.symbols` field is a flat array containing every extracted symbol. Scripts that need queries like "all members of `X`" simply walk the array and filter.

For instance, the following scripts count the symbols of each kind and report the totals at the end of the run:

Expand All @@ -48,41 +65,41 @@ JavaScript::
.`addons/extensions/count_by_kind.js`
[source,javascript]
----
function transform_corpus(corpus) {
mrdocs.register_transform(function(ctx) {
var counts = {};
for (var i = 0; i < corpus.symbols.length; ++i) {
var k = corpus.symbols[i].kind;
for (var i = 0; i < ctx.corpus.symbols.length; ++i) {
var k = ctx.corpus.symbols[i].kind;
counts[k] = (counts[k] || 0) + 1;
}
for (var k in counts) {
console.log(k + ": " + counts[k]);
}
}
});
----

Lua::
+
.`addons/extensions/count_by_kind.lua`
[source,lua]
----
function transform_corpus(corpus)
mrdocs.register_transform(function(ctx)
local counts = {}
for _, sym in ipairs(corpus.symbols) do
for _, sym in ipairs(ctx.corpus.symbols) do
counts[sym.kind] = (counts[sym.kind] or 0) + 1
end
for k, v in pairs(counts) do
print(k .. ": " .. v)
end
end
end)
----
======

Each entry in `corpus.symbols` is a proxy for a live Mr.Docs symbol. The fields of each object are at xref:extensions/dom-reference.adoc[the DOM reference].
Each entry in `ctx.corpus.symbols` is a proxy for a live Mr.Docs symbol. The fields of each object are at xref:extensions/dom-reference.adoc[the DOM reference].

When a script knows a symbol's id and needs to act on that one symbol:

* `corpus.get(id)` returns the proxy for it or `null` if the id is unknown
* `corpus.lookup(name)` does a global-namespace name lookup and returns the proxy (or `null`)
* `ctx.corpus.get(id)` returns the proxy for it or `null` if the id is unknown
* `ctx.corpus.lookup(name)` does a global-namespace name lookup and returns the proxy (or `null`)

.`subclass-tree.cpp`
[source,cpp]
Expand Down Expand Up @@ -122,7 +139,7 @@ Shape
----

[[modifying-the-corpus]]
== Modifying the corpus
=== Modifying the corpus

Scripts modify the corpus by assigning to fields on a symbol proxy. Each assignment lands directly in the underlying Mr.Docs symbol. The runtime validates each assignment and raises an exception on an invalid value. An uncaught error in an extension aborts the build and includes the script's path and the error message.

Expand Down Expand Up @@ -160,7 +177,7 @@ Every `is_foo_bar` function then ships with "Returns true if foo bar." Authors o
include::example$snippets/extensions/brief-from-name/brief-from-name.adoc[tags=!footer]
========

== Cross-linking Symbols
=== Cross-linking Symbols

When the value being written needs to reference another symbol, the second symbol's `id` is what makes the link clickable in the rendered output rather than a plain string.

Expand Down Expand Up @@ -203,3 +220,29 @@ The two-pass shape (index, then look up) is the idiom whenever a write needs to
Notice in this example that `s.doc.sees` receives a list of polymorphic types that represent a paragraph in `s.doc.sees.children`. These polymorphic objects accept an object with a `kind:` selector that names the concrete derived class to construct.
====

[[generators]]
== Generators

The built-in generators render one page per symbol. When you need a different output structure, e.g. one file per namespace, or a single artifact aggregated across every symbol such as a search index, that page-per-symbol shape cannot express it. A generator hands the whole emit to the script instead: it traverses the corpus and writes whatever files it wants. No C++ and no templates are involved.

A script declares a generator with `mrdocs.register_generator(id, fn)`. The `id` is the name you select on the command line with `--generator=<id>`; a registered generator takes precedence over a built-in of the same name. Selecting a generator is a request for output, so its function does the work directly, the page-per-symbol fallback the built-ins provide does not apply.

The function receives the same `ctx` a transform does, plus `ctx.output`:

* `ctx.corpus.symbols` is the array of every symbol. Each carries the same fields the template and helper layers see, plus a flat `_id` string suitable as a stable per-symbol URL fragment. A generator reads the corpus rather than mutating it.
* `ctx.config` is the resolved configuration, as above.
* `ctx.output.write(relativePath, contents)` writes one file under the configured output directory, which is the path specified with `--output` on the command line, or with the `output` key in the config file; that's the same location the built-in generators write to. The path is resolved relative to that directory and may not escape it; an absolute path or one that climbs above the output directory is rejected. Parent directories are created as needed.

Because the script owns the output, it also owns what a per-page generator would otherwise do for it: the URLs it emits, and any escaping of the content it writes. Mr.Docs does not apply an escape map to a generator's output.

=== Example: a search index

A complete, runnable example lives at `examples/generators/script-driven/search-index/`. The extension declares a `search-index` generator that emits a single search-index.json aggregating every symbol, an artifact no per-page generator can produce:

.`addons/extensions/search_index.lua`
[source,lua]
----
include::example$script-driven-generators/search-index/addons/extensions/search_index.lua[]
----

Select it with `--generator=search-index`; it writes search-index.json into the output directory.
Original file line number Diff line number Diff line change
Expand Up @@ -246,4 +246,6 @@ include::example$data-driven-generators/tex/simple.tex[]
----
======

To build the output structure yourself, e.g. one file per namespace or a single aggregated artifact like a search index, hand the whole emit to a script instead of rendering one page per symbol. See the xref:extensions/corpus-extensions.adoc#generators[generators] section of the extensions page.


2 changes: 1 addition & 1 deletion docs/mrdocs.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@
"default": [
"html"
],
"description": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, `xml`, and `noop` (which extracts but writes no output, useful for checking extraction warnings); data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/. This option accepts a single generator (`generator: xml`), a list (`generator: [xml, adoc]`), or a comma-separated string (`generator: \"xml,adoc\"`); when more than one is given the documentation is produced once per generator.",
"description": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, `xml`, and `noop` (which extracts but writes no output, useful for checking extraction warnings); data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/; a script-driven generator is declared by an extension with `mrdocs.register_generator` and produces the output itself. This option accepts a single generator (`generator: xml`), a list (`generator: [xml, adoc]`), or a comma-separated string (`generator: \"xml,adoc\"`); when more than one is given the documentation is produced once per generator.",
"title": "Generator(s) used to create the documentation"
},
"global-namespace-index": {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
-- Declare a `search-index` generator: a script-defined generator that
-- aggregates every symbol into a single search-index.json, the kind of
-- artifact the per-page generators cannot produce.
--
-- `mrdocs.register_generator(id, fn)` declares it next to any
-- `mrdocs.register_transform` a script might also declare; selecting
-- `generator: <id>` runs `fn` with one `ctx`. `ctx.corpus.symbols` is
-- every symbol (each tagged with a flat `_id` so the generator can form
-- stable per-symbol URLs) and `ctx.output.write` emits files under the
-- output directory.

-- Quote a string as a JSON value.
local function json_string(s)
s = s:gsub('\\', '\\\\'):gsub('"', '\\"')
return '"' .. s .. '"'
end

mrdocs.register_generator("search-index", function(ctx)
local entries = {}
for _, sym in ipairs(ctx.corpus.symbols) do
local name = sym.name or ""
if name ~= "" then
entries[#entries + 1] =
'{"name":' .. json_string(name) ..
',"url":' .. json_string(sym._id .. ".html") .. "}"
end
end
ctx.output.write(
"search-index.json",
"[" .. table.concat(entries, ",") .. "]")
end)
9 changes: 9 additions & 0 deletions examples/generators/script-driven/search-index/mrdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
addons-supplemental:
- addons
generator: search-index
multipage: false
show-namespaces: false
warn-if-undocumented: false
source-root: .
input:
- .
2 changes: 2 additions & 0 deletions examples/generators/script-driven/search-index/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/usr/bin/env bash
exec "${MRDOCS:-mrdocs}" --config=mrdocs.yml "$@"
16 changes: 16 additions & 0 deletions examples/generators/script-driven/search-index/simple.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
/// A vector in the Euclidean plane.
struct Vector
{
/** The length (magnitude) of the vector.

@return The Euclidean length.
*/
double length() const;

/** Scale the vector componentwise.

@param sx Factor applied to the x component.
@param sy Factor applied to the y component.
*/
void scale(double sx, double sy);
};
21 changes: 21 additions & 0 deletions include/mrdocs/Support/Lua.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -674,6 +674,27 @@ registerHelper(
Context& ctx,
std::string_view script);

/** Expose a registry-anchored Lua function as a dom::Function.

`ref` must be a reference obtained from
`luaL_ref(L, LUA_REGISTRYINDEX)` for a function value, where `L` is
`ctx`'s native state. The returned function invokes that Lua function,
marshalling its arguments and result as DOM values, so a Lua callable
can be held and called like any other @ref dom::Function.

The returned function takes ownership of the reference and keeps the
context alive: the Lua function is released (`luaL_unref`) when the
last copy of the returned function is destroyed. The anchor lives in
the registry, so no Lua global is introduced.

@param ctx The context whose registry holds the function.
@param ref The `luaL_ref` reference to the function.
@return A dom::Function that calls the anchored Lua function.
*/
[[nodiscard]] MRDOCS_DECL
dom::Function
makeCallable(Context ctx, int ref);

} // lua
} // mrdocs

Expand Down
2 changes: 1 addition & 1 deletion src/lib/ConfigOptions.json
Original file line number Diff line number Diff line change
Expand Up @@ -443,7 +443,7 @@
{
"name": "generator",
"brief": "Generator(s) used to create the documentation",
"details": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, `xml`, and `noop` (which extracts but writes no output, useful for checking extraction warnings); data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/. This option accepts a single generator (`generator: xml`), a list (`generator: [xml, adoc]`), or a comma-separated string (`generator: \"xml,adoc\"`); when more than one is given the documentation is produced once per generator.",
"details": "The generator is responsible for creating the documentation from the extracted symbols. The generator uses the extracted symbols and the templates to create the documentation. The built-in generators include `adoc`, `html`, `xml`, and `noop` (which extracts but writes no output, useful for checking extraction warnings); data-driven generators can be added by dropping a template folder under <addon>/generator/<name>/; a script-driven generator is declared by an extension with `mrdocs.register_generator` and produces the output itself. This option accepts a single generator (`generator: xml`), a list (`generator: [xml, adoc]`), or a comma-separated string (`generator: \"xml,adoc\"`); when more than one is given the documentation is produced once per generator.",
"type": "string-list",
"default": ["html"]
},
Expand Down
32 changes: 32 additions & 0 deletions src/lib/CorpusImpl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1102,4 +1102,36 @@ CorpusImpl::finalize()
}
}

void
CorpusImpl::
registerScriptGenerator(std::string id, dom::Function fn)
{
if (findScriptGenerator(id) == nullptr)
{
scriptGenerators_.emplace_back(std::move(id), std::move(fn));
}
}

dom::Function const*
CorpusImpl::
findScriptGenerator(std::string_view id) const noexcept
{
dom::Function const* result = nullptr;
for (auto const& entry : scriptGenerators_)
{
if (result == nullptr && std::string_view(entry.first) == id)
{
result = &entry.second;
}
}
return result;
}

void
CorpusImpl::
keepScriptVmAlive(std::shared_ptr<void> keepAlive)
{
scriptVmKeepAlives_.push_back(std::move(keepAlive));
}

} // mrdocs
Loading
Loading