Skip to content

Commit f40b8cf

Browse files
wip: read docstrings from doc-gen
This PR implements an alternative path to docstring rendering that reads the necessary data from a doc-gen4 SQLite database, to work around the fact that docstring info is no longer available in the environment with the module system and to allow a global view of the project for things like instance lists.
1 parent ee1b0d5 commit f40b8cf

9 files changed

Lines changed: 673 additions & 45 deletions

File tree

doc/UsersGuide/Manuals.lean

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,9 @@ Author: David Thrane Christiansen
66
import Lean.DocString.Syntax
77
import VersoManual
88
import VersoBlog
9+
import VersoManual.DB
910

10-
open Verso Genre Manual
11+
open Verso Genre Manual DB
1112

1213
open InlineLean
1314
open Verso.Doc
@@ -21,38 +22,66 @@ htmlSplit := .never
2122
Verso's {name}`Manual` genre can be used to write reference manuals, textbooks, or other book-like documents.
2223
It supports generating both HTML and PDFs via LaTeX, but the PDF support is relatively immature and untested compared to the HTML support.
2324

25+
{dbDocstring Manual}
2426
{docstring Manual}
2527

28+
29+
30+
31+
{dbDocstring Manual.PartMetadata}
2632
{docstring Manual.PartMetadata}
2733

34+
35+
36+
{dbDocstring Manual.HtmlSplitMode}
2837
{docstring Manual.HtmlSplitMode}
2938

39+
40+
3041
The {name}`Manual` genre's block and inline element types are extensible.
3142
In the document, they consist of instances of {name}`Manual.Block` and {name}`Manual.Inline`, respectively:
3243

44+
{dbDocstring Manual.Block}
3345
{docstring Manual.Block}
3446

47+
48+
49+
{dbDocstring Manual.Inline}
3550
{docstring Manual.Inline}
3651

52+
53+
3754
The fields {name}`Block.name` and {name Manual.Inline.name}`Inline.name` are used to look up concrete implementations of traversal and output generation in run-time tables that contain descriptions:
3855

56+
{dbDocstring Manual.BlockDescr}
3957
{docstring Manual.BlockDescr}
4058

59+
60+
61+
{dbDocstring Manual.InlineDescr}
4162
{docstring Manual.InlineDescr}
4263

64+
65+
4366
Typically, the `inline_extension` and `block_extension` commands are used to simultaneously define an element and its descriptor, registering them for use by {name}`manualMain`.
4467

4568
:::paragraph
4669
The type {name}`HtmlAssets` contains CSS and JavaScript code.
4770
{name}`Manual.TraverseState`, {name}`Manual.BlockDescr`, and {name}`Manual.InlineDescr` all inherit from this structure.
4871
During traversal, HTML assets are collected; they are all included in the final rendered document.
4972

73+
{dbDocstring Manual.HtmlAssets}
5074
{docstring Manual.HtmlAssets}
5175

76+
77+
5278
Use {name}`HtmlAssets.combine` to combine multiple assets.
5379

80+
{dbDocstring Manual.HtmlAssets.combine}
5481
{docstring Manual.HtmlAssets.combine}
5582

83+
84+
5685
:::
5786

5887
# Tags and References
@@ -92,13 +121,19 @@ tag := "docstrings"
92121
Docstrings can be included using the `docstring` directive. For instance,
93122

94123
```
124+
{dbDocstring List.forM}
95125
{docstring List.forM}
126+
127+
96128
```
97129

98130
results in
99131

132+
{dbDocstring List.forM}
100133
{docstring List.forM}
101134

135+
136+
102137
The {name}`docstring` command takes a positional parameter which is the documented name.
103138
It also accepts the following optional named parameters:
104139

@@ -154,11 +189,17 @@ Elsewhere in the document, `tech` can be used to annotate a use site of a techni
154189
A {deftech}_technical term_ is a term with a specific meaning that's used precisely, like this one.
155190
References to technical terms are valid both before and after their definition sites.
156191

192+
{dbDocstring deftech}
157193
{docstring deftech}
158194

195+
196+
197+
{dbDocstring tech}
159198
{docstring tech}
160199

161200

201+
202+
162203
# Open-Source Licenses
163204
%%%
164205
tag := "oss-licenses"
@@ -168,8 +209,12 @@ To facilitate providing appropriate credit to the authors of open-source JavaScr
168209
This is done using the {name HtmlAssets.licenseInfo}`licenseInfo` field that {name}`BlockDescr` and {name}`InlineDescr` inherit from {name}`HtmlAssets`.
169210
These contain a {name}`LicenseInfo`:
170211

212+
{dbDocstring LicenseInfo}
171213
{docstring LicenseInfo}
172214

215+
216+
173217
The {name}`licenseInfo` command displays the licenses for all components that were included in the generated document:
174218

219+
{dbDocstring licenseInfo}
175220
{docstring licenseInfo}

lakefile.lean

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ lean_exe «verso-demo» where
106106
lean_lib UsersGuide where
107107
srcDir := "doc"
108108
leanOptions := #[⟨`weak.linter.verso.manual.headerTags, true⟩]
109+
needs := #[`@:docSource]
109110

110111
@[default_target]
111112
lean_exe usersguide where
@@ -219,19 +220,17 @@ package_facet docSource pkg : System.FilePath := do
219220
| none => buildDir / ".." / "packages" / "doc-gen4"
220221

221222
exeJob.mapM fun exeFile => do
222-
-- Add trace for the TOML config file so changes trigger rebuild
223-
if ← tomlPath.pathExists then
224-
addTrace (← fetchFileTrace tomlPath (text := true))
225-
226-
buildFileUnlessUpToDate' dbPath do
227-
let args :=
228-
if ← tomlPath.pathExists then
229-
#[wsDir.toString, docgen4Dir.toString, pkgDir.toString, tomlPath.toString]
230-
else
231-
#[wsDir.toString, docgen4Dir.toString, pkgDir.toString]
232-
proc {
233-
cmd := exeFile.toString
234-
args
235-
}
223+
-- Always run the setup exe and let the inner `lake build` handle incrementality.
224+
-- This avoids stale DB issues from incomplete traces — the inner workspace's own
225+
-- build system correctly tracks all dependencies (doc-gen4, documented libraries, etc.).
226+
let args :=
227+
if ← tomlPath.pathExists then
228+
#[wsDir.toString, docgen4Dir.toString, pkgDir.toString, tomlPath.toString]
229+
else
230+
#[wsDir.toString, docgen4Dir.toString, pkgDir.toString]
231+
proc {
232+
cmd := exeFile.toString
233+
args
234+
}
236235

237236
pure dbPath

src/tests/Tests/DocSourceConfig.lean

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Tests for `Verso.Genre.Manual.DocSource.Config` — TOML parsing and lakefile ge
1414
open Verso.Genre.Manual.DocSource
1515
open Lake.Toml
1616

17-
/-- Parse a TOML string into a `Table`. Throws on parse error. -/
17+
/-- Parses a TOML string into a `Table`. Throws on parse error. -/
1818
private def parseToml (input : String) : IO Table := do
1919
let ictx := Lean.Parser.mkInputContext input "<test>"
2020
match (← Lake.Toml.loadToml ictx |>.toBaseIO) with
@@ -23,18 +23,18 @@ private def parseToml (input : String) : IO Table := do
2323
let msgStrs ← msgs.toList.mapM fun msg => msg.data.toString
2424
throw <| .userError s!"TOML parse error:\n{"\n".intercalate msgStrs}"
2525

26-
/-- Assert that two values are equal, throwing a descriptive error if not. -/
26+
/-- Asserts that two values are equal, throwing a descriptive error if not. -/
2727
private def assertEqual [BEq α] [Repr α] (label : String) (expected actual : α) : IO Unit := do
2828
unless expected == actual do
2929
throw <| IO.userError s!"{label}: expected\n {repr expected}\nbut got\n {repr actual}"
3030

31-
/-- Assert that a result is an error. -/
31+
/-- Asserts that a result is an error. -/
3232
private def assertError [Repr α] (label : String) (result : Except String α) : IO Unit := do
3333
match result with
3434
| .error _ => pure ()
3535
| .ok v => throw <| IO.userError s!"{label}: expected an error but got\n {repr v}"
3636

37-
/-- Assert that a string contains a substring. -/
37+
/-- Asserts that a string contains a substring. -/
3838
private def assertContains (label : String) (haystack needle : String) : IO Unit := do
3939
unless (haystack.splitOn needle).length > 1 do
4040
throw <| IO.userError s!"{label}: expected string to contain '{needle}' but got:\n {haystack}"

src/verso-manual/VersoManual/DB.lean

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,6 @@ Released under Apache 2.0 license as described in the file LICENSE.
44
Author: David Thrane Christiansen
55
-/
66
import VersoManual.DB.Config
7+
import VersoManual.DB.Convert
8+
import VersoManual.DB.Query
9+
import VersoManual.DB.Docstring

src/verso-manual/VersoManual/DB/Config.lean

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ namespace Verso.Genre.Manual.DocSource
1717

1818
open Lake.Toml
1919

20-
/-- A dependency entry from `doc-sources.toml`, mirroring `[[require]]` in `lakefile.toml`. -/
20+
/-- Dependency entry from `doc-sources.toml`, mirroring `[[require]]` in `lakefile.toml`. -/
2121
structure Require where
2222
/-- The package name (must match the name declared in its lakefile). -/
2323
name : String
@@ -44,20 +44,20 @@ structure Config where
4444
setup : Array String := #[]
4545
deriving Repr, BEq, Inhabited
4646

47-
/-- Extract a `String` from a TOML `Value`, or `none` if it's not a string. -/
47+
/-- Extracts a `String` from a TOML `Value`, or `none` if it's not a string. -/
4848
private def tomlString? : Value → Option String
4949
| .string _ s => some s
5050
| _ => none
5151

5252
/--
53-
Extract an `Array String` from a TOML array of strings. Non-string elements are silently
53+
Extracts an `Array String` from a TOML array of strings. Non-string elements are silently
5454
skipped.
5555
-/
5656
private def tomlStringArray? : Value → Option (Array String)
5757
| .array _ vs => some <| vs.filterMap tomlString?
5858
| _ => none
5959

60-
/-- Parse a single `[[require]]` entry from a TOML table value. -/
60+
/-- Parses a single `[[require]]` entry from a TOML table value. -/
6161
def Require.ofToml (v : Value) : Except String Require := do
6262
match v with
6363
| .table' _ t =>
@@ -73,7 +73,7 @@ def Require.ofToml (v : Value) : Except String Require := do
7373
pure { name, git, rev, path, subDir }
7474
| _ => throw "[[require]] entry must be a table"
7575

76-
/-- Parse a `Config` from a TOML `Table`. -/
76+
/-- Parses a `Config` from a TOML `Table`. -/
7777
def Config.ofToml (table : Table) : Except String Config := do
7878
let require ← match table.find? `require with
7979
| some (.array _ vs) => vs.mapM Require.ofToml
@@ -87,7 +87,7 @@ def Config.ofToml (table : Table) : Except String Config := do
8787
| none => #[]
8888
pure { require, libraries, setup }
8989

90-
/-- Load and parse a `doc-sources.toml` file. -/
90+
/-- Loads and parses a `doc-sources.toml` file. -/
9191
def Config.load (filePath : System.FilePath) : IO Config := do
9292
let input ← IO.FS.readFile filePath
9393
let ictx := Lean.Parser.mkInputContext input filePath.toString
@@ -101,7 +101,7 @@ def Config.load (filePath : System.FilePath) : IO Config := do
101101
throw <| .userError s!"Error parsing {filePath}:\n{"\n".intercalate msgStrs}"
102102

103103
/--
104-
Split a command string into an executable name and arguments, respecting single and double
104+
Splits a command string into an executable name and arguments, respecting single and double
105105
quotes. Backslash escapes the next character inside double quotes. Unmatched quotes are
106106
treated as if closed at the end of the string.
107107
-/
@@ -152,14 +152,18 @@ def splitCommand (cmd : String) : Option (String × Array String) := do
152152
| [] => none
153153
| exe :: rest => some (exe, rest.toArray)
154154

155-
/-- Generate a `require` declaration in lakefile.lean syntax for a single dependency. -/
155+
/-- Generates a `require` declaration in lakefile.lean syntax for a single dependency. -/
156156
def Require.toLakefileEntry (r : Require) (projectDir : System.FilePath) : String :=
157157
let name := s!"require «{r.name}»"
158158
match r.git, r.path with
159159
| some url, _ =>
160+
-- Resolve relative git URLs against the project root, since the generated
161+
-- lakefile lives in the managed workspace, not the project root.
162+
let absUrl := if System.FilePath.isAbsolute ⟨url⟩ || (url.splitOn "://").length > 1 then url
163+
else (projectDir / url).toString
160164
let revPart := r.rev.map (s!" @ \"{·}\"") |>.getD ""
161165
let subDirPart := r.subDir.map (s!" / \"{·}\"") |>.getD ""
162-
s!"{name} from git\n \"{url}\"{revPart}{subDirPart}\n"
166+
s!"{name} from git\n \"{absUrl}\"{revPart}{subDirPart}\n"
163167
| _, some path =>
164168
-- Resolve relative paths against the project root to produce absolute paths,
165169
-- since the generated lakefile lives in the managed workspace, not the project root.
@@ -170,7 +174,7 @@ def Require.toLakefileEntry (r : Require) (projectDir : System.FilePath) : Strin
170174
s!"{name}\n"
171175

172176
/--
173-
Generate a complete `lakefile.lean` for the managed doc-gen workspace.
177+
Generates a complete `lakefile.lean` for the managed doc-gen workspace.
174178
175179
`config` is the parsed `doc-sources.toml` (or `none` for a core-only build).
176180
`docgen4Dir` is the absolute path to the doc-gen4 package checkout.
@@ -179,7 +183,7 @@ Generate a complete `lakefile.lean` for the managed doc-gen workspace.
179183
def generateLakefile (config : Option Config)
180184
(docgen4Dir : System.FilePath) (projectDir : System.FilePath) : String :=
181185
let header := "import Lake\nopen Lake DSL\n\npackage «docgen-workspace»\n\n"
182-
let docgenReq := s!"require «doc-gen4» from \"{docgen4Dir}\"\n\n"
186+
let docgenReq := s!"require «doc-gen4» from git\n \"{docgen4Dir}\"\n\n"
183187
let userReqs := match config with
184188
| some cfg => cfg.require.map (·.toLakefileEntry projectDir) |>.toList |> String.join
185189
| none => ""
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
/-
2+
Copyright (c) 2025-2026 Lean FRO LLC. All rights reserved.
3+
Released under Apache 2.0 license as described in the file LICENSE.
4+
Author: David Thrane Christiansen
5+
-/
6+
import DocGen4.RenderedCode
7+
import SubVerso.Highlighting.Highlighted
8+
9+
/-! # RenderedCode → Highlighted Conversion
10+
11+
Doc-gen4 stores types as `RenderedCode` (`TaggedText RenderedCode.Tag`) binary blobs. Verso renders
12+
all code using SubVerso's `Highlighted` type. This module converts between them.
13+
14+
The conversion is lossy: `RenderedCode` does not carry hover info, variable types, or go-to-definition
15+
targets. The visual rendering is the same — tokens that were keywords, string literals, or constant
16+
references are tagged appropriately for syntax highlighting and linking.
17+
-/
18+
19+
namespace Verso.Genre.Manual.DB
20+
21+
open DocGen4 (RenderedCode SortFormer)
22+
open SubVerso.Highlighting (Highlighted Token)
23+
24+
/-- Extract plain text content from a `RenderedCode` tree, discarding all tags. -/
25+
partial def renderedCodeText : RenderedCode → String
26+
| .text s => s
27+
| .tag _ inner => renderedCodeText inner
28+
| .append xs => String.join (xs.toList.map renderedCodeText)
29+
30+
/--
31+
Convert a `RenderedCode` value (from doc-gen4's database) to a `Highlighted` value (for Verso's
32+
rendering pipeline). Tags are mapped as follows:
33+
34+
- `.keyword` → `Token.Kind.keyword` (no name or docs)
35+
- `.string` → `Token.Kind.str`
36+
- `.const name` → `Token.Kind.const` (with signature and docstring from `constInfo` if available)
37+
- `.sort` → `Token.Kind.sort` (no docs)
38+
- `.otherExpr` → plain `Highlighted.text` (no semantic info)
39+
40+
The `constInfo` parameter provides hover data for known constants: a map from `Name` to
41+
`(signature, docstring?)`.
42+
-/
43+
partial def renderedCodeToHighlighted (constInfo : Lean.NameMap (String × Option String) := {})
44+
: RenderedCode → Highlighted
45+
| .text s => .text s
46+
| .tag t inner =>
47+
let content := renderedCodeText inner
48+
match t with
49+
| .keyword => .token ⟨.keyword none none none, content⟩
50+
| .string => .token ⟨.str content, content⟩
51+
| .const name =>
52+
let (sig, doc?) := constInfo.find? name |>.getD ("", none)
53+
.token ⟨.const name sig doc? false none, content⟩
54+
| .sort _former => .token ⟨.sort none, content⟩
55+
| .otherExpr => renderedCodeToHighlighted constInfo inner
56+
| .append xs => .seq (xs.map (renderedCodeToHighlighted constInfo))
57+
58+
/-- Collect all constant names referenced in a `RenderedCode` tree. -/
59+
partial def renderedCodeConstNames (acc : Lean.NameSet := {}) : RenderedCode → Lean.NameSet
60+
| .text _ => acc
61+
| .tag (.const name) inner => renderedCodeConstNames (acc.insert name) inner
62+
| .tag _ inner => renderedCodeConstNames acc inner
63+
| .append xs => xs.foldl (init := acc) fun a x => renderedCodeConstNames a x
64+
65+
end Verso.Genre.Manual.DB

0 commit comments

Comments
 (0)