From 9de3d54b3e08f672215e9c811112f45897df0fd8 Mon Sep 17 00:00:00 2001 From: Sebastian Ullrich Date: Fri, 12 Jun 2026 14:37:33 +0000 Subject: [PATCH 1/5] feat: tighten inductive FFI description --- Manual/Language/InductiveTypes.lean | 61 ++--------------------------- 1 file changed, 3 insertions(+), 58 deletions(-) diff --git a/Manual/Language/InductiveTypes.lean b/Manual/Language/InductiveTypes.lean index 995f02b6e..67dc1c714 100644 --- a/Manual/Language/InductiveTypes.lean +++ b/Manual/Language/InductiveTypes.lean @@ -423,6 +423,8 @@ tag := "inductive-types-trivial-wrappers" If an inductive type has exactly one constructor, and that constructor has exactly one run-time relevant parameter, then the inductive type is represented identically to its parameter. +For a public inductive type under the module system, any private information that prevents these conditions from being tested in other modules will disable this optimization even in the current module. + :::example "Zero-Overhead Subtypes" The structure {name}`Subtype` bundles an element of some type with a proof that it satisfies a predicate. Its constructor takes four arguments, but three of them are irrelevant: @@ -459,64 +461,7 @@ tag := "inductive-types-ffi" From the perspective of C, these other inductive types are represented by {C}`lean_object *`. Each constructor is stored as a {C}`lean_ctor_object`, and {C}`lean_is_ctor` will return true. -A {C}`lean_ctor_object` stores the constructor index in its header, and the fields are stored in the {C}`m_objs` portion of the object. -Lean assumes that {C}`sizeof(size_t) == sizeof(void*)`—while this is not guaranteed by C, the Lean run-time system contains an assertion that fails if this is not the case. - - -The memory order of the fields is derived from the types and order of the fields in the declaration. They are ordered as follows: - -* Non-scalar fields stored as {C}`lean_object *` -* Fields of type {lean}`USize` -* Other scalar fields, in decreasing order by size - -Within each group the fields are ordered in declaration order. *Warning*: Trivial wrapper types count as their underlying wrapped type for this purpose. - -* To access fields of the first kind, use {C}`lean_ctor_get(val, i)` to get the `i`th non-scalar field. -* To access {lean}`USize` fields, use {C}`lean_ctor_get_usize(val, n+i)` to get the {C}`i`th `USize` field and {C}`n` is the total number of fields of the first kind. -* To access other scalar fields, use {C}`lean_ctor_get_uintN(val, off)` or {C}`lean_ctor_get_usize(val, off)` as appropriate. Here `off` is the byte offset of the field in the structure, starting at {C}`n*sizeof(void*)` where `n` is the number of fields of the first two kinds. - -::::keepEnv - -For example, a structure such as -```lean -structure S where - ptr_1 : Array Nat - usize_1 : USize - sc64_1 : UInt64 - -- Wrappers of scalars count as scalars: - sc64_2 : { x : UInt64 // x > 0 } - sc64_3 : Float -- `Float` is 64 bit - sc8_1 : Bool - sc16_1 : UInt16 - sc8_2 : UInt8 - sc64_4 : UInt64 - usize_2 : USize - -- Trivial wrapper around `UInt32` - sc32_1 : Char - sc32_2 : UInt32 - sc16_2 : UInt16 -``` -would get re-sorted into the following memory order: - -* {name}`S.ptr_1`: {C}`lean_ctor_get(val, 0)` -* {name}`S.usize_1`: {C}`lean_ctor_get_usize(val, 1)` -* {name}`S.usize_2`: {C}`lean_ctor_get_usize(val, 2)` -* {name}`S.sc64_1`: {C}`lean_ctor_get_uint64(val, sizeof(void*)*3)` -* {name}`S.sc64_2`: {C}`lean_ctor_get_uint64(val, sizeof(void*)*3 + 8)` -* {name}`S.sc64_3`: {C}`lean_ctor_get_float(val, sizeof(void*)*3 + 16)` -* {name}`S.sc64_4`: {C}`lean_ctor_get_uint64(val, sizeof(void*)*3 + 24)` -* {name}`S.sc32_1`: {C}`lean_ctor_get_uint32(val, sizeof(void*)*3 + 32)` -* {name}`S.sc32_2`: {C}`lean_ctor_get_uint32(val, sizeof(void*)*3 + 36)` -* {name}`S.sc16_1`: {C}`lean_ctor_get_uint16(val, sizeof(void*)*3 + 40)` -* {name}`S.sc16_2`: {C}`lean_ctor_get_uint16(val, sizeof(void*)*3 + 42)` -* {name}`S.sc8_1`: {C}`lean_ctor_get_uint8(val, sizeof(void*)*3 + 44)` -* {name}`S.sc8_2`: {C}`lean_ctor_get_uint8(val, sizeof(void*)*3 + 45)` - -:::: - -::: TODO -Figure out how to test/validate/CI these statements -::: +The exact field layout is implementation-defined; constructor objects should only ever be created, and their fields accessed, by `@[export]`ing a corresponding function from Lean. # Mutual Inductive Types From 6961dadca44b5b2421e6ce8b6c888b44fe7511ed Mon Sep 17 00:00:00 2001 From: Sebastian Ullrich Date: Mon, 15 Jun 2026 10:12:38 +0200 Subject: [PATCH 2/5] Update Manual/Language/InductiveTypes.lean Co-authored-by: David Thrane Christiansen --- Manual/Language/InductiveTypes.lean | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Manual/Language/InductiveTypes.lean b/Manual/Language/InductiveTypes.lean index 67dc1c714..8ba5d9eae 100644 --- a/Manual/Language/InductiveTypes.lean +++ b/Manual/Language/InductiveTypes.lean @@ -461,7 +461,9 @@ tag := "inductive-types-ffi" From the perspective of C, these other inductive types are represented by {C}`lean_object *`. Each constructor is stored as a {C}`lean_ctor_object`, and {C}`lean_is_ctor` will return true. -The exact field layout is implementation-defined; constructor objects should only ever be created, and their fields accessed, by `@[export]`ing a corresponding function from Lean. +The exact field layout is implementation-defined. +Constructor objects should only be created or unpacked by functions defined in Lean code. +These functions can be made available to C via the {attr}`export` attribute. # Mutual Inductive Types From 67759456825e25b8c305f830295c7dfbad222443 Mon Sep 17 00:00:00 2001 From: David Thrane Christiansen Date: Mon, 22 Jun 2026 08:21:52 +0200 Subject: [PATCH 3/5] Update text based on experience, experiment, and discussion --- Manual/Language/InductiveTypes.lean | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/Manual/Language/InductiveTypes.lean b/Manual/Language/InductiveTypes.lean index 8ba5d9eae..7ab71fe38 100644 --- a/Manual/Language/InductiveTypes.lean +++ b/Manual/Language/InductiveTypes.lean @@ -421,9 +421,17 @@ However, in cases where some representation is required (such as when they are a tag := "inductive-types-trivial-wrappers" %%% -If an inductive type has exactly one constructor, and that constructor has exactly one run-time relevant parameter, then the inductive type is represented identically to its parameter. +:::paragraph +An inductive type is a {deftech}[trivial wrapper] if it has has exactly one constructor and that constructor has exactly one run-time relevant parameter. +Trivial wrappers are represented identically to their constructor's parameter in the following circumstances: -For a public inductive type under the module system, any private information that prevents these conditions from being tested in other modules will disable this optimization even in the current module. + * The inductive type is private. + + * The type is public, and the {tech}[public scope] of the module in which it is defined contains enough information to determine that it is a trivial wrapper. + + * The type is defined in a source file that is not a {tech}[module]. + +::: :::example "Zero-Overhead Subtypes" The structure {name}`Subtype` bundles an element of some type with a proof that it satisfies a predicate. @@ -461,10 +469,12 @@ tag := "inductive-types-ffi" From the perspective of C, these other inductive types are represented by {C}`lean_object *`. Each constructor is stored as a {C}`lean_ctor_object`, and {C}`lean_is_ctor` will return true. -The exact field layout is implementation-defined. + +The exact layout of fields in a constructor object is implementation defined. Constructor objects should only be created or unpacked by functions defined in Lean code. These functions can be made available to C via the {attr}`export` attribute. - +Because the resulting C and Lean code call symbols defined in each other, they should be linked together. +Each C should be compiled to an object file using a custom target in Lake and added to the Lean library configuration's {tomlField}`moreLinkObjs` field. # Mutual Inductive Types %%% From 4244e49dc6f551c3bdae77e6a61a0f07bc27cbbc Mon Sep 17 00:00:00 2001 From: David Thrane Christiansen Date: Mon, 22 Jun 2026 09:28:01 +0200 Subject: [PATCH 4/5] fix: attribute parsing for incomplete attributes --- Manual/Language/InductiveTypes.lean | 2 +- Manual/Meta/Attribute.lean | 41 ++++++++++++++++++++++++++++- 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/Manual/Language/InductiveTypes.lean b/Manual/Language/InductiveTypes.lean index 7ab71fe38..eefb21c01 100644 --- a/Manual/Language/InductiveTypes.lean +++ b/Manual/Language/InductiveTypes.lean @@ -474,7 +474,7 @@ The exact layout of fields in a constructor object is implementation defined. Constructor objects should only be created or unpacked by functions defined in Lean code. These functions can be made available to C via the {attr}`export` attribute. Because the resulting C and Lean code call symbols defined in each other, they should be linked together. -Each C should be compiled to an object file using a custom target in Lake and added to the Lean library configuration's {tomlField}`moreLinkObjs` field. +Each C should be compiled to an object file using a custom target in Lake and added to the Lean library configuration's {tomlField Lake.LeanLibConfig}`moreLinkObjs` field. # Mutual Inductive Types %%% diff --git a/Manual/Meta/Attribute.lean b/Manual/Meta/Attribute.lean index 875205113..f93a65ee8 100644 --- a/Manual/Meta/Attribute.lean +++ b/Manual/Meta/Attribute.lean @@ -10,6 +10,7 @@ import Verso.Code.Highlighted import Manual.Meta.Basic import Manual.Meta.PPrint +import Manual.Meta.Syntax open Verso Doc Elab open Verso.Genre Manual @@ -30,6 +31,33 @@ namespace Manual def Inline.attr : Inline where name := `Manual.attr +open Lean Parser in +/-- +The syntax kind of the unique production in the `attr` category whose leading token is `tok`. + +Attributes such as `export` require arguments, so their bare names don't parse as complete attribute +syntax. When the leading token identifies a single production, this returns that production's kind, +which is enough to refer to the attribute and link to its documented syntax. +-/ +def attrSyntaxKind? (env : Environment) (tok : String) : Option Name := Id.run do + let some cat := getCategory (parserExtension.getState env).categories `attr + | return none + let tok := tok.trimAscii + let mut kinds : Array Name := #[] + for (tk, prods) in cat.tables.leadingTable do + let tkStr := + match tk with + | .str .anonymous s => s.trimAscii + | _ => tk.toString + if tkStr == tok then + for (p, _) in prods do + for (k, _) in (p.info.collectKinds {}).toList do + if cat.kinds.contains k && !kinds.contains k then + kinds := kinds.push k + match kinds with + | #[k] => some k + | _ => none + @[role_expander attr] def attr : RoleExpander | args, inlines => do @@ -41,7 +69,18 @@ def attr : RoleExpander let altStr ← parserInputString a match Parser.runParserCategory (← getEnv) `attr altStr (← getFileName) with - | .error e => throwErrorAt a e + | .error e => + -- Attributes whose syntax requires arguments (e.g. `export`) don't parse from their bare name. + -- When the name is a leading token, refer to that syntax and link to its docs. + match attrSyntaxKind? (← getEnv) a.getString with + | some kind => + let kindDoc ← findDocString? (← getEnv) kind + pure #[← `(Verso.Doc.Inline.other { Inline.keywordOf with + data := + ToJson.toJson (α := String × Option Name × Name × Option String) + ($(quote a.getString), $(quote (some `attr)), $(quote kind), $(quote kindDoc)) + } #[Verso.Doc.Inline.code $(quote a.getString)])] + | none => throwErrorAt a e | .ok stx => let attrName ← match stx.getKind with From aee4cb2e345e85a5c2822a355b4b0923b05f4daa Mon Sep 17 00:00:00 2001 From: David Thrane Christiansen Date: Mon, 22 Jun 2026 09:47:37 +0200 Subject: [PATCH 5/5] Clarification --- Manual/Language/InductiveTypes.lean | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Manual/Language/InductiveTypes.lean b/Manual/Language/InductiveTypes.lean index eefb21c01..4a58e4e9a 100644 --- a/Manual/Language/InductiveTypes.lean +++ b/Manual/Language/InductiveTypes.lean @@ -470,8 +470,8 @@ tag := "inductive-types-ffi" From the perspective of C, these other inductive types are represented by {C}`lean_object *`. Each constructor is stored as a {C}`lean_ctor_object`, and {C}`lean_is_ctor` will return true. -The exact layout of fields in a constructor object is implementation defined. -Constructor objects should only be created or unpacked by functions defined in Lean code. +There are no guarantees about the exact layout of fields in a constructor object; the compiler is free to select any layout. +Thus, constructor objects should only be created or unpacked by functions defined in Lean code. These functions can be made available to C via the {attr}`export` attribute. Because the resulting C and Lean code call symbols defined in each other, they should be linked together. Each C should be compiled to an object file using a custom target in Lake and added to the Lean library configuration's {tomlField Lake.LeanLibConfig}`moreLinkObjs` field.