Skip to content

Commit f44f712

Browse files
committed
test(schema): nine-domain round-trip + authorship provenance check
Extend the lift-surface verification from MARS-only to 10 domains: MARS + Transport + Accounting + SalesDistribution + Credit + Cost + ServiceManagement + WorkOrder + Compliance + Audit. Coverage: * 210 TTL files across 9 new domains (in addition to MARS's 29) — all parse, all round-trip via the generic `assert_domain_roundtrip` helper. Zero unparseable shapes; zero round-trip failures. * `ttl_emit::tests::nine_domains_lift_surface_round_trip` is the permanent regression gate. Pins each domain's expected TTL count so an upstream re-vendor that drifts the inventory fires the test. Authorship verification (per the operator's suggestion to check `dcterms:creator`): * WorkOrder is FULLY OURS — sole authors are internal agent names (`bus-compiler`, `family-codec-smith`). The unusual `rdfs:Class`-as-verb convention is ours to revise toward `owl:ObjectProperty` without external coordination. * Accounting is MIXED-AUTHORSHIP — Viktor Voss (23 files, original arago) + a prior session's Claude extension (11 files, our local additions). * All other 7 domains are pure-upstream (single external human authors: chris.boos@almato.com, Marek Meyer, Peter Larem, Ola Irgens Kylling, Aymen Ayoub, …). This makes WorkOrder the natural prototyping ground for new TTL predicates OGAR wants to ship before pitching them to OGIT upstream. Doc updates: * `docs/OGIT-DOMAIN-LIFT-CATALOGUE.md` — 10 rows promoted to Lift-tested with per-row authorship provenance. Adds a verification recipe (`§ Verifying domain authorship`) so future sessions can re-run the `dcterms:creator` scan in one Python heredoc. * `.claude/board/EPIPHANIES.md` — new FINDING on author-provenance as a who-can-change-what discriminator. Test footprint: 16/16 in ogar-from-schema (was 15/15; the new test adds one). Workspace-wide: nothing else touched.
1 parent cce8420 commit f44f712

3 files changed

Lines changed: 175 additions & 19 deletions

File tree

.claude/board/EPIPHANIES.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,46 @@
1515
1616
## Entries (newest first)
1717

18+
## 2026-06-22 — Author provenance via `dcterms:creator` discriminates "ours to revise" from "upstream-coordinated"
19+
**Status:** FINDING
20+
**Scope:** OGIT NTO governance × multi-domain lift × who-can-change-what
21+
22+
OGIT TTL files carry `dcterms:creator` on every subject. The field is
23+
free-form text but carries one of two semantic shapes in practice:
24+
25+
- **Human author + email** (`chris.boos@almato.com`, `Viktor Voss`,
26+
`fotto@arago.de`, `Marek Meyer`, `Peter Larem`, `Ola Irgens Kylling`,
27+
…) — original arago/almato authors. Structural changes need upstream
28+
coordination.
29+
- **Internal agent name** (`bus-compiler`, `family-codec-smith`,
30+
`Claude (AdaWorldAPI/lance-graph 3-hop optim)`, …) — files authored
31+
by our agent fleet against this org's forks. We are upstream for
32+
these; structural changes need no external coordination.
33+
34+
The 9-domain spot check (Transport, Accounting, SalesDistribution,
35+
Credit, Cost, ServiceManagement, WorkOrder, Compliance, Audit) revealed:
36+
37+
- **WorkOrder is fully ours** — 100% internal-agent authorship (`bus-compiler`,
38+
`family-codec-smith`). The unusual `rdfs:Class`-as-verb convention is
39+
ours to revise toward standard `owl:ObjectProperty`-as-verb whenever
40+
the AST predicate registry needs the WorkOrder verbs.
41+
- **Accounting is mixed-authorship** — Viktor Voss (23 files, original)
42+
+ a prior session's `Claude` extension (11 files). Structural changes
43+
to the original 23 require upstream coordination; the 11 are ours.
44+
- **All other 7 domains are pure-upstream** — single-or-few external
45+
human authors.
46+
47+
This makes WorkOrder the **natural prototyping ground** for new TTL
48+
predicates OGAR wants to add: ship in WorkOrder first (no external
49+
coordination cost), validate the bijection, then pitch the pattern
50+
to OGIT upstream once it's proven.
51+
52+
Evidence: the `dcterms:creator` provenance scan recipe lives in
53+
`docs/OGIT-DOMAIN-LIFT-CATALOGUE.md § Verifying domain authorship`;
54+
the round-trip stress test for the 9 domains is
55+
`ttl_emit::tests::nine_domains_lift_surface_round_trip` (zero failures
56+
on 210 TTLs across the nine).
57+
1858
## 2026-06-22 — Schema-vs-source duality: schemas lift structure bijectively; source ASTs lift behaviour best-effort; they cross-validate at the structural boundary
1959
**Status:** FINDING
2060
**Scope:** producer architecture × MARS calibration × Foundry-Odoo lens × the bardioc migration

crates/ogar-from-schema/src/ttl_emit.rs

Lines changed: 90 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -276,20 +276,102 @@ mod tests {
276276
/// predicate.
277277
#[test]
278278
fn all_mars_ttl_files_roundtrip() {
279+
let stats = assert_domain_roundtrip("MARS");
280+
// 29 .ttl files in NTO/MARS at the SHA pinned by PROVENANCE.md.
281+
assert!(
282+
stats.total >= 29,
283+
"expected ≥ 29 TTL files in MARS, got {}",
284+
stats.total
285+
);
286+
}
287+
288+
/// Generic helper that walks `vocab/imports/ogit/NTO/<domain>/`,
289+
/// dispatches each TTL to the right parser (`parse_file` for entities
290+
/// and datatype attributes, `crate::sgo::parse_verb` for in-domain
291+
/// `owl:ObjectProperty` verbs), and asserts semantic round-trip.
292+
/// Returns per-shape counts so callers can sanity-check the lift
293+
/// surface they're claiming.
294+
fn assert_domain_roundtrip(domain: &str) -> DomainStats {
279295
let dir = std::path::Path::new(env!("CARGO_MANIFEST_DIR"))
280-
.join("../../vocab/imports/ogit/NTO/MARS");
281-
let mut checked = 0usize;
296+
.join("../../vocab/imports/ogit/NTO")
297+
.join(domain);
298+
let mut stats = DomainStats::default();
282299
for entry in walk_ttl(&dir) {
283300
let src = std::fs::read_to_string(&entry).expect("read");
301+
stats.total += 1;
284302
match parse_file(&src) {
285-
Some(TtlDeclaration::Entity(_)) => assert_entity_roundtrip(&src),
286-
Some(TtlDeclaration::DatatypeAttribute(_)) => assert_attribute_roundtrip(&src),
287-
None => {}
303+
Some(TtlDeclaration::Entity(_)) => {
304+
assert_entity_roundtrip(&src);
305+
stats.entities += 1;
306+
}
307+
Some(TtlDeclaration::DatatypeAttribute(_)) => {
308+
assert_attribute_roundtrip(&src);
309+
stats.attributes += 1;
310+
}
311+
None => {
312+
// Try the verb path — some NTO domains carry their own
313+
// in-domain `owl:ObjectProperty` verbs (Transport,
314+
// Accounting, Credit, Compliance) alongside SGO's
315+
// upstream-shared vocabulary.
316+
let Some(once) = crate::sgo::parse_verb(&src) else {
317+
panic!(
318+
"TTL has no recognised subject type in {domain}: {}",
319+
entry.display()
320+
);
321+
};
322+
let emitted = crate::sgo::emit_verb(&once);
323+
let twice = crate::sgo::parse_verb(&emitted).expect("re-parse verb");
324+
assert_eq!(
325+
once,
326+
twice,
327+
"verb round-trip lost a predicate in {domain}: {}",
328+
entry.display()
329+
);
330+
stats.verbs += 1;
331+
}
288332
}
289-
checked += 1;
290333
}
291-
// 29 .ttl files in NTO/MARS at the SHA pinned by PROVENANCE.md.
292-
assert!(checked >= 29, "expected ≥ 29 TTL files, got {checked}");
334+
stats
335+
}
336+
337+
#[derive(Debug, Default)]
338+
struct DomainStats {
339+
total: usize,
340+
entities: usize,
341+
attributes: usize,
342+
verbs: usize,
343+
}
344+
345+
/// Cross-domain bijection coverage. Each row is one of the nine
346+
/// domains the operator asked OGAR to verify before promoting
347+
/// the lift surface from MARS-only to multi-domain. If any of
348+
/// these fails, the producer can't land on that domain without
349+
/// extending `EntityDecl` / `AttributeDecl` / `VerbDecl` first.
350+
///
351+
/// Counts are also a sanity check on the inventory — they prove
352+
/// the catalogue's per-domain numbers match what's actually in
353+
/// `vocab/imports/`.
354+
#[test]
355+
fn nine_domains_lift_surface_round_trip() {
356+
for (domain, expected_total) in [
357+
("Transport", 27),
358+
("Accounting", 36),
359+
("SalesDistribution", 23),
360+
("Credit", 21),
361+
("Cost", 5),
362+
("ServiceManagement", 59),
363+
("WorkOrder", 27),
364+
("Compliance", 9),
365+
("Audit", 3),
366+
] {
367+
let stats = assert_domain_roundtrip(domain);
368+
assert_eq!(
369+
stats.total, expected_total,
370+
"{domain}: TTL count drifted from inventory \
371+
(expected {expected_total}, got {})",
372+
stats.total
373+
);
374+
}
293375
}
294376

295377
fn walk_ttl(root: &std::path::Path) -> Vec<std::path::PathBuf> {

docs/OGIT-DOMAIN-LIFT-CATALOGUE.md

Lines changed: 45 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,46 @@
1717
| **Production** | A consumer deployment exercises the lifted form (see `DOMAIN-INSTANCES.md`) |
1818

1919
A domain advances Imported → Lift-tested → Cross-walked → Production
20-
left-to-right. All 72 are Imported today (just landed). MARS is the
21-
first Lift-tested. OpenProject/Odoo/Healthcare are Production via the
20+
left-to-right. All 72 are Imported today (just landed). The following
21+
**10 domains are Lift-tested** (round-trip mechanically enforced by
22+
`ttl_emit::tests::nine_domains_lift_surface_round_trip` +
23+
`all_mars_ttl_files_roundtrip`): MARS, Transport, Accounting,
24+
SalesDistribution, Credit, Cost, ServiceManagement, WorkOrder,
25+
Compliance, Audit. OpenProject/Odoo/Healthcare are Production via the
2226
existing canonical concept work but use the source-AST lift, not the
2327
schema lift — they enter Lift-tested when their TTLs are added to the
2428
round-trip stress test.
2529

30+
## Verifying domain authorship (who can change what)
31+
32+
Provenance is `dcterms:creator` on each TTL. Run:
33+
34+
```bash
35+
python3 - <<'PY'
36+
import os, re
37+
from collections import Counter
38+
creator_re = re.compile(r'dcterms:creator\s+"([^"]+)"')
39+
for d in sorted(os.listdir('vocab/imports/ogit/NTO')):
40+
root = f'vocab/imports/ogit/NTO/{d}'
41+
authors = Counter()
42+
for r,_,fs in os.walk(root):
43+
for f in fs:
44+
if not f.endswith('.ttl'): continue
45+
with open(os.path.join(r,f)) as fh:
46+
for m in creator_re.finditer(fh.read()):
47+
authors[m.group(1)] += 1
48+
if not authors: continue
49+
top = ', '.join(f'{a} ({c})' for a,c in authors.most_common(5))
50+
print(f'{d:<28} {top}')
51+
PY
52+
```
53+
54+
Internal-agent authors (`bus-compiler`, `family-codec-smith`, `Claude
55+
(...)`, etc.) signal "our extension — we can revise without external
56+
coordination." External authors (`chris.boos@almato.com`, `Viktor Voss`,
57+
`fotto@arago.de`, …) signal "upstream-owned — structural changes need
58+
arago/almato coordination."
59+
2660
## How to add a new domain to the lift
2761

2862
1. **Verify import**`ls vocab/imports/ogit/NTO/<Domain>/`. If
@@ -44,16 +78,16 @@ round-trip stress test.
4478

4579
| Domain | Entities | Attributes | Verbs | Status | Notes |
4680
|---|--:|--:|--:|---|---|
47-
| `Accounting` | 9 | 20 | 7 | Imported | Covered conceptually via `0x02XX` commerce/ERP via Odoo lift |
81+
| `Accounting` | 9 | 20 | 7 | Lift-tested | Mixed-authorship: `Viktor Voss` (23 files, original arago/almato) + a prior session's extension (`Claude (AdaWorldAPI/lance-graph 3-hop optim)`, 11 files). Covered conceptually via `0x02XX` commerce/ERP via Odoo lift. Structural changes to the original 23 need upstream coordination; the 11 extensions are ours. |
4882
| `Advertising` | 16 | 0 | 0 | Imported | |
49-
| `Audit` | 3 | 0 | 0 | Imported | Audit-as-Lance-version (ADR-013) covers the semantics |
83+
| `Audit` | 3 | 0 | 0 | Lift-tested | `Marek Meyer` (sole author) — pure upstream. Audit-as-Lance-version (ADR-013) covers the semantics. |
5084
| `Auth` | 13 | 24 | 6 | Imported | Cross-walk to `0x0BXX` auth domain (Zitadel/Zanzibar) queued |
5185
| `Automation` | 22 | 105 | 0 | Imported | OLD `marsNodeType` superseded by `NTO/MARS/` |
5286
| `Botany` | 2 | 0 | 0 | Imported | |
5387
| `ClassificationStandard` | 2 | 5 | 2 | Imported | |
54-
| `Compliance` | 1 | 4 | 4 | Imported | |
55-
| `Cost` | 5 | 0 | 0 | Imported | |
56-
| `Credit` | 12 | 0 | 9 | Imported | |
88+
| `Compliance` | 1 | 4 | 4 | Lift-tested | `chris.boos@almato.com` (sole author) — pure upstream |
89+
| `Cost` | 5 | 0 | 0 | Lift-tested | `Peter Larem` (sole author) — pure upstream |
90+
| `Credit` | 12 | 0 | 9 | Lift-tested | `Ola Irgens Kylling` (sole author, 21 files) — pure upstream; capitalised `Entities/` + `Verbs/` dirs (content-driven parser is dir-case-agnostic) |
5791
| `CustomerSupport` | 7 | 31 | 2 | Imported | |
5892
| `Data` | 1 | 1 | 0 | Imported | |
5993
| `DataProcessing` | 2 | 6 | 0 | Imported | |
@@ -104,18 +138,18 @@ round-trip stress test.
104138
| `RPA` | 6 | 1 | 1 | Imported | |
105139
| `Religion` | 1 | 0 | 0 | Imported | |
106140
| `SaaS` | 10 | 12 | 0 | Imported | |
107-
| `SalesDistribution` | 12 | 11 | 0 | Imported | |
141+
| `SalesDistribution` | 12 | 11 | 0 | Lift-tested | `Marek Meyer` (sole author, 23 files) — pure upstream |
108142
| `Schedule` | 5 | 7 | 0 | Imported | |
109143
| `Security` | 2 | 0 | 0 | Imported | |
110-
| `ServiceManagement` | 17 | 42 | 0 | Imported | MARS Machine `generates` Log/Timeseries lands here |
144+
| `ServiceManagement` | 17 | 42 | 0 | Lift-tested | 8 distinct authors led by `Peter Larem` (42 files); pure upstream. MARS Machine `generates` Log/Timeseries lands here. |
111145
| `SharePoint` | 0 | 2 | 0 | Imported | Attributes-only |
112146
| `Software` | 5 | 0 | 0 | Imported | Distinct from `NTO/MARS/Software/` — this is a software-engineering vocabulary |
113147
| `Statistics` | 1 | 0 | 0 | Imported | |
114148
| `Survey` | 3 | 0 | 0 | Imported | |
115-
| `Transport` | 5 | 14 | 8 | Imported | |
149+
| `Transport` | 5 | 14 | 8 | Lift-tested | `chris.boos@almato.com` (sole author, 27 files) — pure upstream-arago |
116150
| `UserMeta` | 4 | 0 | 4 | Imported | |
117151
| `Version` | 0 | 3 | 0 | Imported | Used by MARS Machine for OS version |
118-
| `WorkOrder` | 15 | 0 | 12 | Imported | Covered conceptually by WoA (`0x0003`) |
152+
| **`WorkOrder`** | 27 | 0 | 0 | **Lift-tested** | **Our extension** (`dcterms:creator` = `bus-compiler` + `family-codec-smith` — internal agent authors, zero external). Authored for `woa-rs`. All 27 TTLs declared as `rdfs:Class`, including the 12 in `verbs/` (unusual `rdfs:Class`-as-verb convention). Round-trips cleanly. **Since we're upstream**, the verb files can be re-authored as `owl:ObjectProperty` for the AST predicate registry without external coordination. Previous catalogue row split 15 entities + 12 verbs by directory; content-driven count is 27 entities (what `ogar-from-schema` actually sees). |
119153
| **TOTALS** | **549** | **599** | **241** || + 42 other (Medical sql_mirror, etc.) |
120154

121155
## Adjacent imports (not NTO)

0 commit comments

Comments
 (0)