Skip to content

Commit f3511e6

Browse files
elasticdotventuresb
andauthored
docs: tighten README structure and sample references (#18)
* docs: clean up README and add sample fixtures * docs: expand README diagrams and sample indexes * docs: refine README audiences and CLI diagrams * fix: link compliance gate artifacts in PR comments --------- Co-authored-by: b <b@elastic.ventures>
1 parent 5a58b85 commit f3511e6

16 files changed

Lines changed: 758 additions & 467 deletions

.github/workflows/compliance-gate.yml

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,7 @@ jobs:
8686
category: compliance-gate
8787

8888
- name: Upload evidence store artifacts
89+
id: upload_evidence
8990
if: always()
9091
uses: actions/upload-artifact@v4
9192
with:
@@ -99,6 +100,9 @@ jobs:
99100
- name: Comment on PR with compliance results
100101
if: github.event_name == 'pull_request' && always()
101102
uses: actions/github-script@v7
103+
env:
104+
WORKFLOW_RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
105+
EVIDENCE_ARTIFACT_URL: ${{ steps.upload_evidence.outputs.artifact-url }}
102106
with:
103107
script: |
104108
const fs = require('fs');
@@ -133,6 +137,8 @@ jobs:
133137
const conditional = results.filter(r => r.status === 'conditional_pass').length;
134138
const total = results.length;
135139
const statusIcon = summary.gate_status === 'passed' ? '✅' : '❌';
140+
const workflowRunUrl = process.env.WORKFLOW_RUN_URL;
141+
const evidenceArtifactUrl = process.env.EVIDENCE_ARTIFACT_URL;
136142
137143
const formatIssues = (title, issues) => {
138144
if (!issues || issues.length === 0) {
@@ -151,6 +157,13 @@ jobs:
151157
).join('\n')
152158
: '| none | n/a | n/a |';
153159
160+
const links = [
161+
`- [Workflow run](${workflowRunUrl})`,
162+
evidenceArtifactUrl
163+
? `- [Compliance evidence artifact](${evidenceArtifactUrl})`
164+
: '- Compliance evidence artifact: not available for this run',
165+
].join('\n');
166+
154167
const comment = `## ${statusIcon} Compliance Gate Results
155168
156169
**Summary:**
@@ -179,7 +192,10 @@ jobs:
179192
|-------------|--------|-------|
180193
${detailRows}
181194
182-
See artifacts for complete SARIF reports, summary JSON, and evidence.
195+
**Artifacts and run links:**
196+
${links}
197+
198+
The compliance evidence artifact contains the complete SARIF reports, summary JSON, and evidence bundle for this run.
183199
`;
184200
185201
await github.rest.issues.createComment({

.gitignore

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,3 +211,41 @@ __marimo__/
211211

212212
# Evidence store - runtime artifacts from compliance gate evaluations
213213
evidence_store/
214+
215+
# Logs
216+
logs
217+
npm-debug.log*
218+
yarn-debug.log*
219+
yarn-error.log*
220+
dev-debug.log
221+
# Dependency directories
222+
node_modules/
223+
# Environment variables
224+
# Editor directories and files
225+
.idea
226+
.vscode
227+
*.suo
228+
*.ntvs*
229+
*.njsproj
230+
*.sln
231+
*.sw?
232+
# OS specific
233+
.DS_Store
234+
235+
# Task files
236+
# tasks.json
237+
# tasks/
238+
239+
# Local agent/task scratch state
240+
.env.example
241+
.ralph/
242+
.ralphrc
243+
.taskmaster/
244+
tmp/
245+
Downloads/
246+
README.md.orig
247+
_b00t_/
248+
prd-reqif-ingest/
249+
ralph/.archive/
250+
tasks/aemo-stack.prompt.md
251+
*:Zone.Identifier

README-reqif-ingest-cli.md

Lines changed: 108 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,24 @@
33
This repo now includes a standalone ingestion surface at `reqif_ingest_cli/`.
44
It keeps source-document intake separate from the existing `reqif_mcp` ReqIF parser and policy server.
55

6+
Executive view:
7+
8+
- this is the deterministic artifact-to-ReqIF derivation pipeline
9+
- its purpose is traceable extraction, not policy judgement
10+
11+
Engineer view:
12+
13+
- use this surface when the starting point is an artifact such as XLSX, PDF, DOCX, or Markdown
14+
- use `reqif_mcp` when the starting point is already ReqIF
15+
16+
```mermaid
17+
flowchart LR
18+
ART[Source artifact] --> EXT[Deterministic extraction]
19+
EXT --> DG[document_graph]
20+
DG --> CAND[requirement_candidate]
21+
CAND --> REQIF[Derived ReqIF]
22+
```
23+
624
## Scope
725

826
- Deterministic first pass only.
@@ -12,6 +30,15 @@ It keeps source-document intake separate from the existing `reqif_mcp` ReqIF par
1230
- ReqIF stays derived: `artifact -> document_graph -> requirement_candidate -> reqif`.
1331
- Optional LLM quality-eval hooks use an Azure Foundry/OpenAI-compatible adapter and are disabled by default.
1432

33+
```mermaid
34+
flowchart TD
35+
INPUT[Artifact input] --> HASH[Register hash and metadata]
36+
HASH --> STRUCTURE[Extract structure]
37+
STRUCTURE --> DISTILL[Deterministic distillation]
38+
DISTILL --> OUTPUT[Emit derived ReqIF]
39+
OUTPUT -. optional review .-> LLM[Foundry quality hook]
40+
```
41+
1542
## Commands
1643

1744
Use the local justfile:
@@ -20,18 +47,28 @@ Use the local justfile:
2047
just -f reqif_ingest_cli/justfile --list
2148
```
2249

50+
```mermaid
51+
flowchart LR
52+
TEST[test] --> LINT[lint]
53+
LINT --> TYPE[typecheck]
54+
TYPE --> ARTIFACT[artifact]
55+
ARTIFACT --> EXTRACT[extract]
56+
EXTRACT --> DISTILL[distill]
57+
DISTILL --> EMIT[emit]
58+
```
59+
2360
Common commands:
2461

2562
```bash
2663
just -f reqif_ingest_cli/justfile test
2764
just -f reqif_ingest_cli/justfile lint
2865
just -f reqif_ingest_cli/justfile typecheck
2966

30-
just -f reqif_ingest_cli/justfile artifact "The AESCSF v2 Core.xlsx"
31-
just -f reqif_ingest_cli/justfile extract "The AESCSF v2 Core.xlsx"
32-
just -f reqif_ingest_cli/justfile distill "The AESCSF v2 Core.xlsx"
67+
just -f reqif_ingest_cli/justfile artifact "samples/aemo/The AESCSF v2 Core.xlsx"
68+
just -f reqif_ingest_cli/justfile extract "samples/aemo/The AESCSF v2 Core.xlsx"
69+
just -f reqif_ingest_cli/justfile distill "samples/aemo/The AESCSF v2 Core.xlsx"
3370
just -f reqif_ingest_cli/justfile emit \
34-
"The AESCSF v2 Core.xlsx" \
71+
"samples/aemo/The AESCSF v2 Core.xlsx" \
3572
"evidence_store/toolkits/aemo/aescsf-core.reqif" \
3673
auto \
3774
"AESCSF Core Derived Baseline"
@@ -49,26 +86,66 @@ just -f reqif_ingest_cli/justfile smoke-aemo-toolkit
4986
The standalone module runs directly with `uv`:
5087

5188
```bash
52-
uv run python -m reqif_ingest_cli register-artifact "The AESCSF v2 Core.xlsx" --pretty
53-
uv run python -m reqif_ingest_cli extract "The AESCSF v2 Core.xlsx" --pretty
54-
uv run python -m reqif_ingest_cli distill "The AESCSF v2 Core.xlsx" --pretty
89+
uv run python -m reqif_ingest_cli register-artifact "samples/aemo/The AESCSF v2 Core.xlsx" --pretty
90+
uv run python -m reqif_ingest_cli extract "samples/aemo/The AESCSF v2 Core.xlsx" --pretty
91+
uv run python -m reqif_ingest_cli distill "samples/aemo/The AESCSF v2 Core.xlsx" --pretty
5592
uv run python -m reqif_ingest_cli emit-reqif \
56-
"The AESCSF v2 Core.xlsx" \
93+
"samples/aemo/The AESCSF v2 Core.xlsx" \
5794
--title "AESCSF Core Derived Baseline" \
5895
--output "evidence_store/toolkits/aemo/aescsf-core.reqif" \
5996
--pretty
6097
uv run python -m reqif_ingest_cli foundry-config --pretty
6198
```
6299

100+
```mermaid
101+
sequenceDiagram
102+
participant User
103+
participant CLI as reqif_ingest_cli
104+
participant FS as source artifact
105+
participant Out as JSON / ReqIF output
106+
107+
User->>CLI: register-artifact
108+
CLI->>FS: hash + inspect metadata
109+
User->>CLI: extract
110+
CLI->>Out: document_graph
111+
User->>CLI: distill
112+
CLI->>Out: requirement_candidate
113+
User->>CLI: emit-reqif
114+
CLI->>Out: derived ReqIF
115+
```
116+
63117
## What It Emits
64118

65119
- `artifact/1`: immutable source hash, media type, file format, source path, profile
66120
- `document_graph/1`: sections, rows, paragraphs, anchors, semantic IDs
67121
- `requirement_candidate/1`: deterministic candidate text, rationale, rule ID, provenance
68122
- ReqIF XML: minimal derived baseline that round-trips through the current parser
69123

124+
```mermaid
125+
flowchart LR
126+
A[artifact/1] --> G[document_graph/1]
127+
G --> C[requirement_candidate/1]
128+
C --> R[ReqIF XML]
129+
```
130+
131+
See also:
132+
133+
- `samples/README.md`
134+
- `samples/aemo/README.md`
135+
- `samples/contracts/README.md`
136+
70137
## Current Profiles
71138

139+
```mermaid
140+
flowchart LR
141+
XLSX[XLSX] --> CORE[aescsf_core_v2]
142+
XLSX --> TOOLKIT[aescsf_toolkit_v1_1]
143+
XLSX --> GENERIC[generic_xlsx_table]
144+
PDF[PDF] --> PDFP[pdf_docling_v1]
145+
DOCX[DOCX] --> DOCXP[docx_docling_v1]
146+
MD[Markdown] --> MDP[markdown_docling_v1]
147+
```
148+
72149
- `aescsf_core_v2`
73150
- Detects the flat AESCSF core workbook layout.
74151
- Preserves paragraph chunks inside `Context and Guidance`.
@@ -102,3 +179,26 @@ uv run python -m reqif_ingest_cli foundry-config --pretty
102179
```
103180

104181
The adapter is for review and remapping only. It is not part of the deterministic first pass.
182+
183+
```mermaid
184+
flowchart LR
185+
DISTILL[Deterministic distillation] --> REVIEW{Foundry configured?}
186+
REVIEW -- no --> DONE[Use deterministic output]
187+
REVIEW -- yes --> QA[Quality review / remap hints]
188+
QA --> DONE
189+
```
190+
191+
## Current Gaps
192+
193+
- no baseline diff command yet
194+
- no ingest MCP tool surface yet
195+
- AESCSF mappings are still code-first rather than externalized config
196+
- rich PDF structure extraction still depends on offline Docling model availability
197+
198+
```mermaid
199+
flowchart LR
200+
NOW[Current CLI] --> NEXT1[MCP tool surface]
201+
NOW --> NEXT2[Baseline diffing]
202+
NOW --> NEXT3[Externalized profile config]
203+
NOW --> NEXT4[Richer offline PDF structure]
204+
```

0 commit comments

Comments
 (0)