Add governance artifact schemas, validator, samples, tests, and CI workflow#101
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Changed Files
|
|
Review these changes at https://app.gitnotebooks.com/OneFineStarstuff/OneFineStarstuff.github.io/pull/101 |
|
The files' contents are under analysis for test generation. |
Reviewer's GuideIntroduces JSON Schemas and sample artifacts for BBOM and ARRE governance records, a Python-based validator CLI with schema and semantic checks plus JSON reporting, unit tests around validator behavior, CI workflow wiring validation into PRs, and documentation for running the validator locally and in CI. Flow diagram for governance artifact validator CLIflowchart LR
A[Start main] --> B[parse_args]
B --> C[run_validation]
C --> D[load_json bbom.schema.json]
C --> E[load_json arre_record.schema.json]
D --> F{schema load ok?}
E --> F
F -- no --> G[Set fatal_error schema_load_failure in summary]
G --> H[Return errors and summary]
F -- yes --> I[get_artifact_sets]
I --> J{Any BBOM/ARRE files?}
J -- no --> K[Populate errors in summary and set exit_code 2]
K --> H
J -- yes --> L[validate_file for each BBOM]
L --> M[validate_with_schema + validate_bbom_semantics]
M --> N[Update bbom_files_checked / bbom_failed]
N --> O[validate_file for each ARRE]
O --> P[validate_with_schema + validate_arre_semantics]
P --> Q[Update arre_files_checked / arre_failed]
Q --> R[Finalize summary status and exit_code]
R --> H
H --> S{--report-file provided?}
S -- yes --> T[write_report]
S -- no --> U[Skip report]
T --> V[Print result & return exit code]
U --> V
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
View changes in DiffLens |
📝 WalkthroughWalkthroughThis PR introduces a complete governance artifact validation system. It adds JSON Schema definitions for BBOM (Behavioral Bill of Materials) and ARRE (regulator-facing control records), implements a Python CLI validator tool that performs schema and semantic validation, provides comprehensive test coverage, and integrates validation into CI. Supporting documentation includes a governance blueprint and operational quickstart. ChangesGovernance Artifact Validation System
🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
|
View changes in DiffLens |
|
View changes in DiffLens |
|
Failed to generate code suggestions for PR |
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- In
validate_fileyou only increment the*_files_checkedcounters on successful validation, which means the summary under-reports the number of files that were actually processed; consider incrementing the checked counters for both pass and fail paths so the report reflects total attempted validations. - The validator currently prints
OK BBOM/ARRE: ...insidevalidate_file, which is called fromrun_validation; extracting this user-facing logging tomain(or making it optional via a verbosity flag) would keep the core validation API quieter and easier to reuse programmatically.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `validate_file` you only increment the `*_files_checked` counters on successful validation, which means the summary under-reports the number of files that were actually processed; consider incrementing the checked counters for both pass and fail paths so the report reflects total attempted validations.
- The validator currently prints `OK BBOM/ARRE: ...` inside `validate_file`, which is called from `run_validation`; extracting this user-facing logging to `main` (or making it optional via a verbosity flag) would keep the core validation API quieter and easier to reuse programmatically.
## Individual Comments
### Comment 1
<location path="tools/validate_ai_governance_artifacts.py" line_range="179-182" />
<code_context>
+ label: str,
+) -> None:
+ try:
+ data = load_json(file)
+ validate_with_schema(data, file.name, schema)
+ semantic_validator(data, file.name)
+ summary[counter_key] += 1
+ summary["passed_files"].append(display_path(file))
+ print(f"OK {label}: {display_path(file)}")
</code_context>
<issue_to_address>
**issue (bug_risk):** Failed files are not included in the `*_files_checked` counters, which can make the summary misleading.
Since `summary[counter_key] += 1` is inside the `try`, any file that fails schema or semantic validation is excluded from the `*_files_checked` count and only appears in the `*_failed` metrics. This makes these counters represent “files passed” instead of “files processed.” Increment this counter outside the `try` (e.g., in a `finally` block or before validation) so it tracks all processed files, and keep using the existing failure metrics to represent outcomes.
</issue_to_address>
### Comment 2
<location path="tools/validate_ai_governance_artifacts.py" line_range="85-89" />
<code_context>
+ hazard = data.get("hazard_scores", {})
+ thresholds = data.get("acceptance_thresholds", {})
+
+ if "max_deception" in thresholds and hazard.get("deception", 0) > thresholds["max_deception"]:
+ raise ValidationError(f"{data_name} violates max_deception threshold")
+ if "max_exfiltration" in thresholds and hazard.get("exfiltration", 0) > thresholds["max_exfiltration"]:
+ raise ValidationError(f"{data_name} violates max_exfiltration threshold")
+ if "max_jailbreak" in thresholds and hazard.get("jailbreak", 0) > thresholds["max_jailbreak"]:
+ raise ValidationError(f"{data_name} violates max_jailbreak threshold")
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Type errors in BBOM hazard/threshold values will raise unhandled exceptions instead of a `ValidationError`.
If `hazard_scores` or `acceptance_thresholds` contain non‑numeric values, these `>` comparisons will raise a `TypeError` instead of a `ValidationError`, bypassing the intended error reporting and summary tracking. Unless the schema strictly enforces numeric types, these checks should be wrapped so type issues are caught and re-raised as `ValidationError`s (similar to the handling of invalid dates in `validate_arre_semantics`).
</issue_to_address>
### Comment 3
<location path="AGI_ASI_GSIFI_Blueprint_2026_2030.md" line_range="30" />
<code_context>
+## 1) Design principles for 2026–2030
+
+1. **Safety-critical, not feature-critical.** Frontier AI touching critical banking functions is a safety-critical system.
+2. **Systemic externality mindset.** G‑SIFIs must evaluate institution risk *and* network contagion risk.
+3. **Containment-first scaling.** Capability growth is gated by containment maturity.
+4. **Evidence-by-construction.** Controls must emit machine-readable supervisory evidence continuously.
</code_context>
<issue_to_address>
**issue (typo):** Consider changing "institution risk" to "institutional risk" for grammatical correctness.
"Institution risk" sounds awkward here; "institutional risk" is more natural English and better aligned with standard risk-management terminology.
```suggestion
2. **Systemic externality mindset.** G‑SIFIs must evaluate institutional risk *and* network contagion risk.
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| Compatibility | 3 medium |
| UnusedCode | 12 medium |
| BestPractice | 2 minor |
| Documentation | 6 minor |
| ErrorProne | 4 high |
| Security | 64 high |
| CodeStyle | 6 minor |
| Complexity | 1 medium 2 minor |
🟢 Metrics 63 complexity · 10 duplication
Metric Results Complexity 63 Duplication 10
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
|
View changes in DiffLens |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (3)
tools/validate_ai_governance_artifacts.py (1)
168-177: ⚡ Quick winReduce
validate_fileargument count to satisfy Pylinttoo-many-arguments/too-many-positional-arguments.The 8-parameter signature fails CI (R0913/R0917). The
schema,semantic_validator,counter_key,failed_counter_key, andlabelalways move together per artifact type — bundle them into a small config object (e.g. aNamedTuple/dataclass) so the function takes the file, the config, the summary, and the errors list.♻️ Sketch
class ArtifactKind(NamedTuple): schema: dict semantic_validator: Callable[[dict, str], None] counter_key: str failed_counter_key: str label: str def validate_file(file: Path, kind: ArtifactKind, summary: ValidationSummary, errors: list[str]) -> None: ...🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/validate_ai_governance_artifacts.py` around lines 168 - 177, The validate_file function signature has too many parameters; create a small config type (e.g., NamedTuple or dataclass named ArtifactKind) that bundles schema, semantic_validator, counter_key, failed_counter_key, and label, then change validate_file(file: Path, kind: ArtifactKind, summary: ValidationSummary, errors: list[str]) -> None to consume that single kind object; update all call sites that currently pass schema, semantic_validator, counter_key, failed_counter_key, and label to construct and pass an ArtifactKind instance instead, and adjust any type hints/imports accordingly (keep the function body using kind.schema, kind.semantic_validator, etc.)..github/workflows/governance-artifacts.yml (1)
22-23: ⚡ Quick winConsider caching Python dependencies for faster CI runs.
Adding dependency caching can significantly reduce CI execution time, especially for PRs with multiple commits.
⚡ Proposed enhancement to add dependency caching
- name: Setup Python uses: actions/setup-python@v5 with: python-version: '3.12' + cache: 'pip' + cache-dependency-path: requirements-governance.txt - name: Install dependencies run: python -m pip install -r requirements-governance.txt pytest🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/governance-artifacts.yml around lines 22 - 23, The "Install dependencies" step currently runs "python -m pip install -r requirements-governance.txt pytest" without caching; add an actions/cache step before this to cache pip packages (cache key using python-version and a hash of requirements-governance.txt, restore-keys fallback) and set the cache path to pip's cache directory (e.g., ~/.cache/pip) so subsequent runs reuse installed wheels; update the workflow to restore the cache, run the same install command, and save the cache when there are changes to the requirements file.GOVERNANCE_ARTIFACTS_README.md (1)
29-29: 💤 Low valueCorrect capitalization of GitHub.
The official brand name is "GitHub" with a capital "H".
✍️ Proposed fix
## CI -Validation is enforced in `.github/workflows/governance-artifacts.yml`. +Validation is enforced in `.github/workflows/governance-artifacts.yml` (GitHub Actions).Or simply ensure "GitHub Actions" is mentioned with correct capitalization elsewhere in the document.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@GOVERNANCE_ARTIFACTS_README.md` at line 29, Update the capitalization of the brand name in the sentence referencing the workflow file: change "GitHub" to the correct "GitHub" (capital G and H) where the README currently mentions `.github/workflows/governance-artifacts.yml`, and optionally ensure any mention of the platform reads "GitHub Actions" with correct capitalization; edit the line containing that sentence to apply the fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/governance-artifacts.yml:
- Line 15: Replace mutable action tags with pinned commit SHAs for each "uses:"
entry (e.g., change actions/checkout@v4 to actions/checkout@<COMMIT_SHA>) to
harden the workflow against supply-chain tampering; locate all "uses:" lines
flagged (the example actions/checkout@v4 and the other uses entries at the same
section) and replace the tag with the corresponding repository commit SHA
(verify and paste the latest stable commit hash for each action), keeping the
rest of the step unchanged.
- Around line 14-15: Update the GitHub Actions checkout step (uses:
actions/checkout@v4) to explicitly disable credential persistence by adding
persist-credentials: false to that step; locate the Checkout step in the
workflow and add the persist-credentials: false key under it so credentials are
not written to the workspace or persisted to subsequent steps.
- Around line 25-26: The workflow step "Validate BBOM/ARRE artifacts" runs
tools/validate_ai_governance_artifacts.py writing to
.reports/governance-validation.json but doesn't ensure the .reports directory
exists; add a prior step (or modify this step) to create the directory (e.g.,
run mkdir -p .reports) before calling tools/validate_ai_governance_artifacts.py
so the validator can write the report reliably.
In `@AGI_ASI_GSIFI_Blueprint_2026_2030.md`:
- Around line 496-501: The CI docs recommend using the ajv CLI but the repo
provides a Python validator (tools/validate_ai_governance_artifacts.py); update
the instructions to invoke the repository tool (install
requirements-governance.txt then run tools/validate_ai_governance_artifacts.py
with --report-file .reports/validation.json or without flags) and/or explicitly
state ajv-cli is an alternative for Node.js users (showing npm install -g
ajv-cli and the ajv validate commands) so readers know both supported options.
In `@tools/validate_ai_governance_artifacts.py`:
- Around line 85-90: Several f-strings in the validation checks are longer than
100 characters; split long lines that raise ValidationError to satisfy Pylint's
line-too-long rule by breaking the f-string or using a temporary message
variable. In the block that checks thresholds (the lines referencing thresholds,
hazard.get(...), and raising ValidationError with f"{data_name} violates ...
threshold"), refactor each raise to build the message on a separate wrapped line
(e.g., msg = f"{data_name} violates max_deception threshold" then raise
ValidationError(msg)) or use parentheses to wrap the f-string across lines;
apply the same wrapping approach to the other long f-strings elsewhere in this
module (the other ValidationError messages and any long log/message strings) so
every line is ≤100 chars while preserving the same text and variables
(data_name, thresholds, hazard, ValidationError).
- Around line 187-189: The validator is passing FormatChecker() into
Draft202012Validator but the project dependencies in requirements-governance.txt
omit the jsonschema[format] extra (or rfc3339-validator), so "format":
"date-time" checks may be skipped; update requirements-governance.txt to include
either the jsonschema format extra (e.g., jsonschema[format]>=4.22,<5 to match
the existing constraint) or add rfc3339-validator so FormatChecker enforces
RFC3339 date-time validation used by validate_ai_governance_artifacts.py (where
Draft202012Validator and FormatChecker are used).
- Line 16: The import of NotRequired from typing in
tools/validate_ai_governance_artifacts.py fails on Python <3.11; change the
top-level import to a guarded fallback that tries "from typing import
NotRequired" and on ImportError imports NotRequired from typing_extensions
instead (refer to the NotRequired symbol and the module-level import block), and
also add typing_extensions to the governance requirements used by CI
(requirements-governance.txt and docs/schemas/requirements-governance.txt) and
ensure the lint workflow installs that requirement; additionally, add
jsonschema[format] (or rfc3339-validator) to those governance requirements so
FormatChecker() runtime validation works in CI.
---
Nitpick comments:
In @.github/workflows/governance-artifacts.yml:
- Around line 22-23: The "Install dependencies" step currently runs "python -m
pip install -r requirements-governance.txt pytest" without caching; add an
actions/cache step before this to cache pip packages (cache key using
python-version and a hash of requirements-governance.txt, restore-keys fallback)
and set the cache path to pip's cache directory (e.g., ~/.cache/pip) so
subsequent runs reuse installed wheels; update the workflow to restore the
cache, run the same install command, and save the cache when there are changes
to the requirements file.
In `@GOVERNANCE_ARTIFACTS_README.md`:
- Line 29: Update the capitalization of the brand name in the sentence
referencing the workflow file: change "GitHub" to the correct "GitHub" (capital
G and H) where the README currently mentions
`.github/workflows/governance-artifacts.yml`, and optionally ensure any mention
of the platform reads "GitHub Actions" with correct capitalization; edit the
line containing that sentence to apply the fix.
In `@tools/validate_ai_governance_artifacts.py`:
- Around line 168-177: The validate_file function signature has too many
parameters; create a small config type (e.g., NamedTuple or dataclass named
ArtifactKind) that bundles schema, semantic_validator, counter_key,
failed_counter_key, and label, then change validate_file(file: Path, kind:
ArtifactKind, summary: ValidationSummary, errors: list[str]) -> None to consume
that single kind object; update all call sites that currently pass schema,
semantic_validator, counter_key, failed_counter_key, and label to construct and
pass an ArtifactKind instance instead, and adjust any type hints/imports
accordingly (keep the function body using kind.schema, kind.semantic_validator,
etc.).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1c09e6f1-24f1-4817-a268-8fe9ed0898b2
📒 Files selected for processing (11)
.github/workflows/governance-artifacts.ymlAGI_ASI_GSIFI_Blueprint_2026_2030.mdGOVERNANCE_ARTIFACTS_README.mdartifacts/bbom/sample_tier0_fraud.jsonexamples/arre/sample_t0_sanctions_002.jsonrequirements-governance.txtschemas/arre_record.schema.jsonschemas/bbom.schema.jsontests/test_governance_validator.pytools/__init__.pytools/validate_ai_governance_artifacts.py
❌ Deploy Preview for onefinestarstuff failed.
|
Motivation
Description
schemas/bbom.schema.json) and ARRE records (schemas/arre_record.schema.json) and example artifacts (artifacts/bbom/sample_tier0_fraud.json,examples/arre/sample_t0_sanctions_002.json).tools/validate_ai_governance_artifacts.pythat performs schema validation, semantic checks (e.g., threshold violations, period inversion, duplicate evidence hashes), produces a JSON report, and returns meaningful exit codes.tests/test_governance_validator.py./.github/workflows/governance-artifacts.ymlthat installs dependencies, runs the validator andpytest, and uploads the generated.reports/governance-validation.jsonartifact, plus a READMEGOVERNANCE_ARTIFACTS_README.mdwith quickstart instructions.Testing
pytest -q tests/test_governance_validator.py, covering schema errors, semantic rule violations, malformed JSON resilience, duplicate evidence detection, and schema-load failure handling, and they passed in CI.python -m pip install -r requirements-governance.txt pytestandpython tools/validate_ai_governance_artifacts.py --report-file .reports/governance-validation.jsonas part of validation and uploads the report as thegovernance-validation-reportartifact.main()andrun_validation()to assert correct exit codes (0for pass,2for validation failures) and report contents.Codex Task
Summary by Sourcery
Introduce machine-readable AI governance artifacts with a validator CLI, tests, documentation, and CI integration for BBOM and ARRE evidence records.
New Features:
Enhancements:
CI:
Documentation:
Tests:
Chores:
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
jsonschemadependency