Skip to content

Commit c83b1e2

Browse files
committed
docs: comprehensive documentation audit — 12 issues resolved
Technical correctness: - Fix awk column indices: ai_interaction_score is $10, ai_confidence is $11 (display_name column at position 5 shifted all AI columns by one) - Add display_name column to all CSV header examples across docs - Replace java -jar methodatlas.jar with ./methodatlas throughout troubleshooting.md and deployment/index.md (java -jar omits classpath) - Add -github-annotations section to output-formats.md (four modes, not three) - Fix data-governance.md: document all three items submitted to AI (taxonomy, method name list, class source — was incorrectly two) README completeness: - Add override-file, diff/delta, security-only, emit-metadata, mismatch-limit, ai-cache, apply-tags-from-csv, include-non-security to key capabilities list - Expand documentation table with overrides, interaction-score, compliance, onboarding runbook entries Regulated-environment documentation: - Add exit codes table to cli-reference.md - Add reproducibility and AI non-determinism section to compliance.md - Add governance and review cadence section to ai/overrides.md - Add SBOM/checksum verification section to installation.md - Add enterprise secret management section to data-governance.md - Add onboarding runbook docs/deployment/onboarding.md (six-phase brownfield progression: static → AI → overrides → drift → annotations → CI)
1 parent 8a17048 commit c83b1e2

13 files changed

Lines changed: 368 additions & 67 deletions

README.md

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -25,18 +25,25 @@ MethodAtlas addresses this by turning an existing Java test suite (JUnit 5, JUni
2525

2626
## Key capabilities
2727

28-
- **Deterministic test discovery** — JavaParser AST analysis; no inference, no false positives on method existence
28+
- **Deterministic test discovery** — JavaParser AST analysis; no inference, no false positives on method existence; JUnit 5, JUnit 4, and TestNG detected automatically from import declarations
2929
- **SARIF 2.1.0 output** — first-class integration with static analysis platforms and IDE tooling
3030
- **AI security classification** — classifies each test method against a closed security taxonomy; supports Ollama, OpenAI, Anthropic, Azure OpenAI, Groq, xAI, GitHub Models, Mistral, and OpenRouter
3131
- **Confidence scoring** — per-method decimal score (`-ai-confidence`); filter by threshold for audit packages
3232
- **Content hash fingerprints** — SHA-256 of the class AST text (`-content-hash`); all methods in the same class share the same hash; enables incremental scanning and change detection
33+
- **AI result cache** — reuse previous AI classifications by hash (`-ai-cache`); unchanged classes cost zero API calls
3334
- **Tag vs AI drift detection**`-drift-detect` flags methods where `@Tag("security")` in source disagrees with the AI classification
35+
- **Classification overrides**`-override-file` records human-reviewed corrections; overrides persist across re-runs and set confidence to `1.0` or `0.0`
36+
- **Delta report**`-diff` compares two CSV scans and emits a change report: methods added, removed, or modified between runs; useful for CI regression gates
37+
- **Security-only filter**`-security-only` suppresses non-security methods from CSV/plain output; applied automatically in SARIF mode
38+
- **Mismatch limit**`-mismatch-limit` safety gate for `-apply-tags-from-csv`; aborts without touching source files when the CSV diverges from the current codebase
3439
- **GitHub Actions annotations**`-github-annotations` emits inline PR annotations for security-relevant methods without requiring a GitHub Advanced Security licence
3540
- **Apply-tags** — writes AI-suggested `@DisplayName` and `@Tag` annotations back into source files; idempotent
41+
- **Apply-tags-from-csv** — applies human-reviewed annotation decisions from a CSV back to source; separates the review step from the write-back
3642
- **Manual AI workflow** — two-phase prepare/consume workflow for environments where API access is blocked
3743
- **Local inference** — Ollama support keeps source code entirely within your network
3844
- **YAML configuration** — share scan settings across a team or CI pipeline without repeating CLI flags
3945
- **Custom taxonomy** — supply an external taxonomy file aligned to ISO 27001, NIST SP 800-53, PCI DSS, or your own controls framework
46+
- **Scan provenance**`-emit-metadata` prepends tool version and timestamp to CSV; embed in evidence packages
4047
- **Multiple output modes** — CSV (default), plain text, SARIF, and GitHub Actions annotations
4148

4249
## Quick start
@@ -101,9 +108,9 @@ For each discovered JUnit test method, MethodAtlas emits one record.
101108
### CSV (default)
102109

103110
```csv
104-
fqcn,method,loc,tags,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_interaction_score
105-
com.acme.auth.LoginTest,testLoginWithValidCredentials,12,,true,SECURITY: auth - validates session token,security;auth,Verifies session token is issued on successful login.,0.0
106-
com.acme.util.DateTest,format_returnsIso8601,5,,false,,,,0.1
111+
fqcn,method,loc,tags,display_name,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_interaction_score
112+
com.acme.auth.LoginTest,testLoginWithValidCredentials,12,,,true,SECURITY: auth - validates session token,security;auth,Verifies session token is issued on successful login.,0.0
113+
com.acme.util.DateTest,format_returnsIso8601,5,,,false,,,,0.1
107114
```
108115

109116
### SARIF 2.1.0
@@ -182,10 +189,10 @@ Pass `-ai-confidence` to add a `0.0–1.0` confidence score per method:
182189

183190
```bash
184191
./methodatlas -ai -ai-confidence /path/to/tests | \
185-
awk -F',' 'NR==1 || ($10+0) >= 0.7' # keep only high-confidence findings
192+
awk -F',' 'NR==1 || ($11+0) >= 0.7' # keep only high-confidence findings
186193
```
187194

188-
`ai_confidence` is column 10 in standard output (column 11 when `-content-hash` is also passed).
195+
`ai_confidence` is column 11 in standard output (column 12 when `-content-hash` is also passed).
189196

190197
| Score | Meaning |
191198
| --- | --- |
@@ -302,12 +309,16 @@ Full documentation is available at [accenture.github.io/MethodAtlas](https://acc
302309

303310
| Document | Contents |
304311
| --- | --- |
305-
| [docs/cli-reference.md](docs/cli-reference.md) | Complete option reference, YAML schema, and example commands |
312+
| [docs/cli-reference.md](docs/cli-reference.md) | Complete option reference, YAML schema, exit codes, and example commands |
306313
| [docs/output-formats.md](docs/output-formats.md) | CSV, plain text, SARIF, and GitHub Annotations format descriptions |
307314
| [docs/usage-modes/](docs/usage-modes/index.md) | All operating modes: static inventory, API AI, manual workflow, apply-tags, apply-tags-from-csv, delta, security-only |
308315
| [docs/ai/providers.md](docs/ai/providers.md) | Per-provider setup: Ollama, OpenAI, Anthropic, Azure OpenAI, Groq, xAI, GitHub Models, Mistral, OpenRouter |
316+
| [docs/ai/overrides.md](docs/ai/overrides.md) | Classification override file: format, governance, and CI integration |
309317
| [docs/ai/confidence.md](docs/ai/confidence.md) | Confidence scoring: interpretation and threshold guidance |
310318
| [docs/ai/caching.md](docs/ai/caching.md) | AI result caching: skip unchanged classes, two-pass SARIF pattern, CI cache key strategy |
311319
| [docs/ai/drift-detection.md](docs/ai/drift-detection.md) | Tag vs AI drift detection: detecting stale `@Tag("security")` annotations |
320+
| [docs/ai/interaction-score.md](docs/ai/interaction-score.md) | Placebo-test detection: interaction-score semantics and CI thresholds |
321+
| [docs/compliance.md](docs/compliance.md) | Compliance framework mapping: OWASP SAMM, NIST SSDF, ISO 27001, DORA; reproducibility statement |
312322
| [docs/deployment/](docs/deployment/index.md) | Regulated environment guidance: PCI-DSS, ISO 27001, NIST SSDF, DORA, SOC 2, air-gapped |
313-
| [docs/concepts/data-governance.md](docs/concepts/data-governance.md) | What data is submitted to AI providers and data residency options |
323+
| [docs/deployment/onboarding.md](docs/deployment/onboarding.md) | Onboarding a brownfield codebase: six-phase progression from static scan to CI gate |
324+
| [docs/concepts/data-governance.md](docs/concepts/data-governance.md) | What data is submitted to AI providers, data residency options, enterprise secret management |

docs/ai/confidence.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,10 @@ In regulated environments, test coverage evidence submitted to auditors needs to
3434
Output (CSV excerpt):
3535

3636
```csv
37-
fqcn,method,loc,tags,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_confidence
38-
com.acme.crypto.AesGcmTest,roundTrip_encryptDecrypt,18,,true,SECURITY: crypto - AES-GCM round-trip,security;crypto,Verifies ciphertext and plaintext integrity under AES-GCM.,1.0
39-
com.acme.auth.SessionTest,sessionToken_isRotatedAfterLogin,12,,true,SECURITY: auth - session token rotation after login,security;auth,Session token is replaced on successful login to prevent fixation.,0.7
40-
com.acme.util.DateFormatterTest,format_returnsIso8601,5,,false,,,Test verifies date formatting output only.,0.0
37+
fqcn,method,loc,tags,display_name,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_interaction_score,ai_confidence
38+
com.acme.crypto.AesGcmTest,roundTrip_encryptDecrypt,18,,,true,SECURITY: crypto - AES-GCM round-trip,security;crypto,Verifies ciphertext and plaintext integrity under AES-GCM.,0.0,1.0
39+
com.acme.auth.SessionTest,sessionToken_isRotatedAfterLogin,12,,,true,SECURITY: auth - session token rotation after login,security;auth,Session token is replaced on successful login to prevent fixation.,0.0,0.7
40+
com.acme.util.DateFormatterTest,format_returnsIso8601,5,,,false,,,Test verifies date formatting output only.,0.0,0.0
4141
```
4242

4343
## Filtering high-confidence findings
@@ -47,5 +47,5 @@ Because the output is plain CSV, standard shell tools work:
4747
```bash
4848
# Keep only rows where ai_confidence >= 0.7
4949
./methodatlas -ai -ai-confidence /src | \
50-
awk -F',' 'NR==1 || ($9+0) >= 0.7'
50+
awk -F',' 'NR==1 || ($11+0) >= 0.7'
5151
```

docs/ai/interaction-score.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,9 @@ When AI enrichment is enabled, every row in CSV and plain-text output carries th
5757

5858
**CSV:**
5959
```
60-
fqcn,method,loc,tags,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_interaction_score
61-
com.acme.AuthTest,shouldValidatePassword,8,security,true,SECURITY: ...,security;auth,Validates...,0.0
62-
com.acme.AuthTest,shouldInvokeEncoder,5,security,true,SECURITY: ...,security;auth,Calls encoder.,1.0
60+
fqcn,method,loc,tags,display_name,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_interaction_score
61+
com.acme.AuthTest,shouldValidatePassword,8,security,,true,SECURITY: ...,security;auth,Validates...,0.0
62+
com.acme.AuthTest,shouldInvokeEncoder,5,security,,true,SECURITY: ...,security;auth,Calls encoder.,1.0
6363
```
6464

6565
**Plain text:**
@@ -81,7 +81,7 @@ gate without tuning.
8181
```bash
8282
# Print security-relevant tests with interaction score ≥ 0.8
8383
./methodatlas -ai -security-only src/test/java \
84-
| awk -F',' 'NR==1 || $9+0 >= 0.8' \
84+
| awk -F',' 'NR==1 || $10+0 >= 0.8' \
8585
> weak-security-tests.csv
8686
```
8787

docs/ai/overrides.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,3 +218,20 @@ applied on top.
218218

219219
See [CLI reference — `-override-file`](../cli-reference.md#-override-file) for
220220
the full flag description.
221+
222+
## Governance and review cadence
223+
224+
The override file is a living document that records human classification decisions. Without a defined review cadence, entries can become stale — referencing methods that were renamed, removed, or whose security relevance changed as the codebase evolved.
225+
226+
Recommended practices:
227+
228+
**Trigger-based review (minimum):** review the override file whenever:
229+
- A class named in an override entry is renamed or moved
230+
- A sprint introduces new test methods in a class that has class-level overrides
231+
- A security review flags a method that is currently overridden to `securityRelevant: false`
232+
233+
**Time-based review (regulated environments):** review the entire file at each release candidate or at a fixed calendar interval (e.g. quarterly). The review should confirm that each entry's `note` field describes a rationale that still applies.
234+
235+
**Process:** store the override file in version control alongside the source. Each change to the file constitutes a PR; the PR description and approval record serve as the audit trail. In regulated environments, require a minimum of one security team reviewer on override file PRs separate from the developer who made the change.
236+
237+
For organisations where the override file is owned by a dedicated security team and delivered from a separate repository, see [Remote Override Sources](remote-overrides.md).

docs/cli-examples.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ export ANTHROPIC_API_KEY=sk-ant-...
8686

8787
# Filter high-confidence findings (requires -ai-confidence)
8888
./methodatlas -ai -ai-confidence /path/to/tests | \
89-
awk -F',' 'NR==1 || ($9+0) >= 0.7'
89+
awk -F',' 'NR==1 || ($11+0) >= 0.7'
9090
```
9191

9292
## Source write-back
@@ -127,11 +127,11 @@ Running against a mix of functional and cryptographic test classes:
127127
Produces output such as:
128128

129129
```csv
130-
fqcn,method,loc,tags,ai_security_relevant,ai_display_name,ai_tags,ai_reason
131-
org.egothor.methodatlas.MethodAtlasAppTest,csvMode_detectsMethodsLocAndTags,22,,false,,,Test verifies functional output format only.
132-
zeroecho.core.alg.aes.AesGcmCrossCheckTest,aesGcm_stream_vs_jca_ctxOnly_crosscheck,52,,true,SECURITY: crypto - cross-check AES-GCM stream encryption with JCA reference,security;crypto,Verifies custom AES-GCM matches JCA output — ensures cryptographic correctness.
133-
zeroecho.core.alg.aes.AesLargeDataTest,aesGcmLargeData_ctxOnly,27,,true,SECURITY: crypto - AES-GCM round-trip with context-only parameters,security;crypto,Tests encryption and decryption correctness for large data using AES-GCM.
134-
zeroecho.core.alg.mldsa.MldsaLargeDataTest,mldsa_complete_suite_streaming_sign_verify_large_data,24,,true,SECURITY: crypto - ML-DSA streaming signature and verification for large data,security;crypto;owasp,Validates ML-DSA signature creation and verification including tamper detection.
130+
fqcn,method,loc,tags,display_name,ai_security_relevant,ai_display_name,ai_tags,ai_reason,ai_interaction_score
131+
org.egothor.methodatlas.MethodAtlasAppTest,csvMode_detectsMethodsLocAndTags,22,,,false,,,Test verifies functional output format only.,0.0
132+
zeroecho.core.alg.aes.AesGcmCrossCheckTest,aesGcm_stream_vs_jca_ctxOnly_crosscheck,52,,,true,SECURITY: crypto - cross-check AES-GCM stream encryption with JCA reference,security;crypto,Verifies custom AES-GCM matches JCA output — ensures cryptographic correctness.,0.0
133+
zeroecho.core.alg.aes.AesLargeDataTest,aesGcmLargeData_ctxOnly,27,,,true,SECURITY: crypto - AES-GCM round-trip with context-only parameters,security;crypto,Tests encryption and decryption correctness for large data using AES-GCM.,0.0
134+
zeroecho.core.alg.mldsa.MldsaLargeDataTest,mldsa_complete_suite_streaming_sign_verify_large_data,24,,,true,SECURITY: crypto - ML-DSA streaming signature and verification for large data,security;crypto;owasp,Validates ML-DSA signature creation and verification including tamper detection.,0.0
135135
```
136136

137137
Observations:

docs/cli-reference.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -187,9 +187,9 @@ Appends a SHA-256 content fingerprint to every emitted record. The hash is compu
187187
In CSV output, a `content_hash` column is appended immediately after `tags`:
188188

189189
```text
190-
fqcn,method,loc,tags,content_hash
191-
com.acme.tests.SampleOneTest,alpha,8,fast;crypto,3a7f9b...
192-
com.acme.tests.SampleOneTest,beta,6,param,3a7f9b...
190+
fqcn,method,loc,tags,display_name,content_hash
191+
com.acme.tests.SampleOneTest,alpha,8,fast;crypto,,3a7f9b...
192+
com.acme.tests.SampleOneTest,beta,6,param,,3a7f9b...
193193
```
194194

195195
In plain-text output, a `HASH=<value>` token is appended to each line. In SARIF output, the hash is stored as `properties.contentHash`.
@@ -499,3 +499,14 @@ Runs the prepare phase of the manual AI workflow. For each test class MethodAtla
499499
Runs the consume phase. MethodAtlas reads operator-filled response files and merges the AI JSON into the output CSV. Missing or empty response files are treated as absent AI data; the scan continues.
500500

501501
For practical examples grouped by use case, see [CLI Examples](cli-examples.md).
502+
503+
## Exit codes
504+
505+
| Code | Condition |
506+
|---|---|
507+
| `0` | Scan completed successfully; all source files processed |
508+
| `1` | `-apply-tags-from-csv` aborted because the mismatch count reached or exceeded `-mismatch-limit` |
509+
| `1` | A source file could not be read or written during `-apply-tags-from-csv` |
510+
| `1` | A required argument value is missing or malformed (printed to stderr before exit) |
511+
512+
Note: AI classification failures for individual classes (provider timeout, parse error in the AI response) do not cause a non-zero exit. The affected rows are emitted with blank AI columns and the scan continues. Only structural errors — bad arguments, mismatch-limit violations, and I/O failures during source write-back — produce exit code `1`.

docs/compliance.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,22 @@ security test coverage is maintained and repeated across development cycles.
9494
The SARIF output integrates with code scanning dashboards that provide the
9595
timestamped, per-commit audit trail supervisors may request.
9696

97+
## Reproducibility and AI non-determinism
98+
99+
MethodAtlas separates two distinct layers with different reproducibility properties.
100+
101+
**The structural layer is fully deterministic.** Method discovery (FQCN, method name, LOC, source-level `@Tag` values, content hash) is driven entirely by JavaParser AST analysis of the source files. Given the same source revision, this layer always produces identical output, regardless of provider, model, or time.
102+
103+
**The AI layer is non-deterministic by nature.** Language models use probabilistic sampling. Even with the same model, same source, and same prompt, a different run may produce a slightly different `ai_reason`, a different `ai_confidence` value, or — rarely — a different `securityRelevant` verdict. This is a fundamental property of all language model inference, not a defect in MethodAtlas.
104+
105+
Two mechanisms mitigate AI non-determinism for compliance purposes:
106+
107+
1. **`-ai-cache`** — once a class has been classified, its result is stored in a CSV indexed by SHA-256 content hash. Subsequent runs reuse the stored result without calling the provider. The scan output is therefore reproducible for all unchanged classes.
108+
109+
2. **`-override-file`** — human-reviewed corrections are applied deterministically on every run and take precedence over AI output. An override entry sets confidence to `1.0` or `0.0`, reflecting the higher certainty of a human decision.
110+
111+
For evidence packages submitted to assessors, the recommended practice is to treat the classified CSV (produced with `-ai -content-hash`) as the authoritative record after human review. Re-running the scan on the same commit using the same cache produces output identical to the reviewed artefact for all unchanged classes; any new or changed classes are the only source of variance.
112+
97113
## Further reading
98114

99115
- [OWASP SAMM v2 — Security Testing practice](https://owaspsamm.org/model/verification/security-testing/)

0 commit comments

Comments
 (0)