Skip to content

Latest commit

 

History

History
605 lines (444 loc) · 36.6 KB

File metadata and controls

605 lines (444 loc) · 36.6 KB

Feature: Traceability & Validation

  • p1 - ID: cpt-cypilot-featstatus-traceability-validation

1. Feature Context

  • p1 - cpt-cypilot-feature-traceability-validation

1. Overview

Deterministic quality gate that scans artifacts for ID definitions and references, scans code for @cpt-* traceability markers, validates structural contracts and cross-references, and provides query commands for navigating the ID graph. All checks are single-pass, stdlib-only, and produce machine-readable JSON reports with file paths, line numbers, and actionable fixing prompts.

2. Purpose

Catches structural and traceability issues that AI agents miss or hallucinate — without relying on an LLM. Ensures that every design element has a unique ID, every reference resolves to a definition, every checked reference implies a checked definition, and every to_code ID has a matching code marker. Addresses PRD requirements for ID and traceability (cpt-cypilot-fr-core-traceability) and CDSL instruction tracking (cpt-cypilot-fr-core-cdsl). Artifact validation and cross-artifact consistency capabilities are provided generically by the core Validator for any installed kit.

3. Actors

Actor Role in Feature
cpt-cypilot-actor-user Invokes validation and traceability query commands from CLI
cpt-cypilot-actor-ai-agent Invokes validation after artifact/code generation; uses query commands for navigation
cpt-cypilot-actor-ci-pipeline Runs validation as a CI gate to enforce quality floor

4. References

  • PRD: PRD.mdcpt-cypilot-fr-core-traceability, cpt-cypilot-fr-core-cdsl
  • Design: DESIGN.mdcpt-cypilot-component-validator, cpt-cypilot-component-traceability-engine
  • Specs: traceability.md, CDSL.md, constraints.md
  • Dependencies: cpt-cypilot-feature-core-infra

2. Actor Flows (CDSL)

Validate Artifacts

  • p1 - ID: cpt-cypilot-flow-traceability-validation-validate

Actor: cpt-cypilot-actor-user

Success Scenarios:

  • User runs cpt validate → all registered artifacts validated, cross-references checked, code traceability verified, PASS with coverage report
  • User runs cpt validate --artifact <path> → single artifact validated against its constraints, cross-references checked against all artifacts

Error Scenarios:

  • Artifact not found in registry → ERROR with message
  • Template structure mismatch → FAIL with heading contract details
  • Cross-reference to undefined ID → FAIL with definition hint
  • Code marker references non-existent artifact ID → FAIL with orphan details

Steps:

  1. - p1 - User invokes cpt validate [--artifact <path>] [--skip-code] [--verbose] - inst-user-validate
  2. - p1 - Load project context: cypilot config, registry, systems, kits, constraints - inst-load-context
  3. - p1 - Resolve artifacts to validate: if --artifact specified resolve single artifact from registry, otherwise collect all registered Cypilot-format artifacts - inst-resolve-artifacts
  4. - p1 - IF registry-level errors detected RETURN FAIL report immediately - inst-if-registry-fail
  5. - p1 - Run self-check: validate kit examples against templates to ensure kit integrity - inst-self-check
  6. - p1 - FOR EACH kit: resolve resource paths — for manifest-driven kits, resolve constraints, templates, and examples from resource bindings in core.toml; for legacy kits, use default directory structure - inst-resolve-kit-resources
  7. - p1 - FOR EACH artifact to validate - inst-foreach-artifact
    1. - p1 - Load kind-specific constraints from kit (using resolved resource paths) - inst-load-constraints
    2. - p1 - Validate artifact structure using cpt-cypilot-algo-traceability-validation-validate-structure - inst-validate-structure
  8. - p1 - IF per-artifact errors exist RETURN FAIL report (stop before cross-validation) - inst-if-structure-fail
  9. - p1 - Cross-validate references across all artifacts using cpt-cypilot-algo-traceability-validation-cross-validate - inst-cross-validate
  10. - p1 - IF --skip-code is not set, validate code traceability using cpt-cypilot-algo-traceability-validation-cross-validate-code - inst-if-code
  11. - p1 - Enrich errors with fixing prompts for LLM agents - inst-enrich-errors
  12. - p1 - RETURN JSON report (status, artifact count, error/warning counts, coverage stats, next step hint) - inst-return-report

Supporting:

  • - p1 - Imports and module setup for validate command - inst-validate-imports
  • - p1 - Internal helpers: attach issue to artifact report, enrich target artifact paths, find artifact in system, suggest path from autodetect - inst-validate-helpers
  • - p1 - Human-friendly formatter: issue location, issue formatting, validate report display - inst-validate-format

Check Language

  • p1 - ID: cpt-cypilot-flow-traceability-validation-check-language

Actor: cpt-cypilot-actor-user

Success Scenarios:

  • User runs cpt check-language → all .md artifacts scanned for disallowed Unicode characters, PASS if none found
  • User runs cpt check-language --languages en,ru <path> → specified path scanned with given language policy
  • User runs cpt check-language --exclude "translations/**" → matching paths skipped, skipped count reported
  • File contains <!-- cpt-lang: ignore --> anywhere → file is skipped entirely by the scanner

Error Scenarios:

  • Unknown language code passed via --languages → ERROR exit code 1
  • Specified path does not exist → ERROR exit code 1
  • Violations found → FAIL exit code 2

Steps:

  1. - p1 - Parse arguments, resolve allowed languages, resolve scan roots, resolve ignore globs (CLI --exclude merged with check_language_ignore_paths from workspace config), invoke scanner, display result - inst-cmd-check-language

Supporting:

  • - p1 - Imports, argument parsing setup, and module-level constants - inst-check-lang-imports
  • - p1 - _read_config_languages, _read_config_ignore_paths, _default_roots, _count_md_files helper functions - inst-helpers
  • - p1 - _human_result: format violation report for human output - inst-human-result

Query Traceability

  • p1 - ID: cpt-cypilot-flow-traceability-validation-query

Actor: cpt-cypilot-actor-user

Success Scenarios:

  • User runs cpt list-ids → all ID definitions listed with kind, file, line, checked status
  • User runs cpt where-defined --id <id> → definition location returned with file path and line
  • User runs cpt where-used --id <id> → all reference locations returned across artifacts and code
  • User runs cpt get-content --id <id> → content block under the ID heading returned

Error Scenarios:

  • ID not found in any artifact → empty result with exit code 2

Steps:

  1. - p1 - User invokes one of: list-ids [--kind K] [--pattern P], where-defined --id <id>, where-used --id <id>, get-content --id <id> - inst-user-query
  2. - p1 - Load project context and resolve all registered artifacts - inst-query-load-context
  3. - p1 - Scan all artifacts using cpt-cypilot-algo-traceability-validation-scan-ids to build ID index - inst-scan-all
  4. - p1 - IF list-ids --include-code: scan codebase files for marker references - inst-if-list-code
  5. - p1 - IF list-ids: filter index by --kind and --pattern, return definitions - inst-if-list
  6. - p1 - IF where-defined: find definition entries for the given ID - inst-if-where-def
  7. - p1 - IF where-used: find reference entries for the given ID across artifacts and code - inst-if-where-used
  8. - p1 - IF get-content: locate ID definition, extract content block from heading scope - inst-if-get-content
  9. - p1 - RETURN JSON result - inst-return-query

Supporting:

  • - p1 - Imports and module setup for query commands (list-ids, where-defined, where-used) - inst-query-imports
  • - p1 - Argument parsing, context resolution, and artifact collection for query commands - inst-query-resolve
  • - p1 - Human-friendly formatters for list-ids, where-defined, and where-used output - inst-query-format

3. Processes / Business Logic (CDSL)

Scan Artifact IDs

  • p1 - ID: cpt-cypilot-algo-traceability-validation-scan-ids

Input: Path to a Markdown artifact file

Output: List of ID hits: {id, line, type (definition|reference), checked, has_task, has_priority, priority}

Steps:

  1. - p1 - Read file as UTF-8 lines - inst-read-file
  2. - p1 - FOR EACH line (skipping fenced code blocks) - inst-foreach-line
    1. - p1 - Match ID definition pattern: **ID**: \cpt-...`with optional checkbox and priority -inst-match-def`
    2. - p1 - IF definition matched, extract id, checked, has_task, priority and append as definition hit - inst-if-def
    3. - p1 - ELSE match standalone reference pattern: \cpt-...`with optional checkbox -inst-match-ref`
    4. - p1 - ELSE scan for inline backticked cpt-* references - inst-match-inline
  3. - p1 - RETURN ordered list of hits - inst-return-hits
  4. - p1 - Parse a cpt-{system}-{kind}-{slug} identifier: extract system, kind, slug with composite ID support - inst-parse-cpt

Supporting:

  • - p1 - Imports, regex constants (ID definition, reference, backtick, heading, fence patterns), and ID normalization helper - inst-scan-ids-datamodel
  • - p1 - Heading-by-line index builder for document scope resolution - inst-scan-ids-headings
  • - p1 - Content scoped extraction: hash-fence blocks, heading scopes, ID-definition scopes - inst-scan-ids-get-content
  • - p1 - File I/O utilities: safe text reader, text file iterator, relative path converter - inst-scan-ids-file-utils
  • - p1 - Wrapper function for parse_cpt identifier parser - inst-parse-cpt-fn

Scan CDSL Instructions

  • p1 - ID: cpt-cypilot-algo-traceability-validation-scan-cdsl

Input: Path to a Markdown artifact file

Output: List of CDSL instruction records: {parent_id, inst, checked, line, priority}

Steps:

  1. - p1 - Read file as UTF-8 lines - inst-read-file
  2. - p1 - Track current parent ID by scanning ID definitions at heading level - inst-track-parent
  3. - p1 - FOR EACH line matching CDSL instruction pattern (numbered list item with inst-{slug} suffix) - inst-foreach-cdsl
    1. - p1 - Extract checked status, priority, instruction slug - inst-extract-inst
    2. - p1 - Associate with current parent ID - inst-associate-parent
  4. - p1 - RETURN list of instruction records - inst-return-cdsl

Supporting:

  • - p1 - CDSL line regex and phase number parsing constants - inst-scan-cdsl-datamodel

Validate Artifact Structure

  • p1 - ID: cpt-cypilot-algo-traceability-validation-validate-structure

Input: Artifact path, artifact kind, kind-specific constraints, registered systems

Output: {errors, warnings} lists

Steps:

  1. - p1 - IF constraints have headings contract, validate heading patterns (required sections, levels, ordering) - inst-check-headings
  2. - p1 - IF headings errors exist RETURN early (IDs depend on correct structure) - inst-if-headings-fail
  3. - p1 - Scan IDs using cpt-cypilot-algo-traceability-validation-scan-ids - inst-scan-ids
  4. - p1 - Scan CDSL instructions using cpt-cypilot-algo-traceability-validation-scan-cdsl - inst-scan-cdsl
  5. - p1 - FOR EACH CDSL step where parent ID is checked but step is unchecked - inst-foreach-cdsl-mismatch
    1. - p1 - Emit error: CDSL step unchecked but parent already checked - inst-emit-cdsl-error
  6. - p1 - FOR EACH parent-child ID pair (heading scope) - inst-foreach-parent-child
    1. - p1 - IF all children checked AND parent unchecked, emit error - inst-if-all-done-parent-not
    2. - p1 - IF parent checked AND any child unchecked, emit error - inst-if-parent-done-child-not
  7. - p1 - Validate ID format and heading scoping per constraints - inst-validate-id-format
  8. - p1 - RETURN accumulated errors and warnings - inst-return-structure

Supporting:

  • - p1 - Imports, dataclasses (ReferenceRule, HeadingConstraint, IdConstraint, ArtifactKindConstraints, KitConstraints, ArtifactRecord, ParsedCypilotId), error factory, and optional-bool parser - inst-structure-datamodel
  • - p1 - Entry point for validate_artifact_file: load constraints, dispatch validation phases - inst-check-ids-entry
  • - p1 - Helper functions for task/priority and ID/heading constraint validation - inst-check-ids-helpers
  • - p1 - TOC validation phase within artifact validation - inst-check-toc
  • - p1 - Build definitions-by-ID index from scanned artifact IDs - inst-build-defs-index
  • - p1 - Heading context resolution for CDSL instruction line matching - inst-check-cdsl-heading-ctx

Cross-Validate Artifacts

  • p1 - ID: cpt-cypilot-algo-traceability-validation-cross-validate

Input: List of all artifact records (path, kind, constraints)

Output: {errors, warnings} lists

Steps:

  1. - p1 - Scan all artifacts to build definition index (defs_by_id) and reference index (refs_by_id) - inst-build-index
  2. - p1 - FOR EACH ID with definitions in multiple different artifact files, emit error: duplicate definition listing conflicting files - inst-duplicate-defs
  3. - p1 - FOR EACH reference to an internal-system ID - inst-foreach-ref
    1. - p1 - IF no matching definition exists, emit error: reference to undefined ID - inst-if-no-def
  4. - p1 - FOR EACH reference with checked task marker - inst-foreach-checked-ref
    1. - p1 - IF corresponding definition has task marker AND is unchecked, emit error: ref done but def not done - inst-if-ref-done-def-not
  5. - p1 - FOR EACH definition with checked task marker - inst-foreach-checked-def
    1. - p1 - IF any task-tracked reference is unchecked, emit error: def done but ref not done - inst-if-def-done-ref-not
  6. - p1 - Enforce coverage rules from constraints (required cross-references between artifact kinds) - inst-enforce-coverage
  7. - p1 - RETURN accumulated errors and warnings - inst-return-cross

Supporting:

  • - p1 - Setup helpers: system matcher, kind extractor, external-system detector, heading-info builder, constraint indexing - inst-cross-datamodel

Scan Code Markers

  • p1 - ID: cpt-cypilot-algo-traceability-validation-scan-code

Input: Path to a code file

Output: Parsed code file: scope markers, block markers, references, structural errors

Steps:

  1. - p1 - Read file lines - inst-read-code
  2. - p1 - FOR EACH line matching @cpt-{kind}:{id}:p{N} - inst-match-scope
    1. - p1 - Extract kind, id, phase; add to scope markers list - inst-extract-scope
  3. - p1 - FOR EACH line matching @cpt-begin:{id}:p{N}:inst-{local} - inst-match-begin
    1. - p1 - Push onto open block stack - inst-push-block
  4. - p1 - FOR EACH line matching @cpt-end:{id}:p{N}:inst-{local} - inst-match-end
    1. - p1 - Pop from stack, validate matching begin marker - inst-pop-block
    2. - p1 - IF no matching begin or id/inst mismatch, emit structural error - inst-if-mismatch
  5. - p1 - IF unclosed blocks remain on stack, emit errors - inst-if-unclosed
  6. - p1 - RETURN parsed code file with markers and structural errors - inst-return-code
  7. - p1 - Define code data model: regex patterns, ScopeMarker, BlockMarker, CodeReference, CodeFile dataclasses, error factory - inst-code-datamodel
  8. - p1 - Query and validation methods: list_ids, get by ID/inst, validate duplicate scopes - inst-code-query-validate
  9. - p1 - Convenience wrappers: load_code_file, validate_code_file entry points - inst-code-wrappers

Cross-Validate Code

  • p1 - ID: cpt-cypilot-algo-traceability-validation-cross-validate-code

Input: Parsed code files, artifact ID set, to_code ID set, forbidden IDs (unchecked task), CDSL instruction map

Output: {errors, warnings} lists

Steps:

  1. - p1 - IF traceability mode is DOCS-ONLY and markers found, emit error: markers prohibited - inst-if-docs-only
  2. - p1 - Collect all IDs referenced in code markers - inst-collect-code-ids
  3. - p1 - FOR EACH code marker referencing an ID not in artifact definitions - inst-foreach-orphan
    1. - p1 - Emit error: orphaned code marker (ID not defined in any artifact) - inst-emit-orphan
  4. - p1 - FOR EACH code marker referencing a to_code ID whose task checkbox is unchecked - inst-foreach-forbidden
    1. - p1 - Emit error: code marker exists but artifact task not checked - inst-emit-forbidden
  5. - p1 - FOR EACH to_code ID without any code marker - inst-foreach-missing
    1. - p1 - Emit error: missing code marker for to_code ID - inst-emit-missing
  6. - p1 - FOR EACH CDSL instruction in artifacts with code block markers - inst-foreach-inst
    1. - p1 - IF artifact instruction has no matching @cpt-begin/@cpt-end block, emit error - inst-if-inst-missing
    2. - p1 - IF code block has no matching CDSL step in artifact, emit error - inst-if-inst-orphan
  7. - p1 - RETURN accumulated errors and warnings - inst-return-code-cross

List ID Kinds

  • p1 - ID: cpt-cypilot-algo-traceability-validation-list-id-kinds

Input: Optional artifact path, project context

Output: JSON with kind list, counts, kind↔template mappings

Steps:

  1. - p1 - Parse arguments: --artifact - inst-kinds-parse-args
  2. - p1 - Resolve artifacts to scan (single or all registered) - inst-kinds-resolve-artifacts
  3. - p1 - IF no artifacts found RETURN empty result or error - inst-kinds-if-no-artifacts
  4. - p1 - Build known kinds set from kit constraints - inst-kinds-build-known
  5. - p1 - FOR EACH artifact, scan ID definitions and infer kind tokens from ID slugs - inst-kinds-scan-ids
  6. - p1 - Aggregate kind counts and kind↔template mappings - inst-kinds-aggregate
  7. - p1 - RETURN JSON: {kinds, kind_counts, kind_to_templates, template_to_kinds} - inst-kinds-return

Supporting:

  • - p1 - Imports and module setup for list-id-kinds command - inst-kinds-imports
  • - p1 - Human-friendly formatter for list-id-kinds output - inst-kinds-format

Validate TOC

  • p1 - ID: cpt-cypilot-algo-traceability-validation-validate-toc

Input: List of file paths (or all registered artifacts)

Output: JSON with per-file TOC validation results

Steps:

  1. - p1 - Parse arguments: positional files or --all - inst-toc-parse-args
  2. - p1 - Resolve file list (explicit paths or all registered artifacts) - inst-toc-resolve-files
  3. - p1 - FOR EACH file - inst-toc-foreach-file
    1. - p1 - Parse existing TOC block between <!-- toc --> markers - inst-toc-parse-existing
    2. - p1 - Generate expected TOC from headings - inst-toc-generate-expected
    3. - p1 - Compare existing vs expected: check anchor validity, heading coverage, staleness - inst-toc-compare
    4. - p1 - IF mismatch, record error with diff details - inst-toc-if-mismatch
  4. - p1 - RETURN JSON: {status, files_checked, errors} - inst-toc-return

Supporting:

  • - p1 - Imports and module setup for validate-toc command - inst-toc-imports
  • - p1 - Human-friendly formatter for validate-toc output - inst-toc-format

TOC Utilities

  • p1 - ID: cpt-cypilot-algo-traceability-validation-toc-utils

Input: Markdown content string, optional file path

  1. - p1 - Parse headings from markdown lines (respecting fenced code blocks, min/max level, skip options) - inst-toc-util-parse-headings
  2. - p1 - Build TOC string from heading tuples (numbered or bulleted, GitHub-compatible anchors) - inst-toc-util-build-toc
  3. - p1 - Insert/update TOC using HTML markers (<!-- toc -->) for CLI command - inst-toc-util-insert-markers
  4. - p1 - Insert/update TOC using heading-based insertion (## Table of Contents) for kit file generator - inst-toc-util-insert-heading
  5. - p1 - Process file: strip manual TOC, insert marker-based TOC, write if changed - inst-toc-util-process-file
  6. - p1 - Validate TOC: check existence, anchor validity, completeness, staleness - inst-toc-util-validate

Supporting:

  • - p1 - Imports, constants, fence tracking, GitHub anchor slug generation - inst-toc-util-datamodel
  • - p1 - Internal helpers: unique slug, next heading finder, manual TOC stripping, TOC section finder, entry extraction, anchor building, heading line finder - inst-toc-util-helpers

Markdown Parsing Utilities

  • p1 - ID: cpt-cypilot-algo-traceability-validation-parsing-utils

Input: Markdown text, section regex patterns

  1. - p1 - Parse required sections from requirements file (section ID → title mapping) - inst-parse-required-sections
  2. - p1 - Find present section IDs in artifact text (e.g., A, B, C letter headings) - inst-parse-find-sections
  3. - p1 - Split text by lettered sections with optional line offsets - inst-parse-split-sections
  4. - p1 - Extract field blocks from markdown (**Field Name**: value patterns) - inst-parse-field-block
  5. - p1 - Extract backticked IDs matching a pattern from text - inst-parse-extract-ids

Supporting:

  • - p1 - Imports, constants, field header termination heuristic, and list item detection helper - inst-parse-datamodel

Fixing Prompt Enrichment

  • p1 - ID: cpt-cypilot-algo-traceability-validation-fixing-prompts

Input: List of validation issues (errors/warnings), optional project root

  1. - p1 - Define probable reasons registry mapping error codes to human-readable templates - inst-fix-define-reasons
  2. - p1 - Build actionable fixing prompt per error code with location, ID, and constraint context - inst-fix-build-prompt
  3. - p1 - Enrich each issue in-place: resolve reasons, attach fixing prompt, normalize location - inst-fix-enrich

Supporting:

  • - p1 - Imports, SafeDict formatter, reason resolver, and prompt helper functions (location, kind context, headings hint, relative path) - inst-fix-datamodel

Headings Contract Validation

  • p1 - ID: cpt-cypilot-algo-traceability-validation-headings-contract

Input: Artifact path, artifact kind constraints (headings list), registered systems

Output: {errors, warnings} lists

Steps:

  1. - p1 - Resolve heading constraint IDs by line: match each document heading to a constraint pattern, build per-line active scope stack - inst-resolve-scope
  2. - p1 - Scan headings from markdown lines: parse level, title, numbering prefix (respecting fenced code blocks) - inst-scan-headings
  3. - p1 - Initialize validation context: load heading constraints, build helper lookups, scan document headings - inst-validate-init
  4. - p1 - Check numbering sequence: enforce that sibling sections under the same numeric parent progress consecutively - inst-check-numbering
  5. - p1 - Match headings against constraints: hierarchical scope matching, required/multiple/numbered enforcement, emit errors for missing/duplicate/misnumbered headings - inst-match-headings

Supporting:

  • - p1 - Heading line regex, number prefix regex, and module exports - inst-headings-datamodel
  • - p1 - Helper functions for heading pattern compilation, wildcard mapping, and best-match selection - inst-match-headings-helpers
  • - p1 - Entry-point function signature for validate_headings_contract - inst-validate-headings-entry

Load Constraints

  • p1 - ID: cpt-cypilot-algo-traceability-validation-load-constraints

Input: Kit root path or raw TOML data

Output: KitConstraints object or list of parse errors

Steps:

  1. - p1 - Load constraints.toml from kit root (or from resolved resource binding path for manifest-driven kits), parse TOML, delegate to parse_kit_constraints - inst-load-toml
  2. - p1 - Parse kit constraints: iterate artifact kinds, parse headings, identifiers, TOC flag, normalize heading IDs and prev/next references - inst-parse-kit
  3. - p1 - Parse individual ID constraint: validate kind, required, name, template, examples, task, priority, to_code, headings, references - inst-parse-id-constraint
  4. - p1 - Parse heading constraint: validate level, pattern, description, required, multiple, numbered, id, prev/next, pointer - inst-parse-heading
  5. - p1 - Parse reference rule: validate coverage, task, priority, headings fields - inst-parse-ref-rule

Supporting:

  • - p1 - Examples parser, heading-constraint ID slugifier, and references map parser - inst-constraints-helpers
  • - p1 - Normalize heading IDs and validate prev/next references in parsed constraints - inst-constraints-normalize

Content Language Scan

  • p1 - ID: cpt-cypilot-algo-traceability-validation-lang-scan

Input: List of file/directory paths, list of allowed language codes

  1. - p1 - Define Unicode script ranges for each supported language code (Latin, Cyrillic, Arabic, CJK, etc.) - inst-script-ranges
  2. - p1 - Expose SUPPORTED_LANGUAGES constant — sorted list of all recognized language codes - inst-supported-langs
  3. - p1 - Define always-allowed common ranges (emoji, zero-width markers, BOM) - inst-common-ranges
  4. - p1 - Define skip patterns for fenced code blocks, HTML comments, and @cpt markers - inst-skip-patterns
  5. - p1 - Build merged sorted list of allowed Unicode ranges for the given language codes (build_allowed_ranges, is_allowed) - inst-range-helpers
  6. - p1 - Scan single file: skip fences and structural lines; IF file contains <!-- cpt-lang: ignore --> skip file entirely; collect lines with characters outside allowed ranges - inst-scan-file
  7. - p1 - Scan paths recursively: filter by extension (default .md), match each candidate path against optional ignore_globs (fnmatch), aggregate violations from non-ignored files - inst-scan-paths

Supporting:

  • - p1 - Imports and module-level type aliases - inst-lang-scan-imports
  • - p1 - LangScanError and LangViolation dataclass with bad_chars_preview and line_preview helpers - inst-violation-datamodel

Language Configuration

  • p1 - ID: cpt-cypilot-algo-traceability-validation-language-config

Input: Optional start path for project config lookup

  1. - p1 - Define extension-based comment format defaults for all supported languages - inst-lang-define-defaults
  2. - p1 - Load language config from project core.toml codeScanning section (with fallback to defaults) - inst-lang-load-config
  3. - p1 - Build regex patterns for @cpt-begin/@cpt-end markers using language-specific comment syntax - inst-lang-build-regex

Supporting:

  • - p1 - Imports, default constants, LanguageConfig class, default config factory, and extension-based comment merging helper - inst-lang-datamodel

4. States (CDSL)

Validation Report Lifecycle

  • p1 - ID: cpt-cypilot-state-traceability-validation-report

States: NOT_RUN, PASS, FAIL, ERROR

Initial State: NOT_RUN

Transitions:

  1. - p1 - FROM NOT_RUN TO PASS WHEN validation completes with zero errors (exit code 0) - inst-pass
  2. - p1 - FROM NOT_RUN TO FAIL WHEN validation completes with structural or traceability errors (exit code 2) - inst-fail
  3. - p1 - FROM NOT_RUN TO ERROR WHEN validation cannot run (no cypilot, missing config, exit code 1) - inst-error

5. Definitions of Done

Artifact Structural Validation

  • p1 - ID: cpt-cypilot-dod-traceability-validation-structure

The system MUST validate each artifact against its kit-defined constraints: heading contract (required sections, levels, patterns), ID format (cpt-{system}-{kind}-{slug}), priority marker presence, CDSL step consistency (checked parent implies checked steps), and parent-child checkbox consistency. Validation MUST produce errors with file path, line number, and actionable fixing prompts. Self-check MUST verify kit examples pass template validation before proceeding.

Implements:

  • cpt-cypilot-flow-traceability-validation-validate
  • cpt-cypilot-algo-traceability-validation-validate-structure

Covers (PRD):

  • cpt-cypilot-fr-core-traceability

Covers (DESIGN):

  • cpt-cypilot-component-validator
  • cpt-cypilot-principle-determinism-first

Cross-Artifact Reference Validation

  • p1 - ID: cpt-cypilot-dod-traceability-validation-cross-refs

The system MUST validate cross-artifact relationships: every ID reference resolves to a definition, checked references imply checked definitions, checked definitions imply checked references, and coverage rules from constraints.toml are enforced (required cross-references between artifact kinds). All consistency violations MUST include line numbers and artifact paths.

Implements:

  • cpt-cypilot-algo-traceability-validation-cross-validate

Covers (PRD):

  • cpt-cypilot-fr-core-traceability

Covers (DESIGN):

  • cpt-cypilot-component-validator
  • cpt-cypilot-component-traceability-engine
  • cpt-cypilot-principle-traceability-by-design

Code Traceability Validation

  • p1 - ID: cpt-cypilot-dod-traceability-validation-code

The system MUST scan code files for @cpt-* markers (scope markers and block markers), validate marker structure (pairing, no empty blocks, proper nesting), and cross-validate against artifact IDs: orphaned markers (code references non-existent ID), missing markers (to_code IDs without code markers), forbidden markers (to_code ID with unchecked task checkbox), and CDSL instruction-level cross-validation. DOCS-ONLY traceability mode MUST prohibit all code markers. Single-pass scanning MUST complete in ≤ 3 seconds per artifact.

Implements:

  • cpt-cypilot-algo-traceability-validation-scan-code
  • cpt-cypilot-algo-traceability-validation-cross-validate-code

Covers (PRD):

  • cpt-cypilot-fr-core-traceability
  • cpt-cypilot-fr-core-cdsl

Covers (DESIGN):

  • cpt-cypilot-component-traceability-engine
  • cpt-cypilot-component-validator
  • cpt-cypilot-principle-ci-automation-first
  • cpt-cypilot-constraint-no-weakening

Traceability Query Commands

  • p1 - ID: cpt-cypilot-dod-traceability-validation-queries

The system MUST provide CLI commands for navigating the ID graph: list-ids [--kind K] [--pattern P] (list definitions matching criteria), where-defined --id <id> (find definition location), where-used --id <id> (find all references), get-content --id <id> (extract content block). All commands MUST output JSON, scan all registered artifacts, and use exit codes 0 (found) / 2 (not found).

Implements:

  • cpt-cypilot-flow-traceability-validation-query
  • cpt-cypilot-algo-traceability-validation-scan-ids

Covers (PRD):

  • cpt-cypilot-fr-core-traceability

Covers (DESIGN):

  • cpt-cypilot-component-traceability-engine
  • cpt-cypilot-seq-traceability-query

CDSL Instruction Tracking

  • p1 - ID: cpt-cypilot-dod-traceability-validation-cdsl

The system MUST scan CDSL instruction markers (inst-{slug} suffixes in numbered list items) from FEATURE artifacts, associate each instruction with its parent ID, track checked/unchecked status, and cross-validate against @cpt-begin/@cpt-end block markers in code. Missing implementations and orphaned code blocks MUST both produce errors.

Implements:

  • cpt-cypilot-algo-traceability-validation-scan-cdsl

Covers (PRD):

  • cpt-cypilot-fr-core-cdsl

Covers (DESIGN):

  • cpt-cypilot-component-validator
  • cpt-cypilot-component-traceability-engine

6. Implementation Modules

Module Path Responsibility
Validate Command skills/.../commands/validate.py Main validation orchestration, context loading, report generation
Validate TOC skills/.../commands/validate_toc.py TOC consistency validation
List IDs skills/.../commands/list_ids.py List ID definitions matching criteria
List ID Kinds skills/.../commands/list_id_kinds.py List all ID kind tokens found in artifacts
Get Content skills/.../commands/get_content.py Extract content block for a specific ID
Where Defined skills/.../commands/where_defined.py Find where an ID is defined
Where Used skills/.../commands/where_used.py Find all references to an ID
Document Utils skills/.../utils/document.py ID scanning, CDSL instruction scanning
Constraints Utils skills/.../utils/constraints.py Constraint loading, heading validation, cross-validation
Codebase Utils skills/.../utils/codebase.py Code file scanning, @cpt-* marker validation
Error Codes skills/.../utils/error_codes.py Stable error codes for validation issues
Fixing Utils skills/.../utils/fixing.py Fixing prompt generation for LLM agents
Language Config skills/.../utils/language_config.py Language-specific file extensions and comment patterns
Parsing Utils skills/.../utils/parsing.py Markdown structure parsing, section extraction

7. Acceptance Criteria

  • cpt validate validates all registered artifacts and produces JSON report with PASS/FAIL status
  • cpt validate --artifact <path> validates a single artifact against its constraints
  • Heading contract validation catches missing required sections and wrong heading levels
  • ID format validation catches malformed cpt-* identifiers with line numbers
  • Cross-artifact validation catches undefined references, checked/unchecked mismatches, and coverage gaps
  • Code traceability validation catches orphaned markers, missing to_code markers, and unchecked-task markers
  • CDSL instruction tracking catches missing @cpt-begin/@cpt-end blocks and orphaned code blocks
  • DOCS-ONLY mode prohibits all @cpt-* code markers
  • cpt list-ids, where-defined, where-used, get-content return correct JSON results
  • Validation of a single artifact completes in ≤ 3 seconds
  • Full project validation (all artifacts + code) completes in ≤ 10 seconds for typical repositories
  • All validation errors include file path, line number, and actionable fixing prompt
  • All commands output JSON to stdout and use exit codes 0/1/2