Skip to content

refactor: consolidate input validation into src/validate.rs#305

Merged
jamesadevine merged 2 commits into
mainfrom
refactor/consolidate-input-validation
Apr 22, 2026
Merged

refactor: consolidate input validation into src/validate.rs#305
jamesadevine merged 2 commits into
mainfrom
refactor/consolidate-input-validation

Conversation

@jamesadevine

Copy link
Copy Markdown
Collaborator

Summary

Consolidates scattered input validation primitives from engine.rs and compile/common.rs into a single src/validate.rs module.

Problem

Input validation was duplicated across 3+ files:

  • engine.rs had 5 private is_valid_* allowlist functions + inline ADO/pipeline injection checks
  • compile/common.rs had ~15 validation functions including duplicates of env-var-name and ADO expression checks
  • sanitize.rs had a dead sanitize_light() function

Changes

Created src/validate.rs — single source of truth for ~20 structural validators:

  • 5 character allowlist validators moved from engine.rs
  • 3 format validators moved from compile/common.rs
  • 6 container/Docker/MCP validators moved from compile/common.rs
  • New shared predicates: contains_ado_expression, contains_pipeline_command, contains_newline, reject_pipeline_injection, validate_dns_domain

Simplified engine.rs — imports from validate.rs, uses shared predicates instead of inline checks.

Simplified compile/common.rsvalidate_front_matter_identity() reduced from ~70 lines to ~12 lines.

Removed dead codesanitize_light() and its tests from sanitize.rs.

Design decisions

  • Engine-specific policy (BLOCKED_ARG_PREFIXES, BLOCKED_ENV_KEYS) stays in engine.rs
  • Orchestration functions stay in compile/common.rs
  • SanitizeConfig/SanitizeContent traits stay in sanitize.rs (content transformation ≠ structural validation)

Verification

  • All 976 tests pass
  • Zero new clippy warnings

Move scattered validation primitives (character allowlists, format
validators, injection detectors, container/Docker/DNS validators)
from engine.rs and compile/common.rs into a single validate.rs module.

- Move 5 is_valid_* allowlist functions from engine.rs
- Move is_valid_parameter_name, is_valid_env_var_name, is_safe_tool_name
  from compile/common.rs
- Move container/Docker/MCP validators from compile/common.rs
- Add shared predicates: contains_ado_expression, contains_pipeline_command,
  contains_newline, reject_pipeline_injection, validate_dns_domain
- Simplify validate_front_matter_identity (~70 → ~12 lines)
- Remove dead sanitize_light() from sanitize.rs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — clean consolidation with faithful logic preservation. Two minor issues worth addressing.

Findings

⚠️ Suggestions

  • src/compile/common.rs (lines ~134, ~1140) — Two comment blocks were left behind that are now misleading:

    // Character allowlist validators, format validators, and injection
    // detectors are centralised in crate::validate.  The functions below
    // are thin wrappers or direct call-through to that module.

    and

    // Container, Docker, MCP, and env-var-name validators are centralised
    // in `crate::validate`.  The following re-exports preserve the `pub`
    // surface for callers within the compile module.

    Neither block is followed by actual wrappers or re-exports — the next thing after each is a real function (generate_parameters / generate_mcpg_config). Future readers looking for the "re-exports" will be confused. These should be removed or rewritten as simple // Validation calls delegate to crate::validate. markers at the call sites.

  • src/engine.rs (test line ~931) — The empty-key error message changed from "empty key" to "not a valid environment variable name". The test was updated to match, so there's no test regression, but the original message was more actionable for users who accidentally leave an empty key in their engine.env: block. Worth re-adding a special-case in is_valid_env_var_name's caller, or amending copilot_env to explicitly bail with the old message when key.is_empty().

✅ What Looks Good

  • De Morgan rewrite in validate_dns_domain is logically identical to the original: !(A && B)!A || !B. No behavioural change.
  • splitn(2,'=').nth(1)split_once('=').map_or is a correct and more idiomatic replacement.
  • map_or(false, ...)is_some_and(...) is valid for Rust 2024 edition.
  • sanitize_light removal is justified — it had no callers outside its own tests.
  • New validate.rs is well-structured with clear sections and good test coverage. The colocation of unit tests inside the module matches project conventions.
  • All 976 tests passing per the PR description validates the refactor didn't break any logic paths.

Generated by Rust PR Reviewer for issue #305 · ● 621.9K ·

- Remove misleading 're-export' and 'thin wrapper' comments from
  common.rs that no longer had corresponding code beneath them
- Restore explicit empty-key check in copilot_env() for a clearer
  error message ('empty key' vs generic 'not a valid env var name')

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jamesadevine jamesadevine merged commit 0910e1e into main Apr 22, 2026
4 checks passed
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Clean consolidation — one clear documentation/implementation mismatch to fix, otherwise well-structured.


Findings

🐛 Bugs / Logic Issues

  • src/validate.rs:137-156reject_pipeline_injection's docstring says it rejects "pipeline commands (##vso[, ##[)" but the implementation never calls contains_pipeline_command. Only ADO expressions and newlines are checked. The test at line 493 also doesn't cover the ##vso[ case, confirming the omission.

    This is a doc/code contract mismatch. Future callers relying on the docstring will be surprised. Either add the missing check:

    if contains_pipeline_command(value) {
        anyhow::bail!(
            "Front matter '{}' contains an ADO pipeline command ('##vso[' or '##[') ...",
            field_name
        );
    }

    ...or rename the function to reject_ado_expressions_and_newlines to match what it actually does. (Note: the original validate_front_matter_identity code also didn't check ##vso[ for these fields, so this is not a regression — just a misleading name/docstring introduced in this PR.)

✅ What Looks Good

  • Behavior parity confirmed: the refactored validate_front_matter_identity is semantically equivalent to the removed ~70-line inline version — checked both the removed and added lines carefully.
  • $[ now covered: contains_ado_expression newly catches $[ (runtime expressions), closing a small gap vs the original inline ${{ + $( checks.
  • Case-insensitive mount path detection (to_lowercase()) in validate_mount_source — catches /ETC/shadow-style path evasion correctly.
  • is_some_and over map_or(false, ...) — idiomatic for Rust ≥ 1.70, which this 2024-edition crate satisfies.
  • Solid test coverage across all 20+ new validators; edge cases like *evil*, trailing-* wildcards, and smuggled -v volume flags are all exercised.
  • DANGEROUS_DOCKER_FLAGS constant is a clean way to keep the flag blocklist auditable in one place.

Generated by Rust PR Reviewer for issue #305 · ● 1.1M ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant