GitHub - Saharsh1123/SentinelScan: Python CLI static-analysis tool for detecting hardcoded secrets using AST parsing, with JSON output, severity filtering, and CI-tested detection logic.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
.vscode		.vscode
detectors		detectors
docs		docs
test_dirs		test_dirs
tests		tests
.gitignore		.gitignore
.sentinelscanignore		.sentinelscanignore
LICENSE		LICENSE
README		README
cli.py		cli.py
ignore.py		ignore.py
inline_ignore.py		inline_ignore.py
main.py		main.py
output.py		output.py
pytest.ini		pytest.ini
scanner.py		scanner.py
sentinelscan.json		sentinelscan.json

Repository files navigation

# SentinelScan

SentinelScan is a Python CLI static-analysis tool that scans Python codebases for hardcoded secrets.

It uses Python AST parsing to inspect real assignment statements, extracts candidate values, applies a modular rule engine, and reports findings with structured metadata such as rule ID, severity, reason, variable name, file path, and line number.

SentinelScan is a lightweight, learning-focused static-analysis project. It demonstrates AST parsing, modular rule evaluation, structured findings, CLI output, JSON output, redaction, testing, linting, and CI.

> SentinelScan is not a replacement for mature secret-scanning tools. It is an educational static-analysis project intended to build and demonstrate security engineering concepts.

---

## Features

- Recursively scans directories for Python files (`.py`)
- Uses Python AST parsing to analyze real assignment statements
- Extracts candidates from source code before rule evaluation
- Uses structured dataclass models:
  - `Rule`
  - `Candidate`
  - `Finding`
- Uses a modular rule engine for scalable detection logic
- Detects hardcoded secrets assigned to:
  - Simple variables
  - Object attributes
  - Nested attributes
  - Multiple assignment targets
- Detects common secret types:
  - Passwords
  - API keys
  - Tokens
  - AWS access keys
  - Generic secrets
- Supports value-pattern-based detection for structured secrets such as AWS access keys
- Supports variable-name-based detection for names such as `password`, `pwd`, `passwd`, `api_key`, `apikey`, `token`, and `secret`
- Applies minimum-length filtering to reduce obvious false positives
- Assigns severity levels such as `HIGH` and `MEDIUM`
- Provides detection reasons explaining why each finding was flagged
- Supports human-readable CLI output
- Supports machine-readable JSON output with `--json`
- Supports severity filtering with `--severity`
- Supports secret redaction with `--redact`
- Supports combined flags such as `--json --severity HIGH --redact`
- Handles syntax errors gracefully
- Handles file encoding issues using UTF-8 with ignored decode errors
- Includes pytest-based unit and CLI tests
- Uses Ruff for linting/code quality
- Uses GitHub Actions CI to run automated checks

---

## Requirements

- Python 3.11+
- `pytest` for testing
- `ruff` for linting and formatting
- No third-party runtime dependencies

---

## Installation

Clone the repository:

```bash
git clone https://github.com/Saharsh1123/SentinelScan.git
cd SentinelScan
```

Optional: create and activate a virtual environment.

```bash
python3 -m venv venv
source venv/bin/activate
```

Install test/development dependencies:

```bash
python3 -m pip install pytest ruff
```

---

## Quick Start

Scan a directory:

```bash
python3 main.py ./your_directory
```

Scan the included test fixture directory:

```bash
python3 main.py test_dirs
```

Output JSON:

```bash
python3 main.py test_dirs --json
```

Filter by severity:

```bash
python3 main.py test_dirs --severity HIGH
```

Redact detected values:

```bash
python3 main.py test_dirs --redact
```

Use JSON, severity filtering, and redaction together:

```bash
python3 main.py test_dirs --json --severity HIGH --redact
```

For full usage documentation, see [`docs/USAGE.md`](docs/USAGE.md).

---

## Example CLI Output

```bash
python3 main.py test_dirs
```

```text
Scanning 4 Python files...

--- Findings ---

[HIGH] test_dirs/edge_repo/edge_test.py:1 Password → hjkl
       Reason: variable name matched password/pwd/passwd pattern and value met minimum length

[HIGH] test_dirs/test_repo/open_vulns.py:4 AWS Access Key → AKIAEXAMPLE123456789
       Reason: value matched AKIA-prefixed AWS access key pattern

[MEDIUM] test_dirs/test_repo/open_vulns.py:5 Token → xyzttttggfdddf
       Reason: variable name matched token pattern and value met minimum length

Total findings: 6
```

---

## Example JSON Output

```bash
python3 main.py test_dirs --json
```

```json
[
  {
    "line": 1,
    "file": "test_dirs/edge_repo/edge_test.py",
    "var_name": "password",
    "rule_id": "PASSWORD",
    "rule": "Password",
    "severity": "HIGH",
    "value": "hjkl",
    "reason": "variable name matched password/pwd/passwd pattern and value met minimum length"
  }
]
```

JSON mode prints only valid JSON, without CLI headers, summaries, or human-readable formatting.

---

## Documentation

Detailed documentation is split into separate files:

| Document | Purpose |
|---|---|
| [`docs/USAGE.md`](docs/USAGE.md) | CLI flags, examples, JSON output, redaction, and severity filtering |
| [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) | Project pipeline, module responsibilities, and dataclass model flow |
| [`docs/DETECTION_RULES.md`](docs/DETECTION_RULES.md) | Built-in rules, assignment support, detection reasons, and limitations |
| [`docs/TESTING.md`](docs/TESTING.md) | Test coverage, pytest usage, CLI tests, and CI behavior |
| [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) | Setup, Ruff, local checks, project structure, and contribution workflow |
| [`docs/ROADMAP.md`](docs/ROADMAP.md) | Current limitations and future improvements |

---

## Project Structure

```text
SENTINELSCAN/
├── .github/
│   └── workflows/
│       └── tests.yaml              # GitHub Actions workflow for Ruff and pytest
├── detectors/
│   ├── __init__.py                 # Makes detectors an importable package
│   ├── ast_analyzer.py             # AST parsing and candidate extraction
│   ├── find_secrets.py             # High-level detection orchestration
│   ├── models.py                   # Rule, Candidate, and Finding dataclasses
│   ├── rule_engine.py              # Applies rules to candidates
│   └── rules.py                    # Built-in rule definitions
├── test_dirs/
│   ├── edge_repo/
│   │   └── edge_test.py            # Edge-case fixture file
│   └── test_repo/
│       ├── embedded_test/
│       │   └── embedded_hello.py   # Nested fixture file
│       ├── hello.py                # Benign fixture file
│       └── open_vulns.py           # Fixture file containing sample vulnerabilities
├── tests/
│   ├── test_apply_rules.py         # Unit tests for rule engine behavior
│   ├── test_ast.py                 # Tests for AST-based detection
│   └── test_cli.py                 # CLI integration tests
├── .gitignore
├── cli.py                          # CLI argument parsing
├── LICENSE
├── main.py                         # Application entry point
├── output.py                       # JSON/text output formatting, filtering, and redaction
├── pytest.ini                      # Pytest import path configuration
├── README.md                       # Project overview
└── scanner.py                      # Path validation, file discovery, file reading, and scan orchestration
```

Generated local files such as `__pycache__/`, `.pytest_cache/`, `.ruff_cache/`, and `venv/` may appear during development but should not be committed.

---

## Testing and Linting

Run tests:

```bash
pytest
```

Run Ruff:

```bash
ruff check .
```

Format code:

```bash
ruff format .
```

Recommended local check before committing:

```bash
ruff check .
pytest
python3 main.py test_dirs
python3 main.py test_dirs --json
python3 main.py test_dirs --json --severity HIGH
python3 main.py test_dirs --json --severity HIGH --redact
```

---

## Current Limitations

- Only scans Python files (`.py`)
- Only analyzes assignment statements supported by the current AST logic
- Does not currently support dictionary/subscript assignments such as `config["password"] = "value"`
- Does not currently detect secrets passed directly into function calls
- Does not currently perform entropy scoring, confidence scoring, data-flow analysis, or taint analysis
- Detection rules may still produce false positives or false negatives
- Redaction is optional, so use `--redact` when sharing scan output

For more detail, see [`docs/ROADMAP.md`](docs/ROADMAP.md).

---

## License

This project is for educational and demonstration purposes.

About

Python CLI static-analysis tool for detecting hardcoded secrets using AST parsing, with JSON output, severity filtering, and CI-tested detection logic.

Readme

MIT license