Skip to content

Commit 2611d95

Browse files
committed
2 parents 43377a8 + f8eca3f commit 2611d95

3 files changed

Lines changed: 158 additions & 6 deletions

File tree

NOTICE.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
## Repository Repurposed
2+
3+
This repository has been **repurposed**.
4+
Originally, it contained a small experimental script with no real usage or community activity.
5+
6+
As of 13/09/2025 (DD/MM/YYYY), the repository has been **reset and transformed** into a **new, professional project**: Pyspector, which is **completely different** from the original content.
7+
8+
The star count and forks have been preserved for continuity, but please note that they refer to the old repository state.
9+
10+
If you are here for **PySpector**, you are in the right place :)
11+
12+
The code, documentation, and roadmap you see now are **the new software**, actively maintained.
13+
14+
Final note: some forks of this repository may still contain the old code, but they are unrelated to the current project.

README.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# PySpector: An High-Performance Python and Rust SAST Framework
2+
3+
![PySpector's version](https://img.shields.io/badge/PySpector%20version-0.1.0--beta-blue)
4+
5+
PySpector is a static analysis security testing (SAST) Framework engineered for modern Python development workflows. It leverages a powerful Rust core to deliver high-speed, accurate vulnerability scanning, wrapped in a developer-friendly Python CLI. By compiling the analysis engine to a native binary, PySpector avoids the performance overhead of traditional Python-based tools, making it an ideal choice for integration into CI/CD pipelines and local development environments where speed is critical.
6+
7+
The tool is designed to be both comprehensive and intuitive, offering a multi-layered analysis approach that goes beyond simple pattern matching to understand the structure and data flow of your application.
8+
9+
10+
11+
## Getting Started
12+
13+
### Prerequisites
14+
15+
- **Python**: Version 3.12 or lower (3.8+).
16+
- **Rust**: The Rust compiler (`rustc`) and Cargo package manager are required. You can verify your installation by running `cargo --version`.
17+
18+
### Installation
19+
20+
1. **Create a Virtual Environment**: It is highly recommended to install PySpector in a dedicated virtual environment.
21+
```bash
22+
python3.12 -m venv venv
23+
source venv/bin/activate
24+
```
25+
2. **Install Build Dependencies**: PySpector uses `maturin` to build its Rust core.
26+
```bash
27+
pip install maturin setuptools-rust
28+
```
29+
3. **Install PySpector**: From the root of the project repository, install the package. This will compile the Rust core and install the Python wrapper.
30+
```bash
31+
pip install .
32+
```
33+
34+
## Key Features
35+
36+
* **Multi-Layered Analysis Engine:** PySpector employs a sophisticated, multi-layered approach to detect a broad spectrum of vulnerabilities:
37+
38+
* * **Regex-Based Pattern Matching:** Scans all files for specific patterns, ideal for identifying hardcoded secrets, insecure configurations in Dockerfiles, and weak settings in framework files.
39+
40+
* * **Abstract Syntax Tree (AST) Analysis:** For Python files, the tool parses the code into an AST to analyze its structure. This enables precise detection of vulnerabilities tied to code constructs, such as the use of eval(), insecure deserialization with pickle, or weak hashing algorithms.
41+
42+
* * **Inter-procedural Taint Analysis:** The engine builds a comprehensive call graph of the entire application to perform taint analysis. It tracks the flow of data from input sources (like web requests) to dangerous sinks (like command execution functions), allowing it to identify complex injection vulnerabilities with high accuracy.
43+
44+
* **Comprehensive and Customizable Ruleset:** PySpector comes with 238 built-in rules that cover common vulnerabilities, including those from the OWASP Top 10. The rules are defined in a simple TOML format, making them easy to understand and extend.
45+
46+
* **Versatile Reporting:** Generates clear and actionable reports in multiple formats, including a developer-friendly console output, JSON, HTML, and SARIF for seamless integration with other security tools and platforms.
47+
48+
* **Efficient Baselining:** The interactive triage mode simplifies the process of establishing a security baseline, allowing teams to focus on new and relevant findings in each scan.
49+
50+
## How It Works
51+
52+
PySpector's hybrid architecture is key to its performance and effectiveness.
53+
54+
* **Python CLI Orchestration:** The process begins with the Python-based CLI. It handles command-line arguments, loads the configuration and rules, and prepares the target files for analysis. For each Python file, it uses the native ast module to generate an Abstract Syntax Tree, which is then serialized to JSON.
55+
56+
* **Invocation of the Rust Core:** The serialized ASTs, along with the ruleset and configuration, are passed to the compiled Rust core. The handoff from Python to Rust is managed by the pyo3 library.
57+
58+
* **Parallel Analysis in Rust:** The Rust engine takes over and performs the heavy lifting. It leverages the rayon crate to execute file scans and analysis in parallel, maximizing the use of available CPU cores. It builds a complete call graph of the application to understand inter-file function calls, which is essential for the taint analysis module.
59+
60+
* **Results and Reporting:** Once the analysis is complete, the Rust core returns a structured list of findings to the Python CLI. The Python wrapper then handles the final steps of filtering the results based on the severity threshold and the baseline file, and generating the report in the user-specified format.
61+
62+
This architecture combines the best of both worlds: a flexible, user-friendly interface in Python and a high-performance, memory-safe analysis engine in Rust :)
63+
64+
## Usage
65+
66+
PySpector is operated through a straightforward command-line interface.
67+
68+
### Running a Scan
69+
70+
The primary command is `scan`, which can target a local file, a directory, or even a remote Git repository.
71+
72+
```bash
73+
pyspector scan [PATH or --url REPO_URL] [OPTIONS]
74+
```
75+
76+
## Examples:
77+
78+
* **Scan a single file**
79+
```bash
80+
pyspector scan project/main.py
81+
```
82+
83+
* **Scan a local directory and save the report as HTML:**
84+
```bash
85+
pyspector scan /path/to/your/project -o report.html -f html
86+
```
87+
88+
* **Scan a public GitHub repository:**
89+
```bash
90+
pyspector scan --url https://github.com/username/repo.git
91+
```
92+
93+
## Triaging and Baselining Findings
94+
<img width="871" height="950" alt="image" src="https://github.com/user-attachments/assets/5f31c2fc-9216-408e-975f-a1652c6bbdc7" />
95+
96+
PySpector includes an interactive triage mode to help manage and baseline findings. This allows you to review issues and mark them as "ignored" so they don't appear in future scans.
97+
98+
* **Generate a JSON report:**
99+
```bash
100+
pyspector scan /path/to/your/project -o report.json -f json
101+
```
102+
103+
* **Start the triage TUI:**
104+
```bash
105+
pyspector triage report.json
106+
```
107+
108+
Inside the TUI, you can navigate with the arrow keys, press i to toggle the "ignored" status of an issue, and s to save your changes to a .pyspector_baseline.json file. This baseline file will be automatically loaded on subsequent scans.
109+
110+
## Automation and Integration
111+
112+
PySpector includes Shell helper scripts to integrate security scanning directly into your development and operational workflows.
113+
114+
### Git Pre-Commit Hook
115+
116+
To ensure that no new high-severity issues are introduced into the codebase, you can set up a Git pre-commit hook. This hook will automatically scan staged Python files before each commit and block the commit if any HIGH or CRITICAL issues are found.
117+
118+
**To set up the hook, run the following script from the root of your Git repository:**
119+
```bash
120+
./scripts/setup_hooks.sh
121+
```
122+
This script creates an executable .git/hooks/pre-commit file that performs the check. You can bypass the hook for a specific commit by using the --no-verify flag with your git commit command.
123+
124+
## Scheduled Scans with Cron
125+
126+
For continuous monitoring, you can schedule regular scans of your projects using a cron job. PySpector provides an interactive script to help you generate the correct crontab entry.
127+
128+
**To generate your cron job command, run:**
129+
```bash
130+
./scripts/setup_cron.sh
131+
```
132+
The script will prompt you for the project path, desired scan frequency (daily, weekly, monthly), and a location to store the JSON reports. It will then output the command to add to your crontab, automating your security scanning and reporting process.
133+

src/pyspector/config.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
from pathlib import Path
2-
import toml
3-
import click
2+
import toml # type: ignore
3+
import click # type: ignore
44
try:
55
# Python 3.9+
66
import importlib.resources as pkg_resources
77
except ImportError:
88
# Fallback for older Python versions
9-
import importlib_resources as pkg_resources
9+
import importlib_resources as pkg_resources # type: ignore
1010

1111
DEFAULT_CONFIG = {
1212
"exclude": [
@@ -28,10 +28,15 @@ def load_config(config_path: Path) -> dict:
2828
click.echo(click.style(f"Warning: Could not parse config file '{config_path}'. Using defaults. Error: {e}", fg="yellow"))
2929
return DEFAULT_CONFIG
3030

31-
def get_default_rules() -> str:
31+
def get_default_rules(ai_scan: bool = False) -> str:
3232
"""Loads the built-in TOML rules file from package resources."""
3333
try:
34-
# CORRECTED PATH: Look for the 'rules' sub-package within the main 'pyspector' package.
35-
return pkg_resources.files('pyspector.rules').joinpath('built-in-rules.toml').read_text(encoding='utf-8')
34+
base_rules = pkg_resources.files('pyspector.rules').joinpath('built-in-rules.toml').read_text(encoding='utf-8')
35+
if ai_scan:
36+
click.echo("[*] AI scanning enabled. Loading additional AI/LLM rules.")
37+
ai_rules = pkg_resources.files('pyspector.rules').joinpath('built-in-rules-ai.toml').read_text(encoding='utf-8')
38+
# Combine the two rulesets
39+
return base_rules + "\n" + ai_rules
40+
return base_rules
3641
except Exception as e:
3742
raise FileNotFoundError(f"Could not load built-in-rules.toml from package data! Error: {e}")

0 commit comments

Comments
 (0)