Skip to content

Commit 01c6631

Browse files
committed
feat: Initialize project structure and implement an OpenAPI-based API prober.
Signed-off-by: knowlet <knowlet@pm.me>
0 parents  commit 01c6631

11 files changed

Lines changed: 3005 additions & 0 deletions

File tree

.github/workflows/ruff.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
name: ruff
2+
on: [push, pull_request]
3+
jobs:
4+
ruff:
5+
runs-on: ubuntu-latest
6+
steps:
7+
- uses: actions/checkout@v4
8+
- uses: astral-sh/ruff-action@v1

.gitignore

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# Created by https://www.toptal.com/developers/gitignore/api/python
2+
# Edit at https://www.toptal.com/developers/gitignore?templates=python
3+
4+
### Python ###
5+
# Byte-compiled / optimized / DLL files
6+
__pycache__/
7+
*.py[cod]
8+
*$py.class
9+
10+
# C extensions
11+
*.so
12+
13+
# Distribution / packaging
14+
.Python
15+
build/
16+
develop-eggs/
17+
dist/
18+
downloads/
19+
eggs/
20+
.eggs/
21+
lib/
22+
lib64/
23+
parts/
24+
sdist/
25+
var/
26+
wheels/
27+
share/python-wheels/
28+
*.egg-info/
29+
.installed.cfg
30+
*.egg
31+
MANIFEST
32+
33+
# PyInstaller
34+
# Usually these files are written by a python script from a template
35+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
36+
*.manifest
37+
*.spec
38+
39+
# Installer logs
40+
pip-log.txt
41+
pip-delete-this-directory.txt
42+
43+
# Unit test / coverage reports
44+
htmlcov/
45+
.tox/
46+
.nox/
47+
.coverage
48+
.coverage.*
49+
.cache
50+
nosetests.xml
51+
coverage.xml
52+
*.cover
53+
*.py,cover
54+
.hypothesis/
55+
.pytest_cache/
56+
cover/
57+
58+
# Translations
59+
*.mo
60+
*.pot
61+
62+
# Django stuff:
63+
*.log
64+
local_settings.py
65+
db.sqlite3
66+
db.sqlite3-journal
67+
68+
# Flask stuff:
69+
instance/
70+
.webassets-cache
71+
72+
# Scrapy stuff:
73+
.scrapy
74+
75+
# Sphinx documentation
76+
docs/_build/
77+
78+
# PyBuilder
79+
.pybuilder/
80+
target/
81+
82+
# Jupyter Notebook
83+
.ipynb_checkpoints
84+
85+
# IPython
86+
profile_default/
87+
ipython_config.py
88+
89+
# pyenv
90+
# For a library or package, you might want to ignore these files since the code is
91+
# intended to run in multiple environments; otherwise, check them in:
92+
# .python-version
93+
94+
# pipenv
95+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
96+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
97+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
98+
# install all needed dependencies.
99+
#Pipfile.lock
100+
101+
# poetry
102+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
103+
# This is especially recommended for binary packages to ensure reproducibility, and is more
104+
# commonly ignored for libraries.
105+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
106+
#poetry.lock
107+
108+
# pdm
109+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
110+
#pdm.lock
111+
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
112+
# in version control.
113+
# https://pdm.fming.dev/#use-with-ide
114+
.pdm.toml
115+
116+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
117+
__pypackages__/
118+
119+
# Celery stuff
120+
celerybeat-schedule
121+
celerybeat.pid
122+
123+
# SageMath parsed files
124+
*.sage.py
125+
126+
# Environments
127+
.env
128+
.venv
129+
env/
130+
venv/
131+
ENV/
132+
env.bak/
133+
venv.bak/
134+
135+
# Spyder project settings
136+
.spyderproject
137+
.spyproject
138+
139+
# Rope project settings
140+
.ropeproject
141+
142+
# mkdocs documentation
143+
/site
144+
145+
# mypy
146+
.mypy_cache/
147+
.dmypy.json
148+
dmypy.json
149+
150+
# Pyre type checker
151+
.pyre/
152+
153+
# pytype static type analyzer
154+
.pytype/
155+
156+
# Cython debug symbols
157+
cython_debug/
158+
159+
# PyCharm
160+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
161+
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
162+
# and can be added to the global gitignore or merged into this file. For a more nuclear
163+
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
164+
#.idea/
165+
166+
### Python Patch ###
167+
# Poetry local configuration file - https://python-poetry.org/docs/configuration/#local-configuration
168+
poetry.toml
169+
170+
# ruff
171+
.ruff_cache/
172+
173+
# LSP config files
174+
pyrightconfig.json
175+
176+
# End of https://www.toptal.com/developers/gitignore/api/python
177+
178+
# Scanner output
179+
*_spec.yaml
180+
*.har
181+
*.mitm

.pre-commit-config.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
repos:
2+
- repo: https://github.com/pre-commit/pre-commit-hooks
3+
rev: v5.0.0
4+
hooks:
5+
- id: trailing-whitespace
6+
- id: end-of-file-fixer
7+
- id: check-yaml
8+
- id: check-added-large-files
9+
10+
- repo: https://github.com/astral-sh/ruff-pre-commit
11+
rev: v0.9.0
12+
hooks:
13+
- id: ruff
14+
args: [--fix]
15+
- id: ruff-format

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.10

README.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# Active Scanner for Mitmproxy2Swagger
2+
3+
This module provides an automated way to crawl, inspect, and fuzz API endpoints to generate a high-quality OpenAPI specification with latency metrics.
4+
5+
## Components
6+
7+
1. **Crawler (`src/scanner/crawler.py`)**: Explores a website, renders JavaScript (Playwright), and captures all network traffic.
8+
2. **Prober (`src/scanner/prober.py`)**: Actively probes discovered endpoints with multiple requests to gather statistical performance data.
9+
3. **Core (`mitmproxy2swagger`)**: Converts the captured traffic (HAR/Flow) into an OpenAPI Spec.
10+
11+
## Usage Guide
12+
13+
14+
15+
### Usage Guide
16+
17+
Run the full active scan in one click:
18+
19+
```bash
20+
uv run scanner https://example.com
21+
```
22+
23+
This will automatically:
24+
1. **Crawl** the website to discover endpoints.
25+
2. **Start a proxy** (mitmdump) in the background.
26+
3. **Probe/Fuzz** the endpoints through the proxy.
27+
4. **Generate** the final OpenAPI spec (`final_spec.yaml`).
28+
29+
#### With Authentication
30+
31+
**Using Headers (e.g., Bearer Token):**
32+
```bash
33+
uv run scanner https://api.example.com \
34+
--header "Authorization: Bearer YOUR_TOKEN"
35+
```
36+
37+
**Using Cookies (e.g., Session ID):**
38+
```bash
39+
uv run scanner https://dashboard.example.com \
40+
--cookie "session_id=xyz123"
41+
```
42+
43+
#### Advanced Usage
44+
45+
You can still customize the run:
46+
47+
```bash
48+
uv run scanner https://example.com \
49+
--depth 3 \
50+
--proxy-port 8081 \
51+
--final-spec my_api.yaml
52+
```
53+
54+
55+
56+
## Requirements
57+
* Python 3.10+
58+
* [uv](https://github.com/astral-sh/uv) package manager
59+
* Playwright (`uv run playwright install`)
60+
61+
## Setup
62+
63+
### 1. Installation
64+
65+
1. **Install `uv`** (if not already installed):
66+
```bash
67+
curl -LsSf https://astral.sh/uv/install.sh | sh
68+
```
69+
70+
2. **Sync dependencies**:
71+
```bash
72+
uv sync
73+
```
74+
75+
3. **Install Playwright browsers**:
76+
```bash
77+
uv run playwright install
78+
```
79+
80+
### 2. Development Setup
81+
82+
To ensure code quality, we use `ruff` and `pre-commit`.
83+
84+
1. **Install pre-commit hooks**:
85+
```bash
86+
uv run pre-commit install
87+
```
88+
89+
2. **Run linting manually** (optional):
90+
```bash
91+
uv run ruff check .
92+
uv run ruff format .
93+
```

pyproject.toml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
[project]
2+
name = "scanner"
3+
version = "0.1.0"
4+
description = "Active Scanner for Mitmproxy2Swagger"
5+
readme = "README.md"
6+
requires-python = ">=3.10"
7+
dependencies = [
8+
"playwright>=1.49.1",
9+
"beautifulsoup4>=4.12.3",
10+
"mitmproxy>=10.4.2",
11+
"httpx>=0.28.1",
12+
"pyyaml>=6.0.2",
13+
"ruamel.yaml>=0.18.6",
14+
"mitmproxy2swagger @ git+https://github.com/knowlet/mitmproxy2swagger",
15+
]
16+
17+
[dependency-groups]
18+
dev = [
19+
"ruff>=0.9.0",
20+
"pre-commit>=4.0.1",
21+
]
22+
23+
[build-system]
24+
requires = ["hatchling"]
25+
build-backend = "hatchling.build"
26+
27+
[tool.hatch.metadata]
28+
allow-direct-references = true
29+
30+
[project.scripts]
31+
scanner = "scanner.main:main"
32+
33+
[tool.hatch.build.targets.wheel]
34+
packages = ["src/scanner"]
35+
36+
37+
[tool.ruff]
38+
line-length = 120
39+
target-version = "py310"
40+
41+
[tool.ruff.lint]
42+
select = ["E", "F", "I", "B", "C4", "UP"]
43+
ignore = []
44+
45+
[tool.ruff.format]
46+
quote-style = "double"
47+
indent-style = "space"

src/scanner/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)