Skip to content

Commit 0e8c577

Browse files
committed
First version
0 parents  commit 0e8c577

1,709 files changed

Lines changed: 229970 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.idea/*

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2023 devploit
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# SubProbe
2+
> JS-powered crawler for hidden endpoints & internal subdomains
3+
4+
<p align="center">
5+
<img src="https://i.imgur.com/aJPgEZ9.png" width="250" alt="SubProbe logo"/>
6+
</p>
7+
8+
<p align="center">
9+
<img src="https://img.shields.io/github/license/devploit/SubProbe?style=flat-square" alt="License">
10+
<img src="https://img.shields.io/github/stars/devploit/SubProbe?style=flat-square" alt="Stars">
11+
</p>
12+
13+
<p align="center">
14+
<b>Extract hidden endpoints and internal subdomains from JavaScript files through semantic analysis</b>
15+
</p>
16+
17+
SubProbe is a powerful JavaScript-aware web crawler designed for security researchers and penetration testers. It discovers hidden endpoints, APIs, and subdomains by analyzing JavaScript files within web applications — revealing potential attack surfaces that traditional crawlers and subdomain enumeration tools miss.
18+
19+
## 🚀 Features
20+
21+
- **Deep JavaScript Analysis**: Parses and extracts endpoints from **JavaScript files** (semantic analysis)
22+
- **Recursive Crawling**: Supports multi-level crawling to discover deeper JS resources
23+
- **External Sources**: Collects additional endpoints from:
24+
- robots.txt
25+
- sitemap.xml
26+
- Wayback Machine
27+
- **Endpoint Verification**: Tests endpoints to verify they're accessible
28+
- **Status Filtering**: Filter results by HTTP status codes
29+
- **Export Options**: Save results as JSON, CSV, or plain text files
30+
31+
## 📋 Installation
32+
33+
```bash
34+
# Clone the repository
35+
git clone https://github.com/devploit/SubProbe.git
36+
cd SubProbe
37+
npm install
38+
39+
# Make it executable
40+
npm link
41+
```
42+
43+
After running the above commands, you can use `subprobe` directly from your terminal.
44+
45+
## 📊 Command Options
46+
47+
| Option | Description |
48+
|--------|-------------|
49+
| `--depth <number>` | Recursive scan depth for internal links (default 0) |
50+
| `--filter-status <codes>` | Filter by status codes. Supports exact (200), ranges (400-410), and groups (4xx) |
51+
| `-o, --out <file>` | Export results to JSON, CSV, or plain text (determined by file extension) |
52+
| `--probe` | Check if endpoints respond (via HTTP status codes) |
53+
| `--wayback` | Include Wayback Machine results |
54+
| `--silent` | Only show discovered endpoints without progress information |
55+
| `--no-color` | Disable colored output |
56+
57+
## 📝 Example Output
58+
59+
Running `subprobe https://example.com --probe --wayback` might produce output like this:
60+
61+
```
62+
🚀 Starting SubProbe on https://example.com
63+
64+
[12:34:56] 🕷️ Starting crawl (depth: 0)
65+
[12:34:57] 🎯 Crawling depth 0 (1 URLs)
66+
[12:35:01] 📂 Collecting from robots.txt & sitemap.xml
67+
[12:35:05] 🕚 Collecting from Wayback...
68+
[12:35:12] 🔌 Probing 42 endpoints...
69+
70+
✅ Analysis complete - Summary:
71+
- URLs analyzed: 1
72+
- JS files analyzed: 3/3
73+
- Endpoints found: 42
74+
75+
[12:35:30] 🔍 Found 42 endpoints:
76+
77+
🟩 https://example.com/api/v1/users ✅ [200]
78+
🟩 https://example.com/api/v1/products ✅ [200]
79+
🟩 https://example.com/api/v1/cart ✅ [200]
80+
🟩 https://example.com/api/v1/checkout 🔒 [401]
81+
🟦 https://api.example.com/v2/products ✅ [200]
82+
🟥 https://cdn.example.net/assets/main.js ✅ [200]
83+
🟥 https://analytics.example-tracker.com/collect ❌ [404]
84+
🕓 https://example.com/legacy/api/users ❌ [404]
85+
🕓 https://example.com/beta/graphql ✅ [200]
86+
🗺️ https://example.com/sitemap/products.xml ✅ [200]
87+
🤖 https://example.com/admin/login.php ❌ [404]
88+
```
89+
90+
The output shows different types of endpoints with their status:
91+
- 🟩 Relative paths from the same domain
92+
- 🟦 Internal subdomains
93+
- 🟥 External domains referenced in code
94+
- 🕓 Historical endpoints from Wayback Machine
95+
- 🗺️ Endpoints found in sitemap.xml
96+
- 🤖 Endpoints found in robots.txt
97+
98+
Status codes are shown when using `--probe`:
99+
- ✅ 2xx: Success
100+
- 🔁 3xx: Redirection
101+
- 🔒 401/403: Authentication required
102+
- ❌ 4xx: Client error
103+
- 💥 5xx: Server error
104+
105+
## 🔍 How It Works
106+
107+
SubProbe uses a multi-stage approach to discover hidden endpoints:
108+
109+
1. **Crawling**: SubProbe behaves like a lightweight crawler, starting from the target URL and recursively following links up to the specified depth to discover more JavaScript files and internal pages.
110+
2. **JS Collection**: Extracts and downloads JavaScript files from HTML source
111+
3. **Semantic Analysis**: Parses JS files using AST (Abstract Syntax Tree) analysis to find:
112+
- Fetch API calls
113+
- Axios requests
114+
- XMLHttpRequest URLs
115+
- Hardcoded API endpoints
116+
4. **External Data**: Gathers additional endpoints from robots.txt, sitemap.xml, and optionally Wayback Machine
117+
5. **Endpoint Verification**: If enabled, probes discovered endpoints to check their HTTP status
118+
6. **Results Display**: Presents organized results with color-coded endpoint types and status codes
119+
120+
## 🌐 Use Cases
121+
122+
- Finding hidden API endpoints during penetration tests
123+
- Discovering forgotten or legacy endpoints that might be vulnerable
124+
- Identifying internal subdomains referenced in JavaScript
125+
- Mapping the full attack surface of a web application
126+
- Reconnaissance phase of bug bounty hunting
127+
128+
## 👨‍💻 Contributing
129+
130+
Contributions are welcome! Please feel free to submit a Pull Request.
131+
132+
1. Fork the repository
133+
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
134+
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
135+
4. Push to the branch (`git push origin feature/amazing-feature`)
136+
5. Open a Pull Request
137+
138+
## 📄 License
139+
140+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
141+
142+
---
143+
144+
<p align="center">
145+
Made with ❤️ by <a href="https://github.com/devploit">devploit</a>
146+
</p>

bin/subprobe.js

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#!/usr/bin/env node
2+
3+
import { Command } from 'commander';
4+
import chalk from 'chalk';
5+
import { runSubprobe } from '../lib/extractor.js';
6+
7+
const program = new Command();
8+
9+
program
10+
.name('subprobe')
11+
.description('Extract hidden endpoints and internal subdomains from JavaScript files and external sources')
12+
.version('0.2.1')
13+
.argument('<url>', 'Target URL to analyze')
14+
.option('--depth <number>', 'Recursive scan depth for internal links (default 0)', parseInt, 0)
15+
.option('--filter-status <codes>', 'Filter by status codes. Supports exact (200), ranges (400-410), and groups (4xx)')
16+
.option('-o, --out <file>', 'Export results to JSON or CSV (determined by file extension)')
17+
.option('--probe', 'Check if endpoints respond (via HTTP status codes)')
18+
.option('--wayback', 'Include Wayback Machine results')
19+
.option('--silent', 'Only show discovered endpoints without progress information')
20+
.option('--no-color', 'Disable colored output')
21+
.action(async (url, options) => {
22+
// Disable colors if --no-color is specified
23+
if (options.color === false) {
24+
chalk.level = 0;
25+
}
26+
27+
if (!options.silent) {
28+
console.log(chalk.greenBright(`🚀 Starting SubProbe on ${url}`));
29+
}
30+
31+
await runSubprobe(url, options);
32+
});
33+
34+
program.parse(process.argv);

lib/downloader.js

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
import axios from 'axios';
2+
import * as cheerio from 'cheerio';
3+
import chalk from 'chalk';
4+
5+
export async function fetchHTML(url) {
6+
try {
7+
const res = await axios.get(url, {
8+
timeout: 10000,
9+
maxRedirects: 5,
10+
headers: {
11+
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
12+
}
13+
});
14+
return res.data;
15+
} catch (error) {
16+
return '';
17+
}
18+
}
19+
20+
export async function extractJSLinks(html, baseUrl) {
21+
try {
22+
const $ = cheerio.load(html);
23+
const scripts = [];
24+
25+
$('script[src]').each((_, el) => {
26+
const src = $(el).attr('src');
27+
if (!src) return;
28+
29+
if (src.startsWith('http')) {
30+
scripts.push(src);
31+
} else {
32+
try {
33+
const full = new URL(src, baseUrl).href;
34+
scripts.push(full);
35+
} catch (e) {
36+
}
37+
}
38+
});
39+
40+
return scripts;
41+
} catch (error) {
42+
return [];
43+
}
44+
}
45+
46+
export async function fetchJS(url) {
47+
try {
48+
const displayUrl = url.split('?')[0];
49+
50+
const res = await axios.get(url, {
51+
timeout: 8000,
52+
maxRedirects: 3,
53+
headers: {
54+
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
55+
'Accept': '*/*'
56+
},
57+
validateStatus: function(status) {
58+
return status >= 200 && status < 400;
59+
}
60+
});
61+
62+
const contentType = res.headers['content-type'] || '';
63+
if (!contentType.includes('javascript') &&
64+
!contentType.includes('application/x-javascript') &&
65+
!contentType.includes('text/plain') &&
66+
!contentType.includes('text/') &&
67+
res.data.toString().indexOf('function') === -1 &&
68+
res.data.toString().indexOf('var ') === -1) {
69+
}
70+
71+
let jsContent = res.data;
72+
if (typeof jsContent !== 'string') {
73+
jsContent = String(jsContent);
74+
}
75+
76+
return jsContent;
77+
} catch (error) {
78+
return '';
79+
}
80+
}

lib/exporter.js

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
import fs from 'fs/promises';
2+
import path from 'path';
3+
4+
export async function exportToJSON(filepath, results) {
5+
const dir = path.dirname(filepath);
6+
await fs.mkdir(dir, { recursive: true });
7+
await fs.writeFile(filepath, JSON.stringify(results, null, 2), 'utf-8');
8+
console.log(`\n💾 JSON saved to ${filepath}`);
9+
}
10+
11+
export async function exportToCSV(filepath, results) {
12+
const dir = path.dirname(filepath);
13+
await fs.mkdir(dir, { recursive: true });
14+
15+
const headers = ['value', 'resolved_url', 'type', 'source', 'reachable', 'status'];
16+
const lines = [headers.join(',')];
17+
18+
for (const r of results) {
19+
let resolvedUrl = r.value;
20+
try {
21+
if (r.type === 'relative' && r.origin) {
22+
const base = new URL(r.origin);
23+
resolvedUrl = `${base.protocol}//${base.host}${r.value}`;
24+
} else if (r.type !== 'relative') {
25+
resolvedUrl = r.value;
26+
}
27+
} catch {
28+
resolvedUrl = r.value;
29+
}
30+
31+
32+
const row = [
33+
`"${r.value}"`,
34+
`"${resolvedUrl}"`,
35+
r.type,
36+
r.source,
37+
r.reachable ?? '',
38+
r.status ?? ''
39+
];
40+
41+
lines.push(row.join(','));
42+
}
43+
44+
await fs.writeFile(filepath, lines.join('\n'), 'utf-8');
45+
console.log(`\n💾 CSV saved to ${filepath}`);
46+
}
47+
48+
export async function exportToTXT(filepath, results) {
49+
const dir = path.dirname(filepath);
50+
await fs.mkdir(dir, { recursive: true });
51+
52+
const lines = [];
53+
54+
for (const r of results) {
55+
let resolvedUrl = r.value;
56+
57+
try {
58+
if (r.type === 'relative' && r.origin) {
59+
const base = new URL(r.origin);
60+
resolvedUrl = `${base.protocol}//${base.host}${r.value}`;
61+
} else if (r.type !== 'relative') {
62+
resolvedUrl = r.value;
63+
}
64+
} catch {
65+
resolvedUrl = r.value;
66+
}
67+
68+
const statusInfo = r.status ? ` [${r.status}]` : '';
69+
lines.push(`${resolvedUrl}${statusInfo}`);
70+
}
71+
72+
await fs.writeFile(filepath, lines.join('\n'), 'utf-8');
73+
console.log(`\n💾 TXT saved to ${filepath}`);
74+
}
75+
76+
export async function exportResults(filepath, results) {
77+
if (!filepath) return;
78+
79+
try {
80+
if (filepath.endsWith('.json')) {
81+
await exportToJSON(filepath, results);
82+
} else if (filepath.endsWith('.csv')) {
83+
await exportToCSV(filepath, results);
84+
} else if (filepath.endsWith('.txt')) {
85+
await exportToTXT(filepath, results);
86+
} else {
87+
// Default to JSON if extension isn't recognized
88+
await exportToJSON(filepath, results);
89+
}
90+
} catch (err) {
91+
console.error(`Error exporting results: ${err.message}`);
92+
}
93+
}

0 commit comments

Comments
 (0)