diff --git a/README.md b/README.md index 9954716..278199c 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,11 @@ -# 🚨 Bucket Stream is no longer maintained. If you need support or consultation for your red teaming endeavours, drop me an e-mail paul@darkport.co.uk 🚨 - # Bucket Stream **Find interesting Amazon S3 Buckets by watching certificate transparency logs.** This tool simply listens to various certificate transparency logs (via certstream) and attempts to find public S3 buckets from permutations of the certificates domain name. +> **Note:** This project has been updated and modernized for Python 3. The original project is no longer maintained by the original author, but has been updated to work with current dependencies and Python versions. + ![Demo](https://i.imgur.com/ZFkIYhD.jpg) **Be responsible**. I mainly created this tool to highlight the risks associated with public S3 buckets and to put a different spin on the usual dictionary based attacks. Some quick tips if you use S3 buckets: @@ -19,60 +19,197 @@ Thanks to my good friend David (@riskobscurity) for the idea. ## Installation -Python 3.4+ and pip3 are required. Then just: +**Requirements:** Python 3.7+ (Python 3.8+ recommended) + +1. Clone the repository: + ```bash + git clone https://github.com/eth0izzle/bucket-stream.git + cd bucket-stream + ``` + +2. Create and activate a virtual environment (recommended): + ```bash + python3 -m venv venv + source venv/bin/activate # On Windows: venv\Scripts\activate + ``` -1. `git clone https://github.com/eth0izzle/bucket-stream.git` -2. *(optional)* Create a virtualenv with `pip3 install virtualenv && virtualenv .virtualenv && source .virtualenv/bin/activate` -2. `pip3 install -r requirements.txt` -3. `python3 bucket-stream.py` +3. Install dependencies: + ```bash + pip install -r requirements.txt + ``` + +4. Configure (optional but recommended): + Edit `config.yaml` and add your AWS credentials to avoid rate limiting: + ```yaml + aws_access_key: 'your-access-key' + aws_secret: 'your-secret-key' + ``` ## Usage -Simply run `python3 bucket-stream.py`. - -If you provide AWS access and secret keys in `config.yaml` Bucket Stream will attempt to access authenticated buckets and identity the buckets owner. **Unauthenticated users are severely rate limited.** - - usage: python bucket-stream.py - - Find interesting Amazon S3 Buckets by watching certificate transparency logs. - - optional arguments: - -h, --help Show this help message and exit - --only-interesting Only log 'interesting' buckets whose contents match - anything within keywords.txt (default: False) - --skip-lets-encrypt Skip certs (and thus listed domains) issued by Let's - Encrypt CA (default: False) - -t , --threads Number of threads to spawn. More threads = more power. - Limited to 5 threads if unauthenticated. - (default: 20) - --ignore-rate-limiting - If you ignore rate limits not all buckets will be - checked (default: False) - -l, --log Log found buckets to a file buckets.log (default: - False) - -s, --source Data source to check for bucket permutations. Uses - certificate transparency logs if not specified. - (default: None) - -p, --permutations Path of file containing a list of permutations to try - (see permutations/ dir). (default: permutations\default.txt) +### Basic Usage + +Simply run: +```bash +python bucket-stream.py +``` + +If you provide AWS access and secret keys in `config.yaml`, Bucket Stream will attempt to access authenticated buckets and identify the bucket owner. **Unauthenticated users are severely rate limited (max 5 threads).** + +### Command Line Options + +``` +usage: python bucket-stream.py + +Find interesting Amazon S3 Buckets by watching certificate transparency logs. + +options: + -h, --help Show this help message and exit + --only-interesting Only log 'interesting' buckets whose contents match + anything within keywords.txt (default: False) + --skip-lets-encrypt Skip certs (and thus listed domains) issued by Let's + Encrypt CA (default: False) + -t, --threads Number of threads to spawn. More threads = more power. + Limited to 5 threads if unauthenticated. (default: 20) + --ignore-rate-limiting + If you ignore rate limits not all buckets will be + checked (default: False) + -l, --log Log found buckets to a file buckets.log (default: False) + -s, --source SOURCE Data source to check for bucket permutations. Uses + certificate transparency logs if not specified. + (default: None) + -p, --permutations PERMUTATIONS + Path of file containing a list of permutations to try + (see permutations/ dir). (default: permutations/default.txt) +``` + +### Usage Examples + +**Basic scan with CertStream:** +```bash +python bucket-stream.py +``` + +**Use extended permutations list (more comprehensive but slower):** +```bash +python bucket-stream.py -p permutations/extended.txt +``` + +**Scan specific domains from a file:** +```bash +python bucket-stream.py --source domains.txt --threads 10 +``` + +**Only log interesting buckets (matching keywords.txt):** +```bash +python bucket-stream.py --only-interesting --log +``` + +This will only report buckets that contain files matching keywords in `keywords.txt` (e.g., password files, database dumps, configuration files, etc.). + +**Skip Let's Encrypt certificates:** +```bash +python bucket-stream.py --skip-lets-encrypt +``` + +### Permutations + +The tool uses permutation files to generate potential bucket names. Two files are provided: + +- **`permutations/default.txt`** - ~30 common permutations (fast, recommended for most use cases) +- **`permutations/extended.txt`** - 1000+ permutations (comprehensive but slower) + +You can create custom permutation files. Each line should contain `%s` where the domain name will be inserted, for example: +``` +%s-backup +backup-%s +%s-data +data-%s +``` + +### Keywords Filtering + +The `keywords.txt` file contains a list of sensitive keywords and file extensions used to identify "interesting" buckets when using the `--only-interesting` flag. The file includes: + +- **Sensitive keywords**: password, secret, token, api-key, credentials, etc. +- **Database files**: .sql, .db, .dump, .backup, etc. +- **Configuration files**: .env, .pem, .key, config files, etc. +- **Source code**: .git, .svn, source code files, etc. +- **Archives**: .zip, .tar, .rar, compressed files, etc. +- **Documents**: .xls, .csv, .pdf, spreadsheets, etc. +- **Log files**: .log, access logs, error logs, etc. +- **Virtual machines**: .ova, .vmdk, disk images, etc. +- **And many more...** + +The file contains **200+ keywords** organized by category. You can customize it by adding or removing keywords. Lines starting with `#` are treated as comments and ignored. + +**Example keywords.txt:** +``` +password +secret +.sql +.env +backup +``` + +## Updates & Improvements + +This version includes the following updates: +- ✅ Updated to Python 3.7+ (removed Python 2 compatibility) +- ✅ Updated all dependencies to latest compatible versions +- ✅ Fixed CertStream connection issues +- ✅ Improved error handling and reconnection logic +- ✅ Enhanced default permutations list (~30 common patterns) +- ✅ Expanded keywords.txt file (200+ keywords across 15+ categories) +- ✅ Added comment support in keywords.txt (lines starting with # are ignored) +- ✅ Code modernization and cleanup ## F.A.Qs - **Nothing appears to be happening** - Patience! Sometimes certificate transparency logs can be quiet for a few minutes. Ideally provide AWS secrets in `config.yaml` as this greatly speeds up the checking rate. + Patience! Sometimes certificate transparency logs can be quiet for a few minutes. The tool will show "Waiting for Certstream events..." and then "Connected to CertStream!" when connected. Ideally provide AWS secrets in `config.yaml` as this greatly speeds up the checking rate. + +- **I'm getting rate limited** + + If you don't have AWS credentials, you're limited to 5 threads. Either: + - Add AWS credentials to `config.yaml` (recommended) + - Use `--ignore-rate-limiting` flag (may miss some buckets) + - Reduce threads with `-t 3` + +- **CertStream connection errors** + + The tool automatically retries on connection errors. If you see repeated errors, check your internet connection or try again later. - **I found something highly confidential** **Report it** - please! You can usually figure out the owner from the bucket name or by doing some quick reconnaissance. Failing that contact Amazon's support teams. +## Troubleshooting + +**Import errors:** +- Make sure you're using Python 3.7+ +- Ensure all dependencies are installed: `pip install -r requirements.txt` +- Use a virtual environment to avoid conflicts + +**Connection issues:** +- CertStream may be temporarily unavailable +- Check your firewall/proxy settings +- The tool will automatically retry + +**Rate limiting:** +- Add AWS credentials to `config.yaml` for better performance +- Without credentials, you're limited to 5 threads + ## Contributing -1. Fork it, baby! +Contributions are welcome! Please: + +1. Fork the repository 2. Create your feature branch: `git checkout -b my-new-feature` 3. Commit your changes: `git commit -am 'Add some feature'` 4. Push to the branch: `git push origin my-new-feature` -5. Submit a pull request. +5. Submit a pull request ## License diff --git a/bucket-stream.py b/bucket-stream.py index a4ff78f..70baba7 100644 --- a/bucket-stream.py +++ b/bucket-stream.py @@ -2,36 +2,29 @@ # -*- coding: utf-8 -*- import sys -PY2 = sys.version_info[0] == 2 -PY3 = (sys.version_info[0] >= 3) - -#import queue -if PY2: - import Queue as queue -else: # PY3 - import queue - +import queue import argparse import logging import os import signal import time import json -from threading import Lock -from threading import Event -from threading import Thread +from threading import Lock, Event, Thread import requests import tldextract import yaml from boto3.session import Session from certstream.core import CertStreamClient +import certstream from requests.adapters import HTTPAdapter from termcolor import cprint ARGS = argparse.Namespace() -CONFIG = yaml.safe_load(open("config.yaml")) -KEYWORDS = [line.strip() for line in open("keywords.txt")] +with open("config.yaml", "r") as f: + CONFIG = yaml.safe_load(f) +with open("keywords.txt", "r") as f: + KEYWORDS = [line.strip() for line in f if line.strip() and not line.strip().startswith('#')] S3_URL = "http://s3-1-w.amazonaws.com" BUCKET_HOST = "%s.s3.amazonaws.com" QUEUE_SIZE = CONFIG['queue_size'] @@ -68,18 +61,29 @@ def run(self): class CertStreamThread(Thread): def __init__(self, q, *args, **kwargs): self.q = q - self.c = CertStreamClient( - self.process, skip_heartbeats=True, on_open=None, on_error=None) - super().__init__(*args, **kwargs) def run(self): global THREAD_EVENT - while not THREAD_EVENT.is_set(): - cprint("Waiting for Certstream events - this could take a few minutes to queue up...", + cprint("Waiting for Certstream events - this could take a few minutes to queue up...", "yellow", attrs=["bold"]) - self.c.run_forever() - THREAD_EVENT.wait(10) + try: + certstream.listen_for_events( + self.process, + "wss://certstream.calidog.io/", + skip_heartbeats=True, + on_open=self._on_open, + on_error=self._on_error + ) + except KeyboardInterrupt: + pass + + def _on_open(self): + cprint("Connected to CertStream! Listening for certificate updates...", "green", attrs=["bold"]) + + def _on_error(self, ex): + if not isinstance(ex, KeyboardInterrupt): + cprint("CertStream connection error: {} - Will retry...".format(ex), "yellow") def process(self, message, context): if message["message_type"] == "heartbeat": @@ -246,7 +250,8 @@ def get_permutations(domain, subdomain=None): "%s-www" % domain, ] - perms.extend([line.strip() % domain for line in open(ARGS.permutations)]) + with open(ARGS.permutations, "r") as f: + perms.extend([line.strip() % domain for line in f]) if subdomain is not None: perms.extend([ @@ -314,9 +319,10 @@ def main(): if ARGS.source is None: THREADS.extend([CertStreamThread(q)]) else: - for line in open(ARGS.source): - for permutation in get_permutations(line.strip()): - q.put(BUCKET_HOST % permutation) + with open(ARGS.source, "r") as f: + for line in f: + for permutation in get_permutations(line.strip()): + q.put(BUCKET_HOST % permutation) for t in THREADS: t.daemon = True diff --git a/keywords.txt b/keywords.txt index a30c175..5d36932 100644 --- a/keywords.txt +++ b/keywords.txt @@ -1,16 +1,235 @@ +# Sensitive keywords password +passwd +secret +private +confidential +api-key +api_key +apikey +token +access-key +access_key +credential +credentials +auth +authentication wp-config +config +.env +.env.local +.env.production backup -confidential -.apk +dump +export +archive + +# Database files .sql .psql +.db +.database +.sqlite +.sqlite3 +.mdb +.accdb +.dump +.backup + +# Configuration files +.ini +.cfg +.conf +.config +.properties +.yml +.yaml +.json +.xml +.pem +.key +.crt +.cert +.p12 +.pfx +.id_rsa +.id_dsa +.ssh +.htaccess +.htpasswd +web.config +wp-config.php +config.php +settings.php +database.php + +# Source code & scripts +.inc.php +.php.bak +.git +.gitignore +.gitconfig +.svn +.hg +.DS_Store +.idea +.vscode +composer.json +package.json +requirements.txt +Pipfile +Gemfile + +# Archives & compressed files .zip .tar +.tar.gz +.tgz +.rar +.7z +.bz2 +.gz .bak +.swp +.tmp +.temp + +# Documents & spreadsheets .xls +.xlsx .csv +.doc +.docx +.pdf +.rtf +.odt +.ods + +# Log files .log -.inc.php +.logs +access.log +error.log +debug.log +audit.log +application.log + +# Virtual machine & disk images .ova -.vmdk \ No newline at end of file +.ovf +.vmdk +.vdi +.vhd +.qcow2 +.iso +.img + +# Mobile & executable files +.apk +.ipa +.exe +.msi +.dmg +.deb +.rpm + +# Media files (potentially sensitive) +.jpg +.jpeg +.png +.gif +.mp4 +.avi +.mov + +# Environment & secrets +.env +.env.local +.env.development +.env.production +.env.staging +secrets +.secrets +.secret +.secretkey +.secret_key +private_key +privatekey +aws_access_key +aws_secret +aws_secret_key +aws_access_key_id + +# Backup indicators +backup +backups +bak +old +archive +archives +temp +tmp +temporary +scratch +test +testing +dev +development +staging +prod +production + +# Security related +firewall +security +vulnerability +exploit +payload +shell +webshell +malware +virus +trojan + +# Cloud & infrastructure +terraform.tfstate +.terraform +docker-compose.yml +docker-compose.yaml +dockerfile +Dockerfile +kubernetes +k8s +kubeconfig +.aws +.azure +.gcp + +# API & webhooks +webhook +endpoint +api +rest +graphql +swagger +openapi +postman + +# Personal information +personal +pii +gdpr +customer +client +user +users +employee +employees +staff +payroll +salary +billing +invoice +invoices +contract +contracts \ No newline at end of file diff --git a/permutations/default.txt b/permutations/default.txt index fed7347..00bbebf 100644 --- a/permutations/default.txt +++ b/permutations/default.txt @@ -9,3 +9,36 @@ test-%s %s-prod prod-%s %s-uat +uat-%s +%s-data +data-%s +%s-files +files-%s +%s-assets +assets-%s +%s-uploads +uploads-%s +%s-logs +logs-%s +%s-media +media-%s +%s-public +public-%s +%s-private +private-%s +%s-static +static-%s +%s-temp +temp-%s +%s-archive +archive-%s +%s-cdn +cdn-%s +%s-config +config-%s +%s-db +db-%s +%s-s3 +s3-%s +%s-aws +aws-%s diff --git a/requirements.txt b/requirements.txt index 2ef7f0c..d3dcb81 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,21 +1,7 @@ -boto3==1.4.8 -botocore==1.8.2 -certifi==2017.11.5 -certstream==1.8 -chardet==3.0.4 -docutils==0.14 -idna==2.6 -jmespath==0.9.3 -multidict==3.3.2 -pipdeptree==0.10.1 -pyaml==17.10.0 -python-dateutil==2.6.1 -PyYAML==3.12 -requests==2.18.4 -requests-file==1.4.2 -s3transfer==0.1.11 -six==1.11.0 -termcolor==1.1.0 -tldextract==2.2.0 -urllib3==1.22 -websocket-client==0.44.0 \ No newline at end of file +boto3 +botocore +certstream>=1.11 +requests +termcolor +tldextract +PyYAML diff --git a/test_ws.py b/test_ws.py new file mode 100644 index 0000000..43415af --- /dev/null +++ b/test_ws.py @@ -0,0 +1,28 @@ +import logging +from certstream.core import CertStreamClient +import time +import threading + +logging.basicConfig(level=logging.INFO) + +def process(message, context): + print("SUCCESS: Mesaj alındı!") + +print("Test basliyor (wss://certstream.calidog.io)...") +try: + client = CertStreamClient(process, "wss://certstream.calidog.io", skip_heartbeats=True) + # run_forever metodu bloklayıcı olduğu için thread içinde çalıştırıyoruz + t = threading.Thread(target=client.run_forever) + t.daemon = True + t.start() + + # 10 saniye bekle + start_time = time.time() + while time.time() - start_time < 10: + time.sleep(1) + +except Exception as e: + print(f"Hata: {e}") + +print("Test bitti.") +