Skip to content

Latest commit

 

History

History
33 lines (19 loc) · 1.03 KB

File metadata and controls

33 lines (19 loc) · 1.03 KB

Search Engine Seed URLs

Seed Sites for Search Engine Web Crawler

How to Use

Use e.g. https://extract.me/ to unZIP and have fun!

Source

Top 1 Million Sites

http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip

https://ak.quantcast.com/quantcast-top-sites.zip

Top 10 Million Sites

https://www.domcop.com/files/top/top10milliondomains.csv.zip

Hint

Use https://pinetools.com/split-files to split zip file into parts smaller than 25MB for github 😄

Later use https://pinetools.com/join-files to join them!

powered since 2019 by phpSoftware