Skip to content

Latest commit

 

History

History
25 lines (17 loc) · 908 Bytes

File metadata and controls

25 lines (17 loc) · 908 Bytes

Web Scraping 101

Background and Motivation

With the vast amount of data on the internet, an important question is how to to access it. Normally, we just use a web browser to go to a website and view it manually. Other times a site may have a data portal we can use to download curated data or they an API. In the case where there is useful data stored in web pages we can create programs for the automatic and systematic gathering and parsing of data from the web.

Installation

This tutorial is written in Python 2.7. Dependencies can be found and installed in the requirements.txt file.

Install Dependencies

pip install -r requirements.txt 

Further Resources

  1. Stanford Tutorial
  2. Web Scraping with Python