A minimal project that shows how to fetch HTML pages and extract information using requests and BeautifulSoup.
- Use
requeststo download page HTML. - Parse the HTML with
BeautifulSoupto find tags and extract text or attributes.
Main script: basic web scraping.py
import requests
from bs4 import BeautifulSoup
r = requests.get('https://example.com')
soup = BeautifulSoup(r.text, 'html.parser')
for h in soup.select('h2'):
print(h.get_text())- Install deps:
pip install requests beautifulsoup4 - Run:
python "basic web scraping.py"
Good starter project for learning HTTP requests, parsing HTML and handling network-related issues (timeouts, headers, robots). Always respect a site’s robots.txt and usage policies.