Web Scraper and Data Visualization Project

This project involves creating a web scraper using Selenium and visualizing the scraped data on a Flask-based web page. The scraper fetches images and their corresponding titles from a specified website, saves them locally, and displays them on a web page.

Features

Web Scraper: Uses Selenium to scrape images and titles from a specified URL.
Data Visualization: Displays the scraped data on a Flask web page.
Directory Structure: Organizes downloaded images into directories based on the current date.

Requirements

Python 3.6+
Selenium
Flask
Requests
Dotenv
Instabot (optional, currently commented out)

Installation

Clone the repository:

git clone https://github.com/your-username/web-scraper-visualization.git
cd web-scraper-visualization

Set up a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows use `.venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```
Set up environment variables: Create a .env file in the root directory and add your environment variables (e.g., for Instagram login, if used in the future):
```
username=your_instagram_username
password=your_instagram_password
```

Usage

Note:

The instabotme.py script includes the command-line instructions to post scraped data to Instagram.
Run the instabotme.py script to start the Flask server.
If the scraped data is not shown, refresh the localhost server.

Run the Flask web server:
```
python instabotme.py
```
Access the web page: Open your web browser and navigate to http://localhost:3000 to see the visualized scraped data.
Change base path: base_path = 'C:\Users\Asus\Documents\Vishva_Projects\AI\Web_Scraper_bot\imgs' to your local path

Directory Structure

your_project/

├── templates/

│ └── index.html

├── static/

│ └── images/

│
├── app.py

└── web_scrap_instabot.py

instabotme.py: Main script to run the Flask application and handle Instagram posting.
web_scrap_instabot.py: Contains the Webscrap class for scraping the web.
keep_alive.py: Keeps the Flask web server running.
templates/index.html: HTML template for displaying the scraped data.
imgs/: Directory where the downloaded images are saved, organized by month and day.

`web_scrap_instabot.py`

This file contains the Webscrap class that handles web scraping using Selenium.

Demo Video

Web.Scraper.Demo.1.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.venv		.venv
__pycache__		__pycache__
config		config
static/imgs/aug/5		static/imgs/aug/5
templates		templates
.env		.env
.gitignore		.gitignore
README.md		README.md
instabotme.py		instabotme.py
keep_alive.py		keep_alive.py
requirements.txt		requirements.txt
web_scrap_instabot.py		web_scrap_instabot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraper and Data Visualization Project

Features

Requirements

Installation

Usage

Note:

Directory Structure

`web_scrap_instabot.py`

Demo Video

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Web Scraper and Data Visualization Project

Features

Requirements

Installation

Usage

Note:

Directory Structure

web_scrap_instabot.py

Demo Video

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`web_scrap_instabot.py`

Packages