Skip to content

Latest commit

 

History

History
39 lines (27 loc) · 1.81 KB

File metadata and controls

39 lines (27 loc) · 1.81 KB

Data-Science-Project-with-Streamlit

This Streamlit app provides a comprehensive toolkit for data exploration, visualization, cleaning, and basic machine learning. It is designed to facilitate various stages of a data science project, making it an essential tool for data scientists and analysts. image

#Features

  1. Exploratory Data Analysis (EDA) Upload Datasets: Supports CSV, TXT, and XLSX files. View Data: Display the first few rows of the dataset. Data Summary: Show the shape and descriptive statistics of the dataset. Column Information: List all columns and allow the selection of specific columns for detailed analysis. Correlation Matrix: Visualize correlations between features using heatmaps. Scatter Plot: Create scatter plots for any two selected features.

  2. Data Visualization Plot Types: Generate various plots including area, bar, line, histogram, box, KDE, pair plots, and scatter matrix. Interactive Plots: Utilize Plotly to create interactive scatter matrix plots.

  3. Data Cleaning Handle Missing Values: Fill missing values with column means. Drop Columns: Remove unwanted columns from the dataset.

  4. Machine Learning Model Training: Train a Random Forest classifier on selected features. Model Evaluation: Display classification reports and confusion matrices.

  5. Download Processed Data Download CSV: Allow users to download the processed DataFrame as a CSV file.

**How to Use

Upload your dataset: Choose from CSV, TXT, or XLSX file formats. Select an activity: Choose from EDA, Plots, Data Cleaning, Machine Learning, or Download. Perform analysis and visualization: Use the various tools and options provided to explore and analyze your data. Download the processed data: Save your cleaned and augmented data for further use.