Skip to content

Latest commit

 

History

History
28 lines (22 loc) · 1.39 KB

File metadata and controls

28 lines (22 loc) · 1.39 KB

Data Exploration: Guide

This folder is for any Python scripts or notebooks you use to explore and understand your datasets. These files should:

  1. Read in prepared datasets from 0_datasets
  2. Explore and understand the dataset without running a deep analysis:
    • Generate some visualizations (in a notebook, or in a separate image file saved to this folder)
    • Run some descriptive statistics (beware the Datasaurus Dozen!)
    • ... let your curiosity guide you, but avoid running any inferential statistics or using any machine learning at this stage.

DO NOT modify an existing dataset in 0_datasets! This is critical to open research: Someone should be able to clone this repository and run your scripts to replicate your research. If you modify an original dataset, others cannot replicate your work.

Chapter 4 - Exploratory Data Analysis from the Art of Data Science is a good starting reference.

README.md

Use the README in this folder to give a quick summary of each script/notebook - which dataset(s) it explores, and how.