Skip to content

Latest commit

 

History

History
52 lines (41 loc) · 1.03 KB

File metadata and controls

52 lines (41 loc) · 1.03 KB

Data Science

Project data sets

Data profiling
  • Dimensionality
  • Distribution
  • Granularity
  • Sparsity
  • Correlation
Data preparation
  • Missing-values imputation
  • Anomaly/outlier detection and removal
  • Discretization and Dummification
  • Normalization
  • Balancing
Feature engineering
  • Feature selection
  • Feature extraction
  • Feature generation
Unsupervised learning
  • Pattern mining
  • Clustering
Supervised learning
  • Naïve Bayes
  • kNN
  • Decision tree
  • Random forest
  • Gradient boosting
  • Overfitting

Extra lab: Social network analysis (SNA)

Extra lab: Time-series Analysis and Forecasting

Data sets
  • Profiling
  • Transformation
  • Forecasting
  • Motif discovery