- Deep learning and neural networks
- Torch - A scientific computing framework with wide support for machine learning algorithms that puts GPUs first
- Caffe - A deep learning framework made with expression, speed, and modularity in mind
- DL4J - Open-Source, Distributed, Deep Learning Library for the JVM
- Theano - Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently
- TensorFlow - Open source software library for numerical computation using data flow graphs
- Amazon Deep Scalable Sparse Tensor Network Engine (DSSTNE) - An Amazon developed library for building Deep Learning (DL) machine learning (ML) models
- Keras: Deep Learning library for Theano and TensorFlow - A high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano
- Weka - A collection of machine learning algorithms for data mining tasks
- Anaconda - Open data science platform powered by Python
- Python(x,y) - A free scientific and engineering development software for numerical computations, data analysis and data visualization based on Python programming language, Qt graphical user interfaces and Spyder interactive scientific development environment
- Python
- IPython Documentation - Comprehensive environment for interactive and exploratory computing
- Jupyter notebook - A web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text
- Matplotlib - A python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms
- Natural Language Toolkit - A leading platform for building Python programs to work with human language data
- Numpy - The fundamental package for scientific computing with Python
- Scipy - A Python-based ecosystem of open-source software for mathematics, science, and engineering
- Pandas - An open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language
- PyBrain - Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library
- Scikit-image - A collection of algorithms for image processing
- Scikit-learn - A Python module for machine learning
- Seaborn - A Python visualization library based on matplotlib
- StatsModels - A Python module that allows users to explore data, estimate statistical models, and perform statistical tests
- Pattern - Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization
- Scrapy - An open source and collaborative framework for extracting the data you need from websites
- ggplot - A package for plotting in Python
- Altair - Declarative statistical visualization library for Python
- Blaze - Provides Python users high-level access to efficient computation on inconveniently large data
- Dask - A flexible parallel computing library for analytic computing
- Bokeh - A Python interactive visualization library that targets modern web browsers for presentation
- Basemap - A library for plotting 2D data on maps in Python
- NetworkX - A Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks
- Beautiful Soup - A Python library for pulling data out of HTML and XML files
- Gensim - Python framework for fast Vector Space Modelling
- Shogun - Machine learning toolbox that provides a wide range of unified and efficient Machine Learning (ML) methods
- Chainer - A Powerful, Flexible, and Intuitive Framework for Neural Networks
- NuPIC - An open source project based on a theory of neocortex called Hierarchical Temporal Memory (HTM)
- Neon - Python-based deep learning library
- PyMC - A python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo
- Fuel - A data pipeline framework which provides your machine learning models with the data they need
- PyMVPA - PyMVPA stands for MultiVariate Pattern Analysis (MVPA) in Python
- Deap - A novel evolutionary computation framework for rapid prototyping and testing of ideas
- Annoy - Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
- R
- General CRAN List - By task
- General CRAN List - NLP/Text analytics
- General CRAN List
- ggplot2 - A plotting system for R
- ISLR - The collection of datasets used in the book "An Introduction to Statistical Learning with Applications in R"
- Rcpp - Provides R functions as well as C++ classes which offer a seamless integration of R and C++
- dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory
- plyr - A set of tools that solves a common set of problems
- stringr - A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package
- shiny - Easy to build interactive web applications with R
- knitr - A general-purpose tool for dynamic report generation in R using Literate Programming techniques
- readr - Read flat/tabular text files from disk (or a connection)
- R Markdown - Convert R Markdown documents into a variety of formats
- tidyr - Data tidying (not general reshaping or aggregating) and works well with 'dplyr' data pipelines
- lubridate - Functions to work with date-times and time-spans
- lme4 - Fit linear and generalized linear mixed-effects models
- nlme - Fit and compare Gaussian linear and nonlinear mixed-effects models
- mime - Guesses the MIME type from a filename extension using the data derived from /etc/mime.types in UNIX-type systems
- mda - Mixture and flexible discriminant analysis, multivariate adaptive regression splines (MARS), BRUTO, ...
- lasso2 - Routines and documentation for solving regression problems while imposing an L1 constraint on the estimates
- lars - Efficient procedures for fitting an entire lasso sequence with the cost of a single least squares fit
- digest - Implementation of a function 'digest()' for the creation of hash digests of arbitrary R objects (using the 'md5', 'sha-1', 'sha-256', 'crc32', 'xxhash' and 'murmurhash' algorithms) permitting easy comparison of R language objects, as well as a function 'hmac()' to create hash-based message authentication code
- reshape2 - Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast')
- colorspace - Carries out mapping between assorted color spaces including RGB, HSV, HLS, CIEXYZ, CIELUV, HCL (polar CIELUV), CIELAB and polar CIELAB
- RColorBrewer - Provides color schemes for maps (and other graphics)
- manipulate - Interactive plotting functions for use within RStudio
- scales - Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends
- labeling - Provides a range of axis labeling algorithms
- proto - An object oriented system using object-based, also called prototype-based, rather than class-based object oriented ideas
- randomForest - Classification and regression based on a forest of trees using random inputs
- glmnet - Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression and the Cox model
- caret - Misc functions for training and plotting classification and regression models
- ggvis - An implementation of an interactive grammar of graphics, taking the best parts of 'ggplot2', combining them with the reactive framework of 'shiny' and drawing web graphics using 'vega'
- rgl - Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.)
- htmlwidgets - A framework for creating HTML widgets that render in various contexts including the R console, 'R Markdown' documents, and 'Shiny' web applications
- leaflet - Create and customize interactive maps using the 'Leaflet' JavaScript library and the 'htmlwidgets' package
- dygraphs - An R interface to the 'dygraphs' JavaScript charting library
- googleVis - R interface to Google Charts API, allowing users to create interactive charts based on data frames
- zoo - An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors
- RCurl - A wrapper for 'libcurl' http://curl.haxx.se/libcurl/ Provides functions to allow one to compose general HTTP requests and provides convenient functions to fetch URIs, get & post forms, etc. and process the results returned by the Web server
- jsonlite - A fast JSON parser and generator optimized for statistical data and the web
- bitops - Functions for bitwise operations on integer vectors
- devtools - Collection of package development tools
- magrittr - Provides a mechanism for chaining commands with a new forward-pipe operator, %>%
- packrat - Manage the R packages your project depends on in an isolated, portable, and reproducible way
- Haven - Import foreign statistical formats into R via the embedded 'ReadStat' C library
- DT - Data objects in R can be rendered as HTML tables using the JavaScript library 'DataTables' (typically via R Markdown or Shiny)
- MICE - Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm
- rpart - Recursive partitioning for classification, regression and survival trees
- party - A computational toolbox for recursive partitioning
- nnet - Software for feed-forward neural networks with a single hidden layer, and for multinomial log-linear models
- e1071 - Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, ...
- kernlab - Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction
- gbm - Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart)
- wordcloud - Pretty word clouds
- c50 - C5.0 decision trees and rule-based models for pattern recognition
- class - Various functions for classification, including k-nearest neighbour, Learning Vector Quantization and Self-Organizing Maps
- neuralnet - Training of neural networks using backpropagation, resilient backpropagation with (Riedmiller, 1994) or without weight backtracking (Riedmiller and Braun, 1993) or the modified globally convergent version by Anastasiadis et al. (2005)
- tm - A framework for text mining applications within R
- gmodels - Various R programming tools for model fitting
- rodbc - An ODBC database interface
- princurve - Fits a principal curve to a data matrix in arbitrary dimensions
- Analytics