- Sublime Text Package Control
- Bower
- NPM
- Crantastic
- Cran Packages by Name
- R seek
- PyPi
- Gulp
- Grunt
- Cookiecutter - Project templates
- Data Version Control - Make your data science projects reproducible and shareable.
- Package and Environment Management
- Anaconda - Open data science platform powered by Python
- ActiveState - 300+ Packages Including Data Science and Machine Learning
- Platform
- Python(x,y) - A free scientific and engineering development software for numerical computations, data analysis and data visualization based on Python programming language, Qt graphical user interfaces and Spyder interactive scientific development environment
- Probabilistic Graphical Modelling
- Visualization
- Matplotlib - A python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms
- Seaborn - A Python visualization library based on matplotlib
- Altair - Declarative statistical visualization library for Python
- Bokeh - A Python interactive visualization library that targets modern web browsers for presentation
- ggplot - A package for plotting in Python
- Basemap - A library for plotting 2D data on maps in Python
- Facebook's Visdom - A flexible tool for creating, organizing, and sharing visualizations of live, rich data
- Scikit-plot - An intuitive library to add plotting functionality to scikit-learn objects
- Munging and Wrangling
- Pandas - An open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language
- Scientific and Numerical
- Statistics and Mathematics
- Statsmodels - A Python module that allows users to explore data, estimate statistical models, and perform statistical tests
- Statsmodels Stats
- Notebooks and Reporting
- IPython Documentation - Comprehensive environment for interactive and exploratory computing
- Web Mining and Scraping
- Big Data and Performance
- Network and Graph Analytics
- NetworkX - A Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks
- Parsing and Data Extraction
- Beautiful Soup - A Python library for pulling data out of HTML and XML files
- Data Pipeline
- Fuel - A data pipeline framework which provides your machine learning models with the data they need
- Web and API
- Requests: HTTP for Humans - Requests is the only Non-GMO HTTP library for Python, safe for human consumption
- Package and Environment Management
- devtools - Collection of package development tools
- packrat - Manage the R packages your project depends on in an isolated, portable, and reproducible way
- R Docuementation
- Platform
- proto - An object oriented system using object-based, also called prototype-based, rather than class-based object oriented ideas
- magrittr - Provides a mechanism for chaining commands with a new forward-pipe operator, %>%
- DT - Data objects in R can be rendered as HTML tables using the JavaScript library 'DataTables' (typically via R Markdown or Shiny)
- Visualization
- ggplot2 - A plotting system for R
- ggvis - An implementation of an interactive grammar of graphics, taking the best parts of 'ggplot2', combining them with the reactive framework of 'shiny' and drawing web graphics using 'vega'
- htmlwidgets - A framework for creating HTML widgets that render in various contexts including the R console, 'R Markdown' documents, and 'Shiny' web applications
- leaflet - Create and customize interactive maps using the 'Leaflet' JavaScript library and the 'htmlwidgets' package
- googleVis - R interface to Google Charts API, allowing users to create interactive charts based on data frames
- dygraphs - An R interface to the 'dygraphs' JavaScript charting library
- rgl - Provides medium to high level functions for 3D interactive graphics, including functions modelled on base graphics (plot3d(), etc.) as well as functions for constructing representations of geometric objects (cube3d(), etc.)
- shiny - Easy to build interactive web applications with R
- manipulate - Interactive plotting functions for use within RStudio
- RColorBrewer - Provides color schemes for maps (and other graphics)
- scales - Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends
- labeling - Provides a range of axis labeling algorithms
- colorspace - Carries out mapping between assorted color spaces including RGB, HSV, HLS, CIEXYZ, CIELUV, HCL (polar CIELUV), CIELAB and polar CIELAB
- Munging and Wrangling
- dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory
- plyr - A set of tools that solves a common set of problems
- stringr - A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package
- tidyr - Data tidying (not general reshaping or aggregating) and works well with 'dplyr' data pipelines
- lubridate - Functions to work with date-times and time-spans
- digest - Implementation of a function 'digest()' for the creation of hash digests of arbitrary R objects (using the 'md5', 'sha-1', 'sha-256', 'crc32', 'xxhash' and 'murmurhash' algorithms) permitting easy comparison of R language objects, as well as a function 'hmac()' to create hash-based message authentication code
- reshape2 - Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast')
- MICE - Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm
- party - A computational toolbox for recursive partitioning
- Scientific and Numerical
- Statistics and Mathematics
- Notebooks and Reporting
- R Markdown - Convert R Markdown documents into a variety of formats
- knitr - A general-purpose tool for dynamic report generation in R using Literate Programming techniques
- Web Mining and Scraping
- Big Data and Performance
- Rcpp - Provides R functions as well as C++ classes which offer a seamless integration of R and C++
- Network and Graph Analytics
- Parsing and Data Extraction
- readr - Read flat/tabular text files from disk (or a connection)
- mime - Guesses the MIME type from a filename extension using the data derived from /etc/mime.types in UNIX-type systems
- jsonlite - A fast JSON parser and generator optimized for statistical data and the web
- Haven - Import foreign statistical formats into R via the embedded 'ReadStat' C library
- rodbc - An ODBC database interface
- Data Pipeline
- Web and API
- RCurl - A wrapper for 'libcurl' http://curl.haxx.se/libcurl/ Provides functions to allow one to compose general HTTP requests and provides convenient functions to fetch URIs, get & post forms, etc. and process the results returned by the Web server
- Visualization
- Plot.ly - The modern platform for agile business intelligence and data science
- D3
- Britecharts
- Mathbox - Presentation-quality WebGL math graphing
- Reinforcement learning
- Ray - A flexible, high-performance distributed execution framework
- Notebooks, collaboration, and platforms
- Jupyter notebook - A web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text
- Jupyter notebook extensions - A collection of various notebook extensions for Jupyter
- Mode Analytics - A SQL editor, Python notebook, and visualization builder, all rolled into one.
- Apache Zeppelin - A web-based notebook that enables interactive data analytics.
- Beaker - A notebook-style development environment for working interactively with large and complex datasets.
- Dataiku - Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently
- Domino Data Lab
- Cloudera Data Science Workbench
- ScienceOps Data Science Operations System
- GUI-driven advanced analytics and data mining
- RapidMiner - Data science platform
- KNIME - Fast, easy and intuitive access to advanced data science
- Weka - A collection of machine learning algorithms for data mining tasks
- Orange - Open source machine learning and data visualization for novice and expert
- Symbolic Math
- Terminal and CLI
- Data modeling
- dbt - Data Modeling for Teams
- Instrumentation, data collection, and analytics
- A/B Testing
- Business metrics and analytics
- Feedback, Surveys, Questionnaires, and Net promoter score (NPS)
- Logging, monitoring, and application performance management (APM)
- Event data capture
- Google Cloud Datalab - An easy to use interactive tool for large-scale data exploration, analysis, and visualization
- Orange - Open source machine learning and data visualization for novice and expert
- RapidMiner - Data science platform
- Statwing
- Mode Analytics - A SQL editor, Python notebook, and visualization builder, all rolled into one.
- Looker - Business intelligence
- AWS QuickSight - Fast, easy to use business analytics
- Google Data Studio
- Graphviz - Open source graph visualization software
- Tableau - Business intelligence software
- Qlik - Business intelligence software
- Microsoft Power BI
- SlamData
- Silk
- Chartio
- Plotly
- Datawrapper
- TIBCO Jaspersoft
- IBM Cognos - Kibana - Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack
- IBM Cognos - Databazel - The analytical and reporting solution for MongoDB