Data Science Getting Started
Data Science: Getting Started#
This is a high-level map of the common Python libraries used when getting started with data science.
Core Libraries#
Numeric#
- NumPy - numerical Python, numerical arrays and mathematical operations on arrays. Arrays are efficient data containers that lower-level languages can operate on without copying the data SciPy - high-level numerical routines, optimisation, regression and interpolation Matplotlib - 2D visualisations and interactive plots
Interactive environments#
- IPython - robust interactive environment. Useful when exploring data and working with
matplotlib* Jupyter Notebooks - documents that combine code, output, notes and visualisations
Domain-specific packages#
- Mayavi - 3D visualisations pandas - rich data structures and functions for working with structured data. The primary object is a
DataFrameSymPy - symbolic computing scikit-image - image processing scikit-learn - machine learning