Skip to content

Data Science Getting Started

Data Science: Getting Started#

This is a high-level map of the common Python libraries used when getting started with data science.

Core Libraries#

Numeric#

  • NumPy - numerical Python, numerical arrays and mathematical operations on arrays. Arrays are efficient data containers that lower-level languages can operate on without copying the data SciPy - high-level numerical routines, optimisation, regression and interpolation Matplotlib - 2D visualisations and interactive plots

Interactive environments#

  • IPython - robust interactive environment. Useful when exploring data and working with matplotlib* Jupyter Notebooks - documents that combine code, output, notes and visualisations

Domain-specific packages#

  • Mayavi - 3D visualisations pandas - rich data structures and functions for working with structured data. The primary object is a DataFrame SymPy - symbolic computing scikit-image - image processing scikit-learn - machine learning

Sources#