What is NumPy?

NumPy is an open-source library for scientific computing in Python. It stands for Numerical Python. It provides a high-performance multidimensional array object, and a collection of tools to work with. The provided tools makes complex data manipulation easy. Because Python is slow in execution time, NumPy is implemented in a low-level programming language that is able to provide the necessary performance.

Numeric, was the ancestor of NumPy, and was developed by Jim Hugunin. Another package Numarray was also developed, and had some additional functionalities. In 2005, Travis Oliphant created NumPy package by incorporating the features of Numarray into Numeric. Today there are many contributors to this open source project.

N Dimensional Arrays

A NumPy array is can have n dimensions, all of the same type, and is indexed by a tuple of non-negative integers. The number of dimensions is the rank of the array. The shape of such array is a tuple of integers giving the size of the array along each dimension.

Such array can for example be used for:

Mathematical and logical operations.
Fourier transforms and routines for shape manipulation.
Operations related to linear algebra.
Video and image processing.
Machine-learning algorithms.

The key difference between a NumPy array and a Python list is, that they are designed to handle vectorized operations while a python list is not. That means, if you apply a function it is performed on every item in the array, rather than on the whole array object.

Using the Library

NumPy methods and objects can be used by importing the library:

import numpy

Creating an alias np for numpy will make the development more convenient:

import numpy as np

Creating a Ndarray Object

An instance of ndarray class can be constructed by different array creation routines.

Numpy Array

A basic ndarray can be created using numpy.array().

For example, one dimensional array:

a = np.array([1, 2, 3])
print(a)

Or two dimensional array:

a = np.array([[1, 2], [3, 4]])
print(a)

It is possible to force the type of the ndarray by using dtype:

a = np.array([[1, 2], [3, 4]], dtype=complex)
print(a)

Empty

The method numpy.empty() creates an uninitialized array of specified shape.

The following creates a 3x2 empty array:

x = np.empty([3, 2])
print(x)

!> The elements in the array show random values as they are not initialized.

Zeros

numpy.zeros() returns a new ndarray of specified size, filled with zeros.

The following creates an array of five zeros:

x = np.zeros(5)
print(x)

Ones

The method numpy.ones() is used to create a new ndarray of specified size, filled with ones.

The following creates a 3x2 array of six ones:

x = np.ones((3, 2))
print(x)

Full

numpy.full() is used to create an ndarray filled with a particular number.

The following creates a 2x2 array filled with 7:

x = np.full((2, 2), 7)
print(x)

Eye

Eye matrices refer to identity matrices. Those are created by using numpy.eye. Since eye matrices are always square matrices only one argument is required for the shape.

The following creates a 4x4 eye matrix:

x = np.eye(4)
print(x)

Random

By using numpy.random.random() it is possible to create a ndarray filled with random values between 0 and 1.

The following creates an 2x2 array filled with random values between 0 and 1:

x = np.random.random((2, 2))
print(x)

Random Int

The same can be done for random integer values by using np.random.randint.

The following creates an 5x5 array filled with random values between 0 and 10:

x = np.random.randint(0, 10, (5, 5))
print(x)

Linspace

np.linspace() returns a new one dimensional array of a specified number of evenly spaced points. It takes up to three arguments: starting value of the sequence, end value of the sequence and a number of evenly spaced points to be generated.

The following creates an array from 10 to 20 with 5 evenly spaced points:

x = np.linspace(10, 20, 5)
print(x)

Arange

numpy.arange() is similar to Python's inbuild range() method.

The following creates an array from 10 to 20 with step 2:

x = np.arange(10, 20, 2)
print(x)

Logspace

numpy.logspace() is similar to numpy.linspace(), the difference is that points are spaced evenly on a log scale.

The following creates an array from 1 to 100 with multiple of 10:

x = np.logspace(1, 100, 3)
print(x)

Reshape

numpy.reshape() is used to change the shape of an array.

The following changes an 1x6 array into a 3x2 array:

y = np.arange(6)
x = np.reshape((3, 2))
print(x)

Slicing, Indexing and Conditions

Contents of ndarray object can be accessed and modified by slicing, indexing or conditions.

Slicing

As in Python's collections, the colon notation start:stop:step is used to retrieve a part of the ndarray where step defaults to 1 if it is not specified.

The following creates an array from 1 to 10 with step 1 and retrieves all its elements from index 2 to 7 with step 2:

a = np.arange(10)
b = a[2:7:2]
print(b)

This example shows how to retrieve every element after index 2:

a = np.arange(10)
b = a[2:]
print(b)

The following does the opposite, takes all elements before index 2:

a = np.arange(10)
b = a[:2]
print(b)

Indexing

Elements can be accessed with their column and row position.

The following creates a 2x3 array and takes the elements on position (0, 0), (1, 1) and (2, 0).

x = np.array([[1, 2], [3, 4], [5, 6]])
y = x[[0, 1, 2], [0, 1, 0]]
print(y)

This example takes the corner elements of a 4x3 array:

x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]])
y = x[[[0, 0], [3, 3]], [[0, 2], [0, 2]]]
print(y)

Conditions

Conditions can be used in indexing. Depending how it is used, it can modify or filter a ndarray according the condition.

The following filters as it returns all elements that are greater than 5:

x = np.random.randint(0, 10, (5, 5))
y = x[x > 5]
print(y)

This modifies the ndarray as it assigns 0 to all values smaller than 5:

x = np.random.randint(0, 10, (5, 5))
x[x < 5] = 0
print(x)

Manipulating Ndarrays

NumPy contains a collection of tools to manipulate ndarrays such as add, division, multiplication etc.

Normally to e.g. add or subtract both arrays must have the same shape. But doesn't have to thanks to the so called "broadcasting" phenomena. Broadcasting in NumPy occurs when both shapes are not equal but one of the dimensions is.

Addition

The following example adds an one dimensional array to a 3x3 array using broadcasting. b is added to the four arrays of a:

a = np.random.randint(0, 10, (4, 3))
b = np.array([10, 10, 10])
c = np.add(a, b)
c = a + b # identical
print(c)

Subtract

a = np.random.randint(0, 10, (3, 3))
b = np.array([10, 10, 10])
c = np.subtract(a, b)
c = a - b # identical
print(c)

Multiply

numpy.multiply() performs a element-wise multiplication.

a = np.random.randint(0, 10, (3, 3))
b = np.array([10, 10, 10])
c = np.multiply(a, b)
c = a * b # identical
print(c)

Divide

As in Python 3, numpy.divide() returns a true division. True division adjusts the output type to present the best answer, regardless of input types.

a = np.random.randint(0, 10, (3, 3))
b = np.array([10, 10, 10])
c = np.divide(a, b)
c = a / b # identical
print(c)

Remainder

a = np.array([10, 20, 30])
b = np.array([3, 5, 7])
c = np.mod(a, b)
c = a % b # identical
print(c)

Power

First array elements are element-wise raised to powers from the second array.

a = np.array([10, 100, 1000])
b = np.array([10, 10, 10])
c = np.power(a, b)
c = a ** b # identical
print(c)

Dot

numpy.dot() is for two dimensional arrays a matrix multiplication, and is for one dimensional arrays a inner product without complex conjugation. For N dimensions it is a sum product over the last axis of the first array and the second-to-last of the second array.

The following does a matrix multiplication:

a = np.array([[1, 2], [3, 4]])
b = np.array([[11, 12], [13, 14]])
c = np.dot(a, b)
print(c)

Cross

a = np.array([[1, 2], [3, 4]])
b = np.array([[11, 12], [13, 14]])
c = np.cross(a, b)
print(c)

Transpose

a = np.array([[1, 2], [3, 4]])
b = a.T
print(b)

Functions

PI

print(np.pi)

Sine

a = np.linspace(0, 2 * np.pi, 20)
b = np.sin(a)
print(b)

Cosine

a = np.linspace(0, 2 * np.pi, 20)
b = np.cos(a)
print(b)

Tangent

a = np.linspace(0, 2 * np.pi, 20)
b = np.tan(a)
print(b)

Round

a = np.linspace(0, 2 * np.pi, 20)
b = np.around(a, 1) # precision
print(b)

Floor

a = np.linspace(0, 2 * np.pi, 20)
b = np.floor(a)
print(b)

Ceil

a = np.linspace(0, 2 * np.pi, 20)
b = np.ceil(a)
print(b)

Max

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amax(a)
print(b)

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amax(a, 1)
print(b)

Min

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amin(a)
print(b)

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amin(a, 1)
print(b)

Mean

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.mean(a)
print(b)

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.mean(a, 1)
print(b)

Median

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.median(a)
print(b)

a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.median(a, 1)
print(b)