NumPy is an open-source library for scientific computing in Python. It stands for Numerical Python. It provides a high-performance multidimensional array object, and a collection of tools to work with. The provided tools makes complex data manipulation easy. Because Python is slow in execution time, NumPy is implemented in a low-level programming language that is able to provide the necessary performance.
Numeric, was the ancestor of NumPy, and was developed by Jim Hugunin. Another package Numarray was also developed, and had some additional functionalities. In 2005, Travis Oliphant created NumPy package by incorporating the features of Numarray into Numeric. Today there are many contributors to this open source project.
A NumPy array is can have n dimensions, all of the same type, and is indexed by a tuple of non-negative integers. The number of dimensions is the rank of the array. The shape of such array is a tuple of integers giving the size of the array along each dimension.
Such array can for example be used for:
The key difference between a NumPy array and a Python list is, that they are designed to handle vectorized operations while a python list is not. That means, if you apply a function it is performed on every item in the array, rather than on the whole array object.
NumPy methods and objects can be used by importing the library:
import numpy
Creating an alias np
for numpy
will make the development more convenient:
import numpy as np
An instance of ndarray class can be constructed by different array creation routines.
A basic ndarray can be created using numpy.array()
.
For example, one dimensional array:
a = np.array([1, 2, 3])
print(a)
Or two dimensional array:
a = np.array([[1, 2], [3, 4]])
print(a)
It is possible to force the type of the ndarray by using dtype
:
a = np.array([[1, 2], [3, 4]], dtype=complex)
print(a)
The method numpy.empty()
creates an uninitialized array of specified shape.
The following creates a 3x2 empty array:
x = np.empty([3, 2])
print(x)
!> The elements in the array show random values as they are not initialized.
numpy.zeros()
returns a new ndarray of specified size, filled with zeros.
The following creates an array of five zeros:
x = np.zeros(5)
print(x)
The method numpy.ones()
is used to create a new ndarray of specified size, filled with ones.
The following creates a 3x2 array of six ones:
x = np.ones((3, 2))
print(x)
numpy.full()
is used to create an ndarray filled with a particular number.
The following creates a 2x2 array filled with 7:
x = np.full((2, 2), 7)
print(x)
Eye matrices refer to identity matrices. Those are created by using numpy.eye
. Since eye matrices are always square matrices only one argument is required for the shape.
The following creates a 4x4 eye matrix:
x = np.eye(4)
print(x)
By using numpy.random.random()
it is possible to create a ndarray filled with random values between 0 and 1.
The following creates an 2x2 array filled with random values between 0 and 1:
x = np.random.random((2, 2))
print(x)
The same can be done for random integer values by using np.random.randint
.
The following creates an 5x5 array filled with random values between 0 and 10:
x = np.random.randint(0, 10, (5, 5))
print(x)
np.linspace()
returns a new one dimensional array of a specified number of evenly spaced points. It takes up to three arguments: starting value of the sequence, end value of the sequence and a number of evenly spaced points to be generated.
The following creates an array from 10 to 20 with 5 evenly spaced points:
x = np.linspace(10, 20, 5)
print(x)
numpy.arange()
is similar to Python's inbuild range()
method.
The following creates an array from 10 to 20 with step 2:
x = np.arange(10, 20, 2)
print(x)
numpy.logspace()
is similar to numpy.linspace()
, the difference is that points are spaced evenly on a log scale.
The following creates an array from 1 to 100 with multiple of 10:
x = np.logspace(1, 100, 3)
print(x)
numpy.reshape()
is used to change the shape of an array.
The following changes an 1x6 array into a 3x2 array:
y = np.arange(6)
x = np.reshape((3, 2))
print(x)
Contents of ndarray object can be accessed and modified by slicing, indexing or conditions.
As in Python's collections, the colon notation start:stop:step
is used to retrieve a part of the ndarray where step
defaults to 1 if it is not specified.
The following creates an array from 1 to 10 with step 1 and retrieves all its elements from index 2 to 7 with step 2:
a = np.arange(10)
b = a[2:7:2]
print(b)
This example shows how to retrieve every element after index 2:
a = np.arange(10)
b = a[2:]
print(b)
The following does the opposite, takes all elements before index 2:
a = np.arange(10)
b = a[:2]
print(b)
Elements can be accessed with their column and row position.
The following creates a 2x3 array and takes the elements on position (0, 0), (1, 1) and (2, 0).
x = np.array([[1, 2], [3, 4], [5, 6]])
y = x[[0, 1, 2], [0, 1, 0]]
print(y)
This example takes the corner elements of a 4x3 array:
x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]])
y = x[[[0, 0], [3, 3]], [[0, 2], [0, 2]]]
print(y)
Conditions can be used in indexing. Depending how it is used, it can modify or filter a ndarray according the condition.
The following filters as it returns all elements that are greater than 5:
x = np.random.randint(0, 10, (5, 5))
y = x[x > 5]
print(y)
This modifies the ndarray as it assigns 0 to all values smaller than 5:
x = np.random.randint(0, 10, (5, 5))
x[x < 5] = 0
print(x)
NumPy contains a collection of tools to manipulate ndarrays such as add, division, multiplication etc.
Normally to e.g. add or subtract both arrays must have the same shape. But doesn't have to thanks to the so called "broadcasting" phenomena. Broadcasting in NumPy occurs when both shapes are not equal but one of the dimensions is.
The following example adds an one dimensional array to a 3x3 array using broadcasting. b
is added to the four arrays of a
:
a = np.random.randint(0, 10, (4, 3))
b = np.array([10, 10, 10])
c = np.add(a, b)
c = a + b # identical
print(c)
a = np.random.randint(0, 10, (3, 3))
b = np.array([10, 10, 10])
c = np.subtract(a, b)
c = a - b # identical
print(c)
numpy.multiply()
performs a element-wise multiplication.
a = np.random.randint(0, 10, (3, 3))
b = np.array([10, 10, 10])
c = np.multiply(a, b)
c = a * b # identical
print(c)
As in Python 3, numpy.divide()
returns a true division. True division adjusts the output type to present the best answer, regardless of input types.
a = np.random.randint(0, 10, (3, 3))
b = np.array([10, 10, 10])
c = np.divide(a, b)
c = a / b # identical
print(c)
a = np.array([10, 20, 30])
b = np.array([3, 5, 7])
c = np.mod(a, b)
c = a % b # identical
print(c)
First array elements are element-wise raised to powers from the second array.
a = np.array([10, 100, 1000])
b = np.array([10, 10, 10])
c = np.power(a, b)
c = a ** b # identical
print(c)
numpy.dot()
is for two dimensional arrays a matrix multiplication, and is for one dimensional arrays a inner product without complex conjugation. For N dimensions it is a sum product over the last axis of the first array and the second-to-last of the second array.
The following does a matrix multiplication:
a = np.array([[1, 2], [3, 4]])
b = np.array([[11, 12], [13, 14]])
c = np.dot(a, b)
print(c)
a = np.array([[1, 2], [3, 4]])
b = np.array([[11, 12], [13, 14]])
c = np.cross(a, b)
print(c)
a = np.array([[1, 2], [3, 4]])
b = a.T
print(b)
print(np.pi)
a = np.linspace(0, 2 * np.pi, 20)
b = np.sin(a)
print(b)
a = np.linspace(0, 2 * np.pi, 20)
b = np.cos(a)
print(b)
a = np.linspace(0, 2 * np.pi, 20)
b = np.tan(a)
print(b)
a = np.linspace(0, 2 * np.pi, 20)
b = np.around(a, 1) # precision
print(b)
a = np.linspace(0, 2 * np.pi, 20)
b = np.floor(a)
print(b)
a = np.linspace(0, 2 * np.pi, 20)
b = np.ceil(a)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amax(a)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amax(a, 1)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amin(a)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.amin(a, 1)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.mean(a)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.mean(a, 1)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.median(a)
print(b)
a = np.array([[3, 7, 5], [8, 4, 3], [2, 4, 9]])
b = np.median(a, 1)
print(b)