Features
NumPy targets the
CPython reference implementation of Python, which is a non-optimizing
bytecode interpreter.
Mathematical algorithms written for this version of Python often run much slower than
compiled equivalents due to the absence of compiler optimization. NumPy addresses the slowness problem partly by providing multidimensional arrays and functions and operators that operate efficiently on arrays; using these requires rewriting some code, mostly
inner loops, using NumPy. Using NumPy in Python gives functionality comparable to
MATLAB since they are both interpreted, and they both allow the user to write fast programs as long as most operations work on
arrays or matrices instead of
scalars. In comparison, MATLAB boasts a large number of additional toolboxes, notably
Simulink, whereas NumPy is intrinsically integrated with Python, a more modern and complete
programming language. Moreover, complementary Python packages are available; SciPy is a library that adds more MATLAB-like functionality and
Matplotlib is a
plotting package that provides MATLAB-like plotting functionality. Although MATLAB can perform sparse matrix operations, NumPy alone cannot perform such operations and requires the use of the scipy.sparse library. Internally, both MATLAB and NumPy rely on
BLAS and
LAPACK for efficient
linear algebra computations. Python
bindings of the widely used
computer vision library
OpenCV utilize NumPy arrays to store and operate on data. Since images with multiple channels are simply represented as three-dimensional arrays, indexing,
slicing or
masking with other arrays are very efficient ways to access specific pixels of an image. The NumPy array as universal data structure in OpenCV for images, extracted
feature points,
filter kernels and many more vastly simplifies the programming workflow and
debugging. Importantly, many NumPy operations release the
global interpreter lock, which allows for multithreaded processing. NumPy also provides a C API, which allows Python code to interoperate with external libraries written in low-level languages.
The ndarray data structure The core functionality of NumPy is its "ndarray", for
n-dimensional array,
data structure. These arrays are
strided views on memory. In contrast to Python's built-in list data structure, these arrays are homogeneously typed: all elements of a single array must be of the same type. Such arrays can also be views into memory buffers allocated by
C/
C++,
Python, and
Fortran extensions to the CPython interpreter without the need to copy data around, giving a degree of compatibility with existing numerical libraries. This functionality is exploited by the SciPy package, which wraps a number of such libraries (notably BLAS and LAPACK). NumPy has built-in support for
memory-mapped ndarrays. and
Numba. Cython and
Pythran are static-compiling alternatives to these. Many modern
large-scale scientific computing applications have requirements that exceed the capabilities of the NumPy arrays. For example, NumPy arrays are usually loaded into a computer's
memory, which might have insufficient capacity for the analysis of large
datasets. Further, NumPy operations are executed on a single
CPU. However, many linear algebra operations can be accelerated by executing them on
clusters of CPUs or of specialized hardware, such as
GPUs and
TPUs, which many
deep learning applications rely on. As a result, several alternative array implementations have arisen in the scientific python ecosystem over the recent years, such as
Dask for distributed arrays and
TensorFlow or
JAX for computations on GPUs. Because of its popularity, these often implement a
subset of NumPy's
API or mimic it, so that users can change their array implementation with minimal changes to their code required. accelerated by
Nvidia's
CUDA framework, has also shown potential for faster computing, being a 'drop-in replacement' of NumPy. == Examples ==
Examples
NumPy is conventionally imported as . import numpy as np from numpy.typing import NDArray a: NDArray[int] = np.array(1, 2, 3, 4], [3, 4, 6, 7], [5, 9, 0, 5) a.transpose() Basic operations from numpy.typing import NDArray a: NDArray[int] = np.array([1, 2, 3, 6]) b: NDArray[int] = np.linspace(0, 2, 4) # create an array with four equally spaced points starting with 0 and ending with 2. c: NDArray[int] = a - b print(c) • prints array([ 1. , 1.33333333, 1.66666667, 4. ]) print(a ** 2) • prints array([ 1, 4, 9, 36]) Universal functions from numpy.typing import NDArray, float64 a: NDArray[float64] = np.linspace(-np.pi, np.pi, 100) b: float64 = np.sin(a) c: float64 = np.cos(a) • Functions can take both numbers and arrays as parameters. print(np.sin(1)) • prints 0.8414709848078965 print(np.sin(np.array([1, 2, 3]))) • prints array([0.84147098, 0.90929743, 0.14112001]) Linear algebra import numpy as np from numpy.linalg import solve, inv from numpy.random import rand from numpy.typing import NDArray, float32 a: NDArray[float32] = np.array(1, 2, 3], [3, 4, 6.7], [5, 9.0, 5) print(a.transpose()) • prints: • array( 1. , 3. , 5. ], • [ 2. , 4. , 9. ], • [ 3. , 6.7, 5. ) print(inv(a)) • prints: • array(-2.27683616, 0.96045198, 0.07909605], • [ 1.04519774, -0.56497175, 0.1299435 ], • [ 0.39548023, 0.05649718, -0.11299435) b: NDArray[int] = np.array([3, 2, 1]) print(solve(a, b)) # solve the equation ax = b • prints array([-4.83050847, 2.13559322, 1.18644068]) c: NDArray[float32] = rand(3, 3) * 20 # create a 3x3 random matrix of values within [0,1] scaled by 20 print(c) • prints: • array( 3.98732789, 2.47702609, 4.71167924], • [ 9.24410671, 5.5240412 , 10.6468792 ], • [ 10.38136661, 8.44968437, 15.17639591) print(np.dot(a, c)) # matrix multiplication • prints: • array( 53.61964114, 38.8741616 , 71.53462537], • [ 118.4935668 , 86.14012835, 158.40440712], • [ 155.04043289, 104.3499231 , 195.26228855) print(a @ c) # Starting with Python 3.5 and NumPy 1.10 • prints: • array( 53.61964114, 38.8741616 , 71.53462537], • [ 118.4935668 , 86.14012835, 158.40440712], • [ 155.04043289, 104.3499231 , 195.26228855) Multidimensional arrays import numpy as np from numpy.typing import NDArray, float64 M: NDArray[float64] = np.zeros(shape=(2, 3, 5, 7, 11)) T: NDArray[float64] = np.transpose(M, (4, 2, 1, 3, 0)) print(T.shape) • prints (11, 5, 3, 7, 2) Incorporation with OpenCV import cv2 import numpy as np from numpy.typing import NDArray, float32 r: NDArray[float32] = np.reshape(np.arange(256*256)%256,(256,256)) # 256x256 pixel array with a horizontal gradient from 0 to 255 for the red color channel g: NDArray[float32] = np.zeros_like(r) # array of same size and type as r but filled with 0s for the green color channel b: NDArray[float32] = r.T # transposed r will give a vertical gradient for the blue color channel print(cv2.imwrite("gradients.png", np.dstack([b,g,r]))) # OpenCV images are interpreted as BGR, the depth-stacked array will be written to an 8bit RGB PNG-file called "gradients.png" • prints True Nearest-neighbor search Functional Python and vectorized NumPy version. • Functional Python ### from typing import Callable points: list[list[int = 9,2,8],[4,7,2],[3,4,4],[5,6,9],[5,0,7],[8,2,7],[0,3,2],[7,3,0],[6,1,1],[2,9,6 qPoint: list[int] = [4,5,3] • Lambda function for calculating the Euclidean distance of two vectors edistance: Callablelist[float], list[float, float] = lambda a, b: sum((a1 - b1) ** 2 for a1, b1 in zip(a, b)) ** 0.5 • Compute all Euclidean distances at once and return the nearest point nearest: list[int] = min((edistance(i, qpoint), i) for i in points)[1] print(f"Nearest point to q: {nearest}") • prints Nearest point to q: [3, 4, 4] • Equivalent NumPy vectorization ### import numpy as np from numpy.typing import NDArray points: NDArray[int] = np.array(9,2,8],[4,7,2],[3,4,4],[5,6,9],[5,0,7],[8,2,7],[0,3,2],[7,3,0],[6,1,1],[2,9,6) qPoint: NDArray[int] = np.array([4,5,3]) minIdx: int = np.argmin(np.linalg.norm(points-qPoint, axis=1)) # compute all euclidean distances at once and return the index of the smallest one print(f"Nearest point to q: {points[minIdx]}") • prints Nearest point to q: [3 4 4] F2PY Quickly wrap native code for faster scripts. subroutine ftest(a, b, n, c, d) implicit none integer, intent(in) :: a, b, n integer, intent(out) :: c, d integer :: i c = 0 do i = 1, n c = a + b + c end do d = (c * n) * (-1) end subroutine ftest import foo import numpy as np a: tuple[int, int] = foo.ftest(1, 2, 3) # or c,d = instead of a.c and a.d print(a) • prints (9,-27) help("foo.ftest") • prints the foo.ftest.__doc__ == See also ==