NumPy vectorization involves performing mathematical operations on entire arrays, eliminating the need to loop through individual elements.
We will see an overview of NumPy vectorization and demonstrate its advantages through examples.
We've used the concept of vectorization many times in NumPy. It refers to performing element-wise operations on arrays.
Let's take a simple example. When we add a number with a NumPy array, it adds up with each element of the array.
import numpy as np array1 = np.array([1, 2, 3, 4, 5 ]) number = 10 # number sums up with each array element result = array1 + number print(result)
[11 12 13 14 15]
Here, the number 10 adds up with each array element. This is possible because of vectorization.
Without vectorization, performing the operation would require the use of loops.
Example: Numpy Vectorization to Add Two Arrays Together
import numpy as np # define two 2D arrays array1 = np.array([[1, 2, 3], [4, 5, 6]]) array2 = np.array([[0, 1, 2], [0, 1, 2]]) # add two arrays (vectorization) array_sum = array1 + array2 print("Sum between two arrays:\n", array_sum)
Sum between two arrays: [[1 3 5] [4 6 8]]
In this example, we have created two 2D arrays array1 and array2, and added them together.
This is a vectorized operation, where corresponding elements of two arrays are added together element-wise.
NumPy Vectorization Vs Python for Loop
Even though NumPy is a Python library, it inherited vectorization from C programming. As C is efficient in terms of speed and memory, NumPy vectorization is also much faster than Python.
Let's compare the time it takes to perform a vectorized operation with that of an equivalent loop-based operation.
Python for loop
import time start = time.time() array1 = [1, 2, 3, 4, 5] for i in range(len(array1)): array1[i] += 10 end = time.time() print("For loop time:", end - start)
For loop time: 4.76837158203125e-06
import numpy as np import time start = time.time() array1 = np.array([1, 2, 3, 4, 5 ]) result = array1 + 10 end = time.time() print("Vectorization time:", end - start)
Vectorization time: 1.5020370483398438e-05
Here, the difference in execution time between vectorization and a for loop is significant, even for simple operation.
This comparison illustrates the performance benefits of vectorization, especially when working with large datasets.
NumPy Vectorize() Function
In NumPy, every mathematical operation with arrays is automatically vectorized. So we don't always need to use the
Let's take a scenario. You have an array and a function that returns the square of a positive number.
import numpy as np # array array1 = np.array([-1, 0, 2, 3, 4]) # function that returns the square of a positive number def find_square(x): if x < 0: return 0 else: return x ** 2
Now, to apply the function
find_square() to array1, we have two options: use a loop or vectorize the operation.
Since loops are complicated and slow by nature, it's efficient and convenient to use
Let's see an example.
import numpy as np # array whose square we need to find array1 = np.array([-1, 0, 2, 3, 4]) # function to find the square def find_square(x): if x < 0: return 0 else: return x ** 2 # vectorize() to vectorize the function find_square() vectorized_function = np.vectorize(find_square) # passing an array to a vectorized function result = vectorized_function(array1) print(result)
[ 0 0 4 9 16]
In this example, we used the
vectorize() function to vectorize the
find_square() function. We then passed array1 as a parameter to the vectorized function.