NumPy var()

The numpy.var() method computes the variance along the specified axis.

Example

import numpy as np

# create an array
array1= np.array([0, 1, 2, 3, 4, 5, 6, 7])

# calculate the variance of the array variance = np.var(array1)
print(variance) # Output: 5.25

var() Syntax

The syntax of the numpy.var() method is:

numpy.var(array, axis = None, dtype = None, out = None, ddof = 0, keepdims = <no value>, where = <no value>)

var() Arguments

The numpy.var() method takes the following arguments:

  • array - array containing numbers whose variance is desired (can be array_like)
  • axis (optional) - axis or axes along which the variances are computed (int or tuple of int)
  • dtype (optional) - the data type to use in the calculation of variance (datatype)
  • out (optional) - output array in which to place the result (ndarray)
  • ddof (optional) - delta degrees of freedom (int)
  • keepdims (optional) - specifies whether to preserve the shape of the original array (bool)
  • where (optional) - elements to include in the variance (array of bool)

Notes: The default values of numpy.var() impy the following:

  • axis = None - the variance of the entire array is taken.
    • dtype = None - in the case of integers, float is taken; otherwise variance is of the same data type as the elements
    • By default, keepdims and where will not be passed.

var() Return Value

The numpy.var() method returns the variance of the array.


Example 1: Find the variance of an ndArray

import numpy as np

# create an array
array1 = np.array([[[0, 1], 
                    [2, 3]],                     
                   [[4, 5], 
                    [6, 7]]])

# find the variance of the entire array variance1 = np.var(array1) # find the variance across axis 0 variance2 = np.var(array1, 0) # find the variance across axis 0 and 1 variance3 = np.var(array1, (0, 1))
print('\nvariance of the entire array:', variance1) print('\nvariance across axis 0:\n', variance2) print('\nvariance across axis 0 and 1:', variance3)

Output

variance of the entire array: 5.25

variance across axis 0:
 [[4. 4.]
 [4. 4.]]

variance across axis 0 and 1: [5. 5.]

Example 2: Specifying the Data Type of Variance of an ndArray

The dtype parameter can be used to control the data type of the output array.

import numpy as np

array1 = np.array([[1, 2, 3],
                [4, 5, 6]])

# by default int is converted to float result1 = np.var(array1) # get integer variance result2 = np.var(array1, dtype = int)
print('Float variance:', result1) print('Integer variance:', result2)

Output

Float variance: 2.9166666666666665
Integer variance: 3

Note: Using a lower precision dtype, such as int, can lead to a loss of accuracy.


Example 3: Using Optional keepdims Argument

If keepdims is set to True, the resultant variance array is of the same number of dimensions as the original array.

import numpy as np

array1 = np.array([[1, 2, 3],
                [4, 5, 6]])

# keepdims defaults to False result1 = np.var(array1, axis = 0) # pass keepdims as True result2 = np.var(array1, axis = 0, keepdims = True)
print('Dimensions in original array:', arr.ndim) print('Without keepdims:', result1, 'with dimensions', result1.ndim) print('With keepdims:', result2, 'with dimensions', result2.ndim)

Output

Dimensions in original array: 2
Without keepdims: [2.25 2.25 2.25] with dimensions 1
With keepdims: [[2.25 2.25 2.25]] with dimensions 2

Example 4: Using Optional where Argument

The optional argument where specifies which elements to include in the variance.

import numpy as np

array1= np.array([[1, 2, 3],
                [4, 5, 6]])

# take variance of the entire array
result1 = np.var(array1)

# variance of only even elements result2 = np.var(array1, where = (array1% 2 == 0)) # variance of numbers greater than 3 result3 = np.var(array1, where = (array1 > 3))
print('variance of entire array:', result1) print('variance of only even elements:', result2) print('variance of numbers greater than 3:', result3)

Output

variance of entire array: 2.9166666666666665
variance of only even elements: 2.6666666666666665
variance of  numbers greater than 3: 0.6666666666666666

Example 5: Using Optional out Argument

The out parameter allows us to specify an output array where the result will be stored.

import numpy as np

array1 = np.array([[1, 2, 3],
                [4, 5, 6]])

# create an output array
output = np.zeros(3)

# compute variance and store the result in the output array np.var(array1, out = output, axis = 0)
print('variance:', output)

Output

variance: [2.25 2.25 2.25]

Frequently Asked Questions

What is variance?

Variance is the average of the squared deviation from the mean. It is the measure of the spread of values around the mean in the given array.

Mathematically,

var = sum((array1 - arr.mean())** 2) / (N - 1)

In NumPy,

import numpy as np

array1 = np.array([2, 4, 6, 8, 10])

# calculate variance using np.var() variance1 = np.var(array1) # calculate variance without using np.var() mean = np.mean(array1) diff_squared = (array1 - mean) ** 2 variance2 = np.mean(diff_squared)
print('variance with np.var():', variance1) print('variance without np.var():', variance2)

Output

variance with np.var(): 8.0
variance without np.var(): 8.0
What is the ddof parameter in numpy.var() used for?

The ddof (Delta Degrees of Freedom) parameter in numpy.var() allows adjusting the divisor used in the calculation of variance. The default value is 0, which corresponds to dividing by N, the number of elements.

In the above formula of var,

var = sum((array1 - arr.mean())** 2) / (N - ddof)

Let's look at an example.

import numpy as np

array1 = np.array([1, 2, 3, 4, 5])

# calculate variance with the default ddof = 0 variance0 = np.var(array1) # calculate variance with ddof = 1 variance1 = np.var(array1, ddof = 1)
print('variance (default ddof = 0):', variance0) print('variance (ddof = 1):', variance1)

Output

variance (default ddof = 0): 2.0
variance (ddof = 1): 2.5