NumPy cov()

The numpy.cov() method estimates the covariance matrix, given data and weights.

Example

import numpy as np

# create an array
array1 = np.array([[0, 3, 7], 
                   [1, 4, 6], 
             	     [2, 5, 8]])

# calculate the covariance of the array covariance = np.cov(array1)
print(covariance) ''' Output: [[12.33333333 8.66666667 10.5 ] [ 8.66666667 6.33333333 7.5 ] [10.5 7.5 9. ]] '''

cov() Syntax

The syntax of the numpy.cov() method is:

numpy.cov(array, y = None, rowvar = True, bias = False, ddof = None, fweights = None, aweights = None, dtype = None)

cov() Arguments

The numpy.cov() method takes the following arguments:

  • array - array containing numbers whose covariance is desired (can be array_like)
  • y (optional) - an additional set of variables and observations (array_like)
  • rowvar (optional) - If True, each row represents a variable, otherwise, each column represents a variable
  • bias (optional) - normalizes the array if True
  • ddof (optional) - specifies whether to preserve the shape of the original array (bool)
  • fweights (optional) - integer frequency weights; the number of times each observation vector is repeated (array of int)
  • aweights (optional) - observation vector weights (array of int)
  • dtype (optional) - data type of the result

cov() Return Value

The numpy.cov() method returns a covariance matrix.


Covariance

Covariance is a statistical measure that describes the relationship between two random variables. It measures how changes in one variable are associated with changes in another variable.

Positive covariance means the variables tend to increase or decrease together, while negative covariance means they move in opposite directions.

A covariance of zero implies no linear relationship.


Example 1: Find the Covariance of an ndArray

import numpy as np

# create arrays
array1 = np.array([[0, 1, 2], 
                    [0, 1, 2]])
                    
array2 = np.array([[0, 1, 2], 
                    [2, 1, 0]])
                    
# calculate the covariance of the arrays covariance1 = np.cov(array1) covariance2 = np.cov(array2)
print(covariance1 , '\n') print(covariance2 )

Output

[[1. 1.]
 [1. 1.]]

[[ 1. -1.]
 [-1.  1.]]

Here, array1 correlates perfectly and array2 also does the same but in opposite directions.


Example 2: Specifying the Data Type of the Covariance Matrix

The dtype parameter can be used to control the data type of the covariance matrix.

import numpy as np

# create an array
array1 = np.array([[0, 3, 7], 
                   [1, 4, 6], 
                   [2, 5, 8]])

# calculate the covariance of the array covariance1 = np.cov(array1) # calculate the covariance of the array as float16 covariance2 = np.cov(array1, dtype = np.float16)
print(covariance1 ,'\n') print(covariance2)

Output

[[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

[[12.336  8.664 10.5  ]
 [ 8.664  6.332  7.5  ]
 [10.5    7.5    9.   ]]

Note: Using a lower precision dtype, such as float16, can lead to a loss of accuracy.


Example 3: Using Optional rowvar Argument

If rowvar is set to True (default), each row represents a variable, with observations in the columns.

If rowvar is set to False, the relationship is transposed: each column represents a variable, while the rows contain observations.

import numpy as np

# create an array
array1 = np.array([[0, 3, 7], 
                   [1, 4, 6], 
                   [2, 5, 8]])

# calculate the covariance of the array covariance1 = np.cov(array1) # calculate the covariance with columns as variables covariance2 = np.cov(array1, rowvar = False)
print('With rows as variables\n', covariance1 ,'\n') print('With columns as variables\n', covariance2)

Output

With rows as variables
 [[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

With columns as variables
 [[1.  1.  0.5]
 [1.  1.  0.5]
 [0.5 0.5 1. ]]

Example 4: Create a Normalized Covariance Matrix

The optional argument bias specifies whether to normalize the covariance matrix and the argument ddof specifies the delta degrees of freedom.

import numpy as np

# create an array
array1 = np.array([[0, 3, 7], 
                   [1, 4, 6], 
                   [2, 5, 8]])

# calculate the covariance of the array covariance1 = np.cov(array1) # normalize the covariance matrix covariance2 = np.cov(array1, bias = True) # normalize the covariance matrix with ddof = 2 covariance3 = np.cov(array1, bias = True, ddof = 2)
print('Unnormalized Covariance Matrix\n', covariance1, '\n') print('Normalized Covariance Matrix\n', covariance2, '\n') print('Normalized Covariance Matrix With ddof = 2\n', covariance3, '\n')

Output

Unnormalized Covariance Matrix
[[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

Normalized Covariance Matrix
 [[8.22222222 5.77777778 7.        ]
 [5.77777778 4.22222222 5.        ]
 [7.         5.         6.        ]] 

Normalized Covariance Matrix With ddof = 2
[[24.66666667 17.33333333 21.        ]
 [17.33333333 12.66666667 15.        ]
 [21.         15.         18.        ]] 

Note: ddof = 0 is the default value and ddof = 1 returns an unnormalized matrix.


Example 5: Using Weights

The aweight and fweight parameters allow us to specify weights for covariance estimate.

import numpy as np

# create an array
array1 = np.array([[0, 3, 7], 
                   [1, 4, 6], 
                   [2, 5, 8]])

# specify weights
a = np.array([3, 1, 2])
f = np.array([2, 1, 3])

# calculate the covariance of the array covariance1 = np.cov(array1) # calculate the covariance of the array with aweights provided covariance2 = np.cov(array1, aweights = a) # calculate the covariance of the array with fweights provided covariance3 = np.cov(array1, fweights = f) # calculate the covariance of the array with both aweights and fweights covariance4 = np.cov(array1, aweights = a, fweights = f)
print('Unweighted Covariance Matrix\n', covariance1, '\n') print('Covariance Matrix with Observation Vector weight\n', covariance2, '\n') print('Covariance Matrix with frequency weight\n', covariance3, '\n') print('Covariance Matrix with both weights\n', covariance4, '\n')

Output

Unweighted Covariance Matrix
[[12.33333333  8.66666667 10.5       ]
 [ 8.66666667  6.33333333  7.5       ]
 [10.5         7.5         9.        ]] 

Covariance Matrix with Observation Vector weight
 [[16.04545455 11.5        13.77272727]
 [11.5         8.40909091  9.95454545]
 [13.77272727  9.95454545 11.86363636]] 

Covariance Matrix with frequency weight
 [[12.   8.4 10.2]
 [ 8.4  6.   7.2]
 [10.2  7.2  8.7]] 

Covariance Matrix with both weights
 [[13.86956522  9.86956522 11.86956522]
 [ 9.86956522  7.08695652  8.47826087]
 [11.86956522  8.47826087 10.17391304]] 

Here,

  • aweights represent the observation vector weight i.e. it quantifies the importance of an observation in the correlation.
  • fweights represent the frequency weight i.e. it represents the number of times the observation was repeated.