Calculating Covariance with Python and Numpy

Calculating Covariance with Python and Numpy

When a and b are 1-dimensional sequences, numpy.cov(a,b)[0][1] is equivalent to your cov(a,b).

The 2×2 array returned by np.cov(a,b) has elements equal to

cov(a,a)  cov(a,b)

cov(a,b)  cov(b,b)

(where, again, cov is the function you defined above.)

Thanks to unutbu for the explanation. By default numpy.cov calculates the sample covariance. To obtain the population covariance you can specify normalisation by the total N samples like this:

numpy.cov(a, b, bias=True)[0][1]

or like this:

numpy.cov(a, b, ddof=0)[0][1]

Calculating Covariance with Python and Numpy

Note that starting in Python 3.10, one can obtain the covariance directly from the standard library.

Using statistics.covariance which is a measure (the number youre looking for) of the joint variability of two inputs:

from statistics import covariance

# x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
# y = [1, 2, 3, 1, 2, 3, 1, 2, 3]
covariance(x, y)
# 0.75

Leave a Reply

Your email address will not be published.