NumPy Histogram

NumPy histograms is a graphical representation of the distribution of numerical data. Using functions like histogram() and plt(), we can create and plot histograms.

We'll take a closer look at histograms and how they can be created and plotted in NumPy.


NumPy Histogram

NumPy has a built-in function histogram() that takes an array of data as a parameter.

In histogram, a bin is a range of values that represents a group of data. bin is an optional parameter.

Let's see an example.

import numpy as np

# create an array of data
data = np.array([5, 10, 15, 18, 20])

# create bin to set the interval
bin = [0,10,20,30]

# create histogram
graph = np.histogram(data, bin)

print(graph)

Output

(array([1, 3, 1]), array([ 0, 10, 20, 30]))

In this example, we have used the histogram() function to calculate the frequency distribution of data. We have passed two parameters: data and bin.

The histogram() function returns a tuple containing two arrays:

  • the first array contains the frequency counts of the data within each bin, and
  • the second array contains the bin edges.

From the resulting output, we can see that:

  • Only 1 data point (i.e., 5) from the array data lies between the bin edges 0 and 10
  • 3 data points (i.e., 10, 15, 18) lie between 10 and 20, and
  • 1 data point (i.e., 20) lies between 20 and 30.

Example: NumPy Histogram

import numpy as np

# create an array of data
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3])

# create bin to set the interval
bin = [0, 5, 10]

# create histogram
graph = np.histogram(data, bin)

print(graph)

Output

(array([7, 4]), array([ 0,  5, 10]))

Here, the histogram() functions returns a tuple of arrays. Analyzing the output, 7 data points from the array data lie within the bin edges 0 and 5, and 4 data points lie within 5 and 10.


Plot the Histogram

We can use the plt() function to plot the numerical value returned by the histogram.

The plt() is a function provided by Matplotlib. To use plt(), we need to import the Matplotlib.

Let's see an example.

import numpy as np
from matplotlib import pyplot as plt

# create an array of data
data = np.array([5, 10, 15, 18, 20])

# create bin to set the interval
bins = [0,10,20,30]

# create histogram
graph = np.histogram(data, bins)
print(graph)

# plot histogram 
plt.hist(data, bins)

plt.show()

Output

(array([1, 3, 1]), array([ 0, 10, 20, 30]))
Plotting a Histogram
Plotting a Histogram

In the above example, we used the histogram() function to calculate the frequency distribution of data and then plotted the resulting histogram using the plt.hist() function from the matplotlib library.