The value_counts()
method in Pandas is used to count the number of occurrences of each unique value in a Series.
Example
import pandas as pd
# create a Series
data = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'apple'])
# use value_counts() to count the occurrences of each unique value
counts = data.value_counts()
print(counts)
'''
Output
apple 3
banana 2
orange 1
Name: count, dtype: int64
'''
value_counts() Syntax
The syntax of the value_counts()
method in Pandas is:
Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
value_counts() Arguments
The value_counts()
method takes following arguments:
normalize
(optional) - if set toTrue
, returns the relative frequencies (proportions) of unique values instead of their countssort
(optional) - determines whether to sort the unique values by their counted frequenciesascending
(optional) - determines whether to sort the counts in ascending or descendingbins
(optional) - groups numeric data into equal-width bins if specifieddropna
(optional) - exclude null values ifTrue
.
value_counts() Return Value
The value_counts()
method in Pandas returns a Series containing counts of unique values.
Example 1: Count Occurrences of Each Unique Value
import pandas as pd
# create a Series
favorite_colors = pd.Series(['blue', 'red', 'green', 'blue', 'red', 'blue', 'yellow', 'green', 'green', 'red', 'purple'])
# use value_counts() to count the occurrences of each unique value
color_counts = favorite_colors.value_counts()
print(color_counts)
Output
blue 3 red 3 green 3 yellow 1 purple 1 Name: count, dtype: int64
In the above example, the favorite_colors Series contains strings representing different colors.
We used the value_counts()
method to count the number of times each color appears in the favorite_colors series.
Example 2: Count Values With Normalization
import pandas as pd
# create a Series
data = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'apple'])
# set normalize to True to count values with normalization
normalized_counts = data.value_counts(normalize=True)
print(normalized_counts)
Output
apple 0.500000 banana 0.333333 orange 0.166667 Name: proportion, dtype: float64
In this example, apple
appears 3 times, banana
appears 2 times, and orange
appears 1 time.
The normalize=True
shows the proportion of each fruit in the Series. For instance, apple
make up 50% of the entries, banana
33.33%, and orange
make up 16.67%.
Example 3: Sort Unique Value Counts in Pandas
import pandas as pd
# create a Series
data = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'apple', 'banana', 'kiwi', 'kiwi', 'kiwi'])
# use value_counts() without sorting
counts_no_sort = data.value_counts(sort=False)
print("Counts without sorting:")
print(counts_no_sort)
print()
# use value_counts() with sorting
counts_sort = data.value_counts(sort=True)
print("Counts with sorting:")
print(counts_sort)
Output
Counts without sorting: apple 3 banana 3 orange 1 kiwi 3 Name: count, dtype: int64 Counts with sorting: apple 3 banana 3 kiwi 3 orange 1 Name: count, dtype: int64
Here,
sort=False
- shows the counts of each unique value in the order they appear in the Series (without sorting).sort=True
- shows the counts sorted in descending order, which is the default behavior ofvalue_counts()
.
Example 4: Specify the Order of Sorting
import pandas as pd
# create a Series
data = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'orange', 'orange'])
# count values with sorting in descending order (default behavior)
counts_descending = data.value_counts(ascending=False)
print("Counts (Descending Order):")
print(counts_descending)
print()
# count values with sorting in ascending order
counts_ascending = data.value_counts(ascending=True)
print("Counts (Ascending Order):")
print(counts_ascending)
Output
Counts (Descending Order): orange 3 apple 2 banana 2 Name: count, dtype: int64 Counts (Ascending Order): apple 2 banana 2 orange 3 Name: count, dtype: int64
In this example, with
ascending=False
- sorts the counts in descending order. This will list the most frequent item first.ascending=True
- sorts the counts in ascending order, listing the least frequent item first.
Example 5: Use of bins Argument in value_counts()
The bins
argument in the value_counts()
method is used to bin continuous data into discrete intervals.
Let's look at an example.
import pandas as pd
# create a Series with continuous data
data = pd.Series([0.1, 0.3, 0.5, 1.2, 1.5, 2.3, 2.8, 3.0, 3.5, 4.1])
# use value_counts() with the bins argument
bin_counts = data.value_counts(bins=4)
print(bin_counts)
Output
(0.095, 1.1] 3 (2.1, 3.1] 3 (1.1, 2.1] 2 (3.1, 4.1] 2 Name: count, dtype: int64
In this example, the value_counts()
method divides the range of data in data into 4 equal-width bins and counts the number of values that fall into each bin.
The output we see represents these counts, with each bin described by its range. For example, (0.095, 1.1]
indicates a bin that includes values greater than 0.095 and up to 1.1.
Example 6: Use of dropna Argument in value_counts()
import pandas as pd
# create a Series with some missing values
data = pd.Series(['apple', 'banana', 'apple', 'orange', None, 'banana', None])
# use value_counts() without specifying dropna (default is True)
counts_default = data.value_counts()
print("Counts excluding None:")
print(counts_default)
print()
# use value_counts() with dropna=False
counts_including_na = data.value_counts(dropna=False)
print("\nCounts including None:")
print(counts_including_na)
Output
Counts excluding None: apple 2 banana 2 orange 1 Name: count, dtype: int64 Counts including None: apple 2 banana 2 None 2 orange 1 Name: count, dtype: int64
Here,
dropna=True
counts the occurrences of each fruit and excludes theNone
values.dropna=False
includes theNone
values in the count, showing how many missing values are present in the data