Certification Courses

Created with over a decade of experience and thousands of feedback.

Learn Python

Learn HTML

Learn JavaScript

Learn SQL

Learn DSA

View all Courses on

Learn C

Learn C++

Learn Java

Pandas cut()

The cut() method in Pandas is used for segmenting and sorting data values into bins.

Example

import pandas as pd

# create a list of ages
ages = [20, 22, 25, 27, 21, 23, 37, 31, 61, 45, 41, 32]

# define the bins - age ranges
bins = [18, 25, 35, 60, 100]

# use cut() to categorize each age into the defined bins
categories = pd.cut(ages, bins)

print(categories)

'''
Output

[(18, 25], (18, 25], (18, 25], (25, 35], (18, 25], ..., (25, 35], (60, 100], (35, 60], (35, 60], (25, 35]]
Length: 12
Categories (4, interval[int64, right]): [(18, 25] < (25, 35] < (35, 60] < (60, 100]]
'''

cut() Syntax

The syntax of the cut() method in Pandas is:

Pandas.cut(x, bins, right=True, labels=None, retbins=False, precision=3, include_lowest=False)

cut() Arguments

The cut() method takes following arguments:

x - the input array to be binned
bins - the criteria to bin by
right (optional) - indicates whether bins include the rightmost edge
labels (optional) - specifies the labels for the returned bins
retbins (optional) - specifies whether to return the bins or not
precision (optional) - precision at which to store and display the bins labels
include_lowest (optional) - whether the first interval should be left-inclusive or not.

cut() Return Value

The cut() method in Pandas returns a Series or an array that represents the specific bin or category each original value in the input data belongs to.

Example 1: Categorizing Data Using cut()

import pandas as pd

# create a list of exam scores
scores = [88, 92, 75, 85, 78, 95, 64, 82, 90, 73, 67, 99]

# define the bins - grading ranges
bins = [0, 60, 70, 80, 90, 100]

# use cut() to categorize each score into the defined grading bins
grade_categories = pd.cut(scores, bins)

print(grade_categories)

Output

[(80, 90], (90, 100], (70, 80], (80, 90], (70, 80], ..., (80, 90], (80, 90], (70, 80], (60, 70], (90, 100]]
Length: 12
Categories (5, interval[int64, right]): [(0, 60] < (60, 70] < (70, 80] < (80, 90] < (90, 100]]

In the above example, we have created the list named scores containing exam scores.

The bins are defined to represent different grading ranges: 0-60, 61-70, 71-80, 81-90, 91-100.

Then we used pd.cut() to categorize each score into the corresponding grading bin.

Example 2: Control Bin Boundaries Using right Argument in cut()

import pandas as pd

# create a list of data
data = [2, 4, 6, 8, 10]

# define the bins
bins = [0, 5, 10]

# use cut() with right=True (default)
categories_right_true = pd.cut(data, bins, right=True)

print("Bins closed on the right:")
print(categories_right_true)
print()

# use cut with right=False
categories_right_false = pd.cut(data, bins, right=False)

print("\nBins closed on the left:")
print(categories_right_false)

Output

Bins closed on the right:
[(0, 5], (0, 5], (5, 10], (5, 10], (5, 10]]
Categories (2, interval[int64, right]): [(0, 5] < (5, 10]]

Bins closed on the left:
[[0.0, 5.0), [0.0, 5.0), [5.0, 10.0), [5.0, 10.0), NaN]
Categories (2, interval[int64, left]): [[0, 5) < [5, 10)]

Here, with

right=True, the bins are (0, 5] and (5, 10], indicating that the right edge (5 and 10) is included in the bin.
right=False, the bins are [0, 5) and [5, 10), meaning the left edge (0 and 5) is included in the bin.

Example 3: Naming Bins in Pandas cut()

import pandas as pd

# create a list of data
data = [20, 35, 45, 60, 75, 90]

# define the bins
bins = [0, 25, 50, 75, 100]

# define custom labels for the bins
labels = ['Low', 'Medium', 'High', 'Very High']

# use cut() with custom labels
categories_with_labels = pd.cut(data, bins, labels=labels)

print(categories_with_labels)

Output

['Low', 'Medium', 'Medium', 'High', 'High', 'Very High']
Categories (4, object): ['Low' < 'Medium' < 'High' < 'Very High']

In this example, we have defined the list of custom labels: Low, Medium, High, and Very High, corresponding to each bin.

Then used pd.cut() to categorize the data into bins and assign the custom labels to these bins.

Example 4: Extract Bin Information Using retbins Argument in cut()

import pandas as pd

# create a list of data
data = [10, 15, 20, 25, 30, 35, 40]

# define the bins
bins = [0, 20, 40]

# use cut() with retbins=True
categories, bin_edges = pd.cut(data, bins, retbins=True)

print("Binned Categories:")
print(categories)
print("\nBin Edges:")
print(bin_edges)

Output

Binned Categories:
[(0, 20], (0, 20], (0, 20], (20, 40], (20, 40], (20, 40], (20, 40]]
Categories (2, interval[int64, right]): [(0, 20] < (20, 40]]

Bin Edges:
[ 0 20 40]

In the above example, we used pd.cut() with retbins=True, so it returns two things: the binned categories and the array of bin edges.

The categories variable contains the binned data (each element of data categorized into the bins).

And the bin_edges variable contains the actual edges of the bins used in the process.

Example 5: Specify the precision of the Labels of the Bins

import pandas as pd

# create a list of floating-point data
data = [10.123, 15.456, 20.789, 25.012, 30.345, 35.678, 40.901]

# define the bins
bins = [0, 20, 40, 60]

# use cut() with precision=2
categories = pd.cut(data, bins, precision=2)

print("Binned Categories with Two Decimal Precision:")
print(categories)

Output

Binned Categories with Two Decimal Precision:
[(0, 20], (0, 20], (20, 40], (20, 40], (20, 40], (20, 40], (40, 60]]
Categories (3, interval[int64, right]): [(0, 20] < (20, 40] < (40, 60]]

Here, we used pd.cut() with precision=2. This means that the labels of the bins will be formatted to have two decimal places.

Example 6: Use of include_lowest Argument in cut()

import pandas as pd

# create a list of data
data = [20, 22, 24, 26, 28, 30]

# define the bins
bins = [20, 25, 30]

# use cut() with include_lowest=False (default)
categories_default = pd.cut(data, bins)

print("First bin exclusive of the lower edge:")
print(categories_default)
print()

# use cut() with include_lowest=True
categories_include_lowest = pd.cut(data, bins, include_lowest=True)

print("\nFirst bin inclusive of the lower edge:")
print(categories_include_lowest)

Output

First bin exclusive of the lower edge:
[NaN, (20.0, 25.0], (20.0, 25.0], (25.0, 30.0], (25.0, 30.0], (25.0, 30.0]]
Categories (2, interval[int64, right]): [(20, 25] < (25, 30]]

First bin inclusive of the lower edge:
[(19.999, 25.0], (19.999, 25.0], (19.999, 25.0], (25.0, 30.0], (25.0, 30.0], (25.0, 30.0]]
Categories (2, interval[float64, right]): [(19.999, 25.0] < (25.0, 30.0]]

In this example, with

include_lowest=False - the first bin (20, 25] does not include the lower edge 20. Thus, the value 20 in the data is not included in any bin, resulting in NaN.
include_lowest=True - the first bin [20, 25] is inclusive of the lower edge 20. Therefore, the value 20 is included in the first bin, and there are no NaN values.

Our premium learning platform, created with over a decade of experience and thousands of feedbacks.

Learn and improve your coding skills like never before.

Try Programiz PRO

Interactive Courses
Certificates
AI Help
2000+ Challenges

Popular Tutorials

Popular Examples

Reference Materials

Certification Courses

Become a certified Python
programmer.

Popular Tutorials

Reference Materials

Popular Examples

Pandas cut()

Example

cut() Syntax

cut() Arguments

cut() Return Value

Example 1: Categorizing Data Using cut()

Example 2: Control Bin Boundaries Using right Argument in cut()

Example 3: Naming Bins in Pandas cut()

Example 4: Extract Bin Information Using retbins Argument in cut()

Example 5: Specify the precision of the Labels of the Bins

Example 6: Use of include_lowest Argument in cut()

Popular Tutorials

Popular Examples

Reference Materials

Certification Courses

Become a certified Python programmer.

Popular Tutorials

Reference Materials

Popular Examples

Pandas cut()

Example

cut() Syntax

cut() Arguments

cut() Return Value

Example 1: Categorizing Data Using cut()

Example 2: Control Bin Boundaries Using right Argument in cut()

Example 3: Naming Bins in Pandas cut()

Example 4: Extract Bin Information Using retbins Argument in cut()

Example 5: Specify the precision of the Labels of the Bins

Example 6: Use of include_lowest Argument in cut()

Become a certified Python
programmer.