Stop copy pasting code you don't actually understand

Build the coding confidence you need to become a developer companies will fight for

Stop copy pasting code you don't actually understand

Start FREE Trial

Stop copy pasting code you don't actually understand

Build the coding confidence you need to become a developer companies will fight for

Stop copy pasting code you don't actually understand

Start FREE Trial

Learn

Practice

Compete

Certification Courses

Created with over a decade of experience and thousands of feedback.

Learn Python

Learn HTML

Learn JavaScript

Learn SQL

Learn DSA

View all Courses on

Learn C

Learn C++

Learn Java

Pandas cut()

The cut() method in Pandas is used for segmenting and sorting data values into bins.

Example

import pandas as pd

# create a list of ages
ages = [20, 22, 25, 27, 21, 23, 37, 31, 61, 45, 41, 32]

# define the bins - age ranges
bins = [18, 25, 35, 60, 100]

# use cut() to categorize each age into the defined bins
categories = pd.cut(ages, bins)

print(categories)

'''
Output

[(18, 25], (18, 25], (18, 25], (25, 35], (18, 25], ..., (25, 35], (60, 100], (35, 60], (35, 60], (25, 35]]
Length: 12
Categories (4, interval[int64, right]): [(18, 25] < (25, 35] < (35, 60] < (60, 100]]
'''

cut() Syntax

The syntax of the cut() method in Pandas is:

Pandas.cut(x, bins, right=True, labels=None, retbins=False, precision=3, include_lowest=False)

cut() Arguments

The cut() method takes following arguments:

x - the input array to be binned
bins - the criteria to bin by
right (optional) - indicates whether bins include the rightmost edge
labels (optional) - specifies the labels for the returned bins
retbins (optional) - specifies whether to return the bins or not
precision (optional) - precision at which to store and display the bins labels
include_lowest (optional) - whether the first interval should be left-inclusive or not.

cut() Return Value

The cut() method in Pandas returns a Series or an array that represents the specific bin or category each original value in the input data belongs to.

Example 1: Categorizing Data Using cut()

import pandas as pd

# create a list of exam scores
scores = [88, 92, 75, 85, 78, 95, 64, 82, 90, 73, 67, 99]

# define the bins - grading ranges
bins = [0, 60, 70, 80, 90, 100]

# use cut() to categorize each score into the defined grading bins
grade_categories = pd.cut(scores, bins)

print(grade_categories)

Output

[(80, 90], (90, 100], (70, 80], (80, 90], (70, 80], ..., (80, 90], (80, 90], (70, 80], (60, 70], (90, 100]]
Length: 12
Categories (5, interval[int64, right]): [(0, 60] < (60, 70] < (70, 80] < (80, 90] < (90, 100]]

In the above example, we have created the list named scores containing exam scores.

The bins are defined to represent different grading ranges: 0-60, 61-70, 71-80, 81-90, 91-100.

Then we used pd.cut() to categorize each score into the corresponding grading bin.

Example 2: Control Bin Boundaries Using right Argument in cut()

import pandas as pd

# create a list of data
data = [2, 4, 6, 8, 10]

# define the bins
bins = [0, 5, 10]

# use cut() with right=True (default)
categories_right_true = pd.cut(data, bins, right=True)

print("Bins closed on the right:")
print(categories_right_true)
print()

# use cut with right=False
categories_right_false = pd.cut(data, bins, right=False)

print("\nBins closed on the left:")
print(categories_right_false)

Output

Bins closed on the right:
[(0, 5], (0, 5], (5, 10], (5, 10], (5, 10]]
Categories (2, interval[int64, right]): [(0, 5] < (5, 10]]

Bins closed on the left:
[[0.0, 5.0), [0.0, 5.0), [5.0, 10.0), [5.0, 10.0), NaN]
Categories (2, interval[int64, left]): [[0, 5) < [5, 10)]

Here, with

right=True, the bins are (0, 5] and (5, 10], indicating that the right edge (5 and 10) is included in the bin.
right=False, the bins are [0, 5) and [5, 10), meaning the left edge (0 and 5) is included in the bin.

Example 3: Naming Bins in Pandas cut()

import pandas as pd

# create a list of data
data = [20, 35, 45, 60, 75, 90]

# define the bins
bins = [0, 25, 50, 75, 100]

# define custom labels for the bins
labels = ['Low', 'Medium', 'High', 'Very High']

# use cut() with custom labels
categories_with_labels = pd.cut(data, bins, labels=labels)

print(categories_with_labels)

Output

['Low', 'Medium', 'Medium', 'High', 'High', 'Very High']
Categories (4, object): ['Low' < 'Medium' < 'High' < 'Very High']

In this example, we have defined the list of custom labels: Low, Medium, High, and Very High, corresponding to each bin.

Then used pd.cut() to categorize the data into bins and assign the custom labels to these bins.

Example 4: Extract Bin Information Using retbins Argument in cut()

import pandas as pd

# create a list of data
data = [10, 15, 20, 25, 30, 35, 40]

# define the bins
bins = [0, 20, 40]

# use cut() with retbins=True
categories, bin_edges = pd.cut(data, bins, retbins=True)

print("Binned Categories:")
print(categories)
print("\nBin Edges:")
print(bin_edges)

Output

Binned Categories:
[(0, 20], (0, 20], (0, 20], (20, 40], (20, 40], (20, 40], (20, 40]]
Categories (2, interval[int64, right]): [(0, 20] < (20, 40]]

Bin Edges:
[ 0 20 40]

In the above example, we used pd.cut() with retbins=True, so it returns two things: the binned categories and the array of bin edges.

The categories variable contains the binned data (each element of data categorized into the bins).

And the bin_edges variable contains the actual edges of the bins used in the process.

Example 5: Specify the precision of the Labels of the Bins

import pandas as pd

# create a list of floating-point data
data = [10.123, 15.456, 20.789, 25.012, 30.345, 35.678, 40.901]

# define the bins
bins = [0, 20, 40, 60]

# use cut() with precision=2
categories = pd.cut(data, bins, precision=2)

print("Binned Categories with Two Decimal Precision:")
print(categories)

Output

Binned Categories with Two Decimal Precision:
[(0, 20], (0, 20], (20, 40], (20, 40], (20, 40], (20, 40], (40, 60]]
Categories (3, interval[int64, right]): [(0, 20] < (20, 40] < (40, 60]]

Here, we used pd.cut() with precision=2. This means that the labels of the bins will be formatted to have two decimal places.

Example 6: Use of include_lowest Argument in cut()

import pandas as pd

# create a list of data
data = [20, 22, 24, 26, 28, 30]

# define the bins
bins = [20, 25, 30]

# use cut() with include_lowest=False (default)
categories_default = pd.cut(data, bins)

print("First bin exclusive of the lower edge:")
print(categories_default)
print()

# use cut() with include_lowest=True
categories_include_lowest = pd.cut(data, bins, include_lowest=True)

print("\nFirst bin inclusive of the lower edge:")
print(categories_include_lowest)

Output

First bin exclusive of the lower edge:
[NaN, (20.0, 25.0], (20.0, 25.0], (25.0, 30.0], (25.0, 30.0], (25.0, 30.0]]
Categories (2, interval[int64, right]): [(20, 25] < (25, 30]]

First bin inclusive of the lower edge:
[(19.999, 25.0], (19.999, 25.0], (19.999, 25.0], (25.0, 30.0], (25.0, 30.0], (25.0, 30.0]]
Categories (2, interval[float64, right]): [(19.999, 25.0] < (25.0, 30.0]]

In this example, with

include_lowest=False - the first bin (20, 25] does not include the lower edge 20. Thus, the value 20 in the data is not included in any bin, resulting in NaN.
include_lowest=True - the first bin [20, 25] is inclusive of the lower edge 20. Therefore, the value 20 is included in the first bin, and there are no NaN values.

Your builder path starts here. Builders don't just know how to code, they create solutions that matter.

Escape tutorial hell and ship real projects.

Try Programiz PRO

Real-World Projects
On-Demand Learning
AI Mentor
Builder Community

Popular Tutorials

Popular Examples

Reference Materials

Certification Courses

Become a certified Python
programmer.

Popular Tutorials

Reference Materials

Popular Examples

Pandas cut()

Example

cut() Syntax

cut() Arguments

cut() Return Value

Example 1: Categorizing Data Using cut()

Example 2: Control Bin Boundaries Using right Argument in cut()

Example 3: Naming Bins in Pandas cut()

Example 4: Extract Bin Information Using retbins Argument in cut()

Example 5: Specify the precision of the Labels of the Bins

Example 6: Use of include_lowest Argument in cut()

Popular Tutorials

Popular Examples

Reference Materials

Certification Courses

Become a certified Python programmer.

Popular Tutorials

Reference Materials

Popular Examples

Pandas cut()

Example

cut() Syntax

cut() Arguments

cut() Return Value

Example 1: Categorizing Data Using cut()

Example 2: Control Bin Boundaries Using right Argument in cut()

Example 3: Naming Bins in Pandas cut()

Example 4: Extract Bin Information Using retbins Argument in cut()

Example 5: Specify the precision of the Labels of the Bins

Example 6: Use of include_lowest Argument in cut()

Become a certified Python
programmer.