The melt()
method in Pandas is used to reshape a DataFrame from a wide format to a long format.
Example
import pandas as pd
data = {
'A': [1, 2],
'B': [4, 5]
}
df = pd.DataFrame(data)
# melt the entire DataFrame
melted_df = pd.melt(df)
print(melted_df)
'''
Output
variable value
0 A 1
1 A 2
2 B 4
3 B 5
'''
melt() Syntax
The syntax of the melt()
method in Pandas is:
pd.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
melt() Arguments
The melt()
method takes following arguments:
frame
- the DataFrame we want to meltid_vars
(optional) - a list or a single column name or index to be retained as identifier variablesvalue_vars
(optional) - a list or a single column name or index indicating which columns to meltvar_name
(optional) - the name to use for the variable column. The default is'variable'
value_name
(optional) - the name to use for the value column. The default is'value'
col_level
(optional) - if the input DataFrame has multi-level columns, we can specify the level to melt.
melt() Return Value
The melt()
function returns a new DataFrame that represents the melted or reshaped data.
Example1: Reshape DataFrame Using melt()
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Math': [90, 88, 76],
'Science': [88, 92, 80],
'History': [78, 85, 90]
}
df = pd.DataFrame(data)
# melt the entire DataFrame
melted_df = pd.melt(df)
print(melted_df)
Output
variable value
0 Name Alice
1 Name Bob
2 Name Charlie
3 Math 90
4 Math 88
5 Math 76
6 Science 88
7 Science 92
8 Science 80
9 History 78
10 History 85
11 History 90
In the above example, we have used the melt()
method to melt the entire df DataFrame.
Since we didn't specify any additional arguments, it transforms the DataFrame from its original wide format to a long format.
And melt()
also created two columns: variable
for the column names and value
for the corresponding values.
Example 2: Provide Custom Name to Variable and Value
import pandas as pd
data = {
'Math': [90, 88, 76],
'Science': [88, 92, 80],
'History': [78, 85, 90]
}
df = pd.DataFrame(data)
# melt the entire DataFrame and provide variable and value name
melted_df = pd.melt(df, var_name='Subject', value_name='Score')
print(melted_df)
Output
Subject Score
0 Math 90
1 Math 88
2 Math 76
3 Science 88
4 Science 92
5 Science 80
6 History 78
7 History 85
8 History 90
Here,
var_name='Subject'
names the variable column asSubject
.value_name='Score'
names the variable column asScore
.
Example 3: Preserve Key Information With id_vars in DataFrame Reshaping
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Math': [90, 88, 76],
'Science': [88, 92, 80],
'History': [78, 85, 90]
}
df = pd.DataFrame(data)
# melt the DataFrame, keeping 'Name' as an identifier variable
melted_df = pd.melt(df, id_vars=['Name'], var_name='Subject', value_name='Score')
print(melted_df)
Output
Name Subject Score
0 Alice Math 90
1 Bob Math 88
2 Charlie Math 76
3 Alice Science 88
4 Bob Science 92
5 Charlie Science 80
6 Alice History 78
7 Bob History 85
8 Charlie History 90
In this example, we specified id_vars=['Name']
, which means we want to keep the Name
column as an identifier variable.
As a result, Name
is not melted or pivoted, and it appears as a separate column in the melted DataFrame.
Hence, id_vars
allows us to specify which columns to preserve in their original form while melting or pivoting the others.
Example 4: Melt Only Specific Columns Using value_vars
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Math': [90, 88, 76],
'Science': [88, 92, 80],
'History': [78, 85, 90]
}
df = pd.DataFrame(data)
# melt only specific columns using value_vars
melted_df = pd.melt(df, id_vars=['Name'], value_vars=['Math', 'Science'], var_name='Subject', value_name='Score')
print(melted_df)
Output
Name Subject Score
0 Alice Math 90
1 Bob Math 88
2 Charlie Math 76
3 Alice Science 88
4 Bob Science 92
5 Charlie Science 80
In the above example, we only specified value_vars=['Math', 'Science']
, so only the Math
and Science
columns are melted.
This allows us to control which columns are transformed into the long format while keeping other columns as identifier variables.
Example 5: Melt Multi Level DataFrame
import pandas as pd
# sample DataFrame with multi-level columns
data = {
('Jan', 'Sales'): [100, 150, 200],
('Feb', 'Sales'): [120, 160, 210],
('Jan', 'Profit'): [20, 25, 30],
('Feb', 'Profit'): [22, 27, 31],
}
df = pd.DataFrame(data)
# melt the DataFrame, specifying the col_level parameter
melted_df = pd.melt(df, col_level=0, var_name='NewColumn', value_name='Value')
print(melted_df)
Output
NewColumn Value
0 Jan 100
1 Jan 150
2 Jan 200
3 Feb 120
4 Feb 160
5 Feb 210
6 Jan 20
7 Jan 25
8 Jan 30
9 Feb 22
10 Feb 27
11 Feb 31
Here, col_level=0
specifies that we want to melt the first level of the column index (i.e., the Jan
and Feb
columns).
If we set col_level=1
, then the second level of the column index would be melted. And the output would have been:
NewColumn Value
0 Sales 100
1 Sales 150
2 Sales 200
3 Sales 120
4 Sales 160
5 Sales 210
6 Profit 20
7 Profit 25
8 Profit 30
9 Profit 22
10 Profit 27
11 Profit 31