The loc[] property in Pandas is used to select data from a DataFrame based on labels or conditions.
Example
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)
# select the row with label 1 (second row)
selected_row = df.loc[1]
print(selected_row)
'''
Output
Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object
'''
loc[] Syntax
The syntax of the loc[] property in Pandas is:
loc[rows, columns]
loc[] Arguments
The loc[] property takes following arguments:
rows- specifies data selection criteria for rows, which can be labels, boolean conditions, or slicescolumns- specifies selection criteria for columns, which can be labels, boolean conditions, or slices.
loc[] Return Value
The loc[] property in Pandas returns a DataFrame, depending on how we use it and what we're selecting.
Example 1: Select Single Row by Label
import pandas as pd
# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
# select a single row by label
selected_row = df.loc['A']
print(selected_row)
Output
Name Alice
Age 25
City New York
Name: A, dtype: object
In the above example, we have used the loc[] property to select a single row from the DataFrame.
The index parameter specifies custom row labels A, B, C, and D for the DataFrame.
And the A label is used as the argument to loc[], which specifies that we want to select the row with the label A.
Example 2: Select Multiple Rows by Labels
import pandas as pd
# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
# select multiple rows by labels
selected_rows = df.loc[['A', 'C']]
print(selected_rows)
Output
Name Age City
A Alice 25 New York
C Charlie 35 Chicago
Here, first we created the df DataFrame with row labels A, B, C, and D. Then, we used loc[['A', 'C']] to select multiple rows by the labels A and C.
Hence, the selected_rows DataFrame contains the rows with labels A and C.
Example 3: Select Specific Rows and Columns
import pandas as pd
# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
# select specific rows and columns
selected_data = df.loc[['A', 'C'], ['Name', 'Age']]
print(selected_data)
Output
Name Age
A Alice 25
C Charlie 35
In the above example, we first created the df DataFrame with row labels A, B, C, and D and columns Name, Age, and City.
- To select specific rows, we pass a list of row labels
AandCas the first argument to theloc[]property. - To select specific columns, we pass a list of column names
NameandAgeas the second argument to theloc[]property.
Hence, the output shows the selected rows A and C with the columns Name and Age.
Example 4: Slice Rows and Select Specific Columns
import pandas as pd
# create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
# slice rows and select specific columns
selected_data = df.loc['B':'C', ['Name', 'Age']]
print(selected_data)
Output
Name Age
B Bob 30
C Charlie 35
In this example, the loc['B':'C', ['Name', 'Age']] property slices rows from B to C, inclusive, and selects specific columns Name and Age.
So, selected_data includes rows B and C and only the columns Name and Age.
Note: To learn more about how slicing works, please visit Pandas Indexing and Slicing.
Example 5: Select all Rows for Specific Columns
import pandas as pd
# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
# select all rows for specific columns
selected_columns = df.loc[:, ['Name', 'Age']]
print(selected_columns)
Output
Name Age
A Alice 25
B Bob 30
C Charlie 35
D David 28
Here, the loc[:, ['Name', 'Age']] property selects all rows from the DataFrame with only the specified columns Name and Age.
This will give us a DataFrame containing all rows for the Name and Age columns.
Example 6: Select Specific Rows for all Columns
import pandas as pd
data = {'Student_ID': [101, 102, 103, 104, 105],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Score': [85, 92, 78, 88, 95]}
df = pd.DataFrame(data)
# select rows 1 and 3 for all columns
selected_rows = df.loc[[1, 3], :]
print(selected_rows)
Output
Student_ID Name Score
1 102 Bob 92
3 104 David 88
Here, loc[[1, 3], :] allows us to select the first and third row while including all columns.
Example 7: Select Rows by Boolean Condition
import pandas as pd
# create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28],
'City': ['New York', 'Denver', 'Chicago', 'Houston']}
df = pd.DataFrame(data)
# select rows where Age is greater than or equal 30
selected_rows = df.loc[df['Age'] >= 30]
print(selected_rows)
Output
Name Age City
1 Bob 30 Denver
2 Charlie 35 Chicago
Here, we have used the loc[df['Age'] >= 30] property to select rows from the df DataFrame where the Age column has a value greater than or equal to 30.
The selected_rows DataFrame will display the rows where the Age is greater than or equal to 30.