The to_sql() method in Pandas is used to write records stored in a DataFrame to a SQL database. It requires the SQLAlchemy engine to make a connection to the database.
Example
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='people', con=engine)
to_sql() Syntax
The syntax of the to_sql() method in Pandas is:
df.to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)
to_sql() Arguments
The to_sql() method takes the following common arguments:
name: the name of the target tablecon: engine or database connection objectschema(optional): specifies the schemaif_exists: how to behave if the table already existsindex: write index as a columnindex_label: column label for index column(s)chunksize: specifies the number of rows in each batch to be written at a timedtype: specifies the datatype for columnsmethod: controls the SQL insertion clause used.
to_sql() Return Value
The return value of to_sql() is None. It does not return anything since its purpose is to write the DataFrame to a database.
Example 1: Write to SQL with Default Settings
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='people', con=engine)
In this example, we wrote the DataFrame df to the SQL table people using the default settings.
Example 2: Writing to SQL with Replace Option
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Anna', 'Alex', 'Alice'], 'Age': [28, 22, 23]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='people', con=engine, if_exists='replace')
Here, if the table people already exists, it will be replaced with the new DataFrame df.
Example 3: Specifying Data Types
import pandas as pd
from sqlalchemy import create_engine, Integer, Text
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Sam', 'Lily', 'Max'], 'Age': [32, 24, 27]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='employees', con=engine, dtype={'Name': Text, 'Age': Integer})
In this example, we specified that the Name column should be stored as Text and the Age as Integer in the SQL table employees.
Example 4: Append to Existing Table
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
new_data = {'Name': ['Brian', 'Smith'], 'Age': [34, 29]}
new_df = pd.DataFrame(new_data)
# write to database
new_df.to_sql(name='people', con=engine, if_exists='append')
In this example, we appended the records in new_df to the people table by using if_exists='append'.
Example 5: Using method Parameter
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Sara', 'Alex'], 'Age': [26, 22]}
df = pd.DataFrame(data)
# use the 'multi' method to pass multiple insert values
df.to_sql(name='new_people', con=engine, method='multi')
In this example, we used the 'multi' method to pass multiple insert values in a single INSERT clause.
Using the method='multi' argument can lead to performance benefits when inserting multiple records at once.