6 Lines of Python to Plot SQLite Data

There are some great tutorials on SQL, SQLite and Matplotlib, however I wanted something super simple that I could use to teach my daughters with.

This blog documents my notes on what I used to teach them SQL, SQLite3 and then how to plot the results in Python. Yes it can only takes 6 lines of Python code to read an SQLite3 file and plot it, if you are doing 2 line charts with a title and legends its 9 lines.

Getting Started with SQL

There are some great SQL databases and SQL tools that can are quite user friendly.

For us I started with SQLite3 which is file based, no server component is required and it runs on all major OS’s and Raspberry Pi’s. There are a number of SQL Query builder tools but we found the DB Browser for SQLite gave us all the functionality we needed.

Check the install documentation for your hardware/OS, to install SQLite3 and DB Browser on Debian/Ubuntu/Raspberry Pi’s :

sudo apt install sqlite3
sudo apt-get install sqlitebrowse

SQLite Databases

For testing there are a lot of excellent databases that can downloaded from Kaggle. These data files are in CSV format and they can be imported into a SQLite database file using the DB Browser.

For our first test we used the Kaggle SARS 2003 data set.

A new database file (sars.db) was created and then the CSV was imported.

Create VIEWS to Simplify

SQL Views can be created to simply the field names, reduce the number of fields or add custom fields.

Use the “Execute SQL” tab to create a view. An example to create a view with simplified field names would be:

Views and SELECT queries can be generated to add custom fields. An example to extract the month from the date field and add 2 other fields (month number and month name):

Test SQL Queries

The DB Browser tool is good for testing out SQL queries and plot the data before moving to Python.

A query to find the worst 5 countries affected by SARS would:

select sum(deaths) sum_deaths, country from v_sars group
  by Country having sum_deaths > 0 order by sum_deaths desc limit 5

Plotting in Python

There are number of ways to plot SQL data in Python. I found that the easiest way was to use Pandas dataframes.

To load the necessary Python libraries:

pip install sqlite3
pip install pandas
pip install matplotlib

The Python code to connect to SQLite3, run the SQL query (of sum of deaths vs. country) and plot the data is:

#
# sql_bar0.py - Sqlite to Bar Charts
#
import sqlite3, pandas , matplotlib.pyplot as plt

conn = sqlite3.connect("/media/pete/RED_4GB/sql/sars.db")

sql = """select sum(deaths) sum_deaths, country from v_sars group
  by Country having sum_deaths > 0 order by sum_deaths desc limit 5"""

data = pandas.read_sql(sql, conn)
#x values: data.Country,  y values: data.sum_deaths
plt.bar(data.Country, data.sum_deaths)
plt.title("SARS Death in 2003")
plt.show()

An example with 2 lines to shows the monthly deaths and cases would be:

#
# sql_line2.py - Sqlite to 2 Line Charts
#
import sqlite3, pandas , matplotlib.pyplot as plt

conn = sqlite3.connect("/media/pete/RED_4GB/sql/sars.db")

sql = """select s_month, sum(deaths) as sum_deaths, sum(cases) as sum_cases from v_month group by n_month"""

data = pandas.read_sql(sql, conn)

plt.plot(data.s_month,data.sum_deaths, label = "Deaths")
plt.plot(data.s_month,data.sum_cases, label = "Cases")
plt.legend()
plt.title("SARS Death in 2003")
plt.show()

Summary

By keeping the Python code simple we were able to focus on SQL queries.

Using the basic Python code the SQL connection we later changed from SQLite3 to MySQL or Progresql.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s