




















































In this article by Matthew Copperwaite, author of the book Learning Flask Framework, he talks about how relational databases are the bedrock upon which almost every modern web applications are built. Learning to think about your application in terms of tables and relationships is one of the keys to a clean, well-designed project. We will be using SQLAlchemy, a powerful object relational mapper that allows us to abstract away the complexities of multiple database engines, to work with the database directly from within Python.
In this article, we shall:
(For more resources related to this topic, see here.)
Our application's database is much more than a simple record of things that we need to save for future retrieval. If all we needed to do was save and retrieve data, we could easily use flat text files. The fact is, though, that we want to be able to perform interesting queries on our data. What's more, we want to do this efficiently and without reinventing the wheel. While non-relational databases (sometimes known as NoSQL databases) are very popular and have their place in the world of web, relational databases long ago solved the common problems of filtering, sorting, aggregating, and joining tabular data. Relational databases allow us to define sets of data in a structured way that maintains the consistency of our data. Using relational databases also gives us, the developers, the freedom to focus on the parts of our app that matter.
In addition to efficiently performing ad hoc queries, a relational database server will also do the following:
Relational databases and SQL, the programming language used with relational databases, are topics worthy of an entire book. Because this book is devoted to teaching you how to build apps with Flask, I will show you how to use a tool that has been widely adopted by the Python community for working with databases, namely, SQLAlchemy.
SQLAlchemy abstracts away many of the complications of writing SQL queries, but there is no substitute for a deep understanding of SQL and the relational model. For that reason, if you are new to SQL, I would recommend that you check out the colorful book Learn SQL the Hard Way, Zed Shaw available online for free at http://sql.learncodethehardway.org/.
SQLAlchemy is an extremely powerful library for working with relational databases in Python. Instead of writing SQL queries by hand, we can use normal Python objects to represent database tables and execute queries. There are a number of benefits to this approach which are listed as follows:
I hope you're excited after reading this list. If all the items in this list don't make sense to you right now, don't worry.
Now that we have discussed some of the benefits of using SQLAlchemy, let's install it and start coding.
If you'd like to learn more about SQLAlchemy, there is an article devoted entirely to its design in The Architecture of Open-Source Applications, available online for free at http://aosabook.org/en/sqlalchemy.html.
We will use pip to install SQLAlchemy into the blog app's virtualenv. To activate your virtualenv, change directories to source the activate script as follows:
$ cd ~/projects/blog
$ source bin/activate
(blog) $ pip install sqlalchemy
Downloading/unpacking sqlalchemy
…
Successfully installed sqlalchemy
Cleaning up...
You can check if your installation succeeded by opening a Python interpreter and checking the SQLAlchemy version; note that your exact version number is likely to differ.
$ python
>>> import sqlalchemy
>>> sqlalchemy.__version__
'0.9.0b2'
SQLAlchemy works very well with Flask on its own, but the author of Flask has released a special Flask extension named Flask-SQLAlchemy that provides helpers with many common tasks, and can save us from having to re-invent the wheel later on. Let's use pip to install this extension:
(blog) $ pip install flask-sqlalchemy
…
Successfully installed flask-sqlalchemy
Flask provides a standard interface for the developers who are interested in building extensions. As the framework has grown in popularity, the number of high quality extensions has increased. If you'd like to take a look at some of the more popular extensions, there is a curated list available on the Flask project website at http://flask.pocoo.org/extensions/.
SQLAlchemy supports a multitude of popular database dialects, including SQLite, MySQL, and PostgreSQL. Depending on the database you would like to use, you may need to install an additional Python package containing a database driver. Listed next are several popular databases supported by SQLAlchemy and the corresponding pip-installable driver. Some databases have multiple driver options, so I have listed the most popular one first.
Database |
Driver Package(s) |
SQLite |
Not needed, part of the Python standard library since version 2.5 |
MySQL |
MySQL-python, PyMySQL (pure Python), OurSQL |
PostgreSQL |
psycopg2 |
Firebird |
fdb |
Microsoft SQL Server |
pymssql, PyODBC |
Oracle |
cx-Oracle |
SQLite comes as standard with Python and does not require a separate server process, so it is perfect for getting up and running quickly. For simplicity in the examples that follow, I will demonstrate how to configure the blog app for use with SQLite. If you have a different database in mind that you would like to use for the blog project, feel free to use pip to install the necessary driver package at this time.
Using your favorite text editor, open the config.py module for our blog project (~/projects/blog/app/config.py). We are going to add an SQLAlchemy specific setting to instruct Flask-SQLAlchemy how to connect to our database. The new lines are highlighted in the following:
class Configuration(object):
APPLICATION_DIR = current_directory
DEBUG = True
SQLALCHEMY_DATABASE_URI = 'sqlite:///%s/blog.db' % APPLICATION_DIR
The SQLALCHEMY_DATABASE_URIis comprised of the following parts:
dialect+driver://username:password@host:port/database
Because SQLite databases are stored in local files, the only information we need to provide is the path to the database file. On the other hand, if you wanted to connect to PostgreSQL running locally, your URI might look something like this:
postgresql://postgres:secretpassword@localhost:5432/blog_db
If you're having trouble connecting to your database, try consulting the SQLAlchemy documentation on the database URIs:
http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html
Now that we've specified how to connect to the database, let's create the object responsible for actually managing our database connections. This object is provided by the Flask-SQLAlchemy extension and is conveniently named SQLAlchemy. Open app.py and make the following additions:
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from config import Configuration
app = Flask(__name__)
app.config.from_object(Configuration)
db = SQLAlchemy(app)
These changes instruct our Flask app, and in turn SQLAlchemy, how to communicate with our application's database. The next step will be to create a table for storing blog entries and to do so, we will create our first model.
A model is the data representation of a table of data that we want to store in the database. These models have attributes called columns that represent the data items in the data. So, if we were creating a Person model, we might have columns for storing the first and last name, date of birth, home address, hair color, and so on. Since we are interested in creating a model to represent blog entries, we will have columns for things like the title and body content.
Note that we don't say a People model or Entries model – models are singular even though they commonly represent many different objects.
With SQLAlchemy, creating a model is as easy as defining a class and specifying a number of attributes assigned to that class. Let's start with a very basic model for our blog entries. Create a new file named models.py in the blog project's app/ directory and enter the following code:
import datetime, re
from app import db
def slugify(s):
return re.sub('[^w]+', '-', s).lower()
class Entry(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(100))
slug = db.Column(db.String(100), unique=True)
body = db.Column(db.Text)
created_timestamp = db.Column(db.DateTime, default=datetime.datetime.now)
modified_timestamp = db.Column(
db.DateTime,
default=datetime.datetime.now,
onupdate=datetime.datetime.now)
def __init__(self, *args, **kwargs):
super(Entry, self).__init__(*args, **kwargs) # Call parent constructor.
self.generate_slug()
def generate_slug(self):
self.slug = ''
if self.title:
self.slug = slugify(self.title)
def __repr__(self):
return '<Entry: %s>' % self.title
There is a lot going on, so let's start with the imports and work our way down. We begin by importing the standard library datetime and re modules. We will be using datetime to get the current date and time, and re to do some string manipulation. The next import statement brings in the db object that we created in app.py. As you recall, the db object is an instance of the SQLAlchemy class, which is a part of the Flask-SQLAlchemy extension. The db object provides access to the classes that we need to construct our Entry model, which is just a few lines ahead.
Before the Entry model, we define a helper function slugify, which we will use to give our blog entries some nice URLs. The slugify function takes a string like A post about Flask and uses a regular expression to turn a string that is human-readable in a URL, and so returns a-post-about-flask.
Next is the Entry model. Our Entry model is a normal class that extends db.Model. By extending the db.Model our Entry class will inherit a variety of helpers which we'll use to query the database.
The attributes of the Entry model, are a simple mapping of the names and data that we wish to store in the database and are listed as follows:
For short strings such as titles or names of things, the String column is appropriate, but when the text may be especially long it is better to use a Text column, as we did for the entry body.
We've overridden the constructor for the class (__init__) so that when a new model is created, it automatically sets the slug for us based on the title.
The last piece is the __repr__ method which is used to generate a helpful representation of instances of our Entry class. The specific meaning of __repr__ is not important but allows you to reference the object that the program is working with, when debugging.
A final bit of code needs to be added to main.py, the entry-point to our application, to ensure that the models are imported. Add the highlighted changes to main.py as follows:
from app import app, db
import models
import views
if __name__ == '__main__':
app.run()
In order to start working with the Entry model, we first need to create a table for it in our database. Luckily, Flask-SQLAlchemy comes with a nice helper for doing just this. Create a new sub-folder named scripts in the blog project's app directory. Then create a file named create_db.py:
(blog) $ cd app/
(blog) $ mkdir scripts
(blog) $ touch scripts/create_db.py
Add the following code to the create_db.py module. This function will automatically look at all the code that we have written and create a new table in our database for the Entry model based on our models:
from main import db
if __name__ == '__main__':
db.create_all()
Execute the script from inside the app/ directory. Make sure the virtualenv is active. If everything goes successfully, you should see no output.
(blog) $ python create_db.py
(blog) $
If you encounter errors while creating the database tables, make sure you are in the app directory, with the virtualenv activated, when you run the script. Next, ensure that there are no typos in your SQLALCHEMY_DATABASE_URI setting.
Let's experiment with our new Entry model by saving a few blog entries. We will be doing this from the Python interactive shell. At this stage let's install IPython, a sophisticated shell with features like tab-completion (that the default Python shell lacks):
(blog) $ pip install ipython
Now check if we are in the app directory and let's start the shell and create a couple of entries as follows:
(blog) $ ipython
In []: from models import * # First things first, import our Entry model and db object.
In []: db # What is db?
Out[]: <SQLAlchemy engine='sqlite:////home/charles/projects/blog/app/blog.db'>
If you are familiar with the normal Python shell but not IPython, things may look a little different at first. The main thing to be aware of is that In[] refers to the code you type in, and Out[] is the output of the commands you put in to the shell.
IPython has a neat feature that allows you to print detailed information about an object. This is done by typing in the object's name followed by a question-mark (?). Introspecting the Entry model provides a bit of information, including the argument signature and the string representing that object (known as the docstring) of the constructor:
In []: Entry? # What is Entry and how do we create it?
Type: _BoundDeclarativeMeta
String Form:<class 'models.Entry'>
File: /home/charles/projects/blog/app/models.py
Docstring: <no docstring>
Constructor information:
Definition:Entry(self, *args, **kwargs)
We can create Entry objects by passing column values in as the keyword-arguments. In the preceding example, it uses **kwargs; this is a shortcut for taking a dict object and using it as the values for defining the object, as shown next:
In []: first_entry = Entry(title='First entry', body='This is the body of my first entry.')
In order to save our first entry, we will to add it to the database session. The session is simply an object that represents our actions on the database. Even after adding it to the session, it will not be saved to the database yet. In order to save the entry to the database, we need to commit our session:
In []: db.session.add(first_entry)
In []: first_entry.id is None # No primary key, the entry has not been saved.
Out[]: True
In []: db.session.commit()
In []: first_entry.id
Out[]: 1
In []: first_entry.created_timestamp
Out[]: datetime.datetime(2014, 1, 25, 9, 49, 53, 1337)
As you can see from the preceding code examples, once we commit the session, a unique id will be assigned to our first entry and the created_timestamp will be set to the current time. Congratulations, you've created your first blog entry!
Try adding a few more on your own. You can add multiple entry objects to the same session before committing, so give that a try as well.
At any point while you are experimenting, feel free to delete the blog.db file and re-run the create_db.py script to start over with a fresh database.
In order to make changes to an existing Entry, simply make your edits and then commit. Let's retrieve our Entry using the id that was returned to use earlier, make some changes and commit it. SQLAlchemy will know that it needs to be updated. Here is how you might make edits to the first entry:
In []: first_entry = Entry.query.get(1)
In []: first_entry.body = 'This is the first entry, and I have made some edits.'
In []: db.session.commit()
And just like that your changes are saved.
Deleting an entry is just as easy as creating one. Instead of calling db.session.add, we will call db.session.delete and pass in the Entry instance that we wish to remove:
In []: bad_entry = Entry(title='bad entry', body='This is a lousy
entry.')
In []: db.session.add(bad_entry)
In []: db.session.commit() # Save the bad entry to the database.
In []: db.session.delete(bad_entry)
In []: db.session.commit() # The bad entry is now deleted from the
database.
While creating, updating, and deleting are fairly straightforward operations, the real fun starts when we look at ways to retrieve our entries. We'll start with the basics, and then work our way up to more interesting queries.
We will use a special attribute on our model class to make queries: Entry.query. This attribute exposes a variety of APIs for working with the collection of entries in the database.
Let's simply retrieve a list of all the entries in the Entry table:
In []: entries = Entry.query.all()
In []: entries # What are our entries?
Out[]: [<Entry u'First entry'>, <Entry u'Second entry'>, <Entry
u'Third entry'>, <Entry u'Fourth entry'>]
As you can see, in this example, the query returns a list of Entry instances that we created. When no explicit ordering is specified, the entries are returned to us in an arbitrary order chosen by the database. Let's specify that we want the entries returned to us in an alphabetical order by title:
In []: Entry.query.order_by(Entry.title.asc()).all()
Out []:
[<Entry u'First entry'>,
<Entry u'Fourth entry'>,
<Entry u'Second entry'>,
<Entry u'Third entry'>]
Shown next is how you would list your entries in reverse-chronological order, based on when they were last updated:
In []: oldest_to_newest = Entry.query.order_by(Entry.modified_timestamp.desc()).all()
Out []:
[<Entry: Fourth entry>,
<Entry: Third entry>,
<Entry: Second entry>,
<Entry: First entry>]
It is very useful to be able to retrieve the entire collection of blog entries, but what if we want to filter the list? We could always retrieve the entire collection and then filter it in Python using a loop, but that would be very inefficient. Instead we will rely on the database to do the filtering for us, and simply specify the conditions for which entries should be returned. In the following example, we will specify that we want to filter by entries where the title equals 'First entry'.
In []: Entry.query.filter(Entry.title == 'First entry').all()
Out[]: [<Entry u'First entry'>]
If this seems somewhat magical to you, it's because it really is! SQLAlchemy uses operator overloading to convert expressions like <Model>.<column> == <some value> into an abstracted object called BinaryExpression. When you are ready to execute your query, these data-structures are then translated into SQL.
A BinaryExpression is simply an object that represents the logical comparison and is produced by over-riding the standards methods that are typically called on an object when comparing values in Python.
In order to retrieve a single entry, you have two options, .first() and .one(). Their differences and similarities are summarized in the following table:
Number of matching rows |
first() behavior |
one() behavior |
1 |
Return the object. |
Return the object. |
0 |
Return None. |
Raise sqlalchemy.orm.exc.NoResultFound |
2+ |
Return the first object (based on either explicit ordering or the ordering chosen by the database). |
Raise sqlalchemy.orm.exc.MultipleResultsFound |
Let's try the same query as before, but instead of calling .all(), we will call .first() to retrieve a single Entry instance:
In []: Entry.query.filter(Entry.title == 'First entry').first()
Out[]: <Entry u'First entry'>
Notice how previously .all() returned a list containing the object, whereas .first() returned just the object itself.
In the previous example we tested for equality, but there are many other types of lookups possible. In the following table, have listed some that you may find useful. A complete list can be found in the SQLAlchemy documentation.
Example |
Meaning |
Entry.title == 'The title' |
Entries where the title is "The title", case-sensitive. |
Entry.title != 'The title' |
Entries where the title is not "The title". |
Entry.created_timestamp < datetime.date(2014, 1, 25) |
Entries created before January 25, 2014. For less than or equal, use <=. |
Entry.created_timestamp > datetime.date(2014, 1, 25) |
Entries created after January 25, 2014. For greater than or equal, use >=. |
Entry.body.contains('Python') |
Entries where the body contains the word "Python", case-sensitive. |
Entry.title.endswith('Python') |
Entries where the title ends with the string "Python", case-sensitive. Note that this will also match titles that end with the word "CPython", for example. |
Entry.title.startswith('Python') |
Entries where the title starts with the string "Python", case-sensitive. Note that this will also match titles like "Pythonistas". |
Entry.body.ilike('%python%') |
Entries where the body contains the word "python" anywhere in the text, case-insensitive. The "%" character is a wild-card. |
Entry.title.in_(['Title one', 'Title two']) |
Entries where the title is in the given list, either 'Title one' or 'Title two'. |
The expressions listed in the preceding table can be combined using bitwise operators to produce arbitrarily complex expressions. Let's say we want to retrieve all blog entries that have the word Python or Flask in the title. To accomplish this, we will create two contains expressions, then combine them using Python's bitwise OR operator which is a pipe| character unlike a lot of other languages that use a double pipe || character:
Entry.query.filter(Entry.title.contains('Python') |
Entry.title.contains('Flask'))
Using bitwise operators, we can come up with some pretty complex expressions. Try to figure out what the following example is asking for:
Entry.query.filter(
(Entry.title.contains('Python') |
Entry.title.contains('Flask')) &
(Entry.created_timestamp > (datetime.date.today() -
datetime.timedelta(days=30)))
)
As you probably guessed, this query returns all entries where the title contains either Python or Flask, and which were created within the last 30 days. We are using Python's bitwise OR and AND operators to combine the sub-expressions. For any query you produce, you can view the generated SQL by printing the query as follows:
In []: query = Entry.query.filter(
(Entry.title.contains('Python') | Entry.title.contains('Flask'))
&
(Entry.created_timestamp > (datetime.date.today() -
datetime.timedelta(days=30)))
)
In []: print str(query)
SELECT entry.id AS entry_id, ...
FROM entry
WHERE (
(entry.title LIKE '%%' || :title_1 || '%%') OR (entry.title LIKE
'%%' || :title_2 || '%%')
) AND entry.created_timestamp > :created_timestamp_1
There is one more piece to discuss, which is negation. If we wanted to get a list of all blog entries which did not contain Python or Flask in the title, how would we do that? SQLAlchemy provides two ways to create these types of expressions, using either Python's unary negation operator (~) or by calling db.not_(). This is how you would construct this query with SQLAlchemy:
Using unary negation:
In []: Entry.query.filter(~(Entry.title.contains('Python') |
Entry.title.contains('Flask')))
Using db.not_():
In []: Entry.query.filter(db.not_(Entry.title.contains('Python') |
Entry.title.contains('Flask')))
Not all operations are considered equal to the Python interpreter. This is like in math class, where we learned that expressions like 2 + 3 * 4 are equal to 14 and not 20, because the multiplication operation occurs first. In Python, bitwise operators all have a higher precedence than things like equality tests, so this means that when you are building your query expression, you have to pay attention to the parentheses. Let's look at some example Python expressions and see the corresponding query:
Expression |
Result |
(Entry.title == 'Python' | Entry.title == 'Flask') |
Wrong! SQLAlchemy throws an error because the first thing to be evaluated is actually the 'Python' | Entry.title! |
(Entry.title == 'Python') | (Entry.title == 'Flask') |
Right. Returns entries where the title is either "Python" or "Flask". |
~Entry.title == 'Python' |
Wrong! SQLAlchemy will turn this into a valid SQL query, but the results will not be meaningful. |
~(Entry.title == 'Python') |
Right. Returns entries where the title is not equal to "Python". |
If you find yourself struggling with the operator precedence, it's a safe bet to put parentheses around any comparison that uses ==, !=, <, <=, >, and >=.
The final topic we will discuss in this article is how to make modifications to an existing Model definition. From the project specification, we know we would like to be able to save drafts of our blog entries. Right now we don't have any way to tell whether an entry is a draft or not, so we will need to add a column that let's us store the status of our entry. Unfortunately, while db.create_all() works perfectly for creating tables, it will not automatically modify an existing table; to do this we need to use migrations.
We will use Flask-Migrate to help us automatically update our database whenever we change the schema. In the blog virtualenv, install Flask-migrate using pip:
(blog) $ pip install flask-migrate
The author of SQLAlchemy has a project called alembic; Flask-Migrate makes use of this and integrates it with Flask directly, making things easier.
Next we will add a Migrate helper to our app. We will also create a script manager for our app. The script manager allows us to execute special commands within the context of our app, directly from the command-line. We will be using the script manager to execute the migrate command. Open app.py and make the following additions:
from flask import Flask
from flask.ext.migrate import Migrate, MigrateCommand
from flask.ext.script import Manager
from flask.ext.sqlalchemy import SQLAlchemy
from config import Configuration
app = Flask(__name__)
app.config.from_object(Configuration)
db = SQLAlchemy(app)
migrate = Migrate(app, db)
manager = Manager(app)
manager.add_command('db', MigrateCommand)
In order to use the manager, we will add a new file named manage.py along with app.py. Add the following code to manage.py:
from app import manager
from main import *
if __name__ == '__main__':
manager.run()
This looks very similar to main.py, the key difference being that instead of calling app.run(), we are calling manager.run().
Django has a similar, although auto-generated, manage.py file that serves a similar function.
Before we can start changing our schema, we need to create a record of its current state. To do this, run the following commands from inside your blog's app directory. The first command will create a migrations directory inside the app folder which will track the changes we make to our schema. The second command db migrate will create a snapshot of our current schema so that future changes can be compared to it.
(blog) $ python manage.py db init
Creating directory /home/charles/projects/blog/app/migrations ... done
...
(blog) $ python manage.py db migrate
INFO [alembic.migration] Context impl SQLiteImpl.
INFO [alembic.migration] Will assume non-transactional DDL.
Generating /home/charles/projects/blog/app/migrations/versions/535133f91f00_.py ... done
Finally, we will run db upgrade to run the migration which will indicate to the migration system that everything is up-to-date:
(blog) $ python manage.py db upgrade
INFO [alembic.migration] Context impl SQLiteImpl.
INFO [alembic.migration] Will assume non-transactional DDL.
INFO [alembic.migration] Running upgrade None -> 535133f91f00, empty message
Now that we have a snapshot of our current schema, we can start making changes. We will be adding a new column named status, which will store an integer value corresponding to a particular status. Although there are only two statuses at the moment (PUBLIC and DRAFT), using an integer instead of a Boolean gives us the option to easily add more statuses in the future. Open models.py and make the following additions to the Entry model:
class Entry(db.Model):
STATUS_PUBLIC = 0
STATUS_DRAFT = 1
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(100))
slug = db.Column(db.String(100), unique=True)
body = db.Column(db.Text)
status = db.Column(db.SmallInteger, default=STATUS_PUBLIC)
created_timestamp = db.Column(db.DateTime,
default=datetime.datetime.now)
...
From the command-line, we will once again be running db migrate to generate the migration script. You can see from the command's output that it found our new column:
(blog) $ python manage.py db migrate
INFO [alembic.migration] Context impl SQLiteImpl.
INFO [alembic.migration] Will assume non-transactional DDL.
INFO [alembic.autogenerate.compare] Detected added column 'entry.status'
Generating /home/charles/projects/blog/app/migrations/versions/2c8e81936cad_.py ... done
Because we have blog entries in the database, we need to make a small modification to the auto-generated migration to ensure the statuses for the existing entries are initialized to the proper value. To do this, open up the migration file (mine is migrations/versions/2c8e81936cad_.py) and change the following line:
op.add_column('entry', sa.Column('status', sa.SmallInteger(),
nullable=True))
The replacement of nullable=True with server_default='0' tells the migration script to not set the column to null by default, but instead to use 0:
op.add_column('entry', sa.Column('status', sa.SmallInteger(), server_default='0'))
Finally, run db upgrade to run the migration and create the status column:
(blog) $ python manage.py db upgrade
INFO [alembic.migration] Context impl SQLiteImpl.
INFO [alembic.migration] Will assume non-transactional DDL.
INFO [alembic.migration] Running upgrade 535133f91f00 -> 2c8e81936cad, empty message
Congratulations, your Entry model now has a status field!
By now you should be familiar with using SQLAlchemy to work with a relational database. We covered the benefits of using a relational database and an ORM, configured a Flask application to connect to a relational database, and created SQLAlchemy models. All this allowed us to create relationships between our data and perform queries. To top it off, we also used a migration tool to handle future database schema changes.
We will set aside the interactive interpreter and start creating views to display blog entries in the web browser. We will put all our SQLAlchemy knowledge to work by creating interesting lists of blog entries, as well as a simple search feature. We will build a set of templates to make the blogging site visually appealing, and learn how to use the Jinja2 templating language to eliminate repetitive HTML coding.
Further resources on this subject: