Transactional Testing with Pytest and Flask-SQLAlchemy

Testing 1...2...3...

Since 2015, DataMade has been developing Dedupe.io, a web app that builds on our open source dedupe library to help non-technical users deduplicate and link records in messy datasets using supervised machine learning techniques.

Testing Dedupe.io has proven to be a consistent challenge for the engineering team for a few reasons. First, Dedupe.io is highly stateful – that is, most of its business logic involves updating state in a database server. This often precludes writing isolated tests, a core tenet of some testing paradigms. Second, Dedupe.io runs on a Flask and SQLAlchemy stack using the Flask-SQLAlchemy plugin. Compared to other Python frameworks, like Django, there are fewer mature tools for testing real database interactions in Flask-SQLAlchemy using our test runner of choice, pytest.

In this blog post we’ll cover DataMade’s approach to testing highly-stateful web apps like Dedupe.io, and introduce a new open-source plugin that we’re releasing to make it easier to test stateful Flask-SQLAlchemy apps using pytest: pytest-flask-sqlalchemy-transactions.

The problem: too much database state in tests

At DataMade, we consider Dedupe.io to be a highly-stateful web app: the app logic is written in Python and Flask, but many methods are simply thin wrappers around SQL updates intended to run against a PostgreSQL database. Dedupe.io uses database tables to queue work for machine learning models and record outcomes as that work is completed, so testing the interplay of these workflows is essential to testing the app.

Some developers discourage touching the database in unit tests and advocate mocking database interactions instead, but when testing a highly-stateful web app like Dedupe.io, tests need to interact with a real database or else they won’t do a reasonable job of approximating the app’s core logic.

When tests run updates against a real database, however, they risk violating test isolation, the principle that tests should be fully independent from one another and should not share any state. Since it would be impractical to create a fresh database for every test, the tests share a connection to the database, meaning that the updates that a test introduces can leak from test to test if the developer isn’t careful. To write a unit test for Dedupe.io, the engineering team needed to make sure that they reversed not only the direct updates that the test made to the database, but that they reversed any side effects as well.

As a quick example to illustrate the point, imagine we want to test a simple method add_to_queue that takes a function func and arguments args and registers them to be run at the end of a work queue:

def add_to_queue(func, args):
    # The SQL statement that will add the work to the queue.
    update = '''
        INSERT INTO work_queue (function, arguments)
        VALUES (func, args)
        RETURNING id
    '''

    # Use SQLAlchemy to run the SQL statement.
    with engine.begin() as conn:
        queue_id = conn.execute(update)

    # Return the ID of the new work added to the queue.
    return queue_id

Under a traditional paradigm of database-free unit testing, the developer would fake the database update in order to test the method:

# Python's unittest.mock module can substitute objects in the code with
# other objects whose output we define.
from unittest import mock

def test_add_to_queue():
    # Mock the `conn.execute` method so that it always returns a reliable ID.
    work_id = 1
    patched_conn_execute = mock.MagicMock(spec=sqlalchemy.engine.Connection.execute)
    patched_conn_execute.return_value = work_id

    # Patch the call to `conn.execute` in the `add_to_queue` method so that it
    # uses our mock instead.
    With mock.patch('module.conn.execute', new=patched_conn_execute):
        # Test that the method returns the expected ID.
        assert add_to_queue('sum', (1, 2)) == work_id

But in this context faking the update doesn’t make much sense, since interacting with the database through an SQL update is the most important part of the method. Replacing that interaction results in a test that doesn’t test anything.

With a real test database, on the other hand, the developer can run queries to confirm that the method has actually performed the relevant work:

from sqlalchemy.sql.expression import text

def test_add_to_queue():
    # Call `add_to_queue` to add a function to the work queue.
    work_id = add_to_queue('sum', (1, 2))

    # Using the returned ID, check that the work queue contains a proper entry.
    new_work = engine.execute(text('''
        SELECT *
        FROM work_queue
        WHERE id = :work_id
    '''), work_id=work_id)

    assert new_work.func == 'sum'
    assert new_work.args = (1, 2)

    # Make sure to clean up the work added in this test.
    engine.execute(text('''
        DELETE FROM work_queue
        WHERE id = :work_id
    '''))

Notice how the test requires a cleanup step at the end, or else the state created by add_to_queue will leak into the next test.

From the developer’s perspective, writing cleanup for tests is tedious. From a formal correctness perspective, however, it’s disastrous. It simply isn’t practical to expect the engineering team to reliably anticipate every side effect of a test, particularly as a codebase grows more complex and accumulates technical debt.

The solution: run all tests inside transactions

When we first started developing Dedupe.io, there were no tools to deal with test isolation in our Flask-SQLAlchemy stack. Today, we’re excited to release a new open-source plugin we developed to make it work for us: pytest-flask-sqlalchemy-transactions.

Taking inspiration from Django’s built-in support for transactional tests, the pytest-flask-sqlalchemy-transactions plugin provides comprehensive, easy-to-use pytest fixtures for ensuring test isolation in database transactions for Flask-SQLAlchemy apps. The plugin allows a developer to wrap any test in a database transaction, an isolated unit of database work that can be “rolled back” to reverse new changes. With pytest-flask-sqlalchemy-transactions, a developer can make arbitrary database updates with the confidence that any changes made during a test will roll back once the test exits.

Quick examples

The pytest-flask-sqlalchemy plugin creates a set of fixtures that can be used to run updates that will roll back when the test exits: db_session and db_engine. A developer can use these fixtures like they might use standard SQLAlchemy Session and Engine objects.

In this example, we introduce a state change to a test using db_session and confirm that it doesn’t persist beyond the body of the test:

def test_a_transaction(db_session):
   row = db_session.query(Table).get(1) 
   row.name = 'testing'

   db_session.add(row)
   db_session.commit()

def test_transaction_doesnt_persist(db_session):
   row = db_session.query(Table).get(1) 
   assert row.name != 'testing'

There’s no need to name or instantiate Engine and Session objects in a specific way for pytest-flask-sqlalchemy-transactions to work. Simply set the mocked-engines and mocked-sessions configuration variables to the Engine and Session objects where your codebase instantiates them, and the plugin will replace your connectables with transactional equivalents.

This example replaces two database connection objects that are used in the codebase to create side effects, database.db.session and database.engine:

# In setup.cfg

[pytest]
mocked-sessions=database.db.session
mocked-engines=database.engine

# In database.py

db = flask_sqlalchemy.SQLAlchemy()
engine = sqlalchemy.create_engine('DATABASE_URI')

With the connectables properly replaced, the add_to_queue method from above can safely run in tests without requiring any cleanup:

from sqlalchemy.sql.expression import text

def test_add_to_queue(db_engine):
    # Call `add_to_queue` to add a function to the work queue.
    work_id = add_to_queue('sum', (1, 2))

    # Using the returned ID, check that the work queue contains a proper entry.
    new_work = db_engine.execute(text('''
        SELECT *
        FROM work_queue
        WHERE id = :work_id
    '''), work_id=work_id)

    assert new_work.func == 'sum'
    assert new_work.args = (1, 2)

def test_transaction_doesnt_persist(db_session):
    # Make sure that the changes made in `test_add_to_queue` don’t persist across tests.
   row = db_session.query(work_queue).all()
   assert not row

Before pytest-flask-sqlalchemy-transactions, the developer had to clean up after test_add_to_queue or else state from the test would have leaked into test_transaction_doesnt_persist. Now, with mocked database connections and enforced rollbacks, pytest takes care of the cleanup, and test isolation in Flask-SQLAlchemy is a breeze.

Open source, always

The pytest-flask-sqlalchemy-transactions plugin is one among many in the growing universe of open-source libraries produced for Dedupe.io, all of which are available on the Dedupe.io organization’s GitHub account. As always, the code is open source and freely useable under the MIT license. Like what you see? Want something better? Open up an issue on GitHub on or reach out to us directly with your thoughts!

pytest-flask-sqlalchemy-transactions is greatly indebted to Alex Michael, whose blog post “Delightful testing with pytest and Flask-SQLAlchemy” helped establish the basic approach on which this plugin builds.