Join my Laravel for REST API's course on Udemy 👀

Replication using a Django database router

October 23, 2020  ‐ 3 min read

In Django it is perfectly fine to interact with multiple databases in your project. Django uses the concepts database routers to determine which database should be used for which model.

These database routers can be leveraged to in a master-replica database setup.

Be aware that this setup doesn't take replication lag into account.

To get starter lets first add multiple databases to the settings. A master database set to default and the replica using the replica key.

DATABASES = {
    'default': {
        'NAME': 'django',
        'ENGINE': 'django.db.backends.mysql',
    },
    'replica': {
        'NAME': 'django',
        'ENGINE': 'django.db.backends.mysql',
    },
}

You could use master instead of default as well. But Django requires that a database entry be defined with the default key is defined. If you prefer to do so you can set default to an emply dict.

Now we can move on to the database router itself. The database router should be a class that implements the methods: db_for_read, db_for_write, allow_relation, allow_migrate.

A database router that supports replication might look as follows.

import random
from django.conf import settings

class ReplicationRouter:
    def db_for_read(self, model, **hints):
        """
        Randomly pick a database to read from
        """
        return random.choice([key for key in settings.DATABASES])

    def db_for_write(self, model, **hints):
        """
        Always send write queries to the master database.
        """
        return 'default';

    def allow_relation(self, obj1, obj2, **hints):
        """
        This isn't really applicable for this use-case.
        """
        return True

    def allow_migrate(self, db, app_label, model_name=None, **hints):
        """
        Only allow migration operations on the master database, just in case.
        """
        if db == 'default':
            return True
        return None

The allow_relation and allow_migrate aren't really applicable for replication. Those have a use-case for when you use different databases for different Django apps.

Important for our use-case of replication are db_for_read and db_for_write. The db_for_read method should return the database that should be used for a read operation. While db_for_write specifies the database that should be used for a write operation.

First off the database for write operations. In case Django needs to execute a write operation we want this query to end up at the master database. Because we should never write to the replica!

We simply specify the key in the DATABASES dict of the master database. This way only the master database will receive write queries.

def db_for_write(self, model, **hints):
    return 'default';

Second is specifying the database read operations can be propagated to. For this we randomly select a database that is set in DATABASES. This way read queries should be evenly distributed between the master and the replica.

def db_for_read(self, model, **hints):
    return random.choice([key for key in settings.DATABASES])

This does assume that all databases are used for replication though.

The last thing we need to do is specifying the database router in the settings. You can add a database router by setting DATABASE_ROUTERS.

DATABASE_ROUTERS = ['path.to.ReplicationRouter']