Replication using a Django database router
October 23, 2020 ‐ 3 min read
In Django it is perfectly fine to interact with multiple databases in your project. Django uses the concepts database routers to determine which database should be used for which model.
These database routers can be leveraged to in a master-replica database setup.
Be aware that this setup doesn't take replication lag into account.
To get starter lets first add multiple databases to the settings. A master database set to default
and the replica using the replica
key.
DATABASES = {
'default': {
'NAME': 'django',
'ENGINE': 'django.db.backends.mysql',
},
'replica': {
'NAME': 'django',
'ENGINE': 'django.db.backends.mysql',
},
}
You could use master
instead of default
as well. But Django requires that a database entry be defined with the default
key is defined. If you prefer to do so you can set default
to an emply dict.
Now we can move on to the database router itself. The database router should be a class that implements the methods: db_for_read
, db_for_write
, allow_relation
, allow_migrate
.
A database router that supports replication might look as follows.
import random
from django.conf import settings
class ReplicationRouter:
def db_for_read(self, model, **hints):
"""
Randomly pick a database to read from
"""
return random.choice([key for key in settings.DATABASES])
def db_for_write(self, model, **hints):
"""
Always send write queries to the master database.
"""
return 'default';
def allow_relation(self, obj1, obj2, **hints):
"""
This isn't really applicable for this use-case.
"""
return True
def allow_migrate(self, db, app_label, model_name=None, **hints):
"""
Only allow migration operations on the master database, just in case.
"""
if db == 'default':
return True
return None
The allow_relation
and allow_migrate
aren't really applicable for replication. Those have a use-case for when you use different databases for different Django apps.
Important for our use-case of replication are db_for_read
and db_for_write
. The db_for_read
method should return the database that should be used for a read operation. While db_for_write
specifies the database that should be used for a write operation.
First off the database for write operations. In case Django needs to execute a write operation we want this query to end up at the master database. Because we should never write to the replica!
We simply specify the key in the DATABASES
dict of the master database. This way only the master database will receive write queries.
def db_for_write(self, model, **hints):
return 'default';
Second is specifying the database read operations can be propagated to. For this we randomly select a database that is set in DATABASES
. This way read queries should be evenly distributed between the master and the replica.
def db_for_read(self, model, **hints):
return random.choice([key for key in settings.DATABASES])
This does assume that all databases are used for replication though.
The last thing we need to do is specifying the database router in the settings. You can add a database router by setting DATABASE_ROUTERS
.
DATABASE_ROUTERS = ['path.to.ReplicationRouter']