DDIA | Replication | Multi Leader Replication

In previous posts, we learned about single leader based replication, different types and problems associated with this approach. In single leader based replication, there is only one leader who accepts writes. There are systems where there are multiple leaders/multiple nodes that accept writes. Those systems are using multi-Leader replication. Let’s see some usecases for multi-leader replication.

Multi-datacenter operation

There are multiple data centers for a given system. If there is only one leader then writes from other data centers would need to come to the leader which causes addition to latency as both data centers are far apart.

To solve this, we can have one leader per data center. Within each datacenter, regular leader–follower replication is used; between data centers, each datacenter’s leader replicates its changes to the leaders in other data centers.

Advantages –

1. Better performance, Lower latency, as data is replicated within a data center.
2. High tolerance, if a leader in one data center goes out, other data centers are not affected.
3. High tolerance to network related failures

Disadvantages –

the same data may be concurrently modified in two different data centers, and those write conflicts must be resolved

Some databases support multi-leader configurations by default, but it is also often implemented with external tools, such as Tungsten Replicator for MySQL, BDR for PostgreSQL, and GoldenGate for Oracle.

Clients with offline operation

Another situation in which multi-leader replication is appropriate is if you have an application that needs to continue to work while it is disconnected from the internet.

For example, consider the calendar apps on your mobile phone, your laptop, and other devices. You need to be able to see your meetings (make read requests) and enter new meetings (make write requests) at any time, regardless of whether your device currently has an internet connection. If you make any changes while you are offline, they need to be synced with a server and your other devices when the device is next online.

In this case, every device has a local database that acts as a leader (it accepts write requests), and there is an asynchronous multi-leader replication process (sync) between the replicas of your calendar on all of your devices. The replication lag may be hours or even days, depending on when you have internet access available.

Collaborative editing

Real-time collaborative editing applications allow several people to edit a document simultaneously. When one user edits a document, the changes are instantly applied to their local replica (the state of the document in their web browser or client application) and asynchronously replicated to the server and any other users who are editing the same document.

If you want to guarantee that there will be no editing conflicts, the application must obtain a lock on the document before a user can edit it. If another user wants to edit the same document, they first have to wait until the first user has committed their changes and released the lock. This collaboration model is equivalent to single-leader replication with transactions on the leader.

Thanks for stopping by! Hope this gives you a brief overview in to different use cases that require multi-leader replication. Eager to hear your thoughts and chat, please leave comments below and we can discuss.

One response to “DDIA | Replication | Multi Leader Replication”

DDIA | Replication | Multi Leader Replication | Handling write conflicts – Lets Code Them Up! says:

October 7, 2023 at 11:30 am

[…] previous post, we learnt about multi-leader replication and the use cases where we need that multi-leader […]

DDIA | Replication | Multi Leader Replication

Multi-datacenter operation

Clients with offline operation

Collaborative editing

Share this:

Like this:

One response to “DDIA | Replication | Multi Leader Replication”