Master-slave replication database concept for beginners

Denny Lesmana
4 min readApr 4, 2022

--

WHAT IS THE MASTER-SLAVE DATABASE?

The master-slave is a database architecture divided into a master database and slave databases. The slave database serves as the backup for the master database. The master database is used for the write operations, while read operations may be spread on multiple slave databases.

Imagine a situation where your site is bombarded with multiple data requests causing a surge in web traffic. Now if there is only one master database, then it will be overloaded, making the site slower for all your users.

To scale out your application, you need two data sources- one to handle the write operations and the other one to handle the read operations. And that’s where master-slave architecture comes in handy.

Master-slave replication can be either synchronous or asynchronous. The difference is simply the timing of propagation of changes. If the changes are made to the master and slave at the same time, it is synchronous. If changes are queued up and written later, it is asynchronous.

https://uploads.toptal.io/blog/image/127627/toptal-blog-image-1543512385306-1f627e3afafe9665f763469ba0a283d3.png

Pros of Master-Slave Database Architecture

  • Produce backups: provide reliable backups through their chain of slave databases. The slave database can be shut down without affecting the operations of the master database, because the snapshots of the live data will be replicated onto the slave database, even after a failure in master databases.
  • Scale out the application: When users grow, it’s important to provide seamless user experience to your users. Master-Slave architecture can be used for scaling out your application by distributing your data load across multiple databases.
  • Spreading the load: There may be situations when we have a single master and want to replicate different databases to different slaves. For example, we may want to distribute different sales data to different departments to help spread the load during data analysis.
  • Increasing the performance: as the number of slaves connecting to a master increases, the load, although minimal, also increases, as each slave uses a client connection to the master. Also, as each slave must receive a full copy of the master binary log, the network load on the master may also increase and create a bottleneck. If we are using a large number of slaves connected to one master, and that master is also busy processing requests (for example, as a part of a scale-out solution), then we may want to improve the performance of the replication process. One way to improve the performance of the replication process is to create a deeper replication structure that enables the master to replicate to only one slave, and for the remaining slaves to connect to this primary slave for their individual replication requirements.

Cons of Master-Slave Database Architecture

  • Write operations to master are hard to scale — Write requests to master can hardly be scaled. One of the only few options to scale the writing requests is to increase the compute capacity(CPU and ROM) of the master database.
  • Asynchronous replication fails at times — Asynchronous means two or more operations taking place in a system that is independent and does not rely upon or affect each other. This asynchronous replication followed in the master-slave database is not very reliable as the changes committed to the master may not be reflected in the slave nodes if there is a failure in the master node.
  • No automatic failover — In case a master fails, a slave should be pushed to take the place of the master. No automatic failover replacement ensues.
  • Binary log has to be read each time data is copied — Each slave adds load to the master as the binary log has to be read before copying data to the slave nodes.

--

--

Denny Lesmana

Senior Full Stack Engineer @ Ajaib | Tech & Investment Enthusiast | twitter: https://twitter.com/Denny_lesmanaa