DRBD and MySQL: Just Say No
I’ve successfully used MySQL statement-based replication for several years across data centers and understand it’s quirks.
While at the MySQL Conference, I tried to see how DRBD could help the installations I manage, but I just can’t drink the DRBD Kool-Aid.
MySQL Replication Pluses
- Free
- Easy to setup if you already have a backup and master position
- No shared storage to manage or corrupt
- Light network load
- Can use master for r/w and slaves for r.
- can do maintenance on slave (ALTER TABLE, etc.) and failover afterwards
- works well across Internet even with high-latency
- many replication problems simple and hand-fixable
MySQL Replication Minuses
- Slaves can/will get out of sync with the master, typically noticed after a few weeks or with Maatkit
- Changing masters requires rebuilding slaves
- There is always some replication lag when there is a busy master
- no checksums or 2-phase commit
DRBD is a low-level driver to copy a disk partition in near real-time from a master to a failover node (cold standby.)
MySQL with DRBD Pluses
- Free
- No fsck or transaction log replay needed if manual failover.
- Slaves don’t need SET MASTER updated unless DRBD fails.
MySQL with DRBD Minuses
- DRBD partition corruption means failover node would be unusable (disadvantage of shared storage) and failback could destroy original master too.
- if the master panics, then after failover both fsck and transaction logs replay must be performed
- more work to setup initially than statement-based replication
- NIC and network corruption is also propagated.
- Failover node is a cold standby, cannot accept database traffic if that would change the DRBD partition
- Could generate a lot of network traffic.
- cannot do maintenance on cold standby database
- 2 heartbeats needed on a reliable, local network
I can see how MySQL/DRBD would be appealing for those who operate on a reliable network and don’t need Master-Master for load or maintenance, or who have many slaves that cannot easily be rebuilt.