As MySQL multi-master products (Galera, Percona Xtradb Cluster/Galera, Tungsten Replication, etc.) proliferate, a question arises, “How do they support the notion of “The system of record”?
There’s 3 aspects of implementing the system of record (non-trivial in of multi-master systems):
- policy – the DBA must specify which database is the master for each dataset, what the failover steps are, and what to do in cases of data drift
- technology – the distributed system must enforce that the master receives all updates (aside from bugs, corruption and cosmic rays)
- operations – monitoring the distributed system and ensuring the policy is enforced. Early detection of network partitions and data drift is critical for data integrity in multi-master systems.
One of the biggest limitations of vanilla MySQL is that it is inherently single-master, as there is no in-flight conflict resolution logic to support two or more masters doing UPDATE/INSERT on the same rows.
But that is also ironically one of its strengths – single-master enforces one aspect of data integrity, the system of record, at the architectural level. Moving to a master-master product requires both conflict resolution and additional operational processes to handle data drift in both masters and slaves.
Product | Replication Technology | In-flight Conflict Resolution | Consistency Checking | Supported Storage Engines | CAP | One System of Record |
---|---|---|---|---|---|---|
MySQL 5.5 | Async & SemiSync, Statement, Row, Hybrid | No | pt-table-checksum | MyISAM, InnoDB | AP | Yes, Active Master |
MySQL 5.6 with GTID | Async & SemiSync, Statement, Row, Hybrid | No | pt-table-checksum | MyISAM, InnoDB | AP | Yes, Active Master |
Galera | Sync, Certification based Row with Statement DDL | Yes, Deadlock | pt-table-checksum | InnoDB | CA | No, but all nodes have same data? |
Percona Galera | Sync, Certification based Row with Statement DDL | Yes, Deadlock | pt-table-checksum | InnoDB | CA | No, but all nodes have same data? |
Tungsten Replication | Async, Custom | Yes, filters and tags | trepctl check | MyISAM*, InnoDB | AP | Yes, one master is defined by DBA in filters and tags |
* Tungsten supports the MyISAM storage engine, but does not recommend it because it is not crash-safe in several respects.
* Row-based replication is expected to inherently drift less than statement-based replication by most people.
* CAP is not a checkbox item. Please read “Problems with CAP, and Yahoo’s little known NoSQL system” for a deeper understanding of applying the CAP Theorem.
The System of Record in the Global Data Warehouse
wikipedia: System of Record
The Scale-Out Blog: The System of Record Approach to Multi-Master Database Applications
The Scale-Out Blog: If You *Must* Deploy Multi-Master Replication, Read This First
The Scale-Out Blog: Slouching towards Multi-Master Conflict Resolution
Multi-Master Replication Manager for MySQL
What’s wrong with MMM?
pt-table-checksum
More Galera lessons: parallel slave, out of order commits and deadlocks