There’s 3 aspects of implementing the system of record (non-trivial in of multi-master systems):
- policy – the DBA must specify which database is the master for each dataset, what the failover steps are, and what to do in cases of data drift
- technology – the distributed system must enforce that the master receives all updates (aside from bugs, corruption and cosmic rays)
- operations – monitoring the distributed system and ensuring the policy is enforced. Early detection of network partitions and data drift is critical for data integrity in multi-master systems.
One of the biggest limitations of vanilla MySQL is that it is inherently single-master, as there is no in-flight conflict resolution logic to support two or more masters doing UPDATE/INSERT on the same rows.
But that is also ironically one of its strengths – single-master enforces one aspect of data integrity, the system of record, at the architectural level. Moving to a master-master product requires both conflict resolution and additional operational processes to handle data drift in both masters and slaves.
|Product||Replication Technology||In-flight Conflict Resolution||Consistency Checking||Supported Storage Engines||CAP||One System of Record|
|MySQL 5.5||Async & SemiSync, Statement, Row, Hybrid||No||pt-table-checksum||MyISAM, InnoDB||AP||Yes, Active Master|
|MySQL 5.6 with GTID||Async & SemiSync, Statement, Row, Hybrid||No||pt-table-checksum||MyISAM, InnoDB||AP||Yes, Active Master|
|Galera||Sync, Certification based Row with Statement DDL||Yes, Deadlock||pt-table-checksum||InnoDB||CA||No, but all nodes have same data?|
|Percona Galera||Sync, Certification based Row with Statement DDL||Yes, Deadlock||pt-table-checksum||InnoDB||CA||No, but all nodes have same data?|
|Tungsten Replication||Async, Custom||Yes, filters and tags||trepctl check||MyISAM*, InnoDB||AP||Yes, one master is defined by DBA in filters and tags|
* Tungsten supports the MyISAM storage engine, but does not recommend it because it is not crash-safe in several respects.
* Row-based replication is expected to inherently drift less than statement-based replication by most people.
* CAP is not a checkbox item. Please read “Problems with CAP, and Yahoo’s little known NoSQL system” for a deeper understanding of applying the CAP Theorem.
The System of Record in the Global Data Warehouse
wikipedia: System of Record
The Scale-Out Blog: The System of Record Approach to Multi-Master Database Applications
The Scale-Out Blog: If You *Must* Deploy Multi-Master Replication, Read This First
The Scale-Out Blog: Slouching towards Multi-Master Conflict Resolution
Multi-Master Replication Manager for MySQL
What’s wrong with MMM?
More Galera lessons: parallel slave, out of order commits and deadlocks