High Availability Database Tools
by Pim van der Wal
Percona Live was held at the beginning of April this year and the Acquia team was there. This was a really great conference and I want to start by encouraging any DBA/database developer who uses MySQL in a production environment to go to the Percona Live conference. It's a great place to hear about new functionality, new products and how other companies are using MySQL at scale. Large companies like Google, Facebook and Twitter as well as hot startups like Box all send representatives to share their knowledge.
At Acquia we use the Percona version of MySQL as our standard data store for all hosted Drupal sites as well as for lots of internal projects. Since all our production servers use MySQL replication to provide high availability and our databases are continuously growing we're always looking to get better performance out of replication. The new replication enhancements in MySQL 5.6 and 5.7 are promising better performance but need some work in the application and tools to deliver on that performance. At this year's conference I decided to take a new look at the current state of external high availability and failover solutions to see what enhancements they could deliver and how much effort would be involved in implementing them.
I want to summarize and compare four of the more interesting products that are currently available. I will also point out if there are Drupal specific things to keep in mind when choosing between them. All these tools are open source and free.
MHA (MySQL High Availability) is an open source tool that provides high availability for MySQL using standard MySQL replication by automating failover and slave promotion. The Manager servers monitors the master and when it detects a failure situation it powers down that server, promotes the slave that is furthest along in the replication stream to be the new master and configures the other slaves to replicate from the new master. As part of this promotion incremental relay logs are applied to the slaves to bring them in sync with the new master.
Directing the clients to the new master is accomplished using a virtual ip but this can also be done by a third party tool like PaceMaker. Failover typically happens in 30 seconds or less but there is the chance of data loss because not all transactions have been received by the slaves. This can be mitigated by using MySQL semi synchronous replication which has a slight performance overhead.
The Percona Replication Manager is very similar to MHA in what the product accomplishes but it is designed to work with Pacemaker and Corosync. PaceMaker provides the clustering manager technology for failing between the nodes and Corosync provides the heartbeat. PRM also provides automatic slave promotion and a way to bring the other slaves back in sync.
Where MHA was originally built around MySQL slave promotion PRM was built around reliable failover of virtual IP's using PaceMaker. Both MHA and PRM were developed before Global Transaction identifiers were available in MySQL (starting in 5.6) but both products support GTID's in the current versions.
MySQL Fabric is a newcomer to the market and is being developed by Oracle as part of the MySQL Utilities. The approach is different from MHA and PRM in the sense that it does not rely on virtual IP's to switch clients to a new master. Instead MySQL Fabric comes with database drivers which contain monitoring and failover logic. To avoid the possibility of clients switching over at different times and causing data inconsistencies MySQL Fabric uses an administration node to tell the drivers when to fail over. This node is a single point of failure but if it becomes unavailable the application won't stop working if the primary database is still available. Only the failover functionality will be disabled at that point. Additionally this setup allows for distributed failure detection since all clients can reports failures.
The failover functionality consists of determining the most up-to-date slave, promoting that slave to be the new master and configuring all other slaves to replicate from the new master in a similar fashion to MHA and PRM.
In future releases MySQL Fabric will allow for integration with different replication technologies like Galera but at the time of the presentation only standard MySQL replication is supported. MySQL 5.6 is a requirement because failover relies on Global Transaction Identifiers.
Galera is very different from the previous 3 products in that it does not rely on standard MySQL asynchronous replication but instead implements (virtually) synchronous replication. This means that all updates are written to all nodes as part of the original transaction so there is no replication lag. This also means that inconsistent data is not possible because conflicts get blocked as part of the transaction. The downside is that there is extra latency for each transaction and potential conflicts are handled differently. The latency is caused by nodes having to communicate every transaction to all other nodes. This has been minimized to be very efficient but can still be noticeable on slow networks or WANs. Since all nodes are contacted in parallel the slowest connection is the determining factor for the latency. Potential conflicts between transactions are handled by sending a deadlock error to the client. This allows existing applications to work with minimal modifications since deadlocks should be retried by the application anyway.
A Galera cluster needs to have an uneven number of nodes to determine what happens if one or more nodes become unavailable. The portion of the the cluster that has quorum will continue without a problem but the other nodes will stop accepting queries. With only 2 nodes a network issue will stop the entire cluster and hence cause downtime. An extra arbitration node can be added to avoid problems with even numbers of nodes.
Galera works with MySQL but requires a patched executable which is provided in the download. Percona provides integration with Galera for Percona Server with Percona XtraDB Cluster which is basically a patched Percona Server with Galera.
So which one is best?
So which of these products is best and are there any issues with Drupal?
MySQL Fabric is the newest product and the current release is a release candidate rather than GA. It also relies on specific drivers (connectorPHP) that I have not seen used with Drupal. At this point I would not recommend using MySQL Fabric in production at this point in time simply because it's not GA yet and it has not proven itself in production.
MHA and PRM are very similar and have almost the same advantages and disadvantages. Both of them use standard MySQL replication so your application does not need to change but both of them can cause downtime and data loss can occur if you're not using semi-synchronous replication. PRM comes with the proven technology of PaceMaker and Corosync but that also adds a layer of complexity. PaceMaker can be hard to configure and manage.
Galera / Percona XtraDB Cluster introduces its own replication which comes with its own risks given how mature standard MySQL replication is but synchronous replication means that you don't lose data on failover and you simply point the clients at a different server without any downtime. You can even write to both servers at the same time without fear of data consistency issues. The downsides are the latency caused by network overhead and the fact that your application needs to handle deadlock errors correctly. Drupal does not retry deadlocks by default so this will require extra changes to avoid unnecessary errors.
All of this means that a lot depends on how sensitive your application is to data loss and errors and what you already have in your infrastructure. If you already run PaceMaker PRM is a good choice and if no data loss is a must than Galera / Percona XtraDB Cluster is a good choice but if you're replicating over a WAN and are not currently running PaceMaker I would suggest taking a look at MHA. At Acquia we are currently evaluating Percona XtraDB Cluster for a specific offering but for our regular hosting we have built our own tools to provide HA which have much in common with MHA. Of course we're always on the lookout for improvements in reliability and speed so we may end up switching ourselves.