‘Synchronous’, multi-master, auto-everything – that’s the new MySQL Group Replication (IPC14 talk/slides) in simple words. After torturing PHP developers for decades with MySQL Replication there is now a new replication option which does not require read-write splitting. A system that does not know about slave lags and reading stale data. In theory, MySQL Group Replication is just about the perfect approach to run a standard PHP application (WordPress, Drupal, …) on a small cluster (3-7 nodes) in LAN settings. In theory, MySQL Group Replication improves both availability and performance.
Distribution Transparency
When designing replication systems there are some desireable goals which contradict each other. In a perfect world, from a developers perspective, a database cluster would behave exactly the same way as a single database. The user should never have to worry where and how data is stored in the cluster. Transactions executed on the cluster would provide the same properties like transactions run on a standalone database. The cluster would never return stale data (synchronous).
Synchronous replication is desired but it requires coordination among cluster nodes. In LAN settings coordination can be reasonably fast. MySQL Group Replication is ‘synchronous’ replication (see slides for details). Deploy it on LAN only. In the internet, in WAN settings, when trying to replicate from Europe to Asia things will be slow. If WAN, then either forget about distribution transparency or performance. If WAN, go for asychronous MySQL Replication.
The extra work different clusters cause for the developer
Synchronous and asynchronous clusters always cause some extra work for the developer. Either approach requires load balancing and failover logic. An asynchronous approach adds: dealing with delays and stale reads. MySQL Replication is not only asynchronous but has only one master (primary). This adds: read-write splitting.
PECL/mysqlnd_ms tries to help with all these tasks and take them over in a semi-transparent way.
PECL/mysqlnd_ms support for synchronous clusters
PECL/mysqlnd_ms is a plugin for mysqlnd. PDO_MySQL and mysqli use mysqlnd as their default library to talk to MySQL. Any of the two APIs works with PECL/mysqlnd_ms, our load balancing and replication plugin.
The plugin monitors many API calls and aims to make using any kind of MySQL clusters easier. No matter what cluster: MySQL Replication, MySQL Cluster, MySQL Group Replication, 3rd party solutions. Example configurations are given in the PHP manual.
MySQL Group Replication usage task 1: load balancing
When moving an application from a single database server to a synchronous cluster there are two additional tasks: load balancing and failover. With PECL/mysqlnd_ms load balancing does no require any code changes. The plugin intercepts your connect calls and tests whether the host you connect to matches the name of a config entry. If so, the plugin loads the config, learns from the config which nodes there are and starts load balancing connection. Should you be too lazy to change the host name in your connects to match a PECL/mysqlnd_ms config entry, then just name the config entry after you current host names, have a config entry for ‘127.0.0.1’ etc.
$link = new mysqli("myapp", ...);
MySQL Group Replication usage task 2: failover
The second task is to handle the failure of a cluster node and connect to the next available one. PECL/mysqlnd_ms does that for you if you want. It picks an alternative from the config and connect you to it.
There’s a small feature gap here. MySQL Group Replication tries to be an auto-everything solution. It automatically detects failed nodes. It also fully automates adding new nodes to the cluster. That’s cool but it means that over time the set of nodes can change and your config needs to be updated.
The PECL/mysqlnd_ms feature gap
There are two options. First, you could deploy the config. Second, after a failover or periodically, we could make PECL/mysqlnd_ms fetch the list of nodes from the cluster and make it reconfigure itself (see also here). That’s finally possible because MySQL Group Replication shows the list of nodes in a performance schema table.
Once we did that, and MySQL Group Replication has reached GA, the auto-everything cluster for MySQL becomes real. All the stuff on the server side is already automatic. PECL/mysqlnd_ms is already GA and already handles all additional tasks – without code changes. A tiny addition is missing and you could even get an auto-deployed PECL/mysqlnd_ms…
Happy hacking!