Some throttling for PECL/mysqlnd_ms 1.4

2012/05/15 by admin

Users of MySQL Replication sometimes throttle client requests to give slaves time to catch up to the master. PECL/mysqlnd_ms 1.4, the current development version, features some throttling through the quality-of-service filter and global transaction identifier (GTID). Both the plugins client-side GTID emulation and the MySQL 5.6 built-in GTID feature can be used to slow down PHP MySQL requests, if wanted.

How its done

The replication plugin has a neat feature called quality-of-service filter. If, for example, the quality of service you need from a MySQL Replication cluster is "read your writes", you call mysqlnd_ms_set_qos($connection, MYSQLND_MS_QOS_CONSISTENCY_SESSION). This instructs the plugin to either use a master or a slave, that has replicated your writes already, for all further reads. The plugin takes care of picking an appropriate cluster node. Once you are done with "read your writes" you can relax the service quality to make node selection faster.

By default, MYSQLND_MS_QOS_CONSISTENCY_SESSION will enforce reading from the master. This is undesired as it increases the load on the master. However, before the introduction of global transaction identifiers, there was no safe way of knowing whether a slave had replicated a certain update already or not.

Using either the plugins GTID emulation or the MySQL 5.6 build-in GTID feature, one can reliably check the up-to-date status of a slave using a SQL SELECT statement. GTIDs are some kind of unique transaction sequence numbers. If you know the transaction sequence number of a write operation, you can check whether it has been replicated using a statement like, for example, SELECT GTID_SUBSET('gtid_of_write', @@GLOBAL.GTID_DONE) AS trx_id FROM DUAL. Please, check my previous posts for a more precise description of the GTID feature. This statement will check the replication status and return immediately.

SQL_THREAD_WAIT_AFTER_GTIDS(string gtids [, timeout])

Alternatively, a MySQL 5.6 user can issue SELECT SQL_THREAD_WAIT_AFTER_GTIDS('gtid_of_write') which will block until either the slave has replicated the write in question or the statement times out. This is great to throttle clients and prevent them to send new updates before the slaves have caught up. This is what some throttling is about. You can control which logic PECL/mysqlnd_ms shall use when searching for an up-to-date slave.

Strictly speaking, you could do it in 1.3 already, if using MySQL 5.6, which is not GA yet. GTIDs are opaque to the plugin. The configuration contains a SQL statement to fetch gtid_of_write and one to check if gtid_of_write has been replicated already. You have been free to either use SQL_THREAD_WAIT_AFTER_GTIDS or not for those config settings.

The new bit

New is a wait_for_gtid_timeout setting that can be used with the GTID emulation. If wait_for_gtid_timeout is set, the plugin will poll a slaves state for wait_for_gtid_timeout seconds regardless of the SQL statement configured. The plugin first runs the SQL statement to check if gtid_of_write has been replicated already. If not, it checks if there is time left for another poll attempt, sleeps for second and polls the status again.

All this is done transparently in the background. All the application does is formulate its quality of service needs.

Throttling makes synchronization costs visible

Throttling client requests should not be understood as a hack. MySQL Replication happens to be a lazy primary copy system. All updates must be performed on the primary (master). Synchronization of secondaries (slaves) is lazy. Update transactions are finished once the primary has finished them. An update transaction never waits for secondaries to catch up. This is an easy to implement, often fast and simple approach.

The drawbacks are temporarily stale data on the secondaries and limited gains in availability over a single server.

Clients get confirmation for update transactions as soon as they are finished on the primary. As soon as they are saved on just one server. There is no guarantee that the transaction ever makes it to a secondary. In the unlikely worst case, the primary crashes in an unrecoverable manner and transactions are lost before being replicated. Thus, little gain over a single server. If you don’t want that, you need eager synchronization. This is what MySQL Cluster offers, if you want it. For eager synchronization one needs to slow down updates and wait for one or all replicas to confirm the update. If updating one secondary is all you need, go for MySQL Semisynchronous Replication.

It may not be technically valid comparison, however, the slow down and wait reminds me of throttling. MySQL lets you choose whether you want to do the wait on demand and on the client side (MySQL Replication: lazy synchronization) or built-in to the distributed system (MySQL Cluster: eager synchronization). Putting all this in a matrix shows the wide range of database replication options that MySQL has to offer.

		Update: where
		Primary copy	Update anywhere
Synch.: when	Eager	(MySQL Semisynchronous Replication – one secondary eager, rest lazy)	MySQL Cluster
Synch.: when	Lazy	MySQL Replication	(MySQL Cluster WAN mirror using MySQL Replication – automatic conflict detection)

Happy hacking!

@Ulf_Wendel