Ulf Wendel

PECL/mysqlnd_ms: quality of service filter

What if your PHP application could tell the mysqlnd library what service quality you need when using a MySQL replication cluster? If you wanted read-your-writes, the driver would select replication nodes for you which can offer it. If you can allow replication lag but no more than three seconds, the driver would select… One function call and you get the service you need. That’s what version 1.2 of PECL/mysqlnd_ms is about.

The quality of service filter

In the world of PECL/mysqlnd_ms, the free and Open Source replication and load balancing plugin for mysqlnd, a so-called filter is responsible for choosing nodes for statement execution. A filter looks at the SQL statement to be executed and picks a capable server. The current production ready 1.1 release has three filters. Two load balancing filter (random, round-robin) and a user hook filter (user).

For example, if you run a SELECT statement, the filters of PECL/mysqlnd_ms ensure it ends up on a slave.

query(SELECT id FROM test)
|
mysqli PDO_MySQL mysql
mysqlnd library
PECL/mysqlnd_ms magic
|
  Master  
Slave 1 Slave 2 Slave 3

PECL/mysqlnd_ms 1.2 brings a new quality-of-service filter. The quality of service filter can be configured in the plugins configuration file but also at runtime. The latter is new and unique to this filter.

Reading from slaves no more than n seconds behind

Certain elements on a web site qualify for simple TTL based caching. Others don’t. One could say, that the quality of service demands for cacheable contents are lower than for non-cacheable. The same goes for SQL statements. Stale data may be OK, but it may not be older than n seconds, you may want to set a time-to-live, so to say.

MySQL Replication is asynchronous. It takes some time until a write operation on the master has been replicated to all slaves. Slaves may not always serve current data. The MySQL administrative statement SHOW SLAVE STATUS give a hint how many seconds a slave is behind the master (Seconds_Behind_Master). It is a rather rough estimation that MySQL makes but its the best built-in logic I am aware of. Please, check the MySQL reference manual for details.

  Master  
Slave 1 Slave 2 Slave 3
Seconds_Behind_Master = 0 Seconds_Behind_Master = 3 Seconds_Behind_Master = 5

Say your application is fine with reading from slaves that lag no more than three seconds behind the master. With PECL/mysqlnd_ms 1.2 it will be one line to set the required service quality.


mysqlnd_ms_set_qos($link, 
  MYSQLND_MS_QOS_CONSISTENCY_EVENTUAL,
  MYSQLND_MS_QOS_OPTION_AGE,
  3);       

After the function call, PECL/mysqlnd_ms checks for every statement nodes quality for executing the statement. If the statement is a write, the quality-of-service (qos) filter, returns a list of all masters. If the statement is a read, the qos filter returns all masters and all slaves for which Seconds_Behind_Master <= 3 is true. In the example, a SELECT statement would be run on the master, slave 1 or slave 2, depending on the choice of the load balancing filter. Slave 3 is not used because its replication lag is too high.

To be very clear here: the qos filter may execute SHOW SLAVE STATUS on all slaves. This is an expensive and slow operation. But it is the only option availablewith todays MySQL replication server features. One may try to cache the SHOW SLAVE STATUS results but there’s no fundamental change until the MySQL replication cluster can tell clients which nodes to use.

Don’t drop and forget the idea after this warning. The cool thing is, that if you tell PECL/mysqlnd_ms it may replace the slave access with a local TTL cache access… allow disabling the slow status query, focus on the caching idea, and … But that’s for sure not for 1.2.

Reading from slaves which have replicated a global transaction ID

Setting the maximum age using the qos filter gives you eventual consistency. Using global transaction ids, the qos filter can also offer session consistency or read-your-wrtites. PECL/mysqlnd_ms 1.2 can do global transaction id injection and the MySQL replication team has published a global transaction id feature preview release in October. Because the server preview is not fully functional, 1.2 focusses on its client-side emulation and injection.

$link->query("INSERT INTO test(id) VALUES (123)");
$gtid = mysqlnd_ms_get_last_gtid($link);

|
  Master  
  GTID = 27263  
Slave 1 Slave 2 Slave 3
GTID = 27263 GTID = 27251 GTID = 27263

Whoever does maintain the global transaction ids, some SQL exists to check if a node has replicated a certain id or not. The qos filter knows the SQL and checks if a node has replicated the transaction in question. If so, the node can be used to achieve read-your-writes.

mysqlnd_ms_set_qos(
  $link,
  MYSQLND_MS_QOS_CONSISTENCY_SESSION,
  MYSQLND_MS_QOS_OPTION_GTID,
  27263);
$link->query("SELECT * FROM test ");

|
  Master  
  GTID = 27263  
Slave 1 Slave 2 Slave 3
GTID = 27263 GTID = 27251 GTID = 27263

Is it worth it?

The price of the qos filter is high. Checking nodes for every statement is slow. However, if you hit scalability limits of your master because of too many read-your-write requests, there is no choice . PECL/mysqlnd_ms, a driver-based solution, does not do worse than an application-based solution. But, it takes work from you, the application developer.

If that’s not convincing, OK, but what about the caching idea…

Happy hacking!

@Ulf_Wendel

PS: Do you speak portuguese? If so, you may want to check out this PHP Conference Brasil presentation from Airton Lastori.

Comments are closed.