Ulf Wendel

The performance penalty of the early MySQL Fabric support for PHP

| 1 Comment

PECL/mysqlnd_ms 1.6 is currently being modified to support sharding and fully automatic server and client failover when using MySQL Fabric (slides) to manage a farm of MySQL servers. PECL/mysqlnd_ms is a mostly transparent load balancer that works with all PHP MySQL APIs (PDO_MySQL, mysqli, …). The idea is, that if, for example, a MySQL server fails, the plugin talks to MySQL Fabric to learn about alternative servers that Fabric has provisioned automatically. This “talks to” gives implies a performance penalty for applications. One worth looking at, to understand it. One worth looking at, to have a success story once the early implementation is gone and replaced with a proper one ;-).

Behind the scenes…

How exactly a “Fabric aware” driver or application talks to Fabric is implementation dependent. Figures given for PHP must not be used to extrapolate behaviour of Connector/J or Connector/Python. Only remarks about Fabric itself apply to all.

Let’s assume you want to use MySQL Fabric and PHP for sharding. Fabric takes care of all the server side stuff: splitting, merging, monitoring shards and so forth. The PHP application ensures that queries end up on the appropriate shards by hinting the driver which shard to use. In the PHP case, the “driver” is the PECL/mysqlnd_ms plugin for mysqlnd. The plugin exports a function mysqlnd_ms_select_shard() for hinting it.

$mysqli = new mysqli("myapp", "user", "password", "database");
mysqlnd_ms_select_shard($mysqli, "mydb.mytable", "key");
$mysql->query("INSERT INTO mytable(col1) VALUES ('abc')");

This tiny snippet triggers a huge machinerie: from you application it goes to the plugin. Then, the plugin calls Fabric via XML RPC over HTTP and waits for a reply. Once Fabric has replied, the plugin makes the connection handle point to the shard.

Client   Fabric
mysqlnd_ms_select_shard(link, table, key)  
PECL/mysqlnd_ms *.c XML RPC over HTTP: sharding.lookup_server(table, key, LOCAL) ->  
  HTTP worker thread
  Executor
  <- XML reply HTTP worker thread
PECL/mysqlnd_ms: make $link use shard
mysqli_query($link, …)

The hotspots

Switching a connection from one server to another takes some effort. The current implementation will simply replace an plugin internal list of servers. This is a very fast operation. No new MySQL connection is opened yet. By default, lazy connections are used and the connect to the shard is delayed until the application issues a query. Let’s consider this a cheap step, that can be marked green below.

Client   Fabric
mysqlnd_ms_select_shard(link, table, key)  
PECL/mysqlnd_ms *.c XML RPC over HTTP: sharding.lookup_server(table, key, LOCAL) ->  
  HTTP worker thread

Asking Fabric for a set of one master any number of additional slaves that make a shard is expensive. Upon every call to mysqlnd_ms_select_shard(), PECL/mysqlnd_ms opens a HTTP connection to Fabric to make an XML remote procedure call. Future version of the mysql plugin will do less calls, but that’s not the point. The expensive operation is the HTTP connection established using PHP streams. Say, you do one INSERT on a shard, then the INSERT carries significant overhead. Recall, I am discussing implementation details that must and will be tuned…

  • mysqlnd_ms_select_shard()
    • HTTP connect and round-trip to Fabric for XML RPC
    • switch user connection handle to shard(cheap)
  • mysqli_query()
    • connect MySQL shard
    • if needed and possible align connection state

The overhead can be measured using the plugins performance statistics. Set the PHP configuration directive mysqlnd_ms.collect_statistics=1 to enable the collection of statistics. Here’s the outline of a script that helps you detect what the plugin does and where it spents time in a MySQL Fabric sharding scenario. Please, see the manual for to setup the plugin to use Fabric.

$stats = mysqlnd_ms_get_stats();
$start = microtime(true);

$link = mst_mysqli_connect("myapp", $user, $passwd, $db, $port, $socket);
mysqlnd_ms_fabric_select_shard($link, "fabric_sharding.test", 1));
$now = mysqlnd_ms_get_stats();
foreach ($stats as $k => $v) {
 if ($now[$k] != $v) {
   printf("%s: %s -> %s\n", $k, $v, $now[$k]);
  }
}
$stats = $now;

var_dump($link->query("UPDATE test SET id = 1 WHERE id = 1"));
$now = mysqlnd_ms_get_stats();
foreach ($stats as $k => $v) {
  if ($now[$k] != $v) {
    printf("%s: %s -> %s\n", $k, $v, $now[$k]);
  }
}

printf("Runtime %.fs\n", microtime(true) - $start);

When run in a VM on my notebook it will print something like this. The UPDATE on a shard took in total some 0.038 seconds. Asking Fabric which shard to use took 31535 / 1000000 = 0.0315 seconds. 450 bytes have been transferred from Fabric to the plugin to learn about the shard.

bool(true)
fabric_sharding_lookup_servers_success: 0 -> 1
fabric_sharding_lookup_servers_time_total: 0 -> 31535
fabric_sharding_lookup_servers_bytes_total: 0 -> 450
bool(false)
use_master: 0 -> 1
use_master_guess: 0 -> 1
lazy_connections_master_success: 0 -> 1
Runtime 0.038586s
done!

Any slowdown you like can be provoked: 83x slower!

On my VM, a loop that executes mysqlnd_ms_select_shard() followed by an UPDATE 2000x times runs 83x slower than UPDATE taken when connecting to the shard without Fabric (but with the plugin loaded). The Fabric code takes 23.4 seconds in total and wastes 22.9 seconds on XML RPC. A plain UPDATE on non Fabric connection to the shard takes 0.28 seconds only! The difference between 23.4 – 22.9 = 0.5 and 0.28 seconds is down to the 2000x connects done by the plugin as part of the server/shard switch.

The Fabric hotspot

It certainly does not take 22.9 seconds to send 2000 HTTP replies of 450 bytes. Let’s ask Fabric to show what it does by setting the debug logging level to DEBUG in the Fabric configuration file.

[logging]
level = DEBUG

This is the the result. For every call to mysqlnd_ms_select_shard(), PHP performs one XML RPC and each XML RPC triggers more than 10 SQL queries within Fabric!

[DEBUG] 1394661746.143667 - XML-RPC-Server - Enqueuing request (<socket._socketobject object at 0x1e718a0>) from (('127.0.0.1', 53951)) through thread (<SessionThread(XML-RPC-Session-4, started daemon 140595192383232)>).
[DEBUG] 1394661746.143837 - XML-RPC-Session-4 - Processing request (<socket._socketobject object at 0x1e718a0>) from (('127.0.0.1', 53951)) through thread (<SessionThread(XML-RPC-Session-4, started daemon 140595192383232)>).
[DEBUG] 1394661746.144319 - XML-RPC-Session-4 - Started command (LookupShardServers).
[DEBUG] 1394661746.144816 - XML-RPC-Session-4 - Statement (SELECT sm.shard_mapping_id, table_name, column_name, type_name, global_group FROM shard_tables as sm, shard_maps as smd WHERE sm.shard_mapping_id = smd.shard_mapping_id AND table_name = %s), Params(('fabric_sharding.test',)).
[DEBUG] 1394661746.146071 - XML-RPC-Session-4 - Statement (SELECT sr.shard_mapping_id, sr.lower_bound, s.shard_id FROM shard_ranges AS sr, shards AS s WHERE %s >= CAST(lower_bound AS SIGNED) AND sr.shard_mapping_id = %s AND s.shard_id = sr.shard_id ORDER BY CAST(lower_bound AS SIGNED) DESC LIMIT 1), Params(('1', 1)).
[DEBUG] 1394661746.147233 - XML-RPC-Session-4 - Statement (SELECT shard_id, group_id, state FROM shards WHERE shard_id = %s), Params(('1',)).
[DEBUG] 1394661746.148331 - XML-RPC-Session-4 - Statement (SELECT group_id, description, master_uuid, status FROM groups WHERE group_id = %s), Params(('sharding1_shard1',)).
[DEBUG] 1394661746.149338 - XML-RPC-Session-4 - Statement (SELECT server_uuid, server_address, user, passwd, mode, status, weight FROM servers WHERE group_id = %s), Params(('sharding1_shard1',)).
[DEBUG] 1394661746.150462 - XML-RPC-Session-4 - Statement (SELECT @@GLOBAL.SERVER_UUID as SERVER_UUID), Params(()).
[DEBUG] 1394661746.151100 - XML-RPC-Session-4 - Statement (SELECT @@GLOBAL.SERVER_ID as SERVER_ID), Params(()).
[DEBUG] 1394661746.151648 - XML-RPC-Session-4 - Statement (SELECT @@GLOBAL.VERSION as VERSION), Params(()).
[DEBUG] 1394661746.152203 - XML-RPC-Session-4 - Statement (SELECT @@GLOBAL.GTID_MODE as GTID_MODE), Params(()).
[DEBUG] 1394661746.152757 - XML-RPC-Session-4 - Statement (SELECT @@GLOBAL.LOG_BIN as LOG_BIN), Params(()).
[DEBUG] 1394661746.153374 - XML-RPC-Session-4 - Statement (SELECT @@GLOBAL.READ_ONLY as READ_ONLY), Params(()).
[DEBUG] 1394661746.153820 - XML-RPC-Session-4 - Connected to server with uuid (80716d72-9302-11e3-817c-000c299b2a06), server_id (3307), version (5.6.16-log), gtid (True), binlog (True), read_only (False).
[DEBUG] 1394661746.153932 - XML-RPC-Session-4 - Disconnecting from server with uuid (80716d72-9302-11e3-817c-000c299b2a06), server_id (3307), version (5.6.16-log), gtid (True), binlog (True), read_only (False).
[DEBUG] 1394661746.154039 - XML-RPC-Session-4 - Finished command (LookupShardServers).
[DEBUG] 1394661746.154313 - XML-RPC-Session-4 - Finishing request (<socket._socketobject object at 0x1e718a0>) from (('127.0.0.1', 53951)) through thread (<SessionThread(XML-RPC-Session-4, started daemon 140595192383232)>).

Better do not compare that with the tiny PHP snippet used to select a shard…

Client   Fabric
mysqlnd_ms_select_shard(link, table, key)  
PECL/mysqlnd_ms *.c XML RPC over HTTP: 22.9 seconds  
  > 10 SQL queries against Fabric backing store
Connect to shard: 0.2s
UPDATE: 0.3s

Ghosts from the past

Of course, we will fix that for PHP even before releasing an alpha! Of course, that will make a nice story with a catchy “80x faster” title!

It is not necessary for a client to contact every time a shard server needs to be identified. There are other XML remote procedure calls that can be used. The reason why PHP has gone for this way initially is simple: the other XML RPCs have not been ready when the code was first written. Future versions (and the other drivers) do ask Fabric only once for a complete listing of all shards. The RPC overhead will then vanish in the overall runtime.

Client   Fabric
mysqli_connect(“fabric”, …)  
PECL/mysqlnd_ms *.c XML RPC over HTTP: get all shards ->  
  HTTP worker thread
  Executor
  <- XML reply HTTP worker thread
mysqlnd_ms_select_shard(link, table, key)  
PECL/mysqlnd_ms: make $link use shard
mysqli_query($link, …)
mysqlnd_ms_select_shard(link, table, key)  
PECL/mysqlnd_ms: make $link use shard
mysqli_query($link, …)

How big the performance impact of such a driver is, cannot be answered in general. It depends mainly on the question how often the once fetched shard information can be reused and how expensive the one RPC is in comparison to the normal queries issues. Tipps for benchmarking PHP have been given – please, run your own tests.

Happy hacking!

@Ulf_Wendel Follow me on Twitter

PS: We recently moved some Connector manuals out of the main MySQL reference manual to their own places. For PHP we mirror the MySQL bits from the PHP, as ever. The mirrored PHP MySQL manual can be downloaded as a PDF from dev.mysql.com. It already shows some PECL/mysqlnd_ms Fabric notes whereas the PHP mirrors have not yet caught up to the PHP documentation staging server – http://docs.php.net/manual/en/book.mysqlnd-ms.php.

One Comment

  1. Pingback: Ulf Wendel: The performance penalty of the early MySQL Fabric support for PHP | facebooklikes

Leave a Reply

Required fields are marked *.