Ulf Wendel

A missed load balancing opportunity?

Write-read sequences can be tricky for users of MySQL replication, if what has been written needs to be available for reading immediately. Slaves could be behind. You have to run the sequence on the master. If MySQL semisynchronous replication would tell the client which slave is in sync, a client library could select that slave for load balancing.

The more we look into load balancing for PECL/mysqlnd_ms, the PHP mysqlnd master-slave replication load balancing proof-of-concept plugin, the more restrictions and issues come up. An obvious restriction is that connections have a state.

No solution: binlog monitoring

But not only state is a "trouble maker". Critical write-read sequences can be another one. If the client must be able to "see" the result of a write send to the master immediately, the safest choice is to run the next read on the master. The downside is what the master gets some read load. Alternatively, the client could try to monitor replication binlog or replication lag to choose a slave who has already replicated the write. If it hasn’t been replicated yet, the load balancer could block the client for a wait period until a slave catches up before the load balancer sends the read to the master. Uuups, conceptually that is what semisynchronous replication does. Of course, the server does it in a much more elegant way. And, if the server told PECL/mysqlnd_ms which slave has catched up…

No solution: global transaction ID

Another solution to the critical write-read sequence is using a global transaction ID as proposed by Jan in "MySQL Proxy and a Global Transaction ID". The idea is to maintain some sort of transaction ID or record version which is replicated by the server. The load balancer will pick a slave that, for example, has already replicated the critical write by checking the global transaction ID.

Shall PECL/mysqlnd_ms use a global transaction ID?

PECL/mysqlnd_ms could do the magic of maintaining a global transaction ID in C. Let us know if you want this. To me it does not really sound as it is a bit of a hack, but who cares about hacks, if it solves problems…


  1. Hi Ulf

    The master-slave client you’re working on is really interesting.

    A global transaction is useful when replicating between many nodes. For instance if you need to promote one slave to become a new master, it should help a lot. II haven’t dived into the details, but I think MySQLatFacebook has a patch that does global transaction id? (On my todo list…)

    However, on this topic I’m wondering if you’re not over-optimizing things. Remember that the whole point of using read-only slaves is that you have a read-mostly workload. So if you have 99% reads and 1% writes, it means that *at most* you will have 1% such read-after-write that must be done to the master. If you have 5% writes you *at most* have another 5% such reads. At least in most applications this will be the case, there could of course be patological use cases that read a lot more.

    Of course, if you can scale-out another 5%, 10% or perhaps 20% of reads from the master to a slave, that could mean a 5-20% performance gain, and that is relevant. But if the solution introduces a lot of complexity into an otherwise elegant system, is it worth it?

    Consider that other alternatives like using HandlerSocket can give you 700% performance increase. Just to put things in perspective. Here’s a php client I’d like to have: In some efficient manner, try to catch SELECT statements that are simple key-value operations: “SELECT … FROM t WHERE id=?;” Then let the client intercept that SELECT and retrieve the data from HandlerSocket instead, and back on the client side return the data transparently so the application doesn’t know if HandlerSocket was used or not.

  2. Hi Henrik!

    I was hoping semisync could help identifying the new master in case of problems with the current master or, for example, with load balancing. But unfortunately semisync replication only tells me that one of the slaves is in sync. It does not tell me which although replication should have that information and it should be very easy to expose to the client. Though, no big deal, tiny, tiny optimization.

    Using a driver plugin to maintain a global transaction id is as easy as using MySQL Proxy for the task. If need be, it is simple to implement. I do not think the mysqlnd plugin should do it in C. One can hack it with few lines of PHP and, as you point out: over-optimization. The “critical write-read” sequence is an edge-case. But it is an edge-case I was asked about by users after presenting the idea of a load balancing replication plugin for PHP mysqlnd.

    There are many limits and edge-cases immanent to a load balancing replication driver plugin. My thinking is that PECL/mysqlnd_ms shall not try to handle them automatically – pretty much the opposite of your perception of my blog posting. I do not even want to do automatic and implicit failover. I want a 1.0 release ASAP.

    The HandlerSocket idea you are presenting is brilliant. I love it, I will steal it! Sharding, HandlerSocket and the like are next on the TODO list. First we had to play with "transparent" connection pools, learn about load balancing and failover. Now that we have the building blocks we can move forward to the real thing.

  3. Actually, I now realize what you propose is not correct. At least if I’ve understood it correctly.

    Semi-sync replication means:
    – data is copied to a slave, so it is redundant.
    – data is not yet applied on the slave, will be applied asynchronously.
    => You cannot safely select from the slave, even if you know which one it is. There can still be seconds behind master > 0.

  4. Yes, I’m afraid you are correct that I misunderstood the MySQL 5.5 semi-sync featrure. Semi-sync means synched to slave but not necessarily bubbled up to SQL layer.

    With the other point of people wanting some handling of “pick only a slave in sync”, I continue to believe that it is not an over-optimization. Lukas Smith, for example, commented that he is sticking to the master for the rest of the request after the first write has happened. Thus, any solution a driver could come up with should be welcomed by many. BTW, Lukas’ idea does quite sound to me: why not have a way to disable R/W split (and the load balancing associated with it) to mark both a connection or even a request as readonly…