PHP: 59 tuning screws for mysqlnd

2007/09/11 by admin

59 – that is the number of statistics collected by mysqlnd. When I started writing this blog post, I hoped it would be easy going describing them. But it wasn’t. Some ten days ago, we’ve had 51 statistics and near zero tests for them. Now we have 59 statistics. And, in a certain way, every figure is a tuning screw in the hand of one who knows what they do.

I must confess, I grossly underestimated the time it takes to write only a few words about some of them. Here is a first batch of comments on some 20 figures. Already those some 20 figures should give you an idea how the mysqlnd statistics might help you with bottleneck analysis, monitoring. Or you use the statistics to identify scripts that select more rows than they consume, open more connections than needed, …

How to access client statistics with mysqlnd

Statistics are only available with mysqlnd. Statistics can be accessed using:

Per process: phpinfo()
Per process: mysqli_get_client_stats()
Per connection: mysqli_get_connection_stats(link mysqli)

Note that if you compile PHP with ext/mysql and ext/mysqli and both of them use mysqlnd, then statistics hold aggregated values for ext/mysql calls and ext/mysqli calls. There will be one mysqlnd library in use and this one library will have one set of figures.

If you search the mysqlnd source, you will find another promising, new API call: mysqli_get_cache_stats(). This is not what you might guess, this is about internal caching of zvals. Forget about that for the moment. It is another five figures based on "expericadabra" for which I have no documentation yet… it’s internals you don’t want to care about, really.

mysqlnd statistics: (Network) Traffic

bytes_sent, bytes_received: Number of bytes sent to and received from the MySQL Server.
packets_sent, packets_received: Number of packets sent to and received from the MySQL Server. The MySQL Client/Server Protocol sends all messages in packets.
protocol_overhead_in, protocol_overhead_out: Communication protocol overhead in bytes for incoming (received) and outgoing (sent) messages. Based on the figures you can estimate the payload using protocol_overhead_[in|out] = payload + packets_[received|sent] * 4. However, this is a very rough estimation.

In most cases network traffic figures do not need to be monitored. Most of the time, you will be using fast network connections. Most of the time, applications will not read more data than needed. There might be still some applications doing SELECT * FROM mytable instead of SELECT column_you_need FROM mytable or trying to implement a join operation in PHP, but there should be few such applications – hopefully.

Reducing network traffic has two major advantages. If your network is very busy, you may be able to reduce latency. With less traffic, the data of each client can take a larger portion of the network capacity and be transferred faster. Faster fetching has another advantages: MySQL Server resources can be freed earlier. This might lower the load on the server.

mysqlnd statistics: Types of queries

result_set_queries

Number of queries which generated a result set. Examples:

SELECT
EXPLAIN

non_result_set_queries

Number of queries which did not generate a result set. Mostly DDL, examples:

ALTER, CREATE, DROP, RENAME TABLE
CREATE, DROP INDEX
ALTER, CREATE, DROP, RENAME, DATABASE
ALTER, CREATE, DROP SERVER
INSERT
UPDATE
DELETE
TRUNCATE TABLE
LOAD DATA INFILE

Statistics on the type of the queries you run, tell you if you have a read or write dominated work load. Also, the figures may help you identify scripts that run many queries and are potential bottle necks. For example, you can use the auto_prepend_file and auto_append_file php.ini directives to include a script which records the mysqlnd statistics for every PHP script . Then you can check the statistics in you auto_append_file and create a bottle neck report. You may even consider to calculate the average traffic for each type of message using the network traffic figures collected by mysqlnd.

Buffered vs. unbuffered result sets

buffered_sets, unbuffered_sets: Number of buffered resp. unbuffered result sets for "normal" queries. That is all queries generating a result set but not run as a Prepared Statement.
ps_buffered_sets, ps_unbuffered_sets: Number of buffered resp. unbuffered result sets for Prepared Statements. Cursors are considered unbuffered.

Result sets can be buffered or unbuffered. Using default settings, ext/mysql and ext/mysqli work with buffered result sets for "normal" queries. Buffered result sets are cached on the client. After the query execution all results are fetched from the MySQL Server and stored in a cache on the client. The cache is implemented as a buffer in libmysql resp. mysqlnd. Read mysqlnd saves 40% memory, finally (new tuning options) to find out more about the buffering. The big advantage of buffered result sets is that they allow the server to free all resources associated to a result set, once the results have been fetched by the client.

buffered result sets

Unbuffered result sets on the other hand are kept much longer on the server. If you want to reduce memory consumption on the client – but increase load on the server -, you go for unbuffered results. Once again, if you experience a high server load and the figures for unbuffered result sets are high, you should consider to move the load to the clients. Clients typically scale much better than server do. “Load” does not only refer to memory buffers. The server also needs to keep other things open, for example file handles and threads, before a result set can be freed. And fetching results with PHP is pretty slow, if you think in server dimensions. Thus, resources can be blocked for a long time on the server.

Prepared Statements use unbuffered result sets by default. However, you can use mysqli_stmt_store_result() to turn them into buffered result sets.

unbuffered result sets

mysqlnd statistics: Fetch statistics

Bad news for some of us, including me: there are separate figures for buffered and unbuffered result sets. You really need to turn your brain on. Good news for those who never cared about buffered vs. unbuffered: it is not difficult to learn. Look at the illustrations to recap what happens in case of buffered and unbuffered fetching.

Statistics for buffered result sets

buffered_sets, ps_buffered_sets

Number of buffered result sets generated by "normal" queries and generated by Prepared Statements.

rows_fetched_from_server_normal, rows_fetched_from_server_ps

Number of rows that have been fetched from the Server. This figure is relevant both with buffered and unbuffered result sets.

Buffering implies fetching all rows from the Server immediately after the query execution. If you have 10 rows in a result set, rows_fetched_from_server_[normal|ps], will grow by 10. These figures do not tell you if your script has
consumed all of the buffered rows.

rows_buffered_from_client_normal, rows_buffered_from_client_ps

Number of rows fetched from the Server and buffered on the client-side.

The figures will always grow by the size of your buffered result set. Similar to rows_fetched_from_server_[normal|ps] fetching all happens implicitly after the query execution.

Note that this figure, opposed to rows_fetched_from_server_[normal|ps], will only change if you deal with buffered result sets and not change when using unbuffered result sets. To calculate the number of rows generated by and fetched from the Server for unbuffered results you can do: rows_fetched_from_server_[normal |ps] – rows_buffered_from_client_[normal|ps].

rows_fetched_from_client_normal_buffered, rows_fetched_from_client_ps_buffered

Number of rows fetched by your script from the client-side buffer.

This is a figure of interest because it warns you if you generate rows which you do not need. Say you create a buffered result sets of 10 rows but you fetch only one row out of the 10 using mysqli_fetch_assoc() or any other mysqli_fetch_* function. All rows will be fetched from the Server and buffered on the Client implicitly. Statistics for fetched from Server and buffered from client will be increased by 10. But rows_fetched_from_client_[normal|ps]_buffered will grow only by one, because you have fetched only one row from the buffer (and transfered 9 rows from the Server which you never used).


$res = mysqli_query($link, "SELECT id FROM test LIMIT 10", MYSQLI_STORE_RESULT);
  // -> 10 = all rows get fetched from the Server
  // -> 10 = all rows get buffered from the Client
  // PS: MYSLQLI_STORE_RESULT is a default
$row = mysqli_fetch_assoc($res)
  // -> 1 row fetched from Client
mysqli_free_result($res);
  // -> 9 rows = rows_buffered_from_client_normal - rows_fetched_from_client_normal_buffered not used
  // -> 9 rows generated although not needed
  // -> 9 rows transferred although not needed

Rows skipped, flushed sets

Those two terms are not applicable for buffered result sets. That may seem confusing at first. But read below about unbuffered result sets to learn why.


$res = mysqli_query($link, "SELECT id FROM test LIMIT 10", MYSQLI_STORE_RESULT);
  // -> 10 = all rows get fetched from the Server
  // -> 10 = all rows get buffered from the Client
  // PS: MYSLQLI_STORE_RESULT is a default
$row = mysqli_fetch_assoc($res)
  // -> 1 row fetched from Client
mysqli_free_result($res);
  // -> nothing to fetch from the Server any more - no rows to skip
  // -> nothing to fetch from the Server any more - no lines to flush

Statistics for unbuffered result sets

unbuffered_sets, ps_unbuffered_sets

Number of unbuffered result sets generated by "normal" queries respectively Prepared Statements.

rows_fetched_from_server_normal, rows_fetched_from_server_ps

Number of rows that have been fetched from the Server. This figure is relevant both with buffered and unbuffered result sets.

For unbuffered result sets: this includes rows fetched by your script, and, if needed, rows fetched implicitly to clean the line. A client must fetch all results from a line before it can run another query on the line. If your script does not fetch all rows, PHP will take care of fetching what has not been fetched so far to clean the line.

rows_fetched_from_client_normal_unbuffered, rows_fetched_from_client_ps_unbuffered

Number of rows fetched from the Client with unbuffered result sets.

This includes all rows which your scripts have been fetched as well as those fetched implicitly when cleaning the wire for you (see above).

rows_skipped_normal, rows_skipped_ps

Number of rows generated by the Server but not read from the Client.

Note: when using buffered result sets, you always fetch all rows. The figure cannot change with buffered result sets. Therefore this figure is only of interest for unbuffered result sets.

Again, this is a number which you want to monitor to detect those cases when you let the Server compute more rows than you consume.


$res = mysqli_query($link, "SELECT id FROM test LIMIT 10", MYSQLI_USE_RESULT);
  // -> 10 rows generated by the Server
  // -> 0 = no rows fetched and buffered
$row = mysqli_fetch_assoc($res)
  // -> 1 row fetched from Server
  // -> 1 row fetched from Client
mysqli_free_result($res);
  // -> 10 - 1 = 9 rows skipped, rows_skipped_normal
  // -> 1 result set to be flushed
  // -> forcing 9 rows to be read silently to clean the line

flushed_normal_sets, flushed_ps_sets

Number of result sets with unread data which have been flushed silently for you.

Flushing means that PHP has read silently rows from the Server which you have generated on the Server but not fetched. The rows must be fetched before a new query can be run.

Flushing can only happen with unbuffered result sets. With buffered result sets you always fetch all rows from the Server.

PS: We are open for hear your suggestions on renaming figures. Maybe you have an idea to improve the naming: make it more clear, make it more consistent, … Personally, I do not like all of the current names. But that might be because I’m not a native english speaker and get some things from time to time.