Ulf Wendel

PHP: mysqlnd saves 40% memory, finally (new tuning options)!

mysqlnd saves memory. It consumes half as much memory as libmysql. This is what we have been convinced of. This is what we taught you. Then I tried to test it and made Andrey get nervous for a few hours… Meanwhile he is fine again and we can announce: mysqlnd saves memory, not only in theory, we tested it – we can proof it, can we?

The read-only variable trick

In theory it is so simple. libmysql is not part of PHP. If libmysql fetches any data from the MySQL Server, it puts the data into its own buffers. Then the data gets copied from the libmysql buffers into the ext/mysql resp. ext/mysqli data structures. Those data structures are “zvals“. “zval” is the name of the data structure that PHP uses internally – on the C level – to implement user variables. So, you have the data twice in memory: 1x inside the libmysql buffers and 1x inside mysqlnd. With mysqlnd you might have the data only once in memory. mysqlnd also uses buffers but links the user variables directly to the read buffers, whenever possible. The zvals that represent the user variables do not contain copies of the buffer, they try to work with pointers. Therefore mysqlnd sometimes consumes only half as much memory as libmysql. In the worst case it consumes up to the same amount, but often it is less. How much memory you save depends on the size of your (buffered) result set. If it is only 1kB, you will hardly be able to measure any difference. If you are fetching BLOBs or large result sets it can become more obvious.

Memory savings of mysqlnd explained

Using pointers works great as long as you do read-only operations on the results and don’t keep references to the results after calling mysqli_free_result(). Here is an example of code that will fully profit from the new read-only variable trick of mysqlnd.

<?php
$res = mysqli_query($link, 'SELECT id, label, data FROM mytable');
while ($row = mysqli_fetch_assoc($res)) {
   printf("%04d - %20s\n%s\n\n", $row['id'], $row['label'], $row['data']);
}
mysqli_free_result($res);
?>

By default mysqli_query() returns a buffered result set. On query execution all results get fetched from the MySQL Server into PHP buffers. The buffering is done to allow you to scroll in a result set using mysqli_data_seek(). If you look at the illustration, you will see a buffered result set with three rows. What mysqlnd does when you iterate over the rows of your buffered result set with mysqli_fetch_assoc() is return you pointers to the buffered data. Of course, you are using PHP and there aren’t any pointers in the user land, but that is what happens behind the scenes on the C-level inside PHP. mysqlnd links your user variable directly to the wire.

The linking trick does not work any more if you ever modify any of the $row hashes returned by mysqli_fetch_assoc($res). If you modify the $row hash that belongs to the first row, mysqlnd must not modify the buffered result set. It must not modify the buffered result set (the Row1 box in the picture), because the user might want to re-read it later again using mysqli_data_seek(). Instead of modifying the buffered result set, mysqlnd is forced to separate the zval of $row. During separation, a new zval will be created and gets a new value, for example the value you have assigned to it. Something similar happens if you continue using $row from the example after calling mysqli_free_result($res). Again, mysqlnd is forced to create a new zval. However, those new zvals are what the read-only trick tries to avoid. And the memory required for those new zvals makes the potential memory saving of mysqlnd over libmysql.

Memory limit works much better

Now you understand why mysqlnd can save up to 50% of memory. It is approximately “up to” and not exactly 50%. How much it is depends – among others – on your script. However, there is another benefit, the drawing explains it. Libmysql operates outside the Zend Engine and therefore it did bypass the PHP memory limit setting. With mysqlnd this problem is gone and you can fully control the memory usage.

Memory limit works

So much about the theory…

Measuring the anticipated advantages of mysqlnd be tricky. First problem is that you need large result sets, preferably you operate on BLOBs for demonstration purpose. Second problem is that you need to find the correct tool for the monitoring job. I tried to start simple using memory_get_usage() which “returns the amount of memory, in bytes, that’s currently being allocated to your PHP script.” .

Whenever PHP allocates memory it can use two functions for that: emalloc() and malloc(). malloc() is an operating system level call and emalloc() is a wrapper provided by the Zend Engine. The wrapper function is used to cache allocations, to provide garbage collection functionality and much more. Most of the time extensions and mysqlnd call emalloc() instead of malloc(). Therefore my first idea was to check how much memory ext/mysqli allocates with mysqlnd and with libmysql.

You guess it, … with the code checked in to the PHP 6 CVS, I saw mysqlnd eating much more memory than libmysql. It seemed that with mysqlnd, the script did never free memory. If I fetched a 1MB BLOB, freed the result with mysqli_free_result() and fetched another 2MB BLOB, freed it again with mysqli_free_result(), I noticed the usage of 1 + 2 = 3MB of memory! 2MB was to be expected. 2MB was the largest variable I created. The Zend Engine does not free the 2MB chunk of memory immediately (even on unset()) but cache the chunk for reusage by the script. But how comes I saw 3MB? To be honest it was even worse, because I did my testtest with BLOBs of 2^16, 2^17, … 2^24 bytes and saw about 64MB being used.

Andrey did hide for a while, updated his debug functionality (which we have not blogged about yet), and came back a day later with the solution: it’s been PHP streams increasing their network buffers… and we had allowed them to do so.

New settings to control network buffers

To solve the problem two new ini-settings have been created. One controls the size of the command network buffer and one the is about the size of the network read buffer. The command network buffer comes into action when you send a question to the server, for example a SELECT query. Both buffers are per-connection buffers. They get allocated when you create a new connection.

In the MySQL Client-Server protocol every command is send as a packet. A packet consists of a 4 bytes header, a 1 byte command and the payload, for example the SELECT statement. If the command buffer cannot hold the entire packet, mysqlnd allocates a new buffer temporarily for the one command, sends the command and frees the buffer. By default the command buffer has a size of 2048 bytes. That means, by default all queries with a maximum length of 2048 – 1 – 4 = 2043 bytes will fit into the pre-allocated command buffer and so not need to spend time for allocation and deallocation.

If a command does not fit into the buffer and slow per-command allocation and de-allocation happens, mysqlnd increases the counter command_buffer_too_small. The counter value is returned by the new, mysqlnd only functions mysqli_get_client_stats() and mysqli_get_connection_stats(mysqli $link). By help of the two functions you can monitor mysqlnd and fine-tune your application.

mysqlnd.net_cmd_buffer_size = 2048
mysqlnd.net_read_buffer_size = 32768

The other new buffer is mysqlnd.net_read_buffer_size which has a default value of 32.768 bytes. This buffer controls how many bytes mysqlnd fetches from the PHP streams with one call. If a result set has less than 32kB in size, mysqlnd will call the PHP streams network functions only once, if it is larger more calls are needed. There is no extra counter which tells you how many calls to the PHP streams functions have been required at the average. However, you can make some educated guesses based on the counter result_set_queries, bytes_received and buffered_sets. I have not done any performance testing with different read buffer sizes, but would expect that results would be very dependent on the underlying operating system and for example TCP/IP buffers used in the network layer of an operating system. Therefore: please run your own tests.

The new settings can set in the php.ini configuration file. Once PHP has started you can use only mysqli_options() (not ini_set()!) to change them during run time. Note that a change will affect only new connections but not change the settings of any already opened connections. The new constants to be used with mysqli_options() are
MYSQLI_OPT_NET_CMD_BUFFER_SIZE and MYSQLI_OPT_NET_READ_BUFFER_SIZE. Settings made with mysqli_options() overrule the ini-settings.

Monitoring the memory, next try

Now you are ready for testing mysqlnd. But don’t use the PHP6 CVS version of mysqlnd. We need to merge the latest stuff from the internal SVN to get rid of the stream buffer bug first. However, SVN is somewhat alpha compared to what we checked into PHP6 CVS and asked you to use.

I used two scripts to measure the memory consumption of mysqlnd and libmysql when fetching BLOB columns. One of the scripts creates some table data and the other script fetches the data. The use of two distinct scripts (and script runs) ensures that nothing but the fetch operation gets monitored, because the Zend Engine and its memory manager get restarted.

<?php
if (!$link = mysqli_connect('localhost', 'root', 'root', 'phptest'))
	die(sprintf("[%d] %s\n",
		mysqli_connect_errno(), mysqli_connect_error()));

if (!$res = mysqli_query($link, 'DROP TABLE IF EXISTS test'))
	die(sprintf("[%d] %s\n",
		mysqli_errno($link), mysqli_error($link)));

if (!$res = mysqli_query($link, 'DROP TABLE IF EXISTS test'))
	die(sprintf("[%d] %s\n",
		mysqli_errno($link), mysqli_error($link)));

if (!mysqli_query($link,
	'CREATE TABLE test(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, data MEDIUMBLOB)'))
	die(sprintf("[%d] %s\n",
		mysqli_errno($link), mysqli_error($link)));

if (!($stmt = mysqli_stmt_init($link)) ||
		!mysqli_stmt_prepare($stmt, 'INSERT INTO test(data) VALUES (?)'))
	die(sprintf("[%d] %s\n",
	mysqli_errno($link), mysqli_error($link)));


$data = str_repeat('a', 32768);
if (!mysqli_stmt_bind_param($stmt, 'b', $data))
	die(sprintf("[%d] %s\n",
		mysqli_stmt_errno($stmt), mysqli_stmt_error($stmt)));

for ($len = 32768; $len <= pow(2, 24); $len *= 2) {
	for ($i = 0; $i < ($len / 32768); $i++) {
		if (!mysqli_stmt_send_long_data($stmt, 0, $data))
			die(sprintf("[%d] %s\n",
				mysqli_stmt_errno($stmt), mysqli_stmt_error($stmt)));
	}
	if (!mysqli_stmt_execute($stmt))
		die(sprintf("[%d] %s\n",
			mysqli_stmt_errno($stmt), mysqli_stmt_error($stmt)));
}
mysqli_stmt_close($stmt);
mysqli_close($link);
?>

BLOB columns act as a synonym for “large result sets”. Please note that I have not run any tests with large result sets that consisted of many rows. The memory savings of mysqlnd (if any) might not be as large as with BLOB columns.

<?php
if (!$link = mysqli_connect('localhost', 'root', 'root', 'phptest'))
	die(sprintf("[%d] %s\n",
		mysqli_connect_errno(), mysqli_connect_error()));

$rusage = getrusage();
$before = array(
	'emalloc' => memory_get_usage(false),
);

for ($i = 1; $i <= 10; $i++) {
	$rusage = getrusage();
$before = array(
	'emalloc' => memory_get_usage(false),
);

	$sql = sprintf('SELECT id, data FROM test WHERE id = %d', $i);
	if (!$res = mysqli_query($link, $sql)) {
		die(sprintf("[%d] %s\n",
			mysqli_errno($link), mysqli_error($link)));
	}
	$row = mysqli_fetch_assoc($res);
	$tmp = getrusage();
	printf("%12s Bytes - emalloc %12s  - ru_nvcsw %12s\n",
		number_format(strlen($row['data'])),
		number_format(memory_get_usage(false) - $before['emalloc']),
		number_format($tmp['ru_nvcsw'] - $rusage['ru_nvcsw'])
	);
	mysqli_free_result($res);
}
mysqli_close($link);
?>

When running the scripts with mysqlnd and libmysql, you will get very disappointing results. How comes? Haven’t I been telling you that mysqlnd uses less memory.

libmysql
nixnutz@linux-eu6p:~/php5_mysqli> sapi/cli/php ../php5_mysqlnd/memory_fetch.php
      32,768 Bytes - emalloc       33,876  - ru_nvcsw            2
      65,536 Bytes - emalloc       32,892  - ru_nvcsw            2
     131,072 Bytes - emalloc       65,560  - ru_nvcsw            2
     262,144 Bytes - emalloc      131,072  - ru_nvcsw            3
     524,288 Bytes - emalloc      262,144  - ru_nvcsw            6
   1,048,576 Bytes - emalloc      524,288  - ru_nvcsw           14
   2,097,152 Bytes - emalloc    1,048,576  - ru_nvcsw          119
   4,194,304 Bytes - emalloc    2,097,152  - ru_nvcsw          121
   8,388,608 Bytes - emalloc    4,194,304  - ru_nvcsw          198
  16,777,215 Bytes - emalloc    8,388,604  - ru_nvcsw          425

mysqlnd
nixnutz@linux-eu6p:~/php5_mysqlnd> sapi/cli/php memory_fetch.php
      32,768 Bytes - emalloc       67,672  - ru_nvcsw            2
      65,536 Bytes - emalloc       66,668  - ru_nvcsw            7
     131,072 Bytes - emalloc      132,008  - ru_nvcsw           11
     262,144 Bytes - emalloc      263,092  - ru_nvcsw           19
     524,288 Bytes - emalloc      525,176  - ru_nvcsw            2
   1,048,576 Bytes - emalloc    1,049,512  - ru_nvcsw            2
   2,097,152 Bytes - emalloc    2,098,040  - ru_nvcsw          123
   4,194,304 Bytes - emalloc    4,195,192  - ru_nvcsw            3
   8,388,608 Bytes - emalloc    8,389,496  - ru_nvcsw            4
  16,777,215 Bytes - emalloc   16,778,104  - ru_nvcsw           10

valgrind: 120MB (libmysql) vs. 71MB (mysqlnd)

Don’t put too much hope and trust into memory_get_usage(). Put PHP in a sandbox and let Valgrind do the monitoring. Valgrind is a suite of tools for debugging and profiling Linux programs. Among others it can record memory allocation statistics. And here you go: 121,160,787 bytes allocated with libmysql (100%) – 72,467,084 bytes allocated with mysqlnd (59.8%)!

libmysql
nixnutz@linux-eu6p:~/php5_mysqli> valgrind sapi/cli/php ../php5_mysqlnd/memory_fetch.php
[...]
  16,777,215 Bytes - emalloc    8,388,604  - ru_nvcsw            1
==5979==
==5979== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 1)
==5979== malloc/free: in use at exit: 0 bytes in 0 blocks.
==5979== malloc/free: 11,131 allocs, 11,131 frees, 121,160,787 bytes allocated.
==5979== For counts of detected errors, rerun with: -v
==5979== All heap blocks were freed -- no leaks are possible.
mysqlnd
nixnutz@linux-eu6p:~/php5_mysqlnd> valgrind sapi/cli/php memory_fetch.php
[...]
  16,777,215 Bytes - emalloc   16,778,104  - ru_nvcsw            1
==5983==
==5983== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 1)
==5983== malloc/free: in use at exit: 0 bytes in 0 blocks.
==5983== malloc/free: 11,339 allocs, 11,339 frees, 72,467,084 bytes allocated.
==5983== For counts of detected errors, rerun with: -v
==5983== All heap blocks were freed -- no leaks are possible.

As a final test you can disable the Zend Memory Manager: USE_ZEND_ALLOC=0 valgrind sapi/cli/php memory_fetch.php and execute the script again. For me things remained roughly as before with the one exception that the PHP 5 binary using mysqlnd got up to around 85MB. However, that’s still less than 120MB of libmysql and usually PHP does use the Zend Engine and it’s Memory Manager…

As usual: run your own tests! In another test I tried to monitor the size of Apache worker processes fetching rather small result sets. Guess what: I’ve been not able to see any difference. These new tricks of mysqlnd might or might not be perfect for you. Try it out. The new MySQL native driver for PHP might make a significant difference for you.

8 Comments