Ulf Wendel

PHP: New network traffic, CPU and memory savings with mysqlnd

BIT and CPU squeezing with Facebook, Tuenti and Andrey – for the die hard experts… The MySQL Client Server Protocol supports two flavours of sending database results to the client send as text and as binary. Unlike the text protocol, the binary protocol avoids converting data into strings whenever possible. For example, the binary protocol send a TINYINT column value as one byte and not as a length coded string of a length of one to four bytes. This saves CPU cycles on the server by avoiding a cast operation and it saves at least one byte of network traffic – sometimes. Furthermore, ext/mysqli will use an (PHP) integer variable instead of a string to hold the column value. This saves memory when caching database results in APC (see below for memcached). A MySQL server patch and an experimental mysqlnd branch allows you to use the binary protocol for all statements – not only prepared statements. Furthermore, you can get native types support even without touching the server.

Here is an example of the difference it makes to store a hash in APC which uses strings to hold integer values versus a hash that uses integers to hold the same integer values. If the hash would have been created by fetching results from MySQL using mysqli_query(), it would store all the integers using strings and it would require 102 kB. The same database result would require only 100 kB, if ext/mysqli would return integers as PHP integers and not as strings. Not a big deal, but this is just one example and even small differences can sum up. We have new fine tuning screws to play with…

$integers = array();
$strings = array();
for ($i = 0; $i < 169; $i++) {
  $integers[$i] = array($i, $i, $i);
  $strings[$i] = array((string)$i, (string)$i, (string)$i);
}

apc_clear_cache("user");

apc_store("integers", $integers, 10);
apc_store("strings", $strings, 10);

$info = apc_cache_info("user");
foreach ($info['cache_list'] as $details) {
	printf("Cache entry %-10s requires %d bytes\n", $details['info'], $details['mem_size']);
}

Cache entry integers   requires 100318 bytes
Cache entry strings    requires 102014 bytes

Contents


The basic idea

The basic idea behind this latest experimental branch of the MySQL native driver for PHP (mysqlnd) is as simple as: work less. To work less do the following:

  1. Avoid data type conversions
    • on the MySQL Server
    • within PHP on the C level
    • in the PHP userland, in a PHP application
  2. Use the most compact storage format
    • to save network bandwidth when sending data from the MySQL Server to PHP
    • to save memory in a PHP application
    • to save memory in a clever PHP main memory cache such as APC
    • to do as few memory allocation operations as possible

And it is simple to do less work: use Prepared Statements! All of the above advantages are a side-effect of the "new" communication protocol introduced in MySQL 4.1. The "new" MySQL Client Server protocol will only be used with Prepared Statements. The "new" protocol is also called binary protocol. Binary refers to how the payload, the result set column values, are encoded for being send over the network. The value encoding is the only major difference between the "new" binary and the "old" text protocol.

Before you start using only Prepared Statements in your PHP application read PDO_MYSQLND: Prepared Statements, again. Prepared Statements have their drawbacks either. The same is true for the binary protocol. More later.

In the "old" text protocol all column values are converted to strings before sending them. The conversion from the MySQL Server internal data type representation of a column value to string takes CPU cycles on the server. And it may take CPU cycles again on the client, if a PHP application converts the value back from string to its native value. When using the binary protocol string conversions are avoided whenever possible. For example, an integer will remain an integer all the time. That’s 1. Avoid data type conversion.

The text protocol sends all data as strings. The strings are "length encoded". A length encoded string consists of two parts: the length and the string itself. The length requires at least one byte, followed by n-bytes for the string. For example, the integer value 123456 will result in a 7 bytes long length encoded string. 1 byte is required for encoding the length (6 characters) and another 6 bytes are required for the 6 characters "1", "2", "3", "4", "5", "6". In binary protocol the integer gets send as an integer requiring always 4 bytes. In the example you will save 3 bytes (~ 40% !) of network traffic. That’s one half of 2. Use the most compact storage format.

The second half of 2. Use the most compact storage format is what Facebook and Tuenti have been after. The PHP variable $col1 = '123456' of the type string requires more memory than the PHP variable $col1 = 123456. The difference is not huge. But if you are a popular web site and you cache query results in APC, things sum up easily. Facebook is on Alexa rank 5, Tuenti is within the top 500 and MySQL is within the top 5000.

The second half of 2. Use the most compact storage format is not tied to using the binary protocol. The original Facebook patch against ext/mysqli simply did convert all string values to their "native" types and did not bother how the data was send over the wire.

To sum up and repeat, the primary goal is to save some, few memory in the PHP land, in particular when caching data in APC, by fetching query results not as strings but using “native” data types whenever possible. This can be accomplished in two ways:

  • Andrey’s approach: never convert data to string, always send and use data “as is”
  • Original Facebook patch: don’t mind the transport format, but convert from string to native types in PHP when fetching

You get both, see below. But first, give kudos to Facebook and Luke Weber from Tuenti who pointed us to a bunch of interesting detail optimizations.

As a side effect, measure PHP’s performance when comparing variables. Write a micro benchmark that compares how fast is if ( $int_var == $same_int_value_as_string) ... vs. if ($int_var == $same_int_value_as_integer) ... vs. if ($int_var === $same_int_value_as_integer) .... There is a performance saving of up to about 30%.


The code…

The code is available from bzr: bzr branch lp:~andrey-mysql/php-mysqlnd/binary_protocol and from the PHP CVS. Why do we have a bzr branch then? We are proposing two solutions for the sketched problem. One solution requires patching the MySQL Server and accessing the feature of the patched server through a new function in ext/mysqli that has no other purpose but enabling you to use the patched server. You will not profit from it when an unpatched server. Of course the patch is not official and not approved by the MySQL QA and stuff. That why this one function is not in the PHP CVS. Everything else is in the PHP CVS. I hope you agree with this decision.


Target audience or Try-and-error

Whereas the basic idea is simple and the advantages seem obvious, things are much more complicated in real life. This lengthy blog posting is not long enough to give all details required to give a reliable recommendation whether or not you should use the new technique. The only honest recommendation one can give is that you should try it out, if you cache database results in APC.

Its is unlikely, that you will find any major performance differences between todays ext/mysqli and the proposed changes. Looking at the database server, the web server, PHP, APC, rendering time of your web site in the browser and whatever else is involved in the user experience of the performance of your web application it does not really matter whether you save a few data type conversions or not. The entire topic is detail tuning at its extreme. Not everybody will profit from this kind of tuning. However, some might because you can save a few type casts and thus a few CPU cycles.


Solution one: Patching the MySQL Server

Andrey has taken a very broad approach to solving the problem of minimizing memory consumption in the PHP land when caching database results in APC (see below for memcached). Andrey has patched the MySQL Server to ensure that data never gets converted into strings unless it is really needed. No enforced string conversions is a property of the binary protocol.

The MySQL Server either sends the results of a query using the binary or the text protocol. All Prepared Statements are replied using the binary protocol whereas all other queries that generate a result set are replied using the text protocol. A tiny patch to the server adds more flexibility to this pattern. The patch does not change the current and default behavior of the server. Therefore the patch is backward compatible and does not break any existing implementations of the MySQL Client Server protocol. It does not break any driver or any existing application.

The patch allows the client to advice the server to send result sets using the binary protocol regardless of the type of the query. This is done by introducing two new server options MYSQL_OPTION_USE_TEXT_PROTOCOL and MYSQL_OPTION_USE_BINARY_PROTOCOL. You can think of a server option as a modifier that changes the behavior of the C API function mysql_real_query() (PHP level counterparts: mysqli_query(), mysqli_real_query()). An example of such a modifier, which is not exported to the PHP level, is MYSQL_OPTION_MULTI_STATEMENTS_ON. MYSQL_OPTION_MULTI_STATEMENTS_ON tells mysql_real_query() to accept multiple statement queries. MYSQL_OPTION_MULTI_STATEMENTS_OFF respectively is there for disabling the functionality again. The two new server options introduced by the patch work in the same way. After setting the option MYSQL_OPTION_USE_BINARY_PROTOCOL the default behavior of mysql_real_query() changes in such a way that the server will use the binary protocol to send results back to the client.

Again, that is 100% backward compatible. No default is changed. The new functionality is only supported by drivers, like mysqlnd, that know how to use the new functionality. Its a tiny patch against MySQL 5.0.67 as you can see below.

=== modified file 'include/mysql_com.h'
--- include/mysql_com.h 2007-12-13 10:53:24 +0000
+++ include/mysql_com.h 2008-09-10 15:25:23 +0000
@@ -323,7 +323,9 @@ enum enum_cursor_type
 enum enum_mysql_set_option
 {
   MYSQL_OPTION_MULTI_STATEMENTS_ON,
-  MYSQL_OPTION_MULTI_STATEMENTS_OFF
+  MYSQL_OPTION_MULTI_STATEMENTS_OFF,
+  MYSQL_OPTION_USE_TEXT_PROTOCOL,
+  MYSQL_OPTION_USE_BINARY_PROTOCOL
 };

 #define net_new_transaction(net) ((net)-gt;pkt_nr=0)

=== modified file 'sql/sql_parse.cc'
--- sql/sql_parse.cc    2008-08-26 08:32:43 +0000
+++ sql/sql_parse.cc    2008-09-10 15:58:35 +0000
@@ -2178,6 +2178,14 @@ bool dispatch_command(enum enum_server_c
       thd-gt;client_capabilities&= ~CLIENT_MULTI_STATEMENTS;
       send_eof(thd);
       break;
+    case (int) MYSQL_OPTION_USE_TEXT_PROTOCOL:
+      thd-gt;protocol= (Protocol *) &thd-gt;protocol_simple;
+      send_eof(thd);
+      break;
+    case (int) MYSQL_OPTION_USE_BINARY_PROTOCOL:
+      thd-gt;protocol= (Protocol *) &thd-gt;protocol_prep;
+      send_eof(thd);
+      break;
     default:
       my_message(ER_UNKNOWN_COM_ERROR, ER(ER_UNKNOWN_COM_ERROR), MYF(0));
       break;


Accessing the new functionality from PHP

To give PHP users access to the new feature of using the binary protocol with mysqli_query() we have exported the C function mysql_set_server_option() to PHP. ext/mysqli has got a new function mysqli_set_server_option(mysqli $link, int $server_option) and two new constants MYSQLI_OPTION_USE_BINARY_PROTOCOL, MYSQLI_OPTION_USE_TEXT_PROTOCOL . You can use the function to tell mysqli_query() whether it shall use the binary or the text protocol to send back query results. As usual, the function is available in a procedural and an object oriented fashion.

Look at the example to see how this little chance gives you “native” types with PHP – SELECT 1 returns an integer when using the binary protocol.

nixnutz@ulflinux:~/php5_binprotgt; sapi/cli/php -r '
  $link = mysqli_connect("localhost", "root", "root"); 
  $link->set_server_option(MYSQLI_OPTION_USE_BINARY_PROTOCOL);
  $res = $link->query("SELECT 1"); 
  var_dump($res->fetch_assoc());
  $res>close; 
  $link->set_server_option(MYSQLI_OPTION_USE_TEXT_PROTOCOL);
  $res = $link->query("SELECT 1"); 
  var_dump($res->fetch_assoc());
  $res>close; 
  $link>close();'


array(1) {
  [1]=>
  int(1)
}
array(1) {
  [1]=>
  string(1) "1"
}

This change to ext/mysqli is not in the PHP CVS, its only in a public bazaar branch on Launchpad. Its PHP license and all, relax, but the function is useless to every normal PHP user that does not want to patch his MySQL server (see above). In fact the function is not only useless but in a limited way a security risk and a break in the logic of the ext/mysqli . Theoretically you can use mysqli_set_server_option() to enable multi statements. That breaks with the principle that multi statements are only available with mysqli_multi_query(). Also, multi statements are more prone to SQL injections and without a need one should not introduce a function that enables you to theoretically use multi statements with mysqli_query(). I say “theoretically” because we try to keep that back door closed and do not allow the usage of MYSQL_OPTION_MULTI_STATEMENTS_ON with mysqli_set_server_option(). Back to the original topic.


Estimating the network traffic

The binary and the text protocol cause a different amount due to the different ways of encoding data before sending it over the network. Unfortunately it is tricky to calculate the difference in advance. To make a good estimation you need to investigate the schema, the actual data and the queries. Beside all the theory and background information that follows, the best estimation you can get is to try out the binary protocol and consult the 80+ mysqlnd statistics.

$link = mysqli_connect("localhost", "root", "root");
mysqli_query($link, "USE test");
mysqli_query($link, "DROP TABLE IF EXISTS test");
mysqli_query($link, "CREATE TABLE test(col1 TINYINT, col2 CHAR(15), col3 TINYINT DEFAULT NULL)");
mysqli_query($link, "INSERT INTO test(col1, col2) VALUES (100, 'char(10) column')");

/* only available with mysqlnd */
$stats = mysqli_get_client_stats();
$bytes = $stats['bytes_received'];

$res = mysqli_query($link, "SELECT * FROM test");
mysqli_fetch_assoc($res);
mysqli_free_result($res);

$stats = mysqli_get_client_stats();
printf("Text protocol  : %d bytes\n", $stats['bytes_received'] - $bytes);

mysqli_set_server_option($link, MYSQLI_OPTION_USE_BINARY_PROTOCOL);

$stats = mysqli_get_client_stats();
$bytes = $stats['bytes_received'];

$res = mysqli_query($link, "SELECT * FROM test");
mysqli_fetch_assoc($res);
mysqli_free_result($res);

$stats = mysqli_get_client_stats();
printf("Binary protocol: %d bytes\n", $stats['bytes_received'] - $bytes);

mysqli_close($link);


Text protocol  : 186 bytes
Binary protocol: 184 bytes

The second best solution is to run a script that calculates potential network traffic differences by doing a static analysis of the schema and the first n rows of every table in the schema. The script is available from the bzr branch. Its name is calc_binprot_savings.phpt. When run against the test table created by the above script it will predict an almost negliable advantage for the binary protocol. The prediction is based on a merely theoretical analysis of the schema and the data. The statistics provided by mysqlnd are not considered. Also, the calculation script does estimate only the size of the actual data/payload. While doing so, it ignores some overhead of the client server protocol like data package chaining (about 4 bytes every 16M).

nixnutz@ulflinux:~/php5_binprot> sapi/cli/php ext/mysqli/tests/bench/calc_binprot_savings.phpt


--------------------------------------------------------------------------------
 Binary transport protocol evaluation details
--------------------------------------------------------------------------------

Analyzing 1 table[s]...

Checking table 'test'...

        Number of columns   : 3
        Column 1 (col1)     : TINYINT              (possible saving 1 ... 4 byte[s] for values not NULL)
        Column 2 (col2)     : STRING(15) latin1    (possible saving 0 ... 0 byte[s] for values not NULL)
        Column 3 (col3)     : TINYINT              (possible saving 1 ... 4 byte[s] for values not NULL)
        Theoretical saving  : 0 ... 6 byte[s] = 0% ... 23% (NULL values not considered!)

        Analysis based on 1 rows...
        Estimated text          : 21 byte[s]
        Estimated binary        : 17 byte[s]
        Binary overhead         : 2 byte[s]
        Total binary            : 19 byte[s]

        Actual saving           : 2 byte[s] (= 9 %)


--------------------------------------------------------------------------------
 Summary
--------------------------------------------------------------------------------

        Tables checked   : 1
        Rows estimated   : 1

        Estimated saving : 9 % (MySQL payload - not overall network bandwith)

        You might save some transfer bandwith using binary transport protocol.


        Check the following table[s]:

        Table test                 (possible saving 2 byte[s])



        Please note: this script does not cover every detail, do a real-life test.

Both ways to calculate the savings of the binary protocol show a difference of 2 bytes. But the calculation script speaks of a 9% saving although 2 out of 186 are not equal to 9% of 186. That’s because the estimation script only calculates the size of the payload whereas the try-and-error statistics approach takes everything into account: protocol "overhead" plus payload.

The text protocol sends converts all result set values to length encoded strings before sending them to the client. A length encoded string consists of at least one leading byte to indicate the length of the string followed by 0 to n bytes for the actual data value. Strings up to a length of 250 bytes require one leading byte for the length encoding. Strings of 251 bytes length or more, need up to four leading bytes. NULL values are encoded using one length byte of the value 0.

Examples of length encoded strings used by the text protocol
SQL value Bytes on the wire Encoded version
INT, NULL 1 Byte 1: Length(NULL) = 0
TINYINT, -1 3 Byte 1: Length(“-1″) = 1, Bytes 2 – 3: “-1″
TINYINT, 255 4 Byte 1: Length(“255″) = 3, Bytes 2 – 4: “255″
CHAR(10), “abcd” 5 Byte 1: Length(“abcd”) = 4, Bytes 2 – 5: “abcd”

In contrast to the text protocol, the binary protocol avoids string conversions whenever possible. A TINYINT, for example, is encoded “as-is” using exactly one byte, regardless of the actual value (-1, 255, …). This is the reason why one cannot give a simple answer to what protocol causes less traffic.

NULL values are encoded by setting one bit in a NULL values bit map. For every string based SQL data type, you are back to length encoded strings. A CHAR(10) column with the value “abcd” requires 5 bytes in the text protocol and 5 bytes in the binary protocol.

Text protocol and binary protocol space requirements compared (NULL ignored)
Type Bytes Text protocol Bytes Binary protocol Difference
DECIMAL length encoded string length encoded string 0
TINYINT 2…5 1 1…5
SMALLINT 2…7 2 0…5
MEDIUMINT 2…9 3 -1…6
INTEGER 2…12 4 -2…8
BIGINT 2…21 8 -6…13
FLOAT 2…11 (system dependent?) 4 -2…8(*)
DOUBLE 2…20 (system dependent?) 8 -2…12(*)
DATE 11 4 7
DATETIME 20 4 (time 00:00:00) … 7 13…16
TIME 2…10 8 -6…2
YEAR 3 or 5 2 1…3
*CHAR, *BLOB length encoded string length encoded string 0
SET, ENUM length encoded string length encoded string 0
GEOMETRY length encoded string length encoded string 0

Complicated you say? Let’s add variable number three – your queries. Connect to mysql using the mysql client and the option -T to get data type information. Run SELECT NOW(), NOW() + 1. Check the data type. NOW() returns a DATETIME but NOW() + 1 returns a DECIMAL. A network traffic estimation based on the schema is unable to catch that – your query has converted a DATETIME to a DECIMAL.

nixnutz@ulflinux:~/php5> /usr/local/mysql/bin/mysql -uroot -proot -T test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 545
Server version: 5.0.67 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select now(), now() + 1;
Field   1:  `now()`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       DATETIME
Collation:  binary (63)
Length:     19
Max_length: 19
Decimals:   6
Flags:      NOT_NULL BINARY

Field   2:  `now() + 1`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       DOUBLE
Collation:  binary (63)
Length:     23
Max_length: 21
Decimals:   6
Flags:      NOT_NULL BINARY NUM


+---------------------+-----------------------+
| now()               | now() + 1             |
+---------------------+-----------------------+
| 2008-09-16 13:09:51 | 20080916130952.000000 |
+---------------------+-----------------------+
1 row in set (0.00 sec)

Conclusion: run a test, forget about the theory, the binary protocol might or might not make a small difference in network traffic. Two final examples. This is the conclusing of the schema analysis script for my WordPress blog database. I’m not surprised: lots of BLOB columns and BIGINT columns to hold the IDs of 198 articles…

--------------------------------------------------------------------------------
 Summary
--------------------------------------------------------------------------------

        Tables checked   : 9
        Rows estimated   : 1613

        Estimated saving : 0 % (MySQL payload - not overall network bandwith)

        You might save some transfer bandwith using binary transport protocol.


        Check the following table[s]:

        Table wp_comments          (possible saving 13385 byte[s])
        Table wp_links             (possible saving 1 byte[s])
        Table wp_posts             (possible saving 6528 byte[s])

And this is the result of analyzing the Data Charmer’s Employee database. The result is the exact opposite of the WordPress result!

--------------------------------------------------------------------------------
 Summary
--------------------------------------------------------------------------------

        Tables checked   : 6
        Rows estimated   : 4033

        Estimated saving : 38 % (MySQL payload - not overall network bandwith)

        You might save some transfer bandwith using binary transport protocol.


        Check the following table[s]:

        Table dept_emp             (possible saving 14000 byte[s])
        Table dept_manager         (possible saving 360 byte[s])
        Table employees            (possible saving 14000 byte[s])
        Table salaries             (possible saving 16013 byte[s])
        Table titles               (possible saving 14000 byte[s])



        Please note: this script does not cover every detail, do a real-life test.

Recall that the goal was to save a little bit of memory in PHP and when caching results in APC. Its not a big deal if the server patch does not (always) save network traffic. It might or might not do so depending on your application. You get an additional tuning screw to play with. What the screw does has been explained in depth above.


Using PHP to emulate native types with the text protocol

It is more promising to look at the memory used by PHP and in particular APC when storing values from INTEGER columns as PHP integers and not as PHP strings (see above). There are three ways to make ext/mysqli return native types:

  1. use Prepared Statements (this implies using the binary protocol)
  2. use the MySQL patch from the experimental bzr branch (see above)
  3. let ext/mysqli and mysqlnd do the necessary type conversions for you – that’s in the PHP CVS and in the bzr branch

By default mysqli_query() will use the text protocol and all results will be returned as PHP strings (except NULL). By help of meta data and type information associated with every result set, one can cast the strings back to their native types. This can either be done within a PHP script or inside ext/mysqli. The casting will burn a few CPU cycles and its weird to convert from int to string on the server before sending data and doing the reverse operation of converting from string to int on the client, but that’s the only option remaining if 1) and 2) are no options.

A new mysqlnd only option gives you 3). Try out mysqli_options($link, MYSQLI_OPT_INT_AND_FLOAT_NATIVE, true).

$link = mysqli_connect("localhost", "root", "root");
mysqli_query($link, "USE test");

mysqli_query($link, "DROP TABLE IF EXISTS test");
mysqli_query($link, "CREATE TABLE test(col1 INT, col2 FLOAT)");
mysqli_query($link, "INSERT INTO test(col1, col2) VALUES(1, 12345.67)");

$res = mysqli_query($link, "SELECT col1, col2, col2 * 2 AS _col3 FROM test");
var_dump(mysqli_fetch_assoc($res));
mysqli_free_result($res);

/* mysqlnd only */
mysqli_options($link, MYSQLI_OPT_INT_AND_FLOAT_NATIVE, true);
$res = mysqli_query($link, "SELECT col1, col2, col2 * 2 AS _col3 FROM test");
var_dump(mysqli_fetch_assoc($res));
mysqli_free_result($res);

mysqli_close($link);


array(3) {
  ["col1"]=>
  string(1) "1"
  ["col2"]=>
  string(7) "12345.7"
  ["_col3"]=>
  string(14) "24691.33984375"
}
array(3) {
  ["col1"]=>
  int(1)
  ["col2"]=>
  float(12345.7)
  ["_col3"]=>
  float(24691.33984375)
}

However, please make sure that your application is able to handle a float instead of a string. For example, don’t be surprised if comparisons made with === against strings fails. Be also aware of the fact that computers don’t do precise float calculations: col2 * 2 is a good example from the above script.

As a closing example, lets run a really, really ugly query against the employee example database and cache its results in APC. The query is ugly because a) its slow and b) it gives not the full answer to the question which employee has had the highest annual salary ever. However, using the new option MYSQLI_OPT_INT_AND_FLOAT_NATIVE makes the APC cache entry a little smaller. The differences is around 2%.

apc_clear_cache("user");

$link = mysqli_connect("localhost", "root", "root");
mysqli_query($link, "USE employees");

$res = mysqli_query($link, "SELECT
	e.first_name, e.last_name, s.salary, YEAR(s.from_date) AS _year
FROM
	employees AS e
JOIN
	salaries AS s ON
	e.emp_no = s.emp_no AND
	PERIOD_DIFF(s.from_date, s.to_date) = 12
ORDER BY
	salary DESC");
$employees_string = mysqli_fetch_all($res);
mysqli_free_result($res);

/* mysqlnd only, alternatively switch to binary protocol */
mysqli_options($link, MYSQLI_OPT_INT_AND_FLOAT_NATIVE, 1);
$res = mysqli_query($link, "SELECT
	e.first_name, e.last_name, s.salary, YEAR(s.from_date) AS _year
FROM
	employees AS e
JOIN
	salaries AS s ON
	e.emp_no = s.emp_no AND
	PERIOD_DIFF(s.from_date, s.to_date) = 12
ORDER BY
	salary DESC");
$employees_native = mysqli_fetch_all($res);
mysqli_free_result($res);

mysqli_close($link);


apc_store("strings", $employees_string, 10);
apc_store("native", $employees_native, 10);

$info = apc_cache_info("user");
foreach ($info['cache_list'] as $details) {
	printf("Cache entry %-10s requires %d bytes\n", $details['info'], $details['mem_size']);
}


Cache entry strings    requires 21313 bytes
Cache entry native     requires 20991 bytes

And what about using the binary protocol instead of MYSQLI_OPT_INT_AND_FLOAT_NATIVE to get native types? The impact on the APC cache entry is the same, you gain some 2% memory savings. The network traffic, measured using mysqlnd’s byte_received statistic, goes down by about 3%. No wonder, but hey, you get it for free…


Future works – memcached – Feedback appreciated!

I have not mentioned memcached in this post. The reason is that the memcached PECL extension uses a simple way of serializing variables for storing them into memcached. For example, PHP’s text based serializer is used for serializing arrays and objects. The is the same serializer that is used for PHPs session functionality. There is some room for improvements here. However, it will take some time to evaluate the options and learn more about memached. In constrast to the PECL memcached extension, APC never serializes variables regardless if they are scalars or not. Therefore it was a bit easier to demonstrate the difference of using PHP strings and, for example, PHP integers to hold the same PHP integer value.

We will more or less stop at this point for now and wait for your feedback. One interesting question is if the binary protocol really saves CPU time on the server and on the client. If you got a free minute to test it, let us know about the results.

2 Comments

  1. Am I right in assuming there will be issues with UNSIGNED INT columns?

    With values larger than 0x7fffffff becoming negative in 32 bit PHP? and similarly for unsigned bigints?

  2. Hi Jared,

    I haven’t checked it myself because it worked for Prepared Statements and I trust Andrey’s words that it works. For every value that goes beyond PHP_INT_MAX we should be using strings, just like we do with Prepared Statements.

    Ulf