PHP MySQL developers are confronted with three PHP MySQL APIs. That is confusing. Since mid 2012 the PHP manual suggests not to use the oldest and original PHP MySQL extension any more to cure the situation. Please, swap out ext/mysql in favour of either ext/mysqli or PDO_MySQL. Why to upgrade is below. Read on, if you are a PHP beginner, your PHP application is still using mysql_* functions of ext/mysql or you are a PHP trainer.
As a PHP consumer, tease your software vendors by asking whether they use the state of the art APIs! Here is a cut&paste test question to ask your vendor: do you support secure SSL connections to MySQL?
History and features
PHP and MySQL go hand in hand. The original MySQL API (ext/mysql) is around since PHP 2.0. Many web sites still show examples of the original MySQL API (ext/mysql). This is amazing considering the feature comparison of the three PHP MySQL APIs. By the way, do you know this Quickstart guide on ext/mysqli that is in the PHP manual?
API feature comparison | |||
---|---|---|---|
ext/mysqli | PDO_MySQL | ext/mysql | |
PHP version introduced | 5.0 | 5.1 | 2.0 |
Year introduced | 2003 | 2005 | Last century |
Included with PHP 5.x | Yes | Yes | Yes |
Development status | Active | Active | Maintenance only |
Feature additions | All new, flagship |
All new, given they fit PDO |
None since 2006 |
Lifecycle | Active | Active | Long term deprecation announced |
Recommended for new projects | Yes | Yes | No |
OOP Interface | Yes | Yes | No |
Procedural Interface | Yes | No | Yes |
API supports non-blocking, asynchronous queries with mysqlnd | Yes | No | No |
Persistent Connections | Yes | Yes | Yes |
"No pitfall" Persistent Connections | Default, can be disabled | No | No |
API supports Charsets | Yes | Yes | Yes |
API supports server-side Prepared Statements | Yes | Yes | No |
API supports client-side Prepared Statements | No | Yes | No |
API supports Stored Procedures | Yes | Yes | No |
API supports Multiple Statements | Yes | Most | No |
API supports Transactions | Yes | Yes | No |
Transactions can be controlled with SQL | Yes | Yes | Yes |
API supports SSL | Yes | Yes | No |
API supports Compression | Yes | Yes | Yes |
Supports all MySQL 5.1+ functionality | Yes | Most | No |
The above is almost identical to what you find in the PHP manual at http://www.php.net/manual/en/mysqlinfo.api.choosing.php with minor additions to elaborte and make the points. If you are using the original ext/mysql API, you find yourself in the red zone! Get rid of the weak points in your application.
Soft features
Humans have hard skills and soft skills. Let’s see if that works for APIs too – a personal list of soft features.
ext/mysqli | PDO_MySQL | ext/mysql | |
---|---|---|---|
Performance factor | tendency above average | average | below average |
Security factor | good | good | below average |
Pitfall factor | low | average | above average |
Cosiness factory | average | above average | average |
Performance considerations
All three PHP MySQL APIs are tiny C wrappers around the underlying C client-server library. It is the underlying C client-server library that does most of the work.
Programming language | |||
---|---|---|---|
PHP | WordPress, Drupal, Oxid, phpMyAdmin, YourPHPWonder | ||
C | ext/mysqli | PDO_MySQL | ext/mysql |
C | Client-Server library |
For the same command executed using the same feature you can expect all APIs to show the same performance. How you wrap the C library in a C PHP extension does not matter. It is the features exposed to the users that matter. Different features have different performance characteristics.
Before doing endless profiling sessions ask yourself how much time your web applications spends in the API itself. I am not talking about the accumulated wall clock time of query or connect functions. I ask you to measure pure API times without query and network times!
Performance relevant features | |||
---|---|---|---|
ext/mysqli | PDO_MySQL | ext/mysql | |
API supports non-blocking, asynchronous queries with mysqlnd | Yes | No | No |
Persistent Connections | Yes | Yes | Yes |
"No pitfall" Persistent Connections | Default, can be disabled | No | No |
API supports server-side Prepared Statements | Yes | Yes | No |
API supports Multiple Statements | Yes | Most | No |
API supports Compression | Yes | Yes | Yes |
Features is what counts for performance. The matrix of performance relevant features makes ext/mysql look poor.
Performance: asynchronous, non-blocking
There is no way to run a non-blocking query using ext/mysql. You cannot fire off a slow running query at the beginning of your script, do some work and fetch results once the server has computed them. Instead you are forced to wait for the results. For example, you can query an ad server without having to wait for it. In 2008 a conference talk was given on: How mysqlnd async queries help you with sharding!. Here is the corresponding success story from 2012: Asyncronous Shard Queries in PHP using mysqlnd enabling the feed load 10 times faster.
Performance: servers-side prepared statements
A non-prepared statement is parsed and interpreted by the server every time you execute it. SQL and data is sent to the server in one chunk. A prepared statement is first parsed and prepared on the server before it can be executed multiple times with different parameters without being reparsed again. SQL and data sent to the server seperatedly. Upon repeated execution only the parameter value is transferred, thus bandwith is saved.
Whether using a prepared statement gives a performance penalty or performance win depends on the case. Below is an example that showed a 25% higher query rate using prepared statements on my aged test box. If you don’t know about this feature, which is unavailable with ext/mysql, read this quickstart. Performance is about features being available in the API and smart, trained use of the features.
$link = new mysqli("localhost", "root", "", "test", NULL, "/tmp/mysql57.sock");
$link->query("DROP TABLE IF EXISTS test");
$link->query("CREATE TABLE test(id INT AUTO_INCREMENT PRIMARY KEY, note VARCHAR(255))");
$time = 100;
$note = 'How fast are prepared statements?';
$num_queries = 0;
$start = microtime(true);
do {
$num_queries++;
$my_note = $link->real_escape_string(sprintf("%s %d", $note, $num_queries));
$res = $link->query("INSERT INTO test(note) VALUES ('" . $my_note . "')");
$runtime = microtime(true) - $start;
} while ($runtime < $time);
printf("Non-prepared: %d qps\n", $num_queries / $time);
$num_queries = 0;
$start = microtime(true);
$stmt = $link->prepare("INSERT INTO test(note) VALUES (?)");
$my_note = NULL;
$stmt->bind_param("s", $my_note);
do {
$num_queries++;
$my_note = sprintf("%s %d", $note, $num_queries);
$stmt->execute();
$runtime = microtime(true) - $start;
} while ($runtime < $time);
printf("Prepared: %d qps\n", $num_queries / $time);
Performance: multiple statements
What if you could fire multiple statements with just one question to reduce client-server round-trips? Using Multiple Statements it is possible, unless you are using ext/mysql, which cuts you off from this possible optimization.
$link = new mysqli("localhost", "root", "", "test", NULL, "/tmp/mysql57.sock"); $link->multi_query(" DROP TABLE IF EXISTS test ; CREATE TABLE test(id INT AUTO_INCREMENT PRIMARY KEY, note VARCHAR(255)) ; INSERT INTO test(id) VALUES ('multi statement'); SELECT * FROM test; SELECT id FROM test; "); do { if ($res = $link->store_result()) { var_dump($res->fetch_all(MYSQLI_ASSOC)); } } while ($link->more_results() && $link->next_result());
Performance: summary
API internals don’t matter much, they hardly contribute to overall run time. Reducing SQL runtimes is most important followed by querying MySQL in the most efficient manner. The different options for querying MySQL with different performance characteristics that can be used for optimizations on a case-by-case basis. A certain old API cuts you off from some potential performance gains…
Security related features
Security has a lot to do with how you use a given tool and which tools you are given.
Security related features | |||
---|---|---|---|
ext/mysqli | PDO_MySQL | ext/mysql | |
"No pitfall" Persistent Connections | Default, can be disabled | No | No |
API supports server-side Prepared Statements | Yes | Yes | No |
API supports SSL | Yes | Yes | No |
Security: Encryption/SSL
MySQL supports secure connections between clients and MySQL clients and the server using the Secure Sockets Layer (SSL) protocol. Both ext/mysqli and PDO_MySQL feature SSL connections. Thus, PHP MySQL users can use SSL for their business.
But, ext/mysql will not let you set the configuration settings required for an SSL encrypted connection. Feel free to use "the world’s foremost network protocol analyzer" (Wireshark) to sniff on any ext/mysql connection. The graphical user interface of Wireshark is simple to use. Good luck stealing information.
Security: SQL injection
There are many ways to fight SQL injection. Please, consider watching the web presentation Top 10 MySQL Tips and Mistakes for PHP Developers if you are new to the topic.
It is certainly possible to develop a bullet-proof web application on top of ext/mysql but it requires skill. The below example is taken from the article Not only SQL injection: I don’t trust you!. Do you see the denial-of-service attack? The answer is in the link. It is not only about charsets. Do you notice the charset pitfall in the API?
$query = "SELECT name, rating FROM movies WHERE rating > " . mysql_escape_string($_GET['rating']); mysql_query($query, $link);
For an untrained person, a beginner, the simplest solution to the problem would be to use (server-side) prepared statements. A feature available with all PHP MySQL APIs but ext/mysql.
Security: Persistent connections
Persistent connections have been reason for vivid discussions ever since. A persistent connection is not closed at the end of a script but kept open for reuse in a future script run. A persistent connection keeps its state between use, as the name says. This very property has caused lots of confusion ever since. People have been surprised to be confronted with a connection that has open transactions or SQL user variables set. Many considered this a pitfall.
Imagine the DBA notices a PHP MySQL account has been granted to many rights, or, in the worst case has been compromised. The DBA changes the MySQL account. Your web application has many persistent connections open. The open connections are unaffected from the account changes. PHP continues to issue queries that you tried to prevent until all persistent connections have been closed.
The ext/mysqli API is the only one that is not flawed by default. The mysqli extension does flush the state of a persistent connections and reauthenticate the user (COM_CHANGE_USER) before reusing the persistent connection. Thus, no pitfalls around state, and no delayed recognition of account changes. Strictly speaking ext/mysqli gives you no persistent connection but a pooled one. However, those who know what they do can go back to the old behaviour. The price paid for "no pitfall" persistent connections is performance (german speaking readers can find details here and here).
Cosiness and pitfall factor
Upon closer look it turned out that performance and security are more hard facts than soft ones. Even cosiness and pitfall factor have a touch of hard facts that cannot be ignored.
Undoubtfully the most powerful PHP MySQL API is ext/mysqli. But, it is also the most complex one, which gives a minus on the cosiness factor. When it was introduced it served as a pioneer for dual procedural and object-oriented interfaces. The procedural interface was preserved to make migrating from ext/mysql easy. PDO came a tad later and got a more comprehensive, though less powerful, pure OOP interface that many would rate higher for cosiness.
The original PHP MySQL API (ext/mysql) lacks on features and API calls – cosy! Seriously, it has some neat looking stuff in it but watch out for traps…
$res = mysql_query("SELECT 1"); var_dump(mysql_fetch_assoc($res));
This is a complete script to query MySQL. It is no mistake that you see no call to any kind of connect function. Here is the proof using PHP 5.6:
nixnutz@linux-0v4u:~/git/php-src> sapi/cli/php -d mysql.default_host="localhost" -d mysql.default_user="root" -d mysql.default_socket="/tmp/mysql57.sock" -r ' $res = mysql_query("SELECT 1"); var_dump(mysql_fetch_assoc($res));' array(1) { [1]=> string(1) "1" }
Great stuff, thumbs up? Next example. Let’s assume the implicit connect happens somewhere deep, deep in your startup routine. Then, in the main part of your script you explicitly open a connection with mysql_connect()
. You get a resource identifier for your connection and you use it with mysql_query()
. Guess what, you are using the connection that was opened somewhere deep, deep in your startup routine.
nixnutz@linux-0v4u:~/git/php-src> sapi/cli/php -d mysql.default_host="localhost" -d mysql.default_user="root" -d mysql.default_socket="/tmp/mysql57.sock" -r ' function startup() { mysql_query("SET @a = 1"); } startup(); $link = mysql_connect(); $res = mysql_query("SELECT @a", $link); var_dump(mysql_fetch_assoc($res));' array(1) { ["@a"]=> string(1) "1" }
It is not mandatory to pass a connection identifier to mysql_query()
. The function will use the last opened connection – cosy! Particularily older code makes use of this feature. Assume you modern()
stuff to your old application. You believe to do everything right. You tell mysql_connect()
to open a new connection as you do not want to be fooled. For this you pass true
as the fourth argument to the function. Then, you ensure the connection identifier is used with mysql_query()
. Everything is beautifully encapsulated in a function – what should go wrong?
nixnutz@linux-0v4u:~/git/php-src> sapi/cli/php -d mysql.default_host="localhost" -d mysql.default_user="root" -d mysql.default_socket="/tmp/mysql57.sock" -r ' function startup() { mysql_query("SET @a = 1"); } function modern() { $link = mysql_connect(NULL, NULL, NULL, true); mysql_query("SET @a=2", $link); } startup(); $link = mysql_connect(); modern(); $res = mysql_query("SELECT @a"); var_dump(mysql_fetch_assoc($res));' array(1) { ["@a"]=> string(1) "2" }
Well, your function has side effects. It has accidently made the main routine use the connection established within modern()
. No, closing the connection in modern()
is no solution. Maybe some cosy looking features contribute to the pitfall factor…
You deserve the best
Do yourself a favour and choose the best possible API for your PHP MySQL applications. Either use ext/mysqli or PDO_MySQL. The original MySQL API will not allow you to go beyond the MySQL 4.1 feature set – quite limiting.
Happy hacking!
Pingback: Supercharging PHP MySQL applications using the best API | codegooroo
Pingback: Log Buffer #296, A Carnival of the Vanities for DBAs | The Pythian Blog
2012/11/24 at 08:07
Thanks a lot for the roundup. The Security: Persistent connections section gave me a lot to think about. Happy Thanksgiving!
2012/11/25 at 12:04
It is possible to use iterators with mysqli result sets:
foreach ($res as $row) {
}
Andrey
Pingback: Linkpool Nummer 33 | PHP Gangsta - Der PHP Blog mit Praxisbezug
Pingback: How (and when) to move users to mysqli and PDO_MYSQL? | codegooroo
2012/11/28 at 01:34
The major reason for slow adoption of ext/mysqli and PDO is the lack of killer features.
SSL: First of all, ext/mysql supports it: http://www.madirish.net/?article=244. Second, most connections from PHP to MySQL are on internal network. If someone is on this internal network then you already have bigger troubles and using SSL just consumes resources without a good reason.
Prepared Statements: It lacks support for arrays so it’s very hard to construct trivial queries like `id IN (…)` or `VALUES (…), (…), …`.
Performance of Prepared Statements: It’s almost always slower because it requires two round-trips to execute a single query. Tailored examples like “use the same query with more data sets” could be faster but making a single query (e.g. by using `VALUES (…), (…), …`) is even faster.
Stored Procedures: They are supported in ext/mysql. What’s not supported are multiple result sets. There’s not much use for procedures with multiple result sets.
Multiple Queries: Processing the results and checking for errors is quite involved with multi-queries. Also, there’s not much use for them. One of the exceptions is BEGIN+DELETE+INSERT+COMMIT, most of other queries can be written in more efficient way. Also, good applications are broken to simple methods like: execute query, process result, return it or throw. So even if it would be possible to join more queries to a single multi-query, the application would need to be much more complicated.
Transactions: There’s no advantage in using API instead of executing respective SQL queries. On the contrary, using API can have some crazy side effects based on the server config: http://dev.mysql.com/doc/mysql/en/mysql-commit.html.
Async Queries: This feature is great but requiring one connection per async query is too much in most cases.
“No Pitfall” Persistent Connections: ext/mysqli implementation is flawed – it cleans on connecting whereas it should clean at the script end. Imagine that you lock some table and then your script dies. All future connections will still have the table locked until you get lucky and obtain the connection which locked the table from the pool. If you run out of web server connections sooner then you just DoSed yourself. The point is that you need to clean after yourself inside register_shutdown_function() in all extensions including ext/mysqli.
There’s also some craziness in PDO, particularly about error handling. Try to write a function which inserts data in a table or updates them if the row already exists and throws if that fails. If you want to obey the error mode set on the connection then the error handling of this function is just crazy.
To not get me wrong: I personally use PDO, sometimes ext/mysqli but I completely understand why people don’t want to switch. Also the lack of support for trivial queries (described in Prepared Statements) makes all the extensions unusable by themselves and you need to write something on top of them anyway. If you use your own API in your application it doesn’t matter much what’s inside it and there’s no big reason to change it.
2012/11/28 at 11:26
Jakub,
You made a very valuable comment. I am in a rush today, I’ll not be able to give the reply you deserve today.
1) SSL
I tried SSL using PHP 5.6 and the latest MySQL development version (libmysql) before blogging. I did not see PHP making an attempt to read the option file. Thus, client part is not read and SSL is not used.
The way ext/mysql is implemented violates rules for using the C-API, see http://dev.mysql.com/doc/refman/4.1/en/mysql-real-connect.html notes on CLIENT_SSL. CLIENT_SSL is considered an internal flag. Any hint of internal is a big, big warning about the future. If it ever worked for ext/mysql it has still not be done the way it should have been implemented. It cuts you off from any other way of setting SSL details but using a MySQL option file.
2) Prepared Statements
The point being is that ext/mysql cuts you off from the feature. No matter what skills level you have reached, no matter how trained you are, you are limited by the API. This is a hard limit that you cannot break out.
API design is a soft limit. Certainly cosiness factor is most imporant but it does not change anything about the fact that ext/mysql prevents you from getting most out of MySQL
3) Transactions
You want to use API calls for controlling them. This is the only way a load balancer such as PECL/mysqlnd_ms can detect transaction boundaries.
2012/11/28 at 11:52
Jakub,
good points, they are quite correct but there’s a bit mre to it. If you break it down most applications make use of a very small part of the API. For most applications in fact connect+query+fetch_row are enough and they only use these functions. But what happens if this isn’t enough anymore for some reason? If they later need an advanced feature they are stuck.
But let’s look into your points in more detail:
SSL: Yes, one connect using SSL with ext/mysql but not provide custom options like a custom key. Indeed SSL often isn’t the best choice. I often suggest using a stunnel if you are bound by a policy and need encryption. Nonetheless there are cases where some policy requires users identified by SSL keys. If you distribute your application with such limitiations some users won’t be able to use it. (same might happen with in-house applications when being acquired by someone with stricter policies etc.). So yes, often no issue but making lives simpler for just a hand of people in an easy way is nice.
Prepared Statements: This one is kind of religious. Quick comment on the performance thing, though: If database and web server are close the overhead often can’t be measured and things change if you happen to run the same kind of query multiple times.
Stored Procedures: Going to mysqli early when using stored procedures, again, saves from later surprises due to “out of sync” errors.
Multiple Queries: I agree should not be a “requirement” there are other usecases though, like initilizing multiple things during i.e. an application installation where lots of things have to be done. Certainly no reason to migrate but nice to have every now and then.
Transaction Function: Question of coding style. Using those functions gives something simple to grep for while readng the code. Of course one might write custom wrappers, just like the item above: No reason to migrate but a small present.
Async Queries: Yes, the connection is a limitation due to the protocol (and to some degree consistency promises) Nonetheless there are quite a few usecases. While executing the async query you can do other things, like requesting data from other services (i.e. talk to Facebook’s API at the same time) or other databases, which might be needed in i.e. sharded environments.
Persistent Connections: Yes the “change user” call in between to cleanup can be annoying from time to time but in our experience the traditional way, leaking state, causes a lot of trouble for many users as unexpectd things happen as essentially the connection is in an undefined state all the time. Yes, if you only do simple queries and nothing affecting session state you are fine, unfortunately that’s often not the case.
Overall: Sure you don’t *need* all the things, but using mysqli is not more complicated than using ext/mysql. So why should one limit oneself? – Of course there is an issue with legacy code, but it’s years since there will be a PHP without ext/mysql and even after that there will be ways (i.e. pecl) to access it. Still: Taking the time to migrate opens new opportunities and provides long-term stability.
And as last thing, yet another thought: Even if you don’t need these features there is ongoing development with MySQL and mysqli, maybe there will be interesting things in the feature … and often migrating earlier is simpler than later as applications tend to grow, not shrink.
Pingback: PHP Digest: Simplified Password Hashing, Creation of PHP Framework Using Symfony2 Components, Xdebug Tutorial and Much More | Zfort Group Blog
Pingback: A Smattering of Selenium #132 « Official Selenium Blog
Pingback: Frage zur Umstellung mysql -> mysqli - php.de
2013/06/22 at 21:21
People who know how to write good code already had safe code with the old mysql functions. Maybe the real problem is that people who don’t know how to code kept copy and pasting other people’s shoddy code.
2015/12/21 at 02:24
I put the slides on the site. Sorry for the ovgisreht.For the future: I’d like to get this project done.My usage is that the BDB XA API is a particular C binding for an implementation of the XA specification. So the API was developed by sleepycat, and XA itself is specified by The Open Group. I could be convinced that some other usage is better.