Uh, uh… extending mysqlnd: monitoring and statement redirection

2011/10/11 by admin

Uh, uh… about a year ago Mayflower OpenSource Labs released the mysqlnd user handler plugin (PECL/mysqlnd_uh). The extension lets you extend and replace mysqlnd internal function calls with PHP. Uh, uh… mysqlnd internals exported to user space? Who cares as long as it does the trick?! Let me show you seven lines of PHP code to monitor all queries issued by any PHP MySQL application using any PHP MySQL extension (mysql, mysqli, PDO_MySQL) compiled to use the mysqlnd library.

query_monitor.php


class conn_proxy extends MysqlndUhConnection {
 public function query($res, $query) {
  debug_print_backtrace();
  return parent::query($res, $query);
 }
}
mysqlnd_uh_set_connection_proxy(new conn_proxy());

That’s it. Install PECL/mysqlnd_uh and load the above query_monitor.php before your application starts using the PHP configuration directive auto_prepend_file to see all queries issued by your application, including a backtrace. The proof:


> cat foo.php
<?php
$pdo = new PDO("mysql:host=localhost;dbname=test", "root", "");
var_dump($pdo->query("SELECT 1 AS _one FROM DUAL")->fetchAll(PDO::FETCH_ASSOC));
$mysqli = new mysqli("localhost", "root", "", "test");
$mysqli->query("SELECT 1 AS _two FROM DUAL");
> sapi/cli/php -d auto_prepend_file=query_monitor.php foo.php
#0  conn_proxy->query(Resource id #5, SELECT 1 AS _one FROM DUAL)
#1  PDO->query(SELECT 1 AS _one FROM DUAL) called at [foo.php:3]
array(1) {
  [0]=>
  array(1) {
    ["_one"]=>
    string(1) "1"
  }
}
#0  conn_proxy->query(Resource id #8, SELECT 1 AS _two FROM DUAL)
#1  mysqli->query(SELECT 1 AS _two FROM DUAL) called at [foo.php:5]

In preparation for the third and last PHP MySQL webinar, I wrote documentation for PECL/mysqlnd_uh and made the extension compile with PHP 5.4. Get PHP 5.4, get a development version of mysqlnd_uh and give it a try. Both 5.4 and mysqlnd_uh deserve user feedback.

If you happen to use Symfony2, check out https://github.com/gimler/MySQLndUhTool by Gordon Franke. Gordon has integrated the monitoring into Symfony2 EventDispatcher.

Any PHP MySQL application
\|
mysql	mysqli	PDO_MySQL
MySQL native driver for PHP (mysqlnd) library
Mysqlnd User Handler plugin (PECL/mysqlnd_uh)
\|
MySQL Server

How it works

The mysqlnd user handler plugin provides two PHP classes. The built-in PHP class MysqlndUhConnection is mapped to the mysqlnd C library internal connection class. The built-in PHP class MysqlndUhPreparedStatement gives access to some of mysqlnd’s prepared statement class methods.

*.c	MySQL native driver for PHP (mysqlnd) library
*.c	mysqlnd_conn	mysqlnd_stmt	…
*.c	Mysqlnd User Handler plugin (PECL/mysqlnd_uh)
	provides built-in PHP classes
*.php	MysqlndUhConnection	MysqlndUhPreparedStatement	n/a

On the C level, the methods of the built-in classes are installed to be used instead of the original mysqlnd library functions. The classes do nothing but call the original library methods. The built-in classes behave like a transparent proxy. To change the default behaviour of mysqlnd, you have to subclass the two built-in classes and install proxy objects of you derived classes.

Basics: query logging and rewriting

If that sounds confusing, forget about it for now. Get yourself used to the pattern shown in the monitoring example. Its sufficient for basic stuff like monitoring and rewriting. Rewriting? Sure… let’s assume the database schema has been re-factored over the years. You are understaffed and there is a legacy application accessing a renamed table. The application is building SQL dynamically and its hard to locate the source. Ulf helps: rewriting and logging the backtrace.


class conn_proxy extends MysqlndUhConnection {
 public function query($res, $query) {
  if (FALSE !== stristr($query, "FROM oldtable")) {
    $query = str_replace("FROM oldtable", "FROM newtable", $query);
    error_log(var_export(debug_backtrace(), true));
  }
  return parent::query($res, $query);
 }
}
mysqlnd_uh_set_connection_proxy(new conn_proxy());

There are many use cases for this little snippet: monitoring, rewriting but also auditing and even SQL injection protection using black- and whitelists is possible. If you are interested in the latter and you have some C skills, check out PECL/mysqlnd_sip. The name mysqlnd_sip stands for mysqlnd SQL injection protection. PECL/mysqlnd_sip is a proof-of-concept. Otherwise, try out mysqlnd_uh. It has been written for no other purpose but making it possible to play with mysqlnd, to prototype plugins with PHP.

Note

The mysqlnd library and its functions have not been designed to exposed to the user space. PECL/mysqlnd_uh lets you manipulate the inner workings of mysqlnd. It tries to detect abuse. Nonetheless, it is possible to make mysqlnd leak memory or even crash PHP, if you hook mysqlnd calls inappropriately. This is not considered a mysqlnd bug. Please, report issues which you
want PECL/mysqlnd_uh try to detect and try to prevent from happening.

Standard PHP code encryption tools cannot stop users from monitoring mysqlnd activities with PECL/mysqlnd_uh. PECL/mysqlnd_uh lets users hook mysqlnd C calls. That’s on a level beneath the application. Whatever technology is used to hide the source code of the PHP application, the PHP application still has to call the C calls internally. By hooking the mysqlnd C calls with PECL/mysqlnd_uh users get access to database users, database passwords (MysqlndUhConnection::connect()) and SQL commands executed (MysqlndUhConnection::query(), MysqlndUhPreparedStatement::prepare()). Therefore, administrators should restrict access to PECL/mysqlnd_uh to trusted persons only. Just like access to a database server log should be restricted…

Advanced: statement redirection

After the note only experts should be left. Let’s do something expert-like. Let’s do something in user space which PECL/mysqlnd_ms, the replication and load balancing plugin, does in C. Let’s redirect statements to different servers without letting the application know. The replication plugin does read/write splitting to send read requests to slave nodes and write requests to master nodes in a MySQL replication cluster. We want to pick a server based on a SQL hint. A SQL hint, which shall represent a sharding key.


class mysqlnd_lb_conn_proxy extends MysqlndUhConnection {
 public function query($conn, $query) {
   return parent::query($conn, $query);
 }
}

Again, we start deriving a proxy from MysqlndUhConnection, overwriting the query method. The query method has two arguments. The first argument is a resource of type mysqlnd connection, the second argument is the query statement string. Whenever the statement begins with a certain SQL hint, we shall redirect it to a certain server. To do so, we’ll open a new connection to that server and use that new connection to execute the statement on.


class mysqlnd_lb_conn_proxy extends MysqlndUhConnection {
 public function query($conn, $query) {
  if (FALSE !== stristr($query, "shard1")) {
    $mysqli = new mysqli("localhost", "root", "", "test_1");
    return parent::query($mysqli, $query);
  }
  return parent::query($conn, $query);
 }
}
mysqlnd_uh_set_connection_proxy(new mysqlnd_lb_conn_proxy());


Warning: MysqlndUhConnection::query() expects parameter 1 to be resource, object given in /home/nixnutz/php-src/branches/PHP_5_4/foo.php on line 7

Warning: mysqli::query(): (Mysqlnd User Handler) The method MysqlndUhConnection::query() did not return a boolean value as it should in /home/nixnutz/php-src/branches/PHP_5_4/foo.php on line 17

Good try, but obviously wrong. The parent implementation of the query must not be given a mysqli object as its first parameter but a mysqlnd connection resource. The conversion between the two connection handles is to be done using mysqlnd_uh_convert_to_mysqlnd(). Next try.


class mysqlnd_lb_conn_proxy2 extends MysqlndUhConnection {
 public function query($conn, $query) {
  if (FALSE !== stristr($query, "shard1")) {
    $mysqli = new mysqli("localhost", "root", "", "test_1");
    $mysqlnd_conn = mysqlnd_uh_convert_to_mysqlnd($mysqli);
    return parent::query($mysqlnd_conn, $query);
  }
  return parent::query($conn, $query);
 }
}
mysqlnd_uh_set_connection_proxy(new mysqlnd_lb_conn_proxy2());

$mysqli = new mysqli("localhost", "root", "", "test");
$res = $mysqli->query("/*shard1*/SELECT DATABASE() AS _db FROM DUAL");
var_dump($res);
var_dump($res->num_rows);


bool(true)
NULL

The errors are gone but we are not quite there yet: no proper result set. By scanning the example you may be tempted to forget that a user space call, such as mysqli_query(), may invoke many mysqlnd library calls, no just query. The mysqli_query() function sends a query, checks if there is a result set to fetch and if so, does fetch and store the result set. The one line $res = $mysqli->query("/*shard1*/SELECT DATABASE() AS _shard1 FROM DUAL"); expands to three mysqlnd library calls. All three functions must be modified to use the proxy connection which has been used for redirection. You have to implement MysqlndUhConnection::query(), MysqlndUhConnection::getFieldCount() and MysqlndUhConnection::storeResult().

Here comes the "working" example. "Working" means it is capable of redirecting exactly the user calls shown in the example. Other user API calls may require additional mysqlnd function calls to be overwritten.


class mysqlnd_lb_conn_proxy2 extends MysqlndUhConnection {

 private $last_conn = NULL;

 public function query($conn, $query) {
  if (FALSE !== stristr($query, "shard1")) {
    $mysqli = new mysqli("localhost", "root", "", "test_1");
    $this->last_conn  = mysqlnd_uh_convert_to_mysqlnd($mysqli);
    if (!$this->last_conn) {
      $this->last_conn = NULL;
      return parent::query($conn, $query);
    }
    return parent::query($this->last_conn, $query);
  }
  $this->last_conn = $conn;
  return parent::query($conn, $query);
 }

 public function getFieldCount($conn) {
   return ($this->last_conn) ?
     parent::getFieldCount($this->last_conn) :
     parent::getFieldCount($conn);
 }

 public function storeResult($conn) {
   return ($this->last_conn) ?
    parent::storeResult($this->last_conn) :
    parent::storeResult($conn);
 }

}
mysqlnd_uh_set_connection_proxy(new mysqlnd_lb_conn_proxy2());

$mysqli = new mysqli("localhost", "root", "", "test");
$res = $mysqli->query("/*shard1*/SELECT DATABASE() AS _db FROM DUAL");
var_dump($res->fetch_assoc());
$res = $mysqli->query("SELECT DATABASE() AS _db FROM DUAL");
var_dump($res->fetch_assoc());


array(1) {
  ["_db"]=>
  string(6) "test_1"
}
array(1) {
  ["_db"]=>
  string(4) "test"
}

Finding which functions to overwrite is a bit of try-and-error process. Basically you have three options to go though the iterative process.

Reverse engineer mysqlnd and extension source
Check mysqlnd debug and trace log for pointers
Implement all MysqlndUhConnection methods to see what gets called

As a mysqlnd developer, I usually take the reverse engineering route. It does not require extensive C knowledge to see which function calls which. However, if you fear C, try using the debug and trace log. Below is my standard way of working with mysqlnd_uh, if trying to achieve anything fancy, such as statement redirection.


valgrind sapi/cli/php -d mysqlnd.debug="d:t:O,/tmp/mysqlnd.trace" foo.php


>mysqlnd_init
[... mysqli_connect() called]
<mysqlnd_init
>mysqlnd_connect
| info : host=localhost user=root db=test port=3306 flags=131072
| >mysqlnd_uh_conn.connect
[ ... from mysqli_connect() -> connection_id will become 2125]
| | >mysqlnd_conn::connect
[...]
| | | info : connection_id=2125
| | | info : PACKET_FREE(0x7084bd0)
| | <mysqlnd_conn::connect
| <mysqlnd_uh_conn.connect
<mysqlnd_connect
[...]
>mysqlnd_uh_conn.query
| info : connection 0x7082598
[... from mysqli_query() ]
| >mysqlnd_init
[... proxy openes new connection to shard server]
| <mysqlnd_init
| >mysqlnd_connect
| | info : host=localhost user=root db=test_1 port=3306 flags=131072
[... shard server connection opened by proxy]
| <mysqlnd_connect
[...]
| >mysqlnd_conn::query
| | info : conn=2126 query=/*shard1*/SELECT DATABASE() AS _db FROM DUAL
[...proxy send connection to shard server has connection_id = 2126]
| <mysqlnd_conn::query
<mysqlnd_uh_conn.query
>mysqlnd_uh_conn.get_field_count
| info : connection 0x7082598
[... mysqli_query() checks if there is a result set]
<mysqlnd_uh_conn.get_field_count
>mysqlnd_uh_conn.store_result
| info : connection 0x7082598
[... mysqli_query() fetches and stores the result set]
| >mysqlnd_conn::store_result
| | info : conn=2126
[...]
| <mysqlnd_conn::store_result
<mysqlnd_uh_conn.store_result
[... script continues after first mysqli_query() ]
>RSHUTDOWN

Too complicated? Go back to the basics from the beginning. Monitoring, rewriting and auditing is super easy. Other tasks can easily become more complex. That’s one reason why we have written the PECL/mysqlnd_ms in C. However, the power and flexibility of hooking mysqlnd library calls, the possibility of installing a client-side proxy is quite unique.

Happy hacking!