MySQL and NoSQL go together. In the data center, that’s a known. Inside MySQL, that’s lesser known. MySQL 5.6 is both a puristic key value store and a full-fledged relational database. MySQL could also be developed into a schema free document store supporting MapReduce as a search method. With regards to searching those are the extremes. What search methods should a NotOnlyMySQL offer? The extremes first.
Thanks, MySQL. Losing a key is no nightmare!
Memcached is a high performance key value store for volatile data. Memcached is part of a classic scaling story: first, we buy a bigger machine for MySQL, then we try replication, then we cache database results in Memcached.
|MySQL Protocol||Memcache Protocol|
InnoDB and NDB Memcache API plugins turn MySQL 5.6 into a key value store. Adding Memcache protocol support to MySQL was a logical step not only because of the popularity of MySQL being used together with Memcache but also because the Memcache protocol is perfect match for exposing MySQL’s internal high performance storage APIs to the user.
|MySQL Protocol||Memcache Protocol|
Data can be inserted into MySQL using both MySQL Protocol or Memcache protocol. Data can be searched using either SQL or doing a key lookup. There are two ways to write and read the same data.
Extreme one: darling, I lost the keys for the data treasure!
SQL is a very powerful query language. Key lookup is the very opposite. Key lookup is not powerful enough for the majority of applications. Assume you configured MySQL 5.6 to make a table with employees and their hobbies accessible through the new Memcache interface. How do you get a list of all employees?
|Ulf_Wendel||keep beloved three-wheeler running, photography, sewing, cycling|
|Johannes_Schlueter||cycling, hacking, family|
|Andrey_Hristov||renovate appartements, economics, winter sports|
Using SQL the answer is straight forward: SELECT name_key FROM employees_kv_list.
The query cannot be answered using Memcache protocol. This is quite weird: you have created a list of employees, it exists but you cannot access your data. No key for you to your data treasure! A
get [key] command is not enough to search a key value store. Lucky you, MySQL allows you to query the data using either SQL or the less powerful Memcache protocol commands.
An advanced key value store supporting data types
Redis is another key value NoSQL database. Search or query capabilities of Redis are not significantly better than the ones of Memcache. At least, you can do a "ls" on your key value file system. The
KEY [pattern] command can get you the list of employees – yippie!
Next task: find the cyclists among the employees. Nothing as easy as that using SQL:
SELECT name_key FROM employees_kv_list WHERE hobbies_values LIKE '%cycling%'.
Armed with nothing but
KEY [pattern] a user of a key value store would be forced to read all values and filter them on the client-side. But then, what about consistent read and traffic?
KEYS * GET Ulf_Wendel GET Johannes_Schlueter GET Andrey_Hristov
Redis is more advanced. Redis supports data types. Values can be strings, hashes, lists, sets, sorted sets and more. Sets could be used to reduce the traffic when trying to find the cyclists among the employees. SISMEMBER can test a set for a member.
SADD Ulf_Wendel "keep beloved three-wheeler running" "photography" "sewing" "cycling" [...] KEYS * SISMEMBER Ulf_Wendel "cycling" [...]
I am not after blaming key value stores. Key value work fantastic for a certain, narrow set of applications once you mastered the data model. Their simplicity allows them to offer O(1) time complexity – for key lookups. For nothing else. Ad-hoc queries are no strengh of basic key value stores.
Extreme two: MapReduce – you do the search!
Document oriented key value stores often feature MapReduce. MapReduce offers as powerful search possibilities as SQL.
For MapReduce you develop filter functions that can be run by the data store. Unlike with the certainly extreme examples from above, values are not transferred from the server to the client for client-side filtering. Traffic is lower and thus execution speed is potentially better.
It is powerful. It could be done with MySQL, that’s for sure. But, do you really want to write a script for every query?
|poor||… -> …||powerful|
What is in the middle?
MySQL speaks key value. MySQL speaks SQL. MySQL could support MapReduce. We have the extremes covered. What is in the middle that may cover the magic 80% of the use cases? To be continued the other day.