Big Data Architectures – NoSQL Use Cases for Key Value Databases
The most important question asked about NoSQL databases is: when would I use this? A year ago I promised to start to provide answers and so, finally, let’s start. Relational databases are used ubiquitously in organizations to solve all structured data storage problems but, in fact, may not be the optimal solution for some problems. Don’t panic. I’m absolutely not suggesting that we would want to replace all our relational database solutions with NoSQL solutions. Most of the NoSQL database use cases are in new areas that probably don’t have relational database back ends now anyway, if they even exist yet at your organization.
In general, NoSQL data stores have architectures that lend themselves to Cloud implementations and are structured consistently with Cloud architectures. There is an assumption that the solutions exist in environments where nodes may go down but the applications need to remain running, like the commodity hardware environment of a Cloud provider. Therefore, redundancy is built into the solutions and the environments are built to scale without single points of failure – losing any node will not result in the applications going down because the applications will just continue with one of the copies of the node that was lost, even if the node lost was the controller. There is also an assumption of a need to scale quickly by orders of magnitude. NoSQL data stores usually have a concept of “eventual consistency,” which means that all the redundancy and high availability is there but synchronicity may not be achieved immediately but eventually. A little data might be lost on rare occasions. Deal with it. If you have an application where a little data can never be lost, then don’t use a NoSQL database and be prepared to invest in appropriate recovery and high availability solutions.
Key value databases are EXTREMELY simple databases. There is a key and there is the rest of the data (the values) and that’s it. You can find the value data based on searching the key field. There are no alternate keys and no foreign keys and no broad text searching capabilities against the values. If you want those things then you have to create them and it involves redundancy. But key value databases are FAST. They are a lot faster than relational databases. And they can scale: they can grow in size by hundreds and thousands of times without significant redesign. The growth in price is linear with the growth in size. It is very hard to scale relational database solutions quickly and the price curve is usually not linear but geometrical.
And now the money shot: what’s a key value store good for? They are most frequently used for managing the session information in web applications. Key value databases are great for managing the session information for all these new user apps on our phones and devices that are becoming de rigor for organizations who are dealing directly with individuals and consumers. Key value stores are used in massive multi-player on-line gaming to manage each player session. The slightly more mundane version of this is managing the shopping cart for an on-line buyer. I want to be clear on this: I would use a relational database for the payment transaction and any revenue posting. For all the shopping cart transactions up to the point of payment, however, a key value store is probably a better solution.
Organizations that sell over the internet struggle with the difference in volumes most of the year versus the pre-Christmas buying season. Should they have an infrastructure that is scaled for the highest possible buying peak and pay for that for the other nine tenths of the year or risk their environment being unable to handle a Christmas buying frenzy and going down for some period of hours during their biggest revenue opportunity? Relational databases struggle with handling magnitude changes in volume quickly (number of records stored) and with handling high volumes of transactions simultaneously (thousands or millions of state changes per second). Key value NoSQL databases can handle magnitude scaling of number of records and extremely high volumes of state changes per second with millions of simultaneous users through distributed processing and distributed storage. Key value NoSQL databases also have built in redundancy which can handle the loss of storage nodes without losing the whole application. Sometimes, but not frequently, an item is lost out of a shopping cart or even all the items in a shopping cart. It’s happened to me, has it happened to you?
Example key value databases include Riak, Redis, Memcached, Berkeley DB, Hamster DB, Amazon Dynamo DB.
These new kinds of databases were created to deal with the new kinds of applications that the internet has given rise to, and that’s the best purpose for which they should be used. Organizations should not be considering replacing all their relational databases with NoSQL, especially for their financial applications.