database – Cassandra or MySQL/PostgreSQL?

database – Cassandra or MySQL/PostgreSQL?

There is nothing like a Silver bullet solution, everything is built to solve specific problem and has its own pros and cons. It is up to you to decide – what problem statement you have and what is best solution that fits your problem. Whether you use Cassandra (NoSQL) or MySQL(RDBMS), it is all driven from your systems requirements. Below are the inputs that will help you in taking better decision while deciding on database.

Why to Use NoSQL

In the case of RDBMS database, making choice is quite easy because almost all the databases like MySQL, Oracle, MS SQL, PostgreSQL in this category offer almost same kind of solutions oriented to the ACID property. When it comes to NoSQL, decision becomes difficult because every NoSQL database offers different solution and you have to understand which one is best suited for your app/system requirement. For example, MongoDB fits for use cases where your system demands schema-less document store. HBase might fit for Search engines, analysing log data, any place where scanning huge, two-dimensional join-less tables is a requirement. Redis is built to provide In-Memory search for varieties of data structures like tree, queue, link list etc and can be good fit for making real time leader board, pub-sub kind of system. Similarly there are other database in this category (including Cassandra) which fits for different problems. Now lets move to original question, and answer them one by one.

When to use Cassandra

Being a part of NoSQL family, Cassandra offers solution for problem where your requirement is to have very heavy write system and you want to have quite responsive reporting system on top of that stored data. Consider use case of Web analytics where log data is stored for each request and you want to built analytical platform around it to count hits by hour, by browser, by IP, etc in real time manner. You can refer to blog post (http://blogs.shephertz.com/2015/04/22/why-cassandra-excellent-choice-for-realtime-analytics-workload/) to understand more about the use cases where Cassandra fits in.

When to Use a RDMS instead of Cassandra/NoSQL

Cassandra is based on NoSQL database and does not provide ACID and relational data property. If you have strong requirement of ACID property (for example Financial data), Cassandra would not be a fit in that case. Obviously, you can make work out of it, however you will end up writing lots of application code to handle ACID property and will loose on time to market badly. Also managing that kind of system with Cassandra would be complex and tedious for you.

There are many different flavours of NoSQL databases. If your application is really like Wordnet perhaps you should look at a graph database such as Neo4j.

database – Cassandra or MySQL/PostgreSQL?

I would suggest to analyse your request.

  1. If you are going with more clusters, machines take NoSQL
  2. If your data model is complicated – require efficient structures take NoSQL (no limits with type of columns)
  3. If you fit in a few machines without scales, and you dont need super performance for multi request (as for example in social network – where lot of users send http request), and you dont think you involve saleability take RDBMS (Postgres have some good functions and structures which you can use, like array column type).

Cassandra should work better with large scales of data, multi purpose.
neo4j – would be better for special structures, graphs.

Leave a Reply

Your email address will not be published.