12 December 2017

Aerospike vs Redis vs Tarantool

I have been working on a project where we have to store application data (clicks, views, mentions,...etc) in order to do some analytics later on on the data. There are several ways to architect a solution for this kind of problem and although in the past I architected a similar solution to this problem using Apache Kafka and the Citus extension for PostgreSQL this time we decided to use an in-memory storage engine and implement a batch job that regularly aggregates the data and stores it in PostgreSQL.

The question was which in-memory storage engine should we go for? In this post, I will compare three promising tools and do some performance testing on them in order to better evaluate which one to go with. Before I start I have to mention that I am slightly biased towards Redis because I have used it regularly in the past and know my way around it but after having now worked with both Aerospike and Tarantool I have to say that all of them are worthy candidates.

Test Setup

The tests are run on AWS. I use terraform to provision the infrastructure. The picture below shows an overview of the infrastructure. The tools were provisioned as single instances on their own nodes:
YCSB-tester on  a c4.2xlarge (This instance was used to run YCSB tool to perform the tests)
Redis-tester(v. 4.0.6) on a m4.xlarge
Tarantool-tester(v. 1.8) on a m4.xlarge
Aerospike-tester(v. 3.15.0.1) on a m4.xlarge
Elasticache-tester(v. 3.2.4) on a cache.m4.xlarge

 

06 May 2017

Redis, Apache Kafka & RabbitMQ - When to use what

I recently had to present a design where the designed consisted of Redis, Apache Kafka and RabbitMQ, among other things of course. At the presentation the obvious question came up, can't we just use one of these?

I understand that for novice users visiting the respective websites it must be difficult to understand the differences. On the front page on the redis webpage they state 

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.

On the Apache Kafka webpage you'll see

Publish & Subscribe to streams of data like a messaging system

and finally on RabbitMQ

RabbitMQ is the most widely deployed open source message broker 

 

"Message" seems to be the keyword for all of them but that doesn't tell the full story. Let's have a look at the details and example scenarios where one would choose one over the other.