Deep dive into Redis Clustering

March 16, 2019

In this article, I will be talking about how Redis manages its storage in a distributed storage concept and how it handles failover and utilize performance.

What is Redis?

Redis(Remote Dictionary Service) is,

- An open source(BSD licensed)

- NoSQL database server

- In-memory data structure store(Keeps data in cache)

- An advanced key-value store(Redis keeps data as key-value pairs)

- Supports data structures such as strings, hashes, lists, sets and sorted sets.

Redis Clustering

Redis Clustering provides a consistent and resilient data service where data is automatically sharded (Partitions data) across multiple Redis nodes (Automatically split your dataset among multiple nodes). And it provides a master/slave setup for enhance availability in case of a failure.

How Redis manages its storage in a distributed storage concept

1. Redis Cluster Topology -

Minimal cluster that works as expected requires to contain at least 3 master nodes in the cluster and Redis recommendation is to have at least one slave for each master.

Minimum 3 Redis master nodes on separate 3 machines for each
Minimum 3 Redis slaves (One replica for each master node), 1 slave per master (to allow minimal fail-over mechanism).

2. Redis Cluster TCP ports -

Every Redis Cluster node requires two TCP connections open. The normal Redis TCP port used to serve clients, for instance let's take 7000, plus the port obtained by adding 10000 to the data port, so 17000.

This second high port (In here, 17000) is used for the Cluster bus, that is a node-to-node communication channel using a binary protocol. The Cluster bus is used by nodes for failure detection, configuration update, failover authorization and so forth. If you don't open both TCP ports, your cluster will not work as expected. So make sure that you open both ports in your firewall.

Note that for a Redis Cluster to work properly you need, for each node:

The normal client communication port (in here 7000) used to communicate with clients to be opened to all the clients that need to reach the cluster, plus all the other cluster nodes (that use the client port for keys migrations).
The cluster bus port (the client port + 10000) must be reachable from all the other cluster nodes.

3. Redis Cluster data sharding -

Redis Cluster does not use consistent hashing, but a different form of sharding where every key is conceptually part of what we call an hash slot.

There are 16384 hash slots in Redis Cluster (Redis clustering follows a distributed slot map approach which distribute 16384 slots among Cluster nodes), and to compute what is the hash slot of a given key, we simply take the CRC16 of the key modulo 16384.

Every node in a Redis Cluster is responsible for a subset of the hash slots, so for example you may have a cluster with 3 nodes, where:

Node A contains hash slots from 0 to 5500.

Node B contains hash slots from 5501 to 11000.

Node C contains hash slots from 11001 to 16383.

This allows to add and remove nodes (scale) in the cluster easily and does not require any downtime.

Redis Fail-over Procedure (Redis Cluster master-slave model)

Master-Slave concept in Redis increases data availability by preventing the single point of failure. Every master node in a Redis cluster has at least one slave node (A replica of the Master node). When the Master node fails to operate or becomes unreachable, the cluster will automatically choose its slave node/one of the slave nodes and make that one as the new Master. Therefore, failure in one node will not stop the entire system from working.

For instance let's say we have cluster with nodes Master A, Master B, Master C, if Master B fails the cluster is not able to continue, since we no longer have a way to serve hash slots in the range 5501-11000.

However when the cluster is created (or at a later time) we add a slave node to every master, so that the final cluster is composed of Master A, Master B, Master C that are masters nodes, and Slave A, Slave B, Slave C that are slaves nodes, the system is able to continue if Master B fails.

Slave B replicates Master B, and when Master B goes down, then Redis Cluster will perform automatic fail over process that will force the Slave B as the new master and will continue to operate correctly.

Master, Slave connection when every node is healthy
Server 2 unavailable
Fail over — Slave B becomes as Master B
Check offsets and update the role ( Master B -> Slave B)

NOTE : In this case 4, it can be a dangerous situation since two master nodes are there on the same server (Server 3). Therefore database administrator should rearrange the Cluster structure in such way that on each server will be one single master and a slave of some other master.

However note that if Master B and Slave B fail at the same time Redis Cluster is not able to continue to operate.

Search This Blog

Gaining Something beyond What you have