Hi all,
We have, in our company, two datacenters running in two different cities of Turkey.
And we want to setup a Redis cluster running on these data centers at the same time in an active active manner.
We want to use redis as a distributed cache, not as a datastore. User sessions and mostly read-only data will be stored there.
On a daily basis two data centers will operate for both applications and Redis side. Applications will do read/write operations on Redis.
There may be some cases that we completely shutdown one datacenter and continue to operate on the other without breaking the applications logic and without doing additional work.
This is due to a legal obligation for our company to implement such a disaster scenario periodically.
With a similar approach, while application side may be running on two data centers, at one point, we may shutdown the Redis nodes on one of the datacenters and continue on the other.
So, for our case the Sentinel solution seems to be NOT appropriate for us. Because, in order for sentinel nodes operate properly it is a must to reach to "the number of majority of the Sentinel processes".
From sentinel doc: "Sentinel never starts a failover if the MAJORITY of Sentinel processes are unable to talk"
Assuming that we locate 5 sentinel nodes in dc-1 and 5 sentinel nodes in-dc-2 (with master - slave groups in two datacenters) and we completely shutdown dc-1, then we cannot reach sufficient number of sentinel nodes to do failover properly.
And locating majority of the sentinels in one datacenter is not a solution due to the possibility of shutting down that datacenter completely.
Alternatively, running the Redis in Cluster Mode in data centers, without sentinel, seems to work for us.
We can have 3 Masters in one data center each of which has 5 slaves. And 2 of these slaves are located in one dc and the three slaves are on the other dc.
The reason for locating masters on one datacenter is that we dont want to do additional work (manual data resharding operations especially) on that data center's switch/failover scenarios.
I added a draw of our intentional diagram.
Is our solution proper? Is there any best practice for our case? We would like to see any alternative solutions.
Thanks in advance.