Documentation

Setup and administration

Working with bpm’online and Redis Sentinel

The Redis Sentinel mechanism is used to provide fault tolerance for the Redis repositories used by bpm’online. It provides the following features:

  • Monitoring. Sentinel makes sure that the master/slave instances work correctly in Redis.

  • Notifications. Sentinel alerts system administrators if any instance–related errors occur in Redis.

  • Automatic failover. If the Redis master instance is not working correctly, Sentinel promotes one of the slave instances to a master, and reconfigures the rest to work with the new master instance. Bpm'online is also notified about the new Redis connection address.

Attention

Bpm’online does not support Redis clusters.

Redis Sentinel is a distributed system which is designed to run multiple instances that cooperate together. This approach has the following advantages:

  • The fault is registered only if multiple Sentinel instances (which form a quorum) agree that the master instance is unavailable in Redis. This is done to reduce the number of false alerts.

  • The Sentinel mechanism will still be available even if multiple Sentinel instances are not responding or do not work altogether. This is done to increase fault tolerance.

Contents

Notable Sentinel specificts

Minimal fault tolerance requirements for Redis Sentinel

Network separation issues

System requirements

Installing and configuring Sentinel

Configuring bpm’online to work with Redis Sentinel

Notable Sentinel specificts

  • At least three Sentinel instances are required for a robust deployment. These instances should be placed into computers or virtual machines that are believed to fail in an independent way, i.e., the faults registered by these Sentinel instances should be caused by different sources. For example, the computers are located in different network zones.

  • Due to asynchronous replication, the distributed system (Sentinel + Redis) does not guarantee that all data will be saved if a failure does occur.

  • The fault tolerance of the configuration should be regularly monitored and further confirmed through tests that simulate failures.

  • Docker port remapping creates certain issues with Sentinel processes (see the "Sentinel, Docker, NAT, and possible issues" block of the Sentinel documentation).

Minimal fault tolerance requirements for Redis Sentinel

Legend:

  • M1, M2 – Redis master instances.

  • R1,R2, R3 – Redis slave instances.

  • S1, S2, S3 – Sentinel instances.

  • C1 – bpm’online application.

  • [M2] – promoted instance (e.g., from slave to master).

We recommend using a configuration with at least three Redis and Sentinel instances (see the "Example 2: basic setup with three boxes" block of the Sentinel documentation). This configuration is based on three nodes (computers or virtual machines), each containing running instances of both Redis and Sentinel (Fig. 1). Two Sentinel instances (S2 and S3) form a quorum (the number of instances required to ensure the fault tolerance of the current master instance).

Fig. 1 The three nodes configuration: quorum = 2

scr_chapter_setup_redis_sentinel_3_pionts_configuration.png 

During regular operation, bpm’online client application writes its data to a master instance (M1). This data is then replicated asynchronously to the slave instances (R2 and R3).

If the Redis master instance (M1) becomes unavailable, the Sentinel instances (S1 and S2) “conclude” that a failure has occurred and start the failover process. One of the Redis slave instances (R2 or R3) is promoted to the master, enabling the application to use it instead of the previous master instance.

Attention

There is a risk of losing records in any Sentinel configuration, which uses asynchronous data replication. This occurs if the data were not written to the slave instance, promoted to a master.

Note

Other possible fault tolerant configurations are described in the Sentinel documentation.

Network separation issues

If the network connection is lost, there is a risk that bpm’online will continue to work with the old Redis master instance (M1), while the newly promoted master instance ([M2]) has already been assigned (Fig. 2).

Fig. 2 Network separation

scr_chapter_setup_redis_sentinel_nerwork_splitting.png 

This is easily avoided by enabling the option to stop writing data in case the master instance detects that the number of slave instances has decreased. To do this, set the following values in the redis.conf configuration file of the Redis master instance:

min-slaves-to-write 1
min-slaves-max-lag 10

As a result, if the Redis master instance (M1) will not be able to transfer data to at least one slave instance, it will stop receiving data in 10 seconds after the first attempt. Once the system is recovered by Sentinel instances that form a quorum (S2 and S3), the bpm’online application (C1) will be reconfigured to work with the new master–instance (M2).

Attention

If the network is restored, the master instance will not be able to continue its operation automatically after stopping. If the remaining Redis slave instance (R3) also becomes unavailable, the system will stop working altogether.

System requirements

Redis is an in-memory database, therefore RAM capacity and performance rate are the main requirements for its correct operation. Since Redis is a single–threaded application which uses a single processor core, a single node (computer or virtual machine) with a dual-core processor is required to work with a single Redis instance. Sentinel instances require relatively few resources and can run on the same node as Redis.

It is recommended to deploy Redis and Sentinel on Linux OS.

The table below shows the recommended system requirements for a single node (computer or virtual machine), depending on the number of bpm'online users.

Number of users

CPU

RAM

HDD

Network connection

up to 100

Intel Xeon E3-1225v5

2 Gb

10 Gb

1 Gbit

100 – 200

4 Gb

200 – 500

6 Gb

500 – 750

8 Gb

750 – 1000

12 Gb

10 Gbit

1000 – 2000

16 Gb

2000 – 3000

20 Gb

3000 – 5000

24 Gb

Installing and configuring Sentinel

Redis Sentinel comes bundled up with the Redis distribution. The installation process is described in the Redis documentation. We recommend using the latest version of Redis.

Please refer to the "Quick tutorial" section of the Sentinel documentation to learn more about Sentinel configuration.

Configuring bpm’online to work with Redis Sentinel

Сustomized libraries

Contact bpm’online support to obtain customized libraries.

Files ConnectionStrings.config

Specify the following in the "redisSentinel” connection string:

  • sentinelHosts – unlimited number of comma-separated addresses and ports of Sentinel instances in the <address>:<port> format.

  • masterName – the name of the Redis master instance.

Connection string example:

<add name="redisSentinel" connectionString="sentinelHosts=localhost:26380,localhost:26381,localhost:26382;masterName=mymaster;scanForOtherSentinels=false;db=1;maxReadPoolSize=250;maxWritePoolSize=250" />

Web.config

Please make sure that the appSettings section includes the following parameters:

  • Feature-UseRetryRedisOperation enables the internal bpm’online mechanism that will retry any Redis operations that ended with errors.

  • SessionLockTtlSec – the lifespan of the session lock key.

  • SessionLockTtlProlongationIntervalSec – the period for which the lifespan of the session lock key is prolonged.

These settings must have the following values:

<add key="aspnet:UseLegacyRequestUrlGeneration" value="true" />
<add key="SessionLockTtlSec" value="60" />
<add key="SessionLockTtlProlongationIntervalSec" value="20" />

Please make sure that the redis section includes the following parameters:

  • enablePerformanceMonitor – enables the mechanism for monitoring the execution time of Redis operations. We recommend enabling this mechanism for debugging and troubleshooting. Default value: “Off” (enabling this mechanism might affect application performance).

  • executionTimeLoggingThresholdSec – if the execution of a Redis operation exceeds this threshold, it will be recorded in the log. By default, “5 seconds”.

  • featureUseCustomRedisTimeouts – enables the use of timeouts specified in the configuration file. Default value: “Off”.

  • clientRetryTimeoutMs – all Redis operations that result in errors will be retried using the same client until they reach the timeout specified in this parameter. This parameter is used to eliminate errors caused by short network interruptions. At the same time, getting a new client from the pool is not required. By default, “4000 milliseconds”.

  • clientSendTimeoutMs – the time allocated for sending requests to the Redis server. By default, “3000 milliseconds”.

  • clientReceiveTimeoutMs – the time allocated for receiving responses from the Redis server. By default, “3000 milliseconds”.

  • clientConnectTimeoutMs – the time allocated for establishing a network connection to the Redis server. By default, “100 milliseconds”.

  • deactivatedClientsExpirySec – delay in deleting failed Redis clients after they are removed from the pool. The “0” value represents immediate removal. By default, “0”.

  • operationRetryIntervalMs – if the process for retrying a failed operation does not result in a successful execution, it will be postponed for the time period specified here.  The operation will be performed with a new client, which may have already established a connection to the new master instance. By default, “1000 milliseconds”.

  • operationRetryCount – the number of repeated attempts to perform an operation with a new Redis client. By default, “10”.

These settings must have the following values:

<redis connectionStringName="redis" enablePerformanceMonitor="false" executionTimeLoggingThresholdSec="5" featureUseCustomRedisTimeouts="true" clientRetryTimeoutMs="4000" clientReceiveTimeoutMs="3000" clientSendTimeoutMs="3000" clientConnectTimeoutMs="100" deactivatedClientsExpirySec="0" operationRetryIntervalMs="1000" operationRetryCount="10" />

See also

Установка bpm’online

Часто задаваемые вопросы по установке bpm’online

 

Did you find this information useful?

How can we improve it?