Caching Handling in Device Event Service
Overview
The device event service (DES) needs to make sure that following types of entities are up to date:
- Active In-app campaigns
-
the list of active campaigns is cached and needs to be updated if a campaign changes status.
- Non-terminal campaign ids
-
set of non-terminal campaign ids is cached and needs to be updated if a campaign changes status.
- Message-content
-
static content for a specific campaign is cached and needs to be updated if the content of a campaign changes.
- Campaign details
-
campaign details without the content which is needed for push2inapp to determine if the content is personalized.
- Applications
-
the applications used are also cached and even if they don’t change much we still need to update the cache if an app is changed in any way.
Problem With Current Behaviour
Right now each instance of the device event service listens to entity events and invalidates the cache when an app or campaign are modified. When very many campaigns are modified at once this leads to significant load on the DB and sometimes to the service being unavailable for short time.
New Concept Approach 1
We would change the responsibility of the management of the cache to a single worker instead of every process. One worker would manage the Redis cache and the in-memory cache would use such a low expiry that it won’t need to be invalidated apart from that.
This would remove the need for the processes themselves to subscribe to the entity events and we could have a single worker consuming the events and making sure that the cache is up to date.
The worker could also do more than just invalidating the cache, it could pre-fill it with the updated value so that the web workers won’t have any cache misses.
Modified Entity Properties
In order to decide whether specific caches needs to be updated or not we could add a list of properties that were modified of an entity when sending the entity event.
New Concept Approach 2
Approach 1 uses time-to-live in the in-memory cache to keep it fresh and does not use active cache invalidation of the in-memory caches which has following constraints:
-
If the time-to-live is low the load to Redis increases
-
If the time-to-live is too high consumers might see campaign content which is already outdated
Therefore approach 2 actively invalidates the in-memory caches after the Redis cache has been refreshed:
Pub/Sub Subscriptions
Every web pod needs a subscription to have the same effect as the fanout exchange in RabbitMQ. Each pod has to create the subscription on startup and delete it on shutdown. The subscription name can be created in a similar way as for the RabbitMQ queues, e.g.
${prefix}-des-web-${uuid}
Subscription expiration period can be used to ensure the subscription is cleaned up even when the deletion of the subscription fails during shutdown of the pod.
Dedicated Events
In In-app there is a global priority across all campaigns. What actually matters is the priority scoped per application since every request comes from a certain application. I.e. there are two cases:
-
A new campaign is created an gets priority 1 and the priority of all other campaigns is increased, i.e. The relative priority changes just for the application of the new campaign. But since the a new campaign is in design the priority of active campaigns is not affected.
-
When changing priorities in the UI multiple campaigns with different applications can be affected, i.e.the cache has to be invalidated for the applications which change priority since the relative priority for the unaffected applications stays the same.
With dedicated events for update of priorities including the affected applications DES can clear the affected caches of active campaigns.
Enrich Events With Campaign Properties
Currently the CampaignEventHandler does a PostgreSQL query when the entity event is handled. The properties needed are:
-
statusto ignore campaigns which are in design -
applicationIdto invalidate the caches just for that application id
By providing these fields in the event these PostgreSQL queries can be avoided. Additional fields could be provided during the implementation to skip invalidation for caches which are not affected by the event.
Second Level For All Caches
Currently active campaigns and non-terminal campaign ids use a second level cache. When web pods need to serve the message-content for the same campaign it results in two queries to PostgreSQL since these are stored only in the first level in-memory cache. I.e. to further reduce the number of queries to PostgreSQL the message-content and campaign details could also be stored in Redis.