Caching

HF
2 min readJan 9, 2021

--

A good caching system design can improve customer experience a lot. The cache is like short-term memory, with limited storage space, with TTL, and is often used nearest to the frontend of users.

Challenges: cache inconsistency

Each request to the server node will either go through cache or DB server. The cache can be stored in a disk or in the memory of the node. As we scale the nodes, each node may get caches for different queries, which increase the cache missing in nodes. There are two choices to overcome this issue: 1) CDN; 2) Cache invalidation.

Content Distribution Network (CDN)

CDN is used for sites that serve large static media data volumes.

If the system is not large enough to have its own CDN, we can separate the system with a subdomain to serve the static data using a lightweight HTTP server like Nginx and then cut-over the DNS from your servers to CDN later.

Cache Invalidation

It is very important to keep the cache up to date with the truth, so we usually set a TTL for cache data or have a mechanism to invalidate the cache. If any write operation on the data, definitely we need an update on the cache. Else it will cause a data inconsistency issue.

nSome of the cache invalidation ways:

  1. first-in-first-out
  2. first-in-last-out
  3. LRU
  4. LFU

etc…

Different Ways of Caching

Here list three common ways how to implement the caching logic.

Write-through cache

Each request writes to cache and server DB at the same time. It is good for data consistency, but it is slow if there are a lot of write operations for the system.

Write-around cache

Each request writes to server DB directly, bypassing the cache. This solution is good to avoid cache flooded with data that is not frequently re-read. The cons are that for querying recently written data, needs to query server DB.

Write-back cache

Each request writes to cache and then return. Then cache starts to persistent data in server DB. The pros are this solution is super fast in reading and writing data, as well as can query all data. There is also a risk that if the cache crash before persistent data into server DB, that will cause data loss.

--

--

Responses (1)