How to keep cache consistent with database in distributed systems?

Suppose our Users service cluster has 4 nodes and each respond to client requests behind a load balancing system. Each node also has a cache which should be consistent with the Users database. So how to update each node’s cache if users database has changed? This story introduces a resource-version-keeping approach which may be more economical.

Possible Solutions

There are two obvious solutions, while both have pros and cons.

That is, we use a ticker to fetch data from database and reset cache on an interval basis. Easy as it is, it’s expensive in that it calls database every time, which is unnecessary in most cases since the database may not change at all. Especially if the queried table is too large, this will cause serious performance issue.

So can we use a central cache system like Redis? Yes. Using Redis we only fetch database when cache misses. But a central system also means some maintenance and network cost. Sometimes we only need an in memory and light weight one, can we achieve that?

Then comes our Resource Version Keeper approach. Let’s continue.

Resource Version Keeper

Let’s explain some concepts in this approach at first:

  • Resource

Resource refers to any database contents, such as all users’ information in Users database. All of the users data as a whole is users resource.

  • Version

Version refers to the state of a resource. If any data changes in database, such as insert/update/delete one user, the version changes.

  • VersionDB

VersionDB stores all resources’ version info.

  • UsersDB

UsersDB stores the actual users data.

The lifecycle of a version change example is described in the following:

At first, all nodes have the same version with database, as shown in Fig 1.

Fig 1

The version of both nodes and database are 1. So all of the nodes won’t fetch users data from the UsersDB.

A client wants to create one user and the request goes to node-1. So node-1 updates the version to 2 in VersionDB, see Fig 2.

Fig 2

In the next checking loop of all nodes, they will find that the version they hold in memory is behind that in database, as illustrated in Fig 3.

Fig 3

So all nodes begin to fetch data from database and rebuild their caches, see Fig 4.

Fig 4

After they successfully update the cache, all nodes will update its in-memory version to 2. Then this cycle ends.

Fig 5

In this approach we still need periodic checking whether the version changes, but since VersionDB is relatively small, this cost is acceptable. Suppose the checking interval is 10 seconds, then after at most 10 seconds, all nodes will detect the changes and stay consistent with the database.

Conclusion

Resource Version Keeper is a light weight caching approach and I implemented it in Golang for your reference. Welcome to have a try of this nice utility :)

https://github.com/sceneryback/resverKeeper

Breathtaking interfaces and strong services together make great products