Various caching strategies For REST API in NodeJs

Various caching strategies For REST API in NodeJs

Introduction

One of the things that makes people love your website or an application is its speed. Even if someone has a poor connectivity, they expect some minimum threshold level of speed when using their application. To provide that seamless experience, we must consider optimizing the performance of the REST API calls and the concept of caching proves much useful.

Believe it or not, no matter whether you solve their use-case through your application or not, but they will definitely praise the performance and the speed.

You are right!!

Why do we need cache

The primary goal of caching is to save your server from the pressure of doing the same work again and again. Lets say for an example, you have a page which displays stocks or news-feed. Now, lets imagine that the page will be loaded or refreshed multiple times, but if the same data is being fetched from the database every single time, then there will be a huge load in both the database as well as the server.

You need to look out for a better way and that's where caching techniques comes into picture. So lets quickly start with a basic example.

Building a basic REST API

Let's create a new project and install all the necessary packages:

npm init
npm install --save express rethinkdb memory-cache flat-cache redis

Since I have a working instance of RethinkDB, I will create this REST API around rethinkDB.

Let's start with a basic server.

const express = require('express');
const r = require('rethinkdb');
const PORT = process.env.PORT || 3000;
const app = express();

app.get('/users', (req, res) => {
  setTimeout(() => {
    let connection = null;
    r.connect({ host: '127.0.0.1', port: 28015 }, (err, conn) => {
      if (err) {
        throw err;
      }
      connection = conn;
      r.db('test').table('Users').run(connection, (err, cursor) => {
        if (err) {
          throw err;
        }
        cursor.toArray((err, result) => {
          if (err) {
            throw err;
          }
          connection.close();
          res.send( result );
        });
      });
    }); // this is intentionally wrapped in a setTimeout function to simulate a slow request
  }, 3000);
});

app.listen(PORT, () => {
  console.log(`App running on port ${PORT}`);
});

You can run this by calling:

node index.js

You can call the REST API by calling: http://localhost:3000/users

So far our single-route App works fine. But now imagine every time you call this API, its going to take almost same amount of time to load the same data. Now let's take a look how we can optimize the performance by making use of some basic in-memory caching functions.

Using memory-cache for in-memory caching

We have already installed the package memory-cache. Now lets see how we can use it.

const cache = require('memory-cache');
let memCache = new cache.Cache();
let cacheMiddleware = (ttl) => {
  return (req, res, next) => {
    let key = '__express__' + req.originalUrl || req.url;
    let cacheContent = memCache.get(key);
    if (cacheContent) {
      res.send( cacheContent );
      return;
    } else {
      res.sendResponse = res.send;
      res.send = (body) => {
        memCache.put(key, body, ttl*1000);
        res.sendResponse(body);
      }
      next();
      return;
    }
  };
};

In any caching technique, there are two things that needs to defined:

  • A unique random dynamically generated key to store the content as value for it.
  • Bind the content stored as that key to a duration which is also called Time-To-Live(TTL).

If content exists for a given key, the data is sent back as the response without having to make the extra query to our database. If there is no content in the cache for the particular key, the request is processed as usual and the result of the request is stored in our cache before the response is sent to the user.

Lets use this middleware in our route like this:

app.get('/users', cacheMiddleware(30), (req, res) => {
  setTimeout(() => {
    let connection = null;
    r.connect({ host: '127.0.0.1', port: 28015 }, (err, conn) => {
      if (err) {
        throw err;
      }
      connection = conn;
      r.db('test').table('Users').run(connection, (err, cursor) => {
        if (err) {
          throw err;
        }
        cursor.toArray((err, result) => {
          if (err) {
            throw err;
          }
          connection.close();
          res.send( result );
        });
      });
    });
  }, 3000);
});

Now one of the disadvantage of this method is that once the server is down, the cached content is lost. But we can live with it.

Using a flat file for caching

Now let's see if we can use files to persist our cached data into server. So we will use flat-cache.

const flatCache = require('flat-cache');
let fileCache = flatCache.load('usersCache');
// If you want to load from specific folder
// let cache = flatCache.load('usersCache', path.resolve('./path/to/folder')
let cacheMiddleware = (req,res, next) => {
  let key =  '__express__' + req.originalUrl || req.url;
  let cacheContent = fileCache.getKey(key);
  if( cacheContent){
    res.send( cacheContent );
  }else{
    res.sendResponse = res.send;
    res.send = (body) => {
      fileCache.setKey(key, body);
      fileCache.save();
      res.sendResponse(body);
    };
    next();
    return;
  }
};

Using MemCached as a Service

Now a different option to consider for caching is memcached. It is a NodeJs client with the scaling in mind.

To use the client, you need to have memcached installed in your setup. Now once this is installed, we can configure it by defining:

const Memcached = require('memcached');
const memcached = new Memcached("127.0.0.1:11211");
let cachedMiddleware = (ttl) => {
        return  (req,res,next) => {
        let key = "__express__" + req.originalUrl || req.url;
        memcached.get(key, function(err,data){
            if(data){
                res.send(data);
                return;
            }else{
                res.sendResponse = res.send;
                res.send = (body) => {
                    memcached.set(key, body, (duration*60), function(err){
                        // error handling
                    });
                    res.sendResponse(body);
                }
                next();
                return;
            }
        });
    }

So as usual, a key-value will be set for a TTL duration and we will be fetching that data for that defined duration.

Using Redis as caching

It will be a crime if we are talking about caching and didn't consider Redis as one of the caching strategy. Till now we have used all in-memory cache and also how we can persist our cache using files. But now lets take a look how Redis can perform and solve our problem.

Redis stands for:

Remote Dictionary Server

It has the ability to store and manipulate high-level data types. You can easily have a redis setup up and running by following the steps from Redis Dockerhub.

Now next we need to define our middleware:

const redis = require('redis')
const client = redis.createClient();

let redisMiddleware = (req, res, next) => {
      let key = "__express__" + req.originalUrl || req.url;
      client.get(key, function(err, reply){
        if(reply){
            res.send(reply);
        }else{
            res.sendResponse = res.send;
            res.send = (body) => {
                client.set(key, JSON.stringify(body));
                res.sendResponse(body);
            }
            next();
            return;
        }
      });
    };

Performance Comparison

Lets compare the performance for different strategies we have taken into consideration:

  • No caching at all
IterationResponse Time
1st run3012 ms
2nd run3007 ms
3rd run3010 ms
  • memory-cache(in-memory)
IterationResponse Time
1st run3032 ms
2nd run16 ms
3rd run15 ms
  • flat-cache(caching to file)
IterationResponse Time
1st run3043 ms
2nd run27 ms
3rd run8 ms
  • MemCached(Service)
IterationResponse Time
1st run3049 ms
2nd run10 ms
3rd run16 ms
  • Redis(Key-value based Dictionary)
IterationResponse Time
1st run3028 ms
2nd run6 ms
3rd run10 ms

So if you take out the average time consumed by each strategy, you will find out that the various caching strategies is much better than having no caching at all.

Caching comes in handy a lot of times, but you need to be aware of when to use it.