Caching Mechanism using Pre-aggregations in Cube.js

KNL JAGADEESWARI
Oct 23, 2020
2 min read

Hello everyone,

Let's learn how can we implement cache in cube.js.

Before going to that, let's discuss about why we need caching.

Caching is a buffering technique that stores the frequently queried data in a temporary memory. It makes data easier to be accessed and reduces workloads for databases. Also it boosts up the response time.

Cube.js provides Pre-aggregations to cache the data at database level. This pre-aggregation engine builds an aggregated layer in our database as shown below,

When the user requests for a data, the Cube.js will first look for the relevant pre-aggregation. If it cannot found any pre-aggregation, it will build a new one. Once the pre-aggregated layer is built, all the subsequent request will hit this layer instead of hitting the raw data. With this, it boosts up the response time by hundreds or even thousands of times.

Pre-aggregations looks same as raw data which is stored in the form of a table in our database. To use the data from pre-aggregations, cube.js should have the right access to the stb_pre_aggregations schema where pre-aggregation tables were stored.

Implementation of Pre-aggregations:

Pre-aggregations are defined in the data schema. Look into the below example for rollup pre-aggregation,

cube(`CovidUpdates`, {
 sql: `SELECT * FROM covid_statistics.covid_updates`,

 measures: {
 count: {
 type: `count`,
 drillMembers: []
    },
 confirmed_cases: {
 sql: `confirmed_cases`,
 type: `number`
    },
 active_cases: {
 sql: `active_cases`,
 type: `number`
    },
 tested_cases: {
 sql: `tested_cases`,
 type: `number`
    }
  },

 dimensions: {
 state: {
 sql: `state`,
 type: `string`
    },
 id: {
 sql: `id`,
 type: `number`,
 primaryKey: true 
    }
  },
 preAggregations: {    
 category: {
 type: `rollup`,
 measureReferences: [CovidUpdates.count],
 dimensionReferences: [state],
 useOriginalSqlPreAggregations: true,
 refreshKey: {
 every : '2 minutes' 
      }, 
    }
  }
});

rollup pre-aggregations are most effective way to boost performance of any analytical application. It defines a set of measures and dimensions used to construct query for pre-aggregation table.
useOriginalSqlPreAggregations is used when we want to re-use some heavy SQL query calculation in multiple rollups.
refreshKey is used to keep the pre-aggregations up to date.

Note: If no 'refreshKey' is used, Cube.js will use the default 'refreshKey' to keep the pre-aggregation up to date.

Once the pre-aggregation is defined, re-run the application so that we can see the pre-aggregation table in the database as below,

If we observe the above image, there are two columns in the pre-aggregated table which are mentioned in the example of rollup pre-aggregation as measures and dimensions.

Inspecting Queries:

We can also inspect queries in Cube.js to know whether we are getting the response from pre-aggregations or raw-data.

Cube Playground is used to inspect a single query. For this, click on ‘Cache’ button after executing the query. There we can see the information regarding ‘refreshKey’ of that query and if any query uses ‘pre-aggregations’.

Below image tells you how to inspect queries in cube playground,