ORDER BY keys and using materialized views.
We’ll see how these approaches can reduce query times from seconds to milliseconds.
Optimize ORDER BY keys
Before attempting other optimizations, you should optimize ordering keys to ensure ClickHouse produces the fastest possible results.
Choosing the right key largely depends on the queries you’re going to run. Suppose most of our queries filter by the project and subproject columns.
In this case, it’s a good idea to add them to the ordering key — as well as the time column since we query on time as well.
Let’s create another version of the table that has the same column types as wikistat, but is ordered by (project, subproject, time).
| Query | (time) | (project, subproject, time) |
|---|---|---|
| 2.381 sec | 1.660 sec | |
| 2.148 sec | 0.058 sec | |
| 2.192 sec | 0.012 sec | |
| 2.968 sec | 0.010 sec |
Materialized views
Another option is to use materialized views to aggregate and store the results of popular queries. These results can be queried instead of the original table. Suppose the following query is executed quite often in our case:Create materialized view
We can create the following materialized view:Backfilling destination table
This destination table will only be populated when new records are inserted into thewikistat table, so we need to do some backfilling.
The easiest way to do this is using an INSERT INTO SELECT statement to insert directly into the materialized view’s target table using the view’s SELECT query (transformation):
- Creating a temporary table with a Null table engine
- Connecting a copy of the normally used materialized view to that temporary table
- Using an
INSERT INTO SELECTquery, copying all data from the raw data set into that temporary table - Dropping the temporary table and the temporary materialized view.
wikistat_backfill and write into wikistat_top
wikistat_backfill from the initial wikistat table: