Introduction
ClickHouse has no pivot operator, but we can achieve similar behavior using aggregate function combinators and, in particular, the ones with the-Map suffix.
In this article, we’re going to learn how to do that.
There is also a video covering the same material, which you can see below:
Understanding aggregate function combinators
Let’s start with a simple example. We’re going to use clickhouse-local, which you can launch by running the following:sumMap function, which takes in a map and sums the values of each key:
sumMap over multiple rows of maps;
ClickHouse appeared on both rows and its values have been summed. The key ClickBench was only present on one line, so it sums a single value, which returns that value!
We can also use maxMap to find the maximum values per key:
avgMap to find the average value per key:
Real-World Application: UK housing prices dataset
Now we’re going to use them on a bigger dataset in the ClickHouse SQL playground. We can connect to the playground using clickhouse-client:uk_price_paid table, so let’s explore the data in that table:
Grouping and aggregating by decade
Let’s work out the median prices grouped by county for each decade in the dataset:Filtering results
We can filter the results to only include data from 2010 and on:Combining multiple aggregations
And if we want to find the maximum price per decade we can do that using themaxMap function that we saw earlier:
Applying functions to map values
Alternatively, we can compute the average price usingavgMap.
Those values have a lot of decimal points, which we can clean up by using the mapApply function to call the floor function on each value in the map:
Flexible grouping: counties, districts, and postcodes
Let’s try to group by some different fields. This time we’re going to compute the median price per decade grouped by county and district:postcode1 and postcode2 in the map: