Aggregating by time bucket
The most popular requirement is to aggregate data based on periods, e.g. get the total amount of hits for each day:toDate() function here, which converts the specified time to a date type. Alternatively, we can batch by an hour and filter on the specific date:
toStartOfHour() function used here converts the given time to the start of the hour.
You can also group by year, quarter, month, or day.
Custom grouping intervals
We can even group by arbitrary intervals, e.g., 5 minutes using thetoStartOfInterval() function.
Let’s say we want to group by 4-hour intervals.
We can specify the grouping interval using the INTERVAL clause:
toIntervalHour() function
Filling empty groups
In a lot of cases we deal with sparse data with some absent intervals. This results in empty buckets. Let’s take the following example where we group data by 1-hour intervals. This will output the following stats with some hours missing values:WITH FILL modifier to address this. This will fill out all the empty hours with zeros, so we can better understand the distribution over time:
Rolling time windows
Sometimes, we don’t want to deal with the start of intervals (like the start of a day or an hour) but window intervals. Let’s say we want to understand the total hits for a window, not based on days but on a 24-hour period offset from 6 pm. We can use thedate_diff() function to calculate the difference between a reference time and each record’s time.
In this case, the day column will represent the difference in days (e.g., 1 day ago, 2 days ago, etc.):