This tutorial shows you how to maintain pre-aggregated roll-ups from a high-volume events table using materialized views. You’ll create three objects: a raw table, a rollup table, and the materialized view that writes into the rollup automatically.
When to use this pattern
Use this pattern when:- You have an append-only events stream (clicks, pageviews, IoT, logs).
- Most queries are aggregations over time ranges (per minute/hour/day).
- You want consistent sub-second reads without re-scanning all raw rows.
Create the raw events table
PARTITION BY toYYYYMM(event_time)keeps partitions small and easy to drop.ORDER BY (event_time, user_id)supports time-bounded queries + secondary filter.LowCardinality(String)saves memory for categorical dimensions.TTLcleans up raw data after 90 days (tune to your retention requirements).
Design the rollup (aggregated) table
We’ll pre-aggregate to hourly granularity. Choose your grain to match the most common analysis window.AggregateFunction(sum, ...)) which compactly represent partial aggregates and can be merged or finalized later.Create a materialized view that populates the rollup
This materialized view fires automatically on inserts intoevents_raw and writes aggregate states into the rollup.Querying the rollup
You can either merge states at read time, or finalize them:- Merge at read time
- Finalize with -Final
Filter on fields in the primary key for best performance
You can use theEXPLAIN command to see how the index is used to prune data:Query
Response
(bucket_start, country, event_type).
For best filtering performance you will want to make sure that your queries are making use of primary key fields to prune data.Common variations
- Different grains: add a daily rollup:
- Compression: apply codecs to big columns (example:
Codec(ZSTD(3))) on the raw table. - Cost control: push heavy retention to the raw table and keep long-lived roll-ups.
- Backfilling: when loading historical data, insert into
events_rawand let the materialized view build roll-ups automatically. For existing rows, usePOPULATEon materialized view creation if suitable, orINSERT SELECT.
Clean-up and retention
- Increase raw TTL (e.g., 30/90 days) but keep roll-ups for longer (e.g., 1 year).
- You can also use TTL to move old parts to cheaper storage if tiering is enabled.
Troubleshooting
- Materialized view not updating? Check that inserts go to the events_raw (not the roll-up table), and that the materialized view target is correct (
TO events_rollup_1h). - Slow queries? Confirm they hit the rollup (query the rollup table directly) and that the time filters align to the rollup grain.
- Backfill mismatches? Use
SYSTEM FLUSH LOGSand checksystem.query_log/system.partsto confirm inserts and merges.