Server, local or chDBThe steps in this guide can be executed using an existing ClickHouse server installation. For ad hoc querying, you can instead use clickhouse-local and complete the same workflow without running a server. With minor adjustments, the process can also be performed using ClickHouse’s in process distribution, chDB.
- Apache Iceberg
- Delta Lake
- Apache Hudi
- Apache Paimon
The Example:Example (ClickHouse Cloud):Example:For supported features including partition pruning, schema evolution, time travel, caching, and more, see the support matrix. For full reference, see the
iceberg table function (alias for icebergS3) reads Iceberg tables directly from object storage. Variants exist for each storage backend: icebergS3, icebergAzure, icebergHDFS, and icebergLocal.Example syntax:GCS supportThe S3 variant of the functions can be used for Google Cloud Storage (GCS).
Cluster variant
TheicebergS3Cluster function distributes reads across multiple nodes in a ClickHouse cluster. The initiator node establishes connections to all nodes and dispatches data files dynamically. Each worker node requests and processes tasks until all files have been read. icebergCluster is an alias for icebergS3Cluster. Variants also exist for Azure (icebergAzureCluster) and HDFS (icebergHDFSCluster).Example syntax:Table engine
As an alternative to using the table function in every query, you can create a persistent table using theIceberg table engine. The data still resides in object storage and is read on demand - no data is copied into ClickHouse. The advantage is that the table definition is stored in ClickHouse and can be shared across users and sessions without each user needing to specify the storage path and credentials. Engine variants exist for each storage backend: IcebergS3 (or the Iceberg alias), IcebergAzure, IcebergHDFS, and IcebergLocal.Both the table engine and the table function support data caching, using the same caching mechanism as the S3, AzureBlobStorage, and HDFS storage engines. Additionally, a metadata cache stores manifest file information in memory, reducing repeated reads of Iceberg metadata. This cache is enabled by default via the use_iceberg_metadata_files_cache setting.Example syntax:The table engine Iceberg is an alias to IcebergS3.GCS supportThe S3 variant of the table engine can be used for Google Cloud Storage (GCS).
iceberg table function and Iceberg table engine documentation.