Importing and exporting in Avro format
ClickHouse supports reading and writing Apache Avro data files, which are widely used in Hadoop systems. To import from an avro file, we should use Avro format in theINSERT statement:
Avro and ClickHouse data types
Consider data types matching when importing or exporting Avro files. Use explicit type casting to convert when loading data from Avro files:Avro messages in Kafka
When Kafka messages use Avro format, ClickHouse can read such streams using AvroConfluent format and Kafka engine:Working with Arrow format
Another columnar format is Apache Arrow, also supported by ClickHouse for import and export. To import data from an Arrow file, we use the Arrow format:Arrow data streaming
The ArrowStream format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams. To demonstrate how ClickHouse can stream Arrow data, let’s pipe it to the following python script (it reads input stream in Arrow streaming format and outputs the result as a Pandas table):arrow-stream as a possible source of Arrow streaming data.