Function Engine Configuration
Override the execution engine for specific functions.Setting Function Engines
When to Use
Force chdb for:- Functions with better ClickHouse performance
- Functions that benefit from SQL optimization
- Large-scale string/datetime operations
- Functions with pandas-specific behavior
- When exact pandas compatibility is required
- Custom string operations
Example
Overlapping Functions
159+ functions are available in both chdb and pandas engines:| Category | Functions |
|---|---|
| String | length, upper, lower, trim, ltrim, rtrim, concat, substring, replace, reverse, contains, startswith, endswith |
| Math | abs, round, floor, ceil, exp, log, log10, sqrt, pow, sin, cos, tan |
| DateTime | year, month, day, hour, minute, second, dayofweek, dayofyear, quarter |
| Aggregation | sum, avg, min, max, count, std, var, median |
- Explicit function configuration (if set)
- Global execution_engine setting
- Auto-selection based on context
chdb-Only Functions
Some functions are only available through ClickHouse:| Category | Functions |
|---|---|
| Array | arraySum, arrayAvg, arraySort, arrayDistinct, groupArray, arrayElement |
| JSON | JSONExtractString, JSONExtractInt, JSONExtractFloat, JSONHas |
| URL | domain, path, protocol, extractURLParameter |
| IP | IPv4StringToNum, IPv4NumToString, isIPv4String |
| Geo | greatCircleDistance, geoDistance, geoToH3 |
| Hash | cityHash64, xxHash64, sipHash64, MD5, SHA256 |
| Conditional | sumIf, countIf, avgIf, minIf, maxIf |
pandas-Only Functions
Some functions are only available through pandas:| Category | Functions |
|---|---|
| Apply | Custom lambda functions, user-defined functions |
| Complex Pivot | Pivot tables with custom aggregations |
| Stack/Unstack | Complex reshaping operations |
| Interpolate | Time series interpolation methods |
Dtype Correction
Configure how DataStore corrects data types between engines.Correction Levels
Correction Level Details
| Level | Description | Types Corrected |
|---|---|---|
NONE | No automatic correction | None |
CRITICAL | Essential corrections | NULL handling, boolean conversion |
HIGH (default) | Common corrections | Integer/float precision, datetime, string encoding |
MEDIUM | More corrections | Decimal precision, timezone handling |
ALL | Maximum correction | All type differences |
When Types Need Correction
Type differences can occur when:- ClickHouse → pandas: Different integer sizes (Int64 vs int64)
- pandas → ClickHouse: Python objects to SQL types
- NULL handling: pandas NA vs ClickHouse NULL
- Boolean: Different boolean representations
- DateTime: Timezone differences