Dynamic column selection (also known as a COLUMNS expression) allows you to match some columns in a result with a re2 regular expression.

Matchers

Select all columns:

SELECT * FROM hits;

Select only specific subset of columns:

SELECT COLUMNS('URL.*ID') FROM hits;

┌─URLCategoryID─┬─URLRegionID─┐
│         15664 │         216 │
│             0 │           0 │
│             0 │           0 │
│             0 │           0 │
│             0 │           0 │
└───────────────┴─────────────┘

SELECT COLUMNS(WatchID, UserID) FROM hits;

┌─────────────WatchID─┬───────────────UserID─┐
│ 5365534025466744368 │ 2213399965098237040  │
│ 5287559196528052048 │ 2213399965098237040  │
│ 9057961221679572813 │ 2213399965098237040  │
│ 5520508008786474572 │ 9141107111321352513  │

Column transformers

Apply transformations for selected columns or remove some columns from selections:

APPLY: Allows you to invoke some function for each row returned by an outer table expression of a query.

SELECT * APPLY toString FROM hits;
SELECT COLUMNS('URL.*ID') APPLY toString FROM hits;
SELECT COLUMNS('URL.*ID') APPLY x -> toString(x) FROM hits;

┌─toString(URLCategoryID)─┬─toString(URLRegionID)─┐
│ 15664                   │ 216                   │
│ 0                       │ 0                     │
│ 0                       │ 0                     │
│ 0                       │ 0                     │
│ 0                       │ 0                     │
└─────────────────────────┴───────────────────────┘

EXCEPT: exclude one or more columns from the result.

SELECT * EXCEPT (UserID, URLRegionID) FROM hits;
SELECT COLUMNS('URL.*ID') EXCEPT URLCategoryID FROM hits;

┌─URLRegionID─┐
│         216 │
│           0 │
│           0 │
│           0 │
│           0 │
└─────────────┘

REPLACE: Specifies one or more expression aliases

SELECT COLUMNS('URL.*ID') REPLACE (URLCategoryID * 10 AS URLCategoryID)
FROM hits;

┌─URLCategoryID─┬─URLRegionID─┐
│        156640 │         216 │
│             0 │           0 │
│             0 │           0 │
│             0 │           0 │
│             0 │           0 │
└───────────────┴─────────────┘

SELECT COLUMNS('URL.*ID') REPLACE (leftPad(toString(URLRegionID), 10, '*') AS URLRegionID)
FROM hits;

┌─URLCategoryID─┬─URLRegionID─┐
│         15664 │ *******216  │
│             0 │ *********0  │
│             0 │ *********0  │
│             0 │ *********0  │
│             0 │ *********0  │
└───────────────┴─────────────┘

We can also combine them:

SELECT COLUMNS('URL.*ID') APPLY(toString) APPLY(length) APPLY(max) FROM hits;

┌─max(length(toString(URLCategoryID)))─┬─max(length(toString(URLRegionID)))─┐
│                                    5 │                                  5 │
└──────────────────────────────────────┴────────────────────────────────────┘

Select from multiple tables

merge() create a temporary Merge table with structure is taken from the first table encountered that matches the regular expression.

When upgrading ClickHouse usually rename system table if schema changed in a new release: system.query_log, system.query_log_0, system.query_log_1, ... The query below help querying from all them:

SELECT * FROM merge(system, '^query_log')

ClickHouse SELECT Advances

Matchers

Column transformers

Select from multiple tables

Series: ClickHouse on Kubernetes

Resources

me@duyet.net