Skip to content

Queries

This page covers the logic behind the various queries in QueryBench.

QueryBench

Single table

TODO: Add some single table queries on the Clickhouse table

h2o

h2o join

Note: This section assums the 1e8 dataset size

The first h2o join query runs a join operation on a large table and a small table:

SELECT x.id1, x.id2, x.id3, x.id4 as xid4, small.id4 as smallid4, x.id5, x.id6, x.v1, small.v2
FROM x
INNER JOIN small ON x.id1 = small.id1

This query returns XX rows.

h2o groupby

Note: This section assums the 1e8 dataset size

The first h2o groupby query runs a group by aggregation on a low cardinality column:

select id1, sum(v1) as v1
from x
group by id1

The id1 column has 100 distinct values, so this query returns 100 results. Here's what the result looks like:

XXX

The second h2o group by query runs a group by aggregation on two columns:

select id1, id2, sum(v1) as v1 
from x
group by id1, id2

Each id1 value has 100 distinct id2 values, so this query returns 10,000 results. Here's what they look like:

XXX

TODO

Clickbench

TODO