How we learned that schema purity matters less than matching the natural grain of your queries.

Image generated using ChatGPT
Setting the Scene: Digital Billboards and Audience Targeting
Before we talk schema design, we need to discuss the problem. Without the domain context, the technical decisions won’t make sense.
The Business: Out-of-Home Advertising
Think of the digital billboards you see on highways, in malls, airports, or gyms. Each physical screen is called a frame.
Advertisers don’t just ask — How many people see this screen?
They ask:
Which frames should I buy to best reach my target audience?
The Data We Work With
Frames: Physical advertising screens. In the US alone, there are millions — Times Square, shopping centers, transit hubs, everywhere.
Audiences: Behavioral audiences like luxury car buyers, fitness enthusiasts, or working parents. A typical advertisement campaign targets 1–6 audiences simultaneously.
Impressions: The estimated number of people who see a frame during a specific hour. Example: A Times Square billboard at 5 pm has more impressions than a suburban bus stop at 3 am.
DMA (Designated Market Area): Geographic regions used in advertising. Think New York metro area or Los Angeles market.
The Key Metric: Audience Index
Raw impression counts are misleading.
A Times Square frame may reach more fitness enthusiasts than a suburban bus stop — but it also reaches far more everyone else. What advertisers really care about is concentration, not absolute volume.
We measure this using an audience index:
audience_index = (frame's audience ratio) / (market's natural delivery) × 100Where:
Frame’s audience ratio = audience impressions / total impressions at this frame
Natural delivery = audience impressions / total impressions across ALL frames in the market
Interpretation:
100 → performs exactly at market average
150 → over-indexes by 50% (1.5× better than average)
80 → under-indexes by 20%
Why Hourly Granularity Is Non-Negotiable?
Audience composition changes dramatically over the day:
8 am at a subway station: Commuters, working professionals
2 pm at a mall: Stay-at-home parents, retirees
8 pm at a gym: Fitness enthusiasts
Advertisers buy specific dayparts, such as weekday mornings or weekend evenings. That leads to a crucial requirement:
Audience index must be computed at the hourly level, and only then averaged across the selected hours.
You cannot calculate a single daily index without losing correctness.
The Query in Plain English
Show me all frames in the New York market that over-index (index ≥ 100) for young professionals during weekday commute hours (8–10 am and 5–7 pm, Monday through Friday), subject to dynamic frame-level filters such as location type (e.g., roadside vs mall), media owner, pricing constraints, and availability.
This query is not a batch report or an offline analysis. It sits directly in the critical path of a chat-based user experience, where advertisers explore options conversationally and expect responses in near real time.
This requires:
For each frame, for each selected hour: calculate the audience index
Average those hourly indices to get a single score per frame
Filter to frames with average index ≥ 100
The query must return within 1–2 seconds to keep the conversational flow intact, even as users refine audiences and time windows interactively.
This query cannot be answered using precomputed DMA-level indices alone. While market-level natural delivery can be aggregated, the final index must be recomputed over the user-selected subset of frames, which varies per request based on interactive filters.
Now, let’s design a schema to make this query fast.
The Scale of the Problem
At full resolution, the data looks like this:
≈ 1.7 million Frames
We support around 300 Audiences on the platform
168 Time slots (24 hours × 7 days)
That’s potentially 85 billion data points (1.7M × 300 × 168).
We don’t materialize all of them at once, but the schema must support this dimensionality efficiently.
The Intuition: Columnar Should Win
We use ClickHouse, a columnar OLAP database. The standard advice is clear:
Put different attributes in different columns so queries read only what they need.
Applied naively, that suggested a very wide table. We add 16,800 columns to capture hourly audience impressions across 168 hours.
To test this locally, we decided to ingest 80,000 frames with 100 audiences:
CREATE TABLE frames_hourly_insights (
id String,
dma LowCardinality(String),
-- ... other frame attributes ...
-- 100 audiences × 168 hours = 16,800 columns
`audience1_mon_08` Float32,
`audience1_mon_09` Float32,
-- ... continues for all 168 hours per audience ...
)The appeal was obvious:
Perfect column pruning
Querying Young professionals across during weekday commute hours (8–10 am and 5–7 pm, Monday through Friday) reads just 20 columns
~99.9% of data skipped
On paper, this looked ideal. In practice, it was dead on arrival.
Why Extreme Width Breaks Down:
In ClickHouse, each column is a first-class entity:
It has its own data files and mark files per part
It participates independently in merges
It contributes metadata that must be tracked, planned, and opened at query time
At tens of thousands of columns, these costs compound quickly.
Even before considering query complexity, the system had to:
Manage and reason about thousands of column files per part
Open and coordinate large numbers of file handles
Perform background merges that touch every column, regardless of how many are queried
This overhead exists independent of the ingestion mechanism. While ClickPipe can mitigate write-time memory pressure by streaming data from object storage, it does not eliminate the operational costs of maintaining and merging extremely wide parts.
Even on a reduced dataset of 1000 frames, early queries exhibited disproportionate overhead:
Duration: 1167 ms
Memory: 52 MiB
Parts: 40The parts selected metric revealed the problem: ClickHouse was opening 40 separate files. Despite reading very little actual data, query execution was dominated by column metadata handling rather than I/O or computation.
At this scale, the benefits of column pruning were outweighed by the cost of managing the column set itself.
Lesson:
Extremely wide tables impose metadata, merge, and planning overheads that eventually dominate execution time.
Columnar databases are optimized for scanning fewer columns, not for managing tens of thousands of them.
In practice, to support ~300 audiences the ~50,000 columns is well beyond the operational comfort zone of a ClickHouse table — regardless of whether data arrives via direct INSERTs or through ClickPipe.
The Array Approach: A Promising Middle Ground
We pivoted to arrays. Instead of 168 columns per audience, one column containing an array of 168 values:
CREATE TABLE frames_hourly_insights_array (
id String,
dma LowCardinality(String),
-- ... other frame attributes ...
-- 168-element array per audience (7 days × 24 hours)
-- Index = day_index × 24 + hour + 1 (Mon=0, Tue=1, ..., Sun=6)
all_adults_impressions Array(Float32),
`audience1_impressions` Array(Float32),
`audience2_impressions` Array(Float32),
-- ... 100 audience columns total ...
)Now we had:
~100 audience columns instead of 16,800
Clean writes
Much lower metadata overhead
Initial tests with 80,000 frames and 100 audiences on local were promising:
Duration: 137 ms
Memory: 21.7 MiB
Parts: 2For a moment, this felt like the solution.
The Query Pattern Trap:
Those early tests used a simplified calculation:
Sum impressions across selected hours
Compute one index from the totals
But that wasn’t the real requirement.
The correct computation is: For each selected hour h:
frame_ratio[h] = audience[h] / total[h]natural_delivery[h] = SUM(audience[h]) / SUM(total[h])(per DMA)hourly_index[h] = frame_ratio[h] / natural_delivery[h] × 100
Final score = AVG(hourly_index[h])
This requires joining at an hourly granularity.
Query:
WITH target_hours AS (
-- Array index = day_index * 24 + hour + 1 (Mon=0, Tue=1, ..., Fri=4)
SELECT arrayJoin([
9, 10, 18, 19, -- Monday: 8am, 9am, 5pm, 6pm
33, 34, 42, 43, -- Tuesday
57, 58, 66, 67, -- Wednesday
81, 82, 90, 91, -- Thursday
105, 106, 114, 115 -- Friday
]) AS hour_idx
),
exploded AS (
SELECT
id,
dma,
hour_idx,
all_adults_impressions[hour_idx] AS all_adults_imp,
audience_impressions[hour_idx] AS audience_imp
FROM frames_hourly_insights_array
CROSS JOIN target_hours
),
hourly_nat AS (
SELECT
dma,
hour_idx,
sum(audience_imp) / sum(all_adults_imp) AS natural_delivery
FROM exploded
GROUP BY dma, hour_idx
),
hourly_index AS (
SELECT
e.id,
e.dma,
(e.audience_imp / e.all_adults_imp) / n.natural_delivery * 100 AS hour_index
FROM exploded e
JOIN hourly_nat n
ON e.dma = n.dma
AND e.hour_idx = n.hour_idx
)
SELECT
id,
dma,
round(avg(hour_index), 2) AS audience_index
FROM hourly_index
GROUP BY id, dma
HAVING audience_index >= 100
ORDER BY audience_index DESCWith arrays, that means one thing: explosion.
To join per-hour natural delivery, arrays must be converted into rows, which turns:
80,000 frames * 1 row to 80,000 frames * 24 hours = 1,920,000 rows
Every query. This intermediate explosion:
Consumes memory
Adds CPU overhead
Defeats the original elegance of the array schema
Lesson:
Arrays are cheap to store. Exploding arrays repeatedly is not.
The array-based model would have been a good fit if the requirement were to aggregate impressions across hours first, rather than compute hour-level indices before aggregation.
The “Wrong” Solution: Pre-Exploded Rows
Against our instincts, we tried a schema that already lived at the query’s natural grain:
CREATE TABLE frames_hourly_insights_row_based (
id String,
dma LowCardinality(String),
-- ... other frame attributes (repeated 168× per frame!) ...
day_of_week LowCardinality(String), -- 'mon', 'tue', etc.
hour_of_day UInt8, -- 0-23
all_adults_impressions Float32, -- Single value for this hour
`audience1_impressions` Float32,
`audience2_impressions` Float32,
-- ... 100 audience columns ...
)The Pre-Exploded table is ordered by (dma, day_of_week, hour_of_day, id), ensuring data locality for both hourly natural delivery aggregation and frame-level joins.
However, with this approach, each frame becomes 168 rows — one per hour.
Yes:
Frame attributes are repeated
Row count explodes
Storage grows dramatically
But look at the shape:
dma | day | hour | id | audience impressionsThis is exactly what the query wants.
No array explosion.
No CROSS JOIN.
Just filters and joins on native columns.
Query: Query for determining frames for weekdays between 8–10 am and 5–7 pm becomes natural:
WITH hourly_nat AS (
SELECT
dma,
day_of_week,
hour_of_day,
sum(audience_impressions) / sum(all_adults_impressions) AS natural_delivery
FROM frames_hourly_insights_row_based
WHERE day_of_week IN ('mon', 'tue', 'wed', 'thu', 'fri')
AND hour_of_day IN (8, 9, 17, 18)
GROUP BY dma, day_of_week, hour_of_day
),
hourly_index AS (
SELECT
r.id,
r.dma,
(r.audience_impressions / r.all_adults_impressions)
/ n.natural_delivery * 100 AS hour_index
FROM frames_hourly_insights_row_based r
JOIN hourly_nat n
ON r.dma = n.dma
AND r.day_of_week = n.day_of_week
AND r.hour_of_day = n.hour_of_day
WHERE r.day_of_week IN ('mon', 'tue', 'wed', 'thu', 'fri')
AND r.hour_of_day IN (8, 9, 17, 18)
)
SELECT
id,
dma,
round(avg(hour_index), 2) AS audience_index
FROM hourly_index
GROUP BY id, dma
HAVING audience_index >= 100
ORDER BY audience_index DESCLet’s break down what this query does:
hourly_nat CTE: For each DMA + day + hour combination, calculate the market’s natural delivery (what percentage of all impressions are from our target audience?)
hourly_index CTE: For each frame at each hour, compare the frame’s audience concentration to the market average. A frame where 15% of viewers are young professionals in a market where only 10% are — that’s an index of 150.
Final aggregation: Average each frame’s hourly indices to get one score per frame.
No CROSS JOIN. No array explosion. Just a natural join on existing dimensions.
Results: Less Data Magic, More Performance
At 80,000 frames with hourly-level index calculation:
| Metric | Array Table | Pre-Exploded Table| Winner |
|----------------|-------------|-------------------|-------------------------|
| Query Duration | 1,391 ms | 718 ms | Pre-Exploded 1.9x faster|
| Query Memory | 95.73 MiB | 69.11 MiB | Pre-Exploded 1.4x less |
| Bytes Read | 211.54 MiB | 614.32 MiB | Array (2.9× less) |
| Storage Rows | 160,002 | 17,694,010 | Array (110× less) |The Pre-Exploded table reads more data but processes it faster because:
No intermediate explosion: Data is already at the right granularity
Efficient joins: Day and hour are native columns, not computed indices
Lower memory: No need to materialize exploded array data
What This Is Really About
This wasn’t a win for Pre-Exploded storage over columnar storage.
It was a win for matching schema grain to query grain.
All three designs were columnar in the physical sense. The difference was where data reshaping happened:
The wide-table design pushed complexity into the schema width
The array-based design pushed complexity into runtime reshaping
The pre-exploded design pushed complexity into storage
Only one of those aligned with how the query actually executed.
Arrays work beautifully when you aggregate within them.
Arrays become expensive when you must repeatedly explode and rejoin them.
Pre-exploded data trades storage efficiency for predictable execution, stable memory usage, and simpler queries. Beyond median latency, this design exhibited more predictable memory usage, making it safer under concurrent query load.
The deeper lesson is this:
If your query needs to repeatedly reshape data at runtime, that reshaping likely belongs in the schema.
This principle isn’t ClickHouse-specific. It applies to any analytical system where correctness, latency, and predictability matter more than theoretical elegance.
Schema design is ultimately about choosing where complexity lives.
In this case, putting that complexity into storage turned out to be the least expensive option.
Key Takeaways
Schema design is query-driven
Don’t design for elegance — design for how the data is actually consumed.Avoid runtime reshaping in hot paths
Array explosion, CROSS JOINs, and dynamic joins are often more expensive than extra rows.Columnar systems still have practical limits
Thousands of columns introduce memory and metadata overhead.Benchmark real queries, not toy ones
Our initial tests favored arrays — until we ran the real calculation.Storage is cheaper than correctness bugs and slow queries
Especially when user-facing performance is involved. In practice, the storage overhead is mitigated by efficient compression: low-cardinality dimensions (day_of_week,dma) and compact numeric types compress extremely well in ClickHouse.
Closing Thought
Sometimes a theoretical “wrong” approach is right. The goal isn’t schema purity, it’s query performance. And performance comes from understanding your access patterns, not from following architectural dogma.
Build what works. Measure. Iterate.
Review Credits
Thanks, Amaan, Puneet, for reviewing this post and providing valuable feedback.
Tested on ClickHouse 24.1 on a local Docker container with 8GB RAM. Results may vary based on hardware, data distribution, and ClickHouse version.
Tags: ClickHouse, Database Design, Schema Design, Columnar Databases, Performance Optimization, Data Engineering