Skip to content

Capture clickhouse profile events in query summaries.#10172

Merged
jmcarp merged 1 commit intomainfrom
jmcarp/clickhouse-profile
Mar 30, 2026
Merged

Capture clickhouse profile events in query summaries.#10172
jmcarp merged 1 commit intomainfrom
jmcarp/clickhouse-profile

Conversation

@jmcarp
Copy link
Copy Markdown
Contributor

@jmcarp jmcarp commented Mar 27, 2026

ClickHouse includes a collection of profile events by default when using the native tcp client. This patch captures those events, aggregating them by type and including aggregated profile events in the optional query profile section. We also make use of these profile summaries in the oxql benchmark, adding a new benchmark type that measures query cpu usage rather than latency.

Context: I wanted to evaluate #10110 more rigorously, and Claude noticed that we had access to clickhouse profiling events already. Looking at cpu profiles for that patch actually showed that latency improvements came at the cost of higher cpu use, which is annoying but useful to know.

@jmcarp jmcarp requested a review from bnaecker March 27, 2026 03:14
Copy link
Copy Markdown
Collaborator

@bnaecker bnaecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I completely forgot that we already received and parsed these event messages.

I have a few suggestions, but they're fairly minor. Thanks!

fn bench_metric() -> BenchMetric {
match std::env::var("BENCH_METRIC").as_deref() {
Ok("cpu") => BenchMetric::CpuTime,
_ => BenchMetric::Latency,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably match explicitly here, rather than the _ wildcard, and fail if the benchmark isn't a supported one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


fn bench_metric() -> BenchMetric {
match std::env::var("BENCH_METRIC").as_deref() {
Ok("cpu") => BenchMetric::CpuTime,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the string match the variant name? E.g. "cpu_time"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is better. Updated.

s.profile_summary
.get("UserTimeMicroseconds")
.copied()
.unwrap_or(0)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little suspect of the unwrap_or(0). If the key isn't there, e.g., because of a change when we upgrade ClickHouse, I would want to know. Maybe at least an eprintln!() would help? Same note below too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agreed with you and tried this out, but very occasionally we seem to get missing profile events for a query here and there, rarely enough that it's hard to reproduce. I'm leaving the unwrap_or with a comment for now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok that's frustrating. Maybe add an eprintln! or something too, so we have a notification if / when it keeps happening. Up to you though.

/// Wall clock latency.
Latency,
/// Total cpu time.
CpuTime,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a note that this is user + system time as reported by the DB.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jmcarp jmcarp force-pushed the jmcarp/clickhouse-profile branch 2 times, most recently from 5e8371b to b77e857 Compare March 30, 2026 15:16
ClickHouse includes a collection of profile events by default when using
the native tcp client. This patch captures those events, aggregating
them by type and including aggregated profile events in the optional
query profile section. We also make use of these profile summaries in
the oxql benchmark, adding a new benchmark type that measures query cpu
usage rather than latency.
@jmcarp jmcarp force-pushed the jmcarp/clickhouse-profile branch from b77e857 to 20549dc Compare March 30, 2026 16:05
@jmcarp jmcarp enabled auto-merge (squash) March 30, 2026 16:06
@jmcarp jmcarp merged commit dfcd9ac into main Mar 30, 2026
16 checks passed
@jmcarp jmcarp deleted the jmcarp/clickhouse-profile branch March 30, 2026 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants