This content originally appeared on DEV Community and was authored by Cristian Sifuentes
Kusto Query Language (KQL) — A Practical, No‑BS Guide for Engineers
Wondering if KQL is just a misspelling of SQL? It isn’t. KQL is a read‑only, pipe‑first query language built by Microsoft for Azure Data Explorer (ADX), Azure Monitor / Log Analytics, and Microsoft Sentinel. It’s designed for fast, interactive analysis of huge volumes of structured & semi‑structured data (logs, telemetry, events).
This guide compresses the essentials: what KQL is, how it differs from SQL, when to use it, and a set of copy‑paste recipes you can run today in Azure Logs.
TL;DR for busy devs/SREs
-
Model: Data‑flow pipeline — results stream through operators chained with
|
. - Scope: Read‑only. You query, filter, aggregate, join, visualize. You don’t mutate.
- Superpowers: Time‑series analytics, log crunching, security hunting, near‑real‑time slices.
- Where it runs: ADX, Azure Monitor Logs (App Insights, VM/Container Insights), Sentinel.
- Why it’s nice: Concise, composable, insanely fast at scale; lower cognitive load than SQL for logs.
KQL vs SQL (the quick mental model)
Topic | KQL | SQL |
---|---|---|
Purpose | Query huge log/telemetry/event streams | Manage relational data (OLTP/OLAP) |
Mutations | None (read‑only) | INSERT/UPDATE/DELETE/DDL |
Query style | Pipes: `source | where … |
Order matters | Yes (each pipe filters/reduces) | Less so (optimizer can reorder) |
Time‑series | First‑class: {% raw %}make-series , series_*
|
Possible but verbose |
Learning curve | Friendly for logs | Powerful but heavier syntax |
Think of KQL as bash + pandas for logs, with a query engine that flies.
Core syntax by example
// Basic pattern
TableName
| where <predicate>
| project <columnA>, <columnB>
| summarize <agg>=count() by <dimension>
| order by <agg> desc
| take 10
Multi‑statement with variables
let TopTags =
Tweets
| where Language == "English"
| summarize Count = count() by Hashtags
| order by Count desc
| take 10;
TopTags | project Hashtags
Joins
Requests
| join kind=leftouter (Dependencies | project operation_Id, dep=target, dep_ms=duration)
on operation_Id
| summarize avg_dep_ms = avg(dep_ms) by name
Time‑series
StormEvents
| where State == "FLORIDA" and timestamp > ago(7d)
| make-series count() default=0 on timestamp step 1h by EventType
Where KQL shines
- Troubleshooting & diagnostics: slice errors, correlate dependencies, hunt anomalies.
- Near‑real‑time ops: dashboards and alerts on live telemetry (App Insights, Container/VM logs).
-
Security hunting: device events, sign‑in anomalies (Sentinel tables like
DeviceImageLoadEvents
).
let DriverLoads = DeviceImageLoadEvents
| where InitiatingProcessFileName == "spoolsv.exe";
DriverLoads
- Exploratory analytics: IoT/usage signals in ADX with interactive speed.
Quickstart: where to write KQL
- Azure Portal → Application Insights / Log Analytics / Sentinel → Logs Full Kusto editor with schema browser, Query Hub, charting, and “Save/Share”.
- Azure Data Explorer (Web UI / VS Code extension) for heavy analytics and long‑running queries.
SQL → KQL cheat sheet (practical mapping)
SQL | KQL |
---|---|
SELECT a,b FROM T WHERE x=1 |
`T |
{% raw %}COUNT(*) GROUP BY k
|
summarize count() by k |
AVG(d) |
summarize avg(d) |
ORDER BY c DESC LIMIT 10 |
`order by c desc |
{% raw %}LEFT JOIN
|
join kind=leftouter |
WITH x AS (...) SELECT ... |
let x = (...); <use x> |
IS NULL / IS NOT NULL |
isnull(x) / isnotnull(x) |
Pro tip: In ADX you can explain SQL to see the translated KQL: prefix the SQL with
--explain
(ADX feature).
Essential operators you’ll use daily
-
where
,project
,extend
,summarize by
,order by
,take
-
join
(inner, leftouter, rightouter, fullouter, anti, semi) -
parse
,parse_json
,mv-expand
for JSON/arrays -
make-series
+series_*
funcs for time windows -
bin()
to bucket timestamps (e.g.,bin(timestamp, 5m)
) -
render
to chart (timechart
,barchart
,piechart
,anomalychart
)
Copy‑paste recipes (Azure Monitor / App Insights)
1) Top failing operations (24h)
requests
| where timestamp >= ago(24h)
| summarize failures = countif(success == false), total = count() by name
| extend failureRate = todouble(failures)/total
| top 20 by failures desc
2) Slowest dependencies
dependencies
| where timestamp >= ago(24h)
| summarize p95_ms = percentile(duration, 95), calls=count() by type, target
| top 20 by p95_ms desc
3) Exceptions feed
exceptions
| where timestamp >= ago(24h)
| project timestamp, type, outerMessage, operation_Name, cloud_RoleName
| order by timestamp desc
4) Trace severity mapping
traces
| where timestamp >= ago(72h)
| extend severity = case(severityLevel == 0, "Verbose",
severityLevel == 1, "Information",
severityLevel == 2, "Warning",
severityLevel == 3, "Error",
severityLevel == 4, "Critical", "Unknown")
| summarize count() by severity
5) Geo & client split (web apps)
pageViews
| where timestamp > ago(7d)
| summarize views=count() by client_Browser, client_OS, client_CountryOrRegion
| top 50 by views desc
Performance & cost tips
-
Order operators to reduce first:
where
early →summarize
→project
→order by
→take
. -
Time‑bound always: start queries with
| where timestamp >= ago(...)
to hit hot cache/partitions. -
Use
project
to trim columns before joins and summaries (less memory, faster). -
Sample for exploration:
sample 10000
on massive tables, then tighten filters. -
Beware cardinality: avoid exploding with
mv-expand
unless you need to; filter first.
Control commands (not KQL, but you’ll meet them)
Control commands start with a dot .
and do change metadata/data in ADX (not in Azure Monitor workspaces):
-
.create table
,.ingest
,.alter-merge policy
, etc. They’re not allowed to be embedded in KQL queries—this separation reduces security risk.
Common gotchas
- Case sensitivity: KQL identifiers and functions are case‑sensitive in many contexts.
- Cross‑resource queries: Allowed in workspace‑based setups, but add latency; keep them short‑lived.
-
Joins on timestamps: Always pre‑bin to a window and join on
(key, bin(timestamp, 1m))
for performance. -
Visualizations:
render
works in the Logs/ADX UI only; programmatic clients must pull the data and chart themselves.
Practice: 5 micro‑challenges
1) Write a query that shows P95 latency by operation in the last 6 hours.
2) Count unique users hitting /api/pay
in the last 7 days.
3) Join requests
to dependencies
and find the heaviest backend per operation.
4) Detect spikes using anomalychart
on request rate per minute.
5) Build a make-series for errors per 5 minutes, per cloud_RoleName
.
Final take
KQL is not “SQL with typos.” It’s a purpose‑built language that makes log analysis and time‑series feel natural and fast. Learn the pipe mindset, memorize a dozen operators, and you’ll go from “log soup” to operational insight in minutes.
Written by: Cristian Sifuentes — full‑stack & cloud engineer. If this helped, drop a comment or ping me on dev.to.
This content originally appeared on DEV Community and was authored by Cristian Sifuentes