TIL: Clustered and Non-clustered indices in SQL

This content originally appeared on DEV Community and was authored by Vishnu KN

When a table is created and rows are inserted, behind the scenes, SQL stores data relating to that table on what is known as data pages. This behind the scenes stuff is called SQL internals. And the way it is stored may be “clustered” or “non-clustered” (based on what is specified) or “heap” (if nothing is specified)

Heap is like haphazardly dumping the data wherever space is available.

AI analogy:

Clustered Index → Phone Book

A phone book lists people’s names in alphabetical order (A to Z).
The data itself (the phone numbers + addresses) is stored in that sorted order.
You don’t need another reference—just open directly to “S” to find “Sharma”.
That’s what a clustered index is: the table’s rows themselves are stored in sorted order (e.g., by Name or ID).

Non-Clustered Index → Yellow Pages

A yellow pages book lists businesses by category (Plumber, Electrician, Restaurants…).
Within each category, you see a list of businesses with a reference (page number / address).
The actual details are elsewhere—you use the index to jump there.
That’s what a non-clustered index is: it’s a separate structure that keeps pointers to where the actual data lives.

In short:

Clustered Index (Phone Book): Data itself is arranged in order → fast direct lookup.
Non-Clustered Index (Yellow Pages): Separate lookup guide with pointers → flexible, can have many.

In SQL server, IF primary key is defined, it automatically becomes the clustered index key. If not defined, there will be no clustered index key. And any column can be defined as the clustered index key.

AI Visulaization:

1⃣ Heap (no clustered index)

Rows are just placed wherever there’s space. No particular order:

Data Pages
+-------------------+   +-------------------+
| StudentID=3,Ravi  |   | StudentID=2,Meera |
| StudentID=1,Anil  |   |                   |
+-------------------+   +-------------------+

To find Name='Meera', SQL has to scan everything.

2⃣ Clustered Index on `StudentID`

Rows are physically sorted by StudentID across data pages:

Data Pages (clustered on StudentID)
+-------------------+   +-------------------+
| StudentID=1,Anil  |   | StudentID=3,Ravi  |
| StudentID=2,Meera |   |                   |
+-------------------+   +-------------------+

To find StudentID=2, SQL can go straight to it (fast).

3⃣ Non-Clustered Index on `Name` (table has clustered index on StudentID)

Separate structure, sorted by Name, pointing back to rows:

Non-Clustered Index (sorted by Name)
+--------------------+
| Anil   → StudentID=1
| Meera  → StudentID=2
| Ravi   → StudentID=3
+--------------------+

Clustered Data Pages (sorted by StudentID)
+-------------------+   +-------------------+
| 1,Anil,22         |   | 3,Ravi,20         |
| 2,Meera,21        |   |                   |
+-------------------+   +-------------------+

Steps when you query WHERE Name='Meera':

SQL looks in the non-clustered index → finds Meera → StudentID=2.
Uses the clustered index on StudentID → fetches row 2,Meera,21.

4⃣ Non-Clustered Index on `Name` (table is a heap)

If there’s no clustered index, the non-clustered index points to the physical address (RID):

Non-Clustered Index (sorted by Name)
+--------------------+
| Anil   → Page1,Slot2
| Meera  → Page2,Slot1
| Ravi   → Page1,Slot1
+--------------------+

Heap Data Pages (unordered)
+-------------------+   +-------------------+
| Page1: Ravi(3)    |   | Page2: Meera(2)   |
| Page1: Anil(1)    |   |                   |
+-------------------+   +-------------------+

Here, SQL jumps directly to the physical location.

Summary:

Heap = unordered, non-clustered indexes point to physical row location.
Clustered = ordered by clustered key, non-clustered indexes point to clustered key.