This content originally appeared on DEV Community and was authored by Franck Pachot
We often explain $lookup as similar to a relational JOIN, but there’s a key difference in the returned output format. In SQL, a JOIN returns a flat, tabular result set, which means values from the “one” side of a one‑to‑many relationship are repeated for each matching row on the “many” side. In MongoDB, $lookup outputs related data as arrays embedded within each document, so the “one” side appears only once and related items are grouped together.
Here’s the funny part: relational databases were invented to avoid duplication through relational normalization — yet a JOIN immediately re‑introduces duplication in the query results, consumed as one stream of records. That’s why application code or ORMs often have to break those repeated values back into a nested structure. In MongoDB, when data is stored using references, $lookup can resolve those references and embed the related documents directly in the output, avoiding duplication and producing a structure that’s naturally more application‑friendly. MongoDB is a document database, either you embed on write to store related data in a single document, or you embed on read from multiple documents with references, with $lookup, but the result is the same: an object aggregate in JSON, or rather BSON, that is fetched as-is by the drivers.
Still, MongoDB can provide the equivalent of an INNER JOIN or a LEFT OUTER JOIN, when adding an $unwind stage after the $lookup.
I’ll demonstrate this with the traditional department and employee schema, using the names from the SCOTT example schema of Oracle database, but as MongoDB collections:
db.createCollection("dept");
db.dept.insertMany([
{ deptno: 10, dname: "ACCOUNTING", loc: "NEW YORK" },
{ deptno: 20, dname: "RESEARCH", loc: "DALLAS" },
{ deptno: 30, dname: "SALES", loc: "CHICAGO" },
{ deptno: 40, dname: "OPERATIONS", loc: "BOSTON" }
]);
db.createCollection("emp");
db.emp.insertMany([
{ empno: 7839, ename: "KING", deptno: 10 },
{ empno: 7698, ename: "BLAKE", deptno: 30 },
{ empno: 7782, ename: "CLARK", deptno: 10 },
{ empno: 7566, ename: "JONES", deptno: 20 },
{ empno: 7788, ename: "SCOTT", deptno: 20 },
{ empno: 7902, ename: "FORD", deptno: 20 },
{ empno: 7369, ename: "SMITH", deptno: 20 },
{ empno: 7499, ename: "ALLEN", deptno: 30 },
{ empno: 7521, ename: "WARD", deptno: 30 },
{ empno: 7654, ename: "MARTIN",deptno: 30 },
{ empno: 7844, ename: "TURNER",deptno: 30 },
{ empno: 7876, ename: "ADAMS", deptno: 20 },
{ empno: 7900, ename: "JAMES", deptno: 30 },
{ empno: 7934, ename: "MILLER",deptno: 10 }
]);
Note that the “OPERATIONS” department has no employees at this point in time.
$lookup: join with no data duplication
In a SQL database, joining tables duplicates data to ensure each record has all necessary information for independent processing. When loading this data as objects rather than records, the application must detect where objects begin and end and eliminate duplicates—or rely on an ORM to do this automatically. Here is an example of a join to show the departments with their employees:
SQL> select dept.*,empno,ename
from dept left outer join emp on dept.deptno = emp.deptno
;
DEPTNO DNAME LOC EMPNO ENAME
_________ _____________ ___________ ________ _________
10 ACCOUNTING NEW YORK 7,782 CLARK
10 ACCOUNTING NEW YORK 7,839 KING
10 ACCOUNTING NEW YORK 7,934 MILLER
20 RESEARCH DALLAS 7,369 SMITH
20 RESEARCH DALLAS 7,566 JONES
20 RESEARCH DALLAS 7,788 SCOTT
20 RESEARCH DALLAS 7,876 ADAMS
20 RESEARCH DALLAS 7,902 FORD
30 SALES CHICAGO 7,499 ALLEN
30 SALES CHICAGO 7,521 WARD
30 SALES CHICAGO 7,654 MARTIN
30 SALES CHICAGO 7,698 BLAKE
30 SALES CHICAGO 7,844 TURNER
30 SALES CHICAGO 7,900 JAMES
14 rows selected.
MongoDB provides a consistent document model across both application code and database storage, and returns the same join result without duplication:
db.dept.aggregate([
{ $lookup: {
from: "emp",
localField: "deptno",
foreignField: "deptno",
as: "employees"
}
}
])
[
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: [
{ _id: ObjectId('691f1623f4a22d6613d4b0f1'), empno: 7839, ename: 'KING', deptno: 10 },
{ _id: ObjectId('691f1623f4a22d6613d4b0f3'), empno: 7782, ename: 'CLARK', deptno: 10 },
{ _id: ObjectId('691f1623f4a22d6613d4b0fe'), empno: 7934, ename: 'MILLER', deptno: 10 }
]
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'),deptno: 20,dname: 'RESEARCH',loc: 'DALLAS',
employees: [
{ _id: ObjectId('691f1623f4a22d6613d4b0f4'), empno: 7566, ename: 'JONES', deptno: 20 },
{ _id: ObjectId('691f1623f4a22d6613d4b0f5'), empno: 7788, ename: 'SCOTT', deptno: 20 },
{ _id: ObjectId('691f1623f4a22d6613d4b0f6'), empno: 7902, ename: 'FORD', deptno: 20 },
{ _id: ObjectId('691f1623f4a22d6613d4b0f7'), empno: 7369, ename: 'SMITH', deptno: 20 },
{ _id: ObjectId('691f1623f4a22d6613d4b0fc'), empno: 7876, ename: 'ADAMS', deptno: 20 }
]
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'),deptno: 30,dname: 'SALES',loc: 'CHICAGO',
employees: [
{ _id: ObjectId('691f1623f4a22d6613d4b0f2'), empno: 7698, ename: 'BLAKE', deptno: 30 },
{ _id: ObjectId('691f1623f4a22d6613d4b0f8'), empno: 7499, ename: 'ALLEN', deptno: 30 },
{ _id: ObjectId('691f1623f4a22d6613d4b0f9'), empno: 7521, ename: 'WARD', deptno: 30 },
{ _id: ObjectId('691f1623f4a22d6613d4b0fa'), empno: 7654, ename: 'MARTIN', deptno: 30 },
{ _id: ObjectId('691f1623f4a22d6613d4b0fb'), empno: 7844, ename: 'TURNER', deptno: 30 },
{ _id: ObjectId('691f1623f4a22d6613d4b0fd'), empno: 7900, ename: 'JAMES', deptno: 30 }
]
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dd'), deptno: 40,dname: 'OPERATIONS', loc: 'BOSTON',
employees: []
}
]
The $lookup operation in MongoDB is similar to a LEFT OUTER JOIN except that it keeps the data in a structured schema, to be consumed by the application.
$lookup + $unwind : inner join (with duplication)
The result of $lookup is the easiest for the application, but if you need a record-like output to process further, $unwind can unnest the array into as many documents as the array items:
db.dept.aggregate([
{ $lookup: {
from: "emp",
localField: "deptno",
foreignField: "deptno",
as: "employees"
}
},
{ $unwind:
"$employees"
}
])
[
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f1'), empno: 7839, ename: 'KING', deptno: 10 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f3'), empno: 7782, ename: 'CLARK', deptno: 10 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fe'), empno: 7934, ename: 'MILLER', deptno: 10 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f4'), empno: 7566, ename: 'JONES', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f5'), empno: 7788, ename: 'SCOTT', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f6'), empno: 7902, ename: 'FORD', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f7'), empno: 7369, ename: 'SMITH', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fc'), empno: 7876, ename: 'ADAMS', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f2'), empno: 7698, ename: 'BLAKE', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f8'), empno: 7499, ename: 'ALLEN', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f9'), empno: 7521, ename: 'WARD', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fa'), empno: 7654, ename: 'MARTIN', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fb'), empno: 7844, ename: 'TURNER', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fd'), empno: 7900, ename: 'JAMES', deptno: 30 }
}
]
Internally, the implementation avoids materializing the array and immediately returns one document per match, the two stages $lookup and $unwind being optimized into a single lookup-unwind stage that acts like a SQL join.
Note that because there are no items in the employees array, there are no documents in the output for department 40. This is exactly like an INNER JOIN in SQL.
$lookup + $unwind with preserveNullAndEmptyArrays: left outer join
It is possible to preserve the LEFT OUTER JOIN output of $lookup through $unwind with preserveNullAndEmptyArrays: true:
db.dept.aggregate([
{ $lookup: {
from: "emp",
localField: "deptno",
foreignField: "deptno",
as: "employees"
}
},
{ $unwind: {
path: "$employees" ,
preserveNullAndEmptyArrays: true
}
}
])
[
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f1'), empno: 7839, ename: 'KING', deptno: 10 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f3'), empno: 7782, ename: 'CLARK', deptno: 10 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0da'), deptno: 10, dname: 'ACCOUNTING', loc: 'NEW YORK',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fe'), empno: 7934, ename: 'MILLER', deptno: 10 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f4'), empno: 7566, ename: 'JONES', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f5'), empno: 7788, ename: 'SCOTT', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f6'), empno: 7902, ename: 'FORD', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f7'), empno: 7369, ename: 'SMITH', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0db'), deptno: 20, dname: 'RESEARCH', loc: 'DALLAS',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fc'), empno: 7876, ename: 'ADAMS', deptno: 20 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f2'), empno: 7698, ename: 'BLAKE', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f8'), empno: 7499, ename: 'ALLEN', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0f9'), empno: 7521, ename: 'WARD', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fa'), empno: 7654, ename: 'MARTIN', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fb'), empno: 7844, ename: 'TURNER', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'), deptno: 30, dname: 'SALES', loc: 'CHICAGO',
employees: { _id: ObjectId('691f1623f4a22d6613d4b0fd'), empno: 7900, ename: 'JAMES', deptno: 30 }
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dd'), deptno: 40, dname: 'OPERATIONS', loc: 'BOSTON'
}
]
With preserveNullAndEmptyArrays: true the empty array for department 40 still generated a document, but with no “employees” fields from employees as there are no employees. This is an advantage of the flexible schema: no need to add NULL values into fields for an outer join result.
That’s another funny thing in SQL databases, a NULL is supposed to mark an unknown value, and a primary key is supposed to never be NULL, but the result of a left outer join returns nulls into all columns from the right side, including columns from the primary key. MongoDB flexible schema allows the absence of the fields to represent this more clearly.
$lookup from an array in localField
In SQL, a many-to-many relationship requires an association table and two joins. In MongoDB this associative structure can be embedded as an array, and be queried with a single lookup. When localField is an array, $lookup automatically matches any value in that array to the foreign field. As an example, I store the previous departments of an employee:
db.emp.updateOne(
{ _id: ObjectId("691f1623f4a22d6613d4b0f5") }, // SCOTT
{ $set: { previousDepts: [ 30, 40 ] } }
)
I use the same $lookup, but on this history, and it joins each value:
db.emp.aggregate([
{ $match: {
"ename": "SCOTT"
}
},
{ $lookup: {
from: "dept",
localField: "previousDepts",
foreignField: "deptno",
as: "previousDepartments"
}
}
])
[
{
_id: ObjectId('691f1623f4a22d6613d4b0f5'),
empno: 7788,
ename: 'SCOTT',
deptno: 20,
deptHistory: [ { dept: 30 }, { dept: 40 } ],
previousDepts: [ 30, 40 ],
previousDepartments: [
{
_id: ObjectId('691ef750f4a22d6613d4b0dc'),
deptno: 30,
dname: 'SALES',
loc: 'CHICAGO'
},
{
_id: ObjectId('691ef750f4a22d6613d4b0dd'),
deptno: 40,
dname: 'OPERATIONS',
loc: 'BOSTON'
}
]
}
]
Conclusion
While MongoDB’s $lookup is often compared to SQL JOINs, it works differently and produces results that are better suited for applications, embedding related data as arrays within documents instead of duplicating values in a flat table. This avoids the extra processing that SQL applications or ORMs often need to transform repeated rows back into nested structures. Rather than trying to replicate SQL patterns in MongoDB, it’s important to think from the application’s perspective—how the data will be consumed—and use MongoDB’s document model, flexible schema, and aggregation features like $lookup, $unwind, and array matching to shape the results in the most useful form for your code. This approach leverages MongoDB’s strengths to deliver structured, ready‑to‑use JSON objects directly from the database.
This content originally appeared on DEV Community and was authored by Franck Pachot