Understanding MongoDB $lookup performance



This content originally appeared on DEV Community and was authored by Lucas Aguiar

After running into some issues with $lookup in our banking application, I decided to dive deeper into how it actually works and why it sometimes ends up being so slow.

The problem

It was just a normal day. I was working on a new feature when my tech lead mentioned me on Slack with something like:

“Lucas, your query is too slow”
(and then pasted the Explain Plan)

Right after that, he casually added: “This query took 33 minutes to finish.

In my head, I was like: “Wow… how did this slip through? The code has been there for three months already!
That’s when I decided to investigate the problem more deeply.

Here’s the query he sent me:

[
  {
    "$match": "simple start filtering"
  },
  {
    "$lookup": {
      "from": "Company",
      "let": {
        "company": "$company"
      },
      "pipeline": [
        {
          "$match": {
            "removedAt": null,
            "$expr": {
              "$and": [
                {
                  "$eq": [
                    "$_id",
                    "$$company"
                  ]
                }
              ]
            }
          }
        }
      ],
      "as": "company"
    }
  },
  {
    "$lookup": {
      "from": "Customer",
      "let": {
        "customer": "$customer"
      },
      "pipeline": [
        {
          "$match": {
            "removedAt": null,
            "$expr": {
              "$and": [
                {
                  "$eq": [
                    "$_id",
                    "$$customer"
                  ]
                }
              ]
            }
          }
        }
      ],
      "as": "customer"
    }
  },
  {
    "$match": "more filtering"
  },
]

Right after reading the query, my first thought was: “Why are we using $expr and pipelines inside the $lookup?”

So the very first thing I did was paste this aggregation pipeline into MongoDB Compass to run an explain and try to visualize why the query was so slow.
And to my surprise… I just got an infinite loading screen 😅

I promise, I waited a long time hoping to see some output. But in the end, all I got was a timeout error.

At that point, I decided to break things down step by step, isolating each stage to figure out where the bottleneck was hiding.

Testing

First, I ran an explain on just the $match. Everything worked perfectly: no problems at all. The query took only 10ms and used a simple company index.

[
  {
    "$match": {
      "removedAt": null,
      "company": ObjectId("some_ID"),
      "type": "TYPE_FILTERING"
    }
  }
]

So far, so good.

Next, I added the first $lookup—and here’s where things started to smell funny. The execution time jumped from 10ms to 314ms.

Before moving on, I decided to test both $lookups together, just to confirm where the problem was. And… boom! 6 minutes to return the result. So clearly, the issue had to be in the $lookups.

Understanding the lookup

Everything here are based on the official documentation of lookup on mongodb.

So, to start, $lookup has this syntax and these fields

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       let: { <var_1>: <expression>, …, <var_n>: <expression> },
       pipeline: [ <pipeline to run> ],
       as: <output array field>
     }
}

At the basic, the easy way to use an lookup is with localField and foreignField, but on our query we are using let and pipeline to find the results.
But why? At the application we use soft delete to prevent deleting important data by an error, so everything keeps stored but with removedAt flag, but we really needs to filter with pipeline?

Understanding the problem of $expr

According to the official $expr documentation:

When $expr appears in a $match stage that is part of a $lookup subpipeline, $expr can refer to let variables defined by the $lookup stage

The problem here is that when MongoDB applies the filter, it has to evaluate the condition document by document inside the collection. As the collection grows, this operation gets slower and slower, since it can’t rely on indexes the same way a direct match does.

In other words: $expr inside a $lookup pipeline forces MongoDB to do more work, which quickly becomes painful at scale.

The better approach is to use localField and foreignField whenever possible. That way, MongoDB can take advantage of indexes efficiently. For example:

  {
    "$lookup": {
      "from": "Company",
      "localField": "company",
      "foreignField": "_id",
      "pipeline": [
        {
          "$match": {
            "removedAt": null,

          }
        }
      ],
      "as": "company"
    }
  },

And here we go! 🎉
Now the response time is totally acceptable:

Improving the pipeline even more

After some experiments, I discovered that whenever we use a pipeline inside a $lookup, the query slows down a lot—sometimes 5x to 10x slower.

So, if your application doesn’t really need extra validations, I strongly recommend avoiding the pipeline. Instead, stick to localField/foreignField. This is much simpler and far more efficient when you’re just joining on _ids.

In my case, by removing the unnecessary pipeline, I improved the query from 259ms down to 36ms. Here’s the simplified version:

[
  {
    "$match": "simple filtering"
  },
  {
    "$lookup": {
      "from": "Company",
      "localField": "company",
      "foreignField": "_id",
      "as": "company"
    }
  },
]

Conclusion

After all the tweaks, the final query ended up looking like this:

[
  {
    "$match": "simple filtering"
  },
  {
    "$lookup": {
      "from": "Company",
      "localField": "company",
      "foreignField": "_id",
      "as": "company",
    }
  },
  {
    "$lookup": {
      "from": "Customer",
      "localField": "customer",
      "foreignField": "_id",
      "as": "customer",
    }
  },
  {
    "$match": "more filtering"
  },

And that’s it a (not so short) story, but with some practical tips on how to improve $lookup performance:

  • ✅ Prefer localField / foreignField whenever possible
  • ⚠ $expr can slow down queries a lot
  • 🛠 For more complex lookups, combine pipeline with localField/foreignField


This content originally appeared on DEV Community and was authored by Lucas Aguiar