Smart Monorepos: Determining Impact from Deep Dependency Changes in Typescript



This content originally appeared on Level Up Coding – Medium and was authored by Nicolaos Moscholios

1. Intro

Note: this article will be most relevant if your project fits these criteria:

  • It’s a Typescript project
  • You’re using a monorepo structure (e.g. with yarn workspaces)
  • Your monorepo contains multiple deployable applications (e.g. a web app, a design system app, an admin panel etc…)

At Sortlist, we have a large monorepo that hosts most of the frontend client-facing application code. We have four main apps, all NextJS apps:

  • apps/public this hosts the code for our marketplace app, which you can visit at https://www.sortlist.com
  • apps/client and apps/agency are two authenticated applications that our clients use (we are a middleman, so each application is for each party, the service provider, and the buyer).
  • apps/manager is our internal application used by the sales and customer success teams, essentially our admin panel

Then, within the same project we have the following packages:

.
├── apps/
│ ├── public
│ ├── agency
│ ├── client
│ └── manager
├── libs/
│ ├── config
│ ├── feature-flags
│ ├── media
│ ├── translation
│ ├── tracking
│ ├── utils
│ └── ...etc
└── packages/
├── data-fetching
├── e2e
├── design-system
└── ...etc

That’s a lot of code that all of those apps depend on.

In this article I’ll show you how we built a simple code analysis script that lets us automatically deploy each app based on the code that was modified in the dependencies.

2. The problem

Take the following use case:

You create a pull request or merge a commit that contains changes within the design system package. This change may target a common component used within the apps, or even a piece of code which itself is a dependency of the design system (e.g. code inside utils).

Now because we have 4 applications, all of which take a while to compile, we wanted to avoid deploying everything for every change. This is because:

  1. CI time is expensive — even using your own runners, every minute spent will cost money. In our case, each app is built in a docker image and then deployed to a Kubernetes cluster
  2. With multiple people pushing code to the same monorepo, it’s not ideal having to create a new release for every app for every code push
  3. Sometimes the change does not actually affect the app, creating a “useless” release

So we decided to keep it simple and say:

If the change is outside of the apps, the developer should trigger a manual build for the apps they know their change might affect

This was clearly not the best approach as the team often had trouble remembering to create a manual release. And with every other part of the CI being automated, it’s easy to forget your change needed a deployment of N apps. We often had merged code for a fix thinking it would be released in the next 30 minutes, only to realise the next day the bug was still out because no one created a manual release (the fix was within the tracking package).

3. Existing tools

There are tools that can help with this problem. For example Paths Filter is a really nice Github Action that lets you define filters for changed paths. In fact we were using this initially to detect changes within the apps folder and create a release accordingly, however this was not granular enough.

There are also CLI tools such as madge and dependency-cruiser however these are mostly following dependencies based on import usage, rather than actual usage. For example:

// a.ts
export const thing = { deep: { value: 42 } }
// b.ts
import { thing } from './a'
const x = thing.deep.value

If you change thing.deep.value, madge will not register any new or broken dependency—because the import hasn't changed.

function foo() {
return a + b;
}
export function bar() {
return foo();
}

Similarly, if you modify something in foo, it will not be detected even though the internals of a function, used in another function, might affect its behaviour.

The only real option here is to use the Typescript API, or better, a wrapper such as ts-morph, which is what we ended up using.

Comparing typescript analysis tools

4. Our approach

The requirements were simple. Given a pull request or commit push:

  • Detect if changes exist within the apps directories — then it’s clear — the affected app needs to be deployed
  • Detect changes within the libs or packages, and for every change, find the modified symbols
  • For each symbol, traverse the reference tree until you arrive at one of the apps — then, we know that app should be built and deployed

This is meant to run as a step within the CI, both for Pull Requests (to know which images have to be built for the previews), and on every push to dev and main, to know if we should create a pre-release or release.

PS: you can find the whole code in this gist if you want to take it and modify it for your use case.

The script has 2 main parts:

Detection of modified symbols given a diff

let oldCode = '';
try {
oldCode = execSync(`git show ${baseArg}:${filePath}`, { encoding: 'utf-8' });
} catch {}
const newSourceFile = project.getSourceFileOrThrow(filePath);
const newCode = newSourceFile.getFullText();
const oldTop = getTopLevelIdentifiers(oldCode, filePath);
const newTop = getTopLevelIdentifiers(newCode, filePath);
const changedSymbols = Object.keys({ ...oldTop, ...newTop }).filter((name) => oldTop[name] !== newTop[name]);

We essentially get the source code before and after our change, extract the name of all the symbols in that diff, and if any overlap we consider they have changed.

Recursive reference traversal from changed symbols

for (const name of changedSymbols) {
const projectScope = getProjectScopeFromPath(filePath);
const chainEntry = `${name} (${projectScope})`;
// Stop early if all apps are affected anyway
if (isAllAppsAffected()) break;
getReferencesRecursive(name, newSourceFile, [chainEntry]);
}

Once we have our symbols, we perform a recursive traversal up the reference tree — which is somewhat equivalent to when you look for a code reference in VSCode, with the F12 shortcut.

We do this for every ancestor of a given symbol, until we reach a file path where the symbol lives that matches our top level apps (i.e. within the apps folder). Every time we reach an app, we store it in a Set so that we can simply skip if:

  • All apps are affected — in this case we know we have to rebuild everything, so stop searching
  • If the symbol’s ancestors match an app that’s already been found as affected, we skip searching further

For informative purposes we also log the paths that triggered each app to be considered. This happens here:

if (isInAppFile(ancestor) && !isAlreadyAffected(ancestorFilePath)) {
const affectedApp = apps.find((appPath) => ancestorFilePath.includes(appPath));
if (affectedApp != null) {
affectedApps.add(affectedApp);
impactDetails.push({
symbol: ancestorSymbolName,
path: ancestorFilePath,
chain: [...chain, chainEntry],
});
}
break;
}

This is then logged to the terminal. Here is an example output of running the script with the given (simplified) PR diff:

Modified files
apps/public/src/core/repos/
└── track.repo.ts
libs/tracking/
├── index.ts
└── src/
├── analytics.ts
└── utils.ts+
🔍 Calculating diff between 1f6e17e6b2a67eeb40ea3b733cbbb711933a57cc and HEAD
🔍 Modified app path apps/public/src/core/repos/track.repo.ts: apps/public
🔍 Changed symbols in libs/tracking/index.ts: []
🔍 Changed symbols in libs/tracking/src/analytics.ts: [ 'getAnalyticsInstance', 'abTestCookieToObject' ]
Checking references for getAnalyticsInstance 2
Checking references for useTracker 17
Checking references for TrackingProvider 107
Checking references for TrackingContextProvider 0
Checking references for TrackingContextProvider 0
fatal: path 'libs/tracking/src/utils.ts' exists on disk, but not in 'origin/dev'
🔍 Changed symbols in libs/tracking/src/utils.ts: [ 'abTestCookieToObject', 'getAbTestData' ]
✅ Final Affected Apps: [ 'public', 'agency', 'client', 'manager' ]
🧭 Impact Details: [
{
symbol: 'useTackerBase',
path: '/home/runner/_work/frontend/frontend/apps/agency/src/core/hooks/useTracker.ts',
chain: [
'getAnalyticsInstance (libs/tracking)',
'useTracker (libs/tracking)',
'useTackerBase (apps/agency)'
]
},
{
symbol: 'useTackerBase',
path: '/home/runner/_work/frontend/frontend/apps/client/src/core/hooks/use-tracker/index.ts',
chain: [
'getAnalyticsInstance (libs/tracking)',
'useTracker (libs/tracking)',
'useTackerBase (apps/client)'
]
},
{
symbol: '__function',
path: '/home/runner/_work/frontend/frontend/apps/manager/src/core/context/tracking-context.tsx',
chain: [
'getAnalyticsInstance (libs/tracking)',
'useTracker (libs/tracking)',
'TrackingProvider (libs/tracking)',
'__function (apps/manager)'
]
}
]
Done in 16.34s.

Here we can see that this change requires all applications to be rebuilt since the tracking package is used essentially everywhere. The public app was modified directly so it was ignored from the analysis.

We can see the “chain” or tree traversal path of how the script detected each application.

5. The results

Since using this script in our CI, we have avoided “forgotten” releases as the CI pipeline is able to better determine when apps need to be deployed.

This did of course cause more deployments than before, slightly increasing waiting times and resource usage, especially in cases like above where we modify a core dependency — but that’s the whole point after all.

Hope this can come in handy for anyone else, either to solve the same issue, or just to use it to get some data out of your code changes 😉

Additional reading:

11 Monorepo Build Tools You Should Know


Smart Monorepos: Determining Impact from Deep Dependency Changes in Typescript was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding – Medium and was authored by Nicolaos Moscholios