How a Bottleneck, YouTube Video and C# Channels Led to a 3x Faster .NET Backup Tool



This content originally appeared on DEV Community and was authored by Stewart Celani

When I first released my Autodesk Construction Cloud Backup tool back in 2022, I was pretty happy with it. It was reliably backing up about 150 GB of data in around 6 hours, which was a huge improvement over the commercial alternatives at the time.

Fast forward three years to 2025. The data had grown to about 225 GB, but the nightly backup was now taking a whopping 12 hours.

The data hadn’t even doubled, but the backup time had. Something felt really off with how it was scaling, and I wasn’t happy with it.

The Lightbulb Moment

I was already in the middle of updating the project to .NET 9 and taking the opportunity to swap out NLog for Serilog (it’s just a personal preference, I love Serilog’s structured logging). While taking a break, I stumbled upon a Nick Chapsas video about C# Channels.

YouTube thumbnail game on point

Not affliated, but I do highly recommend Nick’s Dometrain courses if you are serious about learning C#

I honestly didn’t know C# had powerful, built-in “queues” like this designed for exactly this kind of work. Suddenly, the solution to my scaling problem became crystal clear. The entire bottleneck was the tool’s dumb, sequential approach:

Project 1: Enumerate (30s) -> Download (5m) -> Done
Project 2: Enumerate (45s) -> Download (3m) -> Done  
Project 3: Enumerate (15s) -> Download (8m) -> Done

During each enumeration phase, the downloader sits completely idle. During each download phase, the enumerator sits completely idle. It’s like having a single-lane bridge where cars can only go one direction at a time!

I had my lightbulb moment. What if I could have workers that only enumerate projects, and while they’re doing that, another worker is already downloading a project the first one prepared? The downloader could be running the whole time, constantly being fed new work.

How It Works

The core concept is a “conveyor belt” (the Channel) that holds projects ready for download. I chose an unbounded channel because project enumeration times can vary wildly—some projects take seconds while others can take 30+ minutes. A bounded channel could block fast enumerations waiting for slow downloads.

// Create an unbounded channel to queue enumerated projects
var enumerationChannel = Channel.CreateUnbounded<ProjectBackup>(new UnboundedChannelOptions
{
    SingleWriter = false, // Multiple enumeration tasks will write
    SingleReader = true   // Single download task will read
});

The Producer task controls how many projects get enumerated simultaneously. I use a SemaphoreSlim rather than Parallel.ForEachAsync because I want fine-grained control over concurrency and better error handling per project.

// Producer task: Enumerate projects with controlled concurrency
var enumerationTask = Task.Run(async () =>
{
    var semaphore = new SemaphoreSlim(4, 4); // Allow up to 4 concurrent enumerations
    var enumerationTasks = new List<Task>();

    foreach (var project in _projects)
    {
        var projectTask = Task.Run(async () =>
        {
            await semaphore.WaitAsync();
            try
            {
                Logger.Info($"=> Enumerating project {project.Name} ({project.ProjectId})");
                Logger.Info("Querying for list of folders and files, this may take a while depending on project size.");
                project.BackupStartedAt = DateTime.Now;
                await project.GetContentsRecursively();
                Logger.Info($"=> Enumeration complete for {project.Name}, queuing for download");
                await enumerationChannel.Writer.WriteAsync(project);
            }
            catch (Exception ex)
            {
                Logger.Error(ex, $"Failed to enumerate project {project.Name} ({project.ProjectId})");
                // Still queue the project so it appears in the summary as failed
                if (project.BackupStartedAt == null)
                    project.BackupStartedAt = DateTime.Now;
                project.BackupFinishedAt = DateTime.Now;
                await enumerationChannel.Writer.WriteAsync(project);
            }
            finally
            {
                semaphore.Release();
            }
        });
        enumerationTasks.Add(projectTask);
    }

    await Task.WhenAll(enumerationTasks);
    enumerationChannel.Writer.Complete(); // Signal no more projects coming
});

The Consumer task runs continuously, processing projects as they become available. But here’s the cool part—it also tracks performance metrics to measure pipeline efficiency.

// Consumer task: Download projects sequentially with wait time tracking
var downloadTask = Task.Run(async () =>
{
    DateTime? lastDownloadEndTime = null;
    var isFirstProject = true;

    await foreach (var project in enumerationChannel.Reader.ReadAllAsync())
    {
        // Track wait time (time between downloads)
        var projectReceivedTime = DateTime.Now;
        if (lastDownloadEndTime.HasValue)
        {
            var waitTime = projectReceivedTime - lastDownloadEndTime.Value;
            _totalWaitTime = _totalWaitTime.Add(waitTime);
            Logger.Debug($"Waited {waitTime.TotalSeconds:F1}s for next project to be enumerated");
        }
        else if (!isFirstProject)
        {
            // For the first project after initialization
            var waitTime = projectReceivedTime - _backupStartTime!.Value;
            _totalWaitTime = _totalWaitTime.Add(waitTime);
        }
        isFirstProject = false;

        // Download the project and track active download time
        var downloadStartTime = DateTime.Now;
        Logger.Info($"=> Downloading project {project.Name} ({project.ProjectId})");
        Logger.Info("Backup beginning.");

        // Use sanitized project name for directory creation
        var sanitizedProjectName = SanitizeProjectName(project.Name);
        await project.DownloadContentsRecursively(Path.Combine(Config.BackupDirectory, sanitizedProjectName));

        var downloadEndTime = DateTime.Now;
        var downloadTime = downloadEndTime - downloadStartTime;
        _totalDownloadTime = _totalDownloadTime.Add(downloadTime);
        lastDownloadEndTime = downloadEndTime;
        _projectsProcessed++;

        project.BackupFinishedAt = downloadEndTime;
        Logger.Info($"=> Finished downloading project {project.Name} ({project.ProjectId})");
        LogBackupSummaryLine(project);
    }
});

// Wait for both tasks to complete
await Task.WhenAll(enumerationTask, downloadTask);

What makes this implementation special is the performance tracking. I measure the wait time (when the downloader is idle waiting for projects) vs the download time (actively downloading). This tells me exactly how efficiently the pipeline is running. If wait time is high, I know enumeration is the bottleneck and I should increase concurrent enumerations. If it’s low, downloads are keeping up perfectly with enumerations.

Notice that projects are processed sequentially by the consumer—this is intentional! The Autodesk API has rate limits, so downloading multiple projects simultaneously would just trigger throttling. But within each project, files download in parallel via Parallel.ForEachAsync.

Two-Level Parallelization Strategy

Here’s where it gets really interesting. The Channels handle project-level parallelization, but within each project, I also use Parallel.ForEachAsync for downloading individual files (which I previously blogged about here). This creates a two-tier performance boost:

public async Task<List<FileInfo>> DownloadFiles(
    IEnumerable<File> fileList, string rootDirectory, CancellationToken ct = default)
{
    List<FileInfo> fileInfoList = new();

    var parallelOptions = new ParallelOptions
    {
        CancellationToken = ct,
        MaxDegreeOfParallelism = Config.MaxDegreeOfParallelism // Default: 8
    };

    await Parallel.ForEachAsync(fileList, parallelOptions, async (file, ctx) =>
    {
        var fileInfo = await DownloadFile(file, rootDirectory, ctx);
        fileInfoList.Add(fileInfo);
    });

    return fileInfoList;
}

So the complete picture is:

  • Level 1: Up to 4 projects being enumerated concurrently via Channels
  • Level 2: Within each downloading project, up to 8 files downloading in parallel via Parallel.ForEachAsync
  • Pipeline: While one project downloads its files, 3 others can be querying the API for their folder structures

The key insight is that enumeration (API calls to get folder/file lists) and downloading (HTTP file transfers) have completely different bottlenecks. Enumeration is limited by API rate limits, while downloading is limited by bandwidth. By separating them with Channels, each can run at its optimal speed.

The Results

The performance improvement was dramatic. That 12-hour backup of 225GB across ~170 projects now finishes in just 4 hours—a 3x improvement! But the real magic is in the efficiency reporting the tool now generates.

Here’s what a typical backup now logs:

=================================================================================
=> Pipeline Efficiency Statistics:
    Total pipeline time: 03:45:23
    Active download time: 03:32:18 (94.2%)
    Wait time (idle): 00:13:05 (5.8%)
    Projects processed: 168
    Average wait between projects: 00:00:05
=================================================================================

This tells an amazing story. The pipeline is now 94.2% efficient—meaning the downloader is actively working 94% of the time instead of sitting idle waiting for the next project to be enumerated. The old sequential approach was probably around 50% efficient at best.

The tool also tracks incremental backup statistics (v1.1.0 added smart copying from previous backups):

=================================================================================
=> Incremental Backup Statistics:
    Files copied from previous backup: 38,542 (198.7 GB)
    Files downloaded from Autodesk: 1,688 (6.3 GB)
    Total files processed: 40,230
    Estimated time saved: 11h 2m
    Bandwidth saved: 198.7 GB
    Incremental efficiency: 96.93% data reused from previous backup
=================================================================================

When you combine the producer-consumer pipeline with incremental backups, you get compounding performance benefits. The pipeline keeps the tool busy, while incremental sync avoids re-downloading unchanged files.

The Key Insight

The breakthrough wasn’t just using Channels—it was recognizing that enumeration and downloading are fundamentally different operations with different bottlenecks:

  • Enumeration: I/O bound, limited by API rate limits (~4 projects at once is optimal)
  • Downloading: Bandwidth bound, benefits from high parallelism (8 files at once)
  • Sequential coupling: The old approach made downloading wait for enumeration to finish

By decoupling them with Channels, each operation can run at its natural speed. The enumeration never blocks downloads, and downloads never wait for enumeration.

Overall, I’m incredibly happy with how this turned out. It was a great reminder that sometimes the most elegant solution to a nagging problem is just one YouTube video away. C# Channels turned what could have been a complex threading nightmare into clean, readable code that just works.


This content originally appeared on DEV Community and was authored by Stewart Celani