Legacy Lobotomy — Creating Management Commands for Seeding Teams and Users – ██FR█████ █INTELL███████████

This content originally appeared on Level Up Coding – Medium and was authored by Yevhen Nosolenko

Legacy Lobotomy — Creating Management Commands for Seeding Teams and Users

The photo was generated by the author using ChatGPT-4o.

This is the 13th tutorial in our series on refactoring a legacy Django project. In this part, we’ll begin discussing how to populate the test database with sample data. Automated tests help us verify that the system’s logic behaves as expected and that data can be inserted, modified, deleted, and retrieved correctly. However, to profile the performance of our endpoints, we need the database to be filled with a volume of data comparable to production. Moreover, if we need to test how the system handles a specific amount of expected data, we should be able to generate that data on demand.

To get more ideas regarding the project used for this tutorial or check other tutorials in this series, check the introductory article published earlier.

Legacy Lobotomy — Confident Refactoring of a Django Project

Origin branch and destination branch

If you want to follow the steps described in this tutorial, you should start from the analysis-of-user-roles-and-flow branch.
If you want to only see final code state after all changes from this tutorial are applied, you can check the seeding-teams-and-users branch.

Choosing the right method

In the previous tutorial, we discussed different methods to populate a database with data. Now we need to choose the most suitable approach for our needs. Our purpose is not a quick demonstration, so we are not satisfied with having just a few entries in the database. Our goal is to check how the system works with the following amount of data:

10 teams
200 users
600 assignments

These are not just random numbers. This is what the client had in the development database when they complained that the system’s performance was very low. Let’s figure out what is the best method for populating this amount of data.

We have to create more than 800 entries, which is a quite challenging task to do manually. Therefore, using the Django admin panel seems like a bad idea.

We also cannot ask the client to share a backup of their database, so that we could prepare fixtures or just restore the database from backup. Creating so many fixtures manually also doesn’t look reasonable. Thus, using fixtures isn’t an option either.

Data migrations are not suitable here, because:

We only want to generate the data in the local and development environments, and it should not be present in staging and production.
We want to be able to run the generation process multiple times, which is impossible with migrations since each migration runs only once.

With that said, we should strike out this option as well. We are left with two options: custom scripts and Django management commands. Let’s define other requirements for this task:

We want to reuse the same factories we use in the tests because this will simplify test development and populating the database with data.
We want to be able to use native features provided by Django, such as models and ORM.
We don’t have any specific requirements for the programming language, so we are absolutely fine with Python.
We want to implement tests for our scripts and make them run with the rest of the tests to quickly detect any problems with them.

Taking all of this into account, the most suitable method for us is creating Django management commands.

Deciding where to place seed commands

Since we are clear on the approach, we need to consider the best place to put our commands. There are two common strategies for organizing database seeding logic:

Dedicated seeding app — all commands are centralized in one place.
Distributed across Django apps — each app owns its own seeding logic.

While a dedicated app can simplify environment control (e.g., excluding it from INSTALLED_APPS in production), it often becomes bloated and harder to navigate in larger projects.

For this project, we’ve opted to place seeding commands directly inside their respective Django apps. This keeps the logic:

close to the models and factories it interacts with,
modular and easier to scale as the project grows,
simpler to test and maintain.

Installing additional dependencies

We need the factory_boy package installed to be able to use factories created earlier and add new factories when needed. We want to generate data in local and development environments. Therefore, we need to add a new dependency to the requirements/local.txt and requirements/development.txt files.

factory-boy==3.3.0

Now install a new requirement by running the command below from the root of the project.

pip install -r requirements/local.txt

If you see that no new packages were installed, it’s because we have already installed requirements listed in the requirements/testing.txt file, and the factory_boy package was installed as a dependency of the pytest-factoryboy package.

Preparing the users app

We will start creating commands in the users app. As the first step, we need to update its structure.

Step 1. Add a new management package to the users app.

Step 2. Add a new commands package to the users/management folder. Django will register a manage.py command for each Python module in that directory whose name doesn’t begin with an underscore. For more details check the official documentation.

Step 3. Add a new management package to the users/tests folder.

Step 4. Add a new commands package to the users/tests/management folder. This package will contain tests for our commands.

The final structure of the users app should be as shown in the image below.

The final structure of the users app

Creating a command for seeding teams

For the purpose of this tutorial, we need to create 200 regular users. Each regular user should be added to a team. We already have a RegularUserFactory class that generates a regular user and a team if needed. If we use this factory class to generate 200 users, 200 new teams will also be created and assigned to the users. As a result, all users will be added to different teams, which is not what we want. Instead, we want to create multiple teams and assign multiple users to each team. To achieve this, we need to create the teams first and then create users for each team.

Command requirements

We want our command to meet the following conditions:

If the command is called without arguments, 10 teams should be created by default.
The command should accept a single argument, count, which can be any positive integer. If the argument is provided, the command should generate the requested number of teams.
The command should write the IDs of newly created teams to the standard output, so that we can use them later.

Since we are already clear on the requirements, let’s write some code.

Step 1: Define the team seeding command

Add a new seed_teams.py module with the following content to the users/management/commands directory.

from django.core.management import BaseCommand, CommandError

from users.factories import TeamFactory


class Command(BaseCommand):
    help = 'Seed fake teams.'

    def add_arguments(self, parser):
        parser.add_argument(
            '--count',
            required=False,
            type=int,
            default=10,
            help='The number of teams which should be created.'
        )

    def handle(self, count, *args, **options):
        if not isinstance(count, int) or count < 1:
            raise CommandError('The --count argument must be an integer value greater than or equal to 1.')

        created_teams = TeamFactory.create_batch(count)
        created_teams_output = ','.join([str(team.pk) for team in created_teams])
        self.stdout.write(created_teams_output)

In this command we used the TeamFactory class added earlier.

Step 2: Implement tests

Create a new module named test_seed_teams.py in the users/tests/management/commands directory with the following content.

import pytest
from django.core.management import CommandError, call_command

from users.models import Team


@pytest.mark.django_db
class TestSeedTeams:
    def test_when_count_argument_is_not_provided_then_10_teams_should_be_created(self):
        call_command('seed_teams')

        assert Team.objects.count() == 10

    @pytest.mark.parametrize('count', [-10, 0, 0.5, 'invalid'])
    def test_when_count_argument_is_invalid_then_error_should_be_thrown(self, count):
        with pytest.raises(CommandError) as exc_info:
            call_command('seed_teams', count=count)

        assert str(exc_info.value) == 'The --count argument must be an integer value greater than or equal to 1.'

    @pytest.mark.parametrize('count', [1, 5, 12])
    def test_when_count_argument_is_provided_then_requested_number_of_teams_should_be_created(self, count):
        call_command('seed_teams', count=count)

        assert Team.objects.count() == count

    @pytest.mark.parametrize('count', [1, 5, 12])
    def test_when_team_is_created_then_its_id_should_be_written_to_standard_output(self, count, capsys):
        call_command('seed_teams', count=count)

        captured_output = capsys.readouterr().out.strip()
        captured_team_ids = {int(team_id) for team_id in captured_output.split(',')}

        assert len(captured_team_ids) == count

Step 3: Run and verify

Run the tests to verify that they pass successfully.

Next, run the command in the terminal from the project’s root directory and verify that the teams have been created successfully.

python src/manage.py seed_teams

Now, run the command with the –count argument from the root of the project and verify that 5 additional teams are successfully created.

python src/manage.py seed_teams --count 5

The image below shows the output from running both commands.

Running the seed_teams command from the terminal

As we can see, the tests have passed and the command ran successfully, so we’re done with this command.

Creating a command for seeding users

Now that we have a way to generate teams, the next step is to create users and assign them to those teams. For this, we will create a management command that generates regular users and assigns each of them to a team. Unlike our earlier approach where using the RegularUserFactory directly would create a new team for every user, we now want to reuse existing teams.

Command requirements

We want our command to meet the following conditions:

It should accept an optional argument –team-id. If it’s provided, all the users should be assigned to this team; otherwise, each user should be assigned to a random team.
It should accept an optional argument –count to specify how many users to create. By default, it should create 10 users.
The command should output the IDs of the newly created users to the standard output, so they can be saved for future use.

With the requirements in place, let’s dive into the implementation.

Step 1: Define the user seeding command

Add a new seed_users.py module with the following content to the users/management/commands directory.

from django.core.management import BaseCommand, CommandError

from users.factories import RegularUserFactory
from users.models import Team

class Command(BaseCommand):
    help = 'Seed fake users.'

    def add_arguments(self, parser):
        parser.add_argument(
            '--team-id',
            required=False,
            type=int,
            help='The team ID the users should be assigned to.'
        )

        parser.add_argument(
            '--count',
            required=False,
            type=int,
            default=10,
            help='The number of users which should be created.'
        )

    def handle(self, count, *args, **options):
        if not isinstance(count, int) or count < 1:
            raise CommandError('The --count argument must be an integer value greater than or equal to 1.')

        try:
            # Retrieve a team if team id is provided.
            team = self._parse_team(**options)

            # If a team wasn't provided, assign the user to a random team.
            created_users = [RegularUserFactory(team=team or self._fetch_random_team()) for _ in range(count)]
        except Team.DoesNotExist:
            raise CommandError('The team does not exist.')

        created_users_output = ','.join([str(user.pk) for user in created_users])
        self.stdout.write(created_users_output)

    def _parse_team(self, **options):
        team_id = options.get('team_id')

        # Since we don't have a default value for this argument, we want to validate it only when it's passed.
        if team_id is None:
            return None

        if not isinstance(team_id, int) or team_id < 1:
            raise CommandError('The --team-id argument must be an integer value greater than or equal to 1.')

        return Team.objects.get(pk=team_id)

    def _fetch_random_team(self):
        return Team.objects.order_by('?')[:1].get()

In the handle method, we try to retrieve a team if the –team-id argument is provided; otherwise, we assign each user to a randomly selected team. This is achieved by using the order_by('?')[:1].get() construction which ensures that one random object is fetched or a DoesNotExist error is raised. If the specified team doesn’t exist or there are no teams at all, we raise a CommandError.

Step 2: Implement tests

Add a new module named test_seed_users.py in the users/tests/management/commands directory with the following content.

import pytest
from django.contrib.auth import get_user_model
from django.core.management import CommandError, call_command

from users.models import Team

User = get_user_model()


@pytest.mark.django_db
class TestSeedUsers:
    def test_when_there_are_no_teams_then_error_should_be_thrown(self):
        with pytest.raises(CommandError) as exc_info:
            call_command('seed_users')

        assert str(exc_info.value) == 'The team does not exist.'

    def test_when_team_with_provided_id_does_not_exist_then_error_should_be_thrown(self, team):
        with pytest.raises(CommandError) as exc_info:
            call_command('seed_users', team_id=team.pk + 1)

        assert str(exc_info.value) == 'The team does not exist.'

    def test_when_team_id_is_not_provided_then_random_team_should_be_selected(self, team_factory):
        teams_total_count = 5
        teams = team_factory.create_batch(teams_total_count)
        call_command('seed_users')

        existing_team_ids = {team.pk for team in teams}
        used_team_ids = {user.team_id for user in User.objects.all()}

        # Since the team selection is randomized, we can only verify that more than one team was used.
        # There's still a small chance that all 10 assignments receive the same team,
        # but the probability is low enough to be negligible.
        assert len(used_team_ids) > 1

        # Ensure that only existing categories are used and no new categories are created.
        assert used_team_ids - existing_team_ids == set()

    @pytest.mark.parametrize('team_id', [-10, 0, 0.5, 'invalid'])
    def test_when_team_id_argument_is_invalid_then_error_should_be_thrown(self, team_id):
        with pytest.raises(CommandError) as exc_info:
            call_command('seed_users', team_id=team_id)

        assert str(exc_info.value) == 'The --team-id argument must be an integer value greater than or equal to 1.'

    @pytest.mark.parametrize('count', [-10, 0, 0.5, 'invalid'])
    def test_when_count_argument_is_invalid_then_error_should_be_thrown(self, count, team):
        with pytest.raises(CommandError) as exc_info:
            call_command('seed_users', team_id=team.pk, count=count)

        assert str(exc_info.value) == 'The --count argument must be an integer value greater than or equal to 1.'

    def test_when_count_argument_is_not_provided_then_10_users_should_be_created(self, team):
        call_command('seed_users', team_id=team.pk)

        assert User.objects.count() == 10

    @pytest.mark.parametrize('count', [1, 5, 12])
    def test_when_count_argument_is_provided_then_requested_number_of_users_should_be_created(self, count, team):
        call_command('seed_users', team_id=team.pk, count=count)

        assert User.objects.count() == count

    def test_when_team_id_provided_then_all_users_should_be_assigned_to_the_same_team(self, team):
        call_command('seed_users', team_id=team.pk)

        user_team_ids = {user.team_id for user in User.objects.all()}

        assert user_team_ids == {team.pk}

        # Verify that new teams were not created.
        assert Team.objects.count() == 1

    @pytest.mark.parametrize('count', [1, 5, 12])
    def test_when_user_is_created_then_its_id_should_be_written_to_standard_output(self, count, team, capsys):
        call_command('seed_users', team_id=team.pk, count=count)

        captured_output = capsys.readouterr().out.strip()
        captured_user_ids = {int(user_id) for user_id in captured_output.split(',')}

        assert len(captured_user_ids) == count

Step 3: Run and verify

Run all the tests once again and ensure they all pass.

Run the following command from the project’s root directory, and verify that 10 users have been created in the database and assigned to arbitrary teams.

python src/manage.py seed_users

Then, run the following command, and verify that 10 more users have been created and assigned to the team specified by the –team-id argument (which should match the ID of one of the previously created teams).

python src/manage.py seed_users --team-id <TEAM ID>

Now, we can run this command with specifying the number of teams we want to create and ensure that another batch of 25 users has been successfully created.

python src/manage.py seed_users --count 25

The image below shows the output from running these commands.

Running the seed_users command from the terminal

After running the tests and experimenting with the seed_users command, we can conclude that it is complete.

Conclusion

In this tutorial, we implemented custom Django management commands to seed teams and users with test data. We focused on reusing existing factories, ensuring the commands are testable, and structuring the code to scale. We can now efficiently populate the development database with teams and users. In the next tutorial, we will extend this approach by seeding assignment categories and targets.

Legacy Lobotomy — Creating Management Commands for Seeding Teams and Users was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding – Medium and was authored by Yevhen Nosolenko