I’m Building an AI Agent to Write My Unit Tests



This content originally appeared on DEV Community and was authored by Hernán Chilabert

Hey DEV community! 👋

Like many of you, I’ve spent countless hours writing unit tests. It’s one of the most critical parts of building reliable software, but it can also be a real grind. As I’ve been diving deeper into the world of AI Agents, I thought: what if I could automate this?

So, I started a tiny project to build my own AI agent to handle it. This is my journey of learning in public, and I wanted to share the first version with you all.

What I’ve Built So Far: The “Dev Engineer” Agent

The first phase is a simple but functional “Dev Engineer” agent. The concept is straightforward:

  1. You give it a Python source file.
  2. It gives you back a test_.py file with unit tests ready to run with pytest.

Under the hood, it’s a Python script using LangChain to manage the logic and OpenAI LLM. It’s a simple, powerful starting point.

The Big Picture: An Autonomous Testing Team

This is just the beginning. The ultimate goal isn’t just to generate tests, but to create a collaborative team of AI agents that can ensure code quality autonomously. The vision is to build a “QA Engineer” agent that will work alongside the “Dev Engineer” in a feedback loop:

  1. The Dev Agent writes the tests.

  2. The QA Agent runs them, checks for failures, and analyzes code coverage.

  3. If anything is wrong, the QA Agent sends feedback to the Dev Agent.

  4. The Dev Agent corrects the tests and sends them back.

  5. …and so on, until we have a robust and passing test suite.

Let’s Build This Together! (Call for Collaboration)

This project is my personal learning playground, but I believe it has the potential to become a genuinely useful tool for the community. That’s where you come in.

I’m building this completely open source, and I would love for you to get involved. Whether you’re an AI expert or just a curious developer, there are plenty of ways to contribute:

  • Check out the code and give feedback.

  • Suggest new features or improvements.

  • Tackle an open issue or a task from the roadmap below.

I’ve laid out a clear plan for where the project is headed. Take a look and see if anything sparks your interest!

🚀 Project Roadmap

This project is under active development. Below is a summary of my progress and a look at what’s ahead. Contributions are highly encouraged!

✅ Phase 1: Core Test Generation Engine (MVP)
[x] Develop “Dev Engineer” Agent: A core agent capable of generating unit tests from a single Python source file.

[x] LLM Integration: Connect the agent to a foundational LLM (e.g., GPT-4o, Llama 3) to power code generation.

[x] Basic CLI: A simple command-line interface to input a file and receive the generated test file.

🎯 Phase 2: Multi-Agent Collaboration & Feedback Loop
[ ] Introduce “QA Engineer” Agent: Develop a second agent responsible for reviewing, validating, and executing the generated tests.

[ ] Implement Test Execution Tool: Create a secure tool for the QA Agent to programmatically run pytest, capture results, and parse code coverage reports.

[ ] Establish Collaborative Framework (CrewAI): Refactor the agent logic into a Crew to manage the feedback loop, allowing the Dev Agent to fix tests based on the QA Agent’s feedback until a target coverage is achieved.

🏗 Phase 3: API-First Architecture & State Management
[ ] Expose via API: Wrap the agent crew in a FastAPI application to make it accessible as a service.

[ ] Job State Management: Integrate Redis or a database to manage the state of long-running jobs, allowing for asynchronous operation.

[ ] Containerization: Create a Dockerfile and docker-compose.yml to ensure a consistent and reproducible environment for the entire application stack.

✨ Future Vision
[ ] LLMOps & Observability: Integrate with tools like LangSmith to trace, debug, and evaluate the performance of the agent interactions.

[ ] IDE Integration: Develop a VSCode extension for a seamless developer experience right within the editor.

[ ] Multi-Language Support: Expand capabilities beyond Python to include other languages like JavaScript/TypeScript and Go.

[ ] Automated Code Refactoring: Empower the Dev Agent to suggest fixes in the source code itself, not just the tests.

You can find the repository with all the code for Phase 1 here:

👉 https://github.com/herchila/unittest-ai-agent

What do you think? What other developer chores do you wish you could automate with AI? Let me know in the comments below!

See you!


This content originally appeared on DEV Community and was authored by Hernán Chilabert