I Built an Tool to AI Agent Testing



This content originally appeared on DEV Community and was authored by Luis Fernando Richter

Excited to share my latest open-source project: the AI Agent Tester!

As AI models become more integrated into our applications, how do we ensure their responses are consistent and reliable? Manually testing prompts is slow and doesn’t scale. That’s why I created this tool.

The AI Agent Tester automates the validation process. It reads prompts from a simple CSV file, sends them to an AI (like OpenAI’s GPT), and checks the responses for expected keywords.

Here’s what makes it effective:

  • Intelligent Validation: It uses “stemming” with NLTK to recognize word variations (e.g., ‘fly’, ‘flying’, ‘flew’), making validation more robust.

  • Detailed Reports: It generates a JSON report with the status (Success/Fail) for each prompt, along with the AI’s full response.

  • Easy to Use: Built with Python and requires minimal setup. It even has automatic proxy support for corporate environments.

This project is for any developer or QA engineer working with Large Language Models who wants to add a layer of automated testing to their workflow.

It’s open-source, and I would love to get your feedback or contributions!

Check out the project on GitHub: https://github.com/lfrichter/ai-agent-test


This content originally appeared on DEV Community and was authored by Luis Fernando Richter