The SEO Alchemist: How I Built an AI Agent to Turn Raw Web Data into Content Gold – ██FR█████ █INTELL███████████

This content originally appeared on DEV Community and was authored by Inforeole Automatisations IA

This is a submission for the AI Agents Challenge powered by n8n and Bright Data

What I Built

I built an AI Content Strategist Agent that automates the entire process of SEO competitive analysis and content brief creation.

This agent solves a critical problem for content marketers and SEO specialists: the immense amount of manual time and guesswork involved in planning content that ranks. Instead of spending hours manually searching Google, opening dozens of tabs, and trying to synthesize competitor strategies, this agent does it all in minutes. The user simply provides a target keyword, and the agent delivers a complete, data-driven strategic brief designed to outrank the competition.

Demo

Here is a video demonstrating the entire workflow in action: Watch the Demo on YouTube

n8n Workflow

GitHub Gist

Technical Implementation

The agent’s intelligence is orchestrated within n8n by combining specialized tools and carefully crafted instructions.

System Instructions: The agent uses a two-step prompting strategy.
1. Analysis Prompt: The first core prompt instructs the LLM to act as a world-class SEO Content Strategist. It is fed the aggregated raw data from all 10 competitor pages and is tasked with performing a holistic analysis. Critically, its output is strictly enforced to be a structured JSON object containing the search intent, core topics, content gaps, and a suggested outline.
2. Formatting Prompt: A second, simpler prompt takes the structured JSON from the first step and instructs the LLM to transform it into a clean, human-readable Markdown report. This separation of concerns ensures reliability and clean presentation.
Model Choice: The workflow uses openai/gpt-4o via OpenRouter for the main analysis task due to its strong reasoning capabilities and ability to follow complex JSON output instructions. A smaller, faster model is used for the intermediate step of summarizing each individual page within the loop.
Memory: A simple Window Buffer Memory node is used to retain the user’s initial keyword input throughout the execution of the workflow, making it available to all subsequent nodes.
Tools Used: The agent is built on a sequence of powerful n8n nodes:
- Chat Trigger to initiate the conversation.
- Bright Data node for all external data fetching.
- Split in Batches node to loop through the 10 competitor URLs.
- Code nodes for cleaning HTML and extracting specific data points like word count.
- Aggregate node to combine the analysis of all 10 pages into a single dataset.
- LangChain LLM nodes to interface with the OpenRouter models.

Bright Data Verified Node

The Bright Data Verified Node is the cornerstone of the agent’s data gathering capabilities and is used in two distinct modes:

SERP API Mode: The first instance of the node is configured to use Bright Data’s SERP API. When the user provides a keyword, this node reliably fetches the top 10 organic Google search results, providing the clean URLs needed for the next step. This is crucial for getting the correct competitor list without being blocked by Google.
Web Unblocker Mode: Inside the loop, a second Bright Data node is configured to use the Web Unblocker. For each of the 10 competitor URLs, this tool scrapes the entire page content. This is the most critical step, as the Web Unblocker handles IP rotation, CAPTCHAs, and other anti-bot measures automatically, ensuring that the agent can successfully retrieve the raw data it needs for its analysis with a near-perfect success rate.

Journey

The idea for this agent came from my own professional needs. I was spending too much time on manual SEO research and wanted to build a “fire-and-forget” tool to handle it for me.

The biggest challenge was orchestrating the data flow. It was complex to manage a process that involved fetching a list of items, looping through each one to fetch more data, processing that data, and then aggregating it all for one final, holistic analysis. The combination of n8n’s Split in Batches and Aggregate nodes was the key to solving this structural problem.

Another challenge was prompt engineering. Getting an LLM to consistently return a perfectly structured JSON object from a large, messy input of aggregated HTML content required a lot of trial and error. The breakthrough was enforcing a very strict output schema in the prompt instructions.

Through this project, I learned how to build a truly useful, multi-step AI agent that goes beyond simple Q&A. It combines the power of reliable, structured data retrieval (thanks to Bright Data) with the analytical power of modern LLMs, all orchestrated seamlessly within the n8n environment.

This content originally appeared on DEV Community and was authored by Inforeole Automatisations IA