This content originally appeared on DEV Community and was authored by Shah Pourazadi
This is a submission for the AI Agents Challenge powered by n8n and Bright Data
What I Built
IdeaBot — a YouTube-driven Viral Topic Generator.
Give it any topic (e.g., “AI & Automation”) and it will:
1) Find trending YouTube results;
2) pull transcripts and top comments;
3) analyze patterns;
4) generate suggested titles, short-form ideas, mini-scripts (3–5 lines), and social post drafts.
Problem it solves: creators waste time guessing what will resonate. IdeaBot grounds ideation in what audiences are already engaging with (comments) and what’s working now (recent videos), then turns that signal into publish-ready prompts and snippets.
Demo
Public Chat (n8n Chat Trigger): https://flow.wemakeflow.com/webhook/ac48b635-40d1-4c83-aa5a-fbf2cb5ba546/chat
n8n Workflow
Workflow JSON (Gist): https://gist.github.com/azadishahab/677424d5a84f570ebbf2fb83544119b6
Technical Implementation
Agent & Model Setup
Chat entrypoint: When chat message received.
Models: Google Gemini Chat Model nodes power the agents.
Google Gemini Chat Model1 → explicitly set to models/gemini-2.5-flash-lite (drives the URL-builder agent).
Google Gemini Chat Model → default Gemini chat model (drives parsing/summarization/repurposing agents).
Agents (system instructions):
AI Agent1 – SERP URL Builder. Generates a Google video search URL:
https://www.google.com/search?q=&tbm=vid&gl=
– comes from the user prompt; is 2-letter country (defaults to us if not specified).
– Output contract: URL only (no extra text).
AI Agent2 – SERP Result Parser. Input: raw SERP payload. Task: extract YouTube video URLs and return them as an array.
AI Agent – Transcript Summarizer. Input: video metadata/transcripts. Task: summarize each transcript into key notes for downstream repurposing.
AI Agent3 – Content Repurposer. Input: transcript summaries + high-signal comments. Task: generate new, original ideas (publish-ready JSON in the final responders).
Bright Data usage (nodes & flow)
Search (SERP):
Node: Access and extract data from a specific URL (Bright Data Verified).
serp_api1, country: us (default), url: {{$json.output}} (the URL built by AI Agent1).
Responds via Respond to Chat (“Done searching Google…”) to keep the chat user informed.
Video transcripts & metadata (YouTube – Video Posts dataset):
Node: Extract structured data from a single URL2 → dataset “Youtube – Videos posts” (dataset_id: e.g., gd_lk56epmy2i5g7lzu0k).
Input URLs: {{ $(‘Respond to Chat1’).item.json.output.toJsonString() }} (the array of video URLs extracted earlier).
Sort by views (desc) then likes (desc) → Limit to 2 top videos.
Code node wraps those into { output: [{ url: … }] } for consistent downstream shape.
YouTube comments (Comment Collector dataset) with polling:
Node: Extract structured data from a single URL1 → dataset “Youtube – Comments” (dataset_id: e.g., gd_lk9q0ew71spt1mxywf).
Snapshot polling loop: Edit Fields1 (capture snapshot_id) → Download the snapshot content → If status == “running” → Wait 6s → loop back to Download until done.
Filter1: keep only comments with likes > 60 (noise reduction).
Aggregate: consolidate high-signal comment_text for analysis.
Data shaping & analysis pipeline
SERP URL Builder (AI Agent1) → Bright Data SERP fetch → AI Agent2 extracts an array of YouTube URLs → Respond to Chat1 acknowledges URL collection.
Video Posts dataset (transcripts/metadata) → Sort → Limit (2) → Code packaging → Respond to Chat3 (status update) → Comments dataset (with polling) → Filter1 (likes>60) → Aggregate (comment texts).
Summarization branch: the Limit node also feeds AI Agent (Summarizer) to create concise transcript summaries.
Merge:
Aggregate1 collects summarizer outputs;
Aggregate (comments) merges via Merge → Aggregate2 (aggregateAllItemData) to a single payload.
Content generation: AI Agent3 (Repurposer) transforms summaries + comments into the final JSON ideas package.
Final reply: Respond to Chat2 returns the JSON object to the user.
Prompting & contracts (highlights from node configs)
URL Builder (Agent1): strict instruction to output only the correctly-formed SERP URL with tbm=vid and default gl=us.
Parser (Agent2): extract YouTube URLs array from SERP results (no prose).
Summarizer: “Summarize the video transcription, keep all important notes … used for content repurpose.”
Repurposer (Agent3): “You are the Content Repurposer Agent… generate fresh, original content ideas based on video summaries + top comments.”
Final schema: returned by Respond to Chat as a JSON payload (titles, short-form ideas, mini-scripts, post drafts).
Memory / conversation behavior
Workflow uses Chat Trigger (public) with responseNodes mode and several Respond to Chat status messages.
There is no dedicated memory node in this export; each run is effectively stateless (refinements re-enter the flow). You can add an n8n Chat Memory Manager / Window Buffer Memory later if you want multi-turn refinement without re-scraping.
Notable safeguards & heuristics
Comment quality gate: likes > 60 to boost signal.
Top-video cap: Limit 2 (fast, token-efficient).
Polling loop: waits for Bright Data comment snapshots to complete before analysis.
Code shaping: wraps arrays into { output: […] } so downstream Bright Data nodes accept uniform input.
Bright Data Verified Node
How it’s used end-to-end:
SERP (video) fetch
Node: Access and extract data from a specific URL
serp_api1; gl defaults to us; URL pattern https://www.google.com/search?q=&tbm=vid&gl= generated upstream.
Output is handed to AI Agent to extract YouTube links.
Video Post (transcripts & metadata)
Node: Extract structured data from a single URL
Dataset: e.g., gd_lk56epmy2i5g7lzu0k (“Youtube – Videos posts”)
Flow: Sort (views, likes) → Limit top 2..5 → Code to produce {output:[{url:…}]} for downstream.
Comment Collector
Node: Extract structured data from a single URL
Dataset: e.g., gd_lk9q0ew71spt1mxywf (“Youtube – Comments”)
Snapshot poll: Edit Fields → Wait → Download snapshot content → If (status==”running”) loop back → else continue.
Quality: Filter (e.g., likes > 60) → Aggregate to merge comment text for analysis.
This pairing (SERP → Video Post → Comment Collector) yields fresh, structured inputs resilient to blocking, enabling reliable analysis and ideation.
Journey
Process:
Started from a clear target: ideas tied to real audience demand.
Built a prompt→URL Builder so users can stay free-form while the system enforces SERP correctness (tbm=vid, gl default).
Split data collection into videos (transcripts) and comments, then layered agents: summarize, pattern-find, repurpose, respond.
Challenges & Solutions:
SERP parsing reliability: Solved by chaining a Bright Data SERP fetch with an LLM Structured Output Parser to normalize video URLs.
Snapshot polling for comments: Implemented a Wait + If loop to poll until snapshot completion, then filtered by likes for signal.
Token/length limits: Summarizer truncates transcripts; comments are filtered before aggregation.
Keeping outputs actionable: A dedicated Repurposer prompt that forces new ideas to be inspired by (not copied from) summaries + comments, then formats to a strict JSON schema.
What I learned:
Enforcing tool contracts (I/O shapes per agent) makes multi-agent flows robust.
Bright Data’s datasets + polling patterns are a clean fit for n8n; pairing them with lightweight LLM parsing yields dependable, real-time pipelines.
A small amount of structure (sorting by views/likes, comment like-thresholds) dramatically improves idea quality and virality potential.
This content originally appeared on DEV Community and was authored by Shah Pourazadi