Viral-O-Meter for YouTube (Built with n8n + Bright Data)



This content originally appeared on DEV Community and was authored by Mohit Agnihotri

This is a submission for the AI Agents Challenge powered by n8n and Bright Data

What I Built

Viral-O-Meter for YouTube — an n8n + Bright Data agent that checks a keyword’s viral potential in real time.

It pulls top YouTube results for a keyword via Bright Data’s Verified Node, normalizes metrics (views/likes/length), computes robust stats (median & p75 views, like rate, Shorts share, median lengths), and asks an AI Agent to deliver a GO/NO_GO verdict, recommended format (Shorts vs Long-form), ideal length band, content angles, and hooks.

Who it helps: creators & marketers who want a quick, data-driven green‑light before spending time on a video.

Demo

n8n Workflow

  • Gist (workflow JSON): https://gist.github.com/mohitagnihotri/062a1cee13e30bcdeccba3aaa8895b61

  • Key nodes:

    • Bright Data (Verified Node): Data Collector (trigger collection & fetch snapshot), Web Unlocker when needed.
    • Code (Normalize): converts raw fields to numeric (views/likes), derives views_per_day, parses HH:MM:SS, classifies Shorts vs Long-form.
    • Code (Stats): computes medians/quantiles and seeds a suggested length band (±20% around top‑quartile median).
    • Agent (AI): consumes the enriched JSON and returns structured recommendations.
    • Report: send proper analysis report to Gmail.

Note: I included two drop-in Code nodes: (1) normalizer, (2) stats pre-compute. Paste them into the workflow, or pull from the Gist.

Technical Implementation

High-level flow

  1. Ingest with Bright Data
    • Use Data Collector to collect YouTube search/results pages by keyword.
    • Use Deliver Snapshot / Get Snapshot Content to fetch structured results.
    • If regional blocks occur, enable Web Unlocker automatically in the Verified Node.
  2. Normalize (Code Node #1)
    • Parse numeric strings like “58,613”, “1.2M/1.2K” → integers.
    • Parse video_length (seconds or HH:MM:SS) → length_seconds & length_hms.
    • Derive views_per_day, compute like_rate_pct, detect post_type.
  3. Stats (Code Node #2)
    • Compute median / p75 for views, median like rate, Shorts share %.
    • Compute median length (all) and median length among the top quartile by views.
    • Create a recommended_seed with an anchor length (top‑quartile median) and a ±20% band; include a format_hint based on top performers.
  4. AI Agent
    • System Prompt: “You are a YouTube content viability analyst …” (uses benchmarks & recommended_seed if present; otherwise computes).
    • User Prompt: “Analyze the following dataset … { JSON.stringify($json, null, 2) }”
    • Output: deterministic JSON with verdict, confidence, recommended format/length, angles, hooks, and checklist.
  5. Outputs
    • Post reports to Gmail.

Design choices

  • No external assumptions: Agent reasons only over the provided dataset.
  • Robustness: use medians and quantiles to resist outliers.
  • Actionability: force the Agent to output clear ranges (seconds + mm:ss), angles, hooks, and checklist.

Bright Data Verified Node

  • Endpoints used: Data Collector (trigger & deliver snapshots), Web Unlocker when geo or bot blocks are detected.
  • Why Bright Data? Verified, production-ready node in n8n; handles real-time collection and anti-bot hurdles without brittle DIY scraping.
  • Schema highlights: title, url, views, likes, video_length, date_posted, channel, post_type, plus derived fields (views_per_day, like_rate_pct, length_seconds).
  • Cost control: run snapshots on-demand and cap concurrent collections; cache daily to reduce duplicate fetches.

Journey

What worked

  • Verified Node made collection straightforward; Deliver Snapshot gave predictable JSON.
  • Separating normalize and stats steps simplified prompt design and let the Agent stay lightweight.

Challenges

  • Mixed units & locales: views “K/M” + comma separators → solved via a single parser.
  • Shorts vs Long-form detection: use explicit post_type when present; fallback to <60s.
  • Recency bias: compute views_per_day to compare older vs newer videos fairly.

What I learned

  • Scrapping without tools like BrightData is not possible as IP gets blocked, get 403 error etc.
  • Framing the Agent’s output as strict JSON drives consistent downstream automation.
  • Anchoring recommended length to the top‑quartile median yields practical targets that mirror what performs best.


This content originally appeared on DEV Community and was authored by Mohit Agnihotri