Build a content research agent

Superhighway guides

The slowest part of writing a well-researched article is the research: hunting down recent news, finding authoritative sources, reading through each one, and pulling out the facts, stats, and angles worth covering. This guide builds a Python agent that does it for you. Give it a topic and it chains /news (recent coverage), /search (authoritative sources), /scrape (full article text), and /research (synthesized background), then hands everything to an LLM that writes a structured research brief — executive summary, key trends, statistics, perspectives, a suggested outline, and a citation list. It's a writing assistant that turns a one-line topic into a fact-grounded starting point.

1. What you'll build

A Python agent that takes a topic and produces a structured research brief containing:

Recent news and developments from the last 30 days (via /news)
Key authoritative sources and articles (via /search)
Full-text excerpts from the top sources (via /scrape)
Synthesized background from /research
An LLM-generated outline with key points, stats, angles, and citations

2. Setup

pip install openai requests python-dotenv

Create a .env file with your two keys:

SUPERHIGHWAY_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here

3. Gather recent news on the topic

Start with /news to surface what's happened recently — the developments that make an article timely. Each article comes back with a title, source, description, and publish date.

import requests, os, json
from datetime import datetime

SUPERHIGHWAY_KEY = os.getenv("SUPERHIGHWAY_API_KEY")
BASE = "https://superhighway.walls.sh"

def get_recent_news(topic: str, count: int = 8) -> list[dict]:
    """Get recent news articles on the topic."""
    r = requests.get(
        f"{BASE}/news",
        params={"q": topic, "count": count},
        headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
    )
    return r.json().get("articles", [])

4. Find authoritative sources with web search

News covers what's new; /search finds the deeper, evergreen sources — studies, guides, and explainers. We run a couple of angled queries and deduplicate by URL so the brief draws on a range of sources.

def search_sources(topic: str, limit: int = 8) -> list[dict]:
    """Find authoritative web sources on the topic."""
    queries = [
        f"{topic} research study analysis",
        f"{topic} guide tutorial explainer",
    ]
    seen = set()
    sources = []
    for q in queries:
        r = requests.get(
            f"{BASE}/search",
            params={"q": q, "limit": limit // 2},
            headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
        )
        for result in r.json().get("results", []):
            if result["url"] not in seen:
                seen.add(result["url"])
                sources.append(result)
    return sources

5. Scrape full content from the top sources

Search snippets are too thin to write from. /scrape returns each source as clean, LLM-ready Markdown — no nav, ads, or cookie banners. We pull the top few and truncate each so the prompt stays small.

def scrape_sources(sources: list[dict], max_sources: int = 4) -> list[dict]:
    """Scrape full content from top sources."""
    enriched = []
    for source in sources[:max_sources]:
        r = requests.get(
            f"{BASE}/scrape",
            params={"url": source["url"]},
            headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
        )
        data = r.json()
        enriched.append({
            "title": data.get("title", source.get("title", "")),
            "url": source["url"],
            "content": data.get("markdown", "")[:2500],
        })
    return enriched

6. Get synthesized background with /research

One /research call does a multi-source sweep and returns a synthesized summary with citations — perfect for grounding the brief's background section without scraping everything yourself.

def get_research_synthesis(topic: str) -> str:
    """Get a multi-source synthesis of the topic."""
    r = requests.get(
        f"{BASE}/research",
        params={"q": topic, "pages": 6},
        headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
    )
    data = r.json()
    return data.get("synthesis", data.get("markdown", ""))[:4000]

7. Generate the structured brief with an LLM

Now hand the news, scraped sources, and synthesized background to the LLM with a prompt that asks for a fixed set of sections. The structure keeps the output consistent across topics and directly usable by a writer.

from openai import OpenAI

llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def generate_brief(
    topic: str,
    news: list[dict],
    sources: list[dict],
    research: str
) -> str:
    """Generate a structured research brief."""
    news_text = "\n".join(
        f"- {a.get('title', '')} ({a.get('source', '')}): {a.get('description', '')}"
        for a in news[:6]
    )
    sources_text = "\n\n".join(
        f"### {s['title']}\nURL: {s['url']}\n{s['content'][:800]}"
        for s in sources[:3]
    )

    response = llm.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": "You are an expert research assistant. Create structured content research briefs that help writers produce well-researched articles. Be concise and factual."
            },
            {
                "role": "user",
                "content": f"""Create a research brief for an article about: "{topic}"

## Recent News (last 30 days)
{news_text}

## Key Sources
{sources_text}

## Background Research
{research}

Generate a research brief with these sections:
1. **Executive Summary** (2-3 sentences on the current state of this topic)
2. **Key Trends** (3-5 bullet points on what's new or changing)
3. **Important Statistics & Data** (any numbers, percentages, or facts from the sources)
4. **Key Perspectives** (different viewpoints or angles on the topic)
5. **Article Outline** (suggested structure for a 1500-word article)
6. **Key Sources to Cite** (list the most authoritative URLs with brief descriptions)
7. **Questions to Explore** (3-5 open questions the article could answer)
"""
            }
        ]
    )
    return response.choices[0].message.content

8. Wire up the full pipeline

The orchestrator runs each step in order and optionally saves a Markdown file with the brief plus a raw source list for editorial transparency.

def research_topic(topic: str, output_file: str | None = None) -> str:
    """Run the full content research pipeline."""
    print(f"Researching: {topic}")

    # Step 1: Recent news
    print("  Fetching recent news...")
    news = get_recent_news(topic)
    print(f"  Found {len(news)} news articles")

    # Step 2: Web sources
    print("  Searching for authoritative sources...")
    sources = search_sources(topic)

    # Step 3: Scrape top sources
    print("  Scraping top sources...")
    enriched = scrape_sources(sources)

    # Step 4: Background synthesis
    print("  Synthesizing background research...")
    research = get_research_synthesis(topic)

    # Step 5: Generate brief
    print("  Generating research brief...")
    brief = generate_brief(topic, news, enriched, research)

    # Save to file if requested
    if output_file:
        with open(output_file, "w") as f:
            f.write(f"# Research Brief: {topic}\n\n")
            f.write(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}\n\n")
            f.write(brief)
            f.write("\n\n---\n\n## Raw Sources\n\n")
            for article in news[:5]:
                f.write(f"- [{article['title']}]({article['url']})\n")
            for source in sources[:5]:
                f.write(f"- [{source['title']}]({source['url']})\n")
        print(f"  Saved to {output_file}")

    return brief

if __name__ == "__main__":
    import sys
    topic = " ".join(sys.argv[1:]) if len(sys.argv) > 1 else "AI agent frameworks 2025"
    brief = research_topic(
        topic,
        output_file=f"research_{topic[:30].replace(' ', '_')}.md"
    )
    print("\n" + "="*60)
    print(brief)

9. Run it

# Research a tech topic
python content_research.py "LLM agent memory systems"

# Research a business topic
python content_research.py "B2B SaaS pricing models 2025"

# Research a news event
python content_research.py "open source AI models recent developments"

Each run drops a research_*.md file with the brief and a linked source list — ready to hand to a writer or paste into your CMS as a starting draft.

10. Extending the agent

Scheduled research digest — run it on a cron schedule to generate weekly briefs on ongoing beats; combine with the news briefing agent for daily updates.
Community angles — add a /search step with site:reddit.com {topic} or site:news.ycombinator.com {topic} to surface discussion angles and contrarian takes.
Fact-checking pass — after the brief is generated, use /search to verify specific claims against primary sources before publishing.
Multi-angle research — generate separate briefs for different audiences (beginner vs. expert, pro vs. con) and merge them.
Citation tracking — store source URLs and retrieval timestamps in a database for editorial transparency.

11. Getting your API key

Grab a free Superhighway key at /pricing (1,000 calls/month, no credit card). For an agent that provisions its own access, skip the key entirely with x402: it pays $0.002 per call in USDC on Base — no signup, no key management. See the x402 pay-per-call guide for the wallet setup.

From here, the search-and-read guide goes deeper on combining search with scraping, and the news briefing guide shows how to wire /news into a recurring digest.