Build a content research agent
The slowest part of writing a well-researched article is the research: hunting down recent news, finding authoritative sources, reading through each one, and pulling out the facts, stats, and angles worth covering. This guide builds a Python agent that does it for you. Give it a topic and it chains /news (recent coverage), /search (authoritative sources), /scrape (full article text), and /research (synthesized background), then hands everything to an LLM that writes a structured research brief — executive summary, key trends, statistics, perspectives, a suggested outline, and a citation list. It's a writing assistant that turns a one-line topic into a fact-grounded starting point.
1. What you'll build
A Python agent that takes a topic and produces a structured research brief containing:
- Recent news and developments from the last 30 days (via
/news) - Key authoritative sources and articles (via
/search) - Full-text excerpts from the top sources (via
/scrape) - Synthesized background from
/research - An LLM-generated outline with key points, stats, angles, and citations
2. Setup
pip install openai requests python-dotenv
Create a .env file with your two keys:
SUPERHIGHWAY_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
3. Gather recent news on the topic
Start with /news to surface what's happened recently — the developments that make an article timely. Each article comes back with a title, source, description, and publish date.
import requests, os, json
from datetime import datetime
SUPERHIGHWAY_KEY = os.getenv("SUPERHIGHWAY_API_KEY")
BASE = "https://superhighway.walls.sh"
def get_recent_news(topic: str, count: int = 8) -> list[dict]:
"""Get recent news articles on the topic."""
r = requests.get(
f"{BASE}/news",
params={"q": topic, "count": count},
headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
)
return r.json().get("articles", [])
4. Find authoritative sources with web search
News covers what's new; /search finds the deeper, evergreen sources — studies, guides, and explainers. We run a couple of angled queries and deduplicate by URL so the brief draws on a range of sources.
def search_sources(topic: str, limit: int = 8) -> list[dict]:
"""Find authoritative web sources on the topic."""
queries = [
f"{topic} research study analysis",
f"{topic} guide tutorial explainer",
]
seen = set()
sources = []
for q in queries:
r = requests.get(
f"{BASE}/search",
params={"q": q, "limit": limit // 2},
headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
)
for result in r.json().get("results", []):
if result["url"] not in seen:
seen.add(result["url"])
sources.append(result)
return sources
5. Scrape full content from the top sources
Search snippets are too thin to write from. /scrape returns each source as clean, LLM-ready Markdown — no nav, ads, or cookie banners. We pull the top few and truncate each so the prompt stays small.
def scrape_sources(sources: list[dict], max_sources: int = 4) -> list[dict]:
"""Scrape full content from top sources."""
enriched = []
for source in sources[:max_sources]:
r = requests.get(
f"{BASE}/scrape",
params={"url": source["url"]},
headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
)
data = r.json()
enriched.append({
"title": data.get("title", source.get("title", "")),
"url": source["url"],
"content": data.get("markdown", "")[:2500],
})
return enriched
6. Get synthesized background with /research
One /research call does a multi-source sweep and returns a synthesized summary with citations — perfect for grounding the brief's background section without scraping everything yourself.
def get_research_synthesis(topic: str) -> str:
"""Get a multi-source synthesis of the topic."""
r = requests.get(
f"{BASE}/research",
params={"q": topic, "pages": 6},
headers={"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
)
data = r.json()
return data.get("synthesis", data.get("markdown", ""))[:4000]
7. Generate the structured brief with an LLM
Now hand the news, scraped sources, and synthesized background to the LLM with a prompt that asks for a fixed set of sections. The structure keeps the output consistent across topics and directly usable by a writer.
from openai import OpenAI
llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
def generate_brief(
topic: str,
news: list[dict],
sources: list[dict],
research: str
) -> str:
"""Generate a structured research brief."""
news_text = "\n".join(
f"- {a.get('title', '')} ({a.get('source', '')}): {a.get('description', '')}"
for a in news[:6]
)
sources_text = "\n\n".join(
f"### {s['title']}\nURL: {s['url']}\n{s['content'][:800]}"
for s in sources[:3]
)
response = llm.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "You are an expert research assistant. Create structured content research briefs that help writers produce well-researched articles. Be concise and factual."
},
{
"role": "user",
"content": f"""Create a research brief for an article about: "{topic}"
## Recent News (last 30 days)
{news_text}
## Key Sources
{sources_text}
## Background Research
{research}
Generate a research brief with these sections:
1. **Executive Summary** (2-3 sentences on the current state of this topic)
2. **Key Trends** (3-5 bullet points on what's new or changing)
3. **Important Statistics & Data** (any numbers, percentages, or facts from the sources)
4. **Key Perspectives** (different viewpoints or angles on the topic)
5. **Article Outline** (suggested structure for a 1500-word article)
6. **Key Sources to Cite** (list the most authoritative URLs with brief descriptions)
7. **Questions to Explore** (3-5 open questions the article could answer)
"""
}
]
)
return response.choices[0].message.content
8. Wire up the full pipeline
The orchestrator runs each step in order and optionally saves a Markdown file with the brief plus a raw source list for editorial transparency.
def research_topic(topic: str, output_file: str | None = None) -> str:
"""Run the full content research pipeline."""
print(f"Researching: {topic}")
# Step 1: Recent news
print(" Fetching recent news...")
news = get_recent_news(topic)
print(f" Found {len(news)} news articles")
# Step 2: Web sources
print(" Searching for authoritative sources...")
sources = search_sources(topic)
# Step 3: Scrape top sources
print(" Scraping top sources...")
enriched = scrape_sources(sources)
# Step 4: Background synthesis
print(" Synthesizing background research...")
research = get_research_synthesis(topic)
# Step 5: Generate brief
print(" Generating research brief...")
brief = generate_brief(topic, news, enriched, research)
# Save to file if requested
if output_file:
with open(output_file, "w") as f:
f.write(f"# Research Brief: {topic}\n\n")
f.write(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}\n\n")
f.write(brief)
f.write("\n\n---\n\n## Raw Sources\n\n")
for article in news[:5]:
f.write(f"- [{article['title']}]({article['url']})\n")
for source in sources[:5]:
f.write(f"- [{source['title']}]({source['url']})\n")
print(f" Saved to {output_file}")
return brief
if __name__ == "__main__":
import sys
topic = " ".join(sys.argv[1:]) if len(sys.argv) > 1 else "AI agent frameworks 2025"
brief = research_topic(
topic,
output_file=f"research_{topic[:30].replace(' ', '_')}.md"
)
print("\n" + "="*60)
print(brief)
9. Run it
# Research a tech topic
python content_research.py "LLM agent memory systems"
# Research a business topic
python content_research.py "B2B SaaS pricing models 2025"
# Research a news event
python content_research.py "open source AI models recent developments"
Each run drops a research_*.md file with the brief and a linked source list — ready to hand to a writer or paste into your CMS as a starting draft.
10. Extending the agent
- Scheduled research digest — run it on a cron schedule to generate weekly briefs on ongoing beats; combine with the news briefing agent for daily updates.
- Community angles — add a
/searchstep withsite:reddit.com {topic}orsite:news.ycombinator.com {topic}to surface discussion angles and contrarian takes. - Fact-checking pass — after the brief is generated, use
/searchto verify specific claims against primary sources before publishing. - Multi-angle research — generate separate briefs for different audiences (beginner vs. expert, pro vs. con) and merge them.
- Citation tracking — store source URLs and retrieval timestamps in a database for editorial transparency.
11. Getting your API key
Grab a free Superhighway key at /pricing (1,000 calls/month, no credit card). For an agent that provisions its own access, skip the key entirely with x402: it pays $0.002 per call in USDC on Base — no signup, no key management. See the x402 pay-per-call guide for the wallet setup.
From here, the search-and-read guide goes deeper on combining search with scraping, and the news briefing guide shows how to wire /news into a recurring digest.