Media & Entertainment Research Agent

Superhighway guides

Media and entertainment is a fast-moving business where box office, streaming viewership, and content deals reset the competitive map every week — and where the difference between a green-light and a pass turns on audience data, distribution economics, and franchise IP value. This guide builds a Python agent for content strategists, streaming analysts, entertainment investors, talent agencies, production companies, and IP licensing teams. It chains all four Superhighway endpoints — /research for the market landscape, /search against entertainment trade and market-research sources, /scrape for a specific deal report or box-office page, and /news for recent deals and performance — then uses an LLM to emit a structured media brief as JSON.

Overview

The agent takes a film, show, IP, company, or market segment — "superhero franchise streaming vs theatrical", "podcast advertising market 2024" — and produces a structured media and entertainment brief:

Who it's for: content strategists, streaming analysts, entertainment investors, talent agencies, production companies, and IP licensing teams.

How it works

Five endpoint calls feed one LLM synthesis:

  1. /research — deep synthesis of the market: structure, key players, business models, content economics, industry trends.
  2. /search (trade publications) — box office, ratings, streaming viewership, and deals scoped to Variety, Deadline, and Hollywood Reporter.
  3. /search (market research, time=year) — market share, audience demographics, and revenue analysis from Nielsen, Parrot Analytics, and Bloomberg Intelligence.
  4. /scrape — one relevant URL, e.g. a Variety deal report, a Box Office Mojo page, or a Parrot Analytics summary.
  5. /news (time=month) — recent acquisitions, green-light decisions, subscriber data, theatrical performance, talent deals, and distribution rights.

Full example

pip install openai requests python-dotenv

Create a .env file with your two keys:

SUPERHIGHWAY_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
import requests, os, json
from openai import OpenAI

SUPERHIGHWAY_KEY = os.getenv("SUPERHIGHWAY_API_KEY")
BASE = "https://superhighway.walls.sh"
HEADERS = {"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}

llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

NOTE = (
    "Box office and streaming data can lag weeks behind actual performance. "
    "For the most current viewership figures, verify directly with Nielsen, "
    "Parrot Analytics, or the studio/streamer's investor relations filings."
)

# 1. Deep synthesis of the media & entertainment market
def research_market(query: str) -> str:
    """Market structure, key players, business models, content economics, trends."""
    r = requests.get(
        f"{BASE}/research",
        params={"q": f"{query} media entertainment industry"},
        headers=HEADERS,
    )
    data = r.json()
    return data.get("summary", "")[:3000]

# 2. Trade publications: Variety, Deadline, Hollywood Reporter
def search_trades(query: str) -> list[dict]:
    """Box office, ratings, streaming viewership, deals from the trades."""
    r = requests.get(
        f"{BASE}/search",
        params={
            "q": f"{query} box office streaming viewership ratings "
                 f"site:variety.com OR site:deadline.com OR site:hollywoodreporter.com",
        },
        headers=HEADERS,
    )
    return r.json().get("results", [])

# 3. Market research: demographics, revenue, market share (last year)
def search_market_data(query: str) -> list[dict]:
    """Audience demographics, revenue, market share from Nielsen, Parrot, Bloomberg."""
    r = requests.get(
        f"{BASE}/search",
        params={
            "q": f"{query} entertainment media market share revenue audience demographics",
            "time": "year",
        },
        headers=HEADERS,
    )
    return r.json().get("results", [])

# 4. Scrape one relevant deal report / box-office page / analytics summary
def scrape_page(url: str) -> dict:
    """Pull a Variety deal report, a Box Office Mojo page, or a Parrot summary."""
    r = requests.post(
        f"{BASE}/scrape",
        json={"url": url, "mode": "markdown"},
        headers=HEADERS,
    )
    data = r.json()
    return {
        "url": url,
        "title": data.get("title", ""),
        "content": data.get("markdown", data.get("text", ""))[:2500],
    }

# 5. Recent entertainment news: deals, greenlights, subscribers (last month)
def get_news(query: str) -> list[dict]:
    """Acquisitions, greenlights, subscriber data, theatrical performance, talent deals."""
    r = requests.get(
        f"{BASE}/news",
        params={
            "q": f"{query} entertainment media streaming",
            "time": "month",
        },
        headers=HEADERS,
    )
    return r.json().get("results", [])

def generate_brief(
    query: str,
    market: str,
    trades: list[dict],
    market_data: list[dict],
    scraped: dict | None,
    news: list[dict],
) -> dict | None:
    """Generate a structured media & entertainment brief as JSON."""

    trades_text = "\n".join(
        f"- {r.get('title', '')}: {r.get('snippet', '')} ({r.get('url', '')})"
        for r in trades[:6]
    )
    market_text = "\n".join(
        f"- {r.get('title', '')}: {r.get('snippet', '')}"
        for r in market_data[:6]
    )
    news_text = "\n".join(
        f"- {n.get('title', '')}: {n.get('snippet', '')}"
        for n in news[:6]
    )
    scraped_text = ""
    if scraped and scraped.get("content"):
        scraped_text = f"{scraped['title']}\n{scraped['content']}"

    response = llm.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a media and entertainment industry analyst. Use ONLY the "
                    "provided sources. Do not invent box office figures, viewership "
                    "numbers, deal terms, or company names — if a detail is not in the "
                    "sources, say 'not found in sources.' Be precise about distribution "
                    "economics, audience data, and franchise IP value, and flag when "
                    "performance data may be preliminary or estimated."
                ),
            },
            {
                "role": "user",
                "content": f"""Write a media & entertainment brief for: {query}

Market Landscape (synthesis):
{market}

Trade Publications (Variety / Deadline / Hollywood Reporter):
{trades_text}

Market Research Data (demographics / revenue / market share):
{market_text}

Scraped Deal Report / Box-Office / Analytics Page:
{scraped_text}

Recent Entertainment News:
{news_text}

Return JSON with ALL of these fields:
- subject: film/show/IP/company/market segment being researched
- content_type: "film" | "tv-series" | "streaming" | "theatrical" | "music" | "gaming" | "live-events" | "sports-rights" | "podcast" | "mixed"
- market_overview: current state of this content category/market — size, growth, key dynamics
- key_players: list of studios, streamers, distributors, production companies in this space
- distribution_landscape: streaming vs. theatrical vs. linear TV vs. direct-to-consumer — current balance and trends
- audience_profile: who consumes this content — demographics, viewing habits, platform preferences, international vs. domestic split
- content_economics: typical production budgets, licensing fees, box office/viewership benchmarks, revenue windows, ancillary revenue (merchandise, theme parks, gaming tie-ins)
- recent_deals_and_performance: notable recent acquisitions, greenlight decisions, box office results, streaming viewership numbers
- competitive_landscape: how studios/streamers compete for this content type — exclusive deals, franchise IP, talent relationships
- ip_and_licensing: franchise value, sequel/spin-off potential, licensing opportunities, international rights considerations
- outlook: near-term trajectory — slate announcements, market trends, potential disruptions
- data_quality: "high" | "medium" | "low" — based on how well trade sources and market data covered the subject""",
            },
        ],
        response_format={"type": "json_object"},
    )

    try:
        brief = json.loads(response.choices[0].message.content)
        brief["note"] = NOTE
        return brief
    except (json.JSONDecodeError, KeyError):
        return None

def research_media(query: str) -> dict | None:
    """Run the full media & entertainment research pipeline."""
    print(f"Researching media market: {query}")

    print("Synthesizing market landscape...")
    market = research_market(query)

    print("Searching trade publications...")
    trades = search_trades(query)

    print("Searching market research data...")
    market_data = search_market_data(query)

    print("Scraping a relevant deal/box-office/analytics page...")
    scraped = None
    for result in trades + market_data:
        url = result.get("url")
        if url and ("variety.com" in url or "deadline.com" in url
                    or "hollywoodreporter.com" in url or "boxofficemojo.com" in url
                    or "parrotanalytics.com" in url):
            scraped = scrape_page(url)
            if scraped.get("content"):
                break

    print("Pulling recent entertainment news...")
    news = get_news(query)

    print("Generating media brief...")
    return generate_brief(query, market, trades, market_data, scraped, news)

def print_brief(brief: dict):
    if not brief:
        print("Could not generate brief.")
        return
    print(f"\n{'='*60}")
    print(f"Media & Entertainment Brief")
    print(f"{'='*60}")
    print(f"\nSubject: {brief.get('subject', '')}")
    print(f"Content Type: {brief.get('content_type', '')}")
    print(f"\nMarket Overview:\n{brief.get('market_overview', '')}")
    print(f"\nKey Players: {', '.join(brief.get('key_players', []))}")
    print(f"\nDistribution Landscape:\n{brief.get('distribution_landscape', '')}")
    print(f"\nAudience Profile:\n{brief.get('audience_profile', '')}")
    print(f"\nContent Economics:\n{brief.get('content_economics', '')}")
    print(f"\nRecent Deals & Performance:\n{brief.get('recent_deals_and_performance', '')}")
    print(f"\nCompetitive Landscape:\n{brief.get('competitive_landscape', '')}")
    print(f"\nIP & Licensing:\n{brief.get('ip_and_licensing', '')}")
    print(f"\nOutlook:\n{brief.get('outlook', '')}")
    print(f"\nData Quality: {brief.get('data_quality', '?')}")
    print(f"\n{brief.get('note', '')}")

if __name__ == "__main__":
    import sys
    query = sys.argv[1] if len(sys.argv) > 1 else "superhero franchise streaming vs theatrical"
    brief = research_media(query)
    print_brief(brief)

Usage examples

Getting your API key

Grab a free Superhighway key at /pricing (1,000 calls/month, no credit card). For an agent that provisions its own access, skip the key entirely with x402: it pays $0.002 per call in USDC on Base — no signup, no key management. See the x402 pay-per-call guide for the wallet setup.

See also

The brand monitoring agent applies the same multi-endpoint pattern to tracking mentions and sentiment, and the content research agent covers the deep-research synthesis pattern in more depth.