EdTech Research Agent

Superhighway guides

EdTech is a $300B+ global market where product claims outrun evidence, regulation (FERPA, COPPA, CIPA) is non-negotiable, and procurement runs through districts, ESSER funds, and Title I budgets. This guide builds a Python agent for EdTech investors, startup founders, district curriculum directors, learning scientists, and instructional designers. It chains all four Superhighway endpoints — /research for the market landscape, /search against EdTech industry and research sources, /scrape for a specific product profile or research brief, and /news for funding and procurement activity — then uses an LLM to emit a structured market brief as JSON.

EdTech product efficacy claims vary widely. This agent provides market and competitive research, not instructional design advice or legal compliance guidance. Verify student data privacy compliance (FERPA, COPPA) with your district's legal counsel before procurement.

Overview

The agent takes a product or market segment — "adaptive math learning platform K-12", "AI writing tutor higher education" — and produces a structured EdTech market brief:

Who it's for: EdTech investors and analysts, startup founders, district curriculum directors and procurement leads, learning scientists, and instructional designers.

How it works

Five endpoint calls feed one LLM synthesis:

  1. /research — deep synthesis of the market: size, growth, key players, funding trends, adoption patterns, learning efficacy evidence, regulatory landscape.
  2. /search (industry sources) — EdTech news, reviews, and funding scoped to EdSurge, Crunchbase, and ISTE.
  3. /search (research evidence, time=year) — peer-reviewed studies, efficacy data, and adoption reports from research bodies.
  4. /scrape — one relevant URL, e.g. an EdSurge product profile, a district RFP, or a research brief.
  5. /news (time=month) — recent funding rounds, district procurement decisions, policy changes, product launches, and acquisitions.

Full example

pip install openai requests python-dotenv

Create a .env file with your two keys:

SUPERHIGHWAY_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
import requests, os, json
from openai import OpenAI

SUPERHIGHWAY_KEY = os.getenv("SUPERHIGHWAY_API_KEY")
BASE = "https://superhighway.walls.sh"
HEADERS = {"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}

llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

NOTE = (
    "EdTech product claims vary widely in research support. Verify efficacy "
    "evidence independently through What Works Clearinghouse (ies.ed.gov/ncee/wwc) "
    "and RAND Education research before procurement decisions."
)

# 1. Deep synthesis of the EdTech market landscape
def research_market(query: str) -> str:
    """Market size, key players, funding trends, adoption, efficacy, regulation."""
    r = requests.get(
        f"{BASE}/research",
        params={"q": f"{query} education technology market"},
        headers=HEADERS,
    )
    data = r.json()
    return data.get("summary", "")[:3000]

# 2. EdTech industry sources: EdSurge, Crunchbase, ISTE
def search_industry(query: str) -> list[dict]:
    """Product reviews (EdSurge), funding (Crunchbase), standards (ISTE)."""
    r = requests.get(
        f"{BASE}/search",
        params={
            "q": f"{query} edtech startup funding product review "
                 f"site:edsurge.com OR site:crunchbase.com OR site:iste.org",
        },
        headers=HEADERS,
    )
    return r.json().get("results", [])

# 3. Research evidence: efficacy studies, adoption reports (last year)
def search_evidence(query: str) -> list[dict]:
    """Peer-reviewed studies, efficacy data from RAND, Gates, WWC, IES."""
    r = requests.get(
        f"{BASE}/search",
        params={
            "q": f"{query} education technology research evidence classroom efficacy",
            "time": "year",
        },
        headers=HEADERS,
    )
    return r.json().get("results", [])

# 4. Scrape one relevant product profile / RFP / research brief
def scrape_page(url: str) -> dict:
    """Pull an EdSurge product profile, a district RFP, or a research brief."""
    r = requests.post(
        f"{BASE}/scrape",
        json={"url": url, "mode": "markdown"},
        headers=HEADERS,
    )
    data = r.json()
    return {
        "url": url,
        "title": data.get("title", ""),
        "content": data.get("markdown", data.get("text", ""))[:2500],
    }

# 5. Recent edtech news: funding, procurement, policy (last month)
def get_news(query: str) -> list[dict]:
    """Funding rounds, district procurement, policy changes, launches, M&A."""
    r = requests.get(
        f"{BASE}/news",
        params={
            "q": f"{query} edtech education technology school district",
            "time": "month",
        },
        headers=HEADERS,
    )
    return r.json().get("results", [])

def generate_brief(
    query: str,
    market: str,
    industry: list[dict],
    evidence: list[dict],
    scraped: dict | None,
    news: list[dict],
) -> dict | None:
    """Generate a structured EdTech market brief as JSON."""

    industry_text = "\n".join(
        f"- {r.get('title', '')}: {r.get('snippet', '')} ({r.get('url', '')})"
        for r in industry[:6]
    )
    evidence_text = "\n".join(
        f"- {r.get('title', '')}: {r.get('snippet', '')}"
        for r in evidence[:6]
    )
    news_text = "\n".join(
        f"- {n.get('title', '')}: {n.get('snippet', '')}"
        for n in news[:6]
    )
    scraped_text = ""
    if scraped and scraped.get("content"):
        scraped_text = f"{scraped['title']}\n{scraped['content']}"

    response = llm.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are an EdTech market and competitive analyst. Use ONLY the "
                    "provided sources. Do not invent company names, funding figures, "
                    "study results, or efficacy claims — if a detail is not in the "
                    "sources, say 'not found in sources.' This is market research, NOT "
                    "instructional design advice or legal compliance guidance. Be "
                    "precise about the strength of the evidence base and always remind "
                    "the reader to verify efficacy and student-data-privacy compliance."
                ),
            },
            {
                "role": "user",
                "content": f"""Write an EdTech market brief for: {query}

Market Landscape (synthesis):
{market}

EdTech Industry Sources (EdSurge / Crunchbase / ISTE):
{industry_text}

Research Evidence (studies, efficacy, adoption reports):
{evidence_text}

Scraped Product / RFP / Research Page:
{scraped_text}

Recent EdTech News:
{news_text}

Return JSON with ALL of these fields:
- product_or_market_segment: what's being researched (product name or market category)
- market_segment: "K-12" | "higher-ed" | "corporate-learning" | "early-childhood" | "professional-development" | "mixed"
- market_overview: market size estimate, growth trajectory, key trends driving adoption
- key_players: list of leading companies/products in this space
- evidence_base: strength of research evidence — "strong" | "moderate" | "emerging" | "limited" | "not-established"; summary of key studies or lack thereof
- learning_outcomes_claimed: what the product/category claims to improve (reading scores, engagement, retention, etc.)
- regulatory_considerations: FERPA (student data privacy), COPPA (under-13 data), CIPA (school internet safety), state-specific laws, student data privacy pledges
- procurement_landscape: how schools/districts buy (RFP process, per-seat pricing, Title I eligibility, ESSER fund eligibility, pilot programs)
- deployment_contexts: classroom, blended learning, 1:1 device programs, LMS integration requirements (Canvas, Schoology, Google Classroom, Clever SSO)
- funding_activity: recent funding rounds, acquisitions, or notable investors in this space
- competitive_differentiators: what distinguishes leaders (AI features, content library, assessment integration, teacher PD)
- data_quality: "high" | "medium" | "low" — based on how well sources covered EdSurge/research/district procurement
- note: always exactly the note string provided below""",
            },
        ],
        response_format={"type": "json_object"},
    )

    try:
        brief = json.loads(response.choices[0].message.content)
        brief["note"] = NOTE
        return brief
    except (json.JSONDecodeError, KeyError):
        return None

def research_edtech(query: str) -> dict | None:
    """Run the full EdTech market research pipeline."""
    print(f"Researching EdTech market: {query}")

    print("Synthesizing market landscape...")
    market = research_market(query)

    print("Searching EdTech industry sources...")
    industry = search_industry(query)

    print("Searching research evidence...")
    evidence = search_evidence(query)

    print("Scraping a relevant product/RFP/research page...")
    scraped = None
    for result in industry + evidence:
        url = result.get("url")
        if url and ("edsurge.com" in url or "iste.org" in url or "rand.org" in url
                    or "ed.gov" in url):
            scraped = scrape_page(url)
            if scraped.get("content"):
                break

    print("Pulling recent edtech news...")
    news = get_news(query)

    print("Generating market brief...")
    return generate_brief(query, market, industry, evidence, scraped, news)

def print_brief(brief: dict):
    if not brief:
        print("Could not generate brief.")
        return
    print(f"\n{'='*60}")
    print(f"EdTech Market Brief")
    print(f"{'='*60}")
    print(f"\nProduct / Segment: {brief.get('product_or_market_segment', '')}")
    print(f"Market Segment: {brief.get('market_segment', '')}")
    print(f"\nMarket Overview:\n{brief.get('market_overview', '')}")
    print(f"\nKey Players: {', '.join(brief.get('key_players', []))}")
    print(f"\nEvidence Base:\n{brief.get('evidence_base', '')}")
    print(f"\nLearning Outcomes Claimed:\n{brief.get('learning_outcomes_claimed', '')}")
    print(f"\nRegulatory Considerations:\n{brief.get('regulatory_considerations', '')}")
    print(f"\nProcurement Landscape:\n{brief.get('procurement_landscape', '')}")
    print(f"\nDeployment Contexts:\n{brief.get('deployment_contexts', '')}")
    print(f"\nFunding Activity:\n{brief.get('funding_activity', '')}")
    print(f"\nCompetitive Differentiators:\n{brief.get('competitive_differentiators', '')}")
    print(f"\nData Quality: {brief.get('data_quality', '?')}")
    print(f"\n{brief.get('note', '')}")

if __name__ == "__main__":
    import sys
    query = sys.argv[1] if len(sys.argv) > 1 else "adaptive math learning platform K-12"
    brief = research_edtech(query)
    print_brief(brief)

Usage examples

Validate before you deploy. Always validate efficacy claims through peer-reviewed research (What Works Clearinghouse, RAND Education) and consult your district's data privacy officer for FERPA/COPPA compliance before deploying student-facing tools.

Getting your API key

Grab a free Superhighway key at /pricing (1,000 calls/month, no credit card). For an agent that provisions its own access, skip the key entirely with x402: it pays $0.002 per call in USDC on Base — no signup, no key management. See the x402 pay-per-call guide for the wallet setup.

See also

The brand monitoring agent applies the same multi-endpoint pattern to tracking mentions and sentiment, and the content research agent covers the deep-research synthesis pattern in more depth.