EdTech Research Agent
EdTech is a $300B+ global market where product claims outrun evidence, regulation (FERPA, COPPA, CIPA) is non-negotiable, and procurement runs through districts, ESSER funds, and Title I budgets. This guide builds a Python agent for EdTech investors, startup founders, district curriculum directors, learning scientists, and instructional designers. It chains all four Superhighway endpoints — /research for the market landscape, /search against EdTech industry and research sources, /scrape for a specific product profile or research brief, and /news for funding and procurement activity — then uses an LLM to emit a structured market brief as JSON.
EdTech product efficacy claims vary widely. This agent provides market and competitive research, not instructional design advice or legal compliance guidance. Verify student data privacy compliance (FERPA, COPPA) with your district's legal counsel before procurement.
Overview
The agent takes a product or market segment — "adaptive math learning platform K-12", "AI writing tutor higher education" — and produces a structured EdTech market brief:
- Synthesizes the market: size estimates, growth trajectory, key players, funding trends, adoption patterns, learning efficacy evidence, and the regulatory landscape
- Searches EdTech industry sources — EdSurge (news and product reviews), Crunchbase (funding), ISTE (ed-tech standards)
- Searches research evidence — peer-reviewed studies, efficacy data, and adoption reports from RAND, the Gates Foundation, What Works Clearinghouse, and IES
- Scrapes one relevant page: an EdSurge product profile, a district RFP, or a research brief
- Pulls recent news: funding rounds, district procurement decisions, policy changes (ESSER, Title I/II), product launches, and acquisitions
- Uses an LLM to generate a structured brief — segment, evidence base, learning outcomes claimed, regulatory considerations, procurement landscape, deployment contexts, and funding activity as JSON
Who it's for: EdTech investors and analysts, startup founders, district curriculum directors and procurement leads, learning scientists, and instructional designers.
How it works
Five endpoint calls feed one LLM synthesis:
/research— deep synthesis of the market: size, growth, key players, funding trends, adoption patterns, learning efficacy evidence, regulatory landscape./search(industry sources) — EdTech news, reviews, and funding scoped to EdSurge, Crunchbase, and ISTE./search(research evidence,time=year) — peer-reviewed studies, efficacy data, and adoption reports from research bodies./scrape— one relevant URL, e.g. an EdSurge product profile, a district RFP, or a research brief./news(time=month) — recent funding rounds, district procurement decisions, policy changes, product launches, and acquisitions.
Full example
pip install openai requests python-dotenv
Create a .env file with your two keys:
SUPERHIGHWAY_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
import requests, os, json
from openai import OpenAI
SUPERHIGHWAY_KEY = os.getenv("SUPERHIGHWAY_API_KEY")
BASE = "https://superhighway.walls.sh"
HEADERS = {"Authorization": f"Bearer {SUPERHIGHWAY_KEY}"}
llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
NOTE = (
"EdTech product claims vary widely in research support. Verify efficacy "
"evidence independently through What Works Clearinghouse (ies.ed.gov/ncee/wwc) "
"and RAND Education research before procurement decisions."
)
# 1. Deep synthesis of the EdTech market landscape
def research_market(query: str) -> str:
"""Market size, key players, funding trends, adoption, efficacy, regulation."""
r = requests.get(
f"{BASE}/research",
params={"q": f"{query} education technology market"},
headers=HEADERS,
)
data = r.json()
return data.get("summary", "")[:3000]
# 2. EdTech industry sources: EdSurge, Crunchbase, ISTE
def search_industry(query: str) -> list[dict]:
"""Product reviews (EdSurge), funding (Crunchbase), standards (ISTE)."""
r = requests.get(
f"{BASE}/search",
params={
"q": f"{query} edtech startup funding product review "
f"site:edsurge.com OR site:crunchbase.com OR site:iste.org",
},
headers=HEADERS,
)
return r.json().get("results", [])
# 3. Research evidence: efficacy studies, adoption reports (last year)
def search_evidence(query: str) -> list[dict]:
"""Peer-reviewed studies, efficacy data from RAND, Gates, WWC, IES."""
r = requests.get(
f"{BASE}/search",
params={
"q": f"{query} education technology research evidence classroom efficacy",
"time": "year",
},
headers=HEADERS,
)
return r.json().get("results", [])
# 4. Scrape one relevant product profile / RFP / research brief
def scrape_page(url: str) -> dict:
"""Pull an EdSurge product profile, a district RFP, or a research brief."""
r = requests.post(
f"{BASE}/scrape",
json={"url": url, "mode": "markdown"},
headers=HEADERS,
)
data = r.json()
return {
"url": url,
"title": data.get("title", ""),
"content": data.get("markdown", data.get("text", ""))[:2500],
}
# 5. Recent edtech news: funding, procurement, policy (last month)
def get_news(query: str) -> list[dict]:
"""Funding rounds, district procurement, policy changes, launches, M&A."""
r = requests.get(
f"{BASE}/news",
params={
"q": f"{query} edtech education technology school district",
"time": "month",
},
headers=HEADERS,
)
return r.json().get("results", [])
def generate_brief(
query: str,
market: str,
industry: list[dict],
evidence: list[dict],
scraped: dict | None,
news: list[dict],
) -> dict | None:
"""Generate a structured EdTech market brief as JSON."""
industry_text = "\n".join(
f"- {r.get('title', '')}: {r.get('snippet', '')} ({r.get('url', '')})"
for r in industry[:6]
)
evidence_text = "\n".join(
f"- {r.get('title', '')}: {r.get('snippet', '')}"
for r in evidence[:6]
)
news_text = "\n".join(
f"- {n.get('title', '')}: {n.get('snippet', '')}"
for n in news[:6]
)
scraped_text = ""
if scraped and scraped.get("content"):
scraped_text = f"{scraped['title']}\n{scraped['content']}"
response = llm.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
"You are an EdTech market and competitive analyst. Use ONLY the "
"provided sources. Do not invent company names, funding figures, "
"study results, or efficacy claims — if a detail is not in the "
"sources, say 'not found in sources.' This is market research, NOT "
"instructional design advice or legal compliance guidance. Be "
"precise about the strength of the evidence base and always remind "
"the reader to verify efficacy and student-data-privacy compliance."
),
},
{
"role": "user",
"content": f"""Write an EdTech market brief for: {query}
Market Landscape (synthesis):
{market}
EdTech Industry Sources (EdSurge / Crunchbase / ISTE):
{industry_text}
Research Evidence (studies, efficacy, adoption reports):
{evidence_text}
Scraped Product / RFP / Research Page:
{scraped_text}
Recent EdTech News:
{news_text}
Return JSON with ALL of these fields:
- product_or_market_segment: what's being researched (product name or market category)
- market_segment: "K-12" | "higher-ed" | "corporate-learning" | "early-childhood" | "professional-development" | "mixed"
- market_overview: market size estimate, growth trajectory, key trends driving adoption
- key_players: list of leading companies/products in this space
- evidence_base: strength of research evidence — "strong" | "moderate" | "emerging" | "limited" | "not-established"; summary of key studies or lack thereof
- learning_outcomes_claimed: what the product/category claims to improve (reading scores, engagement, retention, etc.)
- regulatory_considerations: FERPA (student data privacy), COPPA (under-13 data), CIPA (school internet safety), state-specific laws, student data privacy pledges
- procurement_landscape: how schools/districts buy (RFP process, per-seat pricing, Title I eligibility, ESSER fund eligibility, pilot programs)
- deployment_contexts: classroom, blended learning, 1:1 device programs, LMS integration requirements (Canvas, Schoology, Google Classroom, Clever SSO)
- funding_activity: recent funding rounds, acquisitions, or notable investors in this space
- competitive_differentiators: what distinguishes leaders (AI features, content library, assessment integration, teacher PD)
- data_quality: "high" | "medium" | "low" — based on how well sources covered EdSurge/research/district procurement
- note: always exactly the note string provided below""",
},
],
response_format={"type": "json_object"},
)
try:
brief = json.loads(response.choices[0].message.content)
brief["note"] = NOTE
return brief
except (json.JSONDecodeError, KeyError):
return None
def research_edtech(query: str) -> dict | None:
"""Run the full EdTech market research pipeline."""
print(f"Researching EdTech market: {query}")
print("Synthesizing market landscape...")
market = research_market(query)
print("Searching EdTech industry sources...")
industry = search_industry(query)
print("Searching research evidence...")
evidence = search_evidence(query)
print("Scraping a relevant product/RFP/research page...")
scraped = None
for result in industry + evidence:
url = result.get("url")
if url and ("edsurge.com" in url or "iste.org" in url or "rand.org" in url
or "ed.gov" in url):
scraped = scrape_page(url)
if scraped.get("content"):
break
print("Pulling recent edtech news...")
news = get_news(query)
print("Generating market brief...")
return generate_brief(query, market, industry, evidence, scraped, news)
def print_brief(brief: dict):
if not brief:
print("Could not generate brief.")
return
print(f"\n{'='*60}")
print(f"EdTech Market Brief")
print(f"{'='*60}")
print(f"\nProduct / Segment: {brief.get('product_or_market_segment', '')}")
print(f"Market Segment: {brief.get('market_segment', '')}")
print(f"\nMarket Overview:\n{brief.get('market_overview', '')}")
print(f"\nKey Players: {', '.join(brief.get('key_players', []))}")
print(f"\nEvidence Base:\n{brief.get('evidence_base', '')}")
print(f"\nLearning Outcomes Claimed:\n{brief.get('learning_outcomes_claimed', '')}")
print(f"\nRegulatory Considerations:\n{brief.get('regulatory_considerations', '')}")
print(f"\nProcurement Landscape:\n{brief.get('procurement_landscape', '')}")
print(f"\nDeployment Contexts:\n{brief.get('deployment_contexts', '')}")
print(f"\nFunding Activity:\n{brief.get('funding_activity', '')}")
print(f"\nCompetitive Differentiators:\n{brief.get('competitive_differentiators', '')}")
print(f"\nData Quality: {brief.get('data_quality', '?')}")
print(f"\n{brief.get('note', '')}")
if __name__ == "__main__":
import sys
query = sys.argv[1] if len(sys.argv) > 1 else "adaptive math learning platform K-12"
brief = research_edtech(query)
print_brief(brief)
Usage examples
- "adaptive math learning platform K-12" — maps the adaptive-math landscape (DreamBox, Khan Academy, IXL, Zearn), summarizes the evidence base for adaptive learning, and surfaces ESSER-eligible pricing and Title I procurement paths.
- "AI writing tutor higher education" — AI writing-assistant tools for colleges, the (still emerging) evidence on AI tutoring efficacy, FERPA implications for tools that store student work, and LMS integration patterns (Canvas, Blackboard).
- "professional development platform corporate learning" — the L&D and skills-based learning market, evidence on microlearning efficacy, and LMS/LXP integration requirements for enterprise deployments.
Validate before you deploy. Always validate efficacy claims through peer-reviewed research (What Works Clearinghouse, RAND Education) and consult your district's data privacy officer for FERPA/COPPA compliance before deploying student-facing tools.
Getting your API key
Grab a free Superhighway key at /pricing (1,000 calls/month, no credit card). For an agent that provisions its own access, skip the key entirely with x402: it pays $0.002 per call in USDC on Base — no signup, no key management. See the x402 pay-per-call guide for the wallet setup.
See also
The brand monitoring agent applies the same multi-endpoint pattern to tracking mentions and sentiment, and the content research agent covers the deep-research synthesis pattern in more depth.