Add live web search to a DSPy program
DSPy is Stanford's framework for programming — not prompting — foundation models. You declare typed Signatures (input/output fields), compose Modules, and let an optimizer compile the prompts. DSPy has no @tool decorator: you give a ReAct agent tools via the dspy.Tool wrapper, or you build a custom dspy.Module that calls an external API and wire it into your pipeline. Both paths below use Superhighway for live web results.
Install dependencies
pip install dspy-ai requests
Get a free API key
Create a free-tier key at superhighway.walls.sh — no credit card. Pass it as a bearer token on every request. (Autonomous agents can skip the key entirely and pay per call via the x402 protocol.)
Path 1 — dspy.Tool (recommended for ReAct agents)
dspy.ReAct builds a reason-act-observe loop: the model thinks, calls a tool, reads the result, and repeats until it can answer. Wrap any Python function in dspy.Tool — the function's docstring becomes the description the model reads, and its type annotations become the argument schema.
import dspy
import requests
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
def web_search(query: str) -> str:
"""Search the live web and return top results as text."""
resp = requests.get(
"https://superhighway.walls.sh/search",
params={"q": query, "limit": 5},
headers={"Authorization": "Bearer YOUR_FREE_API_KEY"},
)
resp.raise_for_status()
results = resp.json().get("results", [])
return "\n\n".join(
f"{r['title']}\n{r['url']}\n{r.get('description','')}" for r in results
)
search_tool = dspy.Tool(web_search)
class ResearchAgent(dspy.Module):
def __init__(self):
self.agent = dspy.ReAct("question -> answer", tools=[search_tool])
def forward(self, question: str) -> dspy.Prediction:
return self.agent(question=question)
agent = ResearchAgent()
result = agent("What are the latest developments in AI agent frameworks?")
print(result.answer)
Path 2 — Custom Module (structured output pipeline)
When you want structured, typed output instead of a free-form ReAct loop, build a dspy.Module for the search step and feed its results into a dspy.Predict over a typed Signature. The signature declares exactly what fields come out — here a detailed answer plus a list of sources.
import dspy
import requests
class WebSearch(dspy.Module):
"""A DSPy module that queries Superhighway for live web results."""
def __init__(self, api_key: str, max_results: int = 5):
self.api_key = api_key
self.max_results = max_results
def forward(self, query: str) -> list[dict]:
resp = requests.get(
"https://superhighway.walls.sh/search",
params={"q": query, "limit": self.max_results},
headers={"Authorization": f"Bearer {self.api_key}"},
)
resp.raise_for_status()
return resp.json().get("results", [])
class AnswerWithSources(dspy.Signature):
"""Answer a question using web search results."""
question: str = dspy.InputField()
search_results: str = dspy.InputField(desc="web search results as text")
answer: str = dspy.OutputField(desc="detailed answer with key facts")
sources: list[str] = dspy.OutputField(desc="URLs of sources used")
class ResearchPipeline(dspy.Module):
def __init__(self, api_key: str):
self.search = WebSearch(api_key=api_key)
self.answer = dspy.Predict(AnswerWithSources)
def forward(self, question: str) -> dspy.Prediction:
results = self.search(query=question)
results_text = "\n\n".join(
f"{r['title']}\n{r['url']}\n{r.get('description','')}"
for r in results
)
return self.answer(question=question, search_results=results_text)
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
pipeline = ResearchPipeline(api_key="YOUR_FREE_API_KEY")
result = pipeline("What are the latest AI safety research papers?")
print(result.answer)
print("\nSources:", result.sources)
Single-call deep research with /research
When the model needs to both find and read sources, the /research endpoint searches the web, reads the top result pages as clean markdown, and returns them in one call — content, not just links. That replaces a multi-turn ReAct loop with a single request. The response is {query, count, results, pages}, where each page has title, url, and markdown.
def deep_research(query: str, api_key: str) -> str:
"""Search AND read top pages in one call using /research."""
resp = requests.get(
"https://superhighway.walls.sh/research",
params={"q": query},
headers={"Authorization": f"Bearer {api_key}"},
)
resp.raise_for_status()
pages = resp.json().get("pages", [])
return "\n\n".join(
f"# {p['title']}\n{p['url']}\n\n{p['markdown']}" for p in pages
)
Wrap deep_research in a dspy.Tool for a ReAct agent, or call it directly in a dspy.Module.forward() like WebSearch above — feed the returned text straight into a dspy.Predict over a typed Signature.
Key DSPy concepts
| Concept | What it is |
|---|---|
Signature | A typed declaration of a task's input and output fields (e.g. question -> answer). DSPy compiles it into the actual prompt for you. |
Module | A composable unit with a forward() method — like a PyTorch nn.Module. Wrap any external call (such as a Superhighway request) in one. |
dspy.ReAct | A built-in module that runs a reason-act-observe tool-use loop, calling the tools you pass until it can answer. |
dspy.Predict | The simplest module: one-shot generation against a signature, no tool loop. Use it for structured output once you already have the context. |
Which tools are available
| Endpoint | What it returns | Price |
|---|---|---|
/search | Ranked web results — title, URL, description | $0.001 |
/news | Recent news with published dates | $0.001 |
/images | Image URLs, thumbnails, source pages | $0.001 |
/scrape | Any URL → clean markdown text | $0.002 |
/research | Search + read top pages in one call | $0.005 |
Full API reference: /openapi.json · superhighway.walls.sh · pricing & free key.