9.9 KiB
Research API Reference
Table of Contents
- Overview
- Prompting Best Practices
- Model Selection
- Key Parameters
- Basic Usage
- Streaming vs Polling
- Structured Output vs Report
- Response Fields
- Summary
Overview
The Research API conducts comprehensive research on any topic with automatic source gathering, analysis, and response generation with citations. It's an end-to-end solution when you need AI-powered research without building your own pipeline.
Prompting Best Practices
Define a clear goal with all details and direction.
Guidelines:
- Be specific when you can. Include known details: target market, competitors, geography, constraints
- Stay open-ended only for discovery. Make it explicit: "tell me about the most impactful AI innovations in healthcare in 2025"
- Avoid contradictions. Don't include conflicting constraints or goals
- Share what's already known. Include prior assumptions so research doesn't repeat existing knowledge
- Keep prompts clean and directed. Clear task + essential context + desired output format
Example Queries
Company research:
Research the company ____ and its 2026 outlook. Provide a brief overview
of the company, its products, services, and market position.
Competitive analysis:
Conduct a competitive analysis of ____ in 2026. Identify their main
competitors, compare market positioning, and analyze key differentiators.
With prior context:
We're evaluating Notion as a potential partner. We already know they
primarily serve SMB and mid-market teams, expanded their AI features
significantly in 2025, and most often compete with Confluence and ClickUp.
Research Notion's 2026 outlook, including market position, growth risks,
and where a partnership could be most valuable. Include citations.
Model Selection
| Model | Best For |
|---|---|
pro |
Comprehensive, multi-agent research for complex, multi-domain topics |
mini |
Targeted, efficient research for narrow or well-scoped questions |
auto |
When unsure how complex research will be (default) |
Pro Model
Multi-agent research suited for complex topics spanning multiple subtopics or domains. Use for deeper analysis, thorough reports, or maximum accuracy.
result = client.research(
input="Analyze the competitive landscape for ____ in the SMB market, "
"including key competitors, positioning, pricing models, customer "
"segments, recent product moves, and defensible advantages or risks "
"over the next 2-3 years.",
model="pro"
)
Mini Model
Optimized for targeted, efficient research. Best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis.
result = client.research(
input="What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?",
model="mini"
)
Key Parameters
research()
| Parameter | Type | Default | Description |
|---|---|---|---|
input |
string | Required | The research topic or question |
model |
enum | "auto" |
"mini", "pro", or "auto" |
stream |
boolean | false | Enable streaming responses |
output_schema |
object | null | JSON Schema for structured output |
citation_format |
enum | "numbered" |
"numbered", "mla", "apa", "chicago" |
get_research()
| Parameter | Type | Description |
|---|---|---|
request_id |
string | Task ID from research() response |
Basic Usage
Research tasks are two-step: initiate with research(), retrieve with get_research().
import time
from tavily import TavilyClient
client = TavilyClient()
# Step 1: Start research task
result = client.research(
input="Latest developments in quantum computing and their practical applications",
model="pro"
)
request_id = result["request_id"]
# Step 2: Poll until completed
response = client.get_research(request_id)
while response["status"] not in ["completed", "failed"]:
print(f"Status: {response['status']}... polling again in 10 seconds")
time.sleep(10)
response = client.get_research(request_id)
# Step 3: Handle result
if response["status"] == "failed":
raise RuntimeError(f"Research failed: {response.get('error', 'Unknown error')}")
report = response["content"]
sources = response["sources"]
Streaming vs Polling
Streaming — Best for user interfaces where you want real-time updates. Polling — Best for background processes where you check status periodically.
Streaming
Enable real-time progress monitoring with stream=True.
stream = client.research(
input="Latest developments in quantum computing",
model="pro",
stream=True
)
for chunk in stream:
print(chunk.decode('utf-8'))
Event Types
| Event Type | Description |
|---|---|
| Tool Call | Agent initiates action (Planning, WebSearch, etc.) |
| Tool Response | Results after tool execution with sources |
| Content | Research report streamed as markdown (or JSON with output_schema) |
| Sources | Complete list of sources, emitted after content |
| Done | Signals completion |
Tool Types
| Tool | Description | Models |
|---|---|---|
Planning |
Initializes research strategy | mini, pro |
WebSearch |
Executes web searches | mini, pro |
Generating |
Creates final report | mini, pro |
ResearchSubtopic |
Deep research on subtopics | pro only |
Typical Flow
Planningtool_call → tool_responseWebSearchtool_call → tool_response (with sources)ResearchSubtopiccycles (Pro mode only)Generatingtool_call → tool_responseContentchunks (markdown or structured JSON)SourceseventDoneevent
See streaming cookbook and polling cookbook for complete examples.
Structured Output vs. Report
| Format | Best For |
|---|---|
| Report (default) | Reading, sharing, or displaying verbatim (chat interfaces, briefs, newsletters) |
| Structured Output | Data enrichment, pipelines, or powering UIs with specific fields |
Structured Output
Use output_schema to receive research in a predefined JSON structure.
schema = {
"properties": {
"summary": {
"type": "string",
"description": "Executive summary of findings"
},
"key_points": {
"type": "array",
"items": {"type": "string"},
"description": "Main takeaways from the research"
},
"metrics": {
"type": "object",
"properties": {
"market_size": {"type": "string", "description": "Total market size"},
"growth_rate": {"type": "number", "description": "Annual growth percentage"}
}
}
},
"required": ["summary", "key_points"]
}
result = client.research(
input="Electric vehicle market analysis 2024",
output_schema=schema
)
Schema Best Practices
- Write clear field descriptions. 1-3 sentences explaining what the field should contain
- Match the structure you need. Use arrays, objects, enums appropriately (e.g.,
competitors: string[], not"A, B, C") - Avoid duplicate fields. Keep each field unique and specific
- Use
requiredarrays to enforce mandatory fields at any nesting level
Supported types: object, string, integer, number, array
Streaming with Structured Output
When output_schema is provided, content arrives as structured JSON:
stream = client.research(
input="AI agent frameworks comparison",
model="mini",
stream=True,
output_schema={
"properties": {
"summary": {"type": "string", "description": "Executive summary"},
"key_points": {"type": "array", "items": {"type": "string"}}
},
"required": ["summary", "key_points"]
}
)
for chunk in stream:
data = chunk.decode('utf-8')
print(data) # Content chunks will be structured JSON
Response Fields
research() Response
| Field | Description |
|---|---|
request_id |
Unique identifier for tracking |
created_at |
Timestamp when task was created |
status |
Initial status |
input |
The research topic submitted |
model |
Model used by research agent |
get_research() Response
| Field | Description |
|---|---|
status |
"pending", "processing", "completed", "failed" |
content |
Generated research report (when completed) |
sources |
Array of source citations |
response_time |
Time in seconds |
Source Object
| Field | Description |
|---|---|
url |
Source URL |
title |
Source title |
citation |
Formatted citation string |
Summary
- Be specific in prompts — Include known details: target market, competitors, geography, constraints
- Share prior context — Include what you already know to avoid repetition
- Choose the right model —
minifor focused queries,profor comprehensive multi-domain analysis - Use streaming for UX — Display real-time progress during long research tasks
- Use structured output for pipelines — Define schemas for consistent, parseable responses
- Use reports for reading — Default format is best for chat interfaces and sharing
For more examples, see the Tavily Cookbook and live demo.