316 lines
9.9 KiB
Markdown
316 lines
9.9 KiB
Markdown
# Research API Reference
|
|
|
|
## Table of Contents
|
|
|
|
- [Overview](#overview)
|
|
- [Prompting Best Practices](#prompting-best-practices)
|
|
- [Model Selection](#model-selection)
|
|
- [Key Parameters](#key-parameters)
|
|
- [Basic Usage](#basic-usage)
|
|
- [Streaming vs Polling](#streaming-vs-polling)
|
|
- [Structured Output vs Report](#structured-output-vs-report)
|
|
- [Response Fields](#response-fields)
|
|
- [Summary](#summary)
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
The Research API conducts comprehensive research on any topic with automatic source gathering, analysis, and response generation with citations. It's an end-to-end solution when you need AI-powered research without building your own pipeline.
|
|
|
|
---
|
|
|
|
## Prompting Best Practices
|
|
|
|
Define a **clear goal** with all **details** and **direction**.
|
|
|
|
**Guidelines:**
|
|
- **Be specific when you can.** Include known details: target market, competitors, geography, constraints
|
|
- **Stay open-ended only for discovery.** Make it explicit: "tell me about the most impactful AI innovations in healthcare in 2025"
|
|
- **Avoid contradictions.** Don't include conflicting constraints or goals
|
|
- **Share what's already known.** Include prior assumptions so research doesn't repeat existing knowledge
|
|
- **Keep prompts clean and directed.** Clear task + essential context + desired output format
|
|
|
|
### Example Queries
|
|
|
|
**Company research:**
|
|
```
|
|
Research the company ____ and its 2026 outlook. Provide a brief overview
|
|
of the company, its products, services, and market position.
|
|
```
|
|
|
|
**Competitive analysis:**
|
|
```
|
|
Conduct a competitive analysis of ____ in 2026. Identify their main
|
|
competitors, compare market positioning, and analyze key differentiators.
|
|
```
|
|
|
|
**With prior context:**
|
|
```
|
|
We're evaluating Notion as a potential partner. We already know they
|
|
primarily serve SMB and mid-market teams, expanded their AI features
|
|
significantly in 2025, and most often compete with Confluence and ClickUp.
|
|
Research Notion's 2026 outlook, including market position, growth risks,
|
|
and where a partnership could be most valuable. Include citations.
|
|
```
|
|
|
|
---
|
|
|
|
## Model Selection
|
|
|
|
| Model | Best For |
|
|
|-------|----------|
|
|
| `pro` | Comprehensive, multi-agent research for complex, multi-domain topics |
|
|
| `mini` | Targeted, efficient research for narrow or well-scoped questions |
|
|
| `auto` | When unsure how complex research will be (default) |
|
|
|
|
### Pro Model
|
|
|
|
Multi-agent research suited for complex topics spanning multiple subtopics or domains. Use for deeper analysis, thorough reports, or maximum accuracy.
|
|
|
|
```python
|
|
result = client.research(
|
|
input="Analyze the competitive landscape for ____ in the SMB market, "
|
|
"including key competitors, positioning, pricing models, customer "
|
|
"segments, recent product moves, and defensible advantages or risks "
|
|
"over the next 2-3 years.",
|
|
model="pro"
|
|
)
|
|
```
|
|
|
|
### Mini Model
|
|
|
|
Optimized for targeted, efficient research. Best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis.
|
|
|
|
```python
|
|
result = client.research(
|
|
input="What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?",
|
|
model="mini"
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Key Parameters
|
|
|
|
### research()
|
|
|
|
| Parameter | Type | Default | Description |
|
|
|-----------|------|---------|-------------|
|
|
| `input` | string | Required | The research topic or question |
|
|
| `model` | enum | `"auto"` | `"mini"`, `"pro"`, or `"auto"` |
|
|
| `stream` | boolean | false | Enable streaming responses |
|
|
| `output_schema` | object | null | JSON Schema for structured output |
|
|
| `citation_format` | enum | `"numbered"` | `"numbered"`, `"mla"`, `"apa"`, `"chicago"` |
|
|
|
|
### get_research()
|
|
|
|
| Parameter | Type | Description |
|
|
|-----------|------|-------------|
|
|
| `request_id` | string | Task ID from `research()` response |
|
|
|
|
---
|
|
|
|
## Basic Usage
|
|
|
|
Research tasks are two-step: initiate with `research()`, retrieve with `get_research()`.
|
|
|
|
```python
|
|
import time
|
|
from tavily import TavilyClient
|
|
|
|
client = TavilyClient()
|
|
|
|
# Step 1: Start research task
|
|
result = client.research(
|
|
input="Latest developments in quantum computing and their practical applications",
|
|
model="pro"
|
|
)
|
|
request_id = result["request_id"]
|
|
|
|
# Step 2: Poll until completed
|
|
response = client.get_research(request_id)
|
|
while response["status"] not in ["completed", "failed"]:
|
|
print(f"Status: {response['status']}... polling again in 10 seconds")
|
|
time.sleep(10)
|
|
response = client.get_research(request_id)
|
|
|
|
# Step 3: Handle result
|
|
if response["status"] == "failed":
|
|
raise RuntimeError(f"Research failed: {response.get('error', 'Unknown error')}")
|
|
|
|
report = response["content"]
|
|
sources = response["sources"]
|
|
```
|
|
|
|
---
|
|
|
|
## Streaming vs Polling
|
|
|
|
**Streaming** — Best for user interfaces where you want real-time updates.
|
|
**Polling** — Best for background processes where you check status periodically.
|
|
|
|
### Streaming
|
|
|
|
Enable real-time progress monitoring with `stream=True`.
|
|
|
|
```python
|
|
stream = client.research(
|
|
input="Latest developments in quantum computing",
|
|
model="pro",
|
|
stream=True
|
|
)
|
|
|
|
for chunk in stream:
|
|
print(chunk.decode('utf-8'))
|
|
```
|
|
|
|
### Event Types
|
|
|
|
| Event Type | Description |
|
|
|------------|-------------|
|
|
| **Tool Call** | Agent initiates action (Planning, WebSearch, etc.) |
|
|
| **Tool Response** | Results after tool execution with sources |
|
|
| **Content** | Research report streamed as markdown (or JSON with `output_schema`) |
|
|
| **Sources** | Complete list of sources, emitted after content |
|
|
| **Done** | Signals completion |
|
|
|
|
### Tool Types
|
|
|
|
| Tool | Description | Models |
|
|
|------|-------------|--------|
|
|
| `Planning` | Initializes research strategy | mini, pro |
|
|
| `WebSearch` | Executes web searches | mini, pro |
|
|
| `Generating` | Creates final report | mini, pro |
|
|
| `ResearchSubtopic` | Deep research on subtopics | pro only |
|
|
|
|
### Typical Flow
|
|
|
|
1. `Planning` tool_call → tool_response
|
|
2. `WebSearch` tool_call → tool_response (with sources)
|
|
3. `ResearchSubtopic` cycles (Pro mode only)
|
|
4. `Generating` tool_call → tool_response
|
|
5. `Content` chunks (markdown or structured JSON)
|
|
6. `Sources` event
|
|
7. `Done` event
|
|
|
|
See [streaming cookbook](https://github.com/tavily-ai/tavily-cookbook/blob/main/cookbooks/research/streaming.ipynb) and [polling cookbook](https://github.com/tavily-ai/tavily-cookbook/blob/main/cookbooks/research/polling.ipynb) for complete examples.
|
|
|
|
---
|
|
|
|
## Structured Output vs. Report
|
|
|
|
| Format | Best For |
|
|
|--------|----------|
|
|
| **Report** (default) | Reading, sharing, or displaying verbatim (chat interfaces, briefs, newsletters) |
|
|
| **Structured Output** | Data enrichment, pipelines, or powering UIs with specific fields |
|
|
|
|
## Structured Output
|
|
|
|
Use `output_schema` to receive research in a predefined JSON structure.
|
|
|
|
```python
|
|
schema = {
|
|
"properties": {
|
|
"summary": {
|
|
"type": "string",
|
|
"description": "Executive summary of findings"
|
|
},
|
|
"key_points": {
|
|
"type": "array",
|
|
"items": {"type": "string"},
|
|
"description": "Main takeaways from the research"
|
|
},
|
|
"metrics": {
|
|
"type": "object",
|
|
"properties": {
|
|
"market_size": {"type": "string", "description": "Total market size"},
|
|
"growth_rate": {"type": "number", "description": "Annual growth percentage"}
|
|
}
|
|
}
|
|
},
|
|
"required": ["summary", "key_points"]
|
|
}
|
|
|
|
result = client.research(
|
|
input="Electric vehicle market analysis 2024",
|
|
output_schema=schema
|
|
)
|
|
```
|
|
|
|
### Schema Best Practices
|
|
|
|
- **Write clear field descriptions.** 1-3 sentences explaining what the field should contain
|
|
- **Match the structure you need.** Use arrays, objects, enums appropriately (e.g., `competitors: string[]`, not `"A, B, C"`)
|
|
- **Avoid duplicate fields.** Keep each field unique and specific
|
|
- **Use `required` arrays** to enforce mandatory fields at any nesting level
|
|
|
|
**Supported types:** `object`, `string`, `integer`, `number`, `array`
|
|
|
|
### Streaming with Structured Output
|
|
|
|
When `output_schema` is provided, content arrives as structured JSON:
|
|
|
|
```python
|
|
stream = client.research(
|
|
input="AI agent frameworks comparison",
|
|
model="mini",
|
|
stream=True,
|
|
output_schema={
|
|
"properties": {
|
|
"summary": {"type": "string", "description": "Executive summary"},
|
|
"key_points": {"type": "array", "items": {"type": "string"}}
|
|
},
|
|
"required": ["summary", "key_points"]
|
|
}
|
|
)
|
|
|
|
for chunk in stream:
|
|
data = chunk.decode('utf-8')
|
|
print(data) # Content chunks will be structured JSON
|
|
```
|
|
|
|
---
|
|
|
|
## Response Fields
|
|
|
|
### research() Response
|
|
|
|
| Field | Description |
|
|
|-------|-------------|
|
|
| `request_id` | Unique identifier for tracking |
|
|
| `created_at` | Timestamp when task was created |
|
|
| `status` | Initial status |
|
|
| `input` | The research topic submitted |
|
|
| `model` | Model used by research agent |
|
|
|
|
### get_research() Response
|
|
|
|
| Field | Description |
|
|
|-------|-------------|
|
|
| `status` | `"pending"`, `"processing"`, `"completed"`, `"failed"` |
|
|
| `content` | Generated research report (when completed) |
|
|
| `sources` | Array of source citations |
|
|
| `response_time` | Time in seconds |
|
|
|
|
### Source Object
|
|
|
|
| Field | Description |
|
|
|-------|-------------|
|
|
| `url` | Source URL |
|
|
| `title` | Source title |
|
|
| `citation` | Formatted citation string |
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
1. **Be specific in prompts** — Include known details: target market, competitors, geography, constraints
|
|
2. **Share prior context** — Include what you already know to avoid repetition
|
|
3. **Choose the right model** — `mini` for focused queries, `pro` for comprehensive multi-domain analysis
|
|
4. **Use streaming for UX** — Display real-time progress during long research tasks
|
|
5. **Use structured output for pipelines** — Define schemas for consistent, parseable responses
|
|
6. **Use reports for reading** — Default format is best for chat interfaces and sharing
|
|
|
|
For more examples, see the [Tavily Cookbook](https://github.com/tavily-ai/tavily-cookbook/tree/main/research) and [live demo](https://chat-research.tavily.com/).
|