Files
2026-03-22 23:21:49 +02:00

9.9 KiB

Research API Reference

Table of Contents


Overview

The Research API conducts comprehensive research on any topic with automatic source gathering, analysis, and response generation with citations. It's an end-to-end solution when you need AI-powered research without building your own pipeline.


Prompting Best Practices

Define a clear goal with all details and direction.

Guidelines:

  • Be specific when you can. Include known details: target market, competitors, geography, constraints
  • Stay open-ended only for discovery. Make it explicit: "tell me about the most impactful AI innovations in healthcare in 2025"
  • Avoid contradictions. Don't include conflicting constraints or goals
  • Share what's already known. Include prior assumptions so research doesn't repeat existing knowledge
  • Keep prompts clean and directed. Clear task + essential context + desired output format

Example Queries

Company research:

Research the company ____ and its 2026 outlook. Provide a brief overview
of the company, its products, services, and market position.

Competitive analysis:

Conduct a competitive analysis of ____ in 2026. Identify their main
competitors, compare market positioning, and analyze key differentiators.

With prior context:

We're evaluating Notion as a potential partner. We already know they
primarily serve SMB and mid-market teams, expanded their AI features
significantly in 2025, and most often compete with Confluence and ClickUp.
Research Notion's 2026 outlook, including market position, growth risks,
and where a partnership could be most valuable. Include citations.

Model Selection

Model Best For
pro Comprehensive, multi-agent research for complex, multi-domain topics
mini Targeted, efficient research for narrow or well-scoped questions
auto When unsure how complex research will be (default)

Pro Model

Multi-agent research suited for complex topics spanning multiple subtopics or domains. Use for deeper analysis, thorough reports, or maximum accuracy.

result = client.research(
    input="Analyze the competitive landscape for ____ in the SMB market, "
          "including key competitors, positioning, pricing models, customer "
          "segments, recent product moves, and defensible advantages or risks "
          "over the next 2-3 years.",
    model="pro"
)

Mini Model

Optimized for targeted, efficient research. Best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis.

result = client.research(
    input="What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?",
    model="mini"
)

Key Parameters

research()

Parameter Type Default Description
input string Required The research topic or question
model enum "auto" "mini", "pro", or "auto"
stream boolean false Enable streaming responses
output_schema object null JSON Schema for structured output
citation_format enum "numbered" "numbered", "mla", "apa", "chicago"

get_research()

Parameter Type Description
request_id string Task ID from research() response

Basic Usage

Research tasks are two-step: initiate with research(), retrieve with get_research().

import time
from tavily import TavilyClient

client = TavilyClient()

# Step 1: Start research task
result = client.research(
    input="Latest developments in quantum computing and their practical applications",
    model="pro"
)
request_id = result["request_id"]

# Step 2: Poll until completed
response = client.get_research(request_id)
while response["status"] not in ["completed", "failed"]:
    print(f"Status: {response['status']}... polling again in 10 seconds")
    time.sleep(10)
    response = client.get_research(request_id)

# Step 3: Handle result
if response["status"] == "failed":
    raise RuntimeError(f"Research failed: {response.get('error', 'Unknown error')}")

report = response["content"]
sources = response["sources"]

Streaming vs Polling

Streaming — Best for user interfaces where you want real-time updates. Polling — Best for background processes where you check status periodically.

Streaming

Enable real-time progress monitoring with stream=True.

stream = client.research(
    input="Latest developments in quantum computing",
    model="pro",
    stream=True
)

for chunk in stream:
    print(chunk.decode('utf-8'))

Event Types

Event Type Description
Tool Call Agent initiates action (Planning, WebSearch, etc.)
Tool Response Results after tool execution with sources
Content Research report streamed as markdown (or JSON with output_schema)
Sources Complete list of sources, emitted after content
Done Signals completion

Tool Types

Tool Description Models
Planning Initializes research strategy mini, pro
WebSearch Executes web searches mini, pro
Generating Creates final report mini, pro
ResearchSubtopic Deep research on subtopics pro only

Typical Flow

  1. Planning tool_call → tool_response
  2. WebSearch tool_call → tool_response (with sources)
  3. ResearchSubtopic cycles (Pro mode only)
  4. Generating tool_call → tool_response
  5. Content chunks (markdown or structured JSON)
  6. Sources event
  7. Done event

See streaming cookbook and polling cookbook for complete examples.


Structured Output vs. Report

Format Best For
Report (default) Reading, sharing, or displaying verbatim (chat interfaces, briefs, newsletters)
Structured Output Data enrichment, pipelines, or powering UIs with specific fields

Structured Output

Use output_schema to receive research in a predefined JSON structure.

schema = {
    "properties": {
        "summary": {
            "type": "string",
            "description": "Executive summary of findings"
        },
        "key_points": {
            "type": "array",
            "items": {"type": "string"},
            "description": "Main takeaways from the research"
        },
        "metrics": {
            "type": "object",
            "properties": {
                "market_size": {"type": "string", "description": "Total market size"},
                "growth_rate": {"type": "number", "description": "Annual growth percentage"}
            }
        }
    },
    "required": ["summary", "key_points"]
}

result = client.research(
    input="Electric vehicle market analysis 2024",
    output_schema=schema
)

Schema Best Practices

  • Write clear field descriptions. 1-3 sentences explaining what the field should contain
  • Match the structure you need. Use arrays, objects, enums appropriately (e.g., competitors: string[], not "A, B, C")
  • Avoid duplicate fields. Keep each field unique and specific
  • Use required arrays to enforce mandatory fields at any nesting level

Supported types: object, string, integer, number, array

Streaming with Structured Output

When output_schema is provided, content arrives as structured JSON:

stream = client.research(
    input="AI agent frameworks comparison",
    model="mini",
    stream=True,
    output_schema={
        "properties": {
            "summary": {"type": "string", "description": "Executive summary"},
            "key_points": {"type": "array", "items": {"type": "string"}}
        },
        "required": ["summary", "key_points"]
    }
)

for chunk in stream:
    data = chunk.decode('utf-8')
    print(data)  # Content chunks will be structured JSON

Response Fields

research() Response

Field Description
request_id Unique identifier for tracking
created_at Timestamp when task was created
status Initial status
input The research topic submitted
model Model used by research agent

get_research() Response

Field Description
status "pending", "processing", "completed", "failed"
content Generated research report (when completed)
sources Array of source citations
response_time Time in seconds

Source Object

Field Description
url Source URL
title Source title
citation Formatted citation string

Summary

  1. Be specific in prompts — Include known details: target market, competitors, geography, constraints
  2. Share prior context — Include what you already know to avoid repetition
  3. Choose the right modelmini for focused queries, pro for comprehensive multi-domain analysis
  4. Use streaming for UX — Display real-time progress during long research tasks
  5. Use structured output for pipelines — Define schemas for consistent, parseable responses
  6. Use reports for reading — Default format is best for chat interfaces and sharing

For more examples, see the Tavily Cookbook and live demo.