giteadmin/.dotfiles

Fork 0

Files

Jonathan Agmon c09d9151ca Add skills

2026-03-22 23:21:49 +02:00

9.9 KiB

Raw Permalink Blame History

Research API Reference

Overview
Prompting Best Practices
Model Selection
Key Parameters
Basic Usage
Streaming vs Polling
Structured Output vs Report
Response Fields
Summary

Overview

The Research API conducts comprehensive research on any topic with automatic source gathering, analysis, and response generation with citations. It's an end-to-end solution when you need AI-powered research without building your own pipeline.

Prompting Best Practices

Define a clear goal with all details and direction.

Guidelines:

Be specific when you can. Include known details: target market, competitors, geography, constraints
Stay open-ended only for discovery. Make it explicit: "tell me about the most impactful AI innovations in healthcare in 2025"
Avoid contradictions. Don't include conflicting constraints or goals
Share what's already known. Include prior assumptions so research doesn't repeat existing knowledge
Keep prompts clean and directed. Clear task + essential context + desired output format

Example Queries

Company research:

Research the company ____ and its 2026 outlook. Provide a brief overview
of the company, its products, services, and market position.

Competitive analysis:

Conduct a competitive analysis of ____ in 2026. Identify their main
competitors, compare market positioning, and analyze key differentiators.

With prior context:

We're evaluating Notion as a potential partner. We already know they
primarily serve SMB and mid-market teams, expanded their AI features
significantly in 2025, and most often compete with Confluence and ClickUp.
Research Notion's 2026 outlook, including market position, growth risks,
and where a partnership could be most valuable. Include citations.

Model Selection

Model	Best For
`pro`	Comprehensive, multi-agent research for complex, multi-domain topics
`mini`	Targeted, efficient research for narrow or well-scoped questions
`auto`	When unsure how complex research will be (default)

Pro Model

Multi-agent research suited for complex topics spanning multiple subtopics or domains. Use for deeper analysis, thorough reports, or maximum accuracy.

result = client.research(
    input="Analyze the competitive landscape for ____ in the SMB market, "
          "including key competitors, positioning, pricing models, customer "
          "segments, recent product moves, and defensible advantages or risks "
          "over the next 2-3 years.",
    model="pro"
)

Mini Model

Optimized for targeted, efficient research. Best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis.

result = client.research(
    input="What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?",
    model="mini"
)

Key Parameters

research()

Parameter	Type	Default	Description
`input`	string	Required	The research topic or question
`model`	enum	`"auto"`	`"mini"`, `"pro"`, or `"auto"`
`stream`	boolean	false	Enable streaming responses
`output_schema`	object	null	JSON Schema for structured output
`citation_format`	enum	`"numbered"`	`"numbered"`, `"mla"`, `"apa"`, `"chicago"`

get_research()

Parameter	Type	Description
`request_id`	string	Task ID from `research()` response

Basic Usage

Research tasks are two-step: initiate with research(), retrieve with get_research().

import time
from tavily import TavilyClient

client = TavilyClient()

# Step 1: Start research task
result = client.research(
    input="Latest developments in quantum computing and their practical applications",
    model="pro"
)
request_id = result["request_id"]

# Step 2: Poll until completed
response = client.get_research(request_id)
while response["status"] not in ["completed", "failed"]:
    print(f"Status: {response['status']}... polling again in 10 seconds")
    time.sleep(10)
    response = client.get_research(request_id)

# Step 3: Handle result
if response["status"] == "failed":
    raise RuntimeError(f"Research failed: {response.get('error', 'Unknown error')}")

report = response["content"]
sources = response["sources"]

Streaming vs Polling

Streaming — Best for user interfaces where you want real-time updates. Polling — Best for background processes where you check status periodically.

Streaming

Enable real-time progress monitoring with stream=True.

stream = client.research(
    input="Latest developments in quantum computing",
    model="pro",
    stream=True
)

for chunk in stream:
    print(chunk.decode('utf-8'))

Event Types

Event Type	Description
Tool Call	Agent initiates action (Planning, WebSearch, etc.)
Tool Response	Results after tool execution with sources
Content	Research report streamed as markdown (or JSON with `output_schema`)
Sources	Complete list of sources, emitted after content
Done	Signals completion

Tool Types

Tool	Description	Models
`Planning`	Initializes research strategy	mini, pro
`WebSearch`	Executes web searches	mini, pro
`Generating`	Creates final report	mini, pro
`ResearchSubtopic`	Deep research on subtopics	pro only

Typical Flow

Planning tool_call → tool_response
WebSearch tool_call → tool_response (with sources)
ResearchSubtopic cycles (Pro mode only)
Generating tool_call → tool_response
Content chunks (markdown or structured JSON)
Sources event
Done event

See streaming cookbook and polling cookbook for complete examples.

Structured Output vs. Report

Format	Best For
Report (default)	Reading, sharing, or displaying verbatim (chat interfaces, briefs, newsletters)
Structured Output	Data enrichment, pipelines, or powering UIs with specific fields

Structured Output

Use output_schema to receive research in a predefined JSON structure.

schema = {
    "properties": {
        "summary": {
            "type": "string",
            "description": "Executive summary of findings"
        },
        "key_points": {
            "type": "array",
            "items": {"type": "string"},
            "description": "Main takeaways from the research"
        },
        "metrics": {
            "type": "object",
            "properties": {
                "market_size": {"type": "string", "description": "Total market size"},
                "growth_rate": {"type": "number", "description": "Annual growth percentage"}
            }
        }
    },
    "required": ["summary", "key_points"]
}

result = client.research(
    input="Electric vehicle market analysis 2024",
    output_schema=schema
)

Schema Best Practices

Write clear field descriptions. 1-3 sentences explaining what the field should contain
Match the structure you need. Use arrays, objects, enums appropriately (e.g., competitors: string[], not "A, B, C")
Avoid duplicate fields. Keep each field unique and specific
Use required arrays to enforce mandatory fields at any nesting level

Supported types: object, string, integer, number, array

Streaming with Structured Output

When output_schema is provided, content arrives as structured JSON:

stream = client.research(
    input="AI agent frameworks comparison",
    model="mini",
    stream=True,
    output_schema={
        "properties": {
            "summary": {"type": "string", "description": "Executive summary"},
            "key_points": {"type": "array", "items": {"type": "string"}}
        },
        "required": ["summary", "key_points"]
    }
)

for chunk in stream:
    data = chunk.decode('utf-8')
    print(data)  # Content chunks will be structured JSON

Response Fields

research() Response

Field	Description
`request_id`	Unique identifier for tracking
`created_at`	Timestamp when task was created
`status`	Initial status
`input`	The research topic submitted
`model`	Model used by research agent

get_research() Response

Field	Description
`status`	`"pending"`, `"processing"`, `"completed"`, `"failed"`
`content`	Generated research report (when completed)
`sources`	Array of source citations
`response_time`	Time in seconds

Source Object

Field	Description
`url`	Source URL
`title`	Source title
`citation`	Formatted citation string

Summary

Be specific in prompts — Include known details: target market, competitors, geography, constraints
Share prior context — Include what you already know to avoid repetition
Choose the right model — mini for focused queries, pro for comprehensive multi-domain analysis
Use streaming for UX — Display real-time progress during long research tasks
Use structured output for pipelines — Define schemas for consistent, parseable responses
Use reports for reading — Default format is best for chat interfaces and sharing

For more examples, see the Tavily Cookbook and live demo.

9.9 KiB Raw Permalink Blame History