14 KiB
Exa & Tavily Options Inference Guide
Date: 2025-01-29
Status: Current Implementation Review
Overview
When a user clicks "Intent & Options" button, the system uses AI to infer optimal Exa and Tavily API settings based on the user's research intent. This document explains how these options are generated.
Flow: Intent & Options Button Click
User clicks "Intent & Options"
↓
Frontend: intentResearchApi.analyzeIntent()
↓
Backend: /api/research/intent/analyze
↓
UnifiedResearchAnalyzer.analyze()
↓
Single LLM Call with unified_prompt_builder.py
↓
LLM Returns:
- ResearchIntent (with purpose, depth, focus_areas, also_answering, etc.)
- ResearchQueries (4-8 diverse queries)
- exa_config (optimized Exa settings with justifications)
- tavily_config (optimized Tavily settings with justifications)
- recommended_provider
↓
Backend maps to optimized_config
↓
Frontend receives AnalyzeIntentResponse with optimized_config
↓
Frontend applies optimized_config to ResearchConfig
↓
User sees optimized Exa/Tavily options in AdvancedProviderOptionsSection
How Options Are Inferred
1. Time Sensitivity Rules
Based on: intent.time_sensitivity field
| Time Sensitivity | Exa Settings | Tavily Settings |
|---|---|---|
| real_time | startPublishedDate = current year, type = "auto" or "fast" |
time_range = "day" or "week", topic = "news" |
| recent | startPublishedDate = current year or last 6 months |
time_range = "month" or "week" |
| historical | No date filters, type = "deep" or "neural" |
time_range = "year" or null, topic = "general" |
| evergreen | No date filters, type = "deep" |
time_range = null, topic = "general" |
Example:
- User input: "Latest AI trends in 2025"
- Time sensitivity inferred:
real_time - Exa:
startPublishedDate = "2025-01-01",type = "fast" - Tavily:
time_range = "week",topic = "news"
2. Content Type Based on Focus Areas
Based on: intent.focus_areas field
| Focus Area Keywords | Exa Category | Exa Type | Tavily Topic |
|---|---|---|---|
| "academic", "research", "studies" | "research paper" |
"deep" or "neural" |
"general" |
includeDomains = ["arxiv.org", "nature.com", "pubmed.ncbi.nlm.nih.gov"] |
|||
| "companies", "competitors", "business" | "company" |
"auto" or "deep" |
"general" |
| "news", "trends", "current events" | "news" (if using Exa) |
"auto" |
"news" |
search_depth = "advanced" |
|||
| "social", "twitter", "social media" | "tweet" |
"auto" |
"general" |
| "github", "code", "technical" | "github" |
"auto" or "deep" |
"general" |
Example:
- User input: "AI research papers on transformer architectures"
- Focus areas inferred:
["academic", "research"] - Exa:
category = "research paper",type = "deep",includeDomains = ["arxiv.org", "nature.com"] - Tavily:
topic = "general"
3. Depth-Based Settings
Based on: intent.depth field (overview, detailed, expert)
| Depth Level | Exa Settings | Tavily Settings |
|---|---|---|
| expert | type = "deep", context = true, contextMaxCharacters = 15000+, numResults = 20-50 |
search_depth = "advanced", chunks_per_source = 3, max_results = 15-20 |
| detailed | type = "auto" or "deep", context = true, contextMaxCharacters = 10000+, numResults = 10-20 |
search_depth = "advanced" or "basic", chunks_per_source = 3, max_results = 10-15 |
| overview | type = "auto" or "fast", numResults = 5-10 |
search_depth = "basic" or "fast", max_results = 5-10 |
Example:
- User input: "Comprehensive analysis of quantum computing"
- Depth inferred:
expert - Exa:
type = "deep",context = true,contextMaxCharacters = 15000,numResults = 30 - Tavily:
search_depth = "advanced",chunks_per_source = 3,max_results = 15
4. Query-Specific Settings
Based on: Primary query characteristics
| Query Type | Exa Settings | Tavily Settings |
|---|---|---|
| Comprehensive (addresses multiple secondary questions/focus areas) | type = "deep", context = true, contextMaxCharacters = 15000+ |
search_depth = "advanced", chunks_per_source = 3 |
| Simple factual | type = "fast", numResults = 5-10 |
search_depth = "ultra-fast", max_results = 5 |
| Time-sensitive | Apply time filters based on urgency | Apply time_range based on urgency |
| Content-specific | Match category to content type | Match topic to content type |
Example:
- Primary query: "What are the best practices for React performance optimization?"
- Query type: Comprehensive (needs detailed analysis)
- Exa:
type = "deep",context = true,contextMaxCharacters = 12000 - Tavily:
search_depth = "advanced",chunks_per_source = 3
5. Also Answering Topics Considerations
Based on: intent.also_answering field
Rules:
- If also_answering topics need different time ranges:
- Use broader
time_rangein Tavily (e.g., "year" instead of "month") - Don't apply strict date filters in Exa
- Use broader
- If also_answering topics need different sources:
- Consider including additional domains in
includeDomains - Use more comprehensive search (
type = "deep"in Exa)
- Consider including additional domains in
Example:
- Primary: "Latest AI trends"
- Also answering: ["Historical AI development", "Future predictions"]
- Exa: No strict date filters,
type = "deep"for comprehensive coverage - Tavily:
time_range = "year"to cover historical and recent
6. Provider Selection Logic
Based on: Combined analysis of all intent fields
Use EXA when:
- Primary query needs semantic understanding
- Focus areas include "academic", "research", "companies"
- Depth = "expert" or "detailed"
- Need comprehensive context (
context = true) - Query targets specific content types (research papers, companies, GitHub)
Use TAVILY when:
- Time sensitivity = "real_time" or "recent"
- Focus areas include "news", "trends", "current events"
- Need quick AI-generated answers
- Primary query is about recent developments
- Query needs real-time information
Example:
- User input: "Latest news about AI regulation"
- Provider selected: Tavily (real-time news focus)
- Tavily:
topic = "news",search_depth = "advanced",time_range = "week"
Exa Config Options Generated
The AI generates these Exa options with justifications:
Core Options
type:"auto" | "fast" | "deep" | "neural" | "keyword"- Justification references: query complexity, depth, time sensitivity
category:"company" | "research paper" | "news" | "linkedin profile" | "github" | "tweet" | "personal site" | "pdf" | "financial report"- Justification references: focus_areas, content type needed
numResults:1-100- Justification references: depth, query complexity, secondary questions count
includeDomains: Array of domain strings- Justification references: focus_areas, content type requirements
startPublishedDate: Date string (YYYY-MM-DD)- Justification references: time_sensitivity, query time requirements
Content Options
highlights:true | false- Justification: Whether snippets are needed for quick scanning
context:true | false(required fortype = "deep")- Justification: Whether full context needed for RAG/AI processing
contextMaxCharacters: Number (if context = true)- Justification: Depth requirements, query complexity
Advanced Options (if applicable)
additionalQueries: Array of query strings (only fortype = "deep")- Justification: Query variations needed for comprehensive coverage
livecrawl:"never" | "fallback" | "preferred" | "always"- Justification: Freshness requirements based on time_sensitivity
Tavily Config Options Generated
The AI generates these Tavily options with justifications:
Core Options
topic:"general" | "news" | "finance"- Justification references: focus_areas, content type
search_depth:"basic" | "advanced" | "fast" | "ultra-fast"- Justification references: depth, query complexity, speed requirements
include_answer:true | false | "basic" | "advanced"- Justification: Whether AI-generated answer is needed
time_range:"day" | "week" | "month" | "year" | null- Justification references: time_sensitivity, query time requirements
max_results:0-20- Justification references: depth, query complexity
Advanced Options
chunks_per_source:1-3(only forsearch_depth = "advanced")- Justification: Depth requirements, comprehensive coverage needs
include_raw_content:true | false | "markdown" | "text"- Justification: Whether full content needed for analysis
country: Country code (only fortopic = "general")- Justification: Geographic relevance based on target_audience
Example: Complete Inference Flow
User Input
Keywords: "AI marketing tools for small businesses"
Purpose: create_content (user-selected)
Content Output: blog_post (user-selected)
Depth: detailed (user-selected)
AI Inference
Intent:
- primary_question: "What are the best AI marketing tools for small businesses?"
- secondary_questions: ["What are the pricing models?", "What features do they offer?"]
- focus_areas: ["tools", "small business", "marketing automation"]
- also_answering: ["How to choose the right tool", "Implementation best practices"]
- time_sensitivity: "recent"
- depth: "detailed"
Recommended Provider: EXA (needs comprehensive analysis, not just news)
Exa Config:
- type: "auto"
justification: "Balanced speed and quality for comprehensive tool research"
- category: null (general search)
justification: "Tools can be found across multiple content types"
- numResults: 15
justification: "Detailed depth requires more sources to cover tools, pricing, and features"
- includeDomains: []
justification: "No specific domain restrictions needed"
- startPublishedDate: "2024-01-01"
justification: "Recent time sensitivity requires current year data"
- highlights: true
justification: "Snippets help quickly identify relevant tools"
- context: true
justification: "Detailed depth requires full context for comprehensive analysis"
- contextMaxCharacters: 10000
justification: "Detailed depth needs substantial context per source"
Tavily Config:
- topic: "general"
justification: "General topic covers tools and business content"
- search_depth: "advanced"
justification: "Detailed depth requires comprehensive search"
- include_answer: true
justification: "AI-generated answers provide quick insights"
- time_range: "year"
justification: "Recent time sensitivity with also_answering topics needing broader coverage"
- max_results: 12
justification: "Detailed depth requires multiple sources"
- chunks_per_source: 3
justification: "Detailed depth needs comprehensive content per source"
Key Files
Backend
-
backend/services/research/intent/unified_prompt_builder.py- Contains all optimization rules (lines 155-275)
- Defines how intent fields map to Exa/Tavily settings
-
backend/services/research/intent/unified_schema_builder.py- Defines JSON schema for exa_config and tavily_config (lines 67-124)
- Specifies all available options and their types
-
backend/services/research/intent/unified_result_parser.py- Extracts exa_config and tavily_config from LLM response (lines 205-206)
-
backend/api/research/handlers/intent.py- Maps exa_config/tavily_config to optimized_config (lines 124-155)
- Returns optimized_config in AnalyzeIntentResponse
Frontend
-
frontend/src/components/Research/types/intent.types.ts- Defines OptimizedConfig interface (lines 224-280)
- Includes all Exa/Tavily options with justifications
-
frontend/src/components/Research/steps/components/IntentConfirmationPanel/AdvancedProviderOptionsSection.tsx- Displays optimized Exa/Tavily options
- Shows AI justifications for each option
-
frontend/src/components/Research/steps/ResearchInput.tsx- Applies optimized_config to ResearchConfig (lines 464-512)
Current Implementation Status
✅ Fully Implemented
- Time sensitivity → Exa/Tavily date filters
- Focus areas → Exa category / Tavily topic
- Depth → Exa type / Tavily search_depth
- Query characteristics → Provider selection
- Also answering → Broader time ranges
⚠️ Partially Implemented
- Some Exa options are inferred but not all are exposed in UI
- Some Tavily options are inferred but not all are exposed in UI
- Advanced options (livecrawl, additionalQueries) are in schema but rarely used
📋 Options Available in Schema (May Not All Be Used)
Exa Options:
- ✅ type, category, numResults, includeDomains, startPublishedDate, highlights, context
- ⚠️ excludeDomains, contextMaxCharacters, additionalQueries, livecrawl
Tavily Options:
- ✅ topic, search_depth, include_answer, time_range, max_results, chunks_per_source
- ⚠️ start_date, end_date, include_raw_content, country, include_images, include_image_descriptions, include_favicon, auto_parameters
References
docs/ALwrity Researcher/EXA_INTEGRATION_ENHANCEMENTS.md- Exa search types and latencydocs/ALwrity Researcher/EXA_API_OPTIONS_AUDIT.md- Complete Exa API options comparisondocs/ALwrity Researcher/EXA_TAVILY_OPTIONS_DISPLAY_REVIEW.md- UI display reviewdocs/ALwrity Researcher/INTENT_DRIVEN_RESEARCH_IMPLEMENTATION_STATUS.md- Implementation status
Status: Current implementation infers Exa and Tavily options based on comprehensive intent analysis with detailed justifications.