18 KiB
Google Trends Integration Analysis
Date: 2025-01-29
Status: Analysis Complete - Ready for Implementation
📋 Executive Summary
After reviewing the legacy Google Trends implementation and the current Research Engine codebase:
- ❌ No Google Trends migration found in the new codebase
- ⚠️ Legacy implementation has significant issues (not production-ready)
- ✅ Pytrends offers comprehensive capabilities that align with user needs
- 🎯 Integration points identified in the current researcher flow
🔍 Legacy Implementation Review
Current Legacy Code Issues
File: ToBeMigrated/ai_web_researcher/google_trends_researcher.py
Problems Identified:
-
Visualization Issues:
- Uses
matplotlib.pyplot.show()- not suitable for web/API - No way to return chart data for frontend rendering
- Hardcoded visualization that blocks execution
- Uses
-
Error Handling:
- Basic try/except blocks
- Returns empty DataFrames on error (silent failures)
- No retry logic for rate limiting
-
Rate Limiting:
- Random sleeps (
time.sleep(random.uniform(0.1, 0.6))) - No proper rate limiting strategy
- Risk of getting blocked by Google
- Random sleeps (
-
Code Quality:
- Mixed concerns (keyword clustering + trends in same file)
- Hardcoded timeframes (
'today 1-y','today 12-m') - No configuration management
- FIXME comments indicating incomplete features
-
Data Structure:
- Returns pandas DataFrames directly
- Not serializable for API responses
- No standardized response format
-
Missing Features:
- No caching strategy
- No async support
- No integration with subscription system
- No user_id tracking
What Works (Can Reuse):
✅ Core pytrends usage patterns:
TrendReq()initializationbuild_payload()methodinterest_over_time()methodinterest_by_region()methodrelated_topics()methodrelated_queries()methodtrending_searches()method
✅ Keyword expansion logic:
- Google auto-suggestions fetching
- Prefix/suffix expansion
- Relevance scoring
✅ Keyword clustering approach:
- TF-IDF vectorization
- K-means clustering
- Silhouette scoring
📚 Pytrends Capabilities Review
Available Methods (from pytrends library):
-
interest_over_time()- Historical indexed data
- Shows when keyword was most searched
- Returns time series data
-
multirange_interest_over_time()- Similar to interest_over_time
- Allows analysis across multiple date ranges
- Better for comparing different time periods
-
historical_hourly_interest()- Historical hourly data
- Sends multiple requests (one week at a time)
- More granular than daily data
-
interest_by_region()- Geographic interest data
- Shows where keyword is most searched
- Returns data by country/region
-
related_topics()- Related topics to keyword
- Returns 'top' and 'rising' topics
- Useful for content expansion
-
related_queries()- Related search queries
- Returns 'top' and 'rising' queries
- Great for keyword research
-
trending_searches()- Latest trending searches
- Country-specific
- Real-time trending topics
-
top_charts()- Top charts for a given topic
- Yearly charts
- Category-specific
-
suggestions()- Additional suggested keywords
- Refines trend search
- Auto-complete suggestions
Key Parameters:
timeframe:'today 1-y','today 12-m','all', custom datesgeo: Country code (e.g., 'US', 'GB', 'IN')hl: Language (e.g., 'en-US')tz: Timezone offset (e.g., 360 for UTC-6)
🔍 Migration Status Check
Search Results:
✅ No Google Trends implementation found in:
backend/services/research/- No trends servicebackend/api/research/- No trends endpoints- Current codebase only mentions "trends" as a deliverable type, not actual Google Trends API
Current "Trends" References:
The codebase has:
ExpectedDeliverable.TRENDSenum valueTrendAnalysismodel inresearch_intent_models.py- Intent-aware analyzer that can extract trends from research results
- But NO actual Google Trends API integration
Conclusion: Google Trends has NOT been migrated to the new codebase. The current "trends" feature only extracts trend information from general research results, not from Google Trends API.
🎯 Where to Integrate Google Trends in User Flow
Current Researcher Flow:
Step 1: ResearchInput
├── User enters keywords/topic
├── Clicks "Intent & Options" button
└── Intent analysis performed
Step 2: IntentConfirmationPanel
├── Shows inferred intent (editable)
├── Shows suggested queries
├── Shows AI-optimized settings
└── User confirms and clicks "Research"
Step 3: Research Execution
└── Research runs via Exa/Tavily/Google
Step 4: StepResults (IntentResultsDisplay)
├── Summary tab
├── Statistics tab
├── Expert Quotes tab
├── Case Studies tab
├── Trends tab (currently shows AI-extracted trends)
└── Sources tab
Recommended Integration Points:
Option 1: Automatic Integration (Recommended) ⭐⭐⭐⭐⭐
When: During research execution, if intent includes trends
Flow:
- User enters keywords → Intent analysis
- If intent includes
EXPLORE_TRENDSpurpose ORTRENDSdeliverable:- Automatically fetch Google Trends data in parallel
- Merge with research results
- Display in "Trends" tab with Google Trends data
Pros:
- Seamless user experience
- No extra clicks
- Trends data always available when relevant
Cons:
- Additional API call (but can be cached)
- Slightly longer execution time
Implementation:
- Add to
IntentAwareAnalyzer.analyze()method - Call Google Trends service if trends in expected_deliverables
- Merge Google Trends data with AI-extracted trends
Option 2: On-Demand Button (Alternative) ⭐⭐⭐⭐
When: After intent analysis, show "Analyze Trends" button
Flow:
- User enters keywords → Intent analysis
IntentConfirmationPanelshows "Analyze Trends" button- User clicks → Fetches Google Trends data
- Shows trends preview in panel
- User proceeds with research
Pros:
- User control
- Faster initial intent analysis
- Can preview trends before research
Cons:
- Extra user action
- Trends not integrated with research results
Implementation:
- Add button to
IntentConfirmationPanel - Create endpoint:
POST /api/research/trends/analyze - Show trends preview in panel
Option 3: Separate Trends Tab (Alternative) ⭐⭐⭐
When: Always available as separate action
Flow:
- User enters keywords
- "Trends" button always visible
- Click → Opens trends analysis
- Separate from main research flow
Pros:
- Clear separation
- Can use independently
- Simple UX
Cons:
- Not integrated with research
- Extra navigation
- Less discoverable
✅ Recommended Approach: Hybrid (Option 1 + Option 2)
Primary: Automatic Integration
For intent-driven research:
- If
purpose == EXPLORE_TRENDSORTRENDS in expected_deliverables:- Automatically fetch Google Trends data
- Include in research results
- Display in "Trends" tab
Secondary: On-Demand Button
For all research:
- Show "Analyze Trends" button in
IntentConfirmationPanel - User can click to get trends even if not in intent
- Preview trends before research execution
User Experience:
┌─────────────────────────────────────────────────────────┐
│ ResearchInput │
│ ┌───────────────────────────────────────────────────┐ │
│ │ Keywords: "AI marketing tools" │ │
│ │ [Intent & Options] │ │
│ └───────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ IntentConfirmationPanel │
│ ┌───────────────────────────────────────────────────┐ │
│ │ Intent: make_decision │ │
│ │ Deliverables: [comparisons, trends, statistics] │ │
│ │ │ │
│ │ [Analyze Trends] ← Always available │ │
│ │ [Research] ← Will auto-include trends │ │
│ └───────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Research Execution │
│ ├── Exa/Tavily/Google search │
│ └── Google Trends (if trends in deliverables) ← AUTO │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ IntentResultsDisplay │
│ ┌───────────────────────────────────────────────────┐ │
│ │ [Summary] [Statistics] [Quotes] [Trends] [Sources]│ │
│ │ │ │
│ │ Trends Tab: │ │
│ │ ├── Interest Over Time (Chart) │ │
│ │ ├── Interest by Region (Map/Table) │ │
│ │ ├── Related Topics (Top & Rising) │ │
│ │ ├── Related Queries (Top & Rising) │ │
│ │ └── AI-Extracted Trends (from research) │ │
│ └───────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
🏗️ Implementation Plan
Phase 1: Core Service (Week 1)
Create: backend/services/research/trends/google_trends_service.py
Features:
- Interest over time
- Interest by region
- Related topics
- Related queries
- Proper error handling
- Rate limiting
- Caching (24-hour TTL)
- Async support
Phase 2: Integration (Week 1-2)
Enhance: IntentAwareAnalyzer
Changes:
- Check if trends in expected_deliverables
- Call Google Trends service
- Merge with AI-extracted trends
- Return enhanced trends data
Phase 3: API Endpoint (Week 2)
Create: POST /api/research/trends/analyze
Purpose: On-demand trends analysis
Request:
{
"keywords": ["AI marketing tools"],
"timeframe": "today 12-m",
"geo": "US"
}
Response:
{
"interest_over_time": [...],
"interest_by_region": [...],
"related_topics": {
"top": [...],
"rising": [...]
},
"related_queries": {
"top": [...],
"rising": [...]
}
}
Phase 4: Frontend Integration (Week 2-3)
Enhance: IntentConfirmationPanel
- Add "Analyze Trends" button
- Show trends preview
Enhance: IntentResultsDisplay
- Enhance "Trends" tab with Google Trends data
- Add charts (interest over time)
- Add regional map/table
- Show related topics/queries
📊 Data Structure Design
Google Trends Response Model
class GoogleTrendsData(BaseModel):
"""Structured Google Trends data."""
interest_over_time: List[Dict[str, Any]] # Time series data
interest_by_region: List[Dict[str, Any]] # Geographic data
related_topics: Dict[str, List[Dict[str, Any]]] # {top: [...], rising: [...]}
related_queries: Dict[str, List[Dict[str, Any]]] # {top: [...], rising: [...]}
trending_searches: Optional[List[str]] = None
timeframe: str
geo: str
keywords: List[str]
Enhanced TrendAnalysis Model
class TrendAnalysis(BaseModel):
"""Enhanced trend analysis with Google Trends data."""
trend: str
direction: str
evidence: List[str]
impact: Optional[str]
timeline: Optional[str]
sources: List[str]
# Google Trends specific
google_trends_data: Optional[GoogleTrendsData] = None
interest_score: Optional[float] = None # 0-100 from Google Trends
regional_interest: Optional[Dict[str, float]] = None
related_topics: Optional[List[str]] = None
related_queries: Optional[List[str]] = None
🔧 Technical Considerations
Rate Limiting
Pytrends Limitations:
- Google Trends API is rate-limited
- Recommended: 1 request per second
- Pytrends handles some rate limiting internally
Our Strategy:
- Cache all trends data (24-hour TTL)
- Use async requests with delays
- Batch multiple keywords in single request when possible
- Implement retry logic with exponential backoff
Caching Strategy
# Cache key: f"google_trends:{keyword}:{timeframe}:{geo}"
# TTL: 24 hours (trends don't change frequently)
# Store: Interest over time, related topics/queries
Error Handling
- Handle Google blocking (429 errors)
- Handle invalid keywords
- Handle missing data
- Graceful degradation (return partial data if available)
Async Support
- Use
asynciofor non-blocking requests - Parallel requests for multiple keywords
- Timeout handling (30 seconds max)
📈 User Value
For Content Creators:
-
Timing Optimization:
- See interest over time to time publication
- Identify peak interest periods
- Avoid publishing during low-interest periods
-
Regional Targeting:
- See which regions have highest interest
- Tailor content for specific markets
- Discover new audience opportunities
-
Content Expansion:
- Related topics → new article ideas
- Related queries → FAQ sections
- Rising topics → timely content opportunities
For Digital Marketers:
-
Campaign Planning:
- Trending searches → campaign topics
- Interest by region → geo-targeting
- Related queries → ad keywords
-
SEO Strategy:
- Related queries → long-tail keywords
- Rising topics → content opportunities
- Interest trends → content calendar
For Solopreneurs:
- Market Research:
- Interest trends → market validation
- Regional data → market expansion
- Related topics → competitive landscape
✅ Success Criteria
- Google Trends service created and tested
- Automatic integration working (when trends in intent)
- On-demand button working in IntentConfirmationPanel
- Trends tab enhanced with Google Trends data
- Charts displaying correctly (interest over time)
- Regional data displaying correctly
- Caching working (24-hour TTL)
- Rate limiting preventing blocks
- Error handling graceful
- User satisfaction with trends feature
🚀 Quick Start Implementation
Step 1: Create Service (2-3 days)
# backend/services/research/trends/google_trends_service.py
class GoogleTrendsService:
async def get_interest_over_time(keywords, timeframe, geo)
async def get_interest_by_region(keywords, geo)
async def get_related_topics(keywords, timeframe)
async def get_related_queries(keywords, timeframe)
async def get_trending_searches(country)
Step 2: Integrate with IntentAwareAnalyzer (1-2 days)
- Check for trends in deliverables
- Call Google Trends service
- Merge with AI-extracted trends
Step 3: Add API Endpoint (1 day)
POST /api/research/trends/analyze- Return structured trends data
Step 4: Frontend Integration (2-3 days)
- Add "Analyze Trends" button
- Enhance Trends tab
- Add charts/visualizations
Total Estimate: 6-9 days for full implementation
📝 Next Steps
- Approve Approach: Confirm hybrid approach (automatic + on-demand)
- Set Up Dependencies: Add
pytrends>=4.9.2to requirements.txt - Create Service: Start with
google_trends_service.py - Test Integration: Test with sample keywords
- Frontend Integration: Add UI components
Status: Analysis Complete - Ready for Implementation
Recommended Action: Start with Phase 1 (Core Service) - create google_trends_service.py with proper error handling, caching, and async support.