# Google Trends Integration Analysis **Date**: 2025-01-29 **Status**: Analysis Complete - Ready for Implementation --- ## 📋 Executive Summary After reviewing the legacy Google Trends implementation and the current Research Engine codebase: - ❌ **No Google Trends migration found** in the new codebase - ⚠️ **Legacy implementation has significant issues** (not production-ready) - ✅ **Pytrends offers comprehensive capabilities** that align with user needs - 🎯 **Integration points identified** in the current researcher flow --- ## 🔍 Legacy Implementation Review ### Current Legacy Code Issues **File**: `ToBeMigrated/ai_web_researcher/google_trends_researcher.py` #### Problems Identified: 1. **Visualization Issues**: - Uses `matplotlib.pyplot.show()` - not suitable for web/API - No way to return chart data for frontend rendering - Hardcoded visualization that blocks execution 2. **Error Handling**: - Basic try/except blocks - Returns empty DataFrames on error (silent failures) - No retry logic for rate limiting 3. **Rate Limiting**: - Random sleeps (`time.sleep(random.uniform(0.1, 0.6))`) - No proper rate limiting strategy - Risk of getting blocked by Google 4. **Code Quality**: - Mixed concerns (keyword clustering + trends in same file) - Hardcoded timeframes (`'today 1-y'`, `'today 12-m'`) - No configuration management - FIXME comments indicating incomplete features 5. **Data Structure**: - Returns pandas DataFrames directly - Not serializable for API responses - No standardized response format 6. **Missing Features**: - No caching strategy - No async support - No integration with subscription system - No user_id tracking #### What Works (Can Reuse): ✅ **Core pytrends usage patterns**: - `TrendReq()` initialization - `build_payload()` method - `interest_over_time()` method - `interest_by_region()` method - `related_topics()` method - `related_queries()` method - `trending_searches()` method ✅ **Keyword expansion logic**: - Google auto-suggestions fetching - Prefix/suffix expansion - Relevance scoring ✅ **Keyword clustering approach**: - TF-IDF vectorization - K-means clustering - Silhouette scoring --- ## 📚 Pytrends Capabilities Review ### Available Methods (from pytrends library): 1. **`interest_over_time()`** - Historical indexed data - Shows when keyword was most searched - Returns time series data 2. **`multirange_interest_over_time()`** - Similar to interest_over_time - Allows analysis across multiple date ranges - Better for comparing different time periods 3. **`historical_hourly_interest()`** - Historical hourly data - Sends multiple requests (one week at a time) - More granular than daily data 4. **`interest_by_region()`** - Geographic interest data - Shows where keyword is most searched - Returns data by country/region 5. **`related_topics()`** - Related topics to keyword - Returns 'top' and 'rising' topics - Useful for content expansion 6. **`related_queries()`** - Related search queries - Returns 'top' and 'rising' queries - Great for keyword research 7. **`trending_searches()`** - Latest trending searches - Country-specific - Real-time trending topics 8. **`top_charts()`** - Top charts for a given topic - Yearly charts - Category-specific 9. **`suggestions()`** - Additional suggested keywords - Refines trend search - Auto-complete suggestions ### Key Parameters: - **`timeframe`**: `'today 1-y'`, `'today 12-m'`, `'all'`, custom dates - **`geo`**: Country code (e.g., 'US', 'GB', 'IN') - **`hl`**: Language (e.g., 'en-US') - **`tz`**: Timezone offset (e.g., 360 for UTC-6) --- ## 🔍 Migration Status Check ### Search Results: ✅ **No Google Trends implementation found** in: - `backend/services/research/` - No trends service - `backend/api/research/` - No trends endpoints - Current codebase only mentions "trends" as a deliverable type, not actual Google Trends API ### Current "Trends" References: The codebase has: - `ExpectedDeliverable.TRENDS` enum value - `TrendAnalysis` model in `research_intent_models.py` - Intent-aware analyzer that can extract trends from research results - But **NO actual Google Trends API integration** **Conclusion**: Google Trends has **NOT been migrated** to the new codebase. The current "trends" feature only extracts trend information from general research results, not from Google Trends API. --- ## 🎯 Where to Integrate Google Trends in User Flow ### Current Researcher Flow: ``` Step 1: ResearchInput ├── User enters keywords/topic ├── Clicks "Intent & Options" button └── Intent analysis performed Step 2: IntentConfirmationPanel ├── Shows inferred intent (editable) ├── Shows suggested queries ├── Shows AI-optimized settings └── User confirms and clicks "Research" Step 3: Research Execution └── Research runs via Exa/Tavily/Google Step 4: StepResults (IntentResultsDisplay) ├── Summary tab ├── Statistics tab ├── Expert Quotes tab ├── Case Studies tab ├── Trends tab (currently shows AI-extracted trends) └── Sources tab ``` ### Recommended Integration Points: #### Option 1: Automatic Integration (Recommended) ⭐⭐⭐⭐⭐ **When**: During research execution, if intent includes trends **Flow**: 1. User enters keywords → Intent analysis 2. If intent includes `EXPLORE_TRENDS` purpose OR `TRENDS` deliverable: - Automatically fetch Google Trends data in parallel - Merge with research results 3. Display in "Trends" tab with Google Trends data **Pros**: - Seamless user experience - No extra clicks - Trends data always available when relevant **Cons**: - Additional API call (but can be cached) - Slightly longer execution time **Implementation**: - Add to `IntentAwareAnalyzer.analyze()` method - Call Google Trends service if trends in expected_deliverables - Merge Google Trends data with AI-extracted trends #### Option 2: On-Demand Button (Alternative) ⭐⭐⭐⭐ **When**: After intent analysis, show "Analyze Trends" button **Flow**: 1. User enters keywords → Intent analysis 2. `IntentConfirmationPanel` shows "Analyze Trends" button 3. User clicks → Fetches Google Trends data 4. Shows trends preview in panel 5. User proceeds with research **Pros**: - User control - Faster initial intent analysis - Can preview trends before research **Cons**: - Extra user action - Trends not integrated with research results **Implementation**: - Add button to `IntentConfirmationPanel` - Create endpoint: `POST /api/research/trends/analyze` - Show trends preview in panel #### Option 3: Separate Trends Tab (Alternative) ⭐⭐⭐ **When**: Always available as separate action **Flow**: 1. User enters keywords 2. "Trends" button always visible 3. Click → Opens trends analysis 4. Separate from main research flow **Pros**: - Clear separation - Can use independently - Simple UX **Cons**: - Not integrated with research - Extra navigation - Less discoverable --- ## ✅ Recommended Approach: Hybrid (Option 1 + Option 2) ### Primary: Automatic Integration **For intent-driven research**: - If `purpose == EXPLORE_TRENDS` OR `TRENDS in expected_deliverables`: - Automatically fetch Google Trends data - Include in research results - Display in "Trends" tab ### Secondary: On-Demand Button **For all research**: - Show "Analyze Trends" button in `IntentConfirmationPanel` - User can click to get trends even if not in intent - Preview trends before research execution ### User Experience: ``` ┌─────────────────────────────────────────────────────────┐ │ ResearchInput │ │ ┌───────────────────────────────────────────────────┐ │ │ │ Keywords: "AI marketing tools" │ │ │ │ [Intent & Options] │ │ │ └───────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ IntentConfirmationPanel │ │ ┌───────────────────────────────────────────────────┐ │ │ │ Intent: make_decision │ │ │ │ Deliverables: [comparisons, trends, statistics] │ │ │ │ │ │ │ │ [Analyze Trends] ← Always available │ │ │ │ [Research] ← Will auto-include trends │ │ │ └───────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ Research Execution │ │ ├── Exa/Tavily/Google search │ │ └── Google Trends (if trends in deliverables) ← AUTO │ └─────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ IntentResultsDisplay │ │ ┌───────────────────────────────────────────────────┐ │ │ │ [Summary] [Statistics] [Quotes] [Trends] [Sources]│ │ │ │ │ │ │ │ Trends Tab: │ │ │ │ ├── Interest Over Time (Chart) │ │ │ │ ├── Interest by Region (Map/Table) │ │ │ │ ├── Related Topics (Top & Rising) │ │ │ │ ├── Related Queries (Top & Rising) │ │ │ │ └── AI-Extracted Trends (from research) │ │ │ └───────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘ ``` --- ## 🏗️ Implementation Plan ### Phase 1: Core Service (Week 1) **Create**: `backend/services/research/trends/google_trends_service.py` **Features**: - Interest over time - Interest by region - Related topics - Related queries - Proper error handling - Rate limiting - Caching (24-hour TTL) - Async support ### Phase 2: Integration (Week 1-2) **Enhance**: `IntentAwareAnalyzer` **Changes**: - Check if trends in expected_deliverables - Call Google Trends service - Merge with AI-extracted trends - Return enhanced trends data ### Phase 3: API Endpoint (Week 2) **Create**: `POST /api/research/trends/analyze` **Purpose**: On-demand trends analysis **Request**: ```json { "keywords": ["AI marketing tools"], "timeframe": "today 12-m", "geo": "US" } ``` **Response**: ```json { "interest_over_time": [...], "interest_by_region": [...], "related_topics": { "top": [...], "rising": [...] }, "related_queries": { "top": [...], "rising": [...] } } ``` ### Phase 4: Frontend Integration (Week 2-3) **Enhance**: `IntentConfirmationPanel` - Add "Analyze Trends" button - Show trends preview **Enhance**: `IntentResultsDisplay` - Enhance "Trends" tab with Google Trends data - Add charts (interest over time) - Add regional map/table - Show related topics/queries --- ## 📊 Data Structure Design ### Google Trends Response Model ```python class GoogleTrendsData(BaseModel): """Structured Google Trends data.""" interest_over_time: List[Dict[str, Any]] # Time series data interest_by_region: List[Dict[str, Any]] # Geographic data related_topics: Dict[str, List[Dict[str, Any]]] # {top: [...], rising: [...]} related_queries: Dict[str, List[Dict[str, Any]]] # {top: [...], rising: [...]} trending_searches: Optional[List[str]] = None timeframe: str geo: str keywords: List[str] ``` ### Enhanced TrendAnalysis Model ```python class TrendAnalysis(BaseModel): """Enhanced trend analysis with Google Trends data.""" trend: str direction: str evidence: List[str] impact: Optional[str] timeline: Optional[str] sources: List[str] # Google Trends specific google_trends_data: Optional[GoogleTrendsData] = None interest_score: Optional[float] = None # 0-100 from Google Trends regional_interest: Optional[Dict[str, float]] = None related_topics: Optional[List[str]] = None related_queries: Optional[List[str]] = None ``` --- ## 🔧 Technical Considerations ### Rate Limiting **Pytrends Limitations**: - Google Trends API is rate-limited - Recommended: 1 request per second - Pytrends handles some rate limiting internally **Our Strategy**: - Cache all trends data (24-hour TTL) - Use async requests with delays - Batch multiple keywords in single request when possible - Implement retry logic with exponential backoff ### Caching Strategy ```python # Cache key: f"google_trends:{keyword}:{timeframe}:{geo}" # TTL: 24 hours (trends don't change frequently) # Store: Interest over time, related topics/queries ``` ### Error Handling - Handle Google blocking (429 errors) - Handle invalid keywords - Handle missing data - Graceful degradation (return partial data if available) ### Async Support - Use `asyncio` for non-blocking requests - Parallel requests for multiple keywords - Timeout handling (30 seconds max) --- ## 📈 User Value ### For Content Creators: 1. **Timing Optimization**: - See interest over time to time publication - Identify peak interest periods - Avoid publishing during low-interest periods 2. **Regional Targeting**: - See which regions have highest interest - Tailor content for specific markets - Discover new audience opportunities 3. **Content Expansion**: - Related topics → new article ideas - Related queries → FAQ sections - Rising topics → timely content opportunities ### For Digital Marketers: 1. **Campaign Planning**: - Trending searches → campaign topics - Interest by region → geo-targeting - Related queries → ad keywords 2. **SEO Strategy**: - Related queries → long-tail keywords - Rising topics → content opportunities - Interest trends → content calendar ### For Solopreneurs: 1. **Market Research**: - Interest trends → market validation - Regional data → market expansion - Related topics → competitive landscape --- ## ✅ Success Criteria - [ ] Google Trends service created and tested - [ ] Automatic integration working (when trends in intent) - [ ] On-demand button working in IntentConfirmationPanel - [ ] Trends tab enhanced with Google Trends data - [ ] Charts displaying correctly (interest over time) - [ ] Regional data displaying correctly - [ ] Caching working (24-hour TTL) - [ ] Rate limiting preventing blocks - [ ] Error handling graceful - [ ] User satisfaction with trends feature --- ## 🚀 Quick Start Implementation ### Step 1: Create Service (2-3 days) ```python # backend/services/research/trends/google_trends_service.py class GoogleTrendsService: async def get_interest_over_time(keywords, timeframe, geo) async def get_interest_by_region(keywords, geo) async def get_related_topics(keywords, timeframe) async def get_related_queries(keywords, timeframe) async def get_trending_searches(country) ``` ### Step 2: Integrate with IntentAwareAnalyzer (1-2 days) - Check for trends in deliverables - Call Google Trends service - Merge with AI-extracted trends ### Step 3: Add API Endpoint (1 day) - `POST /api/research/trends/analyze` - Return structured trends data ### Step 4: Frontend Integration (2-3 days) - Add "Analyze Trends" button - Enhance Trends tab - Add charts/visualizations **Total Estimate**: 6-9 days for full implementation --- ## 📝 Next Steps 1. **Approve Approach**: Confirm hybrid approach (automatic + on-demand) 2. **Set Up Dependencies**: Add `pytrends>=4.9.2` to requirements.txt 3. **Create Service**: Start with `google_trends_service.py` 4. **Test Integration**: Test with sample keywords 5. **Frontend Integration**: Add UI components --- **Status**: Analysis Complete - Ready for Implementation **Recommended Action**: Start with Phase 1 (Core Service) - create `google_trends_service.py` with proper error handling, caching, and async support.