AI Analysis and Content Strategy fixes. Enhanced Strategy Routes refactoring.
This commit is contained in:
@@ -0,0 +1,565 @@
|
||||
# Codebase Organization & Service Reusability Analysis
|
||||
|
||||
**Date**: 2025-01-29
|
||||
**Status**: Comprehensive Codebase Structure Analysis
|
||||
|
||||
---
|
||||
|
||||
## 📋 Overview
|
||||
|
||||
This document provides a comprehensive analysis of:
|
||||
1. **Codebase Organization**: How features are organized across folders
|
||||
2. **Service Architecture**: How Exa, Tavily, and Google Search services are structured
|
||||
3. **Reusability Analysis**: Whether these services are reusable or tightly integrated
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Codebase Organization
|
||||
|
||||
### High-Level Structure
|
||||
|
||||
```
|
||||
AI-Writer/
|
||||
├── backend/
|
||||
│ ├── api/ # API endpoints (FastAPI routers)
|
||||
│ ├── services/ # Business logic & service layer
|
||||
│ ├── models/ # Database models & schemas
|
||||
│ ├── middleware/ # Request/response middleware
|
||||
│ ├── utils/ # Utility functions
|
||||
│ └── database/ # Database migrations
|
||||
│
|
||||
├── frontend/
|
||||
│ └── src/
|
||||
│ ├── components/ # React components
|
||||
│ ├── services/ # Frontend API clients
|
||||
│ ├── hooks/ # React hooks
|
||||
│ └── utils/ # Frontend utilities
|
||||
│
|
||||
└── docs/ # Documentation
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📁 Feature Organization by Folder
|
||||
|
||||
### Backend Services (`backend/services/`)
|
||||
|
||||
#### **Research Services** (`backend/services/research/`)
|
||||
**Purpose**: Core research engine and provider services
|
||||
|
||||
```
|
||||
research/
|
||||
├── core/ # Core research engine (standalone)
|
||||
│ ├── research_engine.py # Main orchestrator
|
||||
│ ├── research_context.py # Unified input schema
|
||||
│ └── parameter_optimizer.py # AI-driven parameter optimization
|
||||
│
|
||||
├── intent/ # Intent-driven research
|
||||
│ ├── unified_research_analyzer.py # Single AI call for intent+queries+params
|
||||
│ ├── intent_aware_analyzer.py # Result analysis based on intent
|
||||
│ └── ...
|
||||
│
|
||||
├── trends/ # Google Trends integration
|
||||
│ └── google_trends_service.py
|
||||
│
|
||||
├── exa_service.py # ⭐ Reusable Exa API service
|
||||
├── tavily_service.py # ⭐ Reusable Tavily API service
|
||||
├── google_search_service.py # ⭐ Reusable Google Search service
|
||||
│
|
||||
├── research_persona_service.py # Persona generation/retrieval
|
||||
└── research_persona_prompt_builder.py
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- Standalone research engine (`ResearchEngine`)
|
||||
- Provider services (Exa, Tavily, Google)
|
||||
- Intent-driven research system
|
||||
- Research persona system
|
||||
|
||||
---
|
||||
|
||||
#### **Blog Writer Services** (`backend/services/blog_writer/`)
|
||||
**Purpose**: Blog content generation
|
||||
|
||||
```
|
||||
blog_writer/
|
||||
├── core/
|
||||
│ └── blog_writer_service.py # Main blog generation service
|
||||
│
|
||||
├── research/ # Blog-specific research providers
|
||||
│ ├── research_service.py # Blog research orchestrator
|
||||
│ ├── exa_provider.py # Blog-specific Exa wrapper
|
||||
│ ├── tavily_provider.py # Blog-specific Tavily wrapper
|
||||
│ ├── google_provider.py # Blog-specific Google wrapper
|
||||
│ └── research_strategies.py # Research strategies per mode
|
||||
│
|
||||
├── outline/ # Outline generation
|
||||
├── content/ # Content generation
|
||||
└── seo/ # SEO optimization
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- Uses `services.research` services (reusable)
|
||||
- Has blog-specific wrappers for providers
|
||||
- Research strategies for different blog modes
|
||||
|
||||
---
|
||||
|
||||
#### **Other Feature Services**
|
||||
|
||||
| Service Folder | Purpose | Research Integration |
|
||||
|---------------|---------|---------------------|
|
||||
| `podcast/` | Podcast generation | Can use Research Engine |
|
||||
| `story_writer/` | Story generation | Can use Research Engine |
|
||||
| `youtube/` | YouTube content | Can use Research Engine |
|
||||
| `linkedin/` | LinkedIn content | Uses GoogleSearchService |
|
||||
| `onboarding/` | User onboarding | Uses ExaService for competitor discovery |
|
||||
| `content_planning/` | Content planning | Can use Research Engine |
|
||||
| `scheduler/` | Task scheduling | Can use Research Engine |
|
||||
|
||||
---
|
||||
|
||||
### Backend API (`backend/api/`)
|
||||
|
||||
#### **Research API** (`backend/api/research/`)
|
||||
**Purpose**: Research endpoints
|
||||
|
||||
```
|
||||
api/research/
|
||||
├── router.py # Main router
|
||||
└── handlers/
|
||||
├── providers.py # Provider status endpoints
|
||||
├── research.py # Traditional research endpoints
|
||||
├── intent.py # Intent-driven endpoints
|
||||
└── projects.py # My Projects endpoints
|
||||
```
|
||||
|
||||
**Endpoints**:
|
||||
- `POST /api/research/intent/analyze` - Intent analysis
|
||||
- `POST /api/research/intent/research` - Intent-driven research
|
||||
- `POST /api/research/execute` - Traditional research
|
||||
- `GET /api/research/config` - Configuration
|
||||
|
||||
---
|
||||
|
||||
#### **Other API Modules**
|
||||
|
||||
| API Folder | Purpose | Research Integration |
|
||||
|-----------|---------|---------------------|
|
||||
| `blog_writer/` | Blog endpoints | Uses blog_writer services |
|
||||
| `podcast/` | Podcast endpoints | Can use Research Engine |
|
||||
| `story_writer/` | Story endpoints | Can use Research Engine |
|
||||
| `onboarding_utils/` | Onboarding utilities | Uses ExaService for competitor discovery |
|
||||
|
||||
---
|
||||
|
||||
### Frontend Components (`frontend/src/components/`)
|
||||
|
||||
#### **Research Components** (`frontend/src/components/Research/`)
|
||||
**Purpose**: Research UI components
|
||||
|
||||
```
|
||||
Research/
|
||||
├── ResearchWizard.tsx # Main wizard orchestrator
|
||||
├── steps/
|
||||
│ ├── ResearchInput.tsx # Step 1: Input + Intent & Options
|
||||
│ ├── StepProgress.tsx # Step 2: Progress/polling
|
||||
│ ├── StepResults.tsx # Step 3: Results display
|
||||
│ └── components/ # Sub-components
|
||||
│ ├── IntentConfirmationPanel.tsx
|
||||
│ ├── IntentResultsDisplay.tsx
|
||||
│ └── ...
|
||||
├── hooks/
|
||||
│ ├── useResearchWizard.ts # Wizard state management
|
||||
│ ├── useResearchExecution.ts # Research execution
|
||||
│ └── useIntentResearch.ts # Intent research flow
|
||||
└── types/
|
||||
├── research.types.ts # Research types
|
||||
└── intent.types.ts # Intent types
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔌 Service Architecture: Exa, Tavily, Google Search
|
||||
|
||||
### Service Design Pattern
|
||||
|
||||
All three services follow a **similar design pattern**:
|
||||
|
||||
1. **Standalone Service Classes**: Each service is a self-contained class
|
||||
2. **Lazy Initialization**: Services check for API keys on initialization
|
||||
3. **Error Handling**: Graceful degradation when API keys are missing
|
||||
4. **Standardized Interface**: Similar method signatures across services
|
||||
|
||||
---
|
||||
|
||||
### 1. ExaService (`backend/services/research/exa_service.py`)
|
||||
|
||||
**Design**: ✅ **Reusable Service**
|
||||
|
||||
```python
|
||||
class ExaService:
|
||||
"""
|
||||
Service for competitor discovery and analysis using the Exa API.
|
||||
Uses neural search to find semantically similar websites and content.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize with API credentials from environment."""
|
||||
self.api_key = os.getenv("EXA_API_KEY")
|
||||
self.exa = None
|
||||
self.enabled = False
|
||||
self._try_initialize()
|
||||
|
||||
async def discover_competitors(...) -> Dict[str, Any]:
|
||||
"""Discover competitors for a given website."""
|
||||
|
||||
async def discover_social_media_accounts(...) -> Dict[str, Any]:
|
||||
"""Discover social media accounts."""
|
||||
|
||||
async def analyze_competitor_content(...) -> Dict[str, Any]:
|
||||
"""Analyze competitor content."""
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- ✅ **Standalone**: No dependencies on Research Engine
|
||||
- ✅ **Reusable**: Can be imported by any module
|
||||
- ✅ **Focused**: Primarily for competitor discovery
|
||||
- ✅ **Flexible**: Supports various search parameters
|
||||
|
||||
**Current Usage**:
|
||||
1. **Research Engine**: Uses for research queries
|
||||
2. **Onboarding**: Uses for competitor discovery (Step 3)
|
||||
3. **Blog Writer**: Uses via blog-specific wrapper (`exa_provider.py`)
|
||||
|
||||
---
|
||||
|
||||
### 2. TavilyService (`backend/services/research/tavily_service.py`)
|
||||
|
||||
**Design**: ✅ **Reusable Service**
|
||||
|
||||
```python
|
||||
class TavilyService:
|
||||
"""
|
||||
Service for web search and research using the Tavily API.
|
||||
Provides AI-powered search with real-time information retrieval.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize with API credentials from environment."""
|
||||
self.api_key = os.getenv("TAVILY_API_KEY")
|
||||
self.base_url = "https://api.tavily.com"
|
||||
self.enabled = False
|
||||
self._try_initialize()
|
||||
|
||||
async def search(...) -> Dict[str, Any]:
|
||||
"""Execute a search query using Tavily API."""
|
||||
|
||||
async def search_industry_trends(...) -> Dict[str, Any]:
|
||||
"""Search for current industry trends."""
|
||||
|
||||
async def discover_competitors(...) -> Dict[str, Any]:
|
||||
"""Discover competitors using Tavily search."""
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- ✅ **Standalone**: No dependencies on Research Engine
|
||||
- ✅ **Reusable**: Can be imported by any module
|
||||
- ✅ **Flexible**: Supports various search parameters (topic, depth, time_range, etc.)
|
||||
- ✅ **Real-time**: Optimized for current information
|
||||
|
||||
**Current Usage**:
|
||||
1. **Research Engine**: Uses for research queries
|
||||
2. **Blog Writer**: Uses via blog-specific wrapper (`tavily_provider.py`)
|
||||
|
||||
---
|
||||
|
||||
### 3. GoogleSearchService (`backend/services/research/google_search_service.py`)
|
||||
|
||||
**Design**: ✅ **Reusable Service**
|
||||
|
||||
```python
|
||||
class GoogleSearchService:
|
||||
"""
|
||||
Service for conducting real industry research using Google Custom Search API.
|
||||
Provides current, relevant industry information for content grounding.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize with API credentials from environment."""
|
||||
self.api_key = os.getenv("GOOGLE_SEARCH_API_KEY")
|
||||
self.search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")
|
||||
self.enabled = False
|
||||
|
||||
async def search_industry_trends(...) -> List[Dict[str, Any]]:
|
||||
"""Search for current industry trends and insights."""
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- ✅ **Standalone**: No dependencies on Research Engine
|
||||
- ✅ **Reusable**: Can be imported by any module
|
||||
- ✅ **Focused**: Industry trend research
|
||||
- ✅ **Credibility Scoring**: Built-in source credibility assessment
|
||||
|
||||
**Current Usage**:
|
||||
1. **Research Engine**: Uses as fallback provider
|
||||
2. **LinkedIn Service**: Uses for industry research
|
||||
|
||||
---
|
||||
|
||||
## 🔄 Reusability Analysis
|
||||
|
||||
### ✅ **Services ARE Reusable**
|
||||
|
||||
All three services (Exa, Tavily, Google Search) are **designed to be reusable**:
|
||||
|
||||
#### **Evidence of Reusability**:
|
||||
|
||||
1. **Standalone Design**:
|
||||
- No dependencies on Research Engine
|
||||
- Self-contained initialization
|
||||
- Independent error handling
|
||||
|
||||
2. **Multiple Usage Points**:
|
||||
```python
|
||||
# Used in Research Engine
|
||||
from services.research.exa_service import ExaService
|
||||
|
||||
# Used in Onboarding
|
||||
from services.research.exa_service import ExaService
|
||||
|
||||
# Used in Blog Writer (via wrapper)
|
||||
from services.research.tavily_service import TavilyService
|
||||
|
||||
# Used in LinkedIn Service
|
||||
from services.research import GoogleSearchService
|
||||
```
|
||||
|
||||
3. **Standardized Interface**:
|
||||
- Similar method signatures
|
||||
- Consistent return formats
|
||||
- Environment-based configuration
|
||||
|
||||
4. **Export Structure**:
|
||||
```python
|
||||
# backend/services/research/__init__.py
|
||||
from .google_search_service import GoogleSearchService
|
||||
from .exa_service import ExaService
|
||||
from .tavily_service import TavilyService
|
||||
|
||||
__all__ = [
|
||||
"GoogleSearchService",
|
||||
"ExaService",
|
||||
"TavilyService",
|
||||
# ... other exports
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ⚠️ **Integration Patterns**
|
||||
|
||||
While services are reusable, they are used in different ways:
|
||||
|
||||
#### **1. Direct Usage** (Most Reusable)
|
||||
```python
|
||||
# Direct import and use
|
||||
from services.research.exa_service import ExaService
|
||||
|
||||
exa = ExaService()
|
||||
result = await exa.discover_competitors(user_url)
|
||||
```
|
||||
|
||||
**Used By**:
|
||||
- Onboarding (competitor discovery)
|
||||
- Research Engine (research queries)
|
||||
|
||||
---
|
||||
|
||||
#### **2. Wrapper Pattern** (Blog Writer)
|
||||
```python
|
||||
# Blog Writer uses wrappers for blog-specific logic
|
||||
from services.research.tavily_service import TavilyService
|
||||
|
||||
class TavilyResearchProvider:
|
||||
def __init__(self):
|
||||
self.tavily = TavilyService() # Reuses service
|
||||
|
||||
async def search(self, prompt, topic, ...):
|
||||
# Blog-specific logic + TavilyService
|
||||
return await self.tavily.search(...)
|
||||
```
|
||||
|
||||
**Why Wrappers?**:
|
||||
- Blog-specific research strategies
|
||||
- Blog-specific result formatting
|
||||
- Blog-specific error handling
|
||||
- Maintains compatibility with existing blog writer code
|
||||
|
||||
**Location**: `backend/services/blog_writer/research/tavily_provider.py`
|
||||
|
||||
---
|
||||
|
||||
#### **3. Engine Orchestration** (Research Engine)
|
||||
```python
|
||||
# Research Engine orchestrates providers
|
||||
from services.research.exa_service import ExaService
|
||||
from services.research.tavily_service import TavilyService
|
||||
from services.research.google_search_service import GoogleSearchService
|
||||
|
||||
class ResearchEngine:
|
||||
def __init__(self):
|
||||
self._exa_provider = ExaService()
|
||||
self._tavily_provider = TavilyService()
|
||||
self._google_provider = GoogleSearchService()
|
||||
|
||||
async def research(self, context: ResearchContext):
|
||||
# Orchestrates providers based on priority
|
||||
if self.exa_available:
|
||||
return await self._exa_provider.search(...)
|
||||
elif self.tavily_available:
|
||||
return await self._tavily_provider.search(...)
|
||||
else:
|
||||
return await self._google_provider.search_industry_trends(...)
|
||||
```
|
||||
|
||||
**Why Orchestration?**:
|
||||
- Provider priority management
|
||||
- Fallback logic
|
||||
- Unified interface for all tools
|
||||
- Research persona integration
|
||||
|
||||
---
|
||||
|
||||
## 📊 Service Reusability Matrix
|
||||
|
||||
| Service | Standalone | Reusable | Current Usage | Integration Pattern |
|
||||
|---------|-----------|----------|---------------|-------------------|
|
||||
| **ExaService** | ✅ Yes | ✅ Yes | Research Engine, Onboarding, Blog Writer | Direct + Wrapper |
|
||||
| **TavilyService** | ✅ Yes | ✅ Yes | Research Engine, Blog Writer | Direct + Wrapper |
|
||||
| **GoogleSearchService** | ✅ Yes | ✅ Yes | Research Engine, LinkedIn Service | Direct |
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Key Insights
|
||||
|
||||
### ✅ **Services Are Reusable**
|
||||
|
||||
1. **No Tight Coupling**: Services don't depend on Research Engine
|
||||
2. **Standardized Interface**: Consistent method signatures
|
||||
3. **Multiple Usage Points**: Used across different modules
|
||||
4. **Environment-Based Config**: No hardcoded dependencies
|
||||
|
||||
### ⚠️ **Integration Patterns Vary**
|
||||
|
||||
1. **Direct Usage**: Simple import and use (most reusable)
|
||||
2. **Wrapper Pattern**: Blog-specific wrappers (maintains compatibility)
|
||||
3. **Engine Orchestration**: Research Engine coordinates providers (unified interface)
|
||||
|
||||
### 🔄 **Architecture Evolution**
|
||||
|
||||
**Current State**:
|
||||
- Services are reusable ✅
|
||||
- Research Engine provides unified interface ✅
|
||||
- Blog Writer uses wrappers for compatibility ✅
|
||||
|
||||
**Future Recommendations**:
|
||||
- Consider migrating Blog Writer to use Research Engine directly
|
||||
- Standardize on Research Engine for all tools
|
||||
- Keep services as low-level building blocks
|
||||
|
||||
---
|
||||
|
||||
## 📝 Usage Examples
|
||||
|
||||
### Example 1: Direct Usage (Onboarding)
|
||||
|
||||
```python
|
||||
# backend/api/onboarding_utils/step3_research_service.py
|
||||
from services.research.exa_service import ExaService
|
||||
|
||||
exa_service = ExaService()
|
||||
result = await exa_service.discover_competitors(
|
||||
user_url=user_url,
|
||||
num_results=10,
|
||||
industry_context=industry
|
||||
)
|
||||
```
|
||||
|
||||
### Example 2: Wrapper Pattern (Blog Writer)
|
||||
|
||||
```python
|
||||
# backend/services/blog_writer/research/tavily_provider.py
|
||||
from services.research.tavily_service import TavilyService
|
||||
|
||||
class TavilyResearchProvider:
|
||||
def __init__(self):
|
||||
self.tavily = TavilyService() # Reuses service
|
||||
|
||||
async def search(self, research_prompt, topic, industry, ...):
|
||||
# Blog-specific query building
|
||||
query = self._build_blog_query(research_prompt, topic, industry)
|
||||
|
||||
# Use TavilyService
|
||||
result = await self.tavily.search(
|
||||
query=query,
|
||||
topic="general",
|
||||
search_depth="advanced",
|
||||
max_results=config.max_sources
|
||||
)
|
||||
|
||||
# Blog-specific result formatting
|
||||
return self._format_blog_results(result)
|
||||
```
|
||||
|
||||
### Example 3: Engine Orchestration (Research Engine)
|
||||
|
||||
```python
|
||||
# backend/services/research/core/research_engine.py
|
||||
from services.research.exa_service import ExaService
|
||||
from services.research.tavily_service import TavilyService
|
||||
|
||||
class ResearchEngine:
|
||||
def __init__(self):
|
||||
self._exa_provider = ExaService()
|
||||
self._tavily_provider = TavilyService()
|
||||
|
||||
async def research(self, context: ResearchContext, user_id: str):
|
||||
# Get optimized config
|
||||
config = self.optimizer.optimize(context)
|
||||
|
||||
# Execute based on provider priority
|
||||
if config.provider == ResearchProvider.EXA:
|
||||
return await self._execute_exa_research(context, config, user_id)
|
||||
elif config.provider == ResearchProvider.TAVILY:
|
||||
return await self._execute_tavily_research(context, config, user_id)
|
||||
else:
|
||||
return await self._execute_google_research(context, config, user_id)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✅ Conclusion
|
||||
|
||||
### **Services ARE Reusable** ✅
|
||||
|
||||
- **ExaService**: ✅ Reusable, used in Research Engine, Onboarding, Blog Writer
|
||||
- **TavilyService**: ✅ Reusable, used in Research Engine, Blog Writer
|
||||
- **GoogleSearchService**: ✅ Reusable, used in Research Engine, LinkedIn Service
|
||||
|
||||
### **Integration Patterns**:
|
||||
|
||||
1. **Direct Usage**: Simple import and use (most reusable)
|
||||
2. **Wrapper Pattern**: Blog-specific wrappers (maintains compatibility)
|
||||
3. **Engine Orchestration**: Research Engine coordinates providers (unified interface)
|
||||
|
||||
### **Architecture Benefits**:
|
||||
|
||||
- ✅ **Modularity**: Services are independent building blocks
|
||||
- ✅ **Reusability**: Can be used by any module
|
||||
- ✅ **Flexibility**: Different integration patterns for different needs
|
||||
- ✅ **Maintainability**: Changes to services don't break consumers
|
||||
|
||||
---
|
||||
|
||||
**Status**: Services are well-designed for reusability with flexible integration patterns 🚀
|
||||
Reference in New Issue
Block a user