18 KiB
Codebase Organization & Service Reusability Analysis
Date: 2025-01-29
Status: Comprehensive Codebase Structure Analysis
📋 Overview
This document provides a comprehensive analysis of:
- Codebase Organization: How features are organized across folders
- Service Architecture: How Exa, Tavily, and Google Search services are structured
- Reusability Analysis: Whether these services are reusable or tightly integrated
🏗️ Codebase Organization
High-Level Structure
AI-Writer/
├── backend/
│ ├── api/ # API endpoints (FastAPI routers)
│ ├── services/ # Business logic & service layer
│ ├── models/ # Database models & schemas
│ ├── middleware/ # Request/response middleware
│ ├── utils/ # Utility functions
│ └── database/ # Database migrations
│
├── frontend/
│ └── src/
│ ├── components/ # React components
│ ├── services/ # Frontend API clients
│ ├── hooks/ # React hooks
│ └── utils/ # Frontend utilities
│
└── docs/ # Documentation
📁 Feature Organization by Folder
Backend Services (backend/services/)
Research Services (backend/services/research/)
Purpose: Core research engine and provider services
research/
├── core/ # Core research engine (standalone)
│ ├── research_engine.py # Main orchestrator
│ ├── research_context.py # Unified input schema
│ └── parameter_optimizer.py # AI-driven parameter optimization
│
├── intent/ # Intent-driven research
│ ├── unified_research_analyzer.py # Single AI call for intent+queries+params
│ ├── intent_aware_analyzer.py # Result analysis based on intent
│ └── ...
│
├── trends/ # Google Trends integration
│ └── google_trends_service.py
│
├── exa_service.py # ⭐ Reusable Exa API service
├── tavily_service.py # ⭐ Reusable Tavily API service
├── google_search_service.py # ⭐ Reusable Google Search service
│
├── research_persona_service.py # Persona generation/retrieval
└── research_persona_prompt_builder.py
Key Features:
- Standalone research engine (
ResearchEngine) - Provider services (Exa, Tavily, Google)
- Intent-driven research system
- Research persona system
Blog Writer Services (backend/services/blog_writer/)
Purpose: Blog content generation
blog_writer/
├── core/
│ └── blog_writer_service.py # Main blog generation service
│
├── research/ # Blog-specific research providers
│ ├── research_service.py # Blog research orchestrator
│ ├── exa_provider.py # Blog-specific Exa wrapper
│ ├── tavily_provider.py # Blog-specific Tavily wrapper
│ ├── google_provider.py # Blog-specific Google wrapper
│ └── research_strategies.py # Research strategies per mode
│
├── outline/ # Outline generation
├── content/ # Content generation
└── seo/ # SEO optimization
Key Features:
- Uses
services.researchservices (reusable) - Has blog-specific wrappers for providers
- Research strategies for different blog modes
Other Feature Services
| Service Folder | Purpose | Research Integration |
|---|---|---|
podcast/ |
Podcast generation | Can use Research Engine |
story_writer/ |
Story generation | Can use Research Engine |
youtube/ |
YouTube content | Can use Research Engine |
linkedin/ |
LinkedIn content | Uses GoogleSearchService |
onboarding/ |
User onboarding | Uses ExaService for competitor discovery |
content_planning/ |
Content planning | Can use Research Engine |
scheduler/ |
Task scheduling | Can use Research Engine |
Backend API (backend/api/)
Research API (backend/api/research/)
Purpose: Research endpoints
api/research/
├── router.py # Main router
└── handlers/
├── providers.py # Provider status endpoints
├── research.py # Traditional research endpoints
├── intent.py # Intent-driven endpoints
└── projects.py # My Projects endpoints
Endpoints:
POST /api/research/intent/analyze- Intent analysisPOST /api/research/intent/research- Intent-driven researchPOST /api/research/execute- Traditional researchGET /api/research/config- Configuration
Other API Modules
| API Folder | Purpose | Research Integration |
|---|---|---|
blog_writer/ |
Blog endpoints | Uses blog_writer services |
podcast/ |
Podcast endpoints | Can use Research Engine |
story_writer/ |
Story endpoints | Can use Research Engine |
onboarding_utils/ |
Onboarding utilities | Uses ExaService for competitor discovery |
Frontend Components (frontend/src/components/)
Research Components (frontend/src/components/Research/)
Purpose: Research UI components
Research/
├── ResearchWizard.tsx # Main wizard orchestrator
├── steps/
│ ├── ResearchInput.tsx # Step 1: Input + Intent & Options
│ ├── StepProgress.tsx # Step 2: Progress/polling
│ ├── StepResults.tsx # Step 3: Results display
│ └── components/ # Sub-components
│ ├── IntentConfirmationPanel.tsx
│ ├── IntentResultsDisplay.tsx
│ └── ...
├── hooks/
│ ├── useResearchWizard.ts # Wizard state management
│ ├── useResearchExecution.ts # Research execution
│ └── useIntentResearch.ts # Intent research flow
└── types/
├── research.types.ts # Research types
└── intent.types.ts # Intent types
🔌 Service Architecture: Exa, Tavily, Google Search
Service Design Pattern
All three services follow a similar design pattern:
- Standalone Service Classes: Each service is a self-contained class
- Lazy Initialization: Services check for API keys on initialization
- Error Handling: Graceful degradation when API keys are missing
- Standardized Interface: Similar method signatures across services
1. ExaService (backend/services/research/exa_service.py)
Design: ✅ Reusable Service
class ExaService:
"""
Service for competitor discovery and analysis using the Exa API.
Uses neural search to find semantically similar websites and content.
"""
def __init__(self):
"""Initialize with API credentials from environment."""
self.api_key = os.getenv("EXA_API_KEY")
self.exa = None
self.enabled = False
self._try_initialize()
async def discover_competitors(...) -> Dict[str, Any]:
"""Discover competitors for a given website."""
async def discover_social_media_accounts(...) -> Dict[str, Any]:
"""Discover social media accounts."""
async def analyze_competitor_content(...) -> Dict[str, Any]:
"""Analyze competitor content."""
Key Features:
- ✅ Standalone: No dependencies on Research Engine
- ✅ Reusable: Can be imported by any module
- ✅ Focused: Primarily for competitor discovery
- ✅ Flexible: Supports various search parameters
Current Usage:
- Research Engine: Uses for research queries
- Onboarding: Uses for competitor discovery (Step 3)
- Blog Writer: Uses via blog-specific wrapper (
exa_provider.py)
2. TavilyService (backend/services/research/tavily_service.py)
Design: ✅ Reusable Service
class TavilyService:
"""
Service for web search and research using the Tavily API.
Provides AI-powered search with real-time information retrieval.
"""
def __init__(self):
"""Initialize with API credentials from environment."""
self.api_key = os.getenv("TAVILY_API_KEY")
self.base_url = "https://api.tavily.com"
self.enabled = False
self._try_initialize()
async def search(...) -> Dict[str, Any]:
"""Execute a search query using Tavily API."""
async def search_industry_trends(...) -> Dict[str, Any]:
"""Search for current industry trends."""
async def discover_competitors(...) -> Dict[str, Any]:
"""Discover competitors using Tavily search."""
Key Features:
- ✅ Standalone: No dependencies on Research Engine
- ✅ Reusable: Can be imported by any module
- ✅ Flexible: Supports various search parameters (topic, depth, time_range, etc.)
- ✅ Real-time: Optimized for current information
Current Usage:
- Research Engine: Uses for research queries
- Blog Writer: Uses via blog-specific wrapper (
tavily_provider.py)
3. GoogleSearchService (backend/services/research/google_search_service.py)
Design: ✅ Reusable Service
class GoogleSearchService:
"""
Service for conducting real industry research using Google Custom Search API.
Provides current, relevant industry information for content grounding.
"""
def __init__(self):
"""Initialize with API credentials from environment."""
self.api_key = os.getenv("GOOGLE_SEARCH_API_KEY")
self.search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")
self.enabled = False
async def search_industry_trends(...) -> List[Dict[str, Any]]:
"""Search for current industry trends and insights."""
Key Features:
- ✅ Standalone: No dependencies on Research Engine
- ✅ Reusable: Can be imported by any module
- ✅ Focused: Industry trend research
- ✅ Credibility Scoring: Built-in source credibility assessment
Current Usage:
- Research Engine: Uses as fallback provider
- LinkedIn Service: Uses for industry research
🔄 Reusability Analysis
✅ Services ARE Reusable
All three services (Exa, Tavily, Google Search) are designed to be reusable:
Evidence of Reusability:
-
Standalone Design:
- No dependencies on Research Engine
- Self-contained initialization
- Independent error handling
-
Multiple Usage Points:
# Used in Research Engine from services.research.exa_service import ExaService # Used in Onboarding from services.research.exa_service import ExaService # Used in Blog Writer (via wrapper) from services.research.tavily_service import TavilyService # Used in LinkedIn Service from services.research import GoogleSearchService -
Standardized Interface:
- Similar method signatures
- Consistent return formats
- Environment-based configuration
-
Export Structure:
# backend/services/research/__init__.py from .google_search_service import GoogleSearchService from .exa_service import ExaService from .tavily_service import TavilyService __all__ = [ "GoogleSearchService", "ExaService", "TavilyService", # ... other exports ]
⚠️ Integration Patterns
While services are reusable, they are used in different ways:
1. Direct Usage (Most Reusable)
# Direct import and use
from services.research.exa_service import ExaService
exa = ExaService()
result = await exa.discover_competitors(user_url)
Used By:
- Onboarding (competitor discovery)
- Research Engine (research queries)
2. Wrapper Pattern (Blog Writer)
# Blog Writer uses wrappers for blog-specific logic
from services.research.tavily_service import TavilyService
class TavilyResearchProvider:
def __init__(self):
self.tavily = TavilyService() # Reuses service
async def search(self, prompt, topic, ...):
# Blog-specific logic + TavilyService
return await self.tavily.search(...)
Why Wrappers?:
- Blog-specific research strategies
- Blog-specific result formatting
- Blog-specific error handling
- Maintains compatibility with existing blog writer code
Location: backend/services/blog_writer/research/tavily_provider.py
3. Engine Orchestration (Research Engine)
# Research Engine orchestrates providers
from services.research.exa_service import ExaService
from services.research.tavily_service import TavilyService
from services.research.google_search_service import GoogleSearchService
class ResearchEngine:
def __init__(self):
self._exa_provider = ExaService()
self._tavily_provider = TavilyService()
self._google_provider = GoogleSearchService()
async def research(self, context: ResearchContext):
# Orchestrates providers based on priority
if self.exa_available:
return await self._exa_provider.search(...)
elif self.tavily_available:
return await self._tavily_provider.search(...)
else:
return await self._google_provider.search_industry_trends(...)
Why Orchestration?:
- Provider priority management
- Fallback logic
- Unified interface for all tools
- Research persona integration
📊 Service Reusability Matrix
| Service | Standalone | Reusable | Current Usage | Integration Pattern |
|---|---|---|---|---|
| ExaService | ✅ Yes | ✅ Yes | Research Engine, Onboarding, Blog Writer | Direct + Wrapper |
| TavilyService | ✅ Yes | ✅ Yes | Research Engine, Blog Writer | Direct + Wrapper |
| GoogleSearchService | ✅ Yes | ✅ Yes | Research Engine, LinkedIn Service | Direct |
🎯 Key Insights
✅ Services Are Reusable
- No Tight Coupling: Services don't depend on Research Engine
- Standardized Interface: Consistent method signatures
- Multiple Usage Points: Used across different modules
- Environment-Based Config: No hardcoded dependencies
⚠️ Integration Patterns Vary
- Direct Usage: Simple import and use (most reusable)
- Wrapper Pattern: Blog-specific wrappers (maintains compatibility)
- Engine Orchestration: Research Engine coordinates providers (unified interface)
🔄 Architecture Evolution
Current State:
- Services are reusable ✅
- Research Engine provides unified interface ✅
- Blog Writer uses wrappers for compatibility ✅
Future Recommendations:
- Consider migrating Blog Writer to use Research Engine directly
- Standardize on Research Engine for all tools
- Keep services as low-level building blocks
📝 Usage Examples
Example 1: Direct Usage (Onboarding)
# backend/api/onboarding_utils/step3_research_service.py
from services.research.exa_service import ExaService
exa_service = ExaService()
result = await exa_service.discover_competitors(
user_url=user_url,
num_results=10,
industry_context=industry
)
Example 2: Wrapper Pattern (Blog Writer)
# backend/services/blog_writer/research/tavily_provider.py
from services.research.tavily_service import TavilyService
class TavilyResearchProvider:
def __init__(self):
self.tavily = TavilyService() # Reuses service
async def search(self, research_prompt, topic, industry, ...):
# Blog-specific query building
query = self._build_blog_query(research_prompt, topic, industry)
# Use TavilyService
result = await self.tavily.search(
query=query,
topic="general",
search_depth="advanced",
max_results=config.max_sources
)
# Blog-specific result formatting
return self._format_blog_results(result)
Example 3: Engine Orchestration (Research Engine)
# backend/services/research/core/research_engine.py
from services.research.exa_service import ExaService
from services.research.tavily_service import TavilyService
class ResearchEngine:
def __init__(self):
self._exa_provider = ExaService()
self._tavily_provider = TavilyService()
async def research(self, context: ResearchContext, user_id: str):
# Get optimized config
config = self.optimizer.optimize(context)
# Execute based on provider priority
if config.provider == ResearchProvider.EXA:
return await self._execute_exa_research(context, config, user_id)
elif config.provider == ResearchProvider.TAVILY:
return await self._execute_tavily_research(context, config, user_id)
else:
return await self._execute_google_research(context, config, user_id)
✅ Conclusion
Services ARE Reusable ✅
- ExaService: ✅ Reusable, used in Research Engine, Onboarding, Blog Writer
- TavilyService: ✅ Reusable, used in Research Engine, Blog Writer
- GoogleSearchService: ✅ Reusable, used in Research Engine, LinkedIn Service
Integration Patterns:
- Direct Usage: Simple import and use (most reusable)
- Wrapper Pattern: Blog-specific wrappers (maintains compatibility)
- Engine Orchestration: Research Engine coordinates providers (unified interface)
Architecture Benefits:
- ✅ Modularity: Services are independent building blocks
- ✅ Reusability: Can be used by any module
- ✅ Flexibility: Different integration patterns for different needs
- ✅ Maintainability: Changes to services don't break consumers
Status: Services are well-designed for reusability with flexible integration patterns 🚀