kunthawat/ALwrity

Fork 0

Files

ajaysi 8193cdba67 AI Analysis and Content Strategy fixes. Enhanced Strategy Routes refactoring.

2026-01-10 19:32:50 +05:30

18 KiB

Raw Permalink Blame History

Codebase Organization & Service Reusability Analysis

Date: 2025-01-29
Status: Comprehensive Codebase Structure Analysis

📋 Overview

This document provides a comprehensive analysis of:

Codebase Organization: How features are organized across folders
Service Architecture: How Exa, Tavily, and Google Search services are structured
Reusability Analysis: Whether these services are reusable or tightly integrated

🏗️ Codebase Organization

High-Level Structure

AI-Writer/
├── backend/
│   ├── api/                    # API endpoints (FastAPI routers)
│   ├── services/               # Business logic & service layer
│   ├── models/                 # Database models & schemas
│   ├── middleware/             # Request/response middleware
│   ├── utils/                  # Utility functions
│   └── database/               # Database migrations
│
├── frontend/
│   └── src/
│       ├── components/         # React components
│       ├── services/            # Frontend API clients
│       ├── hooks/               # React hooks
│       └── utils/               # Frontend utilities
│
└── docs/                        # Documentation

📁 Feature Organization by Folder

Backend Services (`backend/services/`)

Research Services (`backend/services/research/`)

Purpose: Core research engine and provider services

research/
├── core/                        # Core research engine (standalone)
│   ├── research_engine.py       # Main orchestrator
│   ├── research_context.py      # Unified input schema
│   └── parameter_optimizer.py  # AI-driven parameter optimization
│
├── intent/                      # Intent-driven research
│   ├── unified_research_analyzer.py  # Single AI call for intent+queries+params
│   ├── intent_aware_analyzer.py      # Result analysis based on intent
│   └── ...
│
├── trends/                      # Google Trends integration
│   └── google_trends_service.py
│
├── exa_service.py               # ⭐ Reusable Exa API service
├── tavily_service.py             # ⭐ Reusable Tavily API service
├── google_search_service.py     # ⭐ Reusable Google Search service
│
├── research_persona_service.py  # Persona generation/retrieval
└── research_persona_prompt_builder.py

Key Features:

Standalone research engine (ResearchEngine)
Provider services (Exa, Tavily, Google)
Intent-driven research system
Research persona system

Blog Writer Services (`backend/services/blog_writer/`)

Purpose: Blog content generation

blog_writer/
├── core/
│   └── blog_writer_service.py   # Main blog generation service
│
├── research/                    # Blog-specific research providers
│   ├── research_service.py      # Blog research orchestrator
│   ├── exa_provider.py          # Blog-specific Exa wrapper
│   ├── tavily_provider.py       # Blog-specific Tavily wrapper
│   ├── google_provider.py       # Blog-specific Google wrapper
│   └── research_strategies.py   # Research strategies per mode
│
├── outline/                     # Outline generation
├── content/                     # Content generation
└── seo/                         # SEO optimization

Key Features:

Uses services.research services (reusable)
Has blog-specific wrappers for providers
Research strategies for different blog modes

Other Feature Services

Service Folder	Purpose	Research Integration
`podcast/`	Podcast generation	Can use Research Engine
`story_writer/`	Story generation	Can use Research Engine
`youtube/`	YouTube content	Can use Research Engine
`linkedin/`	LinkedIn content	Uses GoogleSearchService
`onboarding/`	User onboarding	Uses ExaService for competitor discovery
`content_planning/`	Content planning	Can use Research Engine
`scheduler/`	Task scheduling	Can use Research Engine

Backend API (`backend/api/`)

Research API (`backend/api/research/`)

Purpose: Research endpoints

api/research/
├── router.py                    # Main router
└── handlers/
    ├── providers.py             # Provider status endpoints
    ├── research.py               # Traditional research endpoints
    ├── intent.py                 # Intent-driven endpoints
    └── projects.py               # My Projects endpoints

Endpoints:

POST /api/research/intent/analyze - Intent analysis
POST /api/research/intent/research - Intent-driven research
POST /api/research/execute - Traditional research
GET /api/research/config - Configuration

Other API Modules

API Folder	Purpose	Research Integration
`blog_writer/`	Blog endpoints	Uses blog_writer services
`podcast/`	Podcast endpoints	Can use Research Engine
`story_writer/`	Story endpoints	Can use Research Engine
`onboarding_utils/`	Onboarding utilities	Uses ExaService for competitor discovery

Frontend Components (`frontend/src/components/`)

Research Components (`frontend/src/components/Research/`)

Purpose: Research UI components

Research/
├── ResearchWizard.tsx           # Main wizard orchestrator
├── steps/
│   ├── ResearchInput.tsx        # Step 1: Input + Intent & Options
│   ├── StepProgress.tsx         # Step 2: Progress/polling
│   ├── StepResults.tsx          # Step 3: Results display
│   └── components/              # Sub-components
│       ├── IntentConfirmationPanel.tsx
│       ├── IntentResultsDisplay.tsx
│       └── ...
├── hooks/
│   ├── useResearchWizard.ts     # Wizard state management
│   ├── useResearchExecution.ts  # Research execution
│   └── useIntentResearch.ts     # Intent research flow
└── types/
    ├── research.types.ts        # Research types
    └── intent.types.ts          # Intent types

🔌 Service Architecture: Exa, Tavily, Google Search

Service Design Pattern

All three services follow a similar design pattern:

Standalone Service Classes: Each service is a self-contained class
Lazy Initialization: Services check for API keys on initialization
Error Handling: Graceful degradation when API keys are missing
Standardized Interface: Similar method signatures across services

1. ExaService (`backend/services/research/exa_service.py`)

Design: ✅ Reusable Service

class ExaService:
    """
    Service for competitor discovery and analysis using the Exa API.
    Uses neural search to find semantically similar websites and content.
    """
    
    def __init__(self):
        """Initialize with API credentials from environment."""
        self.api_key = os.getenv("EXA_API_KEY")
        self.exa = None
        self.enabled = False
        self._try_initialize()
    
    async def discover_competitors(...) -> Dict[str, Any]:
        """Discover competitors for a given website."""
    
    async def discover_social_media_accounts(...) -> Dict[str, Any]:
        """Discover social media accounts."""
    
    async def analyze_competitor_content(...) -> Dict[str, Any]:
        """Analyze competitor content."""

Key Features:

✅ Standalone: No dependencies on Research Engine
✅ Reusable: Can be imported by any module
✅ Focused: Primarily for competitor discovery
✅ Flexible: Supports various search parameters

Current Usage:

Research Engine: Uses for research queries
Onboarding: Uses for competitor discovery (Step 3)
Blog Writer: Uses via blog-specific wrapper (exa_provider.py)

2. TavilyService (`backend/services/research/tavily_service.py`)

Design: ✅ Reusable Service

class TavilyService:
    """
    Service for web search and research using the Tavily API.
    Provides AI-powered search with real-time information retrieval.
    """
    
    def __init__(self):
        """Initialize with API credentials from environment."""
        self.api_key = os.getenv("TAVILY_API_KEY")
        self.base_url = "https://api.tavily.com"
        self.enabled = False
        self._try_initialize()
    
    async def search(...) -> Dict[str, Any]:
        """Execute a search query using Tavily API."""
    
    async def search_industry_trends(...) -> Dict[str, Any]:
        """Search for current industry trends."""
    
    async def discover_competitors(...) -> Dict[str, Any]:
        """Discover competitors using Tavily search."""

Key Features:

✅ Standalone: No dependencies on Research Engine
✅ Reusable: Can be imported by any module
✅ Flexible: Supports various search parameters (topic, depth, time_range, etc.)
✅ Real-time: Optimized for current information

Current Usage:

Research Engine: Uses for research queries
Blog Writer: Uses via blog-specific wrapper (tavily_provider.py)

3. GoogleSearchService (`backend/services/research/google_search_service.py`)

Design: ✅ Reusable Service

class GoogleSearchService:
    """
    Service for conducting real industry research using Google Custom Search API.
    Provides current, relevant industry information for content grounding.
    """
    
    def __init__(self):
        """Initialize with API credentials from environment."""
        self.api_key = os.getenv("GOOGLE_SEARCH_API_KEY")
        self.search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")
        self.enabled = False
    
    async def search_industry_trends(...) -> List[Dict[str, Any]]:
        """Search for current industry trends and insights."""

Key Features:

✅ Standalone: No dependencies on Research Engine
✅ Reusable: Can be imported by any module
✅ Focused: Industry trend research
✅ Credibility Scoring: Built-in source credibility assessment

Current Usage:

Research Engine: Uses as fallback provider
LinkedIn Service: Uses for industry research

🔄 Reusability Analysis

✅ Services ARE Reusable

All three services (Exa, Tavily, Google Search) are designed to be reusable:

Evidence of Reusability:

Standalone Design:
- No dependencies on Research Engine
- Self-contained initialization
- Independent error handling

Multiple Usage Points:

# Used in Research Engine
from services.research.exa_service import ExaService

# Used in Onboarding
from services.research.exa_service import ExaService

# Used in Blog Writer (via wrapper)
from services.research.tavily_service import TavilyService

# Used in LinkedIn Service
from services.research import GoogleSearchService

Standardized Interface:
- Similar method signatures
- Consistent return formats
- Environment-based configuration

Export Structure:

# backend/services/research/__init__.py
from .google_search_service import GoogleSearchService
from .exa_service import ExaService
from .tavily_service import TavilyService

__all__ = [
    "GoogleSearchService",
    "ExaService",
    "TavilyService",
    # ... other exports
]

⚠️ Integration Patterns

While services are reusable, they are used in different ways:

1. Direct Usage (Most Reusable)

# Direct import and use
from services.research.exa_service import ExaService

exa = ExaService()
result = await exa.discover_competitors(user_url)

Used By:

Onboarding (competitor discovery)
Research Engine (research queries)

2. Wrapper Pattern (Blog Writer)

# Blog Writer uses wrappers for blog-specific logic
from services.research.tavily_service import TavilyService

class TavilyResearchProvider:
    def __init__(self):
        self.tavily = TavilyService()  # Reuses service
    
    async def search(self, prompt, topic, ...):
        # Blog-specific logic + TavilyService
        return await self.tavily.search(...)

Why Wrappers?:

Blog-specific research strategies
Blog-specific result formatting
Blog-specific error handling
Maintains compatibility with existing blog writer code

Location: backend/services/blog_writer/research/tavily_provider.py

3. Engine Orchestration (Research Engine)

# Research Engine orchestrates providers
from services.research.exa_service import ExaService
from services.research.tavily_service import TavilyService
from services.research.google_search_service import GoogleSearchService

class ResearchEngine:
    def __init__(self):
        self._exa_provider = ExaService()
        self._tavily_provider = TavilyService()
        self._google_provider = GoogleSearchService()
    
    async def research(self, context: ResearchContext):
        # Orchestrates providers based on priority
        if self.exa_available:
            return await self._exa_provider.search(...)
        elif self.tavily_available:
            return await self._tavily_provider.search(...)
        else:
            return await self._google_provider.search_industry_trends(...)

Why Orchestration?:

Provider priority management
Fallback logic
Unified interface for all tools
Research persona integration

📊 Service Reusability Matrix

Service	Standalone	Reusable	Current Usage	Integration Pattern
ExaService	✅ Yes	✅ Yes	Research Engine, Onboarding, Blog Writer	Direct + Wrapper
TavilyService	✅ Yes	✅ Yes	Research Engine, Blog Writer	Direct + Wrapper
GoogleSearchService	✅ Yes	✅ Yes	Research Engine, LinkedIn Service	Direct

🎯 Key Insights

✅ Services Are Reusable

No Tight Coupling: Services don't depend on Research Engine
Standardized Interface: Consistent method signatures
Multiple Usage Points: Used across different modules
Environment-Based Config: No hardcoded dependencies

⚠️ Integration Patterns Vary

Direct Usage: Simple import and use (most reusable)
Wrapper Pattern: Blog-specific wrappers (maintains compatibility)
Engine Orchestration: Research Engine coordinates providers (unified interface)

🔄 Architecture Evolution

Current State:

Services are reusable ✅
Research Engine provides unified interface ✅
Blog Writer uses wrappers for compatibility ✅

Future Recommendations:

Consider migrating Blog Writer to use Research Engine directly
Standardize on Research Engine for all tools
Keep services as low-level building blocks

📝 Usage Examples

Example 1: Direct Usage (Onboarding)

# backend/api/onboarding_utils/step3_research_service.py
from services.research.exa_service import ExaService

exa_service = ExaService()
result = await exa_service.discover_competitors(
    user_url=user_url,
    num_results=10,
    industry_context=industry
)

Example 2: Wrapper Pattern (Blog Writer)

# backend/services/blog_writer/research/tavily_provider.py
from services.research.tavily_service import TavilyService

class TavilyResearchProvider:
    def __init__(self):
        self.tavily = TavilyService()  # Reuses service
    
    async def search(self, research_prompt, topic, industry, ...):
        # Blog-specific query building
        query = self._build_blog_query(research_prompt, topic, industry)
        
        # Use TavilyService
        result = await self.tavily.search(
            query=query,
            topic="general",
            search_depth="advanced",
            max_results=config.max_sources
        )
        
        # Blog-specific result formatting
        return self._format_blog_results(result)

Example 3: Engine Orchestration (Research Engine)

# backend/services/research/core/research_engine.py
from services.research.exa_service import ExaService
from services.research.tavily_service import TavilyService

class ResearchEngine:
    def __init__(self):
        self._exa_provider = ExaService()
        self._tavily_provider = TavilyService()
    
    async def research(self, context: ResearchContext, user_id: str):
        # Get optimized config
        config = self.optimizer.optimize(context)
        
        # Execute based on provider priority
        if config.provider == ResearchProvider.EXA:
            return await self._execute_exa_research(context, config, user_id)
        elif config.provider == ResearchProvider.TAVILY:
            return await self._execute_tavily_research(context, config, user_id)
        else:
            return await self._execute_google_research(context, config, user_id)

✅ Conclusion

Services ARE Reusable ✅

ExaService: ✅ Reusable, used in Research Engine, Onboarding, Blog Writer
TavilyService: ✅ Reusable, used in Research Engine, Blog Writer
GoogleSearchService: ✅ Reusable, used in Research Engine, LinkedIn Service

Integration Patterns:

Direct Usage: Simple import and use (most reusable)
Wrapper Pattern: Blog-specific wrappers (maintains compatibility)
Engine Orchestration: Research Engine coordinates providers (unified interface)

Architecture Benefits:

✅ Modularity: Services are independent building blocks
✅ Reusability: Can be used by any module
✅ Flexibility: Different integration patterns for different needs
✅ Maintainability: Changes to services don't break consumers

Status: Services are well-designed for reusability with flexible integration patterns 🚀

18 KiB Raw Permalink Blame History

Codebase Organization & Service Reusability Analysis

📋 Overview

🏗️ Codebase Organization

High-Level Structure

📁 Feature Organization by Folder

Backend Services (backend/services/)

Research Services (backend/services/research/)

Blog Writer Services (backend/services/blog_writer/)

Other Feature Services

Backend API (backend/api/)

Research API (backend/api/research/)

Other API Modules

Frontend Components (frontend/src/components/)

Research Components (frontend/src/components/Research/)

🔌 Service Architecture: Exa, Tavily, Google Search

Service Design Pattern

1. ExaService (backend/services/research/exa_service.py)

2. TavilyService (backend/services/research/tavily_service.py)

3. GoogleSearchService (backend/services/research/google_search_service.py)

🔄 Reusability Analysis

✅ Services ARE Reusable

Evidence of Reusability:

⚠️ Integration Patterns

1. Direct Usage (Most Reusable)

2. Wrapper Pattern (Blog Writer)

3. Engine Orchestration (Research Engine)

📊 Service Reusability Matrix

🎯 Key Insights

✅ Services Are Reusable

⚠️ Integration Patterns Vary

🔄 Architecture Evolution

📝 Usage Examples

Example 1: Direct Usage (Onboarding)

Example 2: Wrapper Pattern (Blog Writer)

Example 3: Engine Orchestration (Research Engine)

✅ Conclusion

Services ARE Reusable ✅

Integration Patterns:

Architecture Benefits:

18 KiB

Raw Permalink Blame History

Backend Services (`backend/services/`)

Research Services (`backend/services/research/`)

Blog Writer Services (`backend/services/blog_writer/`)

Backend API (`backend/api/`)

Research API (`backend/api/research/`)

Frontend Components (`frontend/src/components/`)

Research Components (`frontend/src/components/Research/`)

1. ExaService (`backend/services/research/exa_service.py`)

2. TavilyService (`backend/services/research/tavily_service.py`)

3. GoogleSearchService (`backend/services/research/google_search_service.py`)