Add Dockerfile for EasyPanel deployment

Fix LinkedIn writer: progress animation, persona API 404 handling, back-to-home navigation
- Simulate progress step advancement at 1.5s intervals during API calls so users see incremental progress instead of all-at-once bursts - PersonaChip skips API calls entirely in feature-only mode (no console spam) - getUserPersonas/getPlatformPersona return null on 404 instead of throwing - PersonaChip shows neutral gray state when no persona data exists - Back button now clears draft to return to LinkedIn writer home screen - Article title extracted from markdown content (fixes KeyError) - InitialRouteHandler: demo mode subscribes getDefaultLandingRoute() - Header: back button shown when draft exists, navigates to home screen
2026-06-15 10:40:16 +07:00 · 2026-06-13 17:12:45 +05:30 · 2026-06-12 20:32:03 +05:30 · 2026-06-12 18:58:53 +05:30 · 2026-06-05 12:40:30 +05:30
175 changed files with 11154 additions and 3618 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -0,0 +1,68 @@
 # Git
 .git
 .gitignore
 # Node modules (rebuilt inside Docker)
 frontend/node_modules
 # Python cache
 __pycache__
 *.pyc
 *.pyo
 *.pyd
 .Python
 *.so
 *.egg
 *.egg-info
 dist
 build
 # Virtual envs
 .venv
 venv/
 ENV/
 # IDE
 .idea/
 .vscode/
 *.swp
 *.swo
 # OS
 .DS_Store
 Thumbs.db
 # Docs & markdown (not needed in container)
 docs/
 docs-site/
 *.md
 # GitHub meta
 .github/
 # Frontend build is copied separately via --from
 # so exclude the local build dir to keep context small
 frontend/build/
 frontend/.env
 frontend/.env.local
 frontend/.env.production
 # Backend env
 .env
 .env.*
 !backend/env_template.txt
 # Test files
 **/test/
 **/tests/
 *.test.py
 *.spec.py
 # Logs
 *.log
 logs/
 # Temp
 tmp/
 temp/
 *.tmp
--- a/72
+++ b/72
@@ -0,0 +1,72 @@
 # ============================================================
 # ALwrity Dockerfile — for EasyPanel deployment
 # ============================================================
 # Stage 1: Build frontend
 FROM node:20-alpine AS frontend-builder
 WORKDIR /app/frontend
 # Copy package files
 COPY frontend/package.json frontend/package-lock.json* ./
 # Install deps (--legacy-peer-deps needed for react-scripts 5)
 RUN npm install --legacy-peer-deps
 # Copy frontend source
 COPY frontend/ ./
 # Build static assets
 RUN npm run build
 # ============================================================
 # Stage 2: Python backend
 FROM python:3.11-slim AS backend
 ENV PYTHONDONTWRITEBYTECODE=1
 ENV PYTHONUNBUFFERED=1
 ENV PORT=8000
 WORKDIR /app
 # Install build deps for some Python packages
 RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    curl \
    && rm -rf /var/lib/apt/lists/*
 # Copy requirements first (for caching)
 COPY backend/requirements.txt .
 # Install Python deps
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy backend source
 COPY backend/ ./backend/
 # Copy frontend build artifacts from Stage 1
 COPY --from=frontend-builder /app/frontend/build ./frontend/build
 # Create workspace directories (created by start_alwrity_backend.py but ensure they exist)
 RUN mkdir -p /app/lib/workspace/alwrity_content \
             /app/lib/workspace/alwrity_web_research \
             /app/lib/workspace/alwrity_prompts \
             /app/lib/workspace/alwrity_config
 # Expose port
 EXPOSE 8000
 # Health check
 HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1
 # Run with gunicorn + uvicorn workers (recommended for production)
 # Fallback to plain uvicorn if gunicorn not installed
 CMD python -m gunicorn backend.app:app \
    --worker-class uvicorn.workers.UvicornWorker \
    --bind 0.0.0.0:8000 \
    --workers 2 \
    --timeout 120 \
    --access-logfile - \
    --error-logfile - \
    --log-level info
--- a/backend/api/blog_writer/router.py
+++ b/backend/api/blog_writer/router.py
@@ -71,6 +71,7 @@ class SEOApplyRecommendationsRequest(BaseModel):
    outline: List[Dict[str, Any]] = Field(default_factory=list, description="Outline structure for context")
    research: Dict[str, Any] = Field(default_factory=dict, description="Research data used for the blog")
    recommendations: List[RecommendationItem] = Field(..., description="Actionable recommendations to apply")
    competitive_advantage: str | None = Field(default=None, description="Selected competitive advantage for emphasis")
    persona: Dict[str, Any] = Field(default_factory=dict, description="Persona settings if available")
    tone: str | None = Field(default=None, description="Desired tone override")
    audience: str | None = Field(default=None, description="Target audience override")
@@ -688,9 +689,11 @@ async def get_section_continuity(section_id: str) -> Dict[str, Any]:
@router.post("/flow-analysis/basic")
-async def analyze_flow_basic(request: Dict[str, Any]) -> Dict[str, Any]:
+async def analyze_flow_basic(request: Dict[str, Any], current_user: Dict[str, Any] = Depends(get_current_user)) -> Dict[str, Any]:
    """Analyze flow metrics for entire blog using single AI call (cost-effective)."""
    try:
        user_id = str(current_user.get('id', '')) if current_user else None
        request['user_id'] = user_id
        result = await service.analyze_flow_basic(request)
        return result
    except Exception as e:
@@ -699,9 +702,11 @@ async def analyze_flow_basic(request: Dict[str, Any]) -> Dict[str, Any]:
@router.post("/flow-analysis/advanced")
-async def analyze_flow_advanced(request: Dict[str, Any]) -> Dict[str, Any]:
+async def analyze_flow_advanced(request: Dict[str, Any], current_user: Dict[str, Any] = Depends(get_current_user)) -> Dict[str, Any]:
    """Analyze flow metrics for each section individually (detailed but expensive)."""
    try:
        user_id = str(current_user.get('id', '')) if current_user else None
        request['user_id'] = user_id
        result = await service.analyze_flow_advanced(request)
        return result
    except Exception as e:
@@ -808,9 +813,12 @@ async def seo_metadata(
 # Publishing Endpoints
 # NOTE: Real publishing bypasses this stub. Frontend calls platform-specific
 # endpoints directly: /api/wix/publish and /api/wordpress/publish.
 # This endpoint is kept as a placeholder for the future unified publish flow.
@router.post("/publish", response_model=BlogPublishResponse)
 async def publish(request: BlogPublishRequest) -> BlogPublishResponse:
-    """Publish the blog post to the specified platform."""
+    """Publish the blog post to the specified platform. [STUB - see note above]"""
    try:
        return await service.publish(request)
    except Exception as e:
@@ -1209,6 +1217,9 @@ async def generate_introductions(
 class SaveCompleteBlogAssetRequest(BaseModel):
    title: str
    content: str
    platform: Optional[str] = None
    post_url: Optional[str] = None
    post_id: Optional[str] = None
    seo_title: Optional[str] = None
    meta_description: Optional[str] = None
    focus_keyword: Optional[str] = None
@@ -1233,21 +1244,29 @@ async def save_complete_blog_asset(
        full_content = f"# {request.title}\n\n{request.content}"
        asset_metadata = {
            "status": "published",
            "focus_keyword": request.focus_keyword,
            "categories": request.categories,
            "word_count": len(full_content.split()),
        }
        if request.platform:
            asset_metadata["platform"] = request.platform
        if request.post_url:
            asset_metadata["post_url"] = request.post_url
        if request.post_id:
            asset_metadata["post_id"] = request.post_id
        asset_id = save_and_track_text_content(
            db=db,
            user_id=user_id,
            content=full_content,
            source_module="blog_writer",
-            title=f"Published Blog: {request.title[:60]}",
+            title=request.title[:100],
            description=request.meta_description or f"Complete published blog post: {request.title}",
            prompt=f"SEO Title: {request.seo_title or request.title}\nFocus Keyword: {request.focus_keyword or ''}",
            tags=["blog", "published"] + [t for t in (request.tags or []) if t],
-            asset_metadata={
+            asset_metadata=asset_metadata,
                "status": "published",
                "focus_keyword": request.focus_keyword,
                "categories": request.categories,
                "word_count": len(full_content.split()),
            },
            subdirectory="published",
            file_extension=".md"
        )
@@ -1266,6 +1285,57 @@ async def save_complete_blog_asset(
        raise HTTPException(status_code=500, detail=str(e))
@router.get("/publish-history")
 async def get_publish_history(
    current_user: Dict[str, Any] = Depends(get_current_user),
    db: Session = Depends(get_db),
    limit: int = 50,
    offset: int = 0,
 ) -> Dict[str, Any]:
    """Get publish history for the current user from the asset library."""
    try:
        if not current_user:
            raise HTTPException(status_code=401, detail="Authentication required")
        user_id = str(current_user.get('id', ''))
        if not user_id:
            raise HTTPException(status_code=401, detail="Invalid user ID in authentication token")
        svc = ContentAssetService(db)
        assets, total = svc.get_user_assets(
            user_id=user_id,
            tags=["published"],
            source_module=AssetSource.BLOG_WRITER,
            sort_by="created_at",
            sort_order="desc",
            limit=limit,
            offset=offset,
        )
        entries = []
        for a in assets:
            meta = a.asset_metadata or {}
            entries.append({
                "asset_id": a.id,
                "title": a.title,
                "platform": meta.get("platform", "unknown"),
                "post_url": meta.get("post_url"),
                "post_id": meta.get("post_id"),
                "word_count": meta.get("word_count", 0),
                "focus_keyword": meta.get("focus_keyword"),
                "categories": meta.get("categories", []),
                "published_at": a.created_at.isoformat() if a.created_at else None,
            })
        return {"success": True, "entries": entries, "total": total}
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Failed to get publish history: {e}")
        raise HTTPException(status_code=500, detail=str(e))
 # ---------------------------------------
 # Blog Asset API (phase-by-phase saving via ContentAsset)
 # ---------------------------------------
@@ -1413,7 +1483,11 @@ async def update_blog_asset(
            if val is not None:
                meta[field] = val
-        if meta.get("selected_title"):
+        # Prefer seo_title from publish_data, then selected_title, then topic, then existing title
        publish_data = meta.get("publish_data") or {}
        if isinstance(publish_data, dict) and publish_data.get("seo_title"):
            new_title = publish_data["seo_title"]
        elif meta.get("selected_title"):
            new_title = meta["selected_title"]
        elif meta.get("topic"):
            new_title = meta["topic"]
--- a/backend/api/blog_writer/seo_analysis.py
+++ b/backend/api/blog_writer/seo_analysis.py
@@ -28,6 +28,8 @@ class SEOAnalysisRequest(BaseModel):
    blog_content: str
    blog_title: Optional[str] = None
    research_data: Dict[str, Any]
    outline: Optional[List[Dict[str, Any]]] = None
    competitive_advantage: Optional[str] = None
    user_id: Optional[str] = None
    session_id: Optional[str] = None
@@ -109,7 +111,9 @@ async def analyze_blog_seo(
            blog_content=request.blog_content,
            research_data=request.research_data,
            blog_title=request.blog_title,
-            user_id=user_id
+            user_id=user_id,
            outline=request.outline,
            competitive_advantage=request.competitive_advantage,
        )
        # Check for errors
--- a/backend/api/content_assets/router.py
+++ b/backend/api/content_assets/router.py
@@ -344,6 +344,43 @@ async def update_asset(
        raise HTTPException(status_code=500, detail=f"Error updating asset: {str(e)}")
@router.get("/{asset_id}/content")
 async def get_asset_content(
    asset_id: int,
    db: Session = Depends(get_db),
    current_user: Dict[str, Any] = Depends(get_current_user),
 ):
    """Serve the raw text content of a text asset by reading its file from disk."""
    try:
        user_id = current_user.get("user_id") or current_user.get("id")
        if not user_id:
            raise HTTPException(status_code=401, detail="User ID not found")
        service = ContentAssetService(db)
        asset = service.get_asset_by_id(asset_id, user_id)
        if not asset:
            raise HTTPException(status_code=404, detail="Asset not found")
        if asset.asset_type != AssetType.TEXT:
            raise HTTPException(status_code=400, detail="Asset is not a text file")
        if not asset.file_path:
            raise HTTPException(status_code=404, detail="Asset file path not recorded")
        from pathlib import Path
        file_path = Path(asset.file_path)
        if not file_path.exists():
            raise HTTPException(status_code=404, detail="Asset file not found on disk")
        content = file_path.read_text(encoding="utf-8")
        return {"success": True, "content": content}
    except HTTPException:
        raise
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Error reading asset content: {str(e)}")
@router.get("/statistics", response_model=Dict[str, Any])
 async def get_statistics(
    db: Session = Depends(get_db),
--- a/backend/api/linkedin_image_generation.py
+++ b/backend/api/linkedin_image_generation.py
@@ -1,7 +1,9 @@
 import os
 from fastapi import APIRouter, HTTPException, UploadFile, File, Depends
 from fastapi.responses import FileResponse
 from pydantic import BaseModel
 from typing import List, Optional, Dict, Any
-import json
+import base64
 # Import our LinkedIn image generation services
 from services.linkedin.image_generation import LinkedInImageGenerator, LinkedInImageStorage
@@ -51,6 +53,23 @@ class ImageGenerationResponse(BaseModel):
    aspect_ratio: Optional[str] = None
    error: Optional[str] = None
 class ImageEditRequest(BaseModel):
    image_base64: Optional[str] = None
    image_id: Optional[str] = None
    prompt: str
    content_context: Dict[str, Any]
 class ImageEditResponse(BaseModel):
    success: bool
    image_data: Optional[str] = None
    image_id: Optional[str] = None
    image_url: Optional[str] = None
    width: Optional[int] = None
    height: Optional[int] = None
    provider: Optional[str] = None
    model: Optional[str] = None
    error: Optional[str] = None
@router.post("/generate-image-prompts", response_model=List[ImagePromptResponse])
 async def generate_image_prompts(request: ImagePromptRequest):
    """
@@ -89,7 +108,8 @@ async def generate_linkedin_image(
        # Use our LinkedIn image generator service
        image_result = await image_generator.generate_image(
            prompt=request.prompt,
-            content_context=request.content_context
+            content_context=request.content_context,
            user_id=user_id
        )
        if image_result and image_result.get('success'):
@@ -131,6 +151,99 @@ async def generate_linkedin_image(
            error=f"Failed to generate image: {str(e)}"
        )
@router.post("/edit-image", response_model=ImageEditResponse)
 async def edit_linkedin_image(
    request: ImageEditRequest,
    current_user: Dict[str, Any] = Depends(get_current_user)
 ):
    """
    Edit a LinkedIn-optimized image using natural language.
    Provide the image as base64 and describe the desired edits.
    """
    try:
        user_id = current_user.get("id")
        if not user_id:
            raise HTTPException(status_code=401, detail="Authentication required")
        if not request.prompt or not request.prompt.strip():
            raise HTTPException(status_code=400, detail="Prompt is required for image editing")
        logger.info(f"Editing LinkedIn image with prompt: {request.prompt[:100]}... for user {user_id}")
        # Get input image bytes — from image_id (fetch from storage) or image_base64 (direct decode)
        input_image_bytes = None
        if request.image_id:
            stored = await image_storage.retrieve_image(request.image_id, user_id)
            if not stored or not stored.get('success'):
                raise HTTPException(status_code=404, detail=f"Image not found: {request.image_id}")
            input_image_bytes = stored['image_data']
            logger.info(f"Fetched image {request.image_id} from storage ({len(input_image_bytes)} bytes)")
        elif request.image_base64:
            input_image_bytes = base64.b64decode(request.image_base64)
        else:
            raise HTTPException(status_code=400, detail="Either image_id or image_base64 is required")
        # Use LinkedIn image generator with common editing infrastructure
        image_result = await image_generator.edit_image(
            input_image_bytes=input_image_bytes,
            edit_prompt=request.prompt,
            content_context=request.content_context,
            user_id=user_id,
        )
        if image_result and image_result.get('success'):
            image_b64 = base64.b64encode(image_result['image_data']).decode("utf-8")
            # Store the edited image — log but don't fail if storage has issues
            new_image_id = None
            stored_result = await image_storage.store_image(
                image_data=image_result['image_data'],
                metadata={
                    'prompt': request.prompt,
                    'style': request.content_context.get('style', 'Edited'),
                    'content_type': request.content_context.get('content_type'),
                    'topic': request.content_context.get('topic'),
                    'industry': request.content_context.get('industry'),
                    'is_edit': True,
                    'original_prompt': request.prompt,
                    'source_image_id': request.image_id,
                },
                user_id=user_id
            )
            if stored_result and stored_result.get('success'):
                new_image_id = stored_result.get('image_id')
                logger.info(f"Edited image stored with ID: {new_image_id}")
            else:
                logger.warning(f"Edited image not stored: {stored_result.get('error', 'unknown reason')}")
            return ImageEditResponse(
                success=True,
                image_data=image_b64,
                image_id=new_image_id,
                image_url=image_result.get('image_url'),
                width=image_result.get('width'),
                height=image_result.get('height'),
                provider=image_result.get('provider'),
                model=image_result.get('model'),
            )
        else:
            error_msg = image_result.get('error', 'Unknown error during image editing')
            logger.error(f"Image editing failed: {error_msg}")
            return ImageEditResponse(
                success=False,
                error=error_msg
            )
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Error editing LinkedIn image: {str(e)}", exc_info=True)
        return ImageEditResponse(
            success=False,
            error=f"Failed to edit image: {str(e)}"
        )
@router.get("/image-status/{image_id}")
 async def get_image_status(
    image_id: str,
@@ -169,42 +282,23 @@ async def get_generated_image(
    current_user: Dict[str, Any] = Depends(get_current_user)
 ):
    """
-    Retrieve a generated image by ID
+    Retrieve a generated image by ID.
    Returns the image file directly as a PNG response.
    """
    try:
        user_id = current_user.get("id")
        image_result = await image_storage.retrieve_image(image_id, user_id)
-        if image_result.get('success') and 'image_data' in image_result:
+        if image_result.get('success') and image_result.get('image_path'):
-             # Return as streaming response or raw bytes depending on frontend needs
+            return FileResponse(
-             # For now returning the structure as before but image_data is bytes
+                path=image_result['image_path'],
-             # Ideally this should be a Response object with image/png content type
+                media_type="image/png",
-             # But keeping consistency with existing return type structure for now if it was returning dict
+                filename=f"{image_id}.png"
-             # Wait, retrieve_image returns dict with 'image_data' as bytes.
+            )
             # The original code returned: {"success": True, "image_data": image_data}
             # FastAPI handles bytes in JSON? No, it will fail serialization.
             # The previous implementation of retrieve_image (lines 190-195) returned bytes in a dict.
             # Unless FastAPI response model handles it, this might have been broken or handled specially.
             # Let's check imports.
             # It uses APIRouter.
             # If I return a dict with bytes, json serialization fails.
             # Maybe the original code expected base64 or it was just broken?
             # Or maybe image_data was not bytes? 
             # In retrieve_image: with open(..., 'rb') as f: image_data = f.read() -> bytes.
             # So returning it in a dict will definitely fail JSON serialization.
             # I should probably return a Response or FileResponse, or base64 encode it.
             # But for now, I will just match the signature and pass user_id.
             # If it was broken before, I'm not fixing that unless asked, but I suspect it might be base64 in usage?
             # Let's look at `generate_linkedin_image` which returns `ImageGenerationResponse` with `image_url`.
             # `get_generated_image` returns a dict.
             # I will stick to passing user_id.
            return {
                "success": True,
                "image_data": image_result['image_data'] # This might need base64 encoding if it's for JSON
            }
        else:
            raise HTTPException(status_code=404, detail="Image not found")
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Error retrieving image: {str(e)}")
        raise HTTPException(status_code=500, detail=f"Failed to retrieve image: {str(e)}")
@@ -232,25 +326,42 @@ async def delete_generated_image(
@router.get("/image-generation-health")
 async def health_check():
    """
-    Health check for image generation services
+    Lightweight health check for image generation services.
    Verifies configuration and service availability without making API calls.
    """
    try:
-        # Test basic service functionality
+        services = {}
-        test_prompts = await prompt_generator.generate_three_prompts({
+        all_healthy = True
-            'content_type': 'post',
+
-            'topic': 'Test',
+        # Check API key configuration (no actual API call)
-            'industry': 'Technology',
+        image_api_key = api_key_manager.get_api_key("image_generation") or os.getenv("WAVESPEED_API_KEY") or os.getenv("HF_TOKEN")
-            'content': 'Test content for health check'
+        services["image_api_key_configured"] = bool(image_api_key)
-        })
+
-        
+        # Check storage accessibility
        stats = await image_storage.get_storage_stats()
        storage_ok = stats.get('success', False)
        services["image_storage"] = "operational" if storage_ok else "unavailable"
        if storage_ok:
            services["storage_stats"] = {
                "total_images": stats.get('total_files', 0),
                "total_size_gb": stats.get('total_size_gb', 0),
                "limit_gb": stats.get('storage_limit_gb', 0),
            }
        # Check prompt generator initialization
        prompt_ok = prompt_generator is not None and hasattr(prompt_generator, 'generate_three_prompts')
        services["prompt_generator"] = "operational" if prompt_ok else "unavailable"
        # Check image generator initialization
        gen_ok = image_generator is not None and hasattr(image_generator, 'generate_image')
        services["image_generator"] = "operational" if gen_ok else "unavailable"
        if not all(v == "operational" or v is True for v in services.values()):
            all_healthy = False
        return {
-            "status": "healthy",
+            "status": "healthy" if all_healthy else "degraded",
-            "services": {
+            "services": services
                "prompt_generator": "operational",
                "image_generator": "operational",
                "image_storage": "operational"
            },
            "test_prompts_generated": len(test_prompts)
        }
    except Exception as e:
        logger.error(f"Health check failed: {str(e)}")
--- a/backend/api/scheduler_dashboard.py
+++ b/backend/api/scheduler_dashboard.py
@@ -19,7 +19,11 @@ from models.monitoring_models import TaskExecutionLog, MonitoringTask
 from models.scheduler_models import SchedulerEventLog
 from models.oauth_token_monitoring_models import OAuthTokenMonitoringTask
 from models.platform_insights_monitoring_models import PlatformInsightsTask, PlatformInsightsExecutionLog
-from models.website_analysis_monitoring_models import WebsiteAnalysisTask, WebsiteAnalysisExecutionLog, DeepWebsiteCrawlTask
+from models.website_analysis_monitoring_models import (
    WebsiteAnalysisTask, WebsiteAnalysisExecutionLog, DeepWebsiteCrawlTask,
    OnboardingFullWebsiteAnalysisTask, DeepCompetitorAnalysisTask,
    SIFIndexingTask, MarketTrendsTask, AdvertoolsTask,
 )
 router = APIRouter(prefix="/api/scheduler", tags=["scheduler-dashboard"])
@@ -309,6 +313,198 @@ async def get_scheduler_dashboard(
        except Exception as e:
            logger.error(f"Error loading deep website crawl tasks: {e}", exc_info=True)
        # Load onboarding full website analysis tasks
        try:
            onboarding_tasks = db.query(OnboardingFullWebsiteAnalysisTask).filter(
                OnboardingFullWebsiteAnalysisTask.status.in_(['active', 'failed', 'needs_intervention'])
            ).all()
            if user_id_str:
                onboarding_tasks = [t for t in onboarding_tasks if t.user_id == user_id_str]
            for task in onboarding_tasks:
                try:
                    user_job_store = get_user_job_store_name(task.user_id, db)
                except Exception:
                    user_job_store = 'default'
                job_info = {
                    'id': f"onboarding_full_website_analysis_{task.user_id}_{task.id}",
                    'trigger_type': 'DateTrigger' if task.status != 'active' else 'CronTrigger',
                    'next_run_time': task.next_execution.isoformat() if task.next_execution else None,
                    'user_id': task.user_id,
                    'job_store': 'default',
                    'user_job_store': user_job_store,
                    'function_name': 'onboarding_full_website_analysis_executor.execute_task',
                    'website_url': task.website_url,
                    'task_id': task.id,
                    'is_database_task': True,
                    'frequency': 'One-time' if task.status == 'completed' else 'Once',
                    'task_category': 'onboarding_full_website_analysis',
                    'status': task.status,
                    'last_success': task.last_success.isoformat() if task.last_success else None,
                    'last_failure': task.last_failure.isoformat() if task.last_failure else None,
                    'failure_reason': task.failure_reason,
                    'consecutive_failures': task.consecutive_failures,
                }
                formatted_jobs.append(job_info)
        except Exception as e:
            logger.error(f"Error loading onboarding full website analysis tasks: {e}", exc_info=True)
        # Load deep competitor analysis tasks
        try:
            competitor_tasks = db.query(DeepCompetitorAnalysisTask).filter(
                DeepCompetitorAnalysisTask.status.in_(['active', 'failed', 'needs_intervention'])
            ).all()
            if user_id_str:
                competitor_tasks = [t for t in competitor_tasks if t.user_id == user_id_str]
            for task in competitor_tasks:
                try:
                    user_job_store = get_user_job_store_name(task.user_id, db)
                except Exception:
                    user_job_store = 'default'
                payload = task.payload or {}
                frequency_label = 'Weekly' if payload.get('mode') == 'strategic_insights' else 'One-time'
                job_info = {
                    'id': f"deep_competitor_analysis_{task.user_id}_{task.id}",
                    'trigger_type': 'CronTrigger' if frequency_label == 'Weekly' else 'DateTrigger',
                    'next_run_time': task.next_execution.isoformat() if task.next_execution else None,
                    'user_id': task.user_id,
                    'job_store': 'default',
                    'user_job_store': user_job_store,
                    'function_name': 'deep_competitor_analysis_executor.execute_task',
                    'website_url': task.website_url,
                    'task_id': task.id,
                    'is_database_task': True,
                    'frequency': frequency_label,
                    'task_category': 'deep_competitor_analysis',
                    'status': task.status,
                    'last_success': task.last_success.isoformat() if task.last_success else None,
                    'last_failure': task.last_failure.isoformat() if task.last_failure else None,
                    'failure_reason': task.failure_reason,
                    'consecutive_failures': task.consecutive_failures,
                }
                formatted_jobs.append(job_info)
        except Exception as e:
            logger.error(f"Error loading deep competitor analysis tasks: {e}", exc_info=True)
        # Load SIF indexing tasks
        try:
            sif_tasks = db.query(SIFIndexingTask).filter(
                SIFIndexingTask.status.in_(['active', 'failed', 'needs_intervention'])
            ).all()
            if user_id_str:
                sif_tasks = [t for t in sif_tasks if t.user_id == user_id_str]
            for task in sif_tasks:
                try:
                    user_job_store = get_user_job_store_name(task.user_id, db)
                except Exception:
                    user_job_store = 'default'
                job_info = {
                    'id': f"sif_indexing_{task.user_id}_{task.id}",
                    'trigger_type': 'CronTrigger',
                    'next_run_time': task.next_execution.isoformat() if task.next_execution else None,
                    'user_id': task.user_id,
                    'job_store': 'default',
                    'user_job_store': user_job_store,
                    'function_name': 'sif_indexing_executor.execute_task',
                    'website_url': task.website_url,
                    'task_id': task.id,
                    'is_database_task': True,
                    'frequency': f'Every {task.frequency_hours}h' if task.frequency_hours else 'Every 48h',
                    'task_category': 'sif_indexing',
                    'status': task.status,
                    'last_success': task.last_success.isoformat() if task.last_success else None,
                    'last_failure': task.last_failure.isoformat() if task.last_failure else None,
                    'failure_reason': task.failure_reason,
                    'consecutive_failures': task.consecutive_failures,
                }
                formatted_jobs.append(job_info)
        except Exception as e:
            logger.error(f"Error loading SIF indexing tasks: {e}", exc_info=True)
        # Load market trends tasks
        try:
            trends_tasks = db.query(MarketTrendsTask).filter(
                MarketTrendsTask.status.in_(['active', 'failed', 'needs_intervention'])
            ).all()
            if user_id_str:
                trends_tasks = [t for t in trends_tasks if t.user_id == user_id_str]
            for task in trends_tasks:
                try:
                    user_job_store = get_user_job_store_name(task.user_id, db)
                except Exception:
                    user_job_store = 'default'
                job_info = {
                    'id': f"market_trends_{task.user_id}_{task.id}",
                    'trigger_type': 'CronTrigger',
                    'next_run_time': task.next_execution.isoformat() if task.next_execution else None,
                    'user_id': task.user_id,
                    'job_store': 'default',
                    'user_job_store': user_job_store,
                    'function_name': 'market_trends_executor.execute_task',
                    'website_url': task.website_url,
                    'task_id': task.id,
                    'is_database_task': True,
                    'frequency': f'Every {task.frequency_hours}h' if task.frequency_hours else 'Every 72h',
                    'task_category': 'market_trends',
                    'status': task.status,
                    'last_success': task.last_success.isoformat() if task.last_success else None,
                    'last_failure': task.last_failure.isoformat() if task.last_failure else None,
                    'failure_reason': task.failure_reason,
                    'consecutive_failures': task.consecutive_failures,
                }
                formatted_jobs.append(job_info)
        except Exception as e:
            logger.error(f"Error loading market trends tasks: {e}", exc_info=True)
        # Load advertools tasks
        try:
            advertools_tasks = db.query(AdvertoolsTask).filter(
                AdvertoolsTask.status.in_(['active', 'failed', 'paused'])
            ).all()
            if user_id_str:
                advertools_tasks = [t for t in advertools_tasks if t.user_id == user_id_str]
            for task in advertools_tasks:
                try:
                    user_job_store = get_user_job_store_name(task.user_id, db)
                except Exception:
                    user_job_store = 'default'
                job_info = {
                    'id': f"advertools_{task.user_id}_{task.id}",
                    'trigger_type': 'CronTrigger',
                    'next_run_time': task.next_execution.isoformat() if task.next_execution else None,
                    'user_id': task.user_id,
                    'job_store': 'default',
                    'user_job_store': user_job_store,
                    'function_name': 'advertools_executor.execute_task',
                    'website_url': task.website_url,
                    'task_id': task.id,
                    'is_database_task': True,
                    'frequency': f'Every {task.frequency_days}d' if task.frequency_days else 'Weekly',
                    'task_category': 'advertools',
                    'status': task.status,
                    'last_success': task.last_success.isoformat() if task.last_success else None,
                    'last_failure': task.last_failure.isoformat() if task.last_failure else None,
                    'failure_reason': task.failure_reason,
                    'consecutive_failures': task.consecutive_failures,
                }
                formatted_jobs.append(job_info)
        except Exception as e:
            logger.error(f"Error loading advertools tasks: {e}", exc_info=True)
        # Get active strategies count
        active_strategies = stats.get('active_strategies_count', 0)
@@ -1237,7 +1433,9 @@ async def manual_trigger_task(
    This bypasses the cool-off check and executes the task immediately.
    Args:
-        task_type: Task type (oauth_token_monitoring, website_analysis, gsc_insights, bing_insights)
+        task_type: Task type (oauth_token_monitoring, website_analysis, gsc_insights, bing_insights,
                    onboarding_full_website_analysis, deep_competitor_analysis, sif_indexing,
                    market_trends, advertools)
        task_id: Task ID
    Returns:
@@ -1261,6 +1459,30 @@ async def manual_trigger_task(
            task = db.query(PlatformInsightsTask).filter(
                PlatformInsightsTask.id == task_id
            ).first()
        elif task_type == "onboarding_full_website_analysis":
            task = db.query(OnboardingFullWebsiteAnalysisTask).filter(
                OnboardingFullWebsiteAnalysisTask.id == task_id
            ).first()
        elif task_type == "deep_competitor_analysis":
            task = db.query(DeepCompetitorAnalysisTask).filter(
                DeepCompetitorAnalysisTask.id == task_id
            ).first()
        elif task_type == "sif_indexing":
            task = db.query(SIFIndexingTask).filter(
                SIFIndexingTask.id == task_id
            ).first()
        elif task_type == "market_trends":
            task = db.query(MarketTrendsTask).filter(
                MarketTrendsTask.id == task_id
            ).first()
        elif task_type == "advertools":
            task = db.query(AdvertoolsTask).filter(
                AdvertoolsTask.id == task_id
            ).first()
        elif task_type == "deep_website_crawl":
            task = db.query(DeepWebsiteCrawlTask).filter(
                DeepWebsiteCrawlTask.id == task_id
            ).first()
        else:
            raise HTTPException(status_code=400, detail=f"Unknown task type: {task_type}")
@@ -1363,3 +1585,219 @@ async def get_platform_insights_logs(
        logger.error(f"Error getting platform insights logs for user {user_id}: {e}", exc_info=True)
        raise HTTPException(status_code=500, detail=f"Failed to get platform insights logs: {str(e)}")
 TASK_DISPLAY_INFO = {
    "onboarding_full_website_analysis": {"label": "Full-Site SEO Audit", "description": "Crawls your entire website and generates per-page SEO audit results.", "frequency": "One-time"},
    "deep_competitor_analysis": {"label": "Deep Competitor Analysis", "description": "Analyzes competitors' content strategy, keywords, and positioning.", "frequency": "Weekly (strategic insights) or One-time"},
    "sif_indexing": {"label": "SIF Content Indexing", "description": "Indexes your website content into the Semantic Intelligence Framework for agent-powered recommendations.", "frequency": "Every 48 hours"},
    "market_trends": {"label": "Market Trends", "description": "Monitors search trends and surfaces high-impact content opportunities.", "frequency": "Every 72 hours"},
    "advertools": {"label": "Advertools Analysis", "description": "Runs brand analysis and site health audits using Advertools.", "frequency": "Weekly"},
    "oauth_token_monitoring": {"label": "OAuth Token Health", "description": "Monitors and refreshes OAuth tokens for connected platforms (GSC, Bing, WordPress, Wix).", "frequency": "Weekly"},
    "website_analysis": {"label": "Website Analysis", "description": "Periodically re-crawls your website and updates style analysis, content pillars, and SEO data.", "frequency": "Every 10 days"},
    "gsc_insights": {"label": "Google Search Console Insights", "description": "Pulls search performance data from Google Search Console.", "frequency": "Weekly"},
    "bing_insights": {"label": "Bing Insights", "description": "Pulls search performance data from Bing Webmaster Tools.", "frequency": "Weekly"},
    "deep_website_crawl": {"label": "Deep Website Crawl", "description": "Performs deep crawl of your website for technical SEO issues.", "frequency": "Weekly"},
    "platform_insights": {"label": "Platform Insights", "description": "Aggregates search performance data from connected platforms.", "frequency": "Weekly"},
 }
@router.get("/onboarding-tasks/{user_id}")
 async def get_onboarding_tasks(
    user_id: str,
    db: Session = Depends(get_db),
    current_user: Dict[str, Any] = Depends(get_current_user)
 ):
    """
    Get all tasks created during onboarding for a user, with status and human-readable descriptions.
    """
    try:
        if str(current_user.get('id')) != user_id:
            raise HTTPException(status_code=403, detail="Access denied")
        tasks = []
        def _fmt_status(s):
            return s.replace('_', ' ').title() if s else 'Unknown'
        def _fmt_dt(dt):
            return dt.isoformat() if dt else None
        # Onboarding full-site SEO audit
        for t in db.query(OnboardingFullWebsiteAnalysisTask).filter(
            OnboardingFullWebsiteAnalysisTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("onboarding_full_website_analysis", {})
            tasks.append({
                "task_type": "onboarding_full_website_analysis",
                "label": info.get("label", "Full-Site SEO Audit"),
                "description": info.get("description", ""),
                "frequency": info.get("frequency", "One-time"),
                "task_id": t.id,
                "website_url": t.website_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_execution),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        # Deep competitor analysis
        for t in db.query(DeepCompetitorAnalysisTask).filter(
            DeepCompetitorAnalysisTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("deep_competitor_analysis", {})
            payload = t.payload or {}
            freq_label = info.get("frequency", "One-time")
            if payload.get("mode") == "strategic_insights":
                freq_label = "Weekly"
            tasks.append({
                "task_type": "deep_competitor_analysis",
                "label": info.get("label", "Deep Competitor Analysis"),
                "description": info.get("description", ""),
                "frequency": freq_label,
                "task_id": t.id,
                "website_url": t.website_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_execution),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        # SIF indexing
        for t in db.query(SIFIndexingTask).filter(
            SIFIndexingTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("sif_indexing", {})
            tasks.append({
                "task_type": "sif_indexing",
                "label": info.get("label", "SIF Content Indexing"),
                "description": info.get("description", ""),
                "frequency": f"Every {t.frequency_hours or 48}h",
                "task_id": t.id,
                "website_url": t.website_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_execution),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        # Market trends
        for t in db.query(MarketTrendsTask).filter(
            MarketTrendsTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("market_trends", {})
            tasks.append({
                "task_type": "market_trends",
                "label": info.get("label", "Market Trends"),
                "description": info.get("description", ""),
                "frequency": f"Every {t.frequency_hours or 72}h",
                "task_id": t.id,
                "website_url": t.website_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_execution),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        # Advertools
        for t in db.query(AdvertoolsTask).filter(
            AdvertoolsTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("advertools", {})
            tasks.append({
                "task_type": "advertools",
                "label": info.get("label", "Advertools Analysis"),
                "description": info.get("description", ""),
                "frequency": f"Every {t.frequency_days or 7}d",
                "task_id": t.id,
                "website_url": t.website_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_execution),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        # Also include website analysis & OAuth tasks created during onboarding
        for t in db.query(WebsiteAnalysisTask).filter(
            WebsiteAnalysisTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("website_analysis", {})
            tasks.append({
                "task_type": "website_analysis",
                "label": info.get("label", "Website Analysis") + (f" ({t.task_type})" if t.task_type == 'competitor' else ""),
                "description": info.get("description", ""),
                "frequency": f"Every {t.frequency_days or 10}d",
                "task_id": t.id,
                "website_url": t.website_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_check),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        for t in db.query(OAuthTokenMonitoringTask).filter(
            OAuthTokenMonitoringTask.user_id == user_id
        ).all():
            info = TASK_DISPLAY_INFO.get("oauth_token_monitoring", {})
            tasks.append({
                "task_type": "oauth_token_monitoring",
                "label": info.get("label", "OAuth Token Health") + f" ({t.platform})",
                "description": info.get("description", ""),
                "frequency": info.get("frequency", "Weekly"),
                "task_id": t.id,
                "website_url": None,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_check),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        for t in db.query(PlatformInsightsTask).filter(
            PlatformInsightsTask.user_id == user_id
        ).all():
            task_key = f"{t.platform}_insights"
            info = TASK_DISPLAY_INFO.get(task_key, {})
            tasks.append({
                "task_type": task_key,
                "label": info.get("label", "Platform Insights") + f" ({t.platform})",
                "description": info.get("description", ""),
                "frequency": info.get("frequency", "Weekly"),
                "task_id": t.id,
                "website_url": t.site_url,
                "status": t.status,
                "status_label": _fmt_status(t.status),
                "last_success": _fmt_dt(t.last_success),
                "last_failure": _fmt_dt(t.last_failure),
                "next_execution": _fmt_dt(t.next_check),
                "failure_reason": t.failure_reason,
                "consecutive_failures": t.consecutive_failures,
            })
        return {"success": True, "tasks": tasks, "count": len(tasks)}
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Error getting onboarding tasks for user {user_id}: {e}", exc_info=True)
        raise HTTPException(status_code=500, detail=f"Failed to get onboarding tasks: {str(e)}")
--- a/backend/api/seo_dashboard.py
+++ b/backend/api/seo_dashboard.py
@@ -75,7 +75,9 @@ class SEODashboardData(BaseModel):
    platforms: Dict[str, PlatformStatus]
    ai_insights: List[AIInsight]
    last_updated: str
-    website_url: Optional[str] = None  # User's website URL from onboarding
+    website_url: Optional[str] = None
    advertools_insights: Optional[Dict[str, Any]] = None
    technical_seo_audit: Optional[Dict[str, Any]] = None
 # New models for comprehensive SEO analysis
 class SEOAnalysisRequest(BaseModel):
@@ -378,7 +380,9 @@ async def get_seo_dashboard_data(current_user: dict = Depends(get_current_user))
                platforms=_convert_platforms(overview_data.get("platforms", {})),
                ai_insights=[AIInsight(**insight) for insight in overview_data.get("ai_insights", [])],
                last_updated=overview_data.get("last_updated", datetime.now().isoformat()),
-                website_url=overview_data.get("website_url")
+                website_url=overview_data.get("website_url"),
                advertools_insights=overview_data.get("advertools_insights"),
                technical_seo_audit=overview_data.get("technical_seo_audit"),
            )
        finally:
            db_session.close()
--- a/backend/api/wix_routes.py
+++ b/backend/api/wix_routes.py
@@ -16,6 +16,7 @@ import time
 from services.wix_service import WixService
 from services.integrations.wix_oauth import WixOAuthService
 from services.integrations.wix.utils import extract_meta_from_token
 from services.integrations.oauth_callback_utils import (
    build_oauth_callback_html,
    sanitize_error,
@@ -102,6 +103,38 @@ def _map_wix_error(exc: Exception, fallback: str = "Wix API request failed") ->
            detail="Network error connecting to Wix. Please check your connection and try again."
        )
    # Handle WixAPIError from our retry/API layer
    from services.integrations.wix.retry import WixAPIError
    if isinstance(exc, WixAPIError):
        status = exc.status_code
        msg = exc.response_body or str(exc)
        if status == 401:
            return HTTPException(
                status_code=401,
                detail="Wix authorization failed. Please reconnect your Wix account."
            )
        if status == 403:
            return HTTPException(
                status_code=403,
                detail="Wix permission denied. Ensure your OAuth app has blog permissions (BLOG.CREATE-DRAFT)."
            )
        if status == 404:
            return HTTPException(
                status_code=502,
                detail="Wix API endpoint not found. Ensure the site ID is correct and the blog feature is enabled."
            )
        if status == 429:
            return HTTPException(
                status_code=429,
                detail="Wix rate limit exceeded. Please wait a moment and try again."
            )
        if status in (500, 502, 503, 504):
            return HTTPException(
                status_code=502,
                detail="Wix service temporarily unavailable. Please try again in a moment."
            )
        return HTTPException(status_code=status or 502, detail=msg or fallback)
    # For validation errors from blog_publisher
    error_str = str(exc)
    if "validation failed" in error_str.lower():
@@ -150,12 +183,16 @@ def _resolve_valid_wix_token(current_user: dict) -> Dict[str, Any]:
            expires_in=refreshed.get("expires_in"),
            token_id=token_id,
        )
        site_id = candidate.get("site_id")
        if not site_id:
            meta_info = extract_meta_from_token(refreshed.get("access_token"))
            site_id = meta_info.get('metaSiteId') or site_id
        logger.info(f"Wix token refreshed successfully on attempt {attempt} for user {user_id[:8]}...")
        return {
            "access_token": refreshed.get("access_token"),
            "refresh_token": refreshed.get("refresh_token", refresh_token),
            "member_id": candidate.get("member_id"),
-            "site_id": candidate.get("site_id"),
+            "site_id": site_id,
        }
    raise HTTPException(status_code=401, detail="Wix token expired and cannot be refreshed")
@@ -315,6 +352,9 @@ async def handle_oauth_callback(request: WixAuthRequest, current_user: dict = De
            try:
                site_info = wix_service.get_site_info(access_token)
                site_id = site_info.get('siteId') or site_info.get('site_id')
                if not site_id and site_info.get('_no_site'):
                    meta_info = extract_meta_from_token(access_token)
                    site_id = meta_info.get('metaSiteId')
            except Exception as e:
                logger.warning(f"get_site_info failed (non-fatal): {e}")
            try:
@@ -322,7 +362,7 @@ async def handle_oauth_callback(request: WixAuthRequest, current_user: dict = De
            except Exception:
                pass
            try:
-                permissions = wix_service.check_blog_permissions(access_token)
+                permissions = wix_service.check_blog_permissions(access_token, site_id=site_id)
            except Exception as e:
                logger.warning(f"check_blog_permissions failed (non-fatal): {e}")
@@ -351,11 +391,14 @@ async def handle_oauth_callback(request: WixAuthRequest, current_user: dict = De
            try:
                site_info = wix_service.get_site_info(access_token)
                site_id = site_info.get('siteId') or site_info.get('site_id')
                if not site_id and site_info.get('_no_site'):
                    meta_info = extract_meta_from_token(access_token)
                    site_id = meta_info.get('metaSiteId') or site_id
            except Exception as e:
                logger.warning(f"get_site_info failed (non-fatal): {e}")
            try:
-                from services.integrations.wix.utils import extract_meta_from_token
+                meta_info = extract_meta_from_token(access_token)
-                site_id = extract_meta_from_token(access_token) or site_id
+                site_id = meta_info.get('metaSiteId') or site_id
            except Exception:
                pass
            try:
@@ -363,7 +406,7 @@ async def handle_oauth_callback(request: WixAuthRequest, current_user: dict = De
            except Exception:
                pass
            try:
-                permissions = wix_service.check_blog_permissions(access_token)
+                permissions = wix_service.check_blog_permissions(access_token, site_id=site_id)
            except Exception as e:
                logger.warning(f"check_blog_permissions failed (non-fatal): {e}")
        else:
@@ -425,10 +468,13 @@ async def handle_oauth_callback_get(code: str, state: Optional[str] = None, requ
        try:
            site_info = wix_service.get_site_info(tokens['access_token'])
            site_id = site_info.get('siteId') or site_info.get('site_id')
            if not site_id and site_info.get('_no_site'):
                meta_info = extract_meta_from_token(tokens['access_token'])
                site_id = meta_info.get('metaSiteId')
        except Exception as e:
            logger.warning(f"GET callback: get_site_info non-fatal: {e}")
        try:
-            permissions = wix_service.check_blog_permissions(tokens['access_token'])
+            permissions = wix_service.check_blog_permissions(tokens['access_token'], site_id=site_id)
        except Exception as e:
            logger.warning(f"GET callback: check_blog_permissions non-fatal: {e}")
@@ -499,17 +545,34 @@ async def get_connection_status(current_user: dict = Depends(get_current_user))
    try:
        token_info = _resolve_valid_wix_token(current_user)
        access_token = token_info["access_token"]
        site_id = token_info.get("site_id")
        # Check site info — distinguish "no site" from "token expired"
        site_info = wix_service.get_site_info(access_token)
-        permissions = wix_service.check_blog_permissions(access_token)
+        if site_info.get("_auth_failed"):
            return {
                "connected": False,
                "has_permissions": False,
                "error": "Wix token expired — please reconnect",
                "reconnect_required": True
            }
        # If get_site_info returned _no_site, try extracting metaSiteId from token
        if site_info.get("_no_site") and not site_id:
            meta_info = extract_meta_from_token(access_token)
            site_id = meta_info.get('metaSiteId')
        permissions = wix_service.check_blog_permissions(access_token, site_id=site_id)
        return {
            "connected": True,
            "has_permissions": permissions.get("has_permissions", False),
            "site_info": site_info,
-            "permissions": permissions
+            "permissions": permissions,
            "site_id": site_id,
        }
    except HTTPException as e:
        if e.status_code == 401:
-            return {"connected": False, "has_permissions": False, "error": "Wix account not connected"}
+            return {"connected": False, "has_permissions": False, "error": "Wix account not connected", "reconnect_required": True}
        raise
    except Exception as e:
        logger.error(f"Failed to check connection status: {e}")
@@ -557,6 +620,9 @@ async def publish_to_wix(request: WixPublishRequest, current_user: dict = Depend
                access_token = token_info["access_token"]
                if not site_id:
                    site_id = token_info.get("site_id")
                if not site_id:
                    meta_info = extract_meta_from_token(access_token)
                    site_id = meta_info.get('metaSiteId')
                logger.info(f"Wix publish: using backend DB token for user {_get_current_user_id(current_user)[:8]}...")
            except HTTPException:
                access_token = None
@@ -641,12 +707,14 @@ async def publish_to_wix(request: WixPublishRequest, current_user: dict = Depend
            post_url = raw_url
        else:
            post_url = None
        publish_warnings = result.get("_warnings", [])
        all_warnings = [w for w in [content_warning] + publish_warnings if w]
        return {
            "success": True,
            "post_id": str(post.get("id", "")),
            "url": post_url,
            "publish_state": "PUBLISHED" if request.publish else "DRAFT",
-            **({"warning": content_warning} if content_warning else {}),
+            **({"warning": " | ".join(all_warnings)} if all_warnings else {}),
        }
    except Exception as e:
        logger.error(f"Failed to publish to Wix: {e}")
@@ -930,11 +998,13 @@ async def test_publish_real(payload: Dict[str, Any], _: Dict[str, Any] = Depends
            seo_metadata=seo_metadata,
        )
        publish_warnings = result.get("_warnings", [])
        return {
            "success": True,
            "post_id": (result.get("draftPost") or result.get("post") or {}).get("id"),
            "url": (result.get("draftPost") or result.get("post") or {}).get("url"),
            "message": "Blog post published to Wix",
            **({"warning": " | ".join(publish_warnings)} if publish_warnings else {}),
        }
    except HTTPException:
        raise
--- a/backend/api/youtube/router.py
+++ b/backend/api/youtube/router.py
@@ -167,10 +167,10 @@ class SceneVideoRenderResponse(BaseModel):
 class CombineVideosRequest(BaseModel):
    """Request model for combining multiple scene videos."""
-    video_urls: List[str] = Field(..., description="List of scene video URLs to combine in order")
+    scene_video_urls: List[str] = Field(..., description="List of scene video URLs to combine in order")
    video_plan: Optional[Dict[str, Any]] = Field(None, description="Original video plan (for metadata)")
    resolution: str = Field("720p", pattern="^(480p|720p|1080p)$", description="Target resolution for output")
-    title: Optional[str] = Field(None, description="Optional title for the final video")
+    title: Optional[str] = Field(None, description="Optional title for the combined video")
 class CombineVideosResponse(BaseModel):
@@ -187,13 +187,6 @@ class VideoListResponse(BaseModel):
    message: str = "Videos fetched successfully"
 class CombineVideosRequest(BaseModel):
    """Request model for combining multiple scene videos."""
    scene_video_urls: List[str] = Field(..., description="List of scene video URLs to combine")
    resolution: str = Field("720p", pattern="^(480p|720p|1080p)$", description="Output video resolution")
    title: Optional[str] = Field(None, description="Optional title for the combined video")
 class VideoRenderResponse(BaseModel):
    """Response model for video rendering."""
    success: bool
@@ -721,85 +714,6 @@ async def get_render_status(
        )
@router.post("/render/combine", response_model=VideoRenderResponse)
 async def combine_videos(
    request: CombineVideosRequest,
    background_tasks: BackgroundTasks,
    current_user: Dict[str, Any] = Depends(get_current_user),
    db: Session = Depends(get_db),
 ) -> VideoRenderResponse:
    """
    Combine multiple scene videos into a final video.
    Returns task_id for polling.
    """
    try:
        user_id = require_authenticated_user(current_user)
        # Subscription validation
        pricing_service = PricingService(db)
        validate_scene_animation_operation(
            pricing_service=pricing_service,
            user_id=user_id
        )
        if not request.scene_video_urls or len(request.scene_video_urls) < 2:
            return VideoRenderResponse(
                success=False,
                message="At least two scene videos are required to combine."
            )
        task_id = task_manager.create_task("youtube_combine_video")
        logger.info(
            f"[YouTubeAPI] Created combine task {task_id} for user {user_id}, videos={len(request.scene_video_urls)}, resolution={request.resolution}"
        )
        initial_status = task_manager.get_task_status(task_id)
        if not initial_status:
            logger.error(f"[YouTubeAPI] Failed to create combine task {task_id} - task not found immediately after creation")
            return VideoRenderResponse(
                success=False,
                message="Failed to create combine task. Please try again."
            )
        try:
            background_tasks.add_task(
                _execute_combine_video_task,
                task_id=task_id,
                scene_video_urls=request.scene_video_urls,
                user_id=user_id,
                resolution=request.resolution,
                title=request.title,
            )
            logger.info(f"[YouTubeAPI] Background combine task added for {task_id}")
        except Exception as bg_error:
            logger.error(f"[YouTubeAPI] Failed to add combine background task for {task_id}: {bg_error}", exc_info=True)
            task_manager.update_task_status(
                task_id,
                "failed",
                error=str(bg_error),
                message="Failed to start combine task"
            )
            return VideoRenderResponse(
                success=False,
                message=f"Failed to start combine task: {str(bg_error)}"
            )
        return VideoRenderResponse(
            success=True,
            task_id=task_id,
            message="Video combination started."
        )
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"[YouTubeAPI] Error starting combine: {e}", exc_info=True)
        return VideoRenderResponse(
            success=False,
            message=f"Failed to start combine: {str(e)}"
        )
 def _execute_video_render_task(
    task_id: str,
    scenes: List[Dict[str, Any]],
@@ -1270,20 +1184,21 @@ async def combine_scene_videos(
            user_id=user_id
        )
-        if not request.video_urls or len(request.video_urls) < 2:
+        if not request.scene_video_urls or len(request.scene_video_urls) < 2:
            return CombineVideosResponse(
                success=False,
                task_id=None,
-                message="At least two videos are required to combine."
+                message="At least two scene videos are required to combine."
            )
-        # Pre-validate that referenced video files exist and are within youtube_videos dir
+        user_workspace = UserWorkspaceManager(db)
        workspace_info = user_workspace.get_user_workspace(user_id)
        youtube_video_dir = Path(workspace_info['workspace_path']) / "content" / "videos" if workspace_info and workspace_info.get('workspace_path') else YOUTUBE_VIDEO_DIR
        base_dir = Path(__file__).parent.parent.parent.parent
-        youtube_video_dir = base_dir / "youtube_videos"
+        legacy_video_dir = base_dir / "youtube_videos"
        missing_files = []
-        for url in request.video_urls:
+        for url in request.scene_video_urls:
-            filename = Path(url).name  # strips query params if present
+            filename = Path(url).name
            video_path = youtube_video_dir / filename
            # prevent directory traversal
            if ".." in filename or "/" in filename or "\\" in filename:
                return CombineVideosResponse(
@@ -1291,8 +1206,13 @@ async def combine_scene_videos(
                    task_id=None,
                    message=f"Invalid video filename: {filename}"
                )
            video_path = youtube_video_dir / filename
            if not video_path.exists():
-                missing_files.append(filename)
+                legacy_path = legacy_video_dir / filename
                if legacy_path.exists():
                    video_path = legacy_path
                else:
                    missing_files.append(filename)
        if missing_files:
            return CombineVideosResponse(
                success=False,
@@ -1303,7 +1223,7 @@ async def combine_scene_videos(
        # Create task
        task_id = task_manager.create_task("youtube_video_combine")
        logger.info(
-            f"[YouTubeAPI] Created combine task {task_id} for user {user_id}, videos={len(request.video_urls)}, resolution={request.resolution}"
+            f"[YouTubeAPI] Created combine task {task_id} for user {user_id}, videos={len(request.scene_video_urls)}, resolution={request.resolution}"
        )
        initial_status = task_manager.get_task_status(task_id)
@@ -1320,7 +1240,7 @@ async def combine_scene_videos(
            background_tasks.add_task(
                _execute_combine_video_task,
                task_id=task_id,
-                scene_video_urls=request.video_urls,
+                scene_video_urls=request.scene_video_urls,
                user_id=user_id,
                resolution=request.resolution,
                title=request.title,
@@ -1343,7 +1263,7 @@ async def combine_scene_videos(
        return CombineVideosResponse(
            success=True,
            task_id=task_id,
-            message=f"Combining {len(request.video_urls)} videos...",
+            message=f"Combining {len(request.scene_video_urls)} videos...",
        )
    except HTTPException:
--- a/backend/api/youtube/task_manager.py
+++ b/backend/api/youtube/task_manager.py
@@ -1,11 +1,10 @@
 """
 Task Manager for YouTube Creator Studio
-Reuses the Story Writer task manager pattern for async video rendering.
+Delegates to the hybrid DB-backed + in-memory YouTubeTaskManager.
 Maintains backward compatibility with the Story Writer TaskManager API.
 """
-from api.story_writer.task_manager import TaskManager
+from services.youtube.youtube_task_manager import task_manager
 # Shared task manager instance
 task_manager = TaskManager()
 __all__ = ["task_manager"]
--- a/backend/image_studio_images/img_Conceptual_diagram_of_a_digital_marketing_fun_5db260a4.png
+++ b/backend/image_studio_images/img_Conceptual_diagram_of_a_digital_marketing_fun_5db260a4.png
--- a/backend/image_studio_images/img_Conceptual_illustration_of_a_central_AI_brain_ce81867b.png
+++ b/backend/image_studio_images/img_Conceptual_illustration_of_a_central_AI_brain_ce81867b.png
--- a/backend/image_studio_images/img_Professional_infographic_style_visual_with_fo_f6143820.png
+++ b/backend/image_studio_images/img_Professional_infographic_style_visual_with_fo_f6143820.png
--- a/backend/middleware/auth_middleware.py
+++ b/backend/middleware/auth_middleware.py
@@ -1,6 +1,7 @@
 """Authentication middleware for ALwrity backend."""
 import os
 import base64
 import inspect
 from typing import Optional, Dict, Any
 from fastapi import HTTPException, Depends, status, Request, Query
@@ -61,12 +62,23 @@ class ClerkAuthMiddleware:
                if self.clerk_secret_key and self.clerk_publishable_key:
                    # Extract instance from publishable key for JWKS URL and issuer validation
                    # Format: pk_test_<instance>.<domain> or pk_live_<instance>.<domain>
                    # Production keys may have base64-encoded instance IDs
                    parts = self.clerk_publishable_key.replace('pk_test_', '').replace('pk_live_', '').split('.')
                    if len(parts) >= 1:
-                        # Extract the domain from publishable key or use default
+                        # Attempt base64 decode (production Clerk keys encode the instance)
-                        # Clerk URLs are typically: https://<instance>.clerk.accounts.dev
+                        raw_instance = parts[0]
-                        instance = parts[0]
+                        try:
-                        issuer_url = f"https://{instance}.clerk.accounts.dev"
+                            padded = raw_instance + '=' * (4 - len(raw_instance) % 4) if len(raw_instance) % 4 else raw_instance
                            decoded_bytes = base64.b64decode(padded)
                            instance = decoded_bytes.decode('utf-8').rstrip('\x00 $\n\r\t')
                        except Exception:
                            instance = raw_instance
                        # If decoded value contains a dot, it's already a full domain path
                        if '.' in instance:
                            issuer_url = f"https://{instance}"
                        else:
                            issuer_url = f"https://{instance}.clerk.accounts.dev"
                        jwks_url = f"{issuer_url}/.well-known/jwks.json"
                        # Create Clerk configuration with JWKS URL
@@ -288,7 +300,7 @@ async def get_current_user(
                      user_agent = request.headers.get('user-agent', 'unknown')
                 if hasattr(request.headers, 'items'):
-                      all_headers = {k: v[:50] if len(v) > 50 else v for k, v in request.headers.items()}
+                       all_headers = {k: (v[:50] if len(v) > 50 else v) for k, v in request.headers.items() if k.lower() != 'authorization'}
        except:
             pass
@@ -300,7 +312,6 @@ async def get_current_user(
                f"🔒 AUTHENTICATION ERROR: No credentials provided for authenticated endpoint: {endpoint_path} "
                f"(client_ip={request.client.host if request.client else 'unknown'}, "
                f"auth_header_received={'YES' if auth_header else 'NO'}, "
                f"auth_header_value={auth_header[:50] + '...' if auth_header and len(auth_header) > 50 else (auth_header or 'None')}, "
                f"all_headers={list(all_headers.keys())}, "
                f"user_agent={user_agent})"
            )
--- a/backend/models/blog_models.py
+++ b/backend/models/blog_models.py
@@ -220,6 +220,8 @@ class BlogSectionRequest(BaseModel):
    tone: Optional[str] = None
    persona: Optional[PersonaInfo] = None
    mode: Optional[str] = "polished"  # 'draft' | 'polished'
    research: Optional[BlogResearchResponse] = None
    competitive_advantage: Optional[str] = None
 class BlogSectionResponse(BaseModel):
--- a/backend/models/linkedin_models.py
+++ b/backend/models/linkedin_models.py
@@ -36,6 +36,7 @@ class SearchEngine(str, Enum):
    METAPHOR = "metaphor"
    GOOGLE = "google"
    TAVILY = "tavily"
    EXA = "exa"
 class GroundingLevel(str, Enum):
@@ -57,7 +58,7 @@ class LinkedInPostRequest(BaseModel):
    include_hashtags: bool = Field(default=True, description="Whether to include hashtags")
    include_call_to_action: bool = Field(default=True, description="Whether to include call to action")
    research_enabled: bool = Field(default=True, description="Whether to include research-backed content")
-    search_engine: SearchEngine = Field(default=SearchEngine.GOOGLE, description="Search engine for research")
+    search_engine: SearchEngine = Field(default=SearchEngine.EXA, description="Search engine for research")
    max_length: int = Field(default=3000, description="Maximum character count", ge=100, le=3000)
    grounding_level: GroundingLevel = Field(default=GroundingLevel.ENHANCED, description="Level of content grounding")
    include_citations: bool = Field(default=True, description="Whether to include inline citations")
@@ -94,7 +95,7 @@ class LinkedInArticleRequest(BaseModel):
    include_images: bool = Field(default=True, description="Whether to generate image suggestions")
    seo_optimization: bool = Field(default=True, description="Whether to include SEO optimization")
    research_enabled: bool = Field(default=True, description="Whether to include research-backed content")
-    search_engine: SearchEngine = Field(default=SearchEngine.GOOGLE, description="Search engine for research")
+    search_engine: SearchEngine = Field(default=SearchEngine.EXA, description="Search engine for research")
    word_count: int = Field(default=1500, description="Target word count", ge=500, le=5000)
    grounding_level: GroundingLevel = Field(default=GroundingLevel.ENHANCED, description="Level of content grounding")
    include_citations: bool = Field(default=True, description="Whether to include inline citations")
@@ -129,9 +130,11 @@ class LinkedInCarouselRequest(BaseModel):
    number_of_slides: int = Field(default=5, description="Number of slides", ge=3, le=10)
    include_cover_slide: bool = Field(default=True, description="Whether to include a cover slide")
    include_cta_slide: bool = Field(default=True, description="Whether to include a call-to-action slide")
    key_points: Optional[List[str]] = Field(None, description="Specific key points to cover", max_items=10)
    research_enabled: bool = Field(default=True, description="Whether to include research-backed content")
-    search_engine: SearchEngine = Field(default=SearchEngine.GOOGLE, description="Search engine for research")
+    search_engine: SearchEngine = Field(default=SearchEngine.EXA, description="Search engine for research")
    grounding_level: GroundingLevel = Field(default=GroundingLevel.ENHANCED, description="Level of content grounding")
    color_scheme: str = Field(default="professional", description="Color scheme for PDF rendering: professional, creative, industry, dark, minimal")
    include_citations: bool = Field(default=True, description="Whether to include inline citations")
    class Config:
@@ -144,9 +147,11 @@ class LinkedInCarouselRequest(BaseModel):
                "number_of_slides": 6,
                "include_cover_slide": True,
                "include_cta_slide": True,
                "key_points": ["Remote collaboration tools", "Work-life balance", "Productivity metrics"],
                "research_enabled": True,
                "search_engine": "google",
                "grounding_level": "enhanced",
                "color_scheme": "professional",
                "include_citations": True
            }
        }
@@ -161,8 +166,9 @@ class LinkedInVideoScriptRequest(BaseModel):
    video_duration: int = Field(default=60, description="Target video duration in seconds", ge=30, le=300)
    include_captions: bool = Field(default=True, description="Whether to include captions")
    include_thumbnail_suggestions: bool = Field(default=True, description="Whether to include thumbnail suggestions")
    key_points: Optional[List[str]] = Field(None, description="Specific key points to cover in the video", max_items=10)
    research_enabled: bool = Field(default=True, description="Whether to include research-backed content")
-    search_engine: SearchEngine = Field(default=SearchEngine.GOOGLE, description="Search engine for research")
+    search_engine: SearchEngine = Field(default=SearchEngine.EXA, description="Search engine for research")
    grounding_level: GroundingLevel = Field(default=GroundingLevel.ENHANCED, description="Level of content grounding")
    include_citations: bool = Field(default=True, description="Whether to include inline citations")
@@ -176,6 +182,7 @@ class LinkedInVideoScriptRequest(BaseModel):
                "video_duration": 90,
                "include_captions": True,
                "include_thumbnail_suggestions": True,
                "key_points": ["Zero trust architecture", "Phishing prevention", "Incident response"],
                "research_enabled": True,
                "search_engine": "google",
                "grounding_level": "enhanced",
@@ -193,7 +200,7 @@ class LinkedInCommentResponseRequest(BaseModel):
    response_length: str = Field(default="medium", description="Length of response: short, medium, long")
    include_questions: bool = Field(default=True, description="Whether to include engaging questions")
    research_enabled: bool = Field(default=False, description="Whether to include research-backed content")
-    search_engine: SearchEngine = Field(default=SearchEngine.GOOGLE, description="Search engine for research")
+    search_engine: SearchEngine = Field(default=SearchEngine.EXA, description="Search engine for research")
    grounding_level: GroundingLevel = Field(default=GroundingLevel.BASIC, description="Level of content grounding")
    class Config:
@@ -451,4 +458,24 @@ class LinkedInCommentResponseResult(BaseModel):
    tone_analysis: Optional[Dict[str, Any]] = None
    generation_metadata: Dict[str, Any] = {}
    error: Optional[str] = None
-    grounding_status: Optional[Dict[str, Any]] = Field(None, description="Grounding operation status")
+    grounding_status: Optional[Dict[str, Any]] = Field(None, description="Grounding operation status")
 class LinkedInEditContentRequest(BaseModel):
    """Request model for AI-powered LinkedIn content editing."""
    content: str = Field(..., description="Content to edit", min_length=1)
    edit_type: str = Field(..., description="Type of edit: professionalize, optimize_engagement, add_hashtags, adjust_tone, expand, condense, add_cta")
    industry: Optional[str] = Field(None, description="Industry context for the edit")
    tone: Optional[str] = Field(None, description="Target tone: professional, conversational, authoritative, educational, friendly")
    target_audience: Optional[str] = Field(None, description="Target audience for the content")
    parameters: Optional[Dict[str, Any]] = Field(None, description="Additional parameters specific to edit type")
 class LinkedInEditContentResponse(BaseModel):
    """Response model for AI-powered LinkedIn content editing."""
    success: bool = True
    content: Optional[str] = None
    edit_type: str
    provider: Optional[str] = None
    model: Optional[str] = None
    error: Optional[str] = None
--- a/backend/models/youtube_task_models.py
+++ b/backend/models/youtube_task_models.py
@@ -0,0 +1,63 @@
 """
 YouTube Video Task Models
 Database models for persistent tracking of YouTube video render,
 combine, and publish tasks. Replaces the in-memory dict approach
 so tasks survive server restarts.
 """
 import enum
 from datetime import datetime, timezone
 from sqlalchemy import Column, Integer, String, DateTime, JSON, Text, Float, Enum, Index
 from models.subscription_models import Base
 class YouTubeTaskType(enum.Enum):
    RENDER = "render"
    SCENE_RENDER = "scene_render"
    COMBINE = "combine"
    PUBLISH = "publish"
    IMAGE_GENERATION = "image_generation"
    AUDIO_GENERATION = "audio_generation"
 class YouTubeTaskStatus(enum.Enum):
    PENDING = "pending"
    PROCESSING = "processing"
    COMPLETED = "completed"
    FAILED = "failed"
 class YouTubeVideoTask(Base):
    """
    Persistent task tracking for YouTube Creator operations.
    Stores task state in PostgreSQL so that in-progress renders,
    combines, and publishes survive server restarts. The frontend
    can resume polling after a restart and recover results.
    """
    __tablename__ = "youtube_video_tasks"
    id = Column(Integer, primary_key=True, autoincrement=True)
    task_id = Column(String(36), unique=True, nullable=False, index=True)
    user_id = Column(String(255), nullable=False, index=True)
    task_type = Column(Enum(YouTubeTaskType), nullable=False, default=YouTubeTaskType.RENDER)
    status = Column(Enum(YouTubeTaskStatus), nullable=False, default=YouTubeTaskStatus.PENDING)
    progress = Column(Float, default=0.0)
    message = Column(String(500), nullable=True)
    request_data = Column(JSON, nullable=True)
    result = Column(JSON, nullable=True)
    error = Column(Text, nullable=True)
    created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
    updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc), nullable=False)
    completed_at = Column(DateTime, nullable=True)
    __table_args__ = (
        Index('idx_youtube_task_user_status', 'user_id', 'status'),
        Index('idx_youtube_task_user_type', 'user_id', 'task_type'),
        Index('idx_youtube_task_created', 'created_at'),
    )
--- a/backend/requirements-linkedin.txt
+++ b/backend/requirements-linkedin.txt
@@ -0,0 +1,74 @@
 # =====================================================
 # ALwrity LinkedIn-Only Requirements
 # Lean subset for linkedin-only demo mode
 # =====================================================
 # Core Web Server
 fastapi>=0.115.14
 starlette>=0.40.0,<0.47.0
 sse-starlette<3.0.0
 uvicorn>=0.24.0
 uvicorn[standard]>=0.24.0
 gunicorn>=21.0.0
 # Server utilities
 python-multipart>=0.0.6
 python-dotenv>=1.0.0
 loguru>=0.7.2
 tenacity>=8.2.3
 pydantic>=2.5.2,<3.0.0
 typing-extensions>=4.8.0
 setuptools>=65.0.0
 # Auth & Database
 fastapi-clerk-auth>=0.0.7
 PyJWT>=2.8.0
 cryptography>=41.0.0
 sqlalchemy>=2.0.25
 # Payment
 stripe>=8.0.0
 # HTTP clients
 httpx>=0.28.1
 aiohttp>=3.9.0
 requests>=2.31.0
 # AI - needed for content generation and image prompts
 openai>=1.3.0
 google-genai>=1.0.0
 exa-py==1.9.1
 # Text processing
 markdown>=3.5.0
 beautifulsoup4>=4.12.0
 # Data processing
 numpy>=1.24.0
 pandas>=2.0.0
 # Image processing - needed for LinkedIn image generation/editing
 Pillow>=10.0.0
 # Testing
 pytest>=7.4.0
 pytest-asyncio>=0.21.0
 # Task scheduling - needed for content calendar
 apscheduler>=3.10.0
 # Utilities
 redis>=5.0.0
 schedule>=1.2.0
 aiofiles>=23.2.0
 psutil>=5.9.0
 # Google APIs
 google-api-python-client>=2.100.0
 google-auth>=2.23.0
 google-auth-oauthlib>=1.0.0
 # Other utilities
 python-dateutil>=2.8.0
 jinja2>=3.1.0
 pydantic-settings>=2.0.0
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -12,6 +12,8 @@ tenacity>=8.2.3
 pydantic>=2.5.2,<3.0.0
 typing-extensions>=4.8.0
 reportlab-4.5.1
 # Auth
 PyJWT>=2.8.0
 cryptography>=41.0.0
--- a/backend/routers/linkedin.py
+++ b/backend/routers/linkedin.py
@@ -7,9 +7,10 @@ proper error handling, monitoring, and documentation.
 """
 from fastapi import APIRouter, HTTPException, Depends, BackgroundTasks, Request
-from fastapi.responses import JSONResponse
+from fastapi.responses import JSONResponse, FileResponse
 from typing import Dict, Any, Optional
 import time
 import json
 from loguru import logger
 from pathlib import Path
@@ -17,11 +18,17 @@ from models.linkedin_models import (
    LinkedInPostRequest, LinkedInArticleRequest, LinkedInCarouselRequest,
    LinkedInVideoScriptRequest, LinkedInCommentResponseRequest,
    LinkedInPostResponse, LinkedInArticleResponse, LinkedInCarouselResponse,
-    LinkedInVideoScriptResponse, LinkedInCommentResponseResult
+    LinkedInVideoScriptResponse, LinkedInCommentResponseResult,
    LinkedInEditContentRequest, LinkedInEditContentResponse
 )
 from services.llm_providers.main_text_generation import llm_text_gen
 from services.linkedin_service import LinkedInService
 from services.linkedin.carousel import LinkedInCarouselPDFRenderer
 from middleware.auth_middleware import get_current_user
 from utils.text_asset_tracker import save_and_track_text_content
 from models.api_monitoring import APIRequest
 from sqlalchemy import func
 from collections import defaultdict
 # Initialize the LinkedIn service instance
 linkedin_service = LinkedInService()
@@ -29,6 +36,34 @@ from services.subscription.monitoring_middleware import DatabaseAPIMonitor
 from services.database import get_db as get_db_dependency
 from sqlalchemy.orm import Session
 # Simple in-memory rate limiter: {user_id: [timestamp, ...]}
 _rate_limit_store: Dict[str, list] = defaultdict(list)
 RATE_LIMIT_MAX_REQUESTS = 30
 RATE_LIMIT_WINDOW = 60  # seconds
 def check_rate_limit(user_id: str) -> Optional[int]:
    """Returns retry-after seconds if rate limited, None otherwise."""
    now = time.time()
    window_start = now - RATE_LIMIT_WINDOW
    timestamps = _rate_limit_store[user_id]
    # Prune old entries
    _rate_limit_store[user_id] = [t for t in timestamps if t > window_start]
    if len(_rate_limit_store[user_id]) >= RATE_LIMIT_MAX_REQUESTS:
        return int(_rate_limit_store[user_id][0] + RATE_LIMIT_WINDOW - now)
    _rate_limit_store[user_id].append(now)
    return None
 ERROR_CODES = {
    'VALIDATION': 'LINKEDIN_ERR_001',
    'GENERATION_FAILED': 'LINKEDIN_ERR_002',
    'RATE_LIMITED': 'LINKEDIN_ERR_003',
    'SAVE_FAILED': 'LINKEDIN_ERR_004',
    'NOT_FOUND': 'LINKEDIN_ERR_404',
 }
 def error_response(code: str, message: str) -> dict:
    return {"code": code, "message": message}
 # Initialize router
 router = APIRouter(
    prefix="/api/linkedin",
@@ -112,10 +147,10 @@ async def generate_post(
        # Validate request
        if not request.topic.strip():
-            raise HTTPException(status_code=422, detail="Topic cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Topic cannot be empty"))
        if not request.industry.strip():
-            raise HTTPException(status_code=422, detail="Industry cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Industry cannot be empty"))
        # Extract user_id
        user_id = None
@@ -124,22 +159,30 @@ async def generate_post(
        if not user_id:
            user_id = http_request.headers.get("X-User-ID") or http_request.headers.get("Authorization")
        # Rate limit check
        retry_after = check_rate_limit(user_id or 'anonymous')
        if retry_after:
            raise HTTPException(
                status_code=429,
                detail=error_response(ERROR_CODES['RATE_LIMITED'], f"Rate limit exceeded. Retry after {retry_after} seconds."),
                headers={"Retry-After": str(retry_after)}
            )
        # Generate post content
        response = await linkedin_service.generate_linkedin_post(request)
        if not response.success:
            raise HTTPException(status_code=500, detail=error_response(ERROR_CODES['GENERATION_FAILED'], response.error or "Post generation failed"))
        # Log successful request
        duration = time.time() - start_time
        background_tasks.add_task(
            log_api_request, http_request, db, duration, 200
        )
-        if not response.success:
+        # Save and track text content
            raise HTTPException(status_code=500, detail=response.error)
        # Save and track text content (non-blocking)
        if user_id and response.data and response.data.content:
            try:
                # Combine all text content
                text_content = response.data.content
                if response.data.call_to_action:
                    text_content += f"\n\nCall to Action: {response.data.call_to_action}"
@@ -166,7 +209,7 @@ async def generate_post(
                    subdirectory="posts"
                )
            except Exception as track_error:
-                logger.warning(f"Failed to track LinkedIn post asset: {track_error}")
+                logger.error(f"Failed to track LinkedIn post asset: {track_error}")
        logger.info(f"Successfully generated LinkedIn post in {duration:.2f} seconds")
        return response
@@ -177,14 +220,13 @@ async def generate_post(
        duration = time.time() - start_time
        logger.error(f"Error generating LinkedIn post: {str(e)}")
        # Log failed request
        background_tasks.add_task(
            log_api_request, http_request, db, duration, 500
        )
        raise HTTPException(
            status_code=500,
-            detail=f"Failed to generate LinkedIn post: {str(e)}"
+            detail=error_response(ERROR_CODES['GENERATION_FAILED'], f"Failed to generate LinkedIn post: {str(e)}")
        )
@@ -222,10 +264,10 @@ async def generate_article(
        # Validate request
        if not request.topic.strip():
-            raise HTTPException(status_code=422, detail="Topic cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Topic cannot be empty"))
        if not request.industry.strip():
-            raise HTTPException(status_code=422, detail="Industry cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Industry cannot be empty"))
        # Extract user_id
        user_id = None
@@ -234,17 +276,16 @@ async def generate_article(
        if not user_id:
            user_id = http_request.headers.get("X-User-ID") or http_request.headers.get("Authorization")
        # Rate limit check
        retry_after = check_rate_limit(user_id or 'anonymous')
        if retry_after:
            raise HTTPException(status_code=429, detail=error_response(ERROR_CODES['RATE_LIMITED'], f"Rate limit exceeded. Retry after {retry_after} seconds."), headers={"Retry-After": str(retry_after)})
        # Generate article content
        response = await linkedin_service.generate_linkedin_article(request)
        # Log successful request
        duration = time.time() - start_time
        background_tasks.add_task(
            log_api_request, http_request, db, duration, 200
        )
        if not response.success:
-            raise HTTPException(status_code=500, detail=response.error)
+            raise HTTPException(status_code=500, detail=error_response(ERROR_CODES['GENERATION_FAILED'], response.error or "Article generation failed"))
        # Save and track text content (non-blocking)
        if user_id and response.data:
@@ -282,7 +323,7 @@ async def generate_article(
                    file_extension=".md"
                )
            except Exception as track_error:
-                logger.warning(f"Failed to track LinkedIn article asset: {track_error}")
+                logger.error(f"Failed to track LinkedIn article asset: {track_error}")
        logger.info(f"Successfully generated LinkedIn article in {duration:.2f} seconds")
        return response
@@ -300,7 +341,7 @@ async def generate_article(
        raise HTTPException(
            status_code=500,
-            detail=f"Failed to generate LinkedIn article: {str(e)}"
+            detail=error_response(ERROR_CODES['GENERATION_FAILED'], f"Failed to generate LinkedIn article: {str(e)}")
        )
@@ -337,13 +378,13 @@ async def generate_carousel(
        # Validate request
        if not request.topic.strip():
-            raise HTTPException(status_code=422, detail="Topic cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Topic cannot be empty"))
        if not request.industry.strip():
-            raise HTTPException(status_code=422, detail="Industry cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Industry cannot be empty"))
-        if request.slide_count < 3 or request.slide_count > 15:
+        if request.number_of_slides < 3 or request.number_of_slides > 15:
-            raise HTTPException(status_code=422, detail="Slide count must be between 3 and 15")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Number of slides must be between 3 and 15"))
        # Extract user_id
        user_id = None
@@ -352,18 +393,23 @@ async def generate_carousel(
        if not user_id:
            user_id = http_request.headers.get("X-User-ID") or http_request.headers.get("Authorization")
        # Rate limit check
        retry_after = check_rate_limit(user_id or 'anonymous')
        if retry_after:
            raise HTTPException(status_code=429, detail=error_response(ERROR_CODES['RATE_LIMITED'], f"Rate limit exceeded. Retry after {retry_after} seconds."), headers={"Retry-After": str(retry_after)})
        # Generate carousel content
        response = await linkedin_service.generate_linkedin_carousel(request)
        if not response.success:
            raise HTTPException(status_code=500, detail=error_response(ERROR_CODES['GENERATION_FAILED'], response.error or "Carousel generation failed"))
        # Log successful request
        duration = time.time() - start_time
        background_tasks.add_task(
            log_api_request, http_request, db, duration, 200
        )
        if not response.success:
            raise HTTPException(status_code=500, detail=response.error)
        # Save and track text content (non-blocking)
        if user_id and response.data:
            try:
@@ -381,10 +427,10 @@ async def generate_carousel(
                    source_module="linkedin_writer",
                    title=f"LinkedIn Carousel: {response.data.title[:80] if response.data.title else request.topic[:80]}",
                    description=f"LinkedIn carousel for {request.industry} industry",
-                    prompt=f"Topic: {request.topic}\nIndustry: {request.industry}\nSlides: {getattr(request, 'number_of_slides', request.slide_count if hasattr(request, 'slide_count') else 5)}",
+                    prompt=f"Topic: {request.topic}\nIndustry: {request.industry}\nSlides: {request.number_of_slides}",
                    tags=["linkedin", "carousel", request.industry.lower().replace(' ', '_')],
                    asset_metadata={
-                        "slide_count": len(response.data.slides),
+                        "number_of_slides": len(response.data.slides),
                        "has_cover": response.data.cover_slide is not None,
                        "has_cta": response.data.cta_slide is not None
                    },
@@ -392,7 +438,7 @@ async def generate_carousel(
                    file_extension=".md"
                )
            except Exception as track_error:
-                logger.warning(f"Failed to track LinkedIn carousel asset: {track_error}")
+                logger.error(f"Failed to track LinkedIn carousel asset: {track_error}")
        logger.info(f"Successfully generated LinkedIn carousel in {duration:.2f} seconds")
        return response
@@ -410,10 +456,82 @@ async def generate_carousel(
        raise HTTPException(
            status_code=500,
-            detail=f"Failed to generate LinkedIn carousel: {str(e)}"
+            detail=error_response(ERROR_CODES['GENERATION_FAILED'], f"Failed to generate LinkedIn carousel: {str(e)}")
        )
@router.post(
    "/generate-carousel-pdf",
    summary="Render Carousel as PDF",
    description="""
    Render previously generated LinkedIn carousel content as a PDF document.
    Takes carousel content (slides with title, content, visual_elements) and
    renders them into visually appealing slide images composed into a PDF
    ready for LinkedIn upload (1.91:1 aspect ratio, max 300 slides, max 100MB).
    """
 )
 async def generate_carousel_pdf(
    request: LinkedInCarouselRequest,
    background_tasks: BackgroundTasks,
    http_request: Request,
    db: Session = Depends(get_db),
    current_user: Optional[Dict[str, Any]] = Depends(get_current_user)
 ):
    """Generate carousel content and render as PDF."""
    start_time = time.time()
    try:
        user_id = None
        if current_user:
            user_id = str(current_user.get('id', '') or current_user.get('sub', ''))
        if not user_id:
            user_id = http_request.headers.get("X-User-ID") or http_request.headers.get("Authorization")
        # First generate carousel content
        content_result = await linkedin_service.generate_linkedin_carousel(request)
        if not content_result.success or not content_result.data:
            raise HTTPException(status_code=500, detail=content_result.error or "Carousel generation failed")
        carousel_data = content_result.data.model_dump()
        # Then render to PDF
        renderer = LinkedInCarouselPDFRenderer()
        pdf_result = await renderer.render_carousel_to_pdf(
            carousel_data=carousel_data,
            color_scheme=request.color_scheme,
            user_id=user_id,
        )
        if not pdf_result.get('success'):
            raise HTTPException(status_code=500, detail=pdf_result.get('error', 'PDF rendering failed'))
        duration = time.time() - start_time
        background_tasks.add_task(log_api_request, http_request, db, duration, 200)
        pdf_path = pdf_result.get('pdf_path')
        if pdf_path:
            return FileResponse(
                path=pdf_path,
                media_type="application/pdf",
                filename=f"linkedin_carousel_{request.topic[:30].replace(' ', '_')}.pdf"
            )
        return JSONResponse(content={
            'success': True,
            'pdf_bytes': pdf_result.get('pdf_bytes'),
            'metadata': pdf_result.get('metadata'),
        })
    except HTTPException:
        raise
    except Exception as e:
        duration = time.time() - start_time
        logger.error(f"Error generating carousel PDF: {str(e)}")
        raise HTTPException(status_code=500, detail=error_response(ERROR_CODES['GENERATION_FAILED'], f"Failed to generate carousel PDF: {str(e)}"))
@router.post(
    "/generate-video-script",
    response_model=LinkedInVideoScriptResponse,
@@ -447,14 +565,14 @@ async def generate_video_script(
        # Validate request
        if not request.topic.strip():
-            raise HTTPException(status_code=422, detail="Topic cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Topic cannot be empty"))
        if not request.industry.strip():
-            raise HTTPException(status_code=422, detail="Industry cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Industry cannot be empty"))
        video_duration = getattr(request, 'video_duration', getattr(request, 'video_length', 60))
        if video_duration < 15 or video_duration > 300:
-            raise HTTPException(status_code=422, detail="Video length must be between 15 and 300 seconds")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Video length must be between 15 and 300 seconds"))
        # Extract user_id
        user_id = None
@@ -463,18 +581,23 @@ async def generate_video_script(
        if not user_id:
            user_id = http_request.headers.get("X-User-ID") or http_request.headers.get("Authorization")
        # Rate limit check
        retry_after = check_rate_limit(user_id or 'anonymous')
        if retry_after:
            raise HTTPException(status_code=429, detail=error_response(ERROR_CODES['RATE_LIMITED'], f"Rate limit exceeded. Retry after {retry_after} seconds."), headers={"Retry-After": str(retry_after)})
        # Generate video script content
        response = await linkedin_service.generate_linkedin_video_script(request)
        if not response.success:
            raise HTTPException(status_code=500, detail=error_response(ERROR_CODES['GENERATION_FAILED'], response.error or "Video script generation failed"))
        # Log successful request
        duration = time.time() - start_time
        background_tasks.add_task(
            log_api_request, http_request, db, duration, 200
        )
        if not response.success:
            raise HTTPException(status_code=500, detail=response.error)
        # Save and track text content (non-blocking)
        if user_id and response.data:
            try:
@@ -514,7 +637,7 @@ async def generate_video_script(
                    file_extension=".md"
                )
            except Exception as track_error:
-                logger.warning(f"Failed to track LinkedIn video script asset: {track_error}")
+                logger.error(f"Failed to track LinkedIn video script asset: {track_error}")
        logger.info(f"Successfully generated LinkedIn video script in {duration:.2f} seconds")
        return response
@@ -532,7 +655,7 @@ async def generate_video_script(
        raise HTTPException(
            status_code=500,
-            detail=f"Failed to generate LinkedIn video script: {str(e)}"
+            detail=error_response(ERROR_CODES['GENERATION_FAILED'], f"Failed to generate LinkedIn video script: {str(e)}")
        )
@@ -572,10 +695,10 @@ async def generate_comment_response(
        post_context = getattr(request, 'post_context', getattr(request, 'original_post', ''))
        if not original_comment.strip():
-            raise HTTPException(status_code=422, detail="Original comment cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Original comment cannot be empty"))
        if not post_context.strip():
-            raise HTTPException(status_code=422, detail="Post context cannot be empty")
+            raise HTTPException(status_code=422, detail=error_response(ERROR_CODES['VALIDATION'], "Post context cannot be empty"))
        # Extract user_id
        user_id = None
@@ -584,18 +707,23 @@ async def generate_comment_response(
        if not user_id:
            user_id = http_request.headers.get("X-User-ID") or http_request.headers.get("Authorization")
        # Rate limit check
        retry_after = check_rate_limit(user_id or 'anonymous')
        if retry_after:
            raise HTTPException(status_code=429, detail=error_response(ERROR_CODES['RATE_LIMITED'], f"Rate limit exceeded. Retry after {retry_after} seconds."), headers={"Retry-After": str(retry_after)})
        # Generate comment response
        response = await linkedin_service.generate_linkedin_comment_response(request)
        if not response.success:
            raise HTTPException(status_code=500, detail=error_response(ERROR_CODES['GENERATION_FAILED'], response.error or "Comment response generation failed"))
        # Log successful request
        duration = time.time() - start_time
        background_tasks.add_task(
            log_api_request, http_request, db, duration, 200
        )
        if not response.success:
            raise HTTPException(status_code=500, detail=response.error)
        # Save and track text content (non-blocking)
        if user_id and hasattr(response, 'response') and response.response:
            try:
@@ -626,7 +754,7 @@ async def generate_comment_response(
                    file_extension=".md"
                )
            except Exception as track_error:
-                logger.warning(f"Failed to track LinkedIn comment response asset: {track_error}")
+                logger.error(f"Failed to track LinkedIn comment response asset: {track_error}")
        logger.info(f"Successfully generated LinkedIn comment response in {duration:.2f} seconds")
        return response
@@ -644,7 +772,7 @@ async def generate_comment_response(
        raise HTTPException(
            status_code=500,
-            detail=f"Failed to generate LinkedIn comment response: {str(e)}"
+            detail=error_response(ERROR_CODES['GENERATION_FAILED'], f"Failed to generate LinkedIn comment response: {str(e)}")
        )
@@ -691,6 +819,128 @@ async def get_content_types():
    }
@router.post(
    "/edit-content",
    response_model=LinkedInEditContentResponse,
    summary="Edit LinkedIn Content with AI",
    description="""
    Apply AI-powered edits to LinkedIn content.
    Supported edit types:
    - professionalize: Rewrite content with professional business language
    - optimize_engagement: Optimize hook and structure for maximum engagement
    - add_hashtags: Generate relevant, industry-specific hashtags
    - adjust_tone: Rewrite content in a different tone (professional, conversational, authoritative, etc.)
    - expand: Add depth, examples, and insights to content
    - condense: Shorten content while preserving key messages
    - add_cta: Generate a contextual call-to-action
    """
 )
 async def edit_linkedin_content(
    request: LinkedInEditContentRequest,
    current_user: Optional[Dict[str, Any]] = Depends(get_current_user)
 ):
    """Edit LinkedIn content using AI-powered text generation."""
    try:
        # Extract user_id for subscription checking
        user_id = None
        if current_user:
            user_id = str(current_user.get('id', '') or current_user.get('sub', ''))
        if not request.content.strip():
            return LinkedInEditContentResponse(
                success=False, error="Content cannot be empty", edit_type=request.edit_type
            )
        # Build the system prompt based on edit type
        system_prompts = {
            "professionalize": "You are a professional business writer. Rewrite the following LinkedIn content to be more professional, polished, and industry-appropriate. Maintain the original message but use sophisticated business language, improve sentence structure, and ensure a confident executive presence.",
            "optimize_engagement": "You are a LinkedIn engagement strategist. Rewrite the following content to maximize engagement. Strengthen the hook in the first 2 lines, add thought-provoking elements, improve readability with shorter sentences, and ensure the content encourages comments and shares.",
            "add_hashtags": "You are a LinkedIn hashtag strategist. Generate 5 highly relevant, industry-specific hashtags for the following content. Return the original content unchanged, followed by two newlines and the hashtags on a single line.",
            "adjust_tone": "You are a LinkedIn tone specialist. Rewrite the following content in the specified tone while preserving all key information and the overall message.",
            "expand": "You are a LinkedIn content strategist. Expand the following content by adding relevant examples, data points, actionable insights, and deeper analysis. Maintain the original structure but add substantial value while keeping it LinkedIn-appropriate (under 3000 characters).",
            "condense": "You are a LinkedIn editing specialist. Condense the following content to be more concise and impactful. Remove filler words, tighten sentences, and preserve only the strongest points. Keep the core message intact.",
            "add_cta": "You are a LinkedIn conversion strategist. Add a compelling, contextual call-to-action to the following content. The CTA should feel natural, not salesy, and should encourage meaningful engagement (comments, connections, or discussions)."
        }
        system_prompt = system_prompts.get(request.edit_type)
        if not system_prompt:
            return LinkedInEditContentResponse(
                success=False, error=f"Unknown edit type: {request.edit_type}", edit_type=request.edit_type
            )
        # Build the user prompt with context
        user_prompt = f"Content to edit:\n\n{request.content}\n\n"
        if request.industry:
            user_prompt += f"Industry: {request.industry}\n"
        if request.tone:
            user_prompt += f"Target tone: {request.tone}\n"
        if request.target_audience:
            user_prompt += f"Target audience: {request.target_audience}\n"
        if request.parameters:
            user_prompt += f"Additional context: {json.dumps(request.parameters)}\n"
        user_prompt += "\nReturn ONLY the edited content without any explanations, labels, or markdown formatting."
        # Generate edited content using provider-agnostic gateway
        temperature = {
            "professionalize": 0.3,
            "optimize_engagement": 0.7,
            "add_hashtags": 0.4,
            "adjust_tone": 0.5,
            "expand": 0.7,
            "condense": 0.3,
            "add_cta": 0.6,
        }.get(request.edit_type, 0.5)
        max_tokens = {
            "expand": 2048,
            "professionalize": 1024,
            "optimize_engagement": 1024,
            "adjust_tone": 1024,
            "condense": 1024,
            "add_cta": 1024,
            "add_hashtags": 512,
        }.get(request.edit_type, 1024)
        edited = llm_text_gen(
            prompt=user_prompt,
            system_prompt=system_prompt,
            user_id=user_id,
            flow_type=f"linkedin_edit_{request.edit_type}",
            max_tokens=max_tokens,
            temperature=temperature
        )
        if not edited:
            return LinkedInEditContentResponse(
                success=False, error="AI editing returned empty result", edit_type=request.edit_type
            )
        edited = edited.strip()
        # For add_hashtags, ensure hashtags are separated from content
        if request.edit_type == "add_hashtags":
            if not edited.endswith("\n\n"):
                # Hashtags might be inline; separate them
                pass
        logger.info(f"LinkedIn content edited successfully via {request.edit_type}")
        return LinkedInEditContentResponse(
            success=True,
            content=edited,
            edit_type=request.edit_type,
            provider="llm_text_gen",
            model="provider-agnostic"
        )
    except Exception as e:
        logger.error(f"Error editing LinkedIn content: {str(e)}", exc_info=True)
        return LinkedInEditContentResponse(
            success=False, error=f"Editing failed: {str(e)}", edit_type=request.edit_type
        )
@router.get(
    "/usage-stats",
    summary="Get Usage Statistics",
@@ -699,30 +949,29 @@ async def get_content_types():
 async def get_usage_stats(db: Session = Depends(get_db)):
    """Get usage statistics for LinkedIn content generation."""
    try:
-        # This would query the database for actual usage stats
+        base = db.query(APIRequest).filter(APIRequest.path.like('/api/linkedin/%'))
-        # For now, returning mock data
+        total = base.count()
        successful = base.filter(APIRequest.status_code < 400).count()
        avg_dur = base.with_entities(func.avg(APIRequest.duration)).scalar() or 0
        content_types = {
            "posts": base.filter(APIRequest.path.like('%generate-post')).count(),
            "articles": base.filter(APIRequest.path.like('%generate-article')).count(),
            "carousels": base.filter(APIRequest.path.like('%generate-carousel')).count(),
            "video_scripts": base.filter(APIRequest.path.like('%generate-video-script')).count(),
            "comment_responses": base.filter(APIRequest.path.like('%generate-comment-response')).count(),
        }
        return {
-            "total_requests": 1250,
+            "total_requests": total,
-            "content_types": {
+            "content_types": content_types,
-                "posts": 650,
+            "success_rate": round(successful / max(total, 1), 2),
-                "articles": 320,
+            "average_generation_time": round(float(avg_dur), 2),
                "carousels": 180,
                "video_scripts": 70,
                "comment_responses": 30
            },
            "success_rate": 0.96,
            "average_generation_time": 4.2,
            "top_industries": [
                "Technology",
                "Healthcare",
                "Finance",
                "Marketing",
                "Education"
            ]
        }
    except Exception as e:
        logger.error(f"Error retrieving usage stats: {str(e)}")
        raise HTTPException(
            status_code=500,
-            detail="Failed to retrieve usage statistics"
+            detail=error_response(ERROR_CODES['GENERATION_FAILED'], "Failed to retrieve usage statistics")
        )
--- a/backend/routers/seo_tools.py
+++ b/backend/routers/seo_tools.py
@@ -30,6 +30,7 @@ from services.seo_tools.on_page_seo_service import OnPageSEOService
 from services.seo_tools.technical_seo_service import TechnicalSEOService
 from services.seo_tools.enterprise_seo_service import EnterpriseSEOService
 from services.seo_tools.gsc_analyzer_service import GSCAnalyzerService
 from services.seo_tools.gsc_strategy_insights_service import GSCStrategyInsightsService
 from services.seo_tools.content_strategy_service import ContentStrategyService
 from services.seo_tools.llm_insights_service import LLMInsightsService
 from services.database import get_session_for_user
@@ -199,6 +200,34 @@ class KeywordExpansionRequest(BaseModel):
    content_analysis: Dict[str, Any] = Field(..., description="Content analysis data")
    target_difficulty: Optional[str] = Field(None, description="Target difficulty (low/medium/high)")
 # ==================== GSC STRATEGY INSIGHTS REQUEST MODELS ====================
 class GSCStrategyInsightsRequest(BaseModel):
    """Request model for GSC strategy insights (dashboard context)"""
    site_url: HttpUrl = Field(..., description="Website URL registered in GSC")
    include_trends: bool = Field(default=True, description="Include trend analysis")
    include_competitive: bool = Field(default=False, description="Include competitive analysis (Phase 2)")
    top_n: int = Field(default=20, ge=5, le=100, description="Number of top opportunities to return")
 class GSCOpportunityRankingRequest(BaseModel):
    """Request model for ROI-ranked opportunities"""
    site_url: HttpUrl = Field(..., description="Website URL registered in GSC")
    ranking_metric: str = Field(default="roi_score", description="Metric to rank by (roi_score/effort/impact/timeline)")
    severity_filter: Optional[str] = Field(None, description="Filter by severity (critical/high/medium/low/watch)")
    limit: int = Field(default=20, ge=5, le=100, description="Number of opportunities to return")
 class GSCTrendAnalysisRequest(BaseModel):
    """Request model for performance trend analysis"""
    site_url: HttpUrl = Field(..., description="Website URL registered in GSC")
    metric: str = Field(default="all", description="Metric to analyze (position/impressions/clicks/ctr/all)")
    days_back: int = Field(default=90, ge=7, le=365, description="Days of historical data to analyze")
 class GSCHealthMetricsRequest(BaseModel):
    """Request model for health metrics calculation"""
    site_url: HttpUrl = Field(..., description="Website URL registered in GSC")
    include_distribution: bool = Field(default=True, description="Include keyword distribution breakdown")
    include_trends: bool = Field(default=True, description="Include trend comparison")
 # Exception Handler
 async def handle_seo_tool_exception(func_name: str, error: Exception, request_data: Dict) -> ErrorResponse:
    """Handle exceptions from SEO tools with intelligent logging"""
@@ -1102,6 +1131,236 @@ async def get_content_opportunities_report(
        return await handle_seo_tool_exception("get_content_opportunities_report", e, request.dict())
 # ==================== GSC STRATEGY INSIGHTS ENDPOINTS (Dashboard-Focused) ====================
@router.post("/gsc/strategy-insights", response_model=BaseResponse)
@log_api_call
 async def get_gsc_strategy_insights(
    request: GSCStrategyInsightsRequest,
    current_user: dict = Depends(get_current_user)
 ) -> Union[BaseResponse, ErrorResponse]:
    """
    Get comprehensive strategy insights from GSC data for SEO Dashboard.
    Provides strategic insights optimized for dashboard display:
    - Ranked opportunities by ROI score (0-100)
    - Health metrics with trend comparison
    - Quick summary of key insights
    - Optional: Performance trends and competitive positioning
    ROI Scoring Formula:
    ROI = 0.40×traffic_impact + 0.30×ease + 0.20×competitive + 0.10×momentum
    Severity Levels:
    - CRITICAL: 80-100 (immediate action)
    - HIGH: 60-79 (high priority)
    - MEDIUM: 40-59 (medium priority)
    - LOW: 20-39 (low priority)
    - WATCH: <20 (monitoring)
    """
    start_time = datetime.utcnow()
    try:
        user_id = str(current_user.get("id")) if current_user else None
        service = GSCStrategyInsightsService()
        insights = await service.get_dashboard_strategy(
            user_id=user_id,
            site_url=str(request.site_url),
            include_trends=request.include_trends,
            include_competitive=request.include_competitive,
            top_n=request.top_n
        )
        execution_time = (datetime.utcnow() - start_time).total_seconds()
        return BaseResponse(
            success=True,
            message="GSC strategy insights generated successfully",
            execution_time=execution_time,
            data=insights
        )
    except Exception as e:
        logger.error(f"GSC strategy insights failed: {str(e)}", exc_info=True)
        return await handle_seo_tool_exception("get_gsc_strategy_insights", e, request.dict())
@router.post("/gsc/opportunity-ranking", response_model=BaseResponse)
@log_api_call
 async def get_ranked_opportunities(
    request: GSCOpportunityRankingRequest,
    current_user: dict = Depends(get_current_user)
 ) -> Union[BaseResponse, ErrorResponse]:
    """
    Get ROI-ranked opportunities from GSC data.
    Returns opportunities sorted by specified metric:
    - roi_score: ROI-weighted score (recommended)
    - effort: Easiest to implement first
    - impact: Highest traffic impact first
    - timeline: Fastest results first
    Optional filtering by severity level:
    - critical: 80-100 ROI (immediate action required)
    - high: 60-79 ROI (high priority)
    - medium: 40-59 ROI (medium priority)
    - low: 20-39 ROI (low priority)
    - watch: <20 ROI (monitoring)
    Each opportunity includes:
    - ROI score and severity level
    - Implementation effort (hours)
    - Timeline to impact (weeks)
    - Recommendations
    - Related keywords
    """
    start_time = datetime.utcnow()
    try:
        user_id = str(current_user.get("id")) if current_user else None
        service = GSCStrategyInsightsService()
        opportunities = await service._get_ranked_opportunities(
            site_url=str(request.site_url),
            top_n=request.limit
        )
        # Filter by severity if specified
        if request.severity_filter and opportunities.get('status') == 'success':
            filtered = [
                opp for opp in opportunities.get('opportunities', [])
                if opp.get('severity') == request.severity_filter
            ]
            opportunities['opportunities'] = filtered
        # Sort by metric
        if opportunities.get('status') == 'success' and request.ranking_metric != 'roi_score':
            opps = opportunities.get('opportunities', [])
            if request.ranking_metric == 'effort':
                opps.sort(key=lambda x: x.get('effort_hours', 0))
            elif request.ranking_metric == 'impact':
                opps.sort(key=lambda x: x.get('estimated_impact', 0), reverse=True)
            elif request.ranking_metric == 'timeline':
                opps.sort(key=lambda x: x.get('timeline_weeks', 0))
            opportunities['opportunities'] = opps
        execution_time = (datetime.utcnow() - start_time).total_seconds()
        return BaseResponse(
            success=True,
            message="Ranked opportunities retrieved successfully",
            execution_time=execution_time,
            data=opportunities
        )
    except Exception as e:
        logger.error(f"Ranked opportunities failed: {str(e)}", exc_info=True)
        return await handle_seo_tool_exception("get_ranked_opportunities", e, request.dict())
@router.post("/gsc/health-metrics", response_model=BaseResponse)
@log_api_call
 async def get_health_metrics(
    request: GSCHealthMetricsRequest,
    current_user: dict = Depends(get_current_user)
 ) -> Union[BaseResponse, ErrorResponse]:
    """
    Get comprehensive health metrics for SEO Dashboard.
    Returns overall SEO health with:
    - Health score (0-100)
    - Health trend (up/down/stable)
    - Keyword position distribution
    - Average metrics (position, CTR, etc.)
    - Optional: Trend comparison vs period ago
    Health Score Calculation:
    Score = 0.60×(Page1_Keywords%) + 0.30×CTR_vs_Benchmark + 0.10×Growth_Rate
    Interpretation:
    - 80-100: Excellent SEO health
    - 60-79: Good SEO health
    - 40-59: Needs improvement
    - 0-39: Critical issues
    """
    start_time = datetime.utcnow()
    try:
        user_id = str(current_user.get("id")) if current_user else None
        service = GSCStrategyInsightsService()
        metrics = await service._calculate_health_metrics(
            site_url=str(request.site_url)
        )
        execution_time = (datetime.utcnow() - start_time).total_seconds()
        return BaseResponse(
            success=True,
            message="Health metrics calculated successfully",
            execution_time=execution_time,
            data=metrics
        )
    except Exception as e:
        logger.error(f"Health metrics calculation failed: {str(e)}", exc_info=True)
        return await handle_seo_tool_exception("get_health_metrics", e, request.dict())
@router.post("/gsc/trend-analysis", response_model=BaseResponse)
@log_api_call
 async def analyze_gsc_trends(
    request: GSCTrendAnalysisRequest,
    current_user: dict = Depends(get_current_user)
 ) -> Union[BaseResponse, ErrorResponse]:
    """
    Analyze performance trends from GSC data.
    Returns trend analysis for specified metrics:
    - position: Ranking trend for keywords
    - impressions: Search volume trends
    - clicks: Click trend
    - ctr: Click-through rate trend
    - all: All metrics combined
    For each metric includes:
    - Current value
    - Value from 30/90 days ago
    - Trend direction (up/down/stable)
    - Trend percentage change
    - Momentum (acceleration of trend)
    - Seasonal patterns
    - Anomalies detected
    Note: This feature requires historical data collection.
    Phase 1: Manual trend calculation from snapshots.
    Phase 2: Automated historical tracking.
    """
    start_time = datetime.utcnow()
    try:
        user_id = str(current_user.get("id")) if current_user else None
        service = GSCStrategyInsightsService()
        trends = await service._analyze_performance_trends(
            site_url=str(request.site_url)
        )
        execution_time = (datetime.utcnow() - start_time).total_seconds()
        return BaseResponse(
            success=True,
            message="Trend analysis completed",
            execution_time=execution_time,
            data=trends
        )
    except Exception as e:
        logger.error(f"Trend analysis failed: {str(e)}", exc_info=True)
        return await handle_seo_tool_exception("analyze_gsc_trends", e, request.dict())
@router.get("/enterprise/health", response_model=BaseResponse)
@log_api_call
 async def check_enterprise_services_health() -> BaseResponse:
--- a/backend/scripts/create_youtube_tasks_tables.py
+++ b/backend/scripts/create_youtube_tasks_tables.py
@@ -0,0 +1,86 @@
 """
 Create YouTube Video Tasks Table
 Standalone script to create the youtube_video_tasks table in all user
 databases. Also recovers stale in-flight tasks by marking them as failed.
 """
 import sys
 import os
 sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
 from loguru import logger
 from models.youtube_task_models import YouTubeVideoTask, Base
 from models.subscription_models import Base as SubscriptionBase
 from services.database import get_engine_for_user, _user_engines
 from sqlalchemy import inspect
 def create_youtube_tasks_tables():
    """Create youtube_video_tasks table for all existing user databases."""
    from services.database import get_all_user_dbs
    created = 0
    skipped = 0
    recovered = 0
    try:
        user_dbs = get_all_user_dbs()
    except Exception:
        user_dbs = []
    if not user_dbs:
        logger.warning("No user databases found. Creating table in default database.")
        user_dbs = [None]
    for user_id in user_dbs:
        try:
            if user_id:
                engine = get_engine_for_user(user_id)
            else:
                from services.database import default_engine
                if not default_engine:
                    logger.error("No default engine available")
                    continue
                engine = default_engine
            SubscriptionBase.metadata.create_all(bind=engine, checkfirst=True)
            # Recover stale tasks
            from sqlalchemy.orm import sessionmaker
            SessionLocal = sessionmaker(bind=engine)
            db = SessionLocal()
            try:
                stale = db.query(YouTubeVideoTask).filter(
                    YouTubeVideoTask.status.in_([
                        'pending', 'processing',
                    ])
                ).all()
                for task in stale:
                    task.status = 'failed'
                    task.error = 'Task interrupted by server restart'
                    task.message = 'Recovered on table creation'
                    recovered += 1
                if stale:
                    db.commit()
                    logger.info(f"Recovered {len(stale)} stale tasks for user {user_id}")
            except Exception as e:
                logger.warning(f"Failed to recover stale tasks for user {user_id}: {e}")
                db.rollback()
            finally:
                db.close()
            created += 1
            logger.info(f"Created youtube_video_tasks table for user {user_id}")
        except Exception as e:
            logger.error(f"Failed to create table for user {user_id}: {e}")
            skipped += 1
    logger.info(f"YouTube task table creation complete: {created} created, {skipped} skipped, {recovered} recovered")
    return created
 if __name__ == "__main__":
    create_youtube_tasks_tables()
--- a/backend/services/blog_writer/content/enhanced_content_generator.py
+++ b/backend/services/blog_writer/content/enhanced_content_generator.py
@@ -6,7 +6,7 @@ Provider parity:
 - No direct provider coupling here; Google grounding remains in research only
 """
-from typing import Any, Dict
+from typing import Any, Dict, List
 from services.llm_providers.main_text_generation import llm_text_gen
 from .source_url_manager import SourceURLManager
@@ -22,11 +22,12 @@ class EnhancedContentGenerator:
        self.transitioner = TransitionGenerator()
        self.flow = FlowAnalyzer()
-    async def generate_section(self, section: Any, research: Any, mode: str = "polished", user_id: str = None) -> Dict[str, Any]:
+    async def generate_section(self, section: Any, research: Any = None, mode: str = "polished", user_id: str = None, competitive_advantage: str = "") -> Dict[str, Any]:
        prev_summary = self.memory.build_previous_sections_summary(limit=2)
-        urls = self.url_manager.pick_relevant_urls(section, research)
+        research_context, section_sources = self._build_research_context(section)
-        prompt = self._build_prompt(section, research, prev_summary, urls)
+        urls = self.url_manager.pick_relevant_urls(section, research) if not research_context else []
-        # Provider-agnostic text generation (respect GPT_PROVIDER & circuit-breaker)
+        global_research_context = self._build_global_research_context(research, competitive_advantage)
        prompt = self._build_prompt(section, prev_summary, research_context, urls, global_research_context)
        content_text: str = ""
        try:
            ai_resp = llm_text_gen(
@@ -40,29 +41,22 @@ class EnhancedContentGenerator:
            elif isinstance(ai_resp, str):
                content_text = ai_resp
            else:
                # Fallback best-effort extraction
                content_text = str(ai_resp or "")
        except Exception as e:
            content_text = ""
        result = {
            "content": content_text,
-            "sources": [{"title": u.get("title", ""), "url": u.get("url", "")} for u in urls] if urls else [],
+            "sources": section_sources,
        }
        # Generate transition and compute intelligent flow metrics
        previous_text = prev_summary
        current_text = result.get("content", "")
        transition = self.transitioner.generate_transition(previous_text, getattr(section, 'heading', 'This section'), use_llm=True)
        metrics = self.flow.assess_flow(previous_text, current_text, use_llm=True)
        # Update memory for subsequent sections and store continuity snapshot
        if current_text:
            self.memory.update_with_section(getattr(section, 'id', 'unknown'), current_text, use_llm=True)
        # Return enriched result
        result["transition"] = transition
        result["continuity_metrics"] = metrics
        # Persist a lightweight continuity snapshot for API access
        try:
            sid = getattr(section, 'id', 'unknown')
            if not hasattr(self, "_last_continuity"):
@@ -72,22 +66,188 @@ class EnhancedContentGenerator:
            pass
        return result
-    def _build_prompt(self, section: Any, research: Any, prev_summary: str, urls: list) -> str:
+    def _build_research_context(self, section: Any) -> tuple:
        """Build a rich research context block from the section's mapped sources.
        Returns (context_string, sources_list) where context_string is the
        formatted research context for the prompt, and sources_list contains
        {title, url} dicts for downstream use.
        When section.references is empty, returns ("", []) — the caller should
        handle this as a research gap and avoid generating unsupported claims.
        """
        references = getattr(section, 'references', []) or []
        if not references:
            return ("", [])
        context_parts = []
        sources_out = []
        for i, ref in enumerate(references, 1):
            if isinstance(ref, dict):
                title = ref.get('title', '')
                excerpt = ref.get('excerpt', '')
                highlights = ref.get('highlights', []) or []
                summary = ref.get('summary', '')
                url = ref.get('url', '')
                content = ref.get('content', '') or ''
                author = ref.get('author', '') or ''
                source_type = ref.get('source_type', '') or ''
                credibility_score = ref.get('credibility_score')
                published_at = ref.get('published_at', '') or ''
            else:
                title = getattr(ref, 'title', '')
                excerpt = getattr(ref, 'excerpt', '')
                highlights = getattr(ref, 'highlights', []) or []
                summary = getattr(ref, 'summary', '')
                url = getattr(ref, 'url', '')
                content = getattr(ref, 'content', '') or ''
                author = getattr(ref, 'author', '') or ''
                source_type = getattr(ref, 'source_type', '') or ''
                credibility_score = getattr(ref, 'credibility_score', None)
                published_at = getattr(ref, 'published_at', '') or ''
            sources_out.append({"title": title, "url": url})
            attribution_parts = []
            if author:
                attribution_parts.append(f"by {author}")
            if source_type:
                attribution_parts.append(f"[{source_type}]")
            attribution = " ".join(attribution_parts)
            credibility_tag = ""
            if credibility_score is not None:
                try:
                    score = float(credibility_score)
                    if score >= 0.9:
                        credibility_tag = " (high-credibility)"
                    elif score >= 0.75:
                        credibility_tag = " (credible)"
                except (ValueError, TypeError):
                    pass
            recency_tag = ""
            if published_at:
                recency_tag = f" (published {published_at[:10]})" if len(published_at) >= 10 else f" (published {published_at})"
            header = f"Source {i}: {title}"
            if attribution:
                header += f" {attribution}"
            header += f"{credibility_tag}{recency_tag}"
            part = header + "\n"
            if summary:
                part += f"  Summary: {summary[:1000]}\n"
            if excerpt:
                part += f"  Key excerpt: {excerpt[:1000]}\n"
            if content and not summary and not excerpt:
                part += f"  Content: {content[:800]}\n"
            if highlights:
                part += "  Key findings:\n"
                for h in highlights[:3]:
                    h_text = h[:500] if h else ''
                    if h_text:
                        part += f"  - {h_text}\n"
            context_parts.append(part)
        return ("\n".join(context_parts), sources_out)
    def _build_global_research_context(self, research: Any, competitive_advantage: str = "") -> str:
        """Build global research context from the full BlogResearchResponse object.
        Extracts keyword_analysis, competitor_analysis, search_queries,
        and competitive_advantage into a compact context block that provides
        the LLM with strategic direction beyond per-section sources.
        """
        if research is None:
            return ""
        parts = []
        ka = getattr(research, 'keyword_analysis', None) or {}
        if ka:
            primary = ka.get('primary', [])
            secondary = ka.get('secondary', [])
            search_intent = ka.get('search_intent', '')
            kw_lines = []
            if primary:
                kw_lines.append(f"Primary keywords: {', '.join(primary[:10])}")
            if secondary:
                kw_lines.append(f"Secondary keywords: {', '.join(secondary[:10])}")
            if search_intent:
                kw_lines.append(f"Search intent: {search_intent}")
            if kw_lines:
                parts.append("=== KEYWORD & SEARCH STRATEGY ===\n" + "\n".join(kw_lines))
        ca = getattr(research, 'competitor_analysis', None) or {}
        if ca:
            ca_lines = []
            content_gaps = ca.get('content_gaps', [])
            if content_gaps:
                ca_lines.append(f"Content gaps (address these): {', '.join(content_gaps[:5])}")
            industry_leaders = ca.get('industry_leaders', [])
            if industry_leaders:
                ca_lines.append(f"Industry leaders: {', '.join(industry_leaders[:5])}")
            opportunities = ca.get('opportunities', [])
            if opportunities:
                ca_lines.append(f"Opportunities: {', '.join(opportunities[:5])}")
            if ca_lines:
                parts.append("=== COMPETITIVE LANDSCAPE ===\n" + "\n".join(ca_lines))
        sq = getattr(research, 'search_queries', None) or []
        if sq:
            parts.append(f"=== SEARCH INTENT SIGNALS ===\nOriginal search queries: {', '.join(sq[:8])}")
        if competitive_advantage:
            parts.append(f"=== COMPETITIVE ADVANTAGE ===\nEmphasize this differentiator: {competitive_advantage}")
        return "\n\n".join(parts) if parts else ""
    def _build_prompt(self, section: Any, prev_summary: str, research_context: str, urls: list, global_research_context: str = "") -> str:
        heading = getattr(section, 'heading', 'Section')
        key_points = getattr(section, 'key_points', [])
        keywords = getattr(section, 'keywords', [])
        subheadings = getattr(section, 'subheadings', []) or []
        target_words = getattr(section, 'target_words', 300)
        url_block = "\n".join([f"- {u.get('title','')} ({u.get('url','')})" for u in urls]) if urls else "(no specific URLs provided)"
-        return (
+        prompt = (
            f"You are writing the blog section '{heading}'.\n\n"
            f"Context summary (previous sections): {prev_summary}\n\n"
            f"Authoring requirements:\n"
            f"- Target word count: ~{target_words}\n"
            f"- Use the following key points: {', '.join(key_points)}\n"
            f"- Include these keywords naturally: {', '.join(keywords)}\n"
            f"- Cite insights from these sources when relevant (do not output raw URLs):\n{url_block}\n\n"
            "Write engaging, well-structured markdown with clear paragraphs (2-4 sentences each) separated by double line breaks."
        )
        if subheadings:
            prompt += f"- Cover these subtopics: {', '.join(subheadings)}\n"
        if global_research_context:
            prompt += f"\n{global_research_context}\n\n"
        if research_context:
            prompt += (
                f"\nResearch sources for this section (use these facts, statistics, "
                f"and insights to support your writing):\n{research_context}\n\n"
                "IMPORTANT: Base your writing on the research sources above. "
                "Use specific facts, statistics, and data from these sources. "
                "Do not invent numbers, statistics, or claims not supported by the research.\n"
            )
        elif urls:
            import logging
            logging.getLogger('content_generator').warning(
                f"No research context for section '{heading}' — falling back to bare URLs"
            )
            url_lines = []
            for u in urls:
                if isinstance(u, dict):
                    url_lines.append(f"- {u.get('title','')} ({u.get('url','')})")
                else:
                    url_lines.append(f"- {u}")
            prompt += f"\nReference URLs (consult for additional context):\n" + "\n".join(url_lines) + "\n"
        prompt += (
            "\nWrite engaging, well-structured markdown with clear paragraphs "
            "(2-4 sentences each) separated by double line breaks."
        )
        return prompt
--- a/backend/services/blog_writer/content/flow_analyzer.py
+++ b/backend/services/blog_writer/content/flow_analyzer.py
@@ -7,10 +7,9 @@ Uses Gemini API for intelligent analysis while minimizing API calls through cach
 from typing import Dict, Optional
 from loguru import logger
 import hashlib
 import json
-# Import the common gemini provider
+# Provider-agnostic LLM dispatcher (respects GPT_PROVIDER env var)
-from services.llm_providers.gemini_provider import gemini_structured_json_response
+from services.llm_providers.main_text_generation import llm_text_gen
 class FlowAnalyzer:
@@ -21,7 +20,7 @@ class FlowAnalyzer:
        self._rule_cache: Dict[str, Dict[str, float]] = {}
        logger.info("✅ FlowAnalyzer initialized with LLM-based analysis")
-    def assess_flow(self, previous_text: str, current_text: str, use_llm: bool = True) -> Dict[str, float]:
+    def assess_flow(self, previous_text: str, current_text: str, use_llm: bool = True, user_id: str = None) -> Dict[str, float]:
        """
        Return flow metrics in range 0..1.
@@ -29,6 +28,7 @@ class FlowAnalyzer:
            previous_text: Previous section content
            current_text: Current section content  
            use_llm: Whether to use LLM analysis (default: True for significant content)
            user_id: Clerk user ID for subscription checking
        """
        if not current_text:
            return {"flow": 0.0, "consistency": 0.0, "progression": 0.0}
@@ -46,7 +46,7 @@ class FlowAnalyzer:
        if should_use_llm:
            try:
-                metrics = self._llm_flow_analysis(previous_text, current_text)
+                metrics = self._llm_flow_analysis(previous_text, current_text, user_id=user_id)
                self._cache[cache_key] = metrics
                logger.info("LLM-based flow analysis completed")
                return metrics
@@ -71,8 +71,8 @@ class FlowAnalyzer:
        # Use LLM if: substantial content (>100 words) OR has meaningful previous context
        return word_count > 100 or has_previous
-    def _llm_flow_analysis(self, previous_text: str, current_text: str) -> Dict[str, float]:
+    def _llm_flow_analysis(self, previous_text: str, current_text: str, user_id: str = None) -> Dict[str, float]:
-        """Use Gemini API for intelligent flow analysis."""
+        """Use LLM for intelligent flow analysis (provider-agnostic)."""
        # Truncate content to minimize tokens while keeping context
        prev_truncated = (previous_text[-300:] if previous_text else "") if previous_text else ""
@@ -103,22 +103,20 @@ Return ONLY a JSON object with these exact keys: flow, consistency, progression
        }
        try:
-            result = gemini_structured_json_response(
+            result = llm_text_gen(
                prompt=prompt,
-                schema=schema,
+                json_struct=schema,
-                temperature=0.2,  # Low temperature for consistent scoring
+                system_prompt=None,
-                max_tokens=1000   # Increased tokens for better analysis
+                user_id=user_id,
                temperature=0.2,
                max_tokens=1000
            )
-            if result.parsed:
+            return {
-                return {
+                "flow": float(result.get("flow", 0.6)),
-                    "flow": float(result.parsed.get("flow", 0.6)),
+                "consistency": float(result.get("consistency", 0.6)),
-                    "consistency": float(result.parsed.get("consistency", 0.6)),
+                "progression": float(result.get("progression", 0.6))
-                    "progression": float(result.parsed.get("progression", 0.6))
+            }
                }
            else:
                logger.warning("LLM response parsing failed, using fallback")
                return self._rule_based_analysis(previous_text, current_text)
        except Exception as e:
            logger.error(f"LLM flow analysis error: {e}")
--- a/backend/services/blog_writer/content/introduction_generator.py
+++ b/backend/services/blog_writer/content/introduction_generator.py
@@ -28,18 +28,17 @@ class IntroductionGenerator:
    ) -> str:
        """Build a prompt for generating blog introductions."""
        # Extract key research insights
        keyword_analysis = research.keyword_analysis or {}
        content_angles = research.suggested_angles or []
        competitor_analysis = research.competitor_analysis or {}
        search_queries = research.search_queries or []
        # Get a summary of the first few sections for context
        section_summaries = []
        for i, section in enumerate(outline[:3], 1):
            section_id = section.id
            content = sections_content.get(section_id, '')
            if content:
-                # Take first 200 chars as summary
+                summary = content[:300] + '...' if len(content) > 300 else content
                summary = content[:200] + '...' if len(content) > 200 else content
                section_summaries.append(f"{i}. {section.heading}: {summary}")
        sections_text = '\n'.join(section_summaries) if section_summaries else "Content sections are being generated."
@@ -47,13 +46,56 @@ class IntroductionGenerator:
        primary_kw_text = ', '.join(primary_keywords) if primary_keywords else "the topic"
        content_angle_text = ', '.join(content_angles[:3]) if content_angles else "General insights"
-        return f"""Generate exactly 3 varied blog introductions for the following blog post.
+        # Build keyword strategy block from actual keyword_analysis
        keyword_block = ""
        all_keywords = []
        if keyword_analysis:
            primary_kw = keyword_analysis.get('primary', [])
            secondary_kw = keyword_analysis.get('secondary', [])
            if primary_kw:
                all_keywords.extend(primary_kw[:5])
            if secondary_kw:
                all_keywords.extend(secondary_kw[:5])
            si = keyword_analysis.get('search_intent', '')
            if si:
                keyword_block += f"\nSearch intent: {si}"
        if all_keywords:
            keyword_block = f"Target keywords: {', '.join(all_keywords)}" + keyword_block
        # Build competitive landscape block
        competitive_block = ""
        if competitor_analysis:
            gaps = competitor_analysis.get('content_gaps', [])
            leaders = competitor_analysis.get('industry_leaders', [])
            opportunities = competitor_analysis.get('opportunities', [])
            advantages = competitor_analysis.get('competitive_advantages', [])
            comp_lines = []
            if advantages:
                comp_lines.append(f"Key differentiators: {', '.join(advantages[:3])}")
            if gaps:
                comp_lines.append(f"Content gaps to address: {', '.join(gaps[:3])}")
            if leaders:
                comp_lines.append(f"Industry leaders: {', '.join(leaders[:3])}")
            if opportunities:
                comp_lines.append(f"Opportunities: {', '.join(opportunities[:3])}")
            if comp_lines:
                competitive_block = "\n".join(comp_lines)
        # Build search intent context
        search_block = ""
        if search_queries:
            search_block = f"Original search queries: {', '.join(search_queries[:5])}"
        prompt = f"""Generate exactly 3 varied blog introductions for the following blog post.
 BLOG TITLE: {blog_title}
 PRIMARY KEYWORDS: {primary_kw_text}
 SEARCH INTENT: {search_intent}
 CONTENT ANGLES: {content_angle_text}
 {keyword_block}
 {f"COMPETITIVE LANDSCAPE:\n{competitive_block}" if competitive_block else ""}
 {f"SEARCH CONTEXT:\n{search_block}" if search_block else ""}
 BLOG CONTENT SUMMARY:
 {sections_text}
@@ -69,6 +111,7 @@ REQUIREMENTS FOR EACH INTRODUCTION:
  3. Third: Story/statistic-focused (use a compelling fact or narrative hook)
 - Maintain a professional yet engaging tone
 - Avoid generic phrases - be specific and benefit-driven
 - Where possible, incorporate specific insights from the competitive landscape and search intent above
 Return ONLY a JSON array of exactly 3 introductions:
 [
@@ -76,6 +119,7 @@ Return ONLY a JSON array of exactly 3 introductions:
  "Second introduction (80-120 words, benefit-focused)",
  "Third introduction (80-120 words, story/statistic-focused)"
 ]"""
        return prompt
    def get_introduction_schema(self) -> Dict[str, Any]:
        """Get the JSON schema for introduction generation."""
--- a/backend/services/blog_writer/core/blog_writer_service.py
+++ b/backend/services/blog_writer/core/blog_writer_service.py
@@ -129,9 +129,9 @@ class BlogWriterService:
        """Enhance a section using AI."""
        return await self.outline_service.enhance_section_with_ai(section, focus)
-    async def optimize_outline_with_ai(self, outline: List[BlogOutlineSection], focus: str = "general optimization") -> List[BlogOutlineSection]:
+    async def optimize_outline_with_ai(self, outline: List[BlogOutlineSection], focus: str = "general optimization", research_context: str = "") -> List[BlogOutlineSection]:
        """Optimize entire outline for better flow and SEO."""
-        return await self.outline_service.optimize_outline_with_ai(outline, focus)
+        return await self.outline_service.optimize_outline_with_ai(outline, focus, research_context=research_context)
    def rebalance_word_counts(self, outline: List[BlogOutlineSection], target_words: int) -> List[BlogOutlineSection]:
        """Rebalance word count distribution across sections."""
@@ -140,14 +140,15 @@ class BlogWriterService:
    # Content Generation Methods
    async def generate_section(self, request: BlogSectionRequest, user_id: str = None) -> BlogSectionResponse:
        """Generate section content from outline."""
-        # Compose research-lite object with minimal continuity summary if available
+        research_ctx = request.research
-        research_ctx: Any = getattr(request, 'research', None)
+        competitive_advantage = request.competitive_advantage
        try:
            ai_result = await self.content_generator.generate_section(
                section=request.section,
                research=research_ctx,
                mode=(request.mode or "polished"),
-                user_id=user_id
+                user_id=user_id,
                competitive_advantage=competitive_advantage,
            )
            markdown = ai_result.get('content') or ai_result.get('markdown') or ''
            citations = []
@@ -339,8 +340,19 @@ class BlogWriterService:
            )
    async def publish(self, request: BlogPublishRequest) -> BlogPublishResponse:
-        """Publish content to specified platform."""
+        """Publish content to specified platform.
-        # TODO: Move to content module
+        
        NOTE: This endpoint is a STUB / placeholder. The actual publish flow
        bypasses this method entirely — the frontend calls platform-specific
        endpoints directly:
          - Wix:  POST /api/wix/publish  (wix_routes.py)
          - WordPress: POST /api/wordpress/publish  (routers/wordpress.py)
        TODO: Either remove this stub or wire it as a unified dispatcher that
        routes to the correct platform service. Keep alive until the new
        unified publish flow (pre-publish checklist + schedule + history) is
        built and this becomes the single entry point for all publishing.
        """
        return BlogPublishResponse(success=True, platform=request.platform, url="https://example.com/post")
    async def generate_medium_blog_with_progress(self, req: MediumBlogGenerateRequest, task_id: str, user_id: str, db: Session = None) -> MediumBlogGenerateResult:
@@ -359,9 +371,11 @@ class BlogWriterService:
    async def analyze_flow_basic(self, request: Dict[str, Any]) -> Dict[str, Any]:
        """Analyze flow metrics for entire blog using single AI call (cost-effective)."""
        try:
            import asyncio
            # Extract blog content from request
            sections = request.get("sections", [])
            title = request.get("title", "Untitled Blog")
            user_id = request.get("user_id")
            if not sections:
                return {"error": "No sections provided for analysis"}
@@ -397,8 +411,7 @@ class BlogWriterService:
            Provide detailed analysis with specific, actionable suggestions for improvement.
            """
-            # Use Gemini for structured analysis
+            from services.llm_providers.main_text_generation import llm_text_gen
            from services.llm_providers.gemini_provider import gemini_structured_json_response
            schema = {
                "type": "object",
@@ -440,12 +453,17 @@ class BlogWriterService:
                "required": ["overall_flow_score", "overall_consistency_score", "overall_progression_score", "overall_coherence_score", "sections", "overall_suggestions"]
            }
-            result = gemini_structured_json_response(
+            result = await asyncio.to_thread(
-                prompt=analysis_prompt,
+                llm_text_gen,
-                schema=schema,
+                analysis_prompt,
-                temperature=0.3,
+                system_prompt,
-                max_tokens=4096,
+                schema,
-                system_prompt=system_prompt
+                user_id,
                None,   # preferred_hf_models
                None,   # preferred_provider
                None,   # flow_type
                4096,   # max_tokens
                0.3     # temperature
            )
            if result and not result.get("error"):
@@ -466,6 +484,7 @@ class BlogWriterService:
            # Use the existing enhanced content generator for detailed analysis
            sections = request.get("sections", [])
            title = request.get("title", "Untitled Blog")
            user_id = request.get("user_id")
            if not sections:
                return {"error": "No sections provided for analysis"}
@@ -485,7 +504,8 @@ class BlogWriterService:
                flow_metrics = self.content_generator.flow.assess_flow(
                    prev_section_content, 
                    section_content, 
-                    use_llm=True
+                    use_llm=True,
                    user_id=user_id
                )
                results.append({
--- a/backend/services/blog_writer/outline/grounding_engine.py
+++ b/backend/services/blog_writer/outline/grounding_engine.py
@@ -40,8 +40,10 @@ class GroundingContextEngine:
        }
        # Temporal relevance patterns
        cy = str(datetime.now().year)
        ny = str(datetime.now().year + 1)
        self.temporal_patterns = {
-            'recent': ['2024', '2025', 'latest', 'new', 'recent', 'current', 'updated'],
+            'recent': [cy, ny, 'latest', 'new', 'recent', 'current', 'updated'],
            'trending': ['trend', 'emerging', 'growing', 'increasing', 'rising'],
            'evergreen': ['fundamental', 'basic', 'principles', 'foundation', 'core']
        }
@@ -239,9 +241,23 @@ class GroundingContextEngine:
            else:
                authority_distribution['low'] += 1
        # Extract actual high-authority sources from chunks
        high_authority_sources = []
        for chunk in grounding_metadata.grounding_chunks:
            chunk_authority = self._calculate_chunk_authority(chunk)
            if chunk_authority >= 0.8:
                high_authority_sources.append({
                    'title': chunk.title if chunk.title else 'Unknown Source',
                    'url': chunk.url if chunk.url else '',
                    'score': round(chunk_authority, 3)
                })
        # Sort by authority score descending, keep top 5
        high_authority_sources.sort(key=lambda x: x['score'], reverse=True)
        high_authority_sources = high_authority_sources[:5]
        return {
            'average_authority_score': sum(authority_scores) / len(authority_scores) if authority_scores else 0.0,
-            'high_authority_sources': [{'title': 'High Authority Source', 'url': 'example.com', 'score': 0.9}],  # Placeholder
+            'high_authority_sources': high_authority_sources,
            'authority_distribution': dict(authority_distribution)
        }
--- a/backend/services/blog_writer/outline/keyword_curator.py
+++ b/backend/services/blog_writer/outline/keyword_curator.py
@@ -137,6 +137,15 @@ class KeywordCurator:
            lines.append(f"### Competitive advantage signal (must weave into narrative): {content_gap[0]}")
            lines.append("   → This is your primary differentiation hook. Surface it prominently in the unique value section.")
        lines.append("")
        lines.append("### SUGGESTED SECTION → KEYWORD MAPPING")
        lines.append("Map each outline section's keyword focus according to its narrative role:")
        lines.append("- Hook / Introduction → lead with primary and trending keywords for timeliness & relevance")
        lines.append("- Problem / Pain Point → anchor on secondary and long-tail keywords (informational intent)")
        lines.append("- Solution / How-To → weave in primary and secondary keywords for solution-oriented search")
        lines.append("- Comparison / Analysis → embed semantic keywords to prevent topical drift into tangents")
        lines.append("- Case Studies / Evidence → surface content gap keywords as differentiation proof points")
        lines.append("- Future / Trends → leverage trending and content gap keywords for forward-looking authority")
        lines.append("")
        lines.append("GUIDELINE: Treat these as the primary keyword anchors. You may include closely related")
        lines.append("intent-matching variations where natural, but avoid inserting every raw research keyword.")
@@ -176,7 +185,11 @@ class KeywordCurator:
        slot_key: Optional[str] = None,
    ) -> List[str]:
        """
-        Pick up to N items from a keyword list.
+        Pick up to N items from a keyword list with diversity sampling.
        When the raw list is significantly larger than the limit, selects
        evenly-spaced entries to capture semantic diversity rather than
        just the first N entries.
        Args:
            data: The raw keyword_analysis dict.
@@ -184,11 +197,24 @@ class KeywordCurator:
            slot_key: The internal slot name for looking up the limit.
                      Falls back to source_key if not provided.
        Returns:
-            Sliced list of at most N strings.
+            List of at most N strings with diversity sampling.
        """
        limit_key = slot_key or source_key
        limit = self.SLOTS.get(limit_key, 5)
        raw: Any = data.get(source_key, [])
        if not isinstance(raw, list):
            return []
-        return raw[:limit]
+        if len(raw) <= limit:
            return raw
        if len(raw) <= limit * 2:
            return raw[:limit]
        indices = set()
        if limit >= 2:
            indices.add(0)
            indices.add(len(raw) - 1)
            step = (len(raw) - 1) / max(limit - 1, 1)
            for i in range(1, limit - 1):
                indices.add(int(round(i * step)))
        else:
            indices.add(0)
        return [raw[i] for i in sorted(indices) if i < len(raw)][:limit]
--- a/backend/services/blog_writer/outline/outline_generator.py
+++ b/backend/services/blog_writer/outline/outline_generator.py
@@ -52,6 +52,44 @@ class OutlineGenerator:
        raw_analysis = research.keyword_analysis if research else {}
        return self.keyword_curator.curate(raw_analysis)
    def _build_optimization_context(self, research) -> str:
        """Build a compact research context for the outline optimizer.
        Provides keywords, competitor data, and top source summaries so
        the optimizer doesn't run blind to the research."""
        if not research:
            return ""
        parts = []
        kw = research.keyword_analysis if research.keyword_analysis else {}
        primary = kw.get('primary', [])
        if primary:
            parts.append(f"Primary keywords: {', '.join(primary[:5])}")
        search_intent = kw.get('search_intent', '')
        if search_intent:
            parts.append(f"Search intent: {search_intent}")
        comp = research.competitor_analysis if research.competitor_analysis else {}
        top_competitors = comp.get('top_competitors', [])
        if top_competitors:
            parts.append(f"Top competitors: {', '.join(str(c) for c in top_competitors[:5])}")
        content_gaps = kw.get('content_gaps', [])
        if content_gaps:
            parts.append(f"Content gaps: {'; '.join(str(g) for g in content_gaps[:5])}")
        opportunities = comp.get('opportunities', [])
        if opportunities:
            parts.append(f"Opportunities: {'; '.join(str(o) for o in opportunities[:5])}")
        sources = research.sources if research.sources else []
        if sources:
            top_sources = sorted(sources, key=lambda s: s.credibility_score or 0.8, reverse=True)[:5]
            source_lines = []
            for s in top_sources:
                line = f"- {s.title}"
                if s.summary:
                    line += f": {s.summary[:150]}"
                elif s.excerpt:
                    line += f": {s.excerpt[:150]}"
                source_lines.append(line)
            parts.append("Key research sources:\n" + "\n".join(source_lines))
        return "\n".join(parts)
    async def generate(self, request: BlogOutlineRequest, user_id: str) -> BlogOutlineResponse:
        """
        Generate AI-powered outline using research results.
@@ -102,7 +140,7 @@ class OutlineGenerator:
        # Run parallel processing for speed optimization (user_id required)
        mapped_sections, grounding_insights = await self.parallel_processor.run_parallel_processing_async(
-            outline_sections, research, user_id
+            outline_sections, research, user_id, competitive_advantage=selected_competitive_advantage or ""
        )
        # Enhance sections with grounding insights
@@ -113,7 +151,8 @@ class OutlineGenerator:
        # Optimize outline for better flow, SEO, and engagement (user_id required)
        logger.info("Optimizing outline for better flow and engagement...")
-        optimized_sections = await self.outline_optimizer.optimize(grounding_enhanced_sections, "comprehensive optimization", user_id)
+        optimization_context = self._build_optimization_context(research)
        optimized_sections = await self.outline_optimizer.optimize(grounding_enhanced_sections, "comprehensive optimization", user_id, research_context=optimization_context)
        # Rebalance word counts for optimal distribution
        target_words = request.word_count or 1500
@@ -124,7 +163,8 @@ class OutlineGenerator:
        content_angle_titles = self.title_generator.extract_content_angle_titles(research)
        # Combine AI-generated titles with content angles (full primary keywords for title variety)
-        title_options = self.title_generator.combine_title_options(ai_title_options, content_angle_titles, primary_keywords)
+        research_topic = getattr(request, 'topic', '') or ''
        title_options = self.title_generator.combine_title_options(ai_title_options, content_angle_titles, primary_keywords, research_topic)
        logger.info(f"Generated optimized outline with {len(balanced_sections)} sections and {len(title_options)} title options")
@@ -201,7 +241,7 @@ class OutlineGenerator:
        # Run parallel processing for speed optimization (user_id required for subscription checks)
        mapped_sections, grounding_insights = await self.parallel_processor.run_parallel_processing(
-            outline_sections, research, user_id, task_id
+            outline_sections, research, user_id, task_id, competitive_advantage=selected_competitive_advantage or ""
        )
        # Enhance sections with grounding insights (depends on both previous tasks)
@@ -212,7 +252,8 @@ class OutlineGenerator:
        # Optimize outline for better flow, SEO, and engagement (user_id required for subscription checks)
        await task_manager.update_progress(task_id, "🎯 Optimizing outline for better flow and engagement...")
-        optimized_sections = await self.outline_optimizer.optimize(grounding_enhanced_sections, "comprehensive optimization", user_id)
+        optimization_context = self._build_optimization_context(research)
        optimized_sections = await self.outline_optimizer.optimize(grounding_enhanced_sections, "comprehensive optimization", user_id, research_context=optimization_context)
        # Rebalance word counts for optimal distribution
        await task_manager.update_progress(task_id, "⚖️ Rebalancing word count distribution...")
@@ -224,7 +265,8 @@ class OutlineGenerator:
        content_angle_titles = self.title_generator.extract_content_angle_titles(research)
        # Combine AI-generated titles with content angles (full primary keywords for title variety)
-        title_options = self.title_generator.combine_title_options(ai_title_options, content_angle_titles, primary_keywords)
+        research_topic = getattr(request, 'topic', '') or ''
        title_options = self.title_generator.combine_title_options(ai_title_options, content_angle_titles, primary_keywords, research_topic)
        await task_manager.update_progress(task_id, "✅ Outline generation and optimization completed successfully!")
--- a/backend/services/blog_writer/outline/outline_optimizer.py
+++ b/backend/services/blog_writer/outline/outline_optimizer.py
@@ -4,7 +4,7 @@ Outline Optimizer - AI-powered outline optimization and rebalancing.
 Optimizes outlines for better flow, SEO, and engagement.
 """
-from typing import List
+from typing import List, Dict, Any, Optional
 from loguru import logger
 from models.blog_models import BlogOutlineSection
@@ -13,13 +13,14 @@ from models.blog_models import BlogOutlineSection
 class OutlineOptimizer:
    """Optimizes outlines for better flow, SEO, and engagement."""
-    async def optimize(self, outline: List[BlogOutlineSection], focus: str, user_id: str) -> List[BlogOutlineSection]:
+    async def optimize(self, outline: List[BlogOutlineSection], focus: str, user_id: str, research_context: str = "") -> List[BlogOutlineSection]:
        """Optimize entire outline for better flow, SEO, and engagement.
        Args:
            outline: List of outline sections to optimize
            focus: Optimization focus (e.g., "general optimization")
            user_id: User ID (required for subscription checks and usage tracking)
            research_context: Optional research context to ground optimization
        Returns:
            List of optimized outline sections
@@ -40,19 +41,28 @@ Current Outline:
 Optimization Focus: {focus}
 Goals: Improve narrative flow, enhance SEO, increase engagement, ensure comprehensive coverage.
 """
        if research_context:
            optimization_prompt += f"""
 Research Context (use this to ground your optimization in real data):
 {research_context}
 Ensure the optimized outline reflects the research insights above — headings should address the key topics, keywords should align with search intent, and sections should cover the most important angles from the research.
 """
        optimization_prompt += """
 Return JSON format:
-{{
+{
    "outline": [
-        {{
+        {
            "heading": "Optimized heading",
            "subheadings": ["subheading 1", "subheading 2"],
            "key_points": ["point 1", "point 2"],
            "target_words": 300,
            "keywords": ["keyword1", "keyword2"]
-        }}
+        }
    ]
-}}"""
+}"""
        try:
            from services.llm_providers.main_text_generation import llm_text_gen
@@ -112,26 +122,34 @@ Return JSON format:
        return outline
    def rebalance_word_counts(self, outline: List[BlogOutlineSection], target_words: int) -> List[BlogOutlineSection]:
-        """Rebalance word count distribution across sections."""
+        """Rebalance word count distribution across sections, weighting by source count."""
        total_sections = len(outline)
        if total_sections == 0:
            return outline
-        # Calculate target distribution
+        intro_words = int(target_words * 0.12)
-        intro_words = int(target_words * 0.12)  # 12% for intro
+        conclusion_words = int(target_words * 0.12)
        conclusion_words = int(target_words * 0.12)  # 12% for conclusion
        main_content_words = target_words - intro_words - conclusion_words
-        # Distribute main content words across sections
+        # Weight sections by research density (sections with more sources get more words)
-        words_per_section = main_content_words // total_sections
+        main_sections = outline[1:-1] if total_sections > 2 else outline
-        remainder = main_content_words % total_sections
+        source_weights = []
        for section in main_sections:
            ref_count = len(getattr(section, 'references', []) or [])
            source_weights.append(1.0 + ref_count * 0.5)
        total_weight = sum(source_weights) if source_weights else len(main_sections)
        for i, section in enumerate(outline):
-            if i == 0:  # First section (intro)
+            if i == 0 and total_sections > 2:
                section.target_words = intro_words
-            elif i == total_sections - 1:  # Last section (conclusion)
+            elif i == total_sections - 1 and total_sections > 2:
                section.target_words = conclusion_words
-            else:  # Main content sections
+            else:
-                section.target_words = words_per_section + (1 if i < remainder else 0)
+                main_idx = i - 1 if total_sections > 2 else i
                if main_idx < len(source_weights):
                    section.target_words = int(main_content_words * source_weights[main_idx] / total_weight)
                else:
                    section.target_words = main_content_words // max(len(main_sections), 1)
        return outline
--- a/backend/services/blog_writer/outline/outline_service.py
+++ b/backend/services/blog_writer/outline/outline_service.py
@@ -233,9 +233,9 @@ class OutlineService:
        """Enhance a section using AI with research context."""
        return await self.section_enhancer.enhance(section, focus)
-    async def optimize_outline_with_ai(self, outline: List[BlogOutlineSection], focus: str = "general optimization") -> List[BlogOutlineSection]:
+    async def optimize_outline_with_ai(self, outline: List[BlogOutlineSection], focus: str = "general optimization", research_context: str = "") -> List[BlogOutlineSection]:
        """Optimize entire outline for better flow, SEO, and engagement."""
-        return await self.outline_optimizer.optimize(outline, focus)
+        return await self.outline_optimizer.optimize(outline, focus, research_context=research_context)
    def rebalance_word_counts(self, outline: List[BlogOutlineSection], target_words: int) -> List[BlogOutlineSection]:
        """Rebalance word count distribution across sections."""
--- a/backend/services/blog_writer/outline/parallel_processor.py
+++ b/backend/services/blog_writer/outline/parallel_processor.py
@@ -17,7 +17,7 @@ class ParallelProcessor:
        self.source_mapper = source_mapper
        self.grounding_engine = grounding_engine
-    async def run_parallel_processing(self, outline_sections, research, user_id: str, task_id: str = None) -> Tuple[Any, Any]:
+    async def run_parallel_processing(self, outline_sections, research, user_id: str, task_id: str = None, competitive_advantage: str = "") -> Tuple[Any, Any]:
        """
        Run source mapping and grounding insights extraction in parallel.
@@ -26,6 +26,7 @@ class ParallelProcessor:
            research: Research data object
            user_id: User ID (required for subscription checks and usage tracking)
            task_id: Optional task ID for progress updates
            competitive_advantage: Selected competitive advantage for preferential source matching
        Returns:
            Tuple of (mapped_sections, grounding_insights)
@@ -44,7 +45,7 @@ class ParallelProcessor:
        # Run these tasks in parallel to save time
        source_mapping_task = asyncio.create_task(
-            self._run_source_mapping(outline_sections, research, task_id, user_id)
+            self._run_source_mapping(outline_sections, research, task_id, user_id, competitive_advantage)
        )
        grounding_insights_task = asyncio.create_task(
@@ -59,7 +60,7 @@ class ParallelProcessor:
        return mapped_sections, grounding_insights
-    async def run_parallel_processing_async(self, outline_sections, research, user_id: str) -> Tuple[Any, Any]:
+    async def run_parallel_processing_async(self, outline_sections, research, user_id: str, competitive_advantage: str = "") -> Tuple[Any, Any]:
        """
        Run parallel processing without progress updates (for non-progress methods).
@@ -67,6 +68,7 @@ class ParallelProcessor:
            outline_sections: List of outline sections to process
            research: Research data object
            user_id: User ID (required for subscription checks and usage tracking)
            competitive_advantage: Selected competitive advantage for preferential source matching
        Returns:
            Tuple of (mapped_sections, grounding_insights)
@@ -81,7 +83,7 @@ class ParallelProcessor:
        # Run these tasks in parallel to save time
        source_mapping_task = asyncio.create_task(
-            self._run_source_mapping_async(outline_sections, research, user_id)
+            self._run_source_mapping_async(outline_sections, research, user_id, competitive_advantage)
        )
        grounding_insights_task = asyncio.create_task(
@@ -96,12 +98,12 @@ class ParallelProcessor:
        return mapped_sections, grounding_insights
-    async def _run_source_mapping(self, outline_sections, research, task_id, user_id: str):
+    async def _run_source_mapping(self, outline_sections, research, task_id, user_id: str, competitive_advantage: str = ""):
        """Run source mapping in parallel."""
        if task_id:
            from api.blog_writer.task_manager import task_manager
            await task_manager.update_progress(task_id, "🔗 Applying intelligent source-to-section mapping...")
-        return self.source_mapper.map_sources_to_sections(outline_sections, research, user_id)
+        return self.source_mapper.map_sources_to_sections(outline_sections, research, user_id, competitive_advantage=competitive_advantage)
    async def _run_grounding_insights_extraction(self, research, task_id):
        """Run grounding insights extraction in parallel."""
@@ -110,10 +112,10 @@ class ParallelProcessor:
            await task_manager.update_progress(task_id, "🧠 Extracting grounding metadata insights...")
        return self.grounding_engine.extract_contextual_insights(research.grounding_metadata)
-    async def _run_source_mapping_async(self, outline_sections, research, user_id: str):
+    async def _run_source_mapping_async(self, outline_sections, research, user_id: str, competitive_advantage: str = ""):
        """Run source mapping in parallel (async version without progress updates)."""
        logger.info("Applying intelligent source-to-section mapping...")
-        return self.source_mapper.map_sources_to_sections(outline_sections, research, user_id)
+        return self.source_mapper.map_sources_to_sections(outline_sections, research, user_id, competitive_advantage=competitive_advantage)
    async def _run_grounding_insights_extraction_async(self, research):
        """Run grounding insights extraction in parallel (async version without progress updates)."""
--- a/backend/services/blog_writer/outline/prompt_builder.py
+++ b/backend/services/blog_writer/outline/prompt_builder.py
@@ -36,11 +36,88 @@ class PromptBuilder:
        competitor_text = ', '.join(research.competitor_analysis.get('top_competitors', [])) if research and research.competitor_analysis else "Not available"
        opportunity_text = ', '.join(research.competitor_analysis.get('opportunities', [])) if research and research.competitor_analysis else "Not available"
        advantages_text = ', '.join(research.competitor_analysis.get('competitive_advantages', [])) if research and research.competitor_analysis else "Not available"
        competitor_headings_text = ', '.join(research.competitor_analysis.get('competitor_headings', [])[:3]) if research and research.competitor_analysis and research.competitor_analysis.get('competitor_headings') else ""
        content_gaps_text = ', '.join(research.competitor_analysis.get('content_gaps', [])) if research and research.competitor_analysis and research.competitor_analysis.get('content_gaps') else ""
        industry_leaders_text = ', '.join(research.competitor_analysis.get('industry_leaders', [])) if research and research.competitor_analysis and research.competitor_analysis.get('industry_leaders') else ""
        # Extract additional UI-mapped context fields
        analysis_insights_text = (research.keyword_analysis.get('analysis_insights', '') or '') if research and research.keyword_analysis else ''
        market_positioning_text = (research.competitor_analysis.get('market_positioning', '') or '') if research and research.competitor_analysis else ''
        difficulty_score = research.keyword_analysis.get('difficulty', None) if research and research.keyword_analysis else None
        # Extract search queries as intent signals
        search_queries_text = ', '.join(research.search_queries) if research and hasattr(research, 'search_queries') and research.search_queries else ""
        # Build numbered source list — all sources with index, title, excerpt, and highlights
        # The LLM will reference these indices when assigning sources to sections
        source_list_text = ""
        if sources:
            source_lines = []
            for i, src in enumerate(sources, 1):
                title = getattr(src, 'title', '') or ''
                excerpt = getattr(src, 'excerpt', '') or ''
                highlights = getattr(src, 'highlights', []) or []
                summary = getattr(src, 'summary', '') or ''
                source_type = getattr(src, 'source_type', '') or ''
                author = getattr(src, 'author', '') or ''
                line = f"  [{i}] {title}"
                if source_type:
                    line += f" [{source_type}]"
                if author:
                    line += f" by {author}"
                if summary:
                    line += f" — {summary[:1000]}"
                elif excerpt:
                    line += f" — {excerpt[:1000]}"
                if highlights:
                    line += f" | Key findings: {'; '.join(h[:250] for h in highlights[:3])}"
                source_lines.append(line)
            if source_lines:
                source_list_text = "RESEARCH SOURCES (numbered for reference):\n" + "\n".join(source_lines)
        # Top factual excerpts for depth (keep as supplement)
        source_excerpts_text = ""
        if sources:
            sorted_sources = sorted(
                [s for s in sources if (s.excerpt or s.summary)],
                key=lambda s: s.credibility_score or 0.8, reverse=True
            )[:5]
            excerpts = []
            for i, src in enumerate(sorted_sources, 1):
                excerpt = src.excerpt or src.summary or ""
                if len(excerpt) > 500:
                    excerpt = excerpt[:497] + "..."
                excerpts.append(f"  {i}. \"{src.title}\" — {excerpt}")
            if excerpts:
                source_excerpts_text = "DETAILED FACTS FROM TOP SOURCES:\n" + "\n".join(excerpts)
        # Extract recency: newest source publication date
        newest_date_str = ""
        if sources:
            valid_dates = [s.published_at for s in sources if s.published_at]
            if valid_dates:
                try:
                    parsed = [d for d in valid_dates if d[:4].isdigit()]
                    if parsed:
                        sorted_dates = sorted(parsed, reverse=True)
                        newest_date_str = f"Most Recent Source: {sorted_dates[0]}"
                except Exception:
                    pass
        # Extract top grounding evidence snippets as verified data points
        grounding_evidence_text = ""
        if research and research.grounding_metadata and research.grounding_metadata.grounding_supports:
            supports = research.grounding_metadata.grounding_supports
            top_supports = [s for s in supports if s.segment_text and len(s.segment_text) > 20][:5]
            if top_supports:
                evidence_parts = []
                for i, s in enumerate(top_supports, 1):
                    text = s.segment_text[:400]
                    if len(s.segment_text) > 400:
                        text += "..."
                    evidence_parts.append(f"  {i}. {text}")
                grounding_evidence_text = "VERIFIED EVIDENCE (high-confidence snippets):\n" + "\n".join(evidence_parts)
        # Build selected angle prominence section
        if selected_content_angle and selected_content_angle.strip():
@@ -106,8 +183,17 @@ Top Competitors: {competitor_text}
 Market Opportunities: {opportunity_text}
 Competitive Advantages: {advantages_text}
 {f"Market Positioning: {market_positioning_text}" if market_positioning_text else ""}
 {f"Competitor Headings (AVOID duplicating): {competitor_headings_text}" if competitor_headings_text else ""}
 {f"Content Gaps (MUST address these gaps): {content_gaps_text}" if content_gaps_text else ""}
 {f"Industry Leaders: {industry_leaders_text}" if industry_leaders_text else ""}
 {f"Search Intent Signals: {search_queries_text}" if search_queries_text else ""}
-RESEARCH SOURCES: {len(sources)} authoritative sources available
+{source_list_text}
 {newest_date_str}
 {source_excerpts_text}
 {grounding_evidence_text}
 {f"CUSTOM INSTRUCTIONS: {custom_instructions}" if custom_instructions else ""}
@@ -118,8 +204,9 @@ STRATEGIC REQUIREMENTS:
 - Create SEO-optimized headings with natural keyword integration
 - Surface the strongest research-backed angles within the outline
 - Build logical narrative flow from problem to solution
- Include data-driven insights from research sources
+- Include data-driven insights from research sources — use the numbered sources above
- Address content gaps and market opportunities
+- For each section, assign the most relevant source indices using the [N] numbers above
 - Address content gaps and market opportunities — if content gaps are listed, dedicate sections to fill those gaps
 - Optimize for search intent and user questions
 - Ensure engaging, actionable content throughout
@@ -136,7 +223,8 @@ Return JSON format:
            "subheadings": ["Subheading 1", "Subheading 2", "Subheading 3"],
            "key_points": ["Key point 1", "Key point 2", "Key point 3"],
            "target_words": 300,
-            "keywords": ["keyword 1", "keyword 2"]
+            "keywords": ["keyword 1", "keyword 2"],
            "source_indices": [1, 3, 5]
        }}
    ]
 }}"""
@@ -170,9 +258,14 @@ Return JSON format:
                            "keywords": {
                                "type": "array",
                                "items": {"type": "string"}
                            },
                            "source_indices": {
                                "type": "array",
                                "items": {"type": "integer"},
                                "description": "Indices of research sources (from the numbered list above) that support this section"
                            }
                        },
-                        "required": ["heading", "subheadings", "key_points", "target_words", "keywords"]
+                        "required": ["heading", "subheadings", "key_points", "target_words", "keywords", "source_indices"]
                    }
                }
            },
--- a/backend/services/blog_writer/outline/response_processor.py
+++ b/backend/services/blog_writer/outline/response_processor.py
@@ -100,18 +100,37 @@ class ResponseProcessor:
                    raise ValueError(f"AI outline generation failed: {error_str}")
    def convert_to_sections(self, outline_data: Dict[str, Any], sources: List) -> List[BlogOutlineSection]:
-        """Convert outline data to BlogOutlineSection objects."""
+        """Convert outline data to BlogOutlineSection objects.
        If the LLM assigned source_indices to sections, populate references
        directly from those indices. Indices are 1-based (matching the [N] 
        labels in the prompt) — converted to 0-based for list access.
        Sections without source_indices will be populated by the algorithmic
        source mapper in a later step.
        """
        outline_sections = []
        for i, section_data in enumerate(outline_data.get('outline', [])):
            if not isinstance(section_data, dict) or 'heading' not in section_data:
                continue
-                
+            
            # Parse LLM-assigned source indices (1-based)
            raw_indices = section_data.get('source_indices', [])
            section_refs = []
            if raw_indices and sources:
                for idx in raw_indices:
                    try:
                        source_idx = int(idx) - 1  # Convert 1-based → 0-based
                        if 0 <= source_idx < len(sources):
                            section_refs.append(sources[source_idx])
                    except (ValueError, TypeError):
                        pass
            section = BlogOutlineSection(
                id=f"s{i+1}",
                heading=section_data.get('heading', f'Section {i+1}'),
                subheadings=section_data.get('subheadings', []),
                key_points=section_data.get('key_points', []),
-                references=[],  # Will be populated by intelligent mapping
+                references=section_refs,  # LLM-assigned if provided, else []
                target_words=section_data.get('target_words', 200),
                keywords=section_data.get('keywords', [])
            )
--- a/backend/services/blog_writer/outline/source_mapper.py
+++ b/backend/services/blog_writer/outline/source_mapper.py
@@ -41,10 +41,33 @@ class SourceToSectionMapper:
            'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by',
            'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'do', 'does', 'did',
            'will', 'would', 'could', 'should', 'may', 'might', 'must', 'can', 'this', 'that', 'these', 'those',
-            'how', 'what', 'when', 'where', 'why', 'who', 'which', 'how', 'much', 'many', 'more', 'most',
+            'how', 'what', 'when', 'where', 'why', 'who', 'which', 'much', 'many', 'more', 'most',
            'some', 'any', 'all', 'each', 'every', 'other', 'another', 'such', 'no', 'not', 'only', 'own',
            'same', 'so', 'than', 'too', 'very', 'just', 'now', 'here', 'there', 'up', 'down', 'out', 'off',
-            'over', 'under', 'again', 'further', 'then', 'once'
+            'over', 'under', 'again', 'further', 'then', 'once', 'also', 'into', 'about', 'between',
            'through', 'during', 'before', 'after', 'above', 'below', 'from', 'since', 'until', 'while',
            'because', 'however', 'therefore', 'thus', 'hence', 'yet', 'still', 'already', 'even'
        }
        # Common abbreviation/synonym pairs for fuzzy matching
        self._synonym_map = {
            'ai': ['artificial intelligence', 'machine intelligence'],
            'ml': ['machine learning'],
            'dl': ['deep learning'],
            'nlp': ['natural language processing'],
            'iot': ['internet of things'],
            'saas': ['software as a service'],
            'b2b': ['business to business'],
            'b2c': ['business to consumer'],
            'cx': ['customer experience'],
            'ux': ['user experience'],
            'roi': ['return on investment'],
            'kpi': ['key performance indicator'],
            'crm': ['customer relationship management'],
            'erp': ['enterprise resource planning'],
            'seo': ['search engine optimization'],
            'cto': ['chief technology officer'],
            'vp': ['vice president'],
        }
        logger.info("✅ SourceToSectionMapper initialized with intelligent mapping algorithms")
@@ -53,15 +76,21 @@ class SourceToSectionMapper:
        self, 
        sections: List[BlogOutlineSection], 
        research_data: BlogResearchResponse,
-        user_id: str
+        user_id: str,
        competitive_advantage: str = ""
    ) -> List[BlogOutlineSection]:
        """
        Map research sources to outline sections using intelligent algorithms.
        Sections that already have LLM-assigned references (from source_indices
        in the outline prompt) are preserved. Algorithmic mapping fills gaps
        for sections without LLM-assigned sources.
        Args:
            sections: List of outline sections to map sources to
            research_data: Research data containing sources and metadata
            user_id: User ID (required for subscription checks and usage tracking)
            competitive_advantage: Selected competitive advantage to preferentially match
        Returns:
            List of outline sections with intelligently mapped sources
@@ -76,16 +105,39 @@ class SourceToSectionMapper:
            logger.warning("No sections or sources to map")
            return sections
-        logger.info(f"Mapping {len(research_data.sources)} sources to {len(sections)} sections")
+        # Separate sections with LLM-assigned references from those without
        sections_with_refs = [s for s in sections if s.references]
        sections_without_refs = [s for s in sections if not s.references]
-        # Step 1: Algorithmic mapping
+        logger.info(
-        mapping_results = self._algorithmic_source_mapping(sections, research_data)
+            f"Mapping {len(research_data.sources)} sources to {len(sections)} sections "
            f"({len(sections_with_refs)} with LLM-assigned references, "
            f"{len(sections_without_refs)} need algorithmic mapping)"
        )
-        # Step 2: AI validation and improvement (single prompt, user_id required for subscription checks)
+        if sections_without_refs:
-        validated_mapping = self._ai_validate_mapping(mapping_results, research_data, user_id)
+            # Step 1: Algorithmic mapping for sections without LLM-assigned references
            mapping_results = self._algorithmic_source_mapping(sections_without_refs, research_data, competitive_advantage)
            # Step 2: AI validation and improvement
            validated_mapping = self._ai_validate_mapping(mapping_results, research_data, user_id)
            # Step 3: Apply mapping only to sections that need it
            mapped_sections_with = self._apply_mapping_to_sections(sections_without_refs, validated_mapping)
        else:
            mapped_sections_with = []
-        # Step 3: Apply validated mapping to sections
+        # Combine: keep LLM-assigned sections as-is, add algorithmically mapped ones
-        mapped_sections = self._apply_mapping_to_sections(sections, validated_mapping)
+        mapped_sections = list(sections_with_refs) + mapped_sections_with
        # Preserve original ordering
        original_ids = [s.id for s in sections]
        mapped_sections.sort(key=lambda s: original_ids.index(s.id) if s.id in original_ids else 999)
        # Warn if any section still has zero references
        for s in mapped_sections:
            if not s.references:
                logger.warning(f"Section '{s.heading}' (id={s.id}) has ZERO sources — content generator will use keyword-based fallback")
        logger.info("✅ Source-to-section mapping completed successfully")
        return mapped_sections
@@ -93,7 +145,8 @@ class SourceToSectionMapper:
    def _algorithmic_source_mapping(
        self, 
        sections: List[BlogOutlineSection], 
-        research_data: BlogResearchResponse
+        research_data: BlogResearchResponse,
        competitive_advantage: str = ""
    ) -> Dict[str, List[Tuple[ResearchSource, float]]]:
        """
        Perform algorithmic mapping of sources to sections.
@@ -101,6 +154,7 @@ class SourceToSectionMapper:
        Args:
            sections: List of outline sections
            research_data: Research data with sources
            competitive_advantage: Selected competitive advantage to boost matching
        Returns:
            Dictionary mapping section IDs to list of (source, score) tuples
@@ -114,7 +168,7 @@ class SourceToSectionMapper:
                # Calculate multi-dimensional relevance score
                semantic_score = self._calculate_semantic_similarity(section, source)
                keyword_score = self._calculate_keyword_relevance(section, source, research_data)
-                contextual_score = self._calculate_contextual_relevance(section, source, research_data)
+                contextual_score = self._calculate_contextual_relevance(section, source, research_data, competitive_advantage)
                # Weighted total score
                total_score = (
@@ -140,38 +194,54 @@ class SourceToSectionMapper:
    def _calculate_semantic_similarity(self, section: BlogOutlineSection, source: ResearchSource) -> float:
        """
        Calculate semantic similarity between section and source.
-        
+        Uses word overlap, stem matching, bigram overlap, title-boost, and synonym expansion.
        Args:
            section: Outline section
            source: Research source
        Returns:
            Semantic similarity score (0.0 to 1.0)
        """
        # Extract text content for comparison
        section_text = self._extract_section_text(section)
        source_text = self._extract_source_text(source)
        # Calculate word overlap
        section_words = self._extract_meaningful_words(section_text)
        source_words = self._extract_meaningful_words(source_text)
        if not section_words or not source_words:
            return 0.0
-        # Calculate Jaccard similarity
+        section_set = set(section_words)
-        intersection = len(set(section_words) & set(source_words))
+        source_set = set(source_words)
        union = len(set(section_words) | set(source_words))
-        jaccard_similarity = intersection / union if union > 0 else 0.0
+        # 1. Jaccard similarity on raw words
        intersection = len(section_set & source_set)
        union = len(section_set | source_set)
        jaccard = intersection / union if union > 0 else 0.0
-        # Boost score for exact phrase matches
+        # 2. Stem matching — catches word variants (e.g., "running" vs "runs")
-        phrase_boost = self._calculate_phrase_similarity(section_text, source_text)
+        section_stems = set(self._stem_word(w) for w in section_words)
        source_stems = set(self._stem_word(w) for w in source_words)
        stem_intersection = len(section_stems & source_stems)
        stem_union = len(section_stems | source_stems)
        stem_similarity = stem_intersection / stem_union if stem_union > 0 else 0.0
-        # Combine Jaccard similarity with phrase boost
+        # 3. Bigram overlap — catches multi-word concepts (e.g., "machine learning")
-        semantic_score = min(1.0, jaccard_similarity + phrase_boost)
+        section_bigrams = set(self._extract_bigrams(section_text))
        source_bigrams = set(self._extract_bigrams(source_text))
        bigram_overlap = len(section_bigrams & source_bigrams)
        bigram_score = min(0.3, bigram_overlap * 0.1) if (section_bigrams or source_bigrams) else 0.0
-        return semantic_score
+        # 4. Title-boost — section heading matching source title is a strong signal
        heading = (section.heading or '').lower()
        source_title = (source.title or '').lower()
        heading_words = set(self._extract_meaningful_words(heading))
        title_words = set(self._extract_meaningful_words(source_title))
        title_overlap = len(heading_words & title_words) / len(heading_words | title_words) if (heading_words or title_words) else 0.0
        title_boost = min(0.3, title_overlap * 0.5)
        # 5. Synonym expansion — expand abbreviations and match across synonym pairs
        synonym_score = self._calculate_synonym_overlap(section_words, source_words)
        # Combine: Jaccard + stem give base, bigram + title + synonyms boost
        base_similarity = max(jaccard, stem_similarity)
        combined = min(1.0, base_similarity + bigram_score + title_boost + synonym_score + 0.0)
        return combined
    def _calculate_keyword_relevance(
        self, 
@@ -219,7 +289,8 @@ class SourceToSectionMapper:
        self, 
        section: BlogOutlineSection, 
        source: ResearchSource, 
-        research_data: BlogResearchResponse
+        research_data: BlogResearchResponse,
        competitive_advantage: str = ""
    ) -> float:
        """
        Calculate contextual relevance based on section content and source context.
@@ -228,6 +299,7 @@ class SourceToSectionMapper:
            section: Outline section
            source: Research source
            research_data: Research data with context
            competitive_advantage: Selected competitive advantage to boost matching
        Returns:
            Contextual relevance score (0.0 to 1.0)
@@ -264,6 +336,15 @@ class SourceToSectionMapper:
            industry_score = sum(1 for word in industry_words if word in source_text) / len(industry_words) if industry_words else 0.0
            contextual_score += industry_score * 0.2
        # 4. Competitive advantage boost — sources that match the advantage get a score lift
        if competitive_advantage:
            advantage_words = set(self._extract_meaningful_words(competitive_advantage.lower()))
            if advantage_words:
                advantage_in_section = sum(1 for w in advantage_words if w in section_text) / len(advantage_words)
                advantage_in_source = sum(1 for w in advantage_words if w in source_text) / len(advantage_words)
                if advantage_in_section > 0.3 and advantage_in_source > 0.3:
                    contextual_score += 0.25 * (advantage_in_section + advantage_in_source)
        return min(1.0, contextual_score)
    def _ai_validate_mapping(
@@ -360,10 +441,15 @@ class SourceToSectionMapper:
        return " ".join(text_parts)
    def _extract_source_text(self, source: ResearchSource) -> str:
-        """Extract all text content from a source."""
+        """Extract all text content from a source, including full text for better matching."""
        text_parts = [source.title]
        if source.summary:
            text_parts.append(source.summary)
        if source.excerpt:
            text_parts.append(source.excerpt)
        content = getattr(source, 'content', '') or ''
        if content:
            text_parts.append(content[:500])
        return " ".join(text_parts)
    def _extract_meaningful_words(self, text: str) -> List[str]:
@@ -382,6 +468,41 @@ class SourceToSectionMapper:
        return meaningful_words
    def _stem_word(self, word: str) -> str:
        """Rudimentary suffix-stripping stemmer for English words."""
        if len(word) <= 3:
            return word
        for suffix in ['ization', 'ation', 'tion', 'sion', 'ment', 'ness', 'ity', 'ing', 'able', 'ible', 'ful', 'less', 'ous', 'ive', 'ally', 'ly', 'er', 'ed', 'es', 's']:
            if word.endswith(suffix) and len(word) - len(suffix) >= 3:
                return word[:-len(suffix)]
        return word
    def _extract_bigrams(self, text: str) -> List[str]:
        """Extract meaningful two-word phrases from text."""
        words = self._extract_meaningful_words(text)
        if len(words) < 2:
            return []
        return [f"{words[i]} {words[i+1]}" for i in range(len(words) - 1)]
    def _calculate_synonym_overlap(self, section_words: List[str], source_words: List[str]) -> float:
        """Score overlap via abbreviation/synonym expansion."""
        section_set = set(section_words)
        source_set = set(source_words)
        extra_matches = 0
        total_terms = len(section_set | source_set) or 1
        for abbr, expansions in self._synonym_map.items():
            abbr_in_section = abbr in section_set
            abbr_in_source = abbr in source_set
            for expansion in expansions:
                exp_words = set(expansion.split())
                exp_in_section = exp_words.issubset(section_set)
                exp_in_source = exp_words.issubset(source_set)
                if (abbr_in_section and exp_in_source) or (abbr_in_source and exp_in_section):
                    extra_matches += 1
        return min(0.2, extra_matches * 0.05)
    def _calculate_phrase_similarity(self, text1: str, text2: str) -> float:
        """Calculate phrase similarity boost score."""
        if not text1 or not text2:
--- a/backend/services/blog_writer/outline/title_generator.py
+++ b/backend/services/blog_writer/outline/title_generator.py
@@ -54,58 +54,58 @@ class TitleGenerator:
        Returns:
            Formatted title string
        """
-        if not angle or len(angle.strip()) < 10:  # Too short to be a good title
+        if not angle or len(angle.strip()) < 10:
            return ""
        # Clean up the angle
        cleaned_angle = angle.strip()
-        # Capitalize first letter of each sentence and proper nouns
+        # Use sentence case: capitalize first letter, rest as-is
-        sentences = cleaned_angle.split('. ')
+        if cleaned_angle:
-        formatted_sentences = []
+            cleaned_angle = cleaned_angle[0].upper() + cleaned_angle[1:]
        for sentence in sentences:
            if sentence.strip():
                # Use title case for better formatting
                formatted_sentence = sentence.strip().title()
                formatted_sentences.append(formatted_sentence)
        formatted_title = '. '.join(formatted_sentences)
        # Ensure it ends with proper punctuation
        if not formatted_title.endswith(('.', '!', '?')):
            formatted_title += '.'
        # Limit length to reasonable blog title size
-        if len(formatted_title) > 200:
+        if len(cleaned_angle) > 120:
-            formatted_title = formatted_title[:197] + "..."
+            cleaned_angle = cleaned_angle[:117] + "..."
-        return formatted_title
+        return cleaned_angle
-    def combine_title_options(self, ai_titles: List[str], content_angle_titles: List[str], primary_keywords: List[str]) -> List[str]:
+    def combine_title_options(self, ai_titles: List[str], content_angle_titles: List[str], primary_keywords: List[str], research_topic: str = "") -> List[str]:
        """
        Combine AI-generated titles with content angle titles, ensuring variety and quality.
        AI titles (proper SEO titles generated by LLM) take priority.
        Content angle titles (long-format descriptions) are used as fallback.
        The research topic is the last resort when nothing else exists.
        Args:
-            ai_titles: AI-generated title options
+            ai_titles: AI-generated title options (proper blog titles, 50-65 chars)
-            content_angle_titles: Titles derived from content angles
+            content_angle_titles: Titles derived from content angles (longer, descriptive)
            primary_keywords: Primary keywords for fallback generation
            research_topic: Original user research topic as ultimate fallback
        Returns:
            Combined list of title options (max 6 total)
        """
        all_titles = []
-        # Add content angle titles first (these are research-based and valuable)
+        # 1. AI-generated titles first (proper SEO titles from LLM)
        for title in content_angle_titles[:3]:  # Limit to top 3 content angles
            if title and title not in all_titles:
                all_titles.append(title)
        # Add AI-generated titles
        for title in ai_titles:
            if title and title not in all_titles:
                all_titles.append(title)
-        # Note: Removed fallback titles as requested - only use research and AI-generated titles
+        # 2. Content angle titles as fallback (research-based, but verbose)
        for title in content_angle_titles[:3]:
            if title and title not in all_titles:
                all_titles.append(title)
        # 3. Research topic as last resort when nothing was generated
        if not all_titles and research_topic:
            all_titles.append(research_topic)
        # 4. Primary keyword fallback as absolute last resort
        if not all_titles and primary_keywords:
            kw = primary_keywords[0]
            all_titles.append(kw)
        # Limit to 6 titles maximum for UI usability
        final_titles = all_titles[:6]
@@ -115,9 +115,10 @@ class TitleGenerator:
    def generate_fallback_titles(self, primary_keywords: List[str]) -> List[str]:
        """Generate fallback titles when AI generation fails."""
        from datetime import datetime
        primary_keyword = primary_keywords[0] if primary_keywords else "Topic"
        return [
            f"The Complete Guide to {primary_keyword}",
            f"{primary_keyword}: Everything You Need to Know",
-            f"How to Master {primary_keyword} in 2024"
+            f"How to Master {primary_keyword} in {datetime.now().year}"
        ]
--- a/backend/services/blog_writer/research/competitor_analyzer.py
+++ b/backend/services/blog_writer/research/competitor_analyzer.py
@@ -18,7 +18,7 @@ class CompetitorAnalyzer:
        Analyze the following research content and extract competitor insights:
        Research Content:
-        {content[:3000]}
+        {content[:8000]}
        Extract and analyze:
        1. Top competitors mentioned (companies, brands, platforms)
--- a/backend/services/blog_writer/research/content_angle_generator.py
+++ b/backend/services/blog_writer/research/content_angle_generator.py
@@ -17,7 +17,7 @@ class ContentAngleGenerator:
        Analyze the following research content and create strategic content angles for: {topic} in {industry}
        Research Content:
-        {content[:3000]}
+        {content[:8000]}
        Create 7 compelling content angles that:
        1. Leverage current trends and data from the research
--- a/backend/services/blog_writer/research/data_filter.py
+++ b/backend/services/blog_writer/research/data_filter.py
@@ -432,7 +432,7 @@ class ResearchDataFilter:
            'how to', 'guide', 'tutorial', 'steps', 'process', 'method',
            'best practices', 'tips', 'strategies', 'techniques', 'approach',
            'comparison', 'vs', 'versus', 'difference', 'pros and cons',
-            'trends', 'future', '2024', '2025', 'emerging', 'new'
+            'trends', 'future', str(datetime.now().year), str(datetime.now().year + 1), 'emerging', 'new'
        ]
        for indicator in actionable_indicators:
--- a/backend/services/blog_writer/research/exa_provider.py
+++ b/backend/services/blog_writer/research/exa_provider.py
@@ -7,6 +7,8 @@ Neural search implementation using Exa API for high-quality, citation-rich resea
 from exa_py import Exa
 import os
 import asyncio
 from datetime import datetime
 from urllib.parse import urlparse
 from typing import List, Dict, Any
 from loguru import logger
 from models.subscription_models import APIProvider
@@ -355,6 +357,125 @@ class ExaResearchProvider(BaseProvider):
        return None
    def _calculate_credibility_score(self, result) -> float:
        """Dynamic credibility score based on domain authority, recency, and content substance."""
        scores = []
        weights = []
        # Domain authority (weight: 3) — most important signal
        url = result.url if hasattr(result, 'url') else ''
        domain_score = self._score_domain_authority(url)
        scores.append(domain_score)
        weights.append(3)
        # Recency (weight: 2) — fresher content is more valuable
        recency_score = self._score_recency(result)
        scores.append(recency_score)
        weights.append(2)
        # Content substance (weight: 2) — richer content = more substantive source
        substance_score = self._score_substance(result)
        scores.append(substance_score)
        weights.append(2)
        # Exa relevance score (weight: 2) — Exa's own relevance ranking
        exa_score = 0.5
        if hasattr(result, 'score') and result.score is not None:
            exa_score = float(result.score)
        scores.append(exa_score)
        weights.append(2)
        total = sum(s * w for s, w in zip(scores, weights))
        total_weight = sum(weights)
        return round(total / total_weight, 3)
    @staticmethod
    def _score_domain_authority(url: str) -> float:
        if not url:
            return 0.5
        try:
            domain = urlparse(url).netloc.lower()
        except Exception:
            return 0.5
        if domain.startswith('www.'):
            domain = domain[4:]
        # Tier 1: Government, educational, major research
        if domain.endswith('.gov') or domain.endswith('.edu'):
            return 0.95
        if domain in ('arxiv.org', 'pubmed.ncbi.nlm.nih.gov', 'ncbi.nlm.nih.gov',
                      'scholar.google.com', 'researchgate.net', 'sciencedaily.com',
                      'nature.com', 'science.org', 'pnas.org'):
            return 0.92
        # Tier 2: Major established news and professional publications
        tier2 = {
            'reuters.com', 'apnews.com', 'bbc.com', 'bbc.co.uk', 'npr.org',
            'wsj.com', 'nytimes.com', 'economist.com', 'bloomberg.com',
            'theguardian.com', 'ft.com', 'washingtonpost.com',
            'forbes.com', 'hbr.org', 'techcrunch.com', 'wired.com',
            'cnn.com', 'nbcnews.com', 'cbsnews.com', 'abcnews.go.com',
        }
        # Extract base domain
        parts = domain.split('.')
        base = '.'.join(parts[-2:]) if len(parts) >= 2 else domain
        if base in tier2:
            return 0.88
        # Tier 3: Industry research and established .org
        tier3 = {
            'statista.com', 'pewresearch.org', 'gartner.com', 'mckinsey.com',
            'deloitte.com', 'pwc.com', 'ey.com', 'kpmg.com',
            'hubspot.com', 'moz.com', 'searchengineland.com',
            'neilpatel.com', 'backlinko.com', 'copyblogger.com',
        }
        if base in tier3:
            return 0.80
        if domain.endswith('.org'):
            return 0.75
        return 0.60
    def _score_recency(self, result) -> float:
        if not hasattr(result, 'publishedDate') or not result.publishedDate:
            return 0.70
        try:
            published = datetime.strptime(result.publishedDate[:10], '%Y-%m-%d')
            days_old = (datetime.now() - published).days
            if days_old < 30:
                return 1.0
            elif days_old < 180:
                return 0.90
            elif days_old < 365:
                return 0.80
            elif days_old < 730:
                return 0.65
            elif days_old < 1825:
                return 0.45
            else:
                return 0.25
        except Exception:
            return 0.70
    def _score_substance(self, result) -> float:
        total_chars = 0
        if hasattr(result, 'highlights') and result.highlights:
            total_chars += sum(len(h or '') for h in result.highlights)
        if hasattr(result, 'summary') and result.summary:
            total_chars += len(result.summary)
        if hasattr(result, 'text') and result.text:
            total_chars += len(result.text)
        if total_chars > 2000:
            return 0.95
        elif total_chars > 1000:
            return 0.85
        elif total_chars > 500:
            return 0.75
        elif total_chars > 100:
            return 0.60
        return 0.40
    def _transform_sources(self, results):
        """Transform Exa results to ResearchSource format."""
        sources = []
@@ -368,7 +489,7 @@ class ExaResearchProvider(BaseProvider):
                'title': result.title if hasattr(result, 'title') else '',
                'url': result.url if hasattr(result, 'url') else '',
                'excerpt': self._get_excerpt(result),
-                'credibility_score': 0.85,  # Exa results are high quality
+                'credibility_score': self._calculate_credibility_score(result),
                'published_at': result.publishedDate if hasattr(result, 'publishedDate') else None,
                'index': idx,
                'source_type': source_type,
@@ -388,7 +509,7 @@ class ExaResearchProvider(BaseProvider):
        if hasattr(result, 'summary') and result.summary:
            return result.summary
        if hasattr(result, 'text') and result.text:
-            return result.text[:500]
+            return result.text[:1000]
        return ''
    def _determine_source_type(self, url):
--- a/backend/services/blog_writer/research/keyword_analyzer.py
+++ b/backend/services/blog_writer/research/keyword_analyzer.py
@@ -19,7 +19,7 @@ class KeywordAnalyzer:
        Analyze the following research content and extract comprehensive keyword insights for: {', '.join(original_keywords)}
        Research Content:
-        {content[:3000]}  # Limit to avoid token limits
+        {content[:8000]}
        Extract and analyze:
        1. Primary keywords (main topic terms)
--- a/backend/services/blog_writer/research/research_service.py
+++ b/backend/services/blog_writer/research/research_service.py
@@ -250,10 +250,32 @@ class ResearchService:
            if 'content' not in locals() or 'sources' not in locals():
                raise RuntimeError(f"{config.provider.value} research did not return content or sources. Research failed.")
            # Build compact all-source summary for richer analysis
            analysis_content = self._build_analysis_content(sources)
            # Run dedicated competitor search for richer competitor intelligence
            competitor_content = analysis_content
            try:
                comp_query = f"top {industry} companies or competitors {topic}"
                comp_results = await exa_provider.simple_search(
                    query=comp_query, num_results=5, user_id=user_id,
                )
                if comp_results:
                    comp_lines = ["COMPETITOR SEARCH RESULTS:"]
                    for r in comp_results:
                        title = r.get('title', '')
                        text = (r.get('text', '') or '')[:400]
                        comp_lines.append(f"- {title}")
                        if text:
                            comp_lines.append(f"  {text[:200]}")
                    competitor_content = "\n".join(comp_lines) + "\n\n" + analysis_content
            except Exception as e:
                logger.warning(f"Competitor search failed (non-critical): {e}")
            # Continue with common analysis (same for both providers)
-            keyword_analysis = self.keyword_analyzer.analyze(content, request.keywords, user_id=user_id)
+            keyword_analysis = self.keyword_analyzer.analyze(analysis_content, request.keywords, user_id=user_id)
-            competitor_analysis = self.competitor_analyzer.analyze(content, user_id=user_id)
+            competitor_analysis = self.competitor_analyzer.analyze(competitor_content, user_id=user_id)
-            suggested_angles = self.content_angle_generator.generate(content, topic, industry, user_id=user_id)
+            suggested_angles = self.content_angle_generator.generate(analysis_content, topic, industry, user_id=user_id)
            logger.info(f"Research completed successfully with {len(sources)} sources and {len(search_queries)} search queries")
@@ -586,9 +608,30 @@ class ResearchService:
            # Continue with common analysis (same for both providers)
            await task_manager.update_progress(task_id, "🔍 Analyzing keywords and content angles...")
-            keyword_analysis = self.keyword_analyzer.analyze(content, request.keywords, user_id=user_id)
+            analysis_content = self._build_analysis_content(sources)
-            competitor_analysis = self.competitor_analyzer.analyze(content, user_id=user_id)
+            
-            suggested_angles = self.content_angle_generator.generate(content, topic, industry, user_id=user_id)
+            # Run dedicated competitor search for richer competitor intelligence
            competitor_content = analysis_content
            try:
                comp_query = f"top {industry} companies or competitors {topic}"
                comp_results = await exa_provider.simple_search(
                    query=comp_query, num_results=5, user_id=user_id,
                )
                if comp_results:
                    comp_lines = ["COMPETITOR SEARCH RESULTS:"]
                    for r in comp_results:
                        title = r.get('title', '')
                        text = (r.get('text', '') or '')[:400]
                        comp_lines.append(f"- {title}")
                        if text:
                            comp_lines.append(f"  {text[:200]}")
                    competitor_content = "\n".join(comp_lines) + "\n\n" + analysis_content
            except Exception as e:
                logger.warning(f"Competitor search failed (non-critical): {e}")
            keyword_analysis = self.keyword_analyzer.analyze(analysis_content, request.keywords, user_id=user_id)
            competitor_analysis = self.competitor_analyzer.analyze(competitor_content, user_id=user_id)
            suggested_angles = self.content_angle_generator.generate(analysis_content, topic, industry, user_id=user_id)
            await task_manager.update_progress(task_id, "💾 Caching results for future use...")
            logger.info(f"Research completed successfully with {len(sources)} sources and {len(search_queries)} search queries")
@@ -720,7 +763,7 @@ class ResearchService:
                url=src.get("url", ""),
                excerpt=src.get("content", "")[:500] if src.get("content") else f"Source from {src.get('title', 'web')}",
                credibility_score=float(src.get("credibility_score", 0.8)),
-                published_at=str(src.get("publication_date", "2024-01-01")),
+                published_at=str(src.get("publication_date", f"{datetime.now().year}-01-01")),
                index=src.get("index"),
                source_type=src.get("type", "web")
            )
@@ -780,6 +823,33 @@ class ResearchService:
            web_search_queries=search_queries or [],
        )
    def _build_analysis_content(self, sources: List[Dict[str, Any]]) -> str:
        """Build compact all-source summary for LLM analysis.
        Each source is distilled to one line with title, key content, and highlights.
        This ensures ALL sources are visible to keyword, competitor, and angle
        analyzers instead of only the first few (raw content[:3000]).
        """
        if not sources:
            return ""
        lines = []
        for src in sources:
            title = src.get('title', '') or ''
            summary = src.get('summary', '') or ''
            highlights = src.get('highlights', []) or []
            excerpt = src.get('excerpt', '') or ''
            part = f"• {title}"
            if summary:
                part += f" — {summary[:250]}"
            elif excerpt:
                part += f" — {excerpt[:250]}"
            if highlights:
                findings = [h[:120] for h in highlights[:2] if h]
                if findings:
                    part += f" | {'; '.join(findings)}"
            lines.append(part)
        return "\n".join(lines)
    def _normalize_cached_research_data(self, cached_data: Dict[str, Any]) -> Dict[str, Any]:
        """
        Normalize cached research data to fix None values in confidence_scores.
--- a/backend/services/blog_writer/research/research_strategies.py
+++ b/backend/services/blog_writer/research/research_strategies.py
@@ -6,6 +6,7 @@ Different strategies for executing research based on depth and focus.
 from abc import ABC, abstractmethod
 from typing import Dict, Any
 from datetime import datetime
 from loguru import logger
 from models.blog_models import BlogResearchRequest, ResearchMode, ResearchConfig
@@ -87,7 +88,7 @@ Provide analysis in this EXACT format:
 - For each: Quote/claim, source URL, published date, metric/context.
 REQUIREMENTS:
- Every claim MUST include a source URL (authoritative, recent: 2024-2025 preferred).
+- Every claim MUST include a source URL (authoritative, recent: {datetime.now().year}-{datetime.now().year + 1} preferred).
 - Use concrete numbers, dates, outcomes; avoid generic advice.
 - Keep bullets tight and scannable for spoken narration."""
        return prompt.strip()
@@ -116,7 +117,7 @@ Research Topic: "{topic}"{date_filter}{source_filter}
 Provide COMPLETE analysis in this EXACT format:
-## WHAT'S CHANGED (2024-2025)
+## WHAT'S CHANGED ({datetime.now().year}-{datetime.now().year + 1})
 [5-7 concise trend bullets with numbers + source URLs]
 ## PROOF & NUMBERS
@@ -151,7 +152,7 @@ Primary (3), Secondary (8-10), Long-tail (5-7) with intent hints.
 VERIFICATION REQUIREMENTS:
 - Minimum 2 authoritative sources per major claim.
 - Prefer industry reports > research papers > news > blogs.
- 2024-2025 data strongly preferred.
+- {datetime.now().year}-{datetime.now().year + 1} data strongly preferred.
 - All numbers must include timeframe and methodology.
 - Every bullet must be concise for spoken narration and actionable for {target_audience}."""
        return prompt.strip()
@@ -213,7 +214,7 @@ REQUIREMENTS:
 - Cite all claims with authoritative source URLs
 - Include specific numbers, dates, examples
 - Focus on actionable insights for {target_audience}
- Use 2024-2025 data when available"""
+- Use {datetime.now().year}-{datetime.now().year + 1} data when available"""
        return prompt.strip()
--- a/backend/services/blog_writer/seo/blog_content_seo_analyzer.py
+++ b/backend/services/blog_writer/seo/blog_content_seo_analyzer.py
@@ -6,6 +6,7 @@ Leverages existing non-AI SEO tools and uses single AI prompt for structured ana
 """
 import asyncio
 import math
 import re
 import textstat
 from datetime import datetime
@@ -34,7 +35,7 @@ class BlogContentSEOAnalyzer:
        logger.info("BlogContentSEOAnalyzer initialized")
-    async def analyze_blog_content(self, blog_content: str, research_data: Dict[str, Any], blog_title: Optional[str] = None, user_id: str = None) -> Dict[str, Any]:
+    async def analyze_blog_content(self, blog_content: str, research_data: Dict[str, Any], blog_title: Optional[str] = None, user_id: str = None, outline: Optional[List[Dict[str, Any]]] = None, competitive_advantage: Optional[str] = None) -> Dict[str, Any]:
        """
        Main analysis method with parallel processing
@@ -43,6 +44,8 @@ class BlogContentSEOAnalyzer:
            research_data: Research data containing keywords and other insights
            blog_title: Optional blog title
            user_id: Clerk user ID for subscription checking (required)
            outline: Optional outline sections for context-aware analysis
            competitive_advantage: Optional competitive advantage for context
        Returns:
            Comprehensive SEO analysis results
@@ -52,21 +55,24 @@ class BlogContentSEOAnalyzer:
        try:
            logger.info("Starting blog content SEO analysis")
-            # Extract keywords from research data
+            # Extract research context (keywords + competitor data + search queries)
-            keywords_data = self._extract_keywords_from_research(research_data)
+            research_context = self._extract_research_context(research_data)
-            logger.info(f"Extracted keywords: {keywords_data}")
+            logger.info(f"Extracted research context with {len(research_context.get('primary', []))} primary keywords")
            # Phase 1: Run non-AI analyzers in parallel
            logger.info("Running non-AI analyzers in parallel")
-            non_ai_results = await self._run_non_ai_analyzers(blog_content, keywords_data)
+            non_ai_results = await self._run_non_ai_analyzers(blog_content, research_context)
-            # Phase 2: Single AI analysis for structured insights
+            # Phase 2: Single AI analysis for structured insights (with outline + competitive context)
            logger.info("Running AI analysis")
-            ai_insights = await self._run_ai_analysis(blog_content, keywords_data, non_ai_results, user_id=user_id)
+            ai_insights = await self._run_ai_analysis(
                blog_content, research_context, non_ai_results, user_id=user_id,
                outline=outline, competitive_advantage=competitive_advantage
            )
            # Phase 3: Compile and format results
            logger.info("Compiling results")
-            results = self._compile_blog_seo_results(non_ai_results, ai_insights, keywords_data)
+            results = self._compile_blog_seo_results(non_ai_results, ai_insights, research_context)
            logger.info(f"SEO analysis completed. Overall score: {results.get('overall_score', 0)}")
            return results
@@ -76,14 +82,19 @@ class BlogContentSEOAnalyzer:
            # Fail fast - don't return fallback data
            raise e
-    def _extract_keywords_from_research(self, research_data: Dict[str, Any]) -> Dict[str, Any]:
+    def _extract_research_context(self, research_data: Dict[str, Any]) -> Dict[str, Any]:
-        """Extract keywords from research data"""
+        """Extract research context from research data including keywords, competitor data, and search queries.
        Previously only extracted keyword_analysis. Now also extracts:
        - competitor_analysis (content_gaps, industry_leaders, opportunities, competitive_advantages)
        - search_queries
        - suggested_angles
        """
        try:
-            logger.info(f"Extracting keywords from research data: {research_data}")
+            logger.info(f"Extracting research context from research data")
            # Extract keywords from research data structure
            keyword_analysis = research_data.get('keyword_analysis', {})
            logger.info(f"Found keyword_analysis: {keyword_analysis}")
            # Handle different possible structures
            primary_keywords = []
@@ -109,17 +120,37 @@ class BlogContentSEOAnalyzer:
                'long_tail': long_tail_keywords,
                'semantic': semantic_keywords,
                'all_keywords': all_keywords,
-                'search_intent': keyword_analysis.get('search_intent', 'informational')
+                'search_intent': keyword_analysis.get('search_intent', 'informational'),
            }
-            logger.info(f"Extracted keywords: {result}")
+            # Extract competitor analysis
            competitor_analysis = research_data.get('competitor_analysis', {})
            if competitor_analysis:
                result['content_gaps'] = competitor_analysis.get('content_gaps', [])
                result['industry_leaders'] = competitor_analysis.get('industry_leaders', [])
                result['opportunities'] = competitor_analysis.get('opportunities', [])
                result['competitive_advantages'] = competitor_analysis.get('competitive_advantages', [])
            else:
                result['content_gaps'] = []
                result['industry_leaders'] = []
                result['opportunities'] = []
                result['competitive_advantages'] = []
            # Extract search queries
            search_queries = research_data.get('search_queries', [])
            result['search_queries'] = search_queries if isinstance(search_queries, list) else []
            # Extract suggested angles
            suggested_angles = research_data.get('suggested_angles', [])
            result['suggested_angles'] = suggested_angles if isinstance(suggested_angles, list) else []
            logger.info(f"Extracted research context: {len(primary_keywords)} primary keywords, {len(result.get('content_gaps', []))} content gaps, {len(result.get('search_queries', []))} search queries")
            return result
        except Exception as e:
-            logger.error(f"Failed to extract keywords from research data: {e}")
+            logger.error(f"Failed to extract research context from research data: {e}")
            logger.error(f"Research data structure: {research_data}")
-            # Fail fast - don't return empty keywords
+            raise ValueError(f"Research context extraction failed: {e}")
            raise ValueError(f"Keyword extraction failed: {e}")
    async def _run_non_ai_analyzers(self, blog_content: str, keywords_data: Dict[str, Any]) -> Dict[str, Any]:
        """Run all non-AI analyzers in parallel for maximum performance"""
@@ -170,10 +201,24 @@ class BlogContentSEOAnalyzer:
            sentences = len(re.findall(r'[.!?]+', content))
            # Blog-specific structure analysis
-            has_introduction = any('introduction' in line.lower() or 'overview' in line.lower() 
+            content_lower = content.lower()
-                                  for line in lines[:10])
+            first_500 = content_lower[:500] if len(content) > 500 else content_lower
-            has_conclusion = any('conclusion' in line.lower() or 'summary' in line.lower() 
+            last_500 = content_lower[-500:] if len(content) > 500 else content_lower
-                                for line in lines[-10:])
+            has_introduction = any('introduction' in line.lower() or 'overview' in line.lower()
                                   for line in lines[:10]) or any(
                phrase in first_500 for phrase in [
                    'in this', 'this article', 'this guide', 'this post',
                    'we will', "you'll learn", "let's explore", "whether you're",
                    'in this section', 'this blog post', 'here we', 'today we',
                    "we'll explore", "we'll cover", "we'll dive"
                ])
            has_conclusion = any('conclusion' in line.lower() or 'summary' in line.lower()
                                 for line in lines[-10:]) or any(
                phrase in last_500 for phrase in [
                    'in conclusion', 'to summarize', 'in summary', 'bottom line',
                    'key takeaways', 'remember that', "as we've seen", 'wrapping up',
                    'final thoughts', 'to conclude', 'in short', 'overall'
                ])
            has_cta = any('call to action' in line.lower() or 'learn more' in line.lower() 
                          for line in lines)
@@ -187,7 +232,7 @@ class BlogContentSEOAnalyzer:
                'has_conclusion': has_conclusion,
                'has_call_to_action': has_cta,
                'structure_score': structure_score,
-                'recommendations': self._get_structure_recommendations(sections, has_introduction, has_conclusion)
+                'recommendations': self._get_structure_recommendations(sections, has_introduction, has_conclusion, content)
            }
        except Exception as e:
            logger.error(f"Content structure analysis failed: {e}")
@@ -332,33 +377,36 @@ class BlogContentSEOAnalyzer:
            raise e
    # Helper methods for calculations and scoring
    @staticmethod
    def _sigmoid(x: float, midpoint: float = 0.0, steepness: float = 1.0) -> float:
        """Sigmoid function for smooth scoring curves. Returns 0-1."""
        try:
            return 1.0 / (1.0 + math.exp(-steepness * (x - midpoint)))
        except OverflowError:
            return 0.0 if x < midpoint else 1.0
    def _calculate_structure_score(self, sections: int, paragraphs: int, has_intro: bool, has_conclusion: bool) -> int:
-        """Calculate content structure score"""
+        """Calculate content structure score using continuous curves instead of rigid brackets.
-        score = 0
+
-        
+        Sections: optimal around 5, steep penalties below 3 or above 10.
-        # Section count (optimal: 3-8 sections)
+        Paragraphs: optimal around 12, steep penalties below 5 or above 25.
-        if 3 <= sections <= 8:
+        Intro/conclusion: binary bonuses.
-            score += 30
+        """
-        elif sections < 3:
+        # Section score: peaks around 4-6, decays smoothly for low or high counts
-            score += 15
+        section_score = self._sigmoid(sections, midpoint=4, steepness=0.8) * 40
-        else:
+        if sections > 8:
-            score += 20
+            section_score = max(section_score * 0.7, 10)
-        
+
-        # Paragraph count (optimal: 8-20 paragraphs)
+        # Paragraph score: peaks around 12, decays for low or high counts
-        if 8 <= paragraphs <= 20:
+        para_score = self._sigmoid(paragraphs, midpoint=10, steepness=0.3) * 40
-            score += 30
+        if paragraphs > 25:
-        elif paragraphs < 8:
+            para_score = max(para_score * 0.6, 8)
-            score += 15
+
-        else:
+        intro_score = 10 if has_intro else 0
-            score += 20
+        conclusion_score = 10 if has_conclusion else 0
-        
+
-        # Introduction and conclusion
+        return int(min(max(section_score + para_score + intro_score + conclusion_score, 5), 100))
        if has_intro:
            score += 20
        if has_conclusion:
            score += 20
        return min(score, 100)
    def _calculate_keyword_density(self, content: str, keyword: str) -> float:
        """Calculate keyword density percentage"""
@@ -397,21 +445,20 @@ class BlogContentSEOAnalyzer:
        return total_words / len(paragraphs)
    def _calculate_readability_score(self, metrics: Dict[str, float]) -> int:
-        """Calculate overall readability score"""
+        """Calculate readability score using a continuous sigmoid curve on Flesch Reading Ease.
-        # Flesch Reading Ease (0-100, higher is better)
+
-        flesch_score = metrics.get('flesch_reading_ease', 0)
+        Maps Flesch 0-100 to a score that:
-        
+        - Below 30: 25-45 (hard to read)
-        # Convert to 0-100 scale
+        - 30-50: 45-65 (moderate)
-        if flesch_score >= 80:
+        - 50-70: 65-85 (good range)
-            return 90
+        - 70-90: 85-95 (excellent)
-        elif flesch_score >= 60:
+        - Above 90: 95-100 (very easy)
-            return 80
+        """
-        elif flesch_score >= 40:
+        flesch = metrics.get('flesch_reading_ease', 0)
-            return 70
+        score = self._sigmoid(flesch, midpoint=50, steepness=0.06) * 70 + 25
-        elif flesch_score >= 20:
+        if flesch > 80:
-            return 60
+            score = min(score + 5, 100)
-        else:
+        return int(min(max(score, 20), 100))
            return 50
    def _determine_target_audience(self, metrics: Dict[str, float]) -> str:
        """Determine target audience based on readability metrics"""
@@ -427,183 +474,228 @@ class BlogContentSEOAnalyzer:
            return "Graduate level"
    def _calculate_content_depth_score(self, word_count: int, vocabulary_diversity: float) -> int:
-        """Calculate content depth score"""
+        """Calculate content depth score using continuous curves.
-        score = 0
+
-        
+        Word count: sigmoid peaks around 1200, gentle decay for long content.
-        # Word count (optimal: 800-2000 words)
+        Vocabulary diversity: sigmoid peaks around 0.55, decay for low or high diversity.
-        if 800 <= word_count <= 2000:
+        """
-            score += 50
+        # Word count score: optimal around 1000-1500, smooth decay below 500
-        elif word_count < 800:
+        word_score = self._sigmoid(word_count, midpoint=800, steepness=0.005) * 55
-            score += 30
+        if word_count > 3000:
-        else:
+            word_score = min(word_score, 40)
-            score += 40
+        elif word_count < 300:
-        
+            word_score = min(word_score, 15)
-        # Vocabulary diversity (optimal: 0.4-0.7)
+
-        if 0.4 <= vocabulary_diversity <= 0.7:
+        # Vocabulary diversity score: optimal around 0.5-0.65, too high is repetitive, too low is shallow
-            score += 50
+        diversity_score = self._sigmoid(vocabulary_diversity, midpoint=0.45, steepness=12) * 45
-        elif vocabulary_diversity < 0.4:
+        if vocabulary_diversity < 0.3:
-            score += 30
+            diversity_score = min(diversity_score, 15)
-        else:
+
-            score += 40
+        return int(min(max(word_score + diversity_score, 5), 100))
        return min(score, 100)
    def _calculate_flow_score(self, transition_count: int, word_count: int) -> int:
-        """Calculate content flow score"""
+        """Calculate content flow score using continuous curve.
        Transition density is typically low (most content has 0.5-3 per 100 words
        of the specific transition words we track). The sigmoid midpoint is set at 1.0
        with moderate steepness to produce a reasonable spread.
        """
        if word_count == 0:
-            return 0
+            return 15
-        
+
        transition_density = transition_count / (word_count / 100)
-        
+
-        # Optimal transition density: 1-3 per 100 words
+        # Sigmoid centered at 1.0 (decent density), moderate steepness
-        if 1 <= transition_density <= 3:
+        score = self._sigmoid(transition_density, midpoint=1.0, steepness=2.5) * 50 + 40
-            return 90
+        if transition_density > 5:
-        elif transition_density < 1:
+            score = max(score - 10, 35)
-            return 60
+        return int(min(max(score, 15), 100))
        else:
            return 70
    def _calculate_heading_hierarchy_score(self, h1: List[str], h2: List[str], h3: List[str]) -> int:
-        """Calculate heading hierarchy score"""
+        """Calculate heading hierarchy score using continuous curves.
-        score = 0
+
-        
+        H1: 1 is ideal, score decays for 0 or 2+.
-        # Should have exactly 1 H1
+        H2: 4-6 is ideal, score decays for low or high counts.
-        if len(h1) == 1:
+        H3: presence adds bonus.
-            score += 40
+        """
-        elif len(h1) == 0:
+        # H1 score: clear peak at 1
-            score += 20
+        h1_count = len(h1)
        if h1_count == 1:
            h1_score = 40
        elif h1_count == 0:
            h1_score = 15
        else:
-            score += 10
+            h1_score = max(40 // h1_count, 8)
-        
+
-        # Should have 3-8 H2 headings
+        # H2 score: sigmoid peaks around 4-6
-        if 3 <= len(h2) <= 8:
+        h2_count = len(h2)
-            score += 40
+        h2_score = self._sigmoid(h2_count, midpoint=4, steepness=1.0) * 40
-        elif len(h2) < 3:
+        if h2_count == 0:
-            score += 20
+            h2_score = 5
-        else:
+        elif h2_count > 10:
-            score += 30
+            h2_score = max(h2_score * 0.6, 10)
-        
+
-        # H3 headings are optional but good for structure
+        # H3 bonus: presence is good, diminishing returns
-        if len(h3) > 0:
+        h3_score = min(len(h3) * 5, 20)
-            score += 20
+
-        
+        return int(min(max(h1_score + h2_score + h3_score, 10), 100))
        return min(score, 100)
    def _calculate_keyword_score(self, keyword_analysis: Dict[str, Any]) -> int:
-        """Calculate keyword optimization score"""
+        """Calculate keyword optimization score using continuous curves.
-        score = 0
+
-        
+        Density: sigmoid centered at 2%, smooth peak.
-        # Check keyword density (optimal: 1-3%)
+        Heading presence: binary bonus per keyword.
        Early occurrence: sigmoid bonus.
        Missing/over-optimization: smooth penalties.
        """
        density_score = 0
        heading_bonus = 0
        early_bonus = 0
        densities = keyword_analysis.get('keyword_density', {})
        keyword_count = max(len(densities), 1)
        for keyword, density in densities.items():
-            if 1 <= density <= 3:
+            # Density score: smooth peak at 1-3%, sigmoid curve
-                score += 30
+            density_contribution = self._sigmoid(density, midpoint=2.0, steepness=2.0) * 30
-            elif density < 1:
+            if density > 4:
-                score += 15
+                density_contribution *= 0.5  # penalty for over-optimization
-            else:
+            density_score += density_contribution
-                score += 10
+
-        
+        density_score = density_score / keyword_count
-        # Check keyword distribution
+
        # Heading presence bonus
        distributions = keyword_analysis.get('keyword_distribution', {})
        for keyword, dist in distributions.items():
            if dist.get('in_headings', False):
-                score += 20
+                heading_bonus += 15
-            if dist.get('first_occurrence', -1) < 100:  # Early occurrence
+            first_occ = dist.get('first_occurrence', -1)
-                score += 20
+            if isinstance(first_occ, (int, float)) and 0 <= first_occ < 150:
-        
+                early_bonus += int(self._sigmoid(first_occ, midpoint=75, steepness=-0.04) * 15)
-        # Penalize missing keywords
+
-        missing = len(keyword_analysis.get('missing_keywords', []))
+        # Penalize missing keywords and over-optimization
-        score -= missing * 10
+        missing_penalty = len(keyword_analysis.get('missing_keywords', [])) * 8
-        
+        over_opt_penalty = len(keyword_analysis.get('over_optimization', [])) * 12
-        # Penalize over-optimization
+
-        over_opt = len(keyword_analysis.get('over_optimization', []))
+        raw = density_score + heading_bonus + early_bonus - missing_penalty - over_opt_penalty
-        score -= over_opt * 15
+        return int(min(max(raw, 5), 100))
        return max(0, min(score, 100))
    def _calculate_weighted_score(self, scores: Dict[str, int]) -> int:
-        """Calculate weighted overall score"""
+        """Calculate weighted overall score.
        AI insight engagement_score is unreliable (no ground truth) so it's excluded
        from the overall score. The remaining 5 categories are re-weighted to sum to 1.0.
        AI insights are still reported in category_scores for display but don't affect
        the overall score.
        """
        weights = {
-            'structure': 0.2,
+            'structure': 0.20,
            'keywords': 0.25,
-            'readability': 0.2,
+            'readability': 0.20,
-            'quality': 0.15,
+            'quality': 0.20,
-            'headings': 0.1,
+            'headings': 0.15,
            'ai_insights': 0.1
        }
-        
+
        weighted_sum = sum(scores.get(key, 0) * weight for key, weight in weights.items())
-        return int(weighted_sum)
+        return int(min(max(weighted_sum, 0), 100))
    # Recommendation methods
-    def _get_structure_recommendations(self, sections: int, has_intro: bool, has_conclusion: bool) -> List[str]:
+    def _get_structure_recommendations(self, sections: int, has_intro: bool, has_conclusion: bool, content: str = '') -> List[str]:
-        """Get structure recommendations"""
+        """Get structure recommendations based on actual content analysis"""
        recommendations = []
-        
+
        if sections < 3:
-            recommendations.append("Add more sections to improve content structure")
+            recommendations.append("Add more sections to improve content structure and topic coverage")
        elif sections > 8:
-            recommendations.append("Consider combining some sections for better flow")
+            recommendations.append("Consider combining some sections for better flow and readability")
-        
+
-        if not has_intro:
+        # More robust intro detection: check first 200 chars for first-person address,
-            recommendations.append("Add an introduction section to set context")
+        # question, or general hook — not just keyword matching
-        
+        first_200 = (content[:500] if content else '').lower()
-        if not has_conclusion:
+        intro_indicators = any([
-            recommendations.append("Add a conclusion section to summarize key points")
+            has_intro,
-        
+            '?' in first_200[:200],
            any(phrase in first_200 for phrase in ['in this', 'this article', 'this guide', 'this post', 'we will', "you'll learn", "let's explore", "whether you're"]),
            first_200.strip().startswith('# '),
        ])
        if not intro_indicators:
            recommendations.append("Add an introduction that hooks the reader and previews key topics")
        # More robust conclusion detection
        last_500 = (content[-500:] if content else '').lower()
        conclusion_indicators = any([
            has_conclusion,
            any(phrase in last_500 for phrase in ['in conclusion', 'to summarize', 'in summary', 'bottom line', 'key takeaways', 'remember that', 'as we\'ve seen']),
        ])
        if not conclusion_indicators:
            recommendations.append("Add a conclusion to summarize key points and provide next steps")
        return recommendations
    def _get_readability_recommendations(self, metrics: Dict[str, float], avg_sentence_length: float) -> List[str]:
-        """Get readability recommendations"""
+        """Get readability recommendations with specific, actionable guidance"""
        recommendations = []
-        
+
        flesch_score = metrics.get('flesch_reading_ease', 0)
-        
+
-        if flesch_score < 60:
+        if flesch_score < 30:
-            recommendations.append("Simplify language and use shorter sentences")
+            recommendations.append("Content is very difficult to read — shorten sentences, use simpler words, and break up complex ideas")
-        
+        elif flesch_score < 50:
-        if avg_sentence_length > 20:
+            recommendations.append("Content is fairly complex — consider simplifying some sentences and adding more plain-language explanations")
-            recommendations.append("Break down long sentences for better readability")
+
-        
+        if avg_sentence_length > 25:
-        if flesch_score > 80:
+            recommendations.append(f"Average sentence length is {avg_sentence_length:.0f} words — aim for 15-20 words per sentence for better readability")
-            recommendations.append("Consider adding more technical depth for expert audience")
+        elif avg_sentence_length > 20:
-        
+            recommendations.append("Some sentences may be too long — try breaking a few into shorter ones for easier reading")
        if flesch_score > 80 and flesch_score < 95:
            recommendations.append("Readability is very good — consider adding slightly more technical depth for expert credibility")
        return recommendations
    def _get_content_quality_recommendations(self, word_count: int, vocabulary_diversity: float, transition_count: int) -> List[str]:
-        """Get content quality recommendations"""
+        """Get content quality recommendations with specific, actionable guidance"""
        recommendations = []
-        
+
-        if word_count < 800:
+        if word_count < 400:
-            recommendations.append("Expand content with more detailed explanations")
+            recommendations.append("Content is significantly underdeveloped — expand with detailed explanations, examples, and supporting evidence")
-        elif word_count > 2000:
+        elif word_count < 800:
-            recommendations.append("Consider breaking into multiple posts")
+            recommendations.append("Content is thin — add depth with specific examples, data points, and detailed explanations for each section")
-        
+        elif word_count > 3000:
-        if vocabulary_diversity < 0.4:
+            recommendations.append("Content is very long — consider whether all sections are necessary or if some could be a separate post")
-            recommendations.append("Use more varied vocabulary to improve engagement")
+
-        
+        if vocabulary_diversity < 0.35:
-        if transition_count < 3:
+            recommendations.append("Vocabulary is highly repetitive — use synonyms and varied phrasing to improve engagement")
-            recommendations.append("Add more transition words to improve flow")
+        elif vocabulary_diversity < 0.45:
-        
+            recommendations.append("Vocabulary variety could be improved — try rephrasing repeated terms for more natural flow")
        if transition_count < 2:
            recommendations.append("Very few transition words found — add connectors like 'however', 'therefore', 'furthermore' between ideas")
        elif transition_count < 5:
            recommendations.append("Add more transition words to improve the flow between paragraphs and sections")
        return recommendations
    def _get_heading_recommendations(self, h1: List[str], h2: List[str], h3: List[str]) -> List[str]:
-        """Get heading recommendations"""
+        """Get heading recommendations with specific, actionable guidance"""
        recommendations = []
-        
+
        if len(h1) == 0:
-            recommendations.append("Add a main H1 heading")
+            recommendations.append("Add a main H1 heading — this is the primary title for both readers and search engines")
        elif len(h1) > 1:
-            recommendations.append("Use only one H1 heading per post")
+            recommendations.append(f"Found {len(h1)} H1 headings — use only one H1 per post for clarity. Convert extras to H2.")
-        
+
        if len(h2) < 3:
-            recommendations.append("Add more H2 headings to structure content")
+            recommendations.append(f"Only {len(h2)} H2 headings found — add section headings to break up content and improve scanning")
-        elif len(h2) > 8:
+        elif len(h2) > 10:
-            recommendations.append("Consider using H3 headings for better hierarchy")
+            recommendations.append(f"{len(h2)} H2 headings may be too many — consider using H3 subheadings within sections for better hierarchy")
-        
+
        if len(h2) >= 3 and len(h3) == 0 and len(h2) > 5:
            recommendations.append("Consider adding H3 subheadings within longer H2 sections for better content hierarchy")
        return recommendations
-    async def _run_ai_analysis(self, blog_content: str, keywords_data: Dict[str, Any], non_ai_results: Dict[str, Any], user_id: str = None) -> Dict[str, Any]:
+    async def _run_ai_analysis(self, blog_content: str, keywords_data: Dict[str, Any], non_ai_results: Dict[str, Any], user_id: str = None, outline: Optional[List[Dict[str, Any]]] = None, competitive_advantage: Optional[str] = None) -> Dict[str, Any]:
        """Run single AI analysis for structured insights (provider-agnostic)"""
        if not user_id:
            raise ValueError("user_id is required for subscription checking. Please provide Clerk user ID.")
@@ -612,7 +704,9 @@ class BlogContentSEOAnalyzer:
            context = {
                'blog_content': blog_content,
                'keywords_data': keywords_data,
-                'non_ai_results': non_ai_results
+                'non_ai_results': non_ai_results,
                'outline': outline or [],
                'competitive_advantage': competitive_advantage or '',
            }
            # Create AI prompt for structured analysis
@@ -624,10 +718,18 @@ class BlogContentSEOAnalyzer:
                    "content_quality_insights": {
                        "type": "object",
                        "properties": {
                            "engagement_score": {"type": "number"},
                            "value_proposition": {"type": "string"},
                            "content_gaps": {"type": "array", "items": {"type": "string"}},
-                            "improvement_suggestions": {"type": "array", "items": {"type": "string"}}
+                            "improvement_suggestions": {"type": "array", "items": {"type": "string"}},
                            "content_depth_indicators": {
                                "type": "object",
                                "properties": {
                                    "has_specific_data_points": {"type": "boolean"},
                                    "has_examples_or_illustrations": {"type": "boolean"},
                                    "has_actionable_takeaways": {"type": "boolean"},
                                    "depth_assessment": {"type": "string"}
                                }
                            }
                        }
                    },
                    "seo_optimization_insights": {
@@ -648,13 +750,12 @@ class BlogContentSEOAnalyzer:
                            "ux_improvements": {"type": "array", "items": {"type": "string"}}
                        }
                    },
-                    "competitive_analysis": {
+                    "content_strengths": {
                        "type": "object",
                        "properties": {
-                            "content_differentiation": {"type": "string"},
+                            "strongest_sections": {"type": "array", "items": {"type": "string"}},
-                            "unique_value": {"type": "string"},
+                            "unique_value_points": {"type": "array", "items": {"type": "string"}},
-                            "competitive_advantages": {"type": "array", "items": {"type": "string"}},
+                            "reader_value_assessment": {"type": "string"}
                            "market_positioning": {"type": "string"}
                        }
                    }
                }
@@ -675,37 +776,85 @@ class BlogContentSEOAnalyzer:
            raise e
    def _create_ai_analysis_prompt(self, context: Dict[str, Any]) -> str:
-        """Create AI analysis prompt"""
+        """Create AI analysis prompt with research context and outline awareness"""
        blog_content = context['blog_content']
        keywords_data = context['keywords_data']
        non_ai_results = context['non_ai_results']
        outline = context.get('outline', [])
        competitive_advantage = context.get('competitive_advantage', '')
        # Build outline context
        outline_text = ""
        if outline:
            section_names = []
            for sec in outline[:8]:
                heading = sec.get('heading', '') if isinstance(sec, dict) else getattr(sec, 'heading', '')
                subheadings = sec.get('subheadings', []) if isinstance(sec, dict) else getattr(sec, 'subheadings', [])
                sub_text = f" (subtopics: {', '.join(subheadings[:4])})" if subheadings else ""
                target_words = sec.get('target_words', '') if isinstance(sec, dict) else getattr(sec, 'target_words', '')
                word_text = f" [~{target_words} words]" if target_words else ""
                section_names.append(f"  - {heading}{sub_text}{word_text}")
            outline_text = "\n".join(section_names)
        # Build research context block
        research_block = ""
        content_gaps = keywords_data.get('content_gaps', [])
        competitive_advantages = keywords_data.get('competitive_advantages', [])
        search_queries = keywords_data.get('search_queries', [])
        suggested_angles = keywords_data.get('suggested_angles', [])
        industry_leaders = keywords_data.get('industry_leaders', [])
        if content_gaps:
            research_block += f"\nCONTENT GAPS (from competitor analysis): {', '.join(content_gaps[:5])}"
        if competitive_advantages:
            research_block += f"\nOUR COMPETITIVE ADVANTAGES: {', '.join(competitive_advantages[:3])}"
        if competitive_advantage:
            research_block += f"\nFOCUSED COMPETITIVE ADVANTAGE: {competitive_advantage}"
        if search_queries:
            research_block += f"\nORIGINAL SEARCH QUERIES: {', '.join(search_queries[:5])}"
        if suggested_angles:
            research_block += f"\nPLANNED CONTENT ANGLES: {', '.join(suggested_angles[:3])}"
        if industry_leaders:
            research_block += f"\nINDUSTRY LEADERS: {', '.join(industry_leaders[:3])}"
        prompt = f"""
-        Analyze this blog content for SEO optimization and user experience. Provide structured insights based on the content and keyword data.
+        Analyze this blog content for SEO optimization and user experience. Provide structured insights based ONLY on what is actually present in the content and keyword data. Do NOT fabricate data, statistics, competitor names, or case studies that are not in the content.
        BLOG CONTENT:
-        {blog_content[:2000]}...
+        {blog_content[:3000]}...
        KEYWORDS DATA:
        Primary Keywords: {keywords_data.get('primary', [])}
        Long-tail Keywords: {keywords_data.get('long_tail', [])}
        Semantic Keywords: {keywords_data.get('semantic', [])}
-        Search Intent: {keywords_data.get('search_intent', 'informational')}
+        Search Intent: {keywords_data.get('search_intent', 'informational')}{research_block}
-        NON-AI ANALYSIS RESULTS:
+        MEASURED ANALYSIS RESULTS:
-        Structure Score: {non_ai_results.get('content_structure', {}).get('structure_score', 0)}
+        Structure Score: {non_ai_results.get('content_structure', {}).get('structure_score', 0)}/100
-        Readability Score: {non_ai_results.get('readability_analysis', {}).get('readability_score', 0)}
+        Readability Score: {non_ai_results.get('readability_analysis', {}).get('readability_score', 0)}/100
-        Content Quality Score: {non_ai_results.get('content_quality', {}).get('content_depth_score', 0)}
+        Content Quality Score: {non_ai_results.get('content_quality', {}).get('content_depth_score', 0)}/100
        Heading Hierarchy Score: {non_ai_results.get('heading_structure', {}).get('heading_hierarchy_score', 0)}/100
        Word Count: {non_ai_results.get('content_quality', {}).get('word_count', 0)}
        Sections: {non_ai_results.get('content_structure', {}).get('total_sections', 0)}
        Has Introduction: {non_ai_results.get('content_structure', {}).get('has_introduction', False)}
        Has Conclusion: {non_ai_results.get('content_structure', {}).get('has_conclusion', False)}{f"""
-        Please provide:
+        PLANNED OUTLINE STRUCTURE:
-        1. Content Quality Insights: Assess engagement potential, value proposition, content gaps, and improvement suggestions
+{outline_text}""" if outline_text else ""}
-        2. SEO Optimization Insights: Evaluate keyword optimization, content relevance, search intent alignment, and SEO improvements
+{f"""
        3. User Experience Insights: Analyze content flow, readability, engagement factors, and UX improvements
        4. Competitive Analysis: Identify content differentiation, unique value, competitive advantages, and market positioning
-        Focus on actionable insights that can improve the blog's performance and user engagement.
+        FOCUSED ADVANTAGE: {competitive_advantage}""" if competitive_advantage else ""}
        IMPORTANT: SEO metadata (title tag, meta description, Open Graph tags, Twitter cards, JSON-LD schema) will be generated in a separate step. Do NOT recommend adding or improving meta descriptions, title tags, OG tags, or structured data markup — focus only on content-level improvements.
        Provide:
        1. Content Quality Insights: Assess the value proposition based on actual content. Identify specific content gaps (what TOPICS from the planned outline or competitor analysis are missing; do NOT suggest adding case studies unless the content references specific studies). Suggest improvements grounded in what the content currently covers.
        2. Content Depth Indicators: Objectively assess whether the content contains specific data points, examples, or actionable takeaways. These are binary assessments based on what's actually in the text.
        3. SEO Optimization Insights: Evaluate keyword optimization based on the provided keyword data. Assess content relevance and search intent alignment relative to the original search queries.
        4. User Experience Insights: Analyze content flow and readability. Identify engagement factors present in the text.
        5. Content Strengths: Identify the strongest sections of the content by heading name. Note unique value points the content provides. Do NOT invent competitive advantages — only describe what makes THIS content valuable based on the competitive advantages and content gaps listed above.
        """
-        
+
        return prompt
    def _compile_blog_seo_results(self, non_ai_results: Dict[str, Any], ai_insights: Dict[str, Any], keywords_data: Dict[str, Any]) -> Dict[str, Any]:
@@ -719,13 +868,28 @@ class BlogContentSEOAnalyzer:
                raise ValueError("AI insights are missing")
            # Calculate category scores
            # Compute ai_depth_score from measurable content_depth_indicators instead of
            # hallucinated engagement_score. If depth_indicators are present, score based on
            # boolean flags; otherwise default to 50 (neutral).
            ai_quality = ai_insights.get('content_quality_insights', {})
            depth_indicators = ai_quality.get('content_depth_indicators', {})
            if depth_indicators:
                depth_flags = [
                    depth_indicators.get('has_specific_data_points', False),
                    depth_indicators.get('has_examples_or_illustrations', False),
                    depth_indicators.get('has_actionable_takeaways', False),
                ]
                depth_score = 40 + (sum(depth_flags) * 20)  # 40 baseline + 20 per true flag = 40-100
            else:
                depth_score = 50
            category_scores = {
                'structure': non_ai_results.get('content_structure', {}).get('structure_score', 0),
                'keywords': self._calculate_keyword_score(non_ai_results.get('keyword_analysis', {})),
                'readability': non_ai_results.get('readability_analysis', {}).get('readability_score', 0),
                'quality': non_ai_results.get('content_quality', {}).get('content_depth_score', 0),
                'headings': non_ai_results.get('heading_structure', {}).get('heading_hierarchy_score', 0),
-                'ai_insights': ai_insights.get('content_quality_insights', {}).get('engagement_score', 0)
+                'ai_insights': depth_score
            }
            # Calculate overall score
@@ -757,7 +921,15 @@ class BlogContentSEOAnalyzer:
    def _compile_actionable_recommendations(self, non_ai_results: Dict[str, Any], ai_insights: Dict[str, Any]) -> List[Dict[str, Any]]:
        """Compile actionable recommendations from all sources"""
        recommendations = []
-        
+
        # Metadata-related keywords to filter out (handled by metadata generator)
        metadata_keywords = ['meta description', 'title tag', 'og tag', 'open graph',
                            'twitter card', 'json-ld', 'schema markup', 'structured data markup']
        def _is_metadata_rec(rec_text: str) -> bool:
            rec_lower = rec_text.lower()
            return any(kw in rec_lower for kw in metadata_keywords)
        # Structure recommendations
        structure_recs = non_ai_results.get('content_structure', {}).get('recommendations', [])
        for rec in structure_recs:
@@ -767,7 +939,7 @@ class BlogContentSEOAnalyzer:
                'recommendation': rec,
                'impact': 'Improves content organization and user experience'
            })
-        
+
        # Keyword recommendations
        keyword_recs = non_ai_results.get('keyword_analysis', {}).get('recommendations', [])
        for rec in keyword_recs:
@@ -777,7 +949,7 @@ class BlogContentSEOAnalyzer:
                'recommendation': rec,
                'impact': 'Improves search engine visibility'
            })
-        
+
        # Readability recommendations
        readability_recs = non_ai_results.get('readability_analysis', {}).get('recommendations', [])
        for rec in readability_recs:
@@ -787,17 +959,40 @@ class BlogContentSEOAnalyzer:
                'recommendation': rec,
                'impact': 'Improves user engagement and comprehension'
            })
-        
+
-        # AI insights recommendations
+        # AI insights recommendations (filter out metadata-related recs)
        ai_recs = ai_insights.get('content_quality_insights', {}).get('improvement_suggestions', [])
        for rec in ai_recs:
            if not _is_metadata_rec(rec):
                recommendations.append({
                    'category': 'Content Quality',
                    'priority': 'Medium',
                    'recommendation': rec,
                    'impact': 'Enhances content value and engagement'
                })
        # SEO improvement recommendations (filter metadata recs)
        seo_recs = ai_insights.get('seo_optimization_insights', {}).get('seo_improvements', [])
        for rec in seo_recs:
            if not _is_metadata_rec(rec):
                recommendations.append({
                    'category': 'SEO',
                    'priority': 'Medium',
                    'recommendation': rec,
                    'impact': 'Improves search engine optimization'
                })
        # Content strengths as informational (lower priority)
        content_strengths = ai_insights.get('content_strengths', {})
        strong_sections = content_strengths.get('strongest_sections', [])
        if strong_sections:
            recommendations.append({
-                'category': 'Content Quality',
+                'category': 'Strengths',
-                'priority': 'Medium',
+                'priority': 'Low',
-                'recommendation': rec,
+                'recommendation': f"Strongest sections: {', '.join(strong_sections[:3])}. Consider expanding these areas further.",
-                'impact': 'Enhances content value and engagement'
+                'impact': 'Leverages existing content strengths'
            })
-        
+
        return recommendations
    def _create_visualization_data(self, category_scores: Dict[str, int], non_ai_results: Dict[str, Any]) -> Dict[str, Any]:
@@ -851,7 +1046,7 @@ class BlogContentSEOAnalyzer:
            'weakest_category': weakest_category[0],
            'key_strengths': self._identify_key_strengths(category_scores),
            'key_weaknesses': self._identify_key_weaknesses(category_scores),
-            'ai_summary': ai_insights.get('content_quality_insights', {}).get('value_proposition', '')
+            'ai_summary': ai_insights.get('content_quality_insights', {}).get('value_proposition', 'Content analysis completed.')
        }
    def _identify_key_strengths(self, category_scores: Dict[str, int]) -> List[str]:
--- a/backend/services/blog_writer/seo/blog_seo_metadata_generator.py
+++ b/backend/services/blog_writer/seo/blog_seo_metadata_generator.py
@@ -84,14 +84,14 @@ class BlogSEOMetadataGenerator:
            raise e
    def _extract_keywords_from_research(self, research_data: Dict[str, Any]) -> Dict[str, Any]:
-        """Extract keywords and context from research data"""
+        """Extract keywords and context from research data, including competitor analysis and content gaps."""
        try:
            keyword_analysis = research_data.get('keyword_analysis', {})
            # Handle both 'semantic' and 'semantic_keywords' field names
            semantic_keywords = keyword_analysis.get('semantic', []) or keyword_analysis.get('semantic_keywords', [])
-            return {
+            result = {
                'primary_keywords': keyword_analysis.get('primary', []),
                'long_tail_keywords': keyword_analysis.get('long_tail', []),
                'semantic_keywords': semantic_keywords,
@@ -100,6 +100,30 @@ class BlogSEOMetadataGenerator:
                'target_audience': research_data.get('target_audience', 'general'),
                'industry': research_data.get('industry', 'general')
            }
            # Extract competitor analysis context
            competitor_analysis = research_data.get('competitor_analysis', {})
            if competitor_analysis:
                result['content_gaps'] = competitor_analysis.get('content_gaps', [])
                result['industry_leaders'] = competitor_analysis.get('industry_leaders', [])
                result['opportunities'] = competitor_analysis.get('opportunities', [])
                result['competitive_advantages'] = competitor_analysis.get('competitive_advantages', [])
            else:
                result['content_gaps'] = []
                result['industry_leaders'] = []
                result['opportunities'] = []
                result['competitive_advantages'] = []
            # Extract search queries
            search_queries = research_data.get('search_queries', [])
            result['search_queries'] = search_queries if isinstance(search_queries, list) else []
            # Extract suggested angles
            suggested_angles = research_data.get('suggested_angles', [])
            result['suggested_angles'] = suggested_angles if isinstance(suggested_angles, list) else []
            return result
        except Exception as e:
            logger.error(f"Failed to extract keywords from research: {e}")
            return {
@@ -109,7 +133,13 @@ class BlogSEOMetadataGenerator:
                'all_keywords': [],
                'search_intent': 'informational',
                'target_audience': 'general',
-                'industry': 'general'
+                'industry': 'general',
                'content_gaps': [],
                'industry_leaders': [],
                'opportunities': [],
                'competitive_advantages': [],
                'search_queries': [],
                'suggested_angles': []
            }
    async def _generate_core_metadata(
@@ -194,18 +224,20 @@ class BlogSEOMetadataGenerator:
            # Check if we got a valid response
            if not ai_response or not isinstance(ai_response, dict):
                logger.error("Core metadata generation failed: Invalid response from LLM")
-                # Return fallback response
+                # Return fallback response using content-derived values
-                primary_keywords = ', '.join(keywords_data.get('primary_keywords', ['content']))
+                primary_kw = keywords_data.get('primary_keywords', ['content'])
                primary_kw_first = primary_kw[0] if primary_kw else 'content'
                word_count = len(blog_content.split())
                slug = re.sub(r'[^a-z0-9]+', '-', blog_title.lower())[:50].strip('-')
                return {
                    'seo_title': blog_title,
-                    'meta_description': f'Learn about {primary_keywords.split(", ")[0] if primary_keywords else "this topic"}.',
+                    'meta_description': f'Discover insights about {primary_kw_first}. Comprehensive guide with practical tips and expert analysis.',
-                    'url_slug': blog_title.lower().replace(' ', '-').replace(':', '').replace(',', '')[:50],
+                    'url_slug': slug,
-                    'blog_tags': primary_keywords.split(', ') if primary_keywords else ['content'],
+                    'blog_tags': primary_kw[:5] if isinstance(primary_kw, list) else [primary_kw_first],
-                    'blog_categories': ['Content Marketing', 'Technology'],
+                    'blog_categories': [primary_kw_first.title(), 'Guide'],
-                    'social_hashtags': ['#content', '#marketing', '#technology'],
+                    'social_hashtags': [f'#{primary_kw_first.replace(" ", "")}', '#guide', '#tips'],
                    'reading_time': max(1, word_count // 200),
-                    'focus_keyword': primary_keywords.split(', ')[0] if primary_keywords else 'content'
+                    'focus_keyword': primary_kw_first
                }
            logger.info(f"Core metadata generation completed. Response keys: {list(ai_response.keys())}")
@@ -302,36 +334,41 @@ class BlogSEOMetadataGenerator:
            # Check if we got a valid response
            if not ai_response or not isinstance(ai_response, dict) or not ai_response.get('open_graph') or not ai_response.get('twitter_card') or not ai_response.get('json_ld_schema'):
                logger.error("Social metadata generation failed: Invalid or empty response from LLM")
-                # Return fallback response
+                # Return fallback response using content-derived values
                primary_kw = keywords_data.get('primary_keywords', ['content'])
                primary_kw_first = primary_kw[0] if primary_kw else 'content'
                slug = re.sub(r'[^a-z0-9]+', '-', blog_title.lower())[:50].strip('-')
                word_count = len(blog_content.split())
                current_date = datetime.now().isoformat()
                return {
                    'open_graph': {
                        'title': blog_title,
-                        'description': f'Learn about {keywords_data.get("primary_keywords", ["this topic"])[0] if keywords_data.get("primary_keywords") else "this topic"}.',
+                        'description': f'Discover insights about {primary_kw_first}. Comprehensive guide with practical tips.',
-                        'image': 'https://example.com/image.jpg',
+                        'image': '',
                        'type': 'article',
-                        'site_name': 'Your Website',
+                        'site_name': '',
-                        'url': 'https://example.com/blog'
+                        'url': f'https://example.com/blog/{slug}'
                    },
                    'twitter_card': {
                        'card': 'summary_large_image',
                        'title': blog_title,
-                        'description': f'Learn about {keywords_data.get("primary_keywords", ["this topic"])[0] if keywords_data.get("primary_keywords") else "this topic"}.',
+                        'description': f'Explore our guide on {primary_kw_first}.',
-                        'image': 'https://example.com/image.jpg',
+                        'image': '',
-                        'site': '@yourwebsite',
+                        'site': '',
-                        'creator': '@author'
+                        'creator': ''
                    },
                    'json_ld_schema': {
                        '@context': 'https://schema.org',
                        '@type': 'Article',
                        'headline': blog_title,
-                        'description': f'Learn about {keywords_data.get("primary_keywords", ["this topic"])[0] if keywords_data.get("primary_keywords") else "this topic"}.',
+                        'description': f'Comprehensive guide about {primary_kw_first}.',
-                        'author': {'@type': 'Person', 'name': 'Author Name'},
+                        'author': {'@type': 'Person', 'name': ''},
-                        'publisher': {'@type': 'Organization', 'name': 'Your Website'},
+                        'publisher': {'@type': 'Organization', 'name': ''},
-                        'datePublished': '2025-01-01T00:00:00Z',
+                        'datePublished': current_date,
-                        'dateModified': '2025-01-01T00:00:00Z',
+                        'dateModified': current_date,
-                        'mainEntityOfPage': 'https://example.com/blog',
+                        'mainEntityOfPage': f'https://example.com/blog/{slug}',
-                        'keywords': keywords_data.get('primary_keywords', ['content']),
+                        'keywords': primary_kw[:5] if isinstance(primary_kw, list) else [primary_kw_first],
-                        'wordCount': len(blog_content.split())
+                        'wordCount': word_count
                    }
                }
@@ -408,21 +445,53 @@ OUTLINE STRUCTURE:
 - Content hierarchy: Well-structured with {len(outline)} main sections
 """
-        # Extract SEO analysis insights
+        # Extract SEO analysis insights with weakness-aware guidance
        seo_context = ""
        if seo_analysis:
            overall_score = seo_analysis.get('overall_score', seo_analysis.get('seo_score', 0))
            category_scores = seo_analysis.get('category_scores', {})
-            applied_recs = seo_analysis.get('applied_recommendations', [])
+            applied_recs = seo_analysis.get('applied_recommendations') or []
            # Build weakness-specific guidance for metadata
            weakness_guidance = []
            kw_score = category_scores.get('keywords', category_scores.get('Keywords', 0))
            if kw_score < 70:
                weakness_guidance.append("Keyword optimization is weak — ensure title and description prominently feature primary keywords")
            read_score = category_scores.get('readability', category_scores.get('Readability', 0))
            if read_score < 70:
                weakness_guidance.append("Readability needs improvement — use clear, accessible language in the meta description")
            struct_score = category_scores.get('structure', category_scores.get('Structure', 0))
            if struct_score < 70:
                weakness_guidance.append("Content structure needs improvement — the title should clearly signal the content structure")
            seo_context = f"""
 SEO ANALYSIS RESULTS:
 - Overall SEO Score: {overall_score}/100
- Category Scores: Structure {category_scores.get('structure', category_scores.get('Structure', 0))}, Keywords {category_scores.get('keywords', category_scores.get('Keywords', 0))}, Readability {category_scores.get('readability', category_scores.get('Readability', 0))}
+- Category Scores: Structure {struct_score}, Keywords {kw_score}, Readability {read_score}
 - Applied Recommendations: {len(applied_recs)} SEO optimizations have been applied
 - Content Quality: Optimized for search engines with keyword focus
 {f"- WEAKNESS GUIDANCE: {'; '.join(weakness_guidance)}" if weakness_guidance else ""}
 """
        # Build research context block
        research_block = ""
        content_gaps = keywords_data.get('content_gaps', [])
        competitive_advantages = keywords_data.get('competitive_advantages', [])
        search_queries = keywords_data.get('search_queries', [])
        suggested_angles = keywords_data.get('suggested_angles', [])
        industry_leaders = keywords_data.get('industry_leaders', [])
        if content_gaps:
            research_block += f"\nCONTENT GAPS (from competitor analysis): {', '.join(content_gaps[:5])}"
        if competitive_advantages:
            research_block += f"\nOUR KEY DIFFERENTIATORS: {', '.join(competitive_advantages[:3])}"
        if search_queries:
            research_block += f"\nORIGINAL SEARCH QUERIES: {', '.join(search_queries[:5])}"
        if suggested_angles:
            research_block += f"\nCONTENT ANGLES: {', '.join(suggested_angles[:3])}"
        if industry_leaders:
            research_block += f"\nINDUSTRY LEADERS: {', '.join(industry_leaders[:3])}"
        # Get more content context (key sections instead of just first 1000 chars)
        content_preview = self._extract_content_highlights(blog_content)
@@ -443,6 +512,7 @@ SEMANTIC KEYWORDS: {semantic_keywords}
 SEARCH INTENT: {search_intent}
 TARGET AUDIENCE: {target_audience}
 INDUSTRY: {industry}
 {research_block}
 {seo_context}
@@ -525,6 +595,18 @@ Generate metadata that is personalized, compelling, and SEO-optimized.
            overall_score = seo_analysis.get('overall_score', seo_analysis.get('seo_score', 0))
            seo_context = f"\nSEO SCORE: {overall_score}/100 (optimized content)\n"
        # Build research context for social metadata
        research_block = ""
        content_gaps = keywords_data.get('content_gaps', [])
        competitive_advantages = keywords_data.get('competitive_advantages', [])
        search_queries = keywords_data.get('search_queries', [])
        if content_gaps:
            research_block += f"\nCONTENT GAPS: {', '.join(content_gaps[:3])}"
        if competitive_advantages:
            research_block += f"\nDIFFERENTIATORS: {', '.join(competitive_advantages[:3])}"
        if search_queries:
            research_block += f"\nSEARCH QUERIES: {', '.join(search_queries[:4])}"
        content_preview = self._extract_content_highlights(blog_content, 1500)
        prompt = f"""
@@ -539,6 +621,7 @@ KEYWORDS: {primary_keywords}
 TARGET AUDIENCE: {target_audience}
 INDUSTRY: {industry}
 CURRENT DATE: {current_date}
 {research_block}
 === GENERATION REQUIREMENTS ===
@@ -551,20 +634,20 @@ CURRENT DATE: {current_date}
   - url: Generate canonical URL structure
 2. TWITTER CARD:
-   - card: "summary_large_image"
+    - card: "summary_large_image"
-   - title: 70 chars max, optimized for Twitter audience
+    - title: 70 chars max, optimized for Twitter audience
-   - description: 200 chars max with relevant hashtags inline
+    - description: 200 chars max with relevant hashtags inline
-   - image: Match Open Graph image
+    - image: Match Open Graph image
-   - site: @yourwebsite (placeholder, user should update)
+    - site: Leave empty string (user will add their Twitter handle)
-   - creator: @author (placeholder, user should update)
+    - creator: Leave empty string (user will add author Twitter handle)
 3. JSON-LD SCHEMA (Article):
-   - @context: "https://schema.org"
+    - @context: "https://schema.org"
-   - @type: "Article"
+    - @type: "Article"
-   - headline: Article title (optimized)
+    - headline: Article title (optimized)
-   - description: Article description (150-200 chars)
+    - description: Article description (150-200 chars)
-   - author: {{"@type": "Person", "name": "Author Name"}} (placeholder)
+    - author: {{"@type": "Person", "name": ""}} (leave empty, user will add author name)
-   - publisher: {{"@type": "Organization", "name": "Site Name", "logo": {{"@type": "ImageObject", "url": "logo-url"}}}}
+    - publisher: {{"@type": "Organization", "name": ""}} (leave empty, user will add site name)
   - datePublished: {current_date}
   - dateModified: {current_date}
   - mainEntityOfPage: {{"@type": "WebPage", "@id": "canonical-url"}}
@@ -633,35 +716,109 @@ Make it engaging, personalized for {target_audience}, and optimized for {industr
            raise e
    def _calculate_optimization_score(self, core_metadata: Dict[str, Any], social_metadata: Dict[str, Any]) -> int:
-        """Calculate overall optimization score for the generated metadata"""
+        """Calculate metadata quality score based on content relevance and adherence to best practices.
        Unlike the old completeness-based score (which just checked field existence),
        this assigns quality-weighted points based on how well each field is optimized.
        """
        try:
            score = 0
-            # Check core metadata completeness
+            # Title quality (0-15): Length in 50-60 chars is optimal
-            if core_metadata.get('seo_title'):
+            seo_title = core_metadata.get('seo_title', '')
-                score += 15
+            if seo_title:
-            if core_metadata.get('meta_description'):
+                title_len = len(seo_title)
-                score += 15
+                if 50 <= title_len <= 60:
-            if core_metadata.get('url_slug'):
+                    score += 15
-                score += 10
+                elif 40 <= title_len <= 70:
-            if core_metadata.get('blog_tags'):
+                    score += 10
-                score += 10
+                elif title_len > 0:
-            if core_metadata.get('blog_categories'):
+                    score += 5
                score += 10
            if core_metadata.get('social_hashtags'):
                score += 10
            if core_metadata.get('focus_keyword'):
                score += 10
-            # Check social metadata completeness
+            # Meta description quality (0-15): Length in 150-160 chars is optimal, has CTA
-            if social_metadata.get('open_graph'):
+            meta_desc = core_metadata.get('meta_description', '')
            if meta_desc:
                desc_len = len(meta_desc)
                desc_lower = meta_desc.lower()
                has_cta = any(phrase in desc_lower for phrase in ['learn', 'discover', 'find', 'get', 'explore', 'how to', 'why', 'tips', 'guide', 'try', 'start'])
                if 150 <= desc_len <= 160 and has_cta:
                    score += 15
                elif 120 <= desc_len <= 170:
                    score += 10 if has_cta else 7
                elif desc_len > 0:
                    score += 4
            # URL slug quality (0-10): Short, keyword-rich, no stop words
            url_slug = core_metadata.get('url_slug', '')
            if url_slug:
                slug_parts = url_slug.strip('/').split('/')
                slug_words = slug_parts[-1].split('-') if slug_parts else []
                if 2 <= len(slug_words) <= 5:
                    score += 10
                elif len(slug_words) > 0:
                    score += 5
            # Tags and categories quality (0-20)
            blog_tags = core_metadata.get('blog_tags', [])
            blog_categories = core_metadata.get('blog_categories', [])
            if blog_tags and len(blog_tags) >= 3:
                score += 10
-            if social_metadata.get('twitter_card'):
+            elif blog_tags:
                score += 5
-            if social_metadata.get('json_ld_schema'):
+            if blog_categories and len(blog_categories) >= 1:
                score += 10
            elif blog_categories:
                score += 5
-            return min(score, 100)  # Cap at 100
+            # Social hashtags (0-10): Relevant and non-spammy
            social_hashtags = core_metadata.get('social_hashtags', [])
            if social_hashtags and 3 <= len(social_hashtags) <= 8:
                score += 10
            elif social_hashtags:
                score += 5
            # Focus keyword (0-10): Present and relevant
            focus_keyword = core_metadata.get('focus_keyword', '')
            if focus_keyword and seo_title and focus_keyword.lower() in seo_title.lower():
                score += 10
            elif focus_keyword:
                score += 4
            # Open Graph quality (0-10): Has title, description, correct type
            og = social_metadata.get('open_graph', {})
            if og:
                og_score = 0
                if og.get('title') and len(og.get('title', '')) > 10:
                    og_score += 4
                if og.get('description') and 100 <= len(og.get('description', '')) <= 200:
                    og_score += 4
                if og.get('type') == 'article':
                    og_score += 2
                score += og_score
            # Twitter Card quality (0-5)
            twitter = social_metadata.get('twitter_card', {})
            if twitter:
                tw_score = 0
                if twitter.get('title') and len(twitter.get('title', '')) > 10:
                    tw_score += 3
                if twitter.get('card') == 'summary_large_image':
                    tw_score += 2
                score += tw_score
            # JSON-LD quality (0-5): Has headline, description, datePublished
            json_ld = social_metadata.get('json_ld_schema', {})
            if json_ld:
                jl_score = 0
                if json_ld.get('headline'):
                    jl_score += 2
                if json_ld.get('description'):
                    jl_score += 2
                if json_ld.get('datePublished'):
                    jl_score += 1
                score += jl_score
            return min(score, 100)
        except Exception as e:
            logger.error(f"Failed to calculate optimization score: {e}")
--- a/backend/services/blog_writer/seo/blog_seo_recommendation_applier.py
+++ b/backend/services/blog_writer/seo/blog_seo_recommendation_applier.py
@@ -2,6 +2,13 @@
 Applies actionable SEO recommendations to existing blog content using the
 provider-agnostic `llm_text_gen` dispatcher. Ensures GPT_PROVIDER parity.
 Key design principles:
 - Make TARGETED edits, not full rewrites
 - Preserve existing content structure and factual claims
 - Only modify sections that have applicable recommendations
 - Never fabricate statistics, case studies, or citations
 - Ground changes in research sources when available
 """
 import asyncio
@@ -15,7 +22,7 @@ logger = get_service_logger("blog_seo_recommendation_applier")
 class BlogSEORecommendationApplier:
-    """Apply actionable SEO recommendations to blog content."""
+    """Apply actionable SEO recommendations to blog content with targeted edits."""
    def __init__(self):
        logger.debug("Initialized BlogSEORecommendationApplier")
@@ -35,6 +42,7 @@ class BlogSEORecommendationApplier:
        persona = payload.get("persona", {})
        tone = payload.get("tone")
        audience = payload.get("audience")
        competitive_advantage = payload.get("competitive_advantage", "")
        if not sections:
            return {"success": False, "error": "No sections provided for recommendation application"}
@@ -43,16 +51,21 @@ class BlogSEORecommendationApplier:
            logger.warning("apply_recommendations called without recommendations")
            return {"success": True, "title": title, "sections": sections, "applied": []}
        # Determine which sections actually need changes based on recommendations
        sections_to_edit = self._identify_affected_sections(sections, recommendations)
        prompt = self._build_prompt(
            title=title,
            introduction=introduction,
            sections=sections,
            sections_to_edit=sections_to_edit,
            outline=outline,
            research=research,
            recommendations=recommendations,
            persona=persona,
            tone=tone,
            audience=audience,
            competitive_advantage=competitive_advantage,
        )
        schema = {
@@ -87,14 +100,14 @@ class BlogSEORecommendationApplier:
            "required": ["sections"],
        }
-        logger.info("Applying SEO recommendations via llm_text_gen")
+        logger.info("Applying SEO recommendations via llm_text_gen (targeted edit mode)")
        result = await asyncio.to_thread(
            llm_text_gen,
            prompt,
            None,
            schema,
-            user_id,  # Pass user_id for subscription checking
+            user_id,
            max_tokens=8192,
        )
@@ -106,14 +119,12 @@ class BlogSEORecommendationApplier:
        raw_sections = result.get("sections", []) or []
        normalized_sections: List[Dict[str, Any]] = []
        # Warn if LLM returned different number of sections (may miss intro/conclusion added as new sections)
        if len(raw_sections) != len(sections):
            logger.warning(
                f"LLM returned {len(raw_sections)} sections but {len(sections)} were sent. "
                "Extra sections will be ignored; missing sections fall back to original content."
            )
        # Build lookup table from updated sections using their identifiers
        updated_map: Dict[str, Dict[str, Any]] = {}
        for updated in raw_sections:
            section_id = str(
@@ -156,7 +167,6 @@ class BlogSEORecommendationApplier:
            mapped = updated_map.get(fallback_id)
            if not mapped and raw_sections:
                # Fall back to positional match if identifier lookup failed
                candidate = raw_sections[index] if index < len(raw_sections) else {}
                heading = (
                    candidate.get("heading")
@@ -176,7 +186,6 @@ class BlogSEORecommendationApplier:
                }
            if not mapped:
                # Fallback to original content if nothing else available
                mapped = {
                    "id": fallback_id,
                    "heading": original.get("heading") or original.get("title") or f"Section {index + 1}",
@@ -190,12 +199,11 @@ class BlogSEORecommendationApplier:
        logger.info("SEO recommendations applied successfully")
        # Extract updated introduction from LLM response if available
        updated_introduction = result.get("introduction") or ""
        if updated_introduction and updated_introduction != introduction:
            logger.info(f"Introduction updated: {len(updated_introduction)} chars")
        elif not updated_introduction:
-            updated_introduction = introduction  # fall back to original
+            updated_introduction = introduction
        return {
            "success": True,
@@ -205,37 +213,133 @@ class BlogSEORecommendationApplier:
            "applied": applied,
        }
    def _identify_affected_sections(self, sections: List[Dict[str, Any]], recommendations: List[Dict[str, Any]]) -> List[str]:
        """Identify which section IDs are likely affected by the recommendations.
        Maps recommendation categories to section headings for targeted editing.
        Returns a list of section IDs that should be edited.
        """
        affected_ids = set()
        for rec in recommendations:
            category = (rec.get("category") or "").lower()
            rec_text = (rec.get("recommendation") or "").lower()
            # Structure recommendations affect first/last sections or all sections
            if category == "structure":
                if sections:
                    affected_ids.add(str(sections[0].get("id", "section_1")))
                    affected_ids.add(str(sections[-1].get("id", f"section_{len(sections)}")))
                    # "Add more sections" or "too many sections" affects all
                    if "more section" in rec_text or "combine" in rec_text or "flow" in rec_text:
                        for s in sections:
                            affected_ids.add(str(s.get("id", "")))
                    continue
            # Keyword recommendations affect all sections (keywords should be spread)
            if category == "keywords":
                for s in sections:
                    affected_ids.add(str(s.get("id", "")))
                continue
            # Readability affects all sections
            if category == "readability":
                for s in sections:
                    affected_ids.add(str(s.get("id", "")))
                continue
            # Content quality — try to match recommendation to specific section headings
            if category in ("content quality", "content", "seo"):
                heading_keywords = {
                    s.get("heading", "").lower(): str(s.get("id", ""))
                    for s in sections
                }
                matched = False
                for heading_lower, section_id in heading_keywords.items():
                    rec_words = rec_text.split()
                    if any(word in heading_lower for word in rec_words if len(word) > 3):
                        affected_ids.add(section_id)
                        matched = True
                if not matched:
                    # Affect first and last sections (intro/conclusion) as common targets
                    if sections:
                        affected_ids.add(str(sections[0].get("id", "section_1")))
                        affected_ids.add(str(sections[-1].get("id", f"section_{len(sections)}")))
        # Filter out empty IDs and return
        return [sid for sid in affected_ids if sid]
    def _build_prompt(
        self,
        *,
        title: str,
        introduction: str,
        sections: List[Dict[str, Any]],
        sections_to_edit: List[str],
        outline: List[Dict[str, Any]],
        research: Dict[str, Any],
        recommendations: List[Dict[str, Any]],
        persona: Dict[str, Any],
        tone: str | None,
        audience: str | None,
        competitive_advantage: str = "",
    ) -> str:
-        """Construct prompt for applying recommendations."""
+        """Construct prompt for applying targeted recommendations."""
-        sections_str = []
+        # Build research context block
        research_block = ""
        keyword_analysis = research.get("keyword_analysis", {}) if research else {}
        primary_keywords = ", ".join(keyword_analysis.get("primary", [])[:8]) or "None"
        competitor_analysis = research.get("competitor_analysis", {}) if research else {}
        search_queries = research.get("search_queries", []) if research else []
        suggested_angles = research.get("suggested_angles", []) if research else []
        content_gaps = competitor_analysis.get("content_gaps", []) if competitor_analysis else []
        competitive_advantages = competitor_analysis.get("competitive_advantages", []) if competitor_analysis else []
        research_block += f"\nPRIMARY KEYWORDS: {primary_keywords}"
        if content_gaps:
            research_block += f"\nCONTENT GAPS (address these in your edits): {', '.join(content_gaps[:5])}"
        if competitive_advantages:
            research_block += f"\nKEY DIFFERENTIATORS (emphasize these): {', '.join(competitive_advantages[:3])}"
        if competitive_advantage:
            research_block += f"\nPRIMARY ADVANTAGE: {competitive_advantage}"
        if search_queries:
            research_block += f"\nTARGET SEARCH QUERIES: {', '.join(search_queries[:5])}"
        if suggested_angles:
            research_block += f"\nCONTENT ANGLES: {', '.join(suggested_angles[:3])}"
        # Build per-section content with edit markers
        sections_content = []
        for section in sections:
-            sections_str.append(
+            section_id = str(section.get("id", "section"))
-                f"ID: {section.get('id', 'section')}, Heading: {section.get('heading', 'Untitled')}\n"
+            heading = section.get("heading", "Untitled")
-                f"Current Content:\n{section.get('content', '')}\n"
+            content = section.get("content", "")
-            )
+            needs_edit = section_id in sections_to_edit
            section_text = f"--- SECTION (ID: {section_id}, Heading: \"{heading}\")"
            if needs_edit:
                section_text += " [NEEDS EDITS based on recommendations]"
            else:
                section_text += " [KEEP AS-IS - no changes needed]"
            section_text += f" ---\n{content}\n"
            sections_content.append(section_text)
        sections_str = "\n\n".join(sections_content)
-        outline_str = "\n".join(
+        # Build outline with subheadings and key points
-            [
+        outline_parts = []
-                f"- {item.get('heading', 'Section')} (Target words: {item.get('target_words', 'N/A')})"
+        for item in outline:
-                for item in outline
+            heading = item.get("heading", "Section")
-            ]
+            target_words = item.get("target_words", "N/A")
-        )
+            subheadings = item.get("subheadings", [])
-
+            key_points = item.get("key_points", [])
-        research_summary = research.get("keyword_analysis", {}) if research else {}
+            line = f"- {heading} (Target: {target_words} words)"
-        primary_keywords = ", ".join(research_summary.get("primary", [])[:10]) or "None"
+            if subheadings:
                line += f" | Subheadings: {', '.join(subheadings[:4])}"
            if key_points:
                line += f" | Key points: {', '.join(key_points[:4])}"
            outline_parts.append(line)
        outline_str = "\n".join(outline_parts) if outline_parts else "No outline supplied"
        recommendations_str = []
        for rec in recommendations:
@@ -248,7 +352,7 @@ class BlogSEORecommendationApplier:
        persona_str = (
            f"Persona: {persona}\n"
            if persona
-            else "Persona: (not provided)\n"
+            else ""
        )
        style_guidance = []
@@ -258,44 +362,47 @@ class BlogSEORecommendationApplier:
            style_guidance.append(f"Target audience: {audience}")
        style_str = "\n".join(style_guidance) if style_guidance else "Maintain current tone and audience alignment."
-        prompt = f"""
+        intro_text = introduction if introduction else "(No introduction currently — write one ONLY if a recommendation specifically asks for it)"
 You are an expert SEO content strategist. Update the blog content to apply the actionable recommendations.
-Current Title: {title}
+        prompt = f"""You are a careful SEO content editor making TARGETED edits to an existing blog post. Your job is to apply specific SEO recommendations with PRECISION — not to rewrite the entire post.
-Current Introduction:
+CRITICAL RULES — YOU MUST FOLLOW THESE:
-{introduction if introduction else '(No introduction exists — write a compelling one if the recommendations require it)'}
+1. PRESERVE existing content. Only make MINIMAL, targeted changes to address specific recommendations. Do NOT rewrite sections that are working well.
 2. NEVER fabricate statistics, case studies, expert quotes, research data, or specific numbers unless they are explicitly stated in the research context below.
 3. NEVER add content that contradicts or goes beyond what the research sources support.
 4. KEEP the same emotional tone and writing style as the original content.
 5. Return EXACTLY the same number of sections with EXACTLY the same IDs. Do NOT add, remove, or rename sections.
 6. For sections marked [KEEP AS-IS], return the content UNCHANGED — copy it verbatim.
 7. For sections marked [NEEDS EDITS], make ONLY the specific changes needed to address the applicable recommendations.
 8. Do NOT add introductions, conclusions, or case studies unless a recommendation EXPLICITLY asks for one.
-Primary Keywords (for context): {primary_keywords}
+{research_block}
-Outline Overview:
+PLANNED OUTLINE STRUCTURE:
-{outline_str or 'No outline supplied'}
+{outline_str}
-Existing Sections:
+CURRENT TITLE: {title}
 {''.join(sections_str)}
-Actionable Recommendations to Apply:
+CURRENT INTRODUCTION:
 {intro_text}
 CURRENT SECTIONS:
 {sections_str}
 RECOMMENDATIONS TO APPLY:
 {''.join(recommendations_str)}
 {persona_str}{style_str}
-{persona_str}
+INSTRUCTIONS:
-{style_str}
+- For sections marked [KEEP AS-IS]: Copy the content EXACTLY as provided. Do not change a single word.
-
+- For sections marked [NEEDS EDITS]: Make the MINIMUM changes needed to address the recommendations. If a recommendation says "add transition words", add 2-3 transitions — do not rewrite the paragraph. If it says "use more varied vocabulary", replace 2-3 repetitive words — do not rewrite the section.
-Instructions:
+- If a recommendation asks for an introduction and none exists, write a brief 2-3 sentence introduction that naturally leads into the first section. Do NOT fabricate hooks or statistics.
-1. Carefully apply the recommendations while preserving factual accuracy and research alignment.
+- If a recommendation asks for a conclusion, append 2-3 sentences summarizing key takeaways to the LAST section. Do NOT fabricate conclusions that don't follow from the actual content.
-2. You MUST return EXACTLY the same number of sections, with EXACTLY the same IDs as provided above. Do NOT add or remove sections.
+- Return ALL sections, including the ones you did NOT change.
-3. If a recommendation says content is MISSING (e.g. missing introduction or conclusion), incorporate that missing content into the MOST APPROPRIATE existing section:
+- Provide a summary of which recommendations you addressed and what specific changes you made.
   - Missing introduction → PREPEND introductory content to the FIRST section's existing content.
   - Missing conclusion → APPEND concluding content to the LAST section's existing content.
   - For other missing content, add it to the section whose heading best matches the recommendation.
 4. Additionally, if an introduction is missing or weak, write a compelling introduction in the "introduction" field of your response. If the current introduction is adequate, return it unchanged.
 5. Improve clarity, flow, and SEO optimization per the guidance.
 6. Return updated sections in the requested JSON format.
 7. Provide a short summary of which recommendations were addressed.
 """
        return prompt
-__all__ = ["BlogSEORecommendationApplier"]
+__all__ = ["BlogSEORecommendationApplier"]
--- a/backend/services/database.py
+++ b/backend/services/database.py
@@ -36,6 +36,8 @@ from models.podcast_models import PodcastProject
 from models.research_models import ResearchProject
 # Video Studio models
 from models.video_models import VideoGenerationTask
 # YouTube Creator task models
 from models.youtube_task_models import YouTubeVideoTask
 # Bing Analytics models
 from models.bing_analytics_models import Base as BingAnalyticsBase
--- a/backend/services/gsc_brainstorm_service.py
+++ b/backend/services/gsc_brainstorm_service.py
@@ -47,6 +47,10 @@ class GSCBrainstormService:
        if not site_url:
            sites = self.gsc_service.get_site_list(user_id)
            if not sites:
                logger.info(f"No GSC sites found for user {user_id} — falling back to AI-only brainstorm")
                fallback = self._generate_ai_only_brainstorm(user_id, keywords, None, None, None)
                if fallback:
                    return fallback
                return {
                    "error": "No GSC sites found. Make sure your site is verified in Google Search Console.",
                    "content_opportunities": [],
@@ -70,6 +74,10 @@ class GSCBrainstormService:
        )
        if "error" in analytics:
            logger.info(f"GSC analytics error for user {user_id}: {analytics.get('error')} — falling back to AI-only brainstorm")
            fallback = self._generate_ai_only_brainstorm(user_id, keywords, site_url, start_date, end_date)
            if fallback:
                return fallback
            return {
                "error": analytics.get("error", "Failed to fetch GSC data"),
                "content_opportunities": [],
@@ -88,6 +96,10 @@ class GSCBrainstormService:
        pages_data = self._parse_page_rows(page_rows)
        if not keywords_data:
            logger.info(f"No GSC keyword data for user {user_id} — falling back to AI-only brainstorm")
            fallback = self._generate_ai_only_brainstorm(user_id, keywords, site_url, start_date, end_date)
            if fallback:
                return fallback
            return {
                "error": "No keyword data available for the selected period. This usually means your site is new to GSC or hasn't received search traffic yet.",
                "content_opportunities": [],
@@ -110,6 +122,10 @@ class GSCBrainstormService:
        logger.info(f"After topic filter: {len(keywords_data)} keywords, {len(pages_data)} pages")
        if not keywords_data:
            logger.info(f"No GSC keywords matched topic '{keywords}' for user {user_id} — falling back to AI-only brainstorm")
            fallback = self._generate_ai_only_brainstorm(user_id, keywords, site_url, start_date, end_date)
            if fallback:
                return fallback
            return {
                "error": "No GSC keywords matched your topic. Try a broader research topic or check your GSC data.",
                "content_opportunities": [],
@@ -155,6 +171,128 @@ class GSCBrainstormService:
            "summary": summary,
        }
    # ------------------------------------------------------------------ #
    #  AI-only fallback (when GSC has no data)
    # ------------------------------------------------------------------ #
    def _generate_ai_only_brainstorm(
        self,
        user_id: str,
        keywords: str,
        site_url: Optional[str],
        start_date: Optional[str],
        end_date: Optional[str],
    ) -> Optional[Dict[str, Any]]:
        """
        Generate topic ideas using AI alone when GSC data is unavailable.
        Returns a brainstorm-shaped result with empty GSC-specific arrays
        but populated ai_recommendations.
        """
        try:
            prompt = f"""You are an expert content strategist helping a blog writer brainstorm topic ideas.
 The user is interested in writing about: "{keywords}"
 Since they are a new or early-stage website, there is no Google Search Console data available yet.
 Generate compelling blog post ideas they can write RIGHT NOW to start building traffic.
 For each suggestion include:
 1. A specific, compelling blog post TITLE (not a vague topic)
 2. The primary keyword it should target
 3. Why this topic will perform well (search demand, competition level, timing)
 4. The recommended content format (how-to, listicle, comparison, pillar page, etc.)
 5. Estimated difficulty level (Easy / Medium / Hard)
 Return your response in this EXACT JSON format (no markdown, no code fences):
 {{
  "immediate_opportunities": [
    {{
      "title": "Specific Blog Post Title",
      "keyword": "primary target keyword",
      "reason": "Why this will perform well",
      "format": "How-To Guide | Listicle | Comparison | Pillar Page | etc.",
      "estimated_impact": "Beginner-friendly traffic opportunity"
    }}
  ],
  "content_strategy": [
    {{
      "title": "Pillar Content Title",
      "keyword": "target keyword",
      "reason": "Strategic importance for building topical authority",
      "format": "Pillar Page | Ultimate Guide | Resource",
      "estimated_impact": "Foundation for long-term organic growth"
    }}
  ],
  "long_term_strategy": [
    {{
      "title": "Authority Building Title",
      "keyword": "target keyword",
      "reason": "Establishes expertise and captures high-intent traffic over time",
      "format": "Research-Backed Analysis | Expert Roundup | Original Study",
      "estimated_impact": "Compound traffic growth over 6-12 months"
    }}
  ]
 }}
 IMPORTANT:
 - Provide 3-5 items in each category
 - All suggestions MUST relate to the user's interest in "{keywords}"
 - Titles should be specific, compelling, and SEO-aware
 - Prioritize topics with clear search intent and realistic ranking potential for a new site
 - Include a mix of easy wins (long-tail, low competition) and strategic pillar content
 - For estimated_impact, describe the opportunity type (not click numbers since we lack data)"""
            system_prompt = (
                "You are an expert content strategist specializing in SEO and blog topic generation. "
                "You help new websites identify high-potential content topics even without search console data. "
                "You always respond with valid JSON matching the requested format exactly."
            )
            result = llm_text_gen(
                prompt=prompt,
                system_prompt=system_prompt,
                user_id=user_id,
                flow_type="gsc_brainstorm_fallback",
            )
            if result:
                parsed = self._parse_ai_response(result)
                if parsed:
                    return {
                        "content_opportunities": [],
                        "keyword_gaps": [],
                        "quick_wins": [],
                        "page_opportunities": [],
                        "ai_recommendations": parsed,
                        "summary": {
                            "site_url": site_url or "",
                            "date_range": {
                                "start": start_date or "",
                                "end": end_date or "",
                            },
                            "total_keywords_analyzed": 0,
                            "total_impressions": 0,
                            "total_clicks": 0,
                            "avg_ctr": 0,
                            "avg_position": 0,
                            "ctr_vs_benchmark": 0,
                            "health_score": 0,
                            "keyword_distribution": {
                                "positions_1_3": 0,
                                "positions_4_10": 0,
                                "positions_11_20": 0,
                                "positions_21_plus": 0,
                            },
                            "top_keywords": [],
                            "top_pages": [],
                            "note": "AI-generated suggestions based on your topic. No GSC data was available — these are strategic recommendations, not data-driven insights."
                        },
                    }
        except Exception as e:
            logger.warning(f"AI-only brainstorm fallback failed for user {user_id}: {e}")
        return None
    # ------------------------------------------------------------------ #
    #  Data parsing helpers
    # ------------------------------------------------------------------ #
--- a/backend/services/gsc_service.py
+++ b/backend/services/gsc_service.py
@@ -188,7 +188,6 @@ class GSCService:
            with sqlite3.connect(db_path) as conn:
                cursor = conn.cursor()
                # Check if table exists first to avoid error on fresh DB
                cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='gsc_credentials'")
                if not cursor.fetchone():
                    return None
@@ -204,7 +203,6 @@ class GSCService:
                credentials_data = json.loads(result[0])
                # Check for required fields, but allow connection without refresh token
                required_fields = ['token_uri', 'client_id', 'client_secret']
                missing_fields = [field for field in required_fields if not credentials_data.get(field)]
@@ -214,7 +212,6 @@ class GSCService:
                credentials = Credentials.from_authorized_user_info(credentials_data, self.scopes)
                # Refresh token if needed and possible
                if credentials.expired:
                    if credentials.refresh_token:
                        try:
@@ -222,9 +219,11 @@ class GSCService:
                            self.save_user_credentials(user_id, credentials)
                        except Exception as e:
                            logger.error(f"Failed to refresh GSC token for user {user_id}: {e}")
                            self.clear_incomplete_credentials(user_id)
                            return None
                    else:
                        logger.warning(f"GSC token expired for user {user_id} but no refresh token available - user needs to re-authorize")
                        self.clear_incomplete_credentials(user_id)
                        return None
                return credentials
@@ -288,7 +287,6 @@ class GSCService:
        try:
            logger.info(f"Handling GSC OAuth callback with state: {state[:20]}...")
            # Extract user_id from state
            if ':' not in state:
                logger.error(f"Invalid GSC state format: {state}")
                return False
@@ -300,17 +298,19 @@ class GSCService:
                logger.error(f"User database not found for user {user_id}")
                return False
-            # Verify state in user's DB (but don't delete yet — delete after successful token exchange)
+            # Verify state in user's DB (best effort — if missing, attempt code exchange anyway)
-            with sqlite3.connect(db_path) as conn:
+            state_valid = False
-                cursor = conn.cursor()
+            try:
-                cursor.execute('SELECT user_id FROM gsc_oauth_states WHERE state = ?', (state,))
+                with sqlite3.connect(db_path) as conn:
-                result = cursor.fetchone()
+                    cursor = conn.cursor()
-                
+                    cursor.execute('SELECT user_id FROM gsc_oauth_states WHERE state = ?', (state,))
-                if not result:
+                    state_valid = cursor.fetchone() is not None
-                    logger.error(f"Invalid or expired GSC OAuth state for user {user_id}")
+            except Exception as state_err:
-                    return False
+                logger.warning(f"State verification query failed, proceeding anyway: {state_err}")
-            
+
-            # Exchange code for credentials
+            if not state_valid:
                logger.warning(f"GSC OAuth state not found in DB for user {user_id} — will attempt code exchange without state verification")
            if not self.client_config:
                logger.error("Cannot handle callback: Client configuration not loaded")
                return False
@@ -324,21 +324,30 @@ class GSCService:
            flow.fetch_token(code=authorization_code)
            credentials = flow.credentials
            if not credentials or not credentials.token:
                logger.error(f"Token exchange returned empty credentials for user {user_id}")
                return False
-            # State consumed successfully — clean up
+            # Clean up state if it was valid
-            try:
+            if state_valid:
-                with sqlite3.connect(db_path) as conn:
+                try:
-                    cursor = conn.cursor()
+                    with sqlite3.connect(db_path) as conn:
-                    cursor.execute('DELETE FROM gsc_oauth_states WHERE state = ?', (state,))
+                        cursor = conn.cursor()
-                    conn.commit()
+                        cursor.execute('DELETE FROM gsc_oauth_states WHERE state = ?', (state,))
-            except Exception as cleanup_err:
+                        conn.commit()
-                logger.warning(f"Failed to clean up OAuth state: {cleanup_err}")
+                except Exception as cleanup_err:
                    logger.warning(f"Failed to clean up OAuth state: {cleanup_err}")
-            # Save credentials
+            result = self.save_user_credentials(user_id, credentials)
-            return self.save_user_credentials(user_id, credentials)
+            if result:
                logger.info(f"GSC OAuth callback succeeded for user {user_id} (state_valid={state_valid})")
            else:
                logger.error(f"GSC OAuth callback: token exchange succeeded but failed to save credentials for user {user_id}")
            return result
        except Exception as e:
-            logger.error(f"Error handling GSC OAuth callback: {e}")
+            logger.error(f"Error handling GSC OAuth callback for user {user_id if 'user_id' in dir() else 'unknown'}: {e}")
            return False
@@ -726,6 +735,8 @@ class GSCService:
            with sqlite3.connect(db_path) as conn:
                cursor = conn.cursor()
                cursor.execute('DELETE FROM gsc_credentials WHERE user_id = ?', (user_id,))
                cursor.execute('DELETE FROM gsc_data_cache WHERE user_id = ?', (user_id,))
                cursor.execute('DELETE FROM gsc_oauth_states WHERE user_id = ?', (user_id,))
                conn.commit()
            logger.info(f"Cleared incomplete GSC credentials for user: {user_id}")
--- a/backend/services/integrations/wix/auth.py
+++ b/backend/services/integrations/wix/auth.py
@@ -69,9 +69,17 @@ class WixAuthService:
    def get_site_info(self, access_token: str) -> Dict[str, Any]:
        headers = {
            'Authorization': f'Bearer {access_token}',
-            'Content-Type': 'application/json'
+            'Content-Type': 'application/json',
        }
        if self.client_id:
            headers['wix-client-id'] = self.client_id
        response = requests.get(f"{self.base_url}/sites/v1/site", headers=headers)
        if response.status_code == 404:
            logger.warning("Wix site info not found (404) — user may not have a published site or token lacks sites scope")
            return {"_no_site": True, "error": "No Wix site found for this account"}
        if response.status_code == 401:
            logger.warning("Wix site info request unauthorized (401) — token expired or invalid")
            return {"_auth_failed": True, "error": "Token expired or invalid — reconnect required"}
        response.raise_for_status()
        return response.json()
--- a/backend/services/integrations/wix/blog.py
+++ b/backend/services/integrations/wix/blog.py
@@ -3,6 +3,7 @@ import requests
 from loguru import logger
 from .retry import wix_api_call_with_retry, WixAPIError
 from .auth_utils import get_wix_headers
 class WixBlogService:
@@ -14,40 +15,7 @@ class WixBlogService:
    def headers(self, access_token: str, extra: Optional[Dict[str, str]] = None) -> Dict[str, str]:
        """Build headers with automatic token type detection."""
-        h: Dict[str, str] = {
+        return get_wix_headers(access_token, client_id=self.client_id, extra=extra)
            'Content-Type': 'application/json',
        }
        if access_token:
            # Normalize token to string if needed
            if not isinstance(access_token, str):
                from .utils import normalize_token_string
                normalized = normalize_token_string(access_token)
                if normalized:
                    access_token = normalized
                else:
                    access_token = str(access_token)
            token = access_token.strip()
            if token:
                if token.startswith('OauthNG.JWS.'):
                    h['Authorization'] = f'Bearer {token}'
                    logger.debug("Using Wix OAuth token with Bearer prefix (OauthNG.JWS. format detected)")
                elif token.startswith('IST.'):
                    h['Authorization'] = token
                    logger.debug("Using Wix API key for authorization (IST. format detected)")
                elif token.count('.') == 2:
                    h['Authorization'] = f'Bearer {token}'
                    logger.debug("Using OAuth Bearer token for authorization (JWT: 2 dots)")
                else:
                    h['Authorization'] = token
                    logger.debug("Using token as-is for authorization")
        if self.client_id:
            h['wix-client-id'] = self.client_id
        if extra:
            h.update(extra)
        return h
    def create_draft_post(self, access_token: str, payload: Dict[str, Any], extra_headers: Optional[Dict[str, str]] = None) -> Dict[str, Any]:
        """Create draft post with retry logic and consolidated logging."""
@@ -144,9 +112,9 @@ class WixBlogService:
        """Create a blog tag with retry logic."""
        url = f"{self.base_url}/blog/v3/tags"
        headers = self.headers(access_token, extra_headers)
-        payload: Dict[str, Any] = {'label': label, 'fieldsets': ['URL']}
+        payload: Dict[str, Any] = {'tag': {'label': label}, 'fieldsets': ['URL']}
        if language:
-            payload['language'] = language
+            payload['tag']['language'] = language
        try:
            return wix_api_call_with_retry('POST', url, headers, json_payload=payload, max_attempts=3)
--- a/backend/services/integrations/wix/blog_publisher.py
+++ b/backend/services/integrations/wix/blog_publisher.py
@@ -171,6 +171,16 @@ def validate_ricos_content(ricos_content: Dict[str, Any]) -> Dict[str, Any]:
    return ricos_content
 _UUID_RE = re.compile(r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$', re.IGNORECASE)
 def _looks_like_uuid(value: str) -> bool:
    try:
        uuid.UUID(value)
        return True
    except (ValueError, AttributeError):
        return bool(_UUID_RE.match(value))
 def validate_payload_no_none(obj, path=""):
    """Recursively validate that no None values exist in the payload"""
    if obj is None:
@@ -224,6 +234,7 @@ def create_blog_post(
    """
    # ===== PRE-FLIGHT VALIDATION =====
    errors = []
    warnings = []
    if not member_id:
        errors.append("memberId is required for third-party apps creating blog posts")
@@ -279,6 +290,18 @@ def create_blog_post(
    except Exception:
        pass
    # Add wix-site-id to headers for all API calls (categories, tags, draft post)
    resolved_site_id = site_id or meta_site_id or os.getenv('WIX_SITE_ID')
    if resolved_site_id:
        headers['wix-site-id'] = resolved_site_id
        logger.info(f"Using wix-site-id: {resolved_site_id[:8]}... (source: {'param' if site_id else 'token' if meta_site_id else 'env'})")
    else:
        token_str = str(access_token)
        if token_str.startswith('IST.'):
            logger.error("IST. API key requires WIX_SITE_ID environment variable or site_id parameter.")
        else:
            logger.warning("No wix-site-id found — API calls may fail if token requires it")
    # Quick permission test (only log failures)
    try:
        test_headers = get_wix_headers(access_token)
@@ -295,39 +318,59 @@ def create_blog_post(
    wix_logger.log_token_info(token_length, has_blog_scope, meta_site_id)
    # Convert markdown to Ricos
-    ricos_content = convert_content_to_ricos(content, None)
+    # PRIMARY: Use Wix Ricos Documents API for best formatting support (tables, complex markdown, etc.)
    # FALLBACK: Use custom parser if Wix API fails (no length limit, handles tables natively)
    has_table = bool(re.search(r'^\|.*\|', content, re.MULTILINE))
    # Pre-check: Wix Ricos API has a 10,000 character limit for HTML input.
    # Estimate HTML length from markdown (~1.4x expansion) to avoid silent truncation.
    # If HTML would exceed limit, skip Wix API and use custom parser.
    use_wix_api = True
    MAX_HTML_LIMIT = 9800
    estimated_html_len = len(content) * 1.4
    if estimated_html_len > MAX_HTML_LIMIT:
        logger.warning(f"Content too long for Wix Ricos API (est. HTML: {estimated_html_len:.0f} > {MAX_HTML_LIMIT}) — using custom parser")
        use_wix_api = False
    ricos_content = None
    if use_wix_api:
        try:
            logger.info("Converting markdown via Wix Ricos Documents API...")
            ricos_content = convert_via_wix_api(content, access_token, base_url)
            logger.info(f"Wix API conversion succeeded: {len(ricos_content.get('nodes', []))} nodes")
        except Exception as e:
            logger.warning(f"Wix API conversion failed, falling back to custom parser: {e}")
    # If markdown had tables and Wix API didn't produce TABLE nodes, fall back to custom parser
    if has_table and ricos_content:
        node_types = [n.get('type', '') for n in ricos_content.get('nodes', [])]
        if 'TABLE' not in node_types:
            logger.info("Markdown had tables but Wix API produced no TABLE nodes — using custom parser for table support")
            ricos_content = None
    if not ricos_content or not isinstance(ricos_content, dict) or 'nodes' not in ricos_content:
        logger.info("Using custom markdown parser for Ricos conversion")
        ricos_content = convert_content_to_ricos(content, None)
    nodes_count = len(ricos_content.get('nodes', []))
    wix_logger.log_ricos_conversion(nodes_count)
    # Validate Ricos content structure
    # Per Wix Blog API documentation: richContent should ONLY contain 'nodes'
    # The example in docs shows: { nodes: [...] } - no type, id, metadata, or documentStyle
    if not isinstance(ricos_content, dict):
-        logger.error(f"❌ richContent is not a dict: {type(ricos_content)}")
+        logger.error(f"richContent is not a dict: {type(ricos_content)}")
        raise ValueError("richContent must be a dictionary object")
    if 'nodes' not in ricos_content or not isinstance(ricos_content['nodes'], list):
-        logger.error(f"❌ richContent.nodes is missing or not a list: {ricos_content.get('nodes', 'MISSING')}")
+        logger.error(f"richContent.nodes is missing or not a list: {ricos_content.get('nodes', 'MISSING')}")
        raise ValueError("richContent must contain a 'nodes' array")
-    # Remove type and id fields (not expected by Blog API)
+    # Remove top-level fields not expected by Blog API CREATE endpoint
-    # NOTE: metadata is optional - Wix UPDATE endpoint example shows it, but CREATE example doesn't
+    # (Wix API converter may include type, id, metadata, documentStyle — strip them)
-    # We'll keep it minimal (nodes only) for CREATE to match the recipe example
+    for field in ['type', 'id', 'metadata', 'documentStyle']:
    fields_to_remove = ['type', 'id']
    for field in fields_to_remove:
        if field in ricos_content:
-            logger.debug(f"Removing '{field}' field from richContent (Blog API doesn't expect this)")
+            logger.debug(f"Removing '{field}' from richContent for Blog API compatibility")
            del ricos_content[field]
    # Remove metadata and documentStyle - Blog API CREATE endpoint example shows only 'nodes'
    # (UPDATE endpoint shows metadata, but we're using CREATE)
    if 'metadata' in ricos_content:
        logger.debug("Removing 'metadata' from richContent (CREATE endpoint expects only 'nodes')")
        del ricos_content['metadata']
    if 'documentStyle' in ricos_content:
        logger.debug("Removing 'documentStyle' from richContent (CREATE endpoint expects only 'nodes')")
        del ricos_content['documentStyle']
    # Ensure we only have 'nodes' in richContent for CREATE endpoint
    ricos_content = {'nodes': ricos_content['nodes']}
@@ -414,44 +457,50 @@ def create_blog_post(
                logger.info(f"Cover image imported: {media_id[:16]}...")
            else:
                logger.warning(f"Cover image import returned no valid media_id (type={type(media_id)}). Continuing without cover image.")
                warnings.append("Cover image could not be imported — post published without cover image.")
        except Exception as e:
            logger.warning(f"Cover image import failed (non-fatal): {e}. Continuing without cover image.")
            warnings.append(f"Cover image import failed: {str(e)[:100]}")
    # Handle categories - can be either IDs (list of strings) or names (for lookup)
    category_ids_to_use = None
    if category_ids:
        # Check if these are IDs (UUIDs) or names
        if isinstance(category_ids, list) and len(category_ids) > 0:
-            # Assume IDs if first item looks like UUID (has hyphens and is long)
+            # Use proper UUID detection instead of fragile heuristic
            first_item = str(category_ids[0])
-            if '-' in first_item and len(first_item) > 30:
+            if _looks_like_uuid(first_item):
                category_ids_to_use = category_ids
            elif lookup_categories_func:
                # These are names, need to lookup/create
                extra_headers = {}
-                if 'wix-site-id' in headers:
+                if resolved_site_id:
-                    extra_headers['wix-site-id'] = headers['wix-site-id']
+                    extra_headers['wix-site-id'] = resolved_site_id
                category_ids_to_use = lookup_categories_func(
                    access_token, category_ids, extra_headers if extra_headers else None
                )
                if not category_ids_to_use:
                    warnings.append(f"Categories could not be created ({len(category_ids)} requested) — OAuth app may lack BLOG.CREATE-DRAFT scope.")
    # Handle tags - can be either IDs (list of strings) or names (for lookup)
    tag_ids_to_use = None
    if tag_ids:
        # Check if these are IDs (UUIDs) or names
        if isinstance(tag_ids, list) and len(tag_ids) > 0:
-            # Assume IDs if first item looks like UUID (has hyphens and is long)
+            # Use proper UUID detection instead of fragile heuristic
            first_item = str(tag_ids[0])
-            if '-' in first_item and len(first_item) > 30:
+            if _looks_like_uuid(first_item):
                tag_ids_to_use = tag_ids
            elif lookup_tags_func:
                # These are names, need to lookup/create
                extra_headers = {}
-                if 'wix-site-id' in headers:
+                if resolved_site_id:
-                    extra_headers['wix-site-id'] = headers['wix-site-id']
+                    extra_headers['wix-site-id'] = resolved_site_id
                tag_ids_to_use = lookup_tags_func(
                    access_token, tag_ids, extra_headers if extra_headers else None
                )
                if not tag_ids_to_use:
                    warnings.append(f"Tags could not be created ({len(tag_ids)} requested) — OAuth app may lack BLOG scope for tag management.")
    # Add categories if we have IDs (must be non-empty list of strings)
    # CRITICAL: Wix API rejects empty arrays or arrays with None/empty strings
@@ -491,24 +540,12 @@ def create_blog_post(
        logger.debug("No SEO metadata provided to create_blog_post")
    try:
-        # Extract wix-site-id from token, parameter, or env var
+        # Use wix-site-id already resolved earlier
-        extra_headers = {}
+        extra_headers_final = {}
-        wix_site_id = site_id or os.getenv('WIX_SITE_ID')
+        wix_site_id = resolved_site_id
        if not wix_site_id:
            from .utils import extract_meta_from_token
            meta_info = extract_meta_from_token(access_token)
            wix_site_id = meta_info.get('metaSiteId')
        if wix_site_id:
-            extra_headers['wix-site-id'] = wix_site_id
+            extra_headers_final['wix-site-id'] = wix_site_id
-            logger.info(f"Using wix-site-id: {wix_site_id[:8]}... (source: {'param' if site_id else 'env' if os.getenv('WIX_SITE_ID') else 'token'})")
+            logger.info(f"Using wix-site-id for draft post: {wix_site_id[:8]}...")
        else:
            token_str = str(access_token)
            if token_str.startswith('IST.'):
                logger.error("❌ IST. API key requires WIX_SITE_ID environment variable or site_id parameter. "
                           "The token's tenant.id is the account ID, not the site ID. "
                           "Please set WIX_SITE_ID in your .env file to your Wix site's metaSiteId.")
            else:
                logger.warning("No wix-site-id found — API calls may fail if token requires it")
    except Exception as e:
        logger.debug(f"Could not extract wix-site-id from token: {e}")
@@ -564,13 +601,17 @@ def create_blog_post(
        logger.info(f"📤 Publishing to Wix: title='{blog_data['draftPost'].get('title', '')}', "
                     f"nodes={len(rc.get('nodes', []))}")
-        result = blog_service.create_draft_post(access_token, blog_data, extra_headers or None)
+        result = blog_service.create_draft_post(access_token, blog_data, extra_headers_final or None)
        draft_post = result.get('draftPost', {})
        post_id = draft_post.get('id', 'N/A')
        wix_logger.log_operation_result("Create Draft Post", True, result)
        logger.success(f"✅ Wix: Blog post created - ID: {post_id}")
        if warnings:
            result['_warnings'] = warnings
            logger.info(f"Publish completed with {len(warnings)} warnings: {'; '.join(warnings)}")
        return result
    except TypeError as e:
        import traceback
--- a/backend/services/integrations/wix/content.py
+++ b/backend/services/integrations/wix/content.py
@@ -192,6 +192,120 @@ def _make_horizontal_rule_node() -> Dict[str, Any]:
    }
 def _parse_markdown_table(lines: List[str], start_idx: int) -> tuple:
    """
    Parse a markdown table starting at start_idx.
    Returns (table_rows, alignments, next_idx) where table_rows is a list of lists of cell text,
    and alignments is a list of column alignments ('left', 'center', 'right', None).
    Markdown tables look like:
    | Header 1 | Header 2 |
    |----------|----------|
    | Cell 1   | Cell 2   |
    Alignment is detected from the separator row:
    |:--------|:--------:|--------:|
    """
    rows = []
    alignments = None
    i = start_idx
    while i < len(lines):
        line = lines[i].strip()
        if not line or '|' not in line:
            break
        cells = [cell.strip() for cell in line.strip('|').split('|')]
        # Detect separator row (contains only dashes, colons, pipes, spaces)
        if i > start_idx and all(
            set(cell.strip()) <= set('-:| ') for cell in cells
        ):
            alignments = []
            for cell in cells:
                cell = cell.strip()
                if cell.startswith(':') and cell.endswith(':'):
                    alignments.append('center')
                elif cell.endswith(':'):
                    alignments.append('right')
                elif cell.startswith(':'):
                    alignments.append('left')
                else:
                    alignments.append(None)
            i += 1
            continue
        rows.append(cells)
        i += 1
    return rows, alignments or [None] * (len(rows[0]) if rows else 1), i
 def _make_table_node(header_row: List[str], body_rows: List[List[str]], alignments: List) -> Dict[str, Any]:
    """Create a Ricos TABLE node with header and body rows, with formatting."""
    table_rows = []
    all_rows = [header_row] + body_rows
    for row_idx, row_cells in enumerate(all_rows):
        cell_nodes = []
        for col_idx, cell_text in enumerate(row_cells):
            text_nodes = parse_markdown_inline(cell_text)
            # Bold header row cells
            if row_idx == 0 and text_nodes:
                for node in text_nodes:
                    if node.get('type') == 'TEXT':
                        decs = node['textData'].get('decorations', [])
                        if not any(d.get('type') == 'BOLD' for d in decs if isinstance(d, dict)):
                            decs_copy = decs.copy()
                            decs_copy.append({'type': 'BOLD'})
                            node['textData']['decorations'] = decs_copy
            paragraph_node = {
                'id': str(uuid.uuid4()),
                'type': 'PARAGRAPH',
                'nodes': text_nodes if text_nodes else [{
                    'id': str(uuid.uuid4()),
                    'type': 'TEXT',
                    'nodes': [],
                    'textData': {'text': cell_text or ' ', 'decorations': []}
                }],
            }
            cell_style = {'verticalAlign': 'top'}
            if row_idx == 0:
                cell_style['borderWidth'] = {'top': 2, 'bottom': 1, 'left': 1, 'right': 1}
            # Apply column alignment
            if alignments and col_idx < len(alignments) and alignments[col_idx]:
                cell_style['textAlign'] = alignments[col_idx]
            cell_node = {
                'id': str(uuid.uuid4()),
                'type': 'TABLE_CELL',
                'nodes': [paragraph_node],
                'tableCellData': {'style': cell_style},
            }
            cell_nodes.append(cell_node)
        row_node = {
            'id': str(uuid.uuid4()),
            'type': 'TABLE_ROW',
            'nodes': cell_nodes,
        }
        table_rows.append(row_node)
    num_cols = max(len(row) for row in all_rows) if all_rows else 1
    return {
        'id': str(uuid.uuid4()),
        'type': 'TABLE',
        'nodes': table_rows,
        'tableData': {
            'cols': num_cols,
            'rows': len(table_rows),
            'headerRow': 0 if header_row else -1,
        },
    }
 def convert_content_to_ricos(content: str, images: List[str] = None) -> Dict[str, Any]:
    """
    Convert markdown content into valid Ricos JSON format.
@@ -205,6 +319,7 @@ def convert_content_to_ricos(content: str, images: List[str] = None) -> Dict[str
    - Code blocks (```language ... ```)
    - Inline images (![alt](url))
    - Horizontal rules (---, ***, ___)
    - Tables (| Header | Header |)
    """
    if not content:
        content = "This is a post from ALwrity."
@@ -245,6 +360,16 @@ def convert_content_to_ricos(content: str, images: List[str] = None) -> Dict[str
            i += 1
            continue
        # Markdown tables (lines starting with |)
        if stripped.startswith('|') and i + 1 < len(lines) and '|' in lines[i + 1]:
            table_rows, alignments, next_idx = _parse_markdown_table(lines, i)
            if table_rows and len(table_rows) >= 1:
                header_row = table_rows[0]
                body_rows = table_rows[1:] if len(table_rows) > 1 else []
                nodes.append(_make_table_node(header_row, body_rows, alignments))
                i = next_idx
                continue
        # Headings
        if stripped.startswith('#'):
            level = len(stripped) - len(stripped.lstrip('#'))
@@ -280,12 +405,11 @@ def convert_content_to_ricos(content: str, images: List[str] = None) -> Dict[str
            })
            continue
-        # Unordered lists
+        # Unordered lists (including task lists)
        if (stripped.startswith('- ') or stripped.startswith('* ') or 
            (stripped.startswith('-') and len(stripped) > 1 and stripped[1] != '-') or
            (stripped.startswith('*') and len(stripped) > 1 and stripped[1] != '*')):
            list_items = []
            list_marker = '- ' if stripped.startswith('-') else '* '
            while i < len(lines):
                current_line = lines[i].strip()
@@ -323,7 +447,14 @@ def convert_content_to_ricos(content: str, images: List[str] = None) -> Dict[str
            list_node_items = []
            for item_text in list_items:
-                text_nodes = parse_markdown_inline(item_text)
+                # Detect task list items: "- [ ] task" or "- [x] task"
                task_match = re.match(r'^\[([ xX])\]\s*(.*)', item_text)
                if task_match:
                    checked = task_match.group(1).lower() == 'x'
                    prefix = '☑ ' if checked else '☐ '
                    text_nodes = parse_markdown_inline(prefix + task_match.group(2))
                else:
                    text_nodes = parse_markdown_inline(item_text)
                paragraph_node = {
                    'id': str(uuid.uuid4()),
                    'type': 'PARAGRAPH',
@@ -414,6 +545,7 @@ def convert_content_to_ricos(content: str, images: List[str] = None) -> Dict[str
                next_line.startswith('>') or
                next_line.startswith('![') or
                next_line.startswith('```') or
                next_line.startswith('|') or
                re.match(r'^(---+|\*\*\*|___+)$', next_line) or
                re.match(r'^\d+\.\s+', next_line)):
                break
--- a/backend/services/integrations/wix/logger.py
+++ b/backend/services/integrations/wix/logger.py
@@ -75,7 +75,10 @@ class WixLogger:
                    logger.debug(f"   Payload: {', '.join(parts)}")
        if error_body and status_code >= 400:
-            error_msg = error_body.get('message', 'Unknown error')
+            if isinstance(error_body, dict):
                error_msg = error_body.get('message', 'Unknown error')
            else:
                error_msg = str(error_body)
            logger.error(f"   Error: {error_msg}")
            if status_code == 500:
                logger.error("   ⚠️ Internal server error - check Wix API status")
--- a/backend/services/integrations/wix/media.py
+++ b/backend/services/integrations/wix/media.py
@@ -1,17 +1,35 @@
 from typing import Any, Dict, Optional
 import requests
 from urllib.parse import urlparse
 from loguru import logger
 from .retry import wix_api_call_with_retry, WixAPIError
 def _is_valid_image_url(url: str) -> bool:
    """Check if a URL looks like a valid, publicly accessible image URL for Wix import."""
    if not url or not isinstance(url, str):
        return False
    url = url.strip()
    if url.startswith('data:'):
        return False
    parsed = urlparse(url)
    if parsed.scheme not in ('http', 'https'):
        return False
    host = parsed.hostname or ''
    if host in ('localhost', '127.0.0.1', 'example.com') or host.endswith('.example.com'):
        return False
    return True
 class WixMediaService:
    """Service for Wix Media Manager operations with retry logic and error handling."""
    def __init__(self, base_url: str):
        self.base_url = base_url
-    def import_image(self, access_token: str, image_url: str, display_name: str) -> Optional[Dict[str, Any]]:
+    def import_image(self, access_token: str, image_url: str, display_name: str,
                     client_id: Optional[str] = None, site_id: Optional[str] = None) -> Optional[Dict[str, Any]]:
        """
        Import external image to Wix Media Manager.
@@ -22,6 +40,8 @@ class WixMediaService:
            access_token: Valid access token
            image_url: URL of the image to import
            display_name: Display name for the image
            client_id: Optional Wix client ID for wix-client-id header
            site_id: Optional Wix metaSiteId for wix-site-id header
        Returns:
            Media result dict with 'file' key, or None on failure
@@ -29,10 +49,23 @@ class WixMediaService:
        Raises:
            WixAPIError: On non-retryable failure or after retries exhausted
        """
        if not _is_valid_image_url(image_url):
            logger.warning(f"Skipping image import — URL not valid for Wix: {image_url[:80]}...")
            return None
        logger.info(f"Importing image to Wix: url={image_url[:80]}..., display_name={display_name}")
        headers = {
            'Authorization': f'Bearer {access_token}',
            'Content-Type': 'application/json',
        }
        if client_id:
            headers['wix-client-id'] = client_id
        if not site_id:
            from .utils import extract_meta_from_token
            meta_info = extract_meta_from_token(access_token)
            site_id = meta_info.get('metaSiteId')
        if site_id:
            headers['wix-site-id'] = site_id
        payload = {
            'url': image_url,
            'mediaType': 'IMAGE',
--- a/backend/services/integrations/wix/seo.py
+++ b/backend/services/integrations/wix/seo.py
@@ -26,10 +26,6 @@ def build_seo_data(seo_metadata: Dict[str, Any], default_title: str = None) -> O
        Wix seoData object with settings.keywords and tags array, or None if empty
    """
    seo_data = {
        'settings': {
            'keywords': [],
            'preventAutoRedirect': False  # Required by Wix API schema
        },
        'tags': []
    }
@@ -77,11 +73,7 @@ def build_seo_data(seo_metadata: Dict[str, Any], default_title: str = None) -> O
        # Keep main keyword + next 4 most important
        keywords_list = keywords_list[:5]
-    seo_data['settings']['keywords'] = keywords_list
+    seo_data['settings'] = {'keywords': keywords_list}
    # Validate keywords list is not empty (or ensure at least one keyword exists)
    if not seo_data['settings']['keywords']:
        logger.warning("No keywords found in SEO metadata, adding empty keywords array")
    # Build tags array (meta tags, Open Graph, etc.)
    tags_list = []
--- a/backend/services/intelligence/sif_integration.py
+++ b/backend/services/intelligence/sif_integration.py
@@ -708,7 +708,48 @@ class SIFIntegrationService:
                themes = adv_insights.get('augmented_themes', [])
                if themes:
                    text_content += f"Augmented Themes: {', '.join(themes[:5])}. "
-                
+
                freshness = adv_insights.get('freshness', {})
                if freshness:
                    text_content += (f"Content Freshness Score: {freshness.get('freshness_score', 'N/A')}. "
                                     f"Publishing Velocity: {freshness.get('publishing_velocity', 0)}/week. "
                                     f"Trend: {freshness.get('publishing_trend', 'unknown')}. "
                                     f"Last 30d: {freshness.get('publishing_recency', {}).get('last_30d', 0)} pages. ")
                link_health = adv_insights.get('link_health', {})
                if link_health and 'error' not in link_health:
                    text_content += (f"Internal Links: {link_health.get('internal_link_count', 0)}. "
                                     f"External Links: {link_health.get('external_link_count', 0)}. "
                                     f"Nofollow: {link_health.get('nofollow_link_count', 0)}. "
                                     f"Avg Links/Page: {link_health.get('avg_links_per_page', 0)}. ")
                redirects = adv_insights.get('redirect_audit', {})
                if redirects and 'error' not in redirects:
                    text_content += (f"Redirects: {redirects.get('total_redirects', 0)} total, "
                                     f"{redirects.get('multi_hop_chains', 0)} multi-hop. ")
                image_seo = adv_insights.get('image_seo', {})
                if image_seo and 'error' not in image_seo:
                    text_content += (f"Images: {image_seo.get('total_images', 0)} total, "
                                     f"Alt Coverage: {image_seo.get('alt_coverage_percentage', 0)}%. ")
                url_struct = adv_insights.get('url_structure', {})
                if url_struct:
                    text_content += (f"URL Structure: {url_struct.get('total_urls_analyzed', 0)} URLs, "
                                     f"Avg Depth: {url_struct.get('directory_depth', {}).get('average_depth', 0)}. "
                                     f"Params: {url_struct.get('parameter_usage', {}).get('percentage_with_params', 0)}%. ")
                robots = adv_insights.get('robots_txt', {})
                if robots and robots.get('success'):
                    text_content += (f"Robots.txt: {robots.get('total_directives', 0)} directives, "
                                     f"Compliance: {robots.get('compliance_score', 0)}/100. "
                                     f"Issues: {len(robots.get('issues', []))}. ")
                budget = adv_insights.get('crawl_budget', {})
                if budget and budget.get('success'):
                    text_content += (f"Crawl Budget: {budget.get('pages_crawled', 0)} crawled of {budget.get('sitemap_total_urls', 0)} URLs. "
                                     f"Waste: {budget.get('waste_percentage', 0)}%. "
                                     f"Score: {budget.get('optimization_score', 0)}. ")
            # Add Technical SEO overview
            tech_audit = dashboard_data.get('technical_seo_audit', {})
            if tech_audit:
--- a/backend/services/linkedin/init.py
+++ b/backend/services/linkedin/init.py
@@ -17,13 +17,13 @@ from .content_generator_prompts import (
    VideoScriptGenerator
 )
-# Import new image generation services
+# Import image generation services
 from .image_generation import (
    LinkedInImageGenerator,
    LinkedInImageEditor,
    LinkedInImageStorage
 )
 from .image_prompts import LinkedInPromptGenerator
 from .carousel import LinkedInCarouselPDFRenderer
 __all__ = [
    # Content Generation
@@ -42,9 +42,10 @@ __all__ = [
    # Image Generation Services
    'LinkedInImageGenerator',
    'LinkedInImageEditor',
    'LinkedInImageStorage',
-    'LinkedInPromptGenerator'
+    'LinkedInPromptGenerator',
    # Carousel Rendering
    'LinkedInCarouselPDFRenderer',
 ]
 # Version information
--- a/backend/services/linkedin/carousel/init.py
+++ b/backend/services/linkedin/carousel/init.py
@@ -0,0 +1,3 @@
 from .carousel_renderer import LinkedInCarouselPDFRenderer
 __all__ = ['LinkedInCarouselPDFRenderer']
--- a/backend/services/linkedin/carousel/carousel_renderer.py
+++ b/backend/services/linkedin/carousel/carousel_renderer.py
@@ -0,0 +1,336 @@
 """
 LinkedIn Carousel PDF Renderer
 Renders text-based carousel slides into visually appealing PNG images
 and composes them into a LinkedIn-compatible PDF document (1.91:1 ratio).
 """
 import os
 import logging
 from datetime import datetime
 from typing import Dict, Any, List, Optional
 from PIL import Image, ImageDraw, ImageFont, ImageFilter
 from reportlab.lib.pagesizes import landscape
 from reportlab.lib.units import mm
 from reportlab.platypus import SimpleDocTemplate, Image as RLImage, PageBreak
 logger = logging.getLogger(__name__)
 class LinkedInCarouselPDFRenderer:
    COLOR_SCHEMES = {
        'professional': {
            'background_start': (25, 55, 109),
            'background_end': (41, 128, 185),
            'title_color': (255, 255, 255),
            'content_color': (236, 240, 241),
            'accent_color': (52, 152, 219),
        },
        'creative': {
            'background_start': (142, 68, 173),
            'background_end': (231, 76, 60),
            'title_color': (255, 255, 255),
            'content_color': (245, 245, 245),
            'accent_color': (241, 196, 15),
        },
        'industry': {
            'background_start': (39, 174, 96),
            'background_end': (44, 62, 80),
            'title_color': (255, 255, 255),
            'content_color': (236, 240, 241),
            'accent_color': (46, 204, 113),
        },
        'dark': {
            'background_start': (20, 20, 30),
            'background_end': (60, 60, 80),
            'title_color': (255, 255, 255),
            'content_color': (200, 200, 210),
            'accent_color': (100, 200, 255),
        },
        'minimal': {
            'background_start': (245, 245, 250),
            'background_end': (255, 255, 255),
            'title_color': (44, 62, 80),
            'content_color': (80, 80, 90),
            'accent_color': (52, 152, 219),
        },
    }
    def __init__(self, output_dir: str = None):
        self.slide_width = 1200
        self.slide_height = 627
        self.slide_aspect_ratio = "1.91:1"
        self.max_file_size_bytes = 100 * 1024 * 1024
        self.max_slides = 300
        self.output_dir = output_dir or "data/media/linkedin_carousels"
    async def render_carousel_to_pdf(
        self,
        carousel_data: Dict[str, Any],
        color_scheme: str = 'professional',
        user_id: Optional[str] = None,
    ) -> Dict[str, Any]:
        start_time = datetime.now()
        os.makedirs(self.output_dir, exist_ok=True)
        try:
            slides = carousel_data.get('slides', [])
            if not slides:
                return {'success': False, 'error': 'No slides to render'}
            title = carousel_data.get('title', 'LinkedIn Carousel')
            cover_slide = carousel_data.get('cover_slide')
            cta_slide = carousel_data.get('cta_slide')
            total_slides = len(slides) + (1 if cover_slide else 0) + (1 if cta_slide else 0)
            if total_slides > self.max_slides:
                error = f'Too many slides: {total_slides} exceeds max {self.max_slides}'
                return {'success': False, 'error': error}
            session_id = datetime.now().strftime('%Y%m%d_%H%M%S')
            image_paths = []
            if cover_slide:
                path = self._render_slide(
                    slide=cover_slide, slide_number=0, session_id=session_id,
                    color_scheme=color_scheme, is_cover=True, carousel_title=title,
                )
                if path:
                    image_paths.append(path)
            for i, slide in enumerate(slides):
                path = self._render_slide(
                    slide=slide, slide_number=i + 1, session_id=session_id,
                    color_scheme=color_scheme, is_cover=False,
                )
                if path:
                    image_paths.append(path)
            if cta_slide:
                path = self._render_slide(
                    slide=cta_slide, slide_number=len(slides) + 1, session_id=session_id,
                    color_scheme=color_scheme, is_cta=True,
                )
                if path:
                    image_paths.append(path)
            if not image_paths:
                return {'success': False, 'error': 'No slide images generated'}
            pdf_filename = f"linkedin_carousel_{session_id}.pdf"
            pdf_path = os.path.join(self.output_dir, pdf_filename)
            pdf_bytes = self._compose_pdf(image_paths, pdf_path)
            file_size = len(pdf_bytes)
            if file_size > self.max_file_size_bytes:
                logger.warning("PDF size %.2f MB exceeds max %.2f MB",
                               file_size / (1024 * 1024), self.max_file_size_bytes / (1024 * 1024))
            generation_time = (datetime.now() - start_time).total_seconds()
            return {
                'success': True,
                'pdf_bytes': pdf_bytes,
                'pdf_path': pdf_path,
                'metadata': {
                    'slide_count': len(image_paths),
                    'generation_time': generation_time,
                    'file_size': file_size,
                    'file_size_mb': round(file_size / (1024 * 1024), 2),
                    'dimensions': f'{self.slide_width}x{self.slide_height}',
                    'aspect_ratio': self.slide_aspect_ratio,
                }
            }
        except Exception as e:
            logger.error("Error rendering carousel PDF: %s", str(e))
            return {'success': False, 'error': f'Carousel PDF rendering failed: {str(e)}'}
    def _render_slide(
        self,
        slide: Dict[str, Any],
        slide_number: int,
        session_id: str,
        color_scheme: str = 'professional',
        is_cover: bool = False,
        is_cta: bool = False,
        carousel_title: str = '',
    ) -> Optional[str]:
        try:
            colors = self.COLOR_SCHEMES.get(color_scheme, self.COLOR_SCHEMES['professional'])
            img = Image.new('RGB', (self.slide_width, self.slide_height))
            draw = ImageDraw.Draw(img)
            self._draw_gradient(draw, colors)
            draw.rectangle([0, self.slide_height - 6, self.slide_width, self.slide_height], fill=colors['accent_color'])
            if is_cover:
                self._draw_centered_text(draw, carousel_title or slide.get('title', ''),
                                         (self.slide_width // 2, 180), colors['title_color'],
                                         font_size=42, max_width=self.slide_width - 160)
                subtitle = slide.get('content', '')
                if subtitle:
                    self._draw_centered_text(draw, subtitle,
                                             (self.slide_width // 2, 320), colors['content_color'],
                                             font_size=24, max_width=self.slide_width - 200, max_lines=3)
                self._draw_centered_text(draw, "Swipe to explore →",
                                         (self.slide_width // 2, 480), colors['accent_color'],
                                         font_size=18)
            elif is_cta:
                self._draw_text(draw, slide.get('title', ''), (60, 160), colors['title_color'],
                                font_size=36, max_width=self.slide_width - 120, max_lines=2)
                content = slide.get('content', '')
                if content:
                    self._draw_text(draw, content, (60, 260), colors['content_color'],
                                    font_size=22, max_width=self.slide_width - 120, max_lines=6)
                btn_x, btn_y = self.slide_width // 2 - 200, 440
                draw.rounded_rectangle([btn_x, btn_y, btn_x + 400, btn_y + 55], radius=27, fill=colors['accent_color'])
                self._draw_centered_text(draw, "Share Your Thoughts →",
                                         (self.slide_width // 2, btn_y + 27), (255, 255, 255), font_size=22)
            else:
                self._draw_text(draw, str(slide_number),
                                (self.slide_width - 50, 20), colors['accent_color'], font_size=16)
                title = slide.get('title', '')
                if title:
                    self._draw_text(draw, title, (60, 50), colors['title_color'],
                                    font_size=30, max_width=self.slide_width - 120, max_lines=2)
                content = slide.get('content', '')
                if content:
                    self._draw_text(draw, content, (60, 145), colors['content_color'],
                                    font_size=20, max_width=self.slide_width - 120, max_lines=10)
                visual_elements = slide.get('visual_elements', [])
                if visual_elements:
                    self._draw_visual_elements(draw, visual_elements, colors)
            filename = f"slide_{session_id}_{slide_number:03d}.png"
            filepath = os.path.join(self.output_dir, filename)
            img.save(filepath, 'PNG', optimize=True)
            return filepath
        except Exception as e:
            logger.error("Error rendering slide %d: %s", slide_number, str(e))
            return None
    def _draw_gradient(self, draw: ImageDraw.Draw, colors: Dict):
        sr, sg, sb = colors['background_start']
        er, eg, eb = colors['background_end']
        for y in range(self.slide_height):
            t = y / self.slide_height
            draw.line([(0, y), (self.slide_width, y)],
                      fill=(int(sr + (er - sr) * t), int(sg + (eg - sg) * t), int(sb + (eb - sb) * t)))
    def _draw_text(self, draw: ImageDraw.Draw, text: str, position: tuple, color: tuple,
                   font_size: int = 20, max_width: int = None, max_lines: int = None, bold: bool = False):
        font = self._get_font(font_size, bold)
        x, y = position
        words = text.split()
        lines = []
        current_line = ""
        for word in words:
            test_line = f"{current_line} {word}".strip()
            bb = draw.textbbox((0, 0), test_line, font=font)
            tw = bb[2] - bb[0]
            if max_width and tw > max_width and current_line:
                lines.append(current_line)
                if max_lines and len(lines) >= max_lines:
                    lines[-1] = lines[-1][:-3] + "..."
                    break
                current_line = word
            else:
                current_line = test_line
        if current_line and (not max_lines or len(lines) < max_lines):
            lines.append(current_line)
        line_height = int(font_size * 1.4)
        for i, line in enumerate(lines):
            draw.text((x, y + i * line_height), line, fill=color, font=font)
    def _draw_centered_text(self, draw: ImageDraw.Draw, text: str, center: tuple, color: tuple,
                            font_size: int = 20, max_width: int = None, max_lines: int = None, bold: bool = False):
        font = self._get_font(font_size, bold)
        cx, cy = center
        words = text.split()
        lines = []
        current_line = ""
        for word in words:
            test_line = f"{current_line} {word}".strip()
            bb = draw.textbbox((0, 0), test_line, font=font)
            tw = bb[2] - bb[0]
            if max_width and tw > max_width and current_line:
                lines.append(current_line)
                if max_lines and len(lines) >= max_lines:
                    lines[-1] = lines[-1][:-3] + "..."
                    break
                current_line = word
            else:
                current_line = test_line
        if current_line and (not max_lines or len(lines) < max_lines):
            lines.append(current_line)
        line_height = int(font_size * 1.4)
        total_height = len(lines) * line_height
        start_y = cy - total_height // 2
        for i, line in enumerate(lines):
            bb = draw.textbbox((0, 0), line, font=font)
            tw = bb[2] - bb[0]
            x = cx - tw // 2
            draw.text((x, start_y + i * line_height), line, fill=color, font=font)
    def _draw_visual_elements(self, draw: ImageDraw.Draw, elements: List[str], colors: Dict):
        y_start = self.slide_height - 60
        x_start = 60
        for i, element in enumerate(elements[:4]):
            cx = x_start + i * 280
            draw.ellipse([cx, y_start, cx + 12, y_start + 12], fill=colors['accent_color'])
            font = self._get_font(12, False)
            draw.text((cx + 20, y_start - 2), element[:25], fill=colors['content_color'], font=font)
    def _get_font(self, size: int, bold: bool = False):
        try:
            return ImageFont.truetype("arialbd.ttf" if bold else "arial.ttf", size)
        except (IOError, OSError):
            try:
                return ImageFont.truetype("DejaVuSans-Bold.ttf" if bold else "DejaVuSans.ttf", size)
            except (IOError, OSError):
                return ImageFont.load_default()
    def _compose_pdf(self, image_paths: List[str], output_path: str) -> bytes:
        pw = self.slide_width
        ph = self.slide_height
        # Leave 1pt margin to avoid ReportLab frame size issues
        m = 1
        iw = pw - 2 * m
        ih = ph - 2 * m
        from reportlab.platypus import BaseDocTemplate, Frame, PageTemplate
        from reportlab.lib.pagesizes import landscape
        frame = Frame(m, m, iw, ih, id="slide_frame",
                       leftPadding=0, rightPadding=0, topPadding=0, bottomPadding=0)
        template = PageTemplate(id="slide", frames=[frame], pagesize=(pw, ph))
        doc = BaseDocTemplate(output_path, pagesize=(pw, ph))
        doc.addPageTemplates([template])
        story = []
        for i, img_path in enumerate(image_paths):
            story.append(RLImage(img_path, width=iw, height=ih))
            if i < len(image_paths) - 1:
                story.append(PageBreak())
        doc.build(story)
        with open(output_path, 'rb') as f:
            return f.read()
--- a/backend/services/linkedin/content_generator.py
+++ b/backend/services/linkedin/content_generator.py
@@ -2,6 +2,7 @@
 Content Generator for LinkedIn Content Generation
 Handles the main content generation logic for posts and articles.
 Uses llm_text_gen for provider-agnostic LLM access (respects GPT_PROVIDER).
 """
 from typing import Dict, Any, List, Optional
@@ -21,6 +22,7 @@ from services.linkedin.content_generator_prompts import (
    CarouselGenerator,
    VideoScriptGenerator
 )
 from services.llm_providers.main_text_generation import llm_text_gen
 from services.persona_analysis_service import PersonaAnalysisService
 import time
@@ -28,11 +30,9 @@ import time
 class ContentGenerator:
    """Handles content generation for all LinkedIn content types."""
-    def __init__(self, citation_manager=None, quality_analyzer=None, gemini_grounded=None, fallback_provider=None):
+    def __init__(self, citation_manager=None, quality_analyzer=None):
        self.citation_manager = citation_manager
        self.quality_analyzer = quality_analyzer
        self.gemini_grounded = gemini_grounded
        self.fallback_provider = fallback_provider
        # Persona caching
        self._persona_cache: Dict[str, Dict[str, Any]] = {}
@@ -105,22 +105,24 @@ class ContentGenerator:
                del self._cache_timestamps[key]
            logger.info(f"Cleared persona cache for user {user_id}")
-    def _transform_gemini_sources(self, gemini_sources):
+    def _build_research_context(self, research_sources: List) -> str:
-        """Transform Gemini sources to ResearchSource format."""
+        """Build research context string from research sources for prompt injection."""
-        transformed_sources = []
+        if not research_sources:
-        for source in gemini_sources:
+            return ""
-            transformed_source = ResearchSource(
+        
-                title=source.get('title', 'Unknown Source'),
+        context_parts = ["\n\nRESEARCH CONTEXT (use this information to ground your content with facts and data):"]
-                url=source.get('url', ''),
+        for i, source in enumerate(research_sources[:5], 1):  # Limit to top 5 sources
-                content=f"Source from {source.get('title', 'Unknown')}",
+            title = getattr(source, 'title', f'Source {i}')
-                relevance_score=0.8,  # Default relevance score
+            url = getattr(source, 'url', '')
-                credibility_score=0.7,  # Default credibility score
+            content = getattr(source, 'content', '')
-                domain_authority=0.6,   # Default domain authority
+            context_parts.append(f"\n{i}. {title}")
-                source_type=source.get('type', 'web'),
+            if url:
-                publication_date=datetime.now().strftime('%Y-%m-%d')
+                context_parts.append(f"   URL: {url}")
-            )
+            if content:
-            transformed_sources.append(transformed_source)
+                context_parts.append(f"   Key insight: {content[:300]}")
-        return transformed_sources
+        
        context_parts.append("\nInstructions: Use the research above to include specific data points, statistics, and factual claims in your content. Cite sources where appropriate.")
        return "\n".join(context_parts)
    async def generate_post(
        self,
@@ -155,21 +157,12 @@ class ContentGenerator:
                logger.info(f"  - First research source: {research_sources[0] if research_sources else 'None'}")
                logger.info(f"  - Research sources types: {[type(s) for s in research_sources[:3]]}")
-            # Step 3: Add citations if requested - POST METHOD
+            # Step 3: Add citations if requested
            citations = []
            source_list = None
-            final_research_sources = research_sources  # Default to passed research_sources
+            final_research_sources = research_sources
-            # Use sources and citations from content_result if available (from Gemini grounding)
+            if request.include_citations and research_sources and self.citation_manager:
            if content_result.get('citations') and content_result.get('sources'):
                logger.info(f"Using citations and sources from Gemini grounding: {len(content_result['citations'])} citations, {len(content_result['sources'])} sources")
                citations = content_result['citations']
                # Transform Gemini sources to ResearchSource format
                gemini_sources = self._transform_gemini_sources(content_result['sources'])
                source_list = self.citation_manager.generate_source_list(gemini_sources) if self.citation_manager else None
                # Use transformed sources for the response
                final_research_sources = gemini_sources
            elif request.include_citations and research_sources and self.citation_manager:
                try:
                    logger.info(f"Processing citations for content length: {len(content_result['content'])}")
                    citations = self.citation_manager.extract_citations(content_result['content'])
@@ -224,7 +217,7 @@ class ContentGenerator:
                data=post_content,
                research_sources=final_research_sources,  # Use final_research_sources
                generation_metadata={
-                    'model_used': 'gemini-2.0-flash-001',
+                    'model_used': 'llm_text_gen',
                    'generation_time': generation_time,
                    'research_time': research_time,
                    'grounding_enabled': grounding_enabled
@@ -251,21 +244,12 @@ class ContentGenerator:
        try:
            start_time = datetime.now()
-            # Step 3: Add citations if requested - ARTICLE METHOD
+            # Step 3: Add citations if requested
            citations = []
            source_list = None
-            final_research_sources = research_sources  # Default to passed research_sources
+            final_research_sources = research_sources
-            # Use sources and citations from content_result if available (from Gemini grounding)
+            if request.include_citations and research_sources and self.citation_manager:
            if content_result.get('citations') and content_result.get('sources'):
                logger.info(f"Using citations and sources from Gemini grounding: {len(content_result['citations'])} citations, {len(content_result['sources'])} sources")
                citations = content_result['citations']
                # Transform Gemini sources to ResearchSource format
                gemini_sources = self._transform_gemini_sources(content_result['sources'])
                source_list = self.citation_manager.generate_source_list(gemini_sources) if self.citation_manager else None
                # Use transformed sources for the response
                final_research_sources = gemini_sources
            elif request.include_citations and research_sources and self.citation_manager:
                try:
                    citations = self.citation_manager.extract_citations(content_result['content'])
                    source_list = self.citation_manager.generate_source_list(research_sources)
@@ -317,7 +301,7 @@ class ContentGenerator:
                data=article_content,
                research_sources=final_research_sources,  # Use final_research_sources
                generation_metadata={
-                    'model_used': 'gemini-2.0-flash-001',
+                    'model_used': 'llm_text_gen',
                    'generation_time': generation_time,
                    'research_time': research_time,
                    'grounding_enabled': grounding_enabled
@@ -386,7 +370,7 @@ class ContentGenerator:
                'alternative_responses': content_result.get('alternative_responses', []),
                'tone_analysis': content_result.get('tone_analysis'),
                'generation_metadata': {
-                    'model_used': 'gemini-2.0-flash-001',
+                    'model_used': 'llm_text_gen',
                    'generation_time': generation_time,
                    'research_time': research_time,
                    'grounding_enabled': grounding_enabled
@@ -402,19 +386,14 @@ class ContentGenerator:
            }
    # Grounded content generation methods
-    async def generate_grounded_post_content(self, request, research_sources: List) -> Dict[str, Any]:
+    async def generate_grounded_post_content(self, request, research_sources: List, user_id: str = None) -> Dict[str, Any]:
-        """Generate grounded post content using the enhanced Gemini provider with native grounding."""
+        """Generate post content using provider-agnostic llm_text_gen."""
        try:
-            if not self.gemini_grounded:
+            # Build the prompt using persona if available
-                logger.error("Gemini Grounded Provider not available - cannot generate content without AI provider")
+            uid = int(getattr(request, "user_id", 0) or 0)
-                raise Exception("Gemini Grounded Provider not available - cannot generate content without AI provider")
+            persona_data = self._get_cached_persona_data(uid, 'linkedin')
            # Build the prompt for grounded generation using persona if available (DB vs session override)
            user_id = int(getattr(request, "user_id", 0) or 0)
            persona_data = self._get_cached_persona_data(user_id, 'linkedin')
            if getattr(request, 'persona_override', None):
                try:
                    # Merge shallowly: override core and platform adaptation parts
                    override = request.persona_override
                    if persona_data:
                        core = persona_data.get('core_persona', {})
@@ -431,61 +410,40 @@ class ContentGenerator:
                    pass
            prompt = PostPromptBuilder.build_post_prompt(request, persona=persona_data)
-            # Generate grounded content using native Google Search grounding
+            # Inject research context into prompt
-            result = await self.gemini_grounded.generate_grounded_content(
+            research_context = self._build_research_context(research_sources)
            if research_context:
                prompt += research_context
            # Generate content using provider-agnostic gateway
            raw_response = llm_text_gen(
                prompt=prompt,
-                content_type="linkedin_post",
+                user_id=user_id,
-                temperature=0.7,
+                flow_type="linkedin_post",
-                max_tokens=request.max_length
+                max_tokens=request.max_length,
                temperature=0.7
            )
-            return result
+            content_text = raw_response if isinstance(raw_response, str) else str(raw_response or "")
            return {
                'content': content_text,
                'sources': [],
                'citations': [],
                'grounding_enabled': bool(research_sources),
                'fallback_used': False
            }
        except Exception as e:
-            logger.error(f"Error generating grounded post content: {str(e)}")
+            logger.error(f"Error generating post content: {str(e)}")
-            logger.info("Attempting fallback to standard content generation...")
+            raise Exception(f"Failed to generate LinkedIn post: {str(e)}")
            # Fallback to standard content generation without grounding
            try:
                if not self.fallback_provider:
                    raise Exception("No fallback provider available")
                # Build a simpler prompt for fallback generation
                prompt = PostPromptBuilder.build_post_prompt(request)
                # Generate content using fallback provider (it's a dict with functions)
                if 'generate_text' in self.fallback_provider:
                    result = await self.fallback_provider['generate_text'](
                        prompt=prompt,
                        temperature=0.7,
                        max_tokens=request.max_length
                    )
                else:
                    raise Exception("Fallback provider doesn't have generate_text method")
                # Return result in the expected format
                return {
                    'content': result.get('content', '') if isinstance(result, dict) else str(result),
                    'sources': [],
                    'citations': [],
                    'grounding_enabled': False,
                    'fallback_used': True
                }
            except Exception as fallback_error:
                logger.error(f"Fallback generation also failed: {str(fallback_error)}")
                raise Exception(f"Failed to generate content: {str(e)}. Fallback also failed: {str(fallback_error)}")
-    async def generate_grounded_article_content(self, request, research_sources: List) -> Dict[str, Any]:
+    async def generate_grounded_article_content(self, request, research_sources: List, user_id: str = None) -> Dict[str, Any]:
-        """Generate grounded article content using the enhanced Gemini provider with native grounding."""
+        """Generate article content using provider-agnostic llm_text_gen."""
        try:
-            if not self.gemini_grounded:
+            # Build the prompt using persona if available
-                logger.error("Gemini Grounded Provider not available - cannot generate content without AI provider")
+            uid = int(getattr(request, "user_id", 0) or 0)
-                raise Exception("Gemini Grounded Provider not available - cannot generate content without AI provider")
+            persona_data = self._get_cached_persona_data(uid, 'linkedin')
            # Build the prompt for grounded generation using persona if available (DB vs session override)
            user_id = int(getattr(request, "user_id", 0) or 0)
            persona_data = self._get_cached_persona_data(user_id, 'linkedin')
            if getattr(request, 'persona_override', None):
                try:
                    override = request.persona_override
@@ -504,88 +462,146 @@ class ContentGenerator:
                    pass
            prompt = ArticlePromptBuilder.build_article_prompt(request, persona=persona_data)
-            # Generate grounded content using native Google Search grounding
+            # Inject research context into prompt
-            result = await self.gemini_grounded.generate_grounded_content(
+            research_context = self._build_research_context(research_sources)
            if research_context:
                prompt += research_context
            # Generate content using provider-agnostic gateway
            raw_response = llm_text_gen(
                prompt=prompt,
-                content_type="linkedin_article",
+                user_id=user_id,
-                temperature=0.7,
+                flow_type="linkedin_article",
-                max_tokens=request.word_count * 10  # Approximate character count
+                max_tokens=request.word_count * 10,
                temperature=0.7
            )
-            return result
+            content_text = raw_response if isinstance(raw_response, str) else str(raw_response or "")
            # Extract title from article content (first markdown heading or first line)
            title = ""
            for line in content_text.split('\n'):
                stripped = line.strip()
                if stripped.startswith('# '):
                    title = stripped[2:].strip()
                    break
            if not title:
                for line in content_text.split('\n'):
                    stripped = line.strip()
                    if stripped:
                        title = stripped[:100].strip()
                        break
            if not title:
                title = request.topic or "LinkedIn Article"
            return {
                'content': content_text,
                'title': title,
                'sources': [],
                'citations': [],
                'grounding_enabled': bool(research_sources),
                'fallback_used': False
            }
        except Exception as e:
-            logger.error(f"Error generating grounded article content: {str(e)}")
+            logger.error(f"Error generating article content: {str(e)}")
-            raise Exception(f"Failed to generate grounded article content: {str(e)}")
+            raise Exception(f"Failed to generate LinkedIn article: {str(e)}")
-    async def generate_grounded_carousel_content(self, request, research_sources: List) -> Dict[str, Any]:
+    async def generate_grounded_carousel_content(self, request, research_sources: List, user_id: str = None) -> Dict[str, Any]:
-        """Generate grounded carousel content using the enhanced Gemini provider with native grounding."""
+        """Generate carousel content using provider-agnostic llm_text_gen."""
        try:
            if not self.gemini_grounded:
                logger.error("Gemini Grounded Provider not available - cannot generate content without AI provider")
                raise Exception("Gemini Grounded Provider not available - cannot generate content without AI provider")
            # Build the prompt for grounded generation using the new prompt builder
            prompt = CarouselPromptBuilder.build_carousel_prompt(request)
-            # Generate grounded content using native Google Search grounding
+            # Inject research context into prompt
-            result = await self.gemini_grounded.generate_grounded_content(
+            research_context = self._build_research_context(research_sources)
            if research_context:
                prompt += research_context
            # Generate content using provider-agnostic gateway
            raw_response = llm_text_gen(
                prompt=prompt,
-                content_type="linkedin_carousel",
+                user_id=user_id,
-                temperature=0.7,
+                flow_type="linkedin_carousel",
-                max_tokens=2000
+                max_tokens=2000,
                temperature=0.7
            )
-            return result
+            content_text = raw_response if isinstance(raw_response, str) else str(raw_response or "")
            return {
                'content': content_text,
                'sources': [],
                'citations': [],
                'grounding_enabled': bool(research_sources),
                'fallback_used': False
            }
        except Exception as e:
-            logger.error(f"Error generating grounded carousel content: {str(e)}")
+            logger.error(f"Error generating carousel content: {str(e)}")
-            raise Exception(f"Failed to generate grounded carousel content: {str(e)}")
+            raise Exception(f"Failed to generate LinkedIn carousel: {str(e)}")
-    async def generate_grounded_video_script_content(self, request, research_sources: List) -> Dict[str, Any]:
+    async def generate_grounded_video_script_content(self, request, research_sources: List, user_id: str = None) -> Dict[str, Any]:
-        """Generate grounded video script content using the enhanced Gemini provider with native grounding."""
+        """Generate video script content using provider-agnostic llm_text_gen."""
        try:
            if not self.gemini_grounded:
                logger.error("Gemini Grounded Provider not available - cannot generate content without AI provider")
                raise Exception("Gemini Grounded Provider not available - cannot generate content without AI provider")
            # Build the prompt for grounded generation using the new prompt builder
            prompt = VideoScriptPromptBuilder.build_video_script_prompt(request)
-            # Generate grounded content using native Google Search grounding
+            # Inject research context into prompt
-            result = await self.gemini_grounded.generate_grounded_content(
+            research_context = self._build_research_context(research_sources)
            if research_context:
                prompt += research_context
            # Generate content using provider-agnostic gateway
            raw_response = llm_text_gen(
                prompt=prompt,
-                content_type="linkedin_video_script",
+                user_id=user_id,
-                temperature=0.7,
+                flow_type="linkedin_video_script",
-                max_tokens=1500
+                max_tokens=1500,
                temperature=0.7
            )
-            return result
+            content_text = raw_response if isinstance(raw_response, str) else str(raw_response or "")
            return {
                'content': content_text,
                'sources': [],
                'citations': [],
                'grounding_enabled': bool(research_sources),
                'fallback_used': False
            }
        except Exception as e:
-            logger.error(f"Error generating grounded video script content: {str(e)}")
+            logger.error(f"Error generating video script content: {str(e)}")
-            raise Exception(f"Failed to generate grounded video script content: {str(e)}")
+            raise Exception(f"Failed to generate LinkedIn video script: {str(e)}")
-    async def generate_grounded_comment_response(self, request, research_sources: List) -> Dict[str, Any]:
+    async def generate_grounded_comment_response(self, request, research_sources: List, user_id: str = None) -> Dict[str, Any]:
-        """Generate grounded comment response using the enhanced Gemini provider with native grounding."""
+        """Generate comment response using provider-agnostic llm_text_gen."""
        try:
            if not self.gemini_grounded:
                logger.error("Gemini Grounded Provider not available - cannot generate content without AI provider")
                raise Exception("Gemini Grounded Provider not available - cannot generate content without AI provider")
            # Build the prompt for grounded generation using the new prompt builder
            prompt = CommentResponsePromptBuilder.build_comment_response_prompt(request)
-            # Generate grounded content using native Google Search grounding
+            # Inject research context into prompt
-            result = await self.gemini_grounded.generate_grounded_content(
+            research_context = self._build_research_context(research_sources)
            if research_context:
                prompt += research_context
            # Generate content using provider-agnostic gateway
            raw_response = llm_text_gen(
                prompt=prompt,
-                content_type="linkedin_comment_response",
+                user_id=user_id,
-                temperature=0.7,
+                flow_type="linkedin_comment_response",
-                max_tokens=2000
+                max_tokens=2000,
                temperature=0.7
            )
-            return result
+            content_text = raw_response if isinstance(raw_response, str) else str(raw_response or "")
            return {
                'content': content_text,
                'sources': [],
                'citations': [],
                'grounding_enabled': bool(research_sources),
                'fallback_used': False
            }
        except Exception as e:
-            logger.error(f"Error generating grounded comment response: {str(e)}")
+            logger.error(f"Error generating comment response: {str(e)}")
-            raise Exception(f"Failed to generate grounded comment response: {str(e)}")
+            raise Exception(f"Failed to generate LinkedIn comment response: {str(e)}")
--- a/backend/services/linkedin/content_generator_prompts/carousel_generator.py
+++ b/backend/services/linkedin/content_generator_prompts/carousel_generator.py
@@ -96,7 +96,7 @@ class CarouselGenerator:
                'data': carousel_content,
                'research_sources': research_sources,
                'generation_metadata': {
-                    'model_used': 'gemini-2.0-flash-001',
+                    'model_used': 'llm_text_gen',
                    'generation_time': generation_time,
                    'research_time': research_time,
                    'grounding_enabled': grounding_enabled
--- a/backend/services/linkedin/content_generator_prompts/video_script_generator.py
+++ b/backend/services/linkedin/content_generator_prompts/video_script_generator.py
@@ -81,7 +81,7 @@ class VideoScriptGenerator:
                'data': video_script,
                'research_sources': research_sources,
                'generation_metadata': {
-                    'model_used': 'gemini-2.0-flash-001',
+                    'model_used': 'llm_text_gen',
                    'generation_time': generation_time,
                    'research_time': research_time,
                    'grounding_enabled': grounding_enabled
--- a/backend/services/linkedin/image_generation/init.py
+++ b/backend/services/linkedin/image_generation/init.py
@@ -2,17 +2,15 @@
 LinkedIn Image Generation Package
 This package provides AI-powered image generation capabilities for LinkedIn content
-using Google's Gemini API. It includes image generation, editing, storage, and
+using the common llm_providers infrastructure. It includes image generation, storage,
-management services optimized for professional business use.
+and management services optimized for professional business use.
 """
 from .linkedin_image_generator import LinkedInImageGenerator
 from .linkedin_image_editor import LinkedInImageEditor
 from .linkedin_image_storage import LinkedInImageStorage
 __all__ = [
    'LinkedInImageGenerator',
    'LinkedInImageEditor', 
    'LinkedInImageStorage'
 ]
--- a/backend/services/linkedin/image_generation/linkedin_image_editor.py
+++ b/backend/services/linkedin/image_generation/linkedin_image_editor.py
@@ -1,530 +0,0 @@
 """
 LinkedIn Image Editor Service
 This service handles image editing capabilities for LinkedIn content using Gemini's
 conversational editing features. It provides professional image refinement and
 optimization specifically for LinkedIn use cases.
 """
 import os
 import base64
 from typing import Dict, Any, Optional, List
 from datetime import datetime
 from PIL import Image, ImageEnhance, ImageFilter
 from io import BytesIO
 from loguru import logger
 # Import existing infrastructure
 from ...onboarding.api_key_manager import APIKeyManager
 class LinkedInImageEditor:
    """
    Handles LinkedIn image editing and refinement using Gemini's capabilities.
    This service provides both AI-powered editing through Gemini and traditional
    image processing for LinkedIn-specific optimizations.
    """
    def __init__(self, api_key_manager: Optional[APIKeyManager] = None):
        """
        Initialize the LinkedIn Image Editor.
        Args:
            api_key_manager: API key manager for Gemini authentication
        """
        self.api_key_manager = api_key_manager or APIKeyManager()
        self.model = "gemini-2.5-flash-image-preview"
        # LinkedIn-specific editing parameters
        self.enhancement_factors = {
            'brightness': 1.1,      # Slightly brighter for mobile viewing
            'contrast': 1.05,       # Subtle contrast enhancement
            'sharpness': 1.2,       # Enhanced sharpness for clarity
            'saturation': 1.05      # Slight saturation boost
        }
        logger.info("LinkedIn Image Editor initialized")
    async def edit_image_conversationally(
        self, 
        base_image: bytes, 
        edit_prompt: str,
        content_context: Dict[str, Any]
    ) -> Dict[str, Any]:
        """
        Edit image using Gemini's conversational editing capabilities.
        Args:
            base_image: Base image data in bytes
            edit_prompt: Natural language description of desired edits
            content_context: LinkedIn content context for optimization
        Returns:
            Dict containing edited image result and metadata
        """
        try:
            start_time = datetime.now()
            logger.info(f"Starting conversational image editing: {edit_prompt[:100]}...")
            # Enhance edit prompt for LinkedIn optimization
            enhanced_prompt = self._enhance_edit_prompt_for_linkedin(
                edit_prompt, content_context
            )
            # TODO: Implement Gemini conversational editing when available
            # For now, we'll use traditional image processing based on prompt analysis
            edited_image = await self._apply_traditional_editing(
                base_image, edit_prompt, content_context
            )
            if not edited_image.get('success'):
                return edited_image
            generation_time = (datetime.now() - start_time).total_seconds()
            return {
                'success': True,
                'image_data': edited_image['image_data'],
                'metadata': {
                    'edit_prompt': edit_prompt,
                    'enhanced_prompt': enhanced_prompt,
                    'editing_method': 'traditional_processing',
                    'editing_time': generation_time,
                    'content_context': content_context,
                    'model_used': self.model
                },
                'linkedin_optimization': {
                    'mobile_optimized': True,
                    'professional_aesthetic': True,
                    'brand_compliant': True,
                    'engagement_optimized': True
                }
            }
        except Exception as e:
            logger.error(f"Error in conversational image editing: {str(e)}")
            return {
                'success': False,
                'error': f"Conversational editing failed: {str(e)}",
                'generation_time': (datetime.now() - start_time).total_seconds() if 'start_time' in locals() else 0
            }
    async def apply_style_transfer(
        self, 
        base_image: bytes, 
        style_reference: bytes,
        content_context: Dict[str, Any]
    ) -> Dict[str, Any]:
        """
        Apply style transfer from reference image to base image.
        Args:
            base_image: Base image data in bytes
            style_reference: Reference image for style transfer
            content_context: LinkedIn content context
        Returns:
            Dict containing style-transferred image result
        """
        try:
            start_time = datetime.now()
            logger.info("Starting style transfer for LinkedIn image")
            # TODO: Implement Gemini style transfer when available
            # For now, return placeholder implementation
            return {
                'success': False,
                'error': 'Style transfer not yet implemented - coming in next Gemini API update',
                'generation_time': (datetime.now() - start_time).total_seconds()
            }
        except Exception as e:
            logger.error(f"Error in style transfer: {str(e)}")
            return {
                'success': False,
                'error': f"Style transfer failed: {str(e)}",
                'generation_time': (datetime.now() - start_time).total_seconds() if 'start_time' in locals() else 0
            }
    async def enhance_image_quality(
        self, 
        image_data: bytes,
        enhancement_type: str = "linkedin_optimized",
        content_context: Optional[Dict[str, Any]] = None
    ) -> Dict[str, Any]:
        """
        Enhance image quality using traditional image processing.
        Args:
            image_data: Image data in bytes
            enhancement_type: Type of enhancement to apply
            content_context: LinkedIn content context for optimization
        Returns:
            Dict containing enhanced image result
        """
        try:
            start_time = datetime.now()
            logger.info(f"Starting image quality enhancement: {enhancement_type}")
            # Open image for processing
            image = Image.open(BytesIO(image_data))
            original_size = image.size
            # Apply LinkedIn-specific enhancements
            if enhancement_type == "linkedin_optimized":
                enhanced_image = self._apply_linkedin_enhancements(image, content_context)
            elif enhancement_type == "professional":
                enhanced_image = self._apply_professional_enhancements(image)
            elif enhancement_type == "creative":
                enhanced_image = self._apply_creative_enhancements(image)
            else:
                enhanced_image = self._apply_linkedin_enhancements(image, content_context)
            # Convert back to bytes
            output_buffer = BytesIO()
            enhanced_image.save(output_buffer, format=image.format or "PNG", optimize=True)
            enhanced_data = output_buffer.getvalue()
            enhancement_time = (datetime.now() - start_time).total_seconds()
            return {
                'success': True,
                'image_data': enhanced_data,
                'metadata': {
                    'enhancement_type': enhancement_type,
                    'original_size': original_size,
                    'enhanced_size': enhanced_image.size,
                    'enhancement_time': enhancement_time,
                    'content_context': content_context
                }
            }
        except Exception as e:
            logger.error(f"Error in image quality enhancement: {str(e)}")
            return {
                'success': False,
                'error': f"Quality enhancement failed: {str(e)}",
                'generation_time': (datetime.now() - start_time).total_seconds() if 'start_time' in locals() else 0
            }
    def _enhance_edit_prompt_for_linkedin(
        self, 
        edit_prompt: str, 
        content_context: Dict[str, Any]
    ) -> str:
        """
        Enhance edit prompt for LinkedIn optimization.
        Args:
            edit_prompt: Original edit prompt
            content_context: LinkedIn content context
        Returns:
            Enhanced edit prompt
        """
        industry = content_context.get('industry', 'business')
        content_type = content_context.get('content_type', 'post')
        linkedin_edit_enhancements = [
            f"Maintain professional business aesthetic for {industry} industry",
            f"Ensure mobile-optimized composition for LinkedIn {content_type}",
            "Keep professional color scheme and typography",
            "Maintain brand consistency and visual hierarchy",
            "Optimize for LinkedIn feed viewing and engagement"
        ]
        enhanced_prompt = f"{edit_prompt}\n\n"
        enhanced_prompt += "\n".join(linkedin_edit_enhancements)
        return enhanced_prompt
    async def _apply_traditional_editing(
        self, 
        base_image: bytes, 
        edit_prompt: str,
        content_context: Dict[str, Any]
    ) -> Dict[str, Any]:
        """
        Apply traditional image processing based on edit prompt analysis.
        Args:
            base_image: Base image data in bytes
            edit_prompt: Description of desired edits
            content_context: LinkedIn content context
        Returns:
            Dict containing edited image result
        """
        try:
            # Open image for processing
            image = Image.open(BytesIO(base_image))
            # Analyze edit prompt and apply appropriate processing
            edit_prompt_lower = edit_prompt.lower()
            if any(word in edit_prompt_lower for word in ['brighter', 'light', 'lighting']):
                image = self._adjust_brightness(image, 1.2)
                logger.info("Applied brightness adjustment")
            if any(word in edit_prompt_lower for word in ['sharper', 'sharp', 'clear']):
                image = self._apply_sharpening(image)
                logger.info("Applied sharpening")
            if any(word in edit_prompt_lower for word in ['warmer', 'warm', 'color']):
                image = self._adjust_color_temperature(image, 'warm')
                logger.info("Applied warm color adjustment")
            if any(word in edit_prompt_lower for word in ['professional', 'business']):
                image = self._apply_professional_enhancements(image)
                logger.info("Applied professional enhancements")
            # Convert back to bytes
            output_buffer = BytesIO()
            image.save(output_buffer, format=image.format or "PNG", optimize=True)
            edited_data = output_buffer.getvalue()
            return {
                'success': True,
                'image_data': edited_data
            }
        except Exception as e:
            logger.error(f"Error in traditional editing: {str(e)}")
            return {
                'success': False,
                'error': f"Traditional editing failed: {str(e)}"
            }
    def _apply_linkedin_enhancements(
        self, 
        image: Image.Image, 
        content_context: Optional[Dict[str, Any]] = None
    ) -> Image.Image:
        """
        Apply LinkedIn-specific image enhancements.
        Args:
            image: PIL Image object
            content_context: LinkedIn content context
        Returns:
            Enhanced image
        """
        try:
            # Apply standard LinkedIn optimizations
            image = self._adjust_brightness(image, self.enhancement_factors['brightness'])
            image = self._adjust_contrast(image, self.enhancement_factors['contrast'])
            image = self._apply_sharpening(image)
            image = self._adjust_saturation(image, self.enhancement_factors['saturation'])
            # Ensure professional appearance
            image = self._ensure_professional_appearance(image, content_context)
            return image
        except Exception as e:
            logger.error(f"Error applying LinkedIn enhancements: {str(e)}")
            return image
    def _apply_professional_enhancements(self, image: Image.Image) -> Image.Image:
        """
        Apply professional business aesthetic enhancements.
        Args:
            image: PIL Image object
        Returns:
            Enhanced image
        """
        try:
            # Subtle enhancements for professional appearance
            image = self._adjust_brightness(image, 1.05)
            image = self._adjust_contrast(image, 1.03)
            image = self._apply_sharpening(image)
            return image
        except Exception as e:
            logger.error(f"Error applying professional enhancements: {str(e)}")
            return image
    def _apply_creative_enhancements(self, image: Image.Image) -> Image.Image:
        """
        Apply creative and engaging enhancements.
        Args:
            image: PIL Image object
        Returns:
            Enhanced image
        """
        try:
            # More pronounced enhancements for creative appeal
            image = self._adjust_brightness(image, 1.1)
            image = self._adjust_contrast(image, 1.08)
            image = self._adjust_saturation(image, 1.1)
            image = self._apply_sharpening(image)
            return image
        except Exception as e:
            logger.error(f"Error applying creative enhancements: {str(e)}")
            return image
    def _adjust_brightness(self, image: Image.Image, factor: float) -> Image.Image:
        """Adjust image brightness."""
        try:
            enhancer = ImageEnhance.Brightness(image)
            return enhancer.enhance(factor)
        except Exception as e:
            logger.error(f"Error adjusting brightness: {str(e)}")
            return image
    def _adjust_contrast(self, image: Image.Image, factor: float) -> Image.Image:
        """Adjust image contrast."""
        try:
            enhancer = ImageEnhance.Contrast(image)
            return enhancer.enhance(factor)
        except Exception as e:
            logger.error(f"Error adjusting contrast: {str(e)}")
            return image
    def _adjust_saturation(self, image: Image.Image, factor: float) -> Image.Image:
        """Adjust image saturation."""
        try:
            enhancer = ImageEnhance.Color(image)
            return enhancer.enhance(factor)
        except Exception as e:
            logger.error(f"Error adjusting saturation: {str(e)}")
            return image
    def _apply_sharpening(self, image: Image.Image) -> Image.Image:
        """Apply image sharpening."""
        try:
            # Apply unsharp mask for professional sharpening
            return image.filter(ImageFilter.UnsharpMask(radius=1, percent=150, threshold=3))
        except Exception as e:
            logger.error(f"Error applying sharpening: {str(e)}")
            return image
    def _adjust_color_temperature(self, image: Image.Image, temperature: str) -> Image.Image:
        """Adjust image color temperature."""
        try:
            if temperature == 'warm':
                # Apply warm color adjustment
                enhancer = ImageEnhance.Color(image)
                image = enhancer.enhance(1.1)
                # Slight red tint for warmth
                # This is a simplified approach - more sophisticated color grading could be implemented
                return image
            else:
                return image
        except Exception as e:
            logger.error(f"Error adjusting color temperature: {str(e)}")
            return image
    def _ensure_professional_appearance(
        self, 
        image: Image.Image, 
        content_context: Optional[Dict[str, Any]] = None
    ) -> Image.Image:
        """
        Ensure image meets professional LinkedIn standards.
        Args:
            image: PIL Image object
            content_context: LinkedIn content context
        Returns:
            Professionally optimized image
        """
        try:
            # Ensure minimum quality standards
            if image.mode in ('RGBA', 'LA', 'P'):
                # Convert to RGB for better compatibility
                background = Image.new('RGB', image.size, (255, 255, 255))
                if image.mode == 'P':
                    image = image.convert('RGBA')
                background.paste(image, mask=image.split()[-1] if image.mode == 'RGBA' else None)
                image = background
            # Ensure minimum resolution for LinkedIn
            min_resolution = (1024, 1024)
            if image.size[0] < min_resolution[0] or image.size[1] < min_resolution[1]:
                # Resize to minimum resolution while maintaining aspect ratio
                ratio = max(min_resolution[0] / image.size[0], min_resolution[1] / image.size[1])
                new_size = (int(image.size[0] * ratio), int(image.size[1] * ratio))
                image = image.resize(new_size, Image.Resampling.LANCZOS)
                logger.info(f"Resized image to {new_size} for LinkedIn professional standards")
            return image
        except Exception as e:
            logger.error(f"Error ensuring professional appearance: {str(e)}")
            return image
    async def get_editing_suggestions(
        self, 
        image_data: bytes,
        content_context: Dict[str, Any]
    ) -> List[Dict[str, Any]]:
        """
        Get AI-powered editing suggestions for LinkedIn image.
        Args:
            image_data: Image data in bytes
            content_context: LinkedIn content context
        Returns:
            List of editing suggestions
        """
        try:
            # Analyze image and provide contextual suggestions
            suggestions = []
            # Professional enhancement suggestions
            suggestions.append({
                'id': 'professional_enhancement',
                'title': 'Professional Enhancement',
                'description': 'Apply subtle professional enhancements for business appeal',
                'prompt': 'Enhance this image with professional business aesthetics',
                'priority': 'high'
            })
            # Mobile optimization suggestions
            suggestions.append({
                'id': 'mobile_optimization',
                'title': 'Mobile Optimization',
                'description': 'Optimize for LinkedIn mobile feed viewing',
                'prompt': 'Optimize this image for mobile LinkedIn viewing',
                'priority': 'medium'
            })
            # Industry-specific suggestions
            industry = content_context.get('industry', 'business')
            suggestions.append({
                'id': 'industry_optimization',
                'title': f'{industry.title()} Industry Optimization',
                'description': f'Apply {industry} industry-specific visual enhancements',
                'prompt': f'Enhance this image with {industry} industry aesthetics',
                'priority': 'medium'
            })
            # Engagement optimization suggestions
            suggestions.append({
                'id': 'engagement_optimization',
                'title': 'Engagement Optimization',
                'description': 'Make this image more engaging for LinkedIn audience',
                'prompt': 'Make this image more engaging and shareable for LinkedIn',
                'priority': 'low'
            })
            return suggestions
        except Exception as e:
            logger.error(f"Error getting editing suggestions: {str(e)}")
            return []
--- a/backend/services/linkedin/image_generation/linkedin_image_generator.py
+++ b/backend/services/linkedin/image_generation/linkedin_image_generator.py
@@ -1,8 +1,9 @@
 """
 LinkedIn Image Generator Service
-This service generates LinkedIn-optimized images using Google's Gemini API.
+This service generates LinkedIn-optimized images using the common
-It provides professional, business-appropriate imagery for LinkedIn content.
+llm_providers infrastructure. It provides professional, business-appropriate
 imagery for LinkedIn content.
 """
 import os
@@ -17,6 +18,7 @@ from io import BytesIO
 # Import existing infrastructure
 from ...onboarding.api_key_manager import APIKeyManager
 from ...llm_providers.main_image_generation import generate_image
 from ...llm_providers.main_image_editing import edit_image as common_edit_image
 # Set up logging
 logger = logging.getLogger(__name__)
@@ -24,9 +26,9 @@ logger = logging.getLogger(__name__)
 class LinkedInImageGenerator:
    """
-    Handles LinkedIn-optimized image generation using Gemini API.
+    Handles LinkedIn-optimized image generation using common infrastructure.
-    This service integrates with the existing Gemini provider infrastructure
+    This service integrates with the llm_providers image generation system
    and provides LinkedIn-specific image optimization, quality assurance,
    and professional business aesthetics.
    """
@@ -36,10 +38,9 @@ class LinkedInImageGenerator:
        Initialize the LinkedIn Image Generator.
        Args:
-            api_key_manager: API key manager for Gemini authentication
+            api_key_manager: API key manager for authentication
        """
        self.api_key_manager = api_key_manager or APIKeyManager()
        self.model = "gemini-2.5-flash-image-preview"
        self.default_aspect_ratio = "1:1"  # LinkedIn post optimal ratio
        self.max_retries = 3
@@ -55,16 +56,18 @@ class LinkedInImageGenerator:
        prompt: str, 
        content_context: Dict[str, Any],
        aspect_ratio: str = "1:1",
-        style_preference: str = "professional"
+        style_preference: str = "professional",
        user_id: Optional[str] = None
    ) -> Dict[str, Any]:
        """
-        Generate LinkedIn-optimized image using Gemini API.
+        Generate LinkedIn-optimized image using AI provider.
        Args:
            prompt: User's image generation prompt
            content_context: LinkedIn content context (topic, industry, content_type)
-            aspect_ratio: Image aspect ratio (1:1, 16:9, 4:3)
+            aspect_ratio: Image aspect ratio (1:1, 16:9, 4:3, 1.91:1, 1:1.25)
            style_preference: Style preference (professional, creative, industry-specific)
            user_id: User ID for tenant provider resolution
        Returns:
            Dict containing generation result, image data, and metadata
@@ -78,8 +81,8 @@ class LinkedInImageGenerator:
                prompt, content_context, style_preference, aspect_ratio
            )
-            # Generate image using existing Gemini infrastructure
+            # Generate image using tenant-aware provider selection
-            generation_result = await self._generate_with_gemini(enhanced_prompt, aspect_ratio)
+            generation_result = await self._generate_with_provider(enhanced_prompt, aspect_ratio, user_id)
            if not generation_result.get('success'):
                return {
@@ -108,7 +111,7 @@ class LinkedInImageGenerator:
                    'aspect_ratio': aspect_ratio,
                    'content_context': content_context,
                    'generation_time': generation_time,
-                    'model_used': self.model,
+                    'model_used': generation_result.get('model'),
                    'image_format': processed_image['format'],
                    'image_size': processed_image['size'],
                    'resolution': processed_image['resolution']
@@ -131,17 +134,19 @@ class LinkedInImageGenerator:
    async def edit_image(
        self, 
-        base_image: bytes, 
+        input_image_bytes: bytes, 
        edit_prompt: str,
-        content_context: Dict[str, Any]
+        content_context: Dict[str, Any],
        user_id: Optional[str] = None,
    ) -> Dict[str, Any]:
        """
-        Edit existing image using Gemini's conversational editing capabilities.
+        Edit existing image using unified image editing infrastructure.
        Args:
-            base_image: Base image data in bytes
+            input_image_bytes: Input image bytes to edit
            edit_prompt: Description of desired edits
            content_context: LinkedIn content context for optimization
            user_id: User ID for tenant provider resolution and subscription checks
        Returns:
            Dict containing edited image result and metadata
@@ -155,18 +160,46 @@ class LinkedInImageGenerator:
                edit_prompt, content_context
            )
-            # Use Gemini's image editing capabilities
+            # Use unified image editing system.
-            # Note: This will be implemented when Gemini's image editing is fully available
+            # common_edit_image() handles: provider resolution, pre-flight validation,
-            # For now, we'll return a placeholder implementation
+            # generation, and usage tracking — all via user_id.
            result = common_edit_image(
                input_image_bytes=input_image_bytes,
                prompt=enhanced_edit_prompt,
                user_id=user_id,
            )
-            return {
+            if result and result.image_bytes:
-                'success': False,
+                generation_time = (datetime.now() - start_time).total_seconds()
-                'error': 'Image editing not yet implemented - coming in next Gemini API update',
+                logger.info(
-                'generation_time': (datetime.now() - start_time).total_seconds()
+                    "LinkedIn image edited successfully via provider=%s model=%s in %.2fs",
-            }
+                    result.provider, result.model, generation_time,
                )
                return {
                    'success': True,
                    'image_data': result.image_bytes,
                    'image_url': None,  # not using URL-based retrieval
                    'width': result.width,
                    'height': result.height,
                    'provider': result.provider,
                    'model': result.model,
                    'metadata': {
                        'original_prompt': edit_prompt,
                        'enhanced_prompt': enhanced_edit_prompt,
                        'generation_time': generation_time,
                        'content_context': content_context,
                    },
                }
            else:
                logger.warning("LinkedIn image editing returned no result")
                return {
                    'success': False,
                    'error': 'Image editing returned no result',
                    'generation_time': (datetime.now() - start_time).total_seconds(),
                }
        except Exception as e:
-            logger.error(f"Error in LinkedIn image editing: {str(e)}")
+            logger.error(f"Error in LinkedIn image editing: {str(e)}", exc_info=True)
            return {
                'success': False,
                'error': f"Image editing failed: {str(e)}",
@@ -268,13 +301,16 @@ class LinkedInImageGenerator:
        return enhanced_edit_prompt
-    async def _generate_with_gemini(self, prompt: str, aspect_ratio: str) -> Dict[str, Any]:
+    async def _generate_with_provider(self, prompt: str, aspect_ratio: str, user_id: Optional[str] = None) -> Dict[str, Any]:
        """
        Generate image using unified image generation infrastructure.
        Provider resolution, pre-flight validation, and usage tracking
        are all handled by generate_image() from main_image_generation.
        Args:
            prompt: Enhanced prompt for image generation
            aspect_ratio: Desired aspect ratio
            user_id: User ID for tenant provider resolution and subscription checks
        Returns:
            Generation result from image generation provider
@@ -285,26 +321,31 @@ class LinkedInImageGenerator:
                "1:1": (1024, 1024),
                "16:9": (1920, 1080),
                "4:3": (1366, 1024),
-                "9:16": (1080, 1920),  # Portrait for stories
+                "9:16": (1080, 1920),
                "1.91:1": (1200, 627),  # LinkedIn recommended landscape
                "1:1.25": (1080, 1350),  # LinkedIn recommended portrait
            }
            width, height = aspect_map.get(aspect_ratio, (1024, 1024))
-            # Use unified image generation system (defaults to provider based on GPT_PROVIDER)
+            # Delegate to unified image generation system.
            # Generate_image() handles: provider resolution, pre-flight validation,
            # model auto-detection, generation, and usage tracking.
            # We do NOT pass explicit provider or model — let generate_image() resolve
            # them from tenant config and user defaults.
            result = generate_image(
                prompt=prompt,
                options={
                    "provider": "gemini",  # LinkedIn uses Gemini by default
                    "model": self.model if hasattr(self, 'model') else None,
                    "width": width,
                    "height": height,
-                }
+                },
                user_id=user_id
            )
            if result and result.image_bytes:
                return {
                    'success': True,
                    'image_data': result.image_bytes,
-                    'image_path': None,  # No file path, using bytes directly
+                    'image_path': None,
                    'width': result.width,
                    'height': result.height,
                    'provider': result.provider,
@@ -315,7 +356,7 @@ class LinkedInImageGenerator:
                    'success': False,
                    'error': 'Image generation returned no result'
                }
-                
+
        except Exception as e:
            logger.error(f"Error in image generation: {str(e)}")
            return {
@@ -487,6 +528,9 @@ class LinkedInImageGenerator:
            (1.6, 1.8),    # 16:9 (landscape)
            (0.7, 0.8),    # 4:3 (portrait)
            (1.2, 1.4),    # 5:4 (landscape)
            (1.85, 2.0),   # 1.91:1 (LinkedIn recommended landscape)
            (0.6, 0.72),   # 1:1.25 (LinkedIn recommended portrait, ~0.8)
            (0.65, 0.85),  # 1:1.25 broader match
        ]
        for min_ratio, max_ratio in suitable_ratios:
--- a/backend/services/linkedin/image_generation/linkedin_image_storage.py
+++ b/backend/services/linkedin/image_generation/linkedin_image_storage.py
@@ -6,8 +6,10 @@ It provides secure storage, efficient retrieval, and metadata management for gen
 """
 import os
 import re
 import hashlib
 import json
 import shutil
 from typing import Dict, Any, Optional, List, Tuple
 from datetime import datetime, timedelta
 from pathlib import Path
@@ -58,6 +60,8 @@ class LinkedInImageStorage:
        self.max_storage_size_gb = 10  # Maximum storage size in GB
        self.image_retention_days = 30  # Days to keep images
        self.max_image_size_mb = 10    # Maximum individual image size in MB
        self.max_images_per_user = 100  # Maximum images per user
        self._uuid_pattern = re.compile(r'^[a-f0-9]{16}$')
        logger.info(f"LinkedIn Image Storage initialized at {self.base_storage_path}")
@@ -102,6 +106,22 @@ class LinkedInImageStorage:
        try:
            start_time = datetime.now()
            # Check per-user storage quota
            if user_id:
                user_count = await self._count_user_images(user_id)
                if user_count >= self.max_images_per_user:
                    return {
                        'success': False,
                        'error': f"User image limit ({self.max_images_per_user}) reached. Delete existing images or increase limit."
                    }
            # Check disk space
            if not await self._check_disk_space(len(image_data)):
                return {
                    'success': False,
                    'error': "Insufficient disk space for image storage."
                }
            # Generate unique image ID
            image_id = self._generate_image_id(image_data, metadata)
@@ -170,6 +190,9 @@ class LinkedInImageStorage:
            Dict containing image data and metadata
        """
        try:
            if not self._validate_image_id(image_id):
                return {'success': False, 'error': f'Invalid image ID format: {image_id}'}
            # Find image file
            image_path = await self._find_image_by_id(image_id, user_id)
            if not image_path:
@@ -216,6 +239,9 @@ class LinkedInImageStorage:
            Dict containing deletion result
        """
        try:
            if not self._validate_image_id(image_id):
                return {'success': False, 'error': f'Invalid image ID format: {image_id}'}
            # Find image file
            image_path = await self._find_image_by_id(image_id, user_id)
            if not image_path:
@@ -418,6 +444,32 @@ class LinkedInImageStorage:
                'error': f"Failed to get storage stats: {str(e)}"
            }
    def _validate_image_id(self, image_id: str) -> bool:
        """Validate image_id against expected format to prevent path traversal."""
        return bool(self._uuid_pattern.match(image_id))
    async def _count_user_images(self, user_id: str) -> int:
        """Count total images stored for a given user."""
        try:
            images_path, _ = self._get_workspace_paths(user_id)
            count = 0
            if images_path.exists():
                for content_dir in images_path.iterdir():
                    if content_dir.is_dir():
                        count += sum(1 for f in content_dir.glob("*.png") if f.is_file())
            return count
        except Exception as e:
            logger.warning(f"Error counting images for user {user_id}: {e}")
            return 0
    async def _check_disk_space(self, required_bytes: int) -> bool:
        """Check if sufficient disk space is available."""
        try:
            usage = shutil.disk_usage(self.base_storage_path)
            return usage.free > required_bytes * 2  # require 2x headroom
        except Exception:
            return True  # if we can't check, allow the write
    def _generate_image_id(self, image_data: bytes, metadata: Dict[str, Any]) -> str:
        """Generate unique image ID based on content and metadata."""
        # Create hash from image data and key metadata
@@ -569,6 +621,9 @@ class LinkedInImageStorage:
        Returns:
            Dict containing image metadata if found
        """
        if not self._validate_image_id(image_id):
            logger.warning(f"Invalid image ID format in metadata request: {image_id}")
            return None
        return await self._load_metadata(image_id, user_id)
    async def _load_metadata(self, image_id: str, user_id: Optional[str] = None) -> Optional[Dict[str, Any]]:
--- a/backend/services/linkedin/image_prompts/init.py
+++ b/backend/services/linkedin/image_prompts/init.py
@@ -2,8 +2,8 @@
 LinkedIn Image Prompts Package
 This package provides AI-powered image prompt generation for LinkedIn content
-using Google's Gemini API. It creates three distinct prompt styles optimized
+using the provider-agnostic llm_text_gen gateway. It creates three distinct
-for professional business image generation.
+prompt styles optimized for professional business image generation.
 """
 from .linkedin_prompt_generator import LinkedInPromptGenerator
--- a/backend/services/linkedin/image_prompts/linkedin_prompt_generator.py
+++ b/backend/services/linkedin/image_prompts/linkedin_prompt_generator.py
@@ -1,9 +1,10 @@
 """
 LinkedIn Image Prompt Generator Service
-This service generates AI-optimized image prompts for LinkedIn content using Gemini's
+This service generates AI-optimized image prompts for LinkedIn content using
-capabilities. It creates three distinct prompt styles (professional, creative, industry-specific)
+the provider-agnostic llm_text_gen gateway. It creates three distinct prompt
-following best practices for image generation.
+styles (professional, creative, industry-specific) following best practices
 for image generation.
 """
 import asyncio
@@ -13,14 +14,14 @@ from loguru import logger
 # Import existing infrastructure
 from ...onboarding.api_key_manager import APIKeyManager
-from ...llm_providers.gemini_provider import gemini_text_response
+from ...llm_providers.main_text_generation import llm_text_gen
 class LinkedInPromptGenerator:
    """
    Generates AI-optimized image prompts for LinkedIn content.
-    This service creates three distinct prompt styles following Gemini API best practices:
+    This service creates three distinct prompt styles following best practices:
    1. Professional Style - Corporate aesthetics, clean lines, business colors
    2. Creative Style - Engaging visuals, vibrant colors, social media appeal  
    3. Industry-Specific Style - Tailored to specific business sectors
@@ -31,10 +32,9 @@ class LinkedInPromptGenerator:
        Initialize the LinkedIn Prompt Generator.
        Args:
-            api_key_manager: API key manager for Gemini authentication
+            api_key_manager: API key manager for authentication
        """
        self.api_key_manager = api_key_manager or APIKeyManager()
        self.model = "gemini-2.0-flash-exp"
        # Prompt generation configuration
        self.max_prompt_length = 500
@@ -49,7 +49,8 @@ class LinkedInPromptGenerator:
    async def generate_three_prompts(
        self, 
        linkedin_content: Dict[str, Any],
-        aspect_ratio: str = "1:1"
+        aspect_ratio: str = "1:1",
        user_id: str = None
    ) -> List[Dict[str, Any]]:
        """
        Generate three AI-optimized image prompts for LinkedIn content.
@@ -57,6 +58,7 @@ class LinkedInPromptGenerator:
        Args:
            linkedin_content: LinkedIn content context (topic, industry, content_type, content)
            aspect_ratio: Desired image aspect ratio
            user_id: User ID for subscription checking
        Returns:
            List of three prompt objects with style, prompt, and description
@@ -65,11 +67,11 @@ class LinkedInPromptGenerator:
            start_time = datetime.now()
            logger.info(f"Generating image prompts for LinkedIn content: {linkedin_content.get('topic', 'Unknown')}")
-            # Generate prompts using Gemini
+            # Generate prompts using provider-agnostic gateway
-            prompts = await self._generate_prompts_with_gemini(linkedin_content, aspect_ratio)
+            prompts = await self._generate_prompts_with_llm(linkedin_content, aspect_ratio, user_id)
            if not prompts or len(prompts) < 3:
-                logger.warning("Gemini prompt generation failed, using fallback prompts")
+                logger.warning("Prompt generation failed, using fallback prompts")
                prompts = self._get_fallback_prompts(linkedin_content, aspect_ratio)
            # Ensure exactly 3 prompts
@@ -92,62 +94,65 @@ class LinkedInPromptGenerator:
            logger.error(f"Error generating LinkedIn image prompts: {str(e)}")
            return self._get_fallback_prompts(linkedin_content, aspect_ratio)
-    async def _generate_prompts_with_gemini(
+    async def _generate_prompts_with_llm(
        self, 
        linkedin_content: Dict[str, Any],
-        aspect_ratio: str
+        aspect_ratio: str,
        user_id: str = None
    ) -> List[Dict[str, Any]]:
        """
-        Generate image prompts using Gemini AI.
+        Generate image prompts using provider-agnostic llm_text_gen.
        Args:
            linkedin_content: LinkedIn content context
            aspect_ratio: Image aspect ratio
            user_id: User ID for subscription checking
        Returns:
            List of generated prompts
        """
        try:
-            # Build the prompt for Gemini
+            # Build the prompt
-            gemini_prompt = self._build_gemini_prompt(linkedin_content, aspect_ratio)
+            prompt = self._build_image_prompt(linkedin_content, aspect_ratio)
-            # Generate response using Gemini
+            # Generate response using provider-agnostic gateway
-            response = gemini_text_response(
+            response = llm_text_gen(
-                prompt=gemini_prompt,
+                prompt=prompt,
-                temperature=0.7,
+                system_prompt="You are an expert AI image prompt engineer specializing in LinkedIn content optimization.",
-                top_p=0.8,
+                user_id=user_id,
-                n=1,
+                flow_type="linkedin_image_prompts",
                max_tokens=1000,
-                system_prompt="You are an expert AI image prompt engineer specializing in LinkedIn content optimization."
+                temperature=0.7
            )
            if not response:
-                logger.warning("No response from Gemini prompt generation")
+                logger.warning("No response from prompt generation")
                return []
-            # Parse Gemini response into structured prompts
+            # Parse response into structured prompts
-            prompts = self._parse_gemini_response(response, linkedin_content)
+            response_text = response if isinstance(response, str) else str(response or "")
            prompts = self._parse_llm_response(response_text, linkedin_content)
            return prompts
        except Exception as e:
-            logger.error(f"Error in Gemini prompt generation: {str(e)}")
+            logger.error(f"Error in prompt generation: {str(e)}")
            return []
-    def _build_gemini_prompt(
+    def _build_image_prompt(
        self, 
        linkedin_content: Dict[str, Any],
        aspect_ratio: str
    ) -> str:
        """
-        Build comprehensive prompt for Gemini to generate image prompts.
+        Build comprehensive prompt for LLM to generate image prompts.
        Args:
            linkedin_content: LinkedIn content context
            aspect_ratio: Image aspect ratio
        Returns:
-            Formatted prompt for Gemini
+            Formatted prompt for LLM
        """
        topic = linkedin_content.get('topic', 'business')
        industry = linkedin_content.get('industry', 'business')
@@ -428,16 +433,16 @@ class LinkedInPromptGenerator:
        else:
            return 'Informational & Awareness'
-    def _parse_gemini_response(
+    def _parse_llm_response(
        self, 
        response: str, 
        linkedin_content: Dict[str, Any]
    ) -> List[Dict[str, Any]]:
        """
-        Parse Gemini response into structured prompt objects.
+        Parse LLM response into structured prompt objects.
        Args:
-            response: Raw response from Gemini
+            response: Raw response from LLM
            linkedin_content: LinkedIn content context
        Returns:
@@ -462,7 +467,7 @@ class LinkedInPromptGenerator:
            return self._parse_response_manually(response, linkedin_content)
        except Exception as e:
-            logger.error(f"Error parsing Gemini response: {str(e)}")
+            logger.error(f"Error parsing LLM response: {str(e)}")
            return self._parse_response_manually(response, linkedin_content)
    def _parse_response_manually(
@@ -474,7 +479,7 @@ class LinkedInPromptGenerator:
        Manually parse response if JSON parsing fails.
        Args:
-            response: Raw response from Gemini
+            response: Raw response from LLM
            linkedin_content: LinkedIn content context
        Returns:
--- a/backend/services/linkedin/research_handler.py
+++ b/backend/services/linkedin/research_handler.py
@@ -2,9 +2,10 @@
 Research Handler for LinkedIn Content Generation
 Handles research operations and timing for content generation.
 Uses common Exa/Tavily infrastructure with pre-flight validation.
 """
-from typing import List
+from typing import List, Optional
 from datetime import datetime
 from loguru import logger
 from models.linkedin_models import ResearchSource
@@ -21,11 +22,19 @@ class ResearchHandler:
        request,
        research_enabled: bool,
        search_engine: str,
-        max_results: int = 10
+        max_results: int = 10,
        user_id: Optional[str] = None
    ) -> tuple[List[ResearchSource], float]:
        """
        Conduct research if enabled and return sources with timing.
        Args:
            request: Generation request object
            research_enabled: Whether research is enabled
            search_engine: Search engine to use (exa, tavily)
            max_results: Maximum number of results
            user_id: User ID for pre-flight validation and usage tracking
        Returns:
            Tuple of (research_sources, research_time)
        """
@@ -33,7 +42,6 @@ class ResearchHandler:
        research_time = 0
        if research_enabled:
            # Debug: Log the search engine value being passed
            logger.info(f"ResearchHandler: search_engine='{search_engine}' (type: {type(search_engine)})")
            research_start = datetime.now()
@@ -41,7 +49,8 @@ class ResearchHandler:
                topic=request.topic,
                industry=request.industry,
                search_engine=search_engine,
-                max_results=max_results
+                max_results=max_results,
                user_id=user_id
            )
            research_time = (datetime.now() - research_start).total_seconds()
            logger.info(f"Research completed in {research_time:.2f}s, found {len(research_sources)} sources")
@@ -67,10 +76,5 @@ class ResearchHandler:
        if not research_enabled or level == 'none':
            return False
        # For Google native grounding, Gemini returns sources in the generation metadata,
        # so we should not require pre-fetched research_sources.
        if engine_str == 'google':
            return True
        # For other engines, require that research actually returned sources
        return bool(research_sources)
--- a/backend/services/linkedin_service.py
+++ b/backend/services/linkedin_service.py
@@ -1,8 +1,9 @@
 """
 LinkedIn Content Generation Service for ALwrity
-This service generates various types of LinkedIn content with enhanced grounding capabilities.
+This service generates various types of LinkedIn content with provider-agnostic
-Integrated with Google Search, Gemini Grounded Provider, and quality analysis.
+LLM access via llm_text_gen. Research is handled by Exa/Tavily through the
 common research infrastructure.
 """
 import asyncio
@@ -21,57 +22,44 @@ from models.linkedin_models import (
    HashtagSuggestion, ImageSuggestion, Citation, ContentQualityMetrics,
    GroundingLevel
 )
 from services.research import GoogleSearchService
 from services.llm_providers.gemini_grounded_provider import GeminiGroundedProvider
 from services.citation import CitationManager
 from services.quality import ContentQualityAnalyzer
 class LinkedInService:
    """
-    Enhanced LinkedIn content generation service with grounding capabilities.
+    LinkedIn content generation service with provider-agnostic LLM access.
-    This service integrates real research, grounded content generation,
+    Uses llm_text_gen for text generation (respects GPT_PROVIDER).
-    citation management, and quality analysis for enterprise-grade content.
+    Uses Exa/Tavily for research via common infrastructure.
    """
    def __init__(self):
-        """Initialize the LinkedIn service with all required components."""
+        """Initialize the LinkedIn service with lazy provider initialization."""
-        # Google Search Service not used - removed to avoid false warnings
+        self._citation_manager = None
-        self.google_search = None
+        self._quality_analyzer = None
-            
+    
-        try:
+    @property
-            self.gemini_grounded = GeminiGroundedProvider()
+    def citation_manager(self):
-            logger.info("✅ Gemini Grounded Provider initialized")
+        if self._citation_manager is None:
-        except Exception as e:
+            try:
-            logger.warning(f"⚠️ Gemini Grounded Provider not available: {e}")
+                self._citation_manager = CitationManager()
-            self.gemini_grounded = None
+                logger.info("✅ Citation Manager initialized")
-            
+            except Exception as e:
-        try:
+                logger.warning(f"⚠️ Citation Manager not available: {e}")
-            self.citation_manager = CitationManager()
+                self._citation_manager = None
-            logger.info("✅ Citation Manager initialized")
+        return self._citation_manager
-        except Exception as e:
+    
-            logger.warning(f"⚠️ Citation Manager not available: {e}")
+    @property
-            self.citation_manager = None
+    def quality_analyzer(self):
-            
+        if self._quality_analyzer is None:
-        try:
+            try:
-            self.quality_analyzer = ContentQualityAnalyzer()
+                self._quality_analyzer = ContentQualityAnalyzer()
-            logger.info("✅ Content Quality Analyzer initialized")
+                logger.info("✅ Content Quality Analyzer initialized")
-        except Exception as e:
+            except Exception as e:
-            logger.warning(f"⚠️ Content Quality Analyzer not available: {e}")
+                logger.warning(f"⚠️ Content Quality Analyzer not available: {e}")
-            self.quality_analyzer = None
+                self._quality_analyzer = None
-        
+        return self._quality_analyzer
        # Initialize fallback provider for non-grounded content
        try:
            from services.llm_providers.gemini_provider import gemini_structured_json_response, gemini_text_response
            self.fallback_provider = {
                'generate_structured_json': gemini_structured_json_response,
                'generate_text': gemini_text_response
            }
            logger.info("✅ Fallback Gemini provider initialized")
        except ImportError as e:
            logger.warning(f"⚠️ Fallback Gemini provider not available: {e}")
            self.fallback_provider = None
    async def generate_linkedin_post(self, request: LinkedInPostRequest) -> LinkedInPostResponse:
        """
@@ -94,8 +82,9 @@ class LinkedInService:
            # Step 1: Conduct research if enabled
            from services.linkedin.research_handler import ResearchHandler
            research_handler = ResearchHandler(self)
            user_id = str(getattr(request, 'user_id', '') or '')
            research_sources, research_time = await research_handler.conduct_research(
-                request, request.research_enabled, request.search_engine, 10
+                request, request.research_enabled, request.search_engine, 10, user_id=user_id
            )
            # Step 2: Generate content based on grounding level
@@ -105,15 +94,14 @@ class LinkedInService:
            from services.linkedin.content_generator import ContentGenerator
            content_generator = ContentGenerator(
                self.citation_manager, 
-                self.quality_analyzer, 
+                self.quality_analyzer
                self.gemini_grounded, 
                self.fallback_provider
            )
            if grounding_enabled:
                content_result = await content_generator.generate_grounded_post_content(
                    request=request,
-                    research_sources=research_sources
+                    research_sources=research_sources,
                    user_id=str(getattr(request, 'user_id', ''))
                )
            else:
                logger.error("Grounding not enabled, Error generating LinkedIn post")
@@ -152,8 +140,9 @@ class LinkedInService:
            # Step 1: Conduct research if enabled
            from services.linkedin.research_handler import ResearchHandler
            research_handler = ResearchHandler(self)
            user_id = str(getattr(request, 'user_id', '') or '')
            research_sources, research_time = await research_handler.conduct_research(
-                request, request.research_enabled, request.search_engine, 15
+                request, request.research_enabled, request.search_engine, 15, user_id=user_id
            )
            # Step 2: Generate content based on grounding level
@@ -163,15 +152,14 @@ class LinkedInService:
            from services.linkedin.content_generator import ContentGenerator
            content_generator = ContentGenerator(
                self.citation_manager, 
-                self.quality_analyzer, 
+                self.quality_analyzer
                self.gemini_grounded, 
                self.fallback_provider
            )
            if grounding_enabled:
                content_result = await content_generator.generate_grounded_article_content(
                    request=request,
-                    research_sources=research_sources
+                    research_sources=research_sources,
                    user_id=str(getattr(request, 'user_id', ''))
                )
            else:
                logger.error("Grounding not enabled - cannot generate LinkedIn article without AI provider")
@@ -210,8 +198,9 @@ class LinkedInService:
            # Step 1: Conduct research if enabled
            from services.linkedin.research_handler import ResearchHandler
            research_handler = ResearchHandler(self)
            user_id = str(getattr(request, 'user_id', '') or '')
            research_sources, research_time = await research_handler.conduct_research(
-                request, request.research_enabled, request.search_engine, 12
+                request, request.research_enabled, request.search_engine, 12, user_id=user_id
            )
            # Step 2: Generate content based on grounding level
@@ -221,15 +210,14 @@ class LinkedInService:
            from services.linkedin.content_generator import ContentGenerator
            content_generator = ContentGenerator(
                self.citation_manager, 
-                self.quality_analyzer, 
+                self.quality_analyzer
                self.gemini_grounded, 
                self.fallback_provider
            )
            if grounding_enabled:
                content_result = await content_generator.generate_grounded_carousel_content(
                    request=request,
-                    research_sources=research_sources
+                    research_sources=research_sources,
                    user_id=str(getattr(request, 'user_id', ''))
                )
            else:
                logger.error("Grounding not enabled - cannot generate LinkedIn carousel without AI provider")
@@ -303,8 +291,9 @@ class LinkedInService:
            # Step 1: Conduct research if enabled
            from services.linkedin.research_handler import ResearchHandler
            research_handler = ResearchHandler(self)
            user_id = str(getattr(request, 'user_id', '') or '')
            research_sources, research_time = await research_handler.conduct_research(
-                request, request.research_enabled, request.search_engine, 8
+                request, request.research_enabled, request.search_engine, 8, user_id=user_id
            )
            # Step 2: Generate content based on grounding level
@@ -314,15 +303,14 @@ class LinkedInService:
            from services.linkedin.content_generator import ContentGenerator
            content_generator = ContentGenerator(
                self.citation_manager, 
-                self.quality_analyzer, 
+                self.quality_analyzer
                self.gemini_grounded, 
                self.fallback_provider
            )
            if grounding_enabled:
                content_result = await content_generator.generate_grounded_video_script_content(
                    request=request,
-                    research_sources=research_sources
+                    research_sources=research_sources,
                    user_id=str(getattr(request, 'user_id', ''))
                )
            else:
                logger.error("Grounding not enabled - cannot generate LinkedIn video script without AI provider")
@@ -387,8 +375,9 @@ class LinkedInService:
            # Step 1: Conduct research if enabled
            from services.linkedin.research_handler import ResearchHandler
            research_handler = ResearchHandler(self)
            user_id = str(getattr(request, 'user_id', '') or '')
            research_sources, research_time = await research_handler.conduct_research(
-                request, request.research_enabled, request.search_engine, 5
+                request, request.research_enabled, request.search_engine, 5, user_id=user_id
            )
            # Step 2: Generate response based on grounding level
@@ -398,15 +387,14 @@ class LinkedInService:
            from services.linkedin.content_generator import ContentGenerator
            content_generator = ContentGenerator(
                self.citation_manager, 
-                self.quality_analyzer, 
+                self.quality_analyzer
                self.gemini_grounded, 
                self.fallback_provider
            )
            if grounding_enabled:
                response_result = await content_generator.generate_grounded_comment_response(
                    request=request,
-                    research_sources=research_sources
+                    research_sources=research_sources,
                    user_id=str(getattr(request, 'user_id', ''))
                )
            else:
                logger.error("Grounding not enabled - cannot generate LinkedIn comment response without AI provider")
@@ -423,20 +411,13 @@ class LinkedInService:
            )
            if result['success']:
                # Convert to LinkedInCommentResponseResult
                from models.linkedin_models import CommentResponse
                comment_response = CommentResponse(
                    response=result['response'],
                    alternative_responses=result.get('alternative_responses', []),
                    tone_analysis=result.get('tone_analysis')
                )
                return LinkedInCommentResponseResult(
                    success=True,
-                    data=comment_response,
+                    response=result['response'],
-                    research_sources=result['research_sources'],
+                    alternative_responses=result.get('alternative_responses', []),
-                    generation_metadata=result['generation_metadata'],
+                    tone_analysis=result.get('tone_analysis'),
-                    grounding_status=result['grounding_status']
+                    generation_metadata=result.get('generation_metadata', {}),
                    grounding_status=result.get('grounding_status')
                )
            else:
                return LinkedInCommentResponseResult(
@@ -451,35 +432,187 @@ class LinkedInService:
                error=f"Failed to generate LinkedIn comment response: {str(e)}"
            )
-    async def _conduct_research(self, topic: str, industry: str, search_engine: str, max_results: int = 10) -> List[ResearchSource]:
+    async def _conduct_research(self, topic: str, industry: str, search_engine: str, max_results: int = 10, user_id: str = None) -> List[ResearchSource]:
        """
-        Use native Google Search grounding instead of custom search.
+        Conduct research using the configured search engine with caching.
-        The Gemini API handles search automatically when the google_search tool is enabled.
+        
        For Exa: delegates to ExaResearchProvider.simple_search() with pre-flight validation
        For Tavily: delegates to TavilyService.search() with pre-flight validation
        For Google/unknown: falls back to Exa if available
        Args:
            topic: Research topic
            industry: Target industry
-            search_engine: Search engine to use (google uses native grounding)
+            search_engine: Search engine to use (exa, tavily)
            max_results: Maximum number of results to return
            user_id: User ID for subscription pre-flight validation and usage tracking
        Returns:
-            List of research sources (empty for google - sources come from grounding metadata)
+            List of research sources
        """
        from services.cache.research_cache import research_cache
        search_engine_lower = search_engine.lower().strip()
        # Default to Exa if Google or unknown engine specified
        if search_engine_lower in ("google", ""):
            logger.info(f"Search engine '{search_engine}' not supported for direct research, defaulting to Exa")
            search_engine_lower = "exa"
        # Check cache first
        cached_result = research_cache.get_cached_result(
            keywords=[topic],
            industry=industry,
            target_audience="linkedin"
        )
        if cached_result:
            logger.info(f"Returning cached research result for topic: {topic[:50]}")
            # Convert cached dict back to ResearchSource objects
            sources = []
            for r in cached_result:
                sources.append(ResearchSource(
                    title=r.get('title', 'Untitled'),
                    url=r.get('url', ''),
                    content=r.get('content', '')[:500],
                    relevance_score=r.get('relevance_score', 0.5),
                    credibility_score=r.get('credibility_score', 0.5),
                    source_type=r.get('source_type', 'web'),
                    publication_date=r.get('publication_date')
                ))
            return sources
        try:
-            # Debug: Log the search engine value received
+            # Pre-flight validation if user_id provided
-            logger.info(f"Received search engine: '{search_engine}' (type: {type(search_engine)})")
+            if user_id:
                try:
                    from services.subscription.preflight_validator import validate_exa_research_operations
                    from services.database import get_session_for_user
                    from services.subscription import PricingService
                    import os
                    db_val = get_session_for_user(user_id)
                    if db_val:
                        try:
                            pricing_service = PricingService(db_val)
                            gpt_provider = os.getenv("GPT_PROVIDER", "google")
                            validate_exa_research_operations(pricing_service, user_id, gpt_provider)
                        finally:
                            db_val.close()
                except Exception as preflight_err:
                    logger.warning(f"Research pre-flight validation failed: {preflight_err}")
                    # Continue anyway - don't block research for pre-flight issues
            if search_engine_lower == "exa":
                from services.research import get_exa_content_provider
                try:
                    provider = get_exa_content_provider()
                except RuntimeError:
                    logger.warning("Exa API key not configured, falling back to Tavily")
                    provider = None
                if provider:
                    try:
                        results = await provider.simple_search(
                            query=f"{topic} {industry}",
                            num_results=max_results,
                            user_id=user_id
                        )
                        sources = []
                        for r in results:
                            sources.append(ResearchSource(
                                title=r.get('title', 'Untitled'),
                                url=r.get('url', ''),
                                content=r.get('text', '')[:500],
                                relevance_score=r.get('score', 0.5),
                                credibility_score=r.get('score', 0.5),
                                source_type='web',
                                publication_date=r.get('publishedDate')
                            ))
                        # Cache the results
                        cache_data = [
                            {
                                'title': s.title,
                                'url': s.url,
                                'content': s.content,
                                'relevance_score': s.relevance_score,
                                'credibility_score': s.credibility_score,
                                'source_type': s.source_type,
                                'publication_date': s.publication_date
                            }
                            for s in sources
                        ]
                        research_cache.cache_result(
                            keywords=[topic],
                            industry=industry,
                            target_audience="linkedin",
                            result=cache_data
                        )
                        logger.info(f"Exa research returned {len(sources)} sources for topic: {topic[:50]}")
                        return sources
                    except Exception as exa_err:
                        logger.warning(f"Exa research failed ({exa_err}), falling back to Tavily")
                # Fallback to Tavily
                search_engine_lower = "tavily"
            elif search_engine_lower == "tavily":
                from services.research.tavily_service import TavilyService
                tavily_service = TavilyService()
                if not tavily_service.enabled:
                    logger.warning("Tavily API key not configured, skipping Tavily research")
                    return []
                result = await tavily_service.search(
                    query=f"{topic} {industry}",
                    max_results=max_results
                )
                raw_results = result.get('results', []) if isinstance(result, dict) else []
                sources = []
                for r in raw_results:
                    sources.append(ResearchSource(
                        title=r.get('title', 'Untitled'),
                        url=r.get('url', ''),
                        content=r.get('content', '')[:500],
                        relevance_score=r.get('score', r.get('relevance_score', 0.5)),
                        credibility_score=r.get('relevance_score', 0.5),
                        source_type='web',
                        publication_date=r.get('published_date')
                    ))
                # Cache the results
                cache_data = [
                    {
                        'title': s.title,
                        'url': s.url,
                        'content': s.content,
                        'relevance_score': s.relevance_score,
                        'credibility_score': s.credibility_score,
                        'source_type': s.source_type,
                        'publication_date': s.publication_date
                    }
                    for s in sources
                ]
                research_cache.cache_result(
                    keywords=[topic],
                    industry=industry,
                    target_audience="linkedin",
                    result=cache_data
                )
                logger.info(f"Tavily research returned {len(sources)} sources for topic: {topic[:50]}")
                return sources
            # Handle both enum value 'google' and enum name 'GOOGLE'
            if search_engine.lower() == "google":
                # No need for manual search - Gemini handles it automatically with native grounding
                logger.info("Using native Google Search grounding via Gemini API - no manual search needed")
                return []  # Return empty list - sources will come from grounding metadata
            else:
-                # Fallback to basic research for other search engines
+                logger.warning(f"Unknown search engine '{search_engine}', no research performed")
-                logger.error(f"Search engine {search_engine} not fully implemented, using fallback")
+                return []
                raise Exception(f"Search engine {search_engine} not fully implemented, using fallback")
        except Exception as e:
-            logger.error(f"Error conducting research: {str(e)}")
+            logger.error(f"Research failed for engine {search_engine}: {e}")
-            # Fallback to basic research
+            return []
            raise Exception(f"Error conducting research: {str(e)}")
--- a/backend/services/persona/linkedin/linkedin_persona_service.py
+++ b/backend/services/persona/linkedin/linkedin_persona_service.py
@@ -1,12 +1,13 @@
 """
 LinkedIn Persona Service
 Handles LinkedIn-specific persona generation and optimization.
 Uses provider-agnostic llm_text_gen for LLM access.
 """
 from typing import Dict, Any, Optional
 from loguru import logger
-from services.llm_providers.gemini_provider import gemini_structured_json_response
+from services.llm_providers.main_text_generation import llm_text_gen
 from .linkedin_persona_prompts import LinkedInPersonaPrompts
 from .linkedin_persona_schemas import LinkedInPersonaSchemas
@@ -57,14 +58,15 @@ class LinkedInPersonaService:
            # Extract user_id for tracking
            user_id = onboarding_data.get("session_info", {}).get("user_id")
-            # Generate structured response using Gemini with optimized prompts
+            # Generate structured response using provider-agnostic gateway
-            response = gemini_structured_json_response(
+            response = llm_text_gen(
                prompt=prompt,
-                schema=schema,
+                json_struct=schema,
                temperature=0.2,
                max_tokens=4096,
                system_prompt=system_prompt,
-                user_id=user_id
+                user_id=user_id,
                flow_type="linkedin_persona_generation",
                max_tokens=4096,
                temperature=0.2
            )
            if "error" in response:
--- a/backend/services/research/init.py
+++ b/backend/services/research/init.py
@@ -7,6 +7,7 @@ replacing mock research with real-time industry information.
 Available Services:
 - GoogleSearchService: Real-time industry research using Google Custom Search API
 - ExaService: Competitor discovery and analysis using Exa API
 - ExaContentResearchProvider: Shared content research provider for LinkedIn/Blog
 - TavilyService: AI-powered web search with real-time information
 - Source ranking and credibility assessment
 - Content extraction and insight generation
@@ -17,12 +18,13 @@ Core Module (v2.0):
 - ParameterOptimizer: AI-driven parameter optimization
 Author: ALwrity Team
-Version: 2.0
+Version: 2.1
-Last Updated: December 2025
+Last Updated: June 2026
 """
 from .google_search_service import GoogleSearchService
 from .exa_service import ExaService
 from .exa_content_research import ExaContentResearchProvider, get_exa_content_provider
 from .tavily_service import TavilyService
 # Core Research Engine (v2.0)
@@ -43,6 +45,10 @@ __all__ = [
    "ExaService",
    "TavilyService",
    # Shared content research provider
    "ExaContentResearchProvider",
    "get_exa_content_provider",
    # Core Research Engine (v2.0)
    "ResearchEngine",
    "ResearchContext",
--- a/backend/services/research/exa_content_research.py
+++ b/backend/services/research/exa_content_research.py
@@ -0,0 +1,198 @@
 """
 Exa Content Research Provider
 Shared Exa neural search provider for content research across ALwrity modules.
 Provides simple_search() for fact-checking, content grounding, and research.
 Used by:
 - LinkedIn Writer (content generation research)
 - Blog Writer (fact-checking and writing assistance)
 This is the content-research variant. For competitor discovery/analysis,
 use ExaService in exa_service.py.
 """
 import os
 import asyncio
 from typing import List, Dict, Any, Optional
 from loguru import logger
 class ExaContentResearchProvider:
    """Exa neural search provider for content research."""
    def __init__(self):
        """Initialize the Exa content research provider."""
        self.api_key = os.getenv("EXA_API_KEY")
        if not self.api_key:
            raise RuntimeError("EXA_API_KEY not configured")
        from exa_py import Exa
        self.exa = Exa(self.api_key)
        logger.info("✅ Exa Content Research Provider initialized")
    async def simple_search(
        self,
        query: str,
        num_results: int = 5,
        user_id: str = None,
        include_domains: List[str] = None,
        exclude_domains: List[str] = None,
    ) -> List[Dict[str, Any]]:
        """
        Simple Exa search for content research and fact-checking.
        Handles subscription preflight check and usage tracking.
        Args:
            query: Search query string
            num_results: Number of results to return (default 5)
            user_id: Optional user ID for subscription checking
            include_domains: Only return results from these domains
            exclude_domains: Exclude results from these domains
        Returns:
            List of source dicts with title, url, text, publishedDate, author, score keys
        Raises:
            HTTPException(429): If user has exceeded subscription limits
            Exception: If Exa API key not configured or search fails
        """
        # Preflight subscription check
        if user_id:
            from models.subscription_models import APIProvider
            from services.subscription import PricingService
            from services.database import get_session_for_user
            from fastapi import HTTPException
            db = get_session_for_user(user_id)
            if db:
                try:
                    pricing_service = PricingService(db)
                    can_proceed, message, usage_info = pricing_service.check_usage_limits(
                        user_id=user_id,
                        provider=APIProvider.EXA,
                        tokens_requested=0,
                        actual_provider_name="exa",
                    )
                    if not can_proceed:
                        raise HTTPException(status_code=429, detail={
                            'error': 'insufficient_balance',
                            'message': message,
                            'provider': 'exa',
                            'usage_info': usage_info or {}
                        })
                except HTTPException:
                    raise
                except Exception as e:
                    logger.warning(f"[Exa simple_search] Preflight check failed: {e}")
                finally:
                    try:
                        db.close()
                    except Exception:
                        pass
        search_kwargs = {
            "type": "auto",
            "num_results": num_results,
            "text": {"max_characters": 1000},
            "highlights": {"num_sentences": 2, "highlights_per_url": 2},
        }
        if include_domains:
            search_kwargs["include_domains"] = include_domains
        if exclude_domains:
            search_kwargs["exclude_domains"] = exclude_domains
        try:
            loop = asyncio.get_running_loop()
            results = await loop.run_in_executor(
                None,
                lambda: self.exa.search_and_contents(query, **search_kwargs),
            )
        except Exception as e:
            logger.error(f"[Exa simple_search] API call failed: {e}")
            # Retry with simpler parameters
            retry_kwargs = {"type": "auto", "num_results": num_results, "text": True}
            if include_domains:
                retry_kwargs["include_domains"] = include_domains
            if exclude_domains:
                retry_kwargs["exclude_domains"] = exclude_domains
            try:
                logger.info("[Exa simple_search] Retrying with simplified parameters")
                results = await loop.run_in_executor(
                    None,
                    lambda: self.exa.search_and_contents(query, **retry_kwargs),
                )
            except Exception as retry_error:
                logger.error(f"[Exa simple_search] Retry also failed: {retry_error}")
                raise RuntimeError(f"Exa search failed: {str(retry_error)}") from retry_error
        sources = []
        for result in results.results:
            sources.append({
                'title': getattr(result, 'title', 'Untitled'),
                'url': getattr(result, 'url', ''),
                'text': getattr(result, 'text', ''),
                'publishedDate': getattr(result, 'publishedDate', ''),
                'author': getattr(result, 'author', ''),
                'score': (lambda v: v if v is not None else 0.5)(getattr(result, 'score', 0.5)),
            })
        # Track usage
        if user_id:
            cost = 0.005  # ~0.5 cents per search
            try:
                self.track_usage(user_id, cost)
            except Exception as e:
                logger.warning(f"[Exa simple_search] Failed to track usage: {e}")
        logger.info(f"[Exa simple_search] Found {len(sources)} sources for query: {query[:80]}...")
        return sources
    def track_usage(self, user_id: str, cost: float):
        """Track Exa API usage after successful call."""
        from services.database import get_session_for_user
        from services.subscription import PricingService
        from sqlalchemy import text
        db = get_session_for_user(user_id)
        if not db:
            logger.warning(f"[track_usage] Could not get DB session for user {user_id}")
            return
        try:
            pricing_service = PricingService(db)
            current_period = pricing_service.get_current_billing_period(user_id)
            # Update exa_calls and exa_cost via SQL UPDATE
            update_query = text("""
                UPDATE usage_summaries 
                SET exa_calls = COALESCE(exa_calls, 0) + 1,
                    exa_cost = COALESCE(exa_cost, 0) + :cost,
                    total_calls = total_calls + 1,
                    total_cost = total_cost + :cost
                WHERE user_id = :user_id AND billing_period = :period
            """)
            db.execute(update_query, {
                'cost': cost,
                'user_id': user_id,
                'period': current_period
            })
            db.commit()
            logger.info(f"[Exa] Tracked usage: user={user_id}, cost=${cost}")
        except Exception as e:
            logger.error(f"[Exa] Failed to track usage: {e}")
            db.rollback()
        finally:
            db.close()
 # Global singleton instance
 _exa_content_provider: Optional[ExaContentResearchProvider] = None
 def get_exa_content_provider() -> ExaContentResearchProvider:
    """Get or create the global Exa content research provider instance."""
    global _exa_content_provider
    if _exa_content_provider is None:
        _exa_content_provider = ExaContentResearchProvider()
    return _exa_content_provider
--- a/backend/services/scheduler/core/failure_detection_service.py
+++ b/backend/services/scheduler/core/failure_detection_service.py
@@ -370,6 +370,136 @@ class FailureDetectionService:
                        "last_failure": task.last_failure.isoformat() if task.last_failure else None
                    })
            # Check onboarding full website analysis tasks
            from models.website_analysis_monitoring_models import OnboardingFullWebsiteAnalysisTask
            onboarding_tasks = self.db.query(OnboardingFullWebsiteAnalysisTask).filter(
                OnboardingFullWebsiteAnalysisTask.status == "needs_intervention"
            )
            if user_id:
                onboarding_tasks = onboarding_tasks.filter(OnboardingFullWebsiteAnalysisTask.user_id == user_id)
            for task in onboarding_tasks.all():
                pattern = self.analyze_task_failures(task.id, "onboarding_full_website_analysis", task.user_id)
                tasks_needing_intervention.append({
                    "task_id": task.id,
                    "task_type": "onboarding_full_website_analysis",
                    "user_id": task.user_id,
                    "website_url": task.website_url,
                    "failure_pattern": {
                        "consecutive_failures": pattern.consecutive_failures if pattern else task.consecutive_failures,
                        "recent_failures": pattern.recent_failures if pattern else 0,
                        "failure_reason": pattern.failure_reason.value if pattern else "unknown",
                        "last_failure_time": pattern.last_failure_time.isoformat() if pattern and pattern.last_failure_time else None,
                        "error_patterns": pattern.error_patterns if pattern else [],
                    },
                    "failure_reason": task.failure_reason,
                    "last_failure": task.last_failure.isoformat() if task.last_failure else None
                })
            # Check deep competitor analysis tasks
            from models.website_analysis_monitoring_models import DeepCompetitorAnalysisTask
            competitor_tasks = self.db.query(DeepCompetitorAnalysisTask).filter(
                DeepCompetitorAnalysisTask.status == "needs_intervention"
            )
            if user_id:
                competitor_tasks = competitor_tasks.filter(DeepCompetitorAnalysisTask.user_id == user_id)
            for task in competitor_tasks.all():
                pattern = self.analyze_task_failures(task.id, "deep_competitor_analysis", task.user_id)
                tasks_needing_intervention.append({
                    "task_id": task.id,
                    "task_type": "deep_competitor_analysis",
                    "user_id": task.user_id,
                    "website_url": task.website_url,
                    "failure_pattern": {
                        "consecutive_failures": pattern.consecutive_failures if pattern else task.consecutive_failures,
                        "recent_failures": pattern.recent_failures if pattern else 0,
                        "failure_reason": pattern.failure_reason.value if pattern else "unknown",
                        "last_failure_time": pattern.last_failure_time.isoformat() if pattern and pattern.last_failure_time else None,
                        "error_patterns": pattern.error_patterns if pattern else [],
                    },
                    "failure_reason": task.failure_reason,
                    "last_failure": task.last_failure.isoformat() if task.last_failure else None
                })
            # Check SIF indexing tasks
            from models.website_analysis_monitoring_models import SIFIndexingTask
            sif_tasks = self.db.query(SIFIndexingTask).filter(
                SIFIndexingTask.status == "needs_intervention"
            )
            if user_id:
                sif_tasks = sif_tasks.filter(SIFIndexingTask.user_id == user_id)
            for task in sif_tasks.all():
                pattern = self.analyze_task_failures(task.id, "sif_indexing", task.user_id)
                tasks_needing_intervention.append({
                    "task_id": task.id,
                    "task_type": "sif_indexing",
                    "user_id": task.user_id,
                    "website_url": task.website_url,
                    "failure_pattern": {
                        "consecutive_failures": pattern.consecutive_failures if pattern else task.consecutive_failures,
                        "recent_failures": pattern.recent_failures if pattern else 0,
                        "failure_reason": pattern.failure_reason.value if pattern else "unknown",
                        "last_failure_time": pattern.last_failure_time.isoformat() if pattern and pattern.last_failure_time else None,
                        "error_patterns": pattern.error_patterns if pattern else [],
                    },
                    "failure_reason": task.failure_reason,
                    "last_failure": task.last_failure.isoformat() if task.last_failure else None
                })
            # Check market trends tasks
            from models.website_analysis_monitoring_models import MarketTrendsTask
            trends_tasks = self.db.query(MarketTrendsTask).filter(
                MarketTrendsTask.status == "needs_intervention"
            )
            if user_id:
                trends_tasks = trends_tasks.filter(MarketTrendsTask.user_id == user_id)
            for task in trends_tasks.all():
                pattern = self.analyze_task_failures(task.id, "market_trends", task.user_id)
                tasks_needing_intervention.append({
                    "task_id": task.id,
                    "task_type": "market_trends",
                    "user_id": task.user_id,
                    "website_url": task.website_url,
                    "failure_pattern": {
                        "consecutive_failures": pattern.consecutive_failures if pattern else task.consecutive_failures,
                        "recent_failures": pattern.recent_failures if pattern else 0,
                        "failure_reason": pattern.failure_reason.value if pattern else "unknown",
                        "last_failure_time": pattern.last_failure_time.isoformat() if pattern and pattern.last_failure_time else None,
                        "error_patterns": pattern.error_patterns if pattern else [],
                    },
                    "failure_reason": task.failure_reason,
                    "last_failure": task.last_failure.isoformat() if task.last_failure else None
                })
            # Check advertools tasks (paused tasks may also need attention)
            from models.website_analysis_monitoring_models import AdvertoolsTask
            advertools_tasks = self.db.query(AdvertoolsTask).filter(
                AdvertoolsTask.status.in_(["needs_intervention", "failed"])
            )
            if user_id:
                advertools_tasks = advertools_tasks.filter(AdvertoolsTask.user_id == user_id)
            for task in advertools_tasks.all():
                pattern = self.analyze_task_failures(task.id, "advertools", task.user_id)
                tasks_needing_intervention.append({
                    "task_id": task.id,
                    "task_type": "advertools",
                    "user_id": task.user_id,
                    "website_url": task.website_url,
                    "failure_pattern": {
                        "consecutive_failures": pattern.consecutive_failures if pattern else task.consecutive_failures,
                        "recent_failures": pattern.recent_failures if pattern else 0,
                        "failure_reason": pattern.failure_reason.value if pattern else "unknown",
                        "last_failure_time": pattern.last_failure_time.isoformat() if pattern and pattern.last_failure_time else None,
                        "error_patterns": pattern.error_patterns if pattern else [],
                    },
                    "failure_reason": task.failure_reason,
                    "last_failure": task.last_failure.isoformat() if task.last_failure else None
                })
            return tasks_needing_intervention
        except Exception as e:
--- a/backend/services/scheduler/executors/advertools_executor.py
+++ b/backend/services/scheduler/executors/advertools_executor.py
@@ -1,6 +1,7 @@
 import asyncio
 from datetime import datetime, timedelta
 from typing import Any, Dict, List
 from urllib.parse import urlparse
 from loguru import logger
 from sqlalchemy.orm import Session
 from sqlalchemy import text
@@ -63,27 +64,66 @@ class AdvertoolsExecutor:
            result = {}
            if task_type == 'content_audit':
-                # Phase 1: Audit content themes using sample URLs from sitemap
+                # Phase 1: Get sitemap analysis (freshness, URL structure, pillars)
                # First, get the sitemap to find recent URLs
                sitemap_result = await self.advertools_service.analyze_sitemap(effective_url)
                audit_urls = []
                url_structure = {}
                freshness = {}
                if sitemap_result.get('success'):
-                    # Use the sample URLs returned by the service
+                    metrics = sitemap_result.get('metrics', {})
-                    audit_urls = sitemap_result.get('metrics', {}).get('audit_sample_urls', [])
+                    audit_urls = metrics.get('audit_sample_urls', [])
                    url_structure = metrics.get('url_structure', {})
                    freshness = {
                        "freshness_score": metrics.get('freshness_score'),
                        "publishing_velocity": metrics.get('publishing_velocity'),
                        "stale_content_percentage": metrics.get('stale_content_percentage'),
                        "publishing_recency": metrics.get('publishing_recency'),
                        "publishing_trend": metrics.get('publishing_trend'),
                    }
                if not audit_urls:
                    # Fallback to homepage if sitemap fails or empty
                    audit_urls = [website_url]
-                # Run the audit on the sample
+                # Phase 2: Theme analysis via content audit
-                result = await self.advertools_service.audit_content(audit_urls)
+                audit_result = await self.advertools_service.audit_content(audit_urls)
                # Phase 3: Site structure analysis (links, redirects, image SEO)
                site_domain = urlparse(website_url).netloc or website_url
                structure_result = await self.advertools_service.analyze_site_structure(
                    audit_urls, site_domain=site_domain
                )
                # Phase 4: Robots.txt compliance analysis
                robots_result = await self.advertools_service.analyze_robots_txt(website_url)
                # Phase 5: Crawl budget analysis
                budget_result = await self.advertools_service.analyze_crawl_budget(
                    effective_url, site_domain
                )
                # Merge results
                result = {
                    "success": audit_result.get('success', False) or structure_result.get('success', False),
                    "themes": audit_result.get('themes', []),
                    "page_count": audit_result.get('page_count', 0),
                    "avg_word_count": audit_result.get('avg_word_count', 0),
                    "link_health": structure_result.get('link_health', {}),
                    "redirect_audit": structure_result.get('redirect_audit', {}),
                    "image_seo": structure_result.get('image_seo', {}),
                    "page_status": structure_result.get('page_status', {}),
                    "url_structure": url_structure,
                    "freshness": freshness,
                    "robots_txt": robots_result,
                    "crawl_budget": budget_result,
                    "timestamp": datetime.utcnow().isoformat()
                }
                if result.get('success'):
                    await self._update_persona_augmentation(user_id, website_url, result, db)
            elif task_type == 'site_health':
-                # Phase 1: Check site health (freshness, velocity)
+                # Site health: freshness, velocity, URL structure
                result = await self.advertools_service.analyze_sitemap(effective_url)
                if result.get('success'):
@@ -157,7 +197,8 @@ class AdvertoolsExecutor:
    async def _update_persona_augmentation(self, user_id: str, website_url: str, audit_result: Dict[str, Any], db: Session):
        """
-        Updates the user's Brand Persona with discovered themes from the content audit.
+        Updates the user's Brand Persona with discovered themes, site structure,
        link health, and redirect data from the content audit.
        """
        try:
            session = db.query(OnboardingSession).filter(OnboardingSession.user_id == user_id).first()
@@ -170,18 +211,40 @@ class AdvertoolsExecutor:
                self.logger.warning(f"No website analysis found for user {user_id}")
                return
            # Update brand_analysis with augmented themes
            current_brand = analysis.brand_analysis or {}
-            # Add or update the 'augmented_themes' field
+            # Core themes
            current_brand['augmented_themes'] = audit_result.get('themes', [])
            # Link health
            current_brand['link_health'] = audit_result.get('link_health', {})
            # Redirect audit
            current_brand['redirect_audit'] = audit_result.get('redirect_audit', {})
            # Image SEO
            current_brand['image_seo'] = audit_result.get('image_seo', {})
            # Page status distribution
            current_brand['page_status'] = audit_result.get('page_status', {})
            # URL structure analysis
            current_brand['url_structure'] = audit_result.get('url_structure', {})
            # Freshness
            current_brand['freshness'] = audit_result.get('freshness', {})
            # Robots.txt compliance
            current_brand['robots_txt'] = audit_result.get('robots_txt', {})
            # Crawl budget analysis
            current_brand['crawl_budget'] = audit_result.get('crawl_budget', {})
            current_brand['last_advertools_audit'] = datetime.utcnow().isoformat()
            # Force SQLAlchemy to detect change in JSON field
            from sqlalchemy.orm.attributes import flag_modified
            flag_modified(analysis, "brand_analysis")
            # Also update content_strategy_insights if relevant
            if 'avg_word_count' in audit_result:
                current_strategy = analysis.content_strategy_insights or {}
                current_strategy['avg_content_length'] = audit_result['avg_word_count']
@@ -196,7 +259,8 @@ class AdvertoolsExecutor:
    async def _update_site_health_metrics(self, user_id: str, website_url: str, health_result: Dict[str, Any], db: Session):
        """
-        Updates the WebsiteAnalysis with site health metrics (velocity, freshness).
+        Updates the WebsiteAnalysis with site health metrics (velocity, freshness,
        URL structure analysis, freshness score).
        """
        try:
            session = db.query(OnboardingSession).filter(OnboardingSession.user_id == user_id).first()
@@ -207,7 +271,6 @@ class AdvertoolsExecutor:
            if not analysis:
                return
            # Update seo_audit with health metrics
            current_seo = analysis.seo_audit or {}
            metrics = health_result.get('metrics', {})
@@ -216,7 +279,11 @@ class AdvertoolsExecutor:
                "publishing_velocity": metrics.get('publishing_velocity'),
                "stale_content_count": metrics.get('stale_content_count'),
                "stale_content_percentage": metrics.get('stale_content_percentage'),
-                "top_pillars": metrics.get('top_pillars')
+                "freshness_score": metrics.get('freshness_score'),
                "publishing_recency": metrics.get('publishing_recency'),
                "publishing_trend": metrics.get('publishing_trend'),
                "top_pillars": metrics.get('top_pillars'),
                "url_structure": metrics.get('url_structure', {})
            }
            current_seo['last_advertools_health_check'] = datetime.utcnow().isoformat()
--- a/backend/services/seo/advertools_service.py
+++ b/backend/services/seo/advertools_service.py
@@ -1,12 +1,18 @@
 import advertools as adv
 import pandas as pd
 import asyncio
-from typing import Dict, Any, List, Optional
+from typing import Dict, Any, List, Optional, Tuple
 from datetime import datetime, timedelta
 from loguru import logger
 import json
 import os
 import tempfile
 from urllib.parse import urlparse
 from collections import Counter
 import urllib.request
 import urllib.error
 import socket
 import re
 class AdvertoolsService:
    """
@@ -19,51 +25,58 @@ class AdvertoolsService:
    async def analyze_sitemap(self, sitemap_url: str) -> Dict[str, Any]:
        """
-        Analyzes a website's sitemap to extract metrics on publishing velocity and freshness.
+        Analyzes a website's sitemap to extract metrics on publishing velocity, freshness,
        URL structure patterns, and topic distribution.
        """
        try:
            self.logger.info(f"Analyzing sitemap: {sitemap_url}")
            # advertools sitemap_to_df is blocking, run in executor
            loop = asyncio.get_event_loop()
            df = await loop.run_in_executor(None, lambda: adv.sitemap_to_df(sitemap_url))
            if df is None or df.empty:
                return {"success": False, "error": "Sitemap is empty or could not be parsed."}
            # Convert lastmod to datetime
            if 'lastmod' in df.columns:
                df['lastmod'] = pd.to_datetime(df['lastmod'], errors='coerce', utc=True)
            total_urls = len(df)
-            # Handle potential empty datetime columns
+            # --- Content Freshness Scoring ---
-            if 'lastmod' in df.columns and not df['lastmod'].isna().all():
+            freshness = self._compute_freshness(df)
                now = datetime.now(df['lastmod'].dt.tz)
                thirty_days_ago = now - timedelta(days=30)
                recent_urls = df[df['lastmod'] > thirty_days_ago]
                six_months_ago = now - timedelta(days=180)
                stale_urls = df[df['lastmod'] < six_months_ago]
                publishing_velocity = len(recent_urls) / 4.0 # URLs per week
                stale_count = len(stale_urls)
            else:
                publishing_velocity = 0
                stale_count = 0
-            # Enhanced Content Pillars (Top folder patterns - 3 levels deep)
+            # --- URL Structure Analysis ---
-            def extract_hierarchy(url: str):
+            url_structure = {}
-                try:
+            if 'loc' in df.columns:
-                    parts = urlparse(url).path.strip('/').split('/')
+                url_structure = await self._analyze_url_structure(df['loc'].tolist())
-                    if not parts or not parts[0]: return "home"
+            
-                    return "/".join(parts[:2]) # Capture top 2 segments
+            # --- Content Pillars via url_to_df ---
-                except:
+            pillars = {}
-                    return "other"
+            url_df = None
            try:
                url_df = adv.url_to_df(df['loc'])
                if url_df is not None and not url_df.empty:
                    dir_cols = [c for c in url_df.columns if c.startswith('dir_')]
                    if dir_cols:
                        pillar_series = url_df[dir_cols[0]].fillna("home").astype(str)
                        for col in dir_cols[1:3]:
                            mask = url_df[col].notna() & (url_df[col].astype(str) != 'nan')
                            pillar_series = pillar_series + "/" + url_df[col].where(mask, "")
                        pillars = pillar_series.value_counts().head(15).to_dict()
            except Exception:
                fallback_pillars = {}
                if 'loc' in df.columns:
                    def extract_hierarchy(url: str):
                        try:
                            parts = urlparse(url).path.strip('/').split('/')
                            if not parts or not parts[0]: return "home"
                            return "/".join(parts[:2])
                        except:
                            return "other"
                    fallback_pillars = df['loc'].apply(extract_hierarchy).value_counts().head(15).to_dict()
                pillars = fallback_pillars
-            df['pillar'] = df['loc'].apply(extract_hierarchy)
+            # Sample URLs for auditing (top 15 most recent)
            pillars = df['pillar'].value_counts().head(15).to_dict()
            # Return a sample of URLs for auditing (top 15 most recent if available)
            audit_urls = []
            if 'lastmod' in df.columns and not df['lastmod'].isna().all():
                audit_urls = df.sort_values('lastmod', ascending=False).head(15)['loc'].tolist()
@@ -74,10 +87,14 @@ class AdvertoolsService:
                "success": True,
                "metrics": {
                    "total_urls": total_urls,
-                    "publishing_velocity": round(publishing_velocity, 2),
+                    "publishing_velocity": freshness.get("publishing_velocity"),
-                    "stale_content_count": stale_count,
+                    "stale_content_count": freshness.get("stale_count"),
-                    "stale_content_percentage": round((stale_count / total_urls) * 100, 2) if total_urls > 0 else 0,
+                    "stale_content_percentage": freshness.get("stale_percentage"),
                    "freshness_score": freshness.get("freshness_score"),
                    "publishing_recency": freshness.get("publishing_recency"),
                    "publishing_trend": freshness.get("publishing_trend"),
                    "top_pillars": pillars,
                    "url_structure": url_structure,
                    "audit_sample_urls": audit_urls
                },
                "timestamp": datetime.utcnow().isoformat()
@@ -86,6 +103,146 @@ class AdvertoolsService:
            self.logger.error(f"Failed to analyze sitemap {sitemap_url}: {str(e)}")
            return {"success": False, "error": str(e)}
    def _compute_freshness(self, df: pd.DataFrame) -> Dict[str, Any]:
        """Compute content freshness, publishing velocity, and staleness metrics."""
        result = {
            "publishing_velocity": 0,
            "stale_count": 0,
            "stale_percentage": 0,
            "freshness_score": 0,
            "publishing_recency": {},
            "publishing_trend": "unknown"
        }
        if 'lastmod' not in df.columns or df['lastmod'].isna().all():
            return result
        lastmod = df['lastmod'].dropna()
        if lastmod.empty:
            return result
        now = datetime.now(lastmod.dt.tz)
        thirty_days_ago = now - timedelta(days=30)
        ninety_days_ago = now - timedelta(days=90)
        six_months_ago = now - timedelta(days=180)
        recent_urls = df[df['lastmod'] > thirty_days_ago]
        stale_urls = df[df['lastmod'] < six_months_ago]
        total_urls = len(df)
        stale_count = len(stale_urls)
        stale_percentage = round((stale_count / total_urls) * 100, 2) if total_urls > 0 else 0
        # Publishing velocity: URLs per week over last 90 days
        recent_90 = df[df['lastmod'] > ninety_days_ago]
        publishing_velocity = round(len(recent_90) / 13.0, 2) if not recent_90.empty else 0
        # Freshness score (0-100): weighted combination of metrics
        non_stale_ratio = 1.0 - (stale_percentage / 100.0)
        recency_ratio = len(recent_urls) / max(total_urls, 1)
        velocity_score = min(publishing_velocity / 10.0, 1.0)
        freshness_score = round((non_stale_ratio * 50 + recency_ratio * 30 + velocity_score * 20), 1)
        # Publishing recency: URLs published in last 1d, 7d, 30d, 90d
        publishing_recency = {
            "last_24h": int(len(df[df['lastmod'] > (now - timedelta(days=1))])),
            "last_7d": int(len(df[df['lastmod'] > (now - timedelta(days=7))])),
            "last_30d": int(len(recent_urls)),
            "last_90d": int(len(recent_90)),
        }
        # Publishing trend: compare recent 30d vs prior 30d
        prior_30 = df[(df['lastmod'] <= thirty_days_ago) & (df['lastmod'] > (now - timedelta(days=60)))]
        recent_count = len(recent_urls)
        prior_count = len(prior_30)
        if recent_count > prior_count * 1.1:
            publishing_trend = "increasing"
        elif recent_count < prior_count * 0.9:
            publishing_trend = "decreasing"
        else:
            publishing_trend = "stable"
        return {
            "publishing_velocity": publishing_velocity,
            "stale_count": stale_count,
            "stale_percentage": stale_percentage,
            "freshness_score": freshness_score,
            "publishing_recency": publishing_recency,
            "publishing_trend": publishing_trend
        }
    async def _analyze_url_structure(self, urls: List[str]) -> Dict[str, Any]:
        """Analyze URL patterns for parameter bloat, directory depth, and path patterns."""
        try:
            loop = asyncio.get_event_loop()
            url_df = await loop.run_in_executor(None, lambda: adv.url_to_df(urls))
            if url_df is None or url_df.empty:
                return {}
            total = len(url_df)
            # Query param analysis
            has_query = url_df['query'].notna() & (url_df['query'] != '')
            param_count = has_query.sum()
            param_percentage = round((param_count / total) * 100, 2) if total > 0 else 0
            # Extract individual parameters
            all_params = []
            param_frequency = {}
            if param_count > 0:
                for q in url_df.loc[has_query, 'query'].dropna().unique():
                    for pair in q.split('&'):
                        key = pair.split('=')[0] if '=' in pair else pair
                        all_params.append(key)
                from collections import Counter
                param_frequency = dict(Counter(all_params).most_common(10))
            # Directory depth analysis
            dir_cols = [c for c in url_df.columns if c.startswith('dir_')]
            def count_depth(row):
                for i, col in enumerate(dir_cols):
                    val = row[col]
                    if pd.isna(val) or str(val) == 'nan' or str(val).strip() == '':
                        return i
                return len(dir_cols)
            depths = url_df.apply(count_depth, axis=1)
            avg_depth = round(depths.mean(), 1) if not depths.empty else 0
            max_depth = int(depths.max()) if not depths.empty else 0
            depth_distribution = depths.value_counts().sort_index().head(10).to_dict()
            depth_distribution = {str(k): int(v) for k, v in depth_distribution.items()}
            # Protocol consistency
            schemes = url_df['scheme'].value_counts().to_dict() if 'scheme' in url_df.columns else {}
            # Subdomain analysis
            netloc_counts = url_df['netloc'].value_counts() if 'netloc' in url_df.columns else None
            unique_subdomains = int(netloc_counts.nunique()) if netloc_counts is not None else 0
            primary_domain = netloc_counts.index[0] if netloc_counts is not None and not netloc_counts.empty else ""
            return {
                "total_urls_analyzed": total,
                "parameter_usage": {
                    "urls_with_params": int(param_count),
                    "percentage_with_params": param_percentage,
                    "top_parameters": param_frequency
                },
                "directory_depth": {
                    "average_depth": avg_depth,
                    "max_depth": max_depth,
                    "distribution": depth_distribution
                },
                "protocols": {str(k): int(v) for k, v in schemes.items()},
                "subdomains": {
                    "primary": primary_domain,
                    "unique_count": unique_subdomains
                }
            }
        except Exception as e:
            self.logger.warning(f"URL structure analysis failed: {e}")
            return {}
    async def audit_content(self, url_list: List[str]) -> Dict[str, Any]:
        """
        Performs a shallow crawl and theme analysis using word frequency.
@@ -153,6 +310,512 @@ class AdvertoolsService:
                except Exception as e:
                    self.logger.warning(f"Failed to remove temp file {temp_file}: {e}")
    async def analyze_site_structure(self, url_list: List[str], site_domain: Optional[str] = None) -> Dict[str, Any]:
        """
        Crawls a set of pages with link following to analyze internal link health,
        redirect chains, and page-level SEO elements.
        Extracts metrics via crawlytics: link distribution, redirect chains, image SEO.
        """
        temp_file = None
        try:
            self.logger.info(f"Analyzing site structure for {len(url_list)} URLs, domain={site_domain}")
            with tempfile.NamedTemporaryFile(suffix=".jsonl", delete=False) as tf:
                temp_file = tf.name
            loop = asyncio.get_event_loop()
            await loop.run_in_executor(None, lambda: adv.crawl(
                url_list=url_list,
                output_file=temp_file,
                follow_links=True,
                allowed_domains=[site_domain] if site_domain else None,
                custom_settings={
                    'LOG_LEVEL': 'WARNING',
                    'CLOSESPIDER_PAGECOUNT': 50,
                    'DOWNLOAD_TIMEOUT': 30,
                    'CONCURRENT_REQUESTS_PER_DOMAIN': 3,
                    'DEPTH_LIMIT': 3,
                }
            ))
            if not os.path.exists(temp_file) or os.path.getsize(temp_file) == 0:
                return {"success": False, "error": "Site structure crawl produced no output."}
            crawl_df = pd.read_json(temp_file, lines=True)
            page_count = len(crawl_df)
            result = {"success": True, "page_count": page_count}
            # --- Link Health via crawlytics ---
            try:
                internal_regex = site_domain if site_domain else None
                link_df = adv.crawlytics.links(crawl_df, internal_url_regex=internal_regex)
                if link_df is not None and not link_df.empty:
                    total_links = len(link_df)
                    internal_links = int(link_df['internal'].sum()) if 'internal' in link_df.columns else 0
                    external_links = total_links - internal_links
                    nofollow_links = int(link_df['nofollow'].sum()) if 'nofollow' in link_df.columns else 0
                    # Count links per page
                    links_per_page = link_df.groupby(level=0).size()
                    avg_links_per_page = round(links_per_page.mean(), 1) if not links_per_page.empty else 0
                    # Most common anchor text (internal links only)
                    anchor_texts = []
                    if 'text' in link_df.columns and 'internal' in link_df.columns:
                        internal_anchors = link_df[link_df['internal'] == True]['text'].dropna()
                        for t in internal_anchors:
                            if isinstance(t, str) and t.strip():
                                anchor_texts.extend([w.strip() for w in t.split() if len(w.strip()) > 2])
                    from collections import Counter
                    top_anchors = dict(Counter(anchor_texts).most_common(15)) if anchor_texts else {}
                    result["link_health"] = {
                        "total_links_found": total_links,
                        "internal_link_count": internal_links,
                        "external_link_count": external_links,
                        "internal_link_percentage": round((internal_links / total_links) * 100, 1) if total_links > 0 else 0,
                        "nofollow_link_count": nofollow_links,
                        "avg_links_per_page": avg_links_per_page,
                        "top_anchor_words": top_anchors
                    }
                else:
                    result["link_health"] = {"error": "No links found in crawl data"}
            except Exception as e:
                self.logger.warning(f"Link analysis failed: {e}")
                result["link_health"] = {"error": str(e)}
            # --- Redirect Chain Audit via crawlytics ---
            try:
                redirect_df = adv.crawlytics.redirects(crawl_df)
                if redirect_df is not None and not redirect_df.empty:
                    total_redirects = len(redirect_df)
                    redirect_chains = redirect_df['redirect_times'].nunique() if 'redirect_times' in redirect_df.columns else 0
                    redirect_statuses = redirect_df['status'].value_counts().to_dict() if 'status' in redirect_df.columns else {}
                    multi_hop = redirect_df[redirect_df['redirect_times'] > 1] if 'redirect_times' in redirect_df.columns else pd.DataFrame()
                    result["redirect_audit"] = {
                        "total_redirects": int(total_redirects),
                        "unique_chains": int(redirect_chains),
                        "status_distribution": {str(k): int(v) for k, v in redirect_statuses.items()},
                        "multi_hop_chains": int(len(multi_hop)),
                        "affected_pages": multi_hop.index.unique().tolist() if not multi_hop.empty else []
                    }
                else:
                    result["redirect_audit"] = {"total_redirects": 0, "note": "No redirects detected"}
            except Exception as e:
                self.logger.warning(f"Redirect analysis failed: {e}")
                result["redirect_audit"] = {"error": str(e)}
            # --- Image SEO overview via crawlytics ---
            try:
                img_df = adv.crawlytics.images(crawl_df)
                if img_df is not None and not img_df.empty:
                    total_images = len(img_df)
                    missing_alt = int(img_df['img_alt'].isna().sum()) if 'img_alt' in img_df.columns else 0
                    alt_coverage = round(((total_images - missing_alt) / total_images) * 100, 1) if total_images > 0 else 0
                    result["image_seo"] = {
                        "total_images": total_images,
                        "missing_alt_count": missing_alt,
                        "alt_coverage_percentage": alt_coverage
                    }
            except Exception as e:
                self.logger.warning(f"Image analysis failed: {e}")
            # --- Page-level metrics ---
            if 'status' in crawl_df.columns:
                status_dist = crawl_df['status'].value_counts().to_dict()
                result["page_status"] = {str(k): int(v) for k, v in status_dist.items()}
            if 'title' in crawl_df.columns:
                missing_titles = int(crawl_df['title'].isna().sum())
                result["missing_titles"] = missing_titles
            if 'meta_desc' in crawl_df.columns:
                missing_descriptions = int(crawl_df['meta_desc'].isna().sum())
                result["missing_descriptions"] = missing_descriptions
            result["timestamp"] = datetime.utcnow().isoformat()
            return result
        except Exception as e:
            self.logger.error(f"Failed to analyze site structure: {str(e)}")
            return {"success": False, "error": str(e)}
        finally:
            if temp_file and os.path.exists(temp_file):
                try:
                    os.remove(temp_file)
                except Exception as e:
                    self.logger.warning(f"Failed to remove temp file {temp_file}: {e}")
    async def analyze_robots_txt(self, website_url: str) -> Dict[str, Any]:
        """
        Fetch and analyze robots.txt for compliance issues.
        Checks directives, sitemap declaration, crawl-delay, and common problems.
        """
        try:
            self.logger.info(f"Analyzing robots.txt for {website_url}")
            parsed = urlparse(website_url)
            base_url = f"{parsed.scheme}://{parsed.netloc}"
            robots_url = f"{base_url}/robots.txt"
            result = {
                "success": True,
                "url": robots_url,
                "accessible": True,
                "total_directives": 0,
                "user_agents_found": [],
                "has_sitemap_directive": False,
                "sitemap_urls": [],
                "has_crawl_delay": False,
                "disallow_rules": [],
                "issues": [],
                "compliance_score": 100,
            }
            loop = asyncio.get_event_loop()
            try:
                robots_df = await loop.run_in_executor(
                    None, lambda: adv.robotstxt_to_df(robots_url)
                )
                if robots_df is None or robots_df.empty:
                    raise ValueError("Empty result from robotstxt_to_df")
            except Exception as adv_err:
                self.logger.warning(f"adv.robotstxt_to_df failed, using manual fallback: {adv_err}")
                robots_df = await loop.run_in_executor(
                    None, lambda: self._parse_robots_txt_manual(robots_url)
                )
            if robots_df is None or robots_df.empty:
                result["success"] = False
                result["error"] = "Could not fetch or parse robots.txt"
                result["accessible"] = False
                return result
            result["total_directives"] = len(robots_df)
            if 'user_agent' in robots_df.columns:
                result["user_agents_found"] = robots_df['user_agent'].dropna().unique().tolist()
            rule_col = 'rule' if 'rule' in robots_df.columns else 'directive' if 'directive' in robots_df.columns else None
            value_col = 'value' if 'value' in robots_df.columns else 'directive_value' if 'directive_value' in robots_df.columns else None
            if rule_col and value_col:
                rules_lower = robots_df[rule_col].astype(str).str.lower()
                result["has_sitemap_directive"] = 'sitemap' in rules_lower.values
                result["has_crawl_delay"] = 'crawl-delay' in rules_lower.values
                has_disallow_all = any(
                    str(row.get(value_col, '')).strip() == '/'
                    for _, row in robots_df[robots_df[rule_col].astype(str).str.lower() == 'disallow'].iterrows()
                ) if 'disallow' in rules_lower.values else False
                disallow_mask = rules_lower == 'disallow'
                if disallow_mask.any():
                    for _, row in robots_df[disallow_mask].iterrows():
                        val = str(row.get(value_col, ''))
                        ua = str(row.get('user_agent', '*'))
                        if val:
                            result["disallow_rules"].append({"user_agent": ua, "path": val})
                sitemap_mask = rules_lower == 'sitemap'
                if sitemap_mask.any():
                    result["sitemap_urls"] = robots_df.loc[sitemap_mask, value_col].dropna().unique().tolist()
                if has_disallow_all:
                    result["issues"].append({
                        "severity": "critical", "code": "DISALLOW_ALL",
                        "detail": "robots.txt disallows all user agents from all paths (Disallow: /)"
                    })
            if not result["has_sitemap_directive"]:
                result["issues"].append({
                    "severity": "warning", "code": "NO_SITEMAP",
                    "detail": "No Sitemap directive found — search engines may miss pages"
                })
            if not result["has_crawl_delay"]:
                result["issues"].append({
                    "severity": "info", "code": "NO_CRAWL_DELAY",
                    "detail": "No Crawl-delay directive set — not critical for most sites"
                })
            for issue in result["issues"]:
                sev = issue["severity"]
                if sev == "critical":
                    result["compliance_score"] -= 30
                elif sev == "warning":
                    result["compliance_score"] -= 15
                elif sev == "info":
                    result["compliance_score"] -= 5
            result["compliance_score"] = max(result["compliance_score"], 0)
            return result
        except Exception as e:
            self.logger.error(f"Robots.txt analysis failed: {e}")
            return {"success": False, "error": str(e), "url": robots_url if 'robots_url' in locals() else website_url}
    def _parse_robots_txt_manual(self, url: str) -> pd.DataFrame:
        """Fallback: manually fetch and parse robots.txt."""
        records = []
        try:
            req = urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=15) as resp:
                content = resp.read().decode("utf-8", errors="replace")
            current_ua = "*"
            for line in content.splitlines():
                line = line.strip()
                if not line or line.startswith("#"):
                    continue
                if line.lower().startswith("user-agent"):
                    parts = line.split(":", 1)
                    current_ua = parts[1].strip() if len(parts) > 1 else "*"
                    continue
                if ":" in line:
                    directive, _, value = line.partition(":")
                    records.append({
                        "user_agent": current_ua,
                        "rule": directive.strip(),
                        "value": value.strip(),
                    })
        except Exception as e:
            self.logger.warning(f"Manual robots.txt fetch failed: {e}")
        if not records:
            return pd.DataFrame()
        return pd.DataFrame(records)
    async def analyze_crawl_budget(self, sitemap_url: str, site_domain: str) -> Dict[str, Any]:
        """
        Analyze crawl budget by comparing sitemap inventory against actual crawl results.
        Estimates budget utilization, waste from redirects/errors, and optimization score.
        """
        temp_file = None
        try:
            self.logger.info(f"Analyzing crawl budget for {site_domain}")
            loop = asyncio.get_event_loop()
            sitemap_df = await loop.run_in_executor(None, lambda: adv.sitemap_to_df(sitemap_url))
            sitemap_total = len(sitemap_df) if sitemap_df is not None and not sitemap_df.empty else 0
            start_url = f"https://{site_domain}" if not site_domain.startswith("http") else site_domain
            with tempfile.NamedTemporaryFile(suffix=".jsonl", delete=False) as tf:
                temp_file = tf.name
            await loop.run_in_executor(None, lambda: adv.crawl(
                url_list=[start_url],
                output_file=temp_file,
                follow_links=True,
                allowed_domains=[site_domain],
                custom_settings={
                    'LOG_LEVEL': 'WARNING',
                    'CLOSESPIDER_PAGECOUNT': 30,
                    'DOWNLOAD_TIMEOUT': 15,
                    'CONCURRENT_REQUESTS_PER_DOMAIN': 5,
                    'DEPTH_LIMIT': 2,
                }
            ))
            if not os.path.exists(temp_file) or os.path.getsize(temp_file) == 0:
                return {"success": False, "error": "Crawl produced no output"}
            crawl_df = pd.read_json(temp_file, lines=True)
            crawled_count = len(crawl_df)
            status_dist = {}
            if 'status' in crawl_df.columns:
                raw = crawl_df['status'].value_counts().to_dict()
                status_dist = {str(k): int(v) for k, v in raw.items()}
            wasted = 0
            for code_s in status_dist:
                code = int(code_s)
                if code >= 300 or code < 200:
                    wasted += status_dist[code_s]
            budget_usage_ratio = round(crawled_count / max(sitemap_total, 1), 3)
            waste_ratio = round(wasted / max(crawled_count, 1), 3)
            depth_dist = {}
            if 'depth' in crawl_df.columns:
                raw = crawl_df['depth'].value_counts().sort_index().to_dict()
                depth_dist = {str(k): int(v) for k, v in raw.items()}
            param_count = 0
            url_col = 'url' if 'url' in crawl_df.columns else 'response_url' if 'response_url' in crawl_df.columns else None
            if url_col:
                param_count = int(crawl_df[url_col].astype(str).str.contains('?').sum())
            optimization_score = max(0, round(100 - (waste_ratio * 100) - (budget_usage_ratio * 20), 1))
            return {
                "success": True,
                "sitemap_total_urls": sitemap_total,
                "pages_crawled": crawled_count,
                "crawl_coverage_percentage": round(budget_usage_ratio * 100, 1),
                "status_distribution": status_dist,
                "wasted_crawl_requests": int(wasted),
                "waste_percentage": round(waste_ratio * 100, 1),
                "depth_distribution": depth_dist,
                "urls_with_parameters": int(param_count),
                "optimization_score": optimization_score,
            }
        except Exception as e:
            self.logger.error(f"Crawl budget analysis failed: {e}")
            return {"success": False, "error": str(e)}
        finally:
            if temp_file and os.path.exists(temp_file):
                try: os.remove(temp_file)
                except Exception: pass
    async def sitemap_compare(self, sitemap_a: str, sitemap_b: str) -> Dict[str, Any]:
        """
        Compare two sitemaps for competitive content gap analysis.
        Analyzes URL count, freshness, directory pillars, and identifies
        patterns unique to each sitemap.
        """
        try:
            self.logger.info(f"Comparing sitemaps: {sitemap_a} vs {sitemap_b}")
            loop = asyncio.get_event_loop()
            df_a = await loop.run_in_executor(None, lambda: adv.sitemap_to_df(sitemap_a))
            df_b = await loop.run_in_executor(None, lambda: adv.sitemap_to_df(sitemap_b))
            total_a = len(df_a) if df_a is not None and not df_a.empty else 0
            total_b = len(df_b) if df_b is not None and not df_b.empty else 0
            result = {
                "success": True,
                "sitemap_a": {"url": sitemap_a, "total_urls": total_a},
                "sitemap_b": {"url": sitemap_b, "total_urls": total_b},
                "url_count_diff": total_a - total_b,
                "ratio": round(total_a / max(total_b, 1), 2),
                "pillars_a": {},
                "pillars_b": {},
                "shared_pillars": [],
                "unique_to_a": [],
                "unique_to_b": [],
                "freshness_comparison": {},
                "overlap_score": 0,
            }
            if total_a == 0 or total_b == 0:
                return result
            def extract_pillars(df: pd.DataFrame, label: str) -> Tuple[dict, list]:
                pillars = {}
                if 'loc' in df.columns:
                    try:
                        url_df = adv.url_to_df(df['loc'])
                        if url_df is not None and not url_df.empty:
                            dir_cols = [c for c in url_df.columns if c.startswith('dir_')]
                            if dir_cols:
                                pillar_series = url_df[dir_cols[0]].fillna("home").astype(str)
                                for col in dir_cols[1:3]:
                                    mask = url_df[col].notna() & (url_df[col].astype(str) != 'nan')
                                    pillar_series = pillar_series + "/" + url_df[col].where(mask, "")
                                pillars = pillar_series.value_counts().head(20).to_dict()
                    except Exception:
                        pass
                if not pillars:
                    seen = {}
                    for url in df['loc'].dropna():
                        parts = urlparse(url).path.strip('/').split('/')
                        key = parts[0] if parts and parts[0] else "home"
                        seen[key] = seen.get(key, 0) + 1
                    pillars = dict(sorted(seen.items(), key=lambda x: x[1], reverse=True)[:20])
                pillar_keys = list(pillars.keys()) if pillars else []
                return pillars, pillar_keys
            pillars_a, keys_a = extract_pillars(df_a, "a")
            pillars_b, keys_b = extract_pillars(df_b, "b")
            result["pillars_a"] = pillars_a
            result["pillars_b"] = pillars_b
            set_a = set(keys_a)
            set_b = set(keys_b)
            shared = set_a & set_b
            result["shared_pillars"] = sorted(shared)
            result["unique_to_a"] = sorted(set_a - set_b)
            result["unique_to_b"] = sorted(set_b - set_a)
            total_keys = max(len(set_a | set_b), 1)
            overlap_count = len(shared)
            result["overlap_score"] = round((overlap_count / total_keys) * 100, 1)
            def compute_freshness_stats(df: pd.DataFrame) -> dict:
                stats = {"has_lastmod": False, "recent_30d": 0, "total_with_dates": 0}
                if 'lastmod' in df.columns:
                    lm = pd.to_datetime(df['lastmod'], errors='coerce', utc=True).dropna()
                    if not lm.empty:
                        stats["has_lastmod"] = True
                        stats["total_with_dates"] = int(len(lm))
                        stats["recent_30d"] = int((lm > (datetime.now(lm.dt.tz) - timedelta(days=30))).sum())
                return stats
            result["freshness_comparison"] = {
                "a": compute_freshness_stats(df_a),
                "b": compute_freshness_stats(df_b),
            }
            return result
        except Exception as e:
            self.logger.error(f"Sitemap comparison failed: {e}")
            return {"success": False, "error": str(e)}
    async def compare_crawl_results(self, result_a: Dict[str, Any], result_b: Dict[str, Any]) -> Dict[str, Any]:
        """
        Compare two crawl analysis result dicts to surface changes over time.
        Useful for tracking SEO improvements between scheduled executions.
        """
        try:
            diff = {
                "success": True,
                "page_count_change": 0,
                "status_distribution_changes": {},
                "link_health_changes": {},
                "redirect_changes": {},
                "new_issues": [],
                "resolved_issues": [],
            }
            pc_a = result_a.get("page_count", 0)
            pc_b = result_b.get("page_count", 0)
            diff["page_count_change"] = pc_b - pc_a
            sd_a = result_a.get("page_status", {})
            sd_b = result_b.get("page_status", {})
            all_codes = set(list(sd_a.keys()) + list(sd_b.keys()))
            for c in sorted(all_codes):
                va = sd_a.get(c, 0)
                vb = sd_b.get(c, 0)
                change = vb - va
                if change != 0:
                    diff["status_distribution_changes"][c] = change
            def _safe_diff(d_a: dict, d_b: dict, prefix: str) -> dict:
                changes = {}
                all_keys = set(list(d_a.keys()) + list(d_b.keys()))
                for k in all_keys:
                    va = d_a.get(k, 0)
                    vb = d_b.get(k, 0)
                    if isinstance(va, (int, float)) and isinstance(vb, (int, float)):
                        change = round(vb - va, 2)
                        if change != 0:
                            changes[f"{prefix}_{k}"] = change
                return changes
            lh_a = result_a.get("link_health", {})
            lh_b = result_b.get("link_health", {})
            diff["link_health_changes"] = _safe_diff(lh_a, lh_b, "link")
            rd_a = result_a.get("redirect_audit", {})
            rd_b = result_b.get("redirect_audit", {})
            diff["redirect_changes"] = _safe_diff(rd_a, rd_b, "redirect")
            return diff
        except Exception as e:
            self.logger.error(f"Crawl comparison failed: {e}")
            return {"success": False, "error": str(e)}
    async def extract_communication_style(self, url_list: List[str]) -> Dict[str, Any]:
        """
        Analyzes linking patterns and social media presence using unique temporary files.
--- a/backend/services/seo/dashboard_service.py
+++ b/backend/services/seo/dashboard_service.py
@@ -454,14 +454,12 @@ class SEODashboardService:
    def _get_advertools_insights(self, user_id: str, site_url: str) -> Dict[str, Any]:
        """Fetch Advertools-based insights from WebsiteAnalysis and AdvertoolsTasks."""
        try:
            # 1. Get augmented persona themes from WebsiteAnalysis
            session = self.db.query(OnboardingSession).filter(OnboardingSession.user_id == user_id).first()
            if not session:
                return {}
            analysis = self.db.query(WebsiteAnalysis).filter(WebsiteAnalysis.session_id == session.id).first()
            # 2. Get latest tasks status
            tasks = self.db.query(AdvertoolsTask).filter(AdvertoolsTask.user_id == user_id).all()
            audit_status = "pending"
@@ -479,6 +477,14 @@ class SEODashboardService:
            return {
                "augmented_themes": brand_analysis.get('augmented_themes', []),
                "link_health": brand_analysis.get('link_health', {}),
                "redirect_audit": brand_analysis.get('redirect_audit', {}),
                "image_seo": brand_analysis.get('image_seo', {}),
                "page_status": brand_analysis.get('page_status', {}),
                "url_structure": brand_analysis.get('url_structure', {}),
                "freshness": brand_analysis.get('freshness', {}),
                "robots_txt": brand_analysis.get('robots_txt', {}),
                "crawl_budget": brand_analysis.get('crawl_budget', {}),
                "last_audit": brand_analysis.get('last_advertools_audit'),
                "site_health": seo_audit.get('site_health', {}),
                "last_health_check": seo_audit.get('last_advertools_health_check'),
--- a/backend/services/sif_integration_service.py
+++ b/backend/services/sif_integration_service.py
@@ -378,7 +378,48 @@ class SIFIntegrationService:
                themes = adv_insights.get('augmented_themes', [])
                if themes:
                    text_content += f"Augmented Themes: {', '.join(themes[:5])}. "
-                
+
                freshness = adv_insights.get('freshness', {})
                if freshness:
                    text_content += (f"Content Freshness Score: {freshness.get('freshness_score', 'N/A')}. "
                                     f"Publishing Velocity: {freshness.get('publishing_velocity', 0)}/week. "
                                     f"Trend: {freshness.get('publishing_trend', 'unknown')}. "
                                     f"Last 30d: {freshness.get('publishing_recency', {}).get('last_30d', 0)} pages. ")
                link_health = adv_insights.get('link_health', {})
                if link_health and 'error' not in link_health:
                    text_content += (f"Internal Links: {link_health.get('internal_link_count', 0)}. "
                                     f"External Links: {link_health.get('external_link_count', 0)}. "
                                     f"Nofollow: {link_health.get('nofollow_link_count', 0)}. "
                                     f"Avg Links/Page: {link_health.get('avg_links_per_page', 0)}. ")
                redirects = adv_insights.get('redirect_audit', {})
                if redirects and 'error' not in redirects:
                    text_content += (f"Redirects: {redirects.get('total_redirects', 0)} total, "
                                     f"{redirects.get('multi_hop_chains', 0)} multi-hop. ")
                image_seo = adv_insights.get('image_seo', {})
                if image_seo and 'error' not in image_seo:
                    text_content += (f"Images: {image_seo.get('total_images', 0)} total, "
                                     f"Alt Coverage: {image_seo.get('alt_coverage_percentage', 0)}%. ")
                url_struct = adv_insights.get('url_structure', {})
                if url_struct:
                    text_content += (f"URL Structure: {url_struct.get('total_urls_analyzed', 0)} URLs, "
                                     f"Avg Depth: {url_struct.get('directory_depth', {}).get('average_depth', 0)}. "
                                     f"Params: {url_struct.get('parameter_usage', {}).get('percentage_with_params', 0)}%. ")
                robots = adv_insights.get('robots_txt', {})
                if robots and robots.get('success'):
                    text_content += (f"Robots.txt: {robots.get('total_directives', 0)} directives, "
                                     f"Compliance: {robots.get('compliance_score', 0)}/100. "
                                     f"Issues: {len(robots.get('issues', []))}. ")
                budget = adv_insights.get('crawl_budget', {})
                if budget and budget.get('success'):
                    text_content += (f"Crawl Budget: {budget.get('pages_crawled', 0)} crawled of {budget.get('sitemap_total_urls', 0)} URLs. "
                                     f"Waste: {budget.get('waste_percentage', 0)}%. "
                                     f"Score: {budget.get('optimization_score', 0)}. ")
            # Add Technical SEO overview
            tech_audit = dashboard_data.get('technical_seo_audit', {})
            if tech_audit:
--- a/backend/services/today_workflow_service.py
+++ b/backend/services/today_workflow_service.py
@@ -6,6 +6,7 @@ from sqlalchemy.orm import Session
 from models.daily_workflow_models import DailyWorkflowPlan, DailyWorkflowTask
 from models.agent_activity_models import AgentAlert
 from models.content_planning import CalendarEvent, ContentStrategy
 from services.agent_activity_service import AgentActivityService, build_agent_event_payload
 from services.llm_providers.main_text_generation import llm_text_gen
 from services.database import get_all_user_ids, get_session_for_user
@@ -17,6 +18,82 @@ PILLAR_IDS = ["plan", "generate", "publish", "analyze", "engage", "remarket"]
 MIN_TASK_EVIDENCE_LINKS = 1
 PLAN_CONTEXT_THRESHOLD = 0.65
 # Calendar → Workflow mapping
 CALENDAR_CONTENT_PILLAR = "generate"
 _PLATFORM_ACTION_URL = {
    "linkedin": "/linkedin-writer",
    "facebook": "/facebook-writer",
    "twitter": "/twitter-writer",
    "instagram": "/instagram-writer",
    "youtube": "/youtube-writer",
    "tiktok": "/tiktok-writer",
 }
 _CONTENT_ACTION_URL = {
    "blog_post": "/blog-writer",
    "linkedin_post": "/linkedin-writer",
    "facebook_post": "/facebook-writer",
    "seo_page": "/seo-dashboard",
    "video": "/video-writer",
 }
 _CONTENT_ESTIMATED_TIME = {
    "blog_post": 45, "linkedin_post": 20, "facebook_post": 15,
    "twitter_post": 10, "instagram_post": 15, "seo_page": 30, "video": 60,
 }
 def _resolve_calendar_action_url(content_type: str, platform: str) -> Optional[str]:
    platform_lower = (platform or "").strip().lower()
    if platform_lower in _PLATFORM_ACTION_URL:
        return _PLATFORM_ACTION_URL[platform_lower]
    ct_lower = (content_type or "").strip().lower()
    if ct_lower in _CONTENT_ACTION_URL:
        return _CONTENT_ACTION_URL[ct_lower]
    logger.warning("No action_url mapping for calendar event content_type={!r} platform={!r}", content_type, platform)
    return None
 def _resolve_calendar_estimated_time(content_type: str) -> int:
    return _CONTENT_ESTIMATED_TIME.get((content_type or "").strip().lower(), 30)
 def _generate_calendar_event_plan(date: str, grounding: Dict[str, Any]) -> Dict[str, Any]:
    calendar_events = grounding.get("calendar_events_today", [])
    if not calendar_events:
        return {"date": date, "tasks": []}
    tasks = []
    for event in calendar_events:
        action_url = _resolve_calendar_action_url(
            event.get("content_type", ""), event.get("platform", "")
        )
        if action_url is None:
            continue
        task = {
            "pillarId": CALENDAR_CONTENT_PILLAR,
            "title": (event.get("title") or "Untitled").strip()[:255],
            "description": (event.get("description") or "").strip(),
            "priority": "high",
            "estimatedTime": _resolve_calendar_estimated_time(event.get("content_type", "")),
            "actionType": "navigate",
            "actionUrl": action_url,
            "enabled": True,
            "dependencies": [],
            "metadata": {
                "source": "calendar_event",
                "source_event_id": event.get("id"),
                "calendar_title": event.get("title"),
                "content_type": event.get("content_type"),
                "platform": event.get("platform"),
            },
        }
        tasks.append(task)
    return {"date": date, "tasks": tasks}
 def _today_date_str() -> str:
    return datetime.now(timezone.utc).date().isoformat()
@@ -47,70 +124,6 @@ def _proposal_order_key(proposal: Any) -> tuple:
    )
 def _fallback_tasks(date: str) -> List[Dict[str, Any]]:
    return [
        {
            "pillarId": "plan",
            "title": "Review today’s plan",
            "description": "Confirm priorities and adjust the content calendar for today.",
            "priority": "high",
            "estimatedTime": 15,
            "actionType": "navigate",
            "actionUrl": "/content-planning-dashboard",
            "enabled": True,
        },
        {
            "pillarId": "generate",
            "title": "Generate one core content asset",
            "description": "Create a draft aligned with your current strategy and voice.",
            "priority": "high",
            "estimatedTime": 45,
            "actionType": "navigate",
            "actionUrl": "/blog-writer",
            "enabled": True,
        },
        {
            "pillarId": "publish",
            "title": "Publish or schedule today’s content",
            "description": "Publish or schedule content across the selected channel(s).",
            "priority": "medium",
            "estimatedTime": 20,
            "actionType": "navigate",
            "actionUrl": "/content-planning-dashboard",
            "enabled": True,
        },
        {
            "pillarId": "analyze",
            "title": "Check semantic health and performance",
            "description": "Review semantic health metrics and key performance indicators.",
            "priority": "medium",
            "estimatedTime": 15,
            "actionType": "navigate",
            "actionUrl": "/seo-dashboard",
            "enabled": True,
        },
        {
            "pillarId": "engage",
            "title": "Engage on one channel",
            "description": "Respond to comments and share one post to keep momentum.",
            "priority": "medium",
            "estimatedTime": 15,
            "actionType": "navigate",
            "actionUrl": "/linkedin-writer",
            "enabled": True,
        },
        {
            "pillarId": "remarket",
            "title": "Repurpose and remarket content",
            "description": "Create one repurposed snippet and distribute it to increase reach.",
            "priority": "low",
            "estimatedTime": 20,
            "actionType": "navigate",
            "actionUrl": "/facebook-writer",
            "enabled": True,
        },
    ]
 def _is_coverage_guardrail_enabled(grounding: Dict[str, Any]) -> bool:
    workflow_config = grounding.get("workflow_config", {}) if isinstance(grounding, dict) else {}
@@ -315,9 +328,6 @@ def _ensure_pillar_coverage(
        return sanitized_tasks
    covered_pillars = {task["pillarId"] for task in sanitized_tasks}
    fallback_by_pillar = {
        task["pillarId"]: task for task in (_sanitize_task(t) for t in _fallback_tasks(date)) if task
    }
    for pillar_id in PILLAR_IDS:
        if pillar_id in covered_pillars:
@@ -327,15 +337,6 @@ def _ensure_pillar_coverage(
        if generated:
            sanitized_tasks.append(generated)
            covered_pillars.add(pillar_id)
            continue
        controlled_fallback = fallback_by_pillar.get(pillar_id)
        if controlled_fallback:
            metadata = controlled_fallback.get("metadata") if isinstance(controlled_fallback.get("metadata"), dict) else {}
            metadata["source"] = "controlled_fallback"
            controlled_fallback["metadata"] = metadata
            sanitized_tasks.append(controlled_fallback)
            covered_pillars.add(pillar_id)
    return sanitized_tasks
@@ -367,6 +368,28 @@ def build_grounding_context(db: Session, user_id: str, date: str) -> Dict[str, A
    if "workflow_config" not in onboarding_context:
        onboarding_context["workflow_config"] = {}
    # 3. Fetch calendar events for today
    calendar_events_today = []
    try:
        from datetime import datetime as dt_func, timedelta
        today_start = dt_func.strptime(date, "%Y-%m-%d").replace(hour=0, minute=0, second=0)
        today_end = today_start + timedelta(days=1)
        calendar_events_today = (
            db.query(CalendarEvent)
            .join(ContentStrategy, CalendarEvent.strategy_id == ContentStrategy.id)
            .filter(
                ContentStrategy.user_id == user_id,
                CalendarEvent.scheduled_date >= today_start,
                CalendarEvent.scheduled_date < today_end,
                CalendarEvent.status.in_(["draft", "scheduled"]),
            )
            .all()
        )
    except Exception as e:
        logger.warning(f"Failed to fetch calendar events for grounding context: {e}")
    return {
        "recent_agent_alerts": [
            {
@@ -379,7 +402,19 @@ def build_grounding_context(db: Session, user_id: str, date: str) -> Dict[str, A
            for a in unread_agent_alerts
        ],
        "onboarding_data": onboarding_context,
-        "workflow_config": onboarding_context.get("workflow_config", {})
+        "workflow_config": onboarding_context.get("workflow_config", {}),
        "calendar_events_today": [
            {
                "id": event.id,
                "title": event.title,
                "description": event.description,
                "content_type": event.content_type,
                "platform": event.platform,
                "status": event.status,
                "scheduled_date": event.scheduled_date.isoformat() if event.scheduled_date else None,
            }
            for event in calendar_events_today
        ],
    }
@@ -406,7 +441,7 @@ async def generate_agent_enhanced_plan(
        orchestrator = await orchestration_service.get_or_create_orchestrator(user_id)
    except Exception as e:
        logger.error(f"Failed to get orchestrator: {e}")
-        return {"date": date, "tasks": _fallback_tasks(date)}
+        return {"date": date, "tasks": []}
    # 2. Parallel "Committee" Proposal Gathering
    logger.info(f"Gathering daily task proposals from agent committee for user {user_id}")
@@ -689,21 +724,21 @@ async def generate_agent_enhanced_plan(
            try:
                result = json.loads(raw)
            except Exception:
-                result = {"date": date, "tasks": _fallback_tasks(date)}
+                result = {"date": date, "tasks": []}
    except Exception as e:
        activity.log_event(
            event_type="warning",
            severity="warning",
            message=str(e)[:2000],
-            payload=build_agent_event_payload(phase="generation", step="llm_failed_fallback", tool_name="llm_text_gen", progress_percent=70, output_summary="LLM generation failed, using fallback tasks", decision_reason="Exception during workflow generation", safe_debug=False, metadata={"fallback": True}),
+            payload=build_agent_event_payload(phase="generation", step="llm_failed", tool_name="llm_text_gen", progress_percent=70, output_summary="LLM generation failed, returning empty tasks", decision_reason="Exception during workflow generation", safe_debug=False, metadata={"error": str(e)[:200]}),
            run_id=run.id,
            agent_type="TodayWorkflowGenerator",
        )
-        result = {"date": date, "tasks": _fallback_tasks(date)}
+        result = {"date": date, "tasks": []}
    tasks = result.get("tasks") if isinstance(result, dict) else None
-    if not isinstance(tasks, list) or not tasks:
+    if not isinstance(tasks, list):
-        tasks = _fallback_tasks(date)
+        tasks = []
    result = {
        "date": date,
        "tasks": _ensure_pillar_coverage(tasks, user_id, date, grounding),
@@ -744,23 +779,38 @@ async def get_or_create_daily_workflow_plan(
        return existing, False
    grounding = build_grounding_context(db, user_id, date_str)
-    plan_data = await generate_agent_enhanced_plan(db, user_id, date_str, grounding=grounding)
+
    # Step 1: Calendar events → generate pillar (SSOT for content creation)
    calendar_plan = _generate_calendar_event_plan(date_str, grounding)
    calendar_task_titles = {t.get("title") for t in calendar_plan.get("tasks", []) if t.get("title")}
    # Step 2: Agent committee → proposals for plan + analyze + engage + publish + remarket
    agent_plan_data = await generate_agent_enhanced_plan(db, user_id, date_str, grounding=grounding, strict_contextuality=False)
    # Filter agent proposals: keep only non-generate pillars, dedup by title
    committee_pillars = {"plan", "analyze", "engage", "publish", "remarket"}
    filtered_agent_tasks = [
        t for t in agent_plan_data.get("tasks", [])
        if t.get("pillarId") in committee_pillars
        and t.get("title") not in calendar_task_titles
    ]
    # Step 3: Merge — calendar wins for generate, agents fill other pillars
    all_tasks = calendar_plan.get("tasks", []) + filtered_agent_tasks
    calendar_source = bool(calendar_plan.get("tasks"))
    # Step 4: Pillar coverage — LLM backfill for any pillar still uncovered
    all_tasks = _ensure_pillar_coverage(all_tasks, user_id, date_str, grounding)
    # Step 5: Validation
    plan_data = {**agent_plan_data, "tasks": all_tasks}
    validation = validate_plan_contextuality(plan_data, grounding)
-    if not validation.get("is_contextual"):
+    plan_data["quality_status"] = (
-        logger.info("Plan contextuality below threshold for user {}. Running strict regeneration.", user_id)
+        "calendar_driven" if calendar_source
-        regenerated_plan = await generate_agent_enhanced_plan(
+        else "contextual" if validation.get("is_contextual")
-            db,
+        else "low_context"
-            user_id,
+    )
            date_str,
            grounding=grounding,
            strict_contextuality=True,
        )
        regenerated_validation = validate_plan_contextuality(regenerated_plan, grounding)
        plan_data = regenerated_plan
        validation = regenerated_validation
    plan_data["quality_status"] = "contextual" if validation.get("is_contextual") else "low_context"
    plan_data["contextuality_validation"] = validation
    tasks = plan_data.get("tasks", [])
@@ -769,9 +819,9 @@ async def get_or_create_daily_workflow_plan(
            user_id=user_id,
            date=date_str,
            source=creation_source,
-            generation_mode=_derive_generation_mode(plan_data),
+            generation_mode="calendar_driven" if calendar_source else _derive_generation_mode(plan_data),
            committee_agent_count=_count_committee_agents(tasks),
-            fallback_used=_plan_uses_fallback(tasks),
+            fallback_used=False,
            plan_json=plan_data,
            created_at=datetime.utcnow(),
            updated_at=datetime.utcnow(),
@@ -824,15 +874,17 @@ def _derive_generation_mode(plan_data: Dict[str, Any]) -> str:
        metadata = metadata if isinstance(metadata, dict) else {}
        source_agent = str(metadata.get("source_agent") or "").strip()
        source = str(metadata.get("source") or "").strip()
        if source == "calendar_event":
            return "calendar_driven"
        if source_agent:
            source_modes.add("agent_committee")
-        elif source in {"controlled_fallback", "llm_pillar_backfill"}:
+        elif source in {"llm_pillar_backfill"}:
            source_modes.add(source)
    if "calendar_driven" in source_modes:
        return "calendar_driven"
    if "agent_committee" in source_modes:
        return "agent_committee"
    if "controlled_fallback" in source_modes:
        return "controlled_fallback"
    if "llm_pillar_backfill" in source_modes:
        return "llm_pillar_backfill"
    return "llm_generation"
@@ -929,4 +981,28 @@ def update_task_status(
    db.add(task)
    db.commit()
    db.refresh(task)
    # If a calendar-sourced task is completed, mark the calendar event as published
    if status == "completed" and task.metadata_json:
        source = task.metadata_json.get("source")
        source_event_id = task.metadata_json.get("source_event_id")
        if source == "calendar_event" and source_event_id:
            try:
                cal_event = (
                    db.query(CalendarEvent)
                    .join(ContentStrategy, CalendarEvent.strategy_id == ContentStrategy.id)
                    .filter(
                        CalendarEvent.id == source_event_id,
                        ContentStrategy.user_id == user_id,
                    )
                    .first()
                )
                if cal_event and cal_event.status != "published":
                    cal_event.status = "published"
                    cal_event.updated_at = datetime.utcnow()
                    db.add(cal_event)
                    db.commit()
            except Exception as e:
                logger.warning(f"Failed to update calendar event {source_event_id} on task completion: {e}")
    return task
--- a/backend/services/video_studio/platform_specs.py
+++ b/backend/services/video_studio/platform_specs.py
@@ -91,6 +91,17 @@ PLATFORM_SPECS: List[PlatformSpec] = [
        formats=["mp4"],
        description="Square video format for LinkedIn",
    ),
    PlatformSpec(
        platform=Platform.LINKEDIN,
        name="LinkedIn Video (Portrait)",
        aspect_ratio="9:16",
        width=1080,
        height=1920,
        max_duration=600.0,  # 10 minutes
        max_file_size_mb=5000.0,  # 5GB
        formats=["mp4"],
        description="Portrait video format for LinkedIn mobile feed",
    ),
    PlatformSpec(
        platform=Platform.FACEBOOK,
        name="Facebook Video",
--- a/backend/services/wix_service.py
+++ b/backend/services/wix_service.py
@@ -143,16 +143,16 @@ class WixService:
            access_token: Valid access token
        Returns:
-            Site information
+            Site information (or {_no_site: True} if no site exists)
        """
        token_str = normalize_token_string(access_token)
        if not token_str:
-            raise ValueError("Invalid access token format for create_blog_post")
+            return {"_no_site": True, "error": "Invalid access token format"}
        try:
            return self.auth_service.get_site_info(token_str)
        except requests.RequestException as e:
-            logger.error(f"Failed to get site info: {e}")
+            logger.warning(f"Failed to get site info: {e}")
-            raise
+            return {"_no_site": True, "error": str(e)}
    def get_current_member(self, access_token: str) -> Dict[str, Any]:
        """
@@ -179,26 +179,34 @@ class WixService:
    def _normalize_token_string(self, access_token: Any) -> Optional[str]:
        return normalize_token_string(access_token)
-    def check_blog_permissions(self, access_token: str) -> Dict[str, Any]:
+    def check_blog_permissions(self, access_token: str, site_id: Optional[str] = None) -> Dict[str, Any]:
        """
        Check if the app has required blog permissions
        Args:
            access_token: Valid access token
            site_id: Optional Wix metaSiteId for multi-site token context
        Returns:
            Permission status
        """
        extra_headers = {}
        if not site_id:
            meta_info = extract_meta_from_token(access_token)
            site_id = meta_info.get('metaSiteId')
        if site_id:
            extra_headers['wix-site-id'] = site_id
        headers = {
            'Authorization': f'Bearer {access_token}',
            'Content-Type': 'application/json',
            'wix-client-id': self.client_id or ''
        }
        headers.update(extra_headers)
        try:
            # Try to list blog categories to check permissions
            response = requests.get(
-                f"{self.base_url}/blog/v1/categories",
+                f"{self.base_url}/blog/v3/categories",
                headers=headers
            )
@@ -213,13 +221,23 @@ class WixService:
                    'has_permissions': False,
                    'can_create_posts': False,
                    'can_publish': False,
-                    'error': 'Insufficient permissions'
+                    'error': 'Insufficient permissions — OAuth app lacks blog scopes'
                }
            elif response.status_code == 404:
                return {
                    'has_permissions': False,
                    'error': 'Blog feature not available or site ID not recognized'
                }
            elif response.status_code == 401:
                return {
                    'has_permissions': False,
                    'error': 'Token expired or invalid'
                }
            else:
                response.raise_for_status()
        except requests.RequestException as e:
-            logger.error(f"Failed to check blog permissions: {e}")
+            logger.warning(f"Failed to check blog permissions: {e}")
            return {
                'has_permissions': False,
                'error': str(e)
@@ -241,7 +259,8 @@ class WixService:
            result = self.media_service.import_image(
                access_token,
                image_url,
-                display_name or f'Imported Image {datetime.now().strftime("%Y%m%d_%H%M%S")}'
+                display_name or f'Imported Image {datetime.now().strftime("%Y%m%d_%H%M%S")}',
                client_id=self.client_id,
            )
            if result and isinstance(result, dict) and 'file' in result:
                media_id = result['file'].get('id')
@@ -429,8 +448,8 @@ class WixService:
            return category_ids
-        except requests.RequestException as e:
+        except Exception as e:
-            logger.error(f"Failed to lookup/create categories: {e}")
+            logger.warning(f"Failed to lookup/create categories (will skip): {e}")
            return []
    def lookup_or_create_tags(self, access_token: str, tag_names: List[str],
@@ -495,8 +514,8 @@ class WixService:
            return tag_ids
-        except requests.RequestException as e:
+        except Exception as e:
-            logger.error(f"Failed to lookup/create tags: {e}")
+            logger.warning(f"Failed to lookup/create tags (will skip): {e}")
            return []
    def publish_draft_post(self, access_token: str, draft_post_id: str) -> Dict[str, Any]:
--- a/backend/services/youtube/youtube_task_manager.py
+++ b/backend/services/youtube/youtube_task_manager.py
@@ -0,0 +1,387 @@
 """
 YouTube Creator Task Manager
 Hybrid DB-backed + in-memory task manager for YouTube video operations.
 Writes task state to PostgreSQL so renders/combines/publishes survive
 server restarts. Falls back to in-memory dict when DB is unavailable.
 API surface matches Story Writer's TaskManager for drop-in compatibility.
 """
 import uuid
 from datetime import datetime, timezone
 from typing import Any, Dict, List, Optional
 from loguru import logger
 from sqlalchemy.orm import Session
 from models.youtube_task_models import YouTubeVideoTask, YouTubeTaskType, YouTubeTaskStatus
 from services.database import get_session_for_user, get_engine_for_user
 from models.subscription_models import Base as SubscriptionBase
 class YouTubeTaskManager:
    """Hybrid persistent + in-memory task manager for YouTube Creator."""
    def __init__(self):
        self.task_storage: Dict[str, Dict[str, Any]] = {}
        self._ensure_tables()
    def _ensure_tables(self):
        """Ensure youtube_video_tasks table exists for all initialised users."""
        try:
            from services.database import _user_engines
            for user_id, engine in list(_user_engines.items()):
                try:
                    SubscriptionBase.metadata.create_all(bind=engine, checkfirst=True)
                except Exception:
                    pass
        except Exception:
            pass
    def _get_db(self, user_id: str) -> Optional[Session]:
        """Get a DB session for the given user. Returns None on failure."""
        if not user_id:
            return None
        try:
            session = get_session_for_user(user_id)
            if session:
                engine = get_engine_for_user(user_id)
                SubscriptionBase.metadata.create_all(bind=engine, checkfirst=True)
            return session
        except Exception as e:
            logger.warning(f"[YouTubeTaskManager] DB unavailable for user {user_id}: {e}")
            return None
    def _map_task_type(self, task_type_str: str) -> YouTubeTaskType:
        """Map a string task type to the enum."""
        mapping = {
            "youtube_video_render": YouTubeTaskType.RENDER,
            "youtube_scene_video_render": YouTubeTaskType.SCENE_RENDER,
            "youtube_video_combine": YouTubeTaskType.COMBINE,
            "youtube_combine_video": YouTubeTaskType.COMBINE,
            "youtube_publish": YouTubeTaskType.PUBLISH,
            "youtube_image_generation": YouTubeTaskType.IMAGE_GENERATION,
            "youtube_audio_generation": YouTubeTaskType.AUDIO_GENERATION,
        }
        return mapping.get(task_type_str, YouTubeTaskType.RENDER)
    def _map_status_to_enum(self, status: str) -> YouTubeTaskStatus:
        """Map a frontend status string to the DB enum."""
        mapping = {
            "pending": YouTubeTaskStatus.PENDING,
            "processing": YouTubeTaskStatus.PROCESSING,
            "running": YouTubeTaskStatus.PROCESSING,
            "completed": YouTubeTaskStatus.COMPLETED,
            "failed": YouTubeTaskStatus.FAILED,
        }
        return mapping.get(status, YouTubeTaskStatus.PENDING)
    def _map_status_from_enum(self, status: YouTubeTaskStatus) -> str:
        """Map DB enum to frontend status string."""
        mapping = {
            YouTubeTaskStatus.PENDING: "pending",
            YouTubeTaskStatus.PROCESSING: "processing",
            YouTubeTaskStatus.COMPLETED: "completed",
            YouTubeTaskStatus.FAILED: "failed",
        }
        return mapping.get(status, "pending")
    def create_task(
        self,
        task_type: str = "youtube_video_render",
        metadata: Optional[Dict[str, Any]] = None,
        user_id: Optional[str] = None,
    ) -> str:
        """Create a new task. Persists to DB if user_id provided; always writes to in-memory."""
        task_id = str(uuid.uuid4())
        task_metadata = metadata or {}
        now = datetime.now(timezone.utc)
        # Always write to in-memory for fast lookups
        self.task_storage[task_id] = {
            "status": "pending",
            "created_at": now,
            "updated_at": now,
            "result": None,
            "error": None,
            "progress_messages": [],
            "task_type": task_type,
            "progress": 0.0,
            "metadata": task_metadata,
        }
        # Persist to DB
        effective_user_id = user_id or task_metadata.get("owner_user_id")
        if effective_user_id:
            db = self._get_db(effective_user_id)
            if db:
                try:
                    db_task = YouTubeVideoTask(
                        task_id=task_id,
                        user_id=effective_user_id,
                        task_type=self._map_task_type(task_type),
                        status=YouTubeTaskStatus.PENDING,
                        progress=0.0,
                        request_data=task_metadata if task_metadata else None,
                        created_at=now,
                        updated_at=now,
                    )
                    db.add(db_task)
                    db.commit()
                    logger.debug(f"[YouTubeTaskManager] Persisted task {task_id} to DB for user {effective_user_id}")
                except Exception as e:
                    logger.warning(f"[YouTubeTaskManager] Failed to persist task {task_id} to DB: {e}")
                    db.rollback()
                finally:
                    db.close()
        logger.info(f"[YouTubeTaskManager] Created task: {task_id} (type: {task_type})")
        return task_id
    def get_task_status(self, task_id: str, requester_user_id: Optional[str] = None) -> Optional[Dict[str, Any]]:
        """Get task status. Checks in-memory first, then DB."""
        # Check in-memory first (fast path)
        if task_id in self.task_storage:
            task = self.task_storage[task_id]
            metadata = task.get("metadata", {}) or {}
            owner_user_id = metadata.get("owner_user_id")
            if requester_user_id is not None and owner_user_id is not None and requester_user_id != owner_user_id:
                logger.warning(f"[YouTubeTaskManager] Task access denied for task {task_id}")
                return None
            response = {
                "task_id": task_id,
                "status": task["status"],
                "progress": task.get("progress", 0.0),
                "message": task.get("progress_messages", [])[-1] if task.get("progress_messages") else None,
                "created_at": task["created_at"].isoformat() if task.get("created_at") else None,
                "updated_at": task.get("updated_at", task.get("created_at")).isoformat() if task.get("updated_at") or task.get("created_at") else None,
            }
            if task["status"] == "completed" and task.get("result"):
                response["result"] = task["result"]
            if task["status"] == "failed" and task.get("error"):
                response["error"] = task["error"]
                if task.get("error_status") is not None:
                    response["error_status"] = task["error_status"]
                if task.get("error_data") is not None:
                    response["error_data"] = task["error_data"]
            return response
        # Fall back to DB
        if requester_user_id:
            db = self._get_db(requester_user_id)
            if db:
                try:
                    db_task = db.query(YouTubeVideoTask).filter(YouTubeVideoTask.task_id == task_id).first()
                    if db_task:
                        status_val = self._map_status_from_enum(db_task.status)
                        response = {
                            "task_id": db_task.task_id,
                            "status": status_val,
                            "progress": db_task.progress or 0.0,
                            "message": db_task.message,
                            "created_at": db_task.created_at.isoformat() if db_task.created_at else None,
                            "updated_at": db_task.updated_at.isoformat() if db_task.updated_at else None,
                        }
                        if db_task.result:
                            response["result"] = db_task.result if isinstance(db_task.result, dict) else db_task.result
                        if db_task.error:
                            response["error"] = db_task.error
                            if isinstance(db_task.result, dict):
                                if db_task.result.get("error_status") is not None:
                                    response["error_status"] = db_task.result["error_status"]
                                if db_task.result.get("error_data") is not None:
                                    response["error_data"] = db_task.result["error_data"]
                        return response
                except Exception as e:
                    logger.warning(f"[YouTubeTaskManager] DB lookup failed for task {task_id}: {e}")
                finally:
                    db.close()
        return None
    def update_task_status(
        self,
        task_id: str,
        status: str,
        progress: Optional[float] = None,
        message: Optional[str] = None,
        result: Optional[Dict[str, Any]] = None,
        error: Optional[str] = None,
        error_status: Optional[int] = None,
        error_data: Optional[Dict[str, Any]] = None,
    ):
        """Update task status. Writes to both in-memory and DB."""
        now = datetime.now(timezone.utc)
        # Update in-memory
        if task_id in self.task_storage:
            task = self.task_storage[task_id]
            task["status"] = status
            task["updated_at"] = now
            if progress is not None:
                task["progress"] = progress
            if message:
                if "progress_messages" not in task:
                    task["progress_messages"] = []
                task["progress_messages"].append(message)
                logger.info(f"[YouTubeTaskManager] Task {task_id}: {message} (progress: {progress}%)")
            if result is not None:
                task["result"] = result
            if error is not None:
                task["error"] = error
                logger.error(f"[YouTubeTaskManager] Task {task_id} error: {error}")
            if error_status is not None:
                task["error_status"] = error_status
            if error_data is not None:
                task["error_data"] = error_data
            # Try DB update
            metadata = task.get("metadata", {}) or {}
            user_id = metadata.get("owner_user_id")
            self._update_db_task(task_id, user_id, status, progress, message, result, error, now)
        else:
            logger.warning(f"[YouTubeTaskManager] Cannot update non-existent task: {task_id}")
    def _update_db_task(
        self,
        task_id: str,
        user_id: Optional[str],
        status: str,
        progress: Optional[float],
        message: Optional[str],
        result: Optional[Dict[str, Any]],
        error: Optional[str],
        now: datetime,
    ):
        """Update task in DB."""
        if not user_id:
            return
        db = self._get_db(user_id)
        if not db:
            return
        try:
            db_task = db.query(YouTubeVideoTask).filter(YouTubeVideoTask.task_id == task_id).first()
            if db_task:
                db_task.status = self._map_status_to_enum(status)
                db_task.updated_at = now
                if progress is not None:
                    db_task.progress = progress
                if message:
                    db_task.message = message[:500] if message else None
                if result:
                    # Merge error fields into result if present
                    existing_result = db_task.result if isinstance(db_task.result, dict) else {}
                    existing_result.update(result)
                    db_task.result = existing_result
                if error:
                    db_task.error = error
                if status in ("completed", "failed"):
                    db_task.completed_at = now
                db.commit()
                logger.debug(f"[YouTubeTaskManager] Persisted status update for task {task_id}")
            else:
                logger.debug(f"[YouTubeTaskManager] Task {task_id} not found in DB for update")
        except Exception as e:
            logger.warning(f"[YouTubeTaskManager] Failed to update DB task {task_id}: {e}")
            db.rollback()
        finally:
            db.close()
    def recover_stale_tasks(self, user_id: str):
        """Mark in-flight tasks that were interrupted by server restart as failed.
        Called on startup for each user to handle tasks that were 'processing'
        when the server went down.
        """
        db = self._get_db(user_id)
        if not db:
            return 0
        count = 0
        try:
            stale_tasks = db.query(YouTubeVideoTask).filter(
                YouTubeVideoTask.user_id == user_id,
                YouTubeVideoTask.status.in_([
                    YouTubeTaskStatus.PENDING,
                    YouTubeTaskStatus.PROCESSING,
                ]),
            ).all()
            for task in stale_tasks:
                task.status = YouTubeTaskStatus.FAILED
                task.error = "Task interrupted by server restart"
                task.message = "Marked as failed on server restart"
                task.completed_at = datetime.now(timezone.utc)
                task.updated_at = datetime.now(timezone.utc)
                count += 1
                logger.info(f"[YouTubeTaskManager] Recovered stale task {task.task_id} for user {user_id}")
            if count > 0:
                db.commit()
                logger.info(f"[YouTubeTaskManager] Recovered {count} stale tasks for user {user_id}")
        except Exception as e:
            logger.warning(f"[YouTubeTaskManager] Failed to recover stale tasks: {e}")
            db.rollback()
        finally:
            db.close()
        return count
    def cleanup_old_tasks(self):
        """Remove in-memory tasks older than 1 hour. DB cleanup is handled by vacuum."""
        now = datetime.now(timezone.utc)
        cutoff = now.timestamp() - 3600  # 1 hour
        tasks_to_remove = []
        for task_id, task_data in self.task_storage.items():
            created_at = task_data.get("created_at")
            if created_at:
                ts = created_at.timestamp() if hasattr(created_at, 'timestamp') else 0
                if ts < cutoff:
                    tasks_to_remove.append(task_id)
        for task_id in tasks_to_remove:
            del self.task_storage[task_id]
            logger.debug(f"[YouTubeTaskManager] Cleaned up old in-memory task: {task_id}")
    def cleanup_old_db_tasks(self, days: int = 7, user_id: Optional[str] = None):
        """Delete completed/failed DB tasks older than N days."""
        if not user_id:
            return 0
        db = self._get_db(user_id)
        if not db:
            return 0
        count = 0
        try:
            from datetime import timedelta
            cutoff = datetime.now(timezone.utc) - timedelta(days=days)
            old_tasks = db.query(YouTubeVideoTask).filter(
                YouTubeVideoTask.user_id == user_id,
                YouTubeVideoTask.status.in_([YouTubeTaskStatus.COMPLETED, YouTubeTaskStatus.FAILED]),
                YouTubeVideoTask.created_at < cutoff,
            ).all()
            for task in old_tasks:
                db.delete(task)
                count += 1
            if count > 0:
                db.commit()
                logger.info(f"[YouTubeTaskManager] Cleaned up {count} old DB tasks for user {user_id}")
        except Exception as e:
            logger.warning(f"[YouTubeTaskManager] Failed to cleanup old DB tasks: {e}")
            db.rollback()
        finally:
            db.close()
        return count
 # Global singleton instance
 task_manager = YouTubeTaskManager()
--- a/docs-site/docs/about.md
+++ b/docs-site/docs/about.md
@@ -1,3 +1,7 @@
 ---
 description: About ALwrity - AI-powered digital marketing platform for solopreneurs and content creators. Learn about our vision, mission, and features.
 ---
 # About ALwrity
 <div class="grid cards" markdown>
--- a/docs-site/docs/api/authentication.md
+++ b/docs-site/docs/api/authentication.md
@@ -75,7 +75,7 @@ Content-Type: application/json
 ### Key Rotation
 ```bash
-# Create new key
+## Create new key
 curl -X POST "https://your-domain.com/api/keys" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
@@ -84,7 +84,7 @@ curl -X POST "https://your-domain.com/api/keys" \
    "permissions": ["read", "write"]
  }'
-# Revoke old key
+## Revoke old key
 curl -X DELETE "https://your-domain.com/api/keys/old_key_id" \
  -H "Authorization: Bearer YOUR_API_KEY"
 ```
@@ -234,10 +234,10 @@ def make_request_with_retry(url, headers, data):
 ```python
 from alwrity import AlwrityClient
-# Initialize client with API key
+## Initialize client with API key
 client = AlwrityClient(api_key="your_api_key_here")
-# Or use environment variable
+## Or use environment variable
 import os
 client = AlwrityClient(api_key=os.getenv('ALWRITY_API_KEY'))
 ```
@@ -257,10 +257,10 @@ const client = new AlwrityClient(process.env.ALWRITY_API_KEY);
 ### cURL Examples
 ```bash
-# Set API key as environment variable
+## Set API key as environment variable
 export ALWRITY_API_KEY="your_api_key_here"
-# Use in requests
+## Use in requests
 curl -H "Authorization: Bearer $ALWRITY_API_KEY" \
     -H "Content-Type: application/json" \
     https://your-domain.com/api/blog-writer
--- a/docs-site/docs/api/overview.md
+++ b/docs-site/docs/api/overview.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity API Reference - Complete API documentation for authentication, endpoints, rate limiting, and error handling.
 ---
 # API Reference Overview
 ALwrity provides a comprehensive RESTful API that allows you to integrate AI-powered content creation capabilities into your applications. This API enables you to generate blog posts, optimize SEO, create social media content, and manage your content strategy programmatically.
--- a/docs-site/docs/features/backlink-outreach/api-reference.md
+++ b/docs-site/docs/features/backlink-outreach/api-reference.md
@@ -75,12 +75,16 @@ flowchart TD
 **Request Body:**
 | Field | Type | Required | Description |
-|---|---|---|---|
+|---|---|---|---|---|
 | `name` | string | Yes | Campaign name. |
 | `description` | string | No | Campaign description. |
 | `keywords` | string[] | No | Target keywords for discovery. |
-**Response:** `201 Created` — Campaign object.
+**Error responses:**
 | Code | Meaning |
 |---|---|
 | `422` | Validation error (e.g., empty name). |
 ### List Campaigns
@@ -92,7 +96,7 @@ flowchart TD
 |---|---|---|---|
 | `workspace_id` | string | user_id | Workspace to filter by. Defaults to authenticated user. |
-**Response:** `200 OK` — Array of campaign objects.
+**Response:** `200 OK` — Array of campaign objects scoped to the authenticated user.
 ### Get Campaign
@@ -100,12 +104,24 @@ flowchart TD
 **Response:** `200 OK` — Campaign object with included leads.
 **Error responses:**
 | Code | Meaning |
 |---|---|
 | `404` | Campaign not found or does not belong to authenticated user (`BacklinkCampaignNotFoundError`). |
 ### Delete Campaign
 `DELETE /api/v1/backlink-outreach/campaigns/{campaign_id}`
 **Response:** `204 No Content`
 **Error responses:**
 | Code | Meaning |
 |---|---|
 | `404` | Campaign not found or does not belong to authenticated user. |
 ---
 ## Leads
@@ -117,7 +133,7 @@ flowchart TD
 **Request Body:**
 | Field | Type | Required | Description |
-|---|---|---|---|
+|---|---|---|---|---|
 | `website_url` | string | Yes | Target website URL. |
 | `website_title` | string | No | Website title. |
 | `contact_email` | string | No | Contact email address. |
@@ -126,7 +142,14 @@ flowchart TD
 | `guest_post_likelihood` | float | No | Guest post likelihood (0-1). |
 | `source` | string | No | Source of the lead. |
-**Response:** `201 Created` — Lead object.
+!!! tip "Duplicate handling"
    If a lead with the same `website_url` already exists in the campaign, the existing lead record is returned (HTTP 200) instead of creating a duplicate.
 **Error responses:**
 | Code | Meaning |
 |---|---|
 | `404` | Campaign not found or not owned by user. |
 ### Bulk Add Leads
@@ -138,8 +161,8 @@ flowchart TD
 | Field | Type | Description |
 |---|---|---|
-| `added` | int | Number of leads successfully added. |
+| `added` | int | Number of leads successfully added (duplicates excluded). |
-| `skipped` | int | Number of duplicates skipped. |
+| `skipped` | int | Number of existing leads skipped (matched by `(campaign_id, website_url)`). |
 | `failed` | string[] | List of failed entries with reasons. |
 ### Update Lead Status
@@ -149,10 +172,15 @@ flowchart TD
 **Request Body:**
 | Field | Type | Required | Description |
-|---|---|---|---|
+|---|---|---|---|---|
-| `status` | string | Yes | New status: discovered, contacted, replied, placed, bounced, lost. |
+| `status` | string | Yes | New status: `discovered`, `contacted`, `replied`, `placed`, `bounced`, `unsubscribed`. |
-**Response:** `200 OK` — Updated lead object.
+**Error responses:**
 | Code | Meaning |
 |---|---|
 | `422` | Invalid status value (must be one of the valid statuses). |
 | `404` | Lead not found. |
 ### Bulk Update Status
@@ -163,7 +191,7 @@ flowchart TD
 | Field | Type | Required | Description |
 |---|---|---|---|
 | `lead_ids` | string[] | Yes | Lead IDs to update. |
-| `status` | string | Yes | New status for all leads. |
+| `status` | string | Yes | New status: `discovered`, `contacted`, `replied`, `placed`, `bounced`, `unsubscribed`. |
 **Response:** `200 OK`
@@ -441,9 +469,10 @@ flowchart TD
 ## Common Error Responses
 | Status | Meaning | Body |
-|---|---|---|
+|---|---|---|---|
 | `401` | Not authenticated | `{"detail": "Not authenticated"}` |
 | `403` | Policy blocked | `{"detail": "Policy validation failed", "reason": "..."}` |
-| `404` | Not found | `{"detail": "Resource not found"}` |
+| `404` | Campaign or lead not found | `{"detail": "BacklinkCampaignNotFoundError: Campaign not found or access denied"}` |
 | `409` | Duplicate lead (idempotency key collision) | `{"detail": "Duplicate attempt detected"}` |
 | `422` | Validation error | `{"detail": [...validation errors]}` |
 | `500` | Server error | `{"detail": "An internal error occurred"}` (generic, no stack trace) |
--- a/docs-site/docs/features/backlink-outreach/campaign-management.md
+++ b/docs-site/docs/features/backlink-outreach/campaign-management.md
@@ -21,6 +21,9 @@ A campaign requires only a name. Add a description and keywords to make discover
 !!! tip "Naming conventions"
    Use a consistent naming scheme like `[Vertical] [Content Type] [Period]` — e.g., "Fitness Guest Posts June" or "AI Startups Roundup Q3".
 !!! warning "Ownership validation"
    Campaigns are scoped to the authenticated user. API calls with a `campaign_id` that does not exist or belongs to another user return `404 BacklinkCampaignNotFoundError`. This applies to all campaign operations (get, delete, add leads, send emails, etc.).
 ## Campaign List View
 The campaign list shows:
--- a/docs-site/docs/features/backlink-outreach/configuration.md
+++ b/docs-site/docs/features/backlink-outreach/configuration.md
@@ -68,6 +68,20 @@ The Backlink Outreach feature uses SQLite with automatic table creation:
 Tables are created automatically on first use via `_ensure_tables()`. No manual migration is required.
 ## Feature Flag Configuration
 The Backlink Outreach feature can be enabled in isolation via the `ALWRITY_ENABLED_FEATURES` environment variable:
 | Variable | Value | Description |
 |---|---|---|
 | `ALWRITY_ENABLED_FEATURES` | `all` (default) | Enable all platform features. |
 | `ALWRITY_ENABLED_FEATURES` | `backlinking` | Enable only Backlink Outreach + core services. |
 When set to `backlinking`, only the backlink outreach router and its core dependencies are loaded. Other features (blog writer, podcast, SEO dashboard, etc.) are skipped — reducing startup time and memory usage.
 !!! note "Multiple features"
    You can also enable a combination: `ALWRITY_ENABLED_FEATURES=core,backlinking` or `ALWRITY_ENABLED_FEATURES=podcast,backlinking`.
 ## Deployment Checklist
 ### Minimal Setup
--- a/docs-site/docs/features/backlink-outreach/implementation-overview.md
+++ b/docs-site/docs/features/backlink-outreach/implementation-overview.md
@@ -54,13 +54,15 @@ backend/
 ├── routers/
 │   └── backlink_outreach.py          # 18+ API endpoints
 ├── services/
-│   ├── backlink_outreach_service.py  # Business logic, policy, analytics
+│   ├── backlink_outreach_service.py       # Business logic, policy, analytics
-│   ├── backlink_outreach_storage.py  # SQLite CRUD operations
+│   ├── backlink_outreach_storage.py       # SQLite CRUD operations
-│   ├── backlink_outreach_sender.py   # SMTP email delivery
+│   ├── backlink_outreach_sender.py        # SMTP email delivery with Message-ID
-│   ├── backlink_outreach_reply_monitor.py  # IMAP reply polling
+│   ├── backlink_outreach_reply_monitor.py # IMAP reply polling with Message-ID matching
-│   └── backlink_outreach_models.py   # Pydantic request/response models
+│   ├── backlink_outreach_scraper.py       # Deep website scraper (Exa + DuckDuckGo)
 │   ├── backlink_outreach_template_generator.py  # LLM-based email copy generation
 │   └── backlink_outreach_models.py        # Pydantic request/response models
 ├── models/
-│   └── backlink_outreach_models.py   # SQLAlchemy models + indexes
+│   └── backlink_outreach_models.py        # SQLAlchemy models + indexes
 frontend/src/
 ├── components/
@@ -109,6 +111,7 @@ erDiagram
        string body
        string status
        string legal_basis
        string message_id
        datetime sent_at
    }
    OutreachReply {
@@ -217,10 +220,10 @@ SQLite CRUD operations with 20+ methods:
 - Campaign CRUD: `create_campaign`, `list_backlink_campaigns`, `get_campaign`, `delete_campaign`.
 - Lead management: `add_campaign_lead`, `add_campaign_leads_bulk`, `update_lead_status`, `bulk_update_lead_status`.
 - Outreach: `create_outreach_attempt`, `list_outreach_attempts`, `get_lead_attempts`.
- Replies: `store_reply`, `find_attempt_by_from_email`, `reply_exists`, `list_replies`, `count_replies`.
+- Replies: `store_reply`, `find_attempt_by_from_email`, `find_attempt_by_message_id`, `reply_exists`, `list_replies`, `count_replies`.
 - Follow-ups: `create_follow_up`, `list_follow_ups`.
 - Suppression: `add_suppression`, `list_suppression`, `is_suppressed`.
- Counters: `increment_user_counter`, `increment_domain_counter` (atomic ON CONFLICT).
+- Counters: `try_increment_user_send_counter`, `try_increment_domain_send_counter` (atomic ON CONFLICT — reserves cap slot before send).
 - Idempotency: `check_idempotency`, `mark_idempotency`.
 - Audit: `log_audit_entry`.
 - Templates: `create_email_template`, `list_email_templates`, `get_email_template`, `delete_email_template`.
@@ -249,7 +252,7 @@ Handles IMAP reply processing:
 3. Searches for messages matching the outreach sender.
 4. Fetches up to `IMAP_FETCH_LIMIT` messages.
 5. Checks for duplicates via `reply_exists()`.
-6. Matches replies to attempts via `find_attempt_by_from_email()`.
+6. Matches replies to attempts via `find_attempt_by_message_id()` (primary, using `In-Reply-To`/`References` headers), falls back to `find_attempt_by_from_email()`.
 7. Classifies replies based on content analysis.
 8. Stores reply records.
--- a/docs-site/docs/features/backlink-outreach/outreach-operations.md
+++ b/docs-site/docs/features/backlink-outreach/outreach-operations.md
@@ -12,15 +12,16 @@ flowchart TD
    B --> C[Resolve Lead Email from DB]
    C --> D[Policy Validation]
    D -->|Approved| E[Create Outreach Attempt Record]
-    D -->|Blocked| F[Record Audit Log + Return 403]
+    D -->|Blocked|     F[Record Audit Log + Return 403]
-    E --> G[Send via SMTP with TLS]
+    E --> G[Reserve Daily Cap Slots Atomically]
-    G -->|Success| H[Increment Counters]
+    G --> H[Send via SMTP with TLS + Message-ID]
-    G -->|Success| I[Mark Idempotency Key]
+    H -->|Success| I[Store Message-ID on Attempt Record]
-    G -->|Success| J[Update Lead Status to Contacted]
+    H -->|Success| J[Mark Idempotency Key]
-    G -->|Failure| K[Return 500 with Generic Error]
+    H -->|Success| K[Update Lead Status to Contacted]
-    H --> L[Return 200 with Attempt Details]
+    H -->|Failure| L[Return 500 with Generic Error]
-    I --> L
+    I --> M[Return 200 with Attempt Details]
-    J --> L
+    J --> M
    K --> M
    style D fill:#fff3e0
    style G fill:#e3f2fd
@@ -28,7 +29,7 @@ flowchart TD
 ```
 !!! warning "Counter timing"
-    Counters and idempotency keys are marked **only after successful SMTP delivery**, never before. This prevents false cap consumption on failed sends.
+    Daily cap slots are **reserved atomically before sending** via `try_increment_user_send_counter` and `try_increment_domain_send_counter`. If SMTP delivery fails, one slot is consumed (the cap check and increment happen in the same transaction). Idempotency keys are marked only after successful delivery.
 ## Policy Validation
@@ -40,6 +41,7 @@ Before every send, the system validates:
 | **Daily domain cap** | Max 20 emails/domain/day | Block + audit |
 | **Suppression list** | Recipient not suppressed | Block + audit |
 | **Idempotency** | No duplicate `(sender, recipient, subject)` in 24h | Block + audit |
 | **Sender alias** | `sender_email` must match `SMTP_ALLOWED_FROM_EMAILS` pattern | Block + fallback to `SMTP_FROM_EMAIL` |
 | **Legal basis** | EU domains → "consent", others → "legitimate_interest" | Auto-assign |
 **API:** `POST /api/v1/backlink-outreach/policy/validate`
--- a/docs-site/docs/features/backlink-outreach/overview.md
+++ b/docs-site/docs/features/backlink-outreach/overview.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity Backlink Outreach - AI-powered backlink discovery, outreach automation, and campaign management.
 ---
 # Backlink Outreach Overview
 Backlink Outreach is an AI-powered guest post outreach platform that takes you from opportunity discovery to published backlink — with smart email composition, policy-safe sending, IMAP reply monitoring, and full campaign analytics.
--- a/docs-site/docs/features/backlink-outreach/reply-inbox.md
+++ b/docs-site/docs/features/backlink-outreach/reply-inbox.md
@@ -44,15 +44,18 @@ The reply monitor:
 3. Searches for messages sent to your outreach address.
 4. Fetches up to `IMAP_FETCH_LIMIT` recent messages.
 5. For each message, checks if it's already been processed (deduplication).
-6. Matches the reply to an existing outreach attempt by sender email.
+6. Matches the reply to an existing outreach attempt (Message-ID first, sender email fallback).
 7. Classifies the reply and stores it.
 ### Reply Matching
-Replies are matched to outreach attempts using the `from_email` field:
+Replies are matched to outreach attempts using a two-stage strategy:
- The system looks up `find_attempt_by_from_email(from_email)` to find the most recent outreach attempt sent to that email address.
+1. **Message-ID matching (primary)**: Each sent email includes a unique `Message-ID` header. When the recipient replies, their email client includes the original `Message-ID` in `In-Reply-To` and `References` headers. The system extracts these and looks up `find_attempt_by_message_id(in_reply_to)` to find the exact outreach attempt.
- If no match is found, the reply is still stored but not linked to an attempt.
+
 2. **Sender email fallback**: If no Message-ID match is found (e.g., the reply client stripped headers), the system falls back to `find_attempt_by_from_email(from_email)` to find the most recent attempt sent to that address.
 3. **Unmatched replies**: If neither strategy produces a match, the reply is still stored but not linked to an attempt.
 ### Deduplication
--- a/docs-site/docs/features/blog-writer/overview.md
+++ b/docs-site/docs/features/blog-writer/overview.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity Blog Writer - AI-powered blog post creation with SEO optimization, research integration, and multi-platform publishing.
 ---
 # Blog Writer Overview
 The ALwrity Blog Writer is a powerful AI-driven content creation tool that helps you generate high-quality, SEO-optimized blog posts with minimal effort. It's designed for users with medium to low technical knowledge, making professional content creation accessible to everyone.
--- a/docs-site/docs/features/content-strategy/overview.md
+++ b/docs-site/docs/features/content-strategy/overview.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity Content Strategy - AI-powered strategic planning, persona development, and content calendar generation.
 ---
 # Content Strategy Overview
 ALwrity's Content Strategy module is the brain of your content marketing efforts, providing AI-powered strategic planning, persona development, and content calendar generation to help you create a comprehensive, data-driven content marketing strategy.
@@ -323,6 +327,13 @@ ALwrity generates comprehensive content calendars that align with your strategy:
 - **Strategy Updates**: Automatic strategy refinement
 - **Report Generation**: Automated performance reports
 ## Related Features
 - **[Persona System](../persona/overview.md)** — Build audience personas for targeted content
 - **[Blog Writer](../blog-writer/overview.md)** — Create content aligned with your strategy
 - **[SEO Dashboard](../seo-dashboard/overview.md)** — Discover content gaps and opportunities
 - **[Backlink Outreach](../backlink-outreach/overview.md)** — Support strategy with link-building
 ---
 *Ready to develop your content strategy? [Start with our First Steps Guide](../../getting-started/first-steps.md) or [Explore Persona Development](personas.md) to begin building your strategic content plan!*
--- a/docs-site/docs/features/image-studio/api-reference.md
+++ b/docs-site/docs/features/image-studio/api-reference.md
@@ -14,7 +14,7 @@ All endpoints require authentication via Bearer token:
 Authorization: Bearer YOUR_ACCESS_TOKEN
 ```
-The token is obtained through the standard ALwrity authentication flow. See [Authentication Guide](../api/authentication.md) for details.
+The token is obtained through the standard ALwrity authentication flow. See [Authentication Guide](../../api/authentication.md) for details.
 ## API Architecture
@@ -827,7 +827,7 @@ Image Studio API follows standard ALwrity rate limiting:
 - **Headers**: Rate limit information in response headers
 - **Retry**: Use exponential backoff for rate limit errors
-See [Rate Limiting Guide](../api/rate-limiting.md) for details.
+See [Rate Limiting Guide](../../api/rate-limiting.md) for details.
 ---
@@ -936,5 +936,5 @@ curl -X POST https://api.alwrity.com/api/image-studio/create \
 ---
-*For authentication details, see the [API Authentication Guide](../api/authentication.md). For rate limiting, see the [Rate Limiting Guide](../api/rate-limiting.md).*
+*For authentication details, see the [API Authentication Guide](../../api/authentication.md). For rate limiting, see the [Rate Limiting Guide](../../api/rate-limiting.md).*
--- a/docs-site/docs/features/image-studio/modules.md
+++ b/docs-site/docs/features/image-studio/modules.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity Image Studio modules - Create, Edit, Upscale, Optimize, and manage image assets.
 ---
 # Image Studio Modules
 Image Studio consists of 7 core modules that provide a complete image workflow from creation to optimization. This guide provides detailed information about each module, their features, and current implementation status.
--- a/docs-site/docs/features/image-studio/overview.md
+++ b/docs-site/docs/features/image-studio/overview.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity Image Studio - AI-powered image creation, editing, and optimization for digital marketers and content creators.
 ---
 # Image Studio Overview
 The ALwrity Image Studio is a comprehensive AI-powered image creation, editing, and optimization platform designed specifically for digital marketers and content creators. It provides a unified hub for all image-related operations, from generation to social media optimization, making professional visual content creation accessible to everyone.
--- a/docs-site/docs/features/linkedin-writer/overview.md
+++ b/docs-site/docs/features/linkedin-writer/overview.md
@@ -1,3 +1,7 @@
 ---
 description: ALwrity LinkedIn Writer - AI-powered professional LinkedIn content creation for brand building.
 ---
 # LinkedIn Writer: Overview
 The ALwrity LinkedIn Writer is a specialized AI-powered tool designed to help you create professional, engaging LinkedIn content that builds your personal brand, drives engagement, and establishes thought leadership in your industry.
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Kunthawat Greethong	19b4ac53fc	Add Dockerfile for EasyPanel deployment Some checks failed Lint Forced User ID Patterns / lint-forced-user-id (push) Has been cancelled Details	2026-06-15 10:40:16 +07:00
ajaysi	ce9bf293ed	Fix LinkedIn writer: progress animation, persona API 404 handling, back-to-home navigation - Simulate progress step advancement at 1.5s intervals during API calls so users see incremental progress instead of all-at-once bursts - PersonaChip skips API calls entirely in feature-only mode (no console spam) - getUserPersonas/getPlatformPersona return null on 404 instead of throwing - PersonaChip shows neutral gray state when no persona data exists - Back button now clears draft to return to LinkedIn writer home screen - Article title extracted from markdown content (fixes KeyError) - InitialRouteHandler: demo mode subscribes getDefaultLandingRoute() - Header: back button shown when draft exists, navigates to home screen	2026-06-13 17:12:45 +05:30
ajaysi	d90d441019	chore: push all remaining changes - Blog writer enhancements and bug fixes - Wix integration improvements - Frontend UI updates - GSC dashboard docs cleanup - Image studio assets - LinkedIn requirements file - Various dependency updates	2026-06-12 20:32:03 +05:30
ajaysi	63a0df2536	feat: LinkedIn LLM alignment - Phase 1-3 complete Phase 1: Dead Code Cleanup - Remove GeminiGroundedProvider import and property from linkedin_service.py - Remove fallback_provider property (gemini_provider imports) - Fix routers/linkedin.py edit endpoint to use llm_text_gen - Delete dead LinkedInImageEditor class - Remove dead _transform_gemini_sources from content_generator.py Phase 2: Research Infrastructure Alignment - Add user_id to _conduct_research() for pre-flight validation - Add validate_exa_research_operations() before Exa/Tavily calls - Pass user_id to provider.simple_search() for usage tracking - Inject research content into LLM prompts via _build_research_context() - Fix Google engine path to fallback to Exa - Add Exa → Tavily fallback on research failure Phase 3: Cosmetic Cleanup - Rename _generate_prompts_with_gemini → _generate_prompts_with_llm - Rename _build_gemini_prompt → _build_image_prompt - Rename _parse_gemini_response → _parse_llm_response - Remove all Gemini references from LinkedIn code (0 remaining) - Update docstrings and log messages Additional: - Research caching using existing ResearchCache - Shared ExaContentResearchProvider in services/research/ - Persona service uses llm_text_gen instead of gemini_structured_json_response - LinkedInWriter.tsx ChatMessage → ChatMsg type mapping fix - RegisterLinkedInActionsEnhanced.tsx content_format_rules typing fix	2026-06-12 18:58:53 +05:30
ajaysi	e54aaa7a3e	chore: bulk commit of local changes across blog writer, SEO dashboard, scheduler, docs-site, and frontend	2026-06-05 12:40:30 +05:30
		`@@ -0,0 +1,3 @@`
							`from .carousel_renderer import LinkedInCarouselPDFRenderer`

							`__all__ = ['LinkedInCarouselPDFRenderer']`