AI Analysis and Content Strategy fixes. Enhanced Strategy Routes refactoring.

2026-01-10 19:32:50 +05:30
parent 0b63ae7fc1
commit 8193cdba67
298 changed files with 45678 additions and 10952 deletions
--- a/docs/image-generation-comparison.md
+++ b/docs/image-generation-comparison.md
@@ -0,0 +1,287 @@
+# Image Generation Implementation Comparison
+
+## Overview
+This document compares how **Podcast Maker**, **Story Writer**, and **Blog Writer** implement AI image generation, focusing on model selection, provider routing, and best practices.
+
+---
+
+## 1. **Podcast Maker** (`backend/api/podcast/handlers/images.py`)
+
+### Key Features:
+- **Dual Mode**: Character-consistent generation (Ideogram Character) vs. standard generation
+- **Auto Provider Selection**: Uses `provider: None` to auto-select based on environment
+- **Specialized Prompt Building**: Podcast-optimized prompts with scene context
+- **Pre-flight Validation**: Subscription checks before API calls
+
+### Model Usage:
+```python
+# Character-consistent generation (when base_avatar_url provided)
+generate_character_image(
+    prompt=image_prompt,
+    reference_image_bytes=base_avatar_bytes,
+    user_id=user_id,
+    style=style,  # "Realistic", "Fiction", "Auto"
+    aspect_ratio=aspect_ratio,  # "1:1", "16:9", "9:16", "4:3", "3:4"
+    rendering_speed=rendering_speed,  # "Default", "Turbo", "Quality"
+)
+# Model: ideogram-ai/ideogram-character (WaveSpeed)
+# Cost: ~$0.10/image
+
+# Standard generation (no base avatar)
+generate_image(
+    prompt=image_prompt,
+    options={
+        "provider": None,  # Auto-select
+        "width": request.width,
+        "height": request.height,
+    },
+    user_id=user_id
+)
+# Provider: Auto-selected (WaveSpeed, HuggingFace, or Stability)
+# Cost: ~$0.04/image (varies by provider)
+```
+
+### Prompt Building Strategy:
+- **Scene Context**: Scene title, content preview, visual keywords
+- **Podcast Theme**: Idea/topic context
+- **Technical Requirements**: 16:9 aspect ratio, video-optimized composition
+- **Style Constraints**: Realistic photography, professional broadcast quality
+
+### Error Handling:
+- **Character Generation Failure**: Raises HTTPException (no fallback to standard)
+- **Timeout/Connection Issues**: Returns 504 with retry recommendation
+- **Other Errors**: Returns 502 with error details
+
+---
+
+## 2. **Story Writer** (`backend/services/story_writer/image_generation_service.py`)
+
+### Key Features:
+- **Simple Wrapper**: Thin service layer around `generate_image()`
+- **Batch Processing**: Generates images for multiple scenes sequentially
+- **Progress Callbacks**: Supports progress tracking for batch operations
+- **Error Resilience**: Continues with next scene if one fails
+
+### Model Usage:
+```python
+# Single scene generation
+generate_image(
+    prompt=image_prompt,  # From scene.image_prompt
+    options={
+        "provider": provider,  # Optional, can be None for auto-select
+        "width": width,  # Default: 1024
+        "height": height,  # Default: 1024
+        "model": model,  # Optional
+    },
+    user_id=user_id
+)
+
+# Batch generation
+generate_scene_images(
+    scenes=scenes_data,
+    user_id=user_id,
+    provider=request.provider,  # Optional
+    width=request.width or 1024,
+    height=request.height or 1024,
+    model=request.model,  # Optional
+    progress_callback=progress_callback  # Optional
+)
+```
+
+### Prompt Strategy:
+- **Direct Use**: Uses `scene.image_prompt` directly (no prompt building)
+- **Pre-generated**: Prompts are created during story outline phase
+- **No Modification**: Service doesn't modify prompts
+
+### Error Handling:
+- **HTTPException**: Re-raised (e.g., 429 subscription limits)
+- **Other Exceptions**: Wrapped in RuntimeError, continues with next scene
+- **Partial Success**: Returns results with error field for failed scenes
+
+---
+
+## 3. **Blog Writer** (`frontend/src/components/ImageGen/ImageGenerator.tsx`)
+
+### Key Features:
+- **Provider Selection**: User can choose WaveSpeed, HuggingFace, or Stability
+- **Model Selection**: Dropdown based on selected provider
+- **Dimension Validation**: Frontend validation with model-specific limits
+- **Prompt Optimization**: "Optimize Prompt" button for blog-optimized prompts
+- **Cost Display**: Shows cost information for WaveSpeed models
+
+### Model Usage:
+```typescript
+// Frontend component
+const req: ImageGenerationRequest = {
+  prompt,
+  negative_prompt: negative,
+  provider,  // 'wavespeed' | 'huggingface' | 'stability'
+  model,  // e.g., 'qwen-image', 'ideogram-v3-turbo'
+  width,
+  height
+};
+
+// Backend routing (main_image_generation.py)
+// Auto-detects Wavespeed models and remaps provider
+wavespeed_models = ["qwen-image", "ideogram-v3-turbo"]
+if model_lower in wavespeed_models and provider_name != "wavespeed":
+    provider_name = "wavespeed"
+```
+
+### Available Models:
+- **WaveSpeed**: `qwen-image` ($0.05), `ideogram-v3-turbo` ($0.10)
+- **HuggingFace**: `black-forest-labs/FLUX.1-Krea-dev`, `black-forest-labs/FLUX.1-dev`, `runwayml/flux-dev`
+- **Stability AI**: `stable-diffusion-xl-1024-v1-0`, `stable-diffusion-xl-base-1.0`
+
+### Dimension Limits:
+- **WaveSpeed Models**: Max 1024x1024
+- **Other Models**: Max 2048x2048
+- **Frontend Validation**: Clamps dimensions and shows errors
+
+### Prompt Optimization:
+- **Backend Endpoint**: `/api/images/suggest-prompts`
+- **Blog-Optimized**: Focuses on data visualization, infographics, text overlay areas
+- **Context-Aware**: Uses title, section, research, persona for better prompts
+
+---
+
+## 4. **Common Patterns & Best Practices**
+
+### Provider Selection:
+```python
+# Pattern 1: Auto-select (Podcast Maker)
+options = {"provider": None}  # Let _select_provider() decide
+
+# Pattern 2: Explicit (Story Writer, Blog Writer)
+options = {"provider": "wavespeed"}  # User or service specifies
+
+# Pattern 3: Model-based remapping (Blog Writer backend)
+# Automatically remaps provider based on model name
+```
+
+### Model Routing:
+```python
+# Backend auto-detection (main_image_generation.py)
+# Detects Wavespeed models and remaps provider
+wavespeed_models = ["qwen-image", "ideogram-v3-turbo"]
+if model_lower in wavespeed_models and provider_name != "wavespeed":
+    provider_name = "wavespeed"
+```
+
+### Error Handling:
+```python
+# Pattern 1: Re-raise HTTPExceptions (subscription limits)
+except HTTPException:
+    raise
+
+# Pattern 2: Wrap in RuntimeError (Story Writer)
+except Exception as e:
+    raise RuntimeError(f"Failed to generate image: {str(e)}") from e
+
+# Pattern 3: Return error in result (Story Writer batch)
+image_results.append({
+    "error": str(e),
+    "image_url": None,
+})
+```
+
+### Subscription Validation:
+```python
+# Pre-flight validation (Podcast Maker)
+validate_image_generation_operations(
+    pricing_service=pricing_service,
+    user_id=user_id,
+    num_images=1
+)
+
+# Built-in validation (main_image_generation.py)
+_validate_image_operation(
+    user_id=user_id,
+    operation_type="image-generation",
+    num_operations=1,
+)
+```
+
+---
+
+## 5. **Key Differences**
+
+| Feature | Podcast Maker | Story Writer | Blog Writer |
+|---------|---------------|--------------|-------------|
+| **Provider Selection** | Auto-select | Optional explicit | User selects |
+| **Model Selection** | Auto (Character) or Auto-select | Optional explicit | User selects |
+| **Prompt Building** | Custom podcast prompts | Pre-generated | User + optimization |
+| **Dimension Limits** | No validation | No validation | Frontend validation |
+| **Error Handling** | Strict (no fallback) | Resilient (continues) | User-friendly alerts |
+| **Cost Display** | Estimated in response | Not shown | Shown in UI |
+| **Special Features** | Character consistency | Batch processing | Prompt optimization |
+
+---
+
+## 6. **Recommendations for Blog Writer**
+
+### ✅ Already Implemented:
+1. ✅ Provider/model selection UI
+2. ✅ Dimension validation
+3. ✅ Model-based provider remapping
+4. ✅ Cost information display
+5. ✅ Prompt optimization
+
+### 🔄 Could Improve:
+1. **Pre-flight Validation**: Add subscription checks before API calls (like Podcast Maker)
+2. **Error Messages**: More specific error messages based on error type
+3. **Batch Generation**: Support generating multiple images for blog sections
+4. **Progress Tracking**: Show progress for multiple image generations
+5. **Retry Logic**: Automatic retry for transient failures
+
+### 📝 Implementation Notes:
+- **Provider Routing**: Backend correctly auto-detects Wavespeed models
+- **Dimension Limits**: Frontend validation prevents invalid dimensions
+- **Cost Tracking**: Handled by centralized `generate_image()` function
+- **Asset Library**: Images are saved to asset library automatically
+
+---
+
+## 7. **Model-Specific Details**
+
+### WaveSpeed Models:
+- **qwen-image**: $0.05/image, max 1024x1024, fast generation
+- **ideogram-v3-turbo**: $0.10/image, max 1024x1024, superior text rendering
+- **ideogram-character**: $0.10/image, character consistency (Podcast only)
+
+### HuggingFace Models:
+- **FLUX.1-Krea-dev**: Photorealistic, optimized for blog images
+- **FLUX.1-dev**: General purpose
+- **flux-dev**: RunwayML variant
+
+### Stability AI Models:
+- **SDXL 1024**: Professional quality, $0.04/image
+- **SDXL Base**: Standard quality
+
+---
+
+## 8. **Code References**
+
+### Backend:
+- `backend/services/llm_providers/main_image_generation.py` - Core generation logic
+- `backend/services/llm_providers/image_generation/wavespeed_provider.py` - WaveSpeed implementation
+- `backend/api/podcast/handlers/images.py` - Podcast image generation
+- `backend/services/story_writer/image_generation_service.py` - Story Writer service
+- `backend/api/images.py` - Blog Writer image API
+
+### Frontend:
+- `frontend/src/components/ImageGen/ImageGenerator.tsx` - Blog Writer component
+- `frontend/src/components/shared/ImageGenerationModal.tsx` - Shared modal (Podcast/YouTube)
+- `frontend/src/components/StoryWriter/Phases/StoryOutlineParts/ImageEditModal.tsx` - Story Writer UI
+
+---
+
+## Summary
+
+All three tools use the centralized `generate_image()` function but with different approaches:
+
+1. **Podcast Maker**: Specialized for character consistency, auto-selects providers
+2. **Story Writer**: Simple wrapper, batch processing, error resilient
+3. **Blog Writer**: User-controlled provider/model selection, frontend validation, prompt optimization
+
+The Blog Writer implementation is the most user-friendly with explicit controls, while Podcast Maker focuses on specialized use cases and Story Writer prioritizes simplicity and batch operations.