Base code

This commit is contained in:
Kunthawat Greethong
2026-01-08 22:39:53 +07:00
parent 697115c61a
commit c35fa52117
2169 changed files with 626670 additions and 0 deletions

View File

@@ -0,0 +1,499 @@
# Story Generation Code Adaptation Guide
This guide shows how to adapt the existing story generation code to use the production-ready `main_text_generation` and subscription system.
## 1. Import Path Updates
### Before (Legacy)
```python
from ...gpt_providers.text_generation.main_text_generation import llm_text_gen
```
### After (Production)
```python
from services.llm_providers.main_text_generation import llm_text_gen
```
## 2. Adding User ID and Subscription Support
### Before
```python
def generate_with_retry(prompt, system_prompt=None):
try:
return llm_text_gen(prompt, system_prompt)
except Exception as e:
logger.error(f"Error generating content: {e}")
return ""
```
### After
```python
def generate_with_retry(prompt, system_prompt=None, user_id: str = None):
"""
Generate content with retry handling and subscription support.
Args:
prompt: The prompt to generate content from
system_prompt: Custom system prompt (optional)
user_id: Clerk user ID (required for subscription checking)
Returns:
Generated content string
Raises:
RuntimeError: If user_id is missing or subscription limits exceeded
HTTPException: If subscription limit exceeded (429 status)
"""
if not user_id:
raise RuntimeError("user_id is required for subscription checking")
try:
return llm_text_gen(
prompt=prompt,
system_prompt=system_prompt,
user_id=user_id
)
except HTTPException as e:
# Re-raise HTTPExceptions (e.g., 429 subscription limit)
raise
except Exception as e:
logger.error(f"Error generating content: {e}")
raise RuntimeError(f"Failed to generate content: {str(e)}") from e
```
## 3. Structured JSON Response for Outline
### Before
```python
outline = generate_with_retry(outline_prompt.format(premise=premise))
# Returns plain text, needs parsing
```
### After
```python
# Define JSON schema for structured outline
outline_schema = {
"type": "object",
"properties": {
"outline": {
"type": "array",
"items": {
"type": "object",
"properties": {
"scene_number": {"type": "integer"},
"title": {"type": "string"},
"description": {"type": "string"},
"key_events": {"type": "array", "items": {"type": "string"}}
},
"required": ["scene_number", "title", "description"]
}
}
},
"required": ["outline"]
}
# Generate structured outline
outline_response = llm_text_gen(
prompt=outline_prompt.format(premise=premise),
system_prompt=system_prompt,
json_struct=outline_schema,
user_id=user_id
)
# Parse JSON response
import json
outline_data = json.loads(outline_response)
outline = outline_data.get("outline", [])
```
## 4. Complete Service Example
### Story Service Structure
```python
# backend/services/story_writer/story_service.py
from typing import Dict, Any, Optional, List
from loguru import logger
from services.llm_providers.main_text_generation import llm_text_gen
import json
class StoryWriterService:
"""Service for generating stories using prompt chaining."""
def __init__(self):
self.guidelines = """\
Writing Guidelines:
Delve deeper. Lose yourself in the world you're building. Unleash vivid
descriptions to paint the scenes in your reader's mind.
Develop your characters — let their motivations, fears, and complexities unfold naturally.
Weave in the threads of your outline, but don't feel constrained by it.
Allow your story to surprise you as you write. Use rich imagery, sensory details, and
evocative language to bring the setting, characters, and events to life.
Introduce elements subtly that can blossom into complex subplots, relationships,
or worldbuilding details later in the story.
Keep things intriguing but not fully resolved.
Avoid boxing the story into a corner too early.
Plant the seeds of subplots or potential character arc shifts that can be expanded later.
Remember, your main goal is to write as much as you can. If you get through
the story too fast, that is bad. Expand, never summarize.
"""
def generate_premise(
self,
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
user_id: str
) -> str:
"""Generate story premise."""
prompt = f"""\
{persona}
Write a single sentence premise for a {story_setting} story featuring {character_input}.
The plot will revolve around: {plot_elements}
"""
try:
premise = llm_text_gen(
prompt=prompt,
user_id=user_id
)
return premise.strip()
except Exception as e:
logger.error(f"Error generating premise: {e}")
raise RuntimeError(f"Failed to generate premise: {str(e)}") from e
def generate_outline(
self,
premise: str,
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
user_id: str
) -> List[Dict[str, Any]]:
"""Generate structured story outline."""
prompt = f"""\
{persona}
You have a gripping premise in mind:
{premise}
Write an outline for the plot of your story set in {story_setting} featuring {character_input}.
The plot elements are: {plot_elements}
"""
# Define JSON schema for structured response
json_schema = {
"type": "object",
"properties": {
"outline": {
"type": "array",
"items": {
"type": "object",
"properties": {
"scene_number": {"type": "integer"},
"title": {"type": "string"},
"description": {"type": "string"},
"key_events": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["scene_number", "title", "description"]
}
}
},
"required": ["outline"]
}
try:
response = llm_text_gen(
prompt=prompt,
json_struct=json_schema,
user_id=user_id
)
# Parse JSON response
outline_data = json.loads(response)
return outline_data.get("outline", [])
except json.JSONDecodeError as e:
logger.error(f"Failed to parse outline JSON: {e}")
# Fallback to text parsing if JSON fails
return self._parse_text_outline(response)
except Exception as e:
logger.error(f"Error generating outline: {e}")
raise RuntimeError(f"Failed to generate outline: {str(e)}") from e
def generate_story_start(
self,
premise: str,
outline: str,
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
writing_style: str,
story_tone: str,
narrative_pov: str,
audience_age_group: str,
content_rating: str,
ending_preference: str,
user_id: str
) -> str:
"""Generate the starting section of the story."""
# Format outline as text if it's a list
if isinstance(outline, list):
outline_text = "\n".join([
f"{item.get('scene_number', i+1)}. {item.get('title', '')}: {item.get('description', '')}"
for i, item in enumerate(outline)
])
else:
outline_text = str(outline)
prompt = f"""\
{persona}
Write a story with the following details:
**The Story Setting is:**
{story_setting}
**The Characters of the story are:**
{character_input}
**Plot Elements of the story:**
{plot_elements}
**Story Writing Style:**
{writing_style}
**The story Tone is:**
{story_tone}
**Write story from the Point of View of:**
{narrative_pov}
**Target Audience of the story:**
{audience_age_group}, **Content Rating:** {content_rating}
**Story Ending:**
{ending_preference}
You have a gripping premise in mind:
{premise}
Your imagination has crafted a rich narrative outline:
{outline_text}
First, silently review the outline and the premise. Consider how to start the story.
Start to write the very beginning of the story. You are not expected to finish
the whole story now. Your writing should be detailed enough that you are only
scratching the surface of the first bullet of your outline. Try to write AT
MINIMUM 4000 WORDS.
{self.guidelines}
"""
try:
starting_draft = llm_text_gen(
prompt=prompt,
user_id=user_id
)
return starting_draft.strip()
except Exception as e:
logger.error(f"Error generating story start: {e}")
raise RuntimeError(f"Failed to generate story start: {str(e)}") from e
def continue_story(
self,
premise: str,
outline: str,
story_text: str,
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
writing_style: str,
story_tone: str,
narrative_pov: str,
audience_age_group: str,
content_rating: str,
ending_preference: str,
user_id: str
) -> str:
"""Continue writing the story."""
# Format outline as text if it's a list
if isinstance(outline, list):
outline_text = "\n".join([
f"{item.get('scene_number', i+1)}. {item.get('title', '')}: {item.get('description', '')}"
for i, item in enumerate(outline)
])
else:
outline_text = str(outline)
prompt = f"""\
{persona}
Write a story with the following details:
**The Story Setting is:**
{story_setting}
**The Characters of the story are:**
{character_input}
**Plot Elements of the story:**
{plot_elements}
**Story Writing Style:**
{writing_style}
**The story Tone is:**
{story_tone}
**Write story from the Point of View of:**
{narrative_pov}
**Target Audience of the story:**
{audience_age_group}, **Content Rating:** {content_rating}
**Story Ending:**
{ending_preference}
You have a gripping premise in mind:
{premise}
Your imagination has crafted a rich narrative outline:
{outline_text}
You've begun to immerse yourself in this world, and the words are flowing.
Here's what you've written so far:
{story_text}
=====
First, silently review the outline and story so far. Identify what the single
next part of your outline you should write.
Your task is to continue where you left off and write the next part of the story.
You are not expected to finish the whole story now. Your writing should be
detailed enough that you are only scratching the surface of the next part of
your outline. Try to write AT MINIMUM 2000 WORDS. However, only once the story
is COMPLETELY finished, write IAMDONE. Remember, do NOT write a whole chapter
right now.
{self.guidelines}
"""
try:
continuation = llm_text_gen(
prompt=prompt,
user_id=user_id
)
return continuation.strip()
except Exception as e:
logger.error(f"Error continuing story: {e}")
raise RuntimeError(f"Failed to continue story: {str(e)}") from e
def _parse_text_outline(self, text: str) -> List[Dict[str, Any]]:
"""Fallback method to parse text outline if JSON parsing fails."""
# Simple text parsing logic
lines = text.strip().split('\n')
outline = []
for i, line in enumerate(lines):
if line.strip():
outline.append({
"scene_number": i + 1,
"title": f"Scene {i + 1}",
"description": line.strip(),
"key_events": []
})
return outline
```
## 5. API Endpoint Example
```python
# backend/api/story_writer/router.py
from fastapi import APIRouter, HTTPException, Depends
from typing import Dict, Any
from middleware.auth_middleware import get_current_user
from services.story_writer.story_service import StoryWriterService
from models.story_models import StoryGenerationRequest
router = APIRouter(prefix="/api/story", tags=["Story Writer"])
service = StoryWriterService()
@router.post("/generate-premise")
async def generate_premise(
request: StoryGenerationRequest,
current_user: Dict[str, Any] = Depends(get_current_user)
) -> Dict[str, Any]:
"""Generate story premise."""
try:
if not current_user:
raise HTTPException(status_code=401, detail="Authentication required")
user_id = str(current_user.get('id', ''))
if not user_id:
raise HTTPException(status_code=401, detail="Invalid user ID")
premise = service.generate_premise(
persona=request.persona,
story_setting=request.story_setting,
character_input=request.character_input,
plot_elements=request.plot_elements,
user_id=user_id
)
return {"premise": premise, "success": True}
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to generate premise: {e}")
raise HTTPException(status_code=500, detail=str(e))
```
## 6. Key Differences Summary
| Aspect | Legacy Code | Production Code |
|--------|------------|-----------------|
| Import Path | `...gpt_providers.text_generation.main_text_generation` | `services.llm_providers.main_text_generation` |
| User ID | Not required | Required parameter |
| Subscription | No checks | Automatic via `main_text_generation` |
| Error Handling | Basic try/except | HTTPException handling for 429 errors |
| Structured Responses | Text parsing | JSON schema support |
| Async Support | Synchronous | Can use async/await |
| Logging | Basic | Comprehensive with loguru |
## 7. Testing Checklist
When adapting code, verify:
- [ ] All imports updated to production paths
- [ ] `user_id` parameter added to all LLM calls
- [ ] Subscription errors (429) are handled properly
- [ ] Error messages are user-friendly
- [ ] Logging is comprehensive
- [ ] Structured JSON responses work correctly
- [ ] Fallback logic for text parsing exists
- [ ] Long-running operations use task management
## 8. Common Pitfalls
1. **Missing user_id**: Always pass `user_id` parameter
2. **Ignoring HTTPException**: Re-raise HTTPExceptions (especially 429)
3. **No fallback parsing**: If JSON parsing fails, have text parsing fallback
4. **Synchronous blocking**: Use async endpoints for long-running operations
5. **No error context**: Include original exception in error messages

View File

@@ -0,0 +1,537 @@
# Story Generation Feature - Implementation Plan
## Executive Summary
This document reviews the existing story generation backend modules and provides a comprehensive plan to complete the story generation feature with a modern UI using CopilotKit, similar to the AI Blog Writer implementation.
## 1. Current State Review
### 1.1 Existing Backend Modules
#### 1.1.1 Story Writer (`ToBeMigrated/ai_writers/ai_story_writer/`)
**Status**: ✅ Functional but needs migration
**Location**: `ToBeMigrated/ai_writers/ai_story_writer/ai_story_generator.py`
**Features**:
- Prompt chaining approach (premise → outline → starting draft → continuation)
- Supports multiple personas/genres (11 predefined)
- Configurable story parameters:
- Story setting
- Characters
- Plot elements
- Writing style (Formal, Casual, Poetic, Humorous)
- Story tone (Dark, Uplifting, Suspenseful, Whimsical)
- Narrative POV (First Person, Third Person Limited/Omniscient)
- Audience age group
- Content rating
- Ending preference
**Current Implementation**:
- Uses legacy `lib/gpt_providers/text_generation/main_text_generation.py` (needs update)
- Streamlit-based UI (needs React migration)
- Iterative generation until "IAMDONE" marker
**Issues to Address**:
1. ❌ Uses old import path (`...gpt_providers.text_generation.main_text_generation`)
2. ❌ No subscription/user_id integration
3. ❌ No task management/polling support
4. ❌ Streamlit UI (needs React/CopilotKit migration)
#### 1.1.2 Story Illustrator (`ToBeMigrated/ai_writers/ai_story_illustrator/`)
**Status**: ✅ Functional but needs migration
**Location**: `ToBeMigrated/ai_writers/ai_story_illustrator/story_illustrator.py`
**Features**:
- Story segmentation for illustration
- Scene element extraction using LLM
- Multiple illustration styles (12+ options)
- PDF storybook generation
- ZIP export of illustrations
**Current Implementation**:
- Uses legacy import paths
- Streamlit UI
- Integrates with image generation (Gemini)
**Issues to Address**:
1. ❌ Uses old import paths
2. ❌ No subscription integration
3. ❌ Streamlit UI (needs React migration)
#### 1.1.3 Story Video Generator (`ToBeMigrated/ai_writers/ai_story_video_generator/`)
**Status**: ✅ Functional but needs migration
**Location**: `ToBeMigrated/ai_writers/ai_story_video_generator/story_video_generator.py`
**Features**:
- Story generation with scene breakdown
- Image generation per scene
- Text overlay on images
- Video compilation with audio
- Multiple story styles
**Current Implementation**:
- Uses legacy import paths
- Streamlit UI
- MoviePy for video generation
**Issues to Address**:
1. ❌ Uses old import paths
2. ❌ No subscription integration
3. ❌ Streamlit UI (needs React migration)
4. ❌ Heavy dependencies (MoviePy, imageio)
### 1.2 Core Infrastructure Available
#### 1.2.1 Main Text Generation (`backend/services/llm_providers/main_text_generation.py`)
**Status**: ✅ Production-ready
**Features**:
- ✅ Supports Gemini and HuggingFace
- ✅ Subscription/user_id integration
- ✅ Usage tracking
- ✅ Automatic fallback between providers
- ✅ Structured JSON response support
**Usage Pattern**:
```python
from services.llm_providers.main_text_generation import llm_text_gen
response = llm_text_gen(
prompt="...",
system_prompt="...",
json_struct={...}, # Optional
user_id="clerk_user_id" # Required
)
```
#### 1.2.2 Subscription System (`backend/models/subscription_models.py`)
**Status**: ✅ Production-ready
**Features**:
- Usage tracking per provider
- Token limits
- Call limits
- Billing period management
- Already integrated with `main_text_generation`
#### 1.2.3 Blog Writer Architecture (Reference)
**Status**: ✅ Production-ready reference implementation
**Key Components**:
1. **Phase Navigation** (`frontend/src/hooks/usePhaseNavigation.ts`)
- Multi-phase workflow (Research → Outline → Content → SEO → Publish)
- Phase state management
- Auto-progression logic
2. **CopilotKit Integration** (`frontend/src/components/BlogWriter/BlogWriterUtils/useBlogWriterCopilotActions.ts`)
- Action handlers for AI interactions
- Sidebar suggestions
- Context-aware actions
3. **Backend Router** (`backend/api/blog_writer/router.py`)
- RESTful endpoints
- Task management with polling
- Cache management
- Error handling
4. **Task Management** (`backend/api/blog_writer/task_manager.py`)
- Async task execution
- Status tracking
- Result caching
## 2. Implementation Plan
### 2.1 Phase 1: Backend Migration & Enhancement
#### 2.1.1 Create Story Writer Service
**File**: `backend/services/story_writer/story_service.py`
**Tasks**:
1. Migrate `ai_story_generator.py` logic to new service
2. Update imports to use `main_text_generation`
3. Add `user_id` parameter to all LLM calls
4. Implement prompt chaining with proper error handling
5. Add structured JSON response support for outline generation
6. Support both Gemini and HuggingFace through `main_text_generation`
**Key Functions**:
```python
async def generate_story_premise(
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
writing_style: str,
story_tone: str,
narrative_pov: str,
audience_age_group: str,
content_rating: str,
ending_preference: str,
user_id: str
) -> str
async def generate_story_outline(
premise: str,
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
user_id: str
) -> Dict[str, Any] # Structured outline
async def generate_story_start(
premise: str,
outline: str,
persona: str,
guidelines: str,
user_id: str
) -> str
async def continue_story(
premise: str,
outline: str,
story_text: str,
persona: str,
guidelines: str,
user_id: str
) -> str
```
#### 2.1.2 Create Story Writer Router
**File**: `backend/api/story_writer/router.py`
**Endpoints**:
```
POST /api/story/generate-premise
POST /api/story/generate-outline
POST /api/story/generate-start
POST /api/story/continue
POST /api/story/generate-full # Complete story generation with task management
GET /api/story/task/{task_id}/status
GET /api/story/task/{task_id}/result
```
**Request Models**:
```python
class StoryGenerationRequest(BaseModel):
persona: str
story_setting: str
character_input: str
plot_elements: str
writing_style: str
story_tone: str
narrative_pov: str
audience_age_group: str
content_rating: str
ending_preference: str
```
#### 2.1.3 Task Management Integration
**File**: `backend/api/story_writer/task_manager.py`
**Features**:
- Async story generation with polling
- Progress tracking (premise → outline → start → continuation → done)
- Result caching
- Error recovery
### 2.2 Phase 2: Frontend Implementation
#### 2.2.1 Story Writer Component Structure
**File**: `frontend/src/components/StoryWriter/StoryWriter.tsx`
**Phases** (similar to Blog Writer):
1. **Setup** - Story parameters input
2. **Premise** - Review and refine premise
3. **Outline** - Review and refine outline
4. **Writing** - Generate and edit story content
5. **Illustration** (Optional) - Generate illustrations
6. **Export** - Download/export story
#### 2.2.2 Phase Navigation Hook
**File**: `frontend/src/hooks/useStoryWriterPhaseNavigation.ts`
**Based on**: `usePhaseNavigation.ts` from Blog Writer
**Phases**:
```typescript
interface StoryPhase {
id: 'setup' | 'premise' | 'outline' | 'writing' | 'illustration' | 'export';
name: string;
icon: string;
description: string;
completed: boolean;
current: boolean;
disabled: boolean;
}
```
#### 2.2.3 CopilotKit Actions
**File**: `frontend/src/components/StoryWriter/StoryWriterUtils/useStoryWriterCopilotActions.ts`
**Actions**:
- `generateStoryPremise` - Generate story premise
- `generateStoryOutline` - Generate outline from premise
- `startStoryWriting` - Begin story generation
- `continueStoryWriting` - Continue story generation
- `refineStoryOutline` - Refine outline based on feedback
- `generateIllustrations` - Generate illustrations for story
- `exportStory` - Export story in various formats
#### 2.2.4 Story Writer UI Components
**Main Components**:
1. `StoryWriter.tsx` - Main container
2. `StorySetup.tsx` - Phase 1: Input story parameters
3. `StoryPremise.tsx` - Phase 2: Review premise
4. `StoryOutline.tsx` - Phase 3: Review/edit outline
5. `StoryContent.tsx` - Phase 4: Generated story content with editor
6. `StoryIllustration.tsx` - Phase 5: Illustration generation (optional)
7. `StoryExport.tsx` - Phase 6: Export options
**Utility Components**:
- `StoryWriterUtils/HeaderBar.tsx` - Phase navigation header
- `StoryWriterUtils/PhaseContent.tsx` - Phase-specific content wrapper
- `StoryWriterUtils/WriterCopilotSidebar.tsx` - CopilotKit sidebar
- `StoryWriterUtils/useStoryWriterState.ts` - State management hook
### 2.3 Phase 3: Integration with Gemini Examples
#### 2.3.1 Prompt Chaining Pattern
**Reference**: https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Story_Writing_with_Prompt_Chaining.ipynb
**Implementation**:
- Use the existing prompt chaining approach from `ai_story_generator.py`
- Enhance with structured JSON responses for outline
- Add better error handling and retry logic
- Support streaming responses (future enhancement)
#### 2.3.2 Illustration Integration
**Reference**: https://github.com/google-gemini/cookbook/blob/main/examples/Book_illustration.ipynb
**Implementation**:
- Migrate `story_illustrator.py` to backend service
- Create API endpoints for illustration generation
- Add illustration phase to frontend
- Support multiple illustration styles
#### 2.3.3 Video Generation (Optional/Future)
**Reference**: https://github.com/google-gemini/cookbook/blob/main/examples/Animated_Story_Video_Generation_gemini.ipynb
**Status**: Defer to Phase 4 (requires heavy dependencies)
### 2.4 Phase 4: Advanced Features (Future)
1. **Story Video Generation**
- Migrate `story_video_generator.py`
- Add video generation phase
- Handle MoviePy dependencies
2. **Story Templates**
- Pre-defined story templates
- Genre-specific templates
- Character templates
3. **Collaborative Editing**
- Multi-user story editing
- Version control
- Comments and suggestions
4. **Story Analytics**
- Readability metrics
- Story structure analysis
- Character development tracking
## 3. Technical Specifications
### 3.1 Backend API Models
```python
# backend/models/story_models.py
class StoryGenerationRequest(BaseModel):
persona: str
story_setting: str
character_input: str
plot_elements: str
writing_style: str
story_tone: str
narrative_pov: str
audience_age_group: str
content_rating: str
ending_preference: str
class StoryPremiseResponse(BaseModel):
premise: str
task_id: Optional[str] = None
class StoryOutlineResponse(BaseModel):
outline: List[Dict[str, Any]]
task_id: Optional[str] = None
class StoryContentResponse(BaseModel):
content: str
is_complete: bool
task_id: Optional[str] = None
class StoryIllustrationRequest(BaseModel):
story_text: str
style: str = "digital art"
aspect_ratio: str = "16:9"
num_segments: int = 5
class StoryIllustrationResponse(BaseModel):
illustrations: List[str] # URLs or base64
segments: List[str]
```
### 3.2 Frontend API Service
```typescript
// frontend/src/services/storyWriterApi.ts
export interface StoryGenerationRequest {
persona: string;
story_setting: string;
character_input: string;
plot_elements: string;
writing_style: string;
story_tone: string;
narrative_pov: string;
audience_age_group: string;
content_rating: string;
ending_preference: string;
}
export interface StoryPremiseResponse {
premise: string;
task_id?: string;
}
export interface StoryOutlineResponse {
outline: Array<{
scene_number: number;
description: string;
narration?: string;
}>;
task_id?: string;
}
export const storyWriterApi = {
generatePremise: (request: StoryGenerationRequest) => Promise<StoryPremiseResponse>,
generateOutline: (premise: string, request: StoryGenerationRequest) => Promise<StoryOutlineResponse>,
generateFullStory: (request: StoryGenerationRequest) => Promise<{ task_id: string }>,
getTaskStatus: (task_id: string) => Promise<TaskStatus>,
getTaskResult: (task_id: string) => Promise<StoryContentResponse>,
// ... more endpoints
};
```
### 3.3 State Management
```typescript
// frontend/src/hooks/useStoryWriterState.ts
interface StoryWriterState {
// Setup phase
persona: string;
storySetting: string;
characters: string;
plotElements: string;
writingStyle: string;
storyTone: string;
narrativePOV: string;
audienceAgeGroup: string;
contentRating: string;
endingPreference: string;
// Generation phases
premise: string | null;
outline: StoryOutlineSection[] | null;
storyContent: string | null;
isComplete: boolean;
// Illustration (optional)
illustrations: string[];
// Task management
currentTaskId: string | null;
generationProgress: number;
}
```
## 4. Migration Checklist
### Backend
- [ ] Create `backend/services/story_writer/story_service.py`
- [ ] Migrate prompt chaining logic from `ai_story_generator.py`
- [ ] Update all imports to use `main_text_generation`
- [ ] Add `user_id` parameter to all LLM calls
- [ ] Create `backend/api/story_writer/router.py`
- [ ] Create `backend/models/story_models.py`
- [ ] Integrate task management (`backend/api/story_writer/task_manager.py`)
- [ ] Add caching support
- [ ] Create `backend/api/story_writer/illustration_service.py` (optional)
- [ ] Register router in `app.py`
### Frontend
- [ ] Create `frontend/src/components/StoryWriter/` directory structure
- [ ] Create `StoryWriter.tsx` main component
- [ ] Create `useStoryWriterPhaseNavigation.ts` hook
- [ ] Create `useStoryWriterState.ts` hook
- [ ] Create `useStoryWriterCopilotActions.ts` hook
- [ ] Create phase components (Setup, Premise, Outline, Writing, Illustration, Export)
- [ ] Create `frontend/src/services/storyWriterApi.ts`
- [ ] Add Story Writer route to App.tsx
- [ ] Style components to match Blog Writer design
- [ ] Add error handling and loading states
- [ ] Implement polling for async tasks
### Testing
- [ ] Unit tests for story service
- [ ] Integration tests for API endpoints
- [ ] E2E tests for complete story generation flow
- [ ] Test with both Gemini and HuggingFace providers
- [ ] Test subscription limits and error handling
## 5. Dependencies
### Backend
-`main_text_generation` (already available)
-`subscription_models` (already available)
- ✅ FastAPI (already available)
- ⚠️ Image generation (for illustrations - needs verification)
### Frontend
- ✅ CopilotKit (already available)
- ✅ React (already available)
- ✅ TypeScript (already available)
- ⚠️ Markdown editor (for story content editing - check if available)
## 6. Timeline Estimate
- **Phase 1 (Backend)**: 3-5 days
- **Phase 2 (Frontend Core)**: 5-7 days
- **Phase 3 (CopilotKit Integration)**: 2-3 days
- **Phase 4 (Illustration - Optional)**: 3-4 days
- **Testing & Polish**: 2-3 days
**Total**: ~15-22 days for core features + illustrations
## 7. Key Decisions
1. **Provider Support**: Use `main_text_generation` which supports both Gemini and HuggingFace automatically
2. **UI Pattern**: Follow Blog Writer pattern with phase navigation and CopilotKit integration
3. **Task Management**: Use async task pattern with polling (same as Blog Writer)
4. **Illustration**: Make optional/separate phase to keep core story generation focused
5. **Video Generation**: Defer to future phase due to heavy dependencies
## 8. Next Steps
1. Review and approve this plan
2. Set up backend service structure
3. Begin backend migration
4. Create frontend component structure
5. Implement phase navigation
6. Integrate CopilotKit actions
7. Test end-to-end flow
8. Add illustration support (optional)
9. Polish and documentation

View File

@@ -0,0 +1,204 @@
# Story Writer Frontend Foundation - Phase 2 Complete
## Overview
Phase 2: Frontend Foundation has been completed. The frontend is now ready for end-to-end testing with the backend.
## What Was Created
### 1. API Service Layer (`frontend/src/services/storyWriterApi.ts`)
- Complete TypeScript API service for all story generation endpoints
- Methods for:
- `generatePremise()` - Generate story premise
- `generateOutline()` - Generate story outline from premise
- `generateStoryStart()` - Generate starting section of story
- `continueStory()` - Continue writing a story
- `generateFullStory()` - Generate complete story asynchronously
- `getTaskStatus()` - Get task status for async operations
- `getTaskResult()` - Get result of completed task
- `getCacheStats()` - Get cache statistics
- `clearCache()` - Clear story generation cache
### 2. State Management Hook (`frontend/src/hooks/useStoryWriterState.ts`)
- Comprehensive state management for story writer
- Manages:
- Story parameters (persona, setting, characters, plot, style, tone, POV, audience, rating, ending)
- Generated content (premise, outline, story content)
- Task management (task ID, progress, messages)
- UI state (loading, errors)
- Persists state to localStorage
- Provides helper methods and setters
### 3. Phase Navigation Hook (`frontend/src/hooks/useStoryWriterPhaseNavigation.ts`)
- Manages phase navigation logic
- Five phases: Setup → Premise → Outline → Writing → Export
- Auto-progression based on completion status
- Manual phase selection support
- Phase state management (completed, current, disabled)
- Persists current phase to localStorage
### 4. Main Component (`frontend/src/components/StoryWriter/StoryWriter.tsx`)
- Main StoryWriter component
- Integrates state management and phase navigation
- Renders appropriate phase component based on current phase
- Clean, modern UI with Material-UI
### 5. Phase Navigation Component (`frontend/src/components/StoryWriter/PhaseNavigation.tsx`)
- Visual phase stepper using Material-UI Stepper
- Shows phase icons, names, and descriptions
- Clickable phases (when not disabled)
- Visual indicators for current, completed, and disabled phases
### 6. Phase Components
#### StorySetup (`frontend/src/components/StoryWriter/Phases/StorySetup.tsx`)
- Form for configuring story parameters
- All required fields: Persona, Setting, Characters, Plot Elements
- Optional fields: Writing Style, Tone, POV, Audience, Rating, Ending
- Validates required fields before generation
- Calls `generatePremise()` API
- Auto-navigates to Premise phase on success
#### StoryPremise (`frontend/src/components/StoryWriter/Phases/StoryPremise.tsx`)
- Displays and allows editing of generated premise
- Regenerate premise functionality
- Continue to Outline button
#### StoryOutline (`frontend/src/components/StoryWriter/Phases/StoryOutline.tsx`)
- Generates outline from premise
- Displays and allows editing of outline
- Regenerate outline functionality
- Continue to Writing button
#### StoryWriting (`frontend/src/components/StoryWriter/Phases/StoryWriting.tsx`)
- Generates starting section of story
- Continue writing functionality (iterative)
- Displays complete story content
- Shows completion status
- Continue to Export button
#### StoryExport (`frontend/src/components/StoryWriter/Phases/StoryExport.tsx`)
- Displays complete story with summary
- Shows premise and outline
- Copy to clipboard functionality
- Download as text file functionality
### 7. Route Integration
- Added route `/story-writer` to `App.tsx`
- Protected route (requires authentication)
- Imported StoryWriter component
## File Structure
```
frontend/src/
├── services/
│ └── storyWriterApi.ts # API service layer
├── hooks/
│ ├── useStoryWriterState.ts # State management hook
│ └── useStoryWriterPhaseNavigation.ts # Phase navigation hook
└── components/
└── StoryWriter/
├── index.ts # Exports
├── StoryWriter.tsx # Main component
├── PhaseNavigation.tsx # Phase stepper component
└── Phases/
├── StorySetup.tsx # Phase 1: Setup
├── StoryPremise.tsx # Phase 2: Premise
├── StoryOutline.tsx # Phase 3: Outline
├── StoryWriting.tsx # Phase 4: Writing
└── StoryExport.tsx # Phase 5: Export
```
## API Integration
All API calls are properly integrated:
- Uses `aiApiClient` for AI operations (3-minute timeout)
- Uses `pollingApiClient` for status checks
- Proper error handling with user-friendly messages
- Query parameters correctly formatted for backend endpoints
## Testing Checklist
### End-to-End Testing Steps
1. **Setup Phase**
- [ ] Navigate to `/story-writer`
- [ ] Fill in required fields (Persona, Setting, Characters, Plot Elements)
- [ ] Select optional fields (Style, Tone, POV, Audience, Rating, Ending)
- [ ] Click "Generate Premise"
- [ ] Verify API call is made to `/api/story/generate-premise`
- [ ] Verify premise is generated and displayed
- [ ] Verify auto-navigation to Premise phase
2. **Premise Phase**
- [ ] Verify premise is displayed
- [ ] Edit premise (optional)
- [ ] Test "Regenerate Premise" button
- [ ] Click "Continue to Outline"
- [ ] Verify navigation to Outline phase
3. **Outline Phase**
- [ ] Click "Generate Outline"
- [ ] Verify API call is made to `/api/story/generate-outline?premise=...`
- [ ] Verify outline is generated and displayed
- [ ] Test "Regenerate Outline" button
- [ ] Click "Continue to Writing"
- [ ] Verify navigation to Writing phase
4. **Writing Phase**
- [ ] Click "Generate Story"
- [ ] Verify API call is made to `/api/story/generate-start?premise=...&outline=...`
- [ ] Verify story content is generated
- [ ] Test "Continue Writing" button (if story not complete)
- [ ] Verify API call is made to `/api/story/continue`
- [ ] Verify story continues and updates
- [ ] Verify completion status when story is complete
- [ ] Click "Continue to Export"
- [ ] Verify navigation to Export phase
5. **Export Phase**
- [ ] Verify complete story is displayed
- [ ] Verify premise and outline are shown
- [ ] Test "Copy to Clipboard" button
- [ ] Test "Download as Text File" button
6. **Error Handling**
- [ ] Test with missing required fields
- [ ] Test with invalid API responses
- [ ] Test network errors
- [ ] Verify error messages are displayed
7. **State Persistence**
- [ ] Refresh page and verify state is restored from localStorage
- [ ] Verify current phase is restored
- [ ] Verify all form data is restored
8. **Phase Navigation**
- [ ] Test clicking on different phases
- [ ] Verify disabled phases cannot be accessed
- [ ] Verify phase progression logic
## Next Steps
1. **End-to-End Testing**: Test all phases with the backend
2. **Error Handling**: Enhance error messages and recovery
3. **Loading States**: Add better loading indicators
4. **UX Improvements**: Add animations, transitions, and polish
5. **CopilotKit Integration**: Add CopilotKit actions and sidebar (Phase 4)
6. **Styling**: Enhance visual design and responsiveness
## Notes
- All components use Material-UI for consistent styling
- State is persisted to localStorage for recovery on page refresh
- Phase navigation supports both auto-progression and manual selection
- API calls use proper error handling and loading states
- All TypeScript types are properly defined
## Known Limitations
- No CopilotKit integration yet (Phase 4)
- No async task polling for full story generation (can be added)
- Basic error handling (can be enhanced)
- No undo/redo functionality
- No draft saving to backend

View File

@@ -0,0 +1,405 @@
# Story Writer Implementation Review
## Overview
Comprehensive review of the Story Writer feature implementation, covering both backend and frontend components.
## ✅ Backend Implementation
### 1. Service Layer (`backend/services/story_writer/story_service.py`)
**Status**: ✅ Complete and Well-Structured
**Key Features**:
- ✅ Proper integration with `main_text_generation` module
- ✅ Subscription checking via `user_id` parameter
- ✅ Retry logic with error handling
- ✅ Prompt chaining: Premise → Outline → Story Start → Continuation
- ✅ Completion detection via `IAMDONE` marker
- ✅ Comprehensive prompt building with all story parameters
**Methods**:
- `generate_premise()` - Generates story premise
- `generate_outline()` - Generates outline from premise
- `generate_story_start()` - Generates starting section (min 4000 words)
- `continue_story()` - Continues story writing iteratively
- `generate_full_story()` - Full story generation with iteration control
**Strengths**:
- Clean separation of concerns
- Proper error handling and logging
- Well-documented methods
- Follows existing codebase patterns
**Potential Improvements**:
- Consider adding token counting for better progress tracking
- Could add validation for story parameters
### 2. API Router (`backend/api/story_writer/router.py`)
**Status**: ✅ Complete and Well-Integrated
**Endpoints**:
-`POST /api/story/generate-premise` - Generate premise
-`POST /api/story/generate-outline?premise=...` - Generate outline
-`POST /api/story/generate-start?premise=...&outline=...` - Generate story start
-`POST /api/story/continue` - Continue story writing
-`POST /api/story/generate-full` - Full story generation (async)
-`GET /api/story/task/{task_id}/status` - Task status polling
-`GET /api/story/task/{task_id}/result` - Get task result
-`GET /api/story/cache/stats` - Cache statistics
-`POST /api/story/cache/clear` - Clear cache
-`GET /api/story/health` - Health check
**Strengths**:
- Proper authentication via `get_current_user` dependency
- Query parameters correctly used for premise/outline
- Error handling with appropriate HTTP status codes
- Task management for async operations
- Cache management endpoints
**Integration**:
- ✅ Registered in `router_manager.py` (line 175-176)
- ✅ Properly namespaced with `/api/story` prefix
### 3. Models (`backend/models/story_models.py`)
**Status**: ✅ Complete
**Models**:
-`StoryGenerationRequest` - Request model with all parameters
-`StoryPremiseResponse` - Premise generation response
-`StoryOutlineResponse` - Outline generation response
-`StoryContentResponse` - Story content response
-`StoryFullGenerationResponse` - Full story response
-`StoryContinueRequest` - Continue story request
-`StoryContinueResponse` - Continue story response
-`TaskStatus` - Task status model
**Strengths**:
- Proper Pydantic models with Field descriptions
- Type safety and validation
- Clear model structure
### 4. Task Manager (`backend/api/story_writer/task_manager.py`)
**Status**: ✅ Complete
**Features**:
- ✅ Background task execution
- ✅ Task status tracking
- ✅ Progress updates
- ✅ Error handling
- ✅ Result storage
### 5. Cache Manager (`backend/api/story_writer/cache_manager.py`)
**Status**: ✅ Complete
**Features**:
- ✅ In-memory caching based on request parameters
- ✅ Cache statistics
- ✅ Cache clearing
## ✅ Frontend Implementation
### 1. API Service (`frontend/src/services/storyWriterApi.ts`)
**Status**: ✅ Complete
**Methods**:
-`generatePremise()` - Matches backend endpoint
-`generateOutline()` - Correctly uses query parameters
-`generateStoryStart()` - Correctly uses query parameters
-`continueStory()` - Proper request structure
-`generateFullStory()` - Async task support
-`getTaskStatus()` - Task polling support
-`getTaskResult()` - Result retrieval
-`getCacheStats()` - Cache management
-`clearCache()` - Cache clearing
**Strengths**:
- TypeScript types match backend models
- Proper use of `aiApiClient` for AI operations (3-min timeout)
- Proper use of `pollingApiClient` for status checks
- Error handling structure in place
**Issues Found**:
- ⚠️ **Minor**: Query parameter encoding is correct but could use URLSearchParams for better handling
### 2. State Management (`frontend/src/hooks/useStoryWriterState.ts`)
**Status**: ✅ Complete
**Features**:
- ✅ Comprehensive state management for all story parameters
- ✅ Generated content state (premise, outline, story)
- ✅ Task management state
- ✅ UI state (loading, errors)
- ✅ localStorage persistence
- ✅ Helper methods (`getRequest()`, `resetState()`)
**Strengths**:
- Clean hook structure
- Proper TypeScript types
- State persistence for recovery
- All setters provided
**Potential Improvements**:
- Could add debouncing for localStorage writes
- Could add state validation helpers
### 3. Phase Navigation (`frontend/src/hooks/useStoryWriterPhaseNavigation.ts`)
**Status**: ✅ Complete
**Features**:
- ✅ Five-phase workflow: Setup → Premise → Outline → Writing → Export
- ✅ Auto-progression based on completion
- ✅ Manual phase selection
- ✅ Phase state management (completed, current, disabled)
- ✅ localStorage persistence
**Strengths**:
- Smart phase progression logic
- Prevents accessing phases without prerequisites
- User selection tracking
### 4. Main Component (`frontend/src/components/StoryWriter/StoryWriter.tsx`)
**Status**: ✅ Complete
**Features**:
- ✅ Integrates state and phase navigation
- ✅ Renders appropriate phase component
- ✅ Clean Material-UI layout
- ✅ Theme class management
**Strengths**:
- Simple, clean structure
- Proper component composition
### 5. Phase Components
#### StorySetup (`frontend/src/components/StoryWriter/Phases/StorySetup.tsx`)
**Status**: ✅ Complete
**Features**:
- ✅ Form for all story parameters
- ✅ Required field validation
- ✅ Dropdowns for style, tone, POV, audience, rating, ending
- ✅ API integration for premise generation
- ✅ Auto-navigation on success
- ✅ Error handling
**Strengths**:
- Comprehensive form with all options
- Good UX with validation
#### StoryPremise (`frontend/src/components/StoryWriter/Phases/StoryPremise.tsx`)
**Status**: ✅ Complete
**Features**:
- ✅ Display and edit premise
- ✅ Regenerate functionality
- ✅ Continue to Outline button
#### StoryOutline (`frontend/src/components/StoryWriter/Phases/StoryOutline.tsx`)
**Status**: ✅ Complete
**Features**:
- ✅ Generate outline from premise
- ✅ Display and edit outline
- ✅ Regenerate functionality
- ✅ Continue to Writing button
#### StoryWriting (`frontend/src/components/StoryWriter/Phases/StoryWriting.tsx`)
**Status**: ✅ Complete with Minor Issue
**Features**:
- ✅ Generate story start
- ✅ Continue writing functionality
- ✅ Completion detection
- ✅ Story content editing
**Issue Found**:
- ⚠️ **Minor**: The continuation response includes `IAMDONE` marker, but the frontend doesn't strip it before displaying. The backend removes it in the full story generation, but for individual continuations, it's included. This is actually fine since the backend checks for it, but the frontend should strip it for cleaner display.
**Recommendation**:
```typescript
// In StoryWriting.tsx, handleContinue function:
if (response.success && response.continuation) {
// Strip IAMDONE marker if present
const cleanContinuation = response.continuation.replace(/IAMDONE/gi, '').trim();
state.setStoryContent((state.storyContent || '') + '\n\n' + cleanContinuation);
state.setIsComplete(response.is_complete);
}
```
#### StoryExport (`frontend/src/components/StoryWriter/Phases/StoryExport.tsx`)
**Status**: ✅ Complete
**Features**:
- ✅ Display complete story with summary
- ✅ Show premise and outline
- ✅ Copy to clipboard
- ✅ Download as text file
**Strengths**:
- Clean export functionality
- Good summary display
### 6. Phase Navigation Component (`frontend/src/components/StoryWriter/PhaseNavigation.tsx`)
**Status**: ✅ Complete
**Features**:
- ✅ Material-UI Stepper
- ✅ Visual phase indicators
- ✅ Clickable phases (when enabled)
- ✅ Phase status display
**Strengths**:
- Clean, intuitive UI
- Good visual feedback
### 7. Route Integration (`frontend/src/App.tsx`)
**Status**: ✅ Complete
- ✅ Route added: `/story-writer`
- ✅ Protected route (requires authentication)
- ✅ Component imported correctly
## 🔍 Integration Verification
### API Endpoint Matching
✅ All frontend API calls match backend endpoints:
- `/api/story/generate-premise`
- `/api/story/generate-outline?premise=...`
- `/api/story/generate-start?premise=...&outline=...`
- `/api/story/continue`
- `/api/story/generate-full`
- `/api/story/task/{task_id}/status`
- `/api/story/task/{task_id}/result`
### Request/Response Models
✅ Frontend TypeScript interfaces match backend Pydantic models:
- `StoryGenerationRequest`
- `StoryPremiseResponse`
- `StoryOutlineResponse`
- `StoryContentResponse`
- `StoryContinueRequest`
- `StoryContinueResponse`
### Authentication
✅ Both frontend and backend handle authentication:
- Frontend: Uses `apiClient` with auth token interceptor
- Backend: Uses `get_current_user` dependency
- User ID properly passed to service layer
## 🐛 Issues Found
### Critical Issues
None found.
### Minor Issues
1. **IAMDONE Marker Display** (Low Priority)
- **Location**: `frontend/src/components/StoryWriter/Phases/StoryWriting.tsx`
- **Issue**: Continuation text may include `IAMDONE` marker in display
- **Impact**: Minor - marker might appear in story text
- **Fix**: Strip marker before displaying (see recommendation above)
2. **Query Parameter Encoding** (Very Low Priority)
- **Location**: `frontend/src/services/storyWriterApi.ts`
- **Issue**: Using template strings for query params works but could use URLSearchParams
- **Impact**: None - current implementation works correctly
- **Fix**: Optional improvement for better maintainability
## 📋 Testing Checklist
### Backend Testing
- [ ] Test premise generation endpoint
- [ ] Test outline generation endpoint
- [ ] Test story start generation endpoint
- [ ] Test story continuation endpoint
- [ ] Test full story generation (async)
- [ ] Test task status polling
- [ ] Test cache functionality
- [ ] Test error handling (invalid requests, auth failures)
- [ ] Test subscription limit handling
### Frontend Testing
- [ ] Test Setup phase form submission
- [ ] Test Premise generation and display
- [ ] Test Outline generation and display
- [ ] Test Story start generation
- [ ] Test Story continuation
- [ ] Test Phase navigation (forward and backward)
- [ ] Test State persistence (refresh page)
- [ ] Test Error handling and display
- [ ] Test Export functionality
- [ ] Test Responsive design
### Integration Testing
- [ ] End-to-end: Setup → Premise → Outline → Writing → Export
- [ ] Test with real backend API
- [ ] Test error scenarios (network errors, API errors)
- [ ] Test authentication flow
- [ ] Test subscription limit scenarios
## 🎯 Recommendations
### Immediate Actions
1. **Fix IAMDONE Marker Display** (if desired)
- Strip `IAMDONE` marker from continuation text before displaying
### Future Enhancements
1. **CopilotKit Integration** (Phase 4)
- Add CopilotKit actions for story generation
- Add CopilotKit sidebar for AI assistance
- Follow BlogWriter pattern
2. **Enhanced Error Handling**
- More specific error messages
- Retry logic for transient failures
- Better error recovery
3. **Progress Indicators**
- Show progress for long-running operations
- Token counting for better progress tracking
- Estimated time remaining
4. **Draft Saving**
- Save drafts to backend
- Load previous drafts
- Draft management UI
5. **Story Editing**
- Rich text editor for story content
- Markdown support
- Formatting options
6. **Export Enhancements**
- Multiple export formats (PDF, DOCX, EPUB)
- Export with formatting
- Share functionality
## ✅ Summary
### Overall Status: **READY FOR TESTING**
**Backend**: ✅ Complete and well-structured
- All endpoints implemented
- Proper authentication and subscription integration
- Error handling in place
- Task management and caching implemented
**Frontend**: ✅ Complete with minor improvements possible
- All components implemented
- State management working
- Phase navigation functional
- API integration correct
- Route configured
**Integration**: ✅ Verified
- API endpoints match
- Request/response models align
- Authentication flow correct
### Next Steps
1. **End-to-End Testing**: Test the complete flow with real backend
2. **Fix Minor Issues**: Address IAMDONE marker display if needed
3. **CopilotKit Integration**: Add AI assistance features (Phase 4)
4. **Polish & Enhance**: Improve UX, add features, enhance styling
The implementation is solid and ready for testing. The code follows best practices and integrates well with the existing codebase.

View File

@@ -0,0 +1,830 @@
# Story Writer Video Generation Enhancement Plan
---
## Current State Analysis
### Current Video Generation
- **Provider**: HuggingFace (tencent/HunyuanVideo via fal-ai)
- **Issues**:
- Unreliable API responses
- Limited quality control
- No audio synchronization
- Single provider dependency
- Poor error handling
### Current Audio Generation
- **Provider**: gTTS (Google Text-to-Speech)
- **Limitations**:
- Robotic, non-natural voice
- No brand voice consistency
- Limited language options
- No emotion control
- Cannot clone user's voice
### Current Story Writer Workflow
1. User creates story outline with scenes
2. Each scene has `audio_narration` text
3. Audio generated via gTTS per scene
4. Video generated via HuggingFace per scene
5. Videos compiled into final story video
**Location**: `backend/api/story_writer/` and `frontend/src/components/StoryWriter/`
---
## Proposed Enhancements
### Core Principles
**Provider Abstraction**:
- Users should NOT see provider names (HuggingFace, WaveSpeed, etc.)
- All provider routing/switching happens automatically in the background
- Users only see user-friendly options like "Standard Quality" or "Premium Quality"
- System automatically selects best available provider based on user's subscription and credits
**Preserve Existing Options**:
- gTTS remains available as free fallback when credits run out
- HuggingFace remains available as fallback option
- All existing functionality preserved
- New features are additions, not replacements
**Cost Transparency**:
- All buttons show cost information in tooltips
- Users make informed decisions before generating
- No surprise costs
---
### 1. Provider-Agnostic Video Generation System
#### 1.1 Smart Provider Routing
**Backend Implementation** (`backend/services/llm_providers/main_video_generation.py`):
```python
def ai_video_generate(
prompt: str,
quality: str = "standard", # "standard" (480p), "high" (720p), "premium" (1080p)
duration: int = 5,
audio_file_path: Optional[str] = None,
user_id: str,
**kwargs,
) -> bytes:
"""
Unified video generation entry point.
Automatically routes to best available provider:
- WaveSpeed WAN 2.5 (primary, if credits available)
- HuggingFace (fallback, if WaveSpeed unavailable)
Users never see provider names - only quality options.
"""
# 1. Check user subscription and credits
# 2. Select best available provider automatically
# 3. Route to appropriate provider function
# 4. Handle fallbacks transparently
pass
def _select_video_provider(
user_id: str,
quality: str,
pricing_service: PricingService,
) -> Tuple[str, str]:
"""
Automatically select best video provider.
Returns: (provider_name, model_name)
Selection logic:
1. Check user credits/subscription
2. Prefer WaveSpeed if available and credits sufficient
3. Fallback to HuggingFace if WaveSpeed unavailable
4. Return error if no providers available
"""
# Implementation details...
```
**Key Features**:
- Automatic provider selection (users don't choose)
- Seamless fallback between providers
- Quality-based options (Standard/High/Premium) instead of provider names
- Cost-aware routing (uses cheapest available option)
- Transparent error handling
**Quality Mapping**:
- **Standard Quality** (480p): $0.05/second - Uses WaveSpeed 480p or HuggingFace
- **High Quality** (720p): $0.10/second - Uses WaveSpeed 720p
- **Premium Quality** (1080p): $0.15/second - Uses WaveSpeed 1080p
**Cost Optimization**:
- Default to Standard Quality (480p) for cost-effectiveness
- Allow upgrade to High/Premium for final export
- Pre-flight validation prevents waste
- Automatic fallback to free options when credits exhausted
---
### 2. Enhanced Audio Generation with Voice Cloning
#### 2.1 User-Friendly Voice Selection
**Key Principle**: Users choose between "AI Clone Voice" or "Default Voice" (gTTS) - no provider names shown.
**Backend Implementation** (`backend/services/story_writer/audio_generation_service.py`):
```python
class StoryAudioGenerationService:
def generate_scene_audio(
self,
scene: Dict[str, Any],
user_id: str,
use_ai_voice: bool = False, # User's choice: AI Clone or Default
**kwargs,
) -> Dict[str, Any]:
"""
Generate audio with automatic provider selection.
If use_ai_voice=True:
- Try persona voice clone (if trained)
- Try Minimax voice clone (if credits available)
- Fallback to gTTS if no credits
If use_ai_voice=False:
- Use gTTS (always free, always available)
"""
if use_ai_voice:
# Try AI voice options
if self._has_persona_voice(user_id):
return self._generate_with_persona_voice(scene, user_id)
elif self._has_credits_for_voice_clone(user_id):
return self._generate_with_minimax_voice_clone(scene, user_id)
else:
# Fallback to gTTS with notification
logger.info(f"Credits exhausted, falling back to gTTS for user {user_id}")
return self._generate_with_gtts(scene, **kwargs)
else:
# User explicitly chose default voice
return self._generate_with_gtts(scene, **kwargs)
```
**Voice Options in Story Setup**:
- **Default Voice (gTTS)**: Free, always available, robotic but functional
- **AI Clone Voice**: Natural, human-like, requires credits ($0.02/minute)
**Cost Considerations**:
- Voice training: One-time cost (~$0.75) - only if user wants to train custom voice
- Voice generation: ~$0.02 per minute (only when AI Clone Voice selected)
- gTTS: Always free, always available as fallback
- Automatic fallback to gTTS when credits exhausted (with user notification)
---
### 3. Enhanced Story Setup UI
#### 3.1 Video Generation Settings (Provider-Agnostic)
**Location**: `frontend/src/components/StoryWriter/Phases/StorySetup/GenerationSettingsSection.tsx`
**User-Friendly Settings** (No Provider Names):
```typescript
interface VideoGenerationSettings {
// Quality selection (NOT provider selection)
videoQuality: 'standard' | 'high' | 'premium'; // Maps to 480p/720p/1080p
// Duration
videoDuration: 5 | 10; // seconds
// Cost estimation (shown in tooltip)
estimatedCostPerScene: number;
totalEstimatedCost: number;
// Provider routing happens automatically in backend
// Users never see "WaveSpeed" or "HuggingFace"
}
```
**UI Components**:
- Quality selector: "Standard" / "High" / "Premium" (with cost in tooltip)
- Duration selector: 5s (default) / 10s (premium)
- Cost tooltip: Shows estimated cost per scene and total
- Pre-flight validation warnings
- **No provider selector** - routing is automatic
**Tooltip Example**:
```
Standard Quality (480p)
├─ Cost: $0.25 per scene (5 seconds)
├─ Quality: Good for previews and testing
└─ Provider: Automatically selected based on credits
```
#### 3.2 Audio Generation Settings (Simple Choice)
**New Settings**:
```typescript
interface AudioGenerationSettings {
// Simple user choice - no provider names
voiceType: 'default' | 'ai_clone'; // "Default Voice" or "AI Clone Voice"
// Only shown if ai_clone selected
voiceTrainingStatus: 'not_trained' | 'training' | 'ready' | 'failed';
// Existing gTTS settings (preserved)
audioLang: string;
audioSlow: boolean;
audioRate: number;
}
```
**UI Components**:
- **Voice Type Selector**:
- "Default Voice (gTTS)" - Free, always available
- "AI Clone Voice" - Natural, $0.02/minute (with cost tooltip)
- Voice training section (only if AI Clone Voice selected)
- Existing gTTS settings (preserved for Default Voice)
- Cost per minute display in tooltip
**Tooltip for "AI Clone Voice"**:
```
AI Clone Voice
├─ Cost: $0.02 per minute
├─ Quality: Natural, human-like narration
├─ Fallback: Automatically uses Default Voice if credits exhausted
└─ Training: One-time $0.75 to train your custom voice (optional)
```
**Tooltip for "Default Voice"**:
```
Default Voice (gTTS)
├─ Cost: Free
├─ Quality: Standard text-to-speech
└─ Always Available: Works even when credits exhausted
```
---
### 4. New "Animate Scene" Feature in Outline Phase
#### 4.1 Per-Scene Animation Preview
**Location**: `frontend/src/components/StoryWriter/Phases/StoryOutline.tsx`
**Feature**: Add "Animate Scene" hover option alongside existing scene actions
**Implementation**:
- Add to `OutlineHoverActions` component
- Appears on hover over scene cards
- Only generates for single scene (never bulk)
- Uses cheapest option (480p/Standard Quality) to give users a feel
- Shows cost in tooltip before generation
**UI Component**:
```typescript
// In OutlineHoverActions.tsx
const sceneHoverActions = [
// Existing actions...
{
icon: <PlayArrowIcon />,
label: 'Animate Scene',
action: 'animate-scene',
tooltip: `Animate this scene with video\nCost: ~$0.25 (5 seconds, Standard Quality)\nPreview only - uses cheapest option`,
onClick: handleAnimateScene,
},
];
```
**Backend Endpoint**:
```python
@router.post("/animate-scene-preview")
async def animate_scene_preview(
request: SceneAnimationRequest,
current_user: Dict[str, Any] = Depends(get_current_user),
) -> SceneAnimationResponse:
"""
Generate preview animation for a single scene.
Always uses cheapest option (480p/Standard Quality).
Per-scene only - never bulk generation.
"""
# 1. Validate single scene only
# 2. Use Standard Quality (480p) - cheapest option
# 3. Generate video with automatic provider routing
# 4. Return preview video URL
pass
```
**Cost Management**:
- Always uses Standard Quality (480p) - $0.25 per scene
- Pre-flight validation before generation
- Clear cost display in tooltip
- Per-scene only prevents bulk waste
---
### 5. New "Animate Story with VoiceOver" Button in Writing Phase
#### 5.1 Complete Story Animation
**Location**: `frontend/src/components/StoryWriter/Phases/StoryWriting.tsx`
**Feature**: New button alongside existing HuggingFace video options
**Implementation**:
- Add button in Writing phase toolbar
- Generates complete animated story with synchronized voiceover
- Uses user's voice preference from Setup (AI Clone or Default)
- Shows comprehensive cost breakdown in tooltip
- Pre-flight validation before generation
**UI Component**:
```typescript
<Button
variant="contained"
startIcon={<SmartDisplayIcon />}
onClick={handleAnimateStoryWithVoiceOver}
disabled={!state.storyContent || isGenerating}
title={`Animate Story with VoiceOver\n\nCost Breakdown:\n- Video: $${videoCost} (${scenes.length} scenes × $${costPerScene})\n- Audio: $${audioCost} (${totalAudioMinutes} minutes)\n- Total: $${totalCost}\n\nQuality: ${state.videoQuality}\nVoice: ${state.voiceType === 'ai_clone' ? 'AI Clone' : 'Default'}`}
>
Animate Story with VoiceOver
</Button>
```
**Backend Endpoint**:
```python
@router.post("/animate-story-with-voiceover")
async def animate_story_with_voiceover(
request: StoryAnimationRequest,
current_user: Dict[str, Any] = Depends(get_current_user),
) -> StoryAnimationResponse:
"""
Generate complete animated story with synchronized voiceover.
Uses user's quality and voice preferences from Setup.
"""
# 1. Pre-flight validation (cost, credits, limits)
# 2. Generate audio for all scenes (using user's voice preference)
# 3. Generate videos for all scenes (using user's quality preference)
# 4. Synchronize audio with video
# 5. Compile into final story video
# 6. Return video URL and cost breakdown
pass
```
**Cost Tooltip Example**:
```
Animate Story with VoiceOver
Cost Breakdown:
├─ Video (Standard Quality): $2.50
│ └─ 10 scenes × $0.25 per scene
├─ Audio (AI Clone Voice): $1.00
│ └─ 50 minutes total × $0.02/minute
└─ Total: $3.50
Settings:
├─ Quality: Standard (480p)
├─ Voice: AI Clone Voice
└─ Duration: 5 seconds per scene
⚠️ This will use $3.50 of your monthly credits
```
---
## Implementation Phases
### Phase 1: Provider-Agnostic Video System (Week 1-2)
**Priority**: HIGH - Solves immediate HuggingFace issues with provider abstraction
**Tasks**:
1. ✅ Create WaveSpeed API client (`backend/services/wavespeed/client.py`)
2. ✅ Add WAN 2.5 text-to-video function
3. ✅ Implement smart provider routing in `main_video_generation.py`
4. ✅ Add quality-based selection (Standard/High/Premium)
5. ✅ Preserve HuggingFace as fallback option
6. ✅ Update `hd_video.py` with provider routing
7. ✅ Add pre-flight cost validation
8. ✅ Update frontend with quality selector (remove provider names)
9. ✅ Add cost tooltips to all buttons
10. ✅ Update subscription limits
11. ✅ Testing and error handling
**Files to Modify**:
- `backend/services/llm_providers/main_video_generation.py` (add routing logic)
- `backend/api/story_writer/utils/hd_video.py` (use quality-based API)
- `backend/api/story_writer/routes/video_generation.py`
- `frontend/src/components/StoryWriter/Phases/StorySetup/GenerationSettingsSection.tsx` (quality selector)
- `frontend/src/components/StoryWriter/components/HdVideoSection.tsx`
- `backend/services/subscription/pricing_service.py`
**Success Criteria**:
- Video generation works reliably with automatic provider routing
- Users see quality options, not provider names
- HuggingFace preserved as fallback
- Cost tracking accurate
- Pre-flight validation prevents waste
- Error messages clear and actionable
---
### Phase 2: Voice Cloning Integration (Week 3-4)
**Priority**: MEDIUM - Enhances audio quality with simple user choice
**Tasks**:
1. ✅ Create Minimax API client (`backend/services/minimax/voice_clone.py`)
2. ✅ Add voice training endpoint
3. ✅ Add voice generation endpoint
4. ✅ Update `audio_generation_service.py` with "AI Clone" vs "Default" logic
5. ✅ Preserve gTTS as always-available fallback
6. ✅ Add automatic fallback when credits exhausted
7. ✅ Update Story Setup with simple voice type selector
8. ✅ Add cost tooltips to voice options
9. ✅ Add voice preview and testing (if AI Clone selected)
10. ✅ Ensure gTTS always works even when credits exhausted
**Files to Create**:
- `backend/services/minimax/voice_clone.py`
- `backend/services/story_writer/voice_management_service.py`
**Files to Modify**:
- `backend/services/story_writer/audio_generation_service.py` (add voice type logic)
- `frontend/src/components/StoryWriter/Phases/StorySetup/GenerationSettingsSection.tsx` (voice type selector)
- `backend/models/story_models.py` (add voice type field)
**Success Criteria**:
- Users see simple choice: "Default Voice" or "AI Clone Voice"
- gTTS always available as fallback
- Automatic fallback when credits exhausted
- Cost tracking accurate
- Voice quality significantly better than gTTS when AI Clone used
---
### Phase 3: New Features - Animate Scene & Animate Story (Week 5-6)
**Priority**: MEDIUM - Add preview and complete animation features
**Tasks**:
1. ✅ Add "Animate Scene" hover option in Outline phase
2. ✅ Implement per-scene animation preview (cheapest option only)
3. ✅ Add "Animate Story with VoiceOver" button in Writing phase
4. ✅ Implement complete story animation with voiceover
5. ✅ Add comprehensive cost tooltips to all buttons
6. ✅ Add pre-flight validation for all animation features
7. ✅ Ensure per-scene only (no bulk generation in Outline)
8. ✅ Update documentation
9. ✅ User testing and feedback
**Files to Create**:
- `backend/api/story_writer/routes/scene_animation.py` (new endpoint)
- `frontend/src/components/StoryWriter/components/AnimateSceneButton.tsx`
**Files to Modify**:
- `frontend/src/components/StoryWriter/Phases/StoryOutlineParts/OutlineHoverActions.tsx` (add Animate Scene)
- `frontend/src/components/StoryWriter/Phases/StoryWriting.tsx` (add Animate Story button)
- `backend/api/story_writer/routes/video_generation.py` (add story animation endpoint)
**Success Criteria**:
- "Animate Scene" works in Outline (per-scene, cheapest option)
- "Animate Story with VoiceOver" works in Writing phase
- All buttons show cost in tooltips
- Pre-flight validation prevents waste
- Good user experience
---
### Phase 4: Integration & Optimization (Week 7-8)
**Priority**: MEDIUM - Polish and optimize
**Tasks**:
1. ✅ Integrate audio with video (synchronized videos)
2. ✅ Improve error handling and retry logic
3. ✅ Add progress indicators
4. ✅ Optimize cost calculations
5. ✅ Add usage analytics
6. ✅ Update documentation
7. ✅ User testing and feedback
**Success Criteria**:
- Smooth end-to-end workflow
- Cost-effective for users
- Reliable generation
- Excellent user experience
- All features work seamlessly together
---
## Cost Management & Prevention of Waste
### Pre-Flight Validation
**Implementation**: `backend/services/subscription/preflight_validator.py`
**Checks Before Generation**:
1. User has sufficient subscription tier
2. Estimated cost within monthly budget
3. Video generation limit not exceeded
4. Audio generation limit not exceeded
5. Total story cost reasonable (<$5 for typical story)
**Validation Flow**:
```python
def validate_story_generation(
pricing_service: PricingService,
user_id: str,
num_scenes: int,
video_resolution: str,
video_duration: int,
use_voice_clone: bool,
) -> Tuple[bool, str, Dict[str, Any]]:
"""
Pre-flight validation before story generation.
Returns: (allowed, message, cost_breakdown)
"""
# Calculate estimated costs
video_cost_per_scene = get_wavespeed_cost(video_resolution, video_duration)
audio_cost_per_scene = get_voice_clone_cost() if use_voice_clone else 0.0
total_estimated_cost = (video_cost_per_scene + audio_cost_per_scene) * num_scenes
# Check limits
limits = pricing_service.get_user_limits(user_id)
current_usage = pricing_service.get_current_usage(user_id)
# Validation logic...
return (allowed, message, cost_breakdown)
```
### Cost Estimation Display
**Frontend Implementation**:
- Real-time cost calculator in Story Setup
- Per-scene cost breakdown
- Total story cost estimate
- Monthly budget remaining
- Warning if approaching limits
**UI Example**:
```
Video Generation Cost Estimate:
├─ Resolution: 720p ($0.10/second)
├─ Duration: 5 seconds per scene
├─ Scenes: 10
└─ Total: $5.00
Audio Generation Cost Estimate:
├─ Provider: Voice Clone ($0.02/minute)
├─ Average: 30 seconds per scene
├─ Scenes: 10
└─ Total: $1.00
Total Estimated Cost: $6.00
Monthly Budget Remaining: $44.00
```
### Usage Tracking
**Enhanced Tracking**:
- Track video generation per scene
- Track audio generation per scene
- Track total story cost
- Alert users approaching limits
- Provide cost breakdown in analytics
---
## Pricing Integration
### WaveSpeed WAN 2.5 Pricing
**Add to `pricing_service.py`**:
```python
# WaveSpeed WAN 2.5 Text-to-Video
{
"provider": APIProvider.VIDEO, # Or new WAVESPEED provider
"model_name": "wan-2.5-480p",
"cost_per_second": 0.05,
"description": "WaveSpeed WAN 2.5 Text-to-Video (480p)"
},
{
"provider": APIProvider.VIDEO,
"model_name": "wan-2.5-720p",
"cost_per_second": 0.10,
"description": "WaveSpeed WAN 2.5 Text-to-Video (720p)"
},
{
"provider": APIProvider.VIDEO,
"model_name": "wan-2.5-1080p",
"cost_per_second": 0.15,
"description": "WaveSpeed WAN 2.5 Text-to-Video (1080p)"
}
```
### Minimax Voice Clone Pricing
**Add to `pricing_service.py`**:
```python
# Minimax Voice Clone
{
"provider": APIProvider.AUDIO, # New provider type
"model_name": "minimax-voice-clone-train",
"cost_per_request": 0.75, # One-time training cost
"description": "Minimax Voice Clone Training"
},
{
"provider": APIProvider.AUDIO,
"model_name": "minimax-voice-clone-generate",
"cost_per_minute": 0.02, # Per minute of generated audio
"description": "Minimax Voice Clone Generation"
}
```
### Subscription Tier Limits
**Update subscription limits**:
- **Free**: 3 stories/month, 480p only, gTTS only
- **Basic**: 10 stories/month, up to 720p, voice clone available
- **Pro**: 50 stories/month, up to 1080p, voice clone included
- **Enterprise**: Unlimited, all features
---
## Technical Architecture
### Backend Services
```
backend/services/
├── wavespeed/
│ ├── __init__.py
│ ├── client.py # WaveSpeed API client
│ ├── wan25_video.py # WAN 2.5 video generation
│ └── models.py # Request/response models
├── minimax/
│ ├── __init__.py
│ ├── client.py # Minimax API client
│ ├── voice_clone.py # Voice cloning service
│ └── models.py
└── story_writer/
├── audio_generation_service.py # Updated with voice clone
└── video_generation_service.py # Updated with WaveSpeed
```
### Frontend Components
```
frontend/src/components/StoryWriter/
├── Phases/StorySetup/
│ └── GenerationSettingsSection.tsx # Enhanced with new settings
├── components/
│ ├── HdVideoSection.tsx # Updated for WaveSpeed
│ ├── VoiceTrainingSection.tsx # NEW: Voice training UI
│ └── CostEstimationDisplay.tsx # NEW: Cost calculator
└── hooks/
└── useStoryGenerationCost.ts # NEW: Cost calculation hook
```
---
## Error Handling & User Experience
### Error Scenarios
1. **WaveSpeed API Failure**:
- Retry with exponential backoff (3 attempts)
- Fallback to HuggingFace if available
- Clear error message with cost refund notice
2. **Voice Clone Training Failure**:
- Provide specific error (audio quality, length, format)
- Suggest improvements
- Allow retry with different audio
3. **Cost Limit Exceeded**:
- Pre-flight validation prevents this
- Show upgrade prompt
- Suggest reducing scenes/resolution
4. **Audio/Video Mismatch**:
- Validate audio length matches video duration
- Auto-trim or extend audio
- Warn user before generation
### User Feedback
- Progress indicators for all operations
- Clear cost breakdowns
- Quality previews before final generation
- Regeneration options with cost tracking
- Usage analytics dashboard
---
## Testing Plan
### Unit Tests
- WaveSpeed API client
- Voice clone service
- Cost calculation
- Pre-flight validation
### Integration Tests
- End-to-end story generation
- Audio + video synchronization
- Error handling and fallbacks
- Subscription limit enforcement
### User Acceptance Tests
- Story generation workflow
- Voice training process
- Cost estimation accuracy
- Error recovery
---
## Success Metrics
### Technical Metrics
- Video generation success rate >95%
- Audio generation success rate >98%
- Average generation time per scene <30s
- API error rate <2%
### Business Metrics
- User satisfaction with video quality
- Cost per story (target: <$5 for 10-scene story)
- Voice clone adoption rate
- Story completion rate
### User Experience Metrics
- Time to generate story
- Error recovery time
- User understanding of costs
- Feature discovery rate
---
## Provider Management Strategy
### Always-Available Options
- **gTTS**: Always available, always free, works even when credits exhausted
- **HuggingFace**: Preserved as fallback option, works when WaveSpeed unavailable
### Automatic Provider Routing
- **Primary**: WaveSpeed WAN 2.5 (when credits available)
- **Fallback**: HuggingFace (when WaveSpeed unavailable or credits exhausted)
- **Audio Fallback**: gTTS (always available, always free)
### User Experience
- Users never see provider names
- System automatically selects best available option
- Seamless fallback when credits exhausted
- Clear notifications when fallback occurs
- No user intervention required
### No Deprecation
- **HuggingFace**: Kept as permanent fallback option
- **gTTS**: Kept as permanent free option
- All existing functionality preserved
- New features are additions, not replacements
---
## Next Steps
1. **Week 1**: Set up WaveSpeed API access and credentials
2. **Week 1**: Implement provider-agnostic routing system
3. **Week 2**: Integrate into Story Writer with quality-based UI
4. **Week 3**: Implement voice cloning with simple "AI Clone" vs "Default" choice
5. **Week 4**: Add voice training UI (only if AI Clone selected)
6. **Week 5**: Add "Animate Scene" hover option in Outline
7. **Week 6**: Add "Animate Story with VoiceOver" button in Writing
8. **Week 7-8**: Testing, optimization, and polish
## Key Design Principles
1. **Provider Abstraction**: Users never see provider names - only quality/voice options
2. **Preserve Existing**: gTTS and HuggingFace remain available as fallbacks
3. **Cost Transparency**: All buttons show costs in tooltips
4. **Automatic Fallback**: System automatically uses free options when credits exhausted
5. **Per-Scene Only**: Outline phase only allows per-scene generation (no bulk)
6. **User-Friendly**: Simple choices like "Standard Quality" not "WaveSpeed 480p"
---
## Risk Mitigation
| Risk | Mitigation |
|------|------------|
| WaveSpeed API changes | Version pinning, abstraction layer |
| Cost overruns | Strict pre-flight validation |
| Voice quality issues | Quality checks, fallback options |
| User confusion | Clear UI, tooltips, documentation |
| Integration complexity | Phased rollout, extensive testing |
---
*Document Version: 1.0*
*Last Updated: January 2025*
*Priority: HIGH - Immediate Implementation*