16 KiB
Story Generation Feature - Implementation Plan
Executive Summary
This document reviews the existing story generation backend modules and provides a comprehensive plan to complete the story generation feature with a modern UI using CopilotKit, similar to the AI Blog Writer implementation.
1. Current State Review
1.1 Existing Backend Modules
1.1.1 Story Writer (ToBeMigrated/ai_writers/ai_story_writer/)
Status: ✅ Functional but needs migration
Location: ToBeMigrated/ai_writers/ai_story_writer/ai_story_generator.py
Features:
- Prompt chaining approach (premise → outline → starting draft → continuation)
- Supports multiple personas/genres (11 predefined)
- Configurable story parameters:
- Story setting
- Characters
- Plot elements
- Writing style (Formal, Casual, Poetic, Humorous)
- Story tone (Dark, Uplifting, Suspenseful, Whimsical)
- Narrative POV (First Person, Third Person Limited/Omniscient)
- Audience age group
- Content rating
- Ending preference
Current Implementation:
- Uses legacy
lib/gpt_providers/text_generation/main_text_generation.py(needs update) - Streamlit-based UI (needs React migration)
- Iterative generation until "IAMDONE" marker
Issues to Address:
- ❌ Uses old import path (
...gpt_providers.text_generation.main_text_generation) - ❌ No subscription/user_id integration
- ❌ No task management/polling support
- ❌ Streamlit UI (needs React/CopilotKit migration)
1.1.2 Story Illustrator (ToBeMigrated/ai_writers/ai_story_illustrator/)
Status: ✅ Functional but needs migration
Location: ToBeMigrated/ai_writers/ai_story_illustrator/story_illustrator.py
Features:
- Story segmentation for illustration
- Scene element extraction using LLM
- Multiple illustration styles (12+ options)
- PDF storybook generation
- ZIP export of illustrations
Current Implementation:
- Uses legacy import paths
- Streamlit UI
- Integrates with image generation (Gemini)
Issues to Address:
- ❌ Uses old import paths
- ❌ No subscription integration
- ❌ Streamlit UI (needs React migration)
1.1.3 Story Video Generator (ToBeMigrated/ai_writers/ai_story_video_generator/)
Status: ✅ Functional but needs migration
Location: ToBeMigrated/ai_writers/ai_story_video_generator/story_video_generator.py
Features:
- Story generation with scene breakdown
- Image generation per scene
- Text overlay on images
- Video compilation with audio
- Multiple story styles
Current Implementation:
- Uses legacy import paths
- Streamlit UI
- MoviePy for video generation
Issues to Address:
- ❌ Uses old import paths
- ❌ No subscription integration
- ❌ Streamlit UI (needs React migration)
- ❌ Heavy dependencies (MoviePy, imageio)
1.2 Core Infrastructure Available
1.2.1 Main Text Generation (backend/services/llm_providers/main_text_generation.py)
Status: ✅ Production-ready Features:
- ✅ Supports Gemini and HuggingFace
- ✅ Subscription/user_id integration
- ✅ Usage tracking
- ✅ Automatic fallback between providers
- ✅ Structured JSON response support
Usage Pattern:
from services.llm_providers.main_text_generation import llm_text_gen
response = llm_text_gen(
prompt="...",
system_prompt="...",
json_struct={...}, # Optional
user_id="clerk_user_id" # Required
)
1.2.2 Subscription System (backend/models/subscription_models.py)
Status: ✅ Production-ready Features:
- Usage tracking per provider
- Token limits
- Call limits
- Billing period management
- Already integrated with
main_text_generation
1.2.3 Blog Writer Architecture (Reference)
Status: ✅ Production-ready reference implementation
Key Components:
-
Phase Navigation (
frontend/src/hooks/usePhaseNavigation.ts)- Multi-phase workflow (Research → Outline → Content → SEO → Publish)
- Phase state management
- Auto-progression logic
-
CopilotKit Integration (
frontend/src/components/BlogWriter/BlogWriterUtils/useBlogWriterCopilotActions.ts)- Action handlers for AI interactions
- Sidebar suggestions
- Context-aware actions
-
Backend Router (
backend/api/blog_writer/router.py)- RESTful endpoints
- Task management with polling
- Cache management
- Error handling
-
Task Management (
backend/api/blog_writer/task_manager.py)- Async task execution
- Status tracking
- Result caching
2. Implementation Plan
2.1 Phase 1: Backend Migration & Enhancement
2.1.1 Create Story Writer Service
File: backend/services/story_writer/story_service.py
Tasks:
- Migrate
ai_story_generator.pylogic to new service - Update imports to use
main_text_generation - Add
user_idparameter to all LLM calls - Implement prompt chaining with proper error handling
- Add structured JSON response support for outline generation
- Support both Gemini and HuggingFace through
main_text_generation
Key Functions:
async def generate_story_premise(
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
writing_style: str,
story_tone: str,
narrative_pov: str,
audience_age_group: str,
content_rating: str,
ending_preference: str,
user_id: str
) -> str
async def generate_story_outline(
premise: str,
persona: str,
story_setting: str,
character_input: str,
plot_elements: str,
user_id: str
) -> Dict[str, Any] # Structured outline
async def generate_story_start(
premise: str,
outline: str,
persona: str,
guidelines: str,
user_id: str
) -> str
async def continue_story(
premise: str,
outline: str,
story_text: str,
persona: str,
guidelines: str,
user_id: str
) -> str
2.1.2 Create Story Writer Router
File: backend/api/story_writer/router.py
Endpoints:
POST /api/story/generate-premise
POST /api/story/generate-outline
POST /api/story/generate-start
POST /api/story/continue
POST /api/story/generate-full # Complete story generation with task management
GET /api/story/task/{task_id}/status
GET /api/story/task/{task_id}/result
Request Models:
class StoryGenerationRequest(BaseModel):
persona: str
story_setting: str
character_input: str
plot_elements: str
writing_style: str
story_tone: str
narrative_pov: str
audience_age_group: str
content_rating: str
ending_preference: str
2.1.3 Task Management Integration
File: backend/api/story_writer/task_manager.py
Features:
- Async story generation with polling
- Progress tracking (premise → outline → start → continuation → done)
- Result caching
- Error recovery
2.2 Phase 2: Frontend Implementation
2.2.1 Story Writer Component Structure
File: frontend/src/components/StoryWriter/StoryWriter.tsx
Phases (similar to Blog Writer):
- Setup - Story parameters input
- Premise - Review and refine premise
- Outline - Review and refine outline
- Writing - Generate and edit story content
- Illustration (Optional) - Generate illustrations
- Export - Download/export story
2.2.2 Phase Navigation Hook
File: frontend/src/hooks/useStoryWriterPhaseNavigation.ts
Based on: usePhaseNavigation.ts from Blog Writer
Phases:
interface StoryPhase {
id: 'setup' | 'premise' | 'outline' | 'writing' | 'illustration' | 'export';
name: string;
icon: string;
description: string;
completed: boolean;
current: boolean;
disabled: boolean;
}
2.2.3 CopilotKit Actions
File: frontend/src/components/StoryWriter/StoryWriterUtils/useStoryWriterCopilotActions.ts
Actions:
generateStoryPremise- Generate story premisegenerateStoryOutline- Generate outline from premisestartStoryWriting- Begin story generationcontinueStoryWriting- Continue story generationrefineStoryOutline- Refine outline based on feedbackgenerateIllustrations- Generate illustrations for storyexportStory- Export story in various formats
2.2.4 Story Writer UI Components
Main Components:
StoryWriter.tsx- Main containerStorySetup.tsx- Phase 1: Input story parametersStoryPremise.tsx- Phase 2: Review premiseStoryOutline.tsx- Phase 3: Review/edit outlineStoryContent.tsx- Phase 4: Generated story content with editorStoryIllustration.tsx- Phase 5: Illustration generation (optional)StoryExport.tsx- Phase 6: Export options
Utility Components:
StoryWriterUtils/HeaderBar.tsx- Phase navigation headerStoryWriterUtils/PhaseContent.tsx- Phase-specific content wrapperStoryWriterUtils/WriterCopilotSidebar.tsx- CopilotKit sidebarStoryWriterUtils/useStoryWriterState.ts- State management hook
2.3 Phase 3: Integration with Gemini Examples
2.3.1 Prompt Chaining Pattern
Implementation:
- Use the existing prompt chaining approach from
ai_story_generator.py - Enhance with structured JSON responses for outline
- Add better error handling and retry logic
- Support streaming responses (future enhancement)
2.3.2 Illustration Integration
Reference: https://github.com/google-gemini/cookbook/blob/main/examples/Book_illustration.ipynb
Implementation:
- Migrate
story_illustrator.pyto backend service - Create API endpoints for illustration generation
- Add illustration phase to frontend
- Support multiple illustration styles
2.3.3 Video Generation (Optional/Future)
Status: Defer to Phase 4 (requires heavy dependencies)
2.4 Phase 4: Advanced Features (Future)
-
Story Video Generation
- Migrate
story_video_generator.py - Add video generation phase
- Handle MoviePy dependencies
- Migrate
-
Story Templates
- Pre-defined story templates
- Genre-specific templates
- Character templates
-
Collaborative Editing
- Multi-user story editing
- Version control
- Comments and suggestions
-
Story Analytics
- Readability metrics
- Story structure analysis
- Character development tracking
3. Technical Specifications
3.1 Backend API Models
# backend/models/story_models.py
class StoryGenerationRequest(BaseModel):
persona: str
story_setting: str
character_input: str
plot_elements: str
writing_style: str
story_tone: str
narrative_pov: str
audience_age_group: str
content_rating: str
ending_preference: str
class StoryPremiseResponse(BaseModel):
premise: str
task_id: Optional[str] = None
class StoryOutlineResponse(BaseModel):
outline: List[Dict[str, Any]]
task_id: Optional[str] = None
class StoryContentResponse(BaseModel):
content: str
is_complete: bool
task_id: Optional[str] = None
class StoryIllustrationRequest(BaseModel):
story_text: str
style: str = "digital art"
aspect_ratio: str = "16:9"
num_segments: int = 5
class StoryIllustrationResponse(BaseModel):
illustrations: List[str] # URLs or base64
segments: List[str]
3.2 Frontend API Service
// frontend/src/services/storyWriterApi.ts
export interface StoryGenerationRequest {
persona: string;
story_setting: string;
character_input: string;
plot_elements: string;
writing_style: string;
story_tone: string;
narrative_pov: string;
audience_age_group: string;
content_rating: string;
ending_preference: string;
}
export interface StoryPremiseResponse {
premise: string;
task_id?: string;
}
export interface StoryOutlineResponse {
outline: Array<{
scene_number: number;
description: string;
narration?: string;
}>;
task_id?: string;
}
export const storyWriterApi = {
generatePremise: (request: StoryGenerationRequest) => Promise<StoryPremiseResponse>,
generateOutline: (premise: string, request: StoryGenerationRequest) => Promise<StoryOutlineResponse>,
generateFullStory: (request: StoryGenerationRequest) => Promise<{ task_id: string }>,
getTaskStatus: (task_id: string) => Promise<TaskStatus>,
getTaskResult: (task_id: string) => Promise<StoryContentResponse>,
// ... more endpoints
};
3.3 State Management
// frontend/src/hooks/useStoryWriterState.ts
interface StoryWriterState {
// Setup phase
persona: string;
storySetting: string;
characters: string;
plotElements: string;
writingStyle: string;
storyTone: string;
narrativePOV: string;
audienceAgeGroup: string;
contentRating: string;
endingPreference: string;
// Generation phases
premise: string | null;
outline: StoryOutlineSection[] | null;
storyContent: string | null;
isComplete: boolean;
// Illustration (optional)
illustrations: string[];
// Task management
currentTaskId: string | null;
generationProgress: number;
}
4. Migration Checklist
Backend
- Create
backend/services/story_writer/story_service.py - Migrate prompt chaining logic from
ai_story_generator.py - Update all imports to use
main_text_generation - Add
user_idparameter to all LLM calls - Create
backend/api/story_writer/router.py - Create
backend/models/story_models.py - Integrate task management (
backend/api/story_writer/task_manager.py) - Add caching support
- Create
backend/api/story_writer/illustration_service.py(optional) - Register router in
app.py
Frontend
- Create
frontend/src/components/StoryWriter/directory structure - Create
StoryWriter.tsxmain component - Create
useStoryWriterPhaseNavigation.tshook - Create
useStoryWriterState.tshook - Create
useStoryWriterCopilotActions.tshook - Create phase components (Setup, Premise, Outline, Writing, Illustration, Export)
- Create
frontend/src/services/storyWriterApi.ts - Add Story Writer route to App.tsx
- Style components to match Blog Writer design
- Add error handling and loading states
- Implement polling for async tasks
Testing
- Unit tests for story service
- Integration tests for API endpoints
- E2E tests for complete story generation flow
- Test with both Gemini and HuggingFace providers
- Test subscription limits and error handling
5. Dependencies
Backend
- ✅
main_text_generation(already available) - ✅
subscription_models(already available) - ✅ FastAPI (already available)
- ⚠️ Image generation (for illustrations - needs verification)
Frontend
- ✅ CopilotKit (already available)
- ✅ React (already available)
- ✅ TypeScript (already available)
- ⚠️ Markdown editor (for story content editing - check if available)
6. Timeline Estimate
- Phase 1 (Backend): 3-5 days
- Phase 2 (Frontend Core): 5-7 days
- Phase 3 (CopilotKit Integration): 2-3 days
- Phase 4 (Illustration - Optional): 3-4 days
- Testing & Polish: 2-3 days
Total: ~15-22 days for core features + illustrations
7. Key Decisions
- Provider Support: Use
main_text_generationwhich supports both Gemini and HuggingFace automatically - UI Pattern: Follow Blog Writer pattern with phase navigation and CopilotKit integration
- Task Management: Use async task pattern with polling (same as Blog Writer)
- Illustration: Make optional/separate phase to keep core story generation focused
- Video Generation: Defer to future phase due to heavy dependencies
8. Next Steps
- Review and approve this plan
- Set up backend service structure
- Begin backend migration
- Create frontend component structure
- Implement phase navigation
- Integrate CopilotKit actions
- Test end-to-end flow
- Add illustration support (optional)
- Polish and documentation