428 lines
19 KiB
Markdown
428 lines
19 KiB
Markdown
## AI Blog Writer — Implementation Specification (Copilot-first, Research-led)
|
|
|
|
### Overview
|
|
- **Goal**: Build a SOTA AI blog writer that guides non-technical users end-to-end: research → outline → section generation → quality/SEO → publishing.
|
|
- **Approach**: Copilot-first UX using CopilotKit. Reuse LinkedIn assistive writing patterns: Google Search grounding, Exa research, hallucination detector, quality analysis, citations.
|
|
- **User Interaction Model**: The user only talks to the Copilot; the editor reflects all state and changes via generative UI and HITL confirmations.
|
|
|
|
### 🚀 **Current Implementation Status** (Updated: December 2024)
|
|
|
|
**✅ COMPLETED PHASES:**
|
|
- **Stage 1: Research & Strategy** - ✅ FULLY IMPLEMENTED
|
|
- **Stage 2: Content Planning (Outline)** - ✅ FULLY IMPLEMENTED
|
|
- **Backend Architecture** - ✅ MODULAR & PRODUCTION-READY
|
|
- **Frontend UI Components** - ✅ COMPREHENSIVE EDITOR
|
|
- **CopilotKit Integration** - ✅ FULLY FUNCTIONAL
|
|
|
|
**🔄 IN PROGRESS:**
|
|
- **Stage 3: Content Generation** - 🔄 PARTIALLY IMPLEMENTED
|
|
- **Stage 4: SEO & Publishing** - 🔄 PARTIALLY IMPLEMENTED
|
|
|
|
**📋 TODO:**
|
|
- Section-by-section content generation
|
|
- Full SEO optimization pipeline
|
|
- Publishing integrations (Wix/WordPress)
|
|
- Advanced quality checks
|
|
|
|
### Key Principles
|
|
- **AI-first, HITL**: The assistant leads with intelligent suggestions; the user approves via render-and-wait HITL components where appropriate.
|
|
- **Research fidelity**: Google grounding + Exa researcher; hallucination detection with claim verification; pervasive citations.
|
|
- **Persona-aware**: Import blog writing persona from DB and apply it across planning/generation/optimizations.
|
|
- **SEO-excellent**: Real-time SEO analysis, metadata generation, schema, and image alt handling.
|
|
- **Publish-ready**: Smooth handoff to Wix/WordPress; preview and scheduling.
|
|
|
|
---
|
|
|
|
## 1) Workflow (4 Stages)
|
|
|
|
### Stage 1: Research & Strategy (AI Orchestration) ✅ **FULLY IMPLEMENTED**
|
|
|
|
**✅ IMPLEMENTED FEATURES:**
|
|
- **Google Search Grounding**: Single Gemini API call with native Google Search integration
|
|
- **Intelligent Caching**: Exact keyword match caching to reduce API costs
|
|
- **AI-Powered Analysis**: Keyword analysis, competitor analysis, content angle generation
|
|
- **Robust Error Handling**: No fallback data - only real AI-generated insights or graceful failures
|
|
- **Progress Tracking**: Real-time progress messages during research operations
|
|
|
|
**✅ IMPLEMENTED INPUTS:**
|
|
- `keywords: string[]`, `industry: string`, `targetAudience: string`, `wordCountTarget: number`
|
|
- Persona support (basic implementation)
|
|
|
|
**✅ IMPLEMENTED BACKEND/SERVICES:**
|
|
- **Modular Architecture**: `ResearchService`, `KeywordAnalyzer`, `CompetitorAnalyzer`, `ContentAngleGenerator`
|
|
- **Google Grounding**: Native Gemini Google Search integration (no Exa dependency)
|
|
- **Caching System**: Intelligent research result caching with TTL and LRU eviction
|
|
- **Error Handling**: Graceful failure with specific error messages
|
|
|
|
**✅ IMPLEMENTED COPILOTKIT ACTIONS:**
|
|
- `researchTopic(keywords, industry, target_audience, blogLength)` → comprehensive research with sources
|
|
- `chatWithResearchData(question)` → interactive research data exploration
|
|
- `getResearchKeywords()` → HITL keyword collection form
|
|
- `performResearch(formData)` → research execution with form data
|
|
|
|
**✅ IMPLEMENTED GENERATIVE UI:**
|
|
- **ResearchResults Component**: Sources, credibility scores, keyword analysis, content angles
|
|
- **KeywordInputForm**: HITL form for keyword collection with blog length selection
|
|
- **Progress Messages**: Real-time loading states with CopilotKit status system
|
|
|
|
**✅ IMPLEMENTED SUGGESTIONS:**
|
|
- "I want to research a topic for my blog" (initial)
|
|
- "Let's proceed to create an Outline" (post-research)
|
|
- "Chat with Research Data" (exploration)
|
|
- "Create outline with custom inputs" (advanced)
|
|
|
|
---
|
|
|
|
### Stage 2: Content Planning (AI + Human) ✅ **FULLY IMPLEMENTED**
|
|
|
|
**✅ IMPLEMENTED DELIVERABLES:**
|
|
- **Structured Outline**: H1/H2/H3 hierarchy with per-section key points and target word counts
|
|
- **AI-Generated Titles**: Multiple title options with SEO optimization
|
|
- **Research Integration**: Outline sections linked to research sources and keywords
|
|
- **Word Count Distribution**: Intelligent word allocation across sections
|
|
|
|
**✅ IMPLEMENTED COPILOTKIT ACTIONS:**
|
|
- `generateOutline()` → AI-powered outline generation from research data
|
|
- `createOutlineWithCustomInputs(customInstructions)` → custom outline with user instructions
|
|
- `refineOutline(operation, sectionId, payload)` → add/remove/move/merge/rename sections
|
|
- `enhanceSection(sectionId, focus)` → AI enhancement of individual sections
|
|
- `optimizeOutline(focus)` → AI optimization of entire outline
|
|
- `rebalanceOutline(targetWords)` → word count rebalancing across sections
|
|
|
|
**✅ IMPLEMENTED GENERATIVE UI:**
|
|
- **EnhancedOutlineEditor**: Interactive outline editor with expandable sections
|
|
- **TitleSelector**: AI-generated title options with custom title creation
|
|
- **CustomOutlineForm**: HITL form for custom outline instructions
|
|
- **Section Management**: Add, edit, reorder, merge sections with visual feedback
|
|
- **Research Integration**: Source references and keyword suggestions per section
|
|
|
|
**✅ IMPLEMENTED SUGGESTIONS:**
|
|
- "Generate outline" (standard)
|
|
- "Create outline with custom inputs" (advanced)
|
|
- "Enhance section [X]" (section-specific)
|
|
- "Optimize entire outline" (global)
|
|
- "Rebalance word counts" (distribution)
|
|
|
|
---
|
|
|
|
### Stage 3: Content Generation (CopilotKit-only, no multi-agent) 🔄 **PARTIALLY IMPLEMENTED**
|
|
|
|
**🔄 PARTIALLY IMPLEMENTED DELIVERABLES:**
|
|
- **Section Generation**: Basic section generation with markdown output
|
|
- **Content Structure**: Sectioned markdown with inline citations support
|
|
- **Quality Checks**: Hallucination detection integration
|
|
|
|
**✅ IMPLEMENTED COPILOTKIT ACTIONS:**
|
|
- `generateSection(sectionId)` → generates content for specific section
|
|
- `generateAllSections()` → placeholder for bulk generation
|
|
- `runHallucinationCheck()` → integrates with hallucination detector service
|
|
|
|
**🔄 PARTIALLY IMPLEMENTED UI:**
|
|
- **Section Editors**: Basic markdown editing per section
|
|
- **DiffPreview Component**: Exists but needs integration
|
|
- **Citation System**: Basic structure in place
|
|
|
|
**📋 TODO:**
|
|
- Full section-by-section content generation
|
|
- Advanced content optimization
|
|
- Inline citation management
|
|
- Content quality improvements
|
|
- Progress tracking for bulk generation
|
|
|
|
---
|
|
|
|
### Stage 4: Optimization & Publishing (AI + Human) 🔄 **PARTIALLY IMPLEMENTED**
|
|
|
|
**🔄 PARTIALLY IMPLEMENTED SEO OPTIMIZATION:**
|
|
- **SEO Analysis**: Basic SEO analysis with keyword density and structure
|
|
- **Metadata Generation**: Title options and meta description generation
|
|
- **SEO Integration**: Wraps existing SEO tools services
|
|
|
|
**✅ IMPLEMENTED COPILOTKIT ACTIONS:**
|
|
- `runSEOAnalyze(keywords)` → SEO analysis with scores and recommendations
|
|
- `generateSEOMetadata(title)` → metadata generation for titles and descriptions
|
|
- `publishToPlatform(platform, schedule)` → placeholder for publishing
|
|
|
|
**🔄 PARTIALLY IMPLEMENTED UI:**
|
|
- **SEOMiniPanel**: Basic SEO analysis display
|
|
- **Metadata Management**: Title and description editing
|
|
|
|
**📋 TODO:**
|
|
- Full SEO optimization pipeline
|
|
- Advanced SEO recommendations
|
|
- Publishing integrations (Wix/WordPress)
|
|
- Content optimization with diff preview
|
|
- Image alt text and media management
|
|
- Schema markup generation
|
|
|
|
---
|
|
|
|
## 2) SEO Tools Integration & Metadata
|
|
|
|
Existing Services to Wrap
|
|
- Meta Description, OpenGraph, Image Alt, On-Page SEO, Technical SEO, Content Strategy (see `backend/services/seo_tools/*` and docs).
|
|
|
|
Unified Endpoints
|
|
- `POST /api/blog/seo/analyze` → { seoScore, density, structure, readability, link suggestions, image alt status, recs }
|
|
- `POST /api/blog/seo/metadata` → { titleOptions, metaDescriptionOptions, openGraph, twitterCard, schema: { Article, FAQ?, Breadcrumb, Org/Person } }
|
|
|
|
Editor SEO Panel
|
|
- Live density and distribution, readability (Flesch-Kincaid), heading hierarchy, internal/external link suggestions.
|
|
- One-click “Apply Fix” with diff preview.
|
|
|
|
Schema
|
|
- Default Article schema; optional FAQ when Q&A snippets exist; Breadcrumb, Organization/Person as applicable.
|
|
|
|
---
|
|
|
|
## 3) Dedicated Blog Editor Design (Copilot-first)
|
|
|
|
Layout
|
|
- Left: Markdown Editor (per-section tabs), word count, persona cues, inline citation chips.
|
|
- Right: Live Preview (desktop/mobile), SEO SERP snippet preview, social preview (OG/Twitter).
|
|
- Sidebar Panels: Research (sources, claims), SEO (scores/fixes), Media (AI images + alt text), History (versions).
|
|
|
|
Core Components
|
|
- `BlogResearchCard` (render-only): sources, credibility scores, add-to-outline.
|
|
- `OutlineEditor` (HITL): drag-drop H2/H3, per-section refs and target words.
|
|
- `SectionEditor`: markdown area with persona/tone badges; per-section SEO mini-score.
|
|
- `DiffPreview` (HITL): apply/reject AI edits.
|
|
- `SEOPanel`: density/structure/readability + apply fix.
|
|
- `MediaPanel`: AI images, compression, automatic alt-text.
|
|
|
|
CopilotKit Integrations
|
|
- Suggestions: set programmatically (`useCopilotChatHeadless_c`) or via `CopilotSidebar` props.
|
|
- Generative UI: `useCopilotAction({ render })` for research cards, outline editor, diff preview, publish dialog.
|
|
- HITL: `renderAndWaitForResponse` for approvals at outline, diff apply, and publish steps.
|
|
- References: CopilotKit docs — Frontend Actions, Generative UI, Suggestions, HITL.
|
|
|
|
Persistence
|
|
- Persist outline, per-section content, references, persona snapshot, SEO state, metadata drafts.
|
|
- Auto-save every 30s; version history for undo.
|
|
|
|
---
|
|
|
|
## 4) Backend APIs ✅ **FULLY IMPLEMENTED**
|
|
|
|
**✅ IMPLEMENTED BLOG ENDPOINTS:**
|
|
- `POST /api/blog/research/start` → async research with progress tracking
|
|
- `GET /api/blog/research/status/{task_id}` → research progress status
|
|
- `POST /api/blog/outline/start` → async outline generation with progress
|
|
- `GET /api/blog/outline/status/{task_id}` → outline progress status
|
|
- `POST /api/blog/outline/refine` → outline refinement operations
|
|
- `POST /api/blog/outline/rebalance` → word count rebalancing
|
|
- `POST /api/blog/section/generate` → section content generation
|
|
- `POST /api/blog/section/optimize` → content optimization
|
|
- `POST /api/blog/quality/hallucination-check` → hallucination detection
|
|
- `POST /api/blog/seo/analyze` → SEO analysis and recommendations
|
|
- `POST /api/blog/seo/metadata` → metadata generation
|
|
- `POST /api/blog/publish` → publishing to platforms
|
|
- `GET /api/blog/health` → service health check
|
|
|
|
**✅ IMPLEMENTED MODULAR ARCHITECTURE:**
|
|
- **Core Service**: `BlogWriterService` - main orchestrator
|
|
- **Research Module**: `ResearchService`, `KeywordAnalyzer`, `CompetitorAnalyzer`, `ContentAngleGenerator`
|
|
- **Outline Module**: `OutlineService`, `OutlineGenerator`, `OutlineOptimizer`, `SectionEnhancer`
|
|
- **Caching System**: Intelligent research result caching with TTL and LRU eviction
|
|
- **Error Handling**: Graceful failure with specific error messages
|
|
|
|
**✅ IMPLEMENTED MODELS:**
|
|
- `BlogResearchRequest`, `BlogResearchResponse`
|
|
- `BlogOutlineRequest`, `BlogOutlineResponse`, `BlogOutlineRefineRequest`
|
|
- `BlogSectionRequest`, `BlogSectionResponse`
|
|
- `BlogOptimizeRequest`, `BlogOptimizeResponse`
|
|
- `BlogSEOAnalyzeRequest`, `BlogSEOAnalyzeResponse`
|
|
- `BlogSEOMetadataRequest`, `BlogSEOMetadataResponse`
|
|
- `BlogPublishRequest`, `BlogPublishResponse`
|
|
- `HallucinationCheckRequest`, `HallucinationCheckResponse`
|
|
|
|
**✅ REUSED SERVICES:**
|
|
- `/api/hallucination-detector/*` - hallucination detection integration
|
|
- SEO tools services - wrapped for blog-specific analysis
|
|
|
|
---
|
|
|
|
## 5) CopilotKit Action Inventory ✅ **COMPREHENSIVE IMPLEMENTATION**
|
|
|
|
**✅ RESEARCH ACTIONS (FULLY IMPLEMENTED):**
|
|
- `researchTopic(keywords, industry, target_audience, blogLength)` → comprehensive research
|
|
- `chatWithResearchData(question)` → interactive research exploration
|
|
- `getResearchKeywords()` → HITL keyword collection form
|
|
- `performResearch(formData)` → research execution with form data
|
|
|
|
**✅ PLANNING ACTIONS (FULLY IMPLEMENTED):**
|
|
- `generateOutline()` → AI-powered outline generation
|
|
- `createOutlineWithCustomInputs(customInstructions)` → custom outline creation
|
|
- `refineOutline(operation, sectionId, payload)` → outline refinement operations
|
|
- `enhanceSection(sectionId, focus)` → section enhancement
|
|
- `optimizeOutline(focus)` → outline optimization
|
|
- `rebalanceOutline(targetWords)` → word count rebalancing
|
|
|
|
**🔄 GENERATION ACTIONS (PARTIALLY IMPLEMENTED):**
|
|
- `generateSection(sectionId)` → section content generation ✅
|
|
- `generateAllSections()` → bulk generation (placeholder) 🔄
|
|
- `runHallucinationCheck()` → hallucination detection ✅
|
|
|
|
**🔄 SEO ACTIONS (PARTIALLY IMPLEMENTED):**
|
|
- `runSEOAnalyze(keywords)` → SEO analysis ✅
|
|
- `generateSEOMetadata(title)` → metadata generation ✅
|
|
|
|
**🔄 PUBLISHING ACTIONS (PARTIALLY IMPLEMENTED):**
|
|
- `publishToPlatform(platform, schedule)` → publishing (placeholder) 🔄
|
|
|
|
**✅ UX/RENDER-ONLY/HITL (FULLY IMPLEMENTED):**
|
|
- `ResearchResults` → research data visualization
|
|
- `EnhancedOutlineEditor` → interactive outline management
|
|
- `KeywordInputForm` → HITL keyword collection
|
|
- `CustomOutlineForm` → HITL custom outline creation
|
|
- `TitleSelector` → title selection and creation
|
|
- `DiffPreview` → content diff visualization
|
|
- `SEOMiniPanel` → SEO analysis display
|
|
|
|
---
|
|
|
|
## 6) Intelligent Suggestions (states)
|
|
|
|
Before research
|
|
- “Load persona”, “Analyze keywords”, “Research topic”
|
|
|
|
After research
|
|
- “Generate outline”, “Add competitor H2s”, “Attach sources”
|
|
|
|
Outline ready
|
|
- “Generate [Section 1]”, “…”, “Generate all sections”
|
|
|
|
Draft ready
|
|
- “Run fact-check”, “Run SEO analysis”, “Generate metadata”
|
|
|
|
Final
|
|
- “Publish to WordPress”, “Schedule on Wix”
|
|
|
|
---
|
|
|
|
## 7) Delivery Plan / Milestones ✅ **UPDATED STATUS**
|
|
|
|
**✅ MILESTONE 1: Research + Outline (COMPLETED)**
|
|
- ✅ Actions: research topic, generate outline, outline editor (HITL)
|
|
- ✅ Google Search grounding integration
|
|
- ✅ AI-powered keyword and competitor analysis
|
|
- ✅ Interactive outline editor with refinement capabilities
|
|
- ✅ Research data visualization and exploration
|
|
|
|
**🔄 MILESTONE 2: Section Generation + Quality (IN PROGRESS)**
|
|
- ✅ generateSection (basic implementation)
|
|
- 🔄 generateAllSections (needs full implementation)
|
|
- 🔄 optimizeSection with diff preview (needs integration)
|
|
- ✅ hallucination check integration
|
|
- 📋 Content quality improvements and optimization
|
|
|
|
**🔄 MILESTONE 3: SEO & Metadata (IN PROGRESS)**
|
|
- ✅ analyzeSEO panel (basic implementation)
|
|
- ✅ generateSEOMetadata (title/meta generation)
|
|
- 📋 Advanced SEO recommendations and fixes
|
|
- 📋 Schema markup and social media optimization
|
|
|
|
**📋 MILESTONE 4: Publishing (TODO)**
|
|
- 📋 prepareForPublish functionality
|
|
- 📋 publishToPlatform (Wix/WordPress integration)
|
|
- 📋 Scheduling and publishing workflow
|
|
- 📋 Success URL and status tracking
|
|
|
|
**📋 MILESTONE 5: Polish (TODO)**
|
|
- 📋 Advanced readability aids
|
|
- 📋 Version history and auto-save
|
|
- 📋 Performance optimization
|
|
- 📋 Accessibility improvements
|
|
|
|
---
|
|
|
|
## 8) Current Architecture & Implementation Details
|
|
|
|
### 🏗️ **Backend Architecture (Modular & Production-Ready)**
|
|
|
|
**Core Service Structure:**
|
|
```
|
|
backend/services/blog_writer/
|
|
├── core/
|
|
│ └── blog_writer_service.py # Main orchestrator
|
|
├── research/
|
|
│ ├── research_service.py # Research orchestration
|
|
│ ├── keyword_analyzer.py # AI keyword analysis
|
|
│ ├── competitor_analyzer.py # Competitor intelligence
|
|
│ └── content_angle_generator.py # Content angle discovery
|
|
├── outline/
|
|
│ ├── outline_service.py # Outline orchestration
|
|
│ ├── outline_generator.py # AI outline generation
|
|
│ ├── outline_optimizer.py # Outline optimization
|
|
│ └── section_enhancer.py # Section enhancement
|
|
└── blog_service.py # Entry point (thin wrapper)
|
|
```
|
|
|
|
**Key Features:**
|
|
- **No Fallback Data**: Only real AI-generated insights or graceful failures
|
|
- **Intelligent Caching**: Research result caching with TTL and LRU eviction
|
|
- **Error Handling**: Specific error messages and retry logic
|
|
- **Progress Tracking**: Real-time progress updates for long-running operations
|
|
|
|
### 🎨 **Frontend Architecture (CopilotKit-First)**
|
|
|
|
**Component Structure:**
|
|
```
|
|
frontend/src/components/BlogWriter/
|
|
├── BlogWriter.tsx # Main orchestrator component
|
|
├── ResearchAction.tsx # Research CopilotKit actions
|
|
├── ResearchResults.tsx # Research data visualization
|
|
├── KeywordInputForm.tsx # HITL keyword collection
|
|
├── EnhancedOutlineEditor.tsx # Interactive outline editor
|
|
├── TitleSelector.tsx # Title selection and creation
|
|
├── CustomOutlineForm.tsx # HITL custom outline creation
|
|
├── ResearchDataActions.tsx # Research data interaction
|
|
├── EnhancedOutlineActions.tsx # Outline management actions
|
|
├── DiffPreview.tsx # Content diff visualization
|
|
└── SEOMiniPanel.tsx # SEO analysis display
|
|
```
|
|
|
|
**Key Features:**
|
|
- **CopilotKit Integration**: Full action system with HITL components
|
|
- **Real-time Updates**: Progress messages and status tracking
|
|
- **Interactive UI**: Drag-and-drop, expandable sections, visual feedback
|
|
- **Error Handling**: User-friendly error messages and recovery
|
|
|
|
### 🔧 **Technical Implementation Highlights**
|
|
|
|
**Research Phase:**
|
|
- Single Gemini API call with Google Search grounding
|
|
- AI-powered analysis of keywords, competitors, and content angles
|
|
- Intelligent caching to reduce API costs
|
|
- No fallback data - only real AI insights
|
|
|
|
**Outline Phase:**
|
|
- Research-driven outline generation
|
|
- Interactive outline editor with full CRUD operations
|
|
- AI-powered section enhancement and optimization
|
|
- Word count rebalancing and distribution
|
|
|
|
**Quality Assurance:**
|
|
- Robust error handling with specific messages
|
|
- Progress tracking for long-running operations
|
|
- Graceful failure without misleading data
|
|
- Real-time user feedback and guidance
|
|
|
|
---
|
|
|
|
## 9) References
|
|
- CopilotKit Quickstart, Frontend Actions, Generative UI, HITL, Suggestions
|
|
- Quickstart: https://docs.copilotkit.ai/direct-to-llm/guides/quickstart
|
|
- Frontend Actions: https://docs.copilotkit.ai/frontend-actions
|
|
- Generative UI: https://docs.copilotkit.ai/direct-to-llm/guides/generative-ui
|
|
- Headless + Suggestions + HITL: https://docs.copilotkit.ai/premium/headless-ui
|
|
|
|
---
|
|
|
|
## 9) Notes on Reuse from LinkedIn Writer
|
|
- Research handler; Gemini grounded provider; citation manager; quality analyzer.
|
|
- Hallucination detector + Exa verification endpoints.
|
|
- CopilotKit integration patterns: actions, suggestions, render/HITL, state persistence.
|
|
|
|
|