## AI Blog Writer — Implementation Specification (Copilot-first, Research-led) ### Overview - **Goal**: Build a SOTA AI blog writer that guides non-technical users end-to-end: research → outline → section generation → quality/SEO → publishing. - **Approach**: Copilot-first UX using CopilotKit. Reuse LinkedIn assistive writing patterns: Google Search grounding, Exa research, hallucination detector, quality analysis, citations. - **User Interaction Model**: The user only talks to the Copilot; the editor reflects all state and changes via generative UI and HITL confirmations. ### 🚀 **Current Implementation Status** (Updated: December 2024) **✅ COMPLETED PHASES:** - **Stage 1: Research & Strategy** - ✅ FULLY IMPLEMENTED - **Stage 2: Content Planning (Outline)** - ✅ FULLY IMPLEMENTED - **Backend Architecture** - ✅ MODULAR & PRODUCTION-READY - **Frontend UI Components** - ✅ COMPREHENSIVE EDITOR - **CopilotKit Integration** - ✅ FULLY FUNCTIONAL **🔄 IN PROGRESS:** - **Stage 3: Content Generation** - 🔄 PARTIALLY IMPLEMENTED - **Stage 4: SEO & Publishing** - 🔄 PARTIALLY IMPLEMENTED **📋 TODO:** - Section-by-section content generation - Full SEO optimization pipeline - Publishing integrations (Wix/WordPress) - Advanced quality checks ### Key Principles - **AI-first, HITL**: The assistant leads with intelligent suggestions; the user approves via render-and-wait HITL components where appropriate. - **Research fidelity**: Google grounding + Exa researcher; hallucination detection with claim verification; pervasive citations. - **Persona-aware**: Import blog writing persona from DB and apply it across planning/generation/optimizations. - **SEO-excellent**: Real-time SEO analysis, metadata generation, schema, and image alt handling. - **Publish-ready**: Smooth handoff to Wix/WordPress; preview and scheduling. --- ## 1) Workflow (4 Stages) ### Stage 1: Research & Strategy (AI Orchestration) ✅ **FULLY IMPLEMENTED** **✅ IMPLEMENTED FEATURES:** - **Google Search Grounding**: Single Gemini API call with native Google Search integration - **Intelligent Caching**: Exact keyword match caching to reduce API costs - **AI-Powered Analysis**: Keyword analysis, competitor analysis, content angle generation - **Robust Error Handling**: No fallback data - only real AI-generated insights or graceful failures - **Progress Tracking**: Real-time progress messages during research operations **✅ IMPLEMENTED INPUTS:** - `keywords: string[]`, `industry: string`, `targetAudience: string`, `wordCountTarget: number` - Persona support (basic implementation) **✅ IMPLEMENTED BACKEND/SERVICES:** - **Modular Architecture**: `ResearchService`, `KeywordAnalyzer`, `CompetitorAnalyzer`, `ContentAngleGenerator` - **Google Grounding**: Native Gemini Google Search integration (no Exa dependency) - **Caching System**: Intelligent research result caching with TTL and LRU eviction - **Error Handling**: Graceful failure with specific error messages **✅ IMPLEMENTED COPILOTKIT ACTIONS:** - `researchTopic(keywords, industry, target_audience, blogLength)` → comprehensive research with sources - `chatWithResearchData(question)` → interactive research data exploration - `getResearchKeywords()` → HITL keyword collection form - `performResearch(formData)` → research execution with form data **✅ IMPLEMENTED GENERATIVE UI:** - **ResearchResults Component**: Sources, credibility scores, keyword analysis, content angles - **KeywordInputForm**: HITL form for keyword collection with blog length selection - **Progress Messages**: Real-time loading states with CopilotKit status system **✅ IMPLEMENTED SUGGESTIONS:** - "I want to research a topic for my blog" (initial) - "Let's proceed to create an Outline" (post-research) - "Chat with Research Data" (exploration) - "Create outline with custom inputs" (advanced) --- ### Stage 2: Content Planning (AI + Human) ✅ **FULLY IMPLEMENTED** **✅ IMPLEMENTED DELIVERABLES:** - **Structured Outline**: H1/H2/H3 hierarchy with per-section key points and target word counts - **AI-Generated Titles**: Multiple title options with SEO optimization - **Research Integration**: Outline sections linked to research sources and keywords - **Word Count Distribution**: Intelligent word allocation across sections **✅ IMPLEMENTED COPILOTKIT ACTIONS:** - `generateOutline()` → AI-powered outline generation from research data - `createOutlineWithCustomInputs(customInstructions)` → custom outline with user instructions - `refineOutline(operation, sectionId, payload)` → add/remove/move/merge/rename sections - `enhanceSection(sectionId, focus)` → AI enhancement of individual sections - `optimizeOutline(focus)` → AI optimization of entire outline - `rebalanceOutline(targetWords)` → word count rebalancing across sections **✅ IMPLEMENTED GENERATIVE UI:** - **EnhancedOutlineEditor**: Interactive outline editor with expandable sections - **TitleSelector**: AI-generated title options with custom title creation - **CustomOutlineForm**: HITL form for custom outline instructions - **Section Management**: Add, edit, reorder, merge sections with visual feedback - **Research Integration**: Source references and keyword suggestions per section **✅ IMPLEMENTED SUGGESTIONS:** - "Generate outline" (standard) - "Create outline with custom inputs" (advanced) - "Enhance section [X]" (section-specific) - "Optimize entire outline" (global) - "Rebalance word counts" (distribution) --- ### Stage 3: Content Generation (CopilotKit-only, no multi-agent) 🔄 **PARTIALLY IMPLEMENTED** **🔄 PARTIALLY IMPLEMENTED DELIVERABLES:** - **Section Generation**: Basic section generation with markdown output - **Content Structure**: Sectioned markdown with inline citations support - **Quality Checks**: Hallucination detection integration **✅ IMPLEMENTED COPILOTKIT ACTIONS:** - `generateSection(sectionId)` → generates content for specific section - `generateAllSections()` → placeholder for bulk generation - `runHallucinationCheck()` → integrates with hallucination detector service **🔄 PARTIALLY IMPLEMENTED UI:** - **Section Editors**: Basic markdown editing per section - **DiffPreview Component**: Exists but needs integration - **Citation System**: Basic structure in place **📋 TODO:** - Full section-by-section content generation - Advanced content optimization - Inline citation management - Content quality improvements - Progress tracking for bulk generation --- ### Stage 4: Optimization & Publishing (AI + Human) 🔄 **PARTIALLY IMPLEMENTED** **🔄 PARTIALLY IMPLEMENTED SEO OPTIMIZATION:** - **SEO Analysis**: Basic SEO analysis with keyword density and structure - **Metadata Generation**: Title options and meta description generation - **SEO Integration**: Wraps existing SEO tools services **✅ IMPLEMENTED COPILOTKIT ACTIONS:** - `runSEOAnalyze(keywords)` → SEO analysis with scores and recommendations - `generateSEOMetadata(title)` → metadata generation for titles and descriptions - `publishToPlatform(platform, schedule)` → placeholder for publishing **🔄 PARTIALLY IMPLEMENTED UI:** - **SEOMiniPanel**: Basic SEO analysis display - **Metadata Management**: Title and description editing **📋 TODO:** - Full SEO optimization pipeline - Advanced SEO recommendations - Publishing integrations (Wix/WordPress) - Content optimization with diff preview - Image alt text and media management - Schema markup generation --- ## 2) SEO Tools Integration & Metadata Existing Services to Wrap - Meta Description, OpenGraph, Image Alt, On-Page SEO, Technical SEO, Content Strategy (see `backend/services/seo_tools/*` and docs). Unified Endpoints - `POST /api/blog/seo/analyze` → { seoScore, density, structure, readability, link suggestions, image alt status, recs } - `POST /api/blog/seo/metadata` → { titleOptions, metaDescriptionOptions, openGraph, twitterCard, schema: { Article, FAQ?, Breadcrumb, Org/Person } } Editor SEO Panel - Live density and distribution, readability (Flesch-Kincaid), heading hierarchy, internal/external link suggestions. - One-click “Apply Fix” with diff preview. Schema - Default Article schema; optional FAQ when Q&A snippets exist; Breadcrumb, Organization/Person as applicable. --- ## 3) Dedicated Blog Editor Design (Copilot-first) Layout - Left: Markdown Editor (per-section tabs), word count, persona cues, inline citation chips. - Right: Live Preview (desktop/mobile), SEO SERP snippet preview, social preview (OG/Twitter). - Sidebar Panels: Research (sources, claims), SEO (scores/fixes), Media (AI images + alt text), History (versions). Core Components - `BlogResearchCard` (render-only): sources, credibility scores, add-to-outline. - `OutlineEditor` (HITL): drag-drop H2/H3, per-section refs and target words. - `SectionEditor`: markdown area with persona/tone badges; per-section SEO mini-score. - `DiffPreview` (HITL): apply/reject AI edits. - `SEOPanel`: density/structure/readability + apply fix. - `MediaPanel`: AI images, compression, automatic alt-text. CopilotKit Integrations - Suggestions: set programmatically (`useCopilotChatHeadless_c`) or via `CopilotSidebar` props. - Generative UI: `useCopilotAction({ render })` for research cards, outline editor, diff preview, publish dialog. - HITL: `renderAndWaitForResponse` for approvals at outline, diff apply, and publish steps. - References: CopilotKit docs — Frontend Actions, Generative UI, Suggestions, HITL. Persistence - Persist outline, per-section content, references, persona snapshot, SEO state, metadata drafts. - Auto-save every 30s; version history for undo. --- ## 4) Backend APIs ✅ **FULLY IMPLEMENTED** **✅ IMPLEMENTED BLOG ENDPOINTS:** - `POST /api/blog/research/start` → async research with progress tracking - `GET /api/blog/research/status/{task_id}` → research progress status - `POST /api/blog/outline/start` → async outline generation with progress - `GET /api/blog/outline/status/{task_id}` → outline progress status - `POST /api/blog/outline/refine` → outline refinement operations - `POST /api/blog/outline/rebalance` → word count rebalancing - `POST /api/blog/section/generate` → section content generation - `POST /api/blog/section/optimize` → content optimization - `POST /api/blog/quality/hallucination-check` → hallucination detection - `POST /api/blog/seo/analyze` → SEO analysis and recommendations - `POST /api/blog/seo/metadata` → metadata generation - `POST /api/blog/publish` → publishing to platforms - `GET /api/blog/health` → service health check **✅ IMPLEMENTED MODULAR ARCHITECTURE:** - **Core Service**: `BlogWriterService` - main orchestrator - **Research Module**: `ResearchService`, `KeywordAnalyzer`, `CompetitorAnalyzer`, `ContentAngleGenerator` - **Outline Module**: `OutlineService`, `OutlineGenerator`, `OutlineOptimizer`, `SectionEnhancer` - **Caching System**: Intelligent research result caching with TTL and LRU eviction - **Error Handling**: Graceful failure with specific error messages **✅ IMPLEMENTED MODELS:** - `BlogResearchRequest`, `BlogResearchResponse` - `BlogOutlineRequest`, `BlogOutlineResponse`, `BlogOutlineRefineRequest` - `BlogSectionRequest`, `BlogSectionResponse` - `BlogOptimizeRequest`, `BlogOptimizeResponse` - `BlogSEOAnalyzeRequest`, `BlogSEOAnalyzeResponse` - `BlogSEOMetadataRequest`, `BlogSEOMetadataResponse` - `BlogPublishRequest`, `BlogPublishResponse` - `HallucinationCheckRequest`, `HallucinationCheckResponse` **✅ REUSED SERVICES:** - `/api/hallucination-detector/*` - hallucination detection integration - SEO tools services - wrapped for blog-specific analysis --- ## 5) CopilotKit Action Inventory ✅ **COMPREHENSIVE IMPLEMENTATION** **✅ RESEARCH ACTIONS (FULLY IMPLEMENTED):** - `researchTopic(keywords, industry, target_audience, blogLength)` → comprehensive research - `chatWithResearchData(question)` → interactive research exploration - `getResearchKeywords()` → HITL keyword collection form - `performResearch(formData)` → research execution with form data **✅ PLANNING ACTIONS (FULLY IMPLEMENTED):** - `generateOutline()` → AI-powered outline generation - `createOutlineWithCustomInputs(customInstructions)` → custom outline creation - `refineOutline(operation, sectionId, payload)` → outline refinement operations - `enhanceSection(sectionId, focus)` → section enhancement - `optimizeOutline(focus)` → outline optimization - `rebalanceOutline(targetWords)` → word count rebalancing **🔄 GENERATION ACTIONS (PARTIALLY IMPLEMENTED):** - `generateSection(sectionId)` → section content generation ✅ - `generateAllSections()` → bulk generation (placeholder) 🔄 - `runHallucinationCheck()` → hallucination detection ✅ **🔄 SEO ACTIONS (PARTIALLY IMPLEMENTED):** - `runSEOAnalyze(keywords)` → SEO analysis ✅ - `generateSEOMetadata(title)` → metadata generation ✅ **🔄 PUBLISHING ACTIONS (PARTIALLY IMPLEMENTED):** - `publishToPlatform(platform, schedule)` → publishing (placeholder) 🔄 **✅ UX/RENDER-ONLY/HITL (FULLY IMPLEMENTED):** - `ResearchResults` → research data visualization - `EnhancedOutlineEditor` → interactive outline management - `KeywordInputForm` → HITL keyword collection - `CustomOutlineForm` → HITL custom outline creation - `TitleSelector` → title selection and creation - `DiffPreview` → content diff visualization - `SEOMiniPanel` → SEO analysis display --- ## 6) Intelligent Suggestions (states) Before research - “Load persona”, “Analyze keywords”, “Research topic” After research - “Generate outline”, “Add competitor H2s”, “Attach sources” Outline ready - “Generate [Section 1]”, “…”, “Generate all sections” Draft ready - “Run fact-check”, “Run SEO analysis”, “Generate metadata” Final - “Publish to WordPress”, “Schedule on Wix” --- ## 7) Delivery Plan / Milestones ✅ **UPDATED STATUS** **✅ MILESTONE 1: Research + Outline (COMPLETED)** - ✅ Actions: research topic, generate outline, outline editor (HITL) - ✅ Google Search grounding integration - ✅ AI-powered keyword and competitor analysis - ✅ Interactive outline editor with refinement capabilities - ✅ Research data visualization and exploration **🔄 MILESTONE 2: Section Generation + Quality (IN PROGRESS)** - ✅ generateSection (basic implementation) - 🔄 generateAllSections (needs full implementation) - 🔄 optimizeSection with diff preview (needs integration) - ✅ hallucination check integration - 📋 Content quality improvements and optimization **🔄 MILESTONE 3: SEO & Metadata (IN PROGRESS)** - ✅ analyzeSEO panel (basic implementation) - ✅ generateSEOMetadata (title/meta generation) - 📋 Advanced SEO recommendations and fixes - 📋 Schema markup and social media optimization **📋 MILESTONE 4: Publishing (TODO)** - 📋 prepareForPublish functionality - 📋 publishToPlatform (Wix/WordPress integration) - 📋 Scheduling and publishing workflow - 📋 Success URL and status tracking **📋 MILESTONE 5: Polish (TODO)** - 📋 Advanced readability aids - 📋 Version history and auto-save - 📋 Performance optimization - 📋 Accessibility improvements --- ## 8) Current Architecture & Implementation Details ### 🏗️ **Backend Architecture (Modular & Production-Ready)** **Core Service Structure:** ``` backend/services/blog_writer/ ├── core/ │ └── blog_writer_service.py # Main orchestrator ├── research/ │ ├── research_service.py # Research orchestration │ ├── keyword_analyzer.py # AI keyword analysis │ ├── competitor_analyzer.py # Competitor intelligence │ └── content_angle_generator.py # Content angle discovery ├── outline/ │ ├── outline_service.py # Outline orchestration │ ├── outline_generator.py # AI outline generation │ ├── outline_optimizer.py # Outline optimization │ └── section_enhancer.py # Section enhancement └── blog_service.py # Entry point (thin wrapper) ``` **Key Features:** - **No Fallback Data**: Only real AI-generated insights or graceful failures - **Intelligent Caching**: Research result caching with TTL and LRU eviction - **Error Handling**: Specific error messages and retry logic - **Progress Tracking**: Real-time progress updates for long-running operations ### 🎨 **Frontend Architecture (CopilotKit-First)** **Component Structure:** ``` frontend/src/components/BlogWriter/ ├── BlogWriter.tsx # Main orchestrator component ├── ResearchAction.tsx # Research CopilotKit actions ├── ResearchResults.tsx # Research data visualization ├── KeywordInputForm.tsx # HITL keyword collection ├── EnhancedOutlineEditor.tsx # Interactive outline editor ├── TitleSelector.tsx # Title selection and creation ├── CustomOutlineForm.tsx # HITL custom outline creation ├── ResearchDataActions.tsx # Research data interaction ├── EnhancedOutlineActions.tsx # Outline management actions ├── DiffPreview.tsx # Content diff visualization └── SEOMiniPanel.tsx # SEO analysis display ``` **Key Features:** - **CopilotKit Integration**: Full action system with HITL components - **Real-time Updates**: Progress messages and status tracking - **Interactive UI**: Drag-and-drop, expandable sections, visual feedback - **Error Handling**: User-friendly error messages and recovery ### 🔧 **Technical Implementation Highlights** **Research Phase:** - Single Gemini API call with Google Search grounding - AI-powered analysis of keywords, competitors, and content angles - Intelligent caching to reduce API costs - No fallback data - only real AI insights **Outline Phase:** - Research-driven outline generation - Interactive outline editor with full CRUD operations - AI-powered section enhancement and optimization - Word count rebalancing and distribution **Quality Assurance:** - Robust error handling with specific messages - Progress tracking for long-running operations - Graceful failure without misleading data - Real-time user feedback and guidance --- ## 9) References - CopilotKit Quickstart, Frontend Actions, Generative UI, HITL, Suggestions - Quickstart: https://docs.copilotkit.ai/direct-to-llm/guides/quickstart - Frontend Actions: https://docs.copilotkit.ai/frontend-actions - Generative UI: https://docs.copilotkit.ai/direct-to-llm/guides/generative-ui - Headless + Suggestions + HITL: https://docs.copilotkit.ai/premium/headless-ui --- ## 9) Notes on Reuse from LinkedIn Writer - Research handler; Gemini grounded provider; citation manager; quality analyzer. - Hallucination detector + Exa verification endpoints. - CopilotKit integration patterns: actions, suggestions, render/HITL, state persistence.