Files
ALwrity/docs/AI_BLOG_WRITER_IMPLEMENTATION_SPEC.md

19 KiB

AI Blog Writer — Implementation Specification (Copilot-first, Research-led)

Overview

  • Goal: Build a SOTA AI blog writer that guides non-technical users end-to-end: research → outline → section generation → quality/SEO → publishing.
  • Approach: Copilot-first UX using CopilotKit. Reuse LinkedIn assistive writing patterns: Google Search grounding, Exa research, hallucination detector, quality analysis, citations.
  • User Interaction Model: The user only talks to the Copilot; the editor reflects all state and changes via generative UI and HITL confirmations.

🚀 Current Implementation Status (Updated: December 2024)

COMPLETED PHASES:

  • Stage 1: Research & Strategy - FULLY IMPLEMENTED
  • Stage 2: Content Planning (Outline) - FULLY IMPLEMENTED
  • Backend Architecture - MODULAR & PRODUCTION-READY
  • Frontend UI Components - COMPREHENSIVE EDITOR
  • CopilotKit Integration - FULLY FUNCTIONAL

🔄 IN PROGRESS:

  • Stage 3: Content Generation - 🔄 PARTIALLY IMPLEMENTED
  • Stage 4: SEO & Publishing - 🔄 PARTIALLY IMPLEMENTED

📋 TODO:

  • Section-by-section content generation
  • Full SEO optimization pipeline
  • Publishing integrations (Wix/WordPress)
  • Advanced quality checks

Key Principles

  • AI-first, HITL: The assistant leads with intelligent suggestions; the user approves via render-and-wait HITL components where appropriate.
  • Research fidelity: Google grounding + Exa researcher; hallucination detection with claim verification; pervasive citations.
  • Persona-aware: Import blog writing persona from DB and apply it across planning/generation/optimizations.
  • SEO-excellent: Real-time SEO analysis, metadata generation, schema, and image alt handling.
  • Publish-ready: Smooth handoff to Wix/WordPress; preview and scheduling.

1) Workflow (4 Stages)

Stage 1: Research & Strategy (AI Orchestration) FULLY IMPLEMENTED

IMPLEMENTED FEATURES:

  • Google Search Grounding: Single Gemini API call with native Google Search integration
  • Intelligent Caching: Exact keyword match caching to reduce API costs
  • AI-Powered Analysis: Keyword analysis, competitor analysis, content angle generation
  • Robust Error Handling: No fallback data - only real AI-generated insights or graceful failures
  • Progress Tracking: Real-time progress messages during research operations

IMPLEMENTED INPUTS:

  • keywords: string[], industry: string, targetAudience: string, wordCountTarget: number
  • Persona support (basic implementation)

IMPLEMENTED BACKEND/SERVICES:

  • Modular Architecture: ResearchService, KeywordAnalyzer, CompetitorAnalyzer, ContentAngleGenerator
  • Google Grounding: Native Gemini Google Search integration (no Exa dependency)
  • Caching System: Intelligent research result caching with TTL and LRU eviction
  • Error Handling: Graceful failure with specific error messages

IMPLEMENTED COPILOTKIT ACTIONS:

  • researchTopic(keywords, industry, target_audience, blogLength) → comprehensive research with sources
  • chatWithResearchData(question) → interactive research data exploration
  • getResearchKeywords() → HITL keyword collection form
  • performResearch(formData) → research execution with form data

IMPLEMENTED GENERATIVE UI:

  • ResearchResults Component: Sources, credibility scores, keyword analysis, content angles
  • KeywordInputForm: HITL form for keyword collection with blog length selection
  • Progress Messages: Real-time loading states with CopilotKit status system

IMPLEMENTED SUGGESTIONS:

  • "I want to research a topic for my blog" (initial)
  • "Let's proceed to create an Outline" (post-research)
  • "Chat with Research Data" (exploration)
  • "Create outline with custom inputs" (advanced)

Stage 2: Content Planning (AI + Human) FULLY IMPLEMENTED

IMPLEMENTED DELIVERABLES:

  • Structured Outline: H1/H2/H3 hierarchy with per-section key points and target word counts
  • AI-Generated Titles: Multiple title options with SEO optimization
  • Research Integration: Outline sections linked to research sources and keywords
  • Word Count Distribution: Intelligent word allocation across sections

IMPLEMENTED COPILOTKIT ACTIONS:

  • generateOutline() → AI-powered outline generation from research data
  • createOutlineWithCustomInputs(customInstructions) → custom outline with user instructions
  • refineOutline(operation, sectionId, payload) → add/remove/move/merge/rename sections
  • enhanceSection(sectionId, focus) → AI enhancement of individual sections
  • optimizeOutline(focus) → AI optimization of entire outline
  • rebalanceOutline(targetWords) → word count rebalancing across sections

IMPLEMENTED GENERATIVE UI:

  • EnhancedOutlineEditor: Interactive outline editor with expandable sections
  • TitleSelector: AI-generated title options with custom title creation
  • CustomOutlineForm: HITL form for custom outline instructions
  • Section Management: Add, edit, reorder, merge sections with visual feedback
  • Research Integration: Source references and keyword suggestions per section

IMPLEMENTED SUGGESTIONS:

  • "Generate outline" (standard)
  • "Create outline with custom inputs" (advanced)
  • "Enhance section [X]" (section-specific)
  • "Optimize entire outline" (global)
  • "Rebalance word counts" (distribution)

Stage 3: Content Generation (CopilotKit-only, no multi-agent) 🔄 PARTIALLY IMPLEMENTED

🔄 PARTIALLY IMPLEMENTED DELIVERABLES:

  • Section Generation: Basic section generation with markdown output
  • Content Structure: Sectioned markdown with inline citations support
  • Quality Checks: Hallucination detection integration

IMPLEMENTED COPILOTKIT ACTIONS:

  • generateSection(sectionId) → generates content for specific section
  • generateAllSections() → placeholder for bulk generation
  • runHallucinationCheck() → integrates with hallucination detector service

🔄 PARTIALLY IMPLEMENTED UI:

  • Section Editors: Basic markdown editing per section
  • DiffPreview Component: Exists but needs integration
  • Citation System: Basic structure in place

📋 TODO:

  • Full section-by-section content generation
  • Advanced content optimization
  • Inline citation management
  • Content quality improvements
  • Progress tracking for bulk generation

Stage 4: Optimization & Publishing (AI + Human) 🔄 PARTIALLY IMPLEMENTED

🔄 PARTIALLY IMPLEMENTED SEO OPTIMIZATION:

  • SEO Analysis: Basic SEO analysis with keyword density and structure
  • Metadata Generation: Title options and meta description generation
  • SEO Integration: Wraps existing SEO tools services

IMPLEMENTED COPILOTKIT ACTIONS:

  • runSEOAnalyze(keywords) → SEO analysis with scores and recommendations
  • generateSEOMetadata(title) → metadata generation for titles and descriptions
  • publishToPlatform(platform, schedule) → placeholder for publishing

🔄 PARTIALLY IMPLEMENTED UI:

  • SEOMiniPanel: Basic SEO analysis display
  • Metadata Management: Title and description editing

📋 TODO:

  • Full SEO optimization pipeline
  • Advanced SEO recommendations
  • Publishing integrations (Wix/WordPress)
  • Content optimization with diff preview
  • Image alt text and media management
  • Schema markup generation

2) SEO Tools Integration & Metadata

Existing Services to Wrap

  • Meta Description, OpenGraph, Image Alt, On-Page SEO, Technical SEO, Content Strategy (see backend/services/seo_tools/* and docs).

Unified Endpoints

  • POST /api/blog/seo/analyze → { seoScore, density, structure, readability, link suggestions, image alt status, recs }
  • POST /api/blog/seo/metadata → { titleOptions, metaDescriptionOptions, openGraph, twitterCard, schema: { Article, FAQ?, Breadcrumb, Org/Person } }

Editor SEO Panel

  • Live density and distribution, readability (Flesch-Kincaid), heading hierarchy, internal/external link suggestions.
  • One-click “Apply Fix” with diff preview.

Schema

  • Default Article schema; optional FAQ when Q&A snippets exist; Breadcrumb, Organization/Person as applicable.

3) Dedicated Blog Editor Design (Copilot-first)

Layout

  • Left: Markdown Editor (per-section tabs), word count, persona cues, inline citation chips.
  • Right: Live Preview (desktop/mobile), SEO SERP snippet preview, social preview (OG/Twitter).
  • Sidebar Panels: Research (sources, claims), SEO (scores/fixes), Media (AI images + alt text), History (versions).

Core Components

  • BlogResearchCard (render-only): sources, credibility scores, add-to-outline.
  • OutlineEditor (HITL): drag-drop H2/H3, per-section refs and target words.
  • SectionEditor: markdown area with persona/tone badges; per-section SEO mini-score.
  • DiffPreview (HITL): apply/reject AI edits.
  • SEOPanel: density/structure/readability + apply fix.
  • MediaPanel: AI images, compression, automatic alt-text.

CopilotKit Integrations

  • Suggestions: set programmatically (useCopilotChatHeadless_c) or via CopilotSidebar props.
  • Generative UI: useCopilotAction({ render }) for research cards, outline editor, diff preview, publish dialog.
  • HITL: renderAndWaitForResponse for approvals at outline, diff apply, and publish steps.
  • References: CopilotKit docs — Frontend Actions, Generative UI, Suggestions, HITL.

Persistence

  • Persist outline, per-section content, references, persona snapshot, SEO state, metadata drafts.
  • Auto-save every 30s; version history for undo.

4) Backend APIs FULLY IMPLEMENTED

IMPLEMENTED BLOG ENDPOINTS:

  • POST /api/blog/research/start → async research with progress tracking
  • GET /api/blog/research/status/{task_id} → research progress status
  • POST /api/blog/outline/start → async outline generation with progress
  • GET /api/blog/outline/status/{task_id} → outline progress status
  • POST /api/blog/outline/refine → outline refinement operations
  • POST /api/blog/outline/rebalance → word count rebalancing
  • POST /api/blog/section/generate → section content generation
  • POST /api/blog/section/optimize → content optimization
  • POST /api/blog/quality/hallucination-check → hallucination detection
  • POST /api/blog/seo/analyze → SEO analysis and recommendations
  • POST /api/blog/seo/metadata → metadata generation
  • POST /api/blog/publish → publishing to platforms
  • GET /api/blog/health → service health check

IMPLEMENTED MODULAR ARCHITECTURE:

  • Core Service: BlogWriterService - main orchestrator
  • Research Module: ResearchService, KeywordAnalyzer, CompetitorAnalyzer, ContentAngleGenerator
  • Outline Module: OutlineService, OutlineGenerator, OutlineOptimizer, SectionEnhancer
  • Caching System: Intelligent research result caching with TTL and LRU eviction
  • Error Handling: Graceful failure with specific error messages

IMPLEMENTED MODELS:

  • BlogResearchRequest, BlogResearchResponse
  • BlogOutlineRequest, BlogOutlineResponse, BlogOutlineRefineRequest
  • BlogSectionRequest, BlogSectionResponse
  • BlogOptimizeRequest, BlogOptimizeResponse
  • BlogSEOAnalyzeRequest, BlogSEOAnalyzeResponse
  • BlogSEOMetadataRequest, BlogSEOMetadataResponse
  • BlogPublishRequest, BlogPublishResponse
  • HallucinationCheckRequest, HallucinationCheckResponse

REUSED SERVICES:

  • /api/hallucination-detector/* - hallucination detection integration
  • SEO tools services - wrapped for blog-specific analysis

5) CopilotKit Action Inventory COMPREHENSIVE IMPLEMENTATION

RESEARCH ACTIONS (FULLY IMPLEMENTED):

  • researchTopic(keywords, industry, target_audience, blogLength) → comprehensive research
  • chatWithResearchData(question) → interactive research exploration
  • getResearchKeywords() → HITL keyword collection form
  • performResearch(formData) → research execution with form data

PLANNING ACTIONS (FULLY IMPLEMENTED):

  • generateOutline() → AI-powered outline generation
  • createOutlineWithCustomInputs(customInstructions) → custom outline creation
  • refineOutline(operation, sectionId, payload) → outline refinement operations
  • enhanceSection(sectionId, focus) → section enhancement
  • optimizeOutline(focus) → outline optimization
  • rebalanceOutline(targetWords) → word count rebalancing

🔄 GENERATION ACTIONS (PARTIALLY IMPLEMENTED):

  • generateSection(sectionId) → section content generation
  • generateAllSections() → bulk generation (placeholder) 🔄
  • runHallucinationCheck() → hallucination detection

🔄 SEO ACTIONS (PARTIALLY IMPLEMENTED):

  • runSEOAnalyze(keywords) → SEO analysis
  • generateSEOMetadata(title) → metadata generation

🔄 PUBLISHING ACTIONS (PARTIALLY IMPLEMENTED):

  • publishToPlatform(platform, schedule) → publishing (placeholder) 🔄

UX/RENDER-ONLY/HITL (FULLY IMPLEMENTED):

  • ResearchResults → research data visualization
  • EnhancedOutlineEditor → interactive outline management
  • KeywordInputForm → HITL keyword collection
  • CustomOutlineForm → HITL custom outline creation
  • TitleSelector → title selection and creation
  • DiffPreview → content diff visualization
  • SEOMiniPanel → SEO analysis display

6) Intelligent Suggestions (states)

Before research

  • “Load persona”, “Analyze keywords”, “Research topic”

After research

  • “Generate outline”, “Add competitor H2s”, “Attach sources”

Outline ready

  • “Generate [Section 1]”, “…”, “Generate all sections”

Draft ready

  • “Run fact-check”, “Run SEO analysis”, “Generate metadata”

Final

  • “Publish to WordPress”, “Schedule on Wix”

7) Delivery Plan / Milestones UPDATED STATUS

MILESTONE 1: Research + Outline (COMPLETED)

  • Actions: research topic, generate outline, outline editor (HITL)
  • Google Search grounding integration
  • AI-powered keyword and competitor analysis
  • Interactive outline editor with refinement capabilities
  • Research data visualization and exploration

🔄 MILESTONE 2: Section Generation + Quality (IN PROGRESS)

  • generateSection (basic implementation)
  • 🔄 generateAllSections (needs full implementation)
  • 🔄 optimizeSection with diff preview (needs integration)
  • hallucination check integration
  • 📋 Content quality improvements and optimization

🔄 MILESTONE 3: SEO & Metadata (IN PROGRESS)

  • analyzeSEO panel (basic implementation)
  • generateSEOMetadata (title/meta generation)
  • 📋 Advanced SEO recommendations and fixes
  • 📋 Schema markup and social media optimization

📋 MILESTONE 4: Publishing (TODO)

  • 📋 prepareForPublish functionality
  • 📋 publishToPlatform (Wix/WordPress integration)
  • 📋 Scheduling and publishing workflow
  • 📋 Success URL and status tracking

📋 MILESTONE 5: Polish (TODO)

  • 📋 Advanced readability aids
  • 📋 Version history and auto-save
  • 📋 Performance optimization
  • 📋 Accessibility improvements

8) Current Architecture & Implementation Details

🏗️ Backend Architecture (Modular & Production-Ready)

Core Service Structure:

backend/services/blog_writer/
├── core/
│   └── blog_writer_service.py     # Main orchestrator
├── research/
│   ├── research_service.py        # Research orchestration
│   ├── keyword_analyzer.py        # AI keyword analysis
│   ├── competitor_analyzer.py     # Competitor intelligence
│   └── content_angle_generator.py # Content angle discovery
├── outline/
│   ├── outline_service.py         # Outline orchestration
│   ├── outline_generator.py       # AI outline generation
│   ├── outline_optimizer.py       # Outline optimization
│   └── section_enhancer.py        # Section enhancement
└── blog_service.py                # Entry point (thin wrapper)

Key Features:

  • No Fallback Data: Only real AI-generated insights or graceful failures
  • Intelligent Caching: Research result caching with TTL and LRU eviction
  • Error Handling: Specific error messages and retry logic
  • Progress Tracking: Real-time progress updates for long-running operations

🎨 Frontend Architecture (CopilotKit-First)

Component Structure:

frontend/src/components/BlogWriter/
├── BlogWriter.tsx                 # Main orchestrator component
├── ResearchAction.tsx             # Research CopilotKit actions
├── ResearchResults.tsx            # Research data visualization
├── KeywordInputForm.tsx           # HITL keyword collection
├── EnhancedOutlineEditor.tsx      # Interactive outline editor
├── TitleSelector.tsx              # Title selection and creation
├── CustomOutlineForm.tsx          # HITL custom outline creation
├── ResearchDataActions.tsx        # Research data interaction
├── EnhancedOutlineActions.tsx     # Outline management actions
├── DiffPreview.tsx                # Content diff visualization
└── SEOMiniPanel.tsx               # SEO analysis display

Key Features:

  • CopilotKit Integration: Full action system with HITL components
  • Real-time Updates: Progress messages and status tracking
  • Interactive UI: Drag-and-drop, expandable sections, visual feedback
  • Error Handling: User-friendly error messages and recovery

🔧 Technical Implementation Highlights

Research Phase:

  • Single Gemini API call with Google Search grounding
  • AI-powered analysis of keywords, competitors, and content angles
  • Intelligent caching to reduce API costs
  • No fallback data - only real AI insights

Outline Phase:

  • Research-driven outline generation
  • Interactive outline editor with full CRUD operations
  • AI-powered section enhancement and optimization
  • Word count rebalancing and distribution

Quality Assurance:

  • Robust error handling with specific messages
  • Progress tracking for long-running operations
  • Graceful failure without misleading data
  • Real-time user feedback and guidance

9) References


9) Notes on Reuse from LinkedIn Writer

  • Research handler; Gemini grounded provider; citation manager; quality analyzer.
  • Hallucination detector + Exa verification endpoints.
  • CopilotKit integration patterns: actions, suggestions, render/HITL, state persistence.