Files
ALwrity/backend/api/content_planning/docs/AUTO_POPULATION_CODE_WALKTHROUGH.md

18 KiB

Auto-Population Code Walkthrough

Overview

This document provides a comprehensive code walkthrough of the auto-population feature that fills 30 strategy input fields using onboarding data and AI insights.

Table of Contents

  1. Flow Overview
  2. Frontend Flow
  3. Backend Flow
  4. Database Tables Used
  5. Field Mapping
  6. AI Integration
  7. API Calls and Subscription Checks

Flow Overview

High-Level Flow

User Clicks "Auto-Populate Fields" 
  ↓
Frontend: AutoPopulationConsentModal (User Consent)
  ↓
Frontend: strategyBuilderStore.autoPopulateFromOnboarding()
  ↓
Frontend: API Call to /api/content-planning/enhanced-strategies/onboarding-data
  ↓
Backend: utility_endpoints.py → get_onboarding_data()
  ↓
Backend: EnhancedStrategyService._get_onboarding_data()
  ↓
Backend: DataProcessorService.get_onboarding_data()
  ↓
Backend: AutoFillService.get_autofill()
  ↓
Backend: OnboardingDataIntegrationService.process_onboarding_data() (Database Queries)
  ↓
Backend: AutoFillService.get_autofill() → Normalizers + Transformers
  ↓
Backend: AIStructuredAutofillService.generate_autofill_fields() (AI Generation)
  ↓
Backend: AIServiceManager.execute_structured_json_call() (AI API Call)
  ↓
Backend: Response with 30 fields
  ↓
Frontend: Store fields in strategyBuilderStore
  ↓
Frontend: Display fields in ContentStrategyBuilder

Frontend Flow

File: frontend/src/components/ContentPlanningDashboard/components/AutoPopulationConsentModal.tsx

  • Purpose: Explains auto-population to non-technical users (content creators, digital marketers, solopreneurs)
  • Features:
    • Clear explanation of what auto-population does
    • Benefits (Instant Setup, AI-Powered Insights, Your Data Your Control, Always Editable)
    • Data sources used (Website Analysis, Research Preferences, Business Details, AI Analysis)
    • Two buttons: "Skip Auto-Population" (Cancel) and "Auto-Populate Fields" (Confirm)

2. ContentStrategyBuilder Component

File: frontend/src/components/ContentPlanningDashboard/components/ContentStrategyBuilder.tsx

Key Changes:

  • Removed automatic useEffect that triggered auto-population on mount
  • Added consent modal state: showAutoPopulationConsentModal
  • Added consent tracking: autoPopulateConsentAsked (persisted in sessionStorage)
  • Modal shows on first mount (with 500ms delay for rendering)
  • Auto-population only triggers after user clicks "Auto-Populate Fields"

State Management:

const [showAutoPopulationConsentModal, setShowAutoPopulationConsentModal] = useState(false);
const [autoPopulateConsentAsked, setAutoPopulateConsentAsked] = useState(() => {
  return sessionStorage.getItem('autoPopulateConsentAsked') === 'true';
});
const [autoPopulateAttempted, setAutoPopulateAttempted] = useState(false);

Consent Handlers:

  • handleAutoPopulationConsent(): Triggers auto-population, saves consent to sessionStorage
  • handleAutoPopulationCancel(): Skips auto-population, saves consent to sessionStorage

3. Strategy Builder Store

File: frontend/src/stores/strategyBuilderStore.ts

Function: autoPopulateFromOnboarding(forceRefresh?: boolean)

Steps:

  1. Global Protection: Checks isAutoPopulating flag to prevent multiple simultaneous calls
  2. Validation: Checks if already populated (unless forceRefresh)
  3. API Call: Calls contentPlanningApi.getOnboardingData()
  4. Response Processing:
    • Extracts fields, sources, input_data_points from response
    • Validates AI generation success (meta.ai_used and meta.ai_overrides_count > 0)
    • Transforms field values and stores in:
      • fieldValues: Form data
      • autoPopulatedFields: Tracking which fields were auto-populated
      • personalizationData: User data used
      • confidenceScores: AI confidence scores
  5. State Update: Updates store with populated fields

API Endpoint: GET /api/content-planning/enhanced-strategies/onboarding-data

Backend Flow

1. API Endpoint

File: backend/api/content_planning/api/content_strategy/endpoints/utility_endpoints.py

Endpoint: GET /onboarding-data

Authentication: Required (get_current_user)

Flow:

  1. Extracts user_id from authenticated token
  2. Creates EnhancedStrategyDBService and EnhancedStrategyService
  3. Calls enhanced_service._get_onboarding_data(user_id)
  4. Returns response via ResponseBuilder.create_success_response()

2. Enhanced Strategy Service

File: backend/api/content_planning/services/enhanced_strategy_service.py

Method: _get_onboarding_data(user_id: int)

Flow:

  1. Calls core_service.data_processor_service.get_onboarding_data(user_id)
  2. Returns processed onboarding data

3. Data Processor Service

File: backend/api/content_planning/services/content_strategy/utils/data_processors.py

Class: DataProcessorService

Method: async def get_onboarding_data(user_id: int)

Flow:

  1. Creates AutoFillService(db) instance
  2. Calls service.get_autofill(user_id)
  3. Returns comprehensive onboarding data payload

4. AutoFill Service

File: backend/api/content_planning/services/content_strategy/autofill/autofill_service.py

Class: AutoFillService

Method: async def get_autofill(user_id: int)

Steps:

  1. Integration: Calls integration.process_onboarding_data(user_id, db) to collect raw data
  2. Normalization:
    • normalize_website_analysis(website_raw)
    • normalize_research_preferences(research_raw)
    • normalize_api_keys(api_raw)
  3. Quality Assessment:
    • calculate_quality_scores_from_raw()
    • calculate_confidence_from_raw()
    • calculate_data_freshness()
  4. Transformation: Calls transform_to_fields() to map to 30 frontend fields
  5. Transparency:
    • build_data_sources_map() (field → data source mapping)
    • build_input_data_points() (detailed input data points)
  6. Validation: Validates output structure
  7. Return: Returns payload with fields, sources, quality scores, confidence levels, data freshness, input data points

Note: This service does NOT use AI. It only transforms existing onboarding data.

5. Onboarding Data Integration Service

File: backend/api/content_planning/services/content_strategy/onboarding/data_integration.py

Class: OnboardingDataIntegrationService

Method: async def process_onboarding_data(user_id: int, db: Session)

Database Queries:

  1. Website Analysis:

    • Queries OnboardingSession for latest session
    • Queries WebsiteAnalysis for latest analysis
    • Returns: website_url, content_goals, target_metrics, performance_metrics, competitors, target_audience, writing_style, etc.
  2. Research Preferences:

    • Queries ResearchPreferences for session
    • Returns: research_depth, content_types, target_audience, audience_research, content_preferences, etc.
  3. API Keys:

    • Queries APIKey for user
    • Returns: providers, total_keys, available services
  4. Onboarding Session:

    • Queries OnboardingSession for user
    • Returns: business_size, budget, team_size, timeline, region, etc.

Returns: Integrated data dictionary with all sources

Database Tables Used

1. onboarding_sessions

Columns Used:

  • user_id (filter)
  • id (join key)
  • updated_at (ordering)
  • business_size, budget, team_size, timeline, region, progress

2. website_analyses

Columns Used:

  • session_id (join key)
  • updated_at (ordering)
  • website_url, status, content_goals, target_metrics, performance_metrics, competitors, target_audience, writing_style, content_type, content_characteristics, recommended_settings, style_guidelines

3. research_preferences

Columns Used:

  • session_id (join key)
  • research_depth, content_types, target_audience, audience_research, content_preferences, auto_research, factual_content

4. api_keys

Columns Used:

  • user_id (filter)
  • provider (aggregation)
  • is_active (filter)

Field Mapping

30 Fields Mapped to Onboarding Data

File: backend/api/content_planning/services/content_strategy/autofill/transformer.py

Function: transform_to_fields()

Business Context (8 fields)

  1. business_objectiveswebsite.content_goals
  2. target_metricswebsite.target_metrics or website.performance_metrics
  3. content_budgetwebsite.content_budget or session.budget
  4. team_sizewebsite.team_size or session.team_size
  5. implementation_timelinewebsite.implementation_timeline or session.timeline
  6. market_sharewebsite.market_share or derived from performance_metrics
  7. competitive_positionwebsite.competitors (derived)
  8. performance_metricswebsite.performance_metrics

Audience Intelligence (6 fields)

  1. content_preferencesresearch.content_preferences
  2. consumption_patternsresearch.audience_intelligence.consumption_patterns
  3. audience_pain_pointsresearch.audience_intelligence.pain_points
  4. buying_journeyresearch.audience_intelligence.buying_journey
  5. seasonal_trends → Default: ['Q1: Planning', 'Q2: Execution', 'Q3: Optimization', 'Q4: Review']
  6. engagement_metrics → Derived from website.performance_metrics

Competitive Intelligence (5 fields)

  1. top_competitorswebsite.competitors
  2. competitor_content_strategies → Default: ['Educational content', 'Case studies', 'Thought leadership']
  3. market_gapswebsite.content_gaps
  4. industry_trendsresearch.industry_focus
  5. emerging_trendsresearch.trend_analysis

Content Strategy (7 fields)

  1. preferred_formatsresearch.content_types
  2. content_mix → Derived from research.content_types and website.content_goals
  3. content_frequencyresearch.content_calendar.frequency
  4. optimal_timingresearch.content_calendar.timing
  5. quality_metrics → Derived from website.performance_metrics
  6. editorial_guidelineswebsite.style_guidelines
  7. brand_voicewebsite.writing_style.tone or session.brand_voice

Performance & Analytics (4 fields)

  1. traffic_sources → Derived from website.performance_metrics
  2. conversion_rateswebsite.performance_metrics.conversion_rate
  3. content_roi_targets → Derived from session.budget and performance_metrics
  4. ab_testing_capabilities → Derived from session.team_size

AI Integration

When AI is Used

File: backend/api/content_planning/services/content_strategy/autofill/ai_refresh.py

Class: AutoFillRefreshService

Critical Clarification: The standard AutoFillService.get_autofill() does NOT use AI. It only transforms existing onboarding data using database queries and simple mappings.

Standard Autofill (Default):

  • Uses AutoFillService.get_autofill() (NO AI)
  • Database queries only (0 tokens)
  • Direct mappings and simple derivations (~80%+ fields)
  • Fast (~100-200ms)
  • Used in standard "Auto-Populate Fields" flow

AI Autofill (Optional - Refresh Flow):

  • Uses AIStructuredAutofillService.generate_autofill_fields() (WITH AI)
  • AI generation (3500-5000 tokens per call, up to 15,000 with retries)
  • Personalized values for missing/incomplete fields
  • Slower (~2-5 seconds per call)
  • Used in "Refresh Data (AI)" flow only

AI is used in:

  • AutoFillRefreshService.build_fresh_payload() (for refresh flows)
  • AIStructuredAutofillService.generate_autofill_fields() (for AI-only generation)

AI Service

File: backend/api/content_planning/services/content_strategy/autofill/ai_structured_autofill.py

Class: AIStructuredAutofillService

Method: async def generate_autofill_fields(user_id: int, context: Dict[str, Any])

Flow:

  1. Context Summary: Builds personalized context from onboarding data
  2. Schema: Builds JSON schema for 30 fields
  3. Prompt: Builds personalized prompt with user's website URL, industry, business size, writing tone, target audience, etc.
  4. AI Call: Calls self.ai.execute_structured_json_call()
    • Service Type: AIServiceType.STRATEGIC_INTELLIGENCE
    • Prompt: Personalized prompt with user context
    • Schema: JSON schema with 30 field definitions
  5. Retry Logic: Up to 2 retries if success rate < 80% or missing fields > 6
  6. Normalization: Normalizes values (numbers, booleans, select options, arrays)
  7. Validation: Ensures all 30 fields are populated
  8. Return: Returns fields with metadata (ai_used, ai_overrides_count, success_rate, attempts)

AI Service Manager

File: backend/services/ai_service_manager.py (referenced but not in content_planning)

Method: execute_structured_json_call()

Flow:

  1. Gets AI service (via get_service_manager())
  2. Calls main_text_generation() with:
    • Prompt
    • Schema (JSON structure)
    • User ID (for subscription checks)
  3. Subscription Check: Uses user_id for pre-flight subscription validation
  4. Pre-flight Check: Validates subscription limits before API call
  5. API Call: Makes structured JSON call to AI provider (Gemini)
  6. Response: Returns structured JSON with 30 fields

AI Prompts

File: backend/api/content_planning/services/content_strategy/autofill/ai_structured_autofill.py

Method: _build_prompt(context_summary: Dict[str, Any])

Prompt Structure:

  1. Personalized Context:

    • User profile (website URL, business size, region)
    • Content analysis (writing tone, content type, target demographics)
    • Audience insights (pain points, preferences, industry focus)
    • AI recommendations (recommended tone, content type, style guidelines)
    • Research configuration (research depth, content types, auto research)
    • API capabilities (available services, providers)
  2. Instructions:

    • Generate 30 fields personalized for user's website
    • Avoid generic placeholder values
    • Use real insights from website analysis
    • Make each field specific to user's business
  3. Field Examples: Shows example format for all 30 fields

Prompt Length: ~3000-4000 characters (includes context + instructions + examples)

AI Schema

Method: _build_schema()

Schema Structure:

  • Type: OBJECT
  • Properties: 30 field definitions
    • Each field has: type (STRING/NUMBER/BOOLEAN), description
  • Required: All 30 fields
  • Property Ordering: CORE_FIELDS order (critical for consistent JSON output)

API Calls and Subscription Checks

API Call Flow

  1. Frontend → Backend: GET /api/content-planning/enhanced-strategies/onboarding-data

    • Authentication: Required (Bearer token)
    • User ID: Extracted from token
  2. Backend → Database: Multiple queries (see Database Tables section)

    • No API calls, only database queries
  3. Backend → AI Service (if using AI):

    • Service: AIServiceManager.execute_structured_json_call()
    • Provider: Gemini (via gemini_provider)
    • Method: main_text_generation()
    • Subscription Check: Pre-flight validation using user_id
    • Pre-flight Check: Validates subscription limits before API call

Subscription and Pre-flight Checks

File: backend/services/ai_service_manager.py (referenced)

Checks Performed:

  1. Subscription Validation:

    • Checks user's subscription tier
    • Validates API usage limits
    • Uses user_id for subscription lookup
  2. Pre-flight Check:

    • Validates request before making API call
    • Checks rate limits
    • Validates token usage estimate
  3. Post-call Tracking:

    • Tracks token usage
    • Updates subscription usage stats
    • Records API calls

Number of API Calls

Standard Flow (default - NO AI):

  • AI Calls: 0 (NO AI USED)
  • API Calls: 0 (only database queries)
  • Database Queries: 4-5 (OnboardingSession, WebsiteAnalysis, ResearchPreferences, APIKey)
  • Token Usage: 0 tokens
  • Speed: ~100-200ms
  • Used in: Standard "Auto-Populate Fields" flow

AI-Enhanced Flow (optional - WITH AI - refresh flow only):

  • AI Calls: 1-3 (depending on retries)
    • Initial call: 1
    • Retries (if success rate < 80%): up to 2 more
  • Database Queries: 4-5 (same as standard flow)
  • AI Provider: Gemini (via gemini_provider)
  • Token Usage: 3500-5000 tokens per call (up to 15,000 with retries)
  • Speed: ~2-5 seconds per call
  • Used in: "Refresh Data (AI)" flow only (optional)

Token Usage

Estimated Tokens per Call:

  • Input: ~2000-3000 tokens (prompt + context)
  • Output: ~1500-2000 tokens (30 fields JSON)
  • Total: ~3500-5000 tokens per call

With Retries (max 2 retries):

  • Best Case: 3500-5000 tokens (1 call, 100% success)
  • Worst Case: 10500-15000 tokens (3 calls, <80% success each time)

Summary

Key Points

  1. User Consent: Auto-population now requires explicit user consent via modal
  2. No Auto-Trigger: Removed automatic useEffect that triggered on mount
  3. Database First: Standard autofill uses only database queries (NO AI - 0 tokens)
  4. AI Optional: AI is only used in refresh flows (NOT standard auto-population)
  5. 30 Fields: All 30 strategic input fields are mapped from onboarding data
    • 80%+ are direct database mappings (no AI needed)
    • Standard autofill can fill most fields from database queries
    • AI autofill is optional (only for personalization in refresh flows)
  6. Subscription Checks: All AI calls use user_id for subscription and pre-flight checks
  7. Token Usage:
    • Standard autofill: 0 tokens (database queries only)
    • AI autofill (refresh): 3500-5000 tokens per call (up to 15,000 with retries)
  8. Architecture: Standard autofill is the default (fast, free). AI autofill is optional (personalized, costs tokens).

Data Sources Priority

  1. Website Analysis (highest priority)
  2. Research Preferences
  3. Onboarding Session
  4. API Keys (for capabilities only)
  5. AI Generation (only in refresh flows)

Performance Considerations

  • Standard Flow: Fast (database queries only, ~100-200ms)
  • AI-Enhanced Flow: Slower (AI API calls, ~2-5 seconds per call)
  • Retries: Can add up to 2x-3x latency if retries are needed
  • Caching: Onboarding data is cached (TTL: 30 minutes)