Files
ALwrity/docs/Content Plan/FIX_STEP_6_DATA_RETRIEVAL.md

4.6 KiB

Fix: Step 6 Data Retrieval Issue

Problem

Step 6 (FinalStep) was not retrieving data from previous steps (1-5) even though the data was saved in the database. The backend API endpoints were returning null for:

  • website_url
  • style_analysis
  • research_preferences
  • personalization_settings

Root Cause

Database Schema Mismatch: The onboarding_sessions table had user_id defined as INTEGER, but the application was using Clerk user IDs which are strings (e.g., user_33Gz1FPI86VDXhRY8QN4ragRFGN).

# OLD (INCORRECT)
class OnboardingSession(Base):
    user_id = Column(Integer, nullable=False)  # ❌ Can't store string IDs

# NEW (CORRECT)
class OnboardingSession(Base):
    user_id = Column(String(255), nullable=False, index=True)  # ✅ Supports Clerk IDs

This caused:

  1. Failed Queries: SQLAlchemy couldn't match string user_ids against integer column
  2. Null Results: Queries returned no results, causing Step 6 to show null for all data
  3. Orphaned Data: Previous steps' data was saved but couldn't be retrieved

Solution

1. Updated Database Model

File: backend/models/onboarding.py

class OnboardingSession(Base):
    __tablename__ = 'onboarding_sessions'
    id = Column(Integer, primary_key=True, autoincrement=True)
    user_id = Column(String(255), nullable=False, index=True)  # Changed from Integer to String
    current_step = Column(Integer, default=1)
    progress = Column(Float, default=0.0)
    # ... rest of fields

2. Updated Summary Service

File: backend/api/onboarding_utils/onboarding_summary_service.py

The service now properly queries the database using the Clerk user ID string:

def __init__(self, user_id: str):
    from services.onboarding_database_service import OnboardingDatabaseService
    
    self.user_id = user_id  # Store original Clerk ID
    
    # Get the session for this user to get the session_id
    try:
        db = next(get_db())
        db_service = OnboardingDatabaseService(db)
        session = db_service.get_session_by_user(user_id, db)
        self.session_id = session.id if session else None
    except Exception as e:
        logger.error(f"Error getting session for user {user_id}: {e}")
        self.session_id = None

3. Database Migration

File: backend/scripts/migrate_user_id_to_string.py

A migration script was created and executed to:

  1. Backup existing data
  2. Drop the old table
  3. Recreate with VARCHAR user_id
  4. Restore data (converting any integer IDs to strings)

Command:

python backend/scripts/migrate_user_id_to_string.py

Testing

After the fix, Step 6 should correctly retrieve:

  1. API Keys: From Step 1
  2. Website Analysis: From Step 2 (website_url, style_analysis)
  3. Research Preferences: From Step 3
  4. Persona Data: From Step 4
  5. Integration Settings: From Step 5

Verification

Check backend logs for:

OnboardingSummaryService initialized for user user_33Gz1FPI86VDXhRY8QN4ragRFGN, session_id: 1

Check frontend for:

FinalStep: Summary data: {
  api_keys: {...},  // ✅ Should have data
  website_url: "https://alwrity.com",  // ✅ Should NOT be null
  research_preferences: {...},  // ✅ Should have data
  // ...
}

Files Changed

  1. backend/models/onboarding.py - Updated user_id column type
  2. backend/api/onboarding_utils/onboarding_summary_service.py - Fixed initialization logic
  3. backend/scripts/migrate_user_id_to_string.py - Created migration script
  4. backend/database/migrations/update_onboarding_user_id_to_string.sql - SQL migration script

Migration Status

Migration Completed Successfully (2025-10-11)

  • Old table backed up
  • New schema created with VARCHAR(255) user_id
  • Data restored (0 records affected)
  • Index created for performance

Important Notes

  • User Isolation: All queries now use the Clerk user ID string for proper isolation
  • Backward Compatibility: Existing integer IDs are automatically converted to strings
  • Performance: Added index on user_id column for faster lookups
  • Production Deployment: This migration must be run before deploying to Vercel/Render

Next Steps

  1. Database schema updated
  2. Migration script executed
  3. 🔄 Test Step 6 data retrieval
  4. 🔄 Verify all previous steps still save correctly
  5. 🔄 Deploy to production with migration

Rollback Plan

If needed, the backup table can be restored:

-- Restore old table from backup (if backup exists)
DROP TABLE onboarding_sessions;
ALTER TABLE onboarding_sessions_backup RENAME TO onboarding_sessions;

However, this would revert to the broken state where Clerk IDs don't work.