5.6 KiB
Step 2 Website Analysis Data Transformation Fix
Problem
Step 6 (FinalStep) was not displaying website analysis data, even though:
- API Keys were successfully saved and retrieved ✅
- Research Preferences were successfully saved and retrieved ✅
- Persona Data was successfully saved and retrieved ✅
- Website Analysis was NOT being saved to the database ❌
Root Cause
Data Structure Mismatch between frontend and backend:
Frontend Data Structure (WebsiteStep.tsx)
const stepData = {
website: "https://example.com", // ← Note: "website", not "website_url"
domainName: "example.com",
analysis: { // ← Nested object
writing_style: { ... },
content_characteristics: { ... },
target_audience: { ... },
content_type: { ... },
// etc.
},
useAnalysisForGenAI: true
};
Database Schema Expects (Flat Structure)
{
'website_url': 'https://example.com', # ← "website_url" at root level
'writing_style': { ... }, # ← All fields at root level
'content_characteristics': { ... },
'target_audience': { ... },
'content_type': { ... },
'recommended_settings': { ... },
'crawl_result': { ... },
'style_patterns': { ... },
'style_guidelines': { ... },
'status': 'completed'
}
The Issue
In backend/services/api_key_manager.py (line 278-280), the code was passing step.data directly to save_website_analysis():
elif step.step_number == 2: # Website Analysis
self.db_service.save_website_analysis(self.user_id, step.data, db)
But step.data had this structure:
{
'website': 'https://example.com',
'analysis': {
'writing_style': { ... },
# ...
}
}
The database service expected website_url at the root level and all analysis fields flattened, so it couldn't find any of the data and saved an empty record (or didn't save at all).
Solution
Transform the frontend data structure to match the database schema before saving:
File: backend/services/api_key_manager.py (lines 278-289)
elif step.step_number == 2: # Website Analysis
# Transform frontend data structure to match database schema
analysis_for_db = {
'website_url': step.data.get('website', ''),
'status': 'completed'
}
# Merge analysis fields if they exist
if 'analysis' in step.data and step.data['analysis']:
analysis_for_db.update(step.data['analysis'])
self.db_service.save_website_analysis(self.user_id, analysis_for_db, db)
logger.info(f"✅ DATABASE: Website analysis saved to database for user {self.user_id}")
What This Does:
- Creates base structure:
{'website_url': '...', 'status': 'completed'} - Flattens nested
analysisobject: Uses.update()to merge all analysis fields to root level - Result: Data matches database schema exactly
Example Transformation:
Before (frontend format):
{
'website': 'https://example.com',
'analysis': {
'writing_style': {'tone': 'Professional'},
'target_audience': {'demographics': ['B2B']}
}
}
After (database format):
{
'website_url': 'https://example.com',
'status': 'completed',
'writing_style': {'tone': 'Professional'},
'target_audience': {'demographics': ['B2B']}
}
Testing
To verify the fix:
- Restart the backend server to load the updated code
- Complete Step 2 (Website Analysis) in the onboarding flow
- Check backend logs for:
✅ DATABASE: Website analysis saved to database for user {user_id} - Navigate to Step 6 (FinalStep)
- Verify website URL and style analysis are displayed
Expected Backend Logs After Fix:
INFO|api_key_manager.py:289|✅ DATABASE: Website analysis saved to database for user {user_id}
INFO|onboarding_summary_service.py:85|Retrieved website analysis from database for user {user_id}
Related Files
frontend/src/components/OnboardingWizard/WebsiteStep.tsx- Frontend data structurebackend/services/api_key_manager.py- Data transformation logicbackend/services/onboarding_database_service.py- Database save/retrieve methodsbackend/models/onboarding.py- WebsiteAnalysis model schema
Why This Pattern?
This is a common issue in full-stack applications where:
- Frontend optimizes for UI structure (nested for component organization)
- Database optimizes for query performance (flat for indexing)
- Backend middleware transforms between the two
Alternative Solutions Considered
Option 1: Change Frontend Structure
❌ Rejected: Would break all existing Step 2 components and localStorage caching
Option 2: Change Database Schema
❌ Rejected: Would require complex JSON queries and lose type safety
Option 3: Transform in Middleware (Selected) ✅
✅ Best: Minimal code change, maintains backward compatibility, clear separation of concerns
Future Improvements
Consider adding a data transformation layer for all onboarding steps to handle similar mismatches proactively:
class OnboardingDataTransformer:
@staticmethod
def transform_step_2(frontend_data: Dict) -> Dict:
"""Transform Step 2 data from frontend to database format."""
return {
'website_url': frontend_data.get('website', ''),
'status': 'completed',
**frontend_data.get('analysis', {})
}
This would centralize all data transformations and make the codebase more maintainable.
Status
✅ Fixed: Website analysis data now saves correctly to database
⏳ Pending: Restart backend and test with actual user flow