Add brand analysis columns to onboarding database and migration scripts
This commit is contained in:
188
docs/STEP_2_WEBSITE_ANALYSIS_DATA_TRANSFORMATION_FIX.md
Normal file
188
docs/STEP_2_WEBSITE_ANALYSIS_DATA_TRANSFORMATION_FIX.md
Normal file
@@ -0,0 +1,188 @@
|
||||
# Step 2 Website Analysis Data Transformation Fix
|
||||
|
||||
## Problem
|
||||
|
||||
Step 6 (FinalStep) was not displaying website analysis data, even though:
|
||||
- API Keys were successfully saved and retrieved ✅
|
||||
- Research Preferences were successfully saved and retrieved ✅
|
||||
- Persona Data was successfully saved and retrieved ✅
|
||||
- Website Analysis was **NOT being saved** to the database ❌
|
||||
|
||||
## Root Cause
|
||||
|
||||
**Data Structure Mismatch** between frontend and backend:
|
||||
|
||||
### Frontend Data Structure (WebsiteStep.tsx)
|
||||
|
||||
```typescript
|
||||
const stepData = {
|
||||
website: "https://example.com", // ← Note: "website", not "website_url"
|
||||
domainName: "example.com",
|
||||
analysis: { // ← Nested object
|
||||
writing_style: { ... },
|
||||
content_characteristics: { ... },
|
||||
target_audience: { ... },
|
||||
content_type: { ... },
|
||||
// etc.
|
||||
},
|
||||
useAnalysisForGenAI: true
|
||||
};
|
||||
```
|
||||
|
||||
### Database Schema Expects (Flat Structure)
|
||||
|
||||
```python
|
||||
{
|
||||
'website_url': 'https://example.com', # ← "website_url" at root level
|
||||
'writing_style': { ... }, # ← All fields at root level
|
||||
'content_characteristics': { ... },
|
||||
'target_audience': { ... },
|
||||
'content_type': { ... },
|
||||
'recommended_settings': { ... },
|
||||
'crawl_result': { ... },
|
||||
'style_patterns': { ... },
|
||||
'style_guidelines': { ... },
|
||||
'status': 'completed'
|
||||
}
|
||||
```
|
||||
|
||||
## The Issue
|
||||
|
||||
In `backend/services/api_key_manager.py` (line 278-280), the code was passing `step.data` directly to `save_website_analysis()`:
|
||||
|
||||
```python
|
||||
elif step.step_number == 2: # Website Analysis
|
||||
self.db_service.save_website_analysis(self.user_id, step.data, db)
|
||||
```
|
||||
|
||||
But `step.data` had this structure:
|
||||
```python
|
||||
{
|
||||
'website': 'https://example.com',
|
||||
'analysis': {
|
||||
'writing_style': { ... },
|
||||
# ...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The database service expected `website_url` at the root level and all analysis fields flattened, so it couldn't find any of the data and saved an empty record (or didn't save at all).
|
||||
|
||||
## Solution
|
||||
|
||||
Transform the frontend data structure to match the database schema before saving:
|
||||
|
||||
**File**: `backend/services/api_key_manager.py` (lines 278-289)
|
||||
|
||||
```python
|
||||
elif step.step_number == 2: # Website Analysis
|
||||
# Transform frontend data structure to match database schema
|
||||
analysis_for_db = {
|
||||
'website_url': step.data.get('website', ''),
|
||||
'status': 'completed'
|
||||
}
|
||||
# Merge analysis fields if they exist
|
||||
if 'analysis' in step.data and step.data['analysis']:
|
||||
analysis_for_db.update(step.data['analysis'])
|
||||
|
||||
self.db_service.save_website_analysis(self.user_id, analysis_for_db, db)
|
||||
logger.info(f"✅ DATABASE: Website analysis saved to database for user {self.user_id}")
|
||||
```
|
||||
|
||||
### What This Does:
|
||||
|
||||
1. **Creates base structure**: `{'website_url': '...', 'status': 'completed'}`
|
||||
2. **Flattens nested `analysis` object**: Uses `.update()` to merge all analysis fields to root level
|
||||
3. **Result**: Data matches database schema exactly
|
||||
|
||||
### Example Transformation:
|
||||
|
||||
**Before** (frontend format):
|
||||
```python
|
||||
{
|
||||
'website': 'https://example.com',
|
||||
'analysis': {
|
||||
'writing_style': {'tone': 'Professional'},
|
||||
'target_audience': {'demographics': ['B2B']}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**After** (database format):
|
||||
```python
|
||||
{
|
||||
'website_url': 'https://example.com',
|
||||
'status': 'completed',
|
||||
'writing_style': {'tone': 'Professional'},
|
||||
'target_audience': {'demographics': ['B2B']}
|
||||
}
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
To verify the fix:
|
||||
|
||||
1. **Restart the backend server** to load the updated code
|
||||
2. **Complete Step 2** (Website Analysis) in the onboarding flow
|
||||
3. **Check backend logs** for:
|
||||
```
|
||||
✅ DATABASE: Website analysis saved to database for user {user_id}
|
||||
```
|
||||
4. **Navigate to Step 6** (FinalStep)
|
||||
5. **Verify** website URL and style analysis are displayed
|
||||
|
||||
### Expected Backend Logs After Fix:
|
||||
|
||||
```
|
||||
INFO|api_key_manager.py:289|✅ DATABASE: Website analysis saved to database for user {user_id}
|
||||
INFO|onboarding_summary_service.py:85|Retrieved website analysis from database for user {user_id}
|
||||
```
|
||||
|
||||
## Related Files
|
||||
|
||||
- `frontend/src/components/OnboardingWizard/WebsiteStep.tsx` - Frontend data structure
|
||||
- `backend/services/api_key_manager.py` - Data transformation logic
|
||||
- `backend/services/onboarding_database_service.py` - Database save/retrieve methods
|
||||
- `backend/models/onboarding.py` - WebsiteAnalysis model schema
|
||||
|
||||
## Why This Pattern?
|
||||
|
||||
This is a common issue in full-stack applications where:
|
||||
1. **Frontend** optimizes for UI structure (nested for component organization)
|
||||
2. **Database** optimizes for query performance (flat for indexing)
|
||||
3. **Backend middleware** transforms between the two
|
||||
|
||||
## Alternative Solutions Considered
|
||||
|
||||
### Option 1: Change Frontend Structure
|
||||
❌ **Rejected**: Would break all existing Step 2 components and localStorage caching
|
||||
|
||||
### Option 2: Change Database Schema
|
||||
❌ **Rejected**: Would require complex JSON queries and lose type safety
|
||||
|
||||
### Option 3: Transform in Middleware (Selected) ✅
|
||||
✅ **Best**: Minimal code change, maintains backward compatibility, clear separation of concerns
|
||||
|
||||
## Future Improvements
|
||||
|
||||
Consider adding a **data transformation layer** for all onboarding steps to handle similar mismatches proactively:
|
||||
|
||||
```python
|
||||
class OnboardingDataTransformer:
|
||||
@staticmethod
|
||||
def transform_step_2(frontend_data: Dict) -> Dict:
|
||||
"""Transform Step 2 data from frontend to database format."""
|
||||
return {
|
||||
'website_url': frontend_data.get('website', ''),
|
||||
'status': 'completed',
|
||||
**frontend_data.get('analysis', {})
|
||||
}
|
||||
```
|
||||
|
||||
This would centralize all data transformations and make the codebase more maintainable.
|
||||
|
||||
## Status
|
||||
|
||||
✅ **Fixed**: Website analysis data now saves correctly to database
|
||||
⏳ **Pending**: Restart backend and test with actual user flow
|
||||
|
||||
Reference in New Issue
Block a user