Add brand analysis columns to onboarding database and migration scripts

This commit is contained in:
ajaysi
2025-10-11 17:05:42 +05:30
parent b1ebe1034e
commit 1df12a64a2
25 changed files with 2415 additions and 90 deletions

View File

@@ -0,0 +1,188 @@
# Step 2 Website Analysis Data Transformation Fix
## Problem
Step 6 (FinalStep) was not displaying website analysis data, even though:
- API Keys were successfully saved and retrieved ✅
- Research Preferences were successfully saved and retrieved ✅
- Persona Data was successfully saved and retrieved ✅
- Website Analysis was **NOT being saved** to the database ❌
## Root Cause
**Data Structure Mismatch** between frontend and backend:
### Frontend Data Structure (WebsiteStep.tsx)
```typescript
const stepData = {
website: "https://example.com", // ← Note: "website", not "website_url"
domainName: "example.com",
analysis: { // ← Nested object
writing_style: { ... },
content_characteristics: { ... },
target_audience: { ... },
content_type: { ... },
// etc.
},
useAnalysisForGenAI: true
};
```
### Database Schema Expects (Flat Structure)
```python
{
'website_url': 'https://example.com', # ← "website_url" at root level
'writing_style': { ... }, # ← All fields at root level
'content_characteristics': { ... },
'target_audience': { ... },
'content_type': { ... },
'recommended_settings': { ... },
'crawl_result': { ... },
'style_patterns': { ... },
'style_guidelines': { ... },
'status': 'completed'
}
```
## The Issue
In `backend/services/api_key_manager.py` (line 278-280), the code was passing `step.data` directly to `save_website_analysis()`:
```python
elif step.step_number == 2: # Website Analysis
self.db_service.save_website_analysis(self.user_id, step.data, db)
```
But `step.data` had this structure:
```python
{
'website': 'https://example.com',
'analysis': {
'writing_style': { ... },
# ...
}
}
```
The database service expected `website_url` at the root level and all analysis fields flattened, so it couldn't find any of the data and saved an empty record (or didn't save at all).
## Solution
Transform the frontend data structure to match the database schema before saving:
**File**: `backend/services/api_key_manager.py` (lines 278-289)
```python
elif step.step_number == 2: # Website Analysis
# Transform frontend data structure to match database schema
analysis_for_db = {
'website_url': step.data.get('website', ''),
'status': 'completed'
}
# Merge analysis fields if they exist
if 'analysis' in step.data and step.data['analysis']:
analysis_for_db.update(step.data['analysis'])
self.db_service.save_website_analysis(self.user_id, analysis_for_db, db)
logger.info(f"✅ DATABASE: Website analysis saved to database for user {self.user_id}")
```
### What This Does:
1. **Creates base structure**: `{'website_url': '...', 'status': 'completed'}`
2. **Flattens nested `analysis` object**: Uses `.update()` to merge all analysis fields to root level
3. **Result**: Data matches database schema exactly
### Example Transformation:
**Before** (frontend format):
```python
{
'website': 'https://example.com',
'analysis': {
'writing_style': {'tone': 'Professional'},
'target_audience': {'demographics': ['B2B']}
}
}
```
**After** (database format):
```python
{
'website_url': 'https://example.com',
'status': 'completed',
'writing_style': {'tone': 'Professional'},
'target_audience': {'demographics': ['B2B']}
}
```
## Testing
To verify the fix:
1. **Restart the backend server** to load the updated code
2. **Complete Step 2** (Website Analysis) in the onboarding flow
3. **Check backend logs** for:
```
✅ DATABASE: Website analysis saved to database for user {user_id}
```
4. **Navigate to Step 6** (FinalStep)
5. **Verify** website URL and style analysis are displayed
### Expected Backend Logs After Fix:
```
INFO|api_key_manager.py:289|✅ DATABASE: Website analysis saved to database for user {user_id}
INFO|onboarding_summary_service.py:85|Retrieved website analysis from database for user {user_id}
```
## Related Files
- `frontend/src/components/OnboardingWizard/WebsiteStep.tsx` - Frontend data structure
- `backend/services/api_key_manager.py` - Data transformation logic
- `backend/services/onboarding_database_service.py` - Database save/retrieve methods
- `backend/models/onboarding.py` - WebsiteAnalysis model schema
## Why This Pattern?
This is a common issue in full-stack applications where:
1. **Frontend** optimizes for UI structure (nested for component organization)
2. **Database** optimizes for query performance (flat for indexing)
3. **Backend middleware** transforms between the two
## Alternative Solutions Considered
### Option 1: Change Frontend Structure
❌ **Rejected**: Would break all existing Step 2 components and localStorage caching
### Option 2: Change Database Schema
❌ **Rejected**: Would require complex JSON queries and lose type safety
### Option 3: Transform in Middleware (Selected) ✅
✅ **Best**: Minimal code change, maintains backward compatibility, clear separation of concerns
## Future Improvements
Consider adding a **data transformation layer** for all onboarding steps to handle similar mismatches proactively:
```python
class OnboardingDataTransformer:
@staticmethod
def transform_step_2(frontend_data: Dict) -> Dict:
"""Transform Step 2 data from frontend to database format."""
return {
'website_url': frontend_data.get('website', ''),
'status': 'completed',
**frontend_data.get('analysis', {})
}
```
This would centralize all data transformations and make the codebase more maintainable.
## Status
**Fixed**: Website analysis data now saves correctly to database
**Pending**: Restart backend and test with actual user flow