Added onboarding progress tracking & landing page
This commit is contained in:
308
docs/SESSION_ID_CLEANUP_SUMMARY.md
Normal file
308
docs/SESSION_ID_CLEANUP_SUMMARY.md
Normal file
@@ -0,0 +1,308 @@
|
||||
# Session ID Cleanup Summary
|
||||
**Date:** October 1, 2025
|
||||
**Issue:** Frontend session ID confusion - unnecessary tracking when backend uses Clerk user ID
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The frontend was maintaining a separate `sessionId` state and passing it to the backend, but:
|
||||
- Backend authenticates via Clerk JWT tokens
|
||||
- User identity comes from `current_user` (auth token)
|
||||
- Session ID was never actually used for session management
|
||||
- Created confusion and unnecessary complexity
|
||||
|
||||
## Solution Implemented
|
||||
|
||||
### ✅ Frontend Changes
|
||||
|
||||
#### **File: `frontend/src/components/OnboardingWizard/Wizard.tsx`**
|
||||
|
||||
**Removed:**
|
||||
```typescript
|
||||
const [sessionId, setSessionId] = useState<string>(''); // ❌ DELETED
|
||||
```
|
||||
|
||||
**Updated initialization:**
|
||||
```typescript
|
||||
// Before: setSessionId(session.session_id);
|
||||
// After: Just log for debugging
|
||||
console.log('Wizard: Initialized from cache:', {
|
||||
step: onboarding.current_step,
|
||||
progress: onboarding.completion_percentage,
|
||||
userId: session.session_id // Just for logging
|
||||
});
|
||||
```
|
||||
|
||||
**Updated component props:**
|
||||
```typescript
|
||||
// Before:
|
||||
<CompetitorAnalysisStep
|
||||
sessionId={sessionId} // ❌ REMOVED
|
||||
userUrl={stepData?.website || ''}
|
||||
industryContext={stepData?.industryContext}
|
||||
/>
|
||||
|
||||
// After:
|
||||
<CompetitorAnalysisStep
|
||||
userUrl={stepData?.website || ''}
|
||||
industryContext={stepData?.industryContext}
|
||||
/>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### **File: `frontend/src/components/OnboardingWizard/CompetitorAnalysisStep.tsx`**
|
||||
|
||||
**Updated interface:**
|
||||
```typescript
|
||||
// Before:
|
||||
interface CompetitorAnalysisStepProps {
|
||||
onContinue: (researchData?: any) => void;
|
||||
onBack: () => void;
|
||||
sessionId: string; // ❌ REMOVED
|
||||
userUrl: string;
|
||||
industryContext?: string;
|
||||
}
|
||||
|
||||
// After:
|
||||
interface CompetitorAnalysisStepProps {
|
||||
onContinue: (researchData?: any) => void;
|
||||
onBack: () => void;
|
||||
// sessionId removed - backend uses authenticated user from Clerk token
|
||||
userUrl: string;
|
||||
industryContext?: string;
|
||||
}
|
||||
```
|
||||
|
||||
**Updated API call:**
|
||||
```typescript
|
||||
// Before:
|
||||
body: JSON.stringify({
|
||||
session_id: sessionId, // ❌ REMOVED
|
||||
user_url: userUrl,
|
||||
industry_context: industryContext,
|
||||
num_results: 25,
|
||||
website_analysis_data: websiteAnalysisData
|
||||
})
|
||||
|
||||
// After:
|
||||
body: JSON.stringify({
|
||||
// session_id removed - backend gets user from auth token
|
||||
user_url: userUrl,
|
||||
industry_context: industryContext,
|
||||
num_results: 25,
|
||||
website_analysis_data: websiteAnalysisData
|
||||
})
|
||||
```
|
||||
|
||||
**Updated dependencies:**
|
||||
```typescript
|
||||
// Before:
|
||||
}, [sessionId, userUrl, industryContext]);
|
||||
|
||||
// After:
|
||||
}, [userUrl, industryContext]); // sessionId removed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ✅ Backend Changes
|
||||
|
||||
#### **File: `backend/api/onboarding_utils/step3_routes.py`**
|
||||
|
||||
**Made session_id optional:**
|
||||
```python
|
||||
# Before:
|
||||
class CompetitorDiscoveryRequest(BaseModel):
|
||||
session_id: str = Field(..., description="Onboarding session ID")
|
||||
|
||||
# After:
|
||||
class CompetitorDiscoveryRequest(BaseModel):
|
||||
session_id: Optional[str] = Field(
|
||||
None,
|
||||
description="Deprecated - user identification comes from auth token"
|
||||
)
|
||||
```
|
||||
|
||||
**Updated endpoint logic:**
|
||||
```python
|
||||
# Before:
|
||||
logger.info(f"Starting competitor discovery for session {request.session_id}")
|
||||
session_id = request.session_id if request.session_id else "default_session"
|
||||
|
||||
# After:
|
||||
# Session ID is deprecated - we use authenticated user from token instead
|
||||
session_id = request.session_id if request.session_id else "user_authenticated"
|
||||
logger.info(f"Starting competitor discovery for URL: {request.user_url}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Authentication Actually Works
|
||||
|
||||
### **Request Flow:**
|
||||
|
||||
```
|
||||
1. Frontend makes API call with Clerk JWT token
|
||||
↓
|
||||
2. Backend middleware extracts token from Authorization header
|
||||
↓
|
||||
3. Token verified via JWKS (with 60s leeway for clock skew)
|
||||
↓
|
||||
4. User ID extracted from token claims (sub field)
|
||||
↓
|
||||
5. User object passed to endpoint via Depends(get_current_user)
|
||||
↓
|
||||
6. Backend uses Clerk user ID for all user-specific operations
|
||||
```
|
||||
|
||||
### **User Session Management:**
|
||||
|
||||
```python
|
||||
# backend/services/api_key_manager.py
|
||||
def get_onboarding_progress_for_user(user_id: str) -> OnboardingProgress:
|
||||
"""
|
||||
Uses Clerk user_id (from auth token) as the session identifier.
|
||||
No separate session ID needed!
|
||||
"""
|
||||
progress_file = f".onboarding_progress_{safe_user_id}.json"
|
||||
return OnboardingProgress(progress_file=progress_file)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Was Removed
|
||||
|
||||
### ❌ **Unnecessary Code:**
|
||||
|
||||
1. **Frontend session state:**
|
||||
- `const [sessionId, setSessionId] = useState<string>('')`
|
||||
- `setSessionId(...)` calls
|
||||
- `sessionId` prop passing
|
||||
|
||||
2. **localStorage session tracking:**
|
||||
- No more `localStorage.setItem('onboarding_session_id', ...)`
|
||||
- No more `localStorage.getItem('onboarding_session_id')`
|
||||
|
||||
3. **API request session_id:**
|
||||
- Removed from request body
|
||||
- Backend made it optional
|
||||
|
||||
---
|
||||
|
||||
## Benefits
|
||||
|
||||
### ✅ **Code Quality:**
|
||||
- **Simpler:** Less state to manage
|
||||
- **Clearer:** No confusion about what "session" means
|
||||
- **Aligned:** Matches actual backend architecture
|
||||
|
||||
### ✅ **Maintainability:**
|
||||
- Fewer moving parts
|
||||
- Less chance of session tracking bugs
|
||||
- Clear authentication flow
|
||||
|
||||
### ✅ **Security:**
|
||||
- Single source of truth (Clerk token)
|
||||
- No parallel session tracking
|
||||
- Reduced attack surface
|
||||
|
||||
---
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
- [ ] Frontend compiles without errors
|
||||
- [ ] Onboarding wizard loads successfully
|
||||
- [ ] Step 3 (Competitor Analysis) works without sessionId
|
||||
- [ ] Backend accepts requests without session_id
|
||||
- [ ] Backend still accepts requests with session_id (backwards compat)
|
||||
- [ ] User progress persists correctly
|
||||
- [ ] No console errors about missing sessionId
|
||||
|
||||
---
|
||||
|
||||
## Migration Notes
|
||||
|
||||
### **For Other Developers:**
|
||||
|
||||
If you have code that uses `sessionId`:
|
||||
|
||||
**❌ DON'T:**
|
||||
```typescript
|
||||
// Don't pass sessionId anymore
|
||||
<CompetitorAnalysisStep sessionId={someId} ... />
|
||||
|
||||
// Don't send session_id in API calls
|
||||
fetch('/api/...', {
|
||||
body: JSON.stringify({ session_id: someId })
|
||||
})
|
||||
```
|
||||
|
||||
**✅ DO:**
|
||||
```typescript
|
||||
// Just pass the required props
|
||||
<CompetitorAnalysisStep userUrl={url} industryContext={context} />
|
||||
|
||||
// Let backend get user from auth token
|
||||
fetch('/api/...', {
|
||||
headers: { 'Authorization': `Bearer ${token}` },
|
||||
body: JSON.stringify({ /* no session_id */ })
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Backwards Compatibility
|
||||
|
||||
### **Old Frontend Code:**
|
||||
If old frontend still sends `session_id`, it will:
|
||||
- ✅ Still work (backend accepts it as Optional)
|
||||
- ✅ Be ignored (backend uses auth token instead)
|
||||
- ✅ Log a warning (if needed, add deprecation warning)
|
||||
|
||||
### **API Contract:**
|
||||
- Request: `session_id` is now optional
|
||||
- Response: `session_id` still included for compatibility
|
||||
- No breaking changes
|
||||
|
||||
---
|
||||
|
||||
## Related Changes
|
||||
|
||||
This cleanup builds on:
|
||||
1. **Batch API Endpoint** - Reduced API calls (see: `BATCH_API_IMPLEMENTATION_SUMMARY.md`)
|
||||
2. **Auth Fix** - Clock skew resolution (see: `CLOCK_SKEW_FIX.md`)
|
||||
3. **Code Review** - Identified this issue (see: `END_USER_FLOW_CODE_REVIEW.md`)
|
||||
|
||||
---
|
||||
|
||||
## Files Modified
|
||||
|
||||
### **Frontend (2 files):**
|
||||
- `frontend/src/components/OnboardingWizard/Wizard.tsx`
|
||||
- `frontend/src/components/OnboardingWizard/CompetitorAnalysisStep.tsx`
|
||||
|
||||
### **Backend (1 file):**
|
||||
- `backend/api/onboarding_utils/step3_routes.py`
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
✅ **Session ID successfully removed from frontend**
|
||||
✅ **Backend made backwards compatible**
|
||||
✅ **Code now aligns with actual architecture**
|
||||
✅ **User authentication via Clerk token only**
|
||||
|
||||
The codebase is now cleaner, simpler, and more maintainable. The "session" is actually the authenticated Clerk user - no separate tracking needed!
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Test the changes end-to-end
|
||||
2. Monitor for any session-related errors
|
||||
3. Eventually remove session_id from backend responses (breaking change - schedule for v2.0)
|
||||
4. Update API documentation to reflect changes
|
||||
|
||||
Reference in New Issue
Block a user