309 lines
7.6 KiB
Markdown
309 lines
7.6 KiB
Markdown
# Session ID Cleanup Summary
|
|
**Date:** October 1, 2025
|
|
**Issue:** Frontend session ID confusion - unnecessary tracking when backend uses Clerk user ID
|
|
|
|
---
|
|
|
|
## Problem Statement
|
|
|
|
The frontend was maintaining a separate `sessionId` state and passing it to the backend, but:
|
|
- Backend authenticates via Clerk JWT tokens
|
|
- User identity comes from `current_user` (auth token)
|
|
- Session ID was never actually used for session management
|
|
- Created confusion and unnecessary complexity
|
|
|
|
## Solution Implemented
|
|
|
|
### ✅ Frontend Changes
|
|
|
|
#### **File: `frontend/src/components/OnboardingWizard/Wizard.tsx`**
|
|
|
|
**Removed:**
|
|
```typescript
|
|
const [sessionId, setSessionId] = useState<string>(''); // ❌ DELETED
|
|
```
|
|
|
|
**Updated initialization:**
|
|
```typescript
|
|
// Before: setSessionId(session.session_id);
|
|
// After: Just log for debugging
|
|
console.log('Wizard: Initialized from cache:', {
|
|
step: onboarding.current_step,
|
|
progress: onboarding.completion_percentage,
|
|
userId: session.session_id // Just for logging
|
|
});
|
|
```
|
|
|
|
**Updated component props:**
|
|
```typescript
|
|
// Before:
|
|
<CompetitorAnalysisStep
|
|
sessionId={sessionId} // ❌ REMOVED
|
|
userUrl={stepData?.website || ''}
|
|
industryContext={stepData?.industryContext}
|
|
/>
|
|
|
|
// After:
|
|
<CompetitorAnalysisStep
|
|
userUrl={stepData?.website || ''}
|
|
industryContext={stepData?.industryContext}
|
|
/>
|
|
```
|
|
|
|
---
|
|
|
|
#### **File: `frontend/src/components/OnboardingWizard/CompetitorAnalysisStep.tsx`**
|
|
|
|
**Updated interface:**
|
|
```typescript
|
|
// Before:
|
|
interface CompetitorAnalysisStepProps {
|
|
onContinue: (researchData?: any) => void;
|
|
onBack: () => void;
|
|
sessionId: string; // ❌ REMOVED
|
|
userUrl: string;
|
|
industryContext?: string;
|
|
}
|
|
|
|
// After:
|
|
interface CompetitorAnalysisStepProps {
|
|
onContinue: (researchData?: any) => void;
|
|
onBack: () => void;
|
|
// sessionId removed - backend uses authenticated user from Clerk token
|
|
userUrl: string;
|
|
industryContext?: string;
|
|
}
|
|
```
|
|
|
|
**Updated API call:**
|
|
```typescript
|
|
// Before:
|
|
body: JSON.stringify({
|
|
session_id: sessionId, // ❌ REMOVED
|
|
user_url: userUrl,
|
|
industry_context: industryContext,
|
|
num_results: 25,
|
|
website_analysis_data: websiteAnalysisData
|
|
})
|
|
|
|
// After:
|
|
body: JSON.stringify({
|
|
// session_id removed - backend gets user from auth token
|
|
user_url: userUrl,
|
|
industry_context: industryContext,
|
|
num_results: 25,
|
|
website_analysis_data: websiteAnalysisData
|
|
})
|
|
```
|
|
|
|
**Updated dependencies:**
|
|
```typescript
|
|
// Before:
|
|
}, [sessionId, userUrl, industryContext]);
|
|
|
|
// After:
|
|
}, [userUrl, industryContext]); // sessionId removed
|
|
```
|
|
|
|
---
|
|
|
|
### ✅ Backend Changes
|
|
|
|
#### **File: `backend/api/onboarding_utils/step3_routes.py`**
|
|
|
|
**Made session_id optional:**
|
|
```python
|
|
# Before:
|
|
class CompetitorDiscoveryRequest(BaseModel):
|
|
session_id: str = Field(..., description="Onboarding session ID")
|
|
|
|
# After:
|
|
class CompetitorDiscoveryRequest(BaseModel):
|
|
session_id: Optional[str] = Field(
|
|
None,
|
|
description="Deprecated - user identification comes from auth token"
|
|
)
|
|
```
|
|
|
|
**Updated endpoint logic:**
|
|
```python
|
|
# Before:
|
|
logger.info(f"Starting competitor discovery for session {request.session_id}")
|
|
session_id = request.session_id if request.session_id else "default_session"
|
|
|
|
# After:
|
|
# Session ID is deprecated - we use authenticated user from token instead
|
|
session_id = request.session_id if request.session_id else "user_authenticated"
|
|
logger.info(f"Starting competitor discovery for URL: {request.user_url}")
|
|
```
|
|
|
|
---
|
|
|
|
## How Authentication Actually Works
|
|
|
|
### **Request Flow:**
|
|
|
|
```
|
|
1. Frontend makes API call with Clerk JWT token
|
|
↓
|
|
2. Backend middleware extracts token from Authorization header
|
|
↓
|
|
3. Token verified via JWKS (with 60s leeway for clock skew)
|
|
↓
|
|
4. User ID extracted from token claims (sub field)
|
|
↓
|
|
5. User object passed to endpoint via Depends(get_current_user)
|
|
↓
|
|
6. Backend uses Clerk user ID for all user-specific operations
|
|
```
|
|
|
|
### **User Session Management:**
|
|
|
|
```python
|
|
# backend/services/api_key_manager.py
|
|
def get_onboarding_progress_for_user(user_id: str) -> OnboardingProgress:
|
|
"""
|
|
Uses Clerk user_id (from auth token) as the session identifier.
|
|
No separate session ID needed!
|
|
"""
|
|
progress_file = f".onboarding_progress_{safe_user_id}.json"
|
|
return OnboardingProgress(progress_file=progress_file)
|
|
```
|
|
|
|
---
|
|
|
|
## What Was Removed
|
|
|
|
### ❌ **Unnecessary Code:**
|
|
|
|
1. **Frontend session state:**
|
|
- `const [sessionId, setSessionId] = useState<string>('')`
|
|
- `setSessionId(...)` calls
|
|
- `sessionId` prop passing
|
|
|
|
2. **localStorage session tracking:**
|
|
- No more `localStorage.setItem('onboarding_session_id', ...)`
|
|
- No more `localStorage.getItem('onboarding_session_id')`
|
|
|
|
3. **API request session_id:**
|
|
- Removed from request body
|
|
- Backend made it optional
|
|
|
|
---
|
|
|
|
## Benefits
|
|
|
|
### ✅ **Code Quality:**
|
|
- **Simpler:** Less state to manage
|
|
- **Clearer:** No confusion about what "session" means
|
|
- **Aligned:** Matches actual backend architecture
|
|
|
|
### ✅ **Maintainability:**
|
|
- Fewer moving parts
|
|
- Less chance of session tracking bugs
|
|
- Clear authentication flow
|
|
|
|
### ✅ **Security:**
|
|
- Single source of truth (Clerk token)
|
|
- No parallel session tracking
|
|
- Reduced attack surface
|
|
|
|
---
|
|
|
|
## Testing Checklist
|
|
|
|
- [ ] Frontend compiles without errors
|
|
- [ ] Onboarding wizard loads successfully
|
|
- [ ] Step 3 (Competitor Analysis) works without sessionId
|
|
- [ ] Backend accepts requests without session_id
|
|
- [ ] Backend still accepts requests with session_id (backwards compat)
|
|
- [ ] User progress persists correctly
|
|
- [ ] No console errors about missing sessionId
|
|
|
|
---
|
|
|
|
## Migration Notes
|
|
|
|
### **For Other Developers:**
|
|
|
|
If you have code that uses `sessionId`:
|
|
|
|
**❌ DON'T:**
|
|
```typescript
|
|
// Don't pass sessionId anymore
|
|
<CompetitorAnalysisStep sessionId={someId} ... />
|
|
|
|
// Don't send session_id in API calls
|
|
fetch('/api/...', {
|
|
body: JSON.stringify({ session_id: someId })
|
|
})
|
|
```
|
|
|
|
**✅ DO:**
|
|
```typescript
|
|
// Just pass the required props
|
|
<CompetitorAnalysisStep userUrl={url} industryContext={context} />
|
|
|
|
// Let backend get user from auth token
|
|
fetch('/api/...', {
|
|
headers: { 'Authorization': `Bearer ${token}` },
|
|
body: JSON.stringify({ /* no session_id */ })
|
|
})
|
|
```
|
|
|
|
---
|
|
|
|
## Backwards Compatibility
|
|
|
|
### **Old Frontend Code:**
|
|
If old frontend still sends `session_id`, it will:
|
|
- ✅ Still work (backend accepts it as Optional)
|
|
- ✅ Be ignored (backend uses auth token instead)
|
|
- ✅ Log a warning (if needed, add deprecation warning)
|
|
|
|
### **API Contract:**
|
|
- Request: `session_id` is now optional
|
|
- Response: `session_id` still included for compatibility
|
|
- No breaking changes
|
|
|
|
---
|
|
|
|
## Related Changes
|
|
|
|
This cleanup builds on:
|
|
1. **Batch API Endpoint** - Reduced API calls (see: `BATCH_API_IMPLEMENTATION_SUMMARY.md`)
|
|
2. **Auth Fix** - Clock skew resolution (see: `CLOCK_SKEW_FIX.md`)
|
|
3. **Code Review** - Identified this issue (see: `END_USER_FLOW_CODE_REVIEW.md`)
|
|
|
|
---
|
|
|
|
## Files Modified
|
|
|
|
### **Frontend (2 files):**
|
|
- `frontend/src/components/OnboardingWizard/Wizard.tsx`
|
|
- `frontend/src/components/OnboardingWizard/CompetitorAnalysisStep.tsx`
|
|
|
|
### **Backend (1 file):**
|
|
- `backend/api/onboarding_utils/step3_routes.py`
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
✅ **Session ID successfully removed from frontend**
|
|
✅ **Backend made backwards compatible**
|
|
✅ **Code now aligns with actual architecture**
|
|
✅ **User authentication via Clerk token only**
|
|
|
|
The codebase is now cleaner, simpler, and more maintainable. The "session" is actually the authenticated Clerk user - no separate tracking needed!
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. Test the changes end-to-end
|
|
2. Monitor for any session-related errors
|
|
3. Eventually remove session_id from backend responses (breaking change - schedule for v2.0)
|
|
4. Update API documentation to reflect changes
|
|
|