5.9 KiB
5.9 KiB
Polling Timeout Issues - Fixed
🚨 Problem Identified
The research endpoint was timing out even with polling because:
- Frontend polling was using 60-second timeout for status checks
- Research operations were taking longer than 60 seconds
- Polling continued indefinitely after timeout instead of stopping
- No backend timeout protection for long-running operations
✅ Solutions Implemented
1. Frontend Timeout Fixes
New Polling API Client:
- ✅ Created
pollingApiClientwith 10-second timeout for status checks - ✅ Status checks should be quick, so 10 seconds is sufficient
- ✅ Updated
pollResearchStatusandpollOutlineStatusto use polling client
Enhanced Error Handling:
- ✅ Improved timeout error messages in
usePollinghook - ✅ Better distinction between timeout and other errors
- ✅ Clear user messaging: "Request timeout - the research operation may still be running"
2. Backend Timeout Protection
Research Operation Timeout:
- ✅ Added 5-minute timeout to research operations using
asyncio.wait_for - ✅ Graceful timeout handling with clear error messages
- ✅ Task status properly set to "failed" on timeout
Outline Generation Timeout:
- ✅ Added 3-minute timeout to outline generation operations
- ✅ Consistent timeout handling across all async operations
3. Improved User Experience
Better Error Messages:
- ✅ Clear timeout messages: "Research operation timed out after 5 minutes"
- ✅ Helpful suggestions: "Please try again with a simpler query"
- ✅ Distinction between request timeout and operation timeout
Proper Polling Behavior:
- ✅ Polling stops immediately on timeout
- ✅ No more infinite polling loops
- ✅ Clean error state management
🔧 Technical Implementation
Frontend Changes:
New API Client:
// pollingApiClient with 10-second timeout
export const pollingApiClient = axios.create({
baseURL: 'http://localhost:8000',
timeout: 10000, // 10 seconds for status checks
headers: { 'Content-Type': 'application/json' }
});
Updated Polling Methods:
async pollResearchStatus(taskId: string): Promise<TaskStatusResponse> {
const { data } = await pollingApiClient.get(`/api/blog/research/status/${taskId}`);
return data;
}
Enhanced Error Handling:
if (errorMessage.includes('timeout') || errorMessage.includes('TIMEOUT')) {
const timeoutMessage = 'Request timeout - the research operation may still be running. Please try again later.';
setError(timeoutMessage);
onError?.(timeoutMessage);
}
Backend Changes:
Research Operation Timeout:
try:
# Add a timeout to the research operation (5 minutes)
result = await asyncio.wait_for(
service.research_with_progress(request, task_id),
timeout=300 # 5 minutes timeout
)
except asyncio.TimeoutError:
await _update_progress(task_id, "⏰ Research operation timed out after 5 minutes. Please try again with a simpler query.")
task_storage[task_id]["status"] = "failed"
task_storage[task_id]["error"] = "Research operation timed out after 5 minutes"
return
Outline Generation Timeout:
try:
# Add a timeout to the outline generation operation (3 minutes)
result = await asyncio.wait_for(
service.generate_outline_with_progress(request, task_id),
timeout=180 # 3 minutes timeout
)
except asyncio.TimeoutError:
await _update_progress(task_id, "⏰ Outline generation timed out after 3 minutes. Please try again.")
task_storage[task_id]["status"] = "failed"
task_storage[task_id]["error"] = "Outline generation timed out after 3 minutes"
return
📊 Timeout Configuration
Frontend Timeouts:
- Status Polling: 10 seconds (should be quick)
- Regular API: 60 seconds (for normal operations)
- AI Operations: 3 minutes (for AI processing)
- Long Operations: 5 minutes (for SEO analysis)
Backend Timeouts:
- Research Operations: 5 minutes (comprehensive research)
- Outline Generation: 3 minutes (outline creation)
- Task Cleanup: 1 hour (memory management)
🎯 Expected Behavior Now
Before (Broken):
- ❌ Polling timed out after 60 seconds
- ❌ Polling continued indefinitely
- ❌ No backend timeout protection
- ❌ Poor error messages
After (Fixed):
- ✅ Status checks timeout in 10 seconds (quick response)
- ✅ Research operations timeout in 5 minutes (reasonable limit)
- ✅ Polling stops immediately on timeout
- ✅ Clear error messages with helpful suggestions
- ✅ Backend prevents runaway operations
🚀 User Experience
Normal Flow:
- User starts research → Task ID returned
- Frontend polls every 2 seconds with 10-second timeout
- Backend completes research within 5 minutes
- User sees progress messages and final results
Timeout Flow:
- User starts research → Task ID returned
- Research takes longer than 5 minutes
- Backend times out and sets task to "failed"
- Frontend receives timeout error and stops polling
- User sees clear message: "Research operation timed out after 5 minutes. Please try again with a simpler query."
📁 Files Modified
Frontend:
frontend/src/api/client.ts- Added pollingApiClientfrontend/src/services/blogWriterApi.ts- Updated to use polling clientfrontend/src/hooks/usePolling.ts- Enhanced error handling
Backend:
backend/api/blog_writer/router.py- Added operation timeouts
🎉 Result
The polling system now works correctly with:
- ✅ Proper timeout handling at both frontend and backend levels
- ✅ No more infinite polling loops
- ✅ Clear error messages for users
- ✅ Reasonable timeout limits for different operations
- ✅ Graceful failure handling with helpful suggestions
Users will now have a much better experience with the research system! 🎉