- Fix text selection menu not showing: wire contentRef via inputRef on multiline TextField
- Fix blog title not truncating: add min-w-0 for flex item overflow
- Fix outline generation 500: escape curly braces in f-string prompt template
- Fix content generation 'NoneType not callable': replace SessionLocal() with get_session_for_user(), add db param to MediumBlogGenerator, fix signature mismatch in database_task_manager
- Fix writing assistant suggest 500: add auth + user_id to API endpoint and service, replace sync requests with httpx.AsyncClient
- Fix hallucination detector 404: explicitly include router in main.py and app.py
- Fix missing error_data in task failure responses
- Hide CopilotKit web inspector button
- Remove hardcoded fallback suggestions from SmartTypingAssist
- Fix stale closure refs in SmartTypingAssist handleTypingChange
- Add two-column editor layout, stats bar, section hover menu
- Various subscription, billing, and research module improvements
Backend:
- product_image_service.py: Replaced direct wavespeed_client.generate_image()
with generate_image() from main_image_generation (unified entry point)
- This ensures subscription pre-flight validation (_validate_image_operation)
and usage tracking (_track_image_operation_usage) are enforced
- Removed _generate_image_with_retry method and WaveSpeedClient dependency
- Animation/video/avatar services already route through ImageStudioManager - no changes needed
Frontend:
- useProductMarketing.ts: Added formatError() helper for 402/429 detection
across all 8 API operations
- useCampaignCreator.ts: Added formatError() helper for 402/429 detection
across all 13 API operations
- All error messages now surface subscription limits with upgrade prompts
Changes:
1. helpers.py (_track_image_operation_usage): Map provider name to DB columns
dynamically (stability→stability_calls, wavespeed→wavespeed_calls, etc.)
instead of hardcoding stability_calls/stability_cost.
2. upscale_service.py: Added _track_image_operation_usage() call after
successful Stability upscale completion.
3. control_service.py: Added _track_image_operation_usage() call after
successful Stability control operation completion.
4. edit_service.py: Added _track_image_operation_usage() call after
successful Stability edit operation (remove_background, inpaint,
outpaint, search_replace, search_recolor, relight).
Previously only Create Studio and Face Swap tracked usage. Now all five
studios correctly decrement subscription limits.
Frontend Changes:
- Add scene numbering badge (1/N) next to scene titles
- Add inline status chips (Complete, Audio, Image, Voice, Why Script)
- Professional AI-like gradient styling for all chips with shadows
- Remove Script Editor header and 'Why This Script Format?' collapsible
- Move Voice and Why Script info to per-scene chips
- Make scene section mobile-responsive (responsive layout, button sizing)
- Rename 'B-Roll Charts' to 'Podcast Charts' with accordion (collapsed by default)
- Add sceneIndex prop to SceneEditor for scene numbering
- Enhanced accessibility with keyboard navigation and focus states
Backend Changes:
- Audio handler improvements
- B-roll handler enhancements
- Script handler updates
- B-roll composer and service improvements
- Removed temporary broll_temp files
Technical:
- Full mobile responsiveness for scene cards
- Gradient chip styling: vibrant colors with white text and shadows
- Non-breaking approval/generation flow preserved
- TypeScript compatibility maintained
- Fix voice clone preview saved as .wav regardless of actual format (MP3/WebM
content from WaveSpeed was saved with .wav extension causing NotSupportedError)
- Add detect_audio_format() and ensure_audio_extension() to media_utils
- Fix assets_serving.py: use storage_paths for root resolution, add proper
MIME types to FileResponse, add auth via query token for <audio> elements
- Fix assets_serving.py: add path traversal security check
- Fix step4_asset_routes.py: use get_user_workspace() instead of WORKSPACE_DIR,
detect actual audio format before saving preview
- Fix get_db() in database.py: raise HTTPException(401) instead of raw Exception,
catch engine creation failures with HTTPException(503)
- Fix avatar.py: add auth error handling, diagnostic logging for path resolution,
graceful DB save degradation
- Upgrade utils/storage_paths.py with robust find_repo_root() (env var override + validation + fallback)
- Remove broken _find_root() from podcast/constants.py, import from storage_paths instead
- Fix ROOT_DIR resolving to backend/ instead of project root (caused avatar upload 500s on Render.com)
- Fix video_combination_service.py default output dir (was writing to data/media instead of workspace)
- Add deprecation comments to global data/media constants in media_utils.py
- Pass user_id through resolve_media_path for tenant-scoped podcast resolution
- Add ALWRITY_ROOT_DIR env var support for explicit production overrides
- Log warning when get_podcast_media_dir called without user_id
- Use OperationButton with cost display for scene action buttons
- Voice clone integration: When user selects voice clone in Write phase,
backend uses their uploaded voice sample + scene script text to generate
audio via qwen3/minimax/cosyvoice voice clone APIs
- Multi-tenant workspace storage: All podcast assets (audio, video, images,
charts) now use workspace-specific directories per user
- Chart preview improvements: Card-based B-Roll charts UI with thumbnails,
takeaway text, and action buttons; public endpoint for image serving
- Voice clone caching: In-memory LRU cache for voice samples (avoids
re-downloading per scene); frontend caches voice clone metadata
- Thread pool for voice clone: Audio generation uses ThreadPoolExecutor to
avoid blocking the FastAPI event loop
- Auto-detect voice clone IDs (vc_*, MY_VOICE_CLONE) to route correctly
- DB fallback for voice sample URL: Fetches from ContentAsset if not passed
- Fixed API URL resolution for chart previews
- Fixed GlassyCard DOM warnings for motion props
- Fixed ScriptGenerationProgressView syntax error
- Fixed usePodcastWorkflow scriptData reference
- Fix database session handling in main_image_editing.py to use proper generator handling
- Add graceful handling of validation errors in podcast-only mode
- Add better error messages when WAVESPEED_API_KEY or HF_TOKEN is missing
- Add specific HTTP 503 error for configuration issues
- Add ALWRITY_SKIP_IMAGE_EDITING_VALIDATION env var to bypass validation in dev
- Remove hardcoded preferred_provider=huggingface in podcast handlers
- Set preferred_provider=None to respect GPT_PROVIDER env var
- Change default model from Qwen to gpt-oss-120b:cerebras (the model user had access to)
- WaveSpeed will now use gpt-oss-120b model instead of Qwen
- Add HTTPException re-raise before generic Exception handler
- Use static error message instead of str(e) which was out of scope
- Fixes 'e is not associated with a value' error
- Returns HTTP 429 (usage limit) instead of 503 for provider failures
- Includes usage_info with error_type, operation_type, and suggestion
- Frontend SubscriptionContext can now display the modal
- Log gpt_provider and model in preflight info
- Return structured HTTP 503 with actionable error details
- Include available_providers, requested_provider, and suggestion
- Help users understand what went wrong and how to fix it
- Return 503 with structured error details instead of generic RuntimeError
- Include available_providers and requested_provider in error
- Add actionable suggestions for users
- Check if no providers configured and return specific error
- Add GPT_PROVIDER wavespeed/openai support in main_text_generation.py
- wavespeed_text_response now called when GPT_PROVIDER=wavespeed
- Fallback to tenant config when no GPT_PROVIDER set
- Add wavespeed provider mapping in provider_enum
- Fix generate_image() call to use options dict in podcast analysis
- Add dedicated image_generation module with statistical extraction
- Support 16 industry domains with visual concept detection
- Add model-specific guidance for Ideogram, FLUX, GLM, Qwen, MAI
- Extract statistics, rankings, comparisons, and trends automatically
- Refactor backend/api/images.py to use new module
This commit adds the Auto-Dubbing feature for Podcast Maker with support
for translating podcast audio to different languages with optional voice
cloning to preserve the original speaker's voice.
New Features:
- Translation Service (common module): DeepL integration for low-cost
translation, WaveSpeed integration for high-quality translation
- Audio Dubbing Service: STT -> Translate -> TTS pipeline with
voice cloning support
- 9 new API endpoints for dubbing and voice cloning
- Support for 34+ languages
- Cost estimation utilities
- Comprehensive documentation
Files Added:
- services/translation/ (5 files): Translation service module
- services/dubbing/: Audio dubbing service
- api/podcast/handlers/dubbing.py: API endpoints
- docs/AUTO_DUBBING.md: Feature documentation
- CHANGELOG.md: Change log
Files Modified:
- api/podcast/models.py: Added dubbing request/response models
- api/podcast/router.py: Added dubbing routes
- services/__init__.py: Export translation and dubbing services
- scene_animation.py: Fixed missing Path import
- Add services/startup_health.py with health check functions:
- get_startup_status(): Returns current startup status
- readiness_under_auth_context(): Validates tenant DB under auth context
- run_startup_health_routine(): Runs all startup health checks
- Add /health/readiness endpoint for tenant DB validation
- Update startup_event() to use run_startup_health_routine()
- Add raise to startup_event to fail fast on errors
- Import APIKeyManager for provider key checking
- Use APIKeyManager.get_api_key() instead of get_api_key() function
- Add wavespeed provider to available_providers check
- Add detailed provider preflight logging with flow_type tag
- Improve fallback logic when preferred provider is unavailable
These improvements come from PRs #423-#431 while maintaining the modular textgen_utils structure.
huggingface_provider.py:
- Add retry logic with _should_retry_hf_error and _is_non_retryable_hf_error
- Update default models from :groq to :cerebras (HF_FALLBACK_MODELS)
- Add fallback_models parameter to huggingface_text_response
- Add get_available_models with updated model list
main_text_generation.py:
- Add GPT_PROVIDER and TEXTGEN_AI_MODELS env var support
- Add preferred_provider and flow_type parameters to llm_text_gen
- Add HF_MODEL_MAPPING for short model name resolution
- Add flow_type logging tag for better observability
sif_agents.py:
- Add LOW_COST_SHARED_REMOTE_MODELS for SIF agents
- Update SharedLLMWrapper to use preferred_hf_models and flow_type
These changes preserve the modular textgen_utils structure while incorporating
the useful routing and retry logic improvements from the pending PRs.