This commit adds the Auto-Dubbing feature for Podcast Maker with support for translating podcast audio to different languages with optional voice cloning to preserve the original speaker's voice. New Features: - Translation Service (common module): DeepL integration for low-cost translation, WaveSpeed integration for high-quality translation - Audio Dubbing Service: STT -> Translate -> TTS pipeline with voice cloning support - 9 new API endpoints for dubbing and voice cloning - Support for 34+ languages - Cost estimation utilities - Comprehensive documentation Files Added: - services/translation/ (5 files): Translation service module - services/dubbing/: Audio dubbing service - api/podcast/handlers/dubbing.py: API endpoints - docs/AUTO_DUBBING.md: Feature documentation - CHANGELOG.md: Change log Files Modified: - api/podcast/models.py: Added dubbing request/response models - api/podcast/router.py: Added dubbing routes - services/__init__.py: Export translation and dubbing services - scene_animation.py: Fixed missing Path import
1.8 KiB
1.8 KiB
Changelog
All notable changes to the ALwrity project will be documented in this file.
The format is based on Keep a Changelog.
[Unreleased]
Added
Auto-Dubbing Feature (Podcast Maker)
-
Translation Service (
backend/services/translation/)- Common translation module for use across the entire application
- DeepL integration for low-cost, high-quality text translation (500k chars/month free)
- WaveSpeed integration for high-quality video/audio translation
- Support for 34+ languages
- Batch translation support
- Factory pattern for provider selection
- Cost estimation utilities
-
Audio Dubbing Service (
backend/services/dubbing/)- Audio dubbing with STT → Translate → TTS pipeline
- Voice cloning support to preserve original speaker's voice
- Low-quality (DeepL) and high-quality (WaveSpeed) modes
- Batch dubbing support
- Cost estimation
-
Podcast API Endpoints (
backend/api/podcast/)POST /api/podcast/dub/audio- Create audio dubbing taskGET /api/podcast/dub/{task_id}/result- Get dubbing resultPOST /api/podcast/dub/voices/clone- Clone voice from audio sampleGET /api/podcast/dub/voices/{task_id}/result- Get voice clone resultPOST /api/podcast/dub/estimate- Estimate dubbing costGET /api/podcast/dub/languages- List supported languagesGET /api/podcast/dub/voices- List available TTS voices
-
Bug Fixes
- Fixed missing
Pathimport inscene_animation.py
- Fixed missing
Changed
- Updated
backend/services/__init__.pyto export translation and dubbing services - Updated
.envwith DeepL API key placeholder
Documentation
- Added
backend/docs/AUTO_DUBBING.mdwith comprehensive feature documentation
[Previous Releases]
See git history for previous changelog entries.