feat: Add Auto-Dubbing feature for Podcast Maker
This commit adds the Auto-Dubbing feature for Podcast Maker with support for translating podcast audio to different languages with optional voice cloning to preserve the original speaker's voice. New Features: - Translation Service (common module): DeepL integration for low-cost translation, WaveSpeed integration for high-quality translation - Audio Dubbing Service: STT -> Translate -> TTS pipeline with voice cloning support - 9 new API endpoints for dubbing and voice cloning - Support for 34+ languages - Cost estimation utilities - Comprehensive documentation Files Added: - services/translation/ (5 files): Translation service module - services/dubbing/: Audio dubbing service - api/podcast/handlers/dubbing.py: API endpoints - docs/AUTO_DUBBING.md: Feature documentation - CHANGELOG.md: Change log Files Modified: - api/podcast/models.py: Added dubbing request/response models - api/podcast/router.py: Added dubbing routes - services/__init__.py: Export translation and dubbing services - scene_animation.py: Fixed missing Path import
This commit is contained in:
51
backend/CHANGELOG.md
Normal file
51
backend/CHANGELOG.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Changelog
|
||||
|
||||
All notable changes to the ALwrity project will be documented in this file.
|
||||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
|
||||
#### Auto-Dubbing Feature (Podcast Maker)
|
||||
- **Translation Service** (`backend/services/translation/`)
|
||||
- Common translation module for use across the entire application
|
||||
- DeepL integration for low-cost, high-quality text translation (500k chars/month free)
|
||||
- WaveSpeed integration for high-quality video/audio translation
|
||||
- Support for 34+ languages
|
||||
- Batch translation support
|
||||
- Factory pattern for provider selection
|
||||
- Cost estimation utilities
|
||||
|
||||
- **Audio Dubbing Service** (`backend/services/dubbing/`)
|
||||
- Audio dubbing with STT → Translate → TTS pipeline
|
||||
- Voice cloning support to preserve original speaker's voice
|
||||
- Low-quality (DeepL) and high-quality (WaveSpeed) modes
|
||||
- Batch dubbing support
|
||||
- Cost estimation
|
||||
|
||||
- **Podcast API Endpoints** (`backend/api/podcast/`)
|
||||
- `POST /api/podcast/dub/audio` - Create audio dubbing task
|
||||
- `GET /api/podcast/dub/{task_id}/result` - Get dubbing result
|
||||
- `POST /api/podcast/dub/voices/clone` - Clone voice from audio sample
|
||||
- `GET /api/podcast/dub/voices/{task_id}/result` - Get voice clone result
|
||||
- `POST /api/podcast/dub/estimate` - Estimate dubbing cost
|
||||
- `GET /api/podcast/dub/languages` - List supported languages
|
||||
- `GET /api/podcast/dub/voices` - List available TTS voices
|
||||
|
||||
- **Bug Fixes**
|
||||
- Fixed missing `Path` import in `scene_animation.py`
|
||||
|
||||
### Changed
|
||||
|
||||
- Updated `backend/services/__init__.py` to export translation and dubbing services
|
||||
- Updated `.env` with DeepL API key placeholder
|
||||
|
||||
### Documentation
|
||||
|
||||
- Added `backend/docs/AUTO_DUBBING.md` with comprehensive feature documentation
|
||||
|
||||
## [Previous Releases]
|
||||
|
||||
See git history for previous changelog entries.
|
||||
Reference in New Issue
Block a user