Files

ajaysi f503a24b3b feat: Add Auto-Dubbing feature for Podcast Maker

This commit adds the Auto-Dubbing feature for Podcast Maker with support
for translating podcast audio to different languages with optional voice
cloning to preserve the original speaker's voice.

New Features:
- Translation Service (common module): DeepL integration for low-cost
  translation, WaveSpeed integration for high-quality translation
- Audio Dubbing Service: STT -> Translate -> TTS pipeline with
  voice cloning support
- 9 new API endpoints for dubbing and voice cloning
- Support for 34+ languages
- Cost estimation utilities
- Comprehensive documentation

Files Added:
- services/translation/ (5 files): Translation service module
- services/dubbing/: Audio dubbing service
- api/podcast/handlers/dubbing.py: API endpoints
- docs/AUTO_DUBBING.md: Feature documentation
- CHANGELOG.md: Change log

Files Modified:
- api/podcast/models.py: Added dubbing request/response models
- api/podcast/router.py: Added dubbing routes
- services/__init__.py: Export translation and dubbing services
- scene_animation.py: Fixed missing Path import

2026-03-24 15:45:51 +05:30

1.8 KiB

Raw Blame History

Changelog

All notable changes to the ALwrity project will be documented in this file.

The format is based on Keep a Changelog.

[Unreleased]

Added

Auto-Dubbing Feature (Podcast Maker)

Translation Service (backend/services/translation/)
- Common translation module for use across the entire application
- DeepL integration for low-cost, high-quality text translation (500k chars/month free)
- WaveSpeed integration for high-quality video/audio translation
- Support for 34+ languages
- Batch translation support
- Factory pattern for provider selection
- Cost estimation utilities
Audio Dubbing Service (backend/services/dubbing/)
- Audio dubbing with STT → Translate → TTS pipeline
- Voice cloning support to preserve original speaker's voice
- Low-quality (DeepL) and high-quality (WaveSpeed) modes
- Batch dubbing support
- Cost estimation
Podcast API Endpoints (backend/api/podcast/)
- POST /api/podcast/dub/audio - Create audio dubbing task
- GET /api/podcast/dub/{task_id}/result - Get dubbing result
- POST /api/podcast/dub/voices/clone - Clone voice from audio sample
- GET /api/podcast/dub/voices/{task_id}/result - Get voice clone result
- POST /api/podcast/dub/estimate - Estimate dubbing cost
- GET /api/podcast/dub/languages - List supported languages
- GET /api/podcast/dub/voices - List available TTS voices
Bug Fixes
- Fixed missing Path import in scene_animation.py

Changed

Updated backend/services/__init__.py to export translation and dubbing services
Updated .env with DeepL API key placeholder

Documentation

Added backend/docs/AUTO_DUBBING.md with comprehensive feature documentation

[Previous Releases]

See git history for previous changelog entries.

1.8 KiB Raw Blame History

Changelog

[Unreleased]

Added

Auto-Dubbing Feature (Podcast Maker)

Changed

Documentation

[Previous Releases]

1.8 KiB

Raw Blame History