Files
ALwrity/backend/CHANGELOG.md
ajaysi f503a24b3b feat: Add Auto-Dubbing feature for Podcast Maker
This commit adds the Auto-Dubbing feature for Podcast Maker with support
for translating podcast audio to different languages with optional voice
cloning to preserve the original speaker's voice.

New Features:
- Translation Service (common module): DeepL integration for low-cost
  translation, WaveSpeed integration for high-quality translation
- Audio Dubbing Service: STT -> Translate -> TTS pipeline with
  voice cloning support
- 9 new API endpoints for dubbing and voice cloning
- Support for 34+ languages
- Cost estimation utilities
- Comprehensive documentation

Files Added:
- services/translation/ (5 files): Translation service module
- services/dubbing/: Audio dubbing service
- api/podcast/handlers/dubbing.py: API endpoints
- docs/AUTO_DUBBING.md: Feature documentation
- CHANGELOG.md: Change log

Files Modified:
- api/podcast/models.py: Added dubbing request/response models
- api/podcast/router.py: Added dubbing routes
- services/__init__.py: Export translation and dubbing services
- scene_animation.py: Fixed missing Path import
2026-03-24 15:45:51 +05:30

1.8 KiB

Changelog

All notable changes to the ALwrity project will be documented in this file.

The format is based on Keep a Changelog.

[Unreleased]

Added

Auto-Dubbing Feature (Podcast Maker)

  • Translation Service (backend/services/translation/)

    • Common translation module for use across the entire application
    • DeepL integration for low-cost, high-quality text translation (500k chars/month free)
    • WaveSpeed integration for high-quality video/audio translation
    • Support for 34+ languages
    • Batch translation support
    • Factory pattern for provider selection
    • Cost estimation utilities
  • Audio Dubbing Service (backend/services/dubbing/)

    • Audio dubbing with STT → Translate → TTS pipeline
    • Voice cloning support to preserve original speaker's voice
    • Low-quality (DeepL) and high-quality (WaveSpeed) modes
    • Batch dubbing support
    • Cost estimation
  • Podcast API Endpoints (backend/api/podcast/)

    • POST /api/podcast/dub/audio - Create audio dubbing task
    • GET /api/podcast/dub/{task_id}/result - Get dubbing result
    • POST /api/podcast/dub/voices/clone - Clone voice from audio sample
    • GET /api/podcast/dub/voices/{task_id}/result - Get voice clone result
    • POST /api/podcast/dub/estimate - Estimate dubbing cost
    • GET /api/podcast/dub/languages - List supported languages
    • GET /api/podcast/dub/voices - List available TTS voices
  • Bug Fixes

    • Fixed missing Path import in scene_animation.py

Changed

  • Updated backend/services/__init__.py to export translation and dubbing services
  • Updated .env with DeepL API key placeholder

Documentation

  • Added backend/docs/AUTO_DUBBING.md with comprehensive feature documentation

[Previous Releases]

See git history for previous changelog entries.