feat: voice clone audio generation + podcast workspace architecture

- Voice clone integration: When user selects voice clone in Write phase,
  backend uses their uploaded voice sample + scene script text to generate
  audio via qwen3/minimax/cosyvoice voice clone APIs
- Multi-tenant workspace storage: All podcast assets (audio, video, images,
  charts) now use workspace-specific directories per user
- Chart preview improvements: Card-based B-Roll charts UI with thumbnails,
  takeaway text, and action buttons; public endpoint for image serving
- Voice clone caching: In-memory LRU cache for voice samples (avoids
  re-downloading per scene); frontend caches voice clone metadata
- Thread pool for voice clone: Audio generation uses ThreadPoolExecutor to
  avoid blocking the FastAPI event loop
- Auto-detect voice clone IDs (vc_*, MY_VOICE_CLONE) to route correctly
- DB fallback for voice sample URL: Fetches from ContentAsset if not passed
- Fixed API URL resolution for chart previews
- Fixed GlassyCard DOM warnings for motion props
- Fixed ScriptGenerationProgressView syntax error
- Fixed usePodcastWorkflow scriptData reference
This commit is contained in:
ajaysi
2026-04-21 19:38:50 +05:30
parent 7637babd7d
commit 91b2f996fd
33 changed files with 1642 additions and 457 deletions

View File

@@ -34,6 +34,10 @@ try {
export type AudioGenerationSettings = {
voiceId: string;
customVoiceId?: string;
useVoiceClone?: boolean;
voiceSampleUrl?: string;
voiceCloneEngine?: string;
speed: number;
volume: number;
pitch: number;

View File

@@ -136,7 +136,7 @@ const PREDEFINED_VOICES: VoiceOption[] = [
{ id: "Exuberant_Girl", name: "Exuberant Girl", personality: "Joyful, expressive female voice - ideal for celebrations", previewUrl: VOICE_PREVIEW_MAP.Exuberant_Girl, gender: "female", category: "creative" },
];
const VOICE_CLONE_ID = "MY_VOICE_CLONE";
export const VOICE_CLONE_ID = "MY_VOICE_CLONE";
export const VoiceSelector: React.FC<VoiceSelectorProps> = ({
value,