diff --git a/docs-site/docs/features/podcast-maker/api-reference.md b/docs-site/docs/features/podcast-maker/api-reference.md new file mode 100644 index 00000000..e530de19 --- /dev/null +++ b/docs-site/docs/features/podcast-maker/api-reference.md @@ -0,0 +1,93 @@ +# Podcast Maker API Reference + +Base prefix: `/api/podcast` + +This page summarizes the Podcast Maker endpoints currently represented in frontend and backend code. + +## Endpoints by workflow stage + +### Analysis and idea shaping + +- `POST /idea/enhance` +- `POST /analyze` +- `POST /regenerate-queries` + +### Research + +- `POST /research/exa` + +### Scripting + +- `POST /script` +- `POST /script/approve` + +### Audio + +- `POST /audio/upload` +- `POST /audio` +- `POST /combine-audio` +- `GET /audio/{filename}` + +### Images + +- `POST /image` +- `GET /images/{path}` + +### Video + +- `POST /render/video` +- `POST /render/combine-videos` +- `GET /videos` +- `GET /videos/{filename}` +- `GET /final-videos/{filename}` + +### Avatars + +- `POST /avatar/upload` +- `POST /avatar/make-presentable` +- `POST /avatar/generate` + +### Projects + +- `POST /projects` +- `GET /projects` +- `GET /projects/{project_id}` +- `PUT /projects/{project_id}` +- `DELETE /projects/{project_id}` +- `POST /projects/{project_id}/favorite` + +### Dubbing (backend available) + +- `POST /dub/audio` +- `GET /dub/{task_id}/result` +- `GET /dub/audio/{filename}` +- `POST /dub/estimate` +- `GET /dub/languages` +- `GET /dub/voices` +- `POST /dub/voices/clone` +- `GET /dub/voices/{task_id}/result` +- `GET /dub/voices/audio/{filename}` + +## Implementation details + +### Endpoint usage in frontend service + +The current `podcastApi.ts` directly calls these podcast routes for analysis, research, script, audio, image, video, avatar, and project workflows. + +Known gap: + +- `cancelTask()` is a placeholder that posts to `/api/story/task/{taskId}/cancel` rather than a dedicated podcast route. + +### Request/response model notes + +At a high level: + +- Script endpoints exchange `idea`, `duration_minutes`, `speakers`, and optional `research`/`analysis`/`bible` context. +- Audio endpoints exchange scene identifiers, text, and voice/rendering options. +- Video endpoints exchange scene identifiers plus `audio_url` and optional image/prompt context. +- Project endpoints exchange project-level state payloads suitable for restoring workflow progress. + +## Engineering references + +- `docs/Podcast_maker/AI_PODCAST_BACKEND_REFERENCE.md` +- `docs/Podcast_maker/PODCAST_PERSISTENCE_IMPLEMENTATION.md` diff --git a/docs-site/docs/features/podcast-maker/implementation-overview.md b/docs-site/docs/features/podcast-maker/implementation-overview.md new file mode 100644 index 00000000..59fa4069 --- /dev/null +++ b/docs-site/docs/features/podcast-maker/implementation-overview.md @@ -0,0 +1,60 @@ +# Podcast Maker Implementation Overview + +This page keeps implementation details in one place for engineering and advanced troubleshooting. + +## Architecture + +Podcast Maker is split into: + +- **Frontend orchestration service**: `frontend/src/services/podcastApi.ts` + - Coordinates step flow (analysis → research → script → audio/video) + - Runs preflight checks before expensive calls + - Maps API payloads into UI-friendly objects +- **Backend podcast handlers**: `backend/api/podcast/handlers/*.py` + - Route-level APIs for analysis, research, script, media, and projects + - Authenticated operations with user-scoped media/project data + +## Frontend orchestration responsibilities + +Primary responsibilities in `podcastApi.ts`: + +- Create project analysis payloads and map response into Podcast Analysis UI data. +- Build/validate research query payloads for Exa research route. +- Generate script scenes and normalize scene/line structure for editor state. +- Render per-scene audio and combine scenes into final audio. +- Trigger scene image and video generation workflows. +- Persist project state via project CRUD endpoints. + +## Backend handler modules + +- `analysis.py`: idea enhancement, analysis, regenerate-queries. +- `research.py`: Exa research endpoint. +- `script.py`: script generation and scene approval. +- `audio.py`: audio upload, generation, combine, serving audio files. +- `images.py`: scene image generation and image serving. +- `video.py`: scene video generation, video listing/serving, combine videos. +- `avatar.py`: avatar upload, avatar generation, avatar cleanup/presentability. +- `projects.py`: create, get, update, list, delete, favorite project records. +- `dubbing.py`: dubbing/voice clone lifecycle endpoints (currently backend-available). + +## Data models (functional view) + +At feature level, the flow revolves around: + +- **Project metadata**: `project_id`, idea, duration, speakers, budget and status fields. +- **Analysis output**: audience, content type, keywords, outlines, title suggestions. +- **Research output**: source list, summarized insights, fact cards for script grounding. +- **Script output**: scenes with IDs, durations, emotions, and speaker lines. +- **Media output**: audio files, scene images, scene videos, combined episode artifacts. + +## Operational notes + +- Preflight checks are used to fail fast on plan/credit constraints. +- Some operations are synchronous (analysis/script/audio/image), while video is async task-based. +- Client-side task polling is used for long-running jobs. + +## Engineering references + +- `docs/Podcast_maker/AI_PODCAST_BACKEND_REFERENCE.md` +- `docs/Podcast_maker/PODCAST_API_CALL_ANALYSIS.md` +- `docs/Podcast_maker/PODCAST_PLAN_COMPLETION_STATUS.md` diff --git a/docs-site/docs/features/podcast-maker/overview.md b/docs-site/docs/features/podcast-maker/overview.md new file mode 100644 index 00000000..90aec19a --- /dev/null +++ b/docs-site/docs/features/podcast-maker/overview.md @@ -0,0 +1,57 @@ +# Podcast Maker Overview + +Podcast Maker helps you turn a topic idea into a polished episode draft with research, script generation, AI voice narration, and optional video scenes. + +## What you do in the product + +1. **Start with an idea** and episode settings (duration, speakers, style). +2. **Review AI analysis** suggestions (audience fit, outline ideas, titles, takeaways). +3. **Run research** from selected queries and use source-backed fact cards. +4. **Generate and edit a script** scene-by-scene. +5. **Generate voice audio** for each scene and combine clips into one episode file. +6. **Optionally create scene images and talking-head videos**. +7. **Save and revisit projects** from your episode/project list. + +## What you see in the UI + +- Suggested outlines, titles, and hooks after analysis. +- A query approval step before research runs. +- Fact cards and summarized research insights. +- Scene-based script editor with approval actions. +- Audio generation controls (voice, emotion, speed, format-related options). +- Video task progress and completed video listing. +- Project persistence (save/load/list/favorite/delete). + +## Feature status matrix (based on current code) + +| Capability | Status | Notes | +|---|---|---| +| Idea enhancement + analysis suggestions | **Implemented** | Frontend calls `/api/podcast/idea/enhance` and `/api/podcast/analyze`; backend handlers exist. | +| Research with Exa flow | **Implemented** | Frontend uses `/api/podcast/research/exa`; backend Exa research route is present. | +| Script generation + scene approval | **Implemented** | Frontend uses `/api/podcast/script` and `/api/podcast/script/approve`; backend handlers exist. | +| Scene audio generation + combine audio | **Implemented** | Frontend uses `/api/podcast/audio` and `/api/podcast/combine-audio`; backend handlers exist. | +| Scene image generation | **Implemented** | Frontend uses `/api/podcast/image`; backend image handler exists. | +| Scene video generation + status polling + combine videos | **Implemented** | Frontend uses `/api/podcast/render/video`, `/api/podcast/task/{id}/status`, `/api/podcast/render/combine-videos`; backend video routes are present. | +| Project CRUD + favorites | **Implemented** | Frontend calls `/api/podcast/projects*`; backend create/get/update/list/delete/favorite routes exist. | +| Avatar upload/generate/make-presentable | **Implemented** | Frontend calls `/api/podcast/avatar/*`; backend routes exist. | +| Audio dubbing + voice clone routes | **Partial** | Backend dubbing routes exist; not wired in `podcastApi.ts` yet. | +| Task cancellation from Podcast Maker UI | **Partial** | Frontend has `cancelTask()` placeholder using `/api/story/task/.../cancel`, not a dedicated podcast cancel API path. | +| Multi-provider research toggle in podcast service | **Planned/Not active in current frontend** | Podcast frontend currently targets Exa route directly instead of a user-facing provider switch in this API layer. | + +## Advanced / developer notes + +Most users can ignore this section. + +- Podcast Maker uses preflight checks before expensive operations (analysis/script/audio/research) to surface plan/credit issues early. +- The frontend normalizes snake_case API responses into camelCase for UI components where needed. +- Long-running video operations are task-based and polled from the client. + +## Engineering references + +These are internal planning/reference docs retained as source material: + +- `docs/Podcast_maker/AI_PODCAST_BACKEND_REFERENCE.md` +- `docs/Podcast_maker/AI_PODCAST_ENHANCEMENTS.md` +- `docs/Podcast_maker/PODCAST_API_CALL_ANALYSIS.md` +- `docs/Podcast_maker/PODCAST_PERSISTENCE_IMPLEMENTATION.md` +- `docs/Podcast_maker/PODCAST_PLAN_COMPLETION_STATUS.md`