Files

Kunthawat Greethong c35fa52117 Base code

2026-01-08 22:39:53 +07:00

9.0 KiB

Raw Blame History

Face Swap Studio - Implementation Complete ✅

Overview

Face Swap Studio is a complete implementation of MoCha (wavespeed-ai/wan-2.1/mocha) for video character replacement. Users can seamlessly swap faces or characters in videos using a reference image and source video.

Official Documentation Reference

WaveSpeed API Documentation: https://wavespeed.ai/docs/docs-api/wavespeed-ai/wan-2.1-mocha

Model: wavespeed-ai/wan-2.1/mocha
Endpoint: https://api.wavespeed.ai/api/v3/wavespeed-ai/wan-2.1/mocha

Implementation Summary

✅ Backend Implementation

WaveSpeed Client Integration
- Added face_swap() method to VideoGenerator (backend/services/wavespeed/generators/video.py)
- Added wrapper method to WaveSpeedClient (backend/services/wavespeed/client.py)
- Handles MoCha API submission and polling
- Supports sync mode with progress callbacks
Face Swap Service (backend/services/video_studio/face_swap_service.py)
- FaceSwapService class for face swap operations
- Cost calculation with min/max billing rules
- Image and video base64 encoding
- File saving and asset library integration
- Progress tracking
API Endpoints (backend/routers/video_studio/endpoints/face_swap.py)
- POST /api/video-studio/face-swap - Main face swap endpoint
- POST /api/video-studio/face-swap/estimate-cost - Cost estimation endpoint
- File validation (image < 10MB, video < 500MB)
- Error handling and logging

✅ Frontend Implementation

Main Component (FaceSwap.tsx)
- Image and video upload with previews
- Settings panel (prompt, resolution, seed)
- Progress tracking
- Result display with download
Components
- ImageUpload - Reference image upload component
- VideoUpload - Source video upload component
- SettingsPanel - Configuration options
Hook (useFaceSwap.ts)
- State management for all face swap operations
- API integration
- Cost estimation
- Progress tracking
Integration
- Added to Video Studio dashboard modules
- Added to App.tsx routing (/video-studio/face-swap)
- Exported from Video Studio index

API Parameters (Per Official Documentation)

Request Parameters

Parameter	Type	Required	Default	Range	Description
image	string	Yes	-	Base64 data URI or URL	The image for generating the output (reference character)
video	string	Yes	-	Base64 data URI or URL	The video for generating the output (source video)
prompt	string	No	-	Any text	The positive prompt for the generation
resolution	string	No	480p	480p, 720p	The resolution of the output video
seed	integer	No	-1	-1 ~ 2147483647	The random seed to use for the generation. -1 means a random seed will be used.

Response Structure

{
  "code": 200,
  "message": "success",
  "data": {
    "id": "prediction_id",
    "model": "wavespeed-ai/wan-2.1/mocha",
    "outputs": ["video_url"],
    "status": "completed",
    "urls": {
      "get": "https://api.wavespeed.ai/api/v3/predictions/{id}/result"
    },
    "has_nsfw_contents": [false],
    "created_at": "2023-04-01T12:34:56.789Z",
    "error": "",
    "timings": {
      "inference": 12345
    }
  }
}

Pricing (Per Official Documentation)

Resolution	Price per 5s	Price per second	Max Length
480p	$0.20	$0.04 / s	120 s
720p	$0.40	$0.08 / s	120 s

Billing Rules

Minimum charge: 5 seconds - any video shorter than 5 seconds is billed as 5 seconds
Maximum billed duration: 120 seconds (2 minutes)

Key Features

🌟 MoCha Capabilities

🧠 Structure-Free Replacement: No need for pose or depth maps — MoCha automatically aligns motion, expression, and body posture
🎥 Motion Preservation: Accurately transfers the source actor's motion, emotion, and camera perspective to the target character
🎨 Identity Consistency: Maintains the new character's facial identity, lighting, and style across frames without flickering
⚙️ Easy Setup: Works with a single image and a source video — no need for complex preprocessing or rigging
💡 High Realism, Low Effort: Perfect for film, advertising, digital avatars, and creative character transformation

🧩 Best Practices (From Documentation)

Match Pose & Composition: Keep reference image's camera angle, body orientation, and framing close to target video
Keep Aspect Ratios Consistent: Use the same aspect ratio between input image and video
Limit Video Length: For best stability, keep clips under 60 seconds — longer clips may show slight quality degradation
Lighting Consistency: Match lighting direction and tone between image and video to minimize blending artifacts

Implementation Details

Backend Flow

User uploads image and video files
Files are validated (size, type)
Files are converted to base64 data URIs
Request is submitted to MoCha API via WaveSpeed client
Task is polled until completion
Video is downloaded from output URL
Video is saved to user's asset library
Cost is calculated and tracked

Frontend Flow

User uploads reference image (JPG/PNG, avoid WEBP)
User uploads source video (MP4, WebM, max 500MB, max 120s)
User configures settings (optional prompt, resolution, seed)
User clicks "Swap Face"
Progress is tracked during processing
Result video is displayed with download option

File Structure

backend/
├── services/
│   ├── wavespeed/
│   │   ├── generators/
│   │   │   └── video.py          # Added face_swap() method
│   │   └── client.py             # Added face_swap() wrapper
│   └── video_studio/
│       └── face_swap_service.py  # Face swap service
└── routers/
    └── video_studio/
        └── endpoints/
            └── face_swap.py      # API endpoints

frontend/src/components/VideoStudio/modules/FaceSwap/
├── FaceSwap.tsx                  # Main component
├── hooks/
│   └── useFaceSwap.ts           # State management hook
└── components/
    ├── ImageUpload.tsx          # Image upload component
    ├── VideoUpload.tsx          # Video upload component
    ├── SettingsPanel.tsx        # Settings panel
    └── index.ts                 # Component exports

API Endpoints

POST /api/video-studio/face-swap

Request:

image_file: UploadFile (required) - Reference image
video_file: UploadFile (required) - Source video
prompt: string (optional) - Guide the swap
resolution: string (optional, default "480p") - "480p" or "720p"
seed: integer (optional) - Random seed (-1 for random)

Response:

{
  "success": true,
  "video_url": "/api/video-studio/videos/{user_id}/{filename}",
  "cost": 0.40,
  "resolution": "720p",
  "metadata": {
    "original_image_size": 123456,
    "original_video_size": 4567890,
    "swapped_video_size": 5678901,
    "resolution": "720p",
    "seed": -1
  }
}

POST /api/video-studio/face-swap/estimate-cost