9.0 KiB
9.0 KiB
Face Swap Studio - Implementation Complete ✅
Overview
Face Swap Studio is a complete implementation of MoCha (wavespeed-ai/wan-2.1/mocha) for video character replacement. Users can seamlessly swap faces or characters in videos using a reference image and source video.
Official Documentation Reference
WaveSpeed API Documentation: https://wavespeed.ai/docs/docs-api/wavespeed-ai/wan-2.1-mocha
Model: wavespeed-ai/wan-2.1/mocha
Endpoint: https://api.wavespeed.ai/api/v3/wavespeed-ai/wan-2.1/mocha
Implementation Summary
✅ Backend Implementation
-
WaveSpeed Client Integration
- Added
face_swap()method toVideoGenerator(backend/services/wavespeed/generators/video.py) - Added wrapper method to
WaveSpeedClient(backend/services/wavespeed/client.py) - Handles MoCha API submission and polling
- Supports sync mode with progress callbacks
- Added
-
Face Swap Service (
backend/services/video_studio/face_swap_service.py)FaceSwapServiceclass for face swap operations- Cost calculation with min/max billing rules
- Image and video base64 encoding
- File saving and asset library integration
- Progress tracking
-
API Endpoints (
backend/routers/video_studio/endpoints/face_swap.py)POST /api/video-studio/face-swap- Main face swap endpointPOST /api/video-studio/face-swap/estimate-cost- Cost estimation endpoint- File validation (image < 10MB, video < 500MB)
- Error handling and logging
✅ Frontend Implementation
-
Main Component (
FaceSwap.tsx)- Image and video upload with previews
- Settings panel (prompt, resolution, seed)
- Progress tracking
- Result display with download
-
Components
ImageUpload- Reference image upload componentVideoUpload- Source video upload componentSettingsPanel- Configuration options
-
Hook (
useFaceSwap.ts)- State management for all face swap operations
- API integration
- Cost estimation
- Progress tracking
-
Integration
- Added to Video Studio dashboard modules
- Added to App.tsx routing (
/video-studio/face-swap) - Exported from Video Studio index
API Parameters (Per Official Documentation)
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| image | string | Yes | - | Base64 data URI or URL | The image for generating the output (reference character) |
| video | string | Yes | - | Base64 data URI or URL | The video for generating the output (source video) |
| prompt | string | No | - | Any text | The positive prompt for the generation |
| resolution | string | No | 480p | 480p, 720p | The resolution of the output video |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
Response Structure
{
"code": 200,
"message": "success",
"data": {
"id": "prediction_id",
"model": "wavespeed-ai/wan-2.1/mocha",
"outputs": ["video_url"],
"status": "completed",
"urls": {
"get": "https://api.wavespeed.ai/api/v3/predictions/{id}/result"
},
"has_nsfw_contents": [false],
"created_at": "2023-04-01T12:34:56.789Z",
"error": "",
"timings": {
"inference": 12345
}
}
}
Pricing (Per Official Documentation)
| Resolution | Price per 5s | Price per second | Max Length |
|---|---|---|---|
| 480p | $0.20 | $0.04 / s | 120 s |
| 720p | $0.40 | $0.08 / s | 120 s |
Billing Rules
- Minimum charge: 5 seconds - any video shorter than 5 seconds is billed as 5 seconds
- Maximum billed duration: 120 seconds (2 minutes)
Key Features
🌟 MoCha Capabilities
- 🧠 Structure-Free Replacement: No need for pose or depth maps — MoCha automatically aligns motion, expression, and body posture
- 🎥 Motion Preservation: Accurately transfers the source actor's motion, emotion, and camera perspective to the target character
- 🎨 Identity Consistency: Maintains the new character's facial identity, lighting, and style across frames without flickering
- ⚙️ Easy Setup: Works with a single image and a source video — no need for complex preprocessing or rigging
- 💡 High Realism, Low Effort: Perfect for film, advertising, digital avatars, and creative character transformation
🧩 Best Practices (From Documentation)
- Match Pose & Composition: Keep reference image's camera angle, body orientation, and framing close to target video
- Keep Aspect Ratios Consistent: Use the same aspect ratio between input image and video
- Limit Video Length: For best stability, keep clips under 60 seconds — longer clips may show slight quality degradation
- Lighting Consistency: Match lighting direction and tone between image and video to minimize blending artifacts
Implementation Details
Backend Flow
- User uploads image and video files
- Files are validated (size, type)
- Files are converted to base64 data URIs
- Request is submitted to MoCha API via WaveSpeed client
- Task is polled until completion
- Video is downloaded from output URL
- Video is saved to user's asset library
- Cost is calculated and tracked
Frontend Flow
- User uploads reference image (JPG/PNG, avoid WEBP)
- User uploads source video (MP4, WebM, max 500MB, max 120s)
- User configures settings (optional prompt, resolution, seed)
- User clicks "Swap Face"
- Progress is tracked during processing
- Result video is displayed with download option
File Structure
backend/
├── services/
│ ├── wavespeed/
│ │ ├── generators/
│ │ │ └── video.py # Added face_swap() method
│ │ └── client.py # Added face_swap() wrapper
│ └── video_studio/
│ └── face_swap_service.py # Face swap service
└── routers/
└── video_studio/
└── endpoints/
└── face_swap.py # API endpoints
frontend/src/components/VideoStudio/modules/FaceSwap/
├── FaceSwap.tsx # Main component
├── hooks/
│ └── useFaceSwap.ts # State management hook
└── components/
├── ImageUpload.tsx # Image upload component
├── VideoUpload.tsx # Video upload component
├── SettingsPanel.tsx # Settings panel
└── index.ts # Component exports
API Endpoints
POST /api/video-studio/face-swap
Request:
image_file: UploadFile (required) - Reference imagevideo_file: UploadFile (required) - Source videoprompt: string (optional) - Guide the swapresolution: string (optional, default "480p") - "480p" or "720p"seed: integer (optional) - Random seed (-1 for random)
Response:
{
"success": true,
"video_url": "/api/video-studio/videos/{user_id}/{filename}",
"cost": 0.40,
"resolution": "720p",
"metadata": {
"original_image_size": 123456,
"original_video_size": 4567890,
"swapped_video_size": 5678901,
"resolution": "720p",
"seed": -1
}
}
POST /api/video-studio/face-swap/estimate-cost
Request:
resolution: string (required) - "480p" or "720p"estimated_duration: float (required) - Duration in seconds (5.0 - 120.0)
Response:
{
"estimated_cost": 0.40,
"resolution": "720p",
"estimated_duration": 10.0,
"cost_per_second": 0.08,
"pricing_model": "per_second",
"min_duration": 5.0,
"max_duration": 120.0,
"min_charge": 0.40
}
Status
✅ Complete: Face Swap Studio is fully implemented and ready for use.
- ✅ Backend: Complete and integrated with WaveSpeed client
- ✅ Frontend: Complete with full UI and state management
- ✅ Routing: Added to dashboard and App.tsx
- ✅ Documentation: Matches official MoCha API documentation
Next Steps
- Testing: Test face swap with various image/video combinations
- Duration Detection: Improve cost calculation by detecting actual video duration
- Error Handling: Add more specific error messages for common issues
- UI Improvements: Add tips and best practices directly in the UI