Files
moreminimore-marketing/docs/FACE_SWAP_IMPLEMENTATION_COMPLETE.md
Kunthawat Greethong c35fa52117 Base code
2026-01-08 22:39:53 +07:00

9.0 KiB

Face Swap Studio - Implementation Complete

Overview

Face Swap Studio is a complete implementation of MoCha (wavespeed-ai/wan-2.1/mocha) for video character replacement. Users can seamlessly swap faces or characters in videos using a reference image and source video.

Official Documentation Reference

WaveSpeed API Documentation: https://wavespeed.ai/docs/docs-api/wavespeed-ai/wan-2.1-mocha

Model: wavespeed-ai/wan-2.1/mocha
Endpoint: https://api.wavespeed.ai/api/v3/wavespeed-ai/wan-2.1/mocha

Implementation Summary

Backend Implementation

  1. WaveSpeed Client Integration

    • Added face_swap() method to VideoGenerator (backend/services/wavespeed/generators/video.py)
    • Added wrapper method to WaveSpeedClient (backend/services/wavespeed/client.py)
    • Handles MoCha API submission and polling
    • Supports sync mode with progress callbacks
  2. Face Swap Service (backend/services/video_studio/face_swap_service.py)

    • FaceSwapService class for face swap operations
    • Cost calculation with min/max billing rules
    • Image and video base64 encoding
    • File saving and asset library integration
    • Progress tracking
  3. API Endpoints (backend/routers/video_studio/endpoints/face_swap.py)

    • POST /api/video-studio/face-swap - Main face swap endpoint
    • POST /api/video-studio/face-swap/estimate-cost - Cost estimation endpoint
    • File validation (image < 10MB, video < 500MB)
    • Error handling and logging

Frontend Implementation

  1. Main Component (FaceSwap.tsx)

    • Image and video upload with previews
    • Settings panel (prompt, resolution, seed)
    • Progress tracking
    • Result display with download
  2. Components

    • ImageUpload - Reference image upload component
    • VideoUpload - Source video upload component
    • SettingsPanel - Configuration options
  3. Hook (useFaceSwap.ts)

    • State management for all face swap operations
    • API integration
    • Cost estimation
    • Progress tracking
  4. Integration

    • Added to Video Studio dashboard modules
    • Added to App.tsx routing (/video-studio/face-swap)
    • Exported from Video Studio index

API Parameters (Per Official Documentation)

Request Parameters

Parameter Type Required Default Range Description
image string Yes - Base64 data URI or URL The image for generating the output (reference character)
video string Yes - Base64 data URI or URL The video for generating the output (source video)
prompt string No - Any text The positive prompt for the generation
resolution string No 480p 480p, 720p The resolution of the output video
seed integer No -1 -1 ~ 2147483647 The random seed to use for the generation. -1 means a random seed will be used.

Response Structure

{
  "code": 200,
  "message": "success",
  "data": {
    "id": "prediction_id",
    "model": "wavespeed-ai/wan-2.1/mocha",
    "outputs": ["video_url"],
    "status": "completed",
    "urls": {
      "get": "https://api.wavespeed.ai/api/v3/predictions/{id}/result"
    },
    "has_nsfw_contents": [false],
    "created_at": "2023-04-01T12:34:56.789Z",
    "error": "",
    "timings": {
      "inference": 12345
    }
  }
}

Pricing (Per Official Documentation)

Resolution Price per 5s Price per second Max Length
480p $0.20 $0.04 / s 120 s
720p $0.40 $0.08 / s 120 s

Billing Rules

  • Minimum charge: 5 seconds - any video shorter than 5 seconds is billed as 5 seconds
  • Maximum billed duration: 120 seconds (2 minutes)

Key Features

🌟 MoCha Capabilities

  • 🧠 Structure-Free Replacement: No need for pose or depth maps — MoCha automatically aligns motion, expression, and body posture
  • 🎥 Motion Preservation: Accurately transfers the source actor's motion, emotion, and camera perspective to the target character
  • 🎨 Identity Consistency: Maintains the new character's facial identity, lighting, and style across frames without flickering
  • ⚙️ Easy Setup: Works with a single image and a source video — no need for complex preprocessing or rigging
  • 💡 High Realism, Low Effort: Perfect for film, advertising, digital avatars, and creative character transformation

🧩 Best Practices (From Documentation)

  1. Match Pose & Composition: Keep reference image's camera angle, body orientation, and framing close to target video
  2. Keep Aspect Ratios Consistent: Use the same aspect ratio between input image and video
  3. Limit Video Length: For best stability, keep clips under 60 seconds — longer clips may show slight quality degradation
  4. Lighting Consistency: Match lighting direction and tone between image and video to minimize blending artifacts

Implementation Details

Backend Flow

  1. User uploads image and video files
  2. Files are validated (size, type)
  3. Files are converted to base64 data URIs
  4. Request is submitted to MoCha API via WaveSpeed client
  5. Task is polled until completion
  6. Video is downloaded from output URL
  7. Video is saved to user's asset library
  8. Cost is calculated and tracked

Frontend Flow

  1. User uploads reference image (JPG/PNG, avoid WEBP)
  2. User uploads source video (MP4, WebM, max 500MB, max 120s)
  3. User configures settings (optional prompt, resolution, seed)
  4. User clicks "Swap Face"
  5. Progress is tracked during processing
  6. Result video is displayed with download option

File Structure

backend/
├── services/
│   ├── wavespeed/
│   │   ├── generators/
│   │   │   └── video.py          # Added face_swap() method
│   │   └── client.py             # Added face_swap() wrapper
│   └── video_studio/
│       └── face_swap_service.py  # Face swap service
└── routers/
    └── video_studio/
        └── endpoints/
            └── face_swap.py      # API endpoints

frontend/src/components/VideoStudio/modules/FaceSwap/
├── FaceSwap.tsx                  # Main component
├── hooks/
│   └── useFaceSwap.ts           # State management hook
└── components/
    ├── ImageUpload.tsx          # Image upload component
    ├── VideoUpload.tsx          # Video upload component
    ├── SettingsPanel.tsx        # Settings panel
    └── index.ts                 # Component exports

API Endpoints

POST /api/video-studio/face-swap

Request:

  • image_file: UploadFile (required) - Reference image
  • video_file: UploadFile (required) - Source video
  • prompt: string (optional) - Guide the swap
  • resolution: string (optional, default "480p") - "480p" or "720p"
  • seed: integer (optional) - Random seed (-1 for random)

Response:

{
  "success": true,
  "video_url": "/api/video-studio/videos/{user_id}/{filename}",
  "cost": 0.40,
  "resolution": "720p",
  "metadata": {
    "original_image_size": 123456,
    "original_video_size": 4567890,
    "swapped_video_size": 5678901,
    "resolution": "720p",
    "seed": -1
  }
}

POST /api/video-studio/face-swap/estimate-cost

Request:

  • resolution: string (required) - "480p" or "720p"
  • estimated_duration: float (required) - Duration in seconds (5.0 - 120.0)

Response:

{
  "estimated_cost": 0.40,
  "resolution": "720p",
  "estimated_duration": 10.0,
  "cost_per_second": 0.08,
  "pricing_model": "per_second",
  "min_duration": 5.0,
  "max_duration": 120.0,
  "min_charge": 0.40
}

Status

Complete: Face Swap Studio is fully implemented and ready for use.

  • Backend: Complete and integrated with WaveSpeed client
  • Frontend: Complete with full UI and state management
  • Routing: Added to dashboard and App.tsx
  • Documentation: Matches official MoCha API documentation

Next Steps

  1. Testing: Test face swap with various image/video combinations
  2. Duration Detection: Improve cost calculation by detecting actual video duration
  3. Error Handling: Add more specific error messages for common issues
  4. UI Improvements: Add tips and best practices directly in the UI

References