395 lines
13 KiB
Markdown
395 lines
13 KiB
Markdown
# Image Studio: WaveSpeed AI Models Reference
|
|
|
|
**Purpose**: Complete reference guide for all WaveSpeed AI models integrated into Image Studio
|
|
**Last Updated**: Current Session
|
|
|
|
---
|
|
|
|
## 📊 Model Overview
|
|
|
|
Image Studio integrates **30+ WaveSpeed AI models** across multiple categories, giving users multiple options for each task based on cost, quality, and use case requirements.
|
|
|
|
---
|
|
|
|
## 🎨 Image Editing Models (12 Models)
|
|
|
|
### **Budget Tier** ($0.02-$0.03)
|
|
|
|
#### 1. **Qwen Image Edit** - `wavespeed-ai/qwen-image/edit`
|
|
- **Cost**: $0.02
|
|
- **Features**: Bilingual (CN/EN), appearance + semantic editing, style preservation
|
|
- **Best For**: Budget-conscious editing, bilingual content, style transfers
|
|
- **Use Cases**: Quick edits, content localization, style experiments
|
|
|
|
#### 2. **Qwen Image Edit Plus** - `wavespeed-ai/qwen-image/edit-plus`
|
|
- **Cost**: $0.02
|
|
- **Features**: Multi-image editing, ControlNet support, character consistency
|
|
- **Best For**: Batch editing, consistent character work, multi-image workflows
|
|
- **Use Cases**: Character consistency across images, batch style application
|
|
|
|
#### 3. **Step1X Edit** - `wavespeed-ai/step1x-edit`
|
|
- **Cost**: $0.03
|
|
- **Features**: Simple prompt editing, precise modifications
|
|
- **Best For**: Quick edits, straightforward changes
|
|
- **Use Cases**: Hair color changes, accessory additions, simple modifications
|
|
|
|
#### 4. **HiDream E1 Full** - `wavespeed-ai/hidream-e1-full`
|
|
- **Cost**: $0.024
|
|
- **Features**: Identity-preserving edits, wardrobe/accessory changes
|
|
- **Best For**: Fashion edits, character consistency, portrait work
|
|
- **Use Cases**: Outfit changes, accessory modifications, portrait retouching
|
|
|
|
#### 5. **SeedEdit V3** - `bytedance/seededit-v3`
|
|
- **Cost**: $0.027
|
|
- **Features**: Prompt-guided editing, identity preservation
|
|
- **Best For**: Portrait edits, e-commerce variants, localized edits
|
|
- **Use Cases**: Hair/style changes, product color variants, marketing iterations
|
|
|
|
---
|
|
|
|
### **Mid Tier** ($0.035-$0.04)
|
|
|
|
#### 6. **Alibaba WAN 2.5 Image Edit** - `alibaba/wan-2.5/image-edit`
|
|
- **Cost**: $0.035
|
|
- **Features**: Structure-preserving edits, prompt expansion
|
|
- **Best For**: Quick adjustments, cost-effective editing
|
|
- **Use Cases**: Lighting changes, color adjustments, object modifications
|
|
|
|
#### 7. **FLUX Kontext Pro** - `wavespeed-ai/flux-kontext-pro`
|
|
- **Cost**: $0.04
|
|
- **Features**: Improved prompt adherence, typography generation, consistency
|
|
- **Best For**: Typography-heavy edits, consistent results, professional work
|
|
- **Use Cases**: Text in images, poster editing, marketing materials
|
|
|
|
#### 8. **FLUX Kontext Pro Multi** - `wavespeed-ai/flux-kontext-pro/multi`
|
|
- **Cost**: $0.04
|
|
- **Features**: Multi-image handling (up to 5 references), context combination
|
|
- **Best For**: Character consistency, style alignment, multi-image workflows
|
|
- **Use Cases**: Consistent character generation, product variations, style matching
|
|
|
|
---
|
|
|
|
### **Premium Tier** ($0.08-$0.15)
|
|
|
|
#### 9. **FLUX Kontext Max** - `wavespeed-ai/flux-kontext-max`
|
|
- **Cost**: $0.08
|
|
- **Features**: Premium quality, high-fidelity transformations
|
|
- **Best For**: Professional retouching, style transformations, high-end work
|
|
- **Use Cases**: Premium retouching, cinematic edits, artistic transformations
|
|
|
|
#### 10. **Ideogram Character** - `ideogram-ai/ideogram-character`
|
|
- **Cost**: $0.10-$0.20 (Turbo/Default/Quality)
|
|
- **Features**: Character-focused editing, outfit/appearance changes, style modes
|
|
- **Best For**: Fashion visualization, character design, portrait work
|
|
- **Use Cases**: Outfit changes, character variations, fashion campaigns
|
|
|
|
#### 11. **Google Nano Banana Pro Edit Ultra** - `google/nano-banana-pro/edit-ultra`
|
|
- **Cost**: $0.15 (4K) / $0.18 (8K)
|
|
- **Features**: Native 4K/8K editing, natural language, multilingual text
|
|
- **Best For**: Professional marketing, high-res edits, typography work
|
|
- **Use Cases**: Campaign visuals, print materials, high-resolution work
|
|
|
|
---
|
|
|
|
### **Quality Tiers** (Variable Pricing)
|
|
|
|
#### 12. **OpenAI GPT Image 1** - `openai/gpt-image-1`
|
|
- **Cost**: $0.011-$0.250 (varies by quality and size)
|
|
- Low: $0.011 (square) / $0.016 (rectangular)
|
|
- Medium: $0.042 (square) / $0.063 (rectangular)
|
|
- High: $0.167 (square) / $0.250 (rectangular)
|
|
- **Features**: Quality tiers, mask support, style transformation
|
|
- **Best For**: Style transfers, creative transformations, quality control
|
|
- **Use Cases**: Artistic style changes, creative edits, quality-based workflows
|
|
|
|
---
|
|
|
|
## ⬆️ Upscaling Models (3 Models)
|
|
|
|
### 1. **Image Upscaler** - `wavespeed-ai/image-upscaler`
|
|
- **Cost**: $0.01
|
|
- **Resolution**: 2K/4K/8K
|
|
- **Best For**: Fast, affordable upscaling
|
|
- **Speed**: Fast
|
|
|
|
### 2. **Bria Increase Resolution** - `bria/increase-resolution`
|
|
- **Cost**: $0.04
|
|
- **Resolution**: 2x/4x multiplier
|
|
- **Best For**: Detail-preserving upscale
|
|
- **Speed**: Medium
|
|
|
|
### 3. **Ultimate Image Upscaler** - `wavespeed-ai/ultimate-image-upscaler`
|
|
- **Cost**: $0.06
|
|
- **Resolution**: 2K/4K/8K
|
|
- **Best For**: Premium quality upscaling
|
|
- **Speed**: Medium
|
|
|
|
---
|
|
|
|
## 👤 Face Swap Models (5 Models)
|
|
|
|
### 1. **Image Face Swap** - `wavespeed-ai/image-face-swap`
|
|
- **Cost**: $0.01
|
|
- **Features**: Basic face replacement
|
|
- **Best For**: Quick swaps, cost-sensitive use cases
|
|
|
|
### 2. **Image Face Swap Pro** - `wavespeed-ai/image-face-swap-pro`
|
|
- **Cost**: $0.025
|
|
- **Features**: Enhanced blending, realistic results
|
|
- **Best For**: Professional quality swaps
|
|
|
|
### 3. **Image Head Swap** - `wavespeed-ai/image-head-swap`
|
|
- **Cost**: $0.025
|
|
- **Features**: Full head replacement (face + hair + outline)
|
|
- **Best For**: Complete head swaps, casting mockups
|
|
|
|
### 4. **InfiniteYou** - `wavespeed-ai/infinite-you`
|
|
- **Cost**: $0.05
|
|
- **Features**: High-quality identity preservation (ByteDance)
|
|
- **Best For**: High-quality swaps, identity preservation
|
|
|
|
### 5. **Akool Multi-Face Swap** - `akool/image-face-swap`
|
|
- **Cost**: $0.16
|
|
- **Features**: Multi-face swapping in group photos
|
|
- **Best For**: Group photos, multiple face replacements
|
|
|
|
---
|
|
|
|
## 🔧 Specialized Editing Models
|
|
|
|
### **Erasing**
|
|
- **Image Eraser** - `wavespeed-ai/image-eraser` ($0.025)
|
|
- Remove objects, people, text with mask support
|
|
- Multi-region removal, context-aware reconstruction
|
|
|
|
### **Expansion/Outpainting**
|
|
- **Bria Expand** - `bria/expand` ($0.04)
|
|
- Aspect ratio expansion, intelligent outpainting
|
|
- Context-aware, maintains lighting/perspective
|
|
|
|
### **Background**
|
|
- **Bria Background Generation** - `bria/generate-background` ($0.04)
|
|
- Text or reference image-driven background replacement
|
|
- Subject preservation, style options
|
|
|
|
### **Text Removal**
|
|
- **Image Text Remover** - `wavespeed-ai/image-text-remover` ($0.15)
|
|
- Automatic text detection and removal
|
|
- High-fidelity inpainting
|
|
|
|
---
|
|
|
|
## 🌐 Translation Models (2 Models)
|
|
|
|
### 1. **WaveSpeed Image Translator** - `wavespeed-ai/image-translator`
|
|
- **Cost**: $0.15
|
|
- **Features**: 30+ languages, font preservation, layout-aware
|
|
- **Best For**: High-quality translation with visual fidelity
|
|
|
|
### 2. **Alibaba Qwen Image Translate** - `alibaba/qwen-image/translate`
|
|
- **Cost**: $0.01
|
|
- **Features**: OCR + translation, terminology control, sensitive word filtering
|
|
- **Best For**: Cost-effective translation, document processing
|
|
|
|
---
|
|
|
|
## 🎮 3D Generation Models (9 Models)
|
|
|
|
### **Budget Tier** ($0.02)
|
|
|
|
#### 1. **SAM 3D Body** - `wavespeed-ai/sam-3d-body`
|
|
- **Cost**: $0.02
|
|
- **Input**: Single image + optional mask
|
|
- **Output**: 3D human body model
|
|
- **Best For**: Character modeling, avatar creation
|
|
|
|
#### 2. **SAM 3D Objects** - `wavespeed-ai/sam-3d-objects`
|
|
- **Cost**: $0.02
|
|
- **Input**: Single image + optional mask + prompt
|
|
- **Output**: 3D object model
|
|
- **Best For**: Product visualization, props
|
|
|
|
#### 3. **Hunyuan3D V2 Multi-View** - `wavespeed-ai/hunyuan3d/v2-multi-view`
|
|
- **Cost**: $0.02
|
|
- **Input**: Front + back + left images
|
|
- **Output**: High-fidelity 3D with 4K textures
|
|
- **Best For**: Accurate reconstruction, digital twins
|
|
|
|
### **Premium Tier** ($0.25-$0.30)
|
|
|
|
#### 4. **Tripo3D V2.5 Image-to-3D** - `tripo3d/v2.5/image-to-3d`
|
|
- **Cost**: $0.30
|
|
- **Input**: Single image
|
|
- **Output**: High-quality 3D asset
|
|
- **Best For**: Game assets, e-commerce, AR/VR
|
|
|
|
#### 5. **Hunyuan3D V2.1** - `wavespeed-ai/hunyuan3d/v2.1`
|
|
- **Cost**: $0.30
|
|
- **Input**: Single image
|
|
- **Output**: Scalable 3D with PBR textures
|
|
- **Best For**: Production workflows, game art
|
|
|
|
#### 6. **Hunyuan3D V3 Image-to-3D** - `wavespeed-ai/hunyuan3d-v3/image-to-3d`
|
|
- **Cost**: $0.25
|
|
- **Input**: Single image + optional multi-view
|
|
- **Output**: Ultra-high-resolution 3D
|
|
- **Best For**: Film-quality geometry
|
|
|
|
#### 7. **Hyper3D Rodin v2 Image-to-3D** - `hyper3d/rodin-v2/image-to-3d`
|
|
- **Cost**: $0.30
|
|
- **Input**: Single/multiple images + optional prompt
|
|
- **Output**: Production-ready 3D with UVs/textures
|
|
- **Best For**: Game art, film/TV, XR
|
|
|
|
#### 8. **Tripo3D V2.5 Multiview** - `tripo3d/v2.5/multiview-to-3d`
|
|
- **Cost**: $0.30
|
|
- **Input**: Multiple views
|
|
- **Output**: Higher-fidelity 3D
|
|
- **Best For**: Digital twins, 3D catalogs
|
|
|
|
### **Text-to-3D** ($0.30)
|
|
|
|
#### 9. **Hyper3D Rodin v2 Text-to-3D** - `hyper3d/rodin-v2/text-to-3d`
|
|
- **Cost**: $0.30
|
|
- **Input**: Text prompt
|
|
- **Output**: Production-ready 3D with UVs/textures
|
|
- **Best For**: Concept to 3D, rapid prototyping
|
|
|
|
### **Sketch-to-3D** ($0.375)
|
|
|
|
#### 10. **Hunyuan3D V3 Sketch-to-3D** - `wavespeed-ai/hunyuan3d-v3/sketch-to-3d`
|
|
- **Cost**: $0.375
|
|
- **Input**: Sketch image + optional prompt
|
|
- **Output**: 3D model with optional PBR
|
|
- **Best For**: Concept art to 3D, game development
|
|
|
|
---
|
|
|
|
## 📝 Utility Models
|
|
|
|
### **Image Captioning**
|
|
- **Image Captioner** - `wavespeed-ai/image-captioner` ($0.001)
|
|
- Generate detailed image descriptions
|
|
- SEO/accessibility, dataset labeling
|
|
|
|
### **Additional Inpainting**
|
|
- **Z-Image Turbo Inpaint** - `wavespeed-ai/z-image/turbo-inpaint` ($0.02)
|
|
- Ultra-fast inpainting with natural language
|
|
- Best for: Product photo cleanup, object removal
|
|
|
|
### **Additional Outpainting**
|
|
- **Image Zoom-Out** - `wavespeed-ai/image-zoom-out` ($0.02)
|
|
- Professional outpainting/expansion
|
|
- Best for: Expanding images, cinematic compositions
|
|
|
|
### **Enhanced Generation**
|
|
- **WAN 2.2 Text-to-Image Realism** - `wavespeed-ai/wan-2.2/text-to-image-realism` ($0.025)
|
|
- Ultra-realistic photorealistic generation
|
|
- Best for: Lifestyle photography, stock imagery
|
|
|
|
---
|
|
|
|
## 🎯 Model Selection Strategy
|
|
|
|
### **By Cost**
|
|
- **Budget** ($0.01-$0.03): Qwen Edit, Step1X, Face Swap, Image Upscaler
|
|
- **Mid-Range** ($0.04-$0.05): FLUX Kontext Pro, Bria models, InfiniteYou
|
|
- **Premium** ($0.08-$0.20): FLUX Kontext Max, Ideogram Character, Nano Banana Pro
|
|
|
|
### **By Quality**
|
|
- **Good**: Qwen, Step1X, HiDream, SeedEdit
|
|
- **Excellent**: FLUX Kontext Pro/Max, GPT Image 1, Ideogram Character
|
|
- **Premium**: Nano Banana Pro Edit Ultra (4K/8K)
|
|
|
|
### **By Use Case**
|
|
- **Quick Edits**: Qwen Edit ($0.02), Step1X ($0.03)
|
|
- **Professional Work**: Nano Banana Pro ($0.15), FLUX Kontext Max ($0.08)
|
|
- **Character Work**: Ideogram Character ($0.10-$0.20), HiDream ($0.024)
|
|
- **Typography**: FLUX Kontext Pro ($0.04), Ideogram V3 Turbo ($0.03)
|
|
- **Multi-Image**: FLUX Kontext Pro Multi ($0.04), Qwen Edit Plus ($0.02)
|
|
|
|
---
|
|
|
|
## 💡 Smart Model Selection
|
|
|
|
### **Auto-Select Based On**:
|
|
1. **Budget Mode**: Select cheapest model
|
|
2. **Quality Mode**: Select best quality model
|
|
3. **Balanced Mode**: Select best value model
|
|
4. **Use Case**: Select model optimized for specific task
|
|
|
|
### **User Choice**:
|
|
- Show all available models with cost/quality comparison
|
|
- Allow manual selection
|
|
- Display recommendations based on edit type
|
|
|
|
---
|
|
|
|
## 📊 Cost Comparison Examples
|
|
|
|
### **Editing a Portrait**:
|
|
- **Budget**: Qwen Edit ($0.02) or Step1X ($0.03)
|
|
- **Balanced**: FLUX Kontext Pro ($0.04) or SeedEdit ($0.027)
|
|
- **Premium**: Nano Banana Pro ($0.15) or FLUX Kontext Max ($0.08)
|
|
|
|
### **Upscaling an Image**:
|
|
- **Budget**: Image Upscaler ($0.01)
|
|
- **Balanced**: Bria Increase Resolution ($0.04)
|
|
- **Premium**: Ultimate Upscaler ($0.06)
|
|
|
|
### **Face Swapping**:
|
|
- **Budget**: Face Swap ($0.01)
|
|
- **Balanced**: Face Swap Pro ($0.025) or InfiniteYou ($0.05)
|
|
- **Premium**: Multi-Face Swap ($0.16)
|
|
|
|
---
|
|
|
|
## 🔗 Integration Points
|
|
|
|
### **Edit Studio**
|
|
- Add model selector dropdown
|
|
- Show cost comparison
|
|
- Display quality recommendations
|
|
- Allow side-by-side comparison
|
|
|
|
### **Upscale Studio**
|
|
- Add WaveSpeed models as alternatives to Stability
|
|
- Cost comparison UI
|
|
- Quality preview
|
|
|
|
### **Face Swap Studio** (New)
|
|
- Model selection with use case recommendations
|
|
- Cost/quality comparison
|
|
- Batch processing support
|
|
|
|
### **Translation Studio** (New)
|
|
- Model selector (high-quality vs. budget)
|
|
- Language support comparison
|
|
- Batch translation
|
|
|
|
---
|
|
|
|
## 📚 Related Documentation
|
|
|
|
- [Image Studio Enhancement Proposal](docs/IMAGE_STUDIO_ENHANCEMENT_PROPOSAL.md)
|
|
- [Image Studio Implementation Review](docs/IMAGE_STUDIO_IMPLEMENTATION_REVIEW.md)
|
|
- [WaveSpeed Implementation Roadmap](docs/WAVESPEED_IMPLEMENTATION_ROADMAP.md)
|
|
|
|
---
|
|
|
|
*Document Version: 2.0*
|
|
*Last Updated: Current Session*
|
|
*Total Models: 40+ WaveSpeed AI models*
|
|
|
|
---
|
|
|
|
## 📊 Complete Model Count
|
|
|
|
- **Image Editing**: 14 models
|
|
- **Upscaling**: 3 models
|
|
- **Face Swapping**: 5 models
|
|
- **3D Generation**: 9 models
|
|
- **Translation**: 2 models
|
|
- **Specialized**: 7 models (erasing, expansion, background, text removal, captioning, inpainting, generation)
|
|
- **Total**: 40+ WaveSpeed AI models
|