13 KiB
13 KiB
Image Studio: WaveSpeed AI Models Reference
Purpose: Complete reference guide for all WaveSpeed AI models integrated into Image Studio
Last Updated: Current Session
📊 Model Overview
Image Studio integrates 30+ WaveSpeed AI models across multiple categories, giving users multiple options for each task based on cost, quality, and use case requirements.
🎨 Image Editing Models (12 Models)
Budget Tier ($0.02-$0.03)
1. Qwen Image Edit - wavespeed-ai/qwen-image/edit
- Cost: $0.02
- Features: Bilingual (CN/EN), appearance + semantic editing, style preservation
- Best For: Budget-conscious editing, bilingual content, style transfers
- Use Cases: Quick edits, content localization, style experiments
2. Qwen Image Edit Plus - wavespeed-ai/qwen-image/edit-plus
- Cost: $0.02
- Features: Multi-image editing, ControlNet support, character consistency
- Best For: Batch editing, consistent character work, multi-image workflows
- Use Cases: Character consistency across images, batch style application
3. Step1X Edit - wavespeed-ai/step1x-edit
- Cost: $0.03
- Features: Simple prompt editing, precise modifications
- Best For: Quick edits, straightforward changes
- Use Cases: Hair color changes, accessory additions, simple modifications
4. HiDream E1 Full - wavespeed-ai/hidream-e1-full
- Cost: $0.024
- Features: Identity-preserving edits, wardrobe/accessory changes
- Best For: Fashion edits, character consistency, portrait work
- Use Cases: Outfit changes, accessory modifications, portrait retouching
5. SeedEdit V3 - bytedance/seededit-v3
- Cost: $0.027
- Features: Prompt-guided editing, identity preservation
- Best For: Portrait edits, e-commerce variants, localized edits
- Use Cases: Hair/style changes, product color variants, marketing iterations
Mid Tier ($0.035-$0.04)
6. Alibaba WAN 2.5 Image Edit - alibaba/wan-2.5/image-edit
- Cost: $0.035
- Features: Structure-preserving edits, prompt expansion
- Best For: Quick adjustments, cost-effective editing
- Use Cases: Lighting changes, color adjustments, object modifications
7. FLUX Kontext Pro - wavespeed-ai/flux-kontext-pro
- Cost: $0.04
- Features: Improved prompt adherence, typography generation, consistency
- Best For: Typography-heavy edits, consistent results, professional work
- Use Cases: Text in images, poster editing, marketing materials
8. FLUX Kontext Pro Multi - wavespeed-ai/flux-kontext-pro/multi
- Cost: $0.04
- Features: Multi-image handling (up to 5 references), context combination
- Best For: Character consistency, style alignment, multi-image workflows
- Use Cases: Consistent character generation, product variations, style matching
Premium Tier ($0.08-$0.15)
9. FLUX Kontext Max - wavespeed-ai/flux-kontext-max
- Cost: $0.08
- Features: Premium quality, high-fidelity transformations
- Best For: Professional retouching, style transformations, high-end work
- Use Cases: Premium retouching, cinematic edits, artistic transformations
10. Ideogram Character - ideogram-ai/ideogram-character
- Cost: $0.10-$0.20 (Turbo/Default/Quality)
- Features: Character-focused editing, outfit/appearance changes, style modes
- Best For: Fashion visualization, character design, portrait work
- Use Cases: Outfit changes, character variations, fashion campaigns
11. Google Nano Banana Pro Edit Ultra - google/nano-banana-pro/edit-ultra
- Cost: $0.15 (4K) / $0.18 (8K)
- Features: Native 4K/8K editing, natural language, multilingual text
- Best For: Professional marketing, high-res edits, typography work
- Use Cases: Campaign visuals, print materials, high-resolution work
Quality Tiers (Variable Pricing)
12. OpenAI GPT Image 1 - openai/gpt-image-1
- Cost: $0.011-$0.250 (varies by quality and size)
- Low: $0.011 (square) / $0.016 (rectangular)
- Medium: $0.042 (square) / $0.063 (rectangular)
- High: $0.167 (square) / $0.250 (rectangular)
- Features: Quality tiers, mask support, style transformation
- Best For: Style transfers, creative transformations, quality control
- Use Cases: Artistic style changes, creative edits, quality-based workflows
⬆️ Upscaling Models (3 Models)
1. Image Upscaler - wavespeed-ai/image-upscaler
- Cost: $0.01
- Resolution: 2K/4K/8K
- Best For: Fast, affordable upscaling
- Speed: Fast
2. Bria Increase Resolution - bria/increase-resolution
- Cost: $0.04
- Resolution: 2x/4x multiplier
- Best For: Detail-preserving upscale
- Speed: Medium
3. Ultimate Image Upscaler - wavespeed-ai/ultimate-image-upscaler
- Cost: $0.06
- Resolution: 2K/4K/8K
- Best For: Premium quality upscaling
- Speed: Medium
👤 Face Swap Models (5 Models)
1. Image Face Swap - wavespeed-ai/image-face-swap
- Cost: $0.01
- Features: Basic face replacement
- Best For: Quick swaps, cost-sensitive use cases
2. Image Face Swap Pro - wavespeed-ai/image-face-swap-pro
- Cost: $0.025
- Features: Enhanced blending, realistic results
- Best For: Professional quality swaps
3. Image Head Swap - wavespeed-ai/image-head-swap
- Cost: $0.025
- Features: Full head replacement (face + hair + outline)
- Best For: Complete head swaps, casting mockups
4. InfiniteYou - wavespeed-ai/infinite-you
- Cost: $0.05
- Features: High-quality identity preservation (ByteDance)
- Best For: High-quality swaps, identity preservation
5. Akool Multi-Face Swap - akool/image-face-swap
- Cost: $0.16
- Features: Multi-face swapping in group photos
- Best For: Group photos, multiple face replacements
🔧 Specialized Editing Models
Erasing
- Image Eraser -
wavespeed-ai/image-eraser($0.025)- Remove objects, people, text with mask support
- Multi-region removal, context-aware reconstruction
Expansion/Outpainting
- Bria Expand -
bria/expand($0.04)- Aspect ratio expansion, intelligent outpainting
- Context-aware, maintains lighting/perspective
Background
- Bria Background Generation -
bria/generate-background($0.04)- Text or reference image-driven background replacement
- Subject preservation, style options
Text Removal
- Image Text Remover -
wavespeed-ai/image-text-remover($0.15)- Automatic text detection and removal
- High-fidelity inpainting
🌐 Translation Models (2 Models)
1. WaveSpeed Image Translator - wavespeed-ai/image-translator
- Cost: $0.15
- Features: 30+ languages, font preservation, layout-aware
- Best For: High-quality translation with visual fidelity
2. Alibaba Qwen Image Translate - alibaba/qwen-image/translate
- Cost: $0.01
- Features: OCR + translation, terminology control, sensitive word filtering
- Best For: Cost-effective translation, document processing
🎮 3D Generation Models (9 Models)
Budget Tier ($0.02)
1. SAM 3D Body - wavespeed-ai/sam-3d-body
- Cost: $0.02
- Input: Single image + optional mask
- Output: 3D human body model
- Best For: Character modeling, avatar creation
2. SAM 3D Objects - wavespeed-ai/sam-3d-objects
- Cost: $0.02
- Input: Single image + optional mask + prompt
- Output: 3D object model
- Best For: Product visualization, props
3. Hunyuan3D V2 Multi-View - wavespeed-ai/hunyuan3d/v2-multi-view
- Cost: $0.02
- Input: Front + back + left images
- Output: High-fidelity 3D with 4K textures
- Best For: Accurate reconstruction, digital twins
Premium Tier ($0.25-$0.30)
4. Tripo3D V2.5 Image-to-3D - tripo3d/v2.5/image-to-3d
- Cost: $0.30
- Input: Single image
- Output: High-quality 3D asset
- Best For: Game assets, e-commerce, AR/VR
5. Hunyuan3D V2.1 - wavespeed-ai/hunyuan3d/v2.1
- Cost: $0.30
- Input: Single image
- Output: Scalable 3D with PBR textures
- Best For: Production workflows, game art
6. Hunyuan3D V3 Image-to-3D - wavespeed-ai/hunyuan3d-v3/image-to-3d
- Cost: $0.25
- Input: Single image + optional multi-view
- Output: Ultra-high-resolution 3D
- Best For: Film-quality geometry
7. Hyper3D Rodin v2 Image-to-3D - hyper3d/rodin-v2/image-to-3d
- Cost: $0.30
- Input: Single/multiple images + optional prompt
- Output: Production-ready 3D with UVs/textures
- Best For: Game art, film/TV, XR
8. Tripo3D V2.5 Multiview - tripo3d/v2.5/multiview-to-3d
- Cost: $0.30
- Input: Multiple views
- Output: Higher-fidelity 3D
- Best For: Digital twins, 3D catalogs
Text-to-3D ($0.30)
9. Hyper3D Rodin v2 Text-to-3D - hyper3d/rodin-v2/text-to-3d
- Cost: $0.30
- Input: Text prompt
- Output: Production-ready 3D with UVs/textures
- Best For: Concept to 3D, rapid prototyping
Sketch-to-3D ($0.375)
10. Hunyuan3D V3 Sketch-to-3D - wavespeed-ai/hunyuan3d-v3/sketch-to-3d
- Cost: $0.375
- Input: Sketch image + optional prompt
- Output: 3D model with optional PBR
- Best For: Concept art to 3D, game development
📝 Utility Models
Image Captioning
- Image Captioner -
wavespeed-ai/image-captioner($0.001)- Generate detailed image descriptions
- SEO/accessibility, dataset labeling
Additional Inpainting
- Z-Image Turbo Inpaint -
wavespeed-ai/z-image/turbo-inpaint($0.02)- Ultra-fast inpainting with natural language
- Best for: Product photo cleanup, object removal
Additional Outpainting
- Image Zoom-Out -
wavespeed-ai/image-zoom-out($0.02)- Professional outpainting/expansion
- Best for: Expanding images, cinematic compositions
Enhanced Generation
- WAN 2.2 Text-to-Image Realism -
wavespeed-ai/wan-2.2/text-to-image-realism($0.025)- Ultra-realistic photorealistic generation
- Best for: Lifestyle photography, stock imagery
🎯 Model Selection Strategy
By Cost
- Budget ($0.01-$0.03): Qwen Edit, Step1X, Face Swap, Image Upscaler
- Mid-Range ($0.04-$0.05): FLUX Kontext Pro, Bria models, InfiniteYou
- Premium ($0.08-$0.20): FLUX Kontext Max, Ideogram Character, Nano Banana Pro
By Quality
- Good: Qwen, Step1X, HiDream, SeedEdit
- Excellent: FLUX Kontext Pro/Max, GPT Image 1, Ideogram Character
- Premium: Nano Banana Pro Edit Ultra (4K/8K)
By Use Case
- Quick Edits: Qwen Edit ($0.02), Step1X ($0.03)
- Professional Work: Nano Banana Pro ($0.15), FLUX Kontext Max ($0.08)
- Character Work: Ideogram Character ($0.10-$0.20), HiDream ($0.024)
- Typography: FLUX Kontext Pro ($0.04), Ideogram V3 Turbo ($0.03)
- Multi-Image: FLUX Kontext Pro Multi ($0.04), Qwen Edit Plus ($0.02)
💡 Smart Model Selection
Auto-Select Based On:
- Budget Mode: Select cheapest model
- Quality Mode: Select best quality model
- Balanced Mode: Select best value model
- Use Case: Select model optimized for specific task
User Choice:
- Show all available models with cost/quality comparison
- Allow manual selection
- Display recommendations based on edit type
📊 Cost Comparison Examples
Editing a Portrait:
- Budget: Qwen Edit ($0.02) or Step1X ($0.03)
- Balanced: FLUX Kontext Pro ($0.04) or SeedEdit ($0.027)
- Premium: Nano Banana Pro ($0.15) or FLUX Kontext Max ($0.08)
Upscaling an Image:
- Budget: Image Upscaler ($0.01)
- Balanced: Bria Increase Resolution ($0.04)
- Premium: Ultimate Upscaler ($0.06)
Face Swapping:
- Budget: Face Swap ($0.01)
- Balanced: Face Swap Pro ($0.025) or InfiniteYou ($0.05)
- Premium: Multi-Face Swap ($0.16)
🔗 Integration Points
Edit Studio
- Add model selector dropdown
- Show cost comparison
- Display quality recommendations
- Allow side-by-side comparison
Upscale Studio
- Add WaveSpeed models as alternatives to Stability
- Cost comparison UI
- Quality preview
Face Swap Studio (New)
- Model selection with use case recommendations
- Cost/quality comparison
- Batch processing support
Translation Studio (New)
- Model selector (high-quality vs. budget)
- Language support comparison
- Batch translation
📚 Related Documentation
- Image Studio Enhancement Proposal
- Image Studio Implementation Review
- WaveSpeed Implementation Roadmap
Document Version: 2.0
Last Updated: Current Session
Total Models: 40+ WaveSpeed AI models
📊 Complete Model Count
- Image Editing: 14 models
- Upscaling: 3 models
- Face Swapping: 5 models
- 3D Generation: 9 models
- Translation: 2 models
- Specialized: 7 models (erasing, expansion, background, text removal, captioning, inpainting, generation)
- Total: 40+ WaveSpeed AI models