Files
ALwrity/docs/image studio/IMAGE_STUDIO_EDITING_IMPLEMENTATION_STATUS.md

7.3 KiB

Image Studio Editing Feature - Implementation Status

Status: 🚧 IN PROGRESS - Foundation Complete, First Model Integrated
Started: Current Session
Current Phase: Steps 1-4 Complete, Ready for More Models


Completed (Steps 1-2)

Step 1: Protocol & Options

File: backend/services/llm_providers/image_generation/base.py

Added:

  • ImageEditOptions dataclass - Complete with all fields
  • ImageEditProvider protocol - Follows same pattern as ImageGenerationProvider
  • to_dict() method - Converts options to API-friendly format

Status: Complete and tested


Step 2: WaveSpeedEditProvider Structure

File: backend/services/llm_providers/image_generation/wavespeed_edit_provider.py

Created:

  • Provider class structure following WaveSpeedImageProvider pattern
  • SUPPORTED_MODELS dict (empty, ready for 14 models)
  • Validation methods (_validate_options())
  • Helper methods (get_available_models(), get_models_by_tier(), get_models_by_operation())
  • Placeholder for API call method (_call_wavespeed_edit_api())

Status: Structure complete, API implemented

  • SUPPORTED_MODELS dict structure ready
  • API call method (_call_wavespeed_edit_api()) implemented
  • Helper methods (_extract_image_url(), _download_image()) added
  • 5 models added: qwen-edit, qwen-edit-plus, nano-banana-pro-edit-ultra, seedream-v4.5-edit, flux-kontext-pro (waiting for remaining 9 model docs)
  • Model-specific parameter handling: Supports different API formats (size vs aspect_ratio/resolution, image vs images)
  • Verified against official WaveSpeed API documentation
  • Qwen Image Edit: Verified against https://wavespeed.ai/docs/docs-api/wavespeed-ai/qwen-image-edit

📋 Ready for Model Integration

What I Need from You

  1. Model Documentation for each of the 14 editing models:

    • Model ID (e.g., "qwen-edit")
    • Model path/endpoint (e.g., "wavespeed-ai/qwen-image/edit")
    • Display name
    • Cost per edit
    • Max resolution
    • Supported operations/capabilities
    • Any model-specific parameters
  2. WaveSpeed API Documentation for editing:

    • API endpoint structure
    • Request format
    • Response format
    • Authentication method
    • Any special requirements

Model Structure Example

Qwen Image Edit Plus ( Added):

"qwen-edit-plus": {
    "model_path": "wavespeed-ai/qwen-image/edit-plus",
    "name": "Qwen Image Edit Plus",
    "description": "20B MMDiT image editor with multi-image editing...",
    "cost": 0.02,
    "max_resolution": (1536, 1536),
    "capabilities": ["general_edit", "style_transfer", "text_edit", "multi_image"],
    "tier": "budget",
    "supports_multi_image": True,  # Up to 3 reference images
    "supports_controlnet": True,
    "languages": ["en", "zh"],
}

Template for Remaining Models:

"model-id": {
    "model_path": "wavespeed-ai/model-path",
    "name": "Model Display Name",
    "description": "Model description",
    "cost": 0.02,  # Cost per edit
    "max_resolution": (2048, 2048),
    "capabilities": ["general_edit", "inpaint", "outpaint"],
    "tier": "budget",  # "budget", "mid", "premium"
    # Model-specific parameters
}

🔄 Next Steps (After Model Docs)

Step 3: Add Models (In Progress - 2/14 Complete)

  • Qwen Image Edit Plus added (from provided docs)
  • Google Nano Banana Pro Edit Ultra added (from provided docs)
  • 12 models remaining - waiting for model documentation
  • Model-specific parameter handling: Supports both size (Qwen) and aspect_ratio/resolution (Nano Banana) formats

Step 4: Implement API Call COMPLETE

  • _call_wavespeed_edit_api() method implemented
  • Follows same pattern as ImageGenerator.generate_image()
  • Handles sync/async modes
  • Polling support via WaveSpeedClient.poll_until_complete()
  • Helper methods: _extract_image_url(), _download_image()
  • Tested with Qwen Image Edit Plus API structure

Step 5: Unified Entry Point COMPLETE

  • generate_image_edit() added to main_image_generation.py
  • Reuses Phase 1 helpers (_validate_image_operation(), _track_image_operation_usage())
  • Provider selection helper (_get_edit_provider()) added
  • Follows same pattern as generate_image()
  • Error handling and logging consistent

Step 6: Service Integration COMPLETE

  • Refactored _handle_general_edit() to use unified entry point for WaveSpeed models
  • Added model detection logic (WaveSpeed vs HuggingFace)
  • Maintained backward compatibility with Stability AI and HuggingFace
  • API endpoint already supports model parameter (no changes needed)

Step 7: Backend APIs COMPLETE

  • GET /api/image-studio/edit/models - List available models with metadata
  • POST /api/image-studio/edit/recommend - Get smart recommendations
  • Auto-detection logic implemented in _handle_general_edit()
  • Recommendation algorithm with scoring (cost, quality, user tier, resolution)
  • Model metadata methods (get_available_models(), recommend_model())

Step 8: Frontend Integration ⏸️ PENDING

  • ⏸️ Create ModelSelector component
  • ⏸️ Create ModelInfoCard component
  • ⏸️ Create ModelComparisonDialog component
  • ⏸️ Integrate into EditStudio.tsx
  • ⏸️ Add API calls to useImageStudio hook
  • ⏸️ Display cost estimates and model information

📁 Files Created/Modified

New Files

  1. backend/services/llm_providers/image_generation/wavespeed_edit_provider.py - Provider structure

Modified Files

  1. backend/services/llm_providers/image_generation/base.py - Added protocol & options
  2. backend/services/llm_providers/image_generation/__init__.py - Exported new types
  3. backend/services/llm_providers/main_image_generation.py - Added generate_image_edit() function
  4. backend/services/image_studio/edit_service.py - Added model listing, recommendations, auto-detection
  5. backend/services/image_studio/studio_manager.py - Added model API methods
  6. backend/routers/image_studio.py - Added /edit/models and /edit/recommend endpoints

🎯 Current Status Summary

Step Status Notes
Step 1: Protocol & Options Complete Ready to use
Step 2: Provider Structure Complete Structure ready
Step 3: Add Models 🚧 In Progress 5 of 14 models added (Qwen Edit, Qwen Edit Plus, Nano Banana Pro Edit Ultra, Seedream V4.5 Edit, FLUX Kontext Pro)
Step 4: API Implementation Complete API call method implemented
Step 5: Unified Entry Complete Ready to use
Step 6: Service Integration Complete WaveSpeed models integrated, backward compatible
Step 7: Frontend ⏸️ Pending Add model selector UI

📝 Notes

  1. Reusability: All code follows established patterns from Phase 1
  2. Placeholder API Call: _call_wavespeed_edit_api() is a placeholder - will be implemented once we have API docs
  3. Model Registry: Structure ready, just needs model data
  4. Backward Compatibility: Will be maintained when integrating with EditStudioService

Foundation complete - Ready for model documentation