# Stability AI Integration Documentation This document provides comprehensive documentation for the Stability AI integration in the ALwrity backend. ## Overview The Stability AI integration provides access to all major Stability AI services including: - **Image Generation**: Ultra, Core, and SD3.5 models - **Image Editing**: Erase, Inpaint, Outpaint, Search & Replace, Search & Recolor, Background Removal - **Image Upscaling**: Fast, Conservative, and Creative upscaling - **Image Control**: Sketch, Structure, Style, and Style Transfer control - **3D Generation**: Fast 3D and Point-Aware 3D model generation - **Audio Generation**: Text-to-Audio, Audio-to-Audio, and Audio Inpainting - **Legacy V1 APIs**: SDXL 1.0 and other V1 engines ## Architecture ### Modular Structure ``` backend/ ├── models/ │ └── stability_models.py # Pydantic models for all API schemas ├── services/ │ └── stability_service.py # Core service class with HTTP client ├── routers/ │ ├── stability.py # Main API endpoints │ ├── stability_advanced.py # Advanced workflows and features │ └── stability_admin.py # Admin and monitoring endpoints ├── middleware/ │ └── stability_middleware.py # Rate limiting, caching, monitoring ├── utils/ │ └── stability_utils.py # Utility functions and validators ├── config/ │ └── stability_config.py # Configuration and constants └── test/ └── test_stability_endpoints.py # Comprehensive test suite ``` ### Key Components 1. **StabilityAIService**: Core service class handling all API interactions 2. **Pydantic Models**: Comprehensive request/response models with validation 3. **FastAPI Routers**: Organized endpoints for different service categories 4. **Middleware**: Rate limiting, caching, monitoring, and content moderation 5. **Utilities**: File handling, validation, optimization, and workflow management ## API Endpoints ### Generation Endpoints #### POST `/api/stability/generate/ultra` Generate high-quality images using Stable Image Ultra. **Parameters:** - `prompt` (required): Text description of desired image - `image` (optional): Input image for image-to-image generation - `negative_prompt` (optional): What you don't want to see - `aspect_ratio` (optional): Image aspect ratio (default: "1:1") - `seed` (optional): Random seed (0-4294967294) - `output_format` (optional): Output format (jpeg, png, webp) - `style_preset` (optional): Style preset - `strength` (optional): Image influence strength (required if image provided) **Response:** Image bytes or JSON with generation ID **Cost:** 8 credits per generation #### POST `/api/stability/generate/core` Fast and affordable image generation. **Parameters:** - `prompt` (required): Text description - `negative_prompt` (optional): Negative prompt - `aspect_ratio` (optional): Image aspect ratio - `seed` (optional): Random seed - `output_format` (optional): Output format - `style_preset` (optional): Style preset **Cost:** 3 credits per generation #### POST `/api/stability/generate/sd3` Generate using Stable Diffusion 3.5 models. **Parameters:** - `prompt` (required): Text description - `mode` (optional): "text-to-image" or "image-to-image" - `image` (optional): Input image (required for image-to-image) - `strength` (optional): Image influence (required for image-to-image) - `aspect_ratio` (optional): Image aspect ratio (text-to-image only) - `model` (optional): SD3 model variant - `cfg_scale` (optional): CFG scale (1-10) **Cost:** 2.5-6.5 credits depending on model ### Edit Endpoints #### POST `/api/stability/edit/erase` Remove unwanted objects using masks. **Parameters:** - `image` (required): Image file to edit - `mask` (optional): Mask image (or use alpha channel) - `grow_mask` (optional): Mask edge growth (0-20 pixels) - `seed` (optional): Random seed - `output_format` (optional): Output format **Cost:** 5 credits per generation #### POST `/api/stability/edit/inpaint` Fill or replace specified areas with new content. **Parameters:** - `image` (required): Image file to edit - `prompt` (required): Description of desired content - `mask` (optional): Mask image - `negative_prompt` (optional): Negative prompt - `grow_mask` (optional): Mask edge growth (0-100 pixels) - `style_preset` (optional): Style preset **Cost:** 5 credits per generation #### POST `/api/stability/edit/outpaint` Expand image in specified directions. **Parameters:** - `image` (required): Image file to expand - `left` (optional): Pixels to expand left (0-2000) - `right` (optional): Pixels to expand right (0-2000) - `up` (optional): Pixels to expand up (0-2000) - `down` (optional): Pixels to expand down (0-2000) - `creativity` (optional): Creativity level (0-1) - `prompt` (optional): Guidance prompt **Note:** At least one direction must be specified. **Cost:** 4 credits per generation #### POST `/api/stability/edit/search-and-replace` Replace objects using text prompts instead of masks. **Parameters:** - `image` (required): Image file to edit - `prompt` (required): Description of replacement - `search_prompt` (required): What to search for - `grow_mask` (optional): Mask edge growth (0-20 pixels) **Cost:** 5 credits per generation #### POST `/api/stability/edit/search-and-recolor` Change colors of specific objects using prompts. **Parameters:** - `image` (required): Image file to edit - `prompt` (required): Description of new colors - `select_prompt` (required): What to select for recoloring **Cost:** 5 credits per generation #### POST `/api/stability/edit/remove-background` Remove background from images. **Parameters:** - `image` (required): Image file - `output_format` (optional): Output format (png, webp) **Cost:** 5 credits per generation ### Upscale Endpoints #### POST `/api/stability/upscale/fast` Fast 4x upscaling (~1 second processing). **Parameters:** - `image` (required): Image file to upscale - `output_format` (optional): Output format **Cost:** 2 credits per generation #### POST `/api/stability/upscale/conservative` Conservative upscaling to 4K with minimal changes. **Parameters:** - `image` (required): Image file to upscale - `prompt` (required): Description for guidance - `creativity` (optional): Creativity level (0.2-0.5) **Cost:** 40 credits per generation #### POST `/api/stability/upscale/creative` Creative upscaling for highly degraded images (async). **Parameters:** - `image` (required): Image file to upscale - `prompt` (required): Description for guidance - `creativity` (optional): Creativity level (0.1-0.5) - `style_preset` (optional): Style preset **Cost:** 60 credits per generation ### Control Endpoints #### POST `/api/stability/control/sketch` Generate refined images from sketches. **Parameters:** - `image` (required): Sketch or line art - `prompt` (required): Description of desired result - `control_strength` (optional): Control strength (0-1) **Cost:** 5 credits per generation #### POST `/api/stability/control/structure` Maintain structure while changing content. **Parameters:** - `image` (required): Structure reference image - `prompt` (required): Description of desired result - `control_strength` (optional): Control strength (0-1) **Cost:** 5 credits per generation #### POST `/api/stability/control/style` Extract and apply style from reference image. **Parameters:** - `image` (required): Style reference image - `prompt` (required): Description of desired result - `aspect_ratio` (optional): Output aspect ratio - `fidelity` (optional): Style fidelity (0-1) **Cost:** 5 credits per generation #### POST `/api/stability/control/style-transfer` Transfer style between two images. **Parameters:** - `init_image` (required): Image to restyle - `style_image` (required): Style reference - `style_strength` (optional): Style strength (0-1) - `composition_fidelity` (optional): Composition preservation (0-1) **Cost:** 8 credits per generation ### 3D Endpoints #### POST `/api/stability/3d/stable-fast-3d` Generate 3D models from 2D images (fast). **Parameters:** - `image` (required): 2D image to convert - `texture_resolution` (optional): Texture resolution (512, 1024, 2048) - `foreground_ratio` (optional): Object size ratio (0.1-1) - `remesh` (optional): Remesh algorithm (none, triangle, quad) **Output:** GLB 3D model file **Cost:** 10 credits per generation #### POST `/api/stability/3d/stable-point-aware-3d` Advanced 3D generation with editing capabilities. **Parameters:** - `image` (required): 2D image to convert - `texture_resolution` (optional): Texture resolution - `foreground_ratio` (optional): Object size ratio (1-2) - `target_type` (optional): Simplification target (none, vertex, face) - `guidance_scale` (optional): Guidance scale (1-10) **Cost:** 4 credits per generation ### Audio Endpoints #### POST `/api/stability/audio/text-to-audio` Generate audio from text descriptions. **Parameters:** - `prompt` (required): Audio description - `duration` (optional): Duration in seconds (1-190) - `model` (optional): Audio model (stable-audio-2, stable-audio-2.5) - `steps` (optional): Sampling steps (model-dependent) - `cfg_scale` (optional): CFG scale (1-25) **Cost:** 20 credits per generation #### POST `/api/stability/audio/audio-to-audio` Transform audio using text instructions. **Parameters:** - `prompt` (required): Transformation description - `audio` (required): Input audio file - `duration` (optional): Output duration (1-190) - `strength` (optional): Input influence (0-1) **Cost:** 20 credits per generation ### Results Endpoint #### GET `/api/stability/results/{generation_id}` Get results from async generations. **Parameters:** - `generation_id` (required): ID from async operation - `accept_type` (optional): Response format preference **Response:** Generated content or status update ## Advanced Features ### Workflow Processing The integration supports complex multi-step workflows: ```python # Example workflow workflow = [ {"operation": "generate_core", "parameters": {"prompt": "a landscape"}}, {"operation": "upscale_fast", "parameters": {}}, {"operation": "inpaint", "parameters": {"prompt": "add a house"}} ] ``` ### Batch Processing Process multiple images with the same operation: ```python POST /api/stability/advanced/batch/process-folder ``` ### Model Comparison Compare results across different models: ```python POST /api/stability/advanced/compare/models ``` ### AI Director Mode Automated creative decision making: ```python POST /api/stability/advanced/experimental/ai-director ``` ## Configuration ### Environment Variables ```bash STABILITY_API_KEY=your_api_key_here STABILITY_BASE_URL=https://api.stability.ai # Optional STABILITY_TIMEOUT=300 # Optional STABILITY_MAX_RETRIES=3 # Optional STABILITY_MAX_FILE_SIZE=10485760 # Optional (10MB) ``` ### Rate Limiting - **Default Limit**: 150 requests per 10 seconds - **Timeout**: 60 seconds when limit exceeded - **Configurable**: Can be adjusted in middleware ### File Size Limits - **Images**: 10MB maximum - **Audio**: 50MB maximum - **3D Models**: 10MB maximum ### Image Requirements #### Generate Operations - **Minimum**: 4,096 pixels total - **Maximum**: 16,777,216 pixels total (16MP) - **Dimensions**: At least 64x64 pixels #### Edit Operations - **Minimum**: 4,096 pixels total - **Maximum**: 9,437,184 pixels total (~9.4MP) - **Aspect Ratio**: Between 1:2.5 and 2.5:1 #### Upscale Operations - **Fast**: 1,024 to 1,048,576 pixels, 32-1536px dimensions - **Conservative**: 4,096 to 9,437,184 pixels - **Creative**: 4,096 to 1,048,576 pixels ## Usage Examples ### Basic Text-to-Image Generation ```python import requests response = requests.post( "http://localhost:8000/api/stability/generate/ultra", data={ "prompt": "A majestic mountain landscape at sunset", "aspect_ratio": "16:9", "style_preset": "photographic" } ) if response.status_code == 200: with open("generated_image.png", "wb") as f: f.write(response.content) ``` ### Image Editing with Inpainting ```python files = { "image": open("input.png", "rb"), "mask": open("mask.png", "rb") } data = { "prompt": "a beautiful garden", "grow_mask": 10 } response = requests.post( "http://localhost:8000/api/stability/edit/inpaint", files=files, data=data ) ``` ### Audio Generation ```python response = requests.post( "http://localhost:8000/api/stability/audio/text-to-audio", data={ "prompt": "Peaceful piano music with nature sounds", "duration": 60, "model": "stable-audio-2.5" } ) if response.status_code == 200: with open("generated_audio.mp3", "wb") as f: f.write(response.content) ``` ### 3D Model Generation ```python files = {"image": open("object.png", "rb")} response = requests.post( "http://localhost:8000/api/stability/3d/stable-fast-3d", files=files, data={ "texture_resolution": "1024", "foreground_ratio": 0.85 } ) if response.status_code == 200: with open("model.glb", "wb") as f: f.write(response.content) ``` ## Error Handling The API provides comprehensive error handling: ### Common Error Codes - **400**: Invalid parameters or file format - **403**: Content moderation flag or insufficient permissions - **413**: File too large - **422**: Request well-formed but rejected - **429**: Rate limit exceeded - **500**: Internal server error ### Error Response Format ```json { "id": "error_id", "name": "error_name", "errors": ["Detailed error messages"] } ``` ## Monitoring and Analytics ### Health Check Endpoints - `GET /api/stability/health` - Basic health check - `GET /api/stability/admin/health/detailed` - Comprehensive health check ### Statistics Endpoints - `GET /api/stability/admin/stats` - Service statistics - `GET /api/stability/admin/usage/summary` - Usage summary - `GET /api/stability/admin/request-logs` - Request logs ### Cost Estimation - `GET /api/stability/admin/costs/estimate` - Estimate operation costs ## Best Practices ### Prompt Optimization 1. **Be Specific**: Use detailed, descriptive language 2. **Include Style**: Specify artistic style or photographic type 3. **Add Quality Terms**: Include "high quality", "detailed", "sharp" 4. **Use Negative Prompts**: Specify what you don't want ### Image Preparation 1. **Check Dimensions**: Ensure images meet size requirements 2. **Optimize File Size**: Compress large images before upload 3. **Use Appropriate Formats**: PNG for transparency, JPEG for photos 4. **Validate Aspect Ratios**: Check ratio requirements for operations ### Performance Optimization 1. **Use Appropriate Models**: Choose model based on speed vs quality needs 2. **Batch Operations**: Use batch endpoints for multiple similar operations 3. **Cache Results**: Enable caching for repeated operations 4. **Monitor Usage**: Track credit usage and optimize accordingly ## Security Considerations ### API Key Management - Store API keys securely in environment variables - Never commit API keys to version control - Rotate keys regularly - Monitor key usage for unauthorized access ### Content Moderation - Built-in content moderation middleware - Configurable blocked terms - Automatic flagging of inappropriate content - Audit logging for compliance ### Rate Limiting - Automatic rate limiting per client - Configurable limits and timeouts - IP-based and API key-based limiting - Graceful handling of limit exceeded scenarios ## Troubleshooting ### Common Issues #### "API key missing or invalid" - Check STABILITY_API_KEY environment variable - Verify key is correct and active - Check account balance #### "Rate limit exceeded" - Wait for timeout period (60 seconds) - Implement request queuing - Consider upgrading API plan #### "File too large" - Compress images before upload - Check file size limits for operation - Use appropriate image formats #### "Invalid image dimensions" - Check minimum/maximum pixel requirements - Validate aspect ratio constraints - Resize image if necessary ### Debug Endpoints - `POST /api/stability/admin/debug/test-connection` - Test API connectivity - `GET /api/stability/admin/debug/request-logs` - View recent requests - `POST /api/stability/utils/image-info` - Analyze image properties ## Integration Examples ### React Frontend Integration ```javascript // Upload and generate const formData = new FormData(); formData.append('prompt', 'A beautiful landscape'); formData.append('aspect_ratio', '16:9'); const response = await fetch('/api/stability/generate/ultra', { method: 'POST', body: formData }); if (response.ok) { const blob = await response.blob(); const imageUrl = URL.createObjectURL(blob); // Display image } ``` ### Python Service Integration ```python from services.stability_service import StabilityAIService async def generate_content_images(prompts: List[str]): service = StabilityAIService() async with service: results = [] for prompt in prompts: result = await service.generate_core( prompt=prompt, aspect_ratio="16:9" ) results.append(result) return results ``` ## Performance Metrics ### Typical Response Times - **Fast Operations** (Fast Upscale): ~1-2 seconds - **Standard Operations** (Core Generation): ~5-10 seconds - **Complex Operations** (Ultra Generation): ~10-20 seconds - **Heavy Operations** (Creative Upscale): ~30-60 seconds ### Throughput - **Rate Limit**: 150 requests per 10 seconds - **Concurrent Requests**: Limited by API key - **Batch Processing**: Recommended for multiple operations ## Future Enhancements ### Planned Features 1. **Advanced Caching**: Redis-based caching for better performance 2. **Queue Management**: Async job queue for heavy operations 3. **Result Storage**: Persistent storage for generated content 4. **Analytics Dashboard**: Real-time usage analytics 5. **Custom Workflows**: Visual workflow builder 6. **A/B Testing**: Compare different approaches automatically ### API Extensions 1. **Webhook Support**: Real-time notifications for async operations 2. **Streaming Responses**: Progressive image generation updates 3. **Template System**: Predefined generation templates 4. **Collaboration Features**: Shared workspaces and results ## Support For issues and questions: 1. Check the troubleshooting section above 2. Review the test suite for usage examples 3. Check Stability AI documentation: https://platform.stability.ai/docs 4. Contact support through the admin panel ## Version History - **v1.0.0**: Initial implementation with all major Stability AI features - Complete API coverage for v2beta endpoints - Legacy v1 API support - Comprehensive middleware and utilities - Full test suite and documentation