19 KiB
Stability AI Integration Documentation
This document provides comprehensive documentation for the Stability AI integration in the ALwrity backend.
Overview
The Stability AI integration provides access to all major Stability AI services including:
- Image Generation: Ultra, Core, and SD3.5 models
- Image Editing: Erase, Inpaint, Outpaint, Search & Replace, Search & Recolor, Background Removal
- Image Upscaling: Fast, Conservative, and Creative upscaling
- Image Control: Sketch, Structure, Style, and Style Transfer control
- 3D Generation: Fast 3D and Point-Aware 3D model generation
- Audio Generation: Text-to-Audio, Audio-to-Audio, and Audio Inpainting
- Legacy V1 APIs: SDXL 1.0 and other V1 engines
Architecture
Modular Structure
backend/
├── models/
│ └── stability_models.py # Pydantic models for all API schemas
├── services/
│ └── stability_service.py # Core service class with HTTP client
├── routers/
│ ├── stability.py # Main API endpoints
│ ├── stability_advanced.py # Advanced workflows and features
│ └── stability_admin.py # Admin and monitoring endpoints
├── middleware/
│ └── stability_middleware.py # Rate limiting, caching, monitoring
├── utils/
│ └── stability_utils.py # Utility functions and validators
├── config/
│ └── stability_config.py # Configuration and constants
└── test/
└── test_stability_endpoints.py # Comprehensive test suite
Key Components
- StabilityAIService: Core service class handling all API interactions
- Pydantic Models: Comprehensive request/response models with validation
- FastAPI Routers: Organized endpoints for different service categories
- Middleware: Rate limiting, caching, monitoring, and content moderation
- Utilities: File handling, validation, optimization, and workflow management
API Endpoints
Generation Endpoints
POST /api/stability/generate/ultra
Generate high-quality images using Stable Image Ultra.
Parameters:
prompt(required): Text description of desired imageimage(optional): Input image for image-to-image generationnegative_prompt(optional): What you don't want to seeaspect_ratio(optional): Image aspect ratio (default: "1:1")seed(optional): Random seed (0-4294967294)output_format(optional): Output format (jpeg, png, webp)style_preset(optional): Style presetstrength(optional): Image influence strength (required if image provided)
Response: Image bytes or JSON with generation ID
Cost: 8 credits per generation
POST /api/stability/generate/core
Fast and affordable image generation.
Parameters:
prompt(required): Text descriptionnegative_prompt(optional): Negative promptaspect_ratio(optional): Image aspect ratioseed(optional): Random seedoutput_format(optional): Output formatstyle_preset(optional): Style preset
Cost: 3 credits per generation
POST /api/stability/generate/sd3
Generate using Stable Diffusion 3.5 models.
Parameters:
prompt(required): Text descriptionmode(optional): "text-to-image" or "image-to-image"image(optional): Input image (required for image-to-image)strength(optional): Image influence (required for image-to-image)aspect_ratio(optional): Image aspect ratio (text-to-image only)model(optional): SD3 model variantcfg_scale(optional): CFG scale (1-10)
Cost: 2.5-6.5 credits depending on model
Edit Endpoints
POST /api/stability/edit/erase
Remove unwanted objects using masks.
Parameters:
image(required): Image file to editmask(optional): Mask image (or use alpha channel)grow_mask(optional): Mask edge growth (0-20 pixels)seed(optional): Random seedoutput_format(optional): Output format
Cost: 5 credits per generation
POST /api/stability/edit/inpaint
Fill or replace specified areas with new content.
Parameters:
image(required): Image file to editprompt(required): Description of desired contentmask(optional): Mask imagenegative_prompt(optional): Negative promptgrow_mask(optional): Mask edge growth (0-100 pixels)style_preset(optional): Style preset
Cost: 5 credits per generation
POST /api/stability/edit/outpaint
Expand image in specified directions.
Parameters:
image(required): Image file to expandleft(optional): Pixels to expand left (0-2000)right(optional): Pixels to expand right (0-2000)up(optional): Pixels to expand up (0-2000)down(optional): Pixels to expand down (0-2000)creativity(optional): Creativity level (0-1)prompt(optional): Guidance prompt
Note: At least one direction must be specified.
Cost: 4 credits per generation
POST /api/stability/edit/search-and-replace
Replace objects using text prompts instead of masks.
Parameters:
image(required): Image file to editprompt(required): Description of replacementsearch_prompt(required): What to search forgrow_mask(optional): Mask edge growth (0-20 pixels)
Cost: 5 credits per generation
POST /api/stability/edit/search-and-recolor
Change colors of specific objects using prompts.
Parameters:
image(required): Image file to editprompt(required): Description of new colorsselect_prompt(required): What to select for recoloring
Cost: 5 credits per generation
POST /api/stability/edit/remove-background
Remove background from images.
Parameters:
image(required): Image fileoutput_format(optional): Output format (png, webp)
Cost: 5 credits per generation
Upscale Endpoints
POST /api/stability/upscale/fast
Fast 4x upscaling (~1 second processing).
Parameters:
image(required): Image file to upscaleoutput_format(optional): Output format
Cost: 2 credits per generation
POST /api/stability/upscale/conservative
Conservative upscaling to 4K with minimal changes.
Parameters:
image(required): Image file to upscaleprompt(required): Description for guidancecreativity(optional): Creativity level (0.2-0.5)
Cost: 40 credits per generation
POST /api/stability/upscale/creative
Creative upscaling for highly degraded images (async).
Parameters:
image(required): Image file to upscaleprompt(required): Description for guidancecreativity(optional): Creativity level (0.1-0.5)style_preset(optional): Style preset
Cost: 60 credits per generation
Control Endpoints
POST /api/stability/control/sketch
Generate refined images from sketches.
Parameters:
image(required): Sketch or line artprompt(required): Description of desired resultcontrol_strength(optional): Control strength (0-1)
Cost: 5 credits per generation
POST /api/stability/control/structure
Maintain structure while changing content.
Parameters:
image(required): Structure reference imageprompt(required): Description of desired resultcontrol_strength(optional): Control strength (0-1)
Cost: 5 credits per generation
POST /api/stability/control/style
Extract and apply style from reference image.
Parameters:
image(required): Style reference imageprompt(required): Description of desired resultaspect_ratio(optional): Output aspect ratiofidelity(optional): Style fidelity (0-1)
Cost: 5 credits per generation
POST /api/stability/control/style-transfer
Transfer style between two images.
Parameters:
init_image(required): Image to restylestyle_image(required): Style referencestyle_strength(optional): Style strength (0-1)composition_fidelity(optional): Composition preservation (0-1)
Cost: 8 credits per generation
3D Endpoints
POST /api/stability/3d/stable-fast-3d
Generate 3D models from 2D images (fast).
Parameters:
image(required): 2D image to converttexture_resolution(optional): Texture resolution (512, 1024, 2048)foreground_ratio(optional): Object size ratio (0.1-1)remesh(optional): Remesh algorithm (none, triangle, quad)
Output: GLB 3D model file
Cost: 10 credits per generation
POST /api/stability/3d/stable-point-aware-3d
Advanced 3D generation with editing capabilities.
Parameters:
image(required): 2D image to converttexture_resolution(optional): Texture resolutionforeground_ratio(optional): Object size ratio (1-2)target_type(optional): Simplification target (none, vertex, face)guidance_scale(optional): Guidance scale (1-10)
Cost: 4 credits per generation
Audio Endpoints
POST /api/stability/audio/text-to-audio
Generate audio from text descriptions.
Parameters:
prompt(required): Audio descriptionduration(optional): Duration in seconds (1-190)model(optional): Audio model (stable-audio-2, stable-audio-2.5)steps(optional): Sampling steps (model-dependent)cfg_scale(optional): CFG scale (1-25)
Cost: 20 credits per generation
POST /api/stability/audio/audio-to-audio
Transform audio using text instructions.
Parameters:
prompt(required): Transformation descriptionaudio(required): Input audio fileduration(optional): Output duration (1-190)strength(optional): Input influence (0-1)
Cost: 20 credits per generation
Results Endpoint
GET /api/stability/results/{generation_id}
Get results from async generations.
Parameters:
generation_id(required): ID from async operationaccept_type(optional): Response format preference
Response: Generated content or status update
Advanced Features
Workflow Processing
The integration supports complex multi-step workflows:
# Example workflow
workflow = [
{"operation": "generate_core", "parameters": {"prompt": "a landscape"}},
{"operation": "upscale_fast", "parameters": {}},
{"operation": "inpaint", "parameters": {"prompt": "add a house"}}
]
Batch Processing
Process multiple images with the same operation:
POST /api/stability/advanced/batch/process-folder
Model Comparison
Compare results across different models:
POST /api/stability/advanced/compare/models
AI Director Mode
Automated creative decision making:
POST /api/stability/advanced/experimental/ai-director
Configuration
Environment Variables
STABILITY_API_KEY=your_api_key_here
STABILITY_BASE_URL=https://api.stability.ai # Optional
STABILITY_TIMEOUT=300 # Optional
STABILITY_MAX_RETRIES=3 # Optional
STABILITY_MAX_FILE_SIZE=10485760 # Optional (10MB)
Rate Limiting
- Default Limit: 150 requests per 10 seconds
- Timeout: 60 seconds when limit exceeded
- Configurable: Can be adjusted in middleware
File Size Limits
- Images: 10MB maximum
- Audio: 50MB maximum
- 3D Models: 10MB maximum
Image Requirements
Generate Operations
- Minimum: 4,096 pixels total
- Maximum: 16,777,216 pixels total (16MP)
- Dimensions: At least 64x64 pixels
Edit Operations
- Minimum: 4,096 pixels total
- Maximum: 9,437,184 pixels total (~9.4MP)
- Aspect Ratio: Between 1:2.5 and 2.5:1
Upscale Operations
- Fast: 1,024 to 1,048,576 pixels, 32-1536px dimensions
- Conservative: 4,096 to 9,437,184 pixels
- Creative: 4,096 to 1,048,576 pixels
Usage Examples
Basic Text-to-Image Generation
import requests
response = requests.post(
"http://localhost:8000/api/stability/generate/ultra",
data={
"prompt": "A majestic mountain landscape at sunset",
"aspect_ratio": "16:9",
"style_preset": "photographic"
}
)
if response.status_code == 200:
with open("generated_image.png", "wb") as f:
f.write(response.content)
Image Editing with Inpainting
files = {
"image": open("input.png", "rb"),
"mask": open("mask.png", "rb")
}
data = {
"prompt": "a beautiful garden",
"grow_mask": 10
}
response = requests.post(
"http://localhost:8000/api/stability/edit/inpaint",
files=files,
data=data
)
Audio Generation
response = requests.post(
"http://localhost:8000/api/stability/audio/text-to-audio",
data={
"prompt": "Peaceful piano music with nature sounds",
"duration": 60,
"model": "stable-audio-2.5"
}
)
if response.status_code == 200:
with open("generated_audio.mp3", "wb") as f:
f.write(response.content)
3D Model Generation
files = {"image": open("object.png", "rb")}
response = requests.post(
"http://localhost:8000/api/stability/3d/stable-fast-3d",
files=files,
data={
"texture_resolution": "1024",
"foreground_ratio": 0.85
}
)
if response.status_code == 200:
with open("model.glb", "wb") as f:
f.write(response.content)
Error Handling
The API provides comprehensive error handling:
Common Error Codes
- 400: Invalid parameters or file format
- 403: Content moderation flag or insufficient permissions
- 413: File too large
- 422: Request well-formed but rejected
- 429: Rate limit exceeded
- 500: Internal server error
Error Response Format
{
"id": "error_id",
"name": "error_name",
"errors": ["Detailed error messages"]
}
Monitoring and Analytics
Health Check Endpoints
GET /api/stability/health- Basic health checkGET /api/stability/admin/health/detailed- Comprehensive health check
Statistics Endpoints
GET /api/stability/admin/stats- Service statisticsGET /api/stability/admin/usage/summary- Usage summaryGET /api/stability/admin/request-logs- Request logs
Cost Estimation
GET /api/stability/admin/costs/estimate- Estimate operation costs
Best Practices
Prompt Optimization
- Be Specific: Use detailed, descriptive language
- Include Style: Specify artistic style or photographic type
- Add Quality Terms: Include "high quality", "detailed", "sharp"
- Use Negative Prompts: Specify what you don't want
Image Preparation
- Check Dimensions: Ensure images meet size requirements
- Optimize File Size: Compress large images before upload
- Use Appropriate Formats: PNG for transparency, JPEG for photos
- Validate Aspect Ratios: Check ratio requirements for operations
Performance Optimization
- Use Appropriate Models: Choose model based on speed vs quality needs
- Batch Operations: Use batch endpoints for multiple similar operations
- Cache Results: Enable caching for repeated operations
- Monitor Usage: Track credit usage and optimize accordingly
Security Considerations
API Key Management
- Store API keys securely in environment variables
- Never commit API keys to version control
- Rotate keys regularly
- Monitor key usage for unauthorized access
Content Moderation
- Built-in content moderation middleware
- Configurable blocked terms
- Automatic flagging of inappropriate content
- Audit logging for compliance
Rate Limiting
- Automatic rate limiting per client
- Configurable limits and timeouts
- IP-based and API key-based limiting
- Graceful handling of limit exceeded scenarios
Troubleshooting
Common Issues
"API key missing or invalid"
- Check STABILITY_API_KEY environment variable
- Verify key is correct and active
- Check account balance
"Rate limit exceeded"
- Wait for timeout period (60 seconds)
- Implement request queuing
- Consider upgrading API plan
"File too large"
- Compress images before upload
- Check file size limits for operation
- Use appropriate image formats
"Invalid image dimensions"
- Check minimum/maximum pixel requirements
- Validate aspect ratio constraints
- Resize image if necessary
Debug Endpoints
POST /api/stability/admin/debug/test-connection- Test API connectivityGET /api/stability/admin/debug/request-logs- View recent requestsPOST /api/stability/utils/image-info- Analyze image properties
Integration Examples
React Frontend Integration
// Upload and generate
const formData = new FormData();
formData.append('prompt', 'A beautiful landscape');
formData.append('aspect_ratio', '16:9');
const response = await fetch('/api/stability/generate/ultra', {
method: 'POST',
body: formData
});
if (response.ok) {
const blob = await response.blob();
const imageUrl = URL.createObjectURL(blob);
// Display image
}
Python Service Integration
from services.stability_service import StabilityAIService
async def generate_content_images(prompts: List[str]):
service = StabilityAIService()
async with service:
results = []
for prompt in prompts:
result = await service.generate_core(
prompt=prompt,
aspect_ratio="16:9"
)
results.append(result)
return results
Performance Metrics
Typical Response Times
- Fast Operations (Fast Upscale): ~1-2 seconds
- Standard Operations (Core Generation): ~5-10 seconds
- Complex Operations (Ultra Generation): ~10-20 seconds
- Heavy Operations (Creative Upscale): ~30-60 seconds
Throughput
- Rate Limit: 150 requests per 10 seconds
- Concurrent Requests: Limited by API key
- Batch Processing: Recommended for multiple operations
Future Enhancements
Planned Features
- Advanced Caching: Redis-based caching for better performance
- Queue Management: Async job queue for heavy operations
- Result Storage: Persistent storage for generated content
- Analytics Dashboard: Real-time usage analytics
- Custom Workflows: Visual workflow builder
- A/B Testing: Compare different approaches automatically
API Extensions
- Webhook Support: Real-time notifications for async operations
- Streaming Responses: Progressive image generation updates
- Template System: Predefined generation templates
- Collaboration Features: Shared workspaces and results
Support
For issues and questions:
- Check the troubleshooting section above
- Review the test suite for usage examples
- Check Stability AI documentation: https://platform.stability.ai/docs
- Contact support through the admin panel
Version History
- v1.0.0: Initial implementation with all major Stability AI features
- Complete API coverage for v2beta endpoints
- Legacy v1 API support
- Comprehensive middleware and utilities
- Full test suite and documentation