5.0 KiB
5.0 KiB
LTX-2 Pro Text-to-Video Implementation - Complete ✅
Summary
Successfully implemented Lightricks LTX-2 Pro text-to-video generation following the same modular architecture pattern as HunyuanVideo-1.5.
Implementation Details
1. Service Structure ✅
File: backend/services/llm_providers/video_generation/wavespeed_provider.py
LTX2ProService: Complete implementation- Model-specific validation (duration: 6, 8, or 10 seconds)
- Fixed 1080p resolution (no resolution parameter needed)
generate_audioparameter support (boolean, default: True)- Cost calculation (placeholder - update with actual pricing)
- Full API integration (submit → poll → download)
- Progress callback support
- Comprehensive error handling
2. Key Differences from HunyuanVideo-1.5
| Feature | HunyuanVideo-1.5 | LTX-2 Pro |
|---|---|---|
| Duration | 5, 8, 10 seconds | 6, 8, 10 seconds |
| Resolution | 480p, 720p (selectable) | 1080p (fixed) |
| Audio | Not supported | generate_audio parameter (boolean) |
| Negative Prompt | Supported | Not supported |
| Seed | Supported | Not supported |
| Size Format | width*height (selectable) | Fixed 1080p |
3. API Integration ✅
Model: lightricks/ltx-2-pro/text-to-video
Parameters Supported:
- ✅
prompt(required) - ✅
duration(6, 8, or 10 seconds) - ✅
generate_audio(boolean, default: True) - ❌
negative_prompt(not supported - ignored with warning) - ❌
seed(not supported - ignored with warning) - ❌
audio_base64(not supported - ignored with warning) - ❌
enable_prompt_expansion(not supported - ignored with warning) - ❌
resolution(ignored - fixed at 1080p)
Workflow:
- ✅ Submit request to WaveSpeed API
- ✅ Get prediction ID
- ✅ Poll
/api/v3/predictions/{id}/resultwith progress callbacks - ✅ Download video from
outputs[0] - ✅ Return metadata dict
4. Features ✅
- ✅ Pre-flight validation: Subscription limits checked before API calls
- ✅ Usage tracking: Integrated with existing tracking system
- ✅ Progress callbacks: Real-time progress updates (10% → 20-80% → 90% → 100%)
- ✅ Error handling: Comprehensive error messages with prediction_id for resume
- ✅ Cost calculation: Placeholder pricing (update with actual pricing)
- ✅ Metadata return: Full metadata including dimensions (1920x1080), cost, prediction_id
- ✅ Audio generation: Optional synchronized audio via
generate_audioparameter
5. Validation ✅
LTX-2 Pro Specific:
- Duration: Must be 6, 8, or 10 seconds
- Resolution: Fixed at 1080p (parameter ignored)
- Prompt: Required and cannot be empty
- Generate Audio: Boolean (default: True)
6. Factory Function ✅
Updated: get_wavespeed_text_to_video_service()
Model Mappings:
"ltx-2-pro"→LTX2ProService"lightricks/ltx-2-pro"→LTX2ProService"lightricks/ltx-2-pro/text-to-video"→LTX2ProService
Usage Example
from services.llm_providers.main_video_generation import ai_video_generate
result = await ai_video_generate(
prompt="A cinematic scene with synchronized audio",
operation_type="text-to-video",
provider="wavespeed",
model="ltx-2-pro",
duration=6,
generate_audio=True, # LTX-2 Pro specific parameter
user_id="user123",
progress_callback=lambda progress, msg: print(f"{progress}%: {msg}")
)
video_bytes = result["video_bytes"]
cost = result["cost"]
resolution = result["resolution"] # Always "1080p"
Testing Checklist
- Test with valid prompt
- Test with 6-second duration
- Test with 8-second duration
- Test with 10-second duration
- Test with
generate_audio=True - Test with
generate_audio=False - Test progress callbacks
- Test error handling (invalid duration)
- Test cost calculation
- Test metadata return
- Test that unsupported parameters are ignored with warnings
Next Steps
- ✅ HunyuanVideo-1.5: Complete
- ✅ LTX-2 Pro: Complete
- ⏳ LTX-2 Fast: Pending documentation
- ⏳ LTX-2 Retake: Pending documentation
Notes
- Fixed Resolution: LTX-2 Pro always generates 1080p videos (1920x1080)
- Audio Generation: Unique feature - can generate synchronized audio with video
- Pricing: Placeholder cost calculation - update with actual pricing from WaveSpeed docs
- Unsupported Parameters:
negative_prompt,seed,audio_base64,enable_prompt_expansionare ignored with warnings - Polling interval: 0.5 seconds (same as HunyuanVideo-1.5)
- Timeout: 10 minutes maximum
Official Documentation
- API Docs: https://wavespeed.ai/docs/docs-api/lightricks/ltx-2-pro/text-to-video
- Model Playground: https://wavespeed.ai/models/lightricks/ltx-2-pro/text-to-video
Ready for Testing ✅
The implementation is complete and ready for testing. All features are implemented following the modular architecture with separation of concerns, matching the pattern established by HunyuanVideo-1.5.