Base code

2026-01-08 22:39:53 +07:00
parent 697115c61a
commit c35fa52117
2169 changed files with 626670 additions and 0 deletions
--- a/docs/LTX2_PRO_IMPLEMENTATION_COMPLETE.md
+++ b/docs/LTX2_PRO_IMPLEMENTATION_COMPLETE.md
@@ -0,0 +1,139 @@
+# LTX-2 Pro Text-to-Video Implementation - Complete ✅
+
+## Summary
+
+Successfully implemented Lightricks LTX-2 Pro text-to-video generation following the same modular architecture pattern as HunyuanVideo-1.5.
+
+## Implementation Details
+
+### 1. Service Structure ✅
+
+**File**: `backend/services/llm_providers/video_generation/wavespeed_provider.py`
+
+- **`LTX2ProService`**: Complete implementation
+  - Model-specific validation (duration: 6, 8, or 10 seconds)
+  - Fixed 1080p resolution (no resolution parameter needed)
+  - `generate_audio` parameter support (boolean, default: True)
+  - Cost calculation (placeholder - update with actual pricing)
+  - Full API integration (submit → poll → download)
+  - Progress callback support
+  - Comprehensive error handling
+
+### 2. Key Differences from HunyuanVideo-1.5
+
+| Feature | HunyuanVideo-1.5 | LTX-2 Pro |
+|---------|------------------|-----------|
+| **Duration** | 5, 8, 10 seconds | 6, 8, 10 seconds |
+| **Resolution** | 480p, 720p (selectable) | 1080p (fixed) |
+| **Audio** | Not supported | `generate_audio` parameter (boolean) |
+| **Negative Prompt** | Supported | Not supported |
+| **Seed** | Supported | Not supported |
+| **Size Format** | width*height (selectable) | Fixed 1080p |
+
+### 3. API Integration ✅
+
+**Model**: `lightricks/ltx-2-pro/text-to-video`
+
+**Parameters Supported**:
+- ✅ `prompt` (required)
+- ✅ `duration` (6, 8, or 10 seconds)
+- ✅ `generate_audio` (boolean, default: True)
+- ❌ `negative_prompt` (not supported - ignored with warning)
+- ❌ `seed` (not supported - ignored with warning)
+- ❌ `audio_base64` (not supported - ignored with warning)
+- ❌ `enable_prompt_expansion` (not supported - ignored with warning)
+- ❌ `resolution` (ignored - fixed at 1080p)
+
+**Workflow**:
+1. ✅ Submit request to WaveSpeed API
+2. ✅ Get prediction ID
+3. ✅ Poll `/api/v3/predictions/{id}/result` with progress callbacks
+4. ✅ Download video from `outputs[0]`
+5. ✅ Return metadata dict
+
+### 4. Features ✅
+
+- ✅ **Pre-flight validation**: Subscription limits checked before API calls
+- ✅ **Usage tracking**: Integrated with existing tracking system
+- ✅ **Progress callbacks**: Real-time progress updates (10% → 20-80% → 90% → 100%)
+- ✅ **Error handling**: Comprehensive error messages with prediction_id for resume
+- ✅ **Cost calculation**: Placeholder pricing (update with actual pricing)
+- ✅ **Metadata return**: Full metadata including dimensions (1920x1080), cost, prediction_id
+- ✅ **Audio generation**: Optional synchronized audio via `generate_audio` parameter
+
+### 5. Validation ✅
+
+**LTX-2 Pro Specific**:
+- Duration: Must be 6, 8, or 10 seconds
+- Resolution: Fixed at 1080p (parameter ignored)
+- Prompt: Required and cannot be empty
+- Generate Audio: Boolean (default: True)
+
+### 6. Factory Function ✅
+
+**Updated**: `get_wavespeed_text_to_video_service()`
+
+**Model Mappings**:
+- `"ltx-2-pro"` → `LTX2ProService`
+- `"lightricks/ltx-2-pro"` → `LTX2ProService`
+- `"lightricks/ltx-2-pro/text-to-video"` → `LTX2ProService`
+
+## Usage Example
+
+```python
+from services.llm_providers.main_video_generation import ai_video_generate
+
+result = await ai_video_generate(
+    prompt="A cinematic scene with synchronized audio",
+    operation_type="text-to-video",
+    provider="wavespeed",
+    model="ltx-2-pro",
+    duration=6,
+    generate_audio=True,  # LTX-2 Pro specific parameter
+    user_id="user123",
+    progress_callback=lambda progress, msg: print(f"{progress}%: {msg}")
+)
+
+video_bytes = result["video_bytes"]
+cost = result["cost"]
+resolution = result["resolution"]  # Always "1080p"
+```
+
+## Testing Checklist
+
+- [ ] Test with valid prompt
+- [ ] Test with 6-second duration
+- [ ] Test with 8-second duration
+- [ ] Test with 10-second duration
+- [ ] Test with `generate_audio=True`
+- [ ] Test with `generate_audio=False`
+- [ ] Test progress callbacks
+- [ ] Test error handling (invalid duration)
+- [ ] Test cost calculation
+- [ ] Test metadata return
+- [ ] Test that unsupported parameters are ignored with warnings
+
+## Next Steps
+
+1. ✅ **HunyuanVideo-1.5**: Complete
+2. ✅ **LTX-2 Pro**: Complete
+3. ⏳ **LTX-2 Fast**: Pending documentation
+4. ⏳ **LTX-2 Retake**: Pending documentation
+
+## Notes
+
+- **Fixed Resolution**: LTX-2 Pro always generates 1080p videos (1920x1080)
+- **Audio Generation**: Unique feature - can generate synchronized audio with video
+- **Pricing**: Placeholder cost calculation - update with actual pricing from WaveSpeed docs
+- **Unsupported Parameters**: `negative_prompt`, `seed`, `audio_base64`, `enable_prompt_expansion` are ignored with warnings
+- **Polling interval**: 0.5 seconds (same as HunyuanVideo-1.5)
+- **Timeout**: 10 minutes maximum
+
+## Official Documentation
+
+- **API Docs**: https://wavespeed.ai/docs/docs-api/lightricks/ltx-2-pro/text-to-video
+- **Model Playground**: https://wavespeed.ai/models/lightricks/ltx-2-pro/text-to-video
+
+## Ready for Testing ✅
+
+The implementation is complete and ready for testing. All features are implemented following the modular architecture with separation of concerns, matching the pattern established by HunyuanVideo-1.5.