Base code

This commit is contained in:
Kunthawat Greethong
2026-01-08 22:39:53 +07:00
parent 697115c61a
commit c35fa52117
2169 changed files with 626670 additions and 0 deletions

View File

@@ -0,0 +1,147 @@
# HunyuanVideo-1.5 Text-to-Video Implementation - Complete ✅
## Summary
Successfully implemented HunyuanVideo-1.5 text-to-video generation with modular architecture, following separation of concerns principles.
## Implementation Details
### 1. Service Structure ✅
**File**: `backend/services/llm_providers/video_generation/wavespeed_provider.py`
- **`HunyuanVideoService`**: Complete implementation
- Model-specific validation (duration: 5, 8, or 10 seconds, resolution: 480p or 720p)
- Based on official API docs: https://wavespeed.ai/docs/docs-api/wavespeed-ai/hunyuan-video-1.5-text-to-video
- Size format conversion (resolution + aspect_ratio → "width*height")
- Cost calculation ($0.02/s for 480p, $0.04/s for 720p)
- Full API integration (submit → poll → download)
- Progress callback support
- Comprehensive error handling
### 2. Unified Entry Point Integration ✅
**File**: `backend/services/llm_providers/main_video_generation.py`
- **`_generate_text_to_video_wavespeed()`**: New async function
- Routes to appropriate service based on model
- Handles all parameters
- Returns standardized metadata dict
- **`ai_video_generate()`**: Updated
- Now supports WaveSpeed text-to-video
- Default model: `hunyuan-video-1.5`
- Async/await properly handled
### 3. API Integration ✅
**Model**: `wavespeed-ai/hunyuan-video-1.5/text-to-video`
**Parameters Supported**:
-`prompt` (required)
-`negative_prompt` (optional)
-`size` (auto-calculated from resolution + aspect_ratio)
-`duration` (5, 8, or 10 seconds)
-`seed` (optional, default: -1)
**Workflow**:
1. ✅ Submit request to WaveSpeed API
2. ✅ Get prediction ID
3. ✅ Poll `/api/v3/predictions/{id}/result` with progress callbacks
4. ✅ Download video from `outputs[0]`
5. ✅ Return metadata dict
### 4. Features ✅
-**Pre-flight validation**: Subscription limits checked before API calls
-**Usage tracking**: Integrated with existing tracking system
-**Progress callbacks**: Real-time progress updates (10% → 20-80% → 90% → 100%)
-**Error handling**: Comprehensive error messages with prediction_id for resume
-**Cost calculation**: Accurate pricing ($0.02/s 480p, $0.04/s 720p)
-**Metadata return**: Full metadata including dimensions, cost, prediction_id
### 5. Size Format Mapping ✅
**Resolution → Size Format**:
- `480p` + `16:9``"832*480"` (landscape)
- `480p` + `9:16``"480*832"` (portrait)
- `720p` + `16:9``"1280*720"` (landscape)
- `720p` + `9:16``"720*1280"` (portrait)
### 6. Validation ✅
**HunyuanVideo-1.5 Specific**:
- Duration: Must be 5, 8, or 10 seconds (per official API docs)
- Resolution: Must be 480p or 720p (not 1080p)
- Prompt: Required and cannot be empty
## Code Structure
```
backend/services/llm_providers/
├── main_video_generation.py # Unified entry point
│ ├── ai_video_generate() # Main function (async)
│ └── _generate_text_to_video_wavespeed() # WaveSpeed router
└── video_generation/ # Modular services
├── base.py # Base classes
└── wavespeed_provider.py # WaveSpeed services
├── BaseWaveSpeedTextToVideoService # Base class
├── HunyuanVideoService # ✅ Implemented
└── get_wavespeed_text_to_video_service() # Factory
```
## Usage Example
```python
from services.llm_providers.main_video_generation import ai_video_generate
result = await ai_video_generate(
prompt="A tiny robot hiking across a kitchen table",
operation_type="text-to-video",
provider="wavespeed",
model="hunyuan-video-1.5",
duration=5,
resolution="720p",
user_id="user123",
progress_callback=lambda progress, msg: print(f"{progress}%: {msg}")
)
video_bytes = result["video_bytes"]
cost = result["cost"] # $0.20 for 5s @ 720p
```
## Testing Checklist
- [ ] Test with valid prompt
- [ ] Test with 5-second duration
- [ ] Test with 8-second duration
- [ ] Test with 10-second duration
- [ ] Test with 480p resolution
- [ ] Test with 720p resolution
- [ ] Test with negative_prompt
- [ ] Test with seed
- [ ] Test progress callbacks
- [ ] Test error handling (invalid duration)
- [ ] Test error handling (invalid resolution)
- [ ] Test cost calculation
- [ ] Test metadata return
## Next Steps
1.**HunyuanVideo-1.5**: Complete
2.**LTX-2 Pro**: Pending documentation
3.**LTX-2 Fast**: Pending documentation
4.**LTX-2 Retake**: Pending documentation
## Notes
- **Audio support**: Not supported by HunyuanVideo-1.5 (ignored with warning)
- **Prompt expansion**: Not supported by HunyuanVideo-1.5 (ignored with warning)
- **Aspect ratio**: Used for size calculation (landscape vs portrait)
- **Polling interval**: 0.5 seconds (as per example code)
- **Timeout**: 10 minutes maximum
## Ready for Testing ✅
The implementation is complete and ready for testing. All features are implemented following the modular architecture with separation of concerns.