Files
moreminimore-marketing/docs/LTX2_PRO_IMPLEMENTATION_COMPLETE.md
Kunthawat Greethong c35fa52117 Base code
2026-01-08 22:39:53 +07:00

5.0 KiB

LTX-2 Pro Text-to-Video Implementation - Complete

Summary

Successfully implemented Lightricks LTX-2 Pro text-to-video generation following the same modular architecture pattern as HunyuanVideo-1.5.

Implementation Details

1. Service Structure

File: backend/services/llm_providers/video_generation/wavespeed_provider.py

  • LTX2ProService: Complete implementation
    • Model-specific validation (duration: 6, 8, or 10 seconds)
    • Fixed 1080p resolution (no resolution parameter needed)
    • generate_audio parameter support (boolean, default: True)
    • Cost calculation (placeholder - update with actual pricing)
    • Full API integration (submit → poll → download)
    • Progress callback support
    • Comprehensive error handling

2. Key Differences from HunyuanVideo-1.5

Feature HunyuanVideo-1.5 LTX-2 Pro
Duration 5, 8, 10 seconds 6, 8, 10 seconds
Resolution 480p, 720p (selectable) 1080p (fixed)
Audio Not supported generate_audio parameter (boolean)
Negative Prompt Supported Not supported
Seed Supported Not supported
Size Format width*height (selectable) Fixed 1080p

3. API Integration

Model: lightricks/ltx-2-pro/text-to-video

Parameters Supported:

  • prompt (required)
  • duration (6, 8, or 10 seconds)
  • generate_audio (boolean, default: True)
  • negative_prompt (not supported - ignored with warning)
  • seed (not supported - ignored with warning)
  • audio_base64 (not supported - ignored with warning)
  • enable_prompt_expansion (not supported - ignored with warning)
  • resolution (ignored - fixed at 1080p)

Workflow:

  1. Submit request to WaveSpeed API
  2. Get prediction ID
  3. Poll /api/v3/predictions/{id}/result with progress callbacks
  4. Download video from outputs[0]
  5. Return metadata dict

4. Features

  • Pre-flight validation: Subscription limits checked before API calls
  • Usage tracking: Integrated with existing tracking system
  • Progress callbacks: Real-time progress updates (10% → 20-80% → 90% → 100%)
  • Error handling: Comprehensive error messages with prediction_id for resume
  • Cost calculation: Placeholder pricing (update with actual pricing)
  • Metadata return: Full metadata including dimensions (1920x1080), cost, prediction_id
  • Audio generation: Optional synchronized audio via generate_audio parameter

5. Validation

LTX-2 Pro Specific:

  • Duration: Must be 6, 8, or 10 seconds
  • Resolution: Fixed at 1080p (parameter ignored)
  • Prompt: Required and cannot be empty
  • Generate Audio: Boolean (default: True)

6. Factory Function

Updated: get_wavespeed_text_to_video_service()

Model Mappings:

  • "ltx-2-pro"LTX2ProService
  • "lightricks/ltx-2-pro"LTX2ProService
  • "lightricks/ltx-2-pro/text-to-video"LTX2ProService

Usage Example

from services.llm_providers.main_video_generation import ai_video_generate

result = await ai_video_generate(
    prompt="A cinematic scene with synchronized audio",
    operation_type="text-to-video",
    provider="wavespeed",
    model="ltx-2-pro",
    duration=6,
    generate_audio=True,  # LTX-2 Pro specific parameter
    user_id="user123",
    progress_callback=lambda progress, msg: print(f"{progress}%: {msg}")
)

video_bytes = result["video_bytes"]
cost = result["cost"]
resolution = result["resolution"]  # Always "1080p"

Testing Checklist

  • Test with valid prompt
  • Test with 6-second duration
  • Test with 8-second duration
  • Test with 10-second duration
  • Test with generate_audio=True
  • Test with generate_audio=False
  • Test progress callbacks
  • Test error handling (invalid duration)
  • Test cost calculation
  • Test metadata return
  • Test that unsupported parameters are ignored with warnings

Next Steps

  1. HunyuanVideo-1.5: Complete
  2. LTX-2 Pro: Complete
  3. LTX-2 Fast: Pending documentation
  4. LTX-2 Retake: Pending documentation

Notes

  • Fixed Resolution: LTX-2 Pro always generates 1080p videos (1920x1080)
  • Audio Generation: Unique feature - can generate synchronized audio with video
  • Pricing: Placeholder cost calculation - update with actual pricing from WaveSpeed docs
  • Unsupported Parameters: negative_prompt, seed, audio_base64, enable_prompt_expansion are ignored with warnings
  • Polling interval: 0.5 seconds (same as HunyuanVideo-1.5)
  • Timeout: 10 minutes maximum

Official Documentation

Ready for Testing

The implementation is complete and ready for testing. All features are implemented following the modular architecture with separation of concerns, matching the pattern established by HunyuanVideo-1.5.