# Provider Tracking Improvement ## Problem Statement The billing dashboard's API Usage Logs were showing generic provider names (e.g., "Video", "Audio", "Stability") instead of the actual providers (WaveSpeed, Google/Gemini, HuggingFace). This made it difficult to: - Understand which providers are actually being used - Analyze costs by provider - Make informed decisions about provider usage - Track provider-specific trends and patterns ## Solution Added `actual_provider_name` field to track the real provider behind generic enum values, with intelligent detection based on model names and endpoints. ## Implementation ### 1. Database Model Update **File**: `backend/models/subscription_models.py` Added `actual_provider_name` field to `APIUsageLog`: ```python actual_provider_name = Column(String(50), nullable=True) # e.g., "wavespeed", "google", "huggingface" ``` ### 2. Provider Detection Utility **File**: `backend/services/subscription/provider_detection.py` Created intelligent provider detection function that identifies actual providers from: - Model names (e.g., "alibaba/wan-2.5/text-to-video" → "wavespeed") - Endpoints (e.g., "/video-generation/wavespeed" → "wavespeed") - Provider enum values (with fallback logic) **Supported Providers**: - **WaveSpeed**: OSS models (Qwen, Ideogram, FLUX, WAN 2.5, Minimax Speech) - **Google**: Gemini models (gemini-2.5-flash, gemini-2.5-pro, etc.) - **HuggingFace**: GPT-OSS-120B, Tencent HunyuanVideo, etc. - **Stability AI**: Stable Diffusion models - **OpenAI**: GPT-4o, GPT-4o-mini, TTS-1 - **Anthropic**: Claude 3.5 Sonnet ### 3. Service Updates Updated all media generation services to use provider detection: - **Video Generation** (`backend/services/llm_providers/main_video_generation.py`) - **Image Generation** (`backend/services/llm_providers/main_image_generation.py`) - **Audio Generation** (`backend/services/llm_providers/main_audio_generation.py`) - **Usage Tracking Service** (`backend/services/subscription/usage_tracking_service.py`) All services now automatically detect and store the actual provider name when tracking API usage. ### 4. API Endpoint Update **File**: `backend/api/subscription_api.py` Updated `/api/subscription/usage-logs` endpoint to: - Return `actual_provider_name` in response - Use `actual_provider_name` for display if available - Fallback to enum value with special handling for MISTRAL → HuggingFace ### 5. Frontend Updates **Files**: - `frontend/src/types/billing.ts` - Added `actual_provider_name` to `UsageLog` interface - `frontend/src/components/billing/UsageLogsTable.tsx` - Display actual provider name prominently **UI Display**: - Shows actual provider name (e.g., "WaveSpeed") in bold - Shows generic enum value (e.g., "video") in smaller text below if different - Example: "**WaveSpeed**" (video) ### 6. Database Migration **File**: `backend/scripts/add_actual_provider_name_column.py` Migration script that: - Adds `actual_provider_name` column to `api_usage_logs` table - Backfills existing records with detected provider names - Safe to run multiple times (checks if column exists) ## Usage ### Running the Migration ```bash cd backend python scripts/add_actual_provider_name_column.py ``` ### Provider Detection Examples ```python from services.subscription.provider_detection import detect_actual_provider from models.subscription_models import APIProvider # Video generation - WaveSpeed provider = detect_actual_provider( provider_enum=APIProvider.VIDEO, model_name="alibaba/wan-2.5/text-to-video", endpoint="/video-generation/wavespeed" ) # Returns: "wavespeed" # Image generation - WaveSpeed OSS provider = detect_actual_provider( provider_enum=APIProvider.STABILITY, model_name="qwen-image", endpoint="/image-generation/wavespeed" ) # Returns: "wavespeed" # Audio generation - WaveSpeed provider = detect_actual_provider( provider_enum=APIProvider.AUDIO, model_name="minimax/speech-02-hd", endpoint="/audio-generation/wavespeed" ) # Returns: "wavespeed" # LLM - Google Gemini provider = detect_actual_provider( provider_enum=APIProvider.GEMINI, model_name="gemini-2.5-flash" ) # Returns: "google" # LLM - HuggingFace (MISTRAL enum) provider = detect_actual_provider( provider_enum=APIProvider.MISTRAL, model_name="openai/gpt-oss-120b:groq" ) # Returns: "huggingface" ``` ## Benefits 1. **Accurate Provider Tracking**: Know exactly which providers (WaveSpeed, Google, HuggingFace) are being used 2. **Better Cost Analysis**: Analyze costs by actual provider, not generic categories 3. **Usage Insights**: Understand provider usage patterns and trends 4. **Informed Decisions**: Make data-driven decisions about provider selection 5. **Backward Compatible**: Existing records are backfilled, new records automatically tracked ## Future Enhancements 1. **Provider Analytics Dashboard**: Visualize usage and costs by actual provider 2. **Provider Recommendations**: Suggest provider switches based on cost/performance 3. **Provider Cost Comparison**: Compare costs across providers for similar operations 4. **Provider Performance Metrics**: Track response times, success rates by provider ## Testing After running the migration, verify: 1. **Database**: Check that `actual_provider_name` column exists and has values ```sql SELECT provider, actual_provider_name, model_used, COUNT(*) FROM api_usage_logs GROUP BY provider, actual_provider_name, model_used; ``` 2. **API**: Check that `/api/subscription/usage-logs` returns `actual_provider_name` ```bash curl http://localhost:8000/api/subscription/usage-logs?user_id=YOUR_USER_ID ``` 3. **UI**: Check that billing dashboard shows actual provider names in Usage Logs table ## Notes - The `provider` enum field is still used for limit enforcement (VIDEO, AUDIO, STABILITY, etc.) - The `actual_provider_name` field is for display and analytics only - Detection is based on heuristics (model names, endpoints) - may need refinement for edge cases - Existing records are backfilled, but may not be 100% accurate if model names are ambiguous