Files
ALwrity/docs/Billing_Subscription/PROVIDER_TRACKING_IMPROVEMENT.md

6.0 KiB

Provider Tracking Improvement

Problem Statement

The billing dashboard's API Usage Logs were showing generic provider names (e.g., "Video", "Audio", "Stability") instead of the actual providers (WaveSpeed, Google/Gemini, HuggingFace). This made it difficult to:

  • Understand which providers are actually being used
  • Analyze costs by provider
  • Make informed decisions about provider usage
  • Track provider-specific trends and patterns

Solution

Added actual_provider_name field to track the real provider behind generic enum values, with intelligent detection based on model names and endpoints.

Implementation

1. Database Model Update

File: backend/models/subscription_models.py

Added actual_provider_name field to APIUsageLog:

actual_provider_name = Column(String(50), nullable=True)  # e.g., "wavespeed", "google", "huggingface"

2. Provider Detection Utility

File: backend/services/subscription/provider_detection.py

Created intelligent provider detection function that identifies actual providers from:

  • Model names (e.g., "alibaba/wan-2.5/text-to-video" → "wavespeed")
  • Endpoints (e.g., "/video-generation/wavespeed" → "wavespeed")
  • Provider enum values (with fallback logic)

Supported Providers:

  • WaveSpeed: OSS models (Qwen, Ideogram, FLUX, WAN 2.5, Minimax Speech)
  • Google: Gemini models (gemini-2.5-flash, gemini-2.5-pro, etc.)
  • HuggingFace: GPT-OSS-120B, Tencent HunyuanVideo, etc.
  • Stability AI: Stable Diffusion models
  • OpenAI: GPT-4o, GPT-4o-mini, TTS-1
  • Anthropic: Claude 3.5 Sonnet

3. Service Updates

Updated all media generation services to use provider detection:

  • Video Generation (backend/services/llm_providers/main_video_generation.py)
  • Image Generation (backend/services/llm_providers/main_image_generation.py)
  • Audio Generation (backend/services/llm_providers/main_audio_generation.py)
  • Usage Tracking Service (backend/services/subscription/usage_tracking_service.py)

All services now automatically detect and store the actual provider name when tracking API usage.

4. API Endpoint Update

File: backend/api/subscription_api.py

Updated /api/subscription/usage-logs endpoint to:

  • Return actual_provider_name in response
  • Use actual_provider_name for display if available
  • Fallback to enum value with special handling for MISTRAL → HuggingFace

5. Frontend Updates

Files:

  • frontend/src/types/billing.ts - Added actual_provider_name to UsageLog interface
  • frontend/src/components/billing/UsageLogsTable.tsx - Display actual provider name prominently

UI Display:

  • Shows actual provider name (e.g., "WaveSpeed") in bold
  • Shows generic enum value (e.g., "video") in smaller text below if different
  • Example: "WaveSpeed" (video)

6. Database Migration

File: backend/scripts/add_actual_provider_name_column.py

Migration script that:

  • Adds actual_provider_name column to api_usage_logs table
  • Backfills existing records with detected provider names
  • Safe to run multiple times (checks if column exists)

Usage

Running the Migration

cd backend
python scripts/add_actual_provider_name_column.py

Provider Detection Examples

from services.subscription.provider_detection import detect_actual_provider
from models.subscription_models import APIProvider

# Video generation - WaveSpeed
provider = detect_actual_provider(
    provider_enum=APIProvider.VIDEO,
    model_name="alibaba/wan-2.5/text-to-video",
    endpoint="/video-generation/wavespeed"
)
# Returns: "wavespeed"

# Image generation - WaveSpeed OSS
provider = detect_actual_provider(
    provider_enum=APIProvider.STABILITY,
    model_name="qwen-image",
    endpoint="/image-generation/wavespeed"
)
# Returns: "wavespeed"

# Audio generation - WaveSpeed
provider = detect_actual_provider(
    provider_enum=APIProvider.AUDIO,
    model_name="minimax/speech-02-hd",
    endpoint="/audio-generation/wavespeed"
)
# Returns: "wavespeed"

# LLM - Google Gemini
provider = detect_actual_provider(
    provider_enum=APIProvider.GEMINI,
    model_name="gemini-2.5-flash"
)
# Returns: "google"

# LLM - HuggingFace (MISTRAL enum)
provider = detect_actual_provider(
    provider_enum=APIProvider.MISTRAL,
    model_name="openai/gpt-oss-120b:groq"
)
# Returns: "huggingface"

Benefits

  1. Accurate Provider Tracking: Know exactly which providers (WaveSpeed, Google, HuggingFace) are being used
  2. Better Cost Analysis: Analyze costs by actual provider, not generic categories
  3. Usage Insights: Understand provider usage patterns and trends
  4. Informed Decisions: Make data-driven decisions about provider selection
  5. Backward Compatible: Existing records are backfilled, new records automatically tracked

Future Enhancements

  1. Provider Analytics Dashboard: Visualize usage and costs by actual provider
  2. Provider Recommendations: Suggest provider switches based on cost/performance
  3. Provider Cost Comparison: Compare costs across providers for similar operations
  4. Provider Performance Metrics: Track response times, success rates by provider

Testing

After running the migration, verify:

  1. Database: Check that actual_provider_name column exists and has values

    SELECT provider, actual_provider_name, model_used, COUNT(*) 
    FROM api_usage_logs 
    GROUP BY provider, actual_provider_name, model_used;
    
  2. API: Check that /api/subscription/usage-logs returns actual_provider_name

    curl http://localhost:8000/api/subscription/usage-logs?user_id=YOUR_USER_ID
    
  3. UI: Check that billing dashboard shows actual provider names in Usage Logs table

Notes

  • The provider enum field is still used for limit enforcement (VIDEO, AUDIO, STABILITY, etc.)
  • The actual_provider_name field is for display and analytics only
  • Detection is based on heuristics (model names, endpoints) - may need refinement for edge cases
  • Existing records are backfilled, but may not be 100% accurate if model names are ambiguous