HuggingFace Pricing Configuration

Overview

HuggingFace API calls (specifically for GPT-OSS-120B model via Groq) are tracked and billed using configurable pricing. The pricing can be set via environment variables in your .env file.

Environment Variables

`HUGGINGFACE_INPUT_TOKEN_COST`

Description: Cost per input token for HuggingFace API calls
Format: Float (decimal number)
Default: 0.000001 ($1 per 1M input tokens)
Example: HUGGINGFACE_INPUT_TOKEN_COST=0.000001

`HUGGINGFACE_OUTPUT_TOKEN_COST`

Description: Cost per output token for HuggingFace API calls
Format: Float (decimal number)
Default: 0.000003 ($3 per 1M output tokens)
Example: HUGGINGFACE_OUTPUT_TOKEN_COST=0.000003

Configuration

Step 1: Add to .env File

Add the following lines to your .env file:

# HuggingFace Pricing (for GPT-OSS-120B via Groq)
# Pricing is per token (e.g., 0.000001 = $1 per 1M tokens)
HUGGINGFACE_INPUT_TOKEN_COST=0.000001
HUGGINGFACE_OUTPUT_TOKEN_COST=0.000003

Step 2: Initialize/Update Pricing

The pricing is automatically initialized when the database is set up. To update pricing after changing environment variables:

Option 1: Restart the backend server (pricing will be updated on next initialization)
Option 2: Run the database setup script to update pricing:
```
python backend/scripts/create_subscription_tables.py
```

Step 3: Verify Pricing

Check that pricing is correctly configured by:

Checking the database api_provider_pricing table
Making a test API call and checking the cost in usage logs
Viewing the billing dashboard to see cost calculations

Pricing Calculation

The cost calculation works as follows:

Database Lookup: The system first tries to find pricing in the database for the specific model
Model Matching: It tries multiple model name variations:
- Exact model name (e.g., "openai/gpt-oss-120b:groq")
- Short model name (e.g., "gpt-oss-120b")
- Default model name ("default")
Environment Variable Fallback: If no pricing is found in the database, it uses environment variables for HuggingFace/Mistral provider
Default Estimates: As a last resort, it uses default estimates ($1 per 1M tokens for both input and output)

Cost Calculation Formula

cost_input = tokens_input * HUGGINGFACE_INPUT_TOKEN_COST
cost_output = tokens_output * HUGGINGFACE_OUTPUT_TOKEN_COST
cost_total = cost_input + cost_output

Example

For a HuggingFace API call with:

Input tokens: 1000
Output tokens: 500
HUGGINGFACE_INPUT_TOKEN_COST: 0.000001 ($1 per 1M tokens)
HUGGINGFACE_OUTPUT_TOKEN_COST: 0.000003 ($3 per 1M tokens)

Calculation:

cost_input = 1000 * 0.000001 = 0.001 ($0.001)
cost_output = 500 * 0.000003 = 0.0015 ($0.0015)
cost_total = 0.001 + 0.0015 = 0.0025 ($0.0025)

Testing

To test the pricing configuration:

Set environment variables in .env
Restart the backend server
Make a HuggingFace API call
Check the usage logs in the billing dashboard
Verify the cost is calculated correctly

Notes

Pricing is stored in the api_provider_pricing table
Pricing is updated automatically when initialize_default_pricing() is called
Environment variables take precedence over database values if pricing is not found in DB
The pricing applies to all HuggingFace models that map to the MISTRAL provider enum
Default pricing is based on Groq's estimated pricing for GPT-OSS-120B model

3.5 KiB Raw Blame History