AI Analysis and Content Strategy fixes. Enhanced Strategy Routes refactoring.

This commit is contained in:
ajaysi
2026-01-10 19:32:50 +05:30
parent 0b63ae7fc1
commit 8193cdba67
298 changed files with 45678 additions and 10952 deletions

View File

@@ -0,0 +1,237 @@
# Billing Dashboard Consolidation Analysis
## Current State
### Component Inventory
| Component | Status | Usage | Purpose |
|-----------|--------|-------|---------|
| **BillingDashboard** | ❌ **UNUSED** | Not imported anywhere | Legacy full-featured dashboard |
| **EnhancedBillingDashboard** | ✅ **ACTIVE** | MainDashboard, BillingPage | Smart wrapper with view mode toggle |
| **CompactBillingDashboard** | ✅ **ACTIVE** | Used by EnhancedBillingDashboard | Compact view implementation |
| **BillingPage** | ✅ **ACTIVE** | Route: `/billing` | Dedicated billing page wrapper |
| **BillingOverview** | ✅ **ACTIVE** | Sub-component | Usage stats overview card |
| **CostBreakdown** | ✅ **ACTIVE** | Sub-component | Provider cost breakdown |
| **UsageTrends** | ✅ **ACTIVE** | Sub-component | Usage trends chart |
| **UsageAlerts** | ✅ **ACTIVE** | Sub-component | Alert notifications |
| **ComprehensiveAPIBreakdown** | ✅ **ACTIVE** | Sub-component | Detailed API breakdown |
| **SubscriptionRenewalHistory** | ✅ **ACTIVE** | BillingPage only | Renewal history table |
| **UsageLogsTable** | ✅ **ACTIVE** | BillingPage only | Usage logs table |
---
## Architecture Analysis
### Current Structure
```
BillingPage (/billing route)
├── EnhancedBillingDashboard (terminalTheme=true)
│ ├── View Mode Toggle (compact/detailed)
│ ├── Compact Mode → CompactBillingDashboard
│ └── Detailed Mode → Grid Layout
│ ├── BillingOverview
│ ├── SystemHealthIndicator
│ ├── UsageAlerts
│ ├── CostBreakdown
│ ├── UsageTrends
│ └── ComprehensiveAPIBreakdown
├── SubscriptionRenewalHistory
└── UsageLogsTable
MainDashboard
└── EnhancedBillingDashboard (terminalTheme=true)
└── [Same structure as above]
```
### Key Findings
1. **BillingDashboard.tsx is UNUSED**
- Not imported anywhere in the codebase
- Legacy implementation with auto-refresh every 30 seconds
- No view mode toggle
- No terminal theme support
- **Recommendation: DEPRECATE and REMOVE**
2. **EnhancedBillingDashboard is the Main Component**
- ✅ Used in both MainDashboard and BillingPage
- ✅ Supports view mode toggle (compact/detailed)
- ✅ Supports terminal theme
- ✅ Event-driven refresh (no polling)
- ✅ Properly structured with sub-components
3. **CompactBillingDashboard is Well-Designed**
- ✅ Used only by EnhancedBillingDashboard
- ✅ Minimal, focused implementation
- ✅ Supports terminal theme
- ✅ Event-driven refresh
4. **BillingPage Adds Value**
- ✅ Dedicated route for billing
- ✅ Adds SubscriptionRenewalHistory (not in dashboard)
- ✅ Adds UsageLogsTable (not in dashboard)
- ✅ Terminal-themed container
---
## Consolidation Recommendations
### ✅ **RECOMMENDED: Remove BillingDashboard.tsx**
**Reason:**
- Not used anywhere in the codebase
- Functionality fully replaced by EnhancedBillingDashboard
- Reduces code duplication and maintenance burden
**Action:**
```bash
# Delete unused file
rm frontend/src/components/billing/BillingDashboard.tsx
```
**Impact:**
- ✅ Zero breaking changes (not imported)
- ✅ Reduces codebase size
- ✅ Eliminates confusion about which component to use
---
### ✅ **KEEP: EnhancedBillingDashboard Architecture**
**Current Design is Optimal:**
- ✅ Single component handles both compact and detailed views
- ✅ View mode toggle provides flexibility
- ✅ Reusable across MainDashboard and BillingPage
- ✅ Proper separation of concerns with sub-components
**No Changes Needed**
---
### ✅ **KEEP: CompactBillingDashboard**
**Current Design is Optimal:**
- ✅ Focused, minimal implementation
- ✅ Used only by EnhancedBillingDashboard
- ✅ Proper encapsulation
**No Changes Needed**
---
### ✅ **KEEP: BillingPage Structure**
**Current Design is Optimal:**
- ✅ Dedicated route for comprehensive billing view
- ✅ Adds unique components (RenewalHistory, UsageLogsTable)
- ✅ Terminal-themed for consistency
**No Changes Needed**
---
## Proposed Consolidation Plan
### Phase 1: Cleanup (Immediate)
1. **Delete BillingDashboard.tsx**
- File is unused and legacy
- No imports to update
- Zero risk
### Phase 2: Documentation (Optional)
1. **Update Component Documentation**
- Document EnhancedBillingDashboard as the primary component
- Document view mode toggle behavior
- Document terminal theme support
2. **Update Architecture Docs**
- Document component hierarchy
- Document usage patterns
### Phase 3: Future Enhancements (Optional)
1. **Consider Renaming**
- `EnhancedBillingDashboard``BillingDashboard` (after removing legacy)
- `CompactBillingDashboard``BillingDashboardCompact` (for clarity)
2. **Consider Component Props Standardization**
- Standardize `terminalTheme` prop across all billing components
- Standardize `userId` prop handling
---
## Component Usage Matrix
| Component | MainDashboard | BillingPage | Standalone |
|-----------|---------------|-------------|------------|
| EnhancedBillingDashboard | ✅ | ✅ | ❌ |
| CompactBillingDashboard | ✅ (via Enhanced) | ✅ (via Enhanced) | ❌ |
| BillingDashboard | ❌ | ❌ | ❌ |
| BillingOverview | ✅ (via Enhanced) | ✅ (via Enhanced) | ❌ |
| CostBreakdown | ✅ (via Enhanced) | ✅ (via Enhanced) | ❌ |
| UsageTrends | ✅ (via Enhanced) | ✅ (via Enhanced) | ❌ |
| UsageAlerts | ✅ (via Enhanced) | ✅ (via Enhanced) | ❌ |
| ComprehensiveAPIBreakdown | ✅ (via Enhanced) | ✅ (via Enhanced) | ❌ |
| SubscriptionRenewalHistory | ❌ | ✅ | ❌ |
| UsageLogsTable | ❌ | ✅ | ❌ |
---
## Summary
### ✅ **Consolidation Needed: YES**
**Action Items:**
1.**DELETE** `BillingDashboard.tsx` (unused legacy component)
2.**KEEP** current EnhancedBillingDashboard architecture (optimal)
3.**KEEP** CompactBillingDashboard (well-designed)
4.**KEEP** BillingPage structure (adds unique value)
### **Current Architecture Assessment: EXCELLENT**
The current architecture is well-designed:
- ✅ Single source of truth (EnhancedBillingDashboard)
- ✅ Proper component hierarchy
- ✅ Reusable across contexts
- ✅ Flexible view modes
- ✅ Clean separation of concerns
**Only cleanup needed:** Remove unused legacy component.
---
## Migration Checklist
- [ ] Delete `frontend/src/components/billing/BillingDashboard.tsx`
- [ ] Verify no imports reference BillingDashboard
- [ ] Update any documentation referencing BillingDashboard
- [ ] Test MainDashboard billing section
- [ ] Test BillingPage route
- [ ] Verify view mode toggle works
- [ ] Verify terminal theme works
- [ ] Verify event-driven refresh works
---
## Risk Assessment
| Action | Risk Level | Impact | Mitigation |
|--------|------------|--------|------------|
| Delete BillingDashboard.tsx | 🟢 **LOW** | None (unused) | Verify no imports first |
| Keep EnhancedBillingDashboard | 🟢 **NONE** | None | No changes needed |
| Keep CompactBillingDashboard | 🟢 **NONE** | None | No changes needed |
| Keep BillingPage | 🟢 **NONE** | None | No changes needed |
---
## Conclusion
**The billing dashboard architecture is well-designed and requires minimal consolidation.**
**Primary Action:** Remove unused `BillingDashboard.tsx` legacy component.
**Secondary Action:** Consider renaming `EnhancedBillingDashboard` to `BillingDashboard` after cleanup for clarity.
**No architectural changes needed** - the current design is optimal for the use cases.

View File

@@ -0,0 +1,373 @@
# Billing Dashboard Cost Transparency Review
## Executive Summary
This document reviews the current billing dashboard implementation (`CompactBillingDashboard`, `CostBreakdown`, `BillingOverview`, `ComprehensiveAPIBreakdown`) to assess cost transparency and pricing visibility for end users.
**Status**: ✅ **Good Foundation** | ⚠️ **Needs Enhancement**
---
## Current Implementation Analysis
### ✅ **Strengths**
1. **Total Cost Display**
- Clear display of total monthly cost (`$X.XXXX`)
- Shows usage against monthly budget limit
- Progress bars with color-coded warnings (green/yellow/red)
- Tooltips explaining what "Total Cost" includes
2. **Provider Breakdown**
- `CostBreakdown` component shows cost by provider (Gemini, OpenAI, etc.)
- Pie chart visualization with percentages
- Shows cost, calls, and tokens per provider
- Hover tooltips with detailed metrics
3. **Usage Metrics**
- API calls count
- Token usage
- System health status
- Monthly budget usage percentage
4. **Comprehensive API Information**
- `ComprehensiveAPIBreakdown` shows API categories
- Includes pricing information (static/hardcoded)
- Shows use cases and descriptions
- Displays active vs inactive providers
---
## ⚠️ **Areas Needing Improvement**
### 1. **Missing: Per-Operation Cost Display**
**Issue**: Users cannot see how much each operation costs before or after execution.
**Current State**:
- Shows total cost but not cost per API call
- No cost breakdown per operation type (blog generation, image generation, etc.)
- No "cost per call" or "cost per token" metrics
**Recommendation**:
```typescript
// Add to CompactBillingDashboard or CostBreakdown
- Average cost per API call: $X.XXXX
- Cost per 1K tokens: $X.XX
- Cost per image generation: $X.XX
- Cost per video generation: $X.XX
```
### 2. **Missing: Real-Time Pricing Information**
**Issue**: `ComprehensiveAPIBreakdown` shows static pricing that may not match actual costs.
**Current State**:
- Hardcoded pricing in component (e.g., "From $0.10/1M tokens")
- No connection to actual backend pricing
- No dynamic pricing updates
**Recommendation**:
- Fetch pricing from `/api/subscription/pricing` endpoint
- Display actual current pricing per provider/model
- Show pricing tiers (input vs output tokens)
- Update pricing dynamically when backend changes
### 3. **Missing: Cost Estimation Before Operations**
**Issue**: Users don't know how much an operation will cost before executing it.
**Current State**:
- No pre-operation cost estimation
- Users discover costs only after usage
**Recommendation**:
- Add cost estimation tooltips/modals before operations
- Show estimated cost based on:
- Operation type (blog generation, image generation, etc.)
- Selected model/provider
- Estimated tokens/parameters
- Use `preflightCheck` API to get cost estimates
### 4. **Missing: Cost Breakdown by Tool/Feature**
**Issue**: Users cannot see which tools/features are consuming their budget.
**Current State**:
- Shows provider breakdown (Gemini, OpenAI, etc.)
- Does not show tool breakdown (Blog Writer, Image Studio, etc.)
**Recommendation**:
```typescript
// Add tool-level breakdown
- Blog Writer: $X.XX (Y calls)
- Image Studio: $X.XX (Y images)
- Video Studio: $X.XX (Y videos)
- Research Tools: $X.XX (Y searches)
```
### 5. **Missing: Cost Per Unit Metrics**
**Issue**: Cost display shows totals but not unit costs.
**Current State**:
- Total cost: $X.XXXX
- Total calls: X,XXX
- Total tokens: X,XXX
**Missing**:
- Cost per call: $X.XXXX
- Cost per 1K tokens: $X.XX
- Cost per image: $X.XX
**Recommendation**:
Add calculated metrics:
```typescript
const costPerCall = totalCost / totalCalls;
const costPer1KTokens = (totalCost / totalTokens) * 1000;
const costPerImage = imageCost / imageCount;
```
### 6. **Missing: Historical Cost Trends**
**Issue**: Users cannot see how their costs are trending over time.
**Current State**:
- `UsageTrends` component exists but may not show cost trends clearly
- No cost projection/forecast
**Recommendation**:
- Enhance `UsageTrends` to show:
- Daily/weekly cost trends
- Cost projection for remainder of month
- Comparison to previous months
- Cost velocity (spending rate)
### 7. **Missing: Cost Alerts & Warnings**
**Issue**: Cost warnings exist but may not be prominent enough.
**Current State**:
- Shows usage percentage
- Color-coded progress bars
- Alerts section exists
**Recommendation**:
- Add prominent cost warnings at:
- 50% of budget: "You've used 50% of your monthly budget"
- 80% of budget: "⚠️ Warning: 80% of budget used"
- 95% of budget: "🚨 Critical: Approaching budget limit"
- Show estimated days until budget exhaustion
- Suggest cost-saving actions
### 8. **Missing: Cost Comparison & Optimization Tips**
**Issue**: Users cannot see which providers/models are more cost-effective.
**Current State**:
- Shows provider costs but not comparisons
- No optimization suggestions
**Recommendation**:
- Add cost comparison:
- "Gemini Flash is 80% cheaper than GPT-4o for similar tasks"
- "Consider using Qwen Image ($0.03) instead of Stability ($0.04)"
- Show cost savings if user switches models
- Provide optimization tips based on usage patterns
---
## Recommended Enhancements
### Priority 1: High Impact, Low Effort
1. **Add Cost Per Call/Token Metrics**
```typescript
// In CompactBillingDashboard.tsx
<Grid item xs={6} sm={3}>
<Box>
<Typography>Avg Cost per Call</Typography>
<Typography variant="h6">
{formatCurrency(current_usage.total_cost / current_usage.total_calls)}
</Typography>
</Box>
</Grid>
```
2. **Add Tool-Level Cost Breakdown**
- Use `source_module` from usage logs
- Group costs by tool (blog_writer, image_studio, etc.)
- Display in `CostBreakdown` component
3. **Enhance Cost Warnings**
- More prominent alerts at 50%, 80%, 95%
- Show days until budget exhaustion
- Add action buttons (upgrade plan, set alerts)
### Priority 2: Medium Impact, Medium Effort
4. **Dynamic Pricing Display**
- Fetch pricing from `/api/subscription/pricing`
- Update `ComprehensiveAPIBreakdown` to use real pricing
- Show pricing per model/provider dynamically
5. **Cost Estimation Before Operations**
- Add cost estimation modals/tooltips
- Use `preflightCheck` API
- Show estimated cost in operation UI
6. **Historical Cost Trends**
- Enhance `UsageTrends` component
- Add cost projection charts
- Show cost velocity
### Priority 3: High Impact, High Effort
7. **Cost Optimization Recommendations**
- Analyze usage patterns
- Suggest cheaper alternatives
- Show potential savings
8. **Advanced Cost Analytics**
- Cost breakdown by time of day
- Cost breakdown by user action
- Cost efficiency metrics
---
## Implementation Plan
### Phase 1: Quick Wins (1-2 days)
1. ✅ Add cost per call/token metrics to `CompactBillingDashboard`
2. ✅ Enhance cost warnings (50%, 80%, 95% thresholds)
3. ✅ Add tool-level cost breakdown (if `source_module` available)
### Phase 2: Enhanced Transparency (3-5 days)
4. ✅ Fetch and display dynamic pricing from API
5. ✅ Add cost estimation before operations
6. ✅ Enhance `UsageTrends` with cost projections
### Phase 3: Advanced Features (1-2 weeks)
7. ✅ Cost optimization recommendations
8. ✅ Advanced cost analytics dashboard
---
## Code Examples
### Example 1: Add Cost Per Call Metric
```typescript
// In CompactBillingDashboard.tsx, add after Total Cost grid item:
{/* Average Cost Per Call */}
<Grid item xs={6} sm={3}>
<Tooltip title="Average cost per API call this month">
<Box sx={{ textAlign: 'center', p: 2.5, /* styling */ }}>
<TypographyComponent variant="h5">
{current_usage.total_calls > 0
? formatCurrency(current_usage.total_cost / current_usage.total_calls)
: '$0.0000'
}
</TypographyComponent>
<TypographyComponent variant="body2">
Avg Cost/Call
</TypographyComponent>
</Box>
</Tooltip>
</Grid>
```
### Example 2: Add Tool-Level Breakdown
```typescript
// New component: ToolCostBreakdown.tsx
interface ToolCostBreakdownProps {
usageLogs: UsageLog[];
}
const ToolCostBreakdown: React.FC<ToolCostBreakdownProps> = ({ usageLogs }) => {
const toolCosts = useMemo(() => {
const grouped = usageLogs.reduce((acc, log) => {
const tool = log.source_module || 'unknown';
if (!acc[tool]) {
acc[tool] = { cost: 0, calls: 0 };
}
acc[tool].cost += log.cost || 0;
acc[tool].calls += 1;
return acc;
}, {} as Record<string, { cost: number; calls: number }>);
return Object.entries(grouped).map(([tool, data]) => ({
tool: tool.replace(/_/g, ' ').replace(/\b\w/g, l => l.toUpperCase()),
...data
})).sort((a, b) => b.cost - a.cost);
}, [usageLogs]);
return (
<Card>
<CardContent>
<Typography variant="h6">Cost by Tool</Typography>
{toolCosts.map(({ tool, cost, calls }) => (
<Box key={tool} sx={{ display: 'flex', justifyContent: 'space-between', mb: 1 }}>
<Typography>{tool}</Typography>
<Typography>{formatCurrency(cost)} ({calls} calls)</Typography>
</Box>
))}
</CardContent>
</Card>
);
};
```
### Example 3: Dynamic Pricing Display
```typescript
// Update ComprehensiveAPIBreakdown.tsx
const [pricing, setPricing] = useState<APIPricing[]>([]);
useEffect(() => {
billingService.getAPIPricing().then(setPricing);
}, []);
// Replace hardcoded pricing with:
const apiPricing = pricing.find(p =>
p.provider.toLowerCase() === api.name.toLowerCase()
);
<Typography variant="caption">
Pricing: {apiPricing
? `$${apiPricing.input_cost}/1M input, $${apiPricing.output_cost}/1M output tokens`
: api.pricing // fallback to static
}
</Typography>
```
---
## Testing Checklist
- [ ] Cost per call/token metrics display correctly
- [ ] Tool-level breakdown shows accurate costs
- [ ] Cost warnings appear at correct thresholds
- [ ] Dynamic pricing updates when backend changes
- [ ] Cost estimation is accurate (±10%)
- [ ] Historical trends display correctly
- [ ] Cost comparisons are accurate
- [ ] Optimization tips are relevant
---
## Conclusion
The current billing dashboard provides a **good foundation** for cost transparency but needs **enhancements** to provide complete transparency. The recommended improvements will help users:
1. **Understand costs** before and after operations
2. **Optimize spending** by choosing cost-effective options
3. **Monitor usage** with better warnings and projections
4. **Make informed decisions** about plan upgrades
**Next Steps**: Implement Phase 1 quick wins, then proceed with Phase 2 enhancements based on user feedback.

View File

@@ -0,0 +1,120 @@
# Billing Dashboard Improvements
## Summary of Changes
### 1. ✅ Migration Script - Add `actual_provider_name` Column
**Status**: Completed successfully
- Added `actual_provider_name` column to `api_usage_logs` table
- Migration script handles SQLite and MySQL/PostgreSQL
- Backfilled existing records with detected provider names
- Column now tracks real providers: WaveSpeed, Google, HuggingFace, etc.
### 2. ✅ Provider Breakdown in Monthly Budget Usage
**Status**: Completed
**Changes Made**:
- Updated `usage_tracking_service.py` to include all providers in breakdown:
- Video (WaveSpeed, HuggingFace)
- Audio (WaveSpeed)
- Image (Stability, WaveSpeed)
- Image Edit (WaveSpeed)
- Search APIs (Tavily, Serper, Exa)
- Added provider breakdown display in `CompactBillingDashboard.tsx`:
- Shows top 5 providers by cost
- Displays as chips below the progress bar
- Format: "Provider: $X.XX"
- Updated `ProviderBreakdown` TypeScript interface to include all providers
**Location**: `frontend/src/components/billing/CompactBillingDashboard.tsx` (lines ~1040-1063)
### 3. ✅ System Health Card Fix
**Status**: Fixed
**Problem**: System Health was showing zeros for all metrics (recent_requests, recent_errors, error_rate)
**Solution**: Updated `get_lightweight_stats()` in `monitoring_middleware.py` to:
- Query `APIRequest` table for last 5 minutes
- Calculate real `recent_requests` count
- Calculate real `recent_errors` count (status >= 400)
- Calculate real `error_rate` percentage
- Determine status based on error rate:
- `critical`: error_rate > 10%
- `warning`: error_rate > 5%
- `healthy`: error_rate <= 5%
**Location**: `backend/services/subscription/monitoring_middleware.py` (lines 371-389)
### 4. ✅ API Error Handling for `actual_provider_name`
**Status**: Fixed
**Problem**: API was trying to access `actual_provider_name` column that didn't exist, causing errors
**Solution**:
- Added safe access using `getattr()` with try/except
- Falls back to enum value if column doesn't exist
- Migration script ensures column exists
**Location**: `backend/api/subscription_api.py` (lines 1247-1251)
### 5. ✅ Subscription API Review (Lines 611-1017)
**Status**: Reviewed and Fixed
**Issues Found and Fixed**:
1. **Missing limits in subscribe response**: Added `video_calls`, `audio_calls`, `image_edit_calls`, `exa_calls` to limits response
2. **Provider breakdown calculation**: Updated to include all providers, not just Gemini and HuggingFace
3. **Cost calculation**: Updated to sum all provider costs, not just LLM providers
**Code Quality**:
- Error handling is comprehensive
- Logging is detailed and helpful
- Cache management is properly implemented
- Database transaction handling is correct
## Files Modified
### Backend
1. `backend/models/subscription_models.py` - Added `actual_provider_name` field
2. `backend/services/subscription/provider_detection.py` - New utility for provider detection
3. `backend/services/subscription/usage_tracking_service.py` - Enhanced provider breakdown
4. `backend/services/subscription/monitoring_middleware.py` - Fixed System Health stats
5. `backend/services/llm_providers/main_video_generation.py` - Added provider detection
6. `backend/services/llm_providers/main_image_generation.py` - Added provider detection
7. `backend/services/llm_providers/main_audio_generation.py` - Added provider detection
8. `backend/api/subscription_api.py` - Fixed error handling, added missing limits
9. `backend/scripts/add_actual_provider_name_column.py` - Migration script
### Frontend
1. `frontend/src/types/billing.ts` - Updated `ProviderBreakdown` interface
2. `frontend/src/components/billing/CompactBillingDashboard.tsx` - Added provider breakdown display
3. `frontend/src/components/billing/UsageLogsTable.tsx` - Display actual provider name
4. `frontend/src/components/monitoring/SystemHealthIndicator.tsx` - Already correct (needs `onRefresh` prop)
## Testing Checklist
- [x] Migration script runs successfully
- [x] Provider breakdown shows in Monthly Budget Usage
- [x] System Health displays real data (not zeros)
- [x] API Usage Logs show actual provider names
- [ ] Test with existing data (backfill)
- [ ] Test with new API calls (provider detection)
- [ ] Verify all providers appear in breakdown
## Next Steps
1. **Monitor**: Watch for any errors related to `actual_provider_name` column
2. **Verify**: Check that System Health shows real data after API calls
3. **Test**: Verify provider breakdown appears correctly in compact view
4. **Enhance**: Consider adding provider breakdown to detailed view as well
## Notes
- The migration script successfully added the column and backfilled 0 records (no existing records to backfill)
- System Health now queries real data from `APIRequest` table
- Provider breakdown includes all providers, sorted by cost (top 5 displayed)
- All changes are backward compatible (fallback to enum values if `actual_provider_name` is missing)

View File

@@ -0,0 +1,673 @@
# Billing Dashboard Visualization & Animation Opportunities
## Executive Summary
This document reviews the existing Recharts utilities, current chart implementations in the billing dashboard, and provides recommendations for additional visualizations and Framer Motion animations to enhance user experience and data comprehension.
---
## 1. Current Recharts Infrastructure
### 1.1 Lazy Loading Wrapper (`frontend/src/utils/lazyRecharts.tsx`)
**Available Components:**
- `LazyLineChart` - Line charts (lazy loaded)
- `LazyBarChart` - Bar charts (lazy loaded)
- `LazyPieChart` - Pie charts (lazy loaded)
- `LazyAreaChart` - Area charts (lazy loaded)
- `LazyRadarChart` - Radar charts (lazy loaded)
- `LazyComposedChart` - Combined charts (lazy loaded)
**Lightweight Direct Imports:**
- `Line`, `Bar`, `Pie`, `Area`, `Radar`
- `XAxis`, `YAxis`, `CartesianGrid`
- `Tooltip`, `Legend`, `ResponsiveContainer`
- `Cell`, `PolarGrid`, `PolarAngleAxis`, `PolarRadiusAxis`
**Best Practice:** Always use lazy-loaded components wrapped in `<Suspense>` with `ChartLoadingFallback` for optimal performance.
---
## 2. Current Chart Implementations
### 2.1 Existing Charts in Billing Dashboard
#### ✅ **CostBreakdown.tsx** - Pie Chart
- **Type:** Pie chart showing provider cost distribution
- **Data:** `ProviderBreakdown` (cost per provider)
- **Features:**
- Custom tooltip with provider icon, cost, calls, tokens
- Custom label showing percentage
- Color-coded by provider
- Framer Motion: Basic fade-in animation
#### ✅ **UsageTrends.tsx** - Line/Area Charts
- **Type:** Line and Area charts for historical trends
- **Data:** `UsageTrends` (periods, costs, calls, tokens)
- **Features:**
- Multi-series line chart (cost, calls, tokens)
- Area chart for cost projections
- Growth rate indicators
- Cost velocity calculations
- Custom tooltips
- Framer Motion: Card-level animations
#### ✅ **AdvancedCostAnalytics.tsx** - Bar/Pie Charts
- **Type:** Bar charts (time of day, user actions) and Pie charts
- **Data:** `UsageLog[]` (aggregated by hour, endpoint)
- **Features:**
- Time-of-day cost distribution (bar chart)
- Tool/endpoint cost breakdown (pie chart)
- Efficiency metrics
- Tabbed interface
- Framer Motion: Tab transitions
#### ✅ **ToolCostBreakdown.tsx** - No Charts (Text-based)
- **Type:** Grid-based tool cost display
- **Data:** `UsageLog[]` (grouped by tool/endpoint)
- **Opportunity:** Could benefit from bar or pie chart visualization
---
## 3. Recommended New Visualizations
### 3.1 Compact Dashboard Enhancements
#### 📊 **Mini Sparkline Charts** (High Priority)
**Location:** `CompactBillingDashboard.tsx` - Metric cards
**Purpose:** Show trend at a glance without expanding
**Implementation:**
```typescript
// Add to each metric card (Total Cost, Total Calls, etc.)
<Box sx={{ height: 40, mt: 1 }}>
<ResponsiveContainer width="100%" height="100%">
<LazyLineChart data={last7DaysData}>
<Line
type="monotone"
dataKey="value"
stroke={getStatusColor(status)}
strokeWidth={2}
dot={false}
/>
</LazyLineChart>
</ResponsiveContainer>
</Box>
```
**Data Source:** Last 7 days from `UsageTrends`
**Animation:** Fade-in on card hover
---
#### 📈 **Provider Cost Comparison Bar Chart** (Medium Priority)
**Location:** `CompactBillingDashboard.tsx` - Below Monthly Budget Usage
**Purpose:** Quick visual comparison of provider costs
**Implementation:**
- Horizontal bar chart
- Top 5 providers by cost
- Color-coded bars matching provider colors
- Click to expand to detailed view
**Data Source:** `current_usage.provider_breakdown`
---
#### 🎯 **Usage Limit Progress Rings** (High Priority)
**Location:** `CompactBillingDashboard.tsx` - Replace linear progress bars
**Purpose:** More visually appealing circular progress indicators
**Implementation:**
- Circular progress rings (using SVG or Recharts RadialBar)
- Color-coded by usage level (green/yellow/red)
- Percentage and absolute values displayed
- Animated fill on load
**Data Source:** `usage_percentages` from `UsageStats`
---
### 3.2 Detailed Dashboard Enhancements
#### 📊 **Cost Over Time - Multi-Series Area Chart** (High Priority)
**Location:** `UsageTrends.tsx` - Enhance existing
**Purpose:** Show cost trends with provider breakdown
**Implementation:**
- Stacked area chart showing:
- Total cost (area)
- Individual provider costs (stacked)
- Projected cost (dashed line)
- Interactive legend to toggle providers
- Zoom/pan capabilities
**Data Source:** `trends.provider_trends`
---
#### 📈 **Daily Cost Heatmap** (Medium Priority)
**Location:** New component or `AdvancedCostAnalytics.tsx`
**Purpose:** Visualize cost patterns by day of week and hour
**Implementation:**
- Calendar-style heatmap
- X-axis: Days of month
- Y-axis: Hours of day
- Color intensity: Cost amount
- Tooltip: Exact cost, calls, date/time
**Data Source:** `UsageLog[]` aggregated by day/hour
---
#### 🎨 **Provider Efficiency Radar Chart** (Low Priority)
**Location:** `AdvancedCostAnalytics.tsx` or new component
**Purpose:** Compare providers across multiple dimensions
**Implementation:**
- Radar chart with axes:
- Cost per call
- Average response time
- Success rate
- Token efficiency
- Usage volume
- Multiple providers overlaid
- Interactive legend
**Data Source:** Aggregated `UsageLog[]` by provider
---
#### 📉 **Cost Velocity Trend Line** (High Priority)
**Location:** `UsageTrends.tsx` or `BillingOverview.tsx`
**Purpose:** Show spending velocity (daily cost rate) over time
**Implementation:**
- Line chart showing:
- Daily spending rate (calculated)
- 7-day moving average
- Projected monthly cost (horizontal line)
- Budget limit (horizontal line)
- Annotations for budget warnings
**Data Source:** Calculated from `UsageTrends`
---
#### 🎯 **Tool Usage Sankey Diagram** (Low Priority - Complex)
**Location:** New component or `ToolCostBreakdown.tsx`
**Purpose:** Show flow of usage across tools and providers
**Implementation:**
- Sankey diagram (may need custom library or D3)
- Left: Tools (Blog Writer, Image Studio, etc.)
- Right: Providers (Gemini, WaveSpeed, etc.)
- Flow width: Cost amount
- Interactive: Click to filter
**Data Source:** `UsageLog[]` grouped by tool → provider
---
### 3.3 Real-time Monitoring Visualizations
#### ⚡ **Live Cost Counter** (High Priority)
**Location:** `BillingOverview.tsx` or header
**Purpose:** Animated counter showing real-time cost accumulation
**Implementation:**
- Animated number counter (using Framer Motion)
- Updates on data refresh
- Color changes based on velocity
- Pulse animation when cost increases
**Data Source:** `current_usage.total_cost`
---
#### 📊 **Error Rate Gauge** (Medium Priority)
**Location:** `SystemHealthIndicator.tsx` or `BillingOverview.tsx`
**Purpose:** Visual gauge showing API error rate
**Implementation:**
- Semi-circular gauge chart
- Green (0-5%), Yellow (5-10%), Red (>10%)
- Animated needle
- Current value and target displayed
**Data Source:** `systemHealth.error_rate`
---
## 4. Framer Motion Animation Opportunities
### 4.1 Current Animation Usage
**Existing:**
- ✅ Card-level fade-in (`motion.div` with `initial`, `animate`)
- ✅ View mode transitions (`AnimatePresence` with slide)
- ✅ Hover effects (`whileHover` on cards)
- ✅ Loading spinner rotation
**Missing Opportunities:**
- ❌ Stagger animations for metric cards
- ❌ Number counting animations
- ❌ Progress bar fill animations
- ❌ Chart data entry animations
- ❌ Error/warning pulse animations
- ❌ Refresh button rotation
- ❌ Tooltip entrance animations
---
### 4.2 Recommended Animations
#### 🎬 **Staggered Card Entrance** (High Priority)
**Location:** `CompactBillingDashboard.tsx` - Metric cards grid
**Implementation:**
```typescript
<Grid container spacing={2}>
{metrics.map((metric, index) => (
<Grid item key={metric.id}>
<motion.div
initial={{ opacity: 0, y: 20 }}
animate={{ opacity: 1, y: 0 }}
transition={{
delay: index * 0.1,
duration: 0.4,
ease: "easeOut"
}}
>
<MetricCard {...metric} />
</motion.div>
</Grid>
))}
</Grid>
```
---
#### 🔢 **Animated Number Counter** (High Priority)
**Location:** All cost/call/token displays
**Implementation:**
```typescript
import { useMotionValue, useSpring, useTransform } from 'framer-motion';
const AnimatedNumber: React.FC<{ value: number; format?: (n: number) => string }> = ({
value,
format = (n) => n.toLocaleString()
}) => {
const motionValue = useMotionValue(0);
const spring = useSpring(motionValue, {
stiffness: 50,
damping: 30
});
const display = useTransform(spring, (latest) => format(Math.round(latest)));
useEffect(() => {
motionValue.set(value);
}, [value, motionValue]);
return <motion.span>{display}</motion.span>;
};
```
---
#### 📊 **Chart Data Entry Animation** (Medium Priority)
**Location:** All chart components
**Implementation:**
```typescript
// For line/area charts
<Area
dataKey="cost"
fill="url(#colorCost)"
stroke="#667eea"
strokeWidth={2}
initial={{ pathLength: 0 }}
animate={{ pathLength: 1 }}
transition={{ duration: 1, ease: "easeInOut" }}
/>
// For bar charts
<Bar dataKey="cost">
{data.map((entry, index) => (
<Cell
key={`cell-${index}`}
initial={{ scaleY: 0 }}
animate={{ scaleY: 1 }}
transition={{
delay: index * 0.05,
duration: 0.5,
ease: "easeOut"
}}
/>
))}
</Bar>
```
---
#### 🎯 **Progress Bar Fill Animation** (High Priority)
**Location:** All progress bars (usage limits, budget)
**Implementation:**
```typescript
<motion.div
initial={{ scaleX: 0 }}
animate={{ scaleX: usagePercentage / 100 }}
transition={{
duration: 1,
ease: "easeOut",
delay: 0.2
}}
style={{
transformOrigin: "left",
height: "100%",
backgroundColor: getProgressColor(usagePercentage)
}}
/>
```
---
#### ⚠️ **Alert Pulse Animation** (Medium Priority)
**Location:** `UsageAlerts.tsx` and alert indicators
**Implementation:**
```typescript
<motion.div
animate={{
scale: [1, 1.05, 1],
opacity: [1, 0.8, 1]
}}
transition={{
duration: 2,
repeat: Infinity,
ease: "easeInOut"
}}
>
<Alert severity="warning">...</Alert>
</motion.div>
```
---
#### 🔄 **Refresh Button Rotation** (Low Priority - Already has CSS)
**Location:** All refresh buttons
**Implementation:**
```typescript
<motion.div
animate={{ rotate: loading ? 360 : 0 }}
transition={{
duration: 1,
repeat: loading ? Infinity : 0,
ease: "linear"
}}
>
<RefreshCw />
</motion.div>
```
---
#### 💬 **Tooltip Entrance** (Low Priority)
**Location:** All tooltips
**Implementation:**
```typescript
<motion.div
initial={{ opacity: 0, scale: 0.8, y: 10 }}
animate={{ opacity: 1, scale: 1, y: 0 }}
exit={{ opacity: 0, scale: 0.8, y: 10 }}
transition={{ duration: 0.2 }}
>
<TooltipContent />
</motion.div>
```
---
## 5. Implementation Priority
### Phase 1: High Impact, Low Effort (Week 1)
1. ✅ Animated number counters
2. ✅ Progress bar fill animations
3. ✅ Staggered card entrance
4. ✅ Mini sparkline charts in compact view
### Phase 2: Medium Impact, Medium Effort (Week 2)
5. ✅ Cost velocity trend line
6. ✅ Provider cost comparison bar chart
7. ✅ Usage limit progress rings
8. ✅ Chart data entry animations
### Phase 3: High Impact, High Effort (Week 3-4)
9. ✅ Multi-series area chart (cost over time)
10. ✅ Daily cost heatmap
11. ✅ Live cost counter
12. ✅ Error rate gauge
### Phase 4: Nice to Have (Future)
13. ⏳ Provider efficiency radar chart
14. ⏳ Tool usage Sankey diagram
15. ⏳ Alert pulse animations
16. ⏳ Enhanced tooltip animations
---
## 6. Code Examples
### 6.1 Mini Sparkline Component
```typescript
// components/billing/MiniSparkline.tsx
import React, { Suspense } from 'react';
import { Box } from '@mui/material';
import { LazyLineChart, Line, ResponsiveContainer, ChartLoadingFallback } from '../../utils/lazyRecharts';
interface MiniSparklineProps {
data: Array<{ date: string; value: number }>;
color: string;
height?: number;
}
export const MiniSparkline: React.FC<MiniSparklineProps> = ({
data,
color,
height = 40
}) => {
return (
<Box sx={{ height, width: '100%', mt: 1 }}>
<Suspense fallback={<ChartLoadingFallback />}>
<ResponsiveContainer width="100%" height="100%">
<LazyLineChart data={data}>
<Line
type="monotone"
dataKey="value"
stroke={color}
strokeWidth={2}
dot={false}
isAnimationActive={true}
animationDuration={1000}
/>
</LazyLineChart>
</ResponsiveContainer>
</Suspense>
</Box>
);
};
```
### 6.2 Animated Number Component
```typescript
// components/shared/AnimatedNumber.tsx
import React, { useEffect } from 'react';
import { motion, useMotionValue, useSpring, useTransform } from 'framer-motion';
interface AnimatedNumberProps {
value: number;
format?: (n: number) => string;
duration?: number;
}
export const AnimatedNumber: React.FC<AnimatedNumberProps> = ({
value,
format = (n) => n.toLocaleString(),
duration = 1
}) => {
const motionValue = useMotionValue(0);
const spring = useSpring(motionValue, {
stiffness: 50,
damping: 30
});
const display = useTransform(spring, (latest) => format(Math.round(latest)));
useEffect(() => {
motionValue.set(value);
}, [value, motionValue]);
return <motion.span>{display}</motion.span>;
};
```
### 6.3 Usage Limit Progress Ring
```typescript
// components/billing/UsageLimitRing.tsx
import React, { Suspense } from 'react';
import { Box, Typography } from '@mui/material';
import { LazyPieChart, Pie, Cell, ResponsiveContainer, ChartLoadingFallback } from '../../utils/lazyRecharts';
import { motion } from 'framer-motion';
interface UsageLimitRingProps {
used: number;
limit: number;
label: string;
color: string;
}
export const UsageLimitRing: React.FC<UsageLimitRingProps> = ({
used,
limit,
label,
color
}) => {
const percentage = Math.min((used / limit) * 100, 100);
const data = [
{ name: 'Used', value: used },
{ name: 'Remaining', value: Math.max(0, limit - used) }
];
return (
<Box sx={{ position: 'relative', width: 120, height: 120 }}>
<Suspense fallback={<ChartLoadingFallback />}>
<ResponsiveContainer width="100%" height="100%">
<LazyPieChart>
<Pie
data={data}
cx="50%"
cy="50%"
innerRadius={40}
outerRadius={50}
startAngle={90}
endAngle={-270}
dataKey="value"
animationBegin={0}
animationDuration={1000}
>
<Cell fill={color} />
<Cell fill="rgba(255,255,255,0.1)" />
</Pie>
</LazyPieChart>
</ResponsiveContainer>
</Suspense>
<Box sx={{
position: 'absolute',
top: '50%',
left: '50%',
transform: 'translate(-50%, -50%)',
textAlign: 'center'
}}>
<Typography variant="h6" sx={{ fontWeight: 'bold' }}>
{Math.round(percentage)}%
</Typography>
<Typography variant="caption" sx={{ fontSize: '0.7rem' }}>
{label}
</Typography>
</Box>
</Box>
);
};
```
---
## 7. Performance Considerations
### 7.1 Chart Optimization
- ✅ Use lazy loading for all charts
- ✅ Implement `Suspense` boundaries
- ✅ Limit data points (max 30-50 for line charts)
- ✅ Use `ResponsiveContainer` for responsive sizing
- ✅ Debounce chart updates on window resize
### 7.2 Animation Optimization
- ✅ Use `will-change` CSS property for animated elements
- ✅ Prefer `transform` and `opacity` over layout properties
- ✅ Limit simultaneous animations (max 10-15)
- ✅ Use `useReducedMotion` hook for accessibility
### 7.3 Data Aggregation
- ✅ Pre-aggregate data on backend when possible
- ✅ Cache chart data with appropriate TTL
- ✅ Use virtual scrolling for large datasets
---
## 8. Accessibility
### 8.1 Chart Accessibility
- Add `aria-label` to all charts
- Provide text alternatives for chart data
- Ensure color contrast meets WCAG AA standards
- Support keyboard navigation for interactive charts
### 8.2 Animation Accessibility
- Respect `prefers-reduced-motion` media query
- Provide option to disable animations
- Ensure animations don't interfere with screen readers
---
## 9. Testing Recommendations
### 9.1 Visual Regression Testing
- Screenshot tests for all chart types
- Test with various data scenarios (empty, single point, many points)
- Test responsive behavior at different screen sizes
### 9.2 Animation Testing
- Verify animations complete within performance budget (60fps)
- Test with reduced motion preferences
- Verify animations don't cause layout shifts
---
## 10. Conclusion
The billing dashboard has a solid foundation with existing charts and animations. The recommended enhancements will:
1. **Improve Data Comprehension:** More visualizations make patterns easier to spot
2. **Enhance User Experience:** Smooth animations create a polished, professional feel
3. **Increase Engagement:** Interactive charts encourage exploration
4. **Support Decision Making:** Better visualizations help users optimize costs
**Next Steps:**
1. Review and prioritize recommendations with stakeholders
2. Create detailed implementation tickets
3. Start with Phase 1 (high impact, low effort) items
4. Gather user feedback and iterate
---
**Document Version:** 1.0
**Last Updated:** 2025-01-07
**Author:** AI Assistant
**Review Status:** Ready for Review

View File

@@ -0,0 +1,309 @@
# Cost Estimation Integration Guide
## Overview
The cost estimation feature allows users to see estimated costs before executing operations. This helps users make informed decisions and avoid unexpected charges.
## Components
### 1. `CostEstimationModal` Component
A reusable modal component that displays cost estimates for operations.
**Location**: `frontend/src/components/billing/CostEstimationModal.tsx`
**Props**:
```typescript
interface CostEstimationModalProps {
open: boolean;
onClose: () => void;
onConfirm: () => void;
operations: PreflightOperation[];
userId?: string;
}
```
### 2. `useCostEstimation` Hook
A React hook that manages cost estimation state.
**Location**: `frontend/src/hooks/useCostEstimation.ts`
**Returns**:
```typescript
{
showEstimation: (operations: PreflightOperation[]) => void;
estimationOperations: PreflightOperation[];
isEstimationOpen: boolean;
closeEstimation: () => void;
}
```
## Usage Example
### Basic Integration
```typescript
import React from 'react';
import { useCostEstimation } from '../../hooks/useCostEstimation';
import CostEstimationModal from '../billing/CostEstimationModal';
import { PreflightOperation } from '../../services/billingService';
const MyComponent: React.FC = () => {
const {
showEstimation,
estimationOperations,
isEstimationOpen,
closeEstimation
} = useCostEstimation();
const handleGenerate = () => {
// Define operations that will be performed
const operations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'text_generation',
tokens_requested: 2000
}
];
// Show cost estimation modal
showEstimation(operations);
};
const performActualOperation = async () => {
// Your actual operation logic here
console.log('Performing operation...');
};
return (
<>
<button onClick={handleGenerate}>
Generate Content
</button>
<CostEstimationModal
open={isEstimationOpen}
onClose={closeEstimation}
onConfirm={performActualOperation}
operations={estimationOperations}
/>
</>
);
};
```
### Advanced Example: Blog Writer
```typescript
import React, { useState } from 'react';
import { useCostEstimation } from '../../hooks/useCostEstimation';
import CostEstimationModal from '../billing/CostEstimationModal';
import { PreflightOperation } from '../../services/billingService';
const BlogWriter: React.FC = () => {
const [keywords, setKeywords] = useState('');
const {
showEstimation,
estimationOperations,
isEstimationOpen,
closeEstimation
} = useCostEstimation();
const handleGenerateBlog = () => {
// Estimate costs for blog generation workflow
// Typically involves: research (1 call) + outline (1 call) + content (1-3 calls)
const operations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'research',
tokens_requested: 1500
},
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'outline_generation',
tokens_requested: 1000
},
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 3000
}
];
showEstimation(operations);
};
const performBlogGeneration = async () => {
// Actual blog generation logic
// This will only be called if user confirms in the modal
console.log('Generating blog...');
};
return (
<>
<div>
<input
value={keywords}
onChange={(e) => setKeywords(e.target.value)}
placeholder="Enter blog topic..."
/>
<button onClick={handleGenerateBlog}>
Generate Blog
</button>
</div>
<CostEstimationModal
open={isEstimationOpen}
onClose={closeEstimation}
onConfirm={performBlogGeneration}
operations={estimationOperations}
/>
</>
);
};
```
### Example: Image Generation
```typescript
const ImageStudio: React.FC = () => {
const { showEstimation, estimationOperations, isEstimationOpen, closeEstimation } = useCostEstimation();
const handleGenerateImage = () => {
const operations: PreflightOperation[] = [
{
provider: 'stability',
operation_type: 'image_generation',
// tokens_requested not needed for image generation
}
];
showEstimation(operations);
};
return (
<>
<button onClick={handleGenerateImage}>
Generate Image
</button>
<CostEstimationModal
open={isEstimationOpen}
onClose={closeEstimation}
onConfirm={() => generateImage()}
operations={estimationOperations}
/>
</>
);
};
```
## Operation Types
Common operation types you can use:
### LLM Operations
- `text_generation` - General LLM text generation
- `research` - Research operations (typically includes search + LLM analysis)
- `outline_generation` - Content outline generation
- `content_generation` - Full content generation
- `seo_analysis` - SEO analysis and optimization
- `content_optimization` - Content refinement and optimization
- `title_generation` - Title/headline generation
- `summary_generation` - Content summarization
### Media Generation Operations
- `image_generation` - Image generation (text-to-image)
- `image_editing` - Image editing operations (inpaint, outpaint, recolor, etc.)
- `image_upscaling` - Image upscaling operations
- `face_swap` - Face swap operations
- `video_generation` - Video generation (text-to-video, image-to-video)
- `video_editing` - Video editing operations
- `audio_generation` - Audio/TTS generation
- `audio_editing` - Audio editing operations
### Search & Research Operations
- `search` - Generic search API operations
- `exa_search` - Exa neural search
- `tavily_search` - Tavily AI search
- `serper_search` - Serper Google search
- `metaphor_search` - Metaphor search
- `firecrawl_extract` - Firecrawl web page extraction
### Specialized Operations
- `character_image_generation` - Character-consistent image generation
- `product_image_generation` - Product-focused image generation
- `avatar_generation` - Avatar/talking head generation
- `scene_generation` - Scene-based video/image generation
- `batch_operation` - Batch processing operations
## Providers
Supported providers:
### LLM Providers
- `gemini` - Google Gemini (default: gemini-2.5-flash)
- `openai` - OpenAI GPT models (default: gpt-4o-mini)
- `anthropic` - Anthropic Claude (default: claude-3.5-sonnet)
- `mistral` - Mistral AI / HuggingFace (default: gpt-oss-120b)
### Search Providers
- `tavily` - Tavily AI Search ($0.001 per search)
- `serper` - Serper Google Search ($0.001 per search)
- `metaphor` - Metaphor Search ($0.003 per search)
- `exa` - Exa Neural Search ($0.005 per search)
- `firecrawl` - Firecrawl Web Extraction ($0.002 per page)
### Media Providers
- `stability` - Stability AI (images: $0.04/image, includes OSS models)
- OSS Models: `qwen-image` ($0.03), `ideogram-v3-turbo` ($0.05)
- `wavespeed` - WaveSpeed AI (OSS models via Stability provider)
- Image: `qwen-image`, `ideogram-v3-turbo`
- Image Edit: `qwen-edit` ($0.02), `flux-kontext-pro` ($0.04)
- Video: `wan-2.5` ($0.25), `seedance-1.5-pro` ($0.40)
- Audio: `minimax-speech-02-hd` ($0.05 per 1K chars)
- `video` - Video generation (default: wan-2.5 OSS $0.25)
- `image_edit` - Image editing (default: qwen-edit OSS $0.02)
- `audio` - Audio generation (default: minimax-speech-02-hd OSS)
## Best Practices
1. **Always show estimation before expensive operations** - Operations that cost > $0.01 should show estimation
2. **Group related operations** - If a workflow involves multiple API calls, include all of them in the estimation
3. **Provide accurate token estimates** - More accurate token estimates lead to better cost predictions
4. **Handle errors gracefully** - If estimation fails, allow users to proceed with a warning
5. **Cache estimations** - The API returns a `cached` flag - consider caching for better UX
## Integration Checklist
- [ ] Import `useCostEstimation` hook
- [ ] Import `CostEstimationModal` component
- [ ] Define operations array with `PreflightOperation[]`
- [ ] Call `showEstimation(operations)` before operation
- [ ] Render `CostEstimationModal` with proper props
- [ ] Move actual operation logic to `onConfirm` callback
- [ ] Test with various operation types
- [ ] Handle error states gracefully
## Testing
Test the cost estimation with:
1. **Single operation** - Simple text generation
2. **Multiple operations** - Blog generation workflow
3. **Different providers** - Gemini, OpenAI, etc.
4. **Limit exceeded** - Test when limits are reached
5. **Error handling** - Network errors, API failures
## Notes
- The modal automatically fetches cost estimates when opened
- Users can proceed only if `can_proceed` is `true`
- The modal shows detailed breakdown per operation
- Usage limits are displayed if available
- Actual costs may vary slightly from estimates

View File

@@ -0,0 +1,280 @@
# Log Storage and Retention Review
## Executive Summary
This document reviews the storage limits, retention policies, and log management mechanisms for:
1. **API Usage Logs** (`api_usage_logs` table)
2. **Subscription Renewal History** (`subscription_renewal_history` table)
## 1. API Usage Logs
### Current Storage Limits
**Per-User Limit:**
- **Maximum Logs Per User**: `5,000` logs (defined in `LogWrappingService.MAX_LOGS_PER_USER`)
- **Detailed Logs Kept**: `4,000` most recent logs
- **Aggregation Threshold**: Logs older than 30 days OR beyond the 4,000 limit are aggregated
**API Query Limits:**
- **Frontend Default**: 50 logs per page (configurable: 10, 25, 50, 100)
- **Backend Maximum**: 5,000 logs per query (`limit` parameter: `ge=1, le=5000`)
- **Pagination**: Fully supported with `offset` and `limit` parameters
### Log Wrapping/Aggregation Mechanism
**Service**: `LogWrappingService` (`backend/services/subscription/log_wrapping_service.py`)
**How It Works:**
1. **Automatic Check**: Triggered on every `/usage-logs` API call via `check_and_wrap_logs()`
2. **Threshold Detection**: When user exceeds 5,000 logs
3. **Aggregation Strategy**:
- Keeps most recent 4,000 logs as detailed records
- Aggregates oldest logs beyond 4,000 limit
- Groups by provider and billing period
- Creates aggregated log entries with:
- Total counts, tokens, costs
- Average response time
- Success/failure counts
- Time range (oldest to newest timestamp)
- Deletes individual logs that were aggregated
**Aggregated Log Format:**
- `endpoint`: `"[AGGREGATED]"`
- `method`: `"AGGREGATED"`
- `model_used`: `"[{count} calls aggregated]"`
- `error_message`: Contains summary (e.g., "Aggregated 150 calls: 145 success, 5 failed")
- `is_aggregated`: Flag set to `true` in frontend
**Context Preservation:**
-**Preserved**: Total costs, tokens, call counts, success/failure rates, time ranges
-**Preserved**: Provider and billing period grouping
-**Preserved**: Average response time
-**Lost**: Individual endpoint details, specific error messages, request/response sizes
### Current Implementation Status
**✅ Implemented:**
- Automatic log wrapping when limit exceeded
- Aggregation by provider and billing period
- Context preservation for aggregated data
- Frontend display of aggregated logs with special formatting
**⚠️ Potential Issues:**
1. **No Time-Based Retention**: Only count-based, not age-based cleanup
2. **No Manual Cleanup Script**: No scheduled job to clean very old logs
3. **Database Growth**: Aggregated logs still count toward the 5,000 limit
4. **No Archive Strategy**: No mechanism to move old logs to archive tables
### Recommendations
1. **Add Time-Based Retention**:
- Archive logs older than 12 months
- Keep aggregated logs for 24 months
- Delete logs older than 24 months
2. **Improve Aggregation Strategy**:
- Consider aggregating by month for logs older than 90 days
- Create separate archive table for very old logs
- Implement tiered storage (hot/warm/cold)
3. **Add Cleanup Script**:
- Scheduled job to run monthly
- Archive old logs before deletion
- Maintain audit trail
## 2. Subscription Renewal History
### Current Storage Limits
**Per-User Limit:**
- **No Hard Limit**: Unlimited storage (no cleanup/aggregation)
- **API Query Limit**: Maximum 100 records per query (`limit` parameter: `ge=1, le=100`)
- **Frontend Default**: 20 records per page (configurable: 10, 20, 50, 100)
**Storage Characteristics:**
- One record per renewal/upgrade/downgrade event
- Includes usage snapshot before renewal (`usage_before_renewal` JSON field)
- Includes payment information
- Includes period information (start/end dates)
### Current Implementation Status
**✅ Implemented:**
- Full history tracking for all subscription events
- Usage snapshots preserved in JSON format
- Pagination support
- No automatic cleanup (preserves all history)
**⚠️ Potential Issues:**
1. **Unlimited Growth**: No retention policy - will grow indefinitely
2. **Large JSON Snapshots**: `usage_before_renewal` can be large for active users
3. **No Archive Strategy**: All records kept in primary table
4. **No Cleanup Script**: No mechanism to archive old records
### Recommendations
1. **Add Retention Policy**:
- Keep detailed records for last 24 months
- Archive records older than 24 months
- Keep summary records (without full usage snapshots) for 7 years (tax/audit)
2. **Optimize Storage**:
- Compress `usage_before_renewal` JSON for old records
- Create summary table for very old records
- Remove detailed usage snapshots after 12 months
3. **Add Cleanup Script**:
- Monthly job to archive records older than 24 months
- Maintain summary records for compliance
- Preserve payment information indefinitely
## 3. Log Replay Mechanism
### Current Status
**❌ No Log Replay**: There is no mechanism to replay or reconstruct usage from logs.
**What Would Be Needed:**
1. **Event Sourcing Pattern**: Store events that can be replayed
2. **Replay Service**: Service to process logs and rebuild state
3. **State Reconstruction**: Ability to rebuild `UsageSummary` from `APIUsageLog` entries
### Current Data Flow
```
API Call → monitoring_middleware → UsageTrackingService.track_api_usage()
APIUsageLog (individual record)
UsageSummary (aggregated by billing period)
```
**Issue**: If `UsageSummary` is corrupted or lost, it cannot be fully reconstructed from `APIUsageLog` because:
- Aggregation happens in real-time
- No event sourcing pattern
- No replay mechanism
### Recommendations
1. **Add Replay Capability**:
- Create `replay_usage_logs()` function in `UsageTrackingService`
- Rebuild `UsageSummary` from `APIUsageLog` entries
- Support replay for specific billing periods
2. **Add Validation**:
- Periodic job to validate `UsageSummary` against `APIUsageLog`
- Detect discrepancies and auto-correct
- Alert on data inconsistencies
3. **Consider Event Sourcing** (Future):
- Store events instead of just logs
- Enable full state reconstruction
- Support time-travel queries
## 4. Summary and Action Items
### Current State
| Metric | API Usage Logs | Renewal History |
|--------|---------------|----------------|
| **Per-User Limit** | 5,000 logs | Unlimited |
| **Aggregation** | ✅ Yes (automatic) | ❌ No |
| **Retention Policy** | ⚠️ Count-based only | ❌ None |
| **Cleanup Script** | ❌ No | ❌ No |
| **Log Replay** | ❌ No | ❌ No |
| **Archive Strategy** | ❌ No | ❌ No |
### Priority Actions
**High Priority:**
1.**Log Wrapping Works**: Already implemented and functional
2. ⚠️ **Add Time-Based Retention**: Implement age-based cleanup for API logs
3. ⚠️ **Add Renewal History Retention**: Implement retention policy for renewal history
**Medium Priority:**
4. **Add Cleanup Scripts**: Create scheduled jobs for both tables
5. **Add Archive Tables**: Create archive tables for old data
6. **Add Replay Capability**: Enable reconstruction of UsageSummary from logs
**Low Priority:**
7. **Optimize Storage**: Compress JSON fields, optimize indexes
8. **Add Monitoring**: Alert on storage growth, aggregation events
9. **Documentation**: Document retention policies for users
### Code Locations
**Log Wrapping:**
- `backend/services/subscription/log_wrapping_service.py`
- Triggered in: `backend/api/subscription/routes/logs.py` (line 86-89)
**Usage Logs API:**
- `backend/api/subscription/routes/logs.py`
- Frontend: `frontend/src/components/billing/UsageLogsTable.tsx`
**Renewal History API:**
- `backend/api/subscription/routes/subscriptions.py` (line 519-586)
- Frontend: `frontend/src/components/billing/SubscriptionRenewalHistory.tsx`
**Models:**
- `backend/models/subscription_models.py`
- `APIUsageLog` (line 127-173)
- `SubscriptionRenewalHistory` (line 341-389)
## 5. Recommended Retention Policies
### API Usage Logs
```
┌─────────────────────────────────────────────────────────────┐
│ Retention Policy: API Usage Logs │
├─────────────────────────────────────────────────────────────┤
│ │
│ 0-30 days: Detailed logs (all fields) │
│ 30-90 days: Detailed logs (keep 4,000 most recent) │
│ 90-365 days: Aggregated by month │
│ 365-730 days: Aggregated by quarter │
│ 730+ days: Archive to separate table │
│ │
│ Max per user: 5,000 records (detailed + aggregated) │
│ Archive table: Unlimited (for compliance/audit) │
└─────────────────────────────────────────────────────────────┘
```
### Subscription Renewal History
```
┌─────────────────────────────────────────────────────────────┐
│ Retention Policy: Renewal History │
├─────────────────────────────────────────────────────────────┤
│ │
│ 0-12 months: Full records with usage snapshots │
│ 12-24 months: Full records (compressed snapshots) │
│ 24-84 months: Summary records (no usage snapshots) │
│ 84+ months: Archive to separate table │
│ │
│ Payment data: Keep indefinitely (tax/audit compliance) │
│ Usage snapshots: Remove after 12 months │
└─────────────────────────────────────────────────────────────┘
```
## 6. Implementation Plan
### Phase 1: Immediate (No Breaking Changes)
1. Document current behavior
2. Add monitoring/alerts for log counts
3. Add database indexes for performance
### Phase 2: Retention Policies (Backward Compatible)
1. Add time-based retention to log wrapping
2. Create archive tables
3. Add cleanup scripts (manual execution)
### Phase 3: Automation
1. Schedule cleanup jobs (cron/scheduler)
2. Add replay capability
3. Add validation/audit jobs
### Phase 4: Optimization
1. Compress JSON fields
2. Optimize queries with better indexes
3. Add caching for frequently accessed data

View File

@@ -0,0 +1,106 @@
# Priority 2 Alerts Architecture Explanation
## Why Both Common and Tool-Specific Integrations?
You're absolutely right that **common components should be updated once** and automatically picked up everywhere. Here's the architecture:
### Common Component Integration (UsageDashboard)
**Location**: `frontend/src/components/shared/UsageDashboard.tsx`
**Used In**:
- `UserBadge` (in `HeaderControls`) - appears in ALL tool headers
- `WizardHeader` (onboarding)
- Various tool headers directly
**What It Should Show**:
-**Global cost trends** (spending velocity, budget projections)
-**Overall OSS recommendations** (general cost savings opportunities)
-**Usage statistics** (current cost, calls, limits)
**Update Once**: Add Priority 2 alerts here → automatically appears in ALL tool headers
### Tool-Specific Integrations (Optional)
**Purpose**: Contextual alerts and pre-operation cost estimation
**When Needed**:
1. **Pre-Operation Cost Estimation**: Before clicking "Generate Blog" or "Generate Image", show cost estimate
2. **Contextual Recommendations**: In Image Studio, recommend OSS models based on selected provider/model
3. **Workflow-Specific Alerts**: Blog Writer showing cost breakdown for the entire blog generation workflow
**Example**:
- **Common**: "You're spending at a high rate" (shown everywhere)
- **Tool-Specific**: "This blog generation will cost ~$0.05" (shown only in Blog Writer before generation)
## Recommended Architecture
### ✅ **Primary Integration: UsageDashboard**
Add Priority 2 alerts to `UsageDashboard.tsx`:
- Shows cost trends, spending velocity, OSS recommendations
- Automatically appears in all tool headers via `UserBadge`/`HeaderControls`
- **One update, everywhere**
### ✅ **Optional: Tool-Specific Hooks**
Keep tool-specific hooks for:
- Pre-operation cost estimation (before expensive operations)
- Contextual recommendations (based on user's current selection)
**Example Flow**:
1. User opens Blog Writer
2. `UsageDashboard` (in header) shows: "High spending velocity detected"
3. User clicks "Generate Blog"
4. Tool-specific hook shows: "This will cost ~$0.05. Proceed?"
## Implementation Plan
### Phase 1: Common Integration (Recommended)
**Add to `UsageDashboard.tsx`**:
```typescript
import { usePriority2Alerts } from '../../hooks/usePriority2Alerts';
import Priority2AlertBanner from '../shared/Priority2AlertBanner';
// In UsageDashboard component
const { alerts, dismissAlert } = usePriority2Alerts({
userId,
enabled: !!userId && subscription?.active,
});
// Show alerts above usage stats
{alerts.length > 0 && (
<Priority2AlertBanner
alerts={alerts}
onDismiss={dismissAlert}
maxAlerts={2}
/>
)}
```
**Result**: Priority 2 alerts appear in ALL tool headers automatically!
### Phase 2: Tool-Specific (Optional)
Only add tool-specific integrations where you need:
- Pre-operation cost estimation
- Contextual recommendations
**Example**: Blog Writer
```typescript
// Only for pre-operation cost estimation
const { estimateAndProceed } = useBlogWriterCostEstimation();
const handleGenerate = () => {
estimateAndProceed('content', () => {
// Actual generation logic
}, userId);
};
```
## Summary
- **Common Integration**: ✅ Add to `UsageDashboard` → appears everywhere
- **Tool-Specific**: ⚠️ Only for pre-operation estimation and contextual recommendations
- **Best Practice**: Start with common integration, add tool-specific only when needed

View File

@@ -0,0 +1,632 @@
# Priority 2 Alerts Integration Guide
## Overview
This guide explains how to integrate **Priority 2 features** from the cost transparency review as alerts in the main dashboard and individual tool components.
**Priority 2 Features** (from `BILLING_DASHBOARD_COST_TRANSPARENCY_REVIEW.md`):
1. **Dynamic Pricing Display** - Show pricing changes and OSS model recommendations
2. **Cost Estimation Before Operations** - Warn users before expensive operations
3. **Historical Cost Trends** - Alert on high spending velocity and budget projections
---
## Architecture
### Components
1. **`usePriority2Alerts` Hook** (`frontend/src/hooks/usePriority2Alerts.ts`)
- Fetches dashboard data and generates Priority 2 alerts
- Monitors cost trends, spending velocity, and OSS recommendations
- Auto-refreshes at configurable intervals
2. **`Priority2AlertBanner` Component** (`frontend/src/components/shared/Priority2AlertBanner.tsx`)
- Displays alerts in a prominent banner format
- Supports dismissible alerts with localStorage persistence
- Shows action buttons for alerts
3. **Tool-Specific Alert Components**:
- `BlogWriterCostAlerts` - Blog Writer integration
- `CreateStudioCostAlerts` - Image Studio integration
---
## Main Dashboard Integration
### Step 1: Add Priority 2 Alerts to Main Dashboard
```typescript
// In your main dashboard component (e.g., MainDashboard.tsx or Dashboard.tsx)
import React from 'react';
import { usePriority2Alerts } from '../hooks/usePriority2Alerts';
import Priority2AlertBanner from '../components/shared/Priority2AlertBanner';
import { useSubscription } from '../contexts/SubscriptionContext';
const MainDashboard: React.FC = () => {
const { subscription } = useSubscription();
const userId = subscription?.user_id; // Get from your auth context
const { alerts, refreshAlerts, dismissAlert } = usePriority2Alerts({
userId,
enabled: !!userId && subscription?.active,
checkInterval: 120000, // Check every 2 minutes
});
return (
<Box>
{/* Priority 2 Alert Banner - Show at top of dashboard */}
<Priority2AlertBanner
alerts={alerts}
onDismiss={dismissAlert}
maxAlerts={3}
/>
{/* Rest of dashboard content */}
{/* ... */}
</Box>
);
};
```
### Step 2: Integrate with Existing Alert System
The Priority 2 alerts complement the existing `UsageAlerts` component:
```typescript
// In EnhancedBillingDashboard or CompactBillingDashboard
import Priority2AlertBanner from '../shared/Priority2AlertBanner';
import UsageAlerts from '../billing/UsageAlerts';
// Show both alert types
<Grid container spacing={3}>
<Grid item xs={12}>
{/* Priority 2 Alerts (cost trends, OSS recommendations) */}
<Priority2AlertBanner
alerts={priority2Alerts}
onDismiss={dismissPriority2Alert}
/>
</Grid>
<Grid item xs={12} md={4}>
{/* Existing Usage Alerts (limit warnings) */}
<UsageAlerts
alerts={dashboardData.alerts}
onMarkRead={handleMarkRead}
/>
</Grid>
</Grid>
```
---
## Blog Writer Integration Example
### Full Integration
```typescript
// In BlogWriter.tsx
import React from 'react';
import { BlogWriterCostAlerts, useBlogWriterCostEstimation } from './BlogWriterUtils/BlogWriterCostAlerts';
import { useSubscription } from '../../contexts/SubscriptionContext';
export const BlogWriter: React.FC = () => {
const { subscription } = useSubscription();
const userId = subscription?.user_id;
const { estimateAndProceed } = useBlogWriterCostEstimation();
// Wrap research action with cost estimation
const handleResearchAction = async () => {
await estimateAndProceed('research', () => {
// Your actual research logic here
blogWriterApi.startResearch(payload);
}, userId);
};
// Wrap outline generation with cost estimation
const handleOutlineGeneration = async () => {
await estimateAndProceed('outline', () => {
// Your actual outline generation logic here
outlineGenRef.current?.generateNow();
}, userId);
};
// Wrap content generation with cost estimation
const handleContentGeneration = async () => {
await estimateAndProceed('content', () => {
// Your actual content generation logic here
generateContent();
}, userId);
};
return (
<div>
{/* Priority 2 Alerts Banner */}
<BlogWriterCostAlerts
userId={userId}
onResearchStart={handleResearchAction}
onOutlineStart={handleOutlineGeneration}
onContentStart={handleContentGeneration}
/>
{/* Rest of Blog Writer UI */}
{/* ... */}
</div>
);
};
```
### Minimal Integration (Just Alerts)
```typescript
// Simple integration - just show alerts, no cost estimation
import { BlogWriterCostAlerts } from './BlogWriterUtils/BlogWriterCostAlerts';
// In your Blog Writer component
<BlogWriterCostAlerts userId={userId} />
```
---
## Image Studio Integration Example
### Full Integration
```typescript
// In CreateStudio.tsx
import React, { useState } from 'react';
import { CreateStudioCostAlerts, useImageStudioCostEstimation } from './CreateStudioCostAlerts';
import { useSubscription } from '../../contexts/SubscriptionContext';
export const CreateStudio: React.FC = () => {
const { subscription } = useSubscription();
const userId = subscription?.user_id;
const [provider, setProvider] = useState('wavespeed');
const [model, setModel] = useState('qwen-image');
const [numVariations, setNumVariations] = useState(1);
const { estimateAndGenerate } = useImageStudioCostEstimation();
const handleGenerate = async () => {
await estimateAndGenerate(
provider,
model,
numVariations,
() => {
// Your actual image generation logic
generateImage(prompt, { provider, model, numVariations });
},
userId
);
};
return (
<Box>
{/* Priority 2 Alerts with Cost Estimation */}
<CreateStudioCostAlerts
userId={userId}
provider={provider}
model={model}
numVariations={numVariations}
onGenerate={handleGenerate}
/>
{/* Image generation form */}
{/* ... */}
</Box>
);
};
```
---
## Operation Type Examples
### Blog Writer Operations
```typescript
// Research Phase
const researchOperations: PreflightOperation[] = [
{
provider: 'exa',
operation_type: 'research',
tokens_requested: 0, // Exa is per-search, not token-based
},
{
provider: 'exa',
operation_type: 'research',
tokens_requested: 0,
},
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'research',
tokens_requested: 2000, // Analysis tokens
}
];
// Outline Generation
const outlineOperations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'outline_generation',
tokens_requested: 1500,
}
];
// Content Generation (per section)
const contentOperations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 3000, // Per section
},
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 3000,
}
];
```
### Image Studio Operations
```typescript
// Single Image Generation (OSS Model)
const singleImageOperation: PreflightOperation[] = [
{
provider: 'stability', // WaveSpeed OSS models use 'stability' provider
model: 'qwen-image', // OSS model
operation_type: 'image_generation',
tokens_requested: 0, // Not token-based
}
];
// Multiple Images (Batch)
const batchImageOperations: PreflightOperation[] = Array(5).fill(null).map(() => ({
provider: 'stability',
model: 'ideogram-v3-turbo', // Premium OSS model
operation_type: 'image_generation',
tokens_requested: 0,
}));
// Image Editing
const imageEditOperation: PreflightOperation[] = [
{
provider: 'image_edit',
model: 'qwen-edit', // OSS model
operation_type: 'image_editing',
tokens_requested: 0,
}
];
```
### Story Writer Operations
```typescript
// Complete Story Generation (with images, audio, video)
const storyOperations: PreflightOperation[] = [
// Outline
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'outline_generation',
tokens_requested: 1500,
},
// Script
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 2000,
},
// Images (5 scenes)
...Array(5).fill(null).map(() => ({
provider: 'stability',
model: 'qwen-image',
operation_type: 'image_generation',
tokens_requested: 0,
})),
// Audio (5 scenes)
...Array(5).fill(null).map(() => ({
provider: 'audio',
model: 'minimax-speech-02-hd',
operation_type: 'audio_generation',
tokens_requested: 2000, // ~2000 characters per scene
})),
// Videos (5 scenes)
...Array(5).fill(null).map(() => ({
provider: 'video',
model: 'wan-2.5',
operation_type: 'video_generation',
tokens_requested: 0,
})),
];
```
### Podcast Maker Operations
```typescript
// Podcast Generation Workflow
const podcastOperations: PreflightOperation[] = [
// Research
{
provider: 'exa',
operation_type: 'research',
tokens_requested: 0,
},
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'research',
tokens_requested: 2000,
},
// Script Generation
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 5000, // Longer script
},
// Audio Generation (10 minutes = ~1500 words = ~7500 characters)
{
provider: 'audio',
model: 'minimax-speech-02-hd',
operation_type: 'audio_generation',
tokens_requested: 7500, // Characters = tokens for audio
},
// Optional: Video Generation (5 scenes)
...Array(5).fill(null).map(() => ({
provider: 'video',
model: 'wan-2.5',
operation_type: 'video_generation',
tokens_requested: 0,
})),
];
```
### Video Studio Operations
```typescript
// Text-to-Video Generation
const textToVideoOperation: PreflightOperation[] = [
{
provider: 'video',
model: 'wan-2.5', // OSS model (default)
operation_type: 'video_generation',
tokens_requested: 0,
}
];
// Image-to-Video Generation
const imageToVideoOperation: PreflightOperation[] = [
{
provider: 'video',
model: 'wan-2.5',
operation_type: 'video_generation',
tokens_requested: 0,
}
];
// Premium Video (Longer Duration)
const premiumVideoOperation: PreflightOperation[] = [
{
provider: 'video',
model: 'seedance-1.5-pro', // OSS model for longer videos
operation_type: 'video_generation',
tokens_requested: 0,
}
];
```
### Social Media Writer Operations
```typescript
// Facebook/LinkedIn Post Generation
const socialPostOperations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 1000, // Short post
},
// Optional: Image Generation
{
provider: 'stability',
model: 'qwen-image',
operation_type: 'image_generation',
tokens_requested: 0,
}
];
// Twitter Thread Generation
const twitterThreadOperations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'content_generation',
tokens_requested: 2000, // Multiple tweets
}
];
```
### SEO Tools Operations
```typescript
// SEO Analysis
const seoAnalysisOperations: PreflightOperation[] = [
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'seo_analysis',
tokens_requested: 2500, // Comprehensive analysis
}
];
// Content Gap Analysis
const contentGapOperations: PreflightOperation[] = [
{
provider: 'exa',
operation_type: 'research',
tokens_requested: 0,
},
{
provider: 'gemini',
model: 'gemini-2.5-flash',
operation_type: 'research',
tokens_requested: 3000,
}
];
```
---
## Alert Types Generated
### 1. Cost Trend Alerts
**Triggered When**:
- Spending velocity projects budget exhaustion
- Projected cost exceeds 95% of monthly limit
- Daily spending rate is unusually high
**Example Alert**:
```typescript
{
id: 'cost-velocity-high',
type: 'cost_trend',
severity: 'warning',
title: 'High Spending Velocity Detected',
message: 'Your current spending rate projects to $42.50 this month (94% of limit). At this rate, you'll exhaust your budget in ~8 days.',
action: {
label: 'View Cost Trends',
onClick: () => window.location.href = '/billing'
}
}
```
### 2. OSS Recommendation Alerts
**Triggered When**:
- User is using expensive models when cheaper OSS alternatives exist
- Significant cost savings available by switching models
**Example Alert**:
```typescript
{
id: 'oss-image-recommendation',
type: 'oss_recommendation',
severity: 'info',
title: '💡 Cost Savings Opportunity',
message: 'You've spent $2.00 on image generation. Switch to Qwen Image OSS model to save ~$0.50 (25% cheaper at $0.03/image vs $0.04/image).',
action: {
label: 'Learn More',
onClick: () => showToastNotification('OSS models are automatically used as defaults in Basic tier', 'info')
}
}
```
### 3. Cost Estimation Alerts
**Triggered When**:
- User is about to perform an expensive operation (>$0.01)
- Operation represents significant portion of monthly budget (>5%)
**Example Alert**:
```typescript
{
id: 'cost-estimation-high',
type: 'cost_estimation',
severity: 'warning',
title: 'High-Cost Operation Warning',
message: 'This video generation will cost approximately $1.25. This represents 2.8% of your monthly budget.',
action: {
label: 'Proceed',
onClick: () => performOperation()
}
}
```
---
## Integration Checklist
### Main Dashboard
- [ ] Import `usePriority2Alerts` hook
- [ ] Import `Priority2AlertBanner` component
- [ ] Add alert banner at top of dashboard
- [ ] Configure refresh interval (default: 2 minutes)
- [ ] Test alert generation and dismissal
### Blog Writer
- [ ] Import `BlogWriterCostAlerts` component
- [ ] Add component to Blog Writer layout
- [ ] Wrap research/outline/content actions with cost estimation
- [ ] Test cost estimation before operations
- [ ] Verify OSS recommendations appear
### Image Studio
- [ ] Import `CreateStudioCostAlerts` component
- [ ] Add component to Create Studio layout
- [ ] Pass provider/model/numVariations props
- [ ] Integrate cost estimation with generate button
- [ ] Test OSS model recommendations
### Other Tools
- [ ] Story Writer: Add cost alerts for story generation
- [ ] Podcast Maker: Add cost alerts for podcast generation
- [ ] Video Studio: Add cost alerts for video generation
- [ ] Social Media Writers: Add cost alerts for post generation
---
## Testing
### Test Cases
1. **Cost Trend Alerts**
- [ ] High spending velocity detected
- [ ] Budget exhaustion projection shown
- [ ] Alert appears at correct thresholds
2. **OSS Recommendations**
- [ ] Recommendation appears when using expensive models
- [ ] Savings calculation is accurate
- [ ] Alert is dismissible
3. **Cost Estimation**
- [ ] Estimation shown before expensive operations
- [ ] User can proceed or cancel
- [ ] Estimation is accurate (±10%)
4. **Alert Persistence**
- [ ] Dismissed alerts don't reappear
- [ ] Alerts refresh at configured interval
- [ ] Critical alerts cannot be dismissed
---
## Best Practices
1. **Don't Block Users**: Always allow operations to proceed even if estimation fails
2. **Cache Alerts**: Use localStorage to prevent showing same alert repeatedly
3. **Progressive Enhancement**: Alerts enhance UX but shouldn't break functionality
4. **Clear Actions**: Provide actionable buttons in alerts (e.g., "View Billing", "Upgrade Plan")
5. **Contextual Alerts**: Show alerts relevant to current tool/operation
6. **Respect User Preferences**: Allow users to dismiss non-critical alerts
---
## Next Steps
1. **Integrate into Main Dashboard**: Add `Priority2AlertBanner` to main dashboard
2. **Add to Blog Writer**: Integrate `BlogWriterCostAlerts` component
3. **Add to Image Studio**: Integrate `CreateStudioCostAlerts` component
4. **Extend to Other Tools**: Add similar integrations to Story Writer, Podcast Maker, etc.
5. **Monitor Performance**: Track alert generation performance and user engagement
---
**Last Updated**: January 2026

View File

@@ -0,0 +1,899 @@
# Production Pricing Strategy - Basic Tier Launch (OSS-Focused)
## Executive Summary
This document provides a comprehensive pricing strategy for ALwrity's production launch with **Basic Tier only**. All features and tools will be accessible to Basic tier users, requiring careful cost calculation and limit setting to ensure sustainability while providing value.
**Critical Goals**:
1. **OSS-First Strategy**: Prioritize Open-Source AI models (WaveSpeed OSS models) for cost efficiency
2. **Hard Cost Cap**: $40-50 per user per month maximum (protects against losses)
3. **Maximum User Value**: Provide generous limits while staying within cost constraints
4. **Fair Pricing**: Balance between sustainability and user value (not excessive profit margins)
**Strategy**: Use WaveSpeed's OSS models (Qwen, FLUX, Ideogram, WAN 2.5) which offer better pricing than proprietary alternatives, allowing us to provide more value to users while maintaining profitability.
---
## Current State Analysis
### Current Basic Tier (Code Implementation)
**Price**: $29/month ($290/year)
**Limits**:
- **AI Text Generation**: 10 unified calls/month (across all LLM providers)
- **Tokens**: 20,000 per provider (Gemini, OpenAI, Anthropic, Mistral)
- **Search APIs**: 200 Tavily, 200 Serper, 100 Metaphor, 100 Firecrawl, 500 Exa
- **Image Generation**: 5 Stability AI images/month
- **Image Editing**: 30 AI image edits/month
- **Video Generation**: 20 videos/month
- **Audio Generation**: 50 TTS generations/month
- **Monthly Cost Cap**: $50.00
**Problem**: 10 unified AI text generation calls is **too restrictive** for production launch where users need to experience all features.
---
## ALwrity Tools & Content Generation Analysis
### Content Generation Tools
#### 1. **Text Generation Tools** (Primary LLM Usage)
| Tool | API Calls per Generation | Typical Usage | Cost per Generation |
|------|--------------------------|---------------|---------------------|
| **Blog Writer** | 3-5 calls | 1 blog = research (1) + outline (1) + content (1-3) | $0.01 - $0.05 |
| **Story Writer** | 2-3 calls | 1 story = outline (1) + script (1-2) | $0.01 - $0.03 |
| **Podcast Maker** | 3-4 calls | 1 podcast = research (1) + script (1) + outline (1-2) | $0.01 - $0.04 |
| **Facebook Writer** | 1-2 calls | 1 post = generation (1) + optional optimization (1) | $0.005 - $0.01 |
| **LinkedIn Writer** | 1-2 calls | 1 post = generation (1) + optional optimization (1) | $0.005 - $0.01 |
| **SEO Tools** | 1-3 calls | Varies by tool complexity | $0.005 - $0.02 |
| **Content Planning** | 2-4 calls | Strategy generation + analysis | $0.01 - $0.03 |
**Average**: ~2-3 LLM calls per content generation workflow
#### 2. **Image Generation Tools**
| Tool | API Calls | Cost per Generation |
|------|-----------|---------------------|
| **Image Generator** | 1 Stability call | $0.04 per image |
| **Image Editor** | 1 Image Edit call | $0.04 per edit operation |
**Current Limit**: 5 images/month (too low for production)
#### 3. **Video Generation Tools**
| Tool | API Calls | Cost per Video | Notes |
|------|-----------|-----------------|-------|
| **Video Studio** | 1 video call | $0.10 - $0.42 | Depends on model/duration |
| **YouTube Creator** | 1 video call per scene | $0.10 - $0.42 per scene | 5 scenes = $0.50 - $2.10 |
| **Story Writer Video** | 1 video call per scene | $0.10 - $0.42 per scene | Variable scenes |
| **Podcast Maker Video** | 1 video call per scene | $0.10 - $0.42 per scene | Optional video generation |
**Current Limit**: 20 videos/month (reasonable)
#### 4. **Audio Generation Tools**
| Tool | API Calls | Cost per Generation | Notes |
|------|-----------|---------------------|-------|
| **Audio Generator** | 1 audio call | $0.05 per 1,000 chars | ~$0.10 - $0.50 per audio |
| **Podcast Maker TTS** | 1 audio call per scene | $0.05 per 1,000 chars | Multiple scenes |
| **Story Writer Narration** | 1 audio call per scene | $0.05 per 1,000 chars | Multiple scenes |
**Current Limit**: 50 audio generations/month (reasonable)
---
## API Cost Breakdown
### LLM Provider Costs (Per 1M Tokens)
| Provider | Model | Input Cost | Output Cost | Typical Use |
|----------|-------|------------|-------------|-------------|
| **Gemini** | 2.5 Flash | $0.30 | $2.50 | Default (cost-effective) |
| **Gemini** | 2.5 Pro | $1.25 | $10.00 | Premium quality |
| **OpenAI** | GPT-4o Mini | $0.15 | $0.60 | Cost-effective |
| **OpenAI** | GPT-4o | $2.50 | $10.00 | Premium quality |
| **Anthropic** | Claude 3.5 Sonnet | $3.00 | $15.00 | Premium quality |
| **HuggingFace** | GPT-OSS-120B | $1.00 | $3.00 | Alternative option |
**Average Cost per LLM Call** (assuming 1K input + 2K output tokens):
- Gemini Flash: ~$0.0056 per call
- GPT-4o Mini: ~$0.0015 per call
- Claude 3.5: ~$0.033 per call
**Recommendation**: Use Gemini Flash as default for cost efficiency.
### Search API Costs
| Provider | Cost per Search | Typical Usage |
|----------|----------------|---------------|
| **Tavily** | $0.001 | Research operations |
| **Serper** | $0.001 | Research operations |
| **Metaphor** | $0.003 | Research operations |
| **Exa** | $0.005 | Neural search (premium) |
| **Firecrawl** | $0.002 | Web page extraction |
**Average**: ~$0.002 per search operation
### Media Generation Costs (OSS-Focused via WaveSpeed)
#### **Image Generation** (OSS Models via WaveSpeed)
| Model | Cost | Type | Notes |
|------|------|------|-------|
| **Qwen Image** | $0.03 per image | OSS | Fast generation, cost-effective |
| **Ideogram V3 Turbo** | $0.05 per image | OSS | Photorealistic, text rendering |
| **Default (Qwen)** | $0.03 per image | OSS | **Recommended for Basic tier** |
#### **Image Editing** (OSS Models via WaveSpeed)
| Model | Cost | Type | Use Case |
|------|------|------|----------|
| **Qwen Image Edit** | $0.02 per edit | OSS | Budget editing, bilingual |
| **Qwen Image Edit Plus** | $0.02 per edit | OSS | Multi-image editing |
| **FLUX Kontext Pro** | $0.04 per edit | OSS | Typography, professional |
| **Default (Qwen Edit)** | $0.02 per edit | OSS | **Recommended for Basic tier** |
#### **Video Generation** (OSS Models via WaveSpeed)
| Model | Cost | Type | Duration | Notes |
|------|------|------|----------|-------|
| **WAN 2.5** | $0.05/sec | OSS | 5-15 sec | Text-to-Video, Image-to-Video |
| **Seedance 1.5 Pro** | $0.08/sec | OSS | 10-30 sec | Longer duration |
| **Kling v2.5 Turbo (5s)** | $0.21 per video | OSS | 5 sec | Image-to-Video |
| **Kling v2.5 Turbo (10s)** | $0.42 per video | OSS | 10 sec | Extended duration |
| **Default (WAN 2.5)** | $0.25 per video | OSS | ~5 sec | **Recommended for Basic tier** |
#### **Audio Generation** (OSS Models via WaveSpeed)
| Model | Cost | Type | Notes |
|------|------|------|-------|
| **Minimax Speech 02 HD** | $0.05 per 1K chars | OSS | High-quality TTS |
| **Default** | $0.05 per 1K chars | OSS | ~$0.10-0.50 per audio |
#### **Face Swap & Specialized** (OSS Models via WaveSpeed)
| Operation | Cost | Type | Notes |
|-----------|------|------|-------|
| **Face Swap** | $0.01-$0.03 | OSS | Basic to premium quality |
| **Image Upscaling** | $0.01-$0.06 | OSS | 2K/4K/8K options |
| **3D Generation** | $0.02-$0.30 | OSS | Budget to premium |
**OSS Advantage**: WaveSpeed provides access to OSS models (Qwen, FLUX, Ideogram, WAN 2.5) at significantly lower costs than proprietary alternatives, enabling better value for users.
---
## Production-Ready Basic Tier Proposal
### Revised Limits for Production Launch
**Price**: $29/month ($290/year) - **KEEP CURRENT PRICING**
**Rationale**: Competitive pricing point, allows for sustainable margins with proper limits.
### Proposed Limits
#### 1. **AI Text Generation** (Unified Limit)
- **Current**: 10 calls/month ❌ **TOO LOW**
- **Proposed**: **50 calls/month**
- **Rationale**:
- Allows ~16-25 content generations/month (assuming 2-3 calls each)
- Enables users to experience Blog Writer, Story Writer, Podcast Maker, Social Writers
- Sustainable cost: ~$0.28/month (50 calls × $0.0056 average)
#### 2. **Token Limits** (Per Provider)
- **Current**: 20,000 tokens/provider
- **Proposed**: **100,000 tokens/provider**
- **Rationale**:
- Allows ~33-50 LLM calls per provider (assuming 2K tokens/call)
- Provides buffer for longer content generation
- Aligns with unified call limit (50 calls × 2K tokens = 100K tokens)
#### 3. **Search APIs**
- **Tavily**: 200 calls/month ✅ (Keep)
- **Serper**: 200 calls/month ✅ (Keep)
- **Metaphor**: 100 calls/month ✅ (Keep)
- **Firecrawl**: 100 calls/month ✅ (Keep)
- **Exa**: 500 calls/month ✅ (Keep)
- **Rationale**: Sufficient for research-heavy tools (Blog Writer, Podcast Maker, SEO tools)
#### 4. **Image Generation** (OSS Models via WaveSpeed)
- **Current**: 5 images/month ❌ **TOO LOW**
- **Proposed**: **50 images/month** ✅ (INCREASED - OSS models are cheaper)
- **Rationale**:
- OSS models (Qwen Image $0.03) are cheaper than Stability ($0.04)
- Allows users to generate images for Story Writer, Blog Writer, Social Media
- Cost: ~$1.50/month (50 × $0.03 using Qwen Image OSS model)
- Enables visual content creation workflows
- **Default to Qwen Image OSS model** for cost efficiency
#### 5. **Image Editing** (OSS Models via WaveSpeed)
- **Current**: 30 edits/month
- **Proposed**: **50 edits/month** ✅ (INCREASED - OSS models are cheaper)
- **Rationale**:
- OSS models (Qwen Edit $0.02) are cheaper than Stability ($0.04)
- Cost: ~$1.00/month (50 × $0.02 using Qwen Edit OSS model)
- Sufficient for image optimization workflows
- **Default to Qwen Edit OSS model** for cost efficiency
#### 6. **Video Generation** (OSS Models via WaveSpeed)
- **Current**: 20 videos/month
- **Proposed**: **30 videos/month** ✅ (INCREASED - OSS models available)
- **Rationale**:
- OSS models (WAN 2.5 $0.25 per 5s video) provide good value
- Allows ~6-10 full video projects/month (assuming 3-5 scenes each)
- Cost: ~$7.50/month (30 × $0.25 using WAN 2.5 OSS model)
- Enables Video Studio, YouTube Creator, Story Writer video features
- **Default to WAN 2.5 OSS model** for cost efficiency
#### 7. **Audio Generation** (OSS Models via WaveSpeed)
- **Current**: 50 generations/month
- **Proposed**: **100 generations/month** ✅ (INCREASED - OSS models are affordable)
- **Rationale**:
- OSS models (Minimax Speech 02 HD) provide high quality at $0.05/1K chars
- Sufficient for Podcast Maker, Story Writer narration
- Cost: ~$10.00-$25.00/month (depending on length, assuming 2K-5K chars per audio)
- Enables audio content workflows
- **Default to Minimax Speech 02 HD OSS model**
#### 8. **Monthly Cost Cap**
- **Current**: $50.00
- **Proposed**: **$45.00** ✅ (ADJUSTED - aligns with $40-50 target)
- **Rationale**:
- Protects against unexpected high usage
- Allows flexibility within limits
- Provides safety margin
- Aligns with $40-50 hard limit requirement
---
## Cost Analysis: Proposed Basic Tier (OSS-Focused)
### Monthly Cost Breakdown (Per User) - Using OSS Models
| Category | Usage | Cost per Unit (OSS) | Monthly Cost |
|----------|-------|---------------------|--------------|
| **LLM Calls** | 50 calls | $0.0056 avg (Gemini Flash) | **$0.28** |
| **Search APIs** | 200 searches | $0.002 avg | **$0.40** |
| **Image Generation** | 50 images | $0.03 (Qwen Image OSS) | **$1.50** |
| **Image Editing** | 50 edits | $0.02 (Qwen Edit OSS) | **$1.00** |
| **Video Generation** | 30 videos | $0.25 (WAN 2.5 OSS, ~5s) | **$7.50** |
| **Audio Generation** | 100 audios | $0.10-$0.50 avg | **$10.00-$25.00** |
| **Total Variable Cost** | | | **$20.68-$35.68** |
### Margin Analysis (OSS-Focused)
**Subscription Revenue**: $29.00/month
**Variable Costs (OSS Models)**: $20.68-$35.68/month (depending on usage)
**Gross Margin**: **-$6.68 to +$8.32/month**
**✅ IMPROVEMENT**: OSS models reduce costs significantly:
- Image generation: $0.03 vs $0.04 (25% savings)
- Image editing: $0.02 vs $0.04 (50% savings)
- Video generation: $0.25 vs $0.42 (40% savings)
**Mitigation Strategy**:
1. **Cost cap enforcement**: Monthly cost cap of $45 prevents extreme losses
2. **OSS model defaults**: Default to cheaper OSS models (Qwen, WAN 2.5)
3. **Realistic usage**: Most users won't hit all limits simultaneously
4. **Average usage assumption**: ~60-70% of limits = $12-25 cost = $4-17 margin
5. **Hard limit protection**: $45 cap ensures we never exceed $50/user/month
---
## Revised Basic Tier Limits (Production-Ready, OSS-Focused)
```python
{
"name": "Basic",
"tier": SubscriptionTier.BASIC,
"price_monthly": 29.0,
"price_yearly": 290.0,
# AI Text Generation (Unified Limit)
"ai_text_generation_calls_limit": 50, # INCREASED from 10
# Token Limits (Per Provider)
"gemini_tokens_limit": 100000, # INCREASED from 20,000
"openai_tokens_limit": 100000, # INCREASED from 20,000
"anthropic_tokens_limit": 100000, # INCREASED from 20,000
"mistral_tokens_limit": 100000, # INCREASED from 20,000
# Search APIs
"tavily_calls_limit": 200, # Keep
"serper_calls_limit": 200, # Keep
"metaphor_calls_limit": 100, # Keep
"firecrawl_calls_limit": 100, # Keep
"exa_calls_limit": 500, # Keep
# Media Generation (OSS Models via WaveSpeed)
"stability_calls_limit": 50, # INCREASED from 5 (using Qwen Image OSS $0.03)
"image_edit_calls_limit": 50, # INCREASED from 30 (using Qwen Edit OSS $0.02)
"video_calls_limit": 30, # INCREASED from 20 (using WAN 2.5 OSS $0.25)
"audio_calls_limit": 100, # INCREASED from 50 (using Minimax Speech OSS)
# Cost Protection
"monthly_cost_limit": 45.0, # ADJUSTED from 50.0 (aligns with $40-50 target)
# OSS Model Defaults
"default_image_model": "qwen-image", # OSS model via WaveSpeed
"default_image_edit_model": "qwen-edit", # OSS model via WaveSpeed
"default_video_model": "wan-2.5", # OSS model via WaveSpeed
"default_audio_model": "minimax-speech-02-hd", # OSS model via WaveSpeed
# Features
"features": [
"full_content_generation",
"advanced_research",
"basic_analytics",
"all_tools_access", # All ALwrity tools accessible
"billing_dashboard",
"usage_tracking",
"oss_models_priority" # NEW: OSS models prioritized for cost efficiency
],
"description": "Perfect for individuals and small teams. Access all ALwrity features with generous limits powered by OSS AI models."
}
```
---
## Tool Usage Scenarios & Limits
### Scenario 1: Blog Writer User
- **Workflow**: 1 blog post = 3-5 LLM calls + 3-5 search calls + 1-2 images
- **Monthly Capacity**: ~10-16 blog posts (with 50 LLM calls)
- **Cost**: ~$0.50-$1.00 per blog post
- **Status**: ✅ **FEASIBLE**
### Scenario 2: Story Writer User
- **Workflow**: 1 story = 2-3 LLM calls + 5-10 images + 5-10 audio + 5-10 videos
- **Monthly Capacity**: ~16-25 stories (LLM limit) OR ~3-6 stories (image/video limits)
- **Cost**: ~$2.00-$5.00 per story
- **Status**: ✅ **FEASIBLE** (limited by media, not LLM)
### Scenario 3: Podcast Maker User
- **Workflow**: 1 podcast = 3-4 LLM calls + 3-5 search calls + 5-10 audio + optional 5-10 videos
- **Monthly Capacity**: ~12-16 podcasts (LLM limit) OR ~5-10 podcasts (audio limit)
- **Cost**: ~$1.00-$3.00 per podcast (without video)
- **Status**: ✅ **FEASIBLE**
### Scenario 4: Social Media Content Creator
- **Workflow**: 1 post = 1-2 LLM calls + 1 image (optional)
- **Monthly Capacity**: ~25-50 posts (LLM limit) OR ~30 posts (image limit)
- **Cost**: ~$0.10-$0.15 per post
- **Status**: ✅ **FEASIBLE**
### Scenario 5: Video Creator (YouTube Creator)
- **Workflow**: 1 video = 2-3 LLM calls + 5 scenes × (1 image + 1 audio + 1 video)
- **Monthly Capacity**: ~4-5 full videos (video limit) OR ~16-25 videos (LLM limit)
- **Cost**: ~$3.00-$5.00 per video
- **Status**: ✅ **FEASIBLE** (limited by video limit, not LLM)
---
## Risk Mitigation Strategies
### 1. **Cost Cap Enforcement**
- **Monthly cost cap**: $50.00 (hard limit)
- **Behavior**: When cap reached, all API calls blocked until next billing period
- **Protection**: Prevents losses from extreme usage
### 2. **Pre-flight Validation**
- **Implementation**: Already in place
- **Function**: Validates limits BEFORE making API calls
- **Benefit**: Prevents wasted API calls on operations that would fail
### 3. **Usage Monitoring & Alerts**
- **80% Warning**: Alert users at 80% of limits
- **100% Block**: Block operations at 100% of limits
- **Dashboard**: Real-time usage tracking
### 4. **Optimized Default Models**
- **Strategy**: Use cost-effective models by default (Gemini Flash, GPT-4o Mini)
- **Benefit**: Reduces costs while maintaining quality
- **User Control**: Allow model selection for power users
### 5. **Efficient API Usage**
- **Batching**: Batch multiple operations where possible
- **Caching**: Cache research results and common queries
- **Optimization**: Continue optimizing tool workflows to reduce API calls
---
## Pricing Page Updates Required
### Current Issues
1. Pricing page shows outdated limits
2. Missing unified `ai_text_generation_calls_limit` explanation
3. Token limits don't match code (shows 1M/500K, code has 20K)
4. Missing video/audio/image editing limits
5. Missing cost transparency information
### Required Updates
#### Basic Tier Display
```
💰 Basic Plan - $29/month ($290/year)
✨ All ALwrity Features Included:
✅ Blog Writer, Story Writer, Podcast Maker
✅ Image Generator & Editor
✅ Video Studio & YouTube Creator
✅ Audio Generator
✅ All Social Media Writers
✅ All SEO Tools & Dashboards
✅ Content Planning & Strategy Tools
📊 Usage Limits:
• 50 AI Text Generations/month (unified across all LLM providers)
• 100,000 tokens per provider (Gemini, OpenAI, Anthropic, Mistral)
• 200 Research Searches/month (Tavily, Serper)
• 500 Neural Searches/month (Exa)
• 30 AI Images/month
• 30 Image Edits/month
• 20 AI Videos/month
• 50 AI Audio Generations/month
• $50 Monthly Cost Cap (protects you from overages)
💡 Perfect for: Individuals, content creators, small teams
```
---
## Implementation Checklist
### Phase 1: Update Code Limits
- [ ] Update `pricing_service.py` Basic tier limits:
- [ ] `ai_text_generation_calls_limit`: 10 → 50
- [ ] `gemini_tokens_limit`: 20,000 → 100,000
- [ ] `openai_tokens_limit`: 20,000 → 100,000
- [ ] `anthropic_tokens_limit`: 20,000 → 100,000
- [ ] `mistral_tokens_limit`: 20,000 → 100,000
- [ ] `stability_calls_limit`: 5 → 30
- [ ] Run database migration script
- [ ] Test limit enforcement
### Phase 2: Update Pricing Page
- [ ] Update `docs-site/docs/features/subscription/pricing.md`
- [ ] Update frontend pricing page component
- [ ] Add cost transparency section
- [ ] Add tool usage examples
- [ ] Add FAQ section
### Phase 3: Update Documentation
- [ ] Update subscription rule file (`.cursor/rules/subscription.mdc`)
- [ ] Update API documentation
- [ ] Create user-facing pricing guide
### Phase 4: Testing
- [ ] Test all tools with new limits
- [ ] Verify cost calculations
- [ ] Test limit enforcement
- [ ] Test cost cap enforcement
- [ ] Verify pre-flight validation
---
## Cost Calculation Examples
### Example 1: Blog Writer - 1 Blog Post (OSS Models)
```
Research: 3 Exa searches = $0.015
Outline: 1 LLM call (Gemini Flash) = $0.0056
Content: 2 LLM calls (Gemini Flash) = $0.0112
Image: 1 Qwen Image OSS = $0.03 (vs $0.04 Stability)
Total: ~$0.06 per blog post (saved $0.01 with OSS)
```
### Example 2: Story Writer - 1 Story (5 scenes, OSS Models)
```
Outline: 1 LLM call = $0.0056
Script: 1 LLM call = $0.0056
Images: 5 × $0.03 (Qwen Image OSS) = $0.15 (vs $0.20)
Audio: 5 × $0.10 = $0.50
Videos: 5 × $0.25 (WAN 2.5 OSS) = $1.25 (vs $0.50-$2.10)
Total: ~$1.96 per story (higher video cost, but better quality)
```
### Example 3: Podcast Maker - 1 Episode (10 min, 5 scenes, OSS Models)
```
Research: 3 Exa searches = $0.015
Script: 1 LLM call = $0.0056
Outline: 1 LLM call = $0.0056
Audio: 5 × $0.20 (Minimax Speech OSS) = $1.00
Video (optional): 5 × $0.25 (WAN 2.5 OSS) = $1.25
Total: ~$1.03 per podcast (without video)
Total: ~$2.28 per podcast (with video, OSS models)
```
### Example 4: Social Media - 10 Posts (OSS Models)
```
Generation: 10 × 1 LLM call = 10 calls × $0.0056 = $0.056
Images: 10 × $0.03 (Qwen Image OSS) = $0.30 (vs $0.40)
Total: ~$0.36 for 10 posts (saved $0.10 with OSS)
```
---
## Competitive Analysis
### Similar AI Content Platforms
| Platform | Price | Limits | Notes |
|----------|-------|--------|-------|
| **Jasper** | $49/month | 50K words | Text-focused |
| **Copy.ai** | $49/month | Unlimited words | Text-focused |
| **Writesonic** | $19/month | 100K words | Text-focused |
| **ALwrity Basic** | $29/month | 50 LLM calls + media | **Full platform** |
**ALwrity Advantage**:
- Lower price point ($29 vs $49)
- Includes video, image, audio generation (competitors don't)
- Comprehensive tool suite (not just text)
- Better value proposition
---
## Recommendations Summary
### ✅ **APPROVED: Production-Ready Basic Tier (OSS-Focused)**
**Price**: $29/month ($290/year) - **KEEP**
**Key Changes** (OSS-Focused):
1.**Increase AI Text Generation**: 10 → **50 calls/month**
2.**Increase Token Limits**: 20K → **100K per provider**
3.**Increase Image Generation**: 5 → **50 images/month** (OSS: Qwen Image $0.03)
4.**Increase Image Editing**: 30 → **50 edits/month** (OSS: Qwen Edit $0.02)
5.**Increase Video Generation**: 20 → **30 videos/month** (OSS: WAN 2.5 $0.25)
6.**Increase Audio Generation**: 50 → **100 generations/month** (OSS: Minimax Speech)
7.**Adjust Cost Cap**: $50 → **$45** (aligns with $40-50 target)
8.**Default to OSS Models**: Qwen, WAN 2.5, Minimax Speech (cost-efficient)
**Expected Outcomes**:
- Users can experience all ALwrity features with generous limits
- Sustainable cost structure (~$20-35/user/month average with OSS models)
- Competitive pricing ($29 vs competitors $49+)
- Room for margin ($4-17/user/month average)
- Cost cap ($45) protects against losses (hard limit $40-50)
- **OSS models provide 25-50% cost savings** vs proprietary alternatives
**Risk Level**: 🟢 **LOW** (with cost cap enforcement and OSS model defaults)
---
## Implementation Plan
### Phase 1: Update Pricing Service & Database (Priority: HIGH)
#### 1.1 Update `pricing_service.py` Basic Tier Limits
**File**: `backend/services/subscription/pricing_service.py`
**Changes Required**:
```python
# In initialize_default_plans() method
{
"name": "Basic",
"tier": SubscriptionTier.BASIC,
"price_monthly": 29.0,
"price_yearly": 290.0,
# AI Text Generation (Unified Limit)
"ai_text_generation_calls_limit": 50, # Changed from 10
# Token Limits (Per Provider)
"gemini_tokens_limit": 100000, # Changed from 20,000
"openai_tokens_limit": 100000, # Changed from 20,000
"anthropic_tokens_limit": 100000, # Changed from 20,000
"mistral_tokens_limit": 100000, # Changed from 20,000
# Search APIs (Keep existing)
"tavily_calls_limit": 200,
"serper_calls_limit": 200,
"metaphor_calls_limit": 100,
"firecrawl_calls_limit": 100,
"exa_calls_limit": 500,
# Media Generation (OSS Models via WaveSpeed)
"stability_calls_limit": 50, # Changed from 5 (now includes WaveSpeed OSS)
"image_edit_calls_limit": 50, # Changed from 30
"video_calls_limit": 30, # Changed from 20
"audio_calls_limit": 100, # Changed from 50
# Cost Protection
"monthly_cost_limit": 45.0, # Changed from 50.0
}
```
**Action Items**:
- [ ] Update `initialize_default_plans()` method in `pricing_service.py`
- [ ] Run database migration to update existing Basic tier subscriptions
- [ ] Test limit enforcement with new values
- [ ] Verify cost calculations reflect OSS model pricing
#### 1.2 Update WaveSpeed Model Pricing in `pricing_service.py`
**File**: `backend/services/subscription/pricing_service.py`
**Changes Required**:
```python
# In initialize_default_pricing() method, update/add WaveSpeed OSS model pricing:
# Image Generation (OSS Models via WaveSpeed)
{
"provider": APIProvider.IMAGE,
"model_name": "qwen-image",
"cost_per_request": 0.03, # OSS model via WaveSpeed
"description": "WaveSpeed Qwen Image (OSS) - Fast generation"
},
{
"provider": APIProvider.IMAGE,
"model_name": "ideogram-v3-turbo",
"cost_per_request": 0.05, # OSS model via WaveSpeed
"description": "WaveSpeed Ideogram V3 Turbo (OSS) - Photorealistic"
},
# Image Editing (OSS Models via WaveSpeed)
{
"provider": APIProvider.IMAGE_EDIT,
"model_name": "qwen-edit",
"cost_per_request": 0.02, # OSS model via WaveSpeed
"description": "WaveSpeed Qwen Image Edit (OSS) - Budget editing"
},
{
"provider": APIProvider.IMAGE_EDIT,
"model_name": "qwen-edit-plus",
"cost_per_request": 0.02, # OSS model via WaveSpeed
"description": "WaveSpeed Qwen Image Edit Plus (OSS) - Multi-image"
},
{
"provider": APIProvider.IMAGE_EDIT,
"model_name": "flux-kontext-pro",
"cost_per_request": 0.04, # OSS model via WaveSpeed
"description": "WaveSpeed FLUX Kontext Pro (OSS) - Professional"
},
# Video Generation (OSS Models via WaveSpeed)
{
"provider": APIProvider.VIDEO,
"model_name": "wan-2.5",
"cost_per_request": 0.25, # OSS model via WaveSpeed (~5 seconds)
"description": "WaveSpeed WAN 2.5 (OSS) - Text-to-Video, Image-to-Video"
},
{
"provider": APIProvider.VIDEO,
"model_name": "seedance-1.5-pro",
"cost_per_request": 0.40, # OSS model via WaveSpeed (~5 seconds)
"description": "WaveSpeed Seedance 1.5 Pro (OSS) - Longer duration"
},
# Audio Generation (OSS Models via WaveSpeed)
{
"provider": APIProvider.AUDIO,
"model_name": "minimax-speech-02-hd",
"cost_per_input_token": 0.00005, # $0.05 per 1K chars
"cost_per_output_token": 0.0,
"cost_per_request": 0.0,
"description": "WaveSpeed Minimax Speech 02 HD (OSS) - High-quality TTS"
},
```
**Action Items**:
- [ ] Add WaveSpeed OSS model pricing entries
- [ ] Update default model selection logic to prefer OSS models
- [ ] Test cost calculation with OSS models
- [ ] Verify pricing accuracy against WaveSpeed API documentation
#### 1.3 Update Default Model Selection Logic
**Files**:
- `backend/services/llm_providers/main_image_generation.py`
- `backend/services/image_studio/create_service.py`
- `backend/services/image_studio/edit_service.py`
- `backend/services/video_studio/video_service.py`
- `backend/services/audio_generation/audio_service.py`
**Changes Required**:
- Default image generation to `qwen-image` (OSS) instead of Stability
- Default image editing to `qwen-edit` (OSS) instead of Stability
- Default video generation to `wan-2.5` (OSS) instead of HuggingFace
- Default audio generation to `minimax-speech-02-hd` (OSS)
**Action Items**:
- [ ] Update `get_default_provider()` methods to prefer WaveSpeed OSS models
- [ ] Update model selection UI to show OSS models as default/recommended
- [ ] Add cost comparison tooltips showing OSS model savings
- [ ] Test all tools with OSS model defaults
### Phase 2: Update Frontend & Documentation (Priority: HIGH)
#### 2.1 Update Pricing Page
**File**: `docs-site/docs/features/subscription/pricing.md`
**Changes Required**:
- Update Basic tier limits to reflect new values (50 images, 50 edits, 30 videos, 100 audio)
- Add OSS model information and cost savings messaging
- Update cost examples to use OSS model pricing
- Add FAQ about OSS models and cost efficiency
**Action Items**:
- [ ] Update pricing page markdown
- [ ] Update frontend pricing component (if exists)
- [ ] Add OSS model badges/indicators
- [ ] Add cost comparison table (OSS vs proprietary)
#### 2.2 Update Subscription Context & Components
**Files**:
- `frontend/src/contexts/SubscriptionContext.tsx`
- `frontend/src/components/billing/EnhancedBillingDashboard.tsx`
- `frontend/src/components/shared/UsageDashboard.tsx`
**Changes Required**:
- Display OSS model indicators in usage dashboard
- Show cost savings from using OSS models
- Update limit displays to show new Basic tier limits
- Add tooltips explaining OSS model benefits
**Action Items**:
- [ ] Update limit displays in billing dashboard
- [ ] Add OSS model indicators in cost breakdown
- [ ] Update usage statistics to reflect new limits
- [ ] Test UI with new limit values
### Phase 3: Testing & Validation (Priority: CRITICAL)
#### 3.1 Limit Enforcement Testing
**Test Cases**:
- [ ] Test 50 AI text generation calls limit
- [ ] Test 50 image generation limit (OSS models)
- [ ] Test 50 image editing limit (OSS models)
- [ ] Test 30 video generation limit (OSS models)
- [ ] Test 100 audio generation limit (OSS models)
- [ ] Test $45 monthly cost cap enforcement
- [ ] Test pre-flight validation with new limits
- [ ] Test limit exceeded error messages
#### 3.2 Cost Calculation Testing
**Test Cases**:
- [ ] Verify Qwen Image cost: $0.03 per image
- [ ] Verify Qwen Edit cost: $0.02 per edit
- [ ] Verify WAN 2.5 video cost: $0.25 per video
- [ ] Verify Minimax Speech cost: $0.05 per 1K chars
- [ ] Test cost aggregation across all operations
- [ ] Test cost cap enforcement at $45
- [ ] Verify cost display in billing dashboard
#### 3.3 OSS Model Integration Testing
**Test Cases**:
- [ ] Test Qwen Image generation via WaveSpeed
- [ ] Test Qwen Edit editing via WaveSpeed
- [ ] Test WAN 2.5 video generation via WaveSpeed
- [ ] Test Minimax Speech audio generation via WaveSpeed
- [ ] Verify default model selection uses OSS models
- [ ] Test model fallback if OSS model unavailable
- [ ] Verify cost tracking for OSS models
### Phase 4: Database Migration (Priority: HIGH)
#### 4.1 Create Migration Script
**File**: `backend/database/migrations/update_basic_tier_limits_oss.py`
**Script Requirements**:
```python
"""
Migration: Update Basic Tier Limits for OSS-Focused Pricing Strategy
- Increase AI text generation: 10 → 50
- Increase token limits: 20K → 100K per provider
- Increase image generation: 5 → 50
- Increase image editing: 30 → 50
- Increase video generation: 20 → 30
- Increase audio generation: 50 → 100
- Adjust cost cap: $50 → $45
"""
def upgrade():
# Update SubscriptionPlan for Basic tier
# Update existing UserSubscription records
# Clear pricing service cache
pass
def downgrade():
# Revert to previous limits if needed
pass
```
**Action Items**:
- [ ] Create migration script
- [ ] Test migration on staging database
- [ ] Backup production database before migration
- [ ] Run migration during maintenance window
- [ ] Verify all subscriptions updated correctly
### Phase 5: Monitoring & Adjustment (Priority: MEDIUM)
#### 5.1 Set Up Monitoring
**Metrics to Track**:
- Average cost per user per month
- Users hitting $45 cost cap
- Users hitting individual limits
- OSS model usage vs proprietary model usage
- Cost savings from OSS models
**Action Items**:
- [ ] Set up cost monitoring dashboard
- [ ] Create alerts for cost cap breaches
- [ ] Track OSS model adoption rate
- [ ] Monitor user satisfaction with limits
#### 5.2 Adjustment Plan
**Triggers for Adjustment**:
- If average cost > $35/user: Consider reducing limits
- If >15% users hit cost cap: Consider increasing cost cap to $50
- If <20% users use video/audio: Consider reducing those limits
- If OSS models unavailable: Fallback to proprietary models
**Action Items**:
- [ ] Define adjustment criteria
- [ ] Create adjustment workflow
- [ ] Plan communication strategy for limit changes
---
## Next Steps (Priority Order)
1. **CRITICAL**: Update `pricing_service.py` with new Basic tier limits
2. **CRITICAL**: Add WaveSpeed OSS model pricing to `pricing_service.py`
3. **HIGH**: Update default model selection to prefer OSS models
4. **HIGH**: Create and run database migration
5. **HIGH**: Update pricing page documentation
6. **HIGH**: Test limit enforcement and cost calculations
7. **MEDIUM**: Update frontend components with new limits
8. **MEDIUM**: Set up monitoring and alerts
9. **LOW**: Add OSS model indicators to UI
---
## Monitoring & Adjustment Plan
### Key Metrics to Track
- Average LLM calls per user per month
- Average media generation per user per month
- Average cost per user per month
- Users hitting cost cap
- Users hitting individual limits
### Adjustment Triggers
- **If average cost > $25/user**: Consider reducing limits
- **If >20% users hit cost cap**: Consider increasing cost cap
- **If <10% users use video/audio**: Consider reducing those limits
- **If churn rate high**: Consider increasing limits
### Review Schedule
- **Week 1-2**: Daily monitoring
- **Month 1**: Weekly review
- **Month 2-3**: Bi-weekly review
- **Month 4+**: Monthly review
---
## Conclusion
The proposed Basic tier limits (OSS-Focused) provide:
-**Access to all ALwrity features** with generous limits
-**Sustainable cost structure** using OSS models (25-50% savings)
-**Competitive pricing** ($29 vs competitors $49+)
-**Protection against losses** ($45 cost cap, hard limit $40-50)
-**Room for growth** (can adjust based on usage)
-**OSS-first strategy** (Qwen, FLUX, Ideogram, WAN 2.5, Minimax Speech)
-**Maximum user value** while staying within cost constraints
**Key Advantages of OSS-Focused Strategy**:
1. **Cost Efficiency**: 25-50% cost savings vs proprietary models
2. **Better Limits**: Can offer more generations due to lower costs
3. **User Value**: More value for the same $29/month price
4. **Sustainability**: Lower costs = better margins = sustainable business
5. **Flexibility**: Can adjust limits based on actual usage patterns
**Recommendation**: **APPROVE** for production launch with OSS-focused strategy.
**Confidence Level**: 🟢 **HIGH** (with proper monitoring, cost cap enforcement, and OSS model defaults)
**Risk Mitigation**:
- $45 cost cap protects against losses (hard limit $40-50)
- OSS model defaults ensure cost efficiency
- Monitoring allows quick adjustment if needed
- Realistic usage assumptions (60-70% of limits)

View File

@@ -0,0 +1,175 @@
# Provider Tracking Improvement
## Problem Statement
The billing dashboard's API Usage Logs were showing generic provider names (e.g., "Video", "Audio", "Stability") instead of the actual providers (WaveSpeed, Google/Gemini, HuggingFace). This made it difficult to:
- Understand which providers are actually being used
- Analyze costs by provider
- Make informed decisions about provider usage
- Track provider-specific trends and patterns
## Solution
Added `actual_provider_name` field to track the real provider behind generic enum values, with intelligent detection based on model names and endpoints.
## Implementation
### 1. Database Model Update
**File**: `backend/models/subscription_models.py`
Added `actual_provider_name` field to `APIUsageLog`:
```python
actual_provider_name = Column(String(50), nullable=True) # e.g., "wavespeed", "google", "huggingface"
```
### 2. Provider Detection Utility
**File**: `backend/services/subscription/provider_detection.py`
Created intelligent provider detection function that identifies actual providers from:
- Model names (e.g., "alibaba/wan-2.5/text-to-video" → "wavespeed")
- Endpoints (e.g., "/video-generation/wavespeed" → "wavespeed")
- Provider enum values (with fallback logic)
**Supported Providers**:
- **WaveSpeed**: OSS models (Qwen, Ideogram, FLUX, WAN 2.5, Minimax Speech)
- **Google**: Gemini models (gemini-2.5-flash, gemini-2.5-pro, etc.)
- **HuggingFace**: GPT-OSS-120B, Tencent HunyuanVideo, etc.
- **Stability AI**: Stable Diffusion models
- **OpenAI**: GPT-4o, GPT-4o-mini, TTS-1
- **Anthropic**: Claude 3.5 Sonnet
### 3. Service Updates
Updated all media generation services to use provider detection:
- **Video Generation** (`backend/services/llm_providers/main_video_generation.py`)
- **Image Generation** (`backend/services/llm_providers/main_image_generation.py`)
- **Audio Generation** (`backend/services/llm_providers/main_audio_generation.py`)
- **Usage Tracking Service** (`backend/services/subscription/usage_tracking_service.py`)
All services now automatically detect and store the actual provider name when tracking API usage.
### 4. API Endpoint Update
**File**: `backend/api/subscription_api.py`
Updated `/api/subscription/usage-logs` endpoint to:
- Return `actual_provider_name` in response
- Use `actual_provider_name` for display if available
- Fallback to enum value with special handling for MISTRAL → HuggingFace
### 5. Frontend Updates
**Files**:
- `frontend/src/types/billing.ts` - Added `actual_provider_name` to `UsageLog` interface
- `frontend/src/components/billing/UsageLogsTable.tsx` - Display actual provider name prominently
**UI Display**:
- Shows actual provider name (e.g., "WaveSpeed") in bold
- Shows generic enum value (e.g., "video") in smaller text below if different
- Example: "**WaveSpeed**" (video)
### 6. Database Migration
**File**: `backend/scripts/add_actual_provider_name_column.py`
Migration script that:
- Adds `actual_provider_name` column to `api_usage_logs` table
- Backfills existing records with detected provider names
- Safe to run multiple times (checks if column exists)
## Usage
### Running the Migration
```bash
cd backend
python scripts/add_actual_provider_name_column.py
```
### Provider Detection Examples
```python
from services.subscription.provider_detection import detect_actual_provider
from models.subscription_models import APIProvider
# Video generation - WaveSpeed
provider = detect_actual_provider(
provider_enum=APIProvider.VIDEO,
model_name="alibaba/wan-2.5/text-to-video",
endpoint="/video-generation/wavespeed"
)
# Returns: "wavespeed"
# Image generation - WaveSpeed OSS
provider = detect_actual_provider(
provider_enum=APIProvider.STABILITY,
model_name="qwen-image",
endpoint="/image-generation/wavespeed"
)
# Returns: "wavespeed"
# Audio generation - WaveSpeed
provider = detect_actual_provider(
provider_enum=APIProvider.AUDIO,
model_name="minimax/speech-02-hd",
endpoint="/audio-generation/wavespeed"
)
# Returns: "wavespeed"
# LLM - Google Gemini
provider = detect_actual_provider(
provider_enum=APIProvider.GEMINI,
model_name="gemini-2.5-flash"
)
# Returns: "google"
# LLM - HuggingFace (MISTRAL enum)
provider = detect_actual_provider(
provider_enum=APIProvider.MISTRAL,
model_name="openai/gpt-oss-120b:groq"
)
# Returns: "huggingface"
```
## Benefits
1. **Accurate Provider Tracking**: Know exactly which providers (WaveSpeed, Google, HuggingFace) are being used
2. **Better Cost Analysis**: Analyze costs by actual provider, not generic categories
3. **Usage Insights**: Understand provider usage patterns and trends
4. **Informed Decisions**: Make data-driven decisions about provider selection
5. **Backward Compatible**: Existing records are backfilled, new records automatically tracked
## Future Enhancements
1. **Provider Analytics Dashboard**: Visualize usage and costs by actual provider
2. **Provider Recommendations**: Suggest provider switches based on cost/performance
3. **Provider Cost Comparison**: Compare costs across providers for similar operations
4. **Provider Performance Metrics**: Track response times, success rates by provider
## Testing
After running the migration, verify:
1. **Database**: Check that `actual_provider_name` column exists and has values
```sql
SELECT provider, actual_provider_name, model_used, COUNT(*)
FROM api_usage_logs
GROUP BY provider, actual_provider_name, model_used;
```
2. **API**: Check that `/api/subscription/usage-logs` returns `actual_provider_name`
```bash
curl http://localhost:8000/api/subscription/usage-logs?user_id=YOUR_USER_ID
```
3. **UI**: Check that billing dashboard shows actual provider names in Usage Logs table
## Notes
- The `provider` enum field is still used for limit enforcement (VIDEO, AUDIO, STABILITY, etc.)
- The `actual_provider_name` field is for display and analytics only
- Detection is based on heuristics (model names, endpoints) - may need refinement for edge cases
- Existing records are backfilled, but may not be 100% accurate if model names are ambiguous

View File

@@ -0,0 +1,281 @@
# Renewal History Retention Policy Implementation
## Overview
Implemented tiered retention policy for subscription renewal history records. This ensures efficient storage while preserving critical payment and subscription data for tax/audit compliance.
## Retention Policy
### Tiered Retention Strategy
```
┌─────────────────────────────────────────────────────────────┐
│ Retention Policy: Subscription Renewal History │
├─────────────────────────────────────────────────────────────┤
│ │
│ 0-12 months: Full records with usage snapshots │
│ - Complete usage_before_renewal JSON │
│ - All subscription and payment data │
│ │
│ 12-24 months: Compressed records │
│ - Compressed usage snapshot (key metrics) │
│ - All subscription and payment data │
│ │
│ 24-84 months: Summary records │
│ - No usage snapshots │
│ - All subscription and payment data │
│ │
│ 84+ months: Archive-ready records │
│ - No usage snapshots │
│ - Payment data preserved (tax/audit) │
│ │
│ Payment Data: Preserved indefinitely (compliance) │
└─────────────────────────────────────────────────────────────┘
```
## Implementation Details
### New Service
**File**: `backend/services/subscription/renewal_history_retention.py`
**Class**: `RenewalHistoryRetentionService`
### Key Methods
#### 1. `check_and_apply_retention(user_id: str)`
Main method that applies retention policies automatically.
**Process**:
1. Identifies records in each retention tier
2. Compresses usage snapshots for 12-24 month old records
3. Removes usage snapshots for 24-84 month old records
4. Ensures 84+ month old records have no snapshots
5. Returns statistics about processed records
**Returns**:
```python
{
'retention_applied': True,
'total_records': 150,
'compressed_count': 10,
'summarized_count': 5,
'archived_count': 2,
'total_processed': 17,
'message': 'Processed 17 records: 10 compressed, 5 summarized, 2 archived'
}
```
#### 2. `_compress_usage_snapshots(records)`
Compresses detailed usage snapshots to key metrics only.
**Before Compression**:
```json
{
"total_calls": 1500,
"total_tokens": 500000,
"total_cost": 45.50,
"provider_breakdown": {...},
"detailed_metrics": {...},
"trends": {...}
}
```
**After Compression**:
```json
{
"total_calls": 1500,
"total_tokens": 500000,
"total_cost": 45.50,
"compressed_at": "2025-01-15T10:30:00",
"note": "Usage snapshot compressed after 12 months"
}
```
#### 3. `_create_summary_records(records)`
Removes usage snapshots entirely, keeping only subscription and payment data.
#### 4. `_mark_for_archive(records)`
Ensures very old records have no snapshots (should already be done by previous stages).
#### 5. `get_retention_stats(user_id: str)`
Returns statistics about records in each retention tier.
**Returns**:
```python
{
'total_records': 150,
'recent_records': 120, # 0-12 months
'records_to_compress': 15, # 12-24 months
'records_to_summarize': 10, # 24-84 months
'records_to_archive': 5, # 84+ months
'retention_policy': {
'compress_after_days': 365,
'summarize_after_days': 730,
'archive_after_days': 2555
}
}
```
## Integration
### Automatic Application
Retention is automatically applied when fetching renewal history:
```python
# In backend/api/subscription/routes/subscriptions.py
@router.get("/renewal-history/{user_id}")
async def get_renewal_history(...):
# Apply retention before fetching
retention_service = RenewalHistoryRetentionService(db)
retention_service.check_and_apply_retention(user_id)
# ... fetch and return records
```
### New Endpoint
Added endpoint to get retention statistics:
```
GET /api/subscription/renewal-history/{user_id}/retention-stats
```
Returns breakdown of records by retention tier.
## Configuration
### Retention Periods
Currently set to:
- **Compress after**: 365 days (12 months)
- **Summarize after**: 730 days (24 months)
- **Archive after**: 2555 days (84 months / 7 years)
To change:
```python
# In RenewalHistoryRetentionService class
COMPRESS_SNAPSHOT_DAYS = 365 # Change this value
SUMMARY_RECORDS_DAYS = 730 # Change this value
ARCHIVE_DAYS = 2555 # Change this value
```
## Data Preservation
### What's Preserved
**Always Preserved**:
- Payment amount
- Payment status
- Payment date
- Stripe invoice ID
- Plan name and tier
- Billing cycle
- Period start/end dates
- Renewal type and count
**Preserved for 12-24 months**:
- Compressed usage snapshot (key metrics only)
**Removed after 12 months**:
- Detailed usage breakdowns
- Provider-specific metrics
- Trend data
- Detailed usage snapshots
### Compliance
- **Payment Data**: Preserved indefinitely for tax/audit compliance
- **Subscription Data**: Preserved indefinitely for billing history
- **Usage Snapshots**: Removed after 12 months (not required for compliance)
## Benefits
1. **Storage Efficiency**: Reduces database size by removing large JSON snapshots
2. **Compliance**: Preserves all payment data for tax/audit requirements
3. **Performance**: Smaller records = faster queries
4. **Automatic**: No manual intervention required
5. **Gradual**: Applies retention in stages, not all at once
## Example Scenarios
### Scenario 1: New User (0-12 months)
- 5 renewal records, all recent
- **Result**: All records kept with full usage snapshots
### Scenario 2: Active User (12-24 months)
- 20 renewal records
- 3 records are 13 months old
- **Result**: 3 records get compressed snapshots, 17 remain full
### Scenario 3: Long-term User (24+ months)
- 50 renewal records
- 10 records are 25 months old
- **Result**: 10 records have snapshots removed, payment data preserved
### Scenario 4: Very Old Records (84+ months)
- 100 renewal records
- 5 records are 7+ years old
- **Result**: 5 records have no snapshots, ready for archive
## Testing
### Manual Testing
1. **Create test records with old timestamps**:
```sql
UPDATE subscription_renewal_history
SET created_at = datetime('now', '-400 days')
WHERE user_id = 'test_user' AND id IN (SELECT id FROM subscription_renewal_history LIMIT 5);
```
2. **Trigger retention** by calling `/api/subscription/renewal-history/{user_id}`
3. **Verify**:
- Records 12-24 months old have compressed snapshots
- Records 24+ months old have no snapshots
- Payment data is preserved in all records
### Expected Behavior
- Records are processed automatically on history queries
- Usage snapshots are compressed/removed based on age
- Payment data is never removed
- All subscription data is preserved
## Monitoring
The service logs detailed information:
```
[RenewalRetention] Applied retention for user {user_id}: 10 compressed, 5 summarized, 2 archived
```
## Future Enhancements
1. **Archive Table**: Move very old records to separate archive table
2. **Scheduled Jobs**: Run retention on a schedule instead of on-demand
3. **Configurable Periods**: Make retention periods configurable via environment variables
4. **Metrics Dashboard**: Show retention statistics in admin dashboard
5. **Export Functionality**: Allow export of old records before archive
## Backward Compatibility
✅ **Fully backward compatible**:
- Existing records are processed automatically
- No breaking changes to API responses
- Old records without snapshots are handled correctly
- Payment data is always preserved
## Related Files
- `backend/services/subscription/renewal_history_retention.py` - Main implementation
- `backend/api/subscription/routes/subscriptions.py` - API endpoint integration
- `frontend/src/components/billing/SubscriptionRenewalHistory.tsx` - Frontend display
- `docs/Billing_Subscription/LOG_STORAGE_AND_RETENTION_REVIEW.md` - Review document

View File

@@ -0,0 +1,206 @@
# Time-Based Retention Implementation for API Usage Logs
## Overview
Implemented time-based retention for API usage logs in addition to the existing count-based retention. This ensures that logs older than a specified retention period are automatically aggregated, regardless of the total log count.
## Implementation Details
### Changes Made
**File**: `backend/services/subscription/log_wrapping_service.py`
#### 1. Added Time-Based Retention Constant
```python
RETENTION_DAYS = 90 # Time-based retention: aggregate logs older than 90 days
```
#### 2. Enhanced `check_and_wrap_logs()` Method
**Before**: Only checked count-based limit (5,000 logs)
**After**: Checks both:
- **Count-based**: If user has more than 5,000 logs
- **Time-based**: If user has logs older than 90 days
**Key Features**:
- Detects logs older than retention period
- Excludes already aggregated logs from time-based checks
- Provides detailed trigger reasons in response
- Reports how many old logs were aggregated
#### 3. Enhanced `_wrap_old_logs()` Method
**New Parameters**:
- `time_based`: Boolean flag to prioritize time-based retention
**Aggregation Strategy**:
1. **Time-based mode**: Aggregates ALL logs older than 90 days (excluding already aggregated)
2. **Count-based mode**: Aggregates oldest logs beyond 4,000 limit
3. **Combined mode**: When count-based is primary, also includes old logs to prevent keeping very old logs just because they're within count limit
**Key Improvements**:
- Prevents re-aggregation of already aggregated logs (`endpoint != '[AGGREGATED]'`)
- Prioritizes old logs even in count-based mode
- Better logging for debugging and monitoring
## How It Works
### Automatic Triggering
The log wrapping is automatically triggered on every `/usage-logs` API call:
```python
# In backend/api/subscription/routes/logs.py
wrapping_service = LogWrappingService(db)
wrap_result = wrapping_service.check_and_wrap_logs(user_id)
```
### Retention Logic Flow
```
1. Check total log count
├─ If > 5,000 → Count-based trigger
└─ If ≤ 5,000 → Continue
2. Check for old logs (> 90 days)
├─ If found → Time-based trigger
└─ If none → No action needed
3. If either trigger active:
├─ Time-based: Aggregate ALL logs older than 90 days
├─ Count-based: Aggregate oldest logs beyond 4,000 limit
└─ Combined: Merge both sets (prioritize old logs)
4. Create aggregated records
├─ Group by provider + billing period
├─ Preserve: costs, tokens, counts, success rates
└─ Delete individual logs that were aggregated
```
### Example Scenarios
**Scenario 1: Time-Based Only**
- User has 3,000 logs
- 500 logs are older than 90 days
- **Result**: 500 old logs aggregated, 2,500 detailed logs kept
**Scenario 2: Count-Based Only**
- User has 6,000 logs (all recent)
- **Result**: 2,000 oldest logs aggregated, 4,000 detailed logs kept
**Scenario 3: Both Triggers**
- User has 6,000 logs
- 1,000 logs are older than 90 days
- **Result**: All 1,000 old logs + 1,000 additional oldest logs aggregated, 4,000 detailed logs kept
## Configuration
### Retention Period
Currently set to **90 days**. To change:
```python
# In LogWrappingService class
RETENTION_DAYS = 90 # Change this value
```
**Recommended Values**:
- **90 days** (current): Good balance for most use cases
- **60 days**: More aggressive, faster aggregation
- **180 days**: Less aggressive, keeps more detailed history
### Count Limits
```python
MAX_LOGS_PER_USER = 5000 # Total logs per user
logs_to_keep = 4000 # Detailed logs to keep
```
## Response Format
The `check_and_wrap_logs()` method now returns enhanced information:
```python
{
'wrapped': True,
'total_logs_before': 6000,
'total_logs_after': 4500,
'aggregated_logs': 1500,
'aggregated_periods': [...],
'trigger_reasons': [
'count limit (6000 > 5000)',
'time-based retention (500 logs older than 90 days)'
],
'old_logs_aggregated': 500,
'message': 'Wrapped 1500 logs into 12 aggregated records'
}
```
## Benefits
1. **Automatic Cleanup**: Old logs are automatically aggregated without manual intervention
2. **Storage Efficiency**: Prevents indefinite growth of detailed logs
3. **Context Preservation**: Aggregated logs maintain all important metrics
4. **Dual Protection**: Both count and time limits ensure efficient storage
5. **No Data Loss**: Historical data is preserved in aggregated form
## Testing
### Manual Testing
1. **Create old logs** (for testing, you can manually update timestamps in database):
```sql
UPDATE api_usage_logs
SET timestamp = datetime('now', '-100 days')
WHERE user_id = 'test_user' AND id IN (SELECT id FROM api_usage_logs LIMIT 10);
```
2. **Trigger wrapping** by calling `/api/subscription/usage-logs`
3. **Verify**:
- Old logs are aggregated
- Aggregated logs have `endpoint = '[AGGREGATED]'`
- Total log count reduced
- Costs and tokens preserved in aggregated records
### Expected Behavior
- Logs older than 90 days are automatically aggregated
- Aggregated logs are not re-aggregated
- Most recent 4,000 logs remain detailed
- All historical data is preserved in aggregated form
## Monitoring
The service logs detailed information:
```
[LogWrapping] User {user_id} needs log wrapping. Total: 6000, Old logs: 500. Triggers: count limit, time-based retention
[LogWrapping] Time-based aggregation: Found 500 logs older than 90 days
[LogWrapping] Wrapped 1500 logs into 12 aggregated records. Remaining logs: 4500
```
## Future Enhancements
1. **Configurable Retention**: Make `RETENTION_DAYS` configurable via environment variable
2. **Tiered Retention**: Different retention periods for different log types
3. **Archive Tables**: Move very old aggregated logs to separate archive tables
4. **Scheduled Jobs**: Run aggregation on a schedule instead of on-demand
5. **Metrics**: Track aggregation statistics over time
## Backward Compatibility
✅ **Fully backward compatible**:
- Existing count-based logic still works
- No breaking changes to API responses
- Old logs without `actual_provider_name` are handled correctly
- Aggregated logs are properly identified and displayed
## Related Files
- `backend/services/subscription/log_wrapping_service.py` - Main implementation
- `backend/api/subscription/routes/logs.py` - API endpoint that triggers wrapping
- `frontend/src/components/billing/UsageLogsTable.tsx` - Frontend display
- `docs/Billing_Subscription/LOG_STORAGE_AND_RETENTION_REVIEW.md` - Review document

View File

@@ -0,0 +1,65 @@
# Usage Dashboard Cost Display Fix
## Issue
The UsageDashboard component (used in dashboard headers) was showing cost as $0.00 even when there was actual API usage cost.
## Root Cause
The component was reading cost from `dashboardData.summary.total_cost_this_month` instead of `dashboardData.current_usage.total_cost`. While the backend populates both fields, the `current_usage.total_cost` is more reliable because:
1. It's properly coerced in the frontend's `billingService.coerceUsageStats()`
2. It calculates cost from provider breakdown if backend cost is 0
3. It uses `Math.max(backendTotalCost, calculatedTotalCost)` to ensure accuracy
## Solution
Updated `UsageDashboard.tsx` to:
1. **Primary source**: Use `dashboardData.current_usage.total_cost`
2. **Fallback**: Use `dashboardData.summary.total_cost_this_month` if current_usage is unavailable
3. **Safety**: Added null coalescing with default value of 0
## Changes Made
### File: `frontend/src/components/shared/UsageDashboard.tsx`
**Before:**
```typescript
const totalCost = dashboardData.summary.total_cost_this_month;
```
**After:**
```typescript
// Use current_usage for accurate cost (properly coerced from provider breakdown)
// Fallback to summary if current_usage is not available
const totalCalls = dashboardData.current_usage?.total_calls ?? dashboardData.summary.total_api_calls_this_month;
const totalCost = dashboardData.current_usage?.total_cost ?? dashboardData.summary.total_cost_this_month ?? 0;
const monthlyLimit = dashboardData.limits.limits.monthly_cost;
const usagePercentage = monthlyLimit > 0 ? (totalCost / monthlyLimit) * 100 : 0;
```
**Also updated:**
- Full dashboard view to use `current_usage.total_cost` with fallback
- Total calls to use `current_usage.total_calls` with fallback
- Added safety check for division by zero in usage percentage calculation
## Components Affected
- `UsageDashboard` - Used in:
- `DashboardHeader` (main dashboard header)
- `UserBadge` (user menu dropdown)
- `WizardHeader` (onboarding wizard header)
- Various tool headers across the application
## Testing
1. ✅ Verify cost displays correctly in dashboard header
2. ✅ Verify cost displays correctly in user badge menu
3. ✅ Verify cost displays correctly during onboarding
4. ✅ Verify fallback works if current_usage is missing
5. ✅ Verify division by zero protection for usage percentage
## Related Files
- `frontend/src/components/shared/UsageDashboard.tsx` - Fixed component
- `frontend/src/services/billingService.ts` - Cost coercion logic (already correct)
- `backend/api/subscription_api.py` - Backend API endpoint (already correct)
- `backend/services/subscription/usage_tracking_service.py` - Backend cost calculation (already correct)
## Notes
- The backend correctly calculates and returns `total_cost` in both `current_usage` and `summary` fields
- The frontend's `billingService.coerceUsageStats()` properly handles cost calculation from provider breakdown
- The fix ensures we use the most accurate cost value available