8.6 KiB
Renewal History Retention Policy Implementation
Overview
Implemented tiered retention policy for subscription renewal history records. This ensures efficient storage while preserving critical payment and subscription data for tax/audit compliance.
Retention Policy
Tiered Retention Strategy
┌─────────────────────────────────────────────────────────────┐
│ Retention Policy: Subscription Renewal History │
├─────────────────────────────────────────────────────────────┤
│ │
│ 0-12 months: Full records with usage snapshots │
│ - Complete usage_before_renewal JSON │
│ - All subscription and payment data │
│ │
│ 12-24 months: Compressed records │
│ - Compressed usage snapshot (key metrics) │
│ - All subscription and payment data │
│ │
│ 24-84 months: Summary records │
│ - No usage snapshots │
│ - All subscription and payment data │
│ │
│ 84+ months: Archive-ready records │
│ - No usage snapshots │
│ - Payment data preserved (tax/audit) │
│ │
│ Payment Data: Preserved indefinitely (compliance) │
└─────────────────────────────────────────────────────────────┘
Implementation Details
New Service
File: backend/services/subscription/renewal_history_retention.py
Class: RenewalHistoryRetentionService
Key Methods
1. check_and_apply_retention(user_id: str)
Main method that applies retention policies automatically.
Process:
- Identifies records in each retention tier
- Compresses usage snapshots for 12-24 month old records
- Removes usage snapshots for 24-84 month old records
- Ensures 84+ month old records have no snapshots
- Returns statistics about processed records
Returns:
{
'retention_applied': True,
'total_records': 150,
'compressed_count': 10,
'summarized_count': 5,
'archived_count': 2,
'total_processed': 17,
'message': 'Processed 17 records: 10 compressed, 5 summarized, 2 archived'
}
2. _compress_usage_snapshots(records)
Compresses detailed usage snapshots to key metrics only.
Before Compression:
{
"total_calls": 1500,
"total_tokens": 500000,
"total_cost": 45.50,
"provider_breakdown": {...},
"detailed_metrics": {...},
"trends": {...}
}
After Compression:
{
"total_calls": 1500,
"total_tokens": 500000,
"total_cost": 45.50,
"compressed_at": "2025-01-15T10:30:00",
"note": "Usage snapshot compressed after 12 months"
}
3. _create_summary_records(records)
Removes usage snapshots entirely, keeping only subscription and payment data.
4. _mark_for_archive(records)
Ensures very old records have no snapshots (should already be done by previous stages).
5. get_retention_stats(user_id: str)
Returns statistics about records in each retention tier.
Returns:
{
'total_records': 150,
'recent_records': 120, # 0-12 months
'records_to_compress': 15, # 12-24 months
'records_to_summarize': 10, # 24-84 months
'records_to_archive': 5, # 84+ months
'retention_policy': {
'compress_after_days': 365,
'summarize_after_days': 730,
'archive_after_days': 2555
}
}
Integration
Automatic Application
Retention is automatically applied when fetching renewal history:
# In backend/api/subscription/routes/subscriptions.py
@router.get("/renewal-history/{user_id}")
async def get_renewal_history(...):
# Apply retention before fetching
retention_service = RenewalHistoryRetentionService(db)
retention_service.check_and_apply_retention(user_id)
# ... fetch and return records
New Endpoint
Added endpoint to get retention statistics:
GET /api/subscription/renewal-history/{user_id}/retention-stats
Returns breakdown of records by retention tier.
Configuration
Retention Periods
Currently set to:
- Compress after: 365 days (12 months)
- Summarize after: 730 days (24 months)
- Archive after: 2555 days (84 months / 7 years)
To change:
# In RenewalHistoryRetentionService class
COMPRESS_SNAPSHOT_DAYS = 365 # Change this value
SUMMARY_RECORDS_DAYS = 730 # Change this value
ARCHIVE_DAYS = 2555 # Change this value
Data Preservation
What's Preserved
✅ Always Preserved:
- Payment amount
- Payment status
- Payment date
- Stripe invoice ID
- Plan name and tier
- Billing cycle
- Period start/end dates
- Renewal type and count
✅ Preserved for 12-24 months:
- Compressed usage snapshot (key metrics only)
❌ Removed after 12 months:
- Detailed usage breakdowns
- Provider-specific metrics
- Trend data
- Detailed usage snapshots
Compliance
- Payment Data: Preserved indefinitely for tax/audit compliance
- Subscription Data: Preserved indefinitely for billing history
- Usage Snapshots: Removed after 12 months (not required for compliance)
Benefits
- Storage Efficiency: Reduces database size by removing large JSON snapshots
- Compliance: Preserves all payment data for tax/audit requirements
- Performance: Smaller records = faster queries
- Automatic: No manual intervention required
- Gradual: Applies retention in stages, not all at once
Example Scenarios
Scenario 1: New User (0-12 months)
- 5 renewal records, all recent
- Result: All records kept with full usage snapshots
Scenario 2: Active User (12-24 months)
- 20 renewal records
- 3 records are 13 months old
- Result: 3 records get compressed snapshots, 17 remain full
Scenario 3: Long-term User (24+ months)
- 50 renewal records
- 10 records are 25 months old
- Result: 10 records have snapshots removed, payment data preserved
Scenario 4: Very Old Records (84+ months)
- 100 renewal records
- 5 records are 7+ years old
- Result: 5 records have no snapshots, ready for archive
Testing
Manual Testing
-
Create test records with old timestamps:
UPDATE subscription_renewal_history SET created_at = datetime('now', '-400 days') WHERE user_id = 'test_user' AND id IN (SELECT id FROM subscription_renewal_history LIMIT 5); -
Trigger retention by calling
/api/subscription/renewal-history/{user_id} -
Verify:
- Records 12-24 months old have compressed snapshots
- Records 24+ months old have no snapshots
- Payment data is preserved in all records
Expected Behavior
- Records are processed automatically on history queries
- Usage snapshots are compressed/removed based on age
- Payment data is never removed
- All subscription data is preserved
Monitoring
The service logs detailed information:
[RenewalRetention] Applied retention for user {user_id}: 10 compressed, 5 summarized, 2 archived
Future Enhancements
- Archive Table: Move very old records to separate archive table
- Scheduled Jobs: Run retention on a schedule instead of on-demand
- Configurable Periods: Make retention periods configurable via environment variables
- Metrics Dashboard: Show retention statistics in admin dashboard
- Export Functionality: Allow export of old records before archive
Backward Compatibility
✅ Fully backward compatible:
- Existing records are processed automatically
- No breaking changes to API responses
- Old records without snapshots are handled correctly
- Payment data is always preserved
Related Files
backend/services/subscription/renewal_history_retention.py- Main implementationbackend/api/subscription/routes/subscriptions.py- API endpoint integrationfrontend/src/components/billing/SubscriptionRenewalHistory.tsx- Frontend displaydocs/Billing_Subscription/LOG_STORAGE_AND_RETENTION_REVIEW.md- Review document