ALwrity persona system

This commit is contained in:
ajaysi
2025-09-05 15:22:43 +05:30
parent ccbdc9e8c6
commit f82ada0361
38 changed files with 5673 additions and 1240 deletions

View File

@@ -1,473 +0,0 @@
# LinkedIn Content Generation Service
A comprehensive FastAPI-based service for generating professional LinkedIn content using AI. This service has been migrated from the legacy Streamlit implementation to provide robust API endpoints for LinkedIn content creation.
## Overview
The LinkedIn Content Generation Service provides AI-powered tools for creating various types of LinkedIn content:
- **Posts**: Short-form professional posts with research-backed content
- **Articles**: Long-form articles with SEO optimization
- **Carousels**: Multi-slide visual content
- **Video Scripts**: Structured scripts for LinkedIn videos
- **Comment Responses**: Professional responses to LinkedIn comments
## Features
### 🚀 Core Capabilities
- **Multi-format Content Generation**: Posts, articles, carousels, video scripts, and comment responses
- **Research Integration**: Automated research using multiple search engines (Metaphor, Google, Tavily)
- **AI-Powered Optimization**: Industry-specific content optimization using Gemini AI
- **SEO Enhancement**: Built-in SEO optimization for LinkedIn articles
- **Engagement Prediction**: AI-based engagement metrics prediction
- **Professional Tone Control**: Multiple tone options (professional, conversational, authoritative, etc.)
### 🛠 Technical Features
- **FastAPI Integration**: RESTful API with automatic documentation
- **Comprehensive Error Handling**: Robust exception handling and logging
- **Database Monitoring**: Request logging and performance monitoring
- **Async/Await Support**: Non-blocking operations for better performance
- **Pydantic Validation**: Strong request/response validation
- **Structured JSON Responses**: Consistent API response format
## API Endpoints
### Base URL
```
/api/linkedin
```
### Available Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check for service status |
| `/generate-post` | POST | Generate LinkedIn posts |
| `/generate-article` | POST | Generate LinkedIn articles |
| `/generate-carousel` | POST | Generate LinkedIn carousels |
| `/generate-video-script` | POST | Generate video scripts |
| `/generate-comment-response` | POST | Generate comment responses |
| `/content-types` | GET | Get available content types |
| `/usage-stats` | GET | Get usage statistics |
## Quick Start
### 1. Prerequisites
```bash
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export GEMINI_API_KEY="your_gemini_api_key"
export DATABASE_URL="sqlite:///./alwrity.db"
```
### 2. Start the Service
```bash
# Start FastAPI server
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
```
### 3. Access Documentation
- **Interactive API Docs**: http://localhost:8000/docs
- **Alternative Docs**: http://localhost:8000/redoc
## Usage Examples
### Generate a LinkedIn Post
```python
import requests
# Request payload
payload = {
"topic": "Artificial Intelligence in Healthcare",
"industry": "Healthcare",
"post_type": "thought_leadership",
"tone": "professional",
"target_audience": "Healthcare executives",
"key_points": ["AI diagnostics", "Patient outcomes", "Cost reduction"],
"include_hashtags": True,
"include_call_to_action": True,
"research_enabled": True,
"max_length": 2000
}
# Make request
response = requests.post(
"http://localhost:8000/api/linkedin/generate-post",
json=payload
)
# Process response
if response.status_code == 200:
data = response.json()
print(f"Generated post: {data['data']['content']}")
print(f"Hashtags: {[h['hashtag'] for h in data['data']['hashtags']]}")
else:
print(f"Error: {response.status_code}")
```
### Generate a LinkedIn Article
```python
payload = {
"topic": "Digital Transformation in Manufacturing",
"industry": "Manufacturing",
"tone": "professional",
"target_audience": "Manufacturing leaders",
"key_sections": ["Current challenges", "Technology solutions", "Implementation strategies"],
"include_images": True,
"seo_optimization": True,
"research_enabled": True,
"word_count": 1500
}
response = requests.post(
"http://localhost:8000/api/linkedin/generate-article",
json=payload
)
```
### Generate a LinkedIn Carousel
```python
payload = {
"topic": "5 Ways to Improve Team Productivity",
"industry": "Business Management",
"slide_count": 8,
"tone": "professional",
"target_audience": "Team leaders and managers",
"key_takeaways": ["Clear communication", "Goal setting", "Tool optimization"],
"include_cover_slide": True,
"include_cta_slide": True,
"visual_style": "modern"
}
response = requests.post(
"http://localhost:8000/api/linkedin/generate-carousel",
json=payload
)
```
## Request/Response Models
### LinkedIn Post Request
```json
{
"topic": "string",
"industry": "string",
"post_type": "professional|thought_leadership|industry_news|personal_story|company_update|poll",
"tone": "professional|conversational|authoritative|inspirational|educational|friendly",
"target_audience": "string (optional)",
"key_points": ["string"] (optional),
"include_hashtags": true,
"include_call_to_action": true,
"research_enabled": true,
"search_engine": "metaphor|google|tavily",
"max_length": 3000
}
```
### LinkedIn Post Response
```json
{
"success": true,
"data": {
"content": "Generated post content...",
"character_count": 1250,
"hashtags": [
{
"hashtag": "#AIinHealthcare",
"category": "industry",
"popularity_score": 0.9
}
],
"call_to_action": "What's your experience with AI in healthcare?",
"engagement_prediction": {
"estimated_likes": 120,
"estimated_comments": 15,
"estimated_shares": 8
}
},
"research_sources": [
{
"title": "AI in Healthcare: Current Trends",
"url": "https://example.com/ai-healthcare",
"content": "Summary of AI healthcare trends...",
"relevance_score": 0.95
}
],
"generation_metadata": {
"generation_time": 3.2,
"timestamp": "2025-01-27T10:00:00Z",
"model_used": "gemini-2.0-flash-001"
}
}
```
## Configuration
### Environment Variables
| Variable | Description | Required | Default |
|----------|-------------|----------|---------|
| `GEMINI_API_KEY` | Google Gemini API key | Yes | - |
| `DATABASE_URL` | Database connection string | No | `sqlite:///./alwrity.db` |
| `LOG_LEVEL` | Logging level | No | `INFO` |
### Content Generation Settings
The service supports various customization options:
#### Post Types
- `professional`: Standard professional posts
- `thought_leadership`: Industry insights and expertise
- `industry_news`: News and updates
- `personal_story`: Personal experiences and stories
- `company_update`: Company news and announcements
- `poll`: Interactive polls
#### Tone Options
- `professional`: Formal business tone
- `conversational`: Casual but professional
- `authoritative`: Expert and confident
- `inspirational`: Motivational and uplifting
- `educational`: Informative and teaching
- `friendly`: Warm and approachable
#### Search Engines
- `metaphor`: Metaphor AI search (recommended)
- `google`: Google Search API
- `tavily`: Tavily AI search
## Architecture
### Service Structure
```
backend/
├── models/
│ └── linkedin_models.py # Pydantic models for requests/responses
├── services/
│ └── linkedin_service.py # Core business logic
├── routers/
│ └── linkedin.py # FastAPI route handlers
├── middleware/
│ └── monitoring_middleware.py # Request monitoring
└── docs/
└── LINKEDIN_CONTENT_GENERATION.md
```
### Key Components
#### LinkedInContentService
The core service class that handles all content generation logic:
- **Content Generation**: AI-powered content creation
- **Research Integration**: Multi-source research capabilities
- **Error Handling**: Comprehensive exception management
- **Logging**: Detailed operation logging
#### Request Models
Pydantic models for strong typing and validation:
- `LinkedInPostRequest`
- `LinkedInArticleRequest`
- `LinkedInCarouselRequest`
- `LinkedInVideoScriptRequest`
- `LinkedInCommentResponseRequest`
#### Response Models
Structured response models with metadata:
- `LinkedInPostResponse`
- `LinkedInArticleResponse`
- `LinkedInCarouselResponse`
- `LinkedInVideoScriptResponse`
- `LinkedInCommentResponseResult`
## Performance Considerations
### Response Times
- **Posts**: 3-8 seconds (with research)
- **Articles**: 15-45 seconds (depending on length)
- **Carousels**: 5-15 seconds
- **Video Scripts**: 3-10 seconds
- **Comment Responses**: 1-3 seconds
### Rate Limiting
The service respects API rate limits:
- Gemini API: Built-in retry logic with exponential backoff
- Research APIs: Configurable rate limiting
### Caching
- Research results caching (planned)
- Response caching for similar requests (planned)
## Error Handling
### Common Error Scenarios
#### 422 Validation Error
```json
{
"detail": [
{
"loc": ["body", "topic"],
"msg": "ensure this value has at least 3 characters",
"type": "value_error.any_str.min_length"
}
]
}
```
#### 500 Internal Server Error
```json
{
"success": false,
"error": "Content generation failed: API key not configured",
"generation_metadata": {
"service_version": "1.0.0",
"timestamp": "2025-01-27T10:00:00Z"
}
}
```
### Error Recovery
- Automatic retry logic for transient failures
- Graceful fallback for content generation
- Detailed error logging for debugging
## Monitoring and Logging
### Request Monitoring
All API requests are logged with:
- Request path and method
- Response time and status code
- User information (if available)
- Request/response sizes
### Performance Metrics
- Generation time tracking
- Success/failure rates
- Popular content types
- Error frequency analysis
### Health Checks
```bash
curl http://localhost:8000/api/linkedin/health
```
## Migration from Streamlit
### Key Changes
1. **Architecture**: Streamlit UI → FastAPI REST API
2. **Dependencies**: Integrated with existing backend services
3. **Error Handling**: Enhanced exception handling and logging
4. **Monitoring**: Database-backed request monitoring
5. **Validation**: Strong request/response validation
6. **Documentation**: Automatic API documentation
### Compatibility
- All original functionality preserved
- Enhanced features and capabilities
- Better integration with existing systems
- Improved performance and scalability
## Testing
### Running Tests
```bash
# Structure validation
python3 validate_linkedin_structure.py
# Full functionality tests (requires dependencies)
python3 test_linkedin_endpoints.py
```
### Test Coverage
- ✅ Post generation
- ✅ Article generation
- ✅ Carousel generation
- ✅ Video script generation
- ✅ Comment response generation
- ✅ Error handling
- ✅ Structure validation
## Troubleshooting
### Common Issues
#### 1. Import Errors
```bash
ModuleNotFoundError: No module named 'pydantic'
```
**Solution**: Install dependencies
```bash
pip install -r requirements.txt
```
#### 2. API Key Issues
```bash
Error: GEMINI_API_KEY environment variable is not set
```
**Solution**: Set the environment variable
```bash
export GEMINI_API_KEY="your_api_key_here"
```
#### 3. Database Connection Issues
```bash
Error creating database session
```
**Solution**: Check database configuration and permissions
#### 4. Generation Timeouts
**Solution**: Increase timeout settings or reduce content complexity
### Debug Mode
Enable debug logging:
```bash
export LOG_LEVEL=DEBUG
```
## Future Enhancements
### Planned Features
- [ ] Real search engine integration (Metaphor, Google, Tavily)
- [ ] Content scheduling and calendar integration
- [ ] A/B testing capabilities
- [ ] Advanced analytics and reporting
- [ ] Multi-language support
- [ ] Custom templates and brand voice
- [ ] LinkedIn API integration for direct posting
- [ ] Content performance tracking
### Performance Improvements
- [ ] Response caching
- [ ] Parallel processing for multiple requests
- [ ] Background job processing
- [ ] CDN integration for static assets
## Support
For issues and questions:
1. Check the [troubleshooting section](#troubleshooting)
2. Review the API documentation at `/docs`
3. Check the logs for detailed error information
4. Validate your request format against the examples
## License
This LinkedIn Content Generation Service is part of the ALwrity platform and follows the same licensing terms.

View File

@@ -1,401 +0,0 @@
# AI SEO Tools Migration Documentation
## Overview
This document describes the successful migration of AI SEO tools from the `ToBeMigrated/ai_seo_tools` directory to FastAPI endpoints in the backend services. The migration maintains all existing functionality while adding intelligent logging, exception handling, and structured API responses.
## Migration Summary
### What Was Migrated
The following SEO tools have been converted to FastAPI endpoints:
1. **Meta Description Generator** - AI-powered meta description generation
2. **Google PageSpeed Insights Analyzer** - Performance analysis with AI insights
3. **Sitemap Analyzer** - Website structure and content trend analysis
4. **Image Alt Text Generator** - AI-powered alt text generation
5. **OpenGraph Tags Generator** - Social media optimization tags
6. **On-Page SEO Analyzer** - Comprehensive on-page SEO analysis
7. **Technical SEO Analyzer** - Website crawling and technical analysis
8. **Enterprise SEO Suite** - Complete SEO audit workflows
9. **Content Strategy Analyzer** - AI-powered content gap analysis
### New Architecture
```
backend/
├── services/seo_tools/ # SEO tool services
│ ├── meta_description_service.py
│ ├── pagespeed_service.py
│ ├── sitemap_service.py
│ ├── image_alt_service.py
│ ├── opengraph_service.py
│ ├── on_page_seo_service.py
│ ├── technical_seo_service.py
│ ├── enterprise_seo_service.py
│ └── content_strategy_service.py
├── routers/seo_tools.py # FastAPI router
├── middleware/logging_middleware.py # Intelligent logging
└── logs/seo_tools/ # Structured log files
```
## API Endpoints
### Base URL
All SEO tools are available under: `/api/seo`
### Individual Tool Endpoints
#### 1. Meta Description Generation
- **Endpoint**: `POST /api/seo/meta-description`
- **Purpose**: Generate AI-powered SEO meta descriptions
- **Request**:
```json
{
"keywords": ["SEO", "content marketing"],
"tone": "Professional",
"search_intent": "Informational Intent",
"language": "English",
"custom_prompt": "Optional custom prompt"
}
```
- **Response**: Structured response with 5 meta descriptions, analysis, and recommendations
#### 2. PageSpeed Analysis
- **Endpoint**: `POST /api/seo/pagespeed-analysis`
- **Purpose**: Analyze website performance using Google PageSpeed Insights
- **Request**:
```json
{
"url": "https://example.com",
"strategy": "DESKTOP",
"locale": "en",
"categories": ["performance", "accessibility", "best-practices", "seo"]
}
```
- **Response**: Performance metrics, Core Web Vitals, AI insights, and optimization plan
#### 3. Sitemap Analysis
- **Endpoint**: `POST /api/seo/sitemap-analysis`
- **Purpose**: Analyze website sitemap structure and content patterns
- **Request**:
```json
{
"sitemap_url": "https://example.com/sitemap.xml",
"analyze_content_trends": true,
"analyze_publishing_patterns": true
}
```
- **Response**: Structure analysis, content trends, publishing patterns, and AI insights
#### 4. Image Alt Text Generation
- **Endpoint**: `POST /api/seo/image-alt-text`
- **Purpose**: Generate SEO-optimized alt text for images
- **Request**: Form data with image file or JSON with image URL
- **Response**: Generated alt text with confidence score and suggestions
#### 5. OpenGraph Tags Generation
- **Endpoint**: `POST /api/seo/opengraph-tags`
- **Purpose**: Generate OpenGraph tags for social media optimization
- **Request**:
```json
{
"url": "https://example.com",
"title_hint": "Optional title hint",
"description_hint": "Optional description hint",
"platform": "General"
}
```
- **Response**: Complete OpenGraph tags with platform-specific optimizations
#### 6. On-Page SEO Analysis
- **Endpoint**: `POST /api/seo/on-page-analysis`
- **Purpose**: Comprehensive on-page SEO analysis
- **Request**:
```json
{
"url": "https://example.com",
"target_keywords": ["keyword1", "keyword2"],
"analyze_images": true,
"analyze_content_quality": true
}
```
- **Response**: SEO score, content analysis, keyword optimization, and recommendations
#### 7. Technical SEO Analysis
- **Endpoint**: `POST /api/seo/technical-seo`
- **Purpose**: Technical SEO crawling and analysis
- **Request**:
```json
{
"url": "https://example.com",
"crawl_depth": 3,
"include_external_links": true,
"analyze_performance": true
}
```
- **Response**: Technical issues, site structure, performance metrics, and recommendations
### Workflow Endpoints
#### 1. Complete Website Audit
- **Endpoint**: `POST /api/seo/workflow/website-audit`
- **Purpose**: Execute comprehensive SEO audit workflow
- **Request**:
```json
{
"website_url": "https://example.com",
"workflow_type": "complete_audit",
"competitors": ["https://competitor1.com"],
"target_keywords": ["keyword1", "keyword2"]
}
```
#### 2. Content Analysis Workflow
- **Endpoint**: `POST /api/seo/workflow/content-analysis`
- **Purpose**: AI-powered content strategy analysis
- **Request**:
```json
{
"website_url": "https://example.com",
"workflow_type": "content_analysis",
"competitors": ["https://competitor1.com"],
"target_keywords": ["content", "strategy"]
}
```
### Health and Status Endpoints
- **GET** `/api/seo/health` - Health check for SEO tools
- **GET** `/api/seo/tools/status` - Status of all SEO tools and dependencies
## Key Features
### 1. Intelligent Logging
- **Structured Logging**: All operations logged to JSONL files
- **Performance Tracking**: Execution time monitoring
- **Error Logging**: Comprehensive error tracking with stack traces
- **AI Analysis Logging**: Prompt/response tracking for AI operations
**Log Files**:
- `/backend/logs/seo_tools/operations.jsonl` - Successful operations
- `/backend/logs/seo_tools/errors.jsonl` - Error logs
- `/backend/logs/seo_tools/ai_analysis.jsonl` - AI prompt/response logs
- `/backend/logs/seo_tools/external_apis.jsonl` - External API calls
- `/backend/logs/seo_tools/crawling.jsonl` - Web crawling operations
### 2. Exception Handling
- **Never Mock Data**: Real API failures return proper error responses
- **Graceful Degradation**: AI analysis failures don't break core functionality
- **Detailed Error Messages**: Clear error descriptions for debugging
- **Error IDs**: Unique error identifiers for tracking
### 3. AI Enhancement
- **Gemini Integration**: Uses `gemini_provide` functionality for AI analysis
- **Structured Responses**: AI responses parsed into structured data
- **Context-Aware Analysis**: AI considers user type (content creators, marketers)
- **Business Impact Focus**: AI recommendations focus on practical business outcomes
### 4. Background Processing
- **Async Operations**: All heavy operations run asynchronously
- **Background Tasks**: Logging and cleanup run in background
- **Non-blocking**: API responses don't wait for logging operations
## Response Format
All endpoints follow a consistent response format:
```json
{
"success": true,
"message": "Operation completed successfully",
"timestamp": "2024-01-15T10:30:00Z",
"execution_time": 2.45,
"data": {
// Tool-specific data
}
}
```
**Error Response**:
```json
{
"success": false,
"message": "Error description",
"timestamp": "2024-01-15T10:30:00Z",
"execution_time": 1.23,
"error_type": "ValueError",
"error_details": "Detailed error message",
"traceback": "Full traceback (only in debug mode)"
}
```
## Dependencies
### New Dependencies Added
```
aiofiles>=23.2.0 # Async file operations
crawl4ai>=0.2.0 # Web crawling (placeholder)
```
### Existing Dependencies Used
- `fastapi` - Web framework
- `pydantic` - Data validation
- `aiohttp` - Async HTTP client
- `beautifulsoup4` - HTML parsing
- `advertools` - SEO analysis
- `loguru` - Logging
- `google-genai` - AI analysis
## Testing
### Test Script
Run the comprehensive test suite:
```bash
cd /workspace/backend
python test_seo_tools.py
```
### Manual Testing
1. Start the FastAPI server:
```bash
uvicorn app:app --reload --host 0.0.0.0 --port 8000
```
2. Access API documentation:
- Swagger UI: `http://localhost:8000/docs`
- ReDoc: `http://localhost:8000/redoc`
3. Test individual endpoints using the documentation interface
## Configuration
### Environment Variables
Set these environment variables for full functionality:
```bash
# Google PageSpeed Insights API Key (optional)
GOOGLE_PAGESPEED_API_KEY=your_api_key_here
# AI Provider API Keys (at least one required)
GEMINI_API_KEY=your_gemini_key
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
# Debug mode (optional)
DEBUG=false
```
### Logging Configuration
Logs are automatically rotated daily and retained for 30 days. Configure in:
`/workspace/backend/middleware/logging_middleware.py`
## Migration Benefits
### For Content Creators
- **User-Friendly**: API responses tailored for non-technical users
- **Actionable Insights**: Clear recommendations with business impact
- **Comprehensive Analysis**: All-in-one SEO analysis platform
- **AI-Enhanced**: Advanced AI provides strategic insights
### For Digital Marketers
- **Performance Tracking**: Detailed metrics and optimization plans
- **Competitive Analysis**: Built-in competitor intelligence
- **Workflow Automation**: Complete audit workflows
- **ROI Focus**: Recommendations tied to business outcomes
### For Solopreneurs
- **Cost-Effective**: Single API for multiple SEO tools
- **Time-Saving**: Automated analysis and recommendations
- **Easy Integration**: RESTful API with clear documentation
- **Scalable**: Handles small to enterprise-level analysis
### For Developers
- **Modern Architecture**: FastAPI with async support
- **Comprehensive Logging**: Full observability
- **Error Handling**: Robust error management
- **Documentation**: Auto-generated API docs
## Monitoring and Maintenance
### Log Analysis
Use the built-in log analyzer for insights:
```python
from middleware.logging_middleware import log_analyzer
# Get performance summary
performance = await log_analyzer.get_performance_summary(hours=24)
# Get error summary
errors = await log_analyzer.get_error_summary(hours=24)
```
### Health Monitoring
Monitor service health via:
- `/api/seo/health` - Overall health
- `/api/seo/tools/status` - Individual tool status
### Performance Optimization
- Monitor execution times in logs
- Optimize slow-performing tools
- Scale based on usage patterns
## Future Enhancements
### Planned Features
1. **Real-time Monitoring Dashboard** - Visual monitoring interface
2. **Batch Processing** - Process multiple URLs simultaneously
3. **Webhook Support** - Async notifications for long-running operations
4. **Rate Limiting** - Prevent API abuse
5. **Caching** - Cache frequently requested analyses
6. **Authentication** - API key-based authentication
7. **Usage Analytics** - Track API usage and popular tools
### Extension Points
1. **New SEO Tools** - Easy to add new tools following existing patterns
2. **Custom AI Models** - Support for additional AI providers
3. **Export Formats** - PDF, Excel, CSV export options
4. **Integration APIs** - Connect with popular marketing tools
## Troubleshooting
### Common Issues
1. **Import Errors**
- Ensure all dependencies are installed: `pip install -r requirements.txt`
- Check Python path configuration
2. **AI Analysis Failures**
- Verify API keys are set correctly
- Check internet connectivity
- Review error logs for specific issues
3. **PageSpeed API Errors**
- Get Google PageSpeed API key for higher rate limits
- Verify URL format and accessibility
4. **Logging Issues**
- Ensure write permissions to `/workspace/backend/logs/`
- Check disk space availability
### Debug Mode
Enable debug mode for detailed error information:
```bash
export DEBUG=true
```
This will include full tracebacks in API responses.
## Conclusion
The AI SEO Tools migration successfully transforms individual Python scripts into a cohesive, scalable FastAPI service. The new architecture provides:
-**Complete Functionality Preservation**
-**Enhanced Error Handling**
-**Intelligent Logging**
-**AI-Powered Insights**
-**Workflow Automation**
-**Developer-Friendly API**
-**Business-Focused Outputs**
The system is now ready for production use and can easily scale to serve content creators, digital marketers, and solopreneurs with professional-grade SEO analysis capabilities.