Content Calendar, Content Gap Analysis, and Content Optimization

This commit is contained in:
ajaysi
2025-05-27 09:15:08 +05:30
parent 4049d19787
commit 889021c078
100 changed files with 18504 additions and 1251 deletions

View File

@@ -0,0 +1,249 @@
# Content Gap Analysis Utils
This directory contains utility modules that power the Content Gap Analysis tool. These modules provide core functionality for data collection, processing, analysis, and storage.
## Directory Structure
```
utils/
├── README.md
├── ai_processor.py # AI-powered content analysis and processing
├── content_parser.py # Content structure parsing and analysis
├── data_collector.py # Website data collection and processing
└── storage.py # Analysis results storage and retrieval
```
## Module Descriptions
### 1. AI Processor (`ai_processor.py`)
The AI Processor module enhances content analysis using AI techniques. It provides intelligent analysis of website content, competitor data, and keyword research.
#### Key Features:
- Content quality assessment
- Topic analysis and clustering
- Performance metrics analysis
- Strategic recommendations generation
- Progress tracking for analysis tasks
#### Main Components:
- `AIProcessor`: Main class for AI-powered analysis
- `ProgressTracker`: Tracks analysis progress and status
#### Usage Example:
```python
from utils.ai_processor import AIProcessor
processor = AIProcessor()
analysis = processor.analyze_content({
'url': 'https://example.com',
'industry': 'technology',
'content': content_data
})
```
### 2. Content Parser (`content_parser.py`)
The Content Parser module handles the parsing and analysis of website content structure. It provides detailed insights into content organization and quality.
#### Key Features:
- Content structure analysis
- Text statistics calculation
- Topic extraction
- Readability analysis
- Content hierarchy analysis
#### Main Components:
- `ContentParser`: Main class for content parsing and analysis
#### Usage Example:
```python
from utils.content_parser import ContentParser
parser = ContentParser()
structure = parser.parse_structure({
'main_content': content,
'html': html_content,
'headings': headings_data
})
```
### 3. Data Collector (`data_collector.py`)
The Data Collector module is responsible for gathering website data for analysis. It handles web scraping and data extraction.
#### Key Features:
- Website content collection
- Meta data extraction
- Heading structure analysis
- Link and image extraction
- Error handling and retry logic
#### Main Components:
- `DataCollector`: Main class for data collection
#### Usage Example:
```python
from utils.data_collector import DataCollector
collector = DataCollector()
data = collector.collect('https://example.com')
```
### 4. Storage (`storage.py`)
The Storage module manages the persistence and retrieval of analysis results. It provides a robust database interface for storing and accessing analysis data.
#### Key Features:
- Analysis results storage
- Historical data management
- Recommendation tracking
- User-specific analysis storage
- Error handling and rollback support
#### Main Components:
- `ContentGapAnalysisStorage`: Main class for storage operations
#### Usage Example:
```python
from utils.storage import ContentGapAnalysisStorage
storage = ContentGapAnalysisStorage(db_session)
analysis_id = storage.save_analysis(
user_id=1,
website_url='https://example.com',
industry='technology',
results=analysis_results
)
```
## Integration Points
### 1. Website Analysis Integration
```python
from utils.data_collector import DataCollector
from utils.content_parser import ContentParser
from utils.ai_processor import AIProcessor
# Collect data
collector = DataCollector()
data = collector.collect(url)
# Parse content
parser = ContentParser()
structure = parser.parse_structure(data)
# Process with AI
processor = AIProcessor()
analysis = processor.analyze_content({
'url': url,
'content': structure
})
```
### 2. Storage Integration
```python
from utils.storage import ContentGapAnalysisStorage
# Store analysis results
storage = ContentGapAnalysisStorage(db_session)
analysis_id = storage.save_analysis(
user_id=user_id,
website_url=url,
industry=industry,
results=analysis_results
)
# Retrieve analysis
results = storage.get_analysis(analysis_id)
```
## Error Handling
All modules implement comprehensive error handling:
1. **Data Collection Errors**
- Network timeouts
- Invalid URLs
- Access restrictions
- Parsing errors
2. **Processing Errors**
- Invalid data formats
- AI processing failures
- Resource limitations
- Analysis timeouts
3. **Storage Errors**
- Database connection issues
- Transaction failures
- Data validation errors
- Concurrent access conflicts
## Best Practices
1. **Data Collection**
- Implement rate limiting
- Use proper user agents
- Handle redirects
- Validate input data
2. **Content Processing**
- Clean and normalize data
- Handle encoding issues
- Implement fallback strategies
- Cache processed results
3. **Storage Management**
- Use transactions
- Implement data validation
- Handle concurrent access
- Maintain data integrity
## Future Enhancements
1. **Performance Optimizations**
- Implement parallel processing
- Add caching layer
- Optimize database queries
- Enhance error recovery
2. **Feature Additions**
- Content performance tracking
- Automated content planning
- Enhanced competitive intelligence
- Advanced topic clustering
3. **Integration Improvements**
- API endpoints
- Export capabilities
- Data visualization
- Progress tracking
4. **UI/UX Enhancements**
- Interactive visualizations
- Real-time progress updates
- Export interfaces
- Customization options
## Contributing
When contributing to these utility modules:
1. Follow the existing code structure
2. Add comprehensive error handling
3. Include unit tests
4. Update documentation
5. Follow PEP 8 style guide
## Dependencies
- BeautifulSoup4: HTML parsing
- NLTK: Natural language processing
- SQLAlchemy: Database operations
- Streamlit: UI components
- Requests: HTTP requests
## License
This project is licensed under the MIT License - see the LICENSE file for details.

View File

@@ -0,0 +1,13 @@
"""
Utility modules for content gap analysis.
"""
from .data_collector import DataCollector
from .content_parser import ContentParser
from .ai_processor import AIProcessor
__all__ = [
'DataCollector',
'ContentParser',
'AIProcessor'
]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,236 @@
"""
Content parser utility for analyzing website content structure.
"""
from typing import Dict, Any, List
import re
from bs4 import BeautifulSoup
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from collections import Counter
class ContentParser:
"""Parser for analyzing website content structure."""
def __init__(self):
"""Initialize the content parser."""
try:
nltk.data.find('tokenizers/punkt')
except LookupError:
nltk.download('punkt')
try:
nltk.data.find('corpora/stopwords')
except LookupError:
nltk.download('stopwords')
self.stop_words = set(stopwords.words('english'))
def parse_structure(self, content: Dict[str, Any]) -> Dict[str, Any]:
"""
Parse and analyze the structure of website content.
Args:
content: Dictionary containing website content
Returns:
Dictionary containing parsed content structure
"""
try:
# Parse main content
main_content = content.get('main_content', '')
soup = BeautifulSoup(content.get('html', ''), 'html.parser')
# Extract text statistics
text_stats = self._analyze_text(main_content)
# Extract content sections
sections = self._extract_sections(soup)
# Extract topics
topics = self._extract_topics(main_content)
# Analyze readability
readability = self._analyze_readability(main_content)
# Analyze content hierarchy
hierarchy = self._analyze_hierarchy(content.get('headings', []))
return {
'text_statistics': text_stats,
'sections': sections,
'topics': topics,
'readability': readability,
'hierarchy': hierarchy,
'metadata': content.get('metadata', {})
}
except Exception as e:
return {
'error': str(e),
'text_statistics': {},
'sections': [],
'topics': [],
'readability': {},
'hierarchy': {},
'metadata': {}
}
def _analyze_text(self, text: str) -> Dict[str, Any]:
"""Analyze text statistics."""
sentences = sent_tokenize(text)
words = word_tokenize(text.lower())
words = [w for w in words if w.isalnum() and w not in self.stop_words]
return {
'word_count': len(words),
'sentence_count': len(sentences),
'average_sentence_length': len(words) / max(len(sentences), 1),
'unique_words': len(set(words)),
'stop_words': len([w for w in word_tokenize(text.lower()) if w in self.stop_words]),
'characters': len(text),
'paragraphs': len(text.split('\n\n')),
'sentences': sentences
}
def _extract_sections(self, soup: BeautifulSoup) -> List[Dict[str, Any]]:
"""Extract content sections."""
sections = []
# Find main content containers
containers = soup.find_all(['article', 'section', 'div'], class_=re.compile(r'content|main|article|section'))
for container in containers:
# Get section heading
heading = container.find(['h1', 'h2', 'h3'])
heading_text = heading.get_text().strip() if heading else 'Untitled Section'
# Get section content
content = container.get_text().strip()
# Get section type
section_type = container.name
if container.get('class'):
section_type = ' '.join(container.get('class'))
sections.append({
'heading': heading_text,
'content': content,
'type': section_type,
'word_count': len(word_tokenize(content)),
'position': self._get_element_position(container)
})
return sections
def _extract_topics(self, text: str) -> List[Dict[str, Any]]:
"""Extract main topics from content."""
# Tokenize and clean text
words = word_tokenize(text.lower())
words = [w for w in words if w.isalnum() and w not in self.stop_words]
# Get word frequencies
word_freq = Counter(words)
# Get top topics
topics = []
for word, freq in word_freq.most_common(10):
topics.append({
'topic': word,
'frequency': freq,
'percentage': freq / len(words) * 100
})
return topics
def _analyze_readability(self, text: str) -> Dict[str, float]:
"""Analyze text readability."""
sentences = sent_tokenize(text)
words = word_tokenize(text.lower())
words = [w for w in words if w.isalnum()]
# Calculate average sentence length
avg_sentence_length = len(words) / max(len(sentences), 1)
# Calculate average word length
avg_word_length = sum(len(w) for w in words) / max(len(words), 1)
# Calculate Flesch Reading Ease score
# Formula: 206.835 - 1.015(total words/total sentences) - 84.6(total syllables/total words)
syllables = sum(self._count_syllables(w) for w in words)
flesch_score = 206.835 - 1.015 * avg_sentence_length - 84.6 * (syllables / max(len(words), 1))
return {
'flesch_score': max(0, min(100, flesch_score)),
'avg_sentence_length': avg_sentence_length,
'avg_word_length': avg_word_length,
'syllables_per_word': syllables / max(len(words), 1)
}
def _analyze_hierarchy(self, headings: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Analyze content hierarchy."""
# Group headings by level
heading_levels = {}
for heading in headings:
level = heading['level']
if level not in heading_levels:
heading_levels[level] = []
heading_levels[level].append(heading)
# Calculate hierarchy metrics
total_headings = len(headings)
max_depth = max(int(level[1]) for level in heading_levels.keys()) if heading_levels else 0
return {
'total_headings': total_headings,
'max_depth': max_depth,
'heading_distribution': {level: len(headings) for level, headings in heading_levels.items()},
'has_proper_hierarchy': self._check_proper_hierarchy(heading_levels)
}
def _check_proper_hierarchy(self, heading_levels: Dict[str, List[Dict[str, Any]]]) -> bool:
"""Check if headings follow proper hierarchy."""
if not heading_levels:
return False
# Check if h1 exists
if 'h1' not in heading_levels:
return False
# Check if h1 is unique
if len(heading_levels['h1']) > 1:
return False
# Check if levels are sequential
levels = sorted(int(level[1]) for level in heading_levels.keys())
return all(levels[i] - levels[i-1] <= 1 for i in range(1, len(levels)))
def _count_syllables(self, word: str) -> int:
"""Count syllables in a word."""
word = word.lower()
count = 0
vowels = 'aeiouy'
word = word.lower()
if word[0] in vowels:
count += 1
for index in range(1, len(word)):
if word[index] in vowels and word[index - 1] not in vowels:
count += 1
if word.endswith('e'):
count -= 1
if count == 0:
count += 1
return count
def _get_element_position(self, element) -> Dict[str, int]:
"""Get element position in the document."""
try:
return {
'top': element.sourceline,
'left': element.sourcepos
}
except:
return {
'top': 0,
'left': 0
}

View File

@@ -0,0 +1,112 @@
"""
Data collector utility for content gap analysis.
"""
import requests
from bs4 import BeautifulSoup
from typing import Dict, Any
class DataCollector:
"""
Collects and processes website data for analysis.
"""
def __init__(self):
"""Initialize the data collector."""
self.headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
def collect(self, url: str) -> Dict[str, Any]:
"""
Collect website data for analysis.
Args:
url (str): The URL to collect data from
Returns:
dict: Collected website data
"""
try:
# Fetch webpage content
response = requests.get(url, headers=self.headers)
response.raise_for_status()
# Parse HTML content
soup = BeautifulSoup(response.text, 'html.parser')
# Extract relevant data
data = {
'url': url,
'title': self._extract_title(soup),
'meta_description': self._extract_meta_description(soup),
'headings': self._extract_headings(soup),
'content': self._extract_content(soup),
'links': self._extract_links(soup),
'images': self._extract_images(soup)
}
return data
except Exception as e:
return {
'error': str(e),
'url': url
}
def _extract_title(self, soup: BeautifulSoup) -> str:
"""Extract page title."""
title = soup.find('title')
return title.text if title else ''
def _extract_meta_description(self, soup: BeautifulSoup) -> str:
"""Extract meta description."""
meta = soup.find('meta', attrs={'name': 'description'})
return meta.get('content', '') if meta else ''
def _extract_headings(self, soup: BeautifulSoup) -> Dict[str, list]:
"""Extract all headings."""
headings = {}
for i in range(1, 7):
tags = soup.find_all(f'h{i}')
headings[f'h{i}'] = [tag.text.strip() for tag in tags]
return headings
def _extract_content(self, soup: BeautifulSoup) -> str:
"""Extract main content."""
# Remove script and style elements
for script in soup(['script', 'style']):
script.decompose()
# Get text content
text = soup.get_text()
# Clean up text
lines = (line.strip() for line in text.splitlines())
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
text = ' '.join(chunk for chunk in chunks if chunk)
return text
def _extract_links(self, soup: BeautifulSoup) -> list:
"""Extract all links."""
links = []
for link in soup.find_all('a'):
href = link.get('href')
if href:
links.append({
'url': href,
'text': link.text.strip()
})
return links
def _extract_images(self, soup: BeautifulSoup) -> list:
"""Extract all images."""
images = []
for img in soup.find_all('img'):
images.append({
'src': img.get('src', ''),
'alt': img.get('alt', ''),
'title': img.get('title', '')
})
return images

View File

@@ -0,0 +1,237 @@
"""
SEO analyzer utility for content gap analysis.
"""
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse, urljoin
import re
from typing import Dict, Any, List, Optional
from ....utils.website_analyzer.analyzer import WebsiteAnalyzer
def analyze_onpage_seo(url: str) -> Dict[str, Any]:
"""
Analyze on-page SEO elements of a website.
Args:
url: The URL to analyze
Returns:
Dictionary containing SEO analysis results
"""
try:
# Use the combined website analyzer
analyzer = WebsiteAnalyzer()
analysis = analyzer.analyze_website(url)
if not analysis.get('success', False):
return {
'error': analysis.get('error', 'Unknown error in SEO analysis'),
'meta_title': '',
'meta_description': '',
'has_robots_txt': False,
'has_sitemap': False,
'mobile_friendly': False,
'load_time': 0
}
# Extract relevant information from the analysis
seo_info = analysis['data']['analysis']['seo_info']
basic_info = analysis['data']['analysis']['basic_info']
performance = analysis['data']['analysis']['performance']
return {
'meta_tags': seo_info.get('meta_tags', {}),
'content': seo_info.get('content', {}),
'meta_title': basic_info.get('title', ''),
'meta_description': basic_info.get('meta_description', ''),
'has_robots_txt': bool(basic_info.get('robots_txt')),
'has_sitemap': bool(basic_info.get('sitemap')),
'mobile_friendly': True, # This would need to be implemented separately
'load_time': performance.get('load_time', 0)
}
except Exception as e:
return {
'error': str(e),
'meta_title': '',
'meta_description': '',
'has_robots_txt': False,
'has_sitemap': False,
'mobile_friendly': False,
'load_time': 0
}
def _analyze_meta_tags(soup: BeautifulSoup) -> Dict[str, Any]:
"""Analyze meta tags of the webpage."""
meta_tags = {}
# Title tag
title_tag = soup.find('title')
if title_tag:
meta_tags['title'] = title_tag.string.strip()
# Meta description
meta_desc = soup.find('meta', {'name': 'description'})
if meta_desc:
meta_tags['description'] = meta_desc.get('content', '').strip()
# Meta keywords
meta_keywords = soup.find('meta', {'name': 'keywords'})
if meta_keywords:
meta_tags['keywords'] = meta_keywords.get('content', '').strip()
# Open Graph tags
og_tags = {}
for tag in soup.find_all('meta', property=re.compile(r'^og:')):
og_tags[tag['property']] = tag.get('content', '')
meta_tags['og_tags'] = og_tags
# Twitter Card tags
twitter_tags = {}
for tag in soup.find_all('meta', name=re.compile(r'^twitter:')):
twitter_tags[tag['name']] = tag.get('content', '')
meta_tags['twitter_tags'] = twitter_tags
return meta_tags
def _analyze_headings(soup: BeautifulSoup) -> Dict[str, Any]:
"""Analyze heading structure of the webpage."""
headings = {
'h1': [],
'h2': [],
'h3': [],
'h4': [],
'h5': [],
'h6': []
}
for tag in ['h1', 'h2', 'h3', 'h4', 'h5', 'h6']:
for heading in soup.find_all(tag):
headings[tag].append(heading.get_text().strip())
return headings
def _analyze_content(soup: BeautifulSoup) -> Dict[str, Any]:
"""Analyze main content of the webpage."""
# Find main content
main_content = soup.find('main') or soup.find('article') or soup.find('div', class_=re.compile(r'content|main|article'))
if not main_content:
return {
'word_count': 0,
'paragraph_count': 0,
'content': ''
}
# Get text content
content = main_content.get_text()
# Count words and paragraphs
words = content.split()
paragraphs = main_content.find_all('p')
return {
'word_count': len(words),
'paragraph_count': len(paragraphs),
'content': content
}
def _analyze_links(soup: BeautifulSoup, base_url: str) -> Dict[str, Any]:
"""Analyze links on the webpage."""
links = {
'internal': [],
'external': [],
'broken': []
}
base_domain = urlparse(base_url).netloc
for link in soup.find_all('a', href=True):
href = link['href']
# Handle relative URLs
if not href.startswith(('http://', 'https://')):
href = urljoin(base_url, href)
# Categorize link
if urlparse(href).netloc == base_domain:
links['internal'].append({
'url': href,
'text': link.get_text().strip(),
'title': link.get('title', '')
})
else:
links['external'].append({
'url': href,
'text': link.get_text().strip(),
'title': link.get('title', '')
})
return links
def _analyze_images(soup: BeautifulSoup) -> Dict[str, Any]:
"""Analyze images on the webpage."""
images = []
for img in soup.find_all('img'):
image_data = {
'src': img.get('src', ''),
'alt': img.get('alt', ''),
'title': img.get('title', ''),
'width': img.get('width', ''),
'height': img.get('height', ''),
'has_alt': bool(img.get('alt')),
'has_title': bool(img.get('title')),
'has_dimensions': bool(img.get('width') and img.get('height'))
}
images.append(image_data)
return {
'total': len(images),
'with_alt': sum(1 for img in images if img['has_alt']),
'with_title': sum(1 for img in images if img['has_title']),
'with_dimensions': sum(1 for img in images if img['has_dimensions']),
'images': images
}
def _check_technical_elements(soup: BeautifulSoup, url: str) -> Dict[str, Any]:
"""Check technical SEO elements."""
base_url = urlparse(url)
domain = base_url.netloc
# Check robots.txt
robots_url = f"{base_url.scheme}://{domain}/robots.txt"
try:
robots_response = requests.get(robots_url, timeout=5)
has_robots_txt = robots_response.status_code == 200
except:
has_robots_txt = False
# Check sitemap
sitemap_url = f"{base_url.scheme}://{domain}/sitemap.xml"
try:
sitemap_response = requests.get(sitemap_url, timeout=5)
has_sitemap = sitemap_response.status_code == 200
except:
has_sitemap = False
# Check mobile friendliness
viewport = soup.find('meta', {'name': 'viewport'})
has_viewport = bool(viewport)
# Check canonical URL
canonical = soup.find('link', {'rel': 'canonical'})
has_canonical = bool(canonical)
# Check language
html_lang = soup.find('html').get('lang', '')
has_language = bool(html_lang)
return {
'has_robots_txt': has_robots_txt,
'has_sitemap': has_sitemap,
'mobile_friendly': has_viewport,
'has_canonical': has_canonical,
'has_language': has_language,
'language': html_lang
}

View File

@@ -0,0 +1,270 @@
"""
Storage module for content gap analysis results.
"""
from typing import Dict, Any, List, Optional
from datetime import datetime
from sqlalchemy.orm import Session
from sqlalchemy.exc import SQLAlchemyError
import streamlit as st
class ContentGapAnalysisStorage:
"""Handles storage and retrieval of content gap analysis results."""
def __init__(self, db_session: Session):
"""Initialize the storage handler."""
self.db = db_session
def save_analysis(self, user_id: int, website_url: str, industry: str, results: Dict[str, Any]) -> Optional[int]:
"""
Save content gap analysis results.
Args:
user_id: User ID
website_url: Target website URL
industry: Industry category
results: Analysis results dictionary
Returns:
Analysis ID if successful, None otherwise
"""
try:
# Create main analysis record
analysis = ContentGapAnalysis(
user_id=user_id,
website_url=website_url,
industry=industry,
status='completed',
metadata={'version': '1.0'}
)
self.db.add(analysis)
self.db.flush() # Get the ID without committing
# Save website analysis
website_analysis = WebsiteAnalysis(
content_gap_analysis_id=analysis.id,
content_score=results.get('website', {}).get('content_score', 0),
seo_score=results.get('website', {}).get('seo_score', 0),
structure_score=results.get('website', {}).get('structure_score', 0),
content_metrics=results.get('website', {}).get('content_metrics', {}),
seo_metrics=results.get('website', {}).get('seo_metrics', {}),
technical_metrics=results.get('website', {}).get('technical_metrics', {}),
ai_insights=results.get('website', {}).get('ai_insights', {})
)
self.db.add(website_analysis)
# Save competitor analysis if available
if 'competitors' in results:
for competitor in results['competitors']:
competitor_analysis = CompetitorAnalysis(
content_gap_analysis_id=analysis.id,
competitor_url=competitor.get('url'),
market_position=competitor.get('market_position', {}),
content_gaps=competitor.get('content_gaps', []),
competitive_advantages=competitor.get('competitive_advantages', []),
trend_analysis=competitor.get('trend_analysis', {})
)
self.db.add(competitor_analysis)
# Save keyword analysis
keyword_analysis = KeywordAnalysis(
content_gap_analysis_id=analysis.id,
top_keywords=results.get('keywords', {}).get('top_keywords', []),
search_intent=results.get('keywords', {}).get('search_intent', {}),
opportunities=results.get('keywords', {}).get('opportunities', []),
trend_analysis=results.get('keywords', {}).get('trend_analysis', {})
)
self.db.add(keyword_analysis)
# Save recommendations
for recommendation in results.get('recommendations', []):
content_recommendation = ContentRecommendation(
content_gap_analysis_id=analysis.id,
recommendation_type=recommendation.get('type'),
priority_score=recommendation.get('priority_score', 0),
recommendation=recommendation.get('recommendation', ''),
implementation_steps=recommendation.get('implementation_steps', []),
expected_impact=recommendation.get('expected_impact', {}),
status='pending'
)
self.db.add(content_recommendation)
# Save analysis history
history = AnalysisHistory(
content_gap_analysis_id=analysis.id,
status='completed',
metrics={'duration': results.get('duration', 0)}
)
self.db.add(history)
# Commit all changes
self.db.commit()
return analysis.id
except SQLAlchemyError as e:
self.db.rollback()
st.error(f"Error saving analysis results: {str(e)}")
return None
def get_analysis(self, analysis_id: int) -> Optional[Dict[str, Any]]:
"""
Retrieve content gap analysis results.
Args:
analysis_id: Analysis ID
Returns:
Dictionary containing analysis results if found, None otherwise
"""
try:
analysis = self.db.query(ContentGapAnalysis).get(analysis_id)
if not analysis:
return None
# Get website analysis
website_analysis = self.db.query(WebsiteAnalysis).filter_by(
content_gap_analysis_id=analysis_id
).first()
# Get competitor analysis
competitor_analyses = self.db.query(CompetitorAnalysis).filter_by(
content_gap_analysis_id=analysis_id
).all()
# Get keyword analysis
keyword_analysis = self.db.query(KeywordAnalysis).filter_by(
content_gap_analysis_id=analysis_id
).first()
# Get recommendations
recommendations = self.db.query(ContentRecommendation).filter_by(
content_gap_analysis_id=analysis_id
).all()
# Get analysis history
history = self.db.query(AnalysisHistory).filter_by(
content_gap_analysis_id=analysis_id
).order_by(AnalysisHistory.run_date.desc()).all()
return {
'id': analysis.id,
'website_url': analysis.website_url,
'industry': analysis.industry,
'analysis_date': analysis.analysis_date,
'status': analysis.status,
'website': {
'content_score': website_analysis.content_score,
'seo_score': website_analysis.seo_score,
'structure_score': website_analysis.structure_score,
'content_metrics': website_analysis.content_metrics,
'seo_metrics': website_analysis.seo_metrics,
'technical_metrics': website_analysis.technical_metrics,
'ai_insights': website_analysis.ai_insights
} if website_analysis else {},
'competitors': [{
'url': ca.competitor_url,
'market_position': ca.market_position,
'content_gaps': ca.content_gaps,
'competitive_advantages': ca.competitive_advantages,
'trend_analysis': ca.trend_analysis
} for ca in competitor_analyses],
'keywords': {
'top_keywords': keyword_analysis.top_keywords,
'search_intent': keyword_analysis.search_intent,
'opportunities': keyword_analysis.opportunities,
'trend_analysis': keyword_analysis.trend_analysis
} if keyword_analysis else {},
'recommendations': [{
'type': r.recommendation_type,
'priority_score': r.priority_score,
'recommendation': r.recommendation,
'implementation_steps': r.implementation_steps,
'expected_impact': r.expected_impact,
'status': r.status
} for r in recommendations],
'history': [{
'run_date': h.run_date,
'status': h.status,
'metrics': h.metrics,
'error_log': h.error_log
} for h in history]
}
except SQLAlchemyError as e:
st.error(f"Error retrieving analysis results: {str(e)}")
return None
def get_user_analyses(self, user_id: int) -> List[Dict[str, Any]]:
"""
Get all analyses for a user.
Args:
user_id: User ID
Returns:
List of analysis summaries
"""
try:
analyses = self.db.query(ContentGapAnalysis).filter_by(
user_id=user_id
).order_by(ContentGapAnalysis.analysis_date.desc()).all()
return [{
'id': analysis.id,
'website_url': analysis.website_url,
'industry': analysis.industry,
'analysis_date': analysis.analysis_date,
'status': analysis.status
} for analysis in analyses]
except SQLAlchemyError as e:
st.error(f"Error retrieving user analyses: {str(e)}")
return []
def update_recommendation_status(self, recommendation_id: int, status: str) -> bool:
"""
Update the status of a recommendation.
Args:
recommendation_id: Recommendation ID
status: New status
Returns:
True if successful, False otherwise
"""
try:
recommendation = self.db.query(ContentRecommendation).get(recommendation_id)
if recommendation:
recommendation.status = status
recommendation.updated_at = datetime.utcnow()
self.db.commit()
return True
return False
except SQLAlchemyError as e:
self.db.rollback()
st.error(f"Error updating recommendation status: {str(e)}")
return False
def delete_analysis(self, analysis_id: int) -> bool:
"""
Delete an analysis and all related data.
Args:
analysis_id: Analysis ID
Returns:
True if successful, False otherwise
"""
try:
analysis = self.db.query(ContentGapAnalysis).get(analysis_id)
if analysis:
self.db.delete(analysis)
self.db.commit()
return True
return False
except SQLAlchemyError as e:
self.db.rollback()
st.error(f"Error deleting analysis: {str(e)}")
return False