Base code
This commit is contained in:
288
backend/services/seo_analyzer/README.md
Normal file
288
backend/services/seo_analyzer/README.md
Normal file
@@ -0,0 +1,288 @@
|
||||
# SEO Analyzer Module
|
||||
|
||||
A comprehensive, modular SEO analysis system for web applications that provides detailed insights and actionable recommendations for improving search engine optimization.
|
||||
|
||||
## 🚀 Features
|
||||
|
||||
### ✅ **Currently Implemented**
|
||||
|
||||
#### **Core Analysis Components**
|
||||
- **URL Structure Analysis**: Checks URL length, HTTPS usage, special characters, and URL formatting
|
||||
- **Meta Data Analysis**: Analyzes title tags, meta descriptions, viewport settings, and character encoding
|
||||
- **Content Analysis**: Evaluates content quality, word count, heading structure, and readability
|
||||
- **Technical SEO Analysis**: Checks robots.txt, sitemaps, structured data, and canonical URLs
|
||||
- **Performance Analysis**: Measures page load speed, compression, caching, and optimization
|
||||
- **Accessibility Analysis**: Ensures alt text, form labels, heading structure, and color contrast
|
||||
- **User Experience Analysis**: Checks mobile responsiveness, navigation, contact info, and social links
|
||||
- **Security Headers Analysis**: Analyzes security headers for protection against common vulnerabilities
|
||||
- **Keyword Analysis**: Evaluates keyword usage and optimization for target keywords
|
||||
|
||||
#### **AI-Powered Insights**
|
||||
- **Intelligent Issue Detection**: Automatically identifies critical SEO problems
|
||||
- **Actionable Recommendations**: Provides specific fixes with code examples
|
||||
- **Priority-Based Suggestions**: Categorizes issues by severity and impact
|
||||
- **Context-Aware Solutions**: Offers location-specific fixes and improvements
|
||||
|
||||
#### **Advanced Features**
|
||||
- **Progressive Analysis**: Runs faster analyses first, then slower ones with graceful fallbacks
|
||||
- **Timeout Handling**: Robust error handling for network issues and timeouts
|
||||
- **Detailed Reporting**: Comprehensive analysis with scores, issues, warnings, and recommendations
|
||||
- **Modular Architecture**: Reusable components for easy maintenance and extension
|
||||
|
||||
### 🔄 **Coming Soon**
|
||||
|
||||
#### **Enhanced Analysis Features**
|
||||
- **Core Web Vitals Analysis**: LCP, FID, CLS measurements
|
||||
- **Mobile-First Analysis**: Comprehensive mobile optimization checks
|
||||
- **Schema Markup Validation**: Advanced structured data analysis
|
||||
- **Image Optimization Analysis**: Alt text, compression, and format recommendations
|
||||
- **Internal Linking Analysis**: Site structure and internal link optimization
|
||||
- **Social Media Optimization**: Open Graph and Twitter Card analysis
|
||||
|
||||
#### **AI-Powered Enhancements**
|
||||
- **Natural Language Processing**: Advanced content analysis using NLP
|
||||
- **Competitive Analysis**: Compare against competitor websites
|
||||
- **Trend Analysis**: Identify SEO trends and opportunities
|
||||
- **Predictive Insights**: Forecast potential ranking improvements
|
||||
- **Automated Fix Generation**: AI-generated code fixes and optimizations
|
||||
|
||||
#### **Advanced Features**
|
||||
- **Bulk Analysis**: Analyze multiple URLs simultaneously
|
||||
- **Historical Tracking**: Monitor SEO improvements over time
|
||||
- **Custom Rule Engine**: User-defined analysis rules and thresholds
|
||||
- **API Integration**: Connect with Google Search Console, Analytics, and other tools
|
||||
- **White-Label Support**: Customizable branding and reporting
|
||||
|
||||
#### **Enterprise Features**
|
||||
- **Multi-User Support**: Team collaboration and role-based access
|
||||
- **Advanced Reporting**: Custom dashboards and detailed analytics
|
||||
- **API Rate Limiting**: Intelligent request management
|
||||
- **Caching System**: Optimized performance for repeated analyses
|
||||
- **Webhook Support**: Real-time notifications and integrations
|
||||
|
||||
## 📁 **Module Structure**
|
||||
|
||||
```
|
||||
seo_analyzer/
|
||||
├── __init__.py # Package initialization and exports
|
||||
├── core.py # Main analyzer class and data structures
|
||||
├── analyzers.py # Individual analysis components
|
||||
├── utils.py # Utility classes (HTML fetcher, AI insights)
|
||||
├── service.py # Database service for storing/retrieving results
|
||||
└── README.md # This documentation
|
||||
```
|
||||
|
||||
### **Core Components**
|
||||
|
||||
#### **`core.py`**
|
||||
- `ComprehensiveSEOAnalyzer`: Main orchestrator class
|
||||
- `SEOAnalysisResult`: Data structure for analysis results
|
||||
- Progressive analysis with error handling
|
||||
|
||||
#### **`analyzers.py`**
|
||||
- `BaseAnalyzer`: Base class for all analyzers
|
||||
- `URLStructureAnalyzer`: URL analysis and security checks
|
||||
- `MetaDataAnalyzer`: Meta tags and technical SEO
|
||||
- `ContentAnalyzer`: Content quality and structure
|
||||
- `TechnicalSEOAnalyzer`: Technical SEO elements
|
||||
- `PerformanceAnalyzer`: Page speed and optimization
|
||||
- `AccessibilityAnalyzer`: Accessibility compliance
|
||||
- `UserExperienceAnalyzer`: UX and mobile optimization
|
||||
- `SecurityHeadersAnalyzer`: Security header analysis
|
||||
- `KeywordAnalyzer`: Keyword optimization
|
||||
|
||||
#### **`utils.py`**
|
||||
- `HTMLFetcher`: Robust HTML content fetching
|
||||
- `AIInsightGenerator`: AI-powered insights generation
|
||||
|
||||
#### **`service.py`**
|
||||
- `SEOAnalysisService`: Database operations for storing and retrieving analysis results
|
||||
- Analysis history tracking
|
||||
- Statistics and reporting
|
||||
- CRUD operations for analysis data
|
||||
|
||||
## 🛠 **Usage**
|
||||
|
||||
### **Basic Usage**
|
||||
|
||||
```python
|
||||
from services.seo_analyzer import ComprehensiveSEOAnalyzer
|
||||
|
||||
# Initialize analyzer
|
||||
analyzer = ComprehensiveSEOAnalyzer()
|
||||
|
||||
# Analyze a URL
|
||||
result = analyzer.analyze_url_progressive(
|
||||
url="https://example.com",
|
||||
target_keywords=["seo", "optimization"]
|
||||
)
|
||||
|
||||
# Access results
|
||||
print(f"Overall Score: {result.overall_score}")
|
||||
print(f"Health Status: {result.health_status}")
|
||||
print(f"Critical Issues: {len(result.critical_issues)}")
|
||||
```
|
||||
|
||||
### **Individual Analyzer Usage**
|
||||
|
||||
```python
|
||||
from services.seo_analyzer import URLStructureAnalyzer, MetaDataAnalyzer
|
||||
|
||||
# URL analysis
|
||||
url_analyzer = URLStructureAnalyzer()
|
||||
url_result = url_analyzer.analyze("https://example.com")
|
||||
|
||||
# Meta data analysis
|
||||
meta_analyzer = MetaDataAnalyzer()
|
||||
meta_result = meta_analyzer.analyze(html_content, "https://example.com")
|
||||
```
|
||||
|
||||
## 📊 **Analysis Categories**
|
||||
|
||||
### **URL Structure & Security**
|
||||
- URL length optimization
|
||||
- HTTPS implementation
|
||||
- Special character handling
|
||||
- URL readability and formatting
|
||||
|
||||
### **Meta Data & Technical SEO**
|
||||
- Title tag optimization (30-60 characters)
|
||||
- Meta description analysis (70-160 characters)
|
||||
- Viewport meta tag presence
|
||||
- Character encoding declaration
|
||||
|
||||
### **Content Analysis**
|
||||
- Word count evaluation (minimum 300 words)
|
||||
- Heading hierarchy (H1, H2, H3 structure)
|
||||
- Image alt text compliance
|
||||
- Internal linking analysis
|
||||
- Spelling error detection
|
||||
|
||||
### **Technical SEO**
|
||||
- Robots.txt accessibility
|
||||
- XML sitemap presence
|
||||
- Structured data markup
|
||||
- Canonical URL implementation
|
||||
|
||||
### **Performance**
|
||||
- Page load time measurement
|
||||
- GZIP compression detection
|
||||
- Caching header analysis
|
||||
- Resource optimization recommendations
|
||||
|
||||
### **Accessibility**
|
||||
- Image alt text compliance
|
||||
- Form label associations
|
||||
- Heading hierarchy validation
|
||||
- Color contrast recommendations
|
||||
|
||||
### **User Experience**
|
||||
- Mobile responsiveness checks
|
||||
- Navigation menu analysis
|
||||
- Contact information presence
|
||||
- Social media link integration
|
||||
|
||||
### **Security Headers**
|
||||
- X-Frame-Options
|
||||
- X-Content-Type-Options
|
||||
- X-XSS-Protection
|
||||
- Strict-Transport-Security
|
||||
- Content-Security-Policy
|
||||
- Referrer-Policy
|
||||
|
||||
### **Keyword Analysis**
|
||||
- Title keyword presence
|
||||
- Content keyword density
|
||||
- Natural keyword integration
|
||||
- Target keyword optimization
|
||||
|
||||
## 🎯 **Scoring System**
|
||||
|
||||
### **Overall Health Status**
|
||||
- **Excellent (80-100)**: Optimal SEO performance
|
||||
- **Good (60-79)**: Good performance with minor improvements needed
|
||||
- **Needs Improvement (40-59)**: Significant issues requiring attention
|
||||
- **Poor (0-39)**: Critical issues requiring immediate action
|
||||
|
||||
### **Issue Categories**
|
||||
- **Critical Issues**: Major problems affecting rankings (25 points each)
|
||||
- **Warnings**: Important improvements for better performance (10 points each)
|
||||
- **Recommendations**: Optional enhancements for optimal results
|
||||
|
||||
## 🔧 **Configuration**
|
||||
|
||||
### **Timeout Settings**
|
||||
- HTML Fetching: 30 seconds
|
||||
- Security Headers: 15 seconds
|
||||
- Performance Analysis: 20 seconds
|
||||
- Progressive Analysis: Graceful fallbacks
|
||||
|
||||
### **Scoring Thresholds**
|
||||
- URL Length: 2000 characters maximum
|
||||
- Title Length: 30-60 characters optimal
|
||||
- Meta Description: 70-160 characters optimal
|
||||
- Content Length: 300 words minimum
|
||||
- Load Time: 3 seconds maximum
|
||||
|
||||
## 🚀 **Performance Features**
|
||||
|
||||
### **Progressive Analysis**
|
||||
1. **Fast Analyses**: URL structure, meta data, content, technical SEO, accessibility, UX
|
||||
2. **Slower Analyses**: Security headers, performance (with timeout handling)
|
||||
3. **Graceful Fallbacks**: Partial results when analyses fail
|
||||
|
||||
### **Error Handling**
|
||||
- Network timeout management
|
||||
- Partial result generation
|
||||
- Detailed error reporting
|
||||
- Fallback recommendations
|
||||
|
||||
## 📈 **Future Roadmap**
|
||||
|
||||
### **Phase 1 (Q1 2024)**
|
||||
- [ ] Core Web Vitals integration
|
||||
- [ ] Enhanced mobile analysis
|
||||
- [ ] Schema markup validation
|
||||
- [ ] Image optimization analysis
|
||||
|
||||
### **Phase 2 (Q2 2024)**
|
||||
- [ ] NLP-powered content analysis
|
||||
- [ ] Competitive analysis features
|
||||
- [ ] Bulk analysis capabilities
|
||||
- [ ] Historical tracking
|
||||
|
||||
### **Phase 3 (Q3 2024)**
|
||||
- [ ] Predictive insights
|
||||
- [ ] Automated fix generation
|
||||
- [ ] API integrations
|
||||
- [ ] White-label support
|
||||
|
||||
### **Phase 4 (Q4 2024)**
|
||||
- [ ] Enterprise features
|
||||
- [ ] Advanced reporting
|
||||
- [ ] Multi-user support
|
||||
- [ ] Webhook integrations
|
||||
|
||||
## 🤝 **Contributing**
|
||||
|
||||
### **Adding New Analyzers**
|
||||
1. Create a new analyzer class inheriting from `BaseAnalyzer`
|
||||
2. Implement the `analyze()` method
|
||||
3. Return standardized result format
|
||||
4. Add to the main orchestrator in `core.py`
|
||||
|
||||
### **Extending Existing Features**
|
||||
1. Follow the modular architecture
|
||||
2. Maintain backward compatibility
|
||||
3. Add comprehensive error handling
|
||||
4. Include detailed documentation
|
||||
|
||||
## 📝 **License**
|
||||
|
||||
This module is part of the AI-Writer project and follows the same licensing terms.
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Last Updated**: January 2024
|
||||
**Maintainer**: AI-Writer Team
|
||||
52
backend/services/seo_analyzer/__init__.py
Normal file
52
backend/services/seo_analyzer/__init__.py
Normal file
@@ -0,0 +1,52 @@
|
||||
"""
|
||||
SEO Analyzer Package
|
||||
A comprehensive, modular SEO analysis system for web applications.
|
||||
|
||||
This package provides:
|
||||
- URL structure analysis
|
||||
- Meta data analysis
|
||||
- Content analysis
|
||||
- Technical SEO analysis
|
||||
- Performance analysis
|
||||
- Accessibility analysis
|
||||
- User experience analysis
|
||||
- Security headers analysis
|
||||
- Keyword analysis
|
||||
- AI-powered insights generation
|
||||
- Database service for storing and retrieving analysis results
|
||||
"""
|
||||
|
||||
from .core import ComprehensiveSEOAnalyzer, SEOAnalysisResult
|
||||
from .analyzers import (
|
||||
URLStructureAnalyzer,
|
||||
MetaDataAnalyzer,
|
||||
ContentAnalyzer,
|
||||
TechnicalSEOAnalyzer,
|
||||
PerformanceAnalyzer,
|
||||
AccessibilityAnalyzer,
|
||||
UserExperienceAnalyzer,
|
||||
SecurityHeadersAnalyzer,
|
||||
KeywordAnalyzer
|
||||
)
|
||||
from .utils import HTMLFetcher, AIInsightGenerator
|
||||
from .service import SEOAnalysisService
|
||||
|
||||
__version__ = "1.0.0"
|
||||
__author__ = "AI-Writer Team"
|
||||
|
||||
__all__ = [
|
||||
'ComprehensiveSEOAnalyzer',
|
||||
'SEOAnalysisResult',
|
||||
'URLStructureAnalyzer',
|
||||
'MetaDataAnalyzer',
|
||||
'ContentAnalyzer',
|
||||
'TechnicalSEOAnalyzer',
|
||||
'PerformanceAnalyzer',
|
||||
'AccessibilityAnalyzer',
|
||||
'UserExperienceAnalyzer',
|
||||
'SecurityHeadersAnalyzer',
|
||||
'KeywordAnalyzer',
|
||||
'HTMLFetcher',
|
||||
'AIInsightGenerator',
|
||||
'SEOAnalysisService'
|
||||
]
|
||||
796
backend/services/seo_analyzer/analyzers.py
Normal file
796
backend/services/seo_analyzer/analyzers.py
Normal file
@@ -0,0 +1,796 @@
|
||||
"""
|
||||
SEO Analyzers Module
|
||||
Contains all individual SEO analysis components.
|
||||
"""
|
||||
|
||||
import re
|
||||
import time
|
||||
import requests
|
||||
from urllib.parse import urlparse, urljoin
|
||||
from typing import Dict, List, Any, Optional
|
||||
from bs4 import BeautifulSoup
|
||||
from loguru import logger
|
||||
|
||||
|
||||
class BaseAnalyzer:
|
||||
"""Base class for all SEO analyzers"""
|
||||
|
||||
def __init__(self):
|
||||
self.session = requests.Session()
|
||||
self.session.headers.update({
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
|
||||
})
|
||||
|
||||
|
||||
class URLStructureAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes URL structure and security"""
|
||||
|
||||
def analyze(self, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced URL structure analysis with specific fixes"""
|
||||
parsed = urlparse(url)
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Check URL length
|
||||
if len(url) > 2000:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'URL is too long ({len(url)} characters)',
|
||||
'location': 'URL',
|
||||
'current_value': url,
|
||||
'fix': 'Shorten URL to under 2000 characters',
|
||||
'code_example': f'<a href="/shorter-path">Link</a>',
|
||||
'action': 'shorten_url'
|
||||
})
|
||||
|
||||
# Check for hyphens
|
||||
if '_' in parsed.path and '-' not in parsed.path:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'URL uses underscores instead of hyphens',
|
||||
'location': 'URL',
|
||||
'current_value': parsed.path,
|
||||
'fix': 'Replace underscores with hyphens',
|
||||
'code_example': f'<a href="{parsed.path.replace("_", "-")}">Link</a>',
|
||||
'action': 'replace_underscores'
|
||||
})
|
||||
|
||||
# Check for special characters
|
||||
special_chars = re.findall(r'[^a-zA-Z0-9\-_/]', parsed.path)
|
||||
if special_chars:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'URL contains special characters: {", ".join(set(special_chars))}',
|
||||
'location': 'URL',
|
||||
'current_value': parsed.path,
|
||||
'fix': 'Remove special characters from URL',
|
||||
'code_example': f'<a href="/clean-url">Link</a>',
|
||||
'action': 'remove_special_chars'
|
||||
})
|
||||
|
||||
# Check for HTTPS
|
||||
if parsed.scheme != 'https':
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'URL is not using HTTPS',
|
||||
'location': 'URL',
|
||||
'current_value': parsed.scheme,
|
||||
'fix': 'Redirect to HTTPS',
|
||||
'code_example': 'RewriteEngine On\nRewriteCond %{HTTPS} off\nRewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]',
|
||||
'action': 'enable_https'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'url_length': len(url),
|
||||
'has_https': parsed.scheme == 'https',
|
||||
'has_hyphens': '-' in parsed.path,
|
||||
'special_chars_count': len(special_chars)
|
||||
}
|
||||
|
||||
|
||||
class MetaDataAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes meta data and technical SEO elements"""
|
||||
|
||||
def analyze(self, html_content: str, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced meta data analysis with specific element locations"""
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Title analysis
|
||||
title_tag = soup.find('title')
|
||||
if not title_tag:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing title tag',
|
||||
'location': '<head>',
|
||||
'fix': 'Add title tag to head section',
|
||||
'code_example': '<title>Your Page Title</title>',
|
||||
'action': 'add_title_tag'
|
||||
})
|
||||
else:
|
||||
title_text = title_tag.get_text().strip()
|
||||
if len(title_text) < 30:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Title too short ({len(title_text)} characters)',
|
||||
'location': '<title>',
|
||||
'current_value': title_text,
|
||||
'fix': 'Make title 30-60 characters',
|
||||
'code_example': f'<title>{title_text} - Additional Context</title>',
|
||||
'action': 'extend_title'
|
||||
})
|
||||
elif len(title_text) > 60:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Title too long ({len(title_text)} characters)',
|
||||
'location': '<title>',
|
||||
'current_value': title_text,
|
||||
'fix': 'Shorten title to 30-60 characters',
|
||||
'code_example': f'<title>{title_text[:55]}...</title>',
|
||||
'action': 'shorten_title'
|
||||
})
|
||||
|
||||
# Meta description analysis
|
||||
meta_desc = soup.find('meta', attrs={'name': 'description'})
|
||||
if not meta_desc:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing meta description',
|
||||
'location': '<head>',
|
||||
'fix': 'Add meta description',
|
||||
'code_example': '<meta name="description" content="Your page description here">',
|
||||
'action': 'add_meta_description'
|
||||
})
|
||||
else:
|
||||
desc_content = meta_desc.get('content', '').strip()
|
||||
if len(desc_content) < 70:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Meta description too short ({len(desc_content)} characters)',
|
||||
'location': '<meta name="description">',
|
||||
'current_value': desc_content,
|
||||
'fix': 'Extend description to 70-160 characters',
|
||||
'code_example': f'<meta name="description" content="{desc_content} - Additional context about your page">',
|
||||
'action': 'extend_meta_description'
|
||||
})
|
||||
elif len(desc_content) > 160:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Meta description too long ({len(desc_content)} characters)',
|
||||
'location': '<meta name="description">',
|
||||
'current_value': desc_content,
|
||||
'fix': 'Shorten description to 70-160 characters',
|
||||
'code_example': f'<meta name="description" content="{desc_content[:155]}...">',
|
||||
'action': 'shorten_meta_description'
|
||||
})
|
||||
|
||||
# Viewport meta tag
|
||||
viewport = soup.find('meta', attrs={'name': 'viewport'})
|
||||
if not viewport:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing viewport meta tag',
|
||||
'location': '<head>',
|
||||
'fix': 'Add viewport meta tag for mobile optimization',
|
||||
'code_example': '<meta name="viewport" content="width=device-width, initial-scale=1.0">',
|
||||
'action': 'add_viewport_meta'
|
||||
})
|
||||
|
||||
# Charset declaration
|
||||
charset = soup.find('meta', attrs={'charset': True}) or soup.find('meta', attrs={'http-equiv': 'Content-Type'})
|
||||
if not charset:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'Missing charset declaration',
|
||||
'location': '<head>',
|
||||
'fix': 'Add charset meta tag',
|
||||
'code_example': '<meta charset="UTF-8">',
|
||||
'action': 'add_charset_meta'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'title_length': len(title_tag.get_text().strip()) if title_tag else 0,
|
||||
'description_length': len(meta_desc.get('content', '')) if meta_desc else 0,
|
||||
'has_viewport': bool(viewport),
|
||||
'has_charset': bool(charset)
|
||||
}
|
||||
|
||||
|
||||
class ContentAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes content quality and structure"""
|
||||
|
||||
def analyze(self, html_content: str, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced content analysis with specific text locations"""
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Get all text content
|
||||
text_content = soup.get_text()
|
||||
words = text_content.split()
|
||||
word_count = len(words)
|
||||
|
||||
# Check word count
|
||||
if word_count < 300:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Content too short ({word_count} words)',
|
||||
'location': 'Page content',
|
||||
'current_value': f'{word_count} words',
|
||||
'fix': 'Add more valuable content (minimum 300 words)',
|
||||
'code_example': 'Add relevant paragraphs with useful information',
|
||||
'action': 'add_more_content'
|
||||
})
|
||||
|
||||
# Check for H1 tags
|
||||
h1_tags = soup.find_all('h1')
|
||||
if len(h1_tags) == 0:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing H1 tag',
|
||||
'location': 'Page structure',
|
||||
'fix': 'Add one H1 tag per page',
|
||||
'code_example': '<h1>Your Main Page Title</h1>',
|
||||
'action': 'add_h1_tag'
|
||||
})
|
||||
elif len(h1_tags) > 1:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Multiple H1 tags found ({len(h1_tags)})',
|
||||
'location': 'Page structure',
|
||||
'current_value': f'{len(h1_tags)} H1 tags',
|
||||
'fix': 'Use only one H1 tag per page',
|
||||
'code_example': 'Keep only the main H1, change others to H2',
|
||||
'action': 'reduce_h1_tags'
|
||||
})
|
||||
|
||||
# Check for images without alt text
|
||||
images = soup.find_all('img')
|
||||
images_without_alt = [img for img in images if not img.get('alt')]
|
||||
if images_without_alt:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Images without alt text ({len(images_without_alt)} found)',
|
||||
'location': 'Images',
|
||||
'current_value': f'{len(images_without_alt)} images without alt',
|
||||
'fix': 'Add descriptive alt text to all images',
|
||||
'code_example': '<img src="image.jpg" alt="Descriptive text about the image">',
|
||||
'action': 'add_alt_text'
|
||||
})
|
||||
|
||||
# Check for internal links
|
||||
internal_links = soup.find_all('a', href=re.compile(r'^[^http]'))
|
||||
if len(internal_links) < 3:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Few internal links ({len(internal_links)} found)',
|
||||
'location': 'Page content',
|
||||
'current_value': f'{len(internal_links)} internal links',
|
||||
'fix': 'Add more internal links to improve site structure',
|
||||
'code_example': '<a href="/related-page">Related content</a>',
|
||||
'action': 'add_internal_links'
|
||||
})
|
||||
|
||||
# Check for spelling errors (basic check)
|
||||
common_words = ['the', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by']
|
||||
potential_errors = []
|
||||
for word in words[:100]: # Check first 100 words
|
||||
if len(word) > 3 and word.lower() not in common_words:
|
||||
# Basic spell check (this is simplified - in production you'd use a proper spell checker)
|
||||
if re.search(r'[a-z]{15,}', word.lower()): # Very long words might be misspelled
|
||||
potential_errors.append(word)
|
||||
|
||||
if potential_errors:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Potential spelling errors found: {", ".join(potential_errors[:5])}',
|
||||
'location': 'Page content',
|
||||
'current_value': f'{len(potential_errors)} potential errors',
|
||||
'fix': 'Review and correct spelling errors',
|
||||
'code_example': 'Use spell checker or proofread content',
|
||||
'action': 'fix_spelling'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'word_count': word_count,
|
||||
'h1_count': len(h1_tags),
|
||||
'images_count': len(images),
|
||||
'images_without_alt': len(images_without_alt),
|
||||
'internal_links_count': len(internal_links),
|
||||
'potential_spelling_errors': len(potential_errors)
|
||||
}
|
||||
|
||||
|
||||
class TechnicalSEOAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes technical SEO elements"""
|
||||
|
||||
def analyze(self, html_content: str, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced technical SEO analysis with specific fixes"""
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Check for robots.txt
|
||||
robots_url = urljoin(url, '/robots.txt')
|
||||
try:
|
||||
robots_response = self.session.get(robots_url, timeout=5)
|
||||
if robots_response.status_code != 200:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'Robots.txt not accessible',
|
||||
'location': 'Server',
|
||||
'fix': 'Create robots.txt file',
|
||||
'code_example': 'User-agent: *\nAllow: /',
|
||||
'action': 'create_robots_txt'
|
||||
})
|
||||
except:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'Robots.txt not found',
|
||||
'location': 'Server',
|
||||
'fix': 'Create robots.txt file',
|
||||
'code_example': 'User-agent: *\nAllow: /',
|
||||
'action': 'create_robots_txt'
|
||||
})
|
||||
|
||||
# Check for sitemap
|
||||
sitemap_url = urljoin(url, '/sitemap.xml')
|
||||
try:
|
||||
sitemap_response = self.session.get(sitemap_url, timeout=5)
|
||||
if sitemap_response.status_code != 200:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'Sitemap not accessible',
|
||||
'location': 'Server',
|
||||
'fix': 'Create XML sitemap',
|
||||
'code_example': '<?xml version="1.0" encoding="UTF-8"?>\n<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n<url>\n<loc>https://example.com/</loc>\n</url>\n</urlset>',
|
||||
'action': 'create_sitemap'
|
||||
})
|
||||
except:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'Sitemap not found',
|
||||
'location': 'Server',
|
||||
'fix': 'Create XML sitemap',
|
||||
'code_example': '<?xml version="1.0" encoding="UTF-8"?>\n<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">\n<url>\n<loc>https://example.com/</loc>\n</url>\n</urlset>',
|
||||
'action': 'create_sitemap'
|
||||
})
|
||||
|
||||
# Check for structured data
|
||||
structured_data = soup.find_all('script', type='application/ld+json')
|
||||
if not structured_data:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'No structured data found',
|
||||
'location': '<head> or <body>',
|
||||
'fix': 'Add structured data markup',
|
||||
'code_example': '<script type="application/ld+json">{"@context":"https://schema.org","@type":"WebPage","name":"Page Title"}</script>',
|
||||
'action': 'add_structured_data'
|
||||
})
|
||||
|
||||
# Check for canonical URL
|
||||
canonical = soup.find('link', rel='canonical')
|
||||
if not canonical:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing canonical URL',
|
||||
'location': '<head>',
|
||||
'fix': 'Add canonical URL',
|
||||
'code_example': '<link rel="canonical" href="https://example.com/page">',
|
||||
'action': 'add_canonical_url'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'has_robots_txt': len([w for w in warnings if 'robots.txt' in w['message']]) == 0,
|
||||
'has_sitemap': len([w for w in warnings if 'sitemap' in w['message']]) == 0,
|
||||
'has_structured_data': bool(structured_data),
|
||||
'has_canonical': bool(canonical)
|
||||
}
|
||||
|
||||
|
||||
class PerformanceAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes page performance"""
|
||||
|
||||
def analyze(self, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced performance analysis with specific fixes"""
|
||||
try:
|
||||
start_time = time.time()
|
||||
response = self.session.get(url, timeout=20)
|
||||
load_time = time.time() - start_time
|
||||
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Check load time
|
||||
if load_time > 3:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Page load time too slow ({load_time:.2f}s)',
|
||||
'location': 'Page performance',
|
||||
'current_value': f'{load_time:.2f}s',
|
||||
'fix': 'Optimize page speed (target < 3 seconds)',
|
||||
'code_example': 'Optimize images, minify CSS/JS, use CDN',
|
||||
'action': 'optimize_page_speed'
|
||||
})
|
||||
elif load_time > 2:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Page load time could be improved ({load_time:.2f}s)',
|
||||
'location': 'Page performance',
|
||||
'current_value': f'{load_time:.2f}s',
|
||||
'fix': 'Optimize for faster loading',
|
||||
'code_example': 'Compress images, enable caching',
|
||||
'action': 'improve_page_speed'
|
||||
})
|
||||
|
||||
# Check for compression
|
||||
content_encoding = response.headers.get('Content-Encoding')
|
||||
if not content_encoding:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'No compression detected',
|
||||
'location': 'Server configuration',
|
||||
'fix': 'Enable GZIP compression',
|
||||
'code_example': 'Add to .htaccess: SetOutputFilter DEFLATE',
|
||||
'action': 'enable_compression'
|
||||
})
|
||||
|
||||
# Check for caching headers
|
||||
cache_headers = ['Cache-Control', 'Expires', 'ETag']
|
||||
has_cache = any(response.headers.get(header) for header in cache_headers)
|
||||
if not has_cache:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'No caching headers found',
|
||||
'location': 'Server configuration',
|
||||
'fix': 'Add caching headers',
|
||||
'code_example': 'Cache-Control: max-age=31536000',
|
||||
'action': 'add_caching_headers'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'load_time': load_time,
|
||||
'is_compressed': bool(content_encoding),
|
||||
'has_cache': has_cache,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations
|
||||
}
|
||||
except Exception as e:
|
||||
logger.warning(f"Performance analysis failed for {url}: {e}")
|
||||
return {
|
||||
'score': 0, 'error': f'Performance analysis failed: {str(e)}',
|
||||
'load_time': 0, 'is_compressed': False, 'has_cache': False,
|
||||
'issues': [{'type': 'critical', 'message': 'Performance analysis failed', 'location': 'Page', 'fix': 'Check page speed manually', 'action': 'manual_check'}],
|
||||
'warnings': [{'type': 'warning', 'message': 'Could not analyze performance', 'location': 'Page', 'fix': 'Use PageSpeed Insights', 'action': 'manual_check'}],
|
||||
'recommendations': [{'type': 'recommendation', 'message': 'Check page speed manually', 'priority': 'medium', 'action': 'manual_check'}]
|
||||
}
|
||||
|
||||
|
||||
class AccessibilityAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes accessibility features"""
|
||||
|
||||
def analyze(self, html_content: str) -> Dict[str, Any]:
|
||||
"""Enhanced accessibility analysis with specific fixes"""
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Check for alt text on images
|
||||
images = soup.find_all('img')
|
||||
images_without_alt = [img for img in images if not img.get('alt')]
|
||||
if images_without_alt:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Images without alt text ({len(images_without_alt)} found)',
|
||||
'location': 'Images',
|
||||
'current_value': f'{len(images_without_alt)} images without alt',
|
||||
'fix': 'Add descriptive alt text to all images',
|
||||
'code_example': '<img src="image.jpg" alt="Descriptive text about the image">',
|
||||
'action': 'add_alt_text'
|
||||
})
|
||||
|
||||
# Check for form labels
|
||||
forms = soup.find_all('form')
|
||||
for form in forms:
|
||||
inputs = form.find_all(['input', 'textarea', 'select'])
|
||||
for input_elem in inputs:
|
||||
if input_elem.get('type') not in ['hidden', 'submit', 'button']:
|
||||
input_id = input_elem.get('id')
|
||||
if input_id:
|
||||
label = soup.find('label', attrs={'for': input_id})
|
||||
if not label:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Input without label (ID: {input_id})',
|
||||
'location': 'Form',
|
||||
'current_value': f'Input ID: {input_id}',
|
||||
'fix': 'Add label for input field',
|
||||
'code_example': f'<label for="{input_id}">Field Label</label>',
|
||||
'action': 'add_form_label'
|
||||
})
|
||||
|
||||
# Check for heading hierarchy
|
||||
headings = soup.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6'])
|
||||
if headings:
|
||||
h1_count = len([h for h in headings if h.name == 'h1'])
|
||||
if h1_count == 0:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'No H1 heading found',
|
||||
'location': 'Page structure',
|
||||
'fix': 'Add H1 heading for main content',
|
||||
'code_example': '<h1>Main Page Heading</h1>',
|
||||
'action': 'add_h1_heading'
|
||||
})
|
||||
|
||||
# Check for color contrast (basic check)
|
||||
style_tags = soup.find_all('style')
|
||||
inline_styles = soup.find_all(style=True)
|
||||
if style_tags or inline_styles:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'Custom styles found - check color contrast',
|
||||
'location': 'CSS',
|
||||
'fix': 'Ensure sufficient color contrast (4.5:1 for normal text)',
|
||||
'code_example': 'Use tools like WebAIM Contrast Checker',
|
||||
'action': 'check_color_contrast'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'images_count': len(images),
|
||||
'images_without_alt': len(images_without_alt),
|
||||
'forms_count': len(forms),
|
||||
'headings_count': len(headings)
|
||||
}
|
||||
|
||||
|
||||
class UserExperienceAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes user experience elements"""
|
||||
|
||||
def analyze(self, html_content: str, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced user experience analysis with specific fixes"""
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
# Check for mobile responsiveness indicators
|
||||
viewport = soup.find('meta', attrs={'name': 'viewport'})
|
||||
if not viewport:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing viewport meta tag for mobile',
|
||||
'location': '<head>',
|
||||
'fix': 'Add viewport meta tag',
|
||||
'code_example': '<meta name="viewport" content="width=device-width, initial-scale=1.0">',
|
||||
'action': 'add_viewport_meta'
|
||||
})
|
||||
|
||||
# Check for navigation menu
|
||||
nav_elements = soup.find_all(['nav', 'ul', 'ol'])
|
||||
if not nav_elements:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'No navigation menu found',
|
||||
'location': 'Page structure',
|
||||
'fix': 'Add navigation menu',
|
||||
'code_example': '<nav><ul><li><a href="/">Home</a></li></ul></nav>',
|
||||
'action': 'add_navigation'
|
||||
})
|
||||
|
||||
# Check for contact information
|
||||
contact_patterns = ['contact', 'phone', 'email', '@', 'tel:']
|
||||
page_text = soup.get_text().lower()
|
||||
has_contact = any(pattern in page_text for pattern in contact_patterns)
|
||||
if not has_contact:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': 'No contact information found',
|
||||
'location': 'Page content',
|
||||
'fix': 'Add contact information',
|
||||
'code_example': '<p>Contact us: <a href="mailto:info@example.com">info@example.com</a></p>',
|
||||
'action': 'add_contact_info'
|
||||
})
|
||||
|
||||
# Check for social media links
|
||||
social_patterns = ['facebook', 'twitter', 'linkedin', 'instagram']
|
||||
has_social = any(pattern in page_text for pattern in social_patterns)
|
||||
if not has_social:
|
||||
recommendations.append({
|
||||
'type': 'recommendation',
|
||||
'message': 'No social media links found',
|
||||
'location': 'Page content',
|
||||
'fix': 'Add social media links',
|
||||
'code_example': '<a href="https://facebook.com/yourpage">Facebook</a>',
|
||||
'action': 'add_social_links',
|
||||
'priority': 'low'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'has_viewport': bool(viewport),
|
||||
'has_navigation': bool(nav_elements),
|
||||
'has_contact': has_contact,
|
||||
'has_social': has_social
|
||||
}
|
||||
|
||||
|
||||
class SecurityHeadersAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes security headers"""
|
||||
|
||||
def analyze(self, url: str) -> Dict[str, Any]:
|
||||
"""Enhanced security headers analysis with specific fixes"""
|
||||
try:
|
||||
response = self.session.get(url, timeout=15, allow_redirects=True)
|
||||
security_headers = {
|
||||
'X-Frame-Options': response.headers.get('X-Frame-Options'),
|
||||
'X-Content-Type-Options': response.headers.get('X-Content-Type-Options'),
|
||||
'X-XSS-Protection': response.headers.get('X-XSS-Protection'),
|
||||
'Strict-Transport-Security': response.headers.get('Strict-Transport-Security'),
|
||||
'Content-Security-Policy': response.headers.get('Content-Security-Policy'),
|
||||
'Referrer-Policy': response.headers.get('Referrer-Policy')
|
||||
}
|
||||
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
present_headers = []
|
||||
missing_headers = []
|
||||
|
||||
for header_name, header_value in security_headers.items():
|
||||
if header_value:
|
||||
present_headers.append(header_name)
|
||||
else:
|
||||
missing_headers.append(header_name)
|
||||
if header_name in ['X-Frame-Options', 'X-Content-Type-Options']:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Missing {header_name} header',
|
||||
'location': 'Server configuration',
|
||||
'fix': f'Add {header_name} header',
|
||||
'code_example': f'{header_name}: DENY' if header_name == 'X-Frame-Options' else f'{header_name}: nosniff',
|
||||
'action': f'add_{header_name.lower().replace("-", "_")}_header'
|
||||
})
|
||||
else:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Missing {header_name} header',
|
||||
'location': 'Server configuration',
|
||||
'fix': f'Add {header_name} header for better security',
|
||||
'code_example': f'{header_name}: max-age=31536000',
|
||||
'action': f'add_{header_name.lower().replace("-", "_")}_header'
|
||||
})
|
||||
|
||||
score = min(100, len(present_headers) * 16)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'present_headers': present_headers,
|
||||
'missing_headers': missing_headers,
|
||||
'total_headers': len(present_headers),
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations
|
||||
}
|
||||
except Exception as e:
|
||||
logger.warning(f"Security headers analysis failed for {url}: {e}")
|
||||
return {
|
||||
'score': 0, 'error': f'Error analyzing headers: {str(e)}',
|
||||
'present_headers': [], 'missing_headers': ['All security headers'],
|
||||
'total_headers': 0, 'issues': [{'type': 'critical', 'message': 'Could not analyze security headers', 'location': 'Server', 'fix': 'Check security headers manually', 'action': 'manual_check'}],
|
||||
'warnings': [{'type': 'warning', 'message': 'Security headers analysis failed', 'location': 'Server', 'fix': 'Verify security headers manually', 'action': 'manual_check'}],
|
||||
'recommendations': [{'type': 'recommendation', 'message': 'Check security headers manually', 'priority': 'medium', 'action': 'manual_check'}]
|
||||
}
|
||||
|
||||
|
||||
class KeywordAnalyzer(BaseAnalyzer):
|
||||
"""Analyzes keyword usage and optimization"""
|
||||
|
||||
def analyze(self, html_content: str, target_keywords: Optional[List[str]] = None) -> Dict[str, Any]:
|
||||
"""Enhanced keyword analysis with specific locations"""
|
||||
if not target_keywords:
|
||||
return {'score': 0, 'issues': [], 'warnings': [], 'recommendations': []}
|
||||
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
issues = []
|
||||
warnings = []
|
||||
recommendations = []
|
||||
|
||||
page_text = soup.get_text().lower()
|
||||
title_text = soup.find('title')
|
||||
title_text = title_text.get_text().lower() if title_text else ""
|
||||
|
||||
for keyword in target_keywords:
|
||||
keyword_lower = keyword.lower()
|
||||
|
||||
# Check if keyword is in title
|
||||
if keyword_lower not in title_text:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Target keyword "{keyword}" not in title',
|
||||
'location': '<title>',
|
||||
'current_value': title_text,
|
||||
'fix': f'Include keyword "{keyword}" in title',
|
||||
'code_example': f'<title>{keyword} - Your Page Title</title>',
|
||||
'action': 'add_keyword_to_title'
|
||||
})
|
||||
|
||||
# Check keyword density
|
||||
keyword_count = page_text.count(keyword_lower)
|
||||
if keyword_count == 0:
|
||||
issues.append({
|
||||
'type': 'critical',
|
||||
'message': f'Target keyword "{keyword}" not found in content',
|
||||
'location': 'Page content',
|
||||
'current_value': '0 occurrences',
|
||||
'fix': f'Include keyword "{keyword}" naturally in content',
|
||||
'code_example': f'Add "{keyword}" to your page content',
|
||||
'action': 'add_keyword_to_content'
|
||||
})
|
||||
elif keyword_count < 2:
|
||||
warnings.append({
|
||||
'type': 'warning',
|
||||
'message': f'Target keyword "{keyword}" appears only {keyword_count} time(s)',
|
||||
'location': 'Page content',
|
||||
'current_value': f'{keyword_count} occurrence(s)',
|
||||
'fix': f'Include keyword "{keyword}" more naturally',
|
||||
'code_example': f'Add more instances of "{keyword}" to content',
|
||||
'action': 'increase_keyword_density'
|
||||
})
|
||||
|
||||
score = max(0, 100 - len(issues) * 25 - len(warnings) * 10)
|
||||
|
||||
return {
|
||||
'score': score,
|
||||
'issues': issues,
|
||||
'warnings': warnings,
|
||||
'recommendations': recommendations,
|
||||
'target_keywords': target_keywords,
|
||||
'keywords_found': [kw for kw in target_keywords if kw.lower() in page_text]
|
||||
}
|
||||
208
backend/services/seo_analyzer/core.py
Normal file
208
backend/services/seo_analyzer/core.py
Normal file
@@ -0,0 +1,208 @@
|
||||
"""
|
||||
Core SEO Analyzer Module
|
||||
Contains the main ComprehensiveSEOAnalyzer class and data structures.
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
from dataclasses import dataclass
|
||||
from typing import Dict, List, Any, Optional
|
||||
from loguru import logger
|
||||
|
||||
from .analyzers import (
|
||||
URLStructureAnalyzer,
|
||||
MetaDataAnalyzer,
|
||||
ContentAnalyzer,
|
||||
TechnicalSEOAnalyzer,
|
||||
PerformanceAnalyzer,
|
||||
AccessibilityAnalyzer,
|
||||
UserExperienceAnalyzer,
|
||||
SecurityHeadersAnalyzer,
|
||||
KeywordAnalyzer
|
||||
)
|
||||
from .utils import HTMLFetcher, AIInsightGenerator
|
||||
|
||||
|
||||
@dataclass
|
||||
class SEOAnalysisResult:
|
||||
"""Data class for SEO analysis results"""
|
||||
url: str
|
||||
timestamp: datetime
|
||||
overall_score: int
|
||||
health_status: str
|
||||
critical_issues: List[Dict[str, Any]]
|
||||
warnings: List[Dict[str, Any]]
|
||||
recommendations: List[Dict[str, Any]]
|
||||
data: Dict[str, Any]
|
||||
|
||||
|
||||
class ComprehensiveSEOAnalyzer:
|
||||
"""
|
||||
Comprehensive SEO Analyzer
|
||||
Orchestrates all individual analyzers to provide complete SEO analysis.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the comprehensive SEO analyzer with all sub-analyzers"""
|
||||
self.html_fetcher = HTMLFetcher()
|
||||
self.ai_insight_generator = AIInsightGenerator()
|
||||
|
||||
# Initialize all analyzers
|
||||
self.url_analyzer = URLStructureAnalyzer()
|
||||
self.meta_analyzer = MetaDataAnalyzer()
|
||||
self.content_analyzer = ContentAnalyzer()
|
||||
self.technical_analyzer = TechnicalSEOAnalyzer()
|
||||
self.performance_analyzer = PerformanceAnalyzer()
|
||||
self.accessibility_analyzer = AccessibilityAnalyzer()
|
||||
self.ux_analyzer = UserExperienceAnalyzer()
|
||||
self.security_analyzer = SecurityHeadersAnalyzer()
|
||||
self.keyword_analyzer = KeywordAnalyzer()
|
||||
|
||||
def analyze_url_progressive(self, url: str, target_keywords: Optional[List[str]] = None) -> SEOAnalysisResult:
|
||||
"""
|
||||
Progressive analysis method that runs all analyses with enhanced AI insights
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Starting enhanced SEO analysis for URL: {url}")
|
||||
|
||||
# Fetch HTML content
|
||||
html_content = self.html_fetcher.fetch_html(url)
|
||||
if not html_content:
|
||||
return self._create_error_result(url, "Failed to fetch HTML content")
|
||||
|
||||
# Run all analyzers
|
||||
analysis_data = {}
|
||||
|
||||
logger.info("Running enhanced analyses...")
|
||||
analysis_data.update({
|
||||
'url_structure': self.url_analyzer.analyze(url),
|
||||
'meta_data': self.meta_analyzer.analyze(html_content, url),
|
||||
'content_analysis': self.content_analyzer.analyze(html_content, url),
|
||||
'keyword_analysis': self.keyword_analyzer.analyze(html_content, target_keywords) if target_keywords else {},
|
||||
'technical_seo': self.technical_analyzer.analyze(html_content, url),
|
||||
'accessibility': self.accessibility_analyzer.analyze(html_content),
|
||||
'user_experience': self.ux_analyzer.analyze(html_content, url)
|
||||
})
|
||||
|
||||
# Run potentially slower analyses with error handling
|
||||
logger.info("Running security headers analysis...")
|
||||
try:
|
||||
analysis_data['security_headers'] = self.security_analyzer.analyze(url)
|
||||
except Exception as e:
|
||||
logger.warning(f"Security headers analysis failed: {e}")
|
||||
analysis_data['security_headers'] = self._create_fallback_result('security_headers', str(e))
|
||||
|
||||
logger.info("Running performance analysis...")
|
||||
try:
|
||||
analysis_data['performance'] = self.performance_analyzer.analyze(url)
|
||||
except Exception as e:
|
||||
logger.warning(f"Performance analysis failed: {e}")
|
||||
analysis_data['performance'] = self._create_fallback_result('performance', str(e))
|
||||
|
||||
# Generate AI-powered insights
|
||||
ai_insights = self.ai_insight_generator.generate_insights(analysis_data, url)
|
||||
|
||||
# Calculate overall health
|
||||
overall_score, health_status, critical_issues, warnings, recommendations = self._calculate_overall_health(analysis_data, ai_insights)
|
||||
|
||||
result = SEOAnalysisResult(
|
||||
url=url,
|
||||
timestamp=datetime.now(),
|
||||
overall_score=overall_score,
|
||||
health_status=health_status,
|
||||
critical_issues=critical_issues,
|
||||
warnings=warnings,
|
||||
recommendations=recommendations,
|
||||
data=analysis_data
|
||||
)
|
||||
|
||||
logger.info(f"Enhanced SEO analysis completed for {url}. Overall score: {overall_score}")
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in enhanced SEO analysis for {url}: {str(e)}")
|
||||
return self._create_error_result(url, str(e))
|
||||
|
||||
def _calculate_overall_health(self, analysis_data: Dict[str, Any], ai_insights: List[Dict[str, Any]]) -> tuple:
|
||||
"""Calculate overall health with enhanced scoring"""
|
||||
scores = []
|
||||
all_critical_issues = []
|
||||
all_warnings = []
|
||||
all_recommendations = []
|
||||
|
||||
for category, data in analysis_data.items():
|
||||
if isinstance(data, dict) and 'score' in data:
|
||||
scores.append(data['score'])
|
||||
all_critical_issues.extend(data.get('issues', []))
|
||||
all_warnings.extend(data.get('warnings', []))
|
||||
all_recommendations.extend(data.get('recommendations', []))
|
||||
|
||||
# Calculate overall score
|
||||
overall_score = sum(scores) // len(scores) if scores else 0
|
||||
|
||||
# Determine health status
|
||||
if overall_score >= 80:
|
||||
health_status = 'excellent'
|
||||
elif overall_score >= 60:
|
||||
health_status = 'good'
|
||||
elif overall_score >= 40:
|
||||
health_status = 'needs_improvement'
|
||||
else:
|
||||
health_status = 'poor'
|
||||
|
||||
# Add AI insights to recommendations
|
||||
for insight in ai_insights:
|
||||
all_recommendations.append({
|
||||
'type': 'ai_insight',
|
||||
'message': insight['message'],
|
||||
'priority': insight['priority'],
|
||||
'action': insight['action'],
|
||||
'description': insight['description']
|
||||
})
|
||||
|
||||
return overall_score, health_status, all_critical_issues, all_warnings, all_recommendations
|
||||
|
||||
def _create_fallback_result(self, category: str, error_message: str) -> Dict[str, Any]:
|
||||
"""Create a fallback result when analysis fails"""
|
||||
return {
|
||||
'score': 0,
|
||||
'error': f'{category} analysis failed: {error_message}',
|
||||
'issues': [{
|
||||
'type': 'critical',
|
||||
'message': f'{category} analysis timed out',
|
||||
'location': 'System',
|
||||
'fix': f'Check {category} manually',
|
||||
'action': 'manual_check'
|
||||
}],
|
||||
'warnings': [{
|
||||
'type': 'warning',
|
||||
'message': f'Could not analyze {category}',
|
||||
'location': 'System',
|
||||
'fix': f'Verify {category} manually',
|
||||
'action': 'manual_check'
|
||||
}],
|
||||
'recommendations': [{
|
||||
'type': 'recommendation',
|
||||
'message': f'Check {category} manually',
|
||||
'priority': 'medium',
|
||||
'action': 'manual_check'
|
||||
}]
|
||||
}
|
||||
|
||||
def _create_error_result(self, url: str, error_message: str) -> SEOAnalysisResult:
|
||||
"""Create error result with enhanced structure"""
|
||||
return SEOAnalysisResult(
|
||||
url=url,
|
||||
timestamp=datetime.now(),
|
||||
overall_score=0,
|
||||
health_status='error',
|
||||
critical_issues=[{
|
||||
'type': 'critical',
|
||||
'message': f'Analysis failed: {error_message}',
|
||||
'location': 'System',
|
||||
'fix': 'Check URL accessibility and try again',
|
||||
'action': 'retry_analysis'
|
||||
}],
|
||||
warnings=[],
|
||||
recommendations=[],
|
||||
data={}
|
||||
)
|
||||
268
backend/services/seo_analyzer/service.py
Normal file
268
backend/services/seo_analyzer/service.py
Normal file
@@ -0,0 +1,268 @@
|
||||
"""
|
||||
SEO Analysis Service
|
||||
Handles storing and retrieving SEO analysis data from the database.
|
||||
"""
|
||||
|
||||
from typing import Optional, List, Dict, Any
|
||||
from datetime import datetime
|
||||
from sqlalchemy.orm import Session
|
||||
from sqlalchemy import func
|
||||
from loguru import logger
|
||||
|
||||
from models.seo_analysis import (
|
||||
SEOAnalysis,
|
||||
SEOIssue,
|
||||
SEOWarning,
|
||||
SEORecommendation,
|
||||
SEOCategoryScore,
|
||||
SEOAnalysisHistory,
|
||||
create_analysis_from_result,
|
||||
create_issues_from_result,
|
||||
create_warnings_from_result,
|
||||
create_recommendations_from_result,
|
||||
create_category_scores_from_result
|
||||
)
|
||||
from .core import SEOAnalysisResult
|
||||
|
||||
class SEOAnalysisService:
|
||||
"""Service for managing SEO analysis data in the database."""
|
||||
|
||||
def __init__(self, db_session: Session):
|
||||
self.db = db_session
|
||||
|
||||
def store_analysis_result(self, result: SEOAnalysisResult) -> Optional[SEOAnalysis]:
|
||||
"""
|
||||
Store SEO analysis result in the database.
|
||||
|
||||
Args:
|
||||
result: SEOAnalysisResult from the analyzer
|
||||
|
||||
Returns:
|
||||
Stored SEOAnalysis record or None if failed
|
||||
"""
|
||||
try:
|
||||
# Create main analysis record
|
||||
analysis_record = create_analysis_from_result(result)
|
||||
self.db.add(analysis_record)
|
||||
self.db.flush() # Get the ID
|
||||
|
||||
# Create related records
|
||||
issues = create_issues_from_result(analysis_record.id, result)
|
||||
warnings = create_warnings_from_result(analysis_record.id, result)
|
||||
recommendations = create_recommendations_from_result(analysis_record.id, result)
|
||||
category_scores = create_category_scores_from_result(analysis_record.id, result)
|
||||
|
||||
# Add all related records
|
||||
for issue in issues:
|
||||
self.db.add(issue)
|
||||
for warning in warnings:
|
||||
self.db.add(warning)
|
||||
for recommendation in recommendations:
|
||||
self.db.add(recommendation)
|
||||
for score in category_scores:
|
||||
self.db.add(score)
|
||||
|
||||
# Create history record
|
||||
history_record = SEOAnalysisHistory(
|
||||
url=result.url,
|
||||
analysis_date=result.timestamp,
|
||||
overall_score=result.overall_score,
|
||||
health_status=result.health_status,
|
||||
score_change=0, # Will be calculated later
|
||||
critical_issues_count=len(result.critical_issues),
|
||||
warnings_count=len(result.warnings),
|
||||
recommendations_count=len(result.recommendations)
|
||||
)
|
||||
|
||||
# Add category scores to history
|
||||
for category, data in result.data.items():
|
||||
if isinstance(data, dict) and 'score' in data:
|
||||
if category == 'url_structure':
|
||||
history_record.url_structure_score = data['score']
|
||||
elif category == 'meta_data':
|
||||
history_record.meta_data_score = data['score']
|
||||
elif category == 'content_analysis':
|
||||
history_record.content_score = data['score']
|
||||
elif category == 'technical_seo':
|
||||
history_record.technical_score = data['score']
|
||||
elif category == 'performance':
|
||||
history_record.performance_score = data['score']
|
||||
elif category == 'accessibility':
|
||||
history_record.accessibility_score = data['score']
|
||||
elif category == 'user_experience':
|
||||
history_record.user_experience_score = data['score']
|
||||
elif category == 'security_headers':
|
||||
history_record.security_score = data['score']
|
||||
|
||||
self.db.add(history_record)
|
||||
self.db.commit()
|
||||
|
||||
logger.info(f"Stored SEO analysis for {result.url} with score {result.overall_score}")
|
||||
return analysis_record
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error storing SEO analysis: {str(e)}")
|
||||
self.db.rollback()
|
||||
return None
|
||||
|
||||
def get_latest_analysis(self, url: str) -> Optional[SEOAnalysis]:
|
||||
"""
|
||||
Get the latest SEO analysis for a URL.
|
||||
|
||||
Args:
|
||||
url: The URL to get analysis for
|
||||
|
||||
Returns:
|
||||
Latest SEOAnalysis record or None
|
||||
"""
|
||||
try:
|
||||
return self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.url == url
|
||||
).order_by(SEOAnalysis.timestamp.desc()).first()
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting latest analysis for {url}: {str(e)}")
|
||||
return None
|
||||
|
||||
def get_analysis_history(self, url: str, limit: int = 10) -> List[SEOAnalysisHistory]:
|
||||
"""
|
||||
Get analysis history for a URL.
|
||||
|
||||
Args:
|
||||
url: The URL to get history for
|
||||
limit: Maximum number of records to return
|
||||
|
||||
Returns:
|
||||
List of SEOAnalysisHistory records
|
||||
"""
|
||||
try:
|
||||
return self.db.query(SEOAnalysisHistory).filter(
|
||||
SEOAnalysisHistory.url == url
|
||||
).order_by(SEOAnalysisHistory.analysis_date.desc()).limit(limit).all()
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting analysis history for {url}: {str(e)}")
|
||||
return []
|
||||
|
||||
def get_analysis_by_id(self, analysis_id: int) -> Optional[SEOAnalysis]:
|
||||
"""
|
||||
Get SEO analysis by ID.
|
||||
|
||||
Args:
|
||||
analysis_id: The analysis ID
|
||||
|
||||
Returns:
|
||||
SEOAnalysis record or None
|
||||
"""
|
||||
try:
|
||||
return self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.id == analysis_id
|
||||
).first()
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting analysis by ID {analysis_id}: {str(e)}")
|
||||
return None
|
||||
|
||||
def get_all_analyses(self, limit: int = 50) -> List[SEOAnalysis]:
|
||||
"""
|
||||
Get all SEO analyses with pagination.
|
||||
|
||||
Args:
|
||||
limit: Maximum number of records to return
|
||||
|
||||
Returns:
|
||||
List of SEOAnalysis records
|
||||
"""
|
||||
try:
|
||||
return self.db.query(SEOAnalysis).order_by(
|
||||
SEOAnalysis.timestamp.desc()
|
||||
).limit(limit).all()
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting all analyses: {str(e)}")
|
||||
return []
|
||||
|
||||
def delete_analysis(self, analysis_id: int) -> bool:
|
||||
"""
|
||||
Delete an SEO analysis.
|
||||
|
||||
Args:
|
||||
analysis_id: The analysis ID to delete
|
||||
|
||||
Returns:
|
||||
True if successful, False otherwise
|
||||
"""
|
||||
try:
|
||||
analysis = self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.id == analysis_id
|
||||
).first()
|
||||
|
||||
if analysis:
|
||||
self.db.delete(analysis)
|
||||
self.db.commit()
|
||||
logger.info(f"Deleted SEO analysis {analysis_id}")
|
||||
return True
|
||||
else:
|
||||
logger.warning(f"Analysis {analysis_id} not found for deletion")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error deleting analysis {analysis_id}: {str(e)}")
|
||||
self.db.rollback()
|
||||
return False
|
||||
|
||||
def get_analysis_statistics(self) -> Dict[str, Any]:
|
||||
"""
|
||||
Get overall statistics for SEO analyses.
|
||||
|
||||
Returns:
|
||||
Dictionary with analysis statistics
|
||||
"""
|
||||
try:
|
||||
total_analyses = self.db.query(SEOAnalysis).count()
|
||||
total_urls = self.db.query(SEOAnalysis.url).distinct().count()
|
||||
|
||||
# Get average scores by health status
|
||||
excellent_count = self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.health_status == 'excellent'
|
||||
).count()
|
||||
|
||||
good_count = self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.health_status == 'good'
|
||||
).count()
|
||||
|
||||
needs_improvement_count = self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.health_status == 'needs_improvement'
|
||||
).count()
|
||||
|
||||
poor_count = self.db.query(SEOAnalysis).filter(
|
||||
SEOAnalysis.health_status == 'poor'
|
||||
).count()
|
||||
|
||||
# Calculate average overall score
|
||||
avg_score_result = self.db.query(
|
||||
func.avg(SEOAnalysis.overall_score)
|
||||
).scalar()
|
||||
avg_score = float(avg_score_result) if avg_score_result else 0
|
||||
|
||||
return {
|
||||
'total_analyses': total_analyses,
|
||||
'total_urls': total_urls,
|
||||
'average_score': round(avg_score, 2),
|
||||
'health_distribution': {
|
||||
'excellent': excellent_count,
|
||||
'good': good_count,
|
||||
'needs_improvement': needs_improvement_count,
|
||||
'poor': poor_count
|
||||
}
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting analysis statistics: {str(e)}")
|
||||
return {
|
||||
'total_analyses': 0,
|
||||
'total_urls': 0,
|
||||
'average_score': 0,
|
||||
'health_distribution': {
|
||||
'excellent': 0,
|
||||
'good': 0,
|
||||
'needs_improvement': 0,
|
||||
'poor': 0
|
||||
}
|
||||
}
|
||||
135
backend/services/seo_analyzer/utils.py
Normal file
135
backend/services/seo_analyzer/utils.py
Normal file
@@ -0,0 +1,135 @@
|
||||
"""
|
||||
SEO Analyzer Utilities
|
||||
Contains utility classes for HTML fetching and AI insight generation.
|
||||
"""
|
||||
|
||||
import requests
|
||||
from typing import Optional, Dict, List, Any
|
||||
from loguru import logger
|
||||
|
||||
|
||||
class HTMLFetcher:
|
||||
"""Utility class for fetching HTML content from URLs"""
|
||||
|
||||
def __init__(self):
|
||||
self.session = requests.Session()
|
||||
self.session.headers.update({
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
|
||||
})
|
||||
|
||||
def fetch_html(self, url: str) -> Optional[str]:
|
||||
"""Fetch HTML content with retries and protocol fallback."""
|
||||
def _try_fetch(target_url: str, timeout_s: int = 30) -> Optional[str]:
|
||||
try:
|
||||
response = self.session.get(
|
||||
target_url,
|
||||
timeout=timeout_s,
|
||||
allow_redirects=True,
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.text
|
||||
except Exception as inner_e:
|
||||
logger.error(f"Error fetching HTML from {target_url}: {inner_e}")
|
||||
return None
|
||||
|
||||
# First attempt
|
||||
html = _try_fetch(url, timeout_s=30)
|
||||
if html is not None:
|
||||
return html
|
||||
|
||||
# Retry once (shorter timeout)
|
||||
html = _try_fetch(url, timeout_s=15)
|
||||
if html is not None:
|
||||
return html
|
||||
|
||||
# If https fails due to resets, try http fallback once
|
||||
try:
|
||||
if url.startswith("https://"):
|
||||
http_url = "http://" + url[len("https://"):]
|
||||
logger.info(f"SEO Analyzer: Falling back to HTTP for {http_url}")
|
||||
html = _try_fetch(http_url, timeout_s=15)
|
||||
if html is not None:
|
||||
return html
|
||||
except Exception:
|
||||
# Best-effort fallback; errors already logged in _try_fetch
|
||||
pass
|
||||
|
||||
return None
|
||||
|
||||
|
||||
class AIInsightGenerator:
|
||||
"""Utility class for generating AI-powered insights from analysis data"""
|
||||
|
||||
def generate_insights(self, analysis_data: Dict[str, Any], url: str) -> List[Dict[str, Any]]:
|
||||
"""Generate AI-powered insights based on analysis data"""
|
||||
insights = []
|
||||
|
||||
# Analyze overall performance
|
||||
total_issues = sum(len(data.get('issues', [])) for data in analysis_data.values() if isinstance(data, dict))
|
||||
total_warnings = sum(len(data.get('warnings', [])) for data in analysis_data.values() if isinstance(data, dict))
|
||||
|
||||
if total_issues > 5:
|
||||
insights.append({
|
||||
'type': 'critical',
|
||||
'message': f'High number of critical issues ({total_issues}) detected',
|
||||
'priority': 'high',
|
||||
'action': 'fix_critical_issues',
|
||||
'description': 'Multiple critical SEO issues need immediate attention to improve search rankings.'
|
||||
})
|
||||
|
||||
# Content quality insights
|
||||
content_data = analysis_data.get('content_analysis', {})
|
||||
if content_data.get('word_count', 0) < 300:
|
||||
insights.append({
|
||||
'type': 'warning',
|
||||
'message': 'Content is too thin for good SEO',
|
||||
'priority': 'medium',
|
||||
'action': 'expand_content',
|
||||
'description': 'Add more valuable, relevant content to improve search rankings and user engagement.'
|
||||
})
|
||||
|
||||
# Technical SEO insights
|
||||
technical_data = analysis_data.get('technical_seo', {})
|
||||
if not technical_data.get('has_canonical', False):
|
||||
insights.append({
|
||||
'type': 'critical',
|
||||
'message': 'Missing canonical URL can cause duplicate content issues',
|
||||
'priority': 'high',
|
||||
'action': 'add_canonical',
|
||||
'description': 'Canonical URLs help prevent duplicate content penalties.'
|
||||
})
|
||||
|
||||
# Security insights
|
||||
security_data = analysis_data.get('security_headers', {})
|
||||
if security_data.get('total_headers', 0) < 3:
|
||||
insights.append({
|
||||
'type': 'warning',
|
||||
'message': 'Insufficient security headers',
|
||||
'priority': 'medium',
|
||||
'action': 'improve_security',
|
||||
'description': 'Security headers protect against common web vulnerabilities.'
|
||||
})
|
||||
|
||||
# Performance insights
|
||||
performance_data = analysis_data.get('performance', {})
|
||||
if performance_data.get('load_time', 0) > 3:
|
||||
insights.append({
|
||||
'type': 'critical',
|
||||
'message': 'Page load time is too slow',
|
||||
'priority': 'high',
|
||||
'action': 'optimize_performance',
|
||||
'description': 'Slow loading pages negatively impact user experience and search rankings.'
|
||||
})
|
||||
|
||||
# URL structure insights
|
||||
url_data = analysis_data.get('url_structure', {})
|
||||
if not url_data.get('has_https', False):
|
||||
insights.append({
|
||||
'type': 'critical',
|
||||
'message': 'Website is not using HTTPS',
|
||||
'priority': 'high',
|
||||
'action': 'enable_https',
|
||||
'description': 'HTTPS is required for security and is a ranking factor for search engines.'
|
||||
})
|
||||
|
||||
return insights
|
||||
Reference in New Issue
Block a user