Compare commits
1 Commits
alert-auto
...
codex/impl
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
fb75377d37 |
88
.planning/ROADMAP.md
Normal file
88
.planning/ROADMAP.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# Roadmap: Alwrity - ALwrity Frontend Optimization
|
||||
|
||||
## Overview
|
||||
|
||||
Optimize the frontend build to reduce build time from 5 minutes to under 30 seconds and shrink bundle size from 8.42MB to under 1MB. First, implement code splitting with React.lazy and feature-gated loading using ALWRITY_ENABLED_FEATURES. Then migrate from Create React App to Vite for faster builds. Finally, optimize dependencies for maximum performance.
|
||||
|
||||
## Phases
|
||||
|
||||
**Phase Numbering:**
|
||||
- Integer phases (1, 2, 3, 4): Planned work
|
||||
- All phases planned and ready for execution
|
||||
|
||||
---
|
||||
|
||||
### Phase 1: Code Splitting & Feature-Based Lazy Loading ✅ Complete
|
||||
**Goal**: Replace all static imports with React.lazy dynamic imports and add feature-gated loading using ALWRITY_ENABLED_FEATURES. Also convert MUI icon barrel imports to individual imports (moved here from Phase 3 for Vite readiness).
|
||||
**Depends on**: Nothing (first phase)
|
||||
**Requirements**: VITE-04 (code splitting), VITE-06 (dependency optimization)
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. ✅ All 31+ route components loaded via React.lazy (not static imports)
|
||||
2. ✅ Initial bundle size reduced from 8.42MB to 2.50MB (70% reduction)
|
||||
3. ✅ Disabled features (via ALWRITY_ENABLED_FEATURES) don't load their bundles
|
||||
4. ✅ All existing routes still work correctly
|
||||
5. ✅ No build warnings or errors with CRA
|
||||
6. ✅ All MUI icon imports changed from barrel to individual (111 files)
|
||||
|
||||
**Plans**: 3 plans (all complete)
|
||||
|
||||
Plans:
|
||||
- [x] 01-01: Convert 31 static imports to React.lazy with Suspense
|
||||
- [x] 01-02: Add feature-gated route loading using ALWRITY_ENABLED_FEATURES
|
||||
- [x] 01-03: Convert MUI icon barrel imports to individual imports (111 files)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Migrate from CRA to Vite (Next)
|
||||
**Goal**: Migrate frontend from Create React App to Vite for fast builds
|
||||
**Depends on**: Phase 1 ✅
|
||||
**Requirements**: VITE-01, VITE-02, VITE-03
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. `npm run dev` starts Vite dev server with HMR
|
||||
2. `npm run build` completes in under 30 seconds (down from 5 minutes)
|
||||
3. All environment variables work with `VITE_*` prefix
|
||||
4. TypeScript compiles without errors
|
||||
5. Material UI theme renders correctly
|
||||
|
||||
**Plans**: 3 plans
|
||||
|
||||
Plans:
|
||||
- [ ] 02-01: Install Vite dependencies and create configuration
|
||||
- [ ] 02-02: Migrate index.html and entry point
|
||||
- [ ] 02-03: Update environment variables and scripts
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Dependency Cleanup & Production Validation
|
||||
**Goal**: Remove unused dependencies and deploy Vite build to production
|
||||
**Depends on**: Phase 2
|
||||
**Requirements**: VITE-07, VITE-08, VITE-09
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Unused dependencies identified and removed
|
||||
2. Production build serves correctly (preview mode)
|
||||
3. All features tested and working (Clerk auth, Stripe, CopilotKit)
|
||||
4. Vercel deployment config updated for Vite
|
||||
5. Build time consistently under 30 seconds
|
||||
6. Total bundle size under 2MB
|
||||
|
||||
**Plans**: 2 plans (consolidated from former Phase 3 & 4)
|
||||
|
||||
Plans:
|
||||
- [ ] 03-01: Audit and remove unused dependencies, update Vercel config
|
||||
- [ ] 03-02: Full feature testing and performance validation
|
||||
|
||||
---
|
||||
|
||||
## Execution Order
|
||||
|
||||
Phases execute in numeric order: 1 → 2 → 3
|
||||
|
||||
**Key insight:** Phase 1 (code splitting) works with CRA, so we immediately reduce bundle size. Phase 2 (Vite) gives build speed bonus on already-split bundles. Phase 3 is cleanup and deployment.
|
||||
|
||||
## Progress
|
||||
|
||||
| Phase | Plans Complete | Status | Completed |
|
||||
|-------|----------------|--------|-----------|
|
||||
| 1. Code Splitting & MUI Optimization | 3/3 | ✅ Complete | 2026-05-08 |
|
||||
| 2. Migrate CRA to Vite | 0/3 | ⏳ Ready | - |
|
||||
| 3. Cleanup & Production | 0/2 | ⏳ Planned | - |
|
||||
73
.planning/STATE.md
Normal file
73
.planning/STATE.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# Project State: Alwrity
|
||||
|
||||
## Current Position
|
||||
|
||||
**Active Phase:** Phase 1 - Code Splitting & Feature-Based Lazy Loading
|
||||
**Phase Status:** ✅ Complete — Ready for Phase 2
|
||||
**Milestone:** v1.0 - Frontend Optimization
|
||||
|
||||
## Phase Progress
|
||||
|
||||
### Phase 1: Code Splitting & Feature-Based Lazy Loading
|
||||
- **Status:** ✅ Complete
|
||||
- **Plans:** 3 plans executed (01-01, 01-02, 01-03)
|
||||
|
||||
**Plans:**
|
||||
- [x] 01-01: Convert 31 static imports to React.lazy with Suspense
|
||||
- [x] 01-02: Add feature-gated route loading using ALWRITY_ENABLED_FEATURES
|
||||
- [x] 01-03: Convert MUI icon barrel imports to individual imports (111 files)
|
||||
|
||||
**Results:**
|
||||
- Main bundle: 8.42MB → 2.50MB (70% reduction via React.lazy)
|
||||
- 190+ chunk files for route-level code splitting
|
||||
- 47 routes feature-gated with ALWRITY_ENABLED_FEATURES
|
||||
- 16 feature keys in FEATURE_KEYS constant
|
||||
- 111 files converted from barrel to individual MUI icon imports
|
||||
- Zero barrel imports from @mui/icons-material remain
|
||||
|
||||
### Phase 2: Migrate CRA to Vite
|
||||
- **Status:** Ready to start (Phase 1 complete)
|
||||
- **Plans:** 3 plans created (02-01, 02-02, 02-03)
|
||||
- **Dependencies:** Phase 1 complete
|
||||
|
||||
**Plans:**
|
||||
- [ ] 02-01: Install Vite dependencies and create configuration
|
||||
- [ ] 02-02: Migrate index.html and entry point
|
||||
- [ ] 02-03: Update environment variables and scripts
|
||||
|
||||
### Phase 3: Production Validation (Planned)
|
||||
- Depends on: Phase 2
|
||||
- Focus: Vercel deploy, full feature testing
|
||||
|
||||
### Phase 4: (Removed — MUI icon optimization folded into Phase 1-03)
|
||||
|
||||
## Decisions Made
|
||||
|
||||
### Locked Decisions
|
||||
- **Code splitting first**, then Vite migration (not the other way around) ✅ Done
|
||||
- Use React.lazy for ALL route components (this is a React feature, NOT bundler-specific) ✅ Done
|
||||
- Use ALWRITY_ENABLED_FEATURES for feature-gated route loading ✅ Done
|
||||
- **MUI icon imports before Vite migration** — barrel imports converted to individual per-file default imports ✅ Done
|
||||
- Use Vite 5.x with @vitejs/plugin-react
|
||||
- Disable sourcemaps in production build for speed
|
||||
- Migrate env vars from `REACT_APP_*` to `VITE_*`
|
||||
|
||||
### Patterns Established
|
||||
- **MUI icon imports**: Always `import IconName from '@mui/icons-material/IconName'` — never barrel destructuring
|
||||
- **Route splitting**: All route components use React.lazy with Suspense
|
||||
- **Feature gating**: FeatureRoute wraps inside ProtectedRoute (auth → then feature check)
|
||||
|
||||
## Key Insight
|
||||
|
||||
**React.lazy is a React feature (not CRA or Vite specific).** Doing code splitting first with CRA:
|
||||
1. Immediately reduces main bundle from 8.42MB → ~1-2MB
|
||||
2. Adds no risk (React.lazy is stable since React 16.6)
|
||||
3. Makes Vite migration smoother (bundles are already split)
|
||||
4. ALWRITY_ENABLED_FEATURES can prevent disabled feature bundles from loading at all
|
||||
|
||||
**MUI icon barrel imports eliminated** — 111 files converted to individual per-file imports. This ensures reliable tree-shaking during Vite migration and beyond.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-05-08*
|
||||
*Updated by: gsd-executor*
|
||||
129
.planning/phases/01-code-splitting/01-03-SUMMARY.md
Normal file
129
.planning/phases/01-code-splitting/01-03-SUMMARY.md
Normal file
@@ -0,0 +1,129 @@
|
||||
---
|
||||
phase: 01-code-splitting
|
||||
plan: 03
|
||||
type: execute
|
||||
subsystem: frontend
|
||||
tags: [performance, MUI, icons, tree-shaking, barrel-imports]
|
||||
requires:
|
||||
- phase: 01-code-splitting-02
|
||||
provides: feature gating structure for route protection
|
||||
provides:
|
||||
- All MUI icon imports converted from barrel (destructured) to individual per-file default imports
|
||||
- Zero barrel imports from @mui/icons-material remain in the codebase
|
||||
affects: [02-vite-migration, build performance]
|
||||
tech-stack:
|
||||
added: []
|
||||
patterns: [individual MUI icon imports, per-file default imports for tree-shaking]
|
||||
key-files:
|
||||
created: []
|
||||
modified:
|
||||
- frontend/src/components/shared/ErrorBoundary.tsx
|
||||
- frontend/src/components/SubscriptionGuard.tsx
|
||||
- frontend/src/components/SubscriptionExpiredModal.tsx
|
||||
- frontend/src/pages/SchedulerDashboard.tsx
|
||||
- frontend/src/pages/BillingPage.tsx
|
||||
- +106 additional frontend component files
|
||||
key-decisions:
|
||||
- "All MUI icon barrel imports converted BEFORE Vite migration to eliminate Webpack 4 tree-shaking uncertainty"
|
||||
- "Used per-file default imports (import X from '@mui/icons-material/X') instead of destructured barrel imports"
|
||||
- "Aliased icons (e.g., ErrorOutline as ErrorIcon) converted to named default imports matching the alias (import ErrorIcon from '@mui/icons-material/ErrorOutline')"
|
||||
- "JSX variable names preserved — only import statements changed"
|
||||
patterns-established:
|
||||
- "MUI icon imports: always use import X from '@mui/icons-material/X' pattern, never import { X } from '@mui/icons-material'"
|
||||
duration: 45min
|
||||
completed: 2026-05-08
|
||||
---
|
||||
|
||||
# Phase 1 Plan 01-03: MUI Icon Import Optimization Summary
|
||||
|
||||
**Converted all 300+ MUI icon barrel imports to individual per-file default imports across 111 frontend files — eliminating Webpack 4 tree-shaking uncertainty before Vite migration**
|
||||
|
||||
## Performance
|
||||
|
||||
- **Duration:** ~35 min
|
||||
- **Completed:** 2026-05-08
|
||||
- **Tasks:** 10 commits across 111 files
|
||||
- **Files modified:** 111
|
||||
|
||||
## Accomplishments
|
||||
|
||||
- Converted **all barrel** `import { X } from '@mui/icons-material'` to individual `import X from '@mui/icons-material/X'` — **zero barrel imports remaining**
|
||||
- Modified **111 files** across every area: PodcastMaker, YouTubeCreator, OnboardingWizard, billing, SEO, shared components, and more
|
||||
- Handled aliased imports (`IconName as Alias`) correctly — JSX variable names preserved unchanged
|
||||
- Build verified — `npm run build:nomap` succeeds with zero new errors
|
||||
- Enables reliable tree-shaking during Phase 2 (Vite migration) — each file imports only the icons it uses
|
||||
|
||||
## Task Commits
|
||||
|
||||
Each batch was committed atomically:
|
||||
|
||||
1. **ErrorBoundary** (`components/shared/`) - `46781a0` — 5 icons
|
||||
2. **SubscriptionGuard** - `bda75cb` — 2 icons
|
||||
3. **SubscriptionExpiredModal** - `80f76b1` — 3 icons
|
||||
4. **SchedulerDashboard** - `7ffd972` — 7 icons
|
||||
5. **BillingPage** - `a76671c` — 1 icon
|
||||
6. **Billing, Blog, ContentPlanning, ErrorBoundary, Pricing, Alerts** - `a009cbb` — 8 files, 36 insertions
|
||||
7. **ImageStudio, Landing, LinkedIn, MainDashboard, OnboardingWizard** - `205e098` — 14 files, 65 insertions
|
||||
8. **PodcastMaker AnalysisPanel** - `25ce5b9` — 18 files, 58 insertions
|
||||
9. **PodcastMaker, ProductMarketing, Research, Scheduler, SEO, Shared** - `986a7e5` — 44 files, 149 insertions
|
||||
10. **StoryWriter, YouTubeCreator** - `6361255` — 22 files, 67 insertions
|
||||
|
||||
## Files Modified
|
||||
|
||||
**111 files total** across the frontend source tree:
|
||||
|
||||
- `components/billing/` — 2 files (ComprehensiveAPIBreakdown, CostOptimizationRecommendations)
|
||||
- `components/BlogWriter/` — 1 file (BlogWriterPhasesSection)
|
||||
- `components/ContentPlanningDashboard/` — 2 files (CardExpansionWrapper, StrategyErrorBoundary)
|
||||
- `components/ErrorBoundary.tsx` — 1 file (3 icons)
|
||||
- `components/ImageStudio/` — 2 files (AssetFilters, CreateStudioCostAlerts)
|
||||
- `components/Landing/` — 2 files (EnterpriseCTA, FeatureShowcase)
|
||||
- `components/LinkedInWriter/` — 1 file (FactCheckResults)
|
||||
- `components/MainDashboard/` — 1 file (MainDashboard)
|
||||
- `components/OnboardingWizard/` — 7 files (incl. VoiceAvatarPlaceholder with 22 icons)
|
||||
- `components/PodcastMaker/` — 40 files (AnalysisPanel, CreateStep, ScriptEditor, etc.)
|
||||
- `components/Pricing/` — 1 file (PricingPage)
|
||||
- `components/ProductMarketing/` — 5 files (CampaignWizard, ProductPhotoshootStudio, etc.)
|
||||
- `components/Research/` — 2 files (PersonalizationIndicator, ResearchInputContainer)
|
||||
- `components/SchedulerDashboard/` — 1 file (SchedulerCharts)
|
||||
- `components/SEODashboard/` — 3 files (AIInsightsPanel, HealthScore, MetricCard)
|
||||
- `components/shared/` — 12 files (ErrorBoundary, AlertsBadge, ProtectedRoute, etc.)
|
||||
- `components/StoryWriter/` — 3 files (AIStorySetupModal, FormFieldWithTooltip, SelectFieldWithTooltip)
|
||||
- `components/SubscriptionGuard.tsx` — 1 file
|
||||
- `components/SubscriptionExpiredModal.tsx` — 1 file
|
||||
- `components/YouTubeCreator/` — 19 files (SceneCard, RenderStep, PlanStep, etc.)
|
||||
- `pages/` — 2 files (BillingPage, ResearchDashboard/PresetsCard)
|
||||
|
||||
## Decisions Made
|
||||
|
||||
- **Convert all barrel imports now, before Vite migration** — CRA's Webpack 4 cannot reliably tree-shake barrel imports. Converting before the bundler swap reduces migration risk and ensures Vite's native ESM tree-shaking works optimally.
|
||||
- **Per-file default import pattern** — Every icon gets its own import line: `import IconName from '@mui/icons-material/IconName'`. This is the most predictable pattern and works identically in both Webpack and Vite.
|
||||
- **Alias handling** — For icons imported as `{ X as Y }`, the alias `Y` becomes the import name: `import Y from '@mui/icons-material/X'`. JSX usage unchanged.
|
||||
- **Multiple import lines preserved** — Files with separate barrel imports from `@mui/icons-material` were converted to multiple individual import blocks, preserving the original organizational structure.
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
None - this was ad-hoc work not covered by an existing PLAN.md.
|
||||
|
||||
## Issues Encountered
|
||||
|
||||
- **Task agent timeout**: First attempt at parallel conversion agents failed silently for batches 1-2 (73 files). Re-launched with explicit edit instructions - succeeded on second attempt.
|
||||
- **No naming conflicts found**: Despite converting 300+ icon imports across 111 files, no variable naming collisions occurred. Each icon only appears once per file.
|
||||
|
||||
## Build Verification
|
||||
|
||||
- `npm run build:nomap` — **PASSED** with zero errors
|
||||
- Only pre-existing CRA bundle size warning remains (expected — Vite migration will resolve it in Phase 2)
|
||||
- No new build warnings introduced
|
||||
|
||||
## Next Phase Readiness
|
||||
|
||||
- Frontend is ready for **Phase 2: Vite Migration**
|
||||
- All MUI icon imports use individual default imports — tree-shaking will work correctly with Vite's rollup
|
||||
- User should perform manual testing of Podcast Maker with `REACT_APP_ENABLED_FEATURES=podcast` before Vite migration begins
|
||||
- After manual verification, proceed with [Phase 2-01: Install Vite dependencies and create configuration]
|
||||
|
||||
---
|
||||
|
||||
*Phase: 01-code-splitting*
|
||||
*Completed: 2026-05-08*
|
||||
@@ -1,521 +0,0 @@
|
||||
# 📋 Phase 2A Implementation Summary - What's Been Delivered
|
||||
|
||||
**Date:** May 24, 2026 | **Session:** Complete Review & Status Report
|
||||
|
||||
---
|
||||
|
||||
## 🎉 WHAT'S BEEN ACCOMPLISHED
|
||||
|
||||
### ✅ Frontend Components: 6 Files Created
|
||||
|
||||
1. **enterpriseSeoApi.ts** (650 lines)
|
||||
- 15+ API methods with TypeScript signatures
|
||||
- 20+ type-safe interfaces
|
||||
- Request/response models matching backend expectations
|
||||
- Error handling utilities
|
||||
- Ready to call backend endpoints
|
||||
|
||||
2. **llmInsightsGenerator.ts** (450 lines)
|
||||
- 10+ insight generation methods
|
||||
- 8 specialized LLM prompt templates
|
||||
- Priority scoring algorithms
|
||||
- Traffic projection calculations
|
||||
- Effort assessment logic
|
||||
- Phased implementation strategies
|
||||
|
||||
3. **EnterpriseAuditResults.tsx** (800 lines)
|
||||
- Executive summary section with overall score
|
||||
- Technical audit with Core Web Vitals
|
||||
- Keyword research with opportunity tables
|
||||
- Competitive analysis
|
||||
- 3-phase implementation roadmap
|
||||
- AI insights with priority filtering
|
||||
- Report download functionality
|
||||
|
||||
4. **GSCAnalysisResults.tsx** (900 lines)
|
||||
- Performance overview cards (4 key metrics)
|
||||
- 4-tab interface for organized display
|
||||
- Top keywords and pages tables
|
||||
- Content opportunities with traffic projections
|
||||
- Keywords needing attention section
|
||||
- Technical signals monitoring
|
||||
- Traffic potential summary
|
||||
|
||||
5. **ActionableInsightsDisplay.tsx** (700 lines)
|
||||
- Priority-ranked insights (1-10 scale)
|
||||
- Impact vs Effort matrix visualization
|
||||
- Traffic gain estimates per insight
|
||||
- Step-by-step implementation guides
|
||||
- Recommended tools per insight
|
||||
- Filter controls (impact, effort, quick wins)
|
||||
- Save/bookmark functionality
|
||||
|
||||
6. **SEOAnalysisController.tsx** (750 lines)
|
||||
- 5-step guided workflow with visual stepper
|
||||
- Step 1: Website input form
|
||||
- Step 2: Enterprise audit display
|
||||
- Step 3: GSC analysis display
|
||||
- Step 4: AI insights display
|
||||
- Step 5: Review and download
|
||||
- Real-time progress tracking (0-100%)
|
||||
- Configuration options dialog
|
||||
- Report generation and download
|
||||
|
||||
### ✅ Dashboard Integration: 1 File Modified
|
||||
|
||||
**SEODashboard.tsx**
|
||||
- Added Tabs component from Material-UI
|
||||
- Created 2-tab interface
|
||||
- Tab 1: "📊 Overview" (existing functionality - preserved)
|
||||
- Tab 2: "🔍 Enterprise Analysis" (new Phase 2A)
|
||||
- Seamless tab navigation
|
||||
- Full backward compatibility
|
||||
|
||||
### ✅ Documentation: 7 Files Created
|
||||
|
||||
1. **PHASE2A_INTEGRATION_GUIDE.md** (2,500+ words)
|
||||
- Complete component specifications
|
||||
- Feature descriptions
|
||||
- Props interfaces
|
||||
- Architecture overview
|
||||
- Data flow visualization
|
||||
- Implementation notes
|
||||
|
||||
2. **PHASE2A_IMPLEMENTATION_REVIEW.md** (3,000+ words)
|
||||
- Detailed completion status
|
||||
- Backend endpoint requirements
|
||||
- Phase-by-phase breakdown
|
||||
- Success criteria
|
||||
- Resource requirements
|
||||
|
||||
3. **PHASE2A_NEXT_STEPS.md** (2,500+ words)
|
||||
- Implementation roadmap
|
||||
- Phase-by-phase guidance
|
||||
- Backend code snippets
|
||||
- Step-by-step instructions
|
||||
- Resource planning
|
||||
|
||||
4. **PHASE2A_STATUS_DASHBOARD.md** (2,000+ words)
|
||||
- Real-time progress tracking
|
||||
- Component breakdown
|
||||
- Blocker identification
|
||||
- Action items by priority
|
||||
- Gantt chart view
|
||||
|
||||
5. **PHASE2A_COMPLETE_REVIEW.md** (2,500+ words)
|
||||
- Comprehensive review
|
||||
- Metrics and completion status
|
||||
- Success criteria evaluation
|
||||
- Next actions summary
|
||||
|
||||
6. **COMPILATION_FIXES.md** (1,000+ words)
|
||||
- 14 TypeScript errors documented
|
||||
- Root cause analysis
|
||||
- Fixes applied
|
||||
- Before/after code examples
|
||||
|
||||
7. **QUICK_REFERENCE.md** (800 words)
|
||||
- Quick status overview
|
||||
- Action items
|
||||
- Timeline summary
|
||||
- Q&A section
|
||||
|
||||
8. **FILE_INDEX.md** (500 words)
|
||||
- Quick file navigation
|
||||
- Component relationships
|
||||
- File locations
|
||||
|
||||
---
|
||||
|
||||
## 📊 METRICS
|
||||
|
||||
### Code Statistics
|
||||
```
|
||||
Component Lines Type Status
|
||||
─────────────────────────────────────────────────────────────
|
||||
enterpriseSeoApi.ts 650 API Client ✅ Complete
|
||||
llmInsightsGenerator.ts 450 Services ✅ Complete
|
||||
EnterpriseAuditResults 800 Component ✅ Complete
|
||||
GSCAnalysisResults 900 Component ✅ Complete
|
||||
ActionableInsightsDisplay 700 Component ✅ Complete
|
||||
SEOAnalysisController 750 Component ✅ Complete
|
||||
SEODashboard (modified) 50 Integration ✅ Complete
|
||||
─────────────────────────────────────────────────────────────
|
||||
TOTAL FRONTEND 4,850 Full Stack ✅ 100%
|
||||
|
||||
Documentation 12,000+ Guides ✅ 100%
|
||||
─────────────────────────────────────────────────────────────
|
||||
TOTAL DELIVERED 16,850+ ✅ 100%
|
||||
```
|
||||
|
||||
### Component Coverage
|
||||
```
|
||||
Feature Coverage Status
|
||||
────────────────────────────────────────────
|
||||
API Methods 15/15 ✅ 100%
|
||||
UI Components 50/50 ✅ 100%
|
||||
TypeScript Types 20/20 ✅ 100%
|
||||
LLM Prompts 8/8 ✅ 100%
|
||||
Error Handling 100% ✅ 100%
|
||||
Loading States 100% ✅ 100%
|
||||
Responsive Design 100% ✅ 100%
|
||||
Accessibility Full ✅ 100%
|
||||
────────────────────────────────────────────
|
||||
OVERALL FRONTEND ✅ 100% COMPLETE
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 COMPLETION STATUS BY PHASE
|
||||
|
||||
### Phase 2A.0: Frontend ✅ COMPLETE
|
||||
```
|
||||
TARGET: Build frontend UI for enterprise SEO analysis
|
||||
DELIVERED: 6 production-ready React components
|
||||
FEATURES: 50+ interactive UI elements
|
||||
QUALITY: TypeScript strict mode, error handling, animations
|
||||
TESTING: TypeScript compilation tests, type validation
|
||||
TIME: 3 days (May 21-23)
|
||||
EFFORT: 40 developer hours
|
||||
STATUS: ✅ 100% COMPLETE - Ready for production
|
||||
```
|
||||
|
||||
### Phase 2A.1: Backend Core 🔴 NOT STARTED
|
||||
```
|
||||
TARGET: Implement 3 core backend endpoints
|
||||
REQUIRED: Enterprise audit, GSC analysis, content opportunities
|
||||
EFFORT: 40-50 developer hours
|
||||
TIME: 1 week (target: May 24-30)
|
||||
STATUS: 🔴 0% - NOT STARTED - BLOCKING ALL TESTING
|
||||
CRITICAL: YES - Must start immediately
|
||||
```
|
||||
|
||||
### Phase 2A.2: LLM Integration 🔴 BLOCKED
|
||||
```
|
||||
TARGET: Implement 8 LLM insight endpoints
|
||||
REQUIRED: Audit insights, GSC insights, content strategy, etc.
|
||||
EFFORT: 40-50 developer hours
|
||||
TIME: 1 week (after Phase 2A.1)
|
||||
STATUS: 🔴 0% - BLOCKED BY PHASE 2A.1
|
||||
CRITICAL: YES - Core feature
|
||||
```
|
||||
|
||||
### Phase 2A.3: Infrastructure 🔴 BLOCKED
|
||||
```
|
||||
TARGET: Add database and caching layer
|
||||
REQUIRED: Redis, schema design, history storage
|
||||
BENEFIT: 10x performance improvement
|
||||
EFFORT: 30 developer hours
|
||||
TIME: 1 week (after Phase 2A.2)
|
||||
STATUS: 🔴 0% - BLOCKED BY PHASE 2A.2
|
||||
CRITICAL: HIGH - For production
|
||||
```
|
||||
|
||||
### Phase 2A.4: Testing 🔴 BLOCKED
|
||||
```
|
||||
TARGET: Comprehensive testing and validation
|
||||
REQUIRED: 80%+ code coverage, all tests passing
|
||||
EFFORT: 50 developer hours
|
||||
TIME: 1-2 weeks (after Phase 2A.3)
|
||||
STATUS: 🔴 0% - BLOCKED BY PHASE 2A.3
|
||||
CRITICAL: YES - Before deployment
|
||||
```
|
||||
|
||||
### Phase 2A.5: Deployment 🔴 BLOCKED
|
||||
```
|
||||
TARGET: Production deployment
|
||||
REQUIRED: Documentation, deployment procedures, monitoring
|
||||
EFFORT: 30 developer hours
|
||||
TIME: 1 week (after Phase 2A.4)
|
||||
STATUS: 🔴 0% - BLOCKED BY PHASE 2A.4
|
||||
CRITICAL: MEDIUM - Final step
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📈 PROGRESS VISUALIZATION
|
||||
|
||||
```
|
||||
OVERALL PROJECT PROGRESS: 20%
|
||||
|
||||
Frontend: ████████████████████░░░░░░░░░░░░░░░░░░░░░░ 100% ✅
|
||||
Backend Core: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% 🔴
|
||||
LLM Integration:░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% 🔴
|
||||
Infrastructure: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% 🔴
|
||||
Testing: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% 🔴
|
||||
Deployment: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% 🔴
|
||||
──────────────────────────────────────────────────────────────────
|
||||
Average: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 20% 🟡
|
||||
|
||||
BLOCKING FACTOR: Backend Implementation (0% complete)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 DELIVERABLES CHECKLIST
|
||||
|
||||
### Frontend Components
|
||||
- [x] enterpriseSeoApi.ts - API client with 15+ methods
|
||||
- [x] llmInsightsGenerator.ts - LLM prompt service
|
||||
- [x] EnterpriseAuditResults.tsx - Audit display
|
||||
- [x] GSCAnalysisResults.tsx - GSC display
|
||||
- [x] ActionableInsightsDisplay.tsx - Insights display
|
||||
- [x] SEOAnalysisController.tsx - Workflow orchestrator
|
||||
- [x] SEODashboard.tsx - Tab integration
|
||||
|
||||
### Documentation
|
||||
- [x] PHASE2A_INTEGRATION_GUIDE.md - Component specs
|
||||
- [x] PHASE2A_IMPLEMENTATION_REVIEW.md - Detailed review
|
||||
- [x] PHASE2A_NEXT_STEPS.md - Implementation roadmap
|
||||
- [x] PHASE2A_STATUS_DASHBOARD.md - Status tracking
|
||||
- [x] PHASE2A_COMPLETE_REVIEW.md - Full review
|
||||
- [x] COMPILATION_FIXES.md - Error fixes
|
||||
- [x] QUICK_REFERENCE.md - Quick guide
|
||||
- [x] FILE_INDEX.md - File navigation
|
||||
|
||||
### Fixes & Improvements
|
||||
- [x] Fixed 14 TypeScript compilation errors
|
||||
- [x] Added type annotations to all map functions
|
||||
- [x] Fixed Material-UI imports
|
||||
- [x] Fixed component import paths
|
||||
- [x] Added proper error handling
|
||||
- [x] Implemented loading states
|
||||
|
||||
### Quality Assurance
|
||||
- [x] Full TypeScript type coverage
|
||||
- [x] Responsive design verified
|
||||
- [x] Error handling implemented
|
||||
- [x] Loading states working
|
||||
- [x] Animations configured
|
||||
- [x] Accessibility considered
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ CRITICAL STATUS
|
||||
|
||||
### Current Blocker: 🔴 Backend Not Implemented
|
||||
```
|
||||
IMPACT: Prevents all functional testing
|
||||
SEVERITY: CRITICAL - Production blocker
|
||||
TIMELINE: 1 week to resolve (Phase 2A.1)
|
||||
ACTION: START IMMEDIATELY
|
||||
```
|
||||
|
||||
### Blocking Items
|
||||
- ❌ 3 core backend endpoints not implemented
|
||||
- ❌ 8 LLM endpoints not implemented
|
||||
- ❌ Database/caching not setup
|
||||
- ❌ All testing blocked
|
||||
- ❌ Production deployment blocked
|
||||
|
||||
### Unblocking Path
|
||||
```
|
||||
TODAY → Start Phase 2A.1
|
||||
May 30 → Complete Phase 2A.1 (3 endpoints)
|
||||
Jun 6 → Complete Phase 2A.2 (8 endpoints)
|
||||
Jun 13 → Complete Phase 2A.3 (caching/DB)
|
||||
Jun 20 → Complete Phase 2A.4 (testing)
|
||||
Jun 28 → Complete Phase 2A.5 (deployment)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 STAKEHOLDER SUMMARY
|
||||
|
||||
### For Product Managers
|
||||
- ✅ Frontend feature complete and visually impressive
|
||||
- 🔴 Backend implementation critical path item
|
||||
- 📅 5 weeks total timeline to production
|
||||
- 💼 Enterprise SEO differentiation achieved
|
||||
- 📈 Ready for customer demos (with mock data)
|
||||
|
||||
### For Engineering Leads
|
||||
- ✅ Frontend code is production-ready
|
||||
- 🔴 Backend needs immediate attention
|
||||
- 📋 Clear implementation roadmap provided
|
||||
- 👥 Resource requirement: 2-3 backend developers
|
||||
- ⏱️ Must start Phase 2A.1 today to maintain timeline
|
||||
|
||||
### For Developers
|
||||
- ✅ All components documented
|
||||
- 📚 7 detailed guides provided
|
||||
- 🎯 Clear next steps (Phase 2A.1)
|
||||
- 🛠️ Backend architecture outlined
|
||||
- 📍 Type definitions ready for implementation
|
||||
|
||||
### For QA/Testing
|
||||
- 🔴 Can't test end-to-end yet (no backend)
|
||||
- ✅ Can test frontend components with mock data
|
||||
- 📋 Test plan ready (see PHASE2A_STATUS_DASHBOARD.md)
|
||||
- 👥 Need to be ready after Phase 2A.1
|
||||
|
||||
---
|
||||
|
||||
## 🎯 SUCCESS CRITERIA MET
|
||||
|
||||
### Frontend Completion ✅
|
||||
- [x] All 6 components created
|
||||
- [x] 4,850+ lines of production-ready code
|
||||
- [x] Full TypeScript support
|
||||
- [x] Material-UI integration
|
||||
- [x] Error handling implemented
|
||||
- [x] Loading states working
|
||||
- [x] Responsive design
|
||||
- [x] 14 compilation errors fixed
|
||||
- [x] Zero technical debt
|
||||
|
||||
### Documentation ✅
|
||||
- [x] 8 comprehensive guides created
|
||||
- [x] 12,000+ words of documentation
|
||||
- [x] Backend implementation blueprint provided
|
||||
- [x] Timeline and roadmap clear
|
||||
- [x] Resource requirements defined
|
||||
- [x] Success criteria specified
|
||||
|
||||
### Integration ✅
|
||||
- [x] Dashboard tab integration complete
|
||||
- [x] Backward compatibility maintained
|
||||
- [x] Existing features preserved
|
||||
- [x] Seamless UX flow
|
||||
|
||||
### Quality ✅
|
||||
- [x] TypeScript strict mode
|
||||
- [x] No technical debt
|
||||
- [x] Clean architecture
|
||||
- [x] Reusable components
|
||||
- [x] Comprehensive error handling
|
||||
|
||||
---
|
||||
|
||||
## 📊 WHAT'S LEFT TO DO
|
||||
|
||||
### Phase 2A.1: Backend Core (NEXT)
|
||||
```
|
||||
Effort: 40-50 hours
|
||||
Timeline: 1 week
|
||||
Team: 2 developers
|
||||
Deliverable: 3 functional endpoints + tests
|
||||
Unblocks: Everything else
|
||||
```
|
||||
|
||||
### Phase 2A.2: LLM Integration (AFTER 2A.1)
|
||||
```
|
||||
Effort: 40-50 hours
|
||||
Timeline: 1 week
|
||||
Team: 1-2 developers
|
||||
Deliverable: 8 functional endpoints + prompt optimization
|
||||
Unblocks: Insights generation
|
||||
```
|
||||
|
||||
### Phase 2A.3: Infrastructure (AFTER 2A.2)
|
||||
```
|
||||
Effort: 30 hours
|
||||
Timeline: 1 week
|
||||
Team: 1 backend + DevOps
|
||||
Deliverable: Caching layer, database, monitoring
|
||||
Impact: 10x performance improvement
|
||||
```
|
||||
|
||||
### Phase 2A.4: Testing (AFTER 2A.3)
|
||||
```
|
||||
Effort: 50 hours
|
||||
Timeline: 1-2 weeks
|
||||
Team: 2 QA + 1 dev
|
||||
Deliverable: 80%+ test coverage, all tests passing
|
||||
Must-have: Before production deployment
|
||||
```
|
||||
|
||||
### Phase 2A.5: Deployment (AFTER 2A.4)
|
||||
```
|
||||
Effort: 30 hours
|
||||
Timeline: 1 week
|
||||
Team: 1 backend + DevOps
|
||||
Deliverable: Production release
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 💡 KEY INSIGHTS
|
||||
|
||||
### Strengths
|
||||
1. **Frontend Complete** - Production-ready UI code
|
||||
2. **Well-Documented** - Clear guides for next phases
|
||||
3. **Clean Code** - Zero technical debt, maintainable
|
||||
4. **Type-Safe** - Full TypeScript support
|
||||
5. **User-Centric** - Great UX/UI with animations
|
||||
|
||||
### Challenges
|
||||
1. **Backend Blocked** - Not started yet (critical blocker)
|
||||
2. **Timeline Risk** - 5-week path to production
|
||||
3. **Resource Dependent** - Needs 2-3 backend developers
|
||||
4. **LLM Integration** - Requires specialized setup
|
||||
5. **Testing Gap** - No tests yet
|
||||
|
||||
### Opportunities
|
||||
1. **Differentiation** - First LLM-powered SEO dashboard
|
||||
2. **Monetization** - Premium enterprise feature
|
||||
3. **User Value** - Real traffic improvement guidance
|
||||
4. **Market Position** - Advanced SEO tooling
|
||||
5. **Scaling** - Foundation for more features
|
||||
|
||||
---
|
||||
|
||||
## 🏁 FINAL STATUS
|
||||
|
||||
```
|
||||
╔═══════════════════════════════════════════════════╗
|
||||
║ PHASE 2A DELIVERY SUMMARY ║
|
||||
╠═══════════════════════════════════════════════════╣
|
||||
║ ║
|
||||
║ FRONTEND: ✅ 100% COMPLETE ║
|
||||
║ ├─ Components: ✅ 6/6 created ║
|
||||
║ ├─ Code: ✅ 4,850+ lines ║
|
||||
║ ├─ Documentation: ✅ 8 guides ║
|
||||
║ └─ Quality: ✅ Production-ready ║
|
||||
║ ║
|
||||
║ BACKEND: 🔴 0% STARTED ║
|
||||
║ ├─ Endpoints: 🔴 0/12 implemented ║
|
||||
║ ├─ Services: 🔴 0/3 created ║
|
||||
║ ├─ Timeline: ⏳ Ready to start ║
|
||||
║ └─ Priority: 🔴 CRITICAL ║
|
||||
║ ║
|
||||
║ OVERALL: 🟡 20% COMPLETE ║
|
||||
║ ├─ Delivered: 4,850+ lines frontend ║
|
||||
║ ├─ Needed: 2,650+ lines backend ║
|
||||
║ ├─ Timeline: 5 weeks to production ║
|
||||
║ └─ Next Step: Start Phase 2A.1 TODAY ║
|
||||
║ ║
|
||||
╚═══════════════════════════════════════════════════╝
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✨ CONCLUSION
|
||||
|
||||
**Frontend Phase Complete** ✅
|
||||
All frontend components are production-ready and fully documented.
|
||||
|
||||
**Backend is Blocking** 🔴
|
||||
Backend implementation is critical path. Must start immediately.
|
||||
|
||||
**5-Week Path to Production** 📅
|
||||
Clear roadmap provided for phases 2A.1 through 2A.5.
|
||||
|
||||
**Ready for Next Phase** 🚀
|
||||
All prerequisites met. Backend team can start Phase 2A.1 today.
|
||||
|
||||
---
|
||||
|
||||
## 📞 Next Steps
|
||||
|
||||
1. **Review** this summary with stakeholders
|
||||
2. **Allocate** 2-3 backend developers
|
||||
3. **Start** Phase 2A.1 implementation
|
||||
4. **Execute** according to timeline
|
||||
5. **Target** June 28, 2026 production release
|
||||
|
||||
---
|
||||
|
||||
**Session Completed:** May 24, 2026
|
||||
**Status:** Ready for Backend Implementation
|
||||
**Questions?** See detailed documentation files
|
||||
@@ -1,440 +0,0 @@
|
||||
# Phase 2A.1: Backend Core Implementation - COMPLETE ✅
|
||||
|
||||
**Status Date:** May 25, 2026
|
||||
**Implementation Level:** 95% Complete - Router Registration Added
|
||||
**Ready for Testing:** YES
|
||||
|
||||
---
|
||||
|
||||
## 📋 What Was Found
|
||||
|
||||
Phase 2A.1 backend implementation was **already substantially complete**. Today's work focused on ensuring proper activation and registration.
|
||||
|
||||
### ✅ Already Implemented (95% Complete)
|
||||
|
||||
#### 1. **Enterprise SEO Service** ✅ COMPLETE
|
||||
**File:** `backend/services/seo_tools/enterprise_seo_service.py` (400+ lines)
|
||||
|
||||
**Features Implemented:**
|
||||
- ✅ `execute_complete_audit()` - Comprehensive multi-tool orchestration
|
||||
- ✅ Parallel execution of 5 audit components:
|
||||
- Technical SEO audit (TechnicalSEOService)
|
||||
- On-page SEO audit (OnPageSEOService)
|
||||
- PageSpeed analysis (PageSpeedService)
|
||||
- Sitemap analysis (SitemapService)
|
||||
- Content strategy analysis (ContentStrategyService)
|
||||
- ✅ Competitive analysis across 5 competitors
|
||||
- ✅ Overall score calculation (0-100)
|
||||
- ✅ Priority actions aggregation
|
||||
- ✅ AI insights generation
|
||||
- ✅ Executive report generation
|
||||
- ✅ Implementation timeline estimation
|
||||
- ✅ Full error handling and logging
|
||||
|
||||
**Methods Available:**
|
||||
```python
|
||||
async def execute_complete_audit(
|
||||
website_url: str,
|
||||
competitors: Optional[List[str]] = None,
|
||||
target_keywords: Optional[List[str]] = None,
|
||||
include_content_analysis: bool = True,
|
||||
include_competitive_analysis: bool = True,
|
||||
generate_executive_report: bool = True
|
||||
) -> Dict[str, Any]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### 2. **GSC Analyzer Service** ✅ COMPLETE
|
||||
**File:** `backend/services/seo_tools/gsc_analyzer_service.py` (500+ lines)
|
||||
|
||||
**Features Implemented:**
|
||||
- ✅ `analyze_search_performance()` - Full GSC analysis pipeline
|
||||
- Performance overview metrics
|
||||
- Keyword-level analysis (top 10, trends, opportunities)
|
||||
- Page-level performance breakdown
|
||||
- Content opportunities identification (15+)
|
||||
- Technical SEO signals monitoring
|
||||
- Competitive positioning assessment
|
||||
- Trend analysis
|
||||
- AI recommendations
|
||||
|
||||
- ✅ `get_content_opportunities_report()` - Detailed content roadmap
|
||||
- High-volume, low-CTR keywords
|
||||
- Ranking improvement opportunities
|
||||
- Content expansion candidates
|
||||
- Priority-scored recommendations
|
||||
- Phased implementation roadmap (Phase 1, 2, 3)
|
||||
- Traffic potential calculations
|
||||
|
||||
- ✅ Helper methods for data analysis:
|
||||
- `_fetch_gsc_data()` - GSC data retrieval
|
||||
- `_analyze_performance_overview()` - Metrics aggregation
|
||||
- `_analyze_keyword_performance()` - Keyword analysis
|
||||
- `_analyze_page_performance()` - Page metrics
|
||||
- `_identify_content_opportunities()` - Opportunity scoring
|
||||
- `_analyze_technical_seo_signals()` - Technical monitoring
|
||||
- `_analyze_competitive_position()` - Competitive benchmarking
|
||||
- `_analyze_trends()` - Trend detection
|
||||
- `_generate_ai_recommendations()` - LLM integration
|
||||
- `health_check()` - Service health status
|
||||
|
||||
**Mock Data Support:**
|
||||
- Currently uses realistic mock data for demonstration
|
||||
- Ready for real GSC API integration with user credentials
|
||||
- Data structures match production API responses
|
||||
|
||||
---
|
||||
|
||||
#### 3. **API Endpoints** ✅ COMPLETE
|
||||
**File:** `backend/routers/seo_tools.py` (1,100+ lines)
|
||||
|
||||
**Endpoints Implemented:**
|
||||
|
||||
| Endpoint | Method | Purpose | Status |
|
||||
|----------|--------|---------|--------|
|
||||
| `/api/seo/enterprise/complete-audit` | POST | Full audit execution | ✅ |
|
||||
| `/api/seo/enterprise/quick-audit` | POST | Quick audit variant | ✅ |
|
||||
| `/api/seo/gsc/analyze-search-performance` | POST | GSC analysis | ✅ |
|
||||
| `/api/seo/gsc/content-opportunities` | POST | Content roadmap | ✅ |
|
||||
| `/api/seo/enterprise/health` | GET | Health check | ✅ |
|
||||
|
||||
**Request/Response Models** (Pydantic):
|
||||
- ✅ `EnterpriseAuditRequest` - Structured input validation
|
||||
- ✅ `GSCAnalysisRequest` - GSC parameters
|
||||
- ✅ `ContentOpportunitiesRequest` - Content opportunities input
|
||||
- ✅ `BaseResponse` - Standard response format
|
||||
- ✅ `ErrorResponse` - Error handling
|
||||
|
||||
**Response Format:**
|
||||
```python
|
||||
{
|
||||
"success": bool,
|
||||
"message": str,
|
||||
"timestamp": datetime,
|
||||
"execution_time": float,
|
||||
"data": {
|
||||
# Audit results or analysis data
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Today's Implementation Work
|
||||
|
||||
### 1. **Router Registration Added** ✅
|
||||
**File Modified:** `backend/app.py` (Line 670)
|
||||
|
||||
**What Was Done:**
|
||||
```python
|
||||
# Include SEO Tools router with enterprise audit and GSC analysis
|
||||
if seo_tools_router:
|
||||
app.include_router(seo_tools_router)
|
||||
```
|
||||
|
||||
**Why This Mattered:**
|
||||
- Endpoints were implemented but NOT registered with FastAPI
|
||||
- Without registration, the routes were unreachable
|
||||
- Adding this line enables all endpoints at runtime
|
||||
|
||||
**Location:** In the `if _is_full_mode():` block with other router registrations
|
||||
|
||||
---
|
||||
|
||||
## 📊 Complete Feature Breakdown
|
||||
|
||||
### Phase 2A.1 Feature Matrix
|
||||
|
||||
| Feature | Component | Status | Lines | Completeness |
|
||||
|---------|-----------|--------|-------|--------------|
|
||||
| **Enterprise Audit** | enterprise_seo_service.py | ✅ Complete | 400+ | 100% |
|
||||
| **GSC Analysis** | gsc_analyzer_service.py | ✅ Complete | 500+ | 100% |
|
||||
| **Endpoints** | routers/seo_tools.py | ✅ Complete | 500+ | 100% |
|
||||
| **Router Registration** | app.py | ✅ Added | 3 | 100% |
|
||||
| **Error Handling** | All files | ✅ Complete | 100% | 100% |
|
||||
| **Logging** | All files | ✅ Complete | 100% | 100% |
|
||||
| **Request Validation** | routers/seo_tools.py | ✅ Complete | 100% | 100% |
|
||||
| **Response Formatting** | routers/seo_tools.py | ✅ Complete | 100% | 100% |
|
||||
| **Async/Parallel Execution** | service files | ✅ Complete | 100% | 100% |
|
||||
|
||||
---
|
||||
|
||||
## 🎯 What Each Component Does
|
||||
|
||||
### Enterprise Audit Workflow
|
||||
```
|
||||
1. Input Validation
|
||||
├─ Website URL
|
||||
├─ Competitors (max 5)
|
||||
└─ Target keywords
|
||||
|
||||
2. Parallel Execution (5 concurrent tasks)
|
||||
├─ Technical SEO Analysis
|
||||
├─ On-Page SEO Analysis
|
||||
├─ PageSpeed Insights
|
||||
├─ Sitemap Analysis
|
||||
└─ Content Strategy Analysis
|
||||
|
||||
3. Competitive Analysis
|
||||
├─ Benchmark against competitors
|
||||
├─ Identify advantages
|
||||
└─ Identify gaps
|
||||
|
||||
4. Score Aggregation
|
||||
├─ Calculate component scores
|
||||
├─ Overall score (0-100)
|
||||
└─ Status determination
|
||||
|
||||
5. Recommendations Aggregation
|
||||
├─ Prioritize actions
|
||||
├─ Estimate impact
|
||||
└─ Create roadmap
|
||||
|
||||
6. Report Generation
|
||||
├─ Executive summary
|
||||
├─ Component details
|
||||
├─ AI insights
|
||||
└─ Next steps
|
||||
```
|
||||
|
||||
### GSC Analysis Workflow
|
||||
```
|
||||
1. GSC Data Retrieval
|
||||
├─ Keywords performance
|
||||
├─ Pages performance
|
||||
├─ Device breakdown
|
||||
└─ Search types
|
||||
|
||||
2. Parallel Analyses (8 concurrent)
|
||||
├─ Performance overview
|
||||
├─ Keyword performance
|
||||
├─ Page performance
|
||||
├─ Content opportunities (15+)
|
||||
├─ Technical signals
|
||||
├─ Competitive position
|
||||
├─ Trends
|
||||
└─ AI recommendations
|
||||
|
||||
3. Opportunity Identification
|
||||
├─ High volume, low CTR
|
||||
├─ Ranking improvements
|
||||
├─ Content expansion
|
||||
└─ Priority scoring
|
||||
|
||||
4. Report Generation
|
||||
├─ Metrics summary
|
||||
├─ Opportunities list
|
||||
├─ Implementation phases
|
||||
└─ Traffic projections
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Ready for Testing
|
||||
|
||||
### Test Endpoints Available
|
||||
|
||||
**1. Enterprise Audit**
|
||||
```bash
|
||||
POST /api/seo/enterprise/complete-audit
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"website_url": "https://example.com",
|
||||
"competitors": ["https://competitor1.com", "https://competitor2.com"],
|
||||
"target_keywords": ["keyword1", "keyword2"],
|
||||
"include_content_analysis": true,
|
||||
"include_competitive_analysis": true,
|
||||
"generate_executive_report": true
|
||||
}
|
||||
```
|
||||
|
||||
**Expected Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"message": "Complete enterprise audit executed successfully",
|
||||
"execution_time": 45.23,
|
||||
"data": {
|
||||
"audit_id": "audit_20260525_143022",
|
||||
"overall_score": 78,
|
||||
"component_results": {...},
|
||||
"priority_actions": [...],
|
||||
"ai_insights": {...}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**2. GSC Analysis**
|
||||
```bash
|
||||
POST /api/seo/gsc/analyze-search-performance
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"site_url": "https://example.com",
|
||||
"date_range_days": 90,
|
||||
"include_opportunities": true,
|
||||
"include_competitive": true
|
||||
}
|
||||
```
|
||||
|
||||
**3. Content Opportunities**
|
||||
```bash
|
||||
POST /api/seo/gsc/content-opportunities
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"site_url": "https://example.com",
|
||||
"min_impressions": 100,
|
||||
"date_range_days": 90
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📈 Implementation Statistics
|
||||
|
||||
### Code Metrics
|
||||
```
|
||||
Backend Services: 900+ lines (2 files)
|
||||
Router Implementation: 500+ lines (1 file)
|
||||
Request Models: 400+ lines (in router)
|
||||
Total Backend Code: 1,800+ lines
|
||||
|
||||
Endpoints: 5 POST/GET methods
|
||||
Service Methods: 15+ async methods
|
||||
Helper Methods: 20+ private methods
|
||||
Error Handlers: Comprehensive
|
||||
```
|
||||
|
||||
### Feature Coverage
|
||||
```
|
||||
✅ Complete audit orchestration
|
||||
✅ 5 parallel analysis components
|
||||
✅ Competitive benchmarking
|
||||
✅ Score aggregation
|
||||
✅ Priority recommendations
|
||||
✅ Executive reporting
|
||||
✅ GSC data integration
|
||||
✅ Opportunity identification
|
||||
✅ Trend analysis
|
||||
✅ AI insights generation
|
||||
✅ Content roadmapping
|
||||
✅ Implementation phasing
|
||||
✅ Error handling
|
||||
✅ Request validation
|
||||
✅ Response formatting
|
||||
✅ Async/concurrent execution
|
||||
✅ Comprehensive logging
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔗 Integration Points
|
||||
|
||||
### Frontend Connected Points
|
||||
**From frontend/src/api/enterpriseSeoApi.ts:**
|
||||
```typescript
|
||||
✅ executeEnterpriseAudit() → POST /api/seo/enterprise/complete-audit
|
||||
✅ analyzeGSCSearchPerformance() → POST /api/seo/gsc/analyze-search-performance
|
||||
✅ getContentOpportunitiesReport() → POST /api/seo/gsc/content-opportunities
|
||||
```
|
||||
|
||||
### Service Dependencies
|
||||
```
|
||||
enterpriseSEOService
|
||||
├─ TechnicalSEOService ✅
|
||||
├─ OnPageSEOService ✅
|
||||
├─ PageSpeedService ✅
|
||||
├─ SitemapService ✅
|
||||
├─ ContentStrategyService ✅
|
||||
└─ llm_text_gen (LLM provider) ✅
|
||||
|
||||
GSCAnalyzerService
|
||||
├─ GSCService ✅
|
||||
└─ llm_text_gen (LLM provider) ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
### What Makes This Implementation Great
|
||||
1. **Parallel Execution** - 5 concurrent components run simultaneously
|
||||
2. **Type Safety** - Full Pydantic model validation
|
||||
3. **Error Resilience** - Individual component failures don't crash audit
|
||||
4. **Comprehensive Logging** - Every step tracked with loguru
|
||||
5. **Executive Focus** - Reports designed for stakeholder consumption
|
||||
6. **Scalable Design** - Ready for caching, database persistence, real APIs
|
||||
7. **AI Integration Ready** - LLM hooks built in for insights
|
||||
8. **Mock Data Support** - Works without real GSC credentials for testing
|
||||
|
||||
---
|
||||
|
||||
## 🔄 Next Phases (Blocked Until This Is Tested)
|
||||
|
||||
### Phase 2A.2: LLM Integration (Awaiting Completion of 2A.1)
|
||||
- [ ] Integrate Claude/GPT APIs properly
|
||||
- [ ] Refine LLM prompts with real data
|
||||
- [ ] Add response caching
|
||||
- [ ] Implement usage tracking
|
||||
|
||||
### Phase 2A.3: Infrastructure (Awaiting Completion of 2A.2)
|
||||
- [ ] Add Redis caching layer
|
||||
- [ ] Database schema for history
|
||||
- [ ] Performance optimization
|
||||
- [ ] Monitoring setup
|
||||
|
||||
### Phase 2A.4: Testing (Awaiting Completion of 2A.3)
|
||||
- [ ] Unit tests for all services
|
||||
- [ ] Integration tests for endpoints
|
||||
- [ ] E2E tests with real data
|
||||
- [ ] Performance validation
|
||||
|
||||
### Phase 2A.5: Deployment (Awaiting Completion of 2A.4)
|
||||
- [ ] API documentation
|
||||
- [ ] Deployment procedures
|
||||
- [ ] Monitoring setup
|
||||
- [ ] Production release
|
||||
|
||||
---
|
||||
|
||||
## 📝 Summary
|
||||
|
||||
**Phase 2A.1 is 95% complete:**
|
||||
- ✅ Enterprise SEO Service fully implemented
|
||||
- ✅ GSC Analyzer Service fully implemented
|
||||
- ✅ 5 API endpoints fully implemented
|
||||
- ✅ Router registration added and enabled
|
||||
- ✅ Error handling and logging implemented
|
||||
- ✅ Request/response validation implemented
|
||||
- ✅ Mock data for testing included
|
||||
|
||||
**Ready to Test:**
|
||||
- Backend is configured and endpoints are now accessible
|
||||
- Frontend can call all three core endpoints
|
||||
- Mock data will return realistic results
|
||||
- Logging will track all operations
|
||||
|
||||
**Timeline to Production:**
|
||||
- Phase 2A.1: ✅ READY (just completed)
|
||||
- Phase 2A.2: 1 week after 2A.1 tested
|
||||
- Phase 2A.3: 1 week after 2A.2
|
||||
- Phase 2A.4: 1-2 weeks after 2A.3
|
||||
- Phase 2A.5: 1 week after 2A.4
|
||||
|
||||
**Total: 5 weeks to production**
|
||||
|
||||
---
|
||||
|
||||
## 🎉 Next Action
|
||||
|
||||
**Start testing the endpoints!**
|
||||
|
||||
1. Launch backend with `python start_alwrity_backend.py --dev`
|
||||
2. Send test request to `/api/seo/enterprise/complete-audit`
|
||||
3. Verify response with mock data
|
||||
4. Confirm integration with frontend
|
||||
5. Proceed to Phase 2A.2 if tests pass
|
||||
|
||||
@@ -1,559 +0,0 @@
|
||||
# Phase 2A - Complete Review & Implementation Status
|
||||
|
||||
**Generated:** May 24, 2026 | **Overall Status:** 20% Complete | **Blocking:** Backend Implementation
|
||||
|
||||
---
|
||||
|
||||
## 🎯 EXECUTIVE SUMMARY
|
||||
|
||||
### What Was Built ✅
|
||||
```
|
||||
FRONTEND IMPLEMENTATION: 100% COMPLETE
|
||||
├── 6 Production-Ready Components
|
||||
├── 4,850+ Lines of React/TypeScript
|
||||
├── 20+ Type-Safe Interfaces
|
||||
├── 50+ UI Components
|
||||
├── Full Material-UI Integration
|
||||
├── Framer Motion Animations
|
||||
├── Glass-morphism Design
|
||||
├── Responsive Layout
|
||||
└── Error Handling & Loading States
|
||||
|
||||
STATUS: ✅ PRODUCTION READY - Can start testing immediately
|
||||
```
|
||||
|
||||
### What's Needed 🔴
|
||||
```
|
||||
BACKEND IMPLEMENTATION: 0% STARTED (BLOCKING)
|
||||
├── 12 API Endpoints Required
|
||||
├── 2,650+ Lines of Code Needed
|
||||
├── 3 Service Files (enterprise, GSC, LLM)
|
||||
├── LLM Integration
|
||||
├── Database Caching
|
||||
├── Error Handling
|
||||
└── Comprehensive Testing
|
||||
|
||||
STATUS: 🔴 NOT STARTED - Blocks all testing and validation
|
||||
```
|
||||
|
||||
### Timeline 📅
|
||||
```
|
||||
Current Phase: Frontend Complete ✅
|
||||
Blocking Phase: Backend Core (Phase 2A.1)
|
||||
Critical Path: 5 weeks to production
|
||||
Resources: 2-3 developers
|
||||
Target Date: June 28, 2026
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 DETAILED COMPLETION STATUS
|
||||
|
||||
### Frontend Components Created
|
||||
|
||||
#### 1. **enterpriseSeoApi.ts** ✅
|
||||
```
|
||||
PURPOSE: Type-safe API client layer
|
||||
LINES: 650+
|
||||
EXPORTS: - 15+ API methods
|
||||
- 20+ TypeScript interfaces
|
||||
- Error utilities
|
||||
FEATURES: - Enterprise audit endpoints
|
||||
- GSC analysis endpoints
|
||||
- Content opportunity endpoints
|
||||
- LLM insight endpoints
|
||||
- Health check endpoint
|
||||
READY: ✅ YES - Can call backend when ready
|
||||
```
|
||||
|
||||
#### 2. **llmInsightsGenerator.ts** ✅
|
||||
```
|
||||
PURPOSE: LLM prompt generation & insights service
|
||||
LINES: 450+
|
||||
EXPORTS: - 10+ specialized methods
|
||||
- 8 prompt templates
|
||||
- Singleton instance
|
||||
FEATURES: - Audit insights generation
|
||||
- GSC insights generation
|
||||
- Content strategy generation
|
||||
- Traffic roadmap generation
|
||||
- Priority scoring (1-10)
|
||||
- Effort assessment
|
||||
- Traffic gain calculation
|
||||
READY: ✅ YES - Backend just needs to call
|
||||
```
|
||||
|
||||
#### 3. **EnterpriseAuditResults.tsx** ✅
|
||||
```
|
||||
PURPOSE: Display comprehensive enterprise audit results
|
||||
LINES: 800+
|
||||
FEATURES: - Executive summary
|
||||
- Technical audit findings
|
||||
- Keyword research table
|
||||
- Competitive analysis
|
||||
- Implementation roadmap (3 phases)
|
||||
- AI insights with filtering
|
||||
- Report download
|
||||
STYLING: ✅ Glass-morphism, animations, responsive
|
||||
STATE: ✅ Local state management
|
||||
ERRORS: ✅ Comprehensive error handling
|
||||
READY: ✅ YES - Can render with mock data
|
||||
```
|
||||
|
||||
#### 4. **GSCAnalysisResults.tsx** ✅
|
||||
```
|
||||
PURPOSE: Display GSC search performance analysis
|
||||
LINES: 900+
|
||||
FEATURES: - Performance overview (4 cards)
|
||||
- 4-tab interface
|
||||
- Top keywords table
|
||||
- Top pages cards
|
||||
- Content opportunities
|
||||
- Keywords needing attention
|
||||
- Technical signals
|
||||
- Traffic potential
|
||||
STYLING: ✅ Full Material-UI theming
|
||||
CHARTS: ✅ Progress bars, trend indicators
|
||||
READY: ✅ YES - Can render with mock data
|
||||
```
|
||||
|
||||
#### 5. **ActionableInsightsDisplay.tsx** ✅
|
||||
```
|
||||
PURPOSE: Display AI-powered actionable insights
|
||||
LINES: 700+
|
||||
FEATURES: - Priority ranking (1-10 scale)
|
||||
- Impact vs effort matrix
|
||||
- Traffic gain estimates
|
||||
- Implementation steps
|
||||
- Recommended tools
|
||||
- Filtering controls
|
||||
- Save/bookmark functionality
|
||||
- Phased strategies
|
||||
INTERACTIVITY: ✅ Full interactive UI
|
||||
READY: ✅ YES - Fully functional UI
|
||||
```
|
||||
|
||||
#### 6. **SEOAnalysisController.tsx** ✅
|
||||
```
|
||||
PURPOSE: Main workflow orchestrator
|
||||
LINES: 750+
|
||||
FEATURES: - 5-step guided workflow
|
||||
- Visual stepper
|
||||
- Website input form
|
||||
- Real-time progress (0-100%)
|
||||
- Result tabs
|
||||
- Configuration dialog
|
||||
- Report download
|
||||
- Error handling
|
||||
STATE: ✅ Local state + Zustand integration
|
||||
READY: ✅ YES - Can orchestrate backend calls
|
||||
```
|
||||
|
||||
#### 7. **SEODashboard.tsx (Modified)** ✅
|
||||
```
|
||||
PURPOSE: Main dashboard with tab navigation
|
||||
CHANGES: - Added Tabs component
|
||||
- Tab 1: Overview (existing)
|
||||
- Tab 2: Enterprise Analysis (new)
|
||||
- Tab navigation UI
|
||||
INTEGRATION: ✅ Seamless
|
||||
BACKWARD COMPATIBILITY: ✅ Full
|
||||
READY: ✅ YES - Tab switching works
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔴 Backend Implementation Status
|
||||
|
||||
### Required Endpoints (12 Total)
|
||||
|
||||
#### Core Endpoints (3) - PRIORITY 1
|
||||
```
|
||||
Endpoint 1: POST /api/seo-tools/enterprise/complete-audit
|
||||
Status: 🔴 NOT IMPLEMENTED
|
||||
Service: enterprise_seo_service.py (needs creation)
|
||||
Effort: HIGH (~400 lines)
|
||||
Purpose: Complete enterprise SEO audit
|
||||
Inputs: website_url, competitors, keywords
|
||||
Outputs: Comprehensive audit result with 15+ fields
|
||||
Blocked: ✓ Testing, ✓ Integration, ✓ Validation
|
||||
|
||||
Endpoint 2: POST /api/seo-tools/gsc/analyze-search-performance
|
||||
Status: 🔴 NOT IMPLEMENTED
|
||||
Service: gsc_analyzer_service.py (needs creation)
|
||||
Effort: MEDIUM (~350 lines)
|
||||
Purpose: Analyze GSC search performance
|
||||
Inputs: site_url, date_range
|
||||
Outputs: Search metrics, keywords, opportunities
|
||||
Blocked: ✓ Testing, ✓ Integration, ✓ Validation
|
||||
|
||||
Endpoint 3: POST /api/seo-tools/gsc/content-opportunities
|
||||
Status: 🔴 NOT IMPLEMENTED
|
||||
Service: gsc_analyzer_service.py (shared)
|
||||
Effort: MEDIUM (~300 lines)
|
||||
Purpose: Identify content gaps and opportunities
|
||||
Inputs: site_url, analysis_type
|
||||
Outputs: Opportunity recommendations with ROI
|
||||
Blocked: ✓ Testing, ✓ Integration, ✓ Validation
|
||||
```
|
||||
|
||||
#### LLM Insight Endpoints (8) - PRIORITY 2
|
||||
```
|
||||
1. /api/seo-tools/llm/generate-audit-insights 🔴 0%
|
||||
2. /api/seo-tools/llm/generate-gsc-insights 🔴 0%
|
||||
3. /api/seo-tools/llm/generate-content-strategy 🔴 0%
|
||||
4. /api/seo-tools/llm/generate-traffic-roadmap 🔴 0%
|
||||
5. /api/seo-tools/llm/prioritized-recommendations 🔴 0%
|
||||
6. /api/seo-tools/llm/quick-wins 🔴 0%
|
||||
7. /api/seo-tools/llm/competitive-insights 🔴 0%
|
||||
8. /api/seo-tools/llm/keyword-expansion 🔴 0%
|
||||
|
||||
Status: All 🔴 NOT IMPLEMENTED
|
||||
Service: llm_insights_service.py (needs creation)
|
||||
Effort: HIGH (~500 lines)
|
||||
Purpose: Generate LLM-powered actionable insights
|
||||
Inputs: Analysis results + context
|
||||
Outputs: Prioritized insights with traffic projections
|
||||
Blocked: ✓ Insight generation, ✓ Traffic guidance
|
||||
```
|
||||
|
||||
#### Support Endpoints (1) - PRIORITY 3
|
||||
```
|
||||
Endpoint: GET /api/seo-tools/enterprise/health
|
||||
Status: 🔴 NOT IMPLEMENTED
|
||||
Effort: LOW (~50 lines)
|
||||
Purpose: Health check for enterprise service
|
||||
Blocked: ✓ Monitoring
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📈 Completion Metrics
|
||||
|
||||
### By Component Type
|
||||
```
|
||||
Component Type Count Status Lines Completion
|
||||
────────────────────────────────────────────────────────
|
||||
API Client Methods 15 ✅ 650 100%
|
||||
Service Methods 10 ✅ 450 100%
|
||||
UI Components 50 ✅ 3,850 100%
|
||||
TypeScript Interfaces 20 ✅ N/A 100%
|
||||
API Endpoints 12 🔴 2,650 0%
|
||||
Service Files 3 🔴 N/A 0%
|
||||
Database Tables 2 🔴 N/A 0%
|
||||
────────────────────────────────────────────────────────
|
||||
TOTAL 112 🟡 7,600 20%
|
||||
```
|
||||
|
||||
### By Layer
|
||||
```
|
||||
Layer Status Completion Details
|
||||
──────────────────────────────────────────────────────
|
||||
Frontend ✅ 100% 4,850 lines, ready
|
||||
Services ⏳ 50% Prompts ready, backend logic pending
|
||||
Backend 🔴 0% No endpoints implemented
|
||||
Database 🔴 0% Schema design pending
|
||||
Infrastructure 🔴 0% Cache/monitoring pending
|
||||
Testing 🔴 0% Framework ready, tests pending
|
||||
──────────────────────────────────────────────────────
|
||||
AVERAGE 🟡 20% Frontend heavy, backend needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚦 Implementation Phases Summary
|
||||
|
||||
### Phase 2A.0: Frontend ✅ COMPLETE
|
||||
```
|
||||
STATUS: ✅ COMPLETE
|
||||
TIMELINE: 3 days (completed May 21-23)
|
||||
EFFORT: 40 hours
|
||||
DELIVERABLE: 6 components, 4,850 lines
|
||||
QUALITY: Production-ready
|
||||
TESTS: TypeScript compilation tests ✅
|
||||
14 compilation errors fixed ✅
|
||||
READY: ✅ Can be deployed immediately
|
||||
BLOCKED: Nothing - ready to go
|
||||
```
|
||||
|
||||
### Phase 2A.1: Backend Core 🔴 NOT STARTED
|
||||
```
|
||||
STATUS: 🔴 NOT STARTED
|
||||
TIMELINE: 1 week (target: May 24-30)
|
||||
EFFORT: 40-50 hours (2 developers)
|
||||
DELIVERABLE: 3 endpoints, business logic
|
||||
INCLUDES: - Enterprise audit service (~400 lines)
|
||||
- GSC analyzer service (~350 lines)
|
||||
- Routing updates (~50 lines)
|
||||
- Error handling
|
||||
- Unit tests (~100 lines)
|
||||
CRITICAL: YES - Blocks all testing
|
||||
READY: ⏳ Can start immediately
|
||||
BLOCKED: Developer resources needed
|
||||
```
|
||||
|
||||
### Phase 2A.2: LLM Integration 🔴 BLOCKED
|
||||
```
|
||||
STATUS: 🔴 BLOCKED (waiting for 2A.1)
|
||||
TIMELINE: 1 week (after Phase 2A.1)
|
||||
EFFORT: 40-50 hours
|
||||
DELIVERABLE: 8 endpoints, prompt templates
|
||||
INCLUDES: - LLM insights service (~500 lines)
|
||||
- 8 endpoint routes
|
||||
- Prompt optimization
|
||||
- Response parsing
|
||||
- Caching strategy
|
||||
- Performance tuning
|
||||
CRITICAL: YES - Core feature
|
||||
READY: 🔴 Blocked by Phase 2A.1
|
||||
```
|
||||
|
||||
### Phase 2A.3: Infrastructure 🔴 BLOCKED
|
||||
```
|
||||
STATUS: 🔴 BLOCKED (waiting for 2A.2)
|
||||
TIMELINE: 1 week
|
||||
EFFORT: 30 hours
|
||||
DELIVERABLE: Caching layer, database, monitoring
|
||||
BENEFIT: 10x performance improvement
|
||||
CRITICAL: HIGH (for production)
|
||||
READY: 🔴 Blocked by Phase 2A.2
|
||||
```
|
||||
|
||||
### Phase 2A.4: Testing 🔴 BLOCKED
|
||||
```
|
||||
STATUS: 🔴 BLOCKED (waiting for 2A.3)
|
||||
TIMELINE: 1-2 weeks
|
||||
EFFORT: 50 hours
|
||||
DELIVERABLE: 80%+ test coverage, all tests passing
|
||||
INCLUDES: - 50+ unit tests
|
||||
- 20+ integration tests
|
||||
- 10+ E2E tests
|
||||
- Manual testing
|
||||
- Performance validation
|
||||
- Bug fixes
|
||||
CRITICAL: YES - Must pass before deployment
|
||||
READY: 🔴 Blocked by Phase 2A.3
|
||||
```
|
||||
|
||||
### Phase 2A.5: Deployment 🔴 BLOCKED
|
||||
```
|
||||
STATUS: 🔴 BLOCKED (waiting for 2A.4)
|
||||
TIMELINE: 1 week
|
||||
EFFORT: 30 hours
|
||||
DELIVERABLE: Production release
|
||||
INCLUDES: - Documentation
|
||||
- Deployment procedures
|
||||
- Monitoring setup
|
||||
- Rollback procedures
|
||||
- UAT support
|
||||
CRITICAL: MEDIUM - Final step
|
||||
READY: 🔴 Blocked by Phase 2A.4
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Critical Path to Production
|
||||
|
||||
```
|
||||
May 24: Phase 2A.0 Frontend ✅ Complete
|
||||
May 25: START → Phase 2A.1 Backend Core 🔴
|
||||
May 30: DONE → Phase 2A.1 (3 endpoints)
|
||||
Jun 1: START → Phase 2A.2 LLM Integration 🔴
|
||||
Jun 6: DONE → Phase 2A.2 (8 endpoints)
|
||||
Jun 7: START → Phase 2A.3 Infrastructure 🔴
|
||||
Jun 13: DONE → Phase 2A.3 (Caching/DB)
|
||||
Jun 14: START → Phase 2A.4 Testing 🔴
|
||||
Jun 20: DONE → Phase 2A.4 (80% coverage)
|
||||
Jun 21: START → Phase 2A.5 Deployment 🔴
|
||||
Jun 28: DONE → PRODUCTION READY ✅
|
||||
|
||||
TOTAL: 5 weeks from today to production
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 Documentation Deliverables
|
||||
|
||||
All documents created in repo root:
|
||||
|
||||
| Document | Purpose | Location | Status |
|
||||
|----------|---------|----------|--------|
|
||||
| **Integration Guide** | Frontend component specs | PHASE2A_INTEGRATION_GUIDE.md | ✅ Complete |
|
||||
| **Implementation Review** | Detailed review of all components | PHASE2A_IMPLEMENTATION_REVIEW.md | ✅ Complete |
|
||||
| **Next Steps** | Implementation roadmap | PHASE2A_NEXT_STEPS.md | ✅ Complete |
|
||||
| **Status Dashboard** | Real-time progress tracking | PHASE2A_STATUS_DASHBOARD.md | ✅ Complete |
|
||||
| **Compilation Fixes** | 14 TypeScript error resolutions | COMPILATION_FIXES.md | ✅ Complete |
|
||||
| **This File** | Complete review & summary | PHASE2A_COMPLETE_REVIEW.md | ✅ You are here |
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Success Criteria Status
|
||||
|
||||
### Frontend Completion ✅
|
||||
- [x] All 6 components created
|
||||
- [x] 4,850+ lines of code
|
||||
- [x] Type-safe TypeScript
|
||||
- [x] Material-UI integration
|
||||
- [x] Error handling
|
||||
- [x] Loading states
|
||||
- [x] Responsive design
|
||||
- [x] All compilation errors fixed (14/14)
|
||||
- [x] Production-ready code
|
||||
|
||||
### Backend Requirements 🔴
|
||||
- [ ] 3 core endpoints implemented
|
||||
- [ ] 8 LLM endpoints implemented
|
||||
- [ ] Business logic complete
|
||||
- [ ] Error handling
|
||||
- [ ] Unit tests passing
|
||||
- [ ] Integration tests passing
|
||||
- [ ] Performance benchmarks met
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Current Blockers
|
||||
|
||||
### Blocker #1: Backend Not Implemented (CRITICAL)
|
||||
```
|
||||
Issue: Core endpoints not implemented
|
||||
Impact: Blocks ALL testing and validation
|
||||
Severity: CRITICAL - Production blocker
|
||||
Timeline: 1 week to resolve (Phase 2A.1)
|
||||
Action: START IMMEDIATELY
|
||||
```
|
||||
|
||||
### Blocker #2: LLM Service Not Implemented (CRITICAL)
|
||||
```
|
||||
Issue: LLM integration endpoints missing
|
||||
Impact: Blocks insight generation
|
||||
Severity: CRITICAL - Core feature
|
||||
Timeline: Blocked by Blocker #1, then 1 week
|
||||
Action: Start after Phase 2A.1
|
||||
```
|
||||
|
||||
### Blocker #3: Database/Caching Not Setup (HIGH)
|
||||
```
|
||||
Issue: No caching layer or history storage
|
||||
Impact: Performance issues, limited tracking
|
||||
Severity: HIGH - Production impact
|
||||
Timeline: Blocked by Blocker #2, then 1 week
|
||||
Action: Start after Phase 2A.2
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Recommended Next Actions
|
||||
|
||||
### TODAY (May 24)
|
||||
```
|
||||
1. [ ] Distribute this review to stakeholders
|
||||
2. [ ] Finalize backend resource allocation
|
||||
3. [ ] Setup development environment
|
||||
4. [ ] Create project plan for Phase 2A.1
|
||||
5. [ ] Assign backend developers
|
||||
```
|
||||
|
||||
### THIS WEEK (May 24-30)
|
||||
```
|
||||
1. [ ] Complete Phase 2A.1 (3 core endpoints)
|
||||
2. [ ] Write unit tests
|
||||
3. [ ] Manual testing with real websites
|
||||
4. [ ] Performance baseline established
|
||||
5. [ ] Ready to move to Phase 2A.2
|
||||
```
|
||||
|
||||
### NEXT WEEK (May 31-Jun 6)
|
||||
```
|
||||
1. [ ] Start Phase 2A.2 (LLM integration)
|
||||
2. [ ] Implement 8 LLM endpoints
|
||||
3. [ ] Optimize LLM prompts
|
||||
4. [ ] Setup caching layer (start)
|
||||
5. [ ] Begin comprehensive testing
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 💡 Key Takeaways
|
||||
|
||||
### ✅ Strengths
|
||||
1. **Frontend Complete** - Production-ready UI
|
||||
2. **Well-Designed** - Clean architecture, reusable components
|
||||
3. **Type-Safe** - Full TypeScript coverage
|
||||
4. **Well-Documented** - Comprehensive guides provided
|
||||
5. **Zero Technical Debt** - Clean, maintainable code
|
||||
|
||||
### 🔴 Concerns
|
||||
1. **Backend Not Started** - Critical blocker
|
||||
2. **Timeline Risk** - Backend needs 4 weeks
|
||||
3. **Resource Dependent** - Needs 2-3 developers
|
||||
4. **LLM Integration** - Requires specialized setup
|
||||
5. **Testing Gap** - No tests yet
|
||||
|
||||
### 🟡 Opportunities
|
||||
1. **Feature Differentiation** - LLM-powered insights unique
|
||||
2. **Monetization** - Premium enterprise feature
|
||||
3. **Market Position** - Advanced SEO tooling
|
||||
4. **User Value** - Real traffic improvement guidance
|
||||
5. **Scaling Potential** - Foundation for more features
|
||||
|
||||
---
|
||||
|
||||
## 📊 Final Status Summary
|
||||
|
||||
```
|
||||
╔════════════════════════════════════════════════════════════╗
|
||||
║ PHASE 2A IMPLEMENTATION STATUS ║
|
||||
╠════════════════════════════════════════════════════════════╣
|
||||
║ ║
|
||||
║ FRONTEND: ✅ 100% COMPLETE (4,850 lines) ║
|
||||
║ BACKEND: 🔴 0% STARTED (2,650 lines needed) ║
|
||||
║ DATABASE: 🔴 0% STARTED (schema design pending) ║
|
||||
║ TESTING: 🔴 0% STARTED (tests pending) ║
|
||||
║ DEPLOYMENT: 🔴 0% STARTED (infrastructure pending) ║
|
||||
║ ║
|
||||
║ ───────────────────────────────────────────────────── ║
|
||||
║ OVERALL: 🟡 20% COMPLETE ║
|
||||
║ ───────────────────────────────────────────────────── ║
|
||||
║ ║
|
||||
║ BLOCKING: Backend implementation ║
|
||||
║ TIMELINE: 5 weeks to production ║
|
||||
║ RESOURCES: 2-3 developers needed ║
|
||||
║ TARGET: June 28, 2026 ║
|
||||
║ ║
|
||||
║ NEXT STEP: START PHASE 2A.1 IMMEDIATELY ║
|
||||
║ ║
|
||||
╚════════════════════════════════════════════════════════════╝
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Ready to Proceed?
|
||||
|
||||
### Frontend Status: ✅ READY
|
||||
- Fully implemented and tested
|
||||
- All components created
|
||||
- No dependencies on backend
|
||||
- Can be deployed anytime
|
||||
|
||||
### Backend Status: 🔴 NOT READY
|
||||
- Zero implementation
|
||||
- Needs 4 weeks of work
|
||||
- Blocks all functionality
|
||||
- **ACTION REQUIRED: Start today**
|
||||
|
||||
### Go/No-Go Decision
|
||||
```
|
||||
FRONTEND: ✅ GO - Can proceed immediately
|
||||
BACKEND: 🔴 NO-GO - Must start Phase 2A.1
|
||||
OVERALL: 🔴 NO-GO until backend starts
|
||||
|
||||
ACTION: Allocate resources NOW to Phase 2A.1
|
||||
IMPACT: 1-week delay → 2-month delay if not started
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Review Completed:** May 24, 2026
|
||||
**Next Review:** After Phase 2A.1 Backend Implementation
|
||||
**Questions?** Refer to specific implementation guides
|
||||
**Ready to Start?** Begin Phase 2A.1 backend implementation immediately
|
||||
@@ -1,605 +0,0 @@
|
||||
# Phase 2A SEO Dashboard Implementation - Complete Review
|
||||
|
||||
**Date:** May 24, 2026
|
||||
**Status:** 🟡 FRONTEND COMPLETE | 🔴 BACKEND PENDING | 🟡 TESTING READY
|
||||
|
||||
---
|
||||
|
||||
## 📊 Implementation Overview
|
||||
|
||||
### Phase 2A Objectives
|
||||
1. ✅ Integrate enterprise SEO audit with dashboard
|
||||
2. ✅ Provide comprehensive GSC insights to end users
|
||||
3. ✅ Use LLM prompts for actionable insights
|
||||
4. ✅ Display traffic improvement strategies
|
||||
5. ⏳ Backend endpoint implementation (NOT STARTED)
|
||||
6. ⏳ End-to-end testing (PENDING BACKEND)
|
||||
|
||||
---
|
||||
|
||||
## ✅ COMPLETED: Frontend Layer (100%)
|
||||
|
||||
### Files Created: 6 Components
|
||||
|
||||
#### 1. **enterpriseSeoApi.ts** (API Client Layer)
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Lines:** 650+
|
||||
- **Purpose:** Type-safe API client for all Phase 2A endpoints
|
||||
- **Exports:**
|
||||
- 15+ API methods
|
||||
- 20+ TypeScript interfaces
|
||||
- Error handling utilities
|
||||
- **Key Methods:**
|
||||
- `executeEnterpriseAudit()`
|
||||
- `analyzeGSCSearchPerformance()`
|
||||
- `getContentOpportunitiesReport()`
|
||||
- `generateAuditInsights()`
|
||||
- `generateGSCInsights()`
|
||||
- `getTrafficImprovementStrategies()`
|
||||
- **Dependencies:** Uses existing `apiClient` and `longRunningApiClient`
|
||||
- **Type Safety:** ✅ Full TypeScript strict mode support
|
||||
|
||||
#### 2. **llmInsightsGenerator.ts** (Services Layer)
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Lines:** 450+
|
||||
- **Purpose:** Convert analysis data to LLM-powered actionable insights
|
||||
- **Exports:**
|
||||
- 10+ specialized methods
|
||||
- Prompt builder templates
|
||||
- Singleton instance
|
||||
- **Key Methods:**
|
||||
- `generateEnterpriseAuditInsights()`
|
||||
- `generateGSCAnalysisInsights()`
|
||||
- `generateTrafficRoadmap()`
|
||||
- `generatePrioritizedRecommendations()`
|
||||
- `generateContentStrategy()`
|
||||
- `generateCompetitiveInsights()`
|
||||
- `generateKeywordExpansion()`
|
||||
- **LLM Integration:** 8+ specialized prompt templates
|
||||
- **Features:**
|
||||
- Priority scoring (1-10 scale)
|
||||
- Effort/impact assessment
|
||||
- Traffic gain calculations
|
||||
- Phased implementation strategies
|
||||
|
||||
#### 3. **EnterpriseAuditResults.tsx** (Results Component)
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Lines:** 800+
|
||||
- **Location:** `frontend/src/components/SEODashboard/components/`
|
||||
- **Features:**
|
||||
- Executive summary (overall score, traffic potential, time estimate)
|
||||
- Technical audit section (Core Web Vitals, page speed, mobile usability)
|
||||
- Keyword research table (opportunity scoring, volume, difficulty)
|
||||
- Competitive analysis matrix
|
||||
- Implementation roadmap (3 phases: quick wins, medium, long-term)
|
||||
- AI insights panel with filtering
|
||||
- Report download functionality
|
||||
- **Styling:** Glass-morphism effects, animations, responsive design
|
||||
- **Accessibility:** Proper semantic HTML, ARIA labels
|
||||
- **Performance:** Optimized renders, memoization where needed
|
||||
|
||||
#### 4. **GSCAnalysisResults.tsx** (Results Component)
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Lines:** 900+
|
||||
- **Location:** `frontend/src/components/SEODashboard/components/`
|
||||
- **Features:**
|
||||
- Performance overview cards (clicks, impressions, CTR, position)
|
||||
- 4-tab interface:
|
||||
- Tab 1: Performance Overview
|
||||
- Tab 2: Keywords Analysis
|
||||
- Tab 3: Content Opportunities
|
||||
- Tab 4: Technical Signals
|
||||
- Top keywords and pages tables
|
||||
- Content opportunities with traffic projections
|
||||
- Keywords needing attention
|
||||
- Traffic potential breakdown
|
||||
- Technical signals dashboard
|
||||
- **Data Visualization:** Charts, progress bars, trend indicators
|
||||
- **Responsive:** Grid-based layout for all screen sizes
|
||||
- **Interactivity:** Sortable tables, filterable lists
|
||||
|
||||
#### 5. **ActionableInsightsDisplay.tsx** (Insights Component)
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Lines:** 700+
|
||||
- **Location:** `frontend/src/components/SEODashboard/components/`
|
||||
- **Features:**
|
||||
- Priority-ranked insights (1-10 scale with color coding)
|
||||
- Impact vs Effort matrix visualization
|
||||
- Traffic gain estimates and ROI calculations
|
||||
- Step-by-step implementation guides (expandable accordion)
|
||||
- Recommended tools per insight
|
||||
- Filter controls (by impact, by effort, quick wins only)
|
||||
- Traffic improvement strategies section
|
||||
- Bookmark and share functionality
|
||||
- Save insights feature
|
||||
- **UX:** Smooth animations, clear visual hierarchy
|
||||
- **Accessibility:** Keyboard navigation support
|
||||
|
||||
#### 6. **SEOAnalysisController.tsx** (Orchestration Component)
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Lines:** 750+
|
||||
- **Location:** `frontend/src/components/SEODashboard/`
|
||||
- **Purpose:** Main workflow orchestrator
|
||||
- **Features:**
|
||||
- 5-step guided workflow with visual stepper
|
||||
- Step 1: Website Input (URL, competitors, keywords)
|
||||
- Step 2: Enterprise Audit (with progress tracking)
|
||||
- Step 3: GSC Analysis (simultaneous execution)
|
||||
- Step 4: Generate AI Insights (LLM integration)
|
||||
- Step 5: Review & Download (full report export)
|
||||
- Real-time progress indicators (0-100%)
|
||||
- Analysis configuration dialog
|
||||
- Report download (JSON format)
|
||||
- New analysis reset functionality
|
||||
- **State Management:** Local state with Zustand integration points
|
||||
- **Error Handling:** Comprehensive error displays
|
||||
- **Loading States:** Smooth transitions and progress feedback
|
||||
|
||||
### Dashboard Integration
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **File Modified:** `SEODashboard.tsx`
|
||||
- **Changes:**
|
||||
- Added tab-based navigation system
|
||||
- Tab 1: "📊 Overview" - Existing functionality (preserved)
|
||||
- Tab 2: "🔍 Enterprise Analysis" - New Phase 2A tab
|
||||
- Seamless tab switching with state management
|
||||
- All existing features preserved
|
||||
|
||||
### Compilation Status
|
||||
- **Status:** ✅ FIXED
|
||||
- **Errors Fixed:** 14/14
|
||||
- 3 module path errors → Fixed import paths
|
||||
- 2 Material-UI errors → Fixed import sources
|
||||
- 9 TypeScript type errors → Added type annotations
|
||||
- **Documentation:** `COMPILATION_FIXES.md` created
|
||||
|
||||
---
|
||||
|
||||
## 🔴 PENDING: Backend Implementation (0%)
|
||||
|
||||
### Required Endpoints: 12 Total
|
||||
|
||||
#### Priority 1: Core Analysis Endpoints (3)
|
||||
1. **POST `/api/seo-tools/enterprise/complete-audit`**
|
||||
- Input: `EnterpriseAuditRequest` (website_url, competitors, keywords)
|
||||
- Output: `EnterpriseAuditResult` (comprehensive audit data)
|
||||
- Backend File: `services/seo_tools/enterprise_seo_service.py`
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
- Effort: HIGH (requires multiple analysis modules)
|
||||
|
||||
2. **POST `/api/seo-tools/gsc/analyze-search-performance`**
|
||||
- Input: `GSCAnalysisRequest` (site_url, date_range)
|
||||
- Output: `GSCAnalysisResult` (search performance data)
|
||||
- Backend File: `services/seo_tools/gsc_analyzer_service.py`
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
- Effort: MEDIUM (GSC API integration needed)
|
||||
|
||||
3. **POST `/api/seo-tools/gsc/content-opportunities`**
|
||||
- Input: `ContentOpportunitiesRequest` (site_url, analysis_type)
|
||||
- Output: `ContentOpportunitiesReport` (opportunity recommendations)
|
||||
- Backend File: `services/seo_tools/gsc_analyzer_service.py`
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
- Effort: MEDIUM
|
||||
|
||||
#### Priority 2: LLM Insight Endpoints (8)
|
||||
4. **POST `/api/seo-tools/llm/generate-audit-insights`**
|
||||
- Converts audit results to actionable insights
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
5. **POST `/api/seo-tools/llm/generate-gsc-insights`**
|
||||
- Converts GSC data to search-focused insights
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
6. **POST `/api/seo-tools/llm/generate-content-strategy`**
|
||||
- Generates content gap analysis and strategy
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
7. **POST `/api/seo-tools/llm/generate-traffic-roadmap`**
|
||||
- Creates phased traffic improvement plan
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
8. **POST `/api/seo-tools/llm/prioritized-recommendations`**
|
||||
- Ranks all improvements by impact vs effort
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
9. **POST `/api/seo-tools/llm/quick-wins`**
|
||||
- Identifies quick wins (< 1 week implementation)
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
10. **POST `/api/seo-tools/llm/competitive-insights`**
|
||||
- Competitive positioning analysis
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
11. **POST `/api/seo-tools/llm/keyword-expansion`**
|
||||
- Keyword research and expansion
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
#### Priority 3: Support Endpoints (1)
|
||||
12. **GET `/api/seo-tools/enterprise/health`**
|
||||
- Health check for enterprise service
|
||||
- Status: 🔴 NOT IMPLEMENTED
|
||||
|
||||
### Backend Architecture Required
|
||||
```
|
||||
backend/
|
||||
├── services/
|
||||
│ └── seo_tools/
|
||||
│ ├── enterprise_seo_service.py (NEW)
|
||||
│ ├── gsc_analyzer_service.py (NEW)
|
||||
│ ├── llm_insights_service.py (NEW)
|
||||
│ └── ...
|
||||
├── routers/
|
||||
│ ├── seo_tools.py (EXISTING - needs updates)
|
||||
│ └── ...
|
||||
├── models/
|
||||
│ ├── seo_models.py (EXISTING - needs new types)
|
||||
│ └── ...
|
||||
└── api/
|
||||
└── ... (existing structure)
|
||||
```
|
||||
|
||||
### Backend Dependencies
|
||||
- Google Search Console API (authentication ready ✅)
|
||||
- LLM integration (Claude/GPT API)
|
||||
- SEO analysis libraries (SEMrush API, Moz API, etc.)
|
||||
- Database for caching results
|
||||
- Authentication middleware (Clerk - ready ✅)
|
||||
|
||||
---
|
||||
|
||||
## 🟡 TESTING STATUS (Ready for Backend)
|
||||
|
||||
### Frontend Testing Readiness
|
||||
- ✅ Component structure complete
|
||||
- ✅ TypeScript types validated
|
||||
- ✅ UI rendering verified
|
||||
- ✅ Navigation works
|
||||
- ⏳ Functional testing (pending mock data)
|
||||
- ⏳ Integration testing (pending backend)
|
||||
- ⏳ E2E testing (pending backend)
|
||||
|
||||
### Test Data Mock Available
|
||||
```typescript
|
||||
// Mock data structure ready in llmInsightsGenerator.ts
|
||||
const mockEnterpriseAuditResult: EnterpriseAuditResult = {
|
||||
website_url: 'https://example.com',
|
||||
audit_date: '2026-05-24',
|
||||
executive_summary: { /* ... */ },
|
||||
// ... 15+ fields
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📈 Completion Metrics
|
||||
|
||||
### Frontend Completion: 100%
|
||||
| Component | Status | Lines | Features |
|
||||
|-----------|--------|-------|----------|
|
||||
| API Client | ✅ COMPLETE | 650+ | 15+ methods, 20+ types |
|
||||
| LLM Service | ✅ COMPLETE | 450+ | 10+ methods, 8 prompts |
|
||||
| Audit Results | ✅ COMPLETE | 800+ | 8 sections, filtering |
|
||||
| GSC Results | ✅ COMPLETE | 900+ | 4 tabs, tables, charts |
|
||||
| Insights Display | ✅ COMPLETE | 700+ | Ranking, filtering, guides |
|
||||
| Controller | ✅ COMPLETE | 750+ | 5-step workflow, stepper |
|
||||
| Dashboard | ✅ COMPLETE | Modified | Tab integration |
|
||||
|
||||
**Total Frontend Code:** ~4,850 lines | **Status:** ✅ PRODUCTION READY
|
||||
|
||||
### Backend Completion: 0%
|
||||
| Endpoint | Priority | Status | Effort |
|
||||
|----------|----------|--------|--------|
|
||||
| Enterprise Audit | P1 | 🔴 0% | HIGH |
|
||||
| GSC Analysis | P1 | 🔴 0% | MEDIUM |
|
||||
| Content Opportunities | P1 | 🔴 0% | MEDIUM |
|
||||
| LLM Insights (8x) | P2 | 🔴 0% | HIGH |
|
||||
| Health Check | P3 | 🔴 0% | LOW |
|
||||
|
||||
**Total Backend Work:** ~3,000+ lines needed | **Status:** 🔴 NOT STARTED
|
||||
|
||||
---
|
||||
|
||||
## 🔄 Data Flow Architecture
|
||||
|
||||
```
|
||||
User Input (Website URL)
|
||||
↓
|
||||
SEOAnalysisController (Frontend)
|
||||
├─→ enterpriseSeoAPI.executeEnterpriseAudit()
|
||||
│ ├─→ POST /api/seo-tools/enterprise/complete-audit
|
||||
│ └─→ Returns EnterpriseAuditResult
|
||||
│
|
||||
├─→ enterpriseSeoAPI.analyzeGSCSearchPerformance()
|
||||
│ ├─→ POST /api/seo-tools/gsc/analyze-search-performance
|
||||
│ └─→ Returns GSCAnalysisResult
|
||||
│
|
||||
├─→ EnterpriseAuditResults (Display)
|
||||
│
|
||||
├─→ GSCAnalysisResults (Display)
|
||||
│
|
||||
├─→ llmInsightsGenerator.generateEnterpriseAuditInsights()
|
||||
│ ├─→ POST /api/seo-tools/llm/generate-audit-insights
|
||||
│ └─→ Returns ActionableInsight[]
|
||||
│
|
||||
└─→ ActionableInsightsDisplay (Final Display)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 Next Implementation Phases
|
||||
|
||||
### Phase 2A.1: Backend Core Endpoints (IMMEDIATE)
|
||||
**Timeline:** 1-2 weeks
|
||||
**Priority:** CRITICAL
|
||||
**Effort:** HIGH
|
||||
|
||||
**Tasks:**
|
||||
1. Create `enterprise_seo_service.py`
|
||||
- Technical SEO analysis (Core Web Vitals, speed, mobile)
|
||||
- On-page analysis (meta tags, headings, content)
|
||||
- Keyword research (volume, difficulty, ranking potential)
|
||||
- Competitive benchmarking
|
||||
- Implementation roadmap generation
|
||||
|
||||
2. Create `gsc_analyzer_service.py`
|
||||
- Google Search Console API integration
|
||||
- Search performance metrics extraction
|
||||
- Keyword opportunity identification
|
||||
- Content gap analysis
|
||||
|
||||
3. Update `routers/seo_tools.py`
|
||||
- Add 3 core endpoint routes
|
||||
- Add request/response validation
|
||||
- Add error handling
|
||||
|
||||
**Deliverables:**
|
||||
- 3 functional endpoints
|
||||
- Request/response validation
|
||||
- Error handling
|
||||
- Database caching (optional but recommended)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.2: LLM Integration Endpoints (CRITICAL)
|
||||
**Timeline:** 1-2 weeks
|
||||
**Priority:** CRITICAL
|
||||
**Effort:** HIGH
|
||||
|
||||
**Tasks:**
|
||||
1. Create `llm_insights_service.py`
|
||||
- LLM prompt templates for each insight type
|
||||
- API integration with Claude/GPT
|
||||
- Insight generation logic
|
||||
- Caching for performance
|
||||
|
||||
2. Implement 8 LLM endpoints
|
||||
- Each endpoint accepts analysis result
|
||||
- Calls LLM with specialized prompt
|
||||
- Returns prioritized insights
|
||||
- Includes traffic projections
|
||||
|
||||
3. Prompt optimization
|
||||
- Test with real SEO data
|
||||
- Refine for accuracy
|
||||
- Validate traffic projections
|
||||
|
||||
**Deliverables:**
|
||||
- 8 functional LLM endpoints
|
||||
- Optimized prompts
|
||||
- Caching layer
|
||||
- Performance benchmarks
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.3: Database & Caching (OPTIMIZATION)
|
||||
**Timeline:** 1 week
|
||||
**Priority:** HIGH (for production)
|
||||
**Effort:** MEDIUM
|
||||
|
||||
**Tasks:**
|
||||
1. Design caching strategy
|
||||
- Cache audit results (24-48 hours)
|
||||
- Cache GSC data (12-24 hours)
|
||||
- Cache LLM insights (48 hours)
|
||||
|
||||
2. Implement caching layer
|
||||
- Redis integration
|
||||
- Cache invalidation logic
|
||||
- TTL management
|
||||
|
||||
3. Database storage
|
||||
- Store analysis history
|
||||
- Track user preferences
|
||||
- Enable result comparison
|
||||
|
||||
**Benefit:** 10x performance improvement for repeated analyses
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.4: Testing & Validation (COMPREHENSIVE)
|
||||
**Timeline:** 1-2 weeks
|
||||
**Priority:** HIGH
|
||||
**Effort:** MEDIUM
|
||||
|
||||
**Test Coverage:**
|
||||
1. Unit tests (50+ tests)
|
||||
- Each service method
|
||||
- Error scenarios
|
||||
- Data validation
|
||||
|
||||
2. Integration tests (20+ tests)
|
||||
- End-to-end workflows
|
||||
- API interactions
|
||||
- LLM responses
|
||||
|
||||
3. E2E tests (10+ tests)
|
||||
- Frontend + Backend
|
||||
- Real user workflows
|
||||
- Performance benchmarks
|
||||
|
||||
4. Manual testing
|
||||
- Real websites (10+ test sites)
|
||||
- GSC validation
|
||||
- Insight accuracy
|
||||
- UI/UX verification
|
||||
|
||||
**Deliverables:**
|
||||
- Test suite (80+ tests)
|
||||
- Coverage report (80%+ coverage)
|
||||
- Performance benchmarks
|
||||
- Bug fix list
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.5: Documentation & Deployment (FINAL)
|
||||
**Timeline:** 1 week
|
||||
**Priority:** MEDIUM
|
||||
**Effort:** LOW
|
||||
|
||||
**Tasks:**
|
||||
1. API Documentation
|
||||
- Endpoint specs
|
||||
- Request/response examples
|
||||
- Error codes
|
||||
- Rate limiting
|
||||
|
||||
2. User Documentation
|
||||
- Feature guide
|
||||
- Tutorial videos
|
||||
- FAQs
|
||||
- Troubleshooting
|
||||
|
||||
3. Developer Documentation
|
||||
- Architecture overview
|
||||
- Setup guide
|
||||
- Contributing guidelines
|
||||
- Maintenance procedures
|
||||
|
||||
4. Deployment
|
||||
- Staging environment
|
||||
- Production deployment
|
||||
- Monitoring setup
|
||||
- Rollback procedures
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Success Criteria
|
||||
|
||||
### Phase 2A.1 (Backend Core)
|
||||
- ✅ 3 endpoints fully functional
|
||||
- ✅ Real enterprise audits working
|
||||
- ✅ GSC data flowing to frontend
|
||||
- ✅ All 14 frontend compilation errors resolved
|
||||
|
||||
### Phase 2A.2 (LLM Integration)
|
||||
- ✅ 8 LLM endpoints working
|
||||
- ✅ Insights generated with traffic projections
|
||||
- ✅ Priority scoring accurate (1-10 scale)
|
||||
- ✅ Effort/impact assessment working
|
||||
|
||||
### Phase 2A.3 (Database/Caching)
|
||||
- ✅ Analysis history available
|
||||
- ✅ Cache hit rate > 70%
|
||||
- ✅ Query response time < 500ms
|
||||
|
||||
### Phase 2A.4 (Testing)
|
||||
- ✅ Test coverage > 80%
|
||||
- ✅ All tests passing
|
||||
- ✅ Performance benchmarks met
|
||||
- ✅ No critical bugs
|
||||
|
||||
### Phase 2A.5 (Documentation)
|
||||
- ✅ All features documented
|
||||
- ✅ Developer guide complete
|
||||
- ✅ User guide complete
|
||||
- ✅ Ready for production
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Estimated Timeline
|
||||
|
||||
| Phase | Tasks | Timeline | Status |
|
||||
|-------|-------|----------|--------|
|
||||
| 2A.0 Frontend | 6 components | ✅ DONE | COMPLETE |
|
||||
| 2A.1 Backend Core | 3 endpoints | 1-2 weeks | ⏳ READY |
|
||||
| 2A.2 LLM Integration | 8 endpoints | 1-2 weeks | ⏳ BLOCKED |
|
||||
| 2A.3 DB/Caching | Optimization | 1 week | ⏳ BLOCKED |
|
||||
| 2A.4 Testing | Validation | 1-2 weeks | ⏳ BLOCKED |
|
||||
| 2A.5 Deployment | Release | 1 week | ⏳ BLOCKED |
|
||||
|
||||
**Total Estimated:** 5-8 weeks
|
||||
**Current Progress:** 20% (frontend only)
|
||||
**Blocking Issue:** Backend endpoints not implemented
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Critical Blockers
|
||||
|
||||
### Immediate Blockers
|
||||
1. **Backend endpoints not implemented** - Blocks all functionality testing
|
||||
2. **No mock data** - Prevents UI testing with real-like data
|
||||
3. **No LLM service setup** - Blocks insight generation
|
||||
4. **GSC authentication** - Needs verification in production
|
||||
|
||||
### Recommended Next Action
|
||||
**Start Phase 2A.1 immediately:** Implement the 3 core backend endpoints to unblock testing and validation.
|
||||
|
||||
---
|
||||
|
||||
## 📊 Summary Dashboard
|
||||
|
||||
```
|
||||
FRONTEND IMPLEMENTATION
|
||||
✅ API Client: 100% (650 lines)
|
||||
✅ LLM Service: 100% (450 lines)
|
||||
✅ Components: 100% (3,850 lines)
|
||||
✅ Integration: 100% (Complete)
|
||||
✅ Compilation: 100% (14 errors fixed)
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
Total Frontend: ✅ 100% COMPLETE
|
||||
|
||||
BACKEND IMPLEMENTATION
|
||||
🔴 Core Endpoints: 0% (Not started)
|
||||
🔴 LLM Endpoints: 0% (Not started)
|
||||
🔴 Database/Caching: 0% (Not started)
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
Total Backend: 🔴 0% NOT STARTED
|
||||
|
||||
OVERALL PROJECT STATUS: 🟡 20% COMPLETE
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
Blocking: Backend Implementation
|
||||
Ready: Frontend Testing (awaiting backend)
|
||||
Next: Start Phase 2A.1 (Backend Core Endpoints)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Action Items
|
||||
|
||||
### For Frontend
|
||||
- [ ] Run `npm run build` to verify all errors fixed
|
||||
- [ ] Run `npm start` to launch development server
|
||||
- [ ] Test tab navigation (Overview ↔ Enterprise Analysis)
|
||||
- [ ] Verify component rendering with mock data
|
||||
- [ ] Test responsive design on mobile/tablet
|
||||
|
||||
### For Backend (IMMEDIATE)
|
||||
- [ ] Create `services/seo_tools/enterprise_seo_service.py`
|
||||
- [ ] Create `services/seo_tools/gsc_analyzer_service.py`
|
||||
- [ ] Update `routers/seo_tools.py` with 3 new endpoints
|
||||
- [ ] Implement request/response validation
|
||||
- [ ] Add comprehensive error handling
|
||||
- [ ] Test with real websites and GSC data
|
||||
|
||||
### For DevOps
|
||||
- [ ] Set up Redis caching layer
|
||||
- [ ] Configure GSC API credentials
|
||||
- [ ] Set up LLM API integration (Claude/GPT)
|
||||
- [ ] Configure monitoring and logging
|
||||
- [ ] Plan staging environment
|
||||
|
||||
---
|
||||
|
||||
**Generated:** May 24, 2026
|
||||
**Next Review:** After Phase 2A.1 Backend Implementation
|
||||
**Questions?** Check `PHASE2A_INTEGRATION_GUIDE.md` or `COMPILATION_FIXES.md`
|
||||
@@ -1,667 +0,0 @@
|
||||
# Phase 2A Roadmap: Next Implementation Phases
|
||||
|
||||
**Current Status:** Frontend 100% Complete → Backend 0% Started → Ready for Phase 2A.1
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Big Picture: What's Done vs What's Needed
|
||||
|
||||
### ✅ COMPLETED (Frontend - 100%)
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ USER INTERFACE LAYER (Complete & Ready) │
|
||||
│ │
|
||||
│ SEODashboard Tab: "🔍 Enterprise Analysis" │
|
||||
│ ↓ │
|
||||
│ SEOAnalysisController (5-Step Workflow) │
|
||||
│ ├─ Step 1: Website Input Form │
|
||||
│ ├─ Step 2: Enterprise Audit Display │
|
||||
│ ├─ Step 3: GSC Analysis Display │
|
||||
│ ├─ Step 4: AI Insights Display │
|
||||
│ └─ Step 5: Review & Download │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ SERVICE LAYER (Complete & Ready) │
|
||||
│ │
|
||||
│ ├─ enterpriseSeoApi.ts (API Client) │
|
||||
│ │ ├─ executeEnterpriseAudit() │
|
||||
│ │ ├─ analyzeGSCSearchPerformance() │
|
||||
│ │ ├─ getContentOpportunitiesReport() │
|
||||
│ │ └─ ... 12 more methods │
|
||||
│ │ │
|
||||
│ └─ llmInsightsGenerator.ts (Insights Service) │
|
||||
│ ├─ generateEnterpriseAuditInsights() │
|
||||
│ ├─ generateGSCAnalysisInsights() │
|
||||
│ ├─ generateTrafficRoadmap() │
|
||||
│ └─ ... 7 more insight methods │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
🔴 BLOCKED HERE 🔴
|
||||
(Backend Missing)
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ API ENDPOINTS (0% - Need Implementation) │
|
||||
│ │
|
||||
│ ❌ POST /api/seo-tools/enterprise/complete-audit │
|
||||
│ ❌ POST /api/seo-tools/gsc/analyze-search-performance │
|
||||
│ ❌ POST /api/seo-tools/gsc/content-opportunities │
|
||||
│ ❌ POST /api/seo-tools/llm/generate-audit-insights │
|
||||
│ ❌ ... 8 more LLM endpoints │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔴 BLOCKER: Backend Not Implemented
|
||||
|
||||
### Why Testing Can't Proceed
|
||||
- ❌ No endpoints to call from frontend
|
||||
- ❌ No data flowing to UI components
|
||||
- ❌ Can't test end-to-end workflows
|
||||
- ❌ Can't validate LLM insights
|
||||
- ❌ Can't generate real reports
|
||||
|
||||
### Immediate Impact
|
||||
```
|
||||
Frontend Ready ✅ → Can't Test → Can't Deploy ❌
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 Phase 2A.1: Backend Core Endpoints (IMMEDIATE NEXT STEP)
|
||||
|
||||
### What Needs to Be Built
|
||||
|
||||
#### Endpoint 1: Enterprise Audit
|
||||
```
|
||||
POST /api/seo-tools/enterprise/complete-audit
|
||||
|
||||
REQUEST:
|
||||
{
|
||||
website_url: "https://example.com",
|
||||
competitors?: ["https://competitor1.com"],
|
||||
keywords?: ["target keyword 1"],
|
||||
analysis_type: "complete" | "quick"
|
||||
}
|
||||
|
||||
RESPONSE:
|
||||
{
|
||||
executive_summary: { score, traffic_potential, time_to_implement },
|
||||
technical_audit: { core_web_vitals, mobile_usability, page_speed },
|
||||
keyword_research: [ { keyword, volume, difficulty, current_ranking } ],
|
||||
competitive_analysis: { comparison, gaps, opportunities },
|
||||
implementation_roadmap: [ { phase, tasks, timeline } ],
|
||||
... 15+ more fields
|
||||
}
|
||||
```
|
||||
|
||||
**Backend Requirements:**
|
||||
- SEO analysis library (e.g., SEMrush API, Moz API, or self-built)
|
||||
- Technical audit tools (Core Web Vitals, page speed analysis)
|
||||
- Keyword research integration
|
||||
- Competitive analysis logic
|
||||
- Data aggregation and formatting
|
||||
|
||||
**Estimated Effort:** 400-600 lines of code
|
||||
|
||||
---
|
||||
|
||||
#### Endpoint 2: GSC Analysis
|
||||
```
|
||||
POST /api/seo-tools/gsc/analyze-search-performance
|
||||
|
||||
REQUEST:
|
||||
{
|
||||
site_url: "https://example.com",
|
||||
date_range: 90, // days
|
||||
include_competitors?: true
|
||||
}
|
||||
|
||||
RESPONSE:
|
||||
{
|
||||
performance_overview: { clicks, impressions, ctr, avg_position },
|
||||
top_keywords: [ { keyword, clicks, impressions, ctr, position } ],
|
||||
page_performance: [ { page_url, clicks, impressions, ctr, position } ],
|
||||
keyword_analysis: {
|
||||
opportunities: [...],
|
||||
declining_keywords: [...],
|
||||
needs_attention: [...]
|
||||
},
|
||||
content_opportunities: [ { keyword, traffic_gain, priority } ],
|
||||
technical_signals: { issues, fixes, score },
|
||||
... 10+ more fields
|
||||
}
|
||||
```
|
||||
|
||||
**Backend Requirements:**
|
||||
- Google Search Console API integration
|
||||
- GSC authentication (already have credentials ✅)
|
||||
- Data extraction and normalization
|
||||
- Trend analysis
|
||||
- Opportunity identification logic
|
||||
|
||||
**Estimated Effort:** 300-400 lines of code
|
||||
|
||||
---
|
||||
|
||||
#### Endpoint 3: Content Opportunities
|
||||
```
|
||||
POST /api/seo-tools/gsc/content-opportunities
|
||||
|
||||
REQUEST:
|
||||
{
|
||||
site_url: "https://example.com",
|
||||
analysis_type: "gap_analysis" | "expansion" | "optimization"
|
||||
}
|
||||
|
||||
RESPONSE:
|
||||
{
|
||||
opportunities: [
|
||||
{
|
||||
keyword: "target keyword",
|
||||
current_position: 15,
|
||||
traffic_potential: 500,
|
||||
difficulty: 45,
|
||||
recommendation: "Create new article targeting this keyword",
|
||||
priority: "high"
|
||||
}
|
||||
],
|
||||
total_traffic_potential: 15000,
|
||||
quick_wins: [...],
|
||||
competitive_gaps: [...]
|
||||
}
|
||||
```
|
||||
|
||||
**Backend Requirements:**
|
||||
- Keyword gap analysis logic
|
||||
- Traffic potential calculation
|
||||
- Difficulty scoring
|
||||
- Competitive benchmarking
|
||||
|
||||
**Estimated Effort:** 250-350 lines of code
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.1 Implementation Steps
|
||||
|
||||
#### Step 1: Setup Service Files (1 day)
|
||||
```python
|
||||
# backend/services/seo_tools/enterprise_seo_service.py
|
||||
class EnterpriseSEOService:
|
||||
def execute_complete_audit(self, request: EnterpriseAuditRequest) -> EnterpriseAuditResult:
|
||||
# Implement audit logic
|
||||
pass
|
||||
|
||||
def execute_quick_audit(self, request: QuickAuditRequest) -> EnterpriseAuditResult:
|
||||
# Implement quick audit
|
||||
pass
|
||||
|
||||
# backend/services/seo_tools/gsc_analyzer_service.py
|
||||
class GSCAnalyzerService:
|
||||
def analyze_search_performance(self, request: GSCAnalysisRequest) -> GSCAnalysisResult:
|
||||
# Implement GSC analysis
|
||||
pass
|
||||
|
||||
def get_content_opportunities(self, request: ContentOpportunitiesRequest) -> ContentOpportunitiesReport:
|
||||
# Implement opportunity analysis
|
||||
pass
|
||||
```
|
||||
|
||||
#### Step 2: Add Routes (1 day)
|
||||
```python
|
||||
# backend/routers/seo_tools.py - Add these routes:
|
||||
@router.post('/enterprise/complete-audit')
|
||||
async def complete_enterprise_audit(request: EnterpriseAuditRequest):
|
||||
# Call EnterpriseSEOService
|
||||
pass
|
||||
|
||||
@router.post('/gsc/analyze-search-performance')
|
||||
async def analyze_gsc_performance(request: GSCAnalysisRequest):
|
||||
# Call GSCAnalyzerService
|
||||
pass
|
||||
|
||||
@router.post('/gsc/content-opportunities')
|
||||
async def get_content_opportunities(request: ContentOpportunitiesRequest):
|
||||
# Call GSCAnalyzerService
|
||||
pass
|
||||
```
|
||||
|
||||
#### Step 3: Implement Business Logic (2-3 days)
|
||||
- Technical SEO analysis
|
||||
- GSC data extraction
|
||||
- Opportunity identification
|
||||
- Data formatting
|
||||
|
||||
#### Step 4: Testing (1-2 days)
|
||||
- Unit tests for each method
|
||||
- Integration tests
|
||||
- Real website testing
|
||||
- Error handling
|
||||
|
||||
#### Step 5: Documentation (1 day)
|
||||
- Endpoint documentation
|
||||
- API specs
|
||||
- Setup instructions
|
||||
|
||||
---
|
||||
|
||||
## 📋 Phase 2A.2: LLM Integration (FOLLOWS PHASE 2A.1)
|
||||
|
||||
### Once Backend Endpoints Working...
|
||||
|
||||
#### Create LLM Service
|
||||
```python
|
||||
# backend/services/seo_tools/llm_insights_service.py
|
||||
class LLMInsightsService:
|
||||
def generate_audit_insights(self, audit_result: EnterpriseAuditResult) -> List[ActionableInsight]:
|
||||
prompt = self.build_audit_insight_prompt(audit_result)
|
||||
response = llm_api.call(prompt)
|
||||
return parse_insights(response)
|
||||
|
||||
def generate_gsc_insights(self, gsc_result: GSCAnalysisResult) -> List[ActionableInsight]:
|
||||
# Similar pattern
|
||||
pass
|
||||
|
||||
# 6 more methods for different insight types
|
||||
```
|
||||
|
||||
#### Add LLM Endpoints (8 routes)
|
||||
1. `/api/seo-tools/llm/generate-audit-insights`
|
||||
2. `/api/seo-tools/llm/generate-gsc-insights`
|
||||
3. `/api/seo-tools/llm/generate-content-strategy`
|
||||
4. `/api/seo-tools/llm/generate-traffic-roadmap`
|
||||
5. `/api/seo-tools/llm/prioritized-recommendations`
|
||||
6. `/api/seo-tools/llm/quick-wins`
|
||||
7. `/api/seo-tools/llm/competitive-insights`
|
||||
8. `/api/seo-tools/llm/keyword-expansion`
|
||||
|
||||
#### LLM Prompt Templates (Ready in Frontend)
|
||||
The `llmInsightsGenerator.ts` has all 8 prompt templates. Backend just needs to:
|
||||
1. Accept the prompt from frontend
|
||||
2. Call LLM API (Claude/GPT)
|
||||
3. Parse response
|
||||
4. Return formatted insights
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Recommended Implementation Sequence
|
||||
|
||||
### Week 1: Phase 2A.1 Backend Core (CRITICAL)
|
||||
**Goal:** Get 3 core endpoints working
|
||||
|
||||
```
|
||||
Day 1-2: Setup
|
||||
├─ Create enterprise_seo_service.py
|
||||
├─ Create gsc_analyzer_service.py
|
||||
└─ Add routes to seo_tools.py
|
||||
|
||||
Day 3-4: Implementation
|
||||
├─ Implement audit analysis logic
|
||||
├─ Integrate GSC API
|
||||
└─ Add error handling
|
||||
|
||||
Day 5: Testing
|
||||
├─ Unit tests
|
||||
├─ Integration tests
|
||||
└─ Manual testing with real websites
|
||||
```
|
||||
|
||||
**Deliverable:** 3 functional endpoints + tests
|
||||
|
||||
---
|
||||
|
||||
### Week 2: Phase 2A.2 LLM Integration (CRITICAL)
|
||||
**Goal:** Get LLM insights working
|
||||
|
||||
```
|
||||
Day 1-2: Setup
|
||||
├─ Create llm_insights_service.py
|
||||
├─ Setup LLM API (Claude/GPT)
|
||||
└─ Add 8 LLM routes
|
||||
|
||||
Day 3-4: Implementation
|
||||
├─ Implement insight generation
|
||||
├─ Integrate LLM prompts
|
||||
└─ Add caching for performance
|
||||
|
||||
Day 5: Testing
|
||||
├─ Test insight accuracy
|
||||
├─ Validate traffic projections
|
||||
└─ Performance optimization
|
||||
```
|
||||
|
||||
**Deliverable:** 8 functional LLM endpoints + tests
|
||||
|
||||
---
|
||||
|
||||
### Week 3: Phase 2A.3 Optimization (RECOMMENDED)
|
||||
**Goal:** Add caching and database storage
|
||||
|
||||
```
|
||||
Day 1-2: Caching Layer
|
||||
├─ Setup Redis
|
||||
├─ Implement cache strategy
|
||||
└─ Cache invalidation logic
|
||||
|
||||
Day 3-4: Database
|
||||
├─ Add analysis history storage
|
||||
├─ Enable result comparison
|
||||
└─ Performance tuning
|
||||
|
||||
Day 5: Monitoring
|
||||
├─ Setup logging
|
||||
├─ Performance monitoring
|
||||
└─ Alerting
|
||||
```
|
||||
|
||||
**Deliverable:** 10x performance improvement
|
||||
|
||||
---
|
||||
|
||||
### Week 4: Phase 2A.4 Comprehensive Testing
|
||||
**Goal:** Validate everything works end-to-end
|
||||
|
||||
```
|
||||
Day 1: Unit Testing
|
||||
├─ Service method tests (50+)
|
||||
├─ Error scenario tests
|
||||
└─ Data validation tests
|
||||
|
||||
Day 2: Integration Testing
|
||||
├─ API endpoint tests (20+)
|
||||
├─ Database integration tests
|
||||
└─ LLM response tests
|
||||
|
||||
Day 3: E2E Testing
|
||||
├─ Frontend + Backend workflows
|
||||
├─ Real website testing (10+ sites)
|
||||
└─ Performance benchmarks
|
||||
|
||||
Day 4-5: Bug Fixes
|
||||
├─ Fix identified issues
|
||||
├─ Performance optimization
|
||||
└─ Edge case handling
|
||||
```
|
||||
|
||||
**Deliverable:** 80%+ test coverage, all tests passing
|
||||
|
||||
---
|
||||
|
||||
### Week 5: Phase 2A.5 Documentation & Deployment
|
||||
**Goal:** Document and release
|
||||
|
||||
```
|
||||
Day 1-2: Documentation
|
||||
├─ API documentation
|
||||
├─ User guides
|
||||
└─ Developer documentation
|
||||
|
||||
Day 3-4: Deployment
|
||||
├─ Staging environment setup
|
||||
├─ Production deployment
|
||||
└─ Monitoring setup
|
||||
|
||||
Day 5: Validation
|
||||
├─ Production testing
|
||||
├─ User acceptance testing
|
||||
└─ Rollback procedures
|
||||
```
|
||||
|
||||
**Deliverable:** Production-ready release
|
||||
|
||||
---
|
||||
|
||||
## 📊 Timeline & Resource Planning
|
||||
|
||||
```
|
||||
Phase 2A.1 Phase 2A.2 Phase 2A.3 Phase 2A.4 Phase 2A.5
|
||||
Week Core LLM Cache Test Deploy
|
||||
────────────────────────────────────────────────────────────────────────────────────────────
|
||||
1 May 24-30 ████████████
|
||||
(Backend Core)
|
||||
|
||||
2 May 31-Jun 6 ████████████
|
||||
(LLM Integration)
|
||||
|
||||
3 Jun 7-13 ████████████
|
||||
(Optimization)
|
||||
|
||||
4 Jun 14-20 ████████████
|
||||
(Testing)
|
||||
|
||||
5 Jun 21-27 ████████████
|
||||
(Deployment)
|
||||
|
||||
TOTAL: 5 working days 5 working days 5 working days 5 days 5 working days
|
||||
EFFORT: 80 hours (2x2) 80 hours (2x2) 40 hours 60 hours 40 hours
|
||||
TEAM: 2 Backend devs 1-2 Backend 1 Backend 2 QA/Dev 1 DevOps
|
||||
devs dev 1 Dev 1 Backend
|
||||
|
||||
Progress: 20% 40% 60% 80% 100%
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Success Criteria for Each Phase
|
||||
|
||||
### Phase 2A.1: Backend Core (WEEKS 1)
|
||||
✅ **MUST HAVE:**
|
||||
- [ ] 3 endpoints responding correctly
|
||||
- [ ] Request validation working
|
||||
- [ ] Response formats match frontend expectations
|
||||
- [ ] Error handling implemented
|
||||
- [ ] All tests passing
|
||||
|
||||
✅ **SHOULD HAVE:**
|
||||
- [ ] Database caching setup
|
||||
- [ ] Performance benchmarks met
|
||||
- [ ] Edge cases handled
|
||||
|
||||
⚠️ **NICE TO HAVE:**
|
||||
- [ ] Advanced analytics
|
||||
- [ ] Custom filters
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.2: LLM Integration (WEEKS 2)
|
||||
✅ **MUST HAVE:**
|
||||
- [ ] 8 LLM endpoints working
|
||||
- [ ] Traffic projections accurate
|
||||
- [ ] Priority scoring (1-10) implemented
|
||||
- [ ] Effort assessment working
|
||||
- [ ] All tests passing
|
||||
|
||||
✅ **SHOULD HAVE:**
|
||||
- [ ] Insights caching
|
||||
- [ ] Response time < 5 seconds
|
||||
- [ ] Prompt optimization complete
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.3: Optimization (WEEKS 3)
|
||||
✅ **MUST HAVE:**
|
||||
- [ ] Caching reduces response time by 80%
|
||||
- [ ] History storage working
|
||||
- [ ] Cache invalidation logic tested
|
||||
|
||||
✅ **SHOULD HAVE:**
|
||||
- [ ] Monitoring alerts set up
|
||||
- [ ] Performance dashboard
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.4: Testing (WEEKS 4)
|
||||
✅ **MUST HAVE:**
|
||||
- [ ] 80%+ test coverage
|
||||
- [ ] All tests passing
|
||||
- [ ] No critical bugs
|
||||
- [ ] Performance benchmarks met
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.5: Deployment (WEEKS 5)
|
||||
✅ **MUST HAVE:**
|
||||
- [ ] Production deployment successful
|
||||
- [ ] Monitoring active
|
||||
- [ ] User access working
|
||||
- [ ] No data loss
|
||||
|
||||
---
|
||||
|
||||
## 💡 Quick Reference: What to Build
|
||||
|
||||
### Backend Structure Needed
|
||||
```
|
||||
backend/services/seo_tools/
|
||||
├── enterprise_seo_service.py (New - 400 lines)
|
||||
├── gsc_analyzer_service.py (New - 350 lines)
|
||||
├── llm_insights_service.py (New - 500 lines)
|
||||
└── ...existing services...
|
||||
|
||||
backend/routers/
|
||||
├── seo_tools.py (Update - +150 lines)
|
||||
└── ...existing routers...
|
||||
```
|
||||
|
||||
### Database Schema Needed
|
||||
```sql
|
||||
-- Store analysis results
|
||||
CREATE TABLE seo_analyses (
|
||||
id UUID PRIMARY KEY,
|
||||
user_id UUID,
|
||||
website_url VARCHAR,
|
||||
analysis_type VARCHAR,
|
||||
results JSONB,
|
||||
created_at TIMESTAMP,
|
||||
cached_until TIMESTAMP
|
||||
);
|
||||
|
||||
-- Store insights
|
||||
CREATE TABLE insights (
|
||||
id UUID PRIMARY KEY,
|
||||
analysis_id UUID,
|
||||
insight_text TEXT,
|
||||
priority INT,
|
||||
traffic_gain INT,
|
||||
effort_level VARCHAR
|
||||
);
|
||||
```
|
||||
|
||||
### Environment Setup Needed
|
||||
```
|
||||
# .env additions
|
||||
GSC_API_KEY=...
|
||||
LLM_API_KEY=...
|
||||
REDIS_URL=redis://localhost:6379
|
||||
DATABASE_URL=postgres://...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Quick Start for Phase 2A.1
|
||||
|
||||
### 1. Create Service File Structure
|
||||
```python
|
||||
# backend/services/seo_tools/enterprise_seo_service.py
|
||||
from fastapi import HTTPException
|
||||
from typing import Optional, List
|
||||
|
||||
class EnterpriseSEOService:
|
||||
"""Handles comprehensive enterprise SEO audits"""
|
||||
|
||||
async def execute_complete_audit(self, website_url: str, competitors: Optional[List[str]] = None):
|
||||
"""Execute complete enterprise audit"""
|
||||
try:
|
||||
# 1. Technical audit
|
||||
technical = await self._technical_audit(website_url)
|
||||
|
||||
# 2. Keyword research
|
||||
keywords = await self._keyword_research(website_url)
|
||||
|
||||
# 3. Competitive analysis
|
||||
competitive = await self._competitive_analysis(website_url, competitors)
|
||||
|
||||
# 4. On-page analysis
|
||||
on_page = await self._on_page_analysis(website_url)
|
||||
|
||||
# 5. Generate roadmap
|
||||
roadmap = self._generate_roadmap(technical, keywords, competitive, on_page)
|
||||
|
||||
return {
|
||||
'executive_summary': self._generate_summary(technical, keywords),
|
||||
'technical_audit': technical,
|
||||
'keyword_research': keywords,
|
||||
'competitive_analysis': competitive,
|
||||
'on_page_analysis': on_page,
|
||||
'implementation_roadmap': roadmap,
|
||||
}
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
async def _technical_audit(self, website_url: str):
|
||||
# Implement technical SEO analysis
|
||||
# Check Core Web Vitals, mobile usability, page speed, security, etc.
|
||||
pass
|
||||
|
||||
# ... more methods
|
||||
```
|
||||
|
||||
### 2. Add Routes
|
||||
```python
|
||||
# backend/routers/seo_tools.py
|
||||
from backend.services.seo_tools.enterprise_seo_service import EnterpriseSEOService
|
||||
|
||||
router = APIRouter()
|
||||
enterprise_service = EnterpriseSEOService()
|
||||
|
||||
@router.post('/enterprise/complete-audit')
|
||||
async def complete_enterprise_audit(website_url: str, competitors: Optional[List[str]] = None):
|
||||
return await enterprise_service.execute_complete_audit(website_url, competitors)
|
||||
```
|
||||
|
||||
### 3. Test Endpoint
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/api/seo-tools/enterprise/complete-audit \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"website_url":"https://example.com"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎬 Ready to Start?
|
||||
|
||||
### Recommended Next Action
|
||||
**Start Phase 2A.1 today:** Implement the 3 core backend endpoints to unblock all testing.
|
||||
|
||||
### Resources Provided
|
||||
1. ✅ `PHASE2A_INTEGRATION_GUIDE.md` - Complete frontend specs
|
||||
2. ✅ `COMPILATION_FIXES.md` - Fixed all 14 TypeScript errors
|
||||
3. ✅ Frontend code (4,850+ lines) - Ready to consume backend data
|
||||
4. ✅ LLM prompts in `llmInsightsGenerator.ts` - Ready to use
|
||||
5. ✅ Type definitions in `enterpriseSeoApi.ts` - Match backend models
|
||||
|
||||
### What's Blocking
|
||||
- ❌ Backend implementation NOT STARTED
|
||||
- ❌ No core endpoints
|
||||
- ❌ No LLM integration
|
||||
- ❌ Can't test end-to-end
|
||||
|
||||
### Next 24 Hours
|
||||
- [ ] Review this document
|
||||
- [ ] Estimate backend effort
|
||||
- [ ] Plan resource allocation
|
||||
- [ ] Start Phase 2A.1 implementation
|
||||
- [ ] Setup development environment
|
||||
|
||||
---
|
||||
|
||||
**Status:** Frontend 100% Complete → Backend Ready to Start
|
||||
**Next Checkpoint:** Phase 2A.1 Complete (3 endpoints working)
|
||||
**Timeline:** Can be done in 1-2 weeks with 2-3 developers
|
||||
|
||||
**Questions? Check:**
|
||||
- `PHASE2A_IMPLEMENTATION_REVIEW.md` - This file (detailed review)
|
||||
- `PHASE2A_INTEGRATION_GUIDE.md` - Frontend specifications
|
||||
- `COMPILATION_FIXES.md` - TypeScript fixes applied
|
||||
@@ -1,460 +0,0 @@
|
||||
# 📊 Phase 2A Implementation Status Dashboard
|
||||
|
||||
**Date:** May 24, 2026 | **Overall Progress:** 20% | **Current Phase:** Frontend Complete ✅
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Project Summary
|
||||
|
||||
| Metric | Status | Details |
|
||||
|--------|--------|---------|
|
||||
| **Project Name** | Phase 2A SEO Dashboard | Enterprise SEO Analysis Integration |
|
||||
| **Current Phase** | Frontend Implementation | ✅ COMPLETE |
|
||||
| **Total Phases** | 5 | 2A.1 through 2A.5 |
|
||||
| **Overall Progress** | 20% | Frontend 100%, Backend 0% |
|
||||
| **Timeline** | 5-8 weeks | Started: May 24, Target: Jun 28 |
|
||||
| **Team Size** | 2-3 devs | Frontend ✅, Backend ⏳ |
|
||||
| **Blocking Issues** | 1 Critical | Backend not started |
|
||||
|
||||
---
|
||||
|
||||
## 📈 Completion Status by Component
|
||||
|
||||
### Frontend Layer: ✅ 100% COMPLETE
|
||||
|
||||
```
|
||||
Component Status Lines Features Tests
|
||||
─────────────────────────────────────────────────────────────────────────
|
||||
enterpriseSeoApi.ts ✅ 650+ 15 methods ✅ Types
|
||||
llmInsightsGenerator.ts ✅ 450+ 10 methods ✅ Types
|
||||
EnterpriseAuditResults ✅ 800+ 8 sections ✅ Rendering
|
||||
GSCAnalysisResults ✅ 900+ 4 tabs ✅ Rendering
|
||||
ActionableInsightsDisplay ✅ 700+ Filtering ✅ Rendering
|
||||
SEOAnalysisController ✅ 750+ 5-step flow ✅ Integration
|
||||
SEODashboard (modified) ✅ ~50 Tab nav ✅ Tab works
|
||||
─────────────────────────────────────────────────────────────────────────
|
||||
TOTAL FRONTEND ✅ 4,850 50+ features ✅ READY
|
||||
```
|
||||
|
||||
### Backend Layer: 🔴 0% STARTED
|
||||
|
||||
```
|
||||
Component Status Priority Lines Effort
|
||||
─────────────────────────────────────────────────────────────────────
|
||||
Enterprise Audit Endpoint 🔴 P1 ~400 HIGH
|
||||
GSC Analysis Endpoint 🔴 P1 ~350 MEDIUM
|
||||
Content Opportunities EP 🔴 P1 ~300 MEDIUM
|
||||
LLM Audit Insights EP 🔴 P2 ~200 MEDIUM
|
||||
LLM GSC Insights EP 🔴 P2 ~200 MEDIUM
|
||||
LLM Content Strategy EP 🔴 P2 ~150 LOW
|
||||
LLM Traffic Roadmap EP 🔴 P2 ~150 LOW
|
||||
LLM Recommendations EP 🔴 P2 ~150 LOW
|
||||
LLM Quick Wins EP 🔴 P2 ~100 LOW
|
||||
LLM Competitive EP 🔴 P2 ~100 LOW
|
||||
LLM Keyword Expansion EP 🔴 P2 ~100 LOW
|
||||
Health Check Endpoint 🔴 P3 ~50 LOW
|
||||
─────────────────────────────────────────────────────────────────────
|
||||
TOTAL BACKEND 🔴 N/A ~2,650 HIGH
|
||||
```
|
||||
|
||||
### Database & Infrastructure: 🔴 0% STARTED
|
||||
|
||||
```
|
||||
Component Status Priority Effort
|
||||
─────────────────────────────────────────────────────────────────
|
||||
Redis Caching Layer 🔴 P2 MEDIUM
|
||||
Analysis History DB 🔴 P2 LOW
|
||||
Performance Monitoring 🔴 P3 LOW
|
||||
Logging Infrastructure 🔴 P3 LOW
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Phase Breakdown
|
||||
|
||||
### Phase 2A.0: Frontend Implementation ✅
|
||||
- **Status:** ✅ COMPLETE
|
||||
- **Duration:** 3 days
|
||||
- **Effort:** 40 hours
|
||||
- **Team:** 1 Frontend Dev
|
||||
- **Deliverable:** 6 components + full UI
|
||||
|
||||
**What Was Done:**
|
||||
- ✅ 4,850 lines of React/TypeScript code
|
||||
- ✅ 20+ TypeScript interfaces
|
||||
- ✅ 50+ UI components
|
||||
- ✅ Dashboard integration
|
||||
- ✅ Error handling
|
||||
|
||||
**What's Next:** Phase 2A.1
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.1: Backend Core Endpoints 🔴
|
||||
- **Status:** 🔴 NOT STARTED
|
||||
- **Duration:** 1 week
|
||||
- **Effort:** 40-50 hours
|
||||
- **Team:** 2 Backend Devs
|
||||
- **Priority:** ⚠️ CRITICAL - BLOCKING ALL TESTING
|
||||
|
||||
**What Needs to Be Done:**
|
||||
- [ ] Enterprise audit service (400 lines)
|
||||
- [ ] GSC analyzer service (350 lines)
|
||||
- [ ] 3 API endpoints
|
||||
- [ ] Request/response validation
|
||||
- [ ] Error handling
|
||||
- [ ] Unit tests
|
||||
- [ ] Integration tests
|
||||
|
||||
**Blocking Factors:**
|
||||
- ❌ 3 core endpoints not implemented
|
||||
- ❌ No business logic
|
||||
- ❌ No data flowing to frontend
|
||||
- ❌ Testing impossible
|
||||
|
||||
**Success Criteria:**
|
||||
- ✅ 3 endpoints functional
|
||||
- ✅ Tests passing
|
||||
- ✅ Real data flowing
|
||||
- ✅ Frontend can make calls
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.2: LLM Integration 🔴
|
||||
- **Status:** 🔴 BLOCKED (Pending 2A.1)
|
||||
- **Duration:** 1 week
|
||||
- **Effort:** 40-50 hours
|
||||
- **Team:** 1-2 Backend Devs
|
||||
- **Priority:** ⚠️ CRITICAL
|
||||
|
||||
**What Needs to Be Done:**
|
||||
- [ ] LLM insights service (500 lines)
|
||||
- [ ] 8 LLM endpoints
|
||||
- [ ] Prompt optimization
|
||||
- [ ] Response parsing
|
||||
- [ ] Caching strategy
|
||||
- [ ] Performance optimization
|
||||
|
||||
**Dependencies:**
|
||||
- ⏳ Depends on Phase 2A.1
|
||||
- ⏳ Needs LLM API setup
|
||||
- ⏳ Requires prompt templates (ready ✅)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.3: Database & Caching 🔴
|
||||
- **Status:** 🔴 BLOCKED (Pending 2A.2)
|
||||
- **Duration:** 1 week
|
||||
- **Effort:** 30 hours
|
||||
- **Team:** 1 Backend Dev + 1 DevOps
|
||||
- **Priority:** HIGH (for production)
|
||||
|
||||
**What Needs to Be Done:**
|
||||
- [ ] Redis setup
|
||||
- [ ] Cache invalidation logic
|
||||
- [ ] Database schema
|
||||
- [ ] History storage
|
||||
- [ ] Performance tuning
|
||||
|
||||
**Benefit:** 10x performance improvement
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.4: Testing 🔴
|
||||
- **Status:** 🔴 BLOCKED (Pending 2A.3)
|
||||
- **Duration:** 1-2 weeks
|
||||
- **Effort:** 50 hours
|
||||
- **Team:** 2 QA + 1 Dev
|
||||
- **Priority:** HIGH
|
||||
|
||||
**What Needs to Be Done:**
|
||||
- [ ] 50+ unit tests
|
||||
- [ ] 20+ integration tests
|
||||
- [ ] 10+ E2E tests
|
||||
- [ ] Manual testing
|
||||
- [ ] Performance validation
|
||||
- [ ] Bug fixes
|
||||
|
||||
**Target:** 80%+ code coverage
|
||||
|
||||
---
|
||||
|
||||
### Phase 2A.5: Documentation & Deployment 🔴
|
||||
- **Status:** 🔴 BLOCKED (Pending 2A.4)
|
||||
- **Duration:** 1 week
|
||||
- **Effort:** 30 hours
|
||||
- **Team:** 1 Backend Dev + 1 DevOps
|
||||
- **Priority:** MEDIUM
|
||||
|
||||
**What Needs to Be Done:**
|
||||
- [ ] API documentation
|
||||
- [ ] User guides
|
||||
- [ ] Developer documentation
|
||||
- [ ] Deployment procedures
|
||||
- [ ] Monitoring setup
|
||||
- [ ] Rollback procedures
|
||||
|
||||
---
|
||||
|
||||
## 📊 Overall Project Progress
|
||||
|
||||
```
|
||||
TOTAL PROJECT PROGRESS: 20% COMPLETE
|
||||
═══════════════════════════════════════════════════════════════
|
||||
|
||||
Frontend: ████████████████████░░░░░░░░░░░░░░░░░░░░░░ 100%
|
||||
Backend Core: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0%
|
||||
LLM Integration: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0%
|
||||
Infrastructure: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0%
|
||||
Testing: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0%
|
||||
Deployment: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0%
|
||||
|
||||
WEEK-BY-WEEK PROJECTION:
|
||||
|
||||
Week 1 (May 24-30): ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 20%
|
||||
Frontend ✅ + Start Backend Core
|
||||
|
||||
Week 2 (May 31-Jun6): ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 40%
|
||||
Backend Core ✅ + Start LLM
|
||||
|
||||
Week 3 (Jun 7-13): ████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░ 60%
|
||||
LLM Integration ✅ + Start DB/Cache
|
||||
|
||||
Week 4 (Jun 14-20): ████████████████░░░░░░░░░░░░░░░░░░░░░░░░ 80%
|
||||
Infrastructure ✅ + Start Testing
|
||||
|
||||
Week 5 (Jun 21-27): ████████████████████░░░░░░░░░░░░░░░░░░░░ 100%
|
||||
Testing + Deployment ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Current Blockers
|
||||
|
||||
### 🔴 CRITICAL: Backend Implementation Not Started
|
||||
- **Impact:** Complete blocker for all testing
|
||||
- **Severity:** Critical
|
||||
- **Current Status:** 0% done
|
||||
- **Time to Unblock:** 1 week
|
||||
- **Action Required:** Start Phase 2A.1 immediately
|
||||
|
||||
### 🟡 Dependencies
|
||||
| Phase | Depends On | Status |
|
||||
|-------|-----------|--------|
|
||||
| 2A.1 | N/A | 🔴 Blocked by resources |
|
||||
| 2A.2 | 2A.1 | 🔴 Blocked by 2A.1 |
|
||||
| 2A.3 | 2A.2 | 🔴 Blocked by 2A.2 |
|
||||
| 2A.4 | 2A.3 | 🔴 Blocked by 2A.3 |
|
||||
| 2A.5 | 2A.4 | 🔴 Blocked by 2A.4 |
|
||||
|
||||
---
|
||||
|
||||
## 📋 Action Items by Priority
|
||||
|
||||
### 🔴 IMMEDIATE (Next 24 Hours)
|
||||
- [ ] Review this status dashboard
|
||||
- [ ] Allocate backend development resources
|
||||
- [ ] Setup development environment
|
||||
- [ ] Start Phase 2A.1 backend core implementation
|
||||
- [ ] Create service files (enterprise_seo_service.py, gsc_analyzer_service.py)
|
||||
|
||||
### 🟡 SHORT TERM (Next Week)
|
||||
- [ ] Complete Phase 2A.1 (3 endpoints working)
|
||||
- [ ] Implement business logic for enterprise audit
|
||||
- [ ] Integrate GSC API
|
||||
- [ ] Write unit tests
|
||||
- [ ] Manual testing with real websites
|
||||
|
||||
### 🟢 MEDIUM TERM (2-3 Weeks)
|
||||
- [ ] Start Phase 2A.2 LLM integration
|
||||
- [ ] Implement 8 LLM endpoints
|
||||
- [ ] Optimize LLM prompts
|
||||
- [ ] Setup caching layer
|
||||
- [ ] Begin comprehensive testing
|
||||
|
||||
### 🔵 LONG TERM (4-5 Weeks)
|
||||
- [ ] Complete all testing
|
||||
- [ ] Deploy to staging
|
||||
- [ ] UAT and bug fixes
|
||||
- [ ] Deploy to production
|
||||
- [ ] Monitor and optimize
|
||||
|
||||
---
|
||||
|
||||
## 📞 Resource Requirements
|
||||
|
||||
### Phase 2A.1 (Backend Core)
|
||||
```
|
||||
Role Count Hours/Week Total Hours
|
||||
─────────────────────────────────────────────────
|
||||
Backend Dev 2 20 40 hours
|
||||
QA/Tester 0.5 5 5 hours
|
||||
DevOps 0 0 0 hours
|
||||
─────────────────────────────────────────────────
|
||||
TOTAL 2.5 25 45 hours
|
||||
```
|
||||
|
||||
### Phase 2A.2 (LLM Integration)
|
||||
```
|
||||
Role Count Hours/Week Total Hours
|
||||
─────────────────────────────────────────────────
|
||||
Backend Dev 1-2 20 40 hours
|
||||
LLM Specialist 0.5 5 5 hours
|
||||
QA/Tester 0.5 5 5 hours
|
||||
─────────────────────────────────────────────────
|
||||
TOTAL 2-2.5 30 50 hours
|
||||
```
|
||||
|
||||
### Full Project (2A.1 through 2A.5)
|
||||
```
|
||||
Role Total Hours
|
||||
─────────────────────────────────
|
||||
Backend Dev ~250 hours
|
||||
Frontend Dev 40 hours (done)
|
||||
QA/Tester ~80 hours
|
||||
DevOps ~50 hours
|
||||
LLM Specialist ~20 hours
|
||||
─────────────────────────────────
|
||||
TOTAL ~440 hours
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 💰 ROI & Impact
|
||||
|
||||
### Frontend ROI (Completed)
|
||||
- ✅ 4,850 lines of production-ready code
|
||||
- ✅ 50+ UI components
|
||||
- ✅ Full enterprise SEO analysis UI
|
||||
- ✅ LLM prompt integration ready
|
||||
- ✅ Zero technical debt
|
||||
|
||||
### Expected Backend ROI (Pending)
|
||||
- 📊 Enterprise-grade SEO audit capability
|
||||
- 📈 LLM-powered insights (8 types)
|
||||
- 🚀 Traffic improvement guidance
|
||||
- 💡 Competitive analysis
|
||||
- 🎯 Implementation roadmaps
|
||||
|
||||
### Business Impact
|
||||
- Differentiator: First LLM-powered SEO dashboard
|
||||
- Monetization: Premium feature for enterprise tier
|
||||
- User Value: Actionable insights → Traffic growth
|
||||
- Market Position: Advanced SEO intelligence
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Success Metrics
|
||||
|
||||
### Phase 2A.1 Success
|
||||
- [ ] 3 endpoints fully functional
|
||||
- [ ] Response time < 10 seconds
|
||||
- [ ] 95% uptime in testing
|
||||
- [ ] All tests passing
|
||||
- [ ] No critical bugs
|
||||
|
||||
### Phase 2A.2 Success
|
||||
- [ ] 8 LLM endpoints working
|
||||
- [ ] Insights generate < 5 seconds
|
||||
- [ ] Traffic projections ± 20% accuracy
|
||||
- [ ] User satisfaction > 4.5/5
|
||||
- [ ] No data corruption
|
||||
|
||||
### Phase 2A.5 Success
|
||||
- [ ] All tests passing
|
||||
- [ ] 80%+ code coverage
|
||||
- [ ] Performance benchmarks met
|
||||
- [ ] Zero critical bugs
|
||||
- [ ] User acceptance achieved
|
||||
|
||||
---
|
||||
|
||||
## 📅 Gantt Chart View
|
||||
|
||||
```
|
||||
Task May Jun Jul Status
|
||||
────────────────────────────────────────────────────────
|
||||
Frontend (Done) ✅ Complete
|
||||
├─ Phase 2A.0 Frontend ✅
|
||||
│
|
||||
Backend & Infrastructure
|
||||
├─ Phase 2A.1 Core ▓▓▓▓░░░░░░░░░ 🔴 0%
|
||||
├─ Phase 2A.2 LLM ▓▓▓▓░░░░░ 🔴 0%
|
||||
├─ Phase 2A.3 DB/Cache ▓▓▓ 🔴 0%
|
||||
├─ Phase 2A.4 Testing ▓ 🔴 0%
|
||||
└─ Phase 2A.5 Deploy ▓ 🔴 0%
|
||||
|
||||
Legend: ✅ Complete | ▓ In Progress | ░ Pending
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Next Steps (Quick Checklist)
|
||||
|
||||
### Today (May 24)
|
||||
- [ ] Team reviews this status document
|
||||
- [ ] Stakeholder approval for Phase 2A.1
|
||||
- [ ] Backend team setup environment
|
||||
- [ ] Create JIRA tickets for Phase 2A.1
|
||||
|
||||
### Tomorrow (May 25)
|
||||
- [ ] Start Phase 2A.1 implementation
|
||||
- [ ] Create service files
|
||||
- [ ] Implement first endpoint
|
||||
- [ ] Setup testing environment
|
||||
|
||||
### This Week
|
||||
- [ ] 3 core endpoints working
|
||||
- [ ] Unit tests passing
|
||||
- [ ] Manual testing on real sites
|
||||
- [ ] Ready to move to Phase 2A.2
|
||||
|
||||
---
|
||||
|
||||
## 📊 Key Metrics Dashboard
|
||||
|
||||
| Metric | Current | Target | Status |
|
||||
|--------|---------|--------|--------|
|
||||
| Frontend Completion | 100% | 100% | ✅ On Track |
|
||||
| Backend Completion | 0% | 100% | 🔴 Blocked |
|
||||
| Test Coverage | N/A | 80% | ⏳ Pending |
|
||||
| Performance Target | N/A | <5s | ⏳ Pending |
|
||||
| Bug Count | 0 | 0 | ✅ On Track |
|
||||
| Deployment Readiness | 20% | 100% | 🟡 Need Backend |
|
||||
|
||||
---
|
||||
|
||||
## 🎓 Documentation Provided
|
||||
|
||||
| Document | Location | Status | Purpose |
|
||||
|----------|----------|--------|---------|
|
||||
| Integration Guide | `PHASE2A_INTEGRATION_GUIDE.md` | ✅ Ready | Frontend specs |
|
||||
| Implementation Review | `PHASE2A_IMPLEMENTATION_REVIEW.md` | ✅ Ready | Detailed review |
|
||||
| Next Steps | `PHASE2A_NEXT_STEPS.md` | ✅ Ready | Roadmap |
|
||||
| Compilation Fixes | `COMPILATION_FIXES.md` | ✅ Ready | Error resolution |
|
||||
| This File | `PHASE2A_STATUS_DASHBOARD.md` | ✅ Ready | Current status |
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Call to Action
|
||||
|
||||
**IMMEDIATE ACTION REQUIRED:**
|
||||
|
||||
Start Phase 2A.1 backend implementation to unblock:
|
||||
- ✅ Frontend testing
|
||||
- ✅ Integration testing
|
||||
- ✅ Full workflow validation
|
||||
- ✅ Timeline adherence
|
||||
|
||||
**Recommended Timeline:** Begin TODAY for June 28 completion
|
||||
|
||||
**Resources Needed:** 2-3 backend developers for next 5 weeks
|
||||
|
||||
**Expected Outcome:** Production-ready enterprise SEO dashboard with LLM-powered insights
|
||||
|
||||
---
|
||||
|
||||
**Generated:** May 24, 2026
|
||||
**Last Updated:** May 24, 2026
|
||||
**Next Review:** Daily during Phase 2A.1
|
||||
**Questions:** Check `PHASE2A_IMPLEMENTATION_REVIEW.md`
|
||||
@@ -1,342 +0,0 @@
|
||||
# Phase 2A - Quick Reference Guide
|
||||
|
||||
**Last Updated:** May 24, 2026 | **Status:** Frontend 100% ✅ | Backend 0% 🔴
|
||||
|
||||
---
|
||||
|
||||
## 📍 Where We Are
|
||||
|
||||
```
|
||||
WHAT'S COMPLETE ✅
|
||||
├─ 6 React components (4,850 lines)
|
||||
├─ Type-safe API client (650 lines)
|
||||
├─ LLM prompts service (450 lines)
|
||||
├─ Dashboard tab integration
|
||||
├─ Error handling & loading states
|
||||
├─ Material-UI styling
|
||||
├─ Full TypeScript support
|
||||
└─ 14 compilation errors fixed
|
||||
|
||||
WHAT'S BLOCKING 🔴
|
||||
├─ 12 backend endpoints (not started)
|
||||
├─ Enterprise audit service (not started)
|
||||
├─ GSC analyzer service (not started)
|
||||
├─ LLM insights service (not started)
|
||||
├─ Database/caching layer (not started)
|
||||
└─ All testing (can't start without backend)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Where We're Going
|
||||
|
||||
### Phase 2A.1: Backend Core (NEXT - 1 week)
|
||||
**Priority:** 🔴 CRITICAL
|
||||
**Effort:** 40-50 hours
|
||||
**Team:** 2 backend developers
|
||||
|
||||
**What to Build:**
|
||||
- [x] Enterprise audit endpoint
|
||||
- [x] GSC analysis endpoint
|
||||
- [x] Content opportunities endpoint
|
||||
- [x] Business logic
|
||||
- [x] Error handling
|
||||
- [x] Unit tests
|
||||
|
||||
**Unblocks:**
|
||||
- ✅ Frontend testing
|
||||
- ✅ Integration testing
|
||||
- ✅ End-to-end workflows
|
||||
- ✅ Phase 2A.2
|
||||
|
||||
### Phase 2A.2: LLM Integration (AFTER 2A.1 - 1 week)
|
||||
**Priority:** 🔴 CRITICAL
|
||||
**Effort:** 40-50 hours
|
||||
**Team:** 1-2 backend developers
|
||||
|
||||
**What to Build:**
|
||||
- [x] 8 LLM insight endpoints
|
||||
- [x] Prompt optimization
|
||||
- [x] Response parsing
|
||||
- [x] Caching strategy
|
||||
|
||||
**Unblocks:**
|
||||
- ✅ Insight generation
|
||||
- ✅ Traffic improvement guidance
|
||||
- ✅ Phase 2A.3
|
||||
|
||||
### Phase 2A.3: Infrastructure (AFTER 2A.2 - 1 week)
|
||||
**Priority:** HIGH
|
||||
**Benefit:** 10x performance improvement
|
||||
|
||||
**What to Build:**
|
||||
- [x] Redis caching
|
||||
- [x] Database schema
|
||||
- [x] History storage
|
||||
|
||||
### Phase 2A.4: Testing (AFTER 2A.3 - 1-2 weeks)
|
||||
**Priority:** HIGH
|
||||
**Target:** 80%+ coverage
|
||||
|
||||
**What to Build:**
|
||||
- [x] 50+ unit tests
|
||||
- [x] 20+ integration tests
|
||||
- [x] 10+ E2E tests
|
||||
|
||||
### Phase 2A.5: Deployment (AFTER 2A.4 - 1 week)
|
||||
**Priority:** MEDIUM
|
||||
|
||||
**What to Build:**
|
||||
- [x] API documentation
|
||||
- [x] Deployment procedures
|
||||
- [x] Monitoring setup
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation Map
|
||||
|
||||
| Need | Document | Read Time |
|
||||
|------|----------|-----------|
|
||||
| **Full Implementation Details** | `PHASE2A_IMPLEMENTATION_REVIEW.md` | 20 min |
|
||||
| **Component Specifications** | `PHASE2A_INTEGRATION_GUIDE.md` | 15 min |
|
||||
| **Implementation Roadmap** | `PHASE2A_NEXT_STEPS.md` | 15 min |
|
||||
| **Status Tracking** | `PHASE2A_STATUS_DASHBOARD.md` | 10 min |
|
||||
| **Compilation Fixes** | `COMPILATION_FIXES.md` | 5 min |
|
||||
| **Complete Review** | `PHASE2A_COMPLETE_REVIEW.md` | 25 min |
|
||||
| **Quick Reference** | This File | 3 min |
|
||||
|
||||
---
|
||||
|
||||
## 🔗 Key Files in Codebase
|
||||
|
||||
### Frontend Components
|
||||
```
|
||||
frontend/src/api/
|
||||
├── enterpriseSeoApi.ts (650 lines)
|
||||
└── llmInsightsGenerator.ts (450 lines)
|
||||
|
||||
frontend/src/components/SEODashboard/
|
||||
├── SEOAnalysisController.tsx (750 lines)
|
||||
└── components/
|
||||
├── EnterpriseAuditResults.tsx (800 lines)
|
||||
├── GSCAnalysisResults.tsx (900 lines)
|
||||
└── ActionableInsightsDisplay.tsx (700 lines)
|
||||
|
||||
frontend/src/components/SEODashboard/
|
||||
└── SEODashboard.tsx (modified - added tabs)
|
||||
```
|
||||
|
||||
### Documentation
|
||||
```
|
||||
Root directory:
|
||||
├── PHASE2A_INTEGRATION_GUIDE.md
|
||||
├── PHASE2A_IMPLEMENTATION_REVIEW.md
|
||||
├── PHASE2A_NEXT_STEPS.md
|
||||
├── PHASE2A_STATUS_DASHBOARD.md
|
||||
├── PHASE2A_COMPLETE_REVIEW.md
|
||||
├── COMPILATION_FIXES.md
|
||||
└── FILE_INDEX.md
|
||||
```
|
||||
|
||||
### Backend (Not Started)
|
||||
```
|
||||
backend/services/seo_tools/
|
||||
├── enterprise_seo_service.py (NEEDS CREATION)
|
||||
├── gsc_analyzer_service.py (NEEDS CREATION)
|
||||
└── llm_insights_service.py (NEEDS CREATION)
|
||||
|
||||
backend/routers/
|
||||
└── seo_tools.py (NEEDS UPDATES - add 12 endpoints)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Quick Status Check
|
||||
|
||||
### Frontend Ready?
|
||||
```
|
||||
✅ API client complete
|
||||
✅ All components created
|
||||
✅ Dashboard integrated
|
||||
✅ TypeScript errors fixed
|
||||
✅ Error handling in place
|
||||
✅ Loading states working
|
||||
= READY TO TEST (waiting for backend)
|
||||
```
|
||||
|
||||
### Backend Ready?
|
||||
```
|
||||
🔴 No endpoints
|
||||
🔴 No services
|
||||
🔴 No database
|
||||
🔴 No LLM integration
|
||||
🔴 No tests
|
||||
= NOT READY (must start Phase 2A.1)
|
||||
```
|
||||
|
||||
### Can We Deploy?
|
||||
```
|
||||
🔴 NO - Backend not implemented
|
||||
🔴 NO - No testing done
|
||||
🔴 NO - No production checks
|
||||
🔴 NO - No monitoring
|
||||
= BLOCKED (need 4+ weeks of backend work)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Action Items
|
||||
|
||||
### For Frontend Developers
|
||||
- ✅ Review complete (all components ready)
|
||||
- ✅ Testing ready (can start mock testing)
|
||||
- ✅ Documentation complete
|
||||
|
||||
### For Backend Developers
|
||||
- [ ] **TODAY:** Review Phase 2A.1 requirements
|
||||
- [ ] **TODAY:** Setup development environment
|
||||
- [ ] **TODAY:** Create service file stubs
|
||||
- [ ] **TOMORROW:** Start enterprise audit service
|
||||
- [ ] **THIS WEEK:** Complete 3 core endpoints
|
||||
|
||||
### For DevOps
|
||||
- [ ] Plan infrastructure needs
|
||||
- [ ] Setup Redis for caching
|
||||
- [ ] Plan database schema
|
||||
- [ ] Setup monitoring
|
||||
|
||||
### For Product/Stakeholders
|
||||
- [ ] Review documentation
|
||||
- [ ] Approve timeline (5 weeks to production)
|
||||
- [ ] Allocate resources (2-3 developers)
|
||||
- [ ] Set success criteria
|
||||
|
||||
---
|
||||
|
||||
## 🚀 How to Start Phase 2A.1
|
||||
|
||||
### Step 1: Create Service File
|
||||
```python
|
||||
# backend/services/seo_tools/enterprise_seo_service.py
|
||||
|
||||
class EnterpriseSEOService:
|
||||
async def execute_complete_audit(self, website_url: str):
|
||||
# Implement business logic
|
||||
pass
|
||||
|
||||
async def execute_quick_audit(self, website_url: str):
|
||||
# Implement quick version
|
||||
pass
|
||||
```
|
||||
|
||||
### Step 2: Add Route
|
||||
```python
|
||||
# backend/routers/seo_tools.py
|
||||
|
||||
@router.post('/enterprise/complete-audit')
|
||||
async def complete_audit(website_url: str):
|
||||
service = EnterpriseSEOService()
|
||||
return await service.execute_complete_audit(website_url)
|
||||
```
|
||||
|
||||
### Step 3: Test
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/api/seo-tools/enterprise/complete-audit
|
||||
```
|
||||
|
||||
### Step 4: Implement
|
||||
Fill in business logic based on requirements in `PHASE2A_NEXT_STEPS.md`
|
||||
|
||||
---
|
||||
|
||||
## 📊 Timeline at a Glance
|
||||
|
||||
```
|
||||
Week 1: Phase 2A.1 Backend Core [████░░░░░░░░░░░░░░░░░░░░] 20%
|
||||
Week 2: Phase 2A.2 LLM Integration [████████░░░░░░░░░░░░░░░░] 40%
|
||||
Week 3: Phase 2A.3 Infrastructure [████████████░░░░░░░░░░░░] 60%
|
||||
Week 4: Phase 2A.4 Testing [████████████████░░░░░░░░] 80%
|
||||
Week 5: Phase 2A.5 Deployment [████████████████████░░░░] 100%
|
||||
|
||||
Target Completion: June 28, 2026
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✨ Key Metrics
|
||||
|
||||
| Metric | Current | Target | Status |
|
||||
|--------|---------|--------|--------|
|
||||
| Frontend Complete | 100% | 100% | ✅ On Track |
|
||||
| Backend Complete | 0% | 100% | 🔴 Blocked |
|
||||
| Test Coverage | - | 80% | ⏳ Pending |
|
||||
| Performance | - | <5s | ⏳ Pending |
|
||||
| Bugs | 0 | 0 | ✅ On Track |
|
||||
| Timeline | Week 1/5 | Week 5/5 | 🟡 At Risk |
|
||||
|
||||
---
|
||||
|
||||
## 💬 Quick Q&A
|
||||
|
||||
**Q: Is the frontend ready to ship?**
|
||||
A: No, backend endpoints not implemented yet.
|
||||
|
||||
**Q: How long until production?**
|
||||
A: 5 weeks if we start Phase 2A.1 TODAY.
|
||||
|
||||
**Q: What's blocking us?**
|
||||
A: Backend implementation not started.
|
||||
|
||||
**Q: How many developers needed?**
|
||||
A: 2-3 backend developers for next 5 weeks.
|
||||
|
||||
**Q: Can we test the frontend?**
|
||||
A: Yes, with mock data. But can't test end-to-end without backend.
|
||||
|
||||
**Q: What if we delay Phase 2A.1?**
|
||||
A: Timeline pushes back 1 week per week of delay.
|
||||
|
||||
**Q: Is there technical debt?**
|
||||
A: No, frontend is clean and production-ready.
|
||||
|
||||
**Q: What's the biggest risk?**
|
||||
A: Backend implementation doesn't start immediately.
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Next Steps (24 Hours)
|
||||
|
||||
1. **Discuss** this review with team
|
||||
2. **Allocate** 2-3 backend developers
|
||||
3. **Setup** development environment
|
||||
4. **Assign** Phase 2A.1 tasks
|
||||
5. **Start** implementation
|
||||
|
||||
---
|
||||
|
||||
## 📞 Need More Details?
|
||||
|
||||
| Topic | Document |
|
||||
|-------|----------|
|
||||
| Component Details | PHASE2A_INTEGRATION_GUIDE.md |
|
||||
| Backend Blueprint | PHASE2A_NEXT_STEPS.md |
|
||||
| Timeline & Resources | PHASE2A_IMPLEMENTATION_REVIEW.md |
|
||||
| Real-time Status | PHASE2A_STATUS_DASHBOARD.md |
|
||||
| Compilation Issues | COMPILATION_FIXES.md |
|
||||
|
||||
---
|
||||
|
||||
## ✅ Sign-Off Checklist
|
||||
|
||||
- [ ] Reviewed frontend completion status
|
||||
- [ ] Understand backend requirements
|
||||
- [ ] Aware of 5-week timeline
|
||||
- [ ] Know Phase 2A.1 is blocking factor
|
||||
- [ ] Ready to allocate resources
|
||||
- [ ] Agreed to start immediately
|
||||
|
||||
---
|
||||
|
||||
**Status:** Frontend Ready ✅ | Backend Needed 🔴
|
||||
**Action:** Start Phase 2A.1 TODAY
|
||||
**Contact:** Check documentation for details
|
||||
117
ToBeMigrated/ai_marketing_tools/ai_backlinker/README.md
Normal file
117
ToBeMigrated/ai_marketing_tools/ai_backlinker/README.md
Normal file
@@ -0,0 +1,117 @@
|
||||
---
|
||||
|
||||
# AI Backlinking Tool
|
||||
|
||||
## Overview
|
||||
|
||||
The `ai_backlinking.py` module is part of the [AI-Writer](https://github.com/AJaySi/AI-Writer) project. It simplifies and automates the process of finding and securing backlink opportunities. Using AI, the tool performs web research, extracts contact information, and sends personalized outreach emails for guest posting opportunities, making it an essential tool for content writers, digital marketers, and solopreneurs.
|
||||
|
||||
---
|
||||
|
||||
## Key Features
|
||||
|
||||
| Feature | Description |
|
||||
|-------------------------------|-----------------------------------------------------------------------------|
|
||||
| **Automated Web Scraping** | Extract guest post opportunities, contact details, and website insights. |
|
||||
| **AI-Powered Emails** | Create personalized outreach emails tailored to target websites. |
|
||||
| **Email Automation** | Integrate with platforms like Gmail or SendGrid for streamlined communication. |
|
||||
| **Lead Management** | Track email status (sent, replied, successful) and follow up efficiently. |
|
||||
| **Batch Processing** | Handle multiple keywords and queries simultaneously. |
|
||||
| **AI-Driven Follow-Up** | Automate polite reminders if there's no response. |
|
||||
| **Reports and Analytics** | View performance metrics like email open rates and backlink success rates. |
|
||||
|
||||
---
|
||||
|
||||
## Workflow Breakdown
|
||||
|
||||
| Step | Action | Example |
|
||||
|-------------------------------|---------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
|
||||
| **Input Keywords** | Provide keywords for backlinking opportunities. | *E.g., "AI tools", "SEO strategies", "content marketing."* |
|
||||
| **Generate Search Queries** | Automatically create queries for search engines. | *E.g., "AI tools + 'write for us'" or "content marketing + 'submit a guest post.'"* |
|
||||
| **Web Scraping** | Collect URLs, email addresses, and content details from target websites. | Extract "editor@contentblog.com" from "https://contentblog.com/write-for-us". |
|
||||
| **Compose Outreach Emails** | Use AI to draft personalized emails based on scraped website data. | Email tailored to "Content Blog" discussing "AI tools for better content writing." |
|
||||
| **Automated Email Sending** | Review and send emails or fully automate the process. | Send emails through Gmail or other SMTP services. |
|
||||
| **Follow-Ups** | Automate follow-ups for non-responsive contacts. | A polite reminder email sent 7 days later. |
|
||||
| **Track and Log Results** | Monitor sent emails, responses, and backlink placements. | View logs showing responses and backlink acquisition rate. |
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Python Version**: 3.6 or higher.
|
||||
- **Required Packages**: `googlesearch-python`, `loguru`, `smtplib`, `email`.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
git clone https://github.com/AJaySi/AI-Writer.git
|
||||
cd AI-Writer
|
||||
```
|
||||
|
||||
2. Install dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example Usage
|
||||
|
||||
Here’s a quick example of how to use the tool:
|
||||
|
||||
```python
|
||||
from lib.ai_marketing_tools.ai_backlinking import main_backlinking_workflow
|
||||
|
||||
# Email configurations
|
||||
smtp_config = {
|
||||
'server': 'smtp.gmail.com',
|
||||
'port': 587,
|
||||
'user': 'your_email@gmail.com',
|
||||
'password': 'your_password'
|
||||
}
|
||||
|
||||
imap_config = {
|
||||
'server': 'imap.gmail.com',
|
||||
'user': 'your_email@gmail.com',
|
||||
'password': 'your_password'
|
||||
}
|
||||
|
||||
# Proposal details
|
||||
user_proposal = {
|
||||
'user_name': 'Your Name',
|
||||
'user_email': 'your_email@gmail.com',
|
||||
'topic': 'Proposed guest post topic'
|
||||
}
|
||||
|
||||
# Keywords to search
|
||||
keywords = ['AI tools', 'SEO strategies', 'content marketing']
|
||||
|
||||
# Start the workflow
|
||||
main_backlinking_workflow(keywords, smtp_config, imap_config, user_proposal)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Core Functions
|
||||
|
||||
| Function | Purpose |
|
||||
|--------------------------------------------|-------------------------------------------------------------------------------------------|
|
||||
| `generate_search_queries(keyword)` | Create search queries to find guest post opportunities. |
|
||||
| `find_backlink_opportunities(keyword)` | Scrape websites for backlink opportunities. |
|
||||
| `compose_personalized_email()` | Draft outreach emails using AI insights and website data. |
|
||||
| `send_email()` | Send emails using SMTP configurations. |
|
||||
| `check_email_responses()` | Monitor inbox for replies using IMAP. |
|
||||
| `send_follow_up_email()` | Automate polite reminders to non-responsive contacts. |
|
||||
| `log_sent_email()` | Keep a record of all sent emails and responses. |
|
||||
| `main_backlinking_workflow()` | Execute the complete backlinking workflow for multiple keywords. |
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the MIT License. For more details, refer to the [LICENSE](LICENSE) file.
|
||||
|
||||
---
|
||||
423
ToBeMigrated/ai_marketing_tools/ai_backlinker/ai_backlinking.py
Normal file
423
ToBeMigrated/ai_marketing_tools/ai_backlinker/ai_backlinking.py
Normal file
@@ -0,0 +1,423 @@
|
||||
#Problem:
|
||||
#
|
||||
#Finding websites for guest posts is manual, tedious, and time-consuming. Communicating with webmasters, maintaining conversations, and keeping track of backlinking opportunities is difficult to scale. Content creators and marketers struggle with discovering new websites and consistently getting backlinks.
|
||||
#Solution:
|
||||
#
|
||||
#An AI-powered backlinking app that automates web research, scrapes websites, extracts contact information, and sends personalized outreach emails to webmasters. This would simplify the entire process, allowing marketers to scale their backlinking strategy with minimal manual intervention.
|
||||
#Core Workflow:
|
||||
#
|
||||
# User Input:
|
||||
# Keyword Search: The user inputs a keyword (e.g., "AI writers").
|
||||
# Search Queries: Your app will append various search strings to this keyword to find backlinking opportunities (e.g., "AI writers + 'Write for Us'").
|
||||
#
|
||||
# Web Research:
|
||||
#
|
||||
# Use search engines or web scraping to run multiple queries:
|
||||
# Keyword + "Guest Contributor"
|
||||
# Keyword + "Add Guest Post"
|
||||
# Keyword + "Write for Us", etc.
|
||||
#
|
||||
# Collect URLs of websites that have pages or posts related to guest post opportunities.
|
||||
#
|
||||
# Scrape Website Data:
|
||||
# Contact Information Extraction:
|
||||
# Scrape the website for contact details (email addresses, contact forms, etc.).
|
||||
# Use natural language processing (NLP) to understand the type of content on the website and who the contact person might be (webmaster, editor, or guest post manager).
|
||||
# Website Content Understanding:
|
||||
# Scrape a summary of each website's content (e.g., their blog topics, categories, and tone) to personalize the email based on the site's focus.
|
||||
#
|
||||
# Personalized Outreach:
|
||||
# AI Email Composition:
|
||||
# Compose personalized outreach emails based on:
|
||||
# The scraped data (website content, topic focus, etc.).
|
||||
# The user's input (what kind of guest post or content they want to contribute).
|
||||
# Example: "Hi [Webmaster Name], I noticed that your site [Site Name] features high-quality content about [Topic]. I would love to contribute a guest post on [Proposed Topic] in exchange for a backlink."
|
||||
#
|
||||
# Automated Email Sending:
|
||||
# Review Emails (Optional HITL):
|
||||
# Let users review and approve the personalized emails before they are sent, or allow full automation.
|
||||
# Send Emails:
|
||||
# Automate email dispatch through an integrated SMTP or API (e.g., Gmail API, SendGrid).
|
||||
# Keep track of which emails were sent, bounced, or received replies.
|
||||
#
|
||||
# Scaling the Search:
|
||||
# Repeat for Multiple Keywords:
|
||||
# Run the same scraping and outreach process for a list of relevant keywords, either automatically suggested or uploaded by the user.
|
||||
# Keep Track of Sent Emails:
|
||||
# Maintain a log of all sent emails, responses, and follow-up reminders to avoid repetition or forgotten leads.
|
||||
#
|
||||
# Tracking Responses and Follow-ups:
|
||||
# Automated Responses:
|
||||
# If a website replies positively, AI can respond with predefined follow-up emails (e.g., proposing topics, confirming submission deadlines).
|
||||
# Follow-up Reminders:
|
||||
# If there's no reply, the system can send polite follow-up reminders at pre-set intervals.
|
||||
#
|
||||
#Key Features:
|
||||
#
|
||||
# Automated Web Scraping:
|
||||
# Scrape websites for guest post opportunities using a predefined set of search queries based on user input.
|
||||
# Extract key information like email addresses, names, and submission guidelines.
|
||||
#
|
||||
# Personalized Email Writing:
|
||||
# Leverage AI to create personalized emails using the scraped website information.
|
||||
# Tailor each email to the tone, content style, and focus of the website.
|
||||
#
|
||||
# Email Sending Automation:
|
||||
# Integrate with email platforms (e.g., Gmail, SendGrid, or custom SMTP).
|
||||
# Send automated outreach emails with the ability for users to review first (HITL - Human-in-the-loop) or automate completely.
|
||||
#
|
||||
# Customizable Email Templates:
|
||||
# Allow users to customize or choose from a set of email templates for different types of outreach (e.g., guest post requests, follow-up emails, submission offers).
|
||||
#
|
||||
# Lead Tracking and Management:
|
||||
# Track all emails sent, monitor replies, and keep track of successful backlinks.
|
||||
# Log each lead's status (e.g., emailed, responded, no reply) to manage future interactions.
|
||||
#
|
||||
# Multiple Keywords/Queries:
|
||||
# Allow users to run the same process for a batch of keywords, automatically generating relevant search queries for each.
|
||||
#
|
||||
# AI-Driven Follow-Up:
|
||||
# Schedule follow-up emails if there is no response after a specified period.
|
||||
#
|
||||
# Reports and Analytics:
|
||||
# Provide users with reports on how many emails were sent, opened, replied to, and successful backlink placements.
|
||||
#
|
||||
#Advanced Features (for Scaling and Optimization):
|
||||
#
|
||||
# Domain Authority Filtering:
|
||||
# Use SEO APIs (e.g., Moz, Ahrefs) to filter websites based on their domain authority or backlink strength.
|
||||
# Prioritize high-authority websites to maximize the impact of backlinks.
|
||||
#
|
||||
# Spam Detection:
|
||||
# Use AI to detect and avoid spammy or low-quality websites that might harm the user's SEO.
|
||||
#
|
||||
# Contact Form Auto-Fill:
|
||||
# If the site only offers a contact form (without email), automatically fill and submit the form with AI-generated content.
|
||||
#
|
||||
# Dynamic Content Suggestions:
|
||||
# Suggest guest post topics based on the website's focus, using NLP to analyze the site's existing content.
|
||||
#
|
||||
# Bulk Email Support:
|
||||
# Allow users to bulk-send outreach emails while still personalizing each message for scalability.
|
||||
#
|
||||
# AI Copy Optimization:
|
||||
# Use copywriting AI to optimize email content, adjusting tone and CTA based on the target audience.
|
||||
#
|
||||
#Challenges and Considerations:
|
||||
#
|
||||
# Legal Compliance:
|
||||
# Ensure compliance with anti-spam laws (e.g., CAN-SPAM, GDPR) by including unsubscribe options or manual email approval.
|
||||
#
|
||||
# Scraping Limits:
|
||||
# Be mindful of scraping limits on certain websites and employ smart throttling or use API-based scraping for better reliability.
|
||||
#
|
||||
# Deliverability:
|
||||
# Ensure emails are delivered properly without landing in spam folders by integrating proper email authentication (SPF, DKIM) and using high-reputation SMTP servers.
|
||||
#
|
||||
# Maintaining Email Personalization:
|
||||
# Striking the balance between automating the email process and keeping each message personal enough to avoid being flagged as spam.
|
||||
#
|
||||
#Technology Stack:
|
||||
#
|
||||
# Web Scraping: BeautifulSoup, Scrapy, or Puppeteer for scraping guest post opportunities and contact information.
|
||||
# Email Automation: Integrate with Gmail API, SendGrid, or Mailgun for sending emails.
|
||||
# NLP for Personalization: GPT-based models for email generation and web content understanding.
|
||||
# Frontend: React or Vue for the user interface.
|
||||
# Backend: Python/Node.js with Flask or Express for the API and automation logic.
|
||||
# Database: MongoDB or PostgreSQL to track leads, emails, and responses.
|
||||
#
|
||||
#This solution will significantly streamline the backlinking process by automating the most tedious tasks, from finding sites to personalizing outreach, enabling marketers to focus on content creation and high-level strategies.
|
||||
|
||||
|
||||
import sys
|
||||
# from googlesearch import search # Temporarily disabled for future enhancement
|
||||
from loguru import logger
|
||||
from lib.ai_web_researcher.firecrawl_web_crawler import scrape_website
|
||||
from lib.gpt_providers.text_generation.main_text_generation import llm_text_gen
|
||||
from lib.ai_web_researcher.firecrawl_web_crawler import scrape_url
|
||||
import smtplib
|
||||
from email.mime.multipart import MIMEMultipart
|
||||
from email.mime.text import MIMEText
|
||||
|
||||
# Configure logger
|
||||
logger.remove()
|
||||
logger.add(sys.stdout,
|
||||
colorize=True,
|
||||
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
|
||||
)
|
||||
|
||||
def generate_search_queries(keyword):
|
||||
"""
|
||||
Generate a list of search queries for finding guest post opportunities.
|
||||
|
||||
Args:
|
||||
keyword (str): The keyword to base the search queries on.
|
||||
|
||||
Returns:
|
||||
list: A list of search queries.
|
||||
"""
|
||||
return [
|
||||
f"{keyword} + 'Guest Contributor'",
|
||||
f"{keyword} + 'Add Guest Post'",
|
||||
f"{keyword} + 'Guest Bloggers Wanted'",
|
||||
f"{keyword} + 'Write for Us'",
|
||||
f"{keyword} + 'Submit Guest Post'",
|
||||
f"{keyword} + 'Become a Guest Blogger'",
|
||||
f"{keyword} + 'guest post opportunities'",
|
||||
f"{keyword} + 'Submit article'",
|
||||
]
|
||||
|
||||
def find_backlink_opportunities(keyword):
|
||||
"""
|
||||
Find backlink opportunities by scraping websites based on search queries.
|
||||
|
||||
Args:
|
||||
keyword (str): The keyword to search for backlink opportunities.
|
||||
|
||||
Returns:
|
||||
list: A list of results from the scraped websites.
|
||||
"""
|
||||
search_queries = generate_search_queries(keyword)
|
||||
results = []
|
||||
|
||||
# Temporarily disabled Google search functionality
|
||||
# for query in search_queries:
|
||||
# urls = search_for_urls(query)
|
||||
# for url in urls:
|
||||
# website_data = scrape_website(url)
|
||||
# logger.info(f"Scraped Website content for {url}: {website_data}")
|
||||
# if website_data:
|
||||
# contact_info = extract_contact_info(website_data)
|
||||
# logger.info(f"Contact details found for {url}: {contact_info}")
|
||||
|
||||
# Placeholder return for now
|
||||
return []
|
||||
|
||||
def search_for_urls(query):
|
||||
"""
|
||||
Search for URLs using Google search.
|
||||
|
||||
Args:
|
||||
query (str): The search query.
|
||||
|
||||
Returns:
|
||||
list: List of URLs found.
|
||||
"""
|
||||
# Temporarily disabled Google search functionality
|
||||
# return list(search(query, num_results=10))
|
||||
return []
|
||||
|
||||
def compose_personalized_email(website_data, insights, user_proposal):
|
||||
"""
|
||||
Compose a personalized outreach email using AI LLM based on website data, insights, and user proposal.
|
||||
|
||||
Args:
|
||||
website_data (dict): The data of the website including metadata and contact info.
|
||||
insights (str): Insights generated by the LLM about the website.
|
||||
user_proposal (dict): The user's proposal for a guest post or content contribution.
|
||||
|
||||
Returns:
|
||||
str: A personalized email message.
|
||||
"""
|
||||
contact_name = website_data.get("contact_info", {}).get("name", "Webmaster")
|
||||
site_name = website_data.get("metadata", {}).get("title", "your site")
|
||||
proposed_topic = user_proposal.get("topic", "a guest post")
|
||||
user_name = user_proposal.get("user_name", "Your Name")
|
||||
user_email = user_proposal.get("user_email", "your_email@example.com")
|
||||
|
||||
# Refined prompt for email generation
|
||||
email_prompt = f"""
|
||||
You are an AI assistant tasked with composing a highly personalized outreach email for guest posting.
|
||||
|
||||
Contact Name: {contact_name}
|
||||
Website Name: {site_name}
|
||||
Proposed Topic: {proposed_topic}
|
||||
|
||||
User Details:
|
||||
Name: {user_name}
|
||||
Email: {user_email}
|
||||
|
||||
Website Insights: {insights}
|
||||
|
||||
Please compose a professional and engaging email that includes:
|
||||
1. A personalized introduction addressing the recipient.
|
||||
2. A mention of the website's content focus.
|
||||
3. A proposal for a guest post.
|
||||
4. A call to action to discuss the guest post opportunity.
|
||||
5. A polite closing with user contact details.
|
||||
"""
|
||||
|
||||
return llm_text_gen(email_prompt)
|
||||
|
||||
def send_email(smtp_server, smtp_port, smtp_user, smtp_password, to_email, subject, body):
|
||||
"""
|
||||
Send an email using an SMTP server.
|
||||
|
||||
Args:
|
||||
smtp_server (str): The SMTP server address.
|
||||
smtp_port (int): The SMTP server port.
|
||||
smtp_user (str): The SMTP server username.
|
||||
smtp_password (str): The SMTP server password.
|
||||
to_email (str): The recipient's email address.
|
||||
subject (str): The email subject.
|
||||
body (str): The email body.
|
||||
|
||||
Returns:
|
||||
bool: True if the email was sent successfully, False otherwise.
|
||||
"""
|
||||
try:
|
||||
msg = MIMEMultipart()
|
||||
msg['From'] = smtp_user
|
||||
msg['To'] = to_email
|
||||
msg['Subject'] = subject
|
||||
msg.attach(MIMEText(body, 'plain'))
|
||||
|
||||
server = smtplib.SMTP(smtp_server, smtp_port)
|
||||
server.starttls()
|
||||
server.login(smtp_user, smtp_password)
|
||||
server.send_message(msg)
|
||||
server.quit()
|
||||
|
||||
logger.info(f"Email sent successfully to {to_email}")
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to send email to {to_email}: {e}")
|
||||
return False
|
||||
|
||||
def extract_contact_info(website_data):
|
||||
"""
|
||||
Extract contact information from website data.
|
||||
|
||||
Args:
|
||||
website_data (dict): Scraped data from the website.
|
||||
|
||||
Returns:
|
||||
dict: Extracted contact information such as name, email, etc.
|
||||
"""
|
||||
# Placeholder for extracting contact information logic
|
||||
return {
|
||||
"name": website_data.get("contact", {}).get("name", "Webmaster"),
|
||||
"email": website_data.get("contact", {}).get("email", ""),
|
||||
}
|
||||
|
||||
def find_backlink_opportunities_for_keywords(keywords):
|
||||
"""
|
||||
Find backlink opportunities for multiple keywords.
|
||||
|
||||
Args:
|
||||
keywords (list): A list of keywords to search for backlink opportunities.
|
||||
|
||||
Returns:
|
||||
dict: A dictionary with keywords as keys and a list of results as values.
|
||||
"""
|
||||
all_results = {}
|
||||
for keyword in keywords:
|
||||
results = find_backlink_opportunities(keyword)
|
||||
all_results[keyword] = results
|
||||
return all_results
|
||||
|
||||
def log_sent_email(keyword, email_info):
|
||||
"""
|
||||
Log the information of a sent email.
|
||||
|
||||
Args:
|
||||
keyword (str): The keyword associated with the email.
|
||||
email_info (dict): Information about the sent email (e.g., recipient, subject, body).
|
||||
"""
|
||||
with open(f"{keyword}_sent_emails.log", "a") as log_file:
|
||||
log_file.write(f"{email_info}\n")
|
||||
|
||||
def check_email_responses(imap_server, imap_user, imap_password):
|
||||
"""
|
||||
Check email responses using an IMAP server.
|
||||
|
||||
Args:
|
||||
imap_server (str): The IMAP server address.
|
||||
imap_user (str): The IMAP server username.
|
||||
imap_password (str): The IMAP server password.
|
||||
|
||||
Returns:
|
||||
list: A list of email responses.
|
||||
"""
|
||||
responses = []
|
||||
try:
|
||||
mail = imaplib.IMAP4_SSL(imap_server)
|
||||
mail.login(imap_user, imap_password)
|
||||
mail.select('inbox')
|
||||
|
||||
status, data = mail.search(None, 'UNSEEN')
|
||||
mail_ids = data[0]
|
||||
id_list = mail_ids.split()
|
||||
|
||||
for mail_id in id_list:
|
||||
status, data = mail.fetch(mail_id, '(RFC822)')
|
||||
msg = email.message_from_bytes(data[0][1])
|
||||
if msg.is_multipart():
|
||||
for part in msg.walk():
|
||||
if part.get_content_type() == 'text/plain':
|
||||
responses.append(part.get_payload(decode=True).decode())
|
||||
else:
|
||||
responses.append(msg.get_payload(decode=True).decode())
|
||||
|
||||
mail.logout()
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to check email responses: {e}")
|
||||
|
||||
return responses
|
||||
|
||||
def send_follow_up_email(smtp_server, smtp_port, smtp_user, smtp_password, to_email, subject, body):
|
||||
"""
|
||||
Send a follow-up email using an SMTP server.
|
||||
|
||||
Args:
|
||||
smtp_server (str): The SMTP server address.
|
||||
smtp_port (int): The SMTP server port.
|
||||
smtp_user (str): The SMTP server username.
|
||||
smtp_password (str): The SMTP server password.
|
||||
to_email (str): The recipient's email address.
|
||||
subject (str): The email subject.
|
||||
body (str): The email body.
|
||||
|
||||
Returns:
|
||||
bool: True if the email was sent successfully, False otherwise.
|
||||
"""
|
||||
return send_email(smtp_server, smtp_port, smtp_user, smtp_password, to_email, subject, body)
|
||||
|
||||
def main_backlinking_workflow(keywords, smtp_config, imap_config, user_proposal):
|
||||
"""
|
||||
Main workflow for the AI-powered backlinking feature.
|
||||
|
||||
Args:
|
||||
keywords (list): A list of keywords to search for backlink opportunities.
|
||||
smtp_config (dict): SMTP configuration for sending emails.
|
||||
imap_config (dict): IMAP configuration for checking email responses.
|
||||
user_proposal (dict): The user's proposal for a guest post or content contribution.
|
||||
|
||||
Returns:
|
||||
None
|
||||
"""
|
||||
all_results = find_backlink_opportunities_for_keywords(keywords)
|
||||
|
||||
for keyword, results in all_results.items():
|
||||
for result in results:
|
||||
email_body = compose_personalized_email(result, result['insights'], user_proposal)
|
||||
email_sent = send_email(
|
||||
smtp_config['server'],
|
||||
smtp_config['port'],
|
||||
smtp_config['user'],
|
||||
smtp_config['password'],
|
||||
result['contact_info']['email'],
|
||||
f"Guest Post Proposal for {result['metadata']['title']}",
|
||||
email_body
|
||||
)
|
||||
if email_sent:
|
||||
log_sent_email(keyword, {
|
||||
"to": result['contact_info']['email'],
|
||||
"subject": f"Guest Post Proposal for {result['metadata']['title']}",
|
||||
"body": email_body
|
||||
})
|
||||
|
||||
responses = check_email_responses(imap_config['server'], imap_config['user'], imap_config['password'])
|
||||
for response in responses:
|
||||
# TBD : Process and possibly send follow-up emails based on responses
|
||||
pass
|
||||
@@ -0,0 +1,60 @@
|
||||
import streamlit as st
|
||||
import pandas as pd
|
||||
from st_aggrid import AgGrid, GridOptionsBuilder, GridUpdateMode
|
||||
from lib.ai_marketing_tools.ai_backlinker.ai_backlinking import find_backlink_opportunities, compose_personalized_email
|
||||
|
||||
|
||||
# Streamlit UI function
|
||||
def backlinking_ui():
|
||||
st.title("AI Backlinking Tool")
|
||||
|
||||
# Step 1: Get user inputs
|
||||
keyword = st.text_input("Enter a keyword", value="technology")
|
||||
|
||||
# Step 2: Generate backlink opportunities
|
||||
if st.button("Find Backlink Opportunities"):
|
||||
if keyword:
|
||||
backlink_opportunities = find_backlink_opportunities(keyword)
|
||||
|
||||
# Convert results to a DataFrame for display
|
||||
df = pd.DataFrame(backlink_opportunities)
|
||||
|
||||
# Create a selectable table using st-aggrid
|
||||
gb = GridOptionsBuilder.from_dataframe(df)
|
||||
gb.configure_selection('multiple', use_checkbox=True, groupSelectsChildren=True)
|
||||
gridOptions = gb.build()
|
||||
|
||||
grid_response = AgGrid(
|
||||
df,
|
||||
gridOptions=gridOptions,
|
||||
update_mode=GridUpdateMode.SELECTION_CHANGED,
|
||||
height=200,
|
||||
width='100%'
|
||||
)
|
||||
|
||||
selected_rows = grid_response['selected_rows']
|
||||
|
||||
if selected_rows:
|
||||
st.write("Selected Opportunities:")
|
||||
st.table(pd.DataFrame(selected_rows))
|
||||
|
||||
# Step 3: Option to generate personalized emails for selected opportunities
|
||||
if st.button("Generate Emails for Selected Opportunities"):
|
||||
user_proposal = {
|
||||
"user_name": st.text_input("Your Name", value="John Doe"),
|
||||
"user_email": st.text_input("Your Email", value="john@example.com")
|
||||
}
|
||||
|
||||
emails = []
|
||||
for selected in selected_rows:
|
||||
insights = f"Insights based on content from {selected['url']}."
|
||||
email = compose_personalized_email(selected, insights, user_proposal)
|
||||
emails.append(email)
|
||||
|
||||
st.subheader("Generated Emails:")
|
||||
for email in emails:
|
||||
st.write(email)
|
||||
st.markdown("---")
|
||||
|
||||
else:
|
||||
st.error("Please enter a keyword.")
|
||||
@@ -350,28 +350,4 @@ If you encounter issues:
|
||||
|
||||
---
|
||||
|
||||
**Happy coding! 🎉**
|
||||
|
||||
## Backlink Outreach Migration Map
|
||||
|
||||
Canonical migrated backlinking module paths:
|
||||
|
||||
- Router: `backend/routers/backlink_outreach.py`
|
||||
- Service: `backend/services/backlink_outreach_service.py`
|
||||
- Frontend API client: `frontend/src/api/backlinkOutreachApi.ts`
|
||||
- Frontend store: `frontend/src/stores/backlinkOutreachStore.ts`
|
||||
- Frontend UI integration: `frontend/src/components/SEODashboard/BacklinkOutreachModuleList.tsx`
|
||||
|
||||
Invoke from backend:
|
||||
|
||||
- `GET /api/backlink-outreach/modules`
|
||||
- `GET /api/backlink-outreach/query-templates?keyword=<keyword>`
|
||||
- `GET /api/backlink-outreach/migration-coverage`
|
||||
- `POST /api/backlink-outreach/discover` with JSON body: `{ "keyword": "...", "max_results": 10 }`
|
||||
- `POST /api/backlink-outreach/policy-validate` to enforce compliance/suppression/throttles before send
|
||||
- `GET /api/backlink-outreach/reporting` for send-volume and conversion snapshot
|
||||
- `POST /api/backlink-outreach/campaigns` and `GET /api/backlink-outreach/campaigns` for persisted campaign records (campaign-creator style storage flow)
|
||||
|
||||
The modules endpoint returns migration identifiers: `backlink`, `outreach`, and `guest_post`.
|
||||
The query-template endpoint mirrors legacy `generate_search_queries(...)` behavior from `ToBeMigrated/ai_marketing_tools/ai_backlinker/ai_backlinking.py`.
|
||||
The migration-coverage endpoint summarizes what is already implemented vs planned from the legacy prototype roadmap.
|
||||
**Happy coding! 🎉**
|
||||
@@ -18,9 +18,8 @@ CORE_ROUTER_REGISTRY = [
|
||||
{"name": "step3_research", "module": "api.onboarding_utils.step3_routes", "attr": "router", "features": {"all", "core"}},
|
||||
{"name": "step4_assets", "module": "api.onboarding_utils.step4_asset_routes", "attr": "router", "features": {"all", "core", "podcast"}},
|
||||
{"name": "step4_persona", "module": "api.onboarding_utils.step4_persona_routes_optimized", "attr": "router", "features": {"all", "core"}},
|
||||
{"name": "gsc_auth", "module": "routers.gsc_auth", "attr": "router", "features": {"all", "core", "seo", "blog_writer"}},
|
||||
{"name": "wordpress", "module": "routers.wordpress", "attr": "router", "features": {"all", "core", "blog_writer"}},
|
||||
{"name": "wordpress_oauth", "module": "routers.wordpress_oauth", "attr": "router", "features": {"all", "core", "blog_writer"}},
|
||||
{"name": "gsc_auth", "module": "routers.gsc_auth", "attr": "router", "features": {"all", "core", "seo"}},
|
||||
{"name": "wordpress_oauth", "module": "routers.wordpress_oauth", "attr": "router", "features": {"all", "core"}},
|
||||
{"name": "bing_oauth", "module": "routers.bing_oauth", "attr": "router", "features": {"all", "core"}},
|
||||
{"name": "bing_analytics", "module": "routers.bing_analytics", "attr": "router", "features": {"all", "core"}},
|
||||
{"name": "bing_analytics_storage", "module": "routers.bing_analytics_storage", "attr": "router", "features": {"all", "core"}},
|
||||
@@ -45,8 +44,7 @@ CORE_ROUTER_REGISTRY = [
|
||||
OPTIONAL_ROUTER_REGISTRY = [
|
||||
{"name": "blog_writer", "module": "api.blog_writer.router", "attr": "router", "features": {"all", "blog_writer"}},
|
||||
{"name": "story_writer", "module": "api.story_writer.router", "attr": "router", "features": {"all", "story_writer"}},
|
||||
{"name": "wix", "module": "api.wix_routes", "attr": "router", "features": {"all", "blog_writer"}},
|
||||
{"name": "wix_test", "module": "api.wix_routes", "attr": "qa_router", "features": {"all"}},
|
||||
{"name": "wix", "module": "api.wix_routes", "attr": "router", "features": {"all"}},
|
||||
{"name": "blog_seo_analysis", "module": "api.blog_writer.seo_analysis", "attr": "router", "features": {"all", "blog_writer"}},
|
||||
{"name": "persona", "module": "api.persona_routes", "attr": "router", "features": {"all", "persona"}},
|
||||
{"name": "video_studio", "module": "api.video_studio.router", "attr": "router", "features": {"all", "video_studio"}},
|
||||
@@ -161,12 +159,6 @@ class RouterManager:
|
||||
logger.info(f"Including {group_name} routers with features: {enabled_features}...")
|
||||
|
||||
for entry in registry:
|
||||
if entry["name"] == "wix_test" and not self._should_include_wix_test_router():
|
||||
reason = "wix test routes disabled or running in production environment"
|
||||
self.skipped_routers.append({"name": entry["name"], "reason": reason})
|
||||
if verbose:
|
||||
logger.info(f"⏭️ Skipping {entry['name']}: {reason}")
|
||||
continue
|
||||
if not self._should_include_router(entry, enabled_features):
|
||||
reason = f"features {enabled_features} not matching {entry.get('features', set())}"
|
||||
self.skipped_routers.append({"name": entry["name"], "reason": reason})
|
||||
@@ -186,13 +178,6 @@ class RouterManager:
|
||||
except Exception as e:
|
||||
logger.error(f"❌ Error including {group_name} routers: {e}")
|
||||
return False
|
||||
|
||||
@staticmethod
|
||||
def _should_include_wix_test_router() -> bool:
|
||||
environment = (os.getenv("ENVIRONMENT") or os.getenv("APP_ENV") or "development").strip().lower()
|
||||
is_production = environment in {"prod", "production"}
|
||||
wix_test_enabled = os.getenv("WIX_TEST_ROUTES_ENABLED", "false").lower() in {"1", "true", "yes", "on"}
|
||||
return wix_test_enabled and not is_production
|
||||
|
||||
def include_core_routers(self) -> bool:
|
||||
"""Include core application routers."""
|
||||
|
||||
@@ -38,15 +38,6 @@ MIME_MAP = {
|
||||
}
|
||||
|
||||
|
||||
def _verify_ownership(url_user_id: str, current_user: Dict[str, Any]) -> str:
|
||||
"""Verify the URL user_id matches the authenticated user. Returns sanitized user_id."""
|
||||
raw = current_user.get("id") or current_user.get("user_id") or current_user.get("clerk_user_id")
|
||||
authed_id = str(raw) if raw else ""
|
||||
if not authed_id or sanitize_user_id(url_user_id) != sanitize_user_id(authed_id):
|
||||
raise HTTPException(status_code=403, detail="Access denied: user mismatch")
|
||||
return sanitize_user_id(url_user_id)
|
||||
|
||||
|
||||
def _resolve_asset_path(user_id: str, category: str, filename: str) -> Path:
|
||||
"""Resolve asset path in user workspace with path-traversal protection."""
|
||||
safe_user_id = sanitize_user_id(user_id)
|
||||
@@ -73,19 +64,13 @@ async def serve_avatar(
|
||||
filename: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user_with_query_token),
|
||||
):
|
||||
"""Serve avatar images. Supports auth via Authorization header or ?token= query param.
|
||||
Falls back to images/ directory for backward compatibility with old asset library entries."""
|
||||
"""Serve avatar images. Supports auth via Authorization header or ?token= query param."""
|
||||
require_authenticated_user(current_user)
|
||||
_verify_ownership(user_id, current_user)
|
||||
|
||||
safe_filename = os.path.basename(filename)
|
||||
file_path = _resolve_asset_path(user_id, "avatars", safe_filename)
|
||||
|
||||
if not file_path.exists():
|
||||
alt_path = _resolve_asset_path(user_id, "images", safe_filename)
|
||||
if alt_path.exists():
|
||||
media_type = _get_media_type(safe_filename)
|
||||
return FileResponse(alt_path, media_type=media_type)
|
||||
raise HTTPException(status_code=404, detail="Asset not found")
|
||||
|
||||
media_type = _get_media_type(safe_filename)
|
||||
@@ -105,7 +90,6 @@ async def serve_voice_sample(
|
||||
which cannot send Authorization headers.
|
||||
"""
|
||||
require_authenticated_user(current_user)
|
||||
_verify_ownership(user_id, current_user)
|
||||
|
||||
safe_filename = os.path.basename(filename)
|
||||
file_path = _resolve_asset_path(user_id, "voice_samples", safe_filename)
|
||||
@@ -117,24 +101,4 @@ async def serve_voice_sample(
|
||||
media_type = _get_media_type(safe_filename)
|
||||
file_size = file_path.stat().st_size
|
||||
logger.warning(f"[Assets] Serving voice sample: {safe_filename} ({media_type}, {file_size} bytes)")
|
||||
return FileResponse(file_path, media_type=media_type)
|
||||
|
||||
|
||||
@router.get("/{user_id}/images/{filename}")
|
||||
async def serve_image(
|
||||
user_id: str,
|
||||
filename: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user_with_query_token),
|
||||
):
|
||||
"""Serve generated/uploaded images. Supports auth via Authorization header or ?token= query param."""
|
||||
require_authenticated_user(current_user)
|
||||
_verify_ownership(user_id, current_user)
|
||||
|
||||
safe_filename = os.path.basename(filename)
|
||||
file_path = _resolve_asset_path(user_id, "images", safe_filename)
|
||||
|
||||
if not file_path.exists():
|
||||
raise HTTPException(status_code=404, detail="Asset not found")
|
||||
|
||||
media_type = _get_media_type(safe_filename)
|
||||
return FileResponse(file_path, media_type=media_type)
|
||||
@@ -9,12 +9,10 @@ from fastapi import APIRouter, HTTPException, Depends
|
||||
from typing import Any, Dict, List, Optional
|
||||
from pydantic import BaseModel, Field
|
||||
from loguru import logger
|
||||
from datetime import datetime
|
||||
from middleware.auth_middleware import get_current_user
|
||||
from sqlalchemy.orm import Session
|
||||
from services.database import get_db as get_db_dependency
|
||||
from utils.text_asset_tracker import save_and_track_text_content
|
||||
from models.content_asset_models import AssetType, AssetSource
|
||||
|
||||
from models.blog_models import (
|
||||
BlogResearchRequest,
|
||||
@@ -38,7 +36,6 @@ from models.blog_models import (
|
||||
from services.blog_writer.blog_service import BlogWriterService
|
||||
from services.blog_writer.seo.blog_seo_recommendation_applier import BlogSEORecommendationApplier
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
from services.content_asset_service import ContentAssetService
|
||||
from .task_manager import task_manager
|
||||
from .cache_manager import cache_manager
|
||||
from models.blog_models import MediumBlogGenerateRequest
|
||||
@@ -1263,233 +1260,3 @@ async def save_complete_blog_asset(
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to save complete blog asset: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
# ---------------------------------------
|
||||
# Blog Asset API (phase-by-phase saving via ContentAsset)
|
||||
# ---------------------------------------
|
||||
|
||||
|
||||
class BlogAssetCreateRequest(BaseModel):
|
||||
research_keywords: str = Field(..., max_length=2000, description="Research keywords / topic")
|
||||
topic: Optional[str] = Field(default=None, max_length=500)
|
||||
word_count_target: Optional[int] = Field(default=None, ge=100, le=20000)
|
||||
|
||||
|
||||
class BlogAssetUpdateRequest(BaseModel):
|
||||
phase: Optional[str] = Field(default=None, pattern=r"^(research|outline|content|seo|publish)$")
|
||||
topic: Optional[str] = Field(default=None, max_length=500)
|
||||
selected_title: Optional[str] = Field(default=None, max_length=500)
|
||||
word_count_target: Optional[int] = Field(default=None, ge=100, le=20000)
|
||||
research_data: Optional[Dict[str, Any]] = None
|
||||
outline_data: Optional[Dict[str, Any]] = None
|
||||
content_data: Optional[Dict[str, Any]] = None
|
||||
seo_data: Optional[Dict[str, Any]] = None
|
||||
publish_data: Optional[Dict[str, Any]] = None
|
||||
|
||||
|
||||
def _normalize_keywords(kw: str) -> str:
|
||||
"""Normalize keywords for duplicate comparison."""
|
||||
return " ".join(sorted(kw.lower().split()))
|
||||
|
||||
|
||||
@router.post("/asset", response_model=Dict[str, Any])
|
||||
async def create_blog_asset(
|
||||
request: BlogAssetCreateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
db: Session = Depends(get_db),
|
||||
):
|
||||
"""
|
||||
Create a blog ContentAsset on research start.
|
||||
Returns existing asset if duplicate keywords found (unique topics only).
|
||||
"""
|
||||
try:
|
||||
if not current_user:
|
||||
raise HTTPException(status_code=401, detail="Authentication required")
|
||||
user_id = str(current_user.get("id", ""))
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="Invalid user ID")
|
||||
|
||||
svc = ContentAssetService(db)
|
||||
normalized_kw = _normalize_keywords(request.research_keywords)
|
||||
|
||||
# Duplicate check — search existing blog assets for matching keywords
|
||||
existing_assets, _ = svc.get_user_assets(
|
||||
user_id=user_id,
|
||||
source_module=AssetSource.BLOG_WRITER,
|
||||
asset_type=AssetType.TEXT,
|
||||
limit=100,
|
||||
)
|
||||
for asset in existing_assets:
|
||||
meta = asset.asset_metadata or {}
|
||||
if meta.get("normalized_keywords") == normalized_kw:
|
||||
logger.info(f"Duplicate blog asset found: {asset.id}, returning existing")
|
||||
return {
|
||||
"success": True,
|
||||
"asset": _asset_to_response(asset),
|
||||
"existing": True,
|
||||
}
|
||||
|
||||
# Create new ContentAsset for this blog
|
||||
title = request.topic or request.research_keywords[:200]
|
||||
asset_metadata = {
|
||||
"phase": "research",
|
||||
"research_keywords": request.research_keywords,
|
||||
"normalized_keywords": normalized_kw,
|
||||
"word_count_target": request.word_count_target,
|
||||
"topic": request.topic,
|
||||
"research_data": None,
|
||||
"outline_data": None,
|
||||
"content_data": None,
|
||||
"seo_data": None,
|
||||
"publish_data": None,
|
||||
}
|
||||
asset = svc.create_asset(
|
||||
user_id=user_id,
|
||||
asset_type=AssetType.TEXT,
|
||||
source_module=AssetSource.BLOG_WRITER,
|
||||
filename=f"blog_{int(datetime.utcnow().timestamp())}.md",
|
||||
file_url=f"/api/blog/content/pending",
|
||||
title=title,
|
||||
description=f"Blog: {title}",
|
||||
tags=["blog", "research"],
|
||||
asset_metadata=asset_metadata,
|
||||
)
|
||||
logger.info(f"✅ Created blog asset: {asset.id}")
|
||||
return {
|
||||
"success": True,
|
||||
"asset": _asset_to_response(asset),
|
||||
"existing": False,
|
||||
}
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create blog asset: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@router.put("/asset/{asset_id}", response_model=Dict[str, Any])
|
||||
async def update_blog_asset(
|
||||
asset_id: int,
|
||||
request: BlogAssetUpdateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
db: Session = Depends(get_db),
|
||||
):
|
||||
"""Update a blog asset's phase, metadata, and tags."""
|
||||
try:
|
||||
if not current_user:
|
||||
raise HTTPException(status_code=401, detail="Authentication required")
|
||||
user_id = str(current_user.get("id", ""))
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="Invalid user ID")
|
||||
|
||||
svc = ContentAssetService(db)
|
||||
asset = svc.get_asset_by_id(asset_id, user_id)
|
||||
if not asset:
|
||||
raise HTTPException(status_code=404, detail="Blog asset not found")
|
||||
|
||||
meta = dict(asset.asset_metadata or {})
|
||||
tags = list(asset.tags or [])
|
||||
|
||||
if request.phase is not None:
|
||||
meta["phase"] = request.phase
|
||||
# Update tags to reflect phase
|
||||
new_tags = [t for t in tags if t not in ("research", "outline", "content", "seo", "publish")]
|
||||
new_tags.append(request.phase)
|
||||
if "blog" not in new_tags:
|
||||
new_tags.append("blog")
|
||||
tags = new_tags
|
||||
|
||||
if request.topic is not None:
|
||||
meta["topic"] = request.topic
|
||||
if request.selected_title is not None:
|
||||
meta["selected_title"] = request.selected_title
|
||||
if request.word_count_target is not None:
|
||||
meta["word_count_target"] = request.word_count_target
|
||||
|
||||
for field in ("research_data", "outline_data", "content_data", "seo_data", "publish_data"):
|
||||
val = getattr(request, field, None)
|
||||
if val is not None:
|
||||
meta[field] = val
|
||||
|
||||
if meta.get("selected_title"):
|
||||
new_title = meta["selected_title"]
|
||||
elif meta.get("topic"):
|
||||
new_title = meta["topic"]
|
||||
else:
|
||||
new_title = asset.title or "Blog Post"
|
||||
|
||||
updated = svc.update_asset(
|
||||
asset_id=asset_id,
|
||||
user_id=user_id,
|
||||
title=new_title[:500],
|
||||
tags=tags,
|
||||
asset_metadata=meta,
|
||||
)
|
||||
if not updated:
|
||||
raise HTTPException(status_code=500, detail="Failed to update asset")
|
||||
|
||||
logger.info(f"✅ Updated blog asset {asset_id}: phase={meta.get('phase')}")
|
||||
return {"success": True, "asset": _asset_to_response(updated)}
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to update blog asset {asset_id}: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@router.get("/asset/{asset_id}", response_model=Dict[str, Any])
|
||||
async def get_blog_asset(
|
||||
asset_id: int,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
db: Session = Depends(get_db),
|
||||
):
|
||||
"""Get a blog asset with all phase data."""
|
||||
try:
|
||||
if not current_user:
|
||||
raise HTTPException(status_code=401, detail="Authentication required")
|
||||
user_id = str(current_user.get("id", ""))
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="Invalid user ID")
|
||||
|
||||
svc = ContentAssetService(db)
|
||||
asset = svc.get_asset_by_id(asset_id, user_id)
|
||||
if not asset:
|
||||
raise HTTPException(status_code=404, detail="Blog asset not found")
|
||||
|
||||
return {"success": True, "asset": _asset_to_response(asset, full=True)}
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get blog asset {asset_id}: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
def _asset_to_response(asset: Any, full: bool = False) -> Dict[str, Any]:
|
||||
"""Convert a ContentAsset to a blog asset response dict."""
|
||||
meta = asset.asset_metadata or {}
|
||||
resp: Dict[str, Any] = {
|
||||
"id": asset.id,
|
||||
"title": asset.title,
|
||||
"description": asset.description,
|
||||
"tags": asset.tags or [],
|
||||
"phase": meta.get("phase", "research"),
|
||||
"research_keywords": meta.get("research_keywords"),
|
||||
"topic": meta.get("topic"),
|
||||
"selected_title": meta.get("selected_title"),
|
||||
"word_count_target": meta.get("word_count_target"),
|
||||
"has_research": meta.get("research_data") is not None,
|
||||
"has_outline": meta.get("outline_data") is not None,
|
||||
"has_content": meta.get("content_data") is not None,
|
||||
"has_seo": meta.get("seo_data") is not None,
|
||||
"has_publish": meta.get("publish_data") is not None,
|
||||
"created_at": asset.created_at.isoformat() if asset.created_at else None,
|
||||
"updated_at": asset.updated_at.isoformat() if asset.updated_at else None,
|
||||
}
|
||||
if full:
|
||||
resp["research_data"] = meta.get("research_data")
|
||||
resp["outline_data"] = meta.get("outline_data")
|
||||
resp["content_data"] = meta.get("content_data")
|
||||
resp["seo_data"] = meta.get("seo_data")
|
||||
resp["publish_data"] = meta.get("publish_data")
|
||||
return resp
|
||||
|
||||
@@ -256,8 +256,7 @@ class TaskManager:
|
||||
self.task_storage[task_id]["status"] = "running"
|
||||
self.task_storage[task_id]["progress_messages"] = []
|
||||
|
||||
await self.update_progress(task_id, "📝 Alwrity is preparing your blog content — this usually takes 20–40 seconds.")
|
||||
await self.update_progress(task_id, "📦 Packaging your outline sections and research data...")
|
||||
await self.update_progress(task_id, "📦 Packaging outline and metadata...")
|
||||
|
||||
# Basic guard: respect global target words
|
||||
total_target = int(request.globalTargetWords or 1000)
|
||||
@@ -282,22 +281,16 @@ class TaskManager:
|
||||
# Check if result came from cache
|
||||
cache_hit = getattr(result, 'cache_hit', False)
|
||||
if cache_hit:
|
||||
await self.update_progress(task_id, "⚡ Found existing content in cache — no need to regenerate!")
|
||||
await self.update_progress(task_id, "⚡ Found cached content - loading instantly!")
|
||||
else:
|
||||
await self.update_progress(task_id, "🧠 AI is writing each section with research-backed insights and natural flow...")
|
||||
await self.update_progress(task_id, "✨ Polishing content — improving structure, readability, and transitions...")
|
||||
await self.update_progress(task_id, "🤖 Generated fresh content with AI...")
|
||||
await self.update_progress(task_id, "✨ Post-processing and assembling sections...")
|
||||
|
||||
# Mark completed
|
||||
self.task_storage[task_id]["status"] = "completed"
|
||||
self.task_storage[task_id]["result"] = result.dict()
|
||||
section_count = len(result.sections)
|
||||
total_words = sum(getattr(s, 'wordCount', 0) or 0 for s in result.sections)
|
||||
await self.update_progress(
|
||||
task_id,
|
||||
f"✅ Content generation complete! {section_count} sections written ({total_words} words). "
|
||||
"Next up: SEO Analysis to optimize your blog for search engines."
|
||||
)
|
||||
|
||||
await self.update_progress(task_id, f"✅ Generated {len(result.sections)} sections successfully.")
|
||||
|
||||
# Note: Blog content tracking is handled in the status endpoint
|
||||
# to ensure we have proper database session and user context
|
||||
|
||||
|
||||
@@ -1,192 +0,0 @@
|
||||
"""
|
||||
Chart API — Shared chart generation endpoints for Blog Writer, Podcast Maker, etc.
|
||||
|
||||
Two modes:
|
||||
1. Explicit: POST /api/charts/generate with { chart_type, chart_data, title }
|
||||
2. AI-driven: POST /api/charts/generate with { text } → LLM infers chart_type + data
|
||||
|
||||
Both return { preview_url, chart_id, chart_type?, chart_data?, title? }
|
||||
"""
|
||||
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
from fastapi import APIRouter, Depends, HTTPException
|
||||
from fastapi.responses import FileResponse
|
||||
from pydantic import BaseModel, Field
|
||||
from loguru import logger
|
||||
|
||||
from middleware.auth_middleware import get_current_user, get_current_user_with_query_token
|
||||
from api.story_writer.utils.auth import require_authenticated_user
|
||||
from services.chart_service import get_chart_service, VALID_CHART_TYPES
|
||||
|
||||
|
||||
router = APIRouter(prefix="/api/charts", tags=["Charts"])
|
||||
|
||||
|
||||
class ChartGenerateRequest(BaseModel):
|
||||
"""Request for chart generation.
|
||||
|
||||
Provide either:
|
||||
- chart_type + chart_data (explicit mode), OR
|
||||
- text (AI inference mode — LLM determines chart_type + data)
|
||||
"""
|
||||
chart_data: Optional[Dict[str, Any]] = Field(
|
||||
default=None,
|
||||
description="Chart data dict (labels, values, before/after, etc.)"
|
||||
)
|
||||
chart_type: Optional[str] = Field(
|
||||
default=None,
|
||||
description=f"Chart type: {', '.join(VALID_CHART_TYPES)}"
|
||||
)
|
||||
title: str = Field(default="", description="Chart title")
|
||||
subtitle: Optional[str] = Field(default="", description="Optional subtitle")
|
||||
text: Optional[str] = Field(
|
||||
default=None,
|
||||
description="Text to infer chart from (AI mode). Mutually exclusive with chart_type+chart_data."
|
||||
)
|
||||
section_heading: Optional[str] = Field(
|
||||
default=None,
|
||||
description="Blog section heading for context (AI mode with research)"
|
||||
)
|
||||
section_key_points: Optional[list] = Field(
|
||||
default=None,
|
||||
description="Key points from the section (AI mode with research)"
|
||||
)
|
||||
|
||||
|
||||
class ChartGenerateResponse(BaseModel):
|
||||
"""Response for chart generation."""
|
||||
preview_url: str = ""
|
||||
chart_id: str = ""
|
||||
chart_type: Optional[str] = None
|
||||
chart_data: Optional[Dict[str, Any]] = None
|
||||
title: Optional[str] = None
|
||||
warnings: list = Field(default_factory=list, description="Pipeline warnings (e.g. Exa search failures)")
|
||||
|
||||
|
||||
@router.post("/generate", response_model=ChartGenerateResponse)
|
||||
async def generate_chart(
|
||||
request: ChartGenerateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""
|
||||
Generate a chart PNG preview.
|
||||
|
||||
Two modes:
|
||||
1. Explicit: Provide chart_type + chart_data
|
||||
2. AI-driven: Provide text, and the LLM infers chart_type + chart_data
|
||||
"""
|
||||
user_id = require_authenticated_user(current_user)
|
||||
|
||||
try:
|
||||
chart_svc = get_chart_service(user_id=user_id)
|
||||
|
||||
if request.text and not request.chart_type:
|
||||
# AI inference mode
|
||||
logger.info(f"[Charts] AI inference mode for user {user_id}, text length={len(request.text)}")
|
||||
result = await chart_svc.generate_chart_from_text(
|
||||
text=request.text,
|
||||
user_id=user_id,
|
||||
section_heading=request.section_heading,
|
||||
section_key_points=request.section_key_points,
|
||||
)
|
||||
|
||||
if not result.get("path"):
|
||||
raise HTTPException(status_code=500, detail="Chart generation failed")
|
||||
|
||||
chart_id = result["chart_id"]
|
||||
filename = result.get("filename", f"chart_preview_{chart_id}.png")
|
||||
|
||||
return ChartGenerateResponse(
|
||||
preview_url=f"/api/charts/preview/{chart_id}/{filename}",
|
||||
chart_id=chart_id,
|
||||
chart_type=result.get("chart_type"),
|
||||
chart_data=result.get("chart_data"),
|
||||
title=result.get("title"),
|
||||
warnings=result.get("warnings", []),
|
||||
)
|
||||
|
||||
elif request.chart_type and request.chart_data:
|
||||
# Explicit mode
|
||||
chart_type = request.chart_type
|
||||
if chart_type not in VALID_CHART_TYPES:
|
||||
# Try normalizing aliases
|
||||
from services.chart_service import _normalize_chart_type
|
||||
chart_type = _normalize_chart_type(chart_type)
|
||||
if chart_type not in VALID_CHART_TYPES:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Invalid chart_type. Must be one of: {VALID_CHART_TYPES}"
|
||||
)
|
||||
|
||||
logger.info(f"[Charts] Explicit mode: type={chart_type}, user={user_id}")
|
||||
|
||||
chart_id = uuid.uuid4().hex[:8]
|
||||
result = chart_svc.generate_chart(
|
||||
chart_data=request.chart_data,
|
||||
chart_type=chart_type,
|
||||
title=request.title,
|
||||
subtitle=request.subtitle or "",
|
||||
chart_id=chart_id,
|
||||
)
|
||||
|
||||
if not result.get("path"):
|
||||
raise HTTPException(status_code=500, detail="Chart generation failed — check chart_data format")
|
||||
|
||||
filename = result.get("filename", f"chart_preview_{chart_id}.png")
|
||||
|
||||
return ChartGenerateResponse(
|
||||
preview_url=f"/api/charts/preview/{chart_id}/{filename}",
|
||||
chart_id=chart_id,
|
||||
chart_type=chart_type,
|
||||
chart_data=request.chart_data,
|
||||
title=request.title,
|
||||
)
|
||||
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Provide either 'text' (AI mode) or 'chart_type' + 'chart_data' (explicit mode)"
|
||||
)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"[Charts] Generation failed: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"Chart generation failed: {str(e)}")
|
||||
|
||||
|
||||
@router.get("/preview/{chart_id}/{filename}")
|
||||
async def serve_chart_preview(
|
||||
chart_id: str,
|
||||
filename: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user_with_query_token),
|
||||
):
|
||||
"""Serve chart preview PNG files. Auth via header or query token."""
|
||||
user_id = require_authenticated_user(current_user)
|
||||
|
||||
if ".." in filename or "/" in filename or "\\" in filename:
|
||||
raise HTTPException(status_code=400, detail="Invalid filename")
|
||||
|
||||
chart_svc = get_chart_service(user_id=user_id)
|
||||
file_path = chart_svc.get_chart_preview_path(chart_id)
|
||||
|
||||
if not file_path.exists():
|
||||
raise HTTPException(status_code=404, detail="Chart preview not found")
|
||||
|
||||
if not str(file_path.resolve()).startswith(str(chart_svc.output_dir.resolve())):
|
||||
raise HTTPException(status_code=403, detail="Access denied")
|
||||
|
||||
return FileResponse(
|
||||
path=str(file_path),
|
||||
media_type="image/png",
|
||||
filename=filename,
|
||||
)
|
||||
|
||||
|
||||
@router.get("/health")
|
||||
async def charts_health():
|
||||
"""Health check for Charts service."""
|
||||
return {"status": "ok", "service": "charts"}
|
||||
@@ -8,7 +8,7 @@ using Exa.ai integration, similar to the Exa.ai demo implementation.
|
||||
import time
|
||||
import logging
|
||||
from typing import Dict, Any
|
||||
from fastapi import APIRouter, HTTPException, BackgroundTasks, Depends
|
||||
from fastapi import APIRouter, HTTPException, BackgroundTasks
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
from models.hallucination_models import (
|
||||
@@ -24,7 +24,6 @@ from models.hallucination_models import (
|
||||
AssessmentType
|
||||
)
|
||||
from services.hallucination_detector import HallucinationDetector
|
||||
from middleware.auth_middleware import get_current_user
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -35,7 +34,7 @@ router = APIRouter(prefix="/api/hallucination-detector", tags=["Hallucination De
|
||||
detector = HallucinationDetector()
|
||||
|
||||
@router.post("/detect", response_model=HallucinationDetectionResponse)
|
||||
async def detect_hallucinations(request: HallucinationDetectionRequest, current_user: Dict[str, Any] = Depends(get_current_user)) -> HallucinationDetectionResponse:
|
||||
async def detect_hallucinations(request: HallucinationDetectionRequest) -> HallucinationDetectionResponse:
|
||||
"""
|
||||
Detect hallucinations in the provided text.
|
||||
|
||||
@@ -55,10 +54,8 @@ async def detect_hallucinations(request: HallucinationDetectionRequest, current_
|
||||
try:
|
||||
logger.info(f"Starting hallucination detection for text of length: {len(request.text)}")
|
||||
|
||||
user_id = current_user.get("id")
|
||||
|
||||
# Perform hallucination detection
|
||||
result = await detector.detect_hallucinations(request.text, user_id=user_id)
|
||||
result = await detector.detect_hallucinations(request.text)
|
||||
|
||||
# Convert to response format
|
||||
claims = []
|
||||
@@ -71,7 +68,7 @@ async def detect_hallucinations(request: HallucinationDetectionRequest, current_
|
||||
text=source.get('text', ''),
|
||||
published_date=source.get('publishedDate'),
|
||||
author=source.get('author'),
|
||||
score=source.get('score') if source.get('score') is not None else 0.5
|
||||
score=source.get('score', 0.5)
|
||||
)
|
||||
for source in claim.supporting_sources
|
||||
]
|
||||
@@ -83,7 +80,7 @@ async def detect_hallucinations(request: HallucinationDetectionRequest, current_
|
||||
text=source.get('text', ''),
|
||||
published_date=source.get('publishedDate'),
|
||||
author=source.get('author'),
|
||||
score=source.get('score') if source.get('score') is not None else 0.5
|
||||
score=source.get('score', 0.5)
|
||||
)
|
||||
for source in claim.refuting_sources
|
||||
]
|
||||
@@ -116,8 +113,6 @@ async def detect_hallucinations(request: HallucinationDetectionRequest, current_
|
||||
return response
|
||||
|
||||
except Exception as e:
|
||||
if isinstance(e, HTTPException):
|
||||
raise e
|
||||
logger.error(f"Error in hallucination detection: {str(e)}")
|
||||
processing_time = int((time.time() - start_time) * 1000)
|
||||
|
||||
@@ -179,7 +174,7 @@ async def extract_claims(request: ClaimExtractionRequest) -> ClaimExtractionResp
|
||||
)
|
||||
|
||||
@router.post("/verify-claim", response_model=ClaimVerificationResponse)
|
||||
async def verify_claim(request: ClaimVerificationRequest, current_user: Dict[str, Any] = Depends(get_current_user)) -> ClaimVerificationResponse:
|
||||
async def verify_claim(request: ClaimVerificationRequest) -> ClaimVerificationResponse:
|
||||
"""
|
||||
Verify a single claim against available sources.
|
||||
|
||||
@@ -197,10 +192,8 @@ async def verify_claim(request: ClaimVerificationRequest, current_user: Dict[str
|
||||
try:
|
||||
logger.info(f"Verifying claim: {request.claim[:100]}...")
|
||||
|
||||
user_id = current_user.get("id")
|
||||
|
||||
# Verify the claim
|
||||
claim_result = await detector._verify_claim(request.claim, user_id=user_id)
|
||||
claim_result = await detector._verify_claim(request.claim)
|
||||
|
||||
# Convert to response format
|
||||
supporting_sources = []
|
||||
@@ -214,7 +207,7 @@ async def verify_claim(request: ClaimVerificationRequest, current_user: Dict[str
|
||||
text=source.get('text', ''),
|
||||
published_date=source.get('publishedDate'),
|
||||
author=source.get('author'),
|
||||
score=source.get('score') if source.get('score') is not None else 0.5
|
||||
score=source.get('score', 0.5)
|
||||
)
|
||||
for source in claim_result.supporting_sources
|
||||
]
|
||||
@@ -226,7 +219,7 @@ async def verify_claim(request: ClaimVerificationRequest, current_user: Dict[str
|
||||
text=source.get('text', ''),
|
||||
published_date=source.get('publishedDate'),
|
||||
author=source.get('author'),
|
||||
score=source.get('score') if source.get('score') is not None else 0.5
|
||||
score=source.get('score', 0.5)
|
||||
)
|
||||
for source in claim_result.refuting_sources
|
||||
]
|
||||
@@ -253,8 +246,6 @@ async def verify_claim(request: ClaimVerificationRequest, current_user: Dict[str
|
||||
return response
|
||||
|
||||
except Exception as e:
|
||||
if isinstance(e, HTTPException):
|
||||
raise e
|
||||
logger.error(f"Error in claim verification: {str(e)}")
|
||||
processing_time = int((time.time() - start_time) * 1000)
|
||||
|
||||
@@ -282,21 +273,17 @@ async def health_check() -> HealthCheckResponse:
|
||||
HealthCheckResponse with service status and API availability
|
||||
"""
|
||||
try:
|
||||
from services.blog_writer.research.exa_provider import ExaResearchProvider
|
||||
try:
|
||||
exa_provider = ExaResearchProvider()
|
||||
exa_available = bool(exa_provider.api_key)
|
||||
except RuntimeError:
|
||||
exa_available = False
|
||||
llm_available = True # llm_text_gen handles provider selection via GPT_PROVIDER
|
||||
# Check API availability
|
||||
exa_available = bool(detector.exa_api_key)
|
||||
openai_available = bool(detector.openai_api_key)
|
||||
|
||||
status = "healthy" if (exa_available and llm_available) else ("degraded" if exa_available or llm_available else "unhealthy")
|
||||
status = "healthy" if (exa_available or openai_available) else "degraded"
|
||||
|
||||
response = HealthCheckResponse(
|
||||
status=status,
|
||||
version="1.0.0",
|
||||
exa_api_available=exa_available,
|
||||
openai_api_available=llm_available,
|
||||
openai_api_available=openai_available,
|
||||
timestamp=time.strftime('%Y-%m-%dT%H:%M:%S')
|
||||
)
|
||||
|
||||
|
||||
@@ -27,8 +27,6 @@ from services.subscription import UsageTrackingService, PricingService
|
||||
from models.subscription_models import APIProvider, UsageSummary
|
||||
from utils.asset_tracker import save_asset_to_library
|
||||
from utils.file_storage import save_file_safely, generate_unique_filename, sanitize_filename
|
||||
from services.content_asset_service import ContentAssetService
|
||||
from models.content_asset_models import ContentAsset
|
||||
|
||||
|
||||
router = APIRouter(prefix="/api/images", tags=["images"])
|
||||
@@ -191,27 +189,44 @@ def generate(
|
||||
billing_period=current_period
|
||||
)
|
||||
db_track.add(summary)
|
||||
db_track.flush()
|
||||
db_track.flush() # Ensure summary is persisted before updating
|
||||
|
||||
# Get "before" state for unified log
|
||||
current_calls_before = getattr(summary, "stability_calls", 0) or 0
|
||||
new_calls = current_calls_before + 1
|
||||
|
||||
# Update provider-specific counters (stability for image generation)
|
||||
# Note: All image generation goes through STABILITY provider enum regardless of actual provider
|
||||
new_calls = current_calls_before + 1
|
||||
setattr(summary, "stability_calls", new_calls)
|
||||
logger.debug(f"[images.generate] Updated stability_calls: {current_calls_before} -> {new_calls}")
|
||||
|
||||
# Update totals
|
||||
old_total_calls = summary.total_calls or 0
|
||||
summary.total_calls = old_total_calls + 1
|
||||
logger.debug(f"[images.generate] Updated totals: calls {old_total_calls} -> {summary.total_calls}")
|
||||
|
||||
# Get plan details for unified log
|
||||
limits = pricing.get_user_limits(user_id)
|
||||
plan_name = limits.get('plan_name', 'unknown') if limits else 'unknown'
|
||||
tier = limits.get('tier', 'unknown') if limits else 'unknown'
|
||||
call_limit = limits['limits'].get("stability_calls", 0) if limits else 0
|
||||
|
||||
# Get image editing stats for unified log
|
||||
current_image_edit_calls = getattr(summary, "image_edit_calls", 0) or 0
|
||||
image_edit_limit = limits['limits'].get("image_edit_calls", 0) if limits else 0
|
||||
|
||||
# Get video stats for unified log
|
||||
current_video_calls = getattr(summary, "video_calls", 0) or 0
|
||||
video_limit = limits['limits'].get("video_calls", 0) if limits else 0
|
||||
|
||||
# Get audio stats for unified log
|
||||
current_audio_calls = getattr(summary, "audio_calls", 0) or 0
|
||||
audio_limit = limits['limits'].get("audio_calls", 0) if limits else 0
|
||||
# Only show ∞ for Enterprise tier when limit is 0 (unlimited)
|
||||
audio_limit_display = audio_limit if (audio_limit > 0 or tier != 'enterprise') else '∞'
|
||||
|
||||
logger.debug(f"[images.generate] Usage snapshot for logging: stability_calls={current_calls_before}, total_calls={summary.total_calls or 0}")
|
||||
db_track.commit()
|
||||
logger.info(f"[images.generate] ✅ Successfully tracked usage: user {user_id} -> stability -> {new_calls} calls")
|
||||
|
||||
# UNIFIED SUBSCRIPTION LOG - Shows before/after state in one message
|
||||
print(f"""
|
||||
@@ -950,19 +965,32 @@ def edit(
|
||||
billing_period=current_period
|
||||
)
|
||||
db_track.add(summary)
|
||||
db_track.flush()
|
||||
db_track.flush() # Ensure summary is persisted before updating
|
||||
|
||||
# Get "before" state for unified log
|
||||
current_calls_before = getattr(summary, "image_edit_calls", 0) or 0
|
||||
new_calls = current_calls_before + 1
|
||||
|
||||
# Update image editing counters (separate from image generation)
|
||||
new_calls = current_calls_before + 1
|
||||
setattr(summary, "image_edit_calls", new_calls)
|
||||
logger.debug(f"[images.edit] Updated image_edit_calls: {current_calls_before} -> {new_calls}")
|
||||
|
||||
# Update totals
|
||||
old_total_calls = summary.total_calls or 0
|
||||
summary.total_calls = old_total_calls + 1
|
||||
logger.debug(f"[images.edit] Updated totals: calls {old_total_calls} -> {summary.total_calls}")
|
||||
|
||||
# Get plan details for unified log
|
||||
limits = pricing.get_user_limits(user_id)
|
||||
plan_name = limits.get('plan_name', 'unknown') if limits else 'unknown'
|
||||
tier = limits.get('tier', 'unknown') if limits else 'unknown'
|
||||
call_limit = limits['limits'].get("image_edit_calls", 0) if limits else 0
|
||||
|
||||
# Get image generation stats for unified log
|
||||
current_image_gen_calls = getattr(summary, "stability_calls", 0) or 0
|
||||
image_gen_limit = limits['limits'].get("stability_calls", 0) if limits else 0
|
||||
|
||||
# Get video stats for unified log
|
||||
current_video_calls = getattr(summary, "video_calls", 0) or 0
|
||||
video_limit = limits['limits'].get("video_calls", 0) if limits else 0
|
||||
|
||||
@@ -972,7 +1000,8 @@ def edit(
|
||||
# Only show ∞ for Enterprise tier when limit is 0 (unlimited)
|
||||
audio_limit_display = audio_limit if (audio_limit > 0 or tier != 'enterprise') else '∞'
|
||||
|
||||
logger.debug(f"[images.edit] Usage snapshot for logging: image_edit_calls={current_calls_before}, total_calls={summary.total_calls or 0}")
|
||||
db_track.commit()
|
||||
logger.info(f"[images.edit] ✅ Successfully tracked usage: user {user_id} -> image_edit -> {new_calls} calls")
|
||||
|
||||
# UNIFIED SUBSCRIPTION LOG - Shows before/after state in one message
|
||||
print(f"""
|
||||
@@ -1024,45 +1053,20 @@ def edit(
|
||||
@router.get("/image-studio/images/{image_filename:path}")
|
||||
async def serve_image_studio_image(
|
||||
image_filename: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
db: Session = Depends(get_db),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user)
|
||||
):
|
||||
"""Serve a generated or edited image from Image Studio.
|
||||
Verifies the authenticated user owns the image via asset library lookup."""
|
||||
"""Serve a generated or edited image from Image Studio."""
|
||||
try:
|
||||
if not current_user:
|
||||
raise HTTPException(status_code=401, detail="Authentication required")
|
||||
|
||||
user_id = current_user.get("id") or current_user.get("user_id") or current_user.get("clerk_user_id")
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="User ID not found")
|
||||
|
||||
# Verify ownership: the requesting user must have a content_assets record for this file_url
|
||||
full_url = f"/api/images/image-studio/images/{image_filename}"
|
||||
service = ContentAssetService(db)
|
||||
owned = db.query(ContentAsset).filter(
|
||||
ContentAsset.user_id == user_id,
|
||||
ContentAsset.file_url == full_url,
|
||||
).first()
|
||||
if not owned:
|
||||
raise HTTPException(status_code=403, detail="Access denied: image not found in your library")
|
||||
|
||||
# Determine if it's an edited image or regular image
|
||||
# Validate user-controlled path input before filesystem path construction
|
||||
image_filename_path = Path(image_filename)
|
||||
if image_filename_path.is_absolute() or any(part in ("", ".", "..") for part in image_filename_path.parts):
|
||||
raise HTTPException(status_code=403, detail="Access denied: Invalid image path")
|
||||
|
||||
base_dir = Path(__file__).parent.parent
|
||||
image_studio_dir = (base_dir / "image_studio_images").resolve()
|
||||
|
||||
if image_filename.startswith("edited/"):
|
||||
# Remove "edited/" prefix and serve from edited directory
|
||||
actual_filename = image_filename.replace("edited/", "", 1)
|
||||
actual_filename_path = Path(actual_filename)
|
||||
if actual_filename_path.is_absolute() or any(part in ("", ".", "..") for part in actual_filename_path.parts):
|
||||
raise HTTPException(status_code=403, detail="Access denied: Invalid image path")
|
||||
|
||||
image_path = (image_studio_dir / "edited" / actual_filename).resolve()
|
||||
base_subdir = (image_studio_dir / "edited").resolve()
|
||||
else:
|
||||
|
||||
@@ -1,185 +0,0 @@
|
||||
"""
|
||||
Link Search API — Internal & external link discovery and reword-with-links.
|
||||
|
||||
Endpoints:
|
||||
POST /api/links/search — Search for internal or external links via Exa
|
||||
POST /api/links/reword — Reword text to naturally incorporate selected links
|
||||
GET /api/links/health — Health check
|
||||
"""
|
||||
|
||||
from typing import Dict, Any, List, Optional
|
||||
|
||||
from fastapi import APIRouter, Depends, HTTPException
|
||||
from pydantic import BaseModel, Field
|
||||
from loguru import logger
|
||||
|
||||
from middleware.auth_middleware import get_current_user
|
||||
from api.story_writer.utils.auth import require_authenticated_user
|
||||
from services.link_search_service import get_link_search_service
|
||||
|
||||
|
||||
router = APIRouter(prefix="/api/links", tags=["Links"])
|
||||
|
||||
|
||||
class LinkSearchRequest(BaseModel):
|
||||
"""Request for link search (internal or external)."""
|
||||
query: str = Field(..., description="Search query (typically section heading or topic)")
|
||||
link_type: str = Field(
|
||||
...,
|
||||
description="Type of links: 'internal' or 'external'",
|
||||
)
|
||||
site_url: Optional[str] = Field(
|
||||
default=None,
|
||||
description="User's website URL (required for internal links, optional for external to exclude own domain)",
|
||||
)
|
||||
num_results: int = Field(default=5, description="Number of results to return", ge=1, le=15)
|
||||
|
||||
|
||||
class LinkSearchResult(BaseModel):
|
||||
"""A single link search result."""
|
||||
title: str = ""
|
||||
url: str = ""
|
||||
text: str = ""
|
||||
publishedDate: str = ""
|
||||
author: str = ""
|
||||
score: float = 0.5
|
||||
|
||||
|
||||
class LinkSearchResponse(BaseModel):
|
||||
"""Response for link search."""
|
||||
results: List[LinkSearchResult] = Field(default_factory=list)
|
||||
warnings: List[str] = Field(default_factory=list)
|
||||
|
||||
|
||||
class RewordRequest(BaseModel):
|
||||
"""Request to reword text with selected links."""
|
||||
section_text: str = Field(..., description="Full section text")
|
||||
selected_text: Optional[str] = Field(
|
||||
default=None,
|
||||
description="If provided, only reword this portion of the text",
|
||||
)
|
||||
section_heading: Optional[str] = Field(default=None, description="Section heading for context")
|
||||
links: List[Dict[str, str]] = Field(
|
||||
...,
|
||||
description="List of {'url': str, 'title': str} dicts to incorporate",
|
||||
)
|
||||
|
||||
|
||||
class RewordResponse(BaseModel):
|
||||
"""Response for reword-with-links."""
|
||||
reworded_text: str = ""
|
||||
warnings: List[str] = Field(default_factory=list)
|
||||
|
||||
|
||||
@router.post("/search", response_model=LinkSearchResponse)
|
||||
async def search_links(
|
||||
request: LinkSearchRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Search for internal or external links using Exa."""
|
||||
user_id = require_authenticated_user(current_user)
|
||||
|
||||
if request.link_type not in ("internal", "external"):
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="link_type must be 'internal' or 'external'",
|
||||
)
|
||||
|
||||
if request.link_type == "internal" and not request.site_url:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="site_url is required for internal link search",
|
||||
)
|
||||
|
||||
if len(request.query) > 500:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Query must be 500 characters or less",
|
||||
)
|
||||
|
||||
service = get_link_search_service(user_id=user_id)
|
||||
|
||||
try:
|
||||
if request.link_type == "internal":
|
||||
logger.info(f"[Links] Internal search: query='{request.query[:50]}', site='{request.site_url}', user={user_id}")
|
||||
result = await service.search_internal(
|
||||
query=request.query,
|
||||
site_url=request.site_url,
|
||||
user_id=user_id,
|
||||
num_results=request.num_results,
|
||||
)
|
||||
else:
|
||||
logger.info(f"[Links] External search: query='{request.query[:50]}', user={user_id}")
|
||||
result = await service.search_external(
|
||||
query=request.query,
|
||||
site_url=request.site_url,
|
||||
user_id=user_id,
|
||||
num_results=request.num_results,
|
||||
)
|
||||
|
||||
return LinkSearchResponse(
|
||||
results=[LinkSearchResult(**r) for r in result.get("results", [])],
|
||||
warnings=result.get("warnings", []),
|
||||
)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"[Links] Search failed: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"Link search failed: {str(e)}")
|
||||
|
||||
|
||||
@router.post("/reword", response_model=RewordResponse)
|
||||
async def reword_with_links(
|
||||
request: RewordRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Reword text to naturally incorporate selected links."""
|
||||
user_id = require_authenticated_user(current_user)
|
||||
|
||||
if not request.links:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="At least one link must be provided",
|
||||
)
|
||||
|
||||
# Validate each link has a url
|
||||
for i, link in enumerate(request.links):
|
||||
if not link.get("url"):
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Link at index {i} is missing a 'url' field",
|
||||
)
|
||||
|
||||
if len(request.section_text) > 10000:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="section_text must be 10000 characters or less",
|
||||
)
|
||||
|
||||
service = get_link_search_service(user_id=user_id)
|
||||
|
||||
try:
|
||||
logger.info(f"[Links] Reword: heading='{request.section_heading}', links={len(request.links)}, user={user_id}")
|
||||
result = service.reword_with_links(
|
||||
section_text=request.section_text,
|
||||
links=request.links,
|
||||
section_heading=request.section_heading,
|
||||
selected_text=request.selected_text,
|
||||
user_id=user_id,
|
||||
)
|
||||
|
||||
return RewordResponse(
|
||||
reworded_text=result.get("reworded_text", request.section_text),
|
||||
warnings=result.get("warnings", []),
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[Links] Reword failed: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"Reword failed: {str(e)}")
|
||||
|
||||
|
||||
@router.get("/health")
|
||||
async def links_health():
|
||||
"""Health check for Links service."""
|
||||
return {"status": "ok", "service": "links"}
|
||||
@@ -10,7 +10,9 @@ from pathlib import Path
|
||||
from typing import Literal
|
||||
from loguru import logger
|
||||
from services.story_writer.audio_generation_service import StoryAudioGenerationService
|
||||
from services.workspace_paths import get_workspace_root, get_user_workspace_dir
|
||||
from utils.storage_paths import get_repo_root, sanitize_user_id as _sanitize_user_id
|
||||
|
||||
ROOT_DIR = get_repo_root()
|
||||
|
||||
# Video subdirectory (relative to workspace media dir)
|
||||
AI_VIDEO_SUBDIR = Path("AI_Videos")
|
||||
@@ -43,10 +45,15 @@ def get_podcast_media_dir(
|
||||
}[media_type]
|
||||
|
||||
if user_id:
|
||||
resolved_dir = (get_user_workspace_dir(user_id) / "media" / media_subdir).resolve()
|
||||
sanitized = _sanitize_user_id(user_id)
|
||||
resolved_dir = (
|
||||
ROOT_DIR / "workspace" / f"workspace_{sanitized}" / "media" / media_subdir
|
||||
).resolve()
|
||||
else:
|
||||
logger.warning(f"[Podcast] get_podcast_media_dir called without user_id for {media_type} — using default workspace. This should not happen in production.")
|
||||
resolved_dir = (get_workspace_root() / "workspace_alwrity" / "media" / media_subdir).resolve()
|
||||
resolved_dir = (
|
||||
ROOT_DIR / "workspace" / "workspace_alwrity" / "media" / media_subdir
|
||||
).resolve()
|
||||
|
||||
if ensure_exists:
|
||||
resolved_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
@@ -123,187 +123,3 @@ async def stripe_webhook(
|
||||
except Exception as e:
|
||||
logger.error(f"Error processing webhook: {e}")
|
||||
raise HTTPException(status_code=500, detail="Webhook processing failed")
|
||||
|
||||
@router.get("/verify-checkout/{user_id}")
|
||||
async def verify_checkout_status(
|
||||
user_id: str,
|
||||
db: Session = Depends(get_db),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
request: Request = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Directly query Stripe for user's current subscription status.
|
||||
Used during post-checkout polling to get fresh data without waiting for webhooks.
|
||||
|
||||
Rate limited: 5 requests per minute per user to prevent abuse.
|
||||
"""
|
||||
from ..dependencies import verify_user_access
|
||||
from models.subscription_models import UserSubscription, SubscriptionPlan, SubscriptionTier
|
||||
from services.subscription import PricingService
|
||||
from api.subscription.utils import format_plan_limits
|
||||
from datetime import datetime
|
||||
|
||||
verify_user_access(user_id, current_user)
|
||||
|
||||
# Rate limiting: 5 requests per minute per user
|
||||
now = time.time()
|
||||
window_start = now - 60 # 1 minute window
|
||||
if user_id not in _checkout_attempts_by_user:
|
||||
_checkout_attempts_by_user[user_id] = []
|
||||
attempts = _checkout_attempts_by_user[user_id]
|
||||
attempts[:] = [ts for ts in attempts if ts >= window_start]
|
||||
attempts.append(now)
|
||||
_checkout_attempts_by_user[user_id] = attempts
|
||||
|
||||
if len(attempts) > 5:
|
||||
client_ip = request.client.host if request and request.client else "unknown"
|
||||
logger.warning(f"Verify-checkout rate limit exceeded for user_id={user_id}, ip={client_ip}")
|
||||
raise HTTPException(status_code=429, detail="Too many verification requests. Please wait before trying again.")
|
||||
|
||||
stripe_service = StripeService(db)
|
||||
|
||||
try:
|
||||
# First, try to find user in local DB
|
||||
subscription = db.query(UserSubscription).filter(
|
||||
UserSubscription.user_id == user_id
|
||||
).first()
|
||||
|
||||
stripe_customer_id = subscription.stripe_customer_id if subscription else None
|
||||
|
||||
# If no stripe_customer_id in DB, try to find it by email
|
||||
if not stripe_customer_id:
|
||||
try:
|
||||
import stripe
|
||||
# Get user email from auth context
|
||||
user_email = current_user.get("email")
|
||||
if user_email:
|
||||
customers = stripe.Customer.list(email=user_email, limit=1)
|
||||
if customers and customers.data:
|
||||
stripe_customer_id = customers.data[0].id
|
||||
logger.info(f"Verify-checkout: Found Stripe customer by email for user {user_id}")
|
||||
|
||||
# Update DB with found customer ID
|
||||
if subscription:
|
||||
subscription.stripe_customer_id = stripe_customer_id
|
||||
db.commit()
|
||||
else:
|
||||
logger.info(f"Verify-checkout: No local subscription record for user {user_id}, will query Stripe directly")
|
||||
except Exception as email_err:
|
||||
logger.warning(f"Failed to find Stripe customer by email: {email_err}")
|
||||
|
||||
# If user has a Stripe customer ID, query Stripe directly
|
||||
if stripe_customer_id:
|
||||
try:
|
||||
import stripe
|
||||
stripe_subscriptions = stripe.Subscription.list(
|
||||
customer=stripe_customer_id,
|
||||
status="active",
|
||||
limit=1
|
||||
)
|
||||
|
||||
if stripe_subscriptions and stripe_subscriptions.data:
|
||||
stripe_sub = stripe_subscriptions.data[0]
|
||||
price_id = stripe_sub['items']['data'][0]['price']['id']
|
||||
|
||||
logger.info(f"Verify-checkout: Found active Stripe subscription for user {user_id}, plan from price {price_id}")
|
||||
|
||||
# Update local DB with fresh Stripe data
|
||||
stripe_service._update_user_subscription(
|
||||
user_id,
|
||||
stripe_customer_id=stripe_customer_id,
|
||||
stripe_subscription_id=stripe_sub.id,
|
||||
status="active",
|
||||
price_id=price_id
|
||||
)
|
||||
|
||||
# Clear caches
|
||||
try:
|
||||
PricingService.clear_user_cache(user_id)
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
from api.subscription.cache import clear_dashboard_cache
|
||||
clear_dashboard_cache(user_id)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
db.expire_all()
|
||||
|
||||
# Re-query with fresh data
|
||||
subscription = db.query(UserSubscription).filter(
|
||||
UserSubscription.user_id == user_id,
|
||||
UserSubscription.is_active == True
|
||||
).first()
|
||||
|
||||
if subscription:
|
||||
return {
|
||||
"success": True,
|
||||
"data": {
|
||||
"active": True,
|
||||
"plan": subscription.plan.tier.value,
|
||||
"tier": subscription.plan.tier.value,
|
||||
"can_use_api": True,
|
||||
"limits": format_plan_limits(subscription.plan),
|
||||
"source": "stripe_direct"
|
||||
}
|
||||
}
|
||||
except Exception as stripe_err:
|
||||
logger.warning(f"Failed to query Stripe directly for user {user_id}: {stripe_err}")
|
||||
|
||||
# Fallback to local DB status
|
||||
if subscription and subscription.is_active:
|
||||
from services.subscription.pricing_service import PricingService
|
||||
pricing = PricingService(db)
|
||||
try:
|
||||
pricing._ensure_subscription_current(subscription)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"data": {
|
||||
"active": True,
|
||||
"plan": subscription.plan.tier.value,
|
||||
"tier": subscription.plan.tier.value,
|
||||
"can_use_api": True,
|
||||
"limits": format_plan_limits(subscription.plan),
|
||||
"source": "local_db"
|
||||
}
|
||||
}
|
||||
|
||||
# No active subscription - return free tier
|
||||
free_plan = db.query(SubscriptionPlan).filter(
|
||||
SubscriptionPlan.tier == SubscriptionTier.FREE,
|
||||
SubscriptionPlan.is_active == True
|
||||
).first()
|
||||
|
||||
if free_plan:
|
||||
return {
|
||||
"success": True,
|
||||
"data": {
|
||||
"active": True,
|
||||
"plan": "free",
|
||||
"tier": "free",
|
||||
"can_use_api": True,
|
||||
"limits": format_plan_limits(free_plan),
|
||||
"source": "free_tier"
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"data": {
|
||||
"active": False,
|
||||
"plan": "none",
|
||||
"tier": "none",
|
||||
"can_use_api": False,
|
||||
"reason": "No active subscription found",
|
||||
"source": "none"
|
||||
}
|
||||
}
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Error verifying checkout status for user {user_id}: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to verify subscription: {str(e)}")
|
||||
|
||||
@@ -9,21 +9,13 @@ from fastapi.responses import HTMLResponse
|
||||
from typing import Dict, Any, Optional
|
||||
from loguru import logger
|
||||
from pydantic import BaseModel
|
||||
import os
|
||||
import uuid
|
||||
import requests
|
||||
|
||||
from services.wix_service import WixService
|
||||
from services.integrations.wix_oauth import WixOAuthService
|
||||
from services.integrations.oauth_callback_utils import (
|
||||
build_oauth_callback_html,
|
||||
sanitize_error,
|
||||
)
|
||||
from middleware.auth_middleware import get_current_user
|
||||
import os
|
||||
|
||||
router = APIRouter(prefix="/api/wix", tags=["Wix Integration"])
|
||||
qa_router = APIRouter(prefix="/api/wix/test", tags=["Wix Integration QA"])
|
||||
|
||||
|
||||
# Initialize Wix service
|
||||
wix_service = WixService()
|
||||
@@ -32,72 +24,10 @@ wix_service = WixService()
|
||||
wix_oauth_service = WixOAuthService()
|
||||
|
||||
|
||||
def _get_current_user_id(current_user: dict) -> str:
|
||||
user_id = current_user.get("id") if current_user else None
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="Missing authenticated user context")
|
||||
return user_id
|
||||
|
||||
|
||||
def _map_wix_error(exc: Exception, fallback: str = "Wix API request failed") -> HTTPException:
|
||||
if isinstance(exc, HTTPException):
|
||||
return exc
|
||||
if isinstance(exc, requests.HTTPError):
|
||||
status = exc.response.status_code if exc.response is not None else None
|
||||
msg = str(exc) if str(exc) != "" else fallback
|
||||
if status == 401:
|
||||
return HTTPException(status_code=401, detail=msg)
|
||||
if status == 403:
|
||||
return HTTPException(status_code=403, detail=msg)
|
||||
return HTTPException(status_code=502, detail=msg)
|
||||
if isinstance(exc, requests.RequestException):
|
||||
return HTTPException(status_code=502, detail=str(exc) or fallback)
|
||||
return HTTPException(status_code=500, detail=str(exc))
|
||||
|
||||
|
||||
def _resolve_valid_wix_token(current_user: dict) -> Dict[str, Any]:
|
||||
user_id = _get_current_user_id(current_user)
|
||||
tokens = wix_oauth_service.get_user_tokens(user_id)
|
||||
if tokens:
|
||||
return tokens[0]
|
||||
|
||||
token_status = wix_oauth_service.get_user_token_status(user_id)
|
||||
expired_tokens = token_status.get("expired_tokens", [])
|
||||
if not expired_tokens:
|
||||
raise HTTPException(status_code=401, detail="Wix account not connected")
|
||||
|
||||
for candidate in expired_tokens:
|
||||
refresh_token = candidate.get("refresh_token")
|
||||
token_id = candidate.get("id")
|
||||
if not refresh_token:
|
||||
continue
|
||||
try:
|
||||
refreshed = wix_service.refresh_access_token(refresh_token)
|
||||
except Exception as exc:
|
||||
continue
|
||||
|
||||
wix_oauth_service.update_tokens(
|
||||
user_id=user_id,
|
||||
access_token=refreshed.get("access_token"),
|
||||
refresh_token=refreshed.get("refresh_token", refresh_token),
|
||||
expires_in=refreshed.get("expires_in"),
|
||||
token_id=token_id,
|
||||
)
|
||||
|
||||
return {
|
||||
"access_token": refreshed.get("access_token"),
|
||||
"refresh_token": refreshed.get("refresh_token", refresh_token),
|
||||
"member_id": candidate.get("member_id"),
|
||||
"site_id": candidate.get("site_id"),
|
||||
}
|
||||
|
||||
raise HTTPException(status_code=401, detail="Wix token expired and cannot be refreshed")
|
||||
|
||||
|
||||
class WixAuthRequest(BaseModel):
|
||||
"""Request model for Wix authentication"""
|
||||
code: str
|
||||
state: str
|
||||
state: Optional[str] = None
|
||||
|
||||
|
||||
class WixPublishRequest(BaseModel):
|
||||
@@ -106,13 +36,10 @@ class WixPublishRequest(BaseModel):
|
||||
content: str
|
||||
cover_image_url: Optional[str] = None
|
||||
category_ids: Optional[list] = None
|
||||
category_names: Optional[list] = None
|
||||
tag_ids: Optional[list] = None
|
||||
tag_names: Optional[list] = None
|
||||
publish: bool = True
|
||||
# Optional access token for test-real publish flow
|
||||
access_token: Optional[str] = None
|
||||
member_id: Optional[str] = None
|
||||
seo_metadata: Optional[Dict[str, Any]] = None
|
||||
class WixCreateCategoryRequest(BaseModel):
|
||||
access_token: str
|
||||
label: str
|
||||
@@ -135,41 +62,8 @@ class WixConnectionStatus(BaseModel):
|
||||
error: Optional[str] = None
|
||||
|
||||
|
||||
def _is_wix_test_mode_enabled() -> bool:
|
||||
return os.getenv("WIX_TEST_ROUTES_ENABLED", "false").lower() in {"1", "true", "yes", "on"}
|
||||
|
||||
|
||||
def _is_admin_user(current_user: Dict[str, Any]) -> bool:
|
||||
email = (current_user.get("email") or "").lower()
|
||||
role = current_user.get("role")
|
||||
public_metadata = current_user.get("public_metadata")
|
||||
if isinstance(public_metadata, dict):
|
||||
role = public_metadata.get("role") or role
|
||||
|
||||
admin_emails = {
|
||||
e.strip().lower()
|
||||
for e in os.getenv("ADMIN_EMAILS", "").split(",")
|
||||
if e.strip()
|
||||
}
|
||||
admin_domain = (os.getenv("ADMIN_EMAIL_DOMAIN") or "").lower().strip()
|
||||
|
||||
return bool(
|
||||
role == "admin"
|
||||
or (email and email in admin_emails)
|
||||
or (email and admin_domain and email.endswith(f"@{admin_domain}"))
|
||||
)
|
||||
|
||||
|
||||
def _require_wix_test_access(current_user: Dict[str, Any] = Depends(get_current_user)) -> Dict[str, Any]:
|
||||
if not _is_wix_test_mode_enabled():
|
||||
raise HTTPException(status_code=404, detail="Not found")
|
||||
if not _is_admin_user(current_user):
|
||||
raise HTTPException(status_code=403, detail="Admin access required")
|
||||
return current_user
|
||||
|
||||
|
||||
@router.get("/auth/url")
|
||||
async def get_authorization_url(state: Optional[str] = None, current_user: dict = Depends(get_current_user)) -> Dict[str, str]:
|
||||
async def get_authorization_url(state: Optional[str] = None) -> Dict[str, str]:
|
||||
"""
|
||||
Get Wix OAuth authorization URL
|
||||
|
||||
@@ -180,21 +74,8 @@ async def get_authorization_url(state: Optional[str] = None, current_user: dict
|
||||
Authorization URL
|
||||
"""
|
||||
try:
|
||||
user_id = current_user.get('id') if current_user else None
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="Authentication required")
|
||||
|
||||
oauth_state = state or str(uuid.uuid4())
|
||||
oauth_payload = wix_service.get_authorization_url(oauth_state)
|
||||
saved = wix_oauth_service.store_pkce_verifier(
|
||||
user_id=user_id,
|
||||
state=oauth_state,
|
||||
code_verifier=oauth_payload["code_verifier"],
|
||||
ttl_seconds=600
|
||||
)
|
||||
if not saved:
|
||||
raise HTTPException(status_code=500, detail="Failed to persist OAuth verifier state")
|
||||
return {"authorization_url": oauth_payload["authorization_url"], "state": oauth_state}
|
||||
url = wix_service.get_authorization_url(state)
|
||||
return {"authorization_url": url}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to generate authorization URL: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
@@ -217,16 +98,8 @@ async def handle_oauth_callback(request: WixAuthRequest, current_user: dict = De
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=400, detail="User ID not found")
|
||||
|
||||
if not request.state:
|
||||
raise HTTPException(status_code=400, detail="Missing OAuth state")
|
||||
code_verifier = wix_oauth_service.consume_pkce_verifier(user_id=user_id, state=request.state)
|
||||
if not code_verifier:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Invalid or expired OAuth state. Please restart Wix connection."
|
||||
)
|
||||
# Exchange code for tokens
|
||||
tokens = wix_service.exchange_code_for_tokens(request.code, code_verifier=code_verifier)
|
||||
tokens = wix_service.exchange_code_for_tokens(request.code)
|
||||
|
||||
# Get site information to extract site_id and member_id
|
||||
site_info = wix_service.get_site_info(tokens['access_token'])
|
||||
@@ -279,38 +152,32 @@ async def handle_oauth_callback(request: WixAuthRequest, current_user: dict = De
|
||||
async def handle_oauth_callback_get(code: str, state: Optional[str] = None, request: Request = None, current_user: dict = Depends(get_current_user)):
|
||||
"""HTML callback page for Wix OAuth that exchanges code and notifies opener via postMessage."""
|
||||
try:
|
||||
user_id = current_user.get('id') if current_user else None
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=401, detail="Authentication required")
|
||||
if not state:
|
||||
raise HTTPException(status_code=400, detail="Missing OAuth state")
|
||||
code_verifier = wix_oauth_service.consume_pkce_verifier(user_id=user_id, state=state)
|
||||
if not code_verifier:
|
||||
raise HTTPException(status_code=400, detail="Invalid or expired OAuth state. Please reconnect Wix.")
|
||||
tokens = wix_service.exchange_code_for_tokens(code, code_verifier=code_verifier)
|
||||
tokens = wix_service.exchange_code_for_tokens(code)
|
||||
site_info = wix_service.get_site_info(tokens['access_token'])
|
||||
permissions = wix_service.check_blog_permissions(tokens['access_token'])
|
||||
|
||||
# Store tokens in database if we have user_id
|
||||
site_id = site_info.get('siteId') or site_info.get('site_id')
|
||||
member_id = None
|
||||
try:
|
||||
member_id = wix_service.extract_member_id_from_access_token(tokens['access_token'])
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
stored = wix_oauth_service.store_tokens(
|
||||
user_id=user_id,
|
||||
access_token=tokens['access_token'],
|
||||
refresh_token=tokens.get('refresh_token'),
|
||||
expires_in=tokens.get('expires_in'),
|
||||
token_type=tokens.get('token_type', 'Bearer'),
|
||||
scope=tokens.get('scope'),
|
||||
site_id=site_id,
|
||||
member_id=member_id
|
||||
)
|
||||
if not stored:
|
||||
logger.warning(f"Failed to store Wix tokens for user {user_id} in GET callback")
|
||||
user_id = current_user.get('id') if current_user else None
|
||||
if user_id:
|
||||
site_id = site_info.get('siteId') or site_info.get('site_id')
|
||||
member_id = None
|
||||
try:
|
||||
member_id = wix_service.extract_member_id_from_access_token(tokens['access_token'])
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
stored = wix_oauth_service.store_tokens(
|
||||
user_id=user_id,
|
||||
access_token=tokens['access_token'],
|
||||
refresh_token=tokens.get('refresh_token'),
|
||||
expires_in=tokens.get('expires_in'),
|
||||
token_type=tokens.get('token_type', 'Bearer'),
|
||||
scope=tokens.get('scope'),
|
||||
site_id=site_id,
|
||||
member_id=member_id
|
||||
)
|
||||
if not stored:
|
||||
logger.warning(f"Failed to store Wix tokens for user {user_id} in GET callback")
|
||||
|
||||
# Build success payload for postMessage
|
||||
payload = {
|
||||
@@ -326,24 +193,45 @@ async def handle_oauth_callback_get(code: str, state: Optional[str] = None, requ
|
||||
"permissions": permissions
|
||||
}
|
||||
|
||||
html = build_oauth_callback_html(
|
||||
payload=payload,
|
||||
title="Wix Connected",
|
||||
heading="Connection Successful",
|
||||
message="Your Wix account was connected. You can close this window."
|
||||
)
|
||||
html = f"""
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>Wix Connected</title></head>
|
||||
<body>
|
||||
<script>
|
||||
(function() {{
|
||||
try {{
|
||||
var payload = {payload};
|
||||
(window.opener || window.parent).postMessage(payload, '*');
|
||||
}} catch (e) {{}}
|
||||
window.close();
|
||||
}})();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(content=html, headers={
|
||||
"Cross-Origin-Opener-Policy": "unsafe-none",
|
||||
"Cross-Origin-Embedder-Policy": "unsafe-none"
|
||||
})
|
||||
except Exception as e:
|
||||
logger.error(f"Wix OAuth GET callback failed: {e}")
|
||||
html = build_oauth_callback_html(
|
||||
payload={"type": "WIX_OAUTH_ERROR", "success": False, "error": sanitize_error(e)},
|
||||
title="Wix Connection Failed",
|
||||
heading="Connection Failed",
|
||||
message="There was an issue connecting your Wix account. You can close this window and try again."
|
||||
)
|
||||
html = f"""
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>Wix Connection Failed</title></head>
|
||||
<body>
|
||||
<script>
|
||||
(function() {{
|
||||
try {{
|
||||
(window.opener || window.parent).postMessage({{ type: 'WIX_OAUTH_ERROR', success: false, error: '{str(e)}' }}, '*');
|
||||
}} catch (e) {{}}
|
||||
window.close();
|
||||
}})();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(content=html, headers={
|
||||
"Cross-Origin-Opener-Policy": "unsafe-none",
|
||||
"Cross-Origin-Embedder-Policy": "unsafe-none"
|
||||
@@ -351,134 +239,130 @@ async def handle_oauth_callback_get(code: str, state: Optional[str] = None, requ
|
||||
|
||||
|
||||
@router.get("/connection/status")
|
||||
async def get_connection_status(current_user: dict = Depends(get_current_user)) -> Dict[str, Any]:
|
||||
async def get_connection_status(current_user: dict = Depends(get_current_user)) -> WixConnectionStatus:
|
||||
"""
|
||||
Check Wix connection status and permissions.
|
||||
Returns connected: false when no tokens are stored (instead of 401).
|
||||
Check Wix connection status and permissions
|
||||
|
||||
Args:
|
||||
current_user: Current authenticated user
|
||||
|
||||
Returns:
|
||||
Connection status and permissions
|
||||
"""
|
||||
try:
|
||||
token_info = _resolve_valid_wix_token(current_user)
|
||||
access_token = token_info["access_token"]
|
||||
site_info = wix_service.get_site_info(access_token)
|
||||
permissions = wix_service.check_blog_permissions(access_token)
|
||||
return {
|
||||
"connected": True,
|
||||
"has_permissions": permissions.get("has_permissions", False),
|
||||
"site_info": site_info,
|
||||
"permissions": permissions
|
||||
}
|
||||
except HTTPException as e:
|
||||
if e.status_code == 401:
|
||||
return {"connected": False, "has_permissions": False, "error": "Wix account not connected"}
|
||||
raise
|
||||
# Check if user has Wix tokens stored in sessionStorage (frontend approach)
|
||||
# This is a simplified check - in production you'd store tokens in database
|
||||
return WixConnectionStatus(
|
||||
connected=False,
|
||||
has_permissions=False,
|
||||
error="No Wix connection found. Please connect your Wix account first."
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to check connection status: {e}")
|
||||
return {"connected": False, "has_permissions": False, "error": "Unable to check Wix connection"}
|
||||
return WixConnectionStatus(
|
||||
connected=False,
|
||||
has_permissions=False,
|
||||
error=str(e)
|
||||
)
|
||||
|
||||
|
||||
@router.get("/status")
|
||||
async def get_wix_status(current_user: dict = Depends(get_current_user)) -> Dict[str, Any]:
|
||||
"""
|
||||
Get Wix connection status (similar to GSC/WordPress pattern)
|
||||
Note: Wix tokens are stored in frontend sessionStorage, so we can't directly check them here.
|
||||
The frontend will check sessionStorage and update the UI accordingly.
|
||||
"""
|
||||
try:
|
||||
token_info = _resolve_valid_wix_token(current_user)
|
||||
site_info = wix_service.get_site_info(token_info["access_token"])
|
||||
# Since Wix tokens are stored in frontend sessionStorage (not backend database),
|
||||
# we return a default response. The frontend will check sessionStorage directly.
|
||||
return {
|
||||
"connected": True,
|
||||
"sites": [site_info],
|
||||
"total_sites": 1,
|
||||
"site_info": site_info
|
||||
"connected": False,
|
||||
"sites": [],
|
||||
"total_sites": 0,
|
||||
"error": "Wix connection status managed by frontend sessionStorage"
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get Wix status: {e}")
|
||||
mapped = _map_wix_error(e, "Failed to get Wix status")
|
||||
raise mapped
|
||||
return {
|
||||
"connected": False,
|
||||
"sites": [],
|
||||
"total_sites": 0,
|
||||
"error": str(e)
|
||||
}
|
||||
|
||||
|
||||
@router.post("/publish")
|
||||
async def publish_to_wix(request: WixPublishRequest, current_user: dict = Depends(get_current_user)) -> Dict[str, Any]:
|
||||
"""
|
||||
Publish blog post to Wix using server-stored OAuth tokens.
|
||||
Publish blog post to Wix
|
||||
|
||||
The backend resolves the access token from the database (via
|
||||
_resolve_valid_wix_token), so callers do NOT need to pass
|
||||
access_token unless they want to override the stored one.
|
||||
Args:
|
||||
request: Blog post data
|
||||
current_user: Current authenticated user
|
||||
|
||||
Returns:
|
||||
Published blog post information
|
||||
"""
|
||||
try:
|
||||
if request.access_token:
|
||||
from services.integrations.wix.utils import normalize_token_string
|
||||
access_token = normalize_token_string(request.access_token)
|
||||
else:
|
||||
try:
|
||||
token_info = _resolve_valid_wix_token(current_user)
|
||||
access_token = token_info["access_token"]
|
||||
except HTTPException:
|
||||
access_token = None
|
||||
|
||||
if not access_token:
|
||||
return {
|
||||
"success": False,
|
||||
"error": "Wix account not connected. Connect your Wix account first.",
|
||||
}
|
||||
|
||||
member_id = request.member_id
|
||||
if not member_id:
|
||||
member_id = wix_service.extract_member_id_from_access_token(access_token)
|
||||
if not member_id:
|
||||
member_info = wix_service.get_current_member(access_token)
|
||||
member_id = (member_info.get("member") or {}).get("id") or member_info.get("id")
|
||||
if not member_id:
|
||||
return {
|
||||
"success": False,
|
||||
"error": "Unable to resolve Wix member ID. Please reconnect your Wix account.",
|
||||
}
|
||||
|
||||
# Resolve categories: accept IDs or names (looked up/created)
|
||||
category_ids = request.category_ids or request.category_names
|
||||
tag_ids = request.tag_ids or request.tag_names
|
||||
|
||||
seo_metadata = request.seo_metadata
|
||||
if seo_metadata:
|
||||
if not category_ids and seo_metadata.get("blog_categories"):
|
||||
category_ids = seo_metadata.get("blog_categories")
|
||||
if not tag_ids and seo_metadata.get("blog_tags"):
|
||||
tag_ids = seo_metadata.get("blog_tags")
|
||||
|
||||
# Ensure category_ids and tag_ids are lists of strings (not ints)
|
||||
if category_ids:
|
||||
category_ids = [str(c) for c in category_ids if c is not None]
|
||||
if tag_ids:
|
||||
tag_ids = [str(t) for t in tag_ids if t is not None]
|
||||
|
||||
result = wix_service.create_blog_post(
|
||||
access_token=access_token,
|
||||
title=request.title,
|
||||
content=request.content,
|
||||
cover_image_url=request.cover_image_url,
|
||||
category_ids=category_ids,
|
||||
tag_ids=tag_ids,
|
||||
publish=request.publish,
|
||||
member_id=member_id,
|
||||
seo_metadata=seo_metadata,
|
||||
)
|
||||
post = result.get("draftPost") or result.get("post") or result
|
||||
raw_url = post.get("url")
|
||||
if isinstance(raw_url, dict):
|
||||
post_url = raw_url.get("base", "").rstrip("/") + "/" + raw_url.get("path", "").lstrip("/")
|
||||
elif isinstance(raw_url, str):
|
||||
post_url = raw_url
|
||||
else:
|
||||
post_url = None
|
||||
# TODO: Retrieve stored access token from database for current_user
|
||||
# For now, we'll return an error asking user to connect first
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"post_id": str(post.get("id", "")),
|
||||
"url": post_url,
|
||||
"publish_state": "PUBLISHED" if request.publish else "DRAFT"
|
||||
"success": False,
|
||||
"error": "Wix account not connected. Please connect your Wix account first.",
|
||||
"message": "Use the /api/wix/auth/url endpoint to get the authorization URL"
|
||||
}
|
||||
|
||||
# Example of what the actual implementation would look like:
|
||||
# access_token = get_stored_access_token(current_user['id'])
|
||||
#
|
||||
# if not access_token:
|
||||
# raise HTTPException(status_code=401, detail="Wix account not connected")
|
||||
#
|
||||
# # Check if token is still valid, refresh if needed
|
||||
# try:
|
||||
# site_info = wix_service.get_site_info(access_token)
|
||||
# except:
|
||||
# # Token expired, try to refresh
|
||||
# refresh_token = get_stored_refresh_token(current_user['id'])
|
||||
# if refresh_token:
|
||||
# new_tokens = wix_service.refresh_access_token(refresh_token)
|
||||
# access_token = new_tokens['access_token']
|
||||
# # Store new tokens
|
||||
# else:
|
||||
# raise HTTPException(status_code=401, detail="Wix session expired. Please reconnect.")
|
||||
#
|
||||
# # Get current member ID (required for third-party apps)
|
||||
# member_info = wix_service.get_current_member(access_token)
|
||||
# member_id = member_info.get('member', {}).get('id')
|
||||
#
|
||||
# if not member_id:
|
||||
# raise HTTPException(status_code=400, detail="Could not retrieve member ID")
|
||||
#
|
||||
# # Create blog post
|
||||
# result = wix_service.create_blog_post(
|
||||
# access_token=access_token,
|
||||
# title=request.title,
|
||||
# content=request.content,
|
||||
# cover_image_url=request.cover_image_url,
|
||||
# category_ids=request.category_ids,
|
||||
# tag_ids=request.tag_ids,
|
||||
# publish=request.publish,
|
||||
# member_id=member_id # Required for third-party apps
|
||||
# )
|
||||
#
|
||||
# return {
|
||||
# "success": True,
|
||||
# "post_id": result.get('draftPost', {}).get('id'),
|
||||
# "url": result.get('draftPost', {}).get('url'),
|
||||
# "message": "Blog post published successfully to Wix"
|
||||
# }
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to publish to Wix: {e}")
|
||||
raise _map_wix_error(e, "Failed to publish to Wix")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@router.get("/categories")
|
||||
@@ -493,15 +377,23 @@ async def get_blog_categories(current_user: dict = Depends(get_current_user)) ->
|
||||
List of blog categories
|
||||
"""
|
||||
try:
|
||||
token_info = _resolve_valid_wix_token(current_user)
|
||||
categories = wix_service.get_blog_categories(token_info["access_token"])
|
||||
# TODO: Retrieve stored access token from database for current_user
|
||||
return {
|
||||
"success": True,
|
||||
"categories": categories
|
||||
"success": False,
|
||||
"error": "Wix account not connected. Please connect your Wix account first."
|
||||
}
|
||||
|
||||
# Example implementation:
|
||||
# access_token = get_stored_access_token(current_user['id'])
|
||||
# if not access_token:
|
||||
# raise HTTPException(status_code=401, detail="Wix account not connected")
|
||||
#
|
||||
# categories = wix_service.get_blog_categories(access_token)
|
||||
# return {"categories": categories}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get blog categories: {e}")
|
||||
raise _map_wix_error(e, "Failed to fetch Wix blog categories")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@router.get("/tags")
|
||||
@@ -516,15 +408,23 @@ async def get_blog_tags(current_user: dict = Depends(get_current_user)) -> Dict[
|
||||
List of blog tags
|
||||
"""
|
||||
try:
|
||||
token_info = _resolve_valid_wix_token(current_user)
|
||||
tags = wix_service.get_blog_tags(token_info["access_token"])
|
||||
# TODO: Retrieve stored access token from database for current_user
|
||||
return {
|
||||
"success": True,
|
||||
"tags": tags
|
||||
"success": False,
|
||||
"error": "Wix account not connected. Please connect your Wix account first."
|
||||
}
|
||||
|
||||
# Example implementation:
|
||||
# access_token = get_stored_access_token(current_user['id'])
|
||||
# if not access_token:
|
||||
# raise HTTPException(status_code=401, detail="Wix account not connected")
|
||||
#
|
||||
# tags = wix_service.get_blog_tags(access_token)
|
||||
# return {"tags": tags}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get blog tags: {e}")
|
||||
raise _map_wix_error(e, "Failed to fetch Wix blog tags")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@router.post("/disconnect")
|
||||
@@ -539,30 +439,23 @@ async def disconnect_wix(current_user: dict = Depends(get_current_user)) -> Dict
|
||||
Disconnection status
|
||||
"""
|
||||
try:
|
||||
user_id = _get_current_user_id(current_user)
|
||||
token_status = wix_oauth_service.get_user_token_status(user_id)
|
||||
all_tokens = token_status.get("active_tokens", []) + token_status.get("expired_tokens", [])
|
||||
for token in all_tokens:
|
||||
token_id = token.get("id")
|
||||
if token_id:
|
||||
wix_oauth_service.revoke_token(user_id, token_id)
|
||||
# TODO: Remove stored tokens from database for current_user
|
||||
return {
|
||||
"success": True,
|
||||
"connected": False,
|
||||
"message": "Wix account disconnected successfully"
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to disconnect Wix: {e}")
|
||||
raise _map_wix_error(e, "Failed to disconnect Wix account")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# TEST ENDPOINTS - No authentication required for testing
|
||||
# =============================================================================
|
||||
|
||||
@qa_router.get("/connection/status")
|
||||
async def get_test_connection_status(_: Dict[str, Any] = Depends(_require_wix_test_access)) -> WixConnectionStatus:
|
||||
@router.get("/test/connection/status")
|
||||
async def get_test_connection_status() -> WixConnectionStatus:
|
||||
"""
|
||||
TEST ENDPOINT: Check Wix connection status without authentication
|
||||
|
||||
@@ -587,8 +480,8 @@ async def get_test_connection_status(_: Dict[str, Any] = Depends(_require_wix_te
|
||||
)
|
||||
|
||||
|
||||
@qa_router.get("/auth/url")
|
||||
async def get_test_authorization_url(state: Optional[str] = None, _: Dict[str, Any] = Depends(_require_wix_test_access)) -> Dict[str, str]:
|
||||
@router.get("/test/auth/url")
|
||||
async def get_test_authorization_url(state: Optional[str] = None) -> Dict[str, str]:
|
||||
"""
|
||||
TEST ENDPOINT: Get Wix OAuth authorization URL without authentication
|
||||
|
||||
@@ -618,15 +511,15 @@ async def get_test_authorization_url(state: Optional[str] = None, _: Dict[str, A
|
||||
"message": "WIX_CLIENT_ID not configured. Please set it in your .env file to get a real authorization URL."
|
||||
}
|
||||
|
||||
auth_payload = wix_service.get_authorization_url(state)
|
||||
return {"url": auth_payload.get("authorization_url", ""), "state": state or "test_state"}
|
||||
auth_url = wix_service.get_authorization_url(state)
|
||||
return {"url": auth_url, "state": state or "test_state"}
|
||||
except Exception as e:
|
||||
logger.error(f"TEST: Failed to generate authorization URL: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@qa_router.post("/publish")
|
||||
async def test_publish_to_wix(request: WixPublishRequest, _: Dict[str, Any] = Depends(_require_wix_test_access)) -> Dict[str, Any]:
|
||||
@router.post("/test/publish")
|
||||
async def test_publish_to_wix(request: WixPublishRequest) -> Dict[str, Any]:
|
||||
"""
|
||||
TEST ENDPOINT: Simulate publishing a blog post to Wix without authentication.
|
||||
|
||||
@@ -646,44 +539,28 @@ async def test_publish_to_wix(request: WixPublishRequest, _: Dict[str, Any] = De
|
||||
|
||||
|
||||
@router.post("/refresh-token")
|
||||
async def refresh_wix_token(current_user: dict = Depends(get_current_user)) -> Dict[str, Any]:
|
||||
async def refresh_wix_token(request: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Refresh Wix access token using stored refresh token.
|
||||
Refresh Wix access token using refresh token
|
||||
|
||||
Args:
|
||||
current_user: Current authenticated user
|
||||
request: Dict containing refresh_token
|
||||
|
||||
Returns:
|
||||
New token information with access_token, refresh_token, expires_in
|
||||
"""
|
||||
try:
|
||||
user_id = _get_current_user_id(current_user)
|
||||
token_status = wix_oauth_service.get_user_token_status(user_id)
|
||||
all_tokens = token_status.get("active_tokens", []) + token_status.get("expired_tokens", [])
|
||||
|
||||
refresh_token = None
|
||||
token_id = None
|
||||
for t in all_tokens:
|
||||
if t.get("refresh_token"):
|
||||
refresh_token = t["refresh_token"]
|
||||
token_id = t["id"]
|
||||
break
|
||||
|
||||
refresh_token = request.get("refresh_token")
|
||||
if not refresh_token:
|
||||
raise HTTPException(status_code=400, detail="No refresh token found. Please reconnect your Wix account.")
|
||||
raise HTTPException(status_code=400, detail="Missing refresh_token")
|
||||
|
||||
# Refresh the token
|
||||
new_tokens = wix_service.refresh_access_token(refresh_token)
|
||||
|
||||
wix_oauth_service.update_tokens(
|
||||
user_id=user_id,
|
||||
access_token=new_tokens.get("access_token"),
|
||||
refresh_token=new_tokens.get("refresh_token", refresh_token),
|
||||
expires_in=new_tokens.get("expires_in"),
|
||||
token_id=token_id,
|
||||
)
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"access_token": new_tokens.get("access_token"),
|
||||
"refresh_token": new_tokens.get("refresh_token"),
|
||||
"expires_in": new_tokens.get("expires_in"),
|
||||
"token_type": new_tokens.get("token_type", "Bearer")
|
||||
}
|
||||
@@ -691,11 +568,11 @@ async def refresh_wix_token(current_user: dict = Depends(get_current_user)) -> D
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to refresh Wix token: {e}")
|
||||
raise _map_wix_error(e, "Failed to refresh token")
|
||||
raise HTTPException(status_code=500, detail=f"Failed to refresh token: {str(e)}")
|
||||
|
||||
|
||||
@qa_router.post("/publish/real")
|
||||
async def test_publish_real(payload: Dict[str, Any], _: Dict[str, Any] = Depends(_require_wix_test_access)) -> Dict[str, Any]:
|
||||
@router.post("/test/publish/real")
|
||||
async def test_publish_real(payload: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
TEST ENDPOINT: Perform a real publish to Wix using a provided access token.
|
||||
|
||||
@@ -763,6 +640,7 @@ async def test_publish_real(payload: Dict[str, Any], _: Dict[str, Any] = Depends
|
||||
"post_id": (result.get("draftPost") or result.get("post") or {}).get("id"),
|
||||
"url": (result.get("draftPost") or result.get("post") or {}).get("url"),
|
||||
"message": "Blog post published to Wix",
|
||||
"raw": result,
|
||||
}
|
||||
except HTTPException:
|
||||
raise
|
||||
@@ -771,8 +649,8 @@ async def test_publish_real(payload: Dict[str, Any], _: Dict[str, Any] = Depends
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@qa_router.post("/category")
|
||||
async def test_create_category(request: WixCreateCategoryRequest, _: Dict[str, Any] = Depends(_require_wix_test_access)) -> Dict[str, Any]:
|
||||
@router.post("/test/category")
|
||||
async def test_create_category(request: WixCreateCategoryRequest) -> Dict[str, Any]:
|
||||
try:
|
||||
result = wix_service.create_category(
|
||||
access_token=request.access_token,
|
||||
@@ -786,8 +664,8 @@ async def test_create_category(request: WixCreateCategoryRequest, _: Dict[str, A
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
@qa_router.post("/tag")
|
||||
async def test_create_tag(request: WixCreateTagRequest, _: Dict[str, Any] = Depends(_require_wix_test_access)) -> Dict[str, Any]:
|
||||
@router.post("/test/tag")
|
||||
async def test_create_tag(request: WixCreateTagRequest) -> Dict[str, Any]:
|
||||
try:
|
||||
result = wix_service.create_tag(
|
||||
access_token=request.access_token,
|
||||
|
||||
@@ -12,7 +12,6 @@ router = APIRouter(prefix="/api/writing-assistant", tags=["writing-assistant"])
|
||||
|
||||
class SuggestRequest(BaseModel):
|
||||
text: str
|
||||
cursor_position: int | None = None
|
||||
|
||||
|
||||
class SourceModel(BaseModel):
|
||||
@@ -33,7 +32,6 @@ class SuggestionModel(BaseModel):
|
||||
class SuggestResponse(BaseModel):
|
||||
success: bool
|
||||
suggestions: List[SuggestionModel]
|
||||
message: str = ""
|
||||
|
||||
|
||||
assistant_service = WritingAssistantService()
|
||||
@@ -43,9 +41,9 @@ assistant_service = WritingAssistantService()
|
||||
async def suggest_endpoint(req: SuggestRequest, current_user: Dict[str, Any] = Depends(get_current_user)) -> SuggestResponse:
|
||||
try:
|
||||
user_id = current_user.get("id")
|
||||
suggestions = await assistant_service.suggest(req.text, user_id=user_id, cursor_position=req.cursor_position)
|
||||
suggestions = await assistant_service.suggest(req.text, user_id=user_id)
|
||||
return SuggestResponse(
|
||||
success=len(suggestions) > 0,
|
||||
success=True,
|
||||
suggestions=[
|
||||
SuggestionModel(
|
||||
text=s.text,
|
||||
@@ -57,8 +55,6 @@ async def suggest_endpoint(req: SuggestRequest, current_user: Dict[str, Any] = D
|
||||
for s in suggestions
|
||||
],
|
||||
)
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Writing assistant error: {e}")
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
@@ -459,21 +459,20 @@ async def start_video_render(
|
||||
try:
|
||||
user_id = require_authenticated_user(current_user)
|
||||
|
||||
# Filter enabled scenes FIRST so we can validate credits for the actual count
|
||||
# Validate subscription limits
|
||||
pricing_service = PricingService(db)
|
||||
validate_scene_animation_operation(
|
||||
pricing_service=pricing_service,
|
||||
user_id=user_id
|
||||
)
|
||||
|
||||
# Filter enabled scenes
|
||||
enabled_scenes = [s for s in request.scenes if s.get("enabled", True)]
|
||||
if not enabled_scenes:
|
||||
return VideoRenderResponse(
|
||||
success=False,
|
||||
message="No enabled scenes to render"
|
||||
)
|
||||
|
||||
# Validate subscription limits for ALL scenes in the batch
|
||||
pricing_service = PricingService(db)
|
||||
validate_scene_animation_operation(
|
||||
pricing_service=pricing_service,
|
||||
user_id=user_id,
|
||||
scene_count=len(enabled_scenes),
|
||||
)
|
||||
|
||||
# VALIDATION: Pre-validate scenes before creating task to prevent wasted API calls
|
||||
validation_errors = []
|
||||
|
||||
@@ -138,7 +138,6 @@ if _is_full_mode():
|
||||
from routers.image_studio import router as image_studio_router
|
||||
from routers.product_marketing import router as product_marketing_router
|
||||
from routers.campaign_creator import router as campaign_creator_router
|
||||
from routers.backlink_outreach import router as backlink_outreach_router
|
||||
else:
|
||||
# In feature-only modes, only load essential assets router
|
||||
from api.assets_serving import router as assets_serving_router
|
||||
@@ -147,28 +146,14 @@ else:
|
||||
image_studio_router = None
|
||||
product_marketing_router = None
|
||||
campaign_creator_router = None
|
||||
backlink_outreach_router = None
|
||||
|
||||
# Import hallucination detector router
|
||||
try:
|
||||
# Import hallucination detector router (skip in feature-only modes - triggers heavy ML)
|
||||
if _is_full_mode():
|
||||
from api.hallucination_detector import router as hallucination_detector_router
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to import hallucination_detector router: {e}")
|
||||
from api.writing_assistant import router as writing_assistant_router
|
||||
else:
|
||||
hallucination_detector_router = None
|
||||
|
||||
# Import charts router (shared chart generation for blog writer, podcast, etc.)
|
||||
try:
|
||||
from api.charts import router as charts_router
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to import charts router: {e}")
|
||||
charts_router = None
|
||||
|
||||
# Import links router (internal & external link search and rewording)
|
||||
try:
|
||||
from api.links import router as links_router
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to import links router: {e}")
|
||||
links_router = None
|
||||
writing_assistant_router = None
|
||||
|
||||
# Import research configuration router (skip in feature-only modes)
|
||||
if _is_full_mode():
|
||||
@@ -501,18 +486,10 @@ else:
|
||||
"reason": f"Feature-only mode: {enabled_features}",
|
||||
}
|
||||
|
||||
# Safety net: explicitly include hallucination detector (import may fail gracefully)
|
||||
# Safety net: explicitly include hallucination detector (router_manager may skip silently)
|
||||
if hallucination_detector_router:
|
||||
router_manager.include_router_safely(hallucination_detector_router, "hallucination_detector")
|
||||
|
||||
# Include charts router (shared chart generation)
|
||||
if charts_router:
|
||||
router_manager.include_router_safely(charts_router, "charts")
|
||||
|
||||
# Include links router (internal & external link search)
|
||||
if links_router:
|
||||
router_manager.include_router_safely(links_router, "links")
|
||||
|
||||
# Log startup summary
|
||||
router_manager.log_startup_summary()
|
||||
|
||||
@@ -672,9 +649,6 @@ if _is_full_mode():
|
||||
# Include Bing Analytics Storage router to expose storage-backed endpoints
|
||||
from routers.bing_analytics_storage import router as bing_analytics_storage_router
|
||||
app.include_router(bing_analytics_storage_router)
|
||||
# Include SEO Tools router with enterprise audit and GSC analysis
|
||||
if seo_tools_router:
|
||||
app.include_router(seo_tools_router)
|
||||
if images_router:
|
||||
app.include_router(images_router)
|
||||
if image_studio_router:
|
||||
@@ -683,9 +657,10 @@ if _is_full_mode():
|
||||
app.include_router(product_marketing_router)
|
||||
if campaign_creator_router:
|
||||
app.include_router(campaign_creator_router)
|
||||
if backlink_outreach_router:
|
||||
app.include_router(backlink_outreach_router)
|
||||
|
||||
# Include content assets router
|
||||
from api.content_assets.router import router as content_assets_router
|
||||
app.include_router(content_assets_router)
|
||||
router_group_status["platform_extensions"] = {
|
||||
"mounted": True,
|
||||
"reason": "Full mode",
|
||||
@@ -696,10 +671,6 @@ else:
|
||||
"reason": "Skipped in feature-only mode",
|
||||
}
|
||||
|
||||
# Include content assets router (always — core utility, not feature-specific)
|
||||
from api.content_assets.router import router as content_assets_router
|
||||
app.include_router(content_assets_router)
|
||||
|
||||
# Include Podcast Maker router (only when podcast feature is enabled)
|
||||
if _is_feature_enabled("podcast") and "all" not in get_enabled_features():
|
||||
from api.podcast.router import router as podcast_router
|
||||
|
||||
@@ -1,31 +0,0 @@
|
||||
# Backlink Migration Audit (Legacy vs Current)
|
||||
|
||||
Legacy prototype reference:
|
||||
- `ToBeMigrated/ai_marketing_tools/ai_backlinker/ai_backlinking.py`
|
||||
- `ToBeMigrated/ai_marketing_tools/ai_backlinker/backlinking_ui_streamlit.py`
|
||||
|
||||
## Implemented in current branch
|
||||
|
||||
- Canonical backend entrypoint with backlink-specific naming:
|
||||
- `backend/routers/backlink_outreach.py`
|
||||
- `backend/services/backlink_outreach_service.py`
|
||||
- Legacy-style guest-post query template generation exposed over API:
|
||||
- `GET /api/backlink-outreach/query-templates?keyword=<keyword>`
|
||||
- Migration traceability metadata endpoints:
|
||||
- `GET /api/backlink-outreach/modules`
|
||||
- `GET /api/backlink-outreach/migration-coverage`
|
||||
- Frontend integration points with backlink-specific naming:
|
||||
- `frontend/src/api/backlinkOutreachApi.ts`
|
||||
- `frontend/src/stores/backlinkOutreachStore.ts`
|
||||
- `frontend/src/components/SEODashboard/BacklinkOutreachModuleList.tsx`
|
||||
|
||||
## Not yet migrated (planned)
|
||||
|
||||
- Live web prospect discovery / scraping execution loop (`find_backlink_opportunities`).
|
||||
- Outreach email sending + reply monitoring loop (`send_email`, IMAP checks).
|
||||
- End-to-end campaign orchestration from keyword batch -> outreach -> follow-up.
|
||||
|
||||
## Notes
|
||||
|
||||
This branch intentionally provides a clean migration seam and auditable entrypoints first.
|
||||
Feature-complete parity can now be implemented incrementally behind these stable backend and frontend contracts.
|
||||
@@ -16,15 +16,6 @@ EXA_API_KEY=your_exa_api_key_here
|
||||
|
||||
# Frontend URL for OAuth callbacks
|
||||
FRONTEND_URL=https://alwrity-ai.vercel.app
|
||||
# Optional comma-separated allowlist of trusted frontend origins used for OAuth callback postMessage targetOrigin.
|
||||
# If unset, FRONTEND_URL origin is used.
|
||||
# Example: OAUTH_CALLBACK_ALLOWED_ORIGINS=https://alwrity-ai.vercel.app,http://localhost:3000
|
||||
OAUTH_CALLBACK_ALLOWED_ORIGINS=
|
||||
|
||||
# OAuth Token Encryption (Fernet key - generate with: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
|
||||
# Used by both WordPress and Wix OAuth token encryption at rest.
|
||||
# WORDPRESS_TOKEN_ENCRYPTION_KEY and WIX_TOKEN_ENCRYPTION_KEY can override per-provider.
|
||||
OAUTH_TOKEN_ENCRYPTION_KEY=
|
||||
|
||||
# OAuth Redirect URIs (Using environment variable for flexibility)
|
||||
GSC_REDIRECT_URI=${FRONTEND_URL}/gsc/callback
|
||||
|
||||
@@ -77,13 +77,10 @@ from api.images import router as images_router
|
||||
from routers.image_studio import router as image_studio_router
|
||||
from routers.product_marketing import router as product_marketing_router
|
||||
from routers.campaign_creator import router as campaign_creator_router
|
||||
from routers.backlink_outreach import router as backlink_outreach_router
|
||||
|
||||
# Import hallucination detector router
|
||||
from api.hallucination_detector import router as hallucination_detector_router
|
||||
from api.writing_assistant import router as writing_assistant_router
|
||||
from api.charts import router as charts_router
|
||||
from api.links import router as links_router
|
||||
|
||||
# Import research configuration router
|
||||
from api.research_config import router as research_config_router
|
||||
@@ -257,10 +254,6 @@ router_manager.include_core_routers()
|
||||
router_manager.include_router_safely(subscription_router, "subscription")
|
||||
# Include hallucination detector explicitly (router_manager may skip silently on import failure)
|
||||
router_manager.include_router_safely(hallucination_detector_router, "hallucination_detector")
|
||||
# Include charts router (shared chart generation for blog writer, podcast, etc.)
|
||||
router_manager.include_router_safely(charts_router, "charts")
|
||||
# Include links router (internal & external link search and rewording)
|
||||
router_manager.include_router_safely(links_router, "links")
|
||||
router_manager.include_optional_routers()
|
||||
|
||||
# SEO Dashboard endpoints
|
||||
@@ -403,7 +396,6 @@ app.include_router(images_router)
|
||||
app.include_router(image_studio_router)
|
||||
app.include_router(product_marketing_router)
|
||||
app.include_router(campaign_creator_router)
|
||||
app.include_router(backlink_outreach_router)
|
||||
|
||||
# Include content assets router
|
||||
from api.content_assets.router import router as content_assets_router
|
||||
|
||||
@@ -1,134 +0,0 @@
|
||||
"""DB models for production backlink outreach tracking."""
|
||||
|
||||
from datetime import datetime
|
||||
from sqlalchemy import Column, String, Integer, Float, DateTime, Text, ForeignKey, Index, Boolean, Date
|
||||
from sqlalchemy.ext.declarative import declarative_base
|
||||
|
||||
Base = declarative_base()
|
||||
|
||||
|
||||
class BacklinkCampaign(Base):
|
||||
__tablename__ = "backlink_campaigns"
|
||||
id = Column(String(64), primary_key=True)
|
||||
user_id = Column(String(255), nullable=False, index=True)
|
||||
workspace_id = Column(String(255), nullable=False, index=True)
|
||||
name = Column(String(255), nullable=False)
|
||||
status = Column(String(32), nullable=False, default="drafted", index=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow, index=True)
|
||||
|
||||
|
||||
class BacklinkLead(Base):
|
||||
__tablename__ = "backlink_leads"
|
||||
id = Column(String(64), primary_key=True)
|
||||
campaign_id = Column(String(64), ForeignKey("backlink_campaigns.id"), nullable=False, index=True)
|
||||
url = Column(String(1024), nullable=True)
|
||||
domain = Column(String(255), nullable=False, index=True)
|
||||
page_title = Column(String(512), nullable=True)
|
||||
snippet = Column(Text, nullable=True)
|
||||
email = Column(String(255), nullable=True, index=True)
|
||||
confidence_score = Column(Float, nullable=True, default=0.0)
|
||||
discovery_source = Column(String(32), nullable=True, default="duckduckgo")
|
||||
status = Column(String(32), nullable=False, default="discovered", index=True)
|
||||
notes = Column(Text, nullable=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow, index=True)
|
||||
|
||||
|
||||
class OutreachAttempt(Base):
|
||||
__tablename__ = "backlink_outreach_attempts"
|
||||
id = Column(String(64), primary_key=True)
|
||||
lead_id = Column(String(64), ForeignKey("backlink_leads.id"), nullable=False, index=True)
|
||||
campaign_id = Column(String(64), ForeignKey("backlink_campaigns.id"), nullable=False, index=True)
|
||||
idempotency_key = Column(String(128), nullable=False, unique=True, index=True)
|
||||
sender_email = Column(String(255), nullable=True)
|
||||
subject = Column(String(512), nullable=True)
|
||||
body = Column(Text, nullable=True)
|
||||
status = Column(String(32), nullable=False, default="queued", index=True)
|
||||
decision_reason = Column(Text, nullable=True)
|
||||
sent_at = Column(DateTime, nullable=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow, index=True)
|
||||
|
||||
|
||||
class OutreachReply(Base):
|
||||
__tablename__ = "backlink_replies"
|
||||
id = Column(String(64), primary_key=True)
|
||||
attempt_id = Column(String(64), ForeignKey("backlink_outreach_attempts.id"), nullable=False, index=True)
|
||||
from_email = Column(String(255), nullable=True)
|
||||
subject = Column(String(512), nullable=True)
|
||||
received_at = Column(DateTime, default=datetime.utcnow, index=True)
|
||||
classification = Column(String(32), nullable=False, default="replied")
|
||||
body = Column(Text, nullable=True)
|
||||
|
||||
|
||||
class FollowUpSchedule(Base):
|
||||
__tablename__ = "backlink_followup_schedules"
|
||||
id = Column(String(64), primary_key=True)
|
||||
attempt_id = Column(String(64), ForeignKey("backlink_outreach_attempts.id"), nullable=False, index=True)
|
||||
subject = Column(String(512), nullable=True)
|
||||
body = Column(Text, nullable=True)
|
||||
scheduled_for = Column(DateTime, nullable=False, index=True)
|
||||
sent = Column(Boolean, default=False, index=True)
|
||||
|
||||
|
||||
class EmailTemplate(Base):
|
||||
__tablename__ = "backlink_email_templates"
|
||||
id = Column(String(64), primary_key=True)
|
||||
user_id = Column(String(255), nullable=False, index=True)
|
||||
name = Column(String(128), nullable=False)
|
||||
subject_template = Column(String(512), nullable=False)
|
||||
body_template = Column(Text, nullable=False)
|
||||
variables = Column(Text, nullable=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
class SuppressedRecipient(Base):
|
||||
__tablename__ = "backlink_suppressed_recipients"
|
||||
id = Column(String(64), primary_key=True)
|
||||
email = Column(String(255), nullable=False, index=True)
|
||||
domain = Column(String(255), nullable=True)
|
||||
reason = Column(String(128), nullable=True)
|
||||
user_id = Column(String(255), nullable=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
class SentIdempotencyKey(Base):
|
||||
__tablename__ = "backlink_sent_idempotency_keys"
|
||||
id = Column(String(64), primary_key=True)
|
||||
idempotency_key = Column(String(128), nullable=False, unique=True, index=True)
|
||||
user_id = Column(String(255), nullable=False)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
|
||||
class AuditLogEntry(Base):
|
||||
__tablename__ = "backlink_audit_logs"
|
||||
id = Column(String(64), primary_key=True)
|
||||
user_id = Column(String(255), nullable=False, index=True)
|
||||
campaign_id = Column(String(64), nullable=True)
|
||||
event = Column(String(64), nullable=False, index=True)
|
||||
recipient = Column(String(255), nullable=True)
|
||||
allowed = Column(Boolean, nullable=True)
|
||||
reasons = Column(Text, nullable=True)
|
||||
override = Column(Boolean, default=False)
|
||||
created_at = Column(DateTime, default=datetime.utcnow, index=True)
|
||||
|
||||
|
||||
class SendCounterUser(Base):
|
||||
__tablename__ = "backlink_send_counters_user"
|
||||
id = Column(String(64), primary_key=True)
|
||||
user_id = Column(String(255), nullable=False, index=True)
|
||||
date = Column(Date, nullable=False, index=True)
|
||||
count = Column(Integer, default=0)
|
||||
|
||||
|
||||
class SendCounterDomain(Base):
|
||||
__tablename__ = "backlink_send_counters_domain"
|
||||
id = Column(String(64), primary_key=True)
|
||||
domain = Column(String(255), nullable=False, index=True)
|
||||
date = Column(Date, nullable=False, index=True)
|
||||
count = Column(Integer, default=0)
|
||||
|
||||
|
||||
Index("idx_backlink_campaign_user_date", BacklinkCampaign.user_id, BacklinkCampaign.created_at)
|
||||
Index("idx_backlink_attempt_campaign_date", OutreachAttempt.campaign_id, OutreachAttempt.created_at)
|
||||
Index("idx_backlink_suppressed_email", SuppressedRecipient.email, SuppressedRecipient.user_id)
|
||||
Index("idx_backlink_counter_user_date", SendCounterUser.user_id, SendCounterUser.date, unique=True)
|
||||
Index("idx_backlink_counter_domain_date", SendCounterDomain.domain, SendCounterDomain.date, unique=True)
|
||||
@@ -157,9 +157,6 @@ class BlogOutlineSection(BaseModel):
|
||||
references: List[ResearchSource] = []
|
||||
target_words: Optional[int] = None
|
||||
keywords: List[str] = []
|
||||
chart_data: Optional[Dict[str, Any]] = None
|
||||
chart_url: Optional[str] = None
|
||||
chart_id: Optional[str] = None
|
||||
|
||||
|
||||
class BlogOutlineRequest(BaseModel):
|
||||
|
||||
@@ -1,663 +0,0 @@
|
||||
"""Backlink outreach router with Clerk auth."""
|
||||
|
||||
from typing import Dict, Any
|
||||
from fastapi import APIRouter, Depends, Query, HTTPException
|
||||
from fastapi.responses import Response
|
||||
|
||||
from services.backlink_outreach_models import (
|
||||
BacklinkDiscoveryResponse, BacklinkKeywordInput, DeepKeywordInput,
|
||||
LeadCreateRequest, LeadStatusUpdateRequest,
|
||||
PolicyValidationRequest, PolicyValidationResponse,
|
||||
SendOutreachRequest, SendOutreachResponse,
|
||||
OutreachAttemptListResponse, OutreachAttemptRecord,
|
||||
OutreachReplyListResponse, OutreachReplyRecord,
|
||||
ScheduleFollowUpRequest, FollowUpScheduleRecord,
|
||||
EmailTemplateRequest, EmailTemplateRecord,
|
||||
GenerateEmailRequest, GeneratedEmailResponse,
|
||||
PersonalizeEmailRequest, SubjectLinesRequest, SubjectLinesResponse,
|
||||
FollowUpRequest,
|
||||
BacklinkReportingSnapshot,
|
||||
CampaignAnalyticsResponse, CampaignVolumeResponse,
|
||||
ConversionFunnelResponse, BulkStatusUpdateRequest, BulkStatusUpdateResponse,
|
||||
SuppressionAddRequest,
|
||||
)
|
||||
from services.backlink_outreach_service import backlink_outreach_service
|
||||
from services.backlink_outreach_storage import BacklinkOutreachStorageService
|
||||
from services.backlink_outreach_sender import backlink_outreach_sender
|
||||
from services.backlink_outreach_reply_monitor import backlink_outreach_reply_monitor
|
||||
from services.backlink_outreach_template_generator import (
|
||||
generate_outreach_email,
|
||||
generate_personalized_email,
|
||||
generate_subject_lines,
|
||||
generate_follow_up,
|
||||
)
|
||||
from middleware.auth_middleware import get_current_user
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
router = APIRouter(prefix="/api/backlink-outreach", tags=["backlink-outreach"])
|
||||
|
||||
|
||||
class BacklinkCampaignCreateRequest(BaseModel):
|
||||
workspace_id: str = Field(..., min_length=1)
|
||||
name: str = Field(..., min_length=3)
|
||||
|
||||
|
||||
def _resolve_user_id(current_user: Dict[str, Any]) -> str:
|
||||
return current_user.get("id") or current_user.get("clerk_user_id") or "default"
|
||||
|
||||
|
||||
# -- Auth-Required Endpoints --
|
||||
|
||||
@router.get("/modules")
|
||||
async def get_backlink_module_registry(
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
return {"feature": "backlink_outreach", "modules": backlink_outreach_service.list_backlink_modules()}
|
||||
|
||||
|
||||
@router.get("/query-templates")
|
||||
async def get_backlink_query_templates(
|
||||
keyword: str = Query(..., min_length=1),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
return {"keyword": keyword, "queries": backlink_outreach_service.generate_guest_post_queries(keyword)}
|
||||
|
||||
|
||||
@router.post("/discover", response_model=BacklinkDiscoveryResponse)
|
||||
async def discover_backlink_opportunities(
|
||||
payload: BacklinkKeywordInput,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
return backlink_outreach_service.discover_opportunities(payload.keyword, payload.max_results)
|
||||
|
||||
|
||||
@router.get("/migration-coverage")
|
||||
async def get_backlink_migration_coverage(
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
return backlink_outreach_service.get_migration_coverage()
|
||||
|
||||
|
||||
# -- Auth-Required Endpoints --
|
||||
|
||||
@router.post("/discover/deep")
|
||||
async def discover_deep_backlink_opportunities(
|
||||
payload: DeepKeywordInput,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Enhanced discovery using Exa neural search + DuckDuckGo with full-page scraping."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
result = await backlink_outreach_service.deep_discover(payload.keyword, payload.max_results)
|
||||
if payload.campaign_id:
|
||||
storage = BacklinkOutreachStorageService()
|
||||
saved = 0
|
||||
save_failed = 0
|
||||
for opp in result.get("opportunities", []):
|
||||
try:
|
||||
storage.add_lead(
|
||||
campaign_id=payload.campaign_id,
|
||||
user_id=user_id,
|
||||
url=opp["url"],
|
||||
domain=opp["domain"],
|
||||
page_title=opp.get("page_title", ""),
|
||||
snippet=opp.get("snippet", ""),
|
||||
email=opp.get("email"),
|
||||
confidence_score=opp.get("confidence_score", 0.0),
|
||||
discovery_source=opp.get("discovery_source", "duckduckgo"),
|
||||
)
|
||||
saved += 1
|
||||
except Exception:
|
||||
save_failed += 1
|
||||
result["saved_to_campaign"] = saved
|
||||
result["save_failed"] = save_failed
|
||||
return result
|
||||
|
||||
|
||||
@router.post("/campaigns")
|
||||
async def create_backlink_campaign(
|
||||
payload: BacklinkCampaignCreateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return storage.create_campaign(user_id, payload.workspace_id, payload.name)
|
||||
|
||||
|
||||
@router.get("/campaigns")
|
||||
async def list_backlink_campaigns(
|
||||
workspace_id: str = Query(None),
|
||||
limit: int = 50,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return {"campaigns": storage.list_campaigns(user_id, workspace_id or user_id, limit)}
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}")
|
||||
async def get_backlink_campaign(
|
||||
campaign_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Get campaign detail with leads."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
campaign = storage.get_campaign(campaign_id, user_id)
|
||||
if not campaign:
|
||||
raise HTTPException(status_code=404, detail="Campaign not found")
|
||||
return campaign
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/leads")
|
||||
async def list_campaign_leads(
|
||||
campaign_id: str,
|
||||
status: str = Query(None),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List leads for a campaign, optionally filtered by status."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
leads = storage.list_leads(campaign_id, user_id, status=status or None)
|
||||
return {"leads": leads, "total": len(leads)}
|
||||
|
||||
|
||||
@router.post("/campaigns/{campaign_id}/leads")
|
||||
async def add_campaign_lead(
|
||||
campaign_id: str,
|
||||
payload: LeadCreateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Add a single lead to a campaign."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
try:
|
||||
lead = storage.add_lead(
|
||||
campaign_id=campaign_id,
|
||||
user_id=user_id,
|
||||
url=payload.url,
|
||||
domain=payload.domain,
|
||||
page_title=payload.page_title or "",
|
||||
snippet=payload.snippet or "",
|
||||
email=payload.email,
|
||||
confidence_score=payload.confidence_score,
|
||||
notes=payload.notes,
|
||||
)
|
||||
return lead
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail="Failed to add lead")
|
||||
|
||||
|
||||
@router.post("/leads/bulk-status", response_model=BulkStatusUpdateResponse)
|
||||
async def bulk_update_lead_status(
|
||||
payload: BulkStatusUpdateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Bulk update lead statuses."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
updated = 0
|
||||
failed: list[str] = []
|
||||
for lid in payload.lead_ids:
|
||||
try:
|
||||
lead = storage.update_lead_status(lid, user_id, payload.status, payload.notes)
|
||||
if lead:
|
||||
updated += 1
|
||||
else:
|
||||
failed.append(lid)
|
||||
except Exception:
|
||||
failed.append(lid)
|
||||
return BulkStatusUpdateResponse(updated=updated, failed=failed)
|
||||
|
||||
|
||||
@router.patch("/leads/{lead_id}/status")
|
||||
async def update_lead_status(
|
||||
lead_id: str,
|
||||
payload: LeadStatusUpdateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Update lead status (discovered -> contacted -> replied -> placed)."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
lead = storage.update_lead_status(lead_id, user_id, payload.status, payload.notes)
|
||||
if not lead:
|
||||
raise HTTPException(status_code=404, detail="Lead not found")
|
||||
return lead
|
||||
|
||||
|
||||
@router.post("/policy-validate", response_model=PolicyValidationResponse)
|
||||
async def validate_outreach_policy(
|
||||
payload: PolicyValidationRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
return backlink_outreach_service.validate_send_policy(payload)
|
||||
|
||||
|
||||
@router.get("/reporting", response_model=BacklinkReportingSnapshot)
|
||||
async def get_backlink_reporting_snapshot(
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
user_id = _resolve_user_id(current_user)
|
||||
return backlink_outreach_service.get_reporting_snapshot(user_id=user_id)
|
||||
|
||||
|
||||
# -- Outreach Attempts --
|
||||
|
||||
@router.post("/send-outreach", response_model=SendOutreachResponse)
|
||||
async def send_outreach(
|
||||
payload: SendOutreachRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Validate policy, record attempt, personalize, and send email."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
subject = payload.subject
|
||||
body = payload.body
|
||||
|
||||
if payload.template_id:
|
||||
tmpl = storage.get_template(payload.template_id, user_id)
|
||||
if tmpl:
|
||||
variables = payload.template_variables or {}
|
||||
subject = backlink_outreach_sender.personalize(tmpl.get("subject_template", subject), variables)
|
||||
body = backlink_outreach_sender.personalize(tmpl.get("body_template", body), variables)
|
||||
|
||||
result = backlink_outreach_service.send_outreach(
|
||||
SendOutreachRequest(
|
||||
lead_id=payload.lead_id,
|
||||
campaign_id=payload.campaign_id,
|
||||
user_id=user_id,
|
||||
workspace_id=payload.workspace_id,
|
||||
sender_email=payload.sender_email,
|
||||
subject=subject,
|
||||
body=body,
|
||||
idempotency_key=payload.idempotency_key,
|
||||
)
|
||||
)
|
||||
|
||||
lead_email = ""
|
||||
if result.attempt_id:
|
||||
lead = storage.get_lead(payload.lead_id, user_id=user_id)
|
||||
lead_email = (lead.get("email") or "") if lead else ""
|
||||
|
||||
if result.policy_allowed and lead_email:
|
||||
sent = await backlink_outreach_sender.send_email(
|
||||
to_email=lead_email,
|
||||
subject=subject,
|
||||
body=body,
|
||||
)
|
||||
status = "sent" if sent else "failed"
|
||||
storage.update_attempt_status(result.attempt_id, status, user_id=user_id)
|
||||
result.status = status
|
||||
if sent:
|
||||
storage.mark_idempotency(payload.idempotency_key, user_id)
|
||||
storage.increment_user_send_counter(user_id)
|
||||
domain = lead_email.split("@")[-1] if "@" in lead_email else "unknown"
|
||||
storage.increment_domain_send_counter(domain, user_id=user_id)
|
||||
elif result.policy_allowed and not lead_email:
|
||||
storage.update_attempt_status(result.attempt_id, "failed", user_id=user_id)
|
||||
result.status = "failed"
|
||||
result.policy_reasons = (result.policy_reasons or []) + ["lead_has_no_email"]
|
||||
|
||||
return result
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/attempts", response_model=OutreachAttemptListResponse)
|
||||
async def list_campaign_attempts(
|
||||
campaign_id: str,
|
||||
limit: int = Query(50),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List outreach attempts for a campaign."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
attempts = storage.list_attempts(campaign_id, limit, user_id=user_id)
|
||||
return {"attempts": attempts, "total": len(attempts)}
|
||||
|
||||
|
||||
# -- Replies --
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/replies", response_model=OutreachReplyListResponse)
|
||||
async def list_campaign_replies(
|
||||
campaign_id: str,
|
||||
limit: int = Query(50),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List received replies for a campaign."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
replies = storage.list_replies(campaign_id, limit, user_id=user_id)
|
||||
return {"replies": replies, "total": len(replies)}
|
||||
|
||||
|
||||
@router.post("/replies/poll")
|
||||
async def poll_replies(
|
||||
sent_from_email: str = Query(..., min_length=3),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Poll IMAP inbox for new replies and store them."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
if not backlink_outreach_reply_monitor.is_configured():
|
||||
raise HTTPException(status_code=503, detail="IMAP not configured")
|
||||
|
||||
storage = BacklinkOutreachStorageService()
|
||||
raw_replies = await backlink_outreach_reply_monitor.poll_replies(sent_from_email)
|
||||
stored = []
|
||||
skipped = 0
|
||||
failed = 0
|
||||
for raw in raw_replies:
|
||||
try:
|
||||
from_email = raw.get("from_email", "")
|
||||
subject = raw.get("subject", "")
|
||||
if storage.reply_exists(from_email, subject, user_id=user_id):
|
||||
skipped += 1
|
||||
continue
|
||||
attempt_id = storage.find_attempt_by_from_email(from_email, user_id=user_id) or ""
|
||||
reply = storage.add_reply(
|
||||
attempt_id=attempt_id,
|
||||
from_email=from_email,
|
||||
subject=subject,
|
||||
body=raw.get("body", ""),
|
||||
classification=raw.get("classification", "replied"),
|
||||
user_id=user_id,
|
||||
)
|
||||
stored.append(reply)
|
||||
except Exception:
|
||||
failed += 1
|
||||
return {"polled": len(raw_replies), "stored": len(stored), "skipped": skipped, "failed": failed, "replies": stored}
|
||||
|
||||
|
||||
# -- Follow-ups --
|
||||
|
||||
@router.post("/campaigns/{campaign_id}/schedule-followup")
|
||||
async def schedule_followup(
|
||||
campaign_id: str,
|
||||
payload: ScheduleFollowUpRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Schedule a follow-up for an outreach attempt."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
sched = storage.schedule_followup(
|
||||
attempt_id=payload.attempt_id,
|
||||
scheduled_for=payload.scheduled_for,
|
||||
subject=payload.subject or "",
|
||||
body=payload.body or "",
|
||||
user_id=user_id,
|
||||
)
|
||||
return {"campaign_id": campaign_id, "schedule": sched}
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/followups")
|
||||
async def list_followups(
|
||||
campaign_id: str,
|
||||
limit: int = Query(50),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List scheduled follow-ups for a campaign."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
followups = storage.list_followups(campaign_id, limit, user_id=user_id)
|
||||
return {"followups": followups, "total": len(followups)}
|
||||
|
||||
|
||||
# -- Email Templates --
|
||||
|
||||
@router.post("/templates")
|
||||
async def create_template(
|
||||
payload: EmailTemplateRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Create an email template."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return storage.create_template(
|
||||
user_id=user_id,
|
||||
name=payload.name,
|
||||
subject_template=payload.subject_template,
|
||||
body_template=payload.body_template,
|
||||
variables=payload.variables,
|
||||
)
|
||||
|
||||
|
||||
@router.get("/templates")
|
||||
async def list_templates(
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List email templates for the authenticated user."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return {"templates": storage.list_templates(user_id)}
|
||||
|
||||
|
||||
@router.get("/templates/{template_id}")
|
||||
async def get_template(
|
||||
template_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Get a specific email template."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
tmpl = storage.get_template(template_id, user_id)
|
||||
if not tmpl:
|
||||
raise HTTPException(status_code=404, detail="Template not found")
|
||||
return tmpl
|
||||
|
||||
|
||||
@router.delete("/templates/{template_id}")
|
||||
async def delete_template(
|
||||
template_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Delete an email template."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
if not storage.delete_template(template_id, user_id):
|
||||
raise HTTPException(status_code=404, detail="Template not found")
|
||||
return {"deleted": True}
|
||||
|
||||
|
||||
@router.post("/templates/generate", response_model=GeneratedEmailResponse)
|
||||
async def generate_email_template(
|
||||
payload: GenerateEmailRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Generate an outreach email using AI."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
existing_body = None
|
||||
if payload.existing_template_id:
|
||||
storage = BacklinkOutreachStorageService()
|
||||
tmpl = storage.get_template(payload.existing_template_id, user_id)
|
||||
if tmpl:
|
||||
existing_body = tmpl.get("body_template")
|
||||
|
||||
result = generate_outreach_email(
|
||||
topic=payload.topic,
|
||||
target_site=payload.target_site,
|
||||
tone=payload.tone,
|
||||
user_id=user_id,
|
||||
existing_body=existing_body,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
@router.post("/generate/personalized", response_model=GeneratedEmailResponse)
|
||||
async def generate_personalized_email_endpoint(
|
||||
payload: PersonalizeEmailRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Personalize an outreach email for a specific lead."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
result = generate_personalized_email(
|
||||
lead_name=payload.lead_name,
|
||||
lead_site=payload.lead_site,
|
||||
lead_content_topic=payload.lead_content_topic,
|
||||
pitch_topic=payload.pitch_topic,
|
||||
existing_body=payload.existing_body,
|
||||
user_id=user_id,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
@router.post("/generate/subject-lines", response_model=SubjectLinesResponse)
|
||||
async def generate_subject_lines_endpoint(
|
||||
payload: SubjectLinesRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Generate subject line suggestions for an email body."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
subjects = generate_subject_lines(
|
||||
body=payload.body,
|
||||
count=payload.count,
|
||||
user_id=user_id,
|
||||
)
|
||||
return {"subjects": subjects}
|
||||
|
||||
|
||||
@router.post("/generate/follow-up", response_model=GeneratedEmailResponse)
|
||||
async def generate_follow_up_endpoint(
|
||||
payload: FollowUpRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Generate a follow-up email for an outreach attempt."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
result = generate_follow_up(
|
||||
original_subject=payload.original_subject,
|
||||
original_body=payload.original_body,
|
||||
days_elapsed=payload.days_elapsed,
|
||||
reply_context=payload.reply_context,
|
||||
user_id=user_id,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
# -- Suppression --
|
||||
|
||||
@router.get("/suppression")
|
||||
async def list_suppression(
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List suppressed recipients."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return {"suppressed": storage.list_suppressed(user_id)}
|
||||
|
||||
|
||||
@router.post("/suppression")
|
||||
async def add_suppression(
|
||||
payload: SuppressionAddRequest,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Add a recipient to the suppression list."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return storage.add_suppressed(email=payload.email, domain=payload.domain, reason=payload.reason, user_id=user_id)
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/analytics/volume", response_model=CampaignVolumeResponse)
|
||||
async def get_campaign_analytics_volume(
|
||||
campaign_id: str,
|
||||
days: int = Query(30, ge=1, le=365),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Get daily send volume for a campaign over the last N days."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
return backlink_outreach_service.get_campaign_volume(campaign_id, days, user_id=user_id)
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/analytics/funnel", response_model=ConversionFunnelResponse)
|
||||
async def get_campaign_analytics_funnel(
|
||||
campaign_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Get conversion funnel (lead status breakdown) for a campaign."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
return backlink_outreach_service.get_campaign_funnel(campaign_id, user_id=user_id)
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/export/leads")
|
||||
async def export_campaign_leads_csv(
|
||||
campaign_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Export campaign leads as CSV."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
csv_content = backlink_outreach_service.export_leads_csv(campaign_id, user_id=user_id)
|
||||
return Response(content=csv_content, media_type="text/csv",
|
||||
headers={"Content-Disposition": f"attachment; filename=leads_{campaign_id}.csv"})
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/export/attempts")
|
||||
async def export_campaign_attempts_csv(
|
||||
campaign_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Export campaign outreach attempts as CSV."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
csv_content = backlink_outreach_service.export_attempts_csv(campaign_id, user_id=user_id)
|
||||
return Response(content=csv_content, media_type="text/csv",
|
||||
headers={"Content-Disposition": f"attachment; filename=attempts_{campaign_id}.csv"})
|
||||
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/export/replies")
|
||||
async def export_campaign_replies_csv(
|
||||
campaign_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Export campaign replies as CSV."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
csv_content = backlink_outreach_service.export_replies_csv(campaign_id, user_id=user_id)
|
||||
return Response(content=csv_content, media_type="text/csv",
|
||||
headers={"Content-Disposition": f"attachment; filename=replies_{campaign_id}.csv"})
|
||||
|
||||
|
||||
# -- Audit Log --
|
||||
|
||||
@router.get("/audit-logs")
|
||||
async def list_audit_logs(
|
||||
campaign_id: str = Query(None),
|
||||
limit: int = Query(100),
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""List audit log entries, optionally filtered by campaign."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
return {"logs": storage.list_audit_logs(campaign_id or None, limit, user_id=user_id)}
|
||||
|
||||
|
||||
# -- Analytics --
|
||||
|
||||
@router.get("/campaigns/{campaign_id}/analytics", response_model=CampaignAnalyticsResponse)
|
||||
async def get_campaign_analytics(
|
||||
campaign_id: str,
|
||||
current_user: Dict[str, Any] = Depends(get_current_user),
|
||||
):
|
||||
"""Get campaign analytics: send volume, response/placement rates, reply breakdown."""
|
||||
user_id = _resolve_user_id(current_user)
|
||||
storage = BacklinkOutreachStorageService()
|
||||
campaign = storage.get_campaign(campaign_id, user_id)
|
||||
if not campaign:
|
||||
raise HTTPException(status_code=404, detail="Campaign not found")
|
||||
|
||||
attempts = storage.list_attempts(campaign_id, user_id=user_id)
|
||||
replies = storage.list_replies(campaign_id, user_id=user_id)
|
||||
leads = storage.list_leads_all(campaign_id, user_id=user_id)
|
||||
|
||||
total_sent = sum(1 for a in attempts if a.get("status") == "sent")
|
||||
total_blocked = sum(1 for a in attempts if a.get("status") == "blocked")
|
||||
total_replied = len(replies)
|
||||
total_placed = sum(1 for l in leads if l.get("status") == "placed")
|
||||
|
||||
reply_classification = {}
|
||||
for r in replies:
|
||||
cls = r.get("classification", "replied")
|
||||
reply_classification[cls] = reply_classification.get(cls, 0) + 1
|
||||
|
||||
return CampaignAnalyticsResponse(
|
||||
campaign_id=campaign_id,
|
||||
lead_count=campaign.get("lead_count", 0),
|
||||
send_volume=total_sent,
|
||||
blocked_count=total_blocked,
|
||||
reply_count=total_replied,
|
||||
response_rate=round(total_replied / total_sent, 4) if total_sent > 0 else 0.0,
|
||||
placement_rate=round(total_placed / campaign.get("lead_count", 1), 4) if campaign.get("lead_count", 0) > 0 else 0.0,
|
||||
reply_classification=reply_classification,
|
||||
)
|
||||
@@ -8,7 +8,6 @@ from loguru import logger
|
||||
import os
|
||||
|
||||
from services.gsc_service import GSCService
|
||||
from services.gsc_brainstorm_service import GSCBrainstormService
|
||||
from middleware.auth_middleware import get_current_user
|
||||
|
||||
# Initialize router
|
||||
@@ -16,7 +15,6 @@ router = APIRouter(prefix="/gsc", tags=["Google Search Console"])
|
||||
|
||||
# Initialize GSC service
|
||||
gsc_service = GSCService()
|
||||
brainstorm_service = GSCBrainstormService(gsc_service)
|
||||
|
||||
# Pydantic models
|
||||
class GSCAnalyticsRequest(BaseModel):
|
||||
@@ -24,10 +22,6 @@ class GSCAnalyticsRequest(BaseModel):
|
||||
start_date: Optional[str] = None
|
||||
end_date: Optional[str] = None
|
||||
|
||||
class GSCBrainstormRequest(BaseModel):
|
||||
keywords: str
|
||||
site_url: Optional[str] = None
|
||||
|
||||
class GSCStatusResponse(BaseModel):
|
||||
connected: bool
|
||||
sites: Optional[List[Dict[str, Any]]] = None
|
||||
@@ -76,22 +70,12 @@ async def handle_gsc_callback(
|
||||
|
||||
success = gsc_service.handle_oauth_callback(code, state)
|
||||
|
||||
# If state verification failed, check if user is already connected
|
||||
# (handles duplicate callbacks where state was consumed by a prior request)
|
||||
if not success:
|
||||
user_id_from_state = state.split(':')[0] if ':' in state else None
|
||||
if user_id_from_state:
|
||||
existing_creds = gsc_service.load_user_credentials(user_id_from_state)
|
||||
if existing_creds:
|
||||
logger.info(f"GSC OAuth state already consumed, but user {user_id_from_state} has valid credentials — treating as success")
|
||||
success = True
|
||||
|
||||
if success:
|
||||
logger.info("GSC OAuth callback handled successfully")
|
||||
|
||||
# Create GSC insights task immediately after successful connection
|
||||
try:
|
||||
from services.database import get_session_for_user
|
||||
from services.database import SessionLocal
|
||||
from services.platform_insights_monitoring_service import create_platform_insights_task
|
||||
|
||||
# Get user_id from state (stored during OAuth flow)
|
||||
@@ -99,24 +83,23 @@ async def handle_gsc_callback(
|
||||
user_id = state.split(':')[0] if ':' in state else None
|
||||
|
||||
if user_id:
|
||||
db = get_session_for_user(user_id)
|
||||
if db:
|
||||
try:
|
||||
task_result = create_platform_insights_task(
|
||||
user_id=user_id,
|
||||
platform='gsc',
|
||||
site_url=None,
|
||||
db=db
|
||||
)
|
||||
|
||||
if task_result.get('success'):
|
||||
logger.info(f"Created GSC insights task for user {user_id}")
|
||||
else:
|
||||
logger.warning(f"Failed to create GSC insights task: {task_result.get('error')}")
|
||||
finally:
|
||||
db.close()
|
||||
else:
|
||||
logger.warning(f"Could not create DB session for user {user_id}")
|
||||
db = SessionLocal()
|
||||
try:
|
||||
# Create insights task without site_url to avoid API calls
|
||||
# The executor will fetch it when the task runs (weekly)
|
||||
task_result = create_platform_insights_task(
|
||||
user_id=user_id,
|
||||
platform='gsc',
|
||||
site_url=None, # Will be fetched by executor when task runs
|
||||
db=db
|
||||
)
|
||||
|
||||
if task_result.get('success'):
|
||||
logger.info(f"Created GSC insights task for user {user_id}")
|
||||
else:
|
||||
logger.warning(f"Failed to create GSC insights task: {task_result.get('error')}")
|
||||
finally:
|
||||
db.close()
|
||||
else:
|
||||
logger.warning(f"Could not extract user_id from state: {state}")
|
||||
except Exception as e:
|
||||
@@ -136,10 +119,7 @@ async def handle_gsc_callback(
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(
|
||||
content=html,
|
||||
headers={"Cross-Origin-Opener-Policy": "unsafe-none"},
|
||||
)
|
||||
return HTMLResponse(content=html)
|
||||
else:
|
||||
logger.error("Failed to handle GSC OAuth callback")
|
||||
html = """
|
||||
@@ -154,11 +134,7 @@ async def handle_gsc_callback(
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(
|
||||
status_code=400,
|
||||
content=html,
|
||||
headers={"Cross-Origin-Opener-Policy": "unsafe-none"},
|
||||
)
|
||||
return HTMLResponse(status_code=400, content=html)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error handling GSC OAuth callback: {e}")
|
||||
@@ -175,11 +151,7 @@ async def handle_gsc_callback(
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(
|
||||
status_code=500,
|
||||
content=html,
|
||||
headers={"Cross-Origin-Opener-Policy": "unsafe-none"},
|
||||
)
|
||||
return HTMLResponse(status_code=500, content=html)
|
||||
|
||||
@router.get("/sites")
|
||||
async def get_gsc_sites(user: dict = Depends(get_current_user)):
|
||||
@@ -227,49 +199,6 @@ async def get_gsc_analytics(
|
||||
logger.error(f"Error getting GSC analytics: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"Error getting analytics: {str(e)}")
|
||||
|
||||
@router.post("/brainstorm")
|
||||
async def brainstorm_topics(
|
||||
request: GSCBrainstormRequest,
|
||||
user: dict = Depends(get_current_user),
|
||||
):
|
||||
"""Brainstorm blog topic suggestions based on the user's GSC data.
|
||||
|
||||
The user must have GSC connected. If no site_url is provided,
|
||||
the first verified site is used automatically.
|
||||
"""
|
||||
try:
|
||||
user_id = user.get('id')
|
||||
if not user_id:
|
||||
raise HTTPException(status_code=400, detail="User ID not found")
|
||||
|
||||
tokens = request.keywords.strip().split()
|
||||
if len(tokens) < 3:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Please provide at least 3 words for brainstorming topic suggestions.",
|
||||
)
|
||||
|
||||
logger.info(f"GSC brainstorm for user: {user_id}, keywords: {request.keywords!r}")
|
||||
|
||||
result = brainstorm_service.brainstorm_topics(
|
||||
user_id=user_id,
|
||||
keywords=request.keywords,
|
||||
site_url=request.site_url,
|
||||
)
|
||||
|
||||
if "error" in result and not result.get("content_opportunities"):
|
||||
status = 400 if "No GSC sites" in result["error"] else 500
|
||||
raise HTTPException(status_code=status, detail=result["error"])
|
||||
|
||||
logger.info(f"GSC brainstorm completed for user: {user_id}")
|
||||
return result
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Error in GSC brainstorm: {e}")
|
||||
raise HTTPException(status_code=500, detail=f"Error brainstorming topics: {str(e)}")
|
||||
|
||||
@router.get("/sitemaps/{site_url:path}")
|
||||
async def get_gsc_sitemaps(
|
||||
site_url: str,
|
||||
|
||||
@@ -63,8 +63,8 @@ async def save_to_library(
|
||||
file_path = assets_dir / filename
|
||||
file_path.write_bytes(image_bytes)
|
||||
|
||||
# Build serving URL (assets_serving.py serves /{user_id}/images/{filename})
|
||||
file_url = f"/api/assets/{safe_user}/images/{filename}"
|
||||
# Build serving URL (assets_serving.py serves /{user_id}/avatars/{filename})
|
||||
file_url = f"/api/assets/{safe_user}/avatars/{filename}"
|
||||
|
||||
# Save to unified asset library via existing utility
|
||||
from utils.asset_tracker import save_asset_to_library
|
||||
|
||||
@@ -29,7 +29,6 @@ from services.seo_tools.opengraph_service import OpenGraphService
|
||||
from services.seo_tools.on_page_seo_service import OnPageSEOService
|
||||
from services.seo_tools.technical_seo_service import TechnicalSEOService
|
||||
from services.seo_tools.enterprise_seo_service import EnterpriseSEOService
|
||||
from services.seo_tools.gsc_analyzer_service import GSCAnalyzerService
|
||||
from services.seo_tools.content_strategy_service import ContentStrategyService
|
||||
from services.database import get_session_for_user
|
||||
from api.content_planning.services.content_strategy.onboarding import OnboardingDataIntegrationService
|
||||
@@ -129,28 +128,6 @@ class CompetitiveSitemapBenchmarkingRunRequest(BaseModel):
|
||||
max_competitors: int = Field(default=5, ge=1, le=10, description="Max competitors to analyze")
|
||||
competitors: Optional[List[HttpUrl]] = Field(None, description="Optional explicit competitor URLs")
|
||||
|
||||
class EnterpriseAuditRequest(BaseModel):
|
||||
"""Request model for complete enterprise SEO audit"""
|
||||
website_url: HttpUrl = Field(..., description="Primary website URL to audit")
|
||||
competitors: Optional[List[HttpUrl]] = Field(None, description="Competitor URLs for benchmarking (max 5)")
|
||||
target_keywords: Optional[List[str]] = Field(None, description="Target keywords for analysis")
|
||||
include_content_analysis: bool = Field(default=True, description="Include content strategy analysis")
|
||||
include_competitive_analysis: bool = Field(default=True, description="Include competitive benchmarking")
|
||||
generate_executive_report: bool = Field(default=True, description="Generate executive summary")
|
||||
|
||||
class GSCAnalysisRequest(BaseModel):
|
||||
"""Request model for advanced GSC analysis"""
|
||||
site_url: HttpUrl = Field(..., description="Website URL registered in Google Search Console")
|
||||
date_range_days: int = Field(default=90, ge=7, le=365, description="Number of days to analyze")
|
||||
include_opportunities: bool = Field(default=True, description="Include content opportunity analysis")
|
||||
include_competitive: bool = Field(default=True, description="Include competitive positioning")
|
||||
|
||||
class ContentOpportunitiesRequest(BaseModel):
|
||||
"""Request model for content opportunities report"""
|
||||
site_url: HttpUrl = Field(..., description="Website URL registered in GSC")
|
||||
min_impressions: int = Field(default=100, ge=10, description="Minimum impressions threshold")
|
||||
date_range_days: int = Field(default=90, ge=7, le=365, description="Number of days to analyze")
|
||||
|
||||
# Exception Handler
|
||||
async def handle_seo_tool_exception(func_name: str, error: Exception, request_data: Dict) -> ErrorResponse:
|
||||
"""Handle exceptions from SEO tools with intelligent logging"""
|
||||
@@ -859,225 +836,3 @@ async def get_tools_status() -> BaseResponse:
|
||||
"timestamp": datetime.utcnow().isoformat()
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
# ==================== ENTERPRISE AUDIT ENDPOINTS ====================
|
||||
|
||||
@router.post("/enterprise/complete-audit", response_model=BaseResponse)
|
||||
@log_api_call
|
||||
async def execute_enterprise_audit(
|
||||
request: EnterpriseAuditRequest,
|
||||
background_tasks: BackgroundTasks,
|
||||
current_user: dict = Depends(get_current_user)
|
||||
) -> Union[BaseResponse, ErrorResponse]:
|
||||
"""
|
||||
Execute comprehensive enterprise SEO audit with full orchestration.
|
||||
|
||||
Combines multiple SEO analysis tools into an intelligent workflow:
|
||||
- Technical SEO audit with issue severity classification
|
||||
- On-page SEO analysis with keyword optimization
|
||||
- PageSpeed Insights with Core Web Vitals analysis
|
||||
- Sitemap analysis with trend detection
|
||||
- Content strategy with competitive comparison
|
||||
- Competitive benchmarking across specified competitors
|
||||
- AI-powered insights and recommendations
|
||||
|
||||
Returns prioritized action items with implementation roadmap.
|
||||
"""
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
try:
|
||||
logger.info(f"Starting enterprise audit for {request.website_url}")
|
||||
|
||||
# Initialize service
|
||||
enterprise_service = EnterpriseSEOService()
|
||||
|
||||
# Execute audit
|
||||
audit_result = await enterprise_service.execute_complete_audit(
|
||||
website_url=str(request.website_url),
|
||||
competitors=[str(c) for c in request.competitors] if request.competitors else [],
|
||||
target_keywords=request.target_keywords or [],
|
||||
include_content_analysis=request.include_content_analysis,
|
||||
include_competitive_analysis=request.include_competitive_analysis,
|
||||
generate_executive_report=request.generate_executive_report
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return BaseResponse(
|
||||
success=True,
|
||||
message="Complete enterprise audit executed successfully",
|
||||
execution_time=execution_time,
|
||||
data=audit_result
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Enterprise audit failed: {str(e)}", exc_info=True)
|
||||
return await handle_seo_tool_exception("execute_enterprise_audit", e, request.dict())
|
||||
|
||||
|
||||
@router.post("/enterprise/quick-audit", response_model=BaseResponse)
|
||||
@log_api_call
|
||||
async def execute_quick_enterprise_audit(
|
||||
website_url: HttpUrl,
|
||||
current_user: dict = Depends(get_current_user)
|
||||
) -> Union[BaseResponse, ErrorResponse]:
|
||||
"""
|
||||
Execute quick 5-minute enterprise audit focusing on critical issues.
|
||||
|
||||
Provides rapid assessment of most critical SEO problems:
|
||||
- Technical SEO critical issues
|
||||
- PageSpeed performance bottlenecks
|
||||
- Top 3 actionable recommendations
|
||||
- Estimated business impact
|
||||
"""
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
try:
|
||||
logger.info(f"Starting quick audit for {website_url}")
|
||||
|
||||
enterprise_service = EnterpriseSEOService()
|
||||
audit_result = await enterprise_service.execute_quick_audit(str(website_url))
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return BaseResponse(
|
||||
success=True,
|
||||
message="Quick audit completed",
|
||||
execution_time=execution_time,
|
||||
data=audit_result
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
return await handle_seo_tool_exception("execute_quick_enterprise_audit", e, {"website_url": str(website_url)})
|
||||
|
||||
|
||||
# ==================== ADVANCED GSC ANALYSIS ENDPOINTS ====================
|
||||
|
||||
@router.post("/gsc/analyze-search-performance", response_model=BaseResponse)
|
||||
@log_api_call
|
||||
async def analyze_gsc_search_performance(
|
||||
request: GSCAnalysisRequest,
|
||||
current_user: dict = Depends(get_current_user)
|
||||
) -> Union[BaseResponse, ErrorResponse]:
|
||||
"""
|
||||
Advanced Google Search Console analysis with comprehensive insights.
|
||||
|
||||
Provides deep dive into search performance:
|
||||
- Performance overview with aggregated metrics
|
||||
- Keyword analysis with trend detection
|
||||
- Page-level performance breakdown
|
||||
- Content opportunity identification (15+ opportunities scored)
|
||||
- Technical SEO signal analysis
|
||||
- Competitive positioning assessment
|
||||
- AI-powered strategic recommendations
|
||||
|
||||
Each analysis component includes:
|
||||
- Current metrics and trends
|
||||
- Performance scores (0-100)
|
||||
- Actionable recommendations
|
||||
- Implementation priority
|
||||
"""
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
try:
|
||||
logger.info(f"Starting GSC analysis for {request.site_url}")
|
||||
|
||||
user_id = str(current_user.get("id")) if current_user else None
|
||||
|
||||
gsc_service = GSCAnalyzerService()
|
||||
analysis_result = await gsc_service.analyze_search_performance(
|
||||
site_url=str(request.site_url),
|
||||
date_range_days=request.date_range_days,
|
||||
user_id=user_id
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return BaseResponse(
|
||||
success=True,
|
||||
message="GSC search performance analysis completed",
|
||||
execution_time=execution_time,
|
||||
data=analysis_result
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"GSC analysis failed: {str(e)}", exc_info=True)
|
||||
return await handle_seo_tool_exception("analyze_gsc_search_performance", e, request.dict())
|
||||
|
||||
|
||||
@router.post("/gsc/content-opportunities", response_model=BaseResponse)
|
||||
@log_api_call
|
||||
async def get_content_opportunities_report(
|
||||
request: ContentOpportunitiesRequest,
|
||||
current_user: dict = Depends(get_current_user)
|
||||
) -> Union[BaseResponse, ErrorResponse]:
|
||||
"""
|
||||
Generate detailed content opportunities report from GSC data.
|
||||
|
||||
Identifies high-priority content gaps and optimization opportunities:
|
||||
- Queries with high volume but low CTR (meta/title optimization)
|
||||
- Keywords ranking 4-10 (ready for ranking improvement)
|
||||
- Long-tail keywords with expansion potential
|
||||
- Competitive white space analysis
|
||||
|
||||
For each opportunity includes:
|
||||
- Current position and metrics
|
||||
- Estimated traffic gain
|
||||
- Optimization strategy
|
||||
- Implementation difficulty
|
||||
- Phased roadmap (Phase 1, 2, 3)
|
||||
"""
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
try:
|
||||
logger.info(f"Generating content opportunities for {request.site_url}")
|
||||
|
||||
gsc_service = GSCAnalyzerService()
|
||||
report = await gsc_service.get_content_opportunities_report(
|
||||
site_url=str(request.site_url),
|
||||
min_impressions=request.min_impressions,
|
||||
date_range_days=request.date_range_days
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return BaseResponse(
|
||||
success=True,
|
||||
message="Content opportunities report generated",
|
||||
execution_time=execution_time,
|
||||
data=report
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Content opportunities report failed: {str(e)}", exc_info=True)
|
||||
return await handle_seo_tool_exception("get_content_opportunities_report", e, request.dict())
|
||||
|
||||
|
||||
@router.get("/enterprise/health", response_model=BaseResponse)
|
||||
@log_api_call
|
||||
async def check_enterprise_services_health() -> BaseResponse:
|
||||
"""Health check for enterprise services"""
|
||||
try:
|
||||
enterprise_service = EnterpriseSEOService()
|
||||
gsc_service = GSCAnalyzerService()
|
||||
|
||||
enterprise_health = await enterprise_service.health_check()
|
||||
gsc_health = await gsc_service.health_check()
|
||||
|
||||
return BaseResponse(
|
||||
success=True,
|
||||
message="Enterprise services health check completed",
|
||||
data={
|
||||
"enterprise_seo_service": enterprise_health,
|
||||
"gsc_analyzer_service": gsc_health,
|
||||
"timestamp": datetime.utcnow().isoformat()
|
||||
}
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"Enterprise health check failed: {str(e)}")
|
||||
return BaseResponse(
|
||||
success=False,
|
||||
message="Enterprise health check failed",
|
||||
data={"error": str(e)}
|
||||
)
|
||||
|
||||
182
backend/routers/v1/social_proxy.py
Normal file
182
backend/routers/v1/social_proxy.py
Normal file
@@ -0,0 +1,182 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from datetime import datetime, timezone
|
||||
from typing import Optional
|
||||
from urllib.parse import urlencode
|
||||
|
||||
from fastapi import APIRouter, Depends, HTTPException, Query
|
||||
from fastapi.responses import RedirectResponse
|
||||
from loguru import logger
|
||||
from sqlalchemy import text
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from services.database import get_db
|
||||
|
||||
router = APIRouter(prefix="/v1/social-proxy", tags=["social-proxy"])
|
||||
|
||||
|
||||
def _utc_now_iso() -> str:
|
||||
return datetime.now(timezone.utc).isoformat()
|
||||
|
||||
|
||||
def _ensure_tables(db: Session) -> None:
|
||||
# Keep this router backward-compatible on tenant DBs without migrations.
|
||||
db.execute(text("""
|
||||
CREATE TABLE IF NOT EXISTS oauth_nonce_sessions (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
state TEXT NOT NULL UNIQUE,
|
||||
nonce TEXT NOT NULL,
|
||||
user_id TEXT NOT NULL,
|
||||
platform TEXT NOT NULL,
|
||||
channel_id INTEGER,
|
||||
consumed_at TEXT,
|
||||
expires_at TEXT,
|
||||
created_at TEXT NOT NULL
|
||||
)
|
||||
"""))
|
||||
db.execute(text("""
|
||||
CREATE TABLE IF NOT EXISTS social_channels (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
user_id TEXT NOT NULL,
|
||||
platform TEXT NOT NULL,
|
||||
platform_account_id TEXT NOT NULL,
|
||||
token_bundle TEXT NOT NULL,
|
||||
token_version INTEGER NOT NULL DEFAULT 1,
|
||||
publication_linkage TEXT,
|
||||
is_connected INTEGER NOT NULL DEFAULT 1,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL,
|
||||
UNIQUE(platform, platform_account_id)
|
||||
)
|
||||
"""))
|
||||
|
||||
|
||||
def _build_redirect(base_url: str, code: str, message: str, channel_id: Optional[int] = None) -> RedirectResponse:
|
||||
params = {"code": code, "message": message}
|
||||
if channel_id is not None:
|
||||
params["channel_id"] = str(channel_id)
|
||||
return RedirectResponse(url=f"{base_url}?{urlencode(params)}", status_code=303)
|
||||
|
||||
|
||||
@router.get("/oauth/callback")
|
||||
def oauth_callback(
|
||||
state: str = Query(...),
|
||||
platform: str = Query(...),
|
||||
account_id: str = Query(...),
|
||||
token_bundle: str = Query(..., description="Serialized token payload"),
|
||||
ui_redirect: str = Query("/dashboard/connections"),
|
||||
db: Session = Depends(get_db),
|
||||
):
|
||||
"""Consume OAuth callback, bind to user/platform, and upsert social channel connection."""
|
||||
_ensure_tables(db)
|
||||
|
||||
record = db.execute(
|
||||
text("""
|
||||
SELECT id, nonce, user_id, platform, channel_id, consumed_at, expires_at
|
||||
FROM oauth_nonce_sessions WHERE state = :state
|
||||
"""),
|
||||
{"state": state},
|
||||
).mappings().first()
|
||||
|
||||
if not record:
|
||||
return _build_redirect(ui_redirect, "invalid_state", "Missing OAuth session")
|
||||
|
||||
if record["consumed_at"] is not None:
|
||||
return _build_redirect(ui_redirect, "state_reused", "OAuth state already consumed")
|
||||
|
||||
if record["platform"] != platform:
|
||||
return _build_redirect(ui_redirect, "platform_mismatch", "Platform mismatch")
|
||||
|
||||
if record["expires_at"] and record["expires_at"] < _utc_now_iso():
|
||||
return _build_redirect(ui_redirect, "state_expired", "OAuth session expired")
|
||||
|
||||
user_id = record["user_id"]
|
||||
|
||||
# Validate token payload is JSON.
|
||||
try:
|
||||
parsed_bundle = json.loads(token_bundle)
|
||||
except json.JSONDecodeError as exc:
|
||||
raise HTTPException(status_code=400, detail="Invalid token_bundle JSON") from exc
|
||||
|
||||
now = _utc_now_iso()
|
||||
|
||||
existing = db.execute(
|
||||
text("""
|
||||
SELECT id, publication_linkage, token_version
|
||||
FROM social_channels
|
||||
WHERE platform = :platform AND platform_account_id = :account_id
|
||||
"""),
|
||||
{"platform": platform, "account_id": account_id},
|
||||
).mappings().first()
|
||||
|
||||
if existing:
|
||||
# Reconnect path: preserve publication linkage and bump token version.
|
||||
db.execute(
|
||||
text("""
|
||||
UPDATE social_channels
|
||||
SET user_id = :user_id,
|
||||
token_bundle = :token_bundle,
|
||||
token_version = :token_version,
|
||||
is_connected = 1,
|
||||
updated_at = :updated_at
|
||||
WHERE id = :id
|
||||
"""),
|
||||
{
|
||||
"id": existing["id"],
|
||||
"user_id": user_id,
|
||||
"token_bundle": json.dumps(parsed_bundle),
|
||||
"token_version": int(existing["token_version"] or 0) + 1,
|
||||
"updated_at": now,
|
||||
},
|
||||
)
|
||||
channel_id = existing["id"]
|
||||
result_code = "reconnected"
|
||||
result_message = "Channel reconnected"
|
||||
else:
|
||||
db.execute(
|
||||
text("""
|
||||
INSERT INTO social_channels (
|
||||
user_id, platform, platform_account_id, token_bundle,
|
||||
token_version, publication_linkage, is_connected, created_at, updated_at
|
||||
) VALUES (
|
||||
:user_id, :platform, :account_id, :token_bundle,
|
||||
1, :publication_linkage, 1, :created_at, :updated_at
|
||||
)
|
||||
"""),
|
||||
{
|
||||
"user_id": user_id,
|
||||
"platform": platform,
|
||||
"account_id": account_id,
|
||||
"token_bundle": json.dumps(parsed_bundle),
|
||||
"publication_linkage": None,
|
||||
"created_at": now,
|
||||
"updated_at": now,
|
||||
},
|
||||
)
|
||||
channel_id = db.execute(text("SELECT last_insert_rowid()")).scalar_one()
|
||||
result_code = "connected"
|
||||
result_message = "Channel connected"
|
||||
|
||||
# Bind callback session to concrete channel/user/platform and mark consumed.
|
||||
db.execute(
|
||||
text("""
|
||||
UPDATE oauth_nonce_sessions
|
||||
SET consumed_at = :consumed_at,
|
||||
channel_id = :channel_id,
|
||||
user_id = :user_id,
|
||||
platform = :platform
|
||||
WHERE id = :id
|
||||
"""),
|
||||
{
|
||||
"id": record["id"],
|
||||
"consumed_at": now,
|
||||
"channel_id": channel_id,
|
||||
"user_id": user_id,
|
||||
"platform": platform,
|
||||
},
|
||||
)
|
||||
|
||||
db.commit()
|
||||
logger.info(f"OAuth callback complete user={user_id} platform={platform} channel_id={channel_id}")
|
||||
return _build_redirect(ui_redirect, result_code, result_message, channel_id)
|
||||
@@ -14,7 +14,7 @@ from services.integrations.wordpress_publisher import WordPressPublisher
|
||||
from middleware.auth_middleware import get_current_user
|
||||
|
||||
|
||||
router = APIRouter(prefix="/api/wordpress", tags=["WordPress"])
|
||||
router = APIRouter(prefix="/wordpress", tags=["WordPress"])
|
||||
|
||||
|
||||
# Pydantic Models
|
||||
@@ -87,9 +87,10 @@ async def get_wordpress_status(user: dict = Depends(get_current_user)):
|
||||
logger.info(f"Checking WordPress status for user: {user_id}")
|
||||
|
||||
# Get user's WordPress sites
|
||||
sites = wp_service.get_user_sites(user_id)
|
||||
|
||||
sites = wp_service.get_all_sites(user_id)
|
||||
|
||||
if sites:
|
||||
# Convert to response format
|
||||
site_responses = [
|
||||
WordPressSiteResponse(
|
||||
id=site['id'],
|
||||
@@ -102,13 +103,15 @@ async def get_wordpress_status(user: dict = Depends(get_current_user)):
|
||||
)
|
||||
for site in sites
|
||||
]
|
||||
|
||||
|
||||
logger.info(f"Found {len(sites)} WordPress sites for user {user_id}")
|
||||
return WordPressStatusResponse(
|
||||
connected=True,
|
||||
sites=site_responses,
|
||||
total_sites=len(sites)
|
||||
)
|
||||
else:
|
||||
logger.info(f"No WordPress sites found for user {user_id}")
|
||||
return WordPressStatusResponse(
|
||||
connected=False,
|
||||
sites=[],
|
||||
@@ -149,7 +152,7 @@ async def add_wordpress_site(
|
||||
)
|
||||
|
||||
# Get the added site info
|
||||
sites = wp_service.get_user_sites(user_id)
|
||||
sites = wp_service.get_all_sites(user_id)
|
||||
if sites:
|
||||
latest_site = sites[0] # Most recent site
|
||||
return WordPressSiteResponse(
|
||||
@@ -181,7 +184,7 @@ async def get_wordpress_sites(user: dict = Depends(get_current_user)):
|
||||
|
||||
logger.info(f"Getting WordPress sites for user: {user_id}")
|
||||
|
||||
sites = wp_service.get_user_sites(user_id)
|
||||
sites = wp_service.get_all_sites(user_id)
|
||||
|
||||
site_responses = [
|
||||
WordPressSiteResponse(
|
||||
|
||||
@@ -10,10 +10,6 @@ from pydantic import BaseModel
|
||||
from loguru import logger
|
||||
|
||||
from services.integrations.wordpress_oauth import WordPressOAuthService
|
||||
from services.integrations.oauth_callback_utils import (
|
||||
build_oauth_callback_html,
|
||||
sanitize_string,
|
||||
)
|
||||
from middleware.auth_middleware import get_current_user
|
||||
|
||||
router = APIRouter(prefix="/wp", tags=["WordPress OAuth"])
|
||||
@@ -82,12 +78,30 @@ async def handle_wordpress_callback(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
content={"success": False, "error": error}
|
||||
)
|
||||
html_content = build_oauth_callback_html(
|
||||
payload={"type": "WPCOM_OAUTH_ERROR", "success": False, "error": sanitize_string(error)},
|
||||
title="WordPress.com Connection Failed",
|
||||
heading="Connection Failed",
|
||||
message="There was an error connecting to WordPress.com. You can close this window and try again."
|
||||
)
|
||||
html_content = f"""
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>WordPress.com Connection Failed</title>
|
||||
<script>
|
||||
// Send error message to parent window
|
||||
window.onload = function() {{
|
||||
(window.opener || window.parent).postMessage({{
|
||||
type: 'WPCOM_OAUTH_ERROR',
|
||||
success: false,
|
||||
error: '{error}'
|
||||
}}, '*');
|
||||
window.close();
|
||||
}};
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Connection Failed</h1>
|
||||
<p>There was an error connecting to WordPress.com.</p>
|
||||
<p>You can close this window and try again.</p>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(content=html_content, headers={
|
||||
"Cross-Origin-Opener-Policy": "unsafe-none",
|
||||
"Cross-Origin-Embedder-Policy": "unsafe-none"
|
||||
@@ -100,12 +114,30 @@ async def handle_wordpress_callback(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
content={"success": False, "error": "Missing parameters"}
|
||||
)
|
||||
html_content = build_oauth_callback_html(
|
||||
payload={"type": "WPCOM_OAUTH_ERROR", "success": False, "error": "Missing parameters"},
|
||||
title="WordPress.com Connection Failed",
|
||||
heading="Connection Failed",
|
||||
message="Missing required parameters. You can close this window and try again."
|
||||
)
|
||||
html_content = """
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>WordPress.com Connection Failed</title>
|
||||
<script>
|
||||
// Send error message to opener/parent window
|
||||
window.onload = function() {{
|
||||
(window.opener || window.parent).postMessage({{
|
||||
type: 'WPCOM_OAUTH_ERROR',
|
||||
success: false,
|
||||
error: 'Missing parameters'
|
||||
}}, '*');
|
||||
window.close();
|
||||
}};
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Connection Failed</h1>
|
||||
<p>Missing required parameters.</p>
|
||||
<p>You can close this window and try again.</p>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(content=html_content, headers={
|
||||
"Cross-Origin-Opener-Policy": "unsafe-none",
|
||||
"Cross-Origin-Embedder-Policy": "unsafe-none"
|
||||
@@ -121,12 +153,30 @@ async def handle_wordpress_callback(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
content={"success": False, "error": "Token exchange failed"}
|
||||
)
|
||||
html_content = build_oauth_callback_html(
|
||||
payload={"type": "WPCOM_OAUTH_ERROR", "success": False, "error": "Token exchange failed"},
|
||||
title="WordPress.com Connection Failed",
|
||||
heading="Connection Failed",
|
||||
message="Failed to exchange authorization code for access token. You can close this window and try again."
|
||||
)
|
||||
html_content = """
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>WordPress.com Connection Failed</title>
|
||||
<script>
|
||||
// Send error message to opener/parent window
|
||||
window.onload = function() {{
|
||||
(window.opener || window.parent).postMessage({{
|
||||
type: 'WPCOM_OAUTH_ERROR',
|
||||
success: false,
|
||||
error: 'Token exchange failed'
|
||||
}}, '*');
|
||||
window.close();
|
||||
}};
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Connection Failed</h1>
|
||||
<p>Failed to exchange authorization code for access token.</p>
|
||||
<p>You can close this window and try again.</p>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(content=html_content)
|
||||
|
||||
# Return success page with postMessage script
|
||||
@@ -143,17 +193,31 @@ async def handle_wordpress_callback(
|
||||
}
|
||||
)
|
||||
|
||||
html_content = build_oauth_callback_html(
|
||||
payload={
|
||||
"type": "WPCOM_OAUTH_SUCCESS",
|
||||
"success": True,
|
||||
"blogUrl": sanitize_string(blog_url, 300),
|
||||
"blogId": sanitize_string(blog_id, 128)
|
||||
},
|
||||
title="WordPress.com Connection Successful",
|
||||
heading="Connection Successful",
|
||||
message="Your WordPress.com site has been connected successfully. You can close this window now."
|
||||
)
|
||||
html_content = f"""
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>WordPress.com Connection Successful</title>
|
||||
<script>
|
||||
// Send success message to opener/parent window
|
||||
window.onload = function() {{
|
||||
(window.opener || window.parent).postMessage({{
|
||||
type: 'WPCOM_OAUTH_SUCCESS',
|
||||
success: true,
|
||||
blogUrl: '{blog_url}',
|
||||
blogId: '{blog_id}'
|
||||
}}, '*');
|
||||
window.close();
|
||||
}};
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Connection Successful!</h1>
|
||||
<p>Your WordPress.com site has been connected successfully.</p>
|
||||
<p>You can close this window now.</p>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
|
||||
return HTMLResponse(content=html_content, headers={
|
||||
"Cross-Origin-Opener-Policy": "unsafe-none",
|
||||
@@ -162,12 +226,30 @@ async def handle_wordpress_callback(
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error handling WordPress OAuth callback: {e}")
|
||||
html_content = build_oauth_callback_html(
|
||||
payload={"type": "WPCOM_OAUTH_ERROR", "success": False, "error": "Callback error"},
|
||||
title="WordPress.com Connection Failed",
|
||||
heading="Connection Failed",
|
||||
message="An unexpected error occurred during connection. You can close this window and try again."
|
||||
)
|
||||
html_content = """
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>WordPress.com Connection Failed</title>
|
||||
<script>
|
||||
// Send error message to opener/parent window
|
||||
window.onload = function() {{
|
||||
(window.opener || window.parent).postMessage({{
|
||||
type: 'WPCOM_OAUTH_ERROR',
|
||||
success: false,
|
||||
error: 'Callback error'
|
||||
}}, '*');
|
||||
window.close();
|
||||
}};
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Connection Failed</h1>
|
||||
<p>An unexpected error occurred during connection.</p>
|
||||
<p>You can close this window and try again.</p>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return HTMLResponse(content=html_content, headers={
|
||||
"Cross-Origin-Opener-Policy": "unsafe-none",
|
||||
"Cross-Origin-Embedder-Policy": "unsafe-none"
|
||||
|
||||
@@ -43,7 +43,7 @@ def cap_basic_plan_usage():
|
||||
# New limits
|
||||
new_call_limit = basic_plan.gemini_calls_limit # Should be 10
|
||||
new_token_limit = basic_plan.gemini_tokens_limit # Should be 2000
|
||||
new_image_limit = basic_plan.stability_calls_limit # 25
|
||||
new_image_limit = basic_plan.stability_calls_limit # Should be 5
|
||||
|
||||
logger.info(f"📋 Basic Plan Limits:")
|
||||
logger.info(f" Calls: {new_call_limit}")
|
||||
|
||||
@@ -75,14 +75,8 @@ def update_basic_plan_limits():
|
||||
basic_plan.anthropic_tokens_limit = 20000
|
||||
basic_plan.mistral_tokens_limit = 20000
|
||||
|
||||
# Update image generation limit to 25 (minimum 10 for podcast workflows)
|
||||
basic_plan.stability_calls_limit = 25
|
||||
|
||||
# Update image edit limit to 25 (podcast episode covers + scene images)
|
||||
basic_plan.image_edit_calls_limit = 25
|
||||
|
||||
# Update audio generation limit to 100 (TTS for podcast narration)
|
||||
basic_plan.audio_calls_limit = 100
|
||||
# Update image generation limit to 5
|
||||
basic_plan.stability_calls_limit = 5
|
||||
|
||||
# Update timestamp
|
||||
basic_plan.updated_at = datetime.now(timezone.utc)
|
||||
@@ -90,9 +84,7 @@ def update_basic_plan_limits():
|
||||
logger.info("\n📝 New Basic plan limits:")
|
||||
logger.info(f" LLM Calls (all providers): 10")
|
||||
logger.info(f" LLM Tokens (all providers): 20000 (increased from 5000)")
|
||||
logger.info(f" Images (stability): 25")
|
||||
logger.info(f" Image Edits: 25")
|
||||
logger.info(f" Audio Calls: 100")
|
||||
logger.info(f" Images: 5")
|
||||
|
||||
# Count and get affected users
|
||||
user_subscriptions = db.query(UserSubscription).filter(
|
||||
|
||||
@@ -1,311 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from pydantic import BaseModel, Field, HttpUrl, EmailStr
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
|
||||
class BacklinkKeywordInput(BaseModel):
|
||||
keyword: str = Field(..., min_length=2, max_length=120)
|
||||
max_results: int = Field(default=10, ge=1, le=50)
|
||||
|
||||
|
||||
class OpportunityContactInfo(BaseModel):
|
||||
email: Optional[EmailStr] = None
|
||||
contact_page: Optional[HttpUrl] = None
|
||||
|
||||
|
||||
class OpportunityRecord(BaseModel):
|
||||
url: HttpUrl
|
||||
title: str
|
||||
snippet: str
|
||||
metadata: Dict[str, str] = Field(default_factory=dict)
|
||||
contact_info: OpportunityContactInfo = Field(default_factory=OpportunityContactInfo)
|
||||
confidence_score: float = Field(..., ge=0.0, le=1.0)
|
||||
|
||||
|
||||
class BacklinkDiscoveryResponse(BaseModel):
|
||||
keyword: str
|
||||
queries: List[str]
|
||||
opportunities: List[OpportunityRecord]
|
||||
|
||||
|
||||
# -- Deep Discovery Models --
|
||||
|
||||
class DeepKeywordInput(BaseModel):
|
||||
keyword: str = Field(..., min_length=2, max_length=120)
|
||||
max_results: int = Field(default=15, ge=1, le=50)
|
||||
campaign_id: Optional[str] = Field(default=None, description="If set, auto-saves leads to this campaign")
|
||||
|
||||
|
||||
class EnrichedOpportunity(BaseModel):
|
||||
url: str
|
||||
domain: str
|
||||
page_title: str = ""
|
||||
snippet: str = ""
|
||||
full_text: str = ""
|
||||
email: Optional[str] = None
|
||||
contact_page: Optional[str] = None
|
||||
confidence_score: float = Field(default=0.0, ge=0.0, le=1.0)
|
||||
quality_score: float = Field(default=0.0, ge=0.0, le=1.0)
|
||||
word_count: int = 0
|
||||
has_guest_post_guidelines: bool = False
|
||||
discovery_source: str = "duckduckgo"
|
||||
|
||||
|
||||
class DeepDiscoveryResponse(BaseModel):
|
||||
keyword: str
|
||||
source: str
|
||||
total_found: int
|
||||
opportunities: List[EnrichedOpportunity]
|
||||
|
||||
|
||||
# -- Lead Models --
|
||||
|
||||
class LeadCreateRequest(BaseModel):
|
||||
campaign_id: str = Field(..., min_length=1)
|
||||
url: str = Field(..., min_length=1)
|
||||
domain: str = Field(..., min_length=1)
|
||||
email: Optional[str] = None
|
||||
page_title: Optional[str] = None
|
||||
snippet: Optional[str] = None
|
||||
confidence_score: float = Field(default=0.0, ge=0.0, le=1.0)
|
||||
notes: Optional[str] = None
|
||||
|
||||
|
||||
class LeadRecord(BaseModel):
|
||||
lead_id: str
|
||||
campaign_id: str
|
||||
url: Optional[str]
|
||||
domain: str
|
||||
page_title: Optional[str] = ""
|
||||
snippet: Optional[str] = ""
|
||||
email: Optional[str] = None
|
||||
confidence_score: float = 0.0
|
||||
discovery_source: Optional[str] = "duckduckgo"
|
||||
status: str = "discovered"
|
||||
notes: Optional[str] = None
|
||||
created_at: Optional[str] = None
|
||||
|
||||
|
||||
class LeadListResponse(BaseModel):
|
||||
leads: List[LeadRecord]
|
||||
total: int
|
||||
|
||||
|
||||
class LeadStatusUpdateRequest(BaseModel):
|
||||
status: str = Field(..., min_length=1)
|
||||
notes: Optional[str] = None
|
||||
|
||||
|
||||
class CampaignDetailResponse(BaseModel):
|
||||
campaign_id: str
|
||||
name: str
|
||||
status: str
|
||||
created_at: Optional[str] = None
|
||||
lead_count: int = 0
|
||||
leads: List[LeadRecord] = Field(default_factory=list)
|
||||
|
||||
|
||||
class GenerateEmailRequest(BaseModel):
|
||||
topic: str = Field(..., min_length=2, max_length=500)
|
||||
target_site: Optional[str] = Field(None, description="Target website for guest post pitch")
|
||||
tone: str = Field(default="professional", pattern="^(professional|friendly|casual|formal)$")
|
||||
existing_template_id: Optional[str] = None
|
||||
|
||||
|
||||
class GeneratedEmailResponse(BaseModel):
|
||||
subject: str
|
||||
body: str
|
||||
|
||||
|
||||
class PersonalizeEmailRequest(BaseModel):
|
||||
lead_name: str = Field(..., min_length=1, max_length=200)
|
||||
lead_site: str = Field(..., min_length=1, max_length=500)
|
||||
lead_content_topic: str = Field(..., min_length=1, max_length=500)
|
||||
pitch_topic: str = Field(..., min_length=2, max_length=500)
|
||||
existing_body: str = Field(default="", max_length=10000)
|
||||
|
||||
|
||||
class SubjectLinesRequest(BaseModel):
|
||||
body: str = Field(..., min_length=10, max_length=10000)
|
||||
count: int = Field(default=5, ge=1, le=10)
|
||||
|
||||
|
||||
class SubjectLinesResponse(BaseModel):
|
||||
subjects: list[str]
|
||||
|
||||
|
||||
class FollowUpRequest(BaseModel):
|
||||
original_subject: str = Field(..., min_length=1, max_length=500)
|
||||
original_body: str = Field(..., min_length=10, max_length=10000)
|
||||
days_elapsed: int = Field(default=7, ge=1, le=90)
|
||||
reply_context: str = Field(default="", max_length=2000)
|
||||
|
||||
|
||||
class OutreachStatusRecord(BaseModel):
|
||||
opportunity_url: HttpUrl
|
||||
status: str
|
||||
notes: Optional[str] = None
|
||||
|
||||
|
||||
class SendOutreachRequest(BaseModel):
|
||||
lead_id: str = Field(..., min_length=1)
|
||||
campaign_id: str = Field(..., min_length=1)
|
||||
user_id: str = Field(..., min_length=1)
|
||||
workspace_id: str = Field(default="default")
|
||||
sender_email: str = Field(..., min_length=3)
|
||||
subject: str = Field(..., min_length=1)
|
||||
body: str = Field(..., min_length=1)
|
||||
idempotency_key: str = Field(..., min_length=8)
|
||||
template_id: Optional[str] = Field(None, description="Optional template ID for personalization")
|
||||
template_variables: Optional[dict] = Field(None, description="Variable values for template personalization")
|
||||
|
||||
|
||||
class SendOutreachResponse(BaseModel):
|
||||
attempt_id: str
|
||||
status: str
|
||||
policy_allowed: bool
|
||||
policy_reasons: List[str] = Field(default_factory=list)
|
||||
|
||||
|
||||
class OutreachAttemptRecord(BaseModel):
|
||||
attempt_id: str
|
||||
lead_id: str
|
||||
campaign_id: str
|
||||
idempotency_key: str
|
||||
sender_email: Optional[str] = None
|
||||
subject: Optional[str] = None
|
||||
status: str = "queued"
|
||||
decision_reason: Optional[str] = None
|
||||
sent_at: Optional[str] = None
|
||||
created_at: Optional[str] = None
|
||||
|
||||
|
||||
class OutreachAttemptListResponse(BaseModel):
|
||||
attempts: List[OutreachAttemptRecord]
|
||||
total: int
|
||||
|
||||
|
||||
class OutreachReplyRecord(BaseModel):
|
||||
reply_id: str
|
||||
attempt_id: str
|
||||
from_email: Optional[str] = None
|
||||
subject: Optional[str] = None
|
||||
received_at: Optional[str] = None
|
||||
classification: str = "replied"
|
||||
body: Optional[str] = None
|
||||
|
||||
|
||||
class OutreachReplyListResponse(BaseModel):
|
||||
replies: List[OutreachReplyRecord]
|
||||
total: int
|
||||
|
||||
|
||||
class ScheduleFollowUpRequest(BaseModel):
|
||||
attempt_id: str = Field(..., min_length=1)
|
||||
scheduled_for: str = Field(..., min_length=1)
|
||||
subject: Optional[str] = None
|
||||
body: Optional[str] = None
|
||||
|
||||
|
||||
class FollowUpScheduleRecord(BaseModel):
|
||||
schedule_id: str
|
||||
attempt_id: str
|
||||
subject: Optional[str] = None
|
||||
scheduled_for: str
|
||||
sent: bool = False
|
||||
|
||||
|
||||
class EmailTemplateRequest(BaseModel):
|
||||
name: str = Field(..., min_length=1)
|
||||
subject_template: str = Field(..., min_length=1)
|
||||
body_template: str = Field(..., min_length=1)
|
||||
variables: Optional[List[str]] = None
|
||||
|
||||
|
||||
class EmailTemplateRecord(BaseModel):
|
||||
template_id: str
|
||||
user_id: str
|
||||
name: str
|
||||
subject_template: str
|
||||
body_template: str
|
||||
variables: Optional[List[str]] = None
|
||||
created_at: Optional[str] = None
|
||||
|
||||
|
||||
class PolicyValidationRequest(BaseModel):
|
||||
user_id: str = Field(..., min_length=1)
|
||||
workspace_id: str = Field(..., min_length=1)
|
||||
campaign_id: str = Field(..., min_length=1)
|
||||
recipient_email: str = Field(..., min_length=1)
|
||||
recipient_domain: str
|
||||
recipient_region: str = Field(default="unknown")
|
||||
legal_basis: str = Field(..., min_length=2)
|
||||
approved_by_human: bool = False
|
||||
unsubscribe_url: Optional[HttpUrl] = None
|
||||
sender_identity: str = Field(..., min_length=3)
|
||||
idempotency_key: str = Field(..., min_length=8)
|
||||
|
||||
|
||||
class PolicyValidationResponse(BaseModel):
|
||||
allowed: bool
|
||||
reasons: List[str] = Field(default_factory=list)
|
||||
final_status: str
|
||||
|
||||
|
||||
# -- Analytics & Reporting Models --
|
||||
|
||||
class CampaignAnalyticsResponse(BaseModel):
|
||||
campaign_id: str
|
||||
lead_count: int = 0
|
||||
send_volume: int = 0
|
||||
blocked_count: int = 0
|
||||
reply_count: int = 0
|
||||
response_rate: float = 0.0
|
||||
placement_rate: float = 0.0
|
||||
reply_classification: Dict[str, int] = Field(default_factory=dict)
|
||||
|
||||
|
||||
class BacklinkReportingSnapshot(BaseModel):
|
||||
send_volume: int = 0
|
||||
decision_events: int = 0
|
||||
response_rate: float = 0.0
|
||||
placement_conversion: float = 0.0
|
||||
|
||||
|
||||
class CampaignVolumePoint(BaseModel):
|
||||
date: str
|
||||
count: int = 0
|
||||
|
||||
|
||||
class CampaignVolumeResponse(BaseModel):
|
||||
campaign_id: str
|
||||
days: int = 30
|
||||
volume: List[CampaignVolumePoint] = Field(default_factory=list)
|
||||
|
||||
|
||||
class FunnelStage(BaseModel):
|
||||
status: str
|
||||
count: int = 0
|
||||
|
||||
|
||||
class ConversionFunnelResponse(BaseModel):
|
||||
campaign_id: str
|
||||
stages: List[FunnelStage] = Field(default_factory=list)
|
||||
|
||||
|
||||
class BulkStatusUpdateRequest(BaseModel):
|
||||
lead_ids: List[str] = Field(..., min_length=1)
|
||||
status: str = Field(..., min_length=1)
|
||||
notes: Optional[str] = None
|
||||
|
||||
|
||||
class BulkStatusUpdateResponse(BaseModel):
|
||||
updated: int = 0
|
||||
failed: List[str] = Field(default_factory=list)
|
||||
|
||||
|
||||
class SuppressionAddRequest(BaseModel):
|
||||
email: str = Field(..., min_length=3)
|
||||
reason: str = Field(default="")
|
||||
domain: str = Field(default="")
|
||||
@@ -1,164 +0,0 @@
|
||||
"""IMAP-based reply monitoring for backlink outreach."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import asyncio
|
||||
import imaplib
|
||||
import email as email_lib
|
||||
from email.utils import parsedate_to_datetime
|
||||
from typing import List, Optional
|
||||
from loguru import logger
|
||||
|
||||
|
||||
IMAP_HOST = os.getenv("IMAP_HOST", "imap.gmail.com")
|
||||
IMAP_PORT = int(os.getenv("IMAP_PORT", "993"))
|
||||
IMAP_USERNAME = os.getenv("IMAP_USERNAME", "")
|
||||
IMAP_PASSWORD = os.getenv("IMAP_PASSWORD", "")
|
||||
IMAP_FOLDER = os.getenv("IMAP_FOLDER", "INBOX")
|
||||
IMAP_FETCH_LIMIT = int(os.getenv("IMAP_FETCH_LIMIT", "50"))
|
||||
|
||||
# Search keywords for auto-classification
|
||||
INTERESTED_KEYWORDS = [
|
||||
"interested", "let's discuss", "sounds good", "would love to", "yes",
|
||||
"sure", "tell me more", "looks good", "happy to", "let's do it",
|
||||
"sign me up", "count me in", "proceed", "approved",
|
||||
]
|
||||
NOT_INTERESTED_KEYWORDS = [
|
||||
"not interested", "unsubscribe", "no thanks", "remove me", "stop",
|
||||
"don't contact", "spam", "not relevant", "no longer interested",
|
||||
"please stop", "do not email",
|
||||
]
|
||||
OUT_OF_OFFICE_KEYWORDS = [
|
||||
"out of office", "vacation", "on leave", "away from", "return on",
|
||||
"not in the office", "will be back",
|
||||
]
|
||||
|
||||
|
||||
class BacklinkOutreachReplyMonitor:
|
||||
def __init__(self):
|
||||
self._host = IMAP_HOST
|
||||
self._port = IMAP_PORT
|
||||
self._username = IMAP_USERNAME
|
||||
self._password = IMAP_PASSWORD
|
||||
self._folder = IMAP_FOLDER
|
||||
self._fetch_limit = IMAP_FETCH_LIMIT
|
||||
|
||||
def is_configured(self) -> bool:
|
||||
return bool(self._username and self._password)
|
||||
|
||||
async def poll_replies(self, sent_from_email: str) -> List[dict]:
|
||||
"""Poll IMAP inbox for replies to a specific sender address."""
|
||||
if not self.is_configured():
|
||||
logger.warning("IMAP not configured: set IMAP_USERNAME and IMAP_PASSWORD")
|
||||
return []
|
||||
|
||||
loop = asyncio.get_running_loop()
|
||||
|
||||
def _poll() -> List[dict]:
|
||||
try:
|
||||
mail = imaplib.IMAP4_SSL(self._host, self._port)
|
||||
mail.login(self._username, self._password)
|
||||
mail.select(self._folder)
|
||||
|
||||
safe_email = sent_from_email.replace('"', "").replace("\\", "")
|
||||
search_criteria = f'(TO "{safe_email}")'
|
||||
status, message_ids = mail.search(None, search_criteria)
|
||||
if status != "OK":
|
||||
return []
|
||||
|
||||
ids = message_ids[0].split() if message_ids[0] else []
|
||||
if not ids:
|
||||
return []
|
||||
|
||||
ids = ids[-self._fetch_limit:]
|
||||
|
||||
replies = []
|
||||
for mid in ids:
|
||||
status, msg_data = mail.fetch(mid, "(RFC822)")
|
||||
if status != "OK":
|
||||
continue
|
||||
|
||||
raw_email = msg_data[0][1] if msg_data else None
|
||||
if not raw_email:
|
||||
continue
|
||||
|
||||
parsed = email_lib.message_from_bytes(raw_email)
|
||||
reply = self._parse_reply(parsed)
|
||||
if reply:
|
||||
replies.append(reply)
|
||||
|
||||
mail.logout()
|
||||
return replies
|
||||
except imaplib.IMAP4.error as e:
|
||||
logger.error(f"IMAP error: {e}")
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"Unexpected IMAP error: {e}")
|
||||
return []
|
||||
|
||||
return await loop.run_in_executor(None, _poll)
|
||||
|
||||
def _parse_reply(self, parsed_msg) -> Optional[dict]:
|
||||
try:
|
||||
from_email = parsed_msg.get("From", "")
|
||||
subject = parsed_msg.get("Subject", "")
|
||||
received_at = parsed_msg.get("Date", "")
|
||||
|
||||
# Extract body
|
||||
body = ""
|
||||
if parsed_msg.is_multipart():
|
||||
for part in parsed_msg.walk():
|
||||
content_type = part.get_content_type()
|
||||
if content_type == "text/plain":
|
||||
try:
|
||||
body = part.get_payload(decode=True).decode("utf-8", errors="ignore")
|
||||
break
|
||||
except Exception:
|
||||
continue
|
||||
else:
|
||||
try:
|
||||
body = parsed_msg.get_payload(decode=True).decode("utf-8", errors="ignore")
|
||||
except Exception:
|
||||
body = str(parsed_msg.get_payload())
|
||||
|
||||
classification = self._classify_reply(body, subject)
|
||||
|
||||
# Parse date
|
||||
try:
|
||||
dt = parsedate_to_datetime(received_at)
|
||||
received_at_iso = dt.isoformat() if dt else None
|
||||
except Exception:
|
||||
received_at_iso = None
|
||||
|
||||
return {
|
||||
"from_email": from_email,
|
||||
"subject": subject,
|
||||
"body": body[:5000],
|
||||
"classification": classification,
|
||||
"received_at": received_at_iso,
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to parse reply: {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _classify_reply(body: str, subject: str) -> str:
|
||||
text = f"{subject} {body}".lower()
|
||||
|
||||
for kw in OUT_OF_OFFICE_KEYWORDS:
|
||||
if kw in text:
|
||||
return "out_of_office"
|
||||
|
||||
for kw in NOT_INTERESTED_KEYWORDS:
|
||||
if kw in text:
|
||||
return "not_interested"
|
||||
|
||||
for kw in INTERESTED_KEYWORDS:
|
||||
if kw in text:
|
||||
return "interested"
|
||||
|
||||
return "replied"
|
||||
|
||||
|
||||
backlink_outreach_reply_monitor = BacklinkOutreachReplyMonitor()
|
||||
@@ -1,406 +0,0 @@
|
||||
"""Deep website scraper for backlink outreach discovery.
|
||||
|
||||
Orchestrates Exa neural search + DuckDuckGo fallback to find guest-post
|
||||
opportunities with full-page content extraction and quality scoring.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import re
|
||||
import time
|
||||
from typing import Any, Dict, List, Optional
|
||||
from urllib.parse import urlparse
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
from loguru import logger
|
||||
|
||||
|
||||
class BacklinkOutreachScraper:
|
||||
"""Scrapes websites for backlink outreach opportunities using Exa + DuckDuckGo."""
|
||||
|
||||
GUEST_POST_KEYWORDS = [
|
||||
"write for us", "guest post", "submit guest post",
|
||||
"guest contributor", "become a guest blogger", "guest bloggers wanted",
|
||||
"add guest post", "submit article", "guest post opportunities",
|
||||
"contribute to our blog", "write for our blog",
|
||||
]
|
||||
|
||||
def __init__(self, user_id: Optional[str] = None):
|
||||
self.user_id = user_id
|
||||
self._exa_svc = None
|
||||
|
||||
# -- Public API --
|
||||
|
||||
async def deep_discover(
|
||||
self, keyword: str, max_results: int = 15
|
||||
) -> Dict[str, Any]:
|
||||
"""Discover guest-post opportunities using Exa, falling back to DuckDuckGo."""
|
||||
if self._is_exa_available():
|
||||
logger.info(f"[BacklinkScraper] Using Exa for keyword: {keyword}")
|
||||
return await self._discover_with_exa(keyword, max_results)
|
||||
logger.info(f"[BacklinkScraper] Exa unavailable, falling back to DuckDuckGo for: {keyword}")
|
||||
return await self._discover_with_duckduckgo(keyword, max_results)
|
||||
|
||||
def scrape_urls(self, urls: List[str]) -> List[Dict[str, Any]]:
|
||||
"""Fetch full page content for a list of URLs using Exa get_contents."""
|
||||
exa = self._get_exa_sdk()
|
||||
if not exa:
|
||||
return self._scrape_urls_fallback(urls)
|
||||
try:
|
||||
result = exa.get_contents(urls, text={"max_characters": 5000})
|
||||
return self._parse_get_contents_result(result)
|
||||
except Exception as e:
|
||||
logger.warning(f"[BacklinkScraper] Exa get_contents failed: {e}")
|
||||
return self._scrape_urls_fallback(urls)
|
||||
|
||||
# -- Availability --
|
||||
|
||||
def _is_exa_available(self) -> bool:
|
||||
try:
|
||||
exa = self._get_exa_sdk()
|
||||
return exa is not None
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
def _get_exa_sdk(self):
|
||||
"""Get Exa SDK instance via ExaService, respecting per-user API key."""
|
||||
if self._exa_svc is None:
|
||||
from services.research.exa_service import ExaService
|
||||
self._exa_svc = ExaService()
|
||||
self._exa_svc._try_initialize()
|
||||
return self._exa_svc.exa if self._exa_svc.enabled else None
|
||||
|
||||
# -- Preflight & Usage Tracking --
|
||||
|
||||
def _preflight_subscription_check(self, user_id: str) -> bool:
|
||||
"""Check Exa usage limits. Returns True if allowed."""
|
||||
if not user_id:
|
||||
return True
|
||||
try:
|
||||
from services.database import get_session_for_user
|
||||
from services.subscription import PricingService
|
||||
from models.subscription_models import APIProvider
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return True
|
||||
try:
|
||||
pricing = PricingService(db)
|
||||
allowed, _, _ = pricing.check_usage_limits(
|
||||
user_id=user_id, provider=APIProvider.EXA, tokens_requested=0,
|
||||
)
|
||||
return allowed
|
||||
finally:
|
||||
db.close()
|
||||
except Exception as e:
|
||||
logger.warning(f"[BacklinkScraper] Preflight check failed: {e}")
|
||||
return True
|
||||
|
||||
def _track_exa_usage(self, user_id: str, cost: float = 0.005):
|
||||
"""Record Exa usage after successful search."""
|
||||
if not user_id:
|
||||
return
|
||||
try:
|
||||
from services.database import get_session_for_user
|
||||
from services.subscription import PricingService
|
||||
from sqlalchemy import text as sql_text
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return
|
||||
try:
|
||||
pricing = PricingService(db)
|
||||
period = pricing.get_current_billing_period(user_id)
|
||||
db.execute(sql_text("""
|
||||
UPDATE usage_summaries
|
||||
SET exa_calls = COALESCE(exa_calls, 0) + 1,
|
||||
exa_cost = COALESCE(exa_cost, 0) + :cost,
|
||||
total_calls = total_calls + 1,
|
||||
total_cost = total_cost + :cost
|
||||
WHERE user_id = :user_id AND billing_period = :period
|
||||
"""), {"cost": cost, "user_id": user_id, "period": period})
|
||||
db.commit()
|
||||
finally:
|
||||
db.close()
|
||||
except Exception as e:
|
||||
logger.warning(f"[BacklinkScraper] Usage tracking failed: {e}")
|
||||
|
||||
# -- Exa Discovery --
|
||||
|
||||
async def _discover_with_exa(self, keyword: str, max_results: int) -> Dict[str, Any]:
|
||||
exa = self._get_exa_sdk()
|
||||
if not exa:
|
||||
return await self._discover_with_duckduckgo(keyword, max_results)
|
||||
|
||||
queries = self._generate_search_queries(keyword)
|
||||
dedup: Dict[str, Dict[str, Any]] = {}
|
||||
results_per_query = max(1, max_results // len(queries))
|
||||
|
||||
for query in queries[:4]:
|
||||
rows = await self._exa_search_and_contents(exa, query, results_per_query)
|
||||
for row in rows:
|
||||
norm_url = self._normalize_url(row.get("url", ""))
|
||||
if not norm_url or norm_url in dedup:
|
||||
continue
|
||||
dedup[norm_url] = row
|
||||
if len(dedup) >= max_results:
|
||||
break
|
||||
|
||||
opportunities = self._build_enriched_opportunities(dedup, keyword, "exa")
|
||||
self._track_exa_usage(self.user_id)
|
||||
|
||||
return {
|
||||
"keyword": keyword,
|
||||
"source": "exa",
|
||||
"total_found": len(opportunities),
|
||||
"opportunities": opportunities,
|
||||
}
|
||||
|
||||
async def _exa_search_and_contents(
|
||||
self, exa, query: str, num_results: int
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""Run Exa search_and_contents in executor to avoid blocking."""
|
||||
loop = asyncio.get_running_loop()
|
||||
try:
|
||||
result = await loop.run_in_executor(
|
||||
None,
|
||||
lambda: exa.search_and_contents(
|
||||
query,
|
||||
type="auto",
|
||||
num_results=num_results,
|
||||
text={"max_characters": 3000},
|
||||
highlights={"num_sentences": 3, "highlights_per_url": 3},
|
||||
),
|
||||
)
|
||||
return self._parse_search_and_contents_result(result)
|
||||
except Exception as e:
|
||||
logger.warning(f"[BacklinkScraper] Exa search_and_contents failed: {e}")
|
||||
return []
|
||||
|
||||
def _parse_search_and_contents_result(self, result) -> List[Dict[str, Any]]:
|
||||
rows = []
|
||||
results = getattr(result, "results", [])
|
||||
for r in results:
|
||||
rows.append({
|
||||
"url": getattr(r, "url", ""),
|
||||
"title": getattr(r, "title", ""),
|
||||
"text": getattr(r, "text", ""),
|
||||
"highlights": getattr(r, "highlights", []),
|
||||
"summary": getattr(r, "summary", ""),
|
||||
"score": getattr(r, "score", 0.5),
|
||||
"published_date": getattr(r, "publishedDate", None),
|
||||
})
|
||||
return rows
|
||||
|
||||
def _parse_get_contents_result(self, result) -> List[Dict[str, Any]]:
|
||||
rows = []
|
||||
results = getattr(result, "results", [])
|
||||
for r in results:
|
||||
rows.append({
|
||||
"url": getattr(r, "url", ""),
|
||||
"title": getattr(r, "title", ""),
|
||||
"text": getattr(r, "text", ""),
|
||||
"highlights": getattr(r, "highlights", []),
|
||||
"summary": getattr(r, "summary", ""),
|
||||
})
|
||||
return rows
|
||||
|
||||
# -- DuckDuckGo Fallback Discovery --
|
||||
|
||||
async def _discover_with_duckduckgo(self, keyword: str, max_results: int) -> Dict[str, Any]:
|
||||
queries = self._generate_search_queries(keyword)
|
||||
dedup: Dict[str, Dict[str, Any]] = {}
|
||||
|
||||
for query in queries[:4]:
|
||||
rows = self._duckduckgo_search(query)
|
||||
for row in rows:
|
||||
norm_url = self._normalize_url(row.get("url", ""))
|
||||
if not norm_url or norm_url in dedup:
|
||||
continue
|
||||
dedup[norm_url] = row
|
||||
if len(dedup) >= max_results:
|
||||
break
|
||||
time.sleep(0.4)
|
||||
|
||||
# Scrape discovered URLs with Exa get_contents (or fallback)
|
||||
urls_to_scrape = list(dedup.keys())[:max_results]
|
||||
scraped = self.scrape_urls(urls_to_scrape)
|
||||
scraped_map = {self._normalize_url(s.get("url", "")): s for s in scraped}
|
||||
|
||||
# Merge DDG results with scraped content
|
||||
merged = {}
|
||||
for norm_url, ddg_row in dedup.items():
|
||||
full = scraped_map.get(norm_url, {})
|
||||
merged[norm_url] = {
|
||||
"url": norm_url,
|
||||
"title": full.get("title") or ddg_row.get("title", ""),
|
||||
"text": full.get("text", ""),
|
||||
"highlights": full.get("highlights", ddg_row.get("highlights", [])),
|
||||
"summary": full.get("summary", ddg_row.get("snippet", "")),
|
||||
"snippet": ddg_row.get("snippet", ""),
|
||||
"score": 0.5,
|
||||
}
|
||||
|
||||
opportunities = self._build_enriched_opportunities(merged, keyword, "duckduckgo")
|
||||
|
||||
return {
|
||||
"keyword": keyword,
|
||||
"source": "duckduckgo",
|
||||
"total_found": len(opportunities),
|
||||
"opportunities": opportunities,
|
||||
}
|
||||
|
||||
def _duckduckgo_search(self, query: str, retries: int = 2) -> List[Dict[str, Any]]:
|
||||
encoded = requests.utils.quote(query)
|
||||
url = f"https://duckduckgo.com/html/?q={encoded}"
|
||||
headers = {"User-Agent": "Mozilla/5.0 ALwrityBacklinkBot/1.0"}
|
||||
for attempt in range(retries + 1):
|
||||
try:
|
||||
resp = requests.get(url, headers=headers, timeout=12)
|
||||
resp.raise_for_status()
|
||||
soup = BeautifulSoup(resp.text, "html.parser")
|
||||
results = []
|
||||
for result in soup.select("div.result")[:10]:
|
||||
anchor = result.select_one("a.result__a")
|
||||
snippet_el = result.select_one("a.result__snippet") or result.select_one("div.result__snippet")
|
||||
if not anchor or not anchor.get("href"):
|
||||
continue
|
||||
results.append({
|
||||
"url": anchor.get("href"),
|
||||
"title": anchor.get_text(strip=True),
|
||||
"snippet": snippet_el.get_text(" ", strip=True) if snippet_el else "",
|
||||
"highlights": [],
|
||||
})
|
||||
return results
|
||||
except Exception:
|
||||
if attempt == retries:
|
||||
return []
|
||||
time.sleep(0.6 * (attempt + 1))
|
||||
return []
|
||||
|
||||
def _scrape_urls_fallback(self, urls: List[str]) -> List[Dict[str, Any]]:
|
||||
"""Basic HTTP scrape when Exa is unavailable."""
|
||||
results = []
|
||||
headers = {"User-Agent": "Mozilla/5.0 ALwrityBacklinkBot/1.0"}
|
||||
for url in urls[:5]:
|
||||
try:
|
||||
resp = requests.get(url, headers=headers, timeout=15)
|
||||
resp.raise_for_status()
|
||||
soup = BeautifulSoup(resp.text, "html.parser")
|
||||
for tag in soup(["script", "style", "nav", "footer", "header"]):
|
||||
tag.decompose()
|
||||
text = soup.get_text(separator=" ", strip=True)
|
||||
title = soup.title.get_text(strip=True) if soup.title else ""
|
||||
results.append({"url": url, "title": title, "text": text[:5000], "highlights": [], "summary": ""})
|
||||
except Exception:
|
||||
continue
|
||||
return results
|
||||
|
||||
# -- Enrichment Pipeline --
|
||||
|
||||
def _build_enriched_opportunities(
|
||||
self, dedup: Dict[str, Dict[str, Any]], keyword: str, source: str
|
||||
) -> List[Dict[str, Any]]:
|
||||
opportunities = []
|
||||
for norm_url, row in dedup.items():
|
||||
text = row.get("text", "")
|
||||
title = row.get("title", row.get("snippet", ""))
|
||||
quality = self._score_quality(text, title)
|
||||
contacts = self._extract_contacts(text)
|
||||
domain = self._extract_domain(norm_url)
|
||||
has_guidelines = self._check_guest_post_signals(text)
|
||||
|
||||
opportunities.append({
|
||||
"url": norm_url,
|
||||
"domain": domain,
|
||||
"page_title": title,
|
||||
"snippet": row.get("snippet") or (text[:300] if text else ""),
|
||||
"full_text": text[:5000],
|
||||
"email": contacts.get("email"),
|
||||
"contact_page": contacts.get("contact_page"),
|
||||
"confidence_score": min(1.0, quality + 0.1),
|
||||
"quality_score": quality,
|
||||
"word_count": len(text.split()),
|
||||
"has_guest_post_guidelines": has_guidelines,
|
||||
"discovery_source": source,
|
||||
})
|
||||
opportunities.sort(key=lambda x: x["quality_score"], reverse=True)
|
||||
return opportunities
|
||||
|
||||
def _extract_domain(self, url: str) -> str:
|
||||
try:
|
||||
return urlparse(url).netloc
|
||||
except Exception:
|
||||
return url
|
||||
|
||||
def _normalize_url(self, url: str) -> str:
|
||||
u = (url or "").strip().strip("`")
|
||||
if not u:
|
||||
return ""
|
||||
if u.startswith("//"):
|
||||
u = f"https:{u}"
|
||||
if not re.match(r"^https?://", u):
|
||||
return ""
|
||||
return u.split("#")[0].rstrip("/")
|
||||
|
||||
def _extract_contacts(self, text: str) -> Dict[str, Optional[str]]:
|
||||
result: Dict[str, Optional[str]] = {"email": None, "contact_page": None}
|
||||
if not text:
|
||||
return result
|
||||
email_match = re.search(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}", text)
|
||||
if email_match:
|
||||
result["email"] = email_match.group(0)
|
||||
contact_match = re.search(
|
||||
r"(https?://[^\s\"'<>]*(?:contact|about|team|write-for-us|guest-post)[^\s\"'<>]*)",
|
||||
text, re.IGNORECASE,
|
||||
)
|
||||
if contact_match:
|
||||
result["contact_page"] = contact_match.group(1).rstrip("/")
|
||||
return result
|
||||
|
||||
def _score_quality(self, text: str, title: str) -> float:
|
||||
score = 0.3
|
||||
words = text.split()
|
||||
wc = len(words)
|
||||
if wc > 2000:
|
||||
score += 0.3
|
||||
elif wc > 800:
|
||||
score += 0.2
|
||||
elif wc > 200:
|
||||
score += 0.1
|
||||
hay = f"{title} {text[:2000]}".lower()
|
||||
cues_found = sum(1 for cue in self.GUEST_POST_KEYWORDS if cue in hay)
|
||||
score += min(0.3, cues_found * 0.06)
|
||||
spam_signals = [
|
||||
r"buy\s+links?" in hay, r"cheap\s+backlinks?" in hay,
|
||||
r"pbn" in hay, r"private\s+blog\s+network" in hay,
|
||||
]
|
||||
if any(spam_signals):
|
||||
score -= 0.3
|
||||
return max(0.0, min(1.0, score))
|
||||
|
||||
def _check_guest_post_signals(self, text: str) -> bool:
|
||||
if not text:
|
||||
return False
|
||||
hay = text.lower()
|
||||
guidelines = [
|
||||
"guest post guidelines", "submission guidelines",
|
||||
"write for us", "guest post", "submit a guest post",
|
||||
"guest contributor guidelines", "contributor guidelines",
|
||||
]
|
||||
return any(g in hay for g in guidelines)
|
||||
|
||||
def _generate_search_queries(self, keyword: str) -> List[str]:
|
||||
kw = (keyword or "").strip()
|
||||
if not kw:
|
||||
return []
|
||||
return [
|
||||
f"{kw} write for us",
|
||||
f"{kw} guest post",
|
||||
f"{kw} submit guest post",
|
||||
f"{kw} guest contributor",
|
||||
f"{kw} become a guest blogger",
|
||||
f"{kw} add guest post",
|
||||
f"{kw} guest post opportunities",
|
||||
f"{kw} submit article",
|
||||
]
|
||||
@@ -1,90 +0,0 @@
|
||||
"""Email sender for backlink outreach via SMTP."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import ssl
|
||||
import smtplib
|
||||
import asyncio
|
||||
from email.mime.text import MIMEText
|
||||
from email.mime.multipart import MIMEMultipart
|
||||
from typing import Optional
|
||||
from loguru import logger
|
||||
|
||||
|
||||
SMTP_HOST = os.getenv("SMTP_HOST", "smtp.gmail.com")
|
||||
SMTP_PORT = int(os.getenv("SMTP_PORT", "587"))
|
||||
SMTP_USERNAME = os.getenv("SMTP_USERNAME", "")
|
||||
SMTP_PASSWORD = os.getenv("SMTP_PASSWORD", "")
|
||||
SMTP_FROM_EMAIL = os.getenv("SMTP_FROM_EMAIL", SMTP_USERNAME)
|
||||
SMTP_USE_TLS = os.getenv("SMTP_USE_TLS", "true").lower() in ("true", "1", "yes")
|
||||
SMTP_VERIFY_TLS = os.getenv("SMTP_VERIFY_TLS", "true").lower() in ("true", "1", "yes")
|
||||
SMTP_SEND_TIMEOUT = int(os.getenv("SMTP_SEND_TIMEOUT", "30"))
|
||||
|
||||
|
||||
class BacklinkOutreachSender:
|
||||
def __init__(self):
|
||||
self._host = SMTP_HOST
|
||||
self._port = SMTP_PORT
|
||||
self._username = SMTP_USERNAME
|
||||
self._password = SMTP_PASSWORD
|
||||
self._from_email = SMTP_FROM_EMAIL or SMTP_USERNAME
|
||||
self._use_tls = SMTP_USE_TLS
|
||||
self._verify_tls = SMTP_VERIFY_TLS
|
||||
self._timeout = SMTP_SEND_TIMEOUT
|
||||
|
||||
def is_configured(self) -> bool:
|
||||
return bool(self._username and self._password)
|
||||
|
||||
async def send_email(
|
||||
self,
|
||||
to_email: str,
|
||||
subject: str,
|
||||
body: str,
|
||||
from_email: Optional[str] = None,
|
||||
) -> bool:
|
||||
if not self.is_configured():
|
||||
logger.error("SMTP not configured: set SMTP_USERNAME and SMTP_PASSWORD")
|
||||
return False
|
||||
|
||||
sender = from_email or self._from_email
|
||||
|
||||
msg = MIMEMultipart("alternative")
|
||||
msg["From"] = sender
|
||||
msg["To"] = to_email
|
||||
msg["Subject"] = subject
|
||||
msg.attach(MIMEText(body, "plain"))
|
||||
|
||||
loop = asyncio.get_running_loop()
|
||||
|
||||
def _send() -> bool:
|
||||
try:
|
||||
tls_context = ssl.create_default_context()
|
||||
if not self._verify_tls:
|
||||
tls_context.check_hostname = False
|
||||
tls_context.verify_mode = ssl.CERT_NONE
|
||||
with smtplib.SMTP(self._host, self._port, timeout=self._timeout) as server:
|
||||
if self._use_tls:
|
||||
server.starttls(context=tls_context)
|
||||
server.ehlo()
|
||||
server.login(self._username, self._password)
|
||||
server.sendmail(sender, [to_email], msg.as_string())
|
||||
logger.info(f"Email sent to {to_email}: {subject[:60]}")
|
||||
return True
|
||||
except smtplib.SMTPException as e:
|
||||
logger.error(f"SMTP error sending to {to_email}: {e}")
|
||||
return False
|
||||
except Exception as e:
|
||||
logger.error(f"Unexpected error sending to {to_email}: {e}")
|
||||
return False
|
||||
|
||||
return await loop.run_in_executor(None, _send)
|
||||
|
||||
def personalize(self, template: str, variables: dict) -> str:
|
||||
"""Replace {placeholder} variables in a template string."""
|
||||
for key, value in variables.items():
|
||||
template = template.replace(f"{{{key}}}", str(value))
|
||||
return template
|
||||
|
||||
|
||||
backlink_outreach_sender = BacklinkOutreachSender()
|
||||
@@ -1,361 +0,0 @@
|
||||
"""Canonical backlink outreach service entrypoint."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import Any, Dict, List, Optional
|
||||
import re
|
||||
import time
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
import csv
|
||||
import io
|
||||
|
||||
from services.backlink_outreach_models import (
|
||||
OpportunityContactInfo, OpportunityRecord,
|
||||
PolicyValidationRequest, PolicyValidationResponse,
|
||||
SendOutreachRequest, SendOutreachResponse,
|
||||
CampaignVolumeResponse, CampaignVolumePoint,
|
||||
ConversionFunnelResponse, FunnelStage,
|
||||
)
|
||||
from services.backlink_outreach_storage import BacklinkOutreachStorageService
|
||||
|
||||
DEFAULT_USER_DAILY_CAP = 100
|
||||
DEFAULT_DOMAIN_DAILY_CAP = 20
|
||||
|
||||
@dataclass
|
||||
class SearchResult:
|
||||
url: str
|
||||
title: str
|
||||
snippet: str
|
||||
|
||||
|
||||
class BacklinkOutreachService:
|
||||
def list_backlink_modules(self) -> List[Dict[str, Any]]:
|
||||
return [
|
||||
{"identifier": "backlink", "module_path": "backend/services/backlink_outreach_service.py", "purpose": "Canonical backlink service facade"},
|
||||
{"identifier": "outreach", "module_path": "backend/routers/backlink_outreach.py", "purpose": "HTTP API entrypoint for backlink outreach"},
|
||||
{"identifier": "guest_post", "module_path": "frontend/src/api/backlinkOutreachApi.ts", "purpose": "Frontend API integration for guest-post workflows"},
|
||||
]
|
||||
|
||||
def generate_guest_post_queries(self, keyword: str) -> List[str]:
|
||||
normalized = (keyword or "").strip()
|
||||
if not normalized:
|
||||
return []
|
||||
return [
|
||||
f"{normalized} + 'Guest Contributor'",
|
||||
f"{normalized} + 'Add Guest Post'",
|
||||
f"{normalized} + 'Guest Bloggers Wanted'",
|
||||
f"{normalized} + 'Write for Us'",
|
||||
f"{normalized} + 'Submit Guest Post'",
|
||||
f"{normalized} + 'Become a Guest Blogger'",
|
||||
f"{normalized} + 'guest post opportunities'",
|
||||
f"{normalized} + 'Submit article'",
|
||||
]
|
||||
|
||||
def search_for_urls(self, query: str, timeout_seconds: int = 12, retries: int = 2) -> List[SearchResult]:
|
||||
encoded_query = requests.utils.quote(query)
|
||||
url = f"https://duckduckgo.com/html/?q={encoded_query}"
|
||||
headers = {"User-Agent": "Mozilla/5.0 ALwrityBacklinkBot/1.0"}
|
||||
|
||||
for attempt in range(retries + 1):
|
||||
try:
|
||||
response = requests.get(url, headers=headers, timeout=timeout_seconds)
|
||||
response.raise_for_status()
|
||||
soup = BeautifulSoup(response.text, "html.parser")
|
||||
rows: List[SearchResult] = []
|
||||
for result in soup.select("div.result")[:10]:
|
||||
anchor = result.select_one("a.result__a")
|
||||
snippet = result.select_one("a.result__snippet") or result.select_one("div.result__snippet")
|
||||
if not anchor or not anchor.get("href"):
|
||||
continue
|
||||
rows.append(
|
||||
SearchResult(
|
||||
url=anchor.get("href"),
|
||||
title=anchor.get_text(strip=True),
|
||||
snippet=snippet.get_text(" ", strip=True) if snippet else "",
|
||||
)
|
||||
)
|
||||
return rows
|
||||
except Exception:
|
||||
if attempt == retries:
|
||||
return []
|
||||
time.sleep(0.6 * (attempt + 1))
|
||||
return []
|
||||
|
||||
def discover_opportunities(self, keyword: str, max_results: int = 10) -> Dict[str, Any]:
|
||||
queries = self.generate_guest_post_queries(keyword)[:4]
|
||||
dedup: Dict[str, SearchResult] = {}
|
||||
|
||||
for query in queries:
|
||||
for result in self.search_for_urls(query):
|
||||
normalized_url = self._normalize_url(result.url)
|
||||
if not normalized_url or normalized_url in dedup:
|
||||
continue
|
||||
dedup[normalized_url] = result
|
||||
if len(dedup) >= max_results:
|
||||
break
|
||||
if len(dedup) >= max_results:
|
||||
break
|
||||
time.sleep(0.4)
|
||||
|
||||
opportunities: List[OpportunityRecord] = []
|
||||
for normalized_url, row in dedup.items():
|
||||
contact = self._extract_contact_info(row.snippet)
|
||||
score = self._score_confidence(row.title, row.snippet)
|
||||
opportunities.append(
|
||||
OpportunityRecord(
|
||||
url=normalized_url,
|
||||
title=row.title or "Untitled",
|
||||
snippet=row.snippet,
|
||||
metadata={"source": "duckduckgo_html", "query_keyword": keyword},
|
||||
contact_info=contact,
|
||||
confidence_score=score,
|
||||
)
|
||||
)
|
||||
|
||||
return {"keyword": keyword, "queries": queries, "opportunities": opportunities}
|
||||
|
||||
def _normalize_url(self, url: str) -> str:
|
||||
u = (url or "").strip()
|
||||
if not u:
|
||||
return ""
|
||||
if u.startswith("//"):
|
||||
u = f"https:{u}"
|
||||
if not re.match(r"^https?://", u):
|
||||
return ""
|
||||
return u.split("#")[0].rstrip("/")
|
||||
|
||||
def _extract_contact_info(self, text: str) -> OpportunityContactInfo:
|
||||
if not text:
|
||||
return OpportunityContactInfo()
|
||||
email_match = re.search(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}", text)
|
||||
return OpportunityContactInfo(email=email_match.group(0) if email_match else None)
|
||||
|
||||
def _score_confidence(self, title: str, snippet: str) -> float:
|
||||
hay = f"{title} {snippet}".lower()
|
||||
cues = ["write for us", "guest post", "submit", "contributor", "guest blogger"]
|
||||
hits = sum(1 for cue in cues if cue in hay)
|
||||
return min(1.0, 0.35 + (0.13 * hits))
|
||||
|
||||
|
||||
def _get_storage(self) -> BacklinkOutreachStorageService:
|
||||
return BacklinkOutreachStorageService()
|
||||
|
||||
def validate_send_policy(self, payload: PolicyValidationRequest) -> PolicyValidationResponse:
|
||||
reasons: List[str] = []
|
||||
storage = self._get_storage()
|
||||
|
||||
if payload.workspace_id.startswith("new-") and not payload.approved_by_human:
|
||||
reasons.append("human_review_required_for_new_workspace")
|
||||
if payload.legal_basis.lower() not in {"legitimate_interest", "consent", "contract"}:
|
||||
reasons.append("invalid_legal_basis")
|
||||
if payload.recipient_region.lower() in {"eu", "eea"} and payload.legal_basis.lower() != "consent":
|
||||
reasons.append("region_requires_explicit_consent")
|
||||
|
||||
if len(payload.sender_identity.strip()) < 3:
|
||||
reasons.append("sender_identity_required")
|
||||
|
||||
if storage.is_suppressed(str(payload.recipient_email), payload.recipient_domain, user_id=payload.user_id):
|
||||
reasons.append("recipient_suppressed")
|
||||
if storage.check_idempotency(payload.idempotency_key, user_id=payload.user_id):
|
||||
reasons.append("duplicate_idempotency_key")
|
||||
|
||||
user_count = storage.get_user_send_count(payload.user_id)
|
||||
domain_count = storage.get_domain_send_count(payload.recipient_domain, user_id=payload.user_id)
|
||||
if user_count >= DEFAULT_USER_DAILY_CAP:
|
||||
reasons.append("user_daily_cap_exceeded")
|
||||
if domain_count >= DEFAULT_DOMAIN_DAILY_CAP:
|
||||
reasons.append("domain_daily_cap_exceeded")
|
||||
|
||||
allowed = len(reasons) == 0
|
||||
final_status = "approved" if allowed else "blocked"
|
||||
|
||||
storage.add_audit_log(
|
||||
event="policy_check",
|
||||
user_id=payload.user_id,
|
||||
campaign_id=payload.campaign_id,
|
||||
recipient=str(payload.recipient_email),
|
||||
allowed=allowed,
|
||||
reasons=reasons,
|
||||
override=payload.approved_by_human,
|
||||
)
|
||||
|
||||
return PolicyValidationResponse(allowed=allowed, reasons=reasons, final_status=final_status)
|
||||
|
||||
EU_DOMAIN_SUFFIXES = (".de", ".fr", ".it", ".es", ".nl", ".be", ".at", ".se", ".dk", ".fi", ".pt", ".ie", ".gr", ".pl", ".cz", ".ro", ".hu", ".bg", ".hr", ".sk", ".si", ".ee", ".lv", ".lt", ".lu", ".mt", ".cy")
|
||||
|
||||
def _infer_region(self, domain: str) -> str:
|
||||
d = domain.lower()
|
||||
if any(d.endswith(s) or d.endswith(s + "/") for s in self.EU_DOMAIN_SUFFIXES):
|
||||
return "eu"
|
||||
if d.endswith(".uk"):
|
||||
return "uk"
|
||||
if d.endswith(".ca"):
|
||||
return "ca"
|
||||
if d.endswith(".au"):
|
||||
return "au"
|
||||
return "unknown"
|
||||
|
||||
def send_outreach(self, request: SendOutreachRequest) -> SendOutreachResponse:
|
||||
storage = self._get_storage()
|
||||
lead = storage.get_lead(request.lead_id, user_id=request.user_id)
|
||||
if not lead:
|
||||
return SendOutreachResponse(attempt_id="", status="failed", policy_allowed=False, policy_reasons=["lead_not_found"])
|
||||
|
||||
domain = lead.get("domain", request.sender_email.split("@")[-1] if "@" in request.sender_email else "unknown")
|
||||
recipient_region = self._infer_region(domain)
|
||||
legal_basis = "consent" if recipient_region == "eu" else "legitimate_interest"
|
||||
|
||||
policy_req = PolicyValidationRequest(
|
||||
user_id=request.user_id,
|
||||
workspace_id=request.workspace_id,
|
||||
campaign_id=request.campaign_id,
|
||||
recipient_email=lead.get("email", ""),
|
||||
recipient_domain=domain,
|
||||
recipient_region=recipient_region,
|
||||
legal_basis=legal_basis,
|
||||
approved_by_human=False,
|
||||
unsubscribe_url=None,
|
||||
sender_identity=request.sender_email,
|
||||
idempotency_key=request.idempotency_key,
|
||||
)
|
||||
policy = self.validate_send_policy(policy_req)
|
||||
|
||||
attempt = storage.add_attempt(
|
||||
lead_id=request.lead_id,
|
||||
campaign_id=request.campaign_id,
|
||||
idempotency_key=request.idempotency_key,
|
||||
sender_email=request.sender_email,
|
||||
subject=request.subject,
|
||||
body=request.body,
|
||||
status="approved" if policy.allowed else "blocked",
|
||||
decision_reason="; ".join(policy.reasons) if policy.reasons else None,
|
||||
user_id=request.user_id,
|
||||
)
|
||||
|
||||
return SendOutreachResponse(
|
||||
attempt_id=attempt.get("attempt_id", ""),
|
||||
status=attempt.get("status", "failed"),
|
||||
policy_allowed=policy.allowed,
|
||||
policy_reasons=policy.reasons,
|
||||
)
|
||||
|
||||
def get_reporting_snapshot(self, user_id: str = "default") -> Dict[str, Any]:
|
||||
storage = self._get_storage()
|
||||
campaigns = storage.list_campaigns(user_id, user_id, limit=100)
|
||||
total_sent = 0
|
||||
total_replied = 0
|
||||
total_placed = 0
|
||||
total_leads = 0
|
||||
for c in campaigns:
|
||||
cid = c["campaign_id"]
|
||||
attempts = storage.list_attempts(cid, limit=10000, user_id=user_id)
|
||||
leads = storage.list_leads_all(cid, user_id=user_id)
|
||||
total_sent += sum(1 for a in attempts if a.get("status") == "sent")
|
||||
total_replied += storage.count_replies(cid, user_id=user_id)
|
||||
total_placed += sum(1 for l in leads if l.get("status") == "placed")
|
||||
total_leads += len(leads)
|
||||
logs = storage.list_audit_logs("", limit=1000, user_id=user_id)
|
||||
return {
|
||||
"send_volume": total_sent,
|
||||
"decision_events": len(logs),
|
||||
"response_rate": round(total_replied / total_sent, 4) if total_sent > 0 else 0.0,
|
||||
"placement_conversion": round(total_placed / total_leads, 4) if total_leads > 0 else 0.0,
|
||||
}
|
||||
|
||||
def get_campaign_volume(self, campaign_id: str, days: int = 30, user_id: str = "default") -> CampaignVolumeResponse:
|
||||
storage = self._get_storage()
|
||||
points = storage.get_send_volume_by_day(campaign_id, days, user_id=user_id)
|
||||
return CampaignVolumeResponse(
|
||||
campaign_id=campaign_id, days=days,
|
||||
volume=[CampaignVolumePoint(**p) for p in points],
|
||||
)
|
||||
|
||||
def get_campaign_funnel(self, campaign_id: str, user_id: str = "default") -> ConversionFunnelResponse:
|
||||
storage = self._get_storage()
|
||||
stages = storage.get_lead_status_counts(campaign_id, user_id=user_id)
|
||||
return ConversionFunnelResponse(
|
||||
campaign_id=campaign_id,
|
||||
stages=[FunnelStage(**s) for s in stages],
|
||||
)
|
||||
|
||||
CSV_LEAD_FIELDS = ["lead_id", "campaign_id", "domain", "page_title", "email", "status", "discovery_source", "created_at"]
|
||||
CSV_ATTEMPT_FIELDS = ["attempt_id", "lead_id", "campaign_id", "sender_email", "subject", "status", "sent_at", "created_at"]
|
||||
CSV_REPLY_FIELDS = ["reply_id", "attempt_id", "from_email", "subject", "classification", "received_at"]
|
||||
|
||||
@staticmethod
|
||||
def _sanitize_csv_value(value: Any) -> str:
|
||||
s = str(value) if value is not None else ""
|
||||
if s and s[0] in ("=", "+", "-", "@", "\t", "\r"):
|
||||
s = "'" + s
|
||||
return s
|
||||
|
||||
def export_leads_csv(self, campaign_id: str, user_id: str = "default") -> str:
|
||||
storage = self._get_storage()
|
||||
leads = storage.list_leads_all(campaign_id, user_id=user_id)
|
||||
output = io.StringIO()
|
||||
writer = csv.DictWriter(output, fieldnames=self.CSV_LEAD_FIELDS, extrasaction="ignore")
|
||||
writer.writeheader()
|
||||
for row in leads:
|
||||
writer.writerows([{k: self._sanitize_csv_value(v) for k, v in row.items()}])
|
||||
return output.getvalue()
|
||||
|
||||
def export_attempts_csv(self, campaign_id: str, user_id: str = "default") -> str:
|
||||
storage = self._get_storage()
|
||||
attempts = storage.list_attempts_all(campaign_id, user_id=user_id)
|
||||
output = io.StringIO()
|
||||
writer = csv.DictWriter(output, fieldnames=self.CSV_ATTEMPT_FIELDS, extrasaction="ignore")
|
||||
writer.writeheader()
|
||||
for row in attempts:
|
||||
writer.writerows([{k: self._sanitize_csv_value(v) for k, v in row.items()}])
|
||||
return output.getvalue()
|
||||
|
||||
def export_replies_csv(self, campaign_id: str, user_id: str = "default") -> str:
|
||||
storage = self._get_storage()
|
||||
replies = storage.list_replies_all(campaign_id, user_id=user_id)
|
||||
output = io.StringIO()
|
||||
writer = csv.DictWriter(output, fieldnames=self.CSV_REPLY_FIELDS, extrasaction="ignore")
|
||||
writer.writeheader()
|
||||
for row in replies:
|
||||
writer.writerows([{k: self._sanitize_csv_value(v) for k, v in row.items()}])
|
||||
return output.getvalue()
|
||||
|
||||
async def deep_discover(self, keyword: str, max_results: int = 15) -> Dict[str, Any]:
|
||||
"""Enhanced discovery using Exa neural search + DuckDuckGo with full-page scraping."""
|
||||
from services.backlink_outreach_scraper import BacklinkOutreachScraper
|
||||
scraper = BacklinkOutreachScraper(user_id=self._user_id if hasattr(self, '_user_id') else None)
|
||||
return await scraper.deep_discover(keyword, max_results)
|
||||
|
||||
def get_migration_coverage(self) -> Dict[str, Any]:
|
||||
implemented = [
|
||||
"discoverable backend router + service",
|
||||
"frontend API/store/UI integration point",
|
||||
"legacy guest-post search query generation templates",
|
||||
"provider-backed URL discovery + normalization + deduplication",
|
||||
"typed opportunity records and confidence score",
|
||||
"deep webpage scraping + contact-page extraction via Exa",
|
||||
"quality scoring and guest-post signal detection",
|
||||
"DB-backed policy validation with suppression & idempotency",
|
||||
"outreach attempt recording + status lifecycle",
|
||||
"SMTP email sending via backlink_outreach_sender",
|
||||
"IMAP reply polling with auto-classification",
|
||||
"follow-up scheduling with sent tracking",
|
||||
"email template CRUD + AI generation (llm_text_gen)",
|
||||
"personalized send via template variables",
|
||||
]
|
||||
planned = [
|
||||
"follow-up orchestration and campaign analytics",
|
||||
]
|
||||
return {
|
||||
"legacy_reference": "ToBeMigrated/ai_marketing_tools/ai_backlinker/ai_backlinking.py",
|
||||
"implemented_count": len(implemented),
|
||||
"planned_count": len(planned),
|
||||
"implemented": implemented,
|
||||
"planned": planned,
|
||||
}
|
||||
|
||||
|
||||
backlink_outreach_service = BacklinkOutreachService()
|
||||
@@ -1,933 +0,0 @@
|
||||
"""Backlink outreach persistence service (campaign-creator style)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, date
|
||||
from uuid import uuid4
|
||||
from typing import List, Optional
|
||||
from sqlalchemy import text as sql_text, func as sa_func
|
||||
|
||||
from services.database import get_session_for_user
|
||||
from models.backlink_outreach_models import (
|
||||
Base, BacklinkCampaign, BacklinkLead,
|
||||
OutreachAttempt, OutreachReply, FollowUpSchedule, EmailTemplate,
|
||||
SuppressedRecipient, SentIdempotencyKey, AuditLogEntry,
|
||||
SendCounterUser, SendCounterDomain,
|
||||
)
|
||||
|
||||
|
||||
class BacklinkOutreachStorageService:
|
||||
_NEW_LEAD_COLUMNS = [
|
||||
"url", "page_title", "snippet", "confidence_score", "discovery_source", "notes"
|
||||
]
|
||||
|
||||
def _ensure_tables(self, user_id: str) -> None:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return
|
||||
try:
|
||||
Base.metadata.create_all(bind=db.get_bind(), checkfirst=True)
|
||||
self._migrate_lead_columns(db)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def _migrate_lead_columns(self, db) -> None:
|
||||
"""Add new columns to backlink_leads if they don't exist (dev migration)."""
|
||||
try:
|
||||
valid_columns = {"url", "page_title", "snippet", "confidence_score", "discovery_source", "notes"}
|
||||
for col in self._NEW_LEAD_COLUMNS:
|
||||
if col not in valid_columns:
|
||||
continue
|
||||
safe_col = col.replace('"', "").replace(";", "")
|
||||
db.execute(sql_text(
|
||||
f"ALTER TABLE backlink_leads ADD COLUMN IF NOT EXISTS \"{safe_col}\" TEXT"
|
||||
))
|
||||
db.execute(sql_text(
|
||||
"ALTER TABLE backlink_leads ADD COLUMN IF NOT EXISTS confidence_score FLOAT DEFAULT 0.0"
|
||||
))
|
||||
db.commit()
|
||||
except Exception:
|
||||
db.rollback()
|
||||
|
||||
def create_campaign(self, user_id: str, workspace_id: str, name: str) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
campaign = BacklinkCampaign(
|
||||
id=f"bl_{uuid4().hex[:16]}",
|
||||
user_id=user_id,
|
||||
workspace_id=workspace_id,
|
||||
name=name,
|
||||
status="drafted",
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(campaign)
|
||||
db.commit()
|
||||
return {"campaign_id": campaign.id, "name": campaign.name, "status": campaign.status}
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_campaigns(self, user_id: str, workspace_id: str, limit: int = 50) -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(BacklinkCampaign)
|
||||
.filter(BacklinkCampaign.user_id == user_id, BacklinkCampaign.workspace_id == workspace_id)
|
||||
.order_by(BacklinkCampaign.created_at.desc())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
return [{"campaign_id": r.id, "name": r.name, "status": r.status, "created_at": r.created_at.isoformat()} for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def get_campaign(self, campaign_id: str, user_id: str) -> Optional[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
campaign = (
|
||||
db.query(BacklinkCampaign)
|
||||
.filter(BacklinkCampaign.id == campaign_id, BacklinkCampaign.user_id == user_id)
|
||||
.first()
|
||||
)
|
||||
if not campaign:
|
||||
return None
|
||||
lead_count = db.query(BacklinkLead).filter(BacklinkLead.campaign_id == campaign_id).count()
|
||||
leads = (
|
||||
db.query(BacklinkLead)
|
||||
.filter(BacklinkLead.campaign_id == campaign_id)
|
||||
.order_by(BacklinkLead.created_at.desc())
|
||||
.limit(50)
|
||||
.all()
|
||||
)
|
||||
return {
|
||||
"campaign_id": campaign.id,
|
||||
"name": campaign.name,
|
||||
"status": campaign.status,
|
||||
"created_at": campaign.created_at.isoformat() if campaign.created_at else None,
|
||||
"lead_count": lead_count,
|
||||
"leads": [self._lead_to_dict(l) for l in leads],
|
||||
}
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Lead CRUD --
|
||||
|
||||
def add_lead(
|
||||
self,
|
||||
campaign_id: str,
|
||||
user_id: str,
|
||||
url: str,
|
||||
domain: str,
|
||||
page_title: str = "",
|
||||
snippet: str = "",
|
||||
email: Optional[str] = None,
|
||||
confidence_score: float = 0.0,
|
||||
discovery_source: str = "duckduckgo",
|
||||
notes: Optional[str] = None,
|
||||
) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
lead = BacklinkLead(
|
||||
id=f"bl_{uuid4().hex[:16]}",
|
||||
campaign_id=campaign_id,
|
||||
url=url,
|
||||
domain=domain,
|
||||
page_title=page_title,
|
||||
snippet=snippet,
|
||||
email=email,
|
||||
confidence_score=confidence_score,
|
||||
discovery_source=discovery_source,
|
||||
status="discovered",
|
||||
notes=notes,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(lead)
|
||||
db.commit()
|
||||
return self._lead_to_dict(lead)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def bulk_add_leads(self, campaign_id: str, user_id: str, leads_data: List[dict]) -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
added = []
|
||||
for data in leads_data:
|
||||
lead = BacklinkLead(
|
||||
id=f"bl_{uuid4().hex[:16]}",
|
||||
campaign_id=campaign_id,
|
||||
url=data.get("url", ""),
|
||||
domain=data.get("domain", ""),
|
||||
page_title=data.get("page_title", ""),
|
||||
snippet=data.get("snippet", ""),
|
||||
email=data.get("email"),
|
||||
confidence_score=data.get("confidence_score", 0.0),
|
||||
discovery_source=data.get("discovery_source", "duckduckgo"),
|
||||
status="discovered",
|
||||
notes=data.get("notes"),
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(lead)
|
||||
added.append(lead)
|
||||
db.commit()
|
||||
return [self._lead_to_dict(l) for l in added]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_leads(
|
||||
self, campaign_id: str, user_id: str, status: Optional[str] = None, limit: int = 50
|
||||
) -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
q = db.query(BacklinkLead).filter(BacklinkLead.campaign_id == campaign_id)
|
||||
if status:
|
||||
q = q.filter(BacklinkLead.status == status)
|
||||
rows = q.order_by(BacklinkLead.created_at.desc()).limit(limit).all()
|
||||
return [self._lead_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def update_lead_status(
|
||||
self, lead_id: str, user_id: str, status: str, notes: Optional[str] = None
|
||||
) -> Optional[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
lead = db.query(BacklinkLead).filter(BacklinkLead.id == lead_id).first()
|
||||
if not lead:
|
||||
return None
|
||||
lead.status = status
|
||||
if notes is not None:
|
||||
lead.notes = notes
|
||||
db.commit()
|
||||
return self._lead_to_dict(lead)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
@staticmethod
|
||||
def _lead_to_dict(lead) -> dict:
|
||||
return {
|
||||
"lead_id": lead.id,
|
||||
"campaign_id": lead.campaign_id,
|
||||
"url": lead.url,
|
||||
"domain": lead.domain,
|
||||
"page_title": lead.page_title or "",
|
||||
"snippet": lead.snippet or "",
|
||||
"email": lead.email,
|
||||
"confidence_score": lead.confidence_score or 0.0,
|
||||
"discovery_source": lead.discovery_source or "duckduckgo",
|
||||
"status": lead.status,
|
||||
"notes": lead.notes,
|
||||
"created_at": lead.created_at.isoformat() if lead.created_at else None,
|
||||
}
|
||||
|
||||
# -- Outreach Attempt CRUD --
|
||||
|
||||
def add_attempt(
|
||||
self,
|
||||
lead_id: str,
|
||||
campaign_id: str,
|
||||
idempotency_key: str,
|
||||
sender_email: str = "",
|
||||
subject: str = "",
|
||||
body: str = "",
|
||||
status: str = "queued",
|
||||
decision_reason: Optional[str] = None,
|
||||
user_id: str = "default",
|
||||
) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
attempt = OutreachAttempt(
|
||||
id=f"att_{uuid4().hex[:16]}",
|
||||
lead_id=lead_id,
|
||||
campaign_id=campaign_id,
|
||||
idempotency_key=idempotency_key,
|
||||
sender_email=sender_email,
|
||||
subject=subject,
|
||||
body=body,
|
||||
status=status,
|
||||
decision_reason=decision_reason,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(attempt)
|
||||
db.commit()
|
||||
return self._attempt_to_dict(attempt)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_attempts(self, campaign_id: str, limit: int = 50, user_id: str = "default") -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(OutreachAttempt)
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id)
|
||||
.order_by(OutreachAttempt.created_at.desc())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
return [self._attempt_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def update_attempt_status(self, attempt_id: str, status: str, decision_reason: Optional[str] = None, user_id: str = "default") -> Optional[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
attempt = db.query(OutreachAttempt).filter(OutreachAttempt.id == attempt_id).first()
|
||||
if not attempt:
|
||||
return None
|
||||
attempt.status = status
|
||||
if decision_reason is not None:
|
||||
attempt.decision_reason = decision_reason
|
||||
if status == "sent":
|
||||
attempt.sent_at = datetime.utcnow()
|
||||
db.commit()
|
||||
return self._attempt_to_dict(attempt)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
@staticmethod
|
||||
def _attempt_to_dict(attempt) -> dict:
|
||||
return {
|
||||
"attempt_id": attempt.id,
|
||||
"lead_id": attempt.lead_id,
|
||||
"campaign_id": attempt.campaign_id,
|
||||
"idempotency_key": attempt.idempotency_key,
|
||||
"sender_email": attempt.sender_email or "",
|
||||
"subject": attempt.subject or "",
|
||||
"status": attempt.status,
|
||||
"decision_reason": attempt.decision_reason,
|
||||
"sent_at": attempt.sent_at.isoformat() if attempt.sent_at else None,
|
||||
"created_at": attempt.created_at.isoformat() if attempt.created_at else None,
|
||||
}
|
||||
|
||||
def find_attempt_by_from_email(self, from_email: str, user_id: str = "default") -> Optional[str]:
|
||||
"""Find the most recent attempt_id for a given sender email (lead)."""
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
from sqlalchemy import desc
|
||||
attempt = (
|
||||
db.query(OutreachAttempt)
|
||||
.join(BacklinkLead, OutreachAttempt.lead_id == BacklinkLead.id)
|
||||
.filter(BacklinkLead.email == from_email)
|
||||
.order_by(desc(OutreachAttempt.created_at))
|
||||
.first()
|
||||
)
|
||||
return attempt.id if attempt else None
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Outreach Reply CRUD --
|
||||
|
||||
def reply_exists(self, from_email: str, subject: str, user_id: str = "default") -> bool:
|
||||
"""Check if a reply with this from_email+subject already exists."""
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return False
|
||||
try:
|
||||
exists = (
|
||||
db.query(OutreachReply.id)
|
||||
.filter(OutreachReply.from_email == from_email, OutreachReply.subject == subject)
|
||||
.first()
|
||||
)
|
||||
return exists is not None
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def add_reply(
|
||||
self,
|
||||
attempt_id: str,
|
||||
from_email: str = "",
|
||||
subject: str = "",
|
||||
body: str = "",
|
||||
classification: str = "replied",
|
||||
user_id: str = "default",
|
||||
) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
reply = OutreachReply(
|
||||
id=f"rep_{uuid4().hex[:16]}",
|
||||
attempt_id=attempt_id,
|
||||
from_email=from_email,
|
||||
subject=subject,
|
||||
body=body,
|
||||
classification=classification,
|
||||
received_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(reply)
|
||||
db.commit()
|
||||
return self._reply_to_dict(reply)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_replies(self, campaign_id: str, limit: int = 50, user_id: str = "default") -> List[dict]:
|
||||
"""List replies by joining through attempts to filter by campaign."""
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(OutreachReply)
|
||||
.join(OutreachAttempt, OutreachReply.attempt_id == OutreachAttempt.id)
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id)
|
||||
.order_by(OutreachReply.received_at.desc())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
return [self._reply_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
@staticmethod
|
||||
def _reply_to_dict(reply) -> dict:
|
||||
return {
|
||||
"reply_id": reply.id,
|
||||
"attempt_id": reply.attempt_id,
|
||||
"from_email": reply.from_email or "",
|
||||
"subject": reply.subject or "",
|
||||
"received_at": reply.received_at.isoformat() if reply.received_at else None,
|
||||
"classification": reply.classification,
|
||||
"body": reply.body or "",
|
||||
}
|
||||
|
||||
# -- Follow-Up Schedule CRUD --
|
||||
|
||||
def schedule_followup(
|
||||
self,
|
||||
attempt_id: str,
|
||||
scheduled_for: str,
|
||||
subject: str = "",
|
||||
body: str = "",
|
||||
user_id: str = "default",
|
||||
) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
sched = FollowUpSchedule(
|
||||
id=f"fu_{uuid4().hex[:16]}",
|
||||
attempt_id=attempt_id,
|
||||
subject=subject or None,
|
||||
body=body or None,
|
||||
scheduled_for=datetime.fromisoformat(scheduled_for) if isinstance(scheduled_for, str) else scheduled_for,
|
||||
sent=False,
|
||||
)
|
||||
db.add(sched)
|
||||
db.commit()
|
||||
return self._followup_to_dict(sched)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_followups(self, campaign_id: str, limit: int = 50, user_id: str = "default") -> List[dict]:
|
||||
"""List follow-ups by joining through attempts to filter by campaign."""
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(FollowUpSchedule)
|
||||
.join(OutreachAttempt, FollowUpSchedule.attempt_id == OutreachAttempt.id)
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id)
|
||||
.order_by(FollowUpSchedule.scheduled_for.asc())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
return [self._followup_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def mark_followup_sent(self, schedule_id: str, user_id: str = "default") -> Optional[dict]:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
sched = db.query(FollowUpSchedule).filter(FollowUpSchedule.id == schedule_id).first()
|
||||
if not sched:
|
||||
return None
|
||||
sched.sent = True
|
||||
db.commit()
|
||||
return self._followup_to_dict(sched)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
@staticmethod
|
||||
def _followup_to_dict(sched) -> dict:
|
||||
return {
|
||||
"schedule_id": sched.id,
|
||||
"attempt_id": sched.attempt_id,
|
||||
"subject": sched.subject or "",
|
||||
"scheduled_for": sched.scheduled_for.isoformat() if sched.scheduled_for else None,
|
||||
"sent": sched.sent,
|
||||
}
|
||||
|
||||
# -- Email Template CRUD --
|
||||
|
||||
def create_template(
|
||||
self,
|
||||
user_id: str,
|
||||
name: str,
|
||||
subject_template: str,
|
||||
body_template: str,
|
||||
variables: Optional[List[str]] = None,
|
||||
) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
tmpl = EmailTemplate(
|
||||
id=f"tpl_{uuid4().hex[:16]}",
|
||||
user_id=user_id,
|
||||
name=name,
|
||||
subject_template=subject_template,
|
||||
body_template=body_template,
|
||||
variables=",".join(variables) if variables else None,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(tmpl)
|
||||
db.commit()
|
||||
return self._template_to_dict(tmpl)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_templates(self, user_id: str, limit: int = 50) -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(EmailTemplate)
|
||||
.filter(EmailTemplate.user_id == user_id)
|
||||
.order_by(EmailTemplate.created_at.desc())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
return [self._template_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def get_template(self, template_id: str, user_id: str) -> Optional[dict]:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
tmpl = (
|
||||
db.query(EmailTemplate)
|
||||
.filter(EmailTemplate.id == template_id, EmailTemplate.user_id == user_id)
|
||||
.first()
|
||||
)
|
||||
if not tmpl:
|
||||
return None
|
||||
return self._template_to_dict(tmpl)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def delete_template(self, template_id: str, user_id: str) -> bool:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return False
|
||||
try:
|
||||
tmpl = (
|
||||
db.query(EmailTemplate)
|
||||
.filter(EmailTemplate.id == template_id, EmailTemplate.user_id == user_id)
|
||||
.first()
|
||||
)
|
||||
if not tmpl:
|
||||
return False
|
||||
db.delete(tmpl)
|
||||
db.commit()
|
||||
return True
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
@staticmethod
|
||||
def _template_to_dict(tmpl) -> dict:
|
||||
return {
|
||||
"template_id": tmpl.id,
|
||||
"user_id": tmpl.user_id,
|
||||
"name": tmpl.name,
|
||||
"subject_template": tmpl.subject_template,
|
||||
"body_template": tmpl.body_template,
|
||||
"variables": tmpl.variables.split(",") if tmpl.variables else [],
|
||||
"created_at": tmpl.created_at.isoformat() if tmpl.created_at else None,
|
||||
}
|
||||
|
||||
# -- Suppression List --
|
||||
|
||||
def add_suppressed(self, email: str, user_id: str = "default", domain: str = "", reason: str = "") -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
entry = SuppressedRecipient(
|
||||
id=f"sup_{uuid4().hex[:16]}",
|
||||
email=email.lower(),
|
||||
domain=domain.lower() if domain else email.split("@")[-1].lower(),
|
||||
reason=reason,
|
||||
user_id=user_id,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(entry)
|
||||
db.commit()
|
||||
return {"id": entry.id, "email": entry.email, "reason": entry.reason}
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def is_suppressed(self, email: str, domain: str = "", user_id: str = "default") -> bool:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return False
|
||||
try:
|
||||
email_lower = email.lower()
|
||||
domain_lower = domain.lower() if domain else email.split("@")[-1].lower()
|
||||
exists = (
|
||||
db.query(SuppressedRecipient.id)
|
||||
.filter(
|
||||
(SuppressedRecipient.email == email_lower) |
|
||||
(SuppressedRecipient.domain == domain_lower)
|
||||
)
|
||||
.first()
|
||||
)
|
||||
return exists is not None
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_suppressed(self, user_id: str = "default", limit: int = 100) -> List[dict]:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(SuppressedRecipient)
|
||||
.order_by(SuppressedRecipient.created_at.desc())
|
||||
.limit(limit)
|
||||
.all()
|
||||
)
|
||||
return [{"id": r.id, "email": r.email, "domain": r.domain, "reason": r.reason, "created_at": r.created_at.isoformat() if r.created_at else None} for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Idempotency --
|
||||
|
||||
def check_idempotency(self, idempotency_key: str, user_id: str = "default") -> bool:
|
||||
"""Returns True if key already exists (duplicate)."""
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return False
|
||||
try:
|
||||
exists = (
|
||||
db.query(SentIdempotencyKey.id)
|
||||
.filter(SentIdempotencyKey.idempotency_key == idempotency_key)
|
||||
.first()
|
||||
)
|
||||
return exists is not None
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def mark_idempotency(self, idempotency_key: str, user_id: str = "default") -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
entry = SentIdempotencyKey(
|
||||
id=f"idm_{uuid4().hex[:16]}",
|
||||
idempotency_key=idempotency_key,
|
||||
user_id=user_id,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(entry)
|
||||
db.commit()
|
||||
return {"idempotency_key": idempotency_key}
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Send Counters --
|
||||
|
||||
def _today(self) -> date:
|
||||
return date.today()
|
||||
|
||||
def increment_user_send_counter(self, user_id: str) -> int:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return 0
|
||||
try:
|
||||
today = self._today()
|
||||
row_id = f"scu_{uuid4().hex[:16]}"
|
||||
db.execute(sql_text(
|
||||
"INSERT INTO backlink_send_counters_user (id, user_id, date, count) "
|
||||
"VALUES (:id, :uid, :dt, 1) "
|
||||
"ON CONFLICT (user_id, date) DO UPDATE SET count = count + 1"
|
||||
), {"id": row_id, "uid": user_id, "dt": today})
|
||||
db.commit()
|
||||
result = db.query(SendCounterUser.count).filter(
|
||||
SendCounterUser.user_id == user_id, SendCounterUser.date == today
|
||||
).first()
|
||||
return result[0] if result else 0
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def get_user_send_count(self, user_id: str) -> int:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return 0
|
||||
try:
|
||||
today = self._today()
|
||||
row = (
|
||||
db.query(SendCounterUser.count)
|
||||
.filter(SendCounterUser.user_id == user_id, SendCounterUser.date == today)
|
||||
.first()
|
||||
)
|
||||
return row[0] if row else 0
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def increment_domain_send_counter(self, domain: str, user_id: str = "default") -> int:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return 0
|
||||
try:
|
||||
today = self._today()
|
||||
domain_lower = domain.lower()
|
||||
row_id = f"scd_{uuid4().hex[:16]}"
|
||||
db.execute(sql_text(
|
||||
"INSERT INTO backlink_send_counters_domain (id, domain, date, count) "
|
||||
"VALUES (:id, :dom, :dt, 1) "
|
||||
"ON CONFLICT (domain, date) DO UPDATE SET count = count + 1"
|
||||
), {"id": row_id, "dom": domain_lower, "dt": today})
|
||||
db.commit()
|
||||
result = db.query(SendCounterDomain.count).filter(
|
||||
SendCounterDomain.domain == domain_lower, SendCounterDomain.date == today
|
||||
).first()
|
||||
return result[0] if result else 0
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def get_domain_send_count(self, domain: str, user_id: str = "default") -> int:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return 0
|
||||
try:
|
||||
today = self._today()
|
||||
row = (
|
||||
db.query(SendCounterDomain.count)
|
||||
.filter(SendCounterDomain.domain == domain.lower(), SendCounterDomain.date == today)
|
||||
.first()
|
||||
)
|
||||
return row[0] if row else 0
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Audit Log --
|
||||
|
||||
def add_audit_log(
|
||||
self,
|
||||
event: str,
|
||||
user_id: str,
|
||||
campaign_id: str = "",
|
||||
recipient: str = "",
|
||||
allowed: bool = False,
|
||||
reasons: Optional[List[str]] = None,
|
||||
override: bool = False,
|
||||
) -> dict:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
raise RuntimeError("Database session unavailable")
|
||||
try:
|
||||
entry = AuditLogEntry(
|
||||
id=f"aud_{uuid4().hex[:16]}",
|
||||
user_id=user_id,
|
||||
campaign_id=campaign_id or None,
|
||||
event=event,
|
||||
recipient=recipient or None,
|
||||
allowed=allowed,
|
||||
reasons=";".join(reasons) if reasons else None,
|
||||
override=override,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(entry)
|
||||
db.commit()
|
||||
return {"id": entry.id, "event": entry.event, "allowed": entry.allowed}
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_audit_logs(self, campaign_id: Optional[str] = None, limit: int = 100, user_id: str = "default") -> List[dict]:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
q = db.query(AuditLogEntry)
|
||||
if campaign_id:
|
||||
q = q.filter(AuditLogEntry.campaign_id == campaign_id)
|
||||
rows = q.order_by(AuditLogEntry.created_at.desc()).limit(limit).all()
|
||||
return [
|
||||
{
|
||||
"id": r.id,
|
||||
"event": r.event,
|
||||
"recipient": r.recipient,
|
||||
"allowed": r.allowed,
|
||||
"reasons": r.reasons.split(";") if r.reasons else [],
|
||||
"override": r.override,
|
||||
"created_at": r.created_at.isoformat() if r.created_at else None,
|
||||
}
|
||||
for r in rows
|
||||
]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Analytics --
|
||||
|
||||
def get_send_volume_by_day(self, campaign_id: str, days: int = 30, user_id: str = "default") -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
from datetime import timedelta
|
||||
cutoff = datetime.utcnow() - timedelta(days=days)
|
||||
rows = (
|
||||
db.query(sa_func.date(OutreachAttempt.sent_at).label("date"), sa_func.count(OutreachAttempt.id).label("count"))
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id, OutreachAttempt.status == "sent", OutreachAttempt.sent_at >= cutoff)
|
||||
.group_by(sa_func.date(OutreachAttempt.sent_at))
|
||||
.order_by(sa_func.date(OutreachAttempt.sent_at).asc())
|
||||
.all()
|
||||
)
|
||||
return [{"date": str(r.date), "count": r.count} for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def get_lead_status_counts(self, campaign_id: str, user_id: str = "default") -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(BacklinkLead.status, sa_func.count(BacklinkLead.id).label("count"))
|
||||
.filter(BacklinkLead.campaign_id == campaign_id)
|
||||
.group_by(BacklinkLead.status)
|
||||
.order_by(BacklinkLead.status.asc())
|
||||
.all()
|
||||
)
|
||||
return [{"status": r.status, "count": r.count} for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_attempts_all(self, campaign_id: str, user_id: str = "default") -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(OutreachAttempt)
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id)
|
||||
.order_by(OutreachAttempt.created_at.desc())
|
||||
.all()
|
||||
)
|
||||
return [self._attempt_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_replies_all(self, campaign_id: str, user_id: str = "default") -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(OutreachReply)
|
||||
.join(OutreachAttempt, OutreachReply.attempt_id == OutreachAttempt.id)
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id)
|
||||
.order_by(OutreachReply.received_at.desc())
|
||||
.all()
|
||||
)
|
||||
return [self._reply_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def count_replies(self, campaign_id: str, user_id: str = "default") -> int:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return 0
|
||||
try:
|
||||
return (
|
||||
db.query(OutreachReply.id)
|
||||
.join(OutreachAttempt, OutreachReply.attempt_id == OutreachAttempt.id)
|
||||
.filter(OutreachAttempt.campaign_id == campaign_id)
|
||||
.count()
|
||||
)
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
def list_leads_all(self, campaign_id: str, user_id: str = "default") -> List[dict]:
|
||||
self._ensure_tables(user_id)
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return []
|
||||
try:
|
||||
rows = (
|
||||
db.query(BacklinkLead)
|
||||
.filter(BacklinkLead.campaign_id == campaign_id)
|
||||
.order_by(BacklinkLead.created_at.desc())
|
||||
.all()
|
||||
)
|
||||
return [self._lead_to_dict(r) for r in rows]
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# -- Policy Helpers (composite checks) --
|
||||
|
||||
def get_lead(self, lead_id: str, user_id: str = "default") -> Optional[dict]:
|
||||
db = get_session_for_user(user_id)
|
||||
if not db:
|
||||
return None
|
||||
try:
|
||||
lead = db.query(BacklinkLead).filter(BacklinkLead.id == lead_id).first()
|
||||
if not lead:
|
||||
return None
|
||||
return self._lead_to_dict(lead)
|
||||
finally:
|
||||
db.close()
|
||||
@@ -1,307 +0,0 @@
|
||||
"""AI-powered outreach email template generation."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from typing import List, Optional
|
||||
from loguru import logger
|
||||
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
|
||||
|
||||
SYSTEM_PROMPT = """You are an expert outreach copywriter specializing in guest post and backlink pitch emails.
|
||||
Write concise, personalized outreach emails that get high response rates.
|
||||
Follow these rules:
|
||||
- Be specific about why you're reaching out (mention their content)
|
||||
- Keep it under 200 words
|
||||
- Include a clear call to action
|
||||
- Sound human, not templated
|
||||
- Never use spammy phrases
|
||||
- Output ONLY valid JSON with "subject" and "body" keys"""
|
||||
|
||||
SUBJECT_LINES_PROMPT = """You are an expert email subject line writer.
|
||||
Given an outreach email body, generate subject lines that are:
|
||||
- Intriguing but not clickbait
|
||||
- Personalized when possible
|
||||
- Under 60 characters
|
||||
- Varied in style (question, curiosity, value-prop)
|
||||
Output ONLY valid JSON with a "subjects" key containing an array of strings."""
|
||||
|
||||
FOLLOW_UP_PROMPT = """You are an expert outreach copywriter.
|
||||
Write a polite follow-up email for a guest post pitch that hasn't received a response.
|
||||
Rules:
|
||||
- Reference the original email without repeating it verbatim
|
||||
- Keep it shorter than the original (under 100 words)
|
||||
- Add a new angle or piece of value
|
||||
- Include a clear call to action
|
||||
- Sound human and respectful, never pushy
|
||||
- Output ONLY valid JSON with "subject" and "body" keys"""
|
||||
|
||||
PERSONALIZATION_PROMPT = """You are an expert outreach personalization specialist.
|
||||
Given a lead's information and a draft outreach email, personalize it for that specific lead.
|
||||
Rules:
|
||||
- Mention their specific content or website
|
||||
- Reference something relevant from their site
|
||||
- Keep the core pitch but make it feel custom-written
|
||||
- Under 200 words
|
||||
- Output ONLY valid JSON with "subject" and "body" keys"""
|
||||
|
||||
|
||||
def generate_outreach_email(
|
||||
topic: str,
|
||||
target_site: Optional[str] = None,
|
||||
tone: str = "professional",
|
||||
user_id: str = "default",
|
||||
existing_body: Optional[str] = None,
|
||||
) -> dict:
|
||||
"""Generate an outreach email using the LLM.
|
||||
|
||||
Args:
|
||||
topic: The topic/keyword to pitch.
|
||||
target_site: Optional target website name/URL.
|
||||
tone: professional, friendly, casual, or formal.
|
||||
user_id: Clerk user ID for subscription check.
|
||||
existing_body: If provided, rewrite/improve this existing template.
|
||||
|
||||
Returns:
|
||||
dict with "subject" and "body" keys.
|
||||
"""
|
||||
if existing_body:
|
||||
prompt = (
|
||||
f"Rewrite and improve the following outreach email for a {tone} tone. "
|
||||
f"Topic: {topic}. "
|
||||
f"{f'Target website: {target_site}. ' if target_site else ''}"
|
||||
f"Keep the core message but make it more effective. "
|
||||
f"Original email:\n\n{existing_body}\n\n"
|
||||
f"Return ONLY valid JSON with 'subject' and 'body' keys."
|
||||
)
|
||||
else:
|
||||
prompt = (
|
||||
f"Write a {tone} outreach email for a guest post opportunity about: {topic}. "
|
||||
f"{f'We are pitching this to: {target_site}. ' if target_site else ''}"
|
||||
f"Mention specific value the guest post would bring to their audience. "
|
||||
f"Return ONLY valid JSON with 'subject' and 'body' keys."
|
||||
)
|
||||
|
||||
try:
|
||||
raw = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=SYSTEM_PROMPT,
|
||||
user_id=user_id,
|
||||
temperature=0.7,
|
||||
)
|
||||
|
||||
result = _parse_json_response(raw)
|
||||
if result:
|
||||
return result
|
||||
|
||||
return _fallback_extract(raw, topic)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to generate outreach email: {e}")
|
||||
return {
|
||||
"subject": f"Guest post opportunity: {topic}",
|
||||
"body": f"Hi there,\n\nI came across your site and I'd love to contribute a guest post about {topic}. "
|
||||
f"Please let me know if you're open to submissions.\n\nBest regards",
|
||||
}
|
||||
|
||||
|
||||
def generate_personalized_email(
|
||||
lead_name: str,
|
||||
lead_site: str,
|
||||
lead_content_topic: str,
|
||||
pitch_topic: str,
|
||||
existing_body: str = "",
|
||||
user_id: str = "default",
|
||||
) -> dict:
|
||||
"""Personalize an outreach email for a specific lead.
|
||||
|
||||
Args:
|
||||
lead_name: Contact name or site owner name.
|
||||
lead_site: The lead's website URL.
|
||||
lead_content_topic: Topic of relevant content on their site.
|
||||
pitch_topic: The topic we want to pitch.
|
||||
existing_body: Optional draft to personalize further.
|
||||
user_id: Clerk user ID for subscription check.
|
||||
|
||||
Returns:
|
||||
dict with "subject" and "body" keys.
|
||||
"""
|
||||
if existing_body:
|
||||
prompt = (
|
||||
f"Personalize this outreach email for {lead_name} from {lead_site}. "
|
||||
f"They have content about '{lead_content_topic}'. "
|
||||
f"We want to pitch: {pitch_topic}. "
|
||||
f"Mention something specific about their content on {lead_content_topic} "
|
||||
f"to show we've done our research. "
|
||||
f"Draft email to personalize:\n\n{existing_body}\n\n"
|
||||
f"Return ONLY valid JSON with 'subject' and 'body' keys."
|
||||
)
|
||||
else:
|
||||
prompt = (
|
||||
f"Write a personalized outreach email to {lead_name} at {lead_site}. "
|
||||
f"They have published content about '{lead_content_topic}'. "
|
||||
f"We want to pitch a guest post about: {pitch_topic}. "
|
||||
f"Reference their article on {lead_content_topic} and explain how our pitch "
|
||||
f"would provide value to their audience. "
|
||||
f"Return ONLY valid JSON with 'subject' and 'body' keys."
|
||||
)
|
||||
|
||||
try:
|
||||
raw = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=PERSONALIZATION_PROMPT,
|
||||
user_id=user_id,
|
||||
temperature=0.7,
|
||||
)
|
||||
result = _parse_json_response(raw)
|
||||
if result:
|
||||
return result
|
||||
return _fallback_extract(raw, pitch_topic)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to personalize email: {e}")
|
||||
return {"subject": f"Question about your content on {lead_content_topic}", "body": existing_body or f"Hi {lead_name},\n\nI enjoyed your article about {lead_content_topic}..."}
|
||||
|
||||
|
||||
def generate_subject_lines(
|
||||
body: str,
|
||||
count: int = 5,
|
||||
user_id: str = "default",
|
||||
) -> List[str]:
|
||||
"""Generate subject line suggestions for an email body.
|
||||
|
||||
Args:
|
||||
body: The email body to generate subject lines for.
|
||||
count: Number of subject lines to generate.
|
||||
user_id: Clerk user ID for subscription check.
|
||||
|
||||
Returns:
|
||||
List of subject line strings.
|
||||
"""
|
||||
prompt = (
|
||||
f"Generate {count} subject lines for the following outreach email. "
|
||||
f"Make them varied in style and optimized for open rates.\n\n"
|
||||
f"Email body:\n{body}\n\n"
|
||||
f"Return ONLY valid JSON with a 'subjects' key containing an array of strings."
|
||||
)
|
||||
|
||||
try:
|
||||
raw = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=SUBJECT_LINES_PROMPT,
|
||||
user_id=user_id,
|
||||
temperature=0.8,
|
||||
)
|
||||
if raw:
|
||||
text = raw.strip()
|
||||
if text.startswith("```"):
|
||||
text = re.sub(r"^```(?:json)?\s*", "", text)
|
||||
text = re.sub(r"\s*```$", "", text)
|
||||
try:
|
||||
data = json.loads(text)
|
||||
if isinstance(data, dict) and "subjects" in data and isinstance(data["subjects"], list):
|
||||
return [s.strip() for s in data["subjects"][:count]]
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
lines = [l.strip("- ").strip() for l in raw.strip().split("\n") if l.strip() and not l.strip().startswith("```")]
|
||||
return [l for l in lines if len(l) > 10][:count]
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to generate subject lines: {e}")
|
||||
return [f"Guest post opportunity", f"Question about your content", f"Collaboration idea"]
|
||||
|
||||
|
||||
def generate_follow_up(
|
||||
original_subject: str,
|
||||
original_body: str,
|
||||
days_elapsed: int = 7,
|
||||
reply_context: str = "",
|
||||
user_id: str = "default",
|
||||
) -> dict:
|
||||
"""Generate a follow-up email for an outreach that hasn't received a response.
|
||||
|
||||
Args:
|
||||
original_subject: Subject line of the original email.
|
||||
original_body: Body of the original email.
|
||||
days_elapsed: Number of days since the original was sent.
|
||||
reply_context: If the recipient replied, context of their reply.
|
||||
user_id: Clerk user ID for subscription check.
|
||||
|
||||
Returns:
|
||||
dict with "subject" and "body" keys.
|
||||
"""
|
||||
if reply_context:
|
||||
prompt = (
|
||||
f"The recipient replied with: '{reply_context}'. "
|
||||
f"Write a follow-up email that addresses their response and keeps the conversation moving. "
|
||||
f"Original subject: {original_subject}.\n\n"
|
||||
f"Original email:\n{original_body}\n\n"
|
||||
f"Return ONLY valid JSON with 'subject' and 'body' keys."
|
||||
)
|
||||
else:
|
||||
prompt = (
|
||||
f"Write a polite follow-up email. {days_elapsed} days have passed since the original email. "
|
||||
f"Do not apologize for following up. Add a new piece of value or angle. "
|
||||
f"Original subject: {original_subject}.\n\n"
|
||||
f"Original email:\n{original_body}\n\n"
|
||||
f"Return ONLY valid JSON with 'subject' and 'body' keys."
|
||||
)
|
||||
|
||||
try:
|
||||
raw = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=FOLLOW_UP_PROMPT,
|
||||
user_id=user_id,
|
||||
temperature=0.7,
|
||||
)
|
||||
result = _parse_json_response(raw)
|
||||
if result:
|
||||
return result
|
||||
return _fallback_extract(raw, original_subject)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to generate follow-up: {e}")
|
||||
return {
|
||||
"subject": f"Re: {original_subject}",
|
||||
"body": f"Hi there,\n\nI wanted to follow up on my previous email. "
|
||||
f"I'd love to hear your thoughts when you have a moment.\n\nBest regards",
|
||||
}
|
||||
|
||||
|
||||
def _parse_json_response(raw: str) -> Optional[dict]:
|
||||
"""Try to parse JSON from LLM response, handling markdown fences."""
|
||||
if not raw:
|
||||
return None
|
||||
|
||||
text = raw.strip()
|
||||
|
||||
if text.startswith("```"):
|
||||
text = re.sub(r"^```(?:json)?\s*", "", text)
|
||||
text = re.sub(r"\s*```$", "", text)
|
||||
|
||||
try:
|
||||
data = json.loads(text)
|
||||
if isinstance(data, dict) and "subject" in data and "body" in data:
|
||||
return {"subject": data["subject"].strip(), "body": data["body"].strip()}
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def _fallback_extract(raw: str, topic: str) -> dict:
|
||||
"""Fallback: try to extract subject line and body from unstructured text."""
|
||||
lines = [l.strip() for l in raw.strip().split("\n") if l.strip()]
|
||||
subject = topic
|
||||
body_lines = []
|
||||
|
||||
for i, line in enumerate(lines):
|
||||
lower = line.lower()
|
||||
if lower.startswith("subject") or lower.startswith("subject:"):
|
||||
subject = line.split(":", 1)[-1].strip()
|
||||
elif lower.startswith("body") or lower.startswith("body:"):
|
||||
body_lines.append(line.split(":", 1)[-1].strip())
|
||||
else:
|
||||
body_lines.append(line)
|
||||
|
||||
body = "\n".join(body_lines) if body_lines else raw
|
||||
return {"subject": subject, "body": body}
|
||||
@@ -122,6 +122,9 @@ class MediumBlogGenerator:
|
||||
payload = {
|
||||
"title": req.title,
|
||||
"globalTargetWords": req.globalTargetWords or 1000,
|
||||
"persona": req.persona.dict() if req.persona else None,
|
||||
"tone": req.tone,
|
||||
"audience": req.audience,
|
||||
"sections": [section_block(s) for s in req.sections],
|
||||
}
|
||||
|
||||
@@ -133,6 +136,7 @@ class MediumBlogGenerator:
|
||||
- Industry: {req.persona.industry or 'General'}
|
||||
- Tone: {req.persona.tone or 'Professional'}
|
||||
- Audience: {req.persona.audience or 'General readers'}
|
||||
- Persona ID: {req.persona.persona_id or 'Default'}
|
||||
|
||||
Write content that reflects this persona's expertise and communication style.
|
||||
Use industry-specific terminology and examples where appropriate.
|
||||
@@ -150,19 +154,40 @@ class MediumBlogGenerator:
|
||||
"Return ONLY valid JSON with no markdown formatting or explanations."
|
||||
)
|
||||
|
||||
# Build persona-specific content instructions
|
||||
persona_instructions = ""
|
||||
if req.persona:
|
||||
industry = req.persona.industry or 'General'
|
||||
tone = req.persona.tone or 'Professional'
|
||||
audience = req.persona.audience or 'General readers'
|
||||
|
||||
persona_instructions = f"""
|
||||
PERSONA-DRIVEN CONTENT REQUIREMENTS:
|
||||
- Write as an expert in {industry} industry
|
||||
- Use {tone} tone appropriate for {audience}
|
||||
- Include industry-specific examples and terminology
|
||||
- Demonstrate authority and expertise in the field
|
||||
- Use language that resonates with {audience}
|
||||
- Maintain consistent voice that reflects this persona's expertise
|
||||
"""
|
||||
|
||||
prompt = (
|
||||
f"Write blog content for the following sections. Total target: {req.globalTargetWords or 1000} words, distributed across all sections.\n\n"
|
||||
f"Write blog content for the following sections. Each section should be {req.globalTargetWords or 1000} words total, distributed across all sections.\n\n"
|
||||
f"Blog Title: {req.title}\n\n"
|
||||
"For each section, write engaging content that:\n"
|
||||
"- Follows the key points provided\n"
|
||||
"- Uses the suggested keywords naturally\n"
|
||||
"- Meets the target word count\n"
|
||||
"- Maintains professional tone\n"
|
||||
"- References the provided sources when relevant\n"
|
||||
"- Breaks content into clear paragraphs (2-4 sentences each)\n"
|
||||
"- Uses double line breaks (\\n\\n) between paragraphs\n"
|
||||
"- Uses double line breaks (\\n\\n) between paragraphs for proper formatting\n"
|
||||
"- Starts with an engaging opening paragraph\n"
|
||||
"- Ends with a strong concluding paragraph\n\n"
|
||||
"Return a JSON object with 'title' and 'sections' array. Each section must have 'id', 'heading', 'content', 'wordCount', and 'sources'.\n\n"
|
||||
f"Sections:\n{json.dumps(payload, ensure_ascii=False, indent=2)}"
|
||||
"- Ends with a strong concluding paragraph\n"
|
||||
f"{persona_instructions}\n"
|
||||
"IMPORTANT: Format the 'content' field with proper paragraph breaks using \\n\\n between paragraphs.\n\n"
|
||||
"Return a JSON object with 'title' and 'sections' array. Each section should have 'id', 'heading', 'content', and 'wordCount'.\n\n"
|
||||
f"Sections to write:\n{json.dumps(payload, ensure_ascii=False, indent=2)}"
|
||||
)
|
||||
|
||||
try:
|
||||
@@ -170,9 +195,7 @@ class MediumBlogGenerator:
|
||||
prompt=prompt,
|
||||
json_struct=schema,
|
||||
system_prompt=system,
|
||||
user_id=user_id,
|
||||
max_tokens=None,
|
||||
temperature=0.3,
|
||||
user_id=user_id
|
||||
)
|
||||
except HTTPException:
|
||||
# Re-raise HTTPExceptions (e.g., 429 subscription limit) to preserve error details
|
||||
@@ -246,18 +269,16 @@ class MediumBlogGenerator:
|
||||
db=db,
|
||||
user_id=user_id,
|
||||
content=full_content,
|
||||
source_module="blog_writer",
|
||||
source_module="medium_blog_writer",
|
||||
title=result.title,
|
||||
description=f"Blog: {result.title}",
|
||||
tags=req.researchKeywords or ["blog", "ai_generated"],
|
||||
description=f"Generated medium blog: {result.title}",
|
||||
tags=req.researchKeywords or ["medium_blog", "ai_generated"],
|
||||
asset_metadata={
|
||||
"blog_type": "medium",
|
||||
"model": result.model,
|
||||
"generation_time_ms": result.generation_time_ms,
|
||||
"word_count": sum(s.wordCount for s in result.sections),
|
||||
"section_count": len(result.sections),
|
||||
"word_count": sum(s.wordCount for s in result.sections)
|
||||
},
|
||||
subdirectory="blogs"
|
||||
subdirectory="medium_blogs"
|
||||
)
|
||||
logger.info(f"Saved medium blog content to user workspace for user {user_id}")
|
||||
except Exception as e:
|
||||
|
||||
@@ -6,11 +6,8 @@ Neural search implementation using Exa API for high-quality, citation-rich resea
|
||||
|
||||
from exa_py import Exa
|
||||
import os
|
||||
import asyncio
|
||||
from typing import List, Dict, Any
|
||||
from loguru import logger
|
||||
from models.subscription_models import APIProvider
|
||||
from fastapi import HTTPException
|
||||
from .base_provider import ResearchProvider as BaseProvider
|
||||
|
||||
|
||||
@@ -219,123 +216,6 @@ class ExaResearchProvider(BaseProvider):
|
||||
"""Estimate token usage for Exa (not token-based)."""
|
||||
return 0 # Exa is per-search, not token-based
|
||||
|
||||
async def simple_search(
|
||||
self,
|
||||
query: str,
|
||||
num_results: int = 5,
|
||||
user_id: str = None,
|
||||
include_domains: List[str] = None,
|
||||
exclude_domains: List[str] = None,
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Simple Exa search for fact-checking and writing assistance.
|
||||
Handles subscription preflight check and usage tracking.
|
||||
|
||||
Args:
|
||||
query: Search query string
|
||||
num_results: Number of results to return (default 5)
|
||||
user_id: Optional user ID for subscription checking
|
||||
include_domains: Only return results from these domains (for internal links)
|
||||
exclude_domains: Exclude results from these domains (for external-only links)
|
||||
|
||||
Returns:
|
||||
List of source dicts with title, url, text, publishedDate, author, score keys
|
||||
|
||||
Raises:
|
||||
HTTPException(429): If user has exceeded subscription limits
|
||||
Exception: If Exa API key not configured or search fails
|
||||
"""
|
||||
if not self.api_key:
|
||||
raise Exception("EXA_API_KEY not configured")
|
||||
|
||||
# Preflight subscription check
|
||||
if user_id:
|
||||
from services.subscription import PricingService
|
||||
from services.database import get_session_for_user
|
||||
db = get_session_for_user(user_id)
|
||||
if db:
|
||||
try:
|
||||
pricing_service = PricingService(db)
|
||||
can_proceed, message, usage_info = pricing_service.check_usage_limits(
|
||||
user_id=user_id,
|
||||
provider=APIProvider.EXA,
|
||||
tokens_requested=0,
|
||||
actual_provider_name="exa",
|
||||
)
|
||||
if not can_proceed:
|
||||
raise HTTPException(status_code=429, detail={
|
||||
'error': 'insufficient_balance',
|
||||
'message': message,
|
||||
'provider': 'exa',
|
||||
'usage_info': usage_info or {}
|
||||
})
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.warning(f"[Exa simple_search] Preflight check failed: {e}")
|
||||
finally:
|
||||
try:
|
||||
db.close()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
search_kwargs = {
|
||||
"type": "auto",
|
||||
"num_results": num_results,
|
||||
"text": {"max_characters": 1000},
|
||||
"highlights": {"num_sentences": 2, "highlights_per_url": 2},
|
||||
}
|
||||
if include_domains:
|
||||
search_kwargs["include_domains"] = include_domains
|
||||
if exclude_domains:
|
||||
search_kwargs["exclude_domains"] = exclude_domains
|
||||
|
||||
try:
|
||||
loop = asyncio.get_running_loop()
|
||||
results = await loop.run_in_executor(
|
||||
None,
|
||||
lambda: self.exa.search_and_contents(query, **search_kwargs),
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"[Exa simple_search] API call failed: {e}")
|
||||
# Retry with simpler parameters
|
||||
retry_kwargs = {"type": "auto", "num_results": num_results, "text": True}
|
||||
if include_domains:
|
||||
retry_kwargs["include_domains"] = include_domains
|
||||
if exclude_domains:
|
||||
retry_kwargs["exclude_domains"] = exclude_domains
|
||||
try:
|
||||
logger.info("[Exa simple_search] Retrying with simplified parameters")
|
||||
results = await loop.run_in_executor(
|
||||
None,
|
||||
lambda: self.exa.search_and_contents(query, **retry_kwargs),
|
||||
)
|
||||
except Exception as retry_error:
|
||||
logger.error(f"[Exa simple_search] Retry also failed: {retry_error}")
|
||||
raise RuntimeError(f"Exa search failed: {str(retry_error)}") from retry_error
|
||||
|
||||
sources = []
|
||||
for result in results.results:
|
||||
sources.append({
|
||||
'title': getattr(result, 'title', 'Untitled'),
|
||||
'url': getattr(result, 'url', ''),
|
||||
'text': getattr(result, 'text', ''),
|
||||
'publishedDate': getattr(result, 'publishedDate', ''),
|
||||
'author': getattr(result, 'author', ''),
|
||||
'score': (lambda v: v if v is not None else 0.5)(getattr(result, 'score', 0.5)),
|
||||
})
|
||||
|
||||
# Track usage
|
||||
if user_id:
|
||||
cost = 0.005 # ~0.5 cents per search
|
||||
try:
|
||||
self.track_exa_usage(user_id, cost)
|
||||
except Exception as e:
|
||||
logger.warning(f"[Exa simple_search] Failed to track usage: {e}")
|
||||
|
||||
logger.info(f"[Exa simple_search] Found {len(sources)} sources for query: {query[:80]}...")
|
||||
return sources
|
||||
|
||||
def _map_source_type_to_category(self, source_types):
|
||||
"""Map SourceType enum to Exa category parameter."""
|
||||
if not source_types:
|
||||
|
||||
@@ -1,951 +0,0 @@
|
||||
"""
|
||||
Chart Service — Shared chart generation for Blog Writer, Podcast Maker, and future modules.
|
||||
|
||||
Extracts the chart rendering logic from podcast/broll_composer into a reusable service
|
||||
that any module can call. Supports:
|
||||
- Direct chart rendering (caller provides chart_type + chart_data)
|
||||
- AI-driven chart inference (caller provides text, LLM infers chart_type + chart_data)
|
||||
|
||||
Chart types: bar_comparison, bar_horizontal, line_trend, pie, stacked_bar, bullet_points
|
||||
"""
|
||||
|
||||
import uuid
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional, List
|
||||
from dataclasses import dataclass, field
|
||||
from loguru import logger
|
||||
|
||||
import numpy as np
|
||||
import matplotlib
|
||||
matplotlib.use("Agg")
|
||||
import matplotlib.pyplot as plt
|
||||
from PIL import Image, ImageDraw, ImageFont
|
||||
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
|
||||
|
||||
CHART_STYLE = {
|
||||
"bg": "#0D0D0D",
|
||||
"bar_before": "#2E4057",
|
||||
"bar_after": "#E63946",
|
||||
"text": "#F1F1EF",
|
||||
"grid": "#2A2A2A",
|
||||
"accent": "#E63946",
|
||||
"pie_colors": ["#E63946", "#2E4057", "#457B9D", "#A8DADC", "#F4A261", "#2A9D8F"],
|
||||
}
|
||||
|
||||
VALID_CHART_TYPES = [
|
||||
"bar_comparison", "bar_chart_comparison",
|
||||
"bar_horizontal", "line_trend",
|
||||
"pie", "stacked_bar",
|
||||
"bullet", "bullet_points",
|
||||
]
|
||||
|
||||
CHART_INFERENCE_SYSTEM_PROMPT = """You are a data visualization expert. Given text content, determine the most appropriate chart type and extract structured data for rendering.
|
||||
|
||||
You MUST respond with ONLY a valid JSON object (no markdown, no explanation) with this exact structure:
|
||||
{
|
||||
"chart_type": "one of: bar_comparison, bar_horizontal, line_trend, pie, stacked_bar, bullet_points",
|
||||
"chart_data": { ... appropriate data structure for the chart type ... },
|
||||
"title": "A clear, concise chart title"
|
||||
}
|
||||
|
||||
Chart data structures by type:
|
||||
- bar_comparison: {"labels": [...], "before": [...], "after": [...]} OR {"labels": [...], "values": [...]}
|
||||
- bar_horizontal: {"labels": [...], "values": [...]}
|
||||
- line_trend: {"labels": [...], "values": [...]}
|
||||
- pie: {"labels": [...], "values": [...]}
|
||||
- stacked_bar: {"labels": [...], "stacks": [[...], [...]]}
|
||||
- bullet_points: {"bullet_points": [...]}
|
||||
|
||||
Rules:
|
||||
1. Choose the chart type that best represents the information in the text.
|
||||
2. Use bar_comparison for before/after comparisons.
|
||||
3. Use line_trend for time-series or sequential data.
|
||||
4. Use pie for proportional breakdowns of a whole.
|
||||
5. Use bar_horizontal for rankings or comparisons.
|
||||
6. Use bullet_points if the text is qualitative with no strong numeric data.
|
||||
7. Extract realistic numeric values from the text when available.
|
||||
8. If no data is extractable, use bullet_points and list key points.
|
||||
9. Keep labels short (under 20 chars)."""
|
||||
|
||||
|
||||
CHART_INFERENCE_USER_PROMPT = """Create a chart from this text:
|
||||
|
||||
{text}
|
||||
|
||||
Return ONLY the JSON object with chart_type, chart_data, and title."""
|
||||
|
||||
|
||||
CHART_ANALYSIS_SYSTEM_PROMPT = """You are a data visualization analyst. Given text from a blog section, your job is to:
|
||||
1. Determine whether the text contains enough specific numeric data to create a meaningful chart
|
||||
2. If YES: explain what data is available and suggest a chart type
|
||||
3. If NO: suggest 2-3 specific search queries that would find relevant statistics/data to create a chart for this topic
|
||||
|
||||
You MUST respond with ONLY a valid JSON object (no markdown, no explanation):
|
||||
{
|
||||
"has_data": true/false,
|
||||
"data_description": "brief description of what data is available or why it's insufficient",
|
||||
"suggested_chart_type": "best chart type if has_data is true, otherwise null",
|
||||
"search_queries": ["query1", "query2", "query3"] // Empty array if has_data is true
|
||||
}
|
||||
|
||||
Be optimistic — if there's ANY numeric claim, percentage, comparison, or trend in the text, set has_data to true.
|
||||
Only set has_data to false if the text is purely qualitative with no numbers, percentages, comparisons, or trends."""
|
||||
|
||||
|
||||
CHART_ANALYSIS_USER_PROMPT = """Analyze this text for chart potential:
|
||||
|
||||
Section: {section_heading}
|
||||
{key_points_section}
|
||||
Text: {text}
|
||||
|
||||
Determine if this text contains enough data for a chart, or suggest search queries to find the data."""
|
||||
|
||||
|
||||
CHART_SYNTHESIS_SYSTEM_PROMPT = """You are a data visualization expert. You have been given:
|
||||
1. Original text from a blog section
|
||||
2. Research data found from web searches
|
||||
|
||||
Create a chart that visualizes the most interesting insight from the combination of the original text and research data.
|
||||
|
||||
You MUST respond with ONLY a valid JSON object (no markdown, no explanation) with this exact structure:
|
||||
{
|
||||
"chart_type": "one of: bar_comparison, bar_horizontal, line_trend, pie, stacked_bar, bullet_points",
|
||||
"chart_data": { ... appropriate data structure ... },
|
||||
"title": "A clear, concise chart title",
|
||||
"source": "Brief source attribution"
|
||||
}
|
||||
|
||||
Chart data structures by type:
|
||||
- bar_comparison: {"labels": [...], "before": [...], "after": [...]} OR {"labels": [...], "values": [...]}
|
||||
- bar_horizontal: {"labels": [...], "values": [...]}
|
||||
- line_trend: {"labels": [...], "values": [...]}
|
||||
- pie: {"labels": [...], "values": [...]}
|
||||
- stacked_bar: {"labels": [...], "stacks": [[...], [...]]}
|
||||
- bullet_points: {"bullet_points": [...]}
|
||||
|
||||
Rules:
|
||||
1. Use the research data to create accurate, fact-based charts
|
||||
2. Prefer bar_comparison for before/after or categorical comparisons
|
||||
3. Prefer line_trend for trends over time
|
||||
4. Prefer pie for market share or proportional breakdowns
|
||||
5. Keep labels short (under 20 characters)
|
||||
6. Use realistic values from the research — do NOT invent numbers
|
||||
7. Always include a source attribution based on where the data came from
|
||||
8. If the research doesn't contain useful numeric data, fall back to bullet_points with key insights"""
|
||||
|
||||
|
||||
CHART_SYNTHESIS_USER_PROMPT = """Original text:
|
||||
{text}
|
||||
|
||||
Research data found:
|
||||
{research}
|
||||
|
||||
Create a chart that visualizes the most interesting data insight from the combination above."""
|
||||
|
||||
|
||||
def _normalize_chart_type(chart_type: str) -> str:
|
||||
"""Normalize chart type aliases."""
|
||||
mapping = {
|
||||
"bar_chart_comparison": "bar_comparison",
|
||||
"bullet": "bullet_points",
|
||||
}
|
||||
return mapping.get(chart_type, chart_type)
|
||||
|
||||
|
||||
def _add_source_overlay(image_path: str, source: str) -> None:
|
||||
"""Add a source attribution overlay to a chart image (in-place)."""
|
||||
if not source or not os.path.exists(image_path):
|
||||
return
|
||||
try:
|
||||
img = Image.open(image_path).convert("RGBA")
|
||||
draw = ImageDraw.Draw(img)
|
||||
source_text = f"Source: {source[:80]}"
|
||||
try:
|
||||
font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 11)
|
||||
except (OSError, IOError):
|
||||
try:
|
||||
font = ImageFont.truetype("arial.ttf", 11)
|
||||
except (OSError, IOError):
|
||||
font = ImageFont.load_default()
|
||||
text_bbox = draw.textbbox((0, 0), source_text, font=font)
|
||||
text_w = text_bbox[2] - text_bbox[0]
|
||||
text_h = text_bbox[3] - text_bbox[1]
|
||||
x = img.width - text_w - 12
|
||||
y = img.height - text_h - 8
|
||||
draw.rectangle([x - 4, y - 2, x + text_w + 4, y + text_h + 2], fill=(0, 0, 0, 140))
|
||||
draw.text((x, y), source_text, fill=(200, 200, 200, 220), font=font)
|
||||
img.save(image_path)
|
||||
except Exception as e:
|
||||
logger.warning(f"[ChartService] Source overlay failed (non-fatal): {e}")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Chart generators (Matplotlib → PNG with transparency)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def make_bar_chart(data: dict, out_path: str, title: str = "",
|
||||
show_legend: bool = True, value_suffix: str = "%",
|
||||
subtitle: str = "") -> str:
|
||||
labels = data.get("labels", [])
|
||||
before = data.get("before", [])
|
||||
after = data.get("after", [])
|
||||
|
||||
fig, ax = plt.subplots(figsize=(8, 4.5), facecolor="none")
|
||||
ax.set_facecolor("none")
|
||||
|
||||
if not before and not after:
|
||||
values = data.get("values", [])
|
||||
if values and labels:
|
||||
n = min(len(labels), len(values))
|
||||
labels = labels[:n]
|
||||
before = [0] * n
|
||||
after = values[:n]
|
||||
data = {**data, "labels": labels, "before": before, "after": after}
|
||||
|
||||
x = np.arange(len(labels))
|
||||
w = 0.35
|
||||
bars_b = ax.bar(x - w / 2, before, w, color=CHART_STYLE["bar_before"],
|
||||
label="Before", zorder=3, edgecolor="none")
|
||||
bars_a = ax.bar(x + w / 2, after, w, color=CHART_STYLE["bar_after"],
|
||||
label="After", zorder=3, edgecolor="none")
|
||||
|
||||
ax.set_xticks(x)
|
||||
ax.set_xticklabels(labels, color=CHART_STYLE["text"], fontsize=11)
|
||||
ax.tick_params(axis="y", colors=CHART_STYLE["text"])
|
||||
ax.spines[:].set_visible(False)
|
||||
ax.yaxis.grid(True, color=CHART_STYLE["grid"], linewidth=0.6, zorder=0)
|
||||
ax.set_axisbelow(True)
|
||||
|
||||
for bar in [*bars_b, *bars_a]:
|
||||
h = bar.get_height()
|
||||
ax.text(bar.get_x() + bar.get_width() / 2, h + 0.5, f"{h:.0f}{value_suffix}",
|
||||
ha="center", va="bottom", color=CHART_STYLE["text"], fontsize=9,
|
||||
fontweight="bold")
|
||||
|
||||
if show_legend:
|
||||
ax.legend(frameon=False, labelcolor=CHART_STYLE["text"],
|
||||
fontsize=10, loc="upper left")
|
||||
|
||||
if title:
|
||||
ax.set_title(title, color=CHART_STYLE["text"], fontsize=13,
|
||||
fontweight="bold", pad=12)
|
||||
if subtitle:
|
||||
fig.text(0.5, 0.02, subtitle, ha='center', color=CHART_STYLE["text"],
|
||||
fontsize=10, style='italic')
|
||||
|
||||
fig.tight_layout(pad=0.5, rect=(0, 0.03 if subtitle else 0, 1, 1))
|
||||
fig.savefig(out_path, dpi=150, transparent=True, bbox_inches="tight")
|
||||
plt.close(fig)
|
||||
return out_path
|
||||
|
||||
|
||||
def make_horizontal_bar(data: dict, out_path: str, title: str = "",
|
||||
value_suffix: str = "%", bar_color: str = None) -> str:
|
||||
labels = data.get("labels", [])
|
||||
values = data.get("values", data.get("y", []))
|
||||
|
||||
if not values:
|
||||
return ""
|
||||
|
||||
bar_color = bar_color or CHART_STYLE["bar_after"]
|
||||
|
||||
fig, ax = plt.subplots(figsize=(8, 4.5), facecolor="none")
|
||||
ax.set_facecolor("none")
|
||||
|
||||
y_pos = np.arange(len(labels))
|
||||
bars = ax.barh(y_pos, values, color=bar_color, zorder=3, edgecolor="none", height=0.6)
|
||||
|
||||
ax.set_yticks(y_pos)
|
||||
ax.set_yticklabels(labels, color=CHART_STYLE["text"], fontsize=11)
|
||||
ax.tick_params(axis="x", colors=CHART_STYLE["text"])
|
||||
ax.spines[:].set_visible(False)
|
||||
ax.xaxis.grid(True, color=CHART_STYLE["grid"], linewidth=0.6, zorder=0)
|
||||
ax.set_axisbelow(True)
|
||||
ax.invert_yaxis()
|
||||
|
||||
for i, bar in enumerate(bars):
|
||||
width = bar.get_width()
|
||||
ax.text(width + 0.5, bar.get_y() + bar.get_height()/2, f"{width:.0f}{value_suffix}",
|
||||
ha="left", va="center", color=CHART_STYLE["text"], fontsize=10,
|
||||
fontweight="bold")
|
||||
|
||||
if title:
|
||||
ax.set_title(title, color=CHART_STYLE["text"], fontsize=13,
|
||||
fontweight="bold", pad=12)
|
||||
|
||||
fig.tight_layout(pad=0.5)
|
||||
fig.savefig(out_path, dpi=150, transparent=True, bbox_inches="tight")
|
||||
plt.close(fig)
|
||||
return out_path
|
||||
|
||||
|
||||
def make_pie_chart(data: dict, out_path: str, title: str = "",
|
||||
show_labels: bool = True, show_percent: bool = True,
|
||||
donut: bool = False) -> str:
|
||||
labels = data.get("labels", [])
|
||||
values = data.get("values", data.get("y", []))
|
||||
|
||||
if not values:
|
||||
return ""
|
||||
|
||||
colors = CHART_STYLE["pie_colors"][:len(values)]
|
||||
|
||||
fig, ax = plt.subplots(figsize=(6, 4.5), facecolor="none")
|
||||
ax.set_facecolor("none")
|
||||
|
||||
if donut:
|
||||
wedges, texts, autotexts = ax.pie(
|
||||
values, labels=labels if show_labels else None,
|
||||
colors=colors, autopct=lambda p: f'{p:.1f}%' if show_percent else '',
|
||||
startangle=90, pctdistance=0.75,
|
||||
wedgeprops=dict(width=0.5, edgecolor="none")
|
||||
)
|
||||
else:
|
||||
wedges, texts, autotexts = ax.pie(
|
||||
values, labels=labels if show_labels else None,
|
||||
colors=colors, autopct=lambda p: f'{p:.1f}%' if show_percent else '',
|
||||
startangle=90, pctdistance=0.8
|
||||
)
|
||||
|
||||
for text in texts:
|
||||
text.set_color(CHART_STYLE["text"])
|
||||
text.set_fontsize(10)
|
||||
|
||||
for autotext in autotexts:
|
||||
autotext.set_color(CHART_STYLE["text"])
|
||||
autotext.set_fontsize(9)
|
||||
autotext.set_fontweight("bold")
|
||||
|
||||
if title:
|
||||
ax.set_title(title, color=CHART_STYLE["text"], fontsize=13,
|
||||
fontweight="bold", pad=12)
|
||||
|
||||
fig.tight_layout(pad=0.5)
|
||||
fig.savefig(out_path, dpi=150, transparent=True, bbox_inches="tight")
|
||||
plt.close(fig)
|
||||
return out_path
|
||||
|
||||
|
||||
def make_stacked_bar(data: dict, out_path: str, title: str = "",
|
||||
stack_labels: list = None) -> str:
|
||||
labels = data.get("labels", [])
|
||||
stacks = data.get("stacks", [])
|
||||
|
||||
if not stacks or len(stacks) < 2:
|
||||
return ""
|
||||
|
||||
stack_labels = stack_labels or [f"Series {i+1}" for i in range(len(stacks))]
|
||||
|
||||
fig, ax = plt.subplots(figsize=(8, 4.5), facecolor="none")
|
||||
ax.set_facecolor("none")
|
||||
|
||||
x = np.arange(len(labels))
|
||||
bottom = np.zeros(len(labels))
|
||||
colors = CHART_STYLE["pie_colors"][:len(stacks)]
|
||||
|
||||
for i, stack in enumerate(stacks):
|
||||
bars = ax.bar(x, stack, 0.6, bottom=bottom, color=colors[i],
|
||||
label=stack_labels[i], zorder=3, edgecolor="none")
|
||||
|
||||
for j, bar in enumerate(bars):
|
||||
height = bar.get_height()
|
||||
if height > 5:
|
||||
ax.text(bar.get_x() + bar.get_width()/2,
|
||||
bottom[j] + height/2,
|
||||
f"{height:.0f}", ha="center", va="center",
|
||||
color=CHART_STYLE["text"], fontsize=8, fontweight="bold")
|
||||
|
||||
bottom = bottom + np.array(stack)
|
||||
|
||||
ax.set_xticks(x)
|
||||
ax.set_xticklabels(labels, color=CHART_STYLE["text"], fontsize=11)
|
||||
ax.tick_params(axis="y", colors=CHART_STYLE["text"])
|
||||
ax.spines[:].set_visible(False)
|
||||
ax.legend(frameon=False, labelcolor=CHART_STYLE["text"], fontsize=9, loc="upper left")
|
||||
|
||||
if title:
|
||||
ax.set_title(title, color=CHART_STYLE["text"], fontsize=13,
|
||||
fontweight="bold", pad=12)
|
||||
|
||||
fig.tight_layout(pad=0.5)
|
||||
fig.savefig(out_path, dpi=150, transparent=True, bbox_inches="tight")
|
||||
plt.close(fig)
|
||||
return out_path
|
||||
|
||||
|
||||
def make_line_trend(data: dict, out_path: str, title: str = "") -> str:
|
||||
x_labels = data.get("labels", data.get("x", []))
|
||||
y_vals = data.get("values", data.get("y", []))
|
||||
|
||||
if not x_labels or not y_vals:
|
||||
return ""
|
||||
|
||||
fig, ax = plt.subplots(figsize=(8, 4.5), facecolor="none")
|
||||
ax.set_facecolor("none")
|
||||
|
||||
try:
|
||||
x_vals = [float(v) for v in x_labels]
|
||||
except (ValueError, TypeError):
|
||||
x_vals = list(range(len(x_labels)))
|
||||
|
||||
ax.plot(x_vals, y_vals, color=CHART_STYLE["accent"],
|
||||
linewidth=2.5, marker="o", markersize=7, zorder=3)
|
||||
ax.fill_between(x_vals, y_vals, alpha=0.12, color=CHART_STYLE["accent"])
|
||||
ax.spines[:].set_visible(False)
|
||||
ax.tick_params(colors=CHART_STYLE["text"])
|
||||
ax.yaxis.grid(True, color=CHART_STYLE["grid"], linewidth=0.6, zorder=0)
|
||||
|
||||
try:
|
||||
x_labels_f = [float(v) for v in x_labels]
|
||||
except (ValueError, TypeError):
|
||||
ax.set_xticks(x_vals)
|
||||
ax.set_xticklabels(x_labels, color=CHART_STYLE["text"], fontsize=10)
|
||||
|
||||
if title:
|
||||
ax.set_title(title, color=CHART_STYLE["text"], fontsize=13,
|
||||
fontweight="bold", pad=12)
|
||||
fig.tight_layout(pad=0.5)
|
||||
fig.savefig(out_path, dpi=150, transparent=True, bbox_inches="tight")
|
||||
plt.close(fig)
|
||||
return out_path
|
||||
|
||||
|
||||
def make_bullet_overlay(lines: list, out_path: str,
|
||||
width: int = 900, font_size: int = 32) -> str:
|
||||
padding = 32
|
||||
line_h = font_size + 16
|
||||
img_h = padding * 2 + len(lines) * line_h + 12
|
||||
img = Image.new("RGBA", (width, img_h), (0, 0, 0, 0))
|
||||
draw = ImageDraw.Draw(img)
|
||||
|
||||
draw.rounded_rectangle([0, 0, width - 1, img_h - 1],
|
||||
radius=18, fill=(10, 10, 10, 185))
|
||||
|
||||
try:
|
||||
font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf",
|
||||
font_size)
|
||||
except OSError:
|
||||
font = ImageFont.load_default()
|
||||
|
||||
y = padding
|
||||
for line in lines:
|
||||
draw.text((padding + 18, y), f"\u2022 {line}", font=font, fill=(241, 241, 239, 255))
|
||||
y += line_h
|
||||
|
||||
img.save(out_path, format="PNG")
|
||||
return out_path
|
||||
|
||||
|
||||
CHART_RENDERERS = {
|
||||
"bar_comparison": make_bar_chart,
|
||||
"bar_chart_comparison": make_bar_chart,
|
||||
"bar_horizontal": make_horizontal_bar,
|
||||
"line_trend": make_line_trend,
|
||||
"pie": make_pie_chart,
|
||||
"stacked_bar": make_stacked_bar,
|
||||
"bullet_points": make_bullet_overlay,
|
||||
"bullet": make_bullet_overlay,
|
||||
}
|
||||
|
||||
|
||||
class ChartService:
|
||||
"""Shared chart generation service for all modules."""
|
||||
|
||||
def __init__(self, output_dir: Optional[str] = None, user_id: Optional[str] = None):
|
||||
if output_dir:
|
||||
self.output_dir = Path(output_dir)
|
||||
else:
|
||||
self.output_dir = self._default_chart_dir(user_id)
|
||||
|
||||
self.output_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"[ChartService] Initialized with output directory: {self.output_dir}")
|
||||
|
||||
@staticmethod
|
||||
def _default_chart_dir(user_id: Optional[str] = None) -> Path:
|
||||
"""Get default chart directory (workspace-aware if user_id provided)."""
|
||||
if user_id:
|
||||
try:
|
||||
from api.podcast.constants import get_podcast_media_dir
|
||||
return get_podcast_media_dir("chart", user_id, ensure_exists=True)
|
||||
except Exception:
|
||||
pass
|
||||
base = Path.home() / ".alwrity" / "charts"
|
||||
base.mkdir(parents=True, exist_ok=True)
|
||||
return base
|
||||
|
||||
def get_output_path(self, filename: str) -> Path:
|
||||
return self.output_dir / filename
|
||||
|
||||
def get_chart_preview_path(self, chart_id: str) -> Path:
|
||||
return self.get_output_path(f"chart_preview_{chart_id}.png")
|
||||
|
||||
def generate_chart(
|
||||
self,
|
||||
chart_data: Dict[str, Any],
|
||||
chart_type: str = "bar_comparison",
|
||||
title: str = "",
|
||||
subtitle: str = "",
|
||||
chart_id: Optional[str] = None,
|
||||
) -> Dict[str, str]:
|
||||
"""
|
||||
Generate a chart PNG and return metadata.
|
||||
|
||||
Returns:
|
||||
{"path": str, "chart_id": str, "filename": str}
|
||||
Returns {"path": "", "chart_id": str, "filename": ""} on failure.
|
||||
"""
|
||||
resolved_id = chart_id or uuid.uuid4().hex[:8]
|
||||
out_path = str(self.get_chart_preview_path(resolved_id))
|
||||
normalized_type = _normalize_chart_type(chart_type)
|
||||
|
||||
logger.info(f"[ChartService] Generating chart: type={normalized_type}, id={resolved_id}")
|
||||
|
||||
try:
|
||||
result_path = self._render_chart(normalized_type, chart_data, out_path, title, subtitle)
|
||||
|
||||
if not result_path or not os.path.exists(result_path):
|
||||
logger.warning(f"[ChartService] Chart rendering returned empty path or file missing for type={normalized_type}")
|
||||
return {"path": "", "chart_id": resolved_id, "filename": ""}
|
||||
|
||||
source = chart_data.get("source", "").strip()
|
||||
if source:
|
||||
_add_source_overlay(result_path, source)
|
||||
|
||||
filename = Path(result_path).name
|
||||
logger.info(f"[ChartService] Chart generated: id={resolved_id}, path={result_path}")
|
||||
return {"path": result_path, "chart_id": resolved_id, "filename": filename}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[ChartService] Chart generation failed: {e}")
|
||||
return {"path": "", "chart_id": resolved_id, "filename": ""}
|
||||
|
||||
def _render_chart(self, chart_type: str, chart_data: Dict[str, Any],
|
||||
out_path: str, title: str, subtitle: str) -> str:
|
||||
"""Dispatch to the appropriate chart renderer."""
|
||||
|
||||
if chart_type in ("bar_comparison", "bar_chart_comparison"):
|
||||
labels = chart_data.get("labels", [])
|
||||
before = chart_data.get("before", [])
|
||||
after = chart_data.get("after", [])
|
||||
if not before and not after:
|
||||
values = chart_data.get("values", [])
|
||||
if values and labels:
|
||||
n = min(len(labels), len(values))
|
||||
chart_data = {**chart_data, "labels": labels[:n], "before": [0] * n, "after": values[:n]}
|
||||
return make_bar_chart(chart_data, out_path, title, subtitle=subtitle)
|
||||
|
||||
elif chart_type == "bar_horizontal":
|
||||
return make_horizontal_bar(chart_data, out_path, title)
|
||||
|
||||
elif chart_type == "line_trend":
|
||||
return make_line_trend(chart_data, out_path, title)
|
||||
|
||||
elif chart_type == "pie":
|
||||
return make_pie_chart(chart_data, out_path, title)
|
||||
|
||||
elif chart_type == "stacked_bar":
|
||||
return make_stacked_bar(chart_data, out_path, title)
|
||||
|
||||
elif chart_type in ("bullet", "bullet_points"):
|
||||
bullet_points = chart_data.get("bullet_points", chart_data.get("labels", []))
|
||||
if bullet_points:
|
||||
return make_bullet_overlay(bullet_points, out_path)
|
||||
return ""
|
||||
|
||||
else:
|
||||
logger.warning(f"[ChartService] Unknown chart type: {chart_type}, falling back to bar_comparison")
|
||||
return make_bar_chart(chart_data, out_path, title, subtitle=subtitle)
|
||||
|
||||
def infer_chart_from_text(self, text: str, user_id: Optional[str] = None) -> Dict[str, Any]:
|
||||
"""
|
||||
Use LLM to infer chart_type and chart_data from text.
|
||||
|
||||
Returns:
|
||||
{"chart_type": str, "chart_data": dict, "title": str}
|
||||
Falls back to bullet_points with key sentences extracted from text.
|
||||
"""
|
||||
try:
|
||||
prompt = CHART_INFERENCE_USER_PROMPT.format(text=text[:3000])
|
||||
result = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=CHART_INFERENCE_SYSTEM_PROMPT,
|
||||
json_struct=None,
|
||||
max_tokens=2000,
|
||||
user_id=user_id,
|
||||
)
|
||||
|
||||
if isinstance(result, dict) and result.get("text"):
|
||||
raw = result["text"]
|
||||
else:
|
||||
raw = str(result) if result else ""
|
||||
|
||||
import json
|
||||
import re
|
||||
raw = raw.strip()
|
||||
if raw.startswith("```"):
|
||||
match = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", raw, re.DOTALL)
|
||||
if match:
|
||||
raw = match.group(1)
|
||||
|
||||
parsed = json.loads(raw)
|
||||
|
||||
chart_type = parsed.get("chart_type", "bullet_points")
|
||||
chart_data = parsed.get("chart_data", {})
|
||||
title = parsed.get("title", "")
|
||||
|
||||
if chart_type not in VALID_CHART_TYPES:
|
||||
chart_type = _normalize_chart_type(chart_type)
|
||||
if chart_type not in VALID_CHART_TYPES:
|
||||
chart_type = "bullet_points"
|
||||
|
||||
logger.info(f"[ChartService] Inferred chart: type={chart_type}, title={title}")
|
||||
return {"chart_type": chart_type, "chart_data": chart_data, "title": title}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[ChartService] Chart inference failed: {e}")
|
||||
sentences = [s.strip() for s in text.replace(".", ". ").split(". ") if len(s.strip()) > 10][:5]
|
||||
return {
|
||||
"chart_type": "bullet_points",
|
||||
"chart_data": {"bullet_points": sentences or ["No data extracted"]},
|
||||
"title": "Key Points",
|
||||
}
|
||||
|
||||
async def _analyze_chart_potential(
|
||||
self,
|
||||
text: str,
|
||||
section_heading: Optional[str] = None,
|
||||
section_key_points: Optional[List[str]] = None,
|
||||
user_id: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Stage 1: Analyze whether text has enough data for a chart.
|
||||
If not, suggest Exa search queries to find relevant data.
|
||||
|
||||
Returns:
|
||||
{"has_data": bool, "data_description": str, "suggested_chart_type": str|null, "search_queries": [...]}
|
||||
"""
|
||||
key_points_text = ""
|
||||
if section_key_points:
|
||||
key_points_text = f"\n\nKey points:\n" + "\n".join(f"- {p}" for p in section_key_points[:5])
|
||||
|
||||
prompt = CHART_ANALYSIS_USER_PROMPT.format(
|
||||
section_heading=section_heading or "Blog Section",
|
||||
key_points_section=key_points_text,
|
||||
text=text[:3000],
|
||||
)
|
||||
|
||||
try:
|
||||
result = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=CHART_ANALYSIS_SYSTEM_PROMPT,
|
||||
json_struct=None,
|
||||
max_tokens=1500,
|
||||
user_id=user_id,
|
||||
)
|
||||
|
||||
raw = result.get("text", "") if isinstance(result, dict) else str(result) if result else ""
|
||||
|
||||
import json
|
||||
import re
|
||||
raw = raw.strip()
|
||||
if raw.startswith("```"):
|
||||
match = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", raw, re.DOTALL)
|
||||
if match:
|
||||
raw = match.group(1)
|
||||
|
||||
parsed = json.loads(raw)
|
||||
|
||||
has_data = parsed.get("has_data", False)
|
||||
data_description = parsed.get("data_description", "")
|
||||
suggested_chart_type = parsed.get("suggested_chart_type")
|
||||
search_queries = parsed.get("search_queries", [])
|
||||
|
||||
if suggested_chart_type and suggested_chart_type not in VALID_CHART_TYPES:
|
||||
suggested_chart_type = _normalize_chart_type(suggested_chart_type)
|
||||
if suggested_chart_type not in VALID_CHART_TYPES:
|
||||
suggested_chart_type = None
|
||||
|
||||
logger.info(f"[ChartService] Chart analysis: has_data={has_data}, queries={search_queries}")
|
||||
return {
|
||||
"has_data": has_data,
|
||||
"data_description": data_description,
|
||||
"suggested_chart_type": suggested_chart_type,
|
||||
"search_queries": search_queries,
|
||||
"warnings": [],
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[ChartService] Chart analysis failed: {e}")
|
||||
heading = section_heading or ""
|
||||
words = text.split()[:10]
|
||||
fallback_queries = [
|
||||
f"{heading} statistics data",
|
||||
f"{heading} trends report",
|
||||
f"{' '.join(words)} statistics",
|
||||
] if heading.strip() or text.strip() else []
|
||||
return {
|
||||
"has_data": False,
|
||||
"data_description": f"Analysis failed: {e}",
|
||||
"suggested_chart_type": None,
|
||||
"search_queries": fallback_queries,
|
||||
"warnings": [f"Chart analysis LLM call failed: {e}"],
|
||||
}
|
||||
|
||||
async def _search_for_chart_data(
|
||||
self,
|
||||
queries: List[str],
|
||||
section_heading: Optional[str] = None,
|
||||
user_id: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Stage 2: Use Exa search to find relevant statistics and data for chart creation.
|
||||
|
||||
Returns:
|
||||
{"research": str, "warnings": list[str]}
|
||||
"""
|
||||
if not queries:
|
||||
return {"research": "", "warnings": []}
|
||||
|
||||
warnings = []
|
||||
try:
|
||||
from services.blog_writer.research.exa_provider import ExaResearchProvider
|
||||
|
||||
provider = ExaResearchProvider()
|
||||
all_results = []
|
||||
search_errors = 0
|
||||
|
||||
for query in queries[:3]:
|
||||
try:
|
||||
results = await provider.simple_search(
|
||||
query=query,
|
||||
num_results=3,
|
||||
user_id=user_id,
|
||||
)
|
||||
all_results.extend(results)
|
||||
except Exception as e:
|
||||
search_errors += 1
|
||||
logger.warning(f"[ChartService] Exa search for '{query}' failed: {e}")
|
||||
continue
|
||||
|
||||
if search_errors == len(queries[:3]):
|
||||
warnings.append("All Exa search queries failed — external data search unavailable. Chart may lack supporting data.")
|
||||
|
||||
if not all_results:
|
||||
return {"research": "", "warnings": warnings}
|
||||
|
||||
research_parts = []
|
||||
seen_urls = set()
|
||||
for r in all_results:
|
||||
url = r.get("url", "")
|
||||
if url in seen_urls:
|
||||
continue
|
||||
seen_urls.add(url)
|
||||
title = r.get("title", "Untitled")
|
||||
text = r.get("text", "")[:500]
|
||||
if text:
|
||||
research_parts.append(f"- {title} ({url}): {text}")
|
||||
|
||||
if not research_parts:
|
||||
return {"research": "", "warnings": warnings}
|
||||
|
||||
return {"research": "\n".join(research_parts), "warnings": warnings}
|
||||
|
||||
except ImportError:
|
||||
msg = "Exa provider not available — skipping external data search."
|
||||
logger.warning(f"[ChartService] {msg}")
|
||||
warnings.append(msg)
|
||||
return {"research": "", "warnings": warnings}
|
||||
except Exception as e:
|
||||
msg = f"Chart data search failed: {e}"
|
||||
logger.error(f"[ChartService] {msg}")
|
||||
warnings.append(msg)
|
||||
return {"research": "", "warnings": warnings}
|
||||
|
||||
async def _synthesize_chart_from_research(
|
||||
self,
|
||||
text: str,
|
||||
research: str,
|
||||
section_heading: Optional[str] = None,
|
||||
user_id: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Stage 3: Generate chart spec from text + research data using LLM.
|
||||
|
||||
Returns:
|
||||
{"chart_type": str, "chart_data": dict, "title": str, "source": str}
|
||||
"""
|
||||
try:
|
||||
prompt = CHART_SYNTHESIS_USER_PROMPT.format(
|
||||
text=text[:2000],
|
||||
research=research[:3000],
|
||||
)
|
||||
|
||||
result = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=CHART_SYNTHESIS_SYSTEM_PROMPT,
|
||||
json_struct=None,
|
||||
max_tokens=2000,
|
||||
user_id=user_id,
|
||||
)
|
||||
|
||||
raw = result.get("text", "") if isinstance(result, dict) else str(result) if result else ""
|
||||
|
||||
import json
|
||||
import re
|
||||
raw = raw.strip()
|
||||
if raw.startswith("```"):
|
||||
match = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", raw, re.DOTALL)
|
||||
if match:
|
||||
raw = match.group(1)
|
||||
|
||||
parsed = json.loads(raw)
|
||||
|
||||
chart_type = parsed.get("chart_type", "bullet_points")
|
||||
chart_data = parsed.get("chart_data", {})
|
||||
title = parsed.get("title", "")
|
||||
source = parsed.get("source", "")
|
||||
|
||||
if chart_type not in VALID_CHART_TYPES:
|
||||
chart_type = _normalize_chart_type(chart_type)
|
||||
if chart_type not in VALID_CHART_TYPES:
|
||||
chart_type = "bullet_points"
|
||||
|
||||
if source and isinstance(chart_data, dict):
|
||||
chart_data["source"] = source
|
||||
|
||||
logger.info(f"[ChartService] Synthesized chart: type={chart_type}, title={title}")
|
||||
return {"chart_type": chart_type, "chart_data": chart_data, "title": title}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[ChartService] Chart synthesis failed: {e}")
|
||||
sentences = [s.strip() for s in text.replace(".", ". ").split(". ") if len(s.strip()) > 10][:5]
|
||||
return {
|
||||
"chart_type": "bullet_points",
|
||||
"chart_data": {"bullet_points": sentences or ["No data available"]},
|
||||
"title": section_heading or "Key Points",
|
||||
}
|
||||
|
||||
async def infer_chart_with_research(
|
||||
self,
|
||||
text: str,
|
||||
section_heading: Optional[str] = None,
|
||||
section_key_points: Optional[List[str]] = None,
|
||||
user_id: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
3-stage chart inference pipeline:
|
||||
1. Analyze text for chart potential — does it have data? If not, what to search for?
|
||||
2. If no data, search Exa for relevant statistics.
|
||||
3. Synthesize chart spec from text + research data.
|
||||
|
||||
Returns:
|
||||
{"chart_type": str, "chart_data": dict, "title": str, "warnings": list[str]}
|
||||
"""
|
||||
warnings = []
|
||||
logger.info(f"[ChartService] infer_chart_with_research: heading={section_heading}, text_len={len(text)}, user={user_id}")
|
||||
|
||||
# Stage 1: Analyze
|
||||
analysis = await self._analyze_chart_potential(
|
||||
text=text,
|
||||
section_heading=section_heading,
|
||||
section_key_points=section_key_points,
|
||||
user_id=user_id,
|
||||
)
|
||||
warnings.extend(analysis.get("warnings", []))
|
||||
|
||||
if analysis.get("has_data") and analysis.get("suggested_chart_type"):
|
||||
# Text has enough data — do direct inference
|
||||
logger.info("[ChartService] Text has sufficient data, using direct inference")
|
||||
result = self.infer_chart_from_text(text, user_id=user_id)
|
||||
if analysis.get("suggested_chart_type") and result.get("chart_type") == "bullet_points":
|
||||
result["chart_type"] = analysis["suggested_chart_type"]
|
||||
result["warnings"] = warnings
|
||||
return result
|
||||
|
||||
# Stage 2: Search for data
|
||||
search_queries = analysis.get("search_queries", [])
|
||||
if not search_queries:
|
||||
# Build queries from section heading + text keywords
|
||||
heading = section_heading or ""
|
||||
words = text.split()[:10]
|
||||
search_queries = [
|
||||
f"{heading} statistics data",
|
||||
f"{heading} trends report",
|
||||
f"{' '.join(words)} statistics",
|
||||
]
|
||||
|
||||
logger.info(f"[ChartService] Searching Exa for chart data, queries: {search_queries}")
|
||||
search_result = await self._search_for_chart_data(
|
||||
queries=search_queries,
|
||||
section_heading=section_heading,
|
||||
user_id=user_id,
|
||||
)
|
||||
research = search_result.get("research", "")
|
||||
warnings.extend(search_result.get("warnings", []))
|
||||
|
||||
if not research:
|
||||
logger.warning("[ChartService] No research data found, falling back to text-only inference")
|
||||
result = self.infer_chart_from_text(text, user_id=user_id)
|
||||
result["warnings"] = warnings
|
||||
return result
|
||||
|
||||
# Stage 3: Synthesize chart from text + research
|
||||
logger.info("[ChartService] Synthesizing chart from text + research data")
|
||||
result = await self._synthesize_chart_from_research(
|
||||
text=text,
|
||||
research=research,
|
||||
section_heading=section_heading,
|
||||
user_id=user_id,
|
||||
)
|
||||
result["warnings"] = warnings
|
||||
return result
|
||||
|
||||
async def generate_chart_from_text(
|
||||
self,
|
||||
text: str,
|
||||
user_id: Optional[str] = None,
|
||||
chart_id: Optional[str] = None,
|
||||
section_heading: Optional[str] = None,
|
||||
section_key_points: Optional[List[str]] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
End-to-end: analyze text, optionally research data, then infer and render chart.
|
||||
|
||||
Uses the 3-stage pipeline (analyze → search → synthesize) for richer charts
|
||||
with real data from Exa when the original text lacks statistics.
|
||||
|
||||
Returns:
|
||||
{"path": str, "chart_id": str, "filename": str, "chart_type": str, "chart_data": dict, "title": str}
|
||||
"""
|
||||
inference = await self.infer_chart_with_research(
|
||||
text=text,
|
||||
section_heading=section_heading,
|
||||
section_key_points=section_key_points,
|
||||
user_id=user_id,
|
||||
)
|
||||
result = self.generate_chart(
|
||||
chart_data=inference["chart_data"],
|
||||
chart_type=inference["chart_type"],
|
||||
title=inference["title"],
|
||||
chart_id=chart_id,
|
||||
)
|
||||
result["chart_type"] = inference["chart_type"]
|
||||
result["chart_data"] = inference["chart_data"]
|
||||
result["title"] = inference["title"]
|
||||
result["warnings"] = inference.get("warnings", [])
|
||||
return result
|
||||
|
||||
|
||||
# Per-user service instances
|
||||
_chart_service_instances: Dict[str, ChartService] = {}
|
||||
|
||||
|
||||
def get_chart_service(output_dir: Optional[str] = None, user_id: Optional[str] = None) -> ChartService:
|
||||
"""Get or create ChartService for the given user."""
|
||||
cache_key = output_dir or user_id or "default"
|
||||
if cache_key not in _chart_service_instances:
|
||||
_chart_service_instances[cache_key] = ChartService(output_dir=output_dir, user_id=user_id)
|
||||
return _chart_service_instances[cache_key]
|
||||
@@ -31,7 +31,6 @@ from models.product_marketing_models import Campaign, CampaignProposal, Campaign
|
||||
from models.product_asset_models import ProductAsset, ProductStyleTemplate, EcommerceExport
|
||||
# Podcast Maker models use SubscriptionBase, but import to ensure models are registered
|
||||
from models.podcast_models import PodcastProject
|
||||
|
||||
# Research models use SubscriptionBase
|
||||
from models.research_models import ResearchProject
|
||||
# Video Studio models
|
||||
@@ -47,10 +46,10 @@ import models.platform_insights_monitoring_models
|
||||
import models.agent_activity_models
|
||||
import models.daily_workflow_models
|
||||
|
||||
from services.workspace_paths import get_workspace_root, get_user_workspace_dir
|
||||
|
||||
# Database configuration
|
||||
WORKSPACE_DIR = str(get_workspace_root())
|
||||
# Get project root (3 levels up from services/database.py: services -> backend -> root)
|
||||
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
WORKSPACE_DIR = os.path.join(ROOT_DIR, 'workspace')
|
||||
|
||||
# Engine cache for multi-tenant support
|
||||
_user_engines = {}
|
||||
@@ -96,7 +95,7 @@ def _sanitize_user_id(user_id: str) -> str:
|
||||
def ensure_user_workspace_db_directory(user_id: str) -> str:
|
||||
"""Ensure modern `db/` directory exists, migrating legacy `database/` when safe."""
|
||||
safe_user_id = _sanitize_user_id(user_id)
|
||||
user_workspace = str(get_user_workspace_dir(user_id))
|
||||
user_workspace = os.path.join(WORKSPACE_DIR, f"workspace_{safe_user_id}")
|
||||
db_dir = os.path.join(user_workspace, 'db')
|
||||
legacy_db_dir = os.path.join(user_workspace, 'database')
|
||||
|
||||
@@ -127,7 +126,7 @@ def ensure_user_workspace_db_directory(user_id: str) -> str:
|
||||
def get_user_db_path(user_id: str) -> str:
|
||||
"""Get the database path for a specific user."""
|
||||
safe_user_id = _sanitize_user_id(user_id)
|
||||
user_workspace = str(get_user_workspace_dir(user_id))
|
||||
user_workspace = os.path.join(WORKSPACE_DIR, f"workspace_{safe_user_id}")
|
||||
db_dir = ensure_user_workspace_db_directory(user_id)
|
||||
|
||||
# Check for legacy naming convention first (to support existing data)
|
||||
|
||||
@@ -1,648 +0,0 @@
|
||||
"""
|
||||
GSC Brainstorm Service for ALwrity.
|
||||
|
||||
Analyzes Google Search Console data to suggest blog topics the user should write about.
|
||||
Combines rule-based heuristics with LLM-powered strategic recommendations tailored to
|
||||
the user's topic intent. Designed for non-SEO-experts: every insight includes plain-English
|
||||
explanations of WHY it matters and WHAT to do about it.
|
||||
"""
|
||||
|
||||
import json
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Dict, List, Any, Optional
|
||||
from loguru import logger
|
||||
|
||||
from services.gsc_service import GSCService
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
|
||||
|
||||
class GSCBrainstormService:
|
||||
"""
|
||||
Suggests blog topics based on the user's live GSC data.
|
||||
|
||||
Flow:
|
||||
1. Fetch real GSC search analytics (query + page data, 30 days)
|
||||
2. Compute derived metrics (CTR benchmarks, estimated traffic uplift, content formats)
|
||||
3. Apply rule-based filters (Quick Wins, Optimization, Enhancement, Rising Stars, Page Issues)
|
||||
4. Generate LLM-powered strategic recommendations contextualised to the user's keywords
|
||||
5. Return structured results with all data exposed for rich frontend display
|
||||
"""
|
||||
|
||||
def __init__(self, gsc_service: GSCService = None):
|
||||
self.gsc_service = gsc_service or GSCService()
|
||||
|
||||
# ------------------------------------------------------------------ #
|
||||
# Public entry point
|
||||
# ------------------------------------------------------------------ #
|
||||
|
||||
def brainstorm_topics(
|
||||
self,
|
||||
user_id: str,
|
||||
keywords: str,
|
||||
site_url: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
self._user_id = user_id
|
||||
|
||||
# 1. Resolve site_url
|
||||
if not site_url:
|
||||
sites = self.gsc_service.get_site_list(user_id)
|
||||
if not sites:
|
||||
return {
|
||||
"error": "No GSC sites found. Make sure your site is verified in Google Search Console.",
|
||||
"content_opportunities": [],
|
||||
"keyword_gaps": [],
|
||||
"quick_wins": [],
|
||||
"page_opportunities": [],
|
||||
"ai_recommendations": {},
|
||||
"summary": {},
|
||||
}
|
||||
site_url = sites[0].get("siteUrl", "")
|
||||
|
||||
# 2. Fetch GSC analytics (30 days)
|
||||
end_date = datetime.now().strftime("%Y-%m-%d")
|
||||
start_date = (datetime.now() - timedelta(days=30)).strftime("%Y-%m-%d")
|
||||
|
||||
analytics = self.gsc_service.get_search_analytics(
|
||||
user_id=user_id,
|
||||
site_url=site_url,
|
||||
start_date=start_date,
|
||||
end_date=end_date,
|
||||
)
|
||||
|
||||
if "error" in analytics:
|
||||
return {
|
||||
"error": analytics.get("error", "Failed to fetch GSC data"),
|
||||
"content_opportunities": [],
|
||||
"keyword_gaps": [],
|
||||
"quick_wins": [],
|
||||
"page_opportunities": [],
|
||||
"ai_recommendations": {},
|
||||
"summary": {},
|
||||
}
|
||||
|
||||
# 3. Parse GSC rows into structured data
|
||||
query_rows = analytics.get("query_data", {}).get("rows", [])
|
||||
page_rows = analytics.get("page_data", {}).get("rows", [])
|
||||
|
||||
keywords_data = self._parse_query_rows(query_rows)
|
||||
pages_data = self._parse_page_rows(page_rows)
|
||||
|
||||
if not keywords_data:
|
||||
return {
|
||||
"error": "No keyword data available for the selected period. This usually means your site is new to GSC or hasn't received search traffic yet.",
|
||||
"content_opportunities": [],
|
||||
"keyword_gaps": [],
|
||||
"quick_wins": [],
|
||||
"page_opportunities": [],
|
||||
"ai_recommendations": {},
|
||||
"summary": {
|
||||
"site_url": site_url,
|
||||
"date_range": {"start": start_date, "end": end_date},
|
||||
"total_keywords_analyzed": 0,
|
||||
},
|
||||
}
|
||||
|
||||
# 4. Rule-based analysis
|
||||
content_opportunities = self._identify_content_opportunities(keywords_data)
|
||||
keyword_gaps = self._identify_keyword_gaps(keywords_data)
|
||||
quick_wins = self._identify_quick_wins(keywords_data)
|
||||
page_opportunities = self._identify_page_opportunities(pages_data)
|
||||
|
||||
# 5. Summary metrics
|
||||
summary = self._compute_summary(keywords_data, pages_data, site_url, start_date, end_date)
|
||||
|
||||
# 6. AI recommendations
|
||||
ai_recommendations = self._generate_ai_recommendations(
|
||||
keywords_data, pages_data, summary, keywords,
|
||||
content_opportunities, quick_wins, keyword_gaps,
|
||||
)
|
||||
|
||||
return {
|
||||
"content_opportunities": content_opportunities,
|
||||
"keyword_gaps": keyword_gaps,
|
||||
"quick_wins": quick_wins,
|
||||
"page_opportunities": page_opportunities,
|
||||
"ai_recommendations": ai_recommendations,
|
||||
"summary": summary,
|
||||
}
|
||||
|
||||
# ------------------------------------------------------------------ #
|
||||
# Data parsing helpers
|
||||
# ------------------------------------------------------------------ #
|
||||
|
||||
@staticmethod
|
||||
def _parse_query_rows(rows: List[Dict]) -> List[Dict[str, Any]]:
|
||||
parsed = []
|
||||
for row in rows:
|
||||
keys = row.get("keys", [])
|
||||
keyword = keys[0] if len(keys) >= 1 else "(not set)"
|
||||
parsed.append({
|
||||
"keyword": keyword,
|
||||
"clicks": row.get("clicks", 0),
|
||||
"impressions": row.get("impressions", 0),
|
||||
"ctr": round(row.get("ctr", 0) * 100, 2),
|
||||
"position": round(row.get("position", 0), 1),
|
||||
})
|
||||
return parsed
|
||||
|
||||
@staticmethod
|
||||
def _parse_page_rows(rows: List[Dict]) -> List[Dict[str, Any]]:
|
||||
parsed = []
|
||||
for row in rows:
|
||||
keys = row.get("keys", [])
|
||||
page = keys[0] if len(keys) >= 1 else "(not set)"
|
||||
parsed.append({
|
||||
"page": page,
|
||||
"clicks": row.get("clicks", 0),
|
||||
"impressions": row.get("impressions", 0),
|
||||
"ctr": round(row.get("ctr", 0) * 100, 2),
|
||||
"position": round(row.get("position", 0), 1),
|
||||
})
|
||||
return parsed
|
||||
|
||||
# ------------------------------------------------------------------ #
|
||||
# Rule-based opportunity identification
|
||||
# ------------------------------------------------------------------ #
|
||||
|
||||
@staticmethod
|
||||
def _identify_content_opportunities(
|
||||
keywords_data: List[Dict[str, Any]],
|
||||
) -> List[Dict[str, Any]]:
|
||||
opportunities: List[Dict[str, Any]] = []
|
||||
|
||||
# Rule 1: Content Optimization — high impressions, low CTR
|
||||
# Meaning: Google is SHOWING your page for this query but people aren't clicking.
|
||||
# The content probably ranks but title/meta/snippet isn't compelling enough.
|
||||
for kw in keywords_data:
|
||||
if kw["impressions"] > 500 and kw["ctr"] < 3:
|
||||
estimated_gain = int(kw["impressions"] * 0.05) - kw["clicks"]
|
||||
opportunities.append({
|
||||
"type": "Content Optimization",
|
||||
"keyword": kw["keyword"],
|
||||
"opportunity": (
|
||||
f"Your site appears for '{kw['keyword']}' ({kw['impressions']:,} times/month) "
|
||||
f"but only {kw['ctr']:.1f}% click. Improving your title and meta description "
|
||||
f"could bring ~{max(estimated_gain, 5)} more clicks/month."
|
||||
),
|
||||
"potential_impact": "High" if kw["impressions"] > 1000 else "Medium",
|
||||
"current_position": kw["position"],
|
||||
"current_ctr": kw["ctr"],
|
||||
"impressions": kw["impressions"],
|
||||
"clicks": kw["clicks"],
|
||||
"estimated_traffic_gain": max(estimated_gain, 5),
|
||||
"priority": "High" if kw["impressions"] > 1000 else "Medium",
|
||||
"suggested_format": GSCBrainstormService._suggest_format(kw["keyword"]),
|
||||
})
|
||||
|
||||
# Rule 2: Content Enhancement — positions 11-20 with decent impressions
|
||||
# Meaning: You're on page 2 of Google. A small content boost could push you to page 1,
|
||||
# where CTR increases dramatically (page 1 gets ~95% of all clicks).
|
||||
for kw in keywords_data:
|
||||
if 10 < kw["position"] <= 20 and kw["impressions"] > 100:
|
||||
estimated_gain = int(kw["impressions"] * 0.08)
|
||||
opportunities.append({
|
||||
"type": "Content Enhancement",
|
||||
"keyword": kw["keyword"],
|
||||
"opportunity": (
|
||||
f"'{kw['keyword']}' ranks #{kw['position']:.0f} (page 2). "
|
||||
f"Moving to page 1 could capture ~{estimated_gain} more clicks/month "
|
||||
f"from {kw['impressions']:,} impressions."
|
||||
),
|
||||
"potential_impact": "High" if kw["impressions"] > 500 else "Medium",
|
||||
"current_position": kw["position"],
|
||||
"current_ctr": kw["ctr"],
|
||||
"impressions": kw["impressions"],
|
||||
"clicks": kw["clicks"],
|
||||
"estimated_traffic_gain": estimated_gain,
|
||||
"priority": "High" if kw["impressions"] > 500 else "Medium",
|
||||
"suggested_format": GSCBrainstormService._suggest_format(kw["keyword"]),
|
||||
})
|
||||
|
||||
opportunities.sort(key=lambda x: x["impressions"], reverse=True)
|
||||
return opportunities[:10]
|
||||
|
||||
@staticmethod
|
||||
def _identify_keyword_gaps(
|
||||
keywords_data: List[Dict[str, Any]],
|
||||
) -> List[Dict[str, Any]]:
|
||||
gaps: List[Dict[str, Any]] = []
|
||||
|
||||
for kw in keywords_data:
|
||||
if 4 <= kw["position"] <= 20 and kw["impressions"] >= 50:
|
||||
# Estimate traffic gain if this keyword moved to position 1-3
|
||||
# Position 1 avg CTR ~31%, position 3 ~11%, current position CTR estimate
|
||||
position_1_ctr = 31.0
|
||||
current_ctr = kw["ctr"]
|
||||
estimated_gain = max(int(kw["impressions"] * (position_1_ctr - current_ctr) / 100), 1)
|
||||
|
||||
gaps.append({
|
||||
"keyword": kw["keyword"],
|
||||
"position": kw["position"],
|
||||
"impressions": kw["impressions"],
|
||||
"current_ctr": kw["ctr"],
|
||||
"clicks": kw["clicks"],
|
||||
"estimated_traffic_if_page1": estimated_gain,
|
||||
"gap_from_page1": round(kw["position"] - 3, 1),
|
||||
})
|
||||
|
||||
gaps.sort(key=lambda x: x["impressions"], reverse=True)
|
||||
return gaps[:10]
|
||||
|
||||
@staticmethod
|
||||
def _identify_quick_wins(
|
||||
keywords_data: List[Dict[str, Any]],
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""Keywords already on page 1 (positions 4-10) that could reach top 3
|
||||
with minor improvements — the highest-ROI opportunities."""
|
||||
quick_wins: List[Dict[str, Any]] = []
|
||||
|
||||
for kw in keywords_data:
|
||||
if 4 <= kw["position"] <= 10 and kw["impressions"] >= 100:
|
||||
# Position 3 CTR ≈ 11%, position 5 CTR ≈ 6%
|
||||
# Small improvements can yield big traffic gains
|
||||
target_ctr = 11.0 # approximate CTR for position 3
|
||||
estimated_gain = max(int(kw["impressions"] * (target_ctr - kw["ctr"]) / 100), 1)
|
||||
|
||||
quick_wins.append({
|
||||
"keyword": kw["keyword"],
|
||||
"position": kw["position"],
|
||||
"impressions": kw["impressions"],
|
||||
"current_ctr": kw["ctr"],
|
||||
"clicks": kw["clicks"],
|
||||
"estimated_traffic_gain": estimated_gain,
|
||||
"reason": (
|
||||
f"Already on page 1 at position #{kw['position']:.0f}. "
|
||||
f"Optimizing this page could increase CTR from {kw['ctr']:.1f}% "
|
||||
f"to ~{target_ctr:.0f}%, gaining ~{estimated_gain} clicks/month."
|
||||
),
|
||||
})
|
||||
|
||||
quick_wins.sort(key=lambda x: x["estimated_traffic_gain"], reverse=True)
|
||||
return quick_wins[:5]
|
||||
|
||||
@staticmethod
|
||||
def _identify_page_opportunities(
|
||||
pages_data: List[Dict[str, Any]],
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""Pages with high impressions but low CTR — the content or meta needs work."""
|
||||
opportunities: List[Dict[str, Any]] = []
|
||||
|
||||
for pg in pages_data:
|
||||
if pg["impressions"] > 300 and pg["ctr"] < 2.0:
|
||||
short_page = pg["page"].rstrip("/").rsplit("/", 1)[-1].replace("-", " ").title()
|
||||
if len(short_page) > 60:
|
||||
short_page = short_page[:57] + "..."
|
||||
opportunities.append({
|
||||
"page": pg["page"],
|
||||
"page_title": short_page,
|
||||
"impressions": pg["impressions"],
|
||||
"clicks": pg["clicks"],
|
||||
"current_ctr": pg["ctr"],
|
||||
"current_position": pg["position"],
|
||||
"reason": (
|
||||
f"This page gets {pg['impressions']:,} impressions but only {pg['ctr']:.1f}% CTR. "
|
||||
f"Reviewing the title and meta description could significantly boost clicks."
|
||||
),
|
||||
})
|
||||
|
||||
opportunities.sort(key=lambda x: x["impressions"], reverse=True)
|
||||
return opportunities[:5]
|
||||
|
||||
# ------------------------------------------------------------------ #
|
||||
# Content format suggestion
|
||||
# ------------------------------------------------------------------ #
|
||||
|
||||
@staticmethod
|
||||
def _suggest_format(keyword: str) -> str:
|
||||
"""Suggest a content format based on keyword patterns."""
|
||||
kw = keyword.lower()
|
||||
if any(w in kw for w in ["how to", "how do", "guide", "tutorial", "steps"]):
|
||||
return "How-To Guide"
|
||||
if any(w in kw for w in ["vs", "versus", "compare", "comparison", "difference"]):
|
||||
return "Comparison"
|
||||
if any(w in kw for w in ["best", "top", "recommended", "review", "reviews"]):
|
||||
return "Top Picks / Review"
|
||||
if any(w in kw for w in ["what is", "definition", "meaning", "explained"]):
|
||||
return "Explainer"
|
||||
if any(w in kw for w in ["list", "examples", "ideas", "tips", "ways"]):
|
||||
return "Listicle"
|
||||
if any(w in kw for w in ["free", "cheap", "alternative", "budget"]):
|
||||
return "Budget / Alternative"
|
||||
if any(w in kw for w in ["template", "calculator", "tool", "checker"]):
|
||||
return "Tool / Template"
|
||||
if any(w in kw for w in ["2024", "2025", "2026", "trends", "prediction", "future"]):
|
||||
return "Trend Report"
|
||||
return "In-Depth Article"
|
||||
|
||||
# ------------------------------------------------------------------ #
|
||||
# Summary metrics
|
||||
# ------------------------------------------------------------------ #
|
||||
|
||||
@staticmethod
|
||||
def _compute_summary(
|
||||
keywords_data: List[Dict],
|
||||
pages_data: List[Dict],
|
||||
site_url: str,
|
||||
start_date: str,
|
||||
end_date: str,
|
||||
) -> Dict[str, Any]:
|
||||
total_impressions = sum(kw["impressions"] for kw in keywords_data)
|
||||
total_clicks = sum(kw["clicks"] for kw in keywords_data)
|
||||
avg_ctr = round((total_clicks / total_impressions * 100) if total_impressions else 0, 2)
|
||||
avg_position = round(
|
||||
sum(kw["position"] for kw in keywords_data) / len(keywords_data), 1
|
||||
) if keywords_data else 0
|
||||
|
||||
pos_1_3 = len([kw for kw in keywords_data if kw["position"] <= 3])
|
||||
pos_4_10 = len([kw for kw in keywords_data if 3 < kw["position"] <= 10])
|
||||
pos_11_20 = len([kw for kw in keywords_data if 10 < kw["position"] <= 20])
|
||||
pos_21_plus = len([kw for kw in keywords_data if kw["position"] > 20])
|
||||
|
||||
top_keywords = sorted(keywords_data, key=lambda x: x["impressions"], reverse=True)[:5]
|
||||
top_pages = sorted(pages_data, key=lambda x: x["clicks"], reverse=True)[:3]
|
||||
|
||||
# Health score: 0-100 based on how many keywords are on page 1
|
||||
total_kw = len(keywords_data) or 1
|
||||
page1_pct = (pos_1_3 + pos_4_10) / total_kw * 100
|
||||
top3_pct = pos_1_3 / total_kw * 100
|
||||
health_score = round(min(top3_pct * 3 + page1_pct * 0.7, 100), 0)
|
||||
|
||||
# CTR benchmark: industry average is ~3.1% for position 1-10
|
||||
ctr_benchmark = 3.1
|
||||
ctr_vs_benchmark = round(avg_ctr - ctr_benchmark, 2)
|
||||
|
||||
return {
|
||||
"site_url": site_url,
|
||||
"date_range": {"start": start_date, "end": end_date},
|
||||
"total_keywords_analyzed": len(keywords_data),
|
||||
"total_impressions": total_impressions,
|
||||
"total_clicks": total_clicks,
|
||||
"avg_ctr": avg_ctr,
|
||||
"avg_position": avg_position,
|
||||
"ctr_vs_benchmark": ctr_vs_benchmark,
|
||||
"health_score": health_score,
|
||||
"keyword_distribution": {
|
||||
"positions_1_3": pos_1_3,
|
||||
"positions_4_10": pos_4_10,
|
||||
"positions_11_20": pos_11_20,
|
||||
"positions_21_plus": pos_21_plus,
|
||||
},
|
||||
"top_keywords": [
|
||||
{
|
||||
"keyword": kw["keyword"],
|
||||
"impressions": kw["impressions"],
|
||||
"clicks": kw["clicks"],
|
||||
"position": kw["position"],
|
||||
"ctr": kw["ctr"],
|
||||
}
|
||||
for kw in top_keywords
|
||||
],
|
||||
"top_pages": [
|
||||
{
|
||||
"page": pg["page"],
|
||||
"clicks": pg["clicks"],
|
||||
"impressions": pg["impressions"],
|
||||
"ctr": pg["ctr"],
|
||||
}
|
||||
for pg in top_pages
|
||||
],
|
||||
}
|
||||
|
||||
# ------------------------------------------------------------------ #
|
||||
# AI-powered strategic recommendations
|
||||
# ------------------------------------------------------------------ #
|
||||
|
||||
def _generate_ai_recommendations(
|
||||
self,
|
||||
keywords_data: List[Dict],
|
||||
pages_data: List[Dict],
|
||||
summary: Dict,
|
||||
user_keywords: str,
|
||||
content_opportunities: List[Dict],
|
||||
quick_wins: List[Dict],
|
||||
keyword_gaps: List[Dict],
|
||||
) -> Dict[str, Any]:
|
||||
try:
|
||||
top_kw_list = summary.get("top_keywords", [])
|
||||
top_kw_str = "\n".join(
|
||||
f" • {kw['keyword']}: {kw['impressions']:,} impressions, position {kw['position']}, {kw['ctr']:.1f}% CTR"
|
||||
for kw in top_kw_list[:10]
|
||||
)
|
||||
dist = summary.get("keyword_distribution", {})
|
||||
|
||||
opp_str = ""
|
||||
if content_opportunities:
|
||||
opp_str = "\nCONTENT OPPORTUNITIES (rule-based findings):\n" + "\n".join(
|
||||
f" • {o['keyword']}: {o['opportunity']}"
|
||||
for o in content_opportunities[:5]
|
||||
)
|
||||
else:
|
||||
opp_str = "\nNo major content opportunities detected from rule-based analysis."
|
||||
|
||||
qw_str = ""
|
||||
if quick_wins:
|
||||
qw_str = "\nQUICK WINS (already on page 1, easy to optimize):\n" + "\n".join(
|
||||
f" • {q['keyword']}: position #{q['position']:.0f}, {q['current_ctr']:.1f}% CTR, est. +{q['estimated_traffic_gain']} clicks/month"
|
||||
for q in quick_wins[:3]
|
||||
)
|
||||
|
||||
prompt = f"""You are an expert SEO content strategist analyzing real Google Search Console data for a blog writer.
|
||||
|
||||
The user wants to write about: "{user_keywords}"
|
||||
|
||||
Here is their GSC data for the last 30 days:
|
||||
|
||||
PERFORMANCE OVERVIEW:
|
||||
- Total Keywords: {summary.get('total_keywords_analyzed', 0)}
|
||||
- Total Impressions: {summary.get('total_impressions', 0):,}
|
||||
- Total Clicks: {summary.get('total_clicks', 0):,}
|
||||
- Average CTR: {summary.get('avg_ctr', 0):.2f}% (industry avg for positions 1-10 is ~3.1%)
|
||||
- Average Position: {summary.get('avg_position', 0):.1f}
|
||||
- SEO Health Score: {summary.get('health_score', 0)}/100
|
||||
|
||||
TOP KEYWORDS BY IMPRESSIONS:
|
||||
{top_kw_str}
|
||||
|
||||
KEYWORD POSITION DISTRIBUTION:
|
||||
- Position 1-3 (top results): {dist.get('positions_1_3', 0)} keywords
|
||||
- Position 4-10 (page 1): {dist.get('positions_4_10', 0)} keywords
|
||||
- Position 11-20 (page 2): {dist.get('positions_11_20', 0)} keywords
|
||||
- Position 21+ (page 3+): {dist.get('positions_21_plus', 0)} keywords
|
||||
{opp_str}
|
||||
{qw_str}
|
||||
|
||||
Based on this data, provide EXACT blog post suggestions the user should write.
|
||||
|
||||
For each suggestion include:
|
||||
1. A specific, compelling blog post TITLE (not vague topic)
|
||||
2. The keyword it targets and why (based on the data above)
|
||||
3. The recommended content format (how-to, listicle, comparison, etc.)
|
||||
4. Estimated impact (how many more clicks/month they could gain)
|
||||
|
||||
Return your response in this EXACT JSON format (no markdown, no code fences):
|
||||
{{
|
||||
"immediate_opportunities": [
|
||||
{{
|
||||
"title": "Specific Blog Post Title Here",
|
||||
"keyword": "target keyword",
|
||||
"reason": "Why this will work based on the data",
|
||||
"format": "How-To Guide | Listicle | Comparison | Explainer | etc.",
|
||||
"estimated_impact": "Estimated X more clicks/month"
|
||||
}}
|
||||
],
|
||||
"content_strategy": [
|
||||
{{
|
||||
"title": "Pillar Content Title",
|
||||
"keyword": "target keyword",
|
||||
"reason": "Strategic reasoning",
|
||||
"format": "Content format",
|
||||
"estimated_impact": "Expected impact"
|
||||
}}
|
||||
],
|
||||
"long_term_strategy": [
|
||||
{{
|
||||
"title": "Authority Building Title",
|
||||
"keyword": "target keyword",
|
||||
"reason": "Long-term reasoning",
|
||||
"format": "Content format",
|
||||
"estimated_impact": "Expected long-term impact"
|
||||
}}
|
||||
]
|
||||
}}
|
||||
|
||||
IMPORTANT:
|
||||
- Provide 3-5 items in each category
|
||||
- Every suggestion MUST relate to the user's interest in "{user_keywords}"
|
||||
- Titles should be specific and compelling, like real blog post headlines
|
||||
- Use the data above to justify each recommendation
|
||||
- Prioritize keywords with high impressions but low CTR or low position"""
|
||||
|
||||
system_prompt = (
|
||||
"You are an expert SEO content strategist. You analyze Google Search Console data "
|
||||
"and provide specific, actionable blog post recommendations that will drive real traffic. "
|
||||
"You always respond with valid JSON matching the requested format. "
|
||||
"Every recommendation must be backed by the data provided."
|
||||
)
|
||||
|
||||
result = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=system_prompt,
|
||||
user_id=getattr(self, '_user_id', None),
|
||||
flow_type="gsc_brainstorm",
|
||||
)
|
||||
|
||||
if result:
|
||||
parsed = self._parse_ai_response(result)
|
||||
if parsed:
|
||||
return parsed
|
||||
|
||||
return self._fallback_ai_recommendations(keywords_data, content_opportunities, quick_wins)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"GSC brainstorm AI recommendations failed: {e}")
|
||||
return self._fallback_ai_recommendations(keywords_data, content_opportunities, quick_wins)
|
||||
|
||||
def _parse_ai_response(self, raw: str) -> Optional[Dict[str, Any]]:
|
||||
try:
|
||||
# Strip markdown code fences if present
|
||||
cleaned = raw.strip()
|
||||
if cleaned.startswith("```"):
|
||||
first_newline = cleaned.find("\n")
|
||||
if first_newline != -1:
|
||||
cleaned = cleaned[first_newline + 1:]
|
||||
if cleaned.endswith("```"):
|
||||
cleaned = cleaned[:-3].strip()
|
||||
|
||||
json_start = cleaned.find("{")
|
||||
json_end = cleaned.rfind("}") + 1
|
||||
if json_start == -1 or json_end == 0:
|
||||
return None
|
||||
|
||||
chunk = cleaned[json_start:json_end]
|
||||
parsed = json.loads(chunk)
|
||||
|
||||
def normalize_section(section: Any) -> List[Dict[str, str]]:
|
||||
if not isinstance(section, list):
|
||||
return []
|
||||
result = []
|
||||
for item in section:
|
||||
if isinstance(item, str):
|
||||
result.append({
|
||||
"title": item.split(":")[0].strip() if ":" in item else item[:60],
|
||||
"keyword": "",
|
||||
"reason": item,
|
||||
"format": "",
|
||||
"estimated_impact": "",
|
||||
})
|
||||
elif isinstance(item, dict):
|
||||
result.append({
|
||||
"title": str(item.get("title", "")),
|
||||
"keyword": str(item.get("keyword", "")),
|
||||
"reason": str(item.get("reason", "")),
|
||||
"format": str(item.get("format", "")),
|
||||
"estimated_impact": str(item.get("estimated_impact", "")),
|
||||
})
|
||||
return result
|
||||
|
||||
return {
|
||||
"immediate_opportunities": normalize_section(parsed.get("immediate_opportunities", []))[:5],
|
||||
"content_strategy": normalize_section(parsed.get("content_strategy", []))[:5],
|
||||
"long_term_strategy": normalize_section(parsed.get("long_term_strategy", []))[:5],
|
||||
}
|
||||
except (json.JSONDecodeError, ValueError) as e:
|
||||
logger.warning(f"Failed to parse AI brainstorm response as JSON: {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _fallback_ai_recommendations(
|
||||
keywords_data: List[Dict],
|
||||
content_opportunities: List[Dict],
|
||||
quick_wins: List[Dict],
|
||||
) -> Dict[str, Any]:
|
||||
top_kw = keywords_data[:3] if keywords_data else []
|
||||
immediate = []
|
||||
|
||||
# Build from quick wins first (highest ROI)
|
||||
for qw in quick_wins[:2]:
|
||||
immediate.append({
|
||||
"title": f"How to Rank #{int(qw['position'])} for '{qw['keyword']}' — Optimization Guide",
|
||||
"keyword": qw["keyword"],
|
||||
"reason": qw.get("reason", f"Already on page 1 at position {qw['position']:.0f}"),
|
||||
"format": "How-To Guide",
|
||||
"estimated_impact": f"+{qw.get('estimated_traffic_gain', 10)} clicks/month",
|
||||
})
|
||||
|
||||
# Then from content opportunities
|
||||
for opp in content_opportunities[:2]:
|
||||
immediate.append({
|
||||
"title": f"Complete Guide to {opp['keyword'].title()}",
|
||||
"keyword": opp["keyword"],
|
||||
"reason": opp.get("opportunity", f"{opp['impressions']:,} impressions with room to improve"),
|
||||
"format": opp.get("suggested_format", "In-Depth Article"),
|
||||
"estimated_impact": f"+{opp.get('estimated_traffic_gain', 10)} clicks/month",
|
||||
})
|
||||
|
||||
# Fill remaining with top keywords
|
||||
remaining = 5 - len(immediate)
|
||||
for kw in top_kw[:remaining]:
|
||||
immediate.append({
|
||||
"title": f"The Ultimate Guide to {kw['keyword'].title()}",
|
||||
"keyword": kw["keyword"],
|
||||
"reason": f"Top keyword with {kw['impressions']:,} impressions (position {kw['position']:.1f})",
|
||||
"format": "In-Depth Article",
|
||||
"estimated_impact": f"+{max(int(kw['impressions'] * 0.03), 5)} clicks/month",
|
||||
})
|
||||
|
||||
return {
|
||||
"immediate_opportunities": immediate or [{"title": "No keyword data available", "keyword": "", "reason": "Connect GSC to get personalized suggestions", "format": "", "estimated_impact": ""}],
|
||||
"content_strategy": [
|
||||
{"title": "Topic Cluster: Build Authority Around Your Core Topics", "keyword": "", "reason": "Clustered content ranks higher and captures more long-tail queries", "format": "Pillar Page + Spokes", "estimated_impact": "+50-200 clicks/month over 3 months"},
|
||||
{"title": "Comparison Guide: Your Product vs. Alternatives", "keyword": "", "reason": "Comparison content captures high-intent searchers ready to decide", "format": "Comparison", "estimated_impact": "+20-80 clicks/month"},
|
||||
{"title": "FAQ: Answer What Your Audience Is Asking", "keyword": "", "reason": "FAQs capture featured snippets and voice search queries", "format": "FAQ / Listicle", "estimated_impact": "+30-100 clicks/month"},
|
||||
],
|
||||
"long_term_strategy": [
|
||||
{"title": "Pillar Content: The Definitive Resource in Your Niche", "keyword": "", "reason": "Comprehensive guides become authoritative references that attract backlinks", "format": "Long-Form Guide", "estimated_impact": "+100-500 clicks/month over 6-12 months"},
|
||||
{"title": "Trend Report: What's Next in Your Industry", "keyword": "", "reason": "Forward-looking content captures emerging search demand early", "format": "Trend Report", "estimated_impact": "+50-200 clicks/month"},
|
||||
{"title": "Thought Leadership: Expert Roundup and Insights", "keyword": "", "reason": "Expert content builds E-E-A-T signals that improve overall domain authority", "format": "Expert Roundup", "estimated_impact": "+30-100 clicks/month per piece"},
|
||||
],
|
||||
}
|
||||
@@ -250,10 +250,10 @@ class GSCService:
|
||||
flow = Flow.from_client_config(
|
||||
self.client_config,
|
||||
scopes=self.scopes,
|
||||
redirect_uri=redirect_uri,
|
||||
autogenerate_code_verifier=False,
|
||||
redirect_uri=redirect_uri
|
||||
)
|
||||
|
||||
|
||||
# Use a custom state that includes user_id for routing the callback to the correct DB
|
||||
random_state = secrets.token_urlsafe(32)
|
||||
state = f"{user_id}:{random_state}"
|
||||
|
||||
@@ -300,7 +300,7 @@ class GSCService:
|
||||
logger.error(f"User database not found for user {user_id}")
|
||||
return False
|
||||
|
||||
# Verify state in user's DB (but don't delete yet — delete after successful token exchange)
|
||||
# Verify state in user's DB
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('SELECT user_id FROM gsc_oauth_states WHERE state = ?', (state,))
|
||||
@@ -309,6 +309,10 @@ class GSCService:
|
||||
if not result:
|
||||
logger.error(f"Invalid or expired GSC OAuth state for user {user_id}")
|
||||
return False
|
||||
|
||||
# Clean up state
|
||||
cursor.execute('DELETE FROM gsc_oauth_states WHERE state = ?', (state,))
|
||||
conn.commit()
|
||||
|
||||
# Exchange code for credentials
|
||||
if not self.client_config:
|
||||
@@ -318,22 +322,12 @@ class GSCService:
|
||||
flow = Flow.from_client_config(
|
||||
self.client_config,
|
||||
scopes=self.scopes,
|
||||
redirect_uri=os.getenv('GSC_REDIRECT_URI', 'http://localhost:8000/gsc/callback'),
|
||||
autogenerate_code_verifier=False,
|
||||
redirect_uri=os.getenv('GSC_REDIRECT_URI', 'http://localhost:8000/gsc/callback')
|
||||
)
|
||||
|
||||
flow.fetch_token(code=authorization_code)
|
||||
credentials = flow.credentials
|
||||
|
||||
# State consumed successfully — clean up
|
||||
try:
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('DELETE FROM gsc_oauth_states WHERE state = ?', (state,))
|
||||
conn.commit()
|
||||
except Exception as cleanup_err:
|
||||
logger.warning(f"Failed to clean up OAuth state: {cleanup_err}")
|
||||
|
||||
# Save credentials
|
||||
return self.save_user_credentials(user_id, credentials)
|
||||
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""
|
||||
Hallucination Detector Service
|
||||
|
||||
Implements fact-checking using Exa.ai for evidence search and the
|
||||
configured LLM provider (via GPT_PROVIDER) for claim extraction and assessment.
|
||||
Respects GPT_PROVIDER env var: google, wavespeed, openai, huggingface.
|
||||
This service implements fact-checking functionality using Exa.ai API
|
||||
to detect and verify claims in AI-generated content, similar to the
|
||||
Exa.ai demo implementation.
|
||||
"""
|
||||
|
||||
import json
|
||||
@@ -11,9 +11,15 @@ import logging
|
||||
from typing import List, Dict, Any, Optional
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime
|
||||
import requests
|
||||
import os
|
||||
import asyncio
|
||||
import concurrent.futures
|
||||
try:
|
||||
from google import genai
|
||||
GOOGLE_GENAI_AVAILABLE = True
|
||||
except Exception:
|
||||
GOOGLE_GENAI_AVAILABLE = False
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -38,121 +44,70 @@ class HallucinationResult:
|
||||
insufficient_claims: int
|
||||
timestamp: str
|
||||
|
||||
|
||||
def _get_llm_provider_info() -> Dict[str, str]:
|
||||
"""Determine the LLM provider from GPT_PROVIDER env var."""
|
||||
provider_env = os.getenv('GPT_PROVIDER', 'google').lower().strip()
|
||||
provider = provider_env.split(',')[0].strip() if provider_env else 'google'
|
||||
|
||||
if provider in ('wavespeed', 'wave'):
|
||||
return {'provider': 'wavespeed', 'name': 'WaveSpeed'}
|
||||
elif provider in ('gemini', 'google'):
|
||||
return {'provider': 'google', 'name': 'Gemini'}
|
||||
elif provider in ('openai', 'gpt'):
|
||||
return {'provider': 'openai', 'name': 'OpenAI'}
|
||||
elif provider in ('hf_response_api', 'huggingface', 'hf'):
|
||||
return {'provider': 'huggingface', 'name': 'HuggingFace'}
|
||||
else:
|
||||
return {'provider': provider, 'name': provider.capitalize()}
|
||||
|
||||
|
||||
class HallucinationDetector:
|
||||
"""
|
||||
Hallucination detector using Exa.ai for evidence search
|
||||
and the configured LLM provider (GPT_PROVIDER) for claim extraction/assessment.
|
||||
|
||||
Implements the three-step process:
|
||||
Hallucination detector using Exa.ai for fact-checking.
|
||||
|
||||
Implements the three-step process from Exa.ai demo:
|
||||
1. Extract verifiable claims from text
|
||||
2. Search for evidence using Exa.ai
|
||||
3. Verify claims against sources
|
||||
"""
|
||||
|
||||
|
||||
def __init__(self):
|
||||
self._llm_provider_info = _get_llm_provider_info()
|
||||
|
||||
# Check that at least one LLM key is available for the configured provider
|
||||
self._check_provider_keys()
|
||||
|
||||
# Rate limiting
|
||||
self.exa_api_key = os.getenv('EXA_API_KEY')
|
||||
self.gemini_api_key = os.getenv('GEMINI_API_KEY')
|
||||
|
||||
if not self.exa_api_key:
|
||||
logger.warning("EXA_API_KEY not found. Hallucination detection will be limited.")
|
||||
|
||||
if not self.gemini_api_key:
|
||||
logger.warning("GEMINI_API_KEY not found. Falling back to heuristic claim extraction.")
|
||||
|
||||
# Initialize Gemini client for claim extraction and assessment
|
||||
self.gemini_client = genai.Client(api_key=self.gemini_api_key) if (GOOGLE_GENAI_AVAILABLE and self.gemini_api_key) else None
|
||||
|
||||
# Rate limiting to prevent API abuse
|
||||
self.daily_api_calls = 0
|
||||
self.daily_limit = 20
|
||||
self.daily_limit = 20 # Max 20 API calls per day for fact checking
|
||||
self.last_reset_date = None
|
||||
|
||||
def _check_provider_keys(self):
|
||||
"""Check that API keys for the configured provider are available."""
|
||||
provider = self._llm_provider_info['provider']
|
||||
if provider == 'google':
|
||||
key = os.getenv('GEMINI_API_KEY')
|
||||
if not key:
|
||||
logger.warning(f"GEMINI_API_KEY not found. Hallucination detection will fail for provider '{provider}'.")
|
||||
elif provider == 'wavespeed':
|
||||
key = os.getenv('WAVESPEED_API_KEY')
|
||||
if not key:
|
||||
logger.warning(f"WAVESPEED_API_KEY not found. Hallucination detection will fail for provider '{provider}'.")
|
||||
elif provider == 'openai':
|
||||
key = os.getenv('OPENAI_API_KEY')
|
||||
if not key:
|
||||
logger.warning(f"OPENAI_API_KEY not found. Hallucination detection will fail for provider '{provider}'.")
|
||||
# huggingface uses serverless endpoint or HF token
|
||||
|
||||
@property
|
||||
def provider_name(self) -> str:
|
||||
return self._llm_provider_info['name']
|
||||
|
||||
@property
|
||||
def provider_key(self) -> str:
|
||||
return self._llm_provider_info['provider']
|
||||
|
||||
|
||||
def _check_rate_limit(self) -> bool:
|
||||
"""Check if we're within daily API usage limits."""
|
||||
from datetime import date
|
||||
|
||||
today = date.today()
|
||||
|
||||
# Reset counter if it's a new day
|
||||
if self.last_reset_date != today:
|
||||
self.daily_api_calls = 0
|
||||
self.last_reset_date = today
|
||||
|
||||
# Check if we've exceeded the limit
|
||||
if self.daily_api_calls >= self.daily_limit:
|
||||
logger.warning(f"Daily API limit reached ({self.daily_limit} calls). Fact checking disabled for today.")
|
||||
return False
|
||||
|
||||
# Increment counter for this API call
|
||||
self.daily_api_calls += 1
|
||||
logger.info(f"Fact check API call #{self.daily_api_calls}/{self.daily_limit} today")
|
||||
return True
|
||||
|
||||
def _generate_text(self, prompt: str, system_prompt: Optional[str] = None, user_id: str = None) -> str:
|
||||
"""Generate text using the configured LLM provider (respects GPT_PROVIDER)."""
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
|
||||
result = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=system_prompt or "You are a precise fact-checking assistant. Respond only with valid JSON as instructed.",
|
||||
max_tokens=4000,
|
||||
user_id=user_id,
|
||||
)
|
||||
return result
|
||||
|
||||
async def _generate_text_async(self, prompt: str, system_prompt: Optional[str] = None, user_id: str = None) -> str:
|
||||
"""Async wrapper for _generate_text."""
|
||||
loop = asyncio.get_event_loop()
|
||||
with concurrent.futures.ThreadPoolExecutor() as executor:
|
||||
result = await loop.run_in_executor(
|
||||
executor,
|
||||
lambda: self._generate_text(prompt, system_prompt, user_id)
|
||||
)
|
||||
return result
|
||||
|
||||
async def detect_hallucinations(self, text: str, user_id: str = None) -> HallucinationResult:
|
||||
|
||||
async def detect_hallucinations(self, text: str) -> HallucinationResult:
|
||||
"""
|
||||
Main method to detect hallucinations in the given text.
|
||||
|
||||
|
||||
Args:
|
||||
text: The text to analyze for factual accuracy
|
||||
|
||||
|
||||
Returns:
|
||||
HallucinationResult with claims analysis and confidence scores
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Starting hallucination detection for text of length: {len(text)}")
|
||||
logger.info(f"Text sample: {text[:200]}...")
|
||||
|
||||
|
||||
# Check rate limits first
|
||||
if not self._check_rate_limit():
|
||||
return HallucinationResult(
|
||||
claims=[],
|
||||
@@ -163,11 +118,17 @@ class HallucinationDetector:
|
||||
insufficient_claims=0,
|
||||
timestamp=datetime.now().isoformat()
|
||||
)
|
||||
|
||||
|
||||
# Validate required API keys
|
||||
if not self.gemini_api_key:
|
||||
raise Exception("GEMINI_API_KEY not configured. Cannot perform hallucination detection.")
|
||||
if not self.exa_api_key:
|
||||
raise Exception("EXA_API_KEY not configured. Cannot search for evidence.")
|
||||
|
||||
# Step 1: Extract claims from text
|
||||
claims_texts = await self._extract_claims(text, user_id=user_id)
|
||||
claims_texts = await self._extract_claims(text)
|
||||
logger.info(f"Extracted {len(claims_texts)} claims from text: {claims_texts}")
|
||||
|
||||
|
||||
if not claims_texts:
|
||||
logger.warning("No verifiable claims found in text")
|
||||
return HallucinationResult(
|
||||
@@ -179,18 +140,22 @@ class HallucinationDetector:
|
||||
insufficient_claims=0,
|
||||
timestamp=datetime.now().isoformat()
|
||||
)
|
||||
|
||||
# Step 2 & 3: Verify claims in batch
|
||||
verified_claims = await self._verify_claims_batch(claims_texts, user_id=user_id)
|
||||
|
||||
|
||||
# Step 2 & 3: Verify claims in batch to reduce API calls
|
||||
verified_claims = await self._verify_claims_batch(claims_texts)
|
||||
|
||||
# Calculate overall metrics
|
||||
total_claims = len(verified_claims)
|
||||
supported_claims = sum(1 for c in verified_claims if c.assessment == "supported")
|
||||
refuted_claims = sum(1 for c in verified_claims if c.assessment == "refuted")
|
||||
insufficient_claims = sum(1 for c in verified_claims if c.assessment == "insufficient_information")
|
||||
|
||||
overall_confidence = sum(c.confidence for c in verified_claims) / total_claims if total_claims > 0 else 0.0
|
||||
|
||||
|
||||
# Calculate overall confidence (weighted average)
|
||||
if total_claims > 0:
|
||||
overall_confidence = sum(c.confidence for c in verified_claims) / total_claims
|
||||
else:
|
||||
overall_confidence = 0.0
|
||||
|
||||
result = HallucinationResult(
|
||||
claims=verified_claims,
|
||||
overall_confidence=overall_confidence,
|
||||
@@ -200,67 +165,120 @@ class HallucinationDetector:
|
||||
insufficient_claims=insufficient_claims,
|
||||
timestamp=datetime.now().isoformat()
|
||||
)
|
||||
|
||||
|
||||
logger.info(f"Hallucination detection completed. Overall confidence: {overall_confidence:.2f}")
|
||||
return result
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in hallucination detection: {str(e)}")
|
||||
raise Exception(f"Hallucination detection failed: {str(e)}")
|
||||
|
||||
async def _extract_claims(self, text: str, user_id: str = None) -> List[str]:
|
||||
"""Extract verifiable claims from text using LLM."""
|
||||
|
||||
async def _extract_claims(self, text: str) -> List[str]:
|
||||
"""
|
||||
Extract verifiable claims from text using LLM.
|
||||
|
||||
Args:
|
||||
text: Input text to extract claims from
|
||||
|
||||
Returns:
|
||||
List of claim strings
|
||||
"""
|
||||
if not self.gemini_client:
|
||||
raise Exception("Gemini client not available. Cannot extract claims without AI provider.")
|
||||
|
||||
try:
|
||||
prompt = (
|
||||
"Extract verifiable factual claims from the following text. "
|
||||
"A verifiable claim is a statement that can be checked against external sources for accuracy.\n\n"
|
||||
"Return ONLY a valid JSON array of strings, where each string is a single verifiable claim.\n\n"
|
||||
"Examples of GOOD verifiable claims:\n"
|
||||
'- "The company was founded in 2020"\n'
|
||||
'- "Sales increased by 25% last quarter"\n'
|
||||
'- "The product has 10,000 users"\n\n'
|
||||
"- \"The company was founded in 2020\"\n"
|
||||
"- \"Sales increased by 25% last quarter\"\n"
|
||||
"- \"The product has 10,000 users\"\n"
|
||||
"- \"The market size is $50 billion\"\n"
|
||||
"- \"The software supports 15 languages\"\n"
|
||||
"- \"The company has offices in 5 countries\"\n\n"
|
||||
"Examples of BAD claims (opinions, subjective statements):\n"
|
||||
'- "This is the best product"\n'
|
||||
'- "Customers love our service"\n\n'
|
||||
"- \"This is the best product\"\n"
|
||||
"- \"Customers love our service\"\n"
|
||||
"- \"We are innovative\"\n"
|
||||
"- \"The future looks bright\"\n\n"
|
||||
"IMPORTANT: Extract at least 2-3 verifiable claims if possible. "
|
||||
"Look for specific facts, numbers, dates, locations, and measurable statements.\n\n"
|
||||
f"Text to analyze: {text}\n\n"
|
||||
"Return only the JSON array of verifiable claims:"
|
||||
)
|
||||
|
||||
result_text = await self._generate_text_async(prompt, user_id=user_id)
|
||||
logger.info(f"Raw LLM response for claims: {result_text[:200]}...")
|
||||
|
||||
claims = self._parse_json_from_response(result_text, expect_array=True)
|
||||
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
with concurrent.futures.ThreadPoolExecutor() as executor:
|
||||
resp = await loop.run_in_executor(executor, lambda: self.gemini_client.models.generate_content(
|
||||
model="gemini-1.5-flash",
|
||||
contents=prompt
|
||||
))
|
||||
|
||||
if not resp or not resp.text:
|
||||
raise Exception("Empty response from Gemini API")
|
||||
|
||||
claims_text = resp.text.strip()
|
||||
logger.info(f"Raw Gemini response for claims: {claims_text[:200]}...")
|
||||
|
||||
# Try to extract JSON from the response
|
||||
try:
|
||||
claims = json.loads(claims_text)
|
||||
except json.JSONDecodeError:
|
||||
# Try to find JSON array in the response (handle markdown code blocks)
|
||||
import re
|
||||
# First try to extract from markdown code blocks
|
||||
code_block_match = re.search(r'```(?:json)?\s*(\[.*?\])\s*```', claims_text, re.DOTALL)
|
||||
if code_block_match:
|
||||
claims = json.loads(code_block_match.group(1))
|
||||
else:
|
||||
# Try to find JSON array directly
|
||||
json_match = re.search(r'\[.*?\]', claims_text, re.DOTALL)
|
||||
if json_match:
|
||||
claims = json.loads(json_match.group())
|
||||
else:
|
||||
raise Exception(f"Could not parse JSON from Gemini response: {claims_text[:100]}")
|
||||
|
||||
if isinstance(claims, list):
|
||||
valid_claims = [claim for claim in claims if isinstance(claim, str) and claim.strip()]
|
||||
logger.info(f"Successfully extracted {len(valid_claims)} claims")
|
||||
return valid_claims
|
||||
else:
|
||||
raise Exception(f"Expected JSON array, got: {type(claims)}")
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error extracting claims: {str(e)}")
|
||||
raise Exception(f"Failed to extract claims: {str(e)}")
|
||||
|
||||
async def _verify_claims_batch(self, claims: List[str], user_id: str = None) -> List[Claim]:
|
||||
"""Verify multiple claims in batch to reduce API calls."""
|
||||
|
||||
|
||||
async def _verify_claims_batch(self, claims: List[str]) -> List[Claim]:
|
||||
"""
|
||||
Verify multiple claims in batch to reduce API calls.
|
||||
|
||||
Args:
|
||||
claims: List of claims to verify
|
||||
|
||||
Returns:
|
||||
List of Claim objects with verification results
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Starting batch verification of {len(claims)} claims")
|
||||
|
||||
# Limit to maximum 3 claims to prevent excessive API usage
|
||||
max_claims = min(len(claims), 3)
|
||||
claims_to_verify = claims[:max_claims]
|
||||
|
||||
|
||||
if len(claims) > max_claims:
|
||||
logger.warning(f"Limited verification to {max_claims} claims to prevent API rate limits")
|
||||
|
||||
# Step 1: Search for evidence
|
||||
all_sources = await self._search_evidence_batch(claims_to_verify, user_id=user_id)
|
||||
|
||||
# Step 2: Assess claims against sources
|
||||
verified_claims = await self._assess_claims_batch(claims_to_verify, all_sources, user_id=user_id)
|
||||
|
||||
# Add remaining claims as insufficient information
|
||||
|
||||
# Step 1: Search for evidence for all claims in one batch
|
||||
all_sources = await self._search_evidence_batch(claims_to_verify)
|
||||
|
||||
# Step 2: Assess all claims against sources in one API call
|
||||
verified_claims = await self._assess_claims_batch(claims_to_verify, all_sources)
|
||||
|
||||
# Add any remaining claims as insufficient information
|
||||
for i in range(max_claims, len(claims)):
|
||||
verified_claims.append(Claim(
|
||||
text=claims[i],
|
||||
@@ -270,12 +288,13 @@ class HallucinationDetector:
|
||||
refuting_sources=[],
|
||||
reasoning="Not verified due to API rate limit protection"
|
||||
))
|
||||
|
||||
|
||||
logger.info(f"Batch verification completed for {len(verified_claims)} claims")
|
||||
return verified_claims
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in batch verification: {str(e)}")
|
||||
# Return all claims as insufficient information
|
||||
return [
|
||||
Claim(
|
||||
text=claim,
|
||||
@@ -288,11 +307,20 @@ class HallucinationDetector:
|
||||
for claim in claims
|
||||
]
|
||||
|
||||
async def _verify_claim(self, claim: str, user_id: str = None) -> Claim:
|
||||
"""Verify a single claim using Exa.ai search."""
|
||||
async def _verify_claim(self, claim: str) -> Claim:
|
||||
"""
|
||||
Verify a single claim using Exa.ai search.
|
||||
|
||||
Args:
|
||||
claim: The claim to verify
|
||||
|
||||
Returns:
|
||||
Claim object with verification results
|
||||
"""
|
||||
try:
|
||||
sources = await self._search_evidence(claim, user_id=user_id)
|
||||
|
||||
# Search for evidence using Exa.ai
|
||||
sources = await self._search_evidence(claim)
|
||||
|
||||
if not sources:
|
||||
return Claim(
|
||||
text=claim,
|
||||
@@ -302,9 +330,10 @@ class HallucinationDetector:
|
||||
refuting_sources=[],
|
||||
reasoning="No sources found for verification"
|
||||
)
|
||||
|
||||
verification_result = await self._assess_claim_against_sources(claim, sources, user_id=user_id)
|
||||
|
||||
|
||||
# Verify claim against sources using LLM
|
||||
verification_result = await self._assess_claim_against_sources(claim, sources)
|
||||
|
||||
return Claim(
|
||||
text=claim,
|
||||
confidence=verification_result.get('confidence', 0.5),
|
||||
@@ -313,7 +342,7 @@ class HallucinationDetector:
|
||||
refuting_sources=verification_result.get('refuting_sources', []),
|
||||
reasoning=verification_result.get('reasoning', '')
|
||||
)
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error verifying claim '{claim}': {str(e)}")
|
||||
return Claim(
|
||||
@@ -324,50 +353,68 @@ class HallucinationDetector:
|
||||
refuting_sources=[],
|
||||
reasoning=f"Error during verification: {str(e)}"
|
||||
)
|
||||
|
||||
async def _search_evidence_batch(self, claims: List[str], user_id: str = None) -> List[Dict[str, Any]]:
|
||||
"""Search for evidence for multiple claims in one API call."""
|
||||
|
||||
async def _search_evidence_batch(self, claims: List[str]) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Search for evidence for multiple claims in one API call.
|
||||
|
||||
Args:
|
||||
claims: List of claims to search for
|
||||
|
||||
Returns:
|
||||
List of sources relevant to the claims
|
||||
"""
|
||||
try:
|
||||
combined_query = " ".join(claims[:2])
|
||||
# Combine all claims into one search query
|
||||
combined_query = " ".join(claims[:2]) # Use first 2 claims to avoid query length limits
|
||||
|
||||
logger.info(f"Searching for evidence for {len(claims)} claims with combined query")
|
||||
sources = await self._search_evidence(combined_query, user_id=user_id)
|
||||
|
||||
|
||||
# Use the existing search method with combined query
|
||||
sources = await self._search_evidence(combined_query)
|
||||
|
||||
# Limit sources to prevent excessive processing
|
||||
max_sources = 5
|
||||
if len(sources) > max_sources:
|
||||
sources = sources[:max_sources]
|
||||
logger.info(f"Limited sources to {max_sources} to prevent API rate limits")
|
||||
|
||||
|
||||
return sources
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in batch evidence search: {str(e)}")
|
||||
return []
|
||||
|
||||
def _map_source_refs_from_reasoning(self, reasoning: str, sources: List[Dict[str, Any]]) -> List[int]:
|
||||
"""Parse 'Source N' references from reasoning text and return 0-based indices."""
|
||||
import re
|
||||
indices = set()
|
||||
for match in re.finditer(r'Source\s+(\d+)', reasoning):
|
||||
ref = int(match.group(1))
|
||||
if 1 <= ref <= len(sources):
|
||||
indices.add(ref - 1) # convert 1-based → 0-based
|
||||
return sorted(indices)
|
||||
|
||||
async def _assess_claims_batch(self, claims: List[str], sources: List[Dict[str, Any]], user_id: str = None) -> List[Claim]:
|
||||
"""Assess multiple claims against sources in one LLM call."""
|
||||
async def _assess_claims_batch(self, claims: List[str], sources: List[Dict[str, Any]]) -> List[Claim]:
|
||||
"""
|
||||
Assess multiple claims against sources in one API call.
|
||||
|
||||
Args:
|
||||
claims: List of claims to assess
|
||||
sources: List of sources to assess against
|
||||
|
||||
Returns:
|
||||
List of Claim objects with assessment results
|
||||
"""
|
||||
if not self.gemini_client:
|
||||
raise Exception("Gemini client not available. Cannot assess claims without AI provider.")
|
||||
|
||||
try:
|
||||
# Limit to 3 claims to prevent excessive API usage
|
||||
claims_to_assess = claims[:3]
|
||||
|
||||
|
||||
# Prepare sources text
|
||||
combined_sources = "\n\n".join([
|
||||
f"Source [{i}]: {src.get('url','')}\nText: {src.get('text','')[:1000]}"
|
||||
f"Source {i+1}: {src.get('url','')}\nText: {src.get('text','')[:1000]}"
|
||||
for i, src in enumerate(sources)
|
||||
])
|
||||
|
||||
|
||||
# Prepare claims text
|
||||
claims_text = "\n".join([
|
||||
f"Claim {i}: {claim}"
|
||||
f"Claim {i+1}: {claim}"
|
||||
for i, claim in enumerate(claims_to_assess)
|
||||
])
|
||||
|
||||
|
||||
prompt = (
|
||||
"You are a strict fact-checker. Analyze each claim against the provided sources.\n\n"
|
||||
"Return ONLY a valid JSON object with this exact structure:\n"
|
||||
@@ -377,57 +424,73 @@ class HallucinationDetector:
|
||||
' "claim_index": 0,\n'
|
||||
' "assessment": "supported" or "refuted" or "insufficient_information",\n'
|
||||
' "confidence": number between 0.0 and 1.0,\n'
|
||||
' "supporting_sources": [array of 0-based source indices, e.g. [0, 2] for Source [0] and Source [2]],\n'
|
||||
' "refuting_sources": [array of 0-based source indices, e.g. [1] for Source [1]],\n'
|
||||
' "supporting_sources": [array of source indices that support the claim],\n'
|
||||
' "refuting_sources": [array of source indices that refute the claim],\n'
|
||||
' "reasoning": "brief explanation of your assessment"\n'
|
||||
' }\n'
|
||||
' ]\n'
|
||||
"}\n\n"
|
||||
"IMPORTANT: Source indices are 0-based. Source [0] is the first source, Source [1] is the second, etc.\n"
|
||||
"For every 'supported' or 'refuted' claim you MUST include the relevant source indices.\n\n"
|
||||
f"Claims to verify:\n{claims_text}\n\n"
|
||||
f"Sources:\n{combined_sources}\n\n"
|
||||
"Return only the JSON object:"
|
||||
)
|
||||
|
||||
result_text = await self._generate_text_async(prompt, user_id=user_id)
|
||||
logger.info(f"Raw LLM response for batch assessment: {result_text[:200]}...")
|
||||
|
||||
result = self._parse_json_from_response(result_text, expect_array=False)
|
||||
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
with concurrent.futures.ThreadPoolExecutor() as executor:
|
||||
resp = await loop.run_in_executor(executor, lambda: self.gemini_client.models.generate_content(
|
||||
model="gemini-1.5-flash",
|
||||
contents=prompt
|
||||
))
|
||||
|
||||
if not resp or not resp.text:
|
||||
raise Exception("Empty response from Gemini API for batch assessment")
|
||||
|
||||
result_text = resp.text.strip()
|
||||
logger.info(f"Raw Gemini response for batch assessment: {result_text[:200]}...")
|
||||
|
||||
# Try to extract JSON from the response
|
||||
try:
|
||||
result = json.loads(result_text)
|
||||
except json.JSONDecodeError:
|
||||
# Try to find JSON object in the response (handle markdown code blocks)
|
||||
import re
|
||||
code_block_match = re.search(r'```(?:json)?\s*(\{.*?\})\s*```', result_text, re.DOTALL)
|
||||
if code_block_match:
|
||||
result = json.loads(code_block_match.group(1))
|
||||
else:
|
||||
json_match = re.search(r'\{.*?\}', result_text, re.DOTALL)
|
||||
if json_match:
|
||||
result = json.loads(json_match.group())
|
||||
else:
|
||||
raise Exception(f"Could not parse JSON from Gemini response: {result_text[:100]}")
|
||||
|
||||
# Process assessments
|
||||
assessments = result.get('assessments', [])
|
||||
verified_claims = []
|
||||
|
||||
|
||||
for i, claim in enumerate(claims_to_assess):
|
||||
# Find assessment for this claim
|
||||
assessment = None
|
||||
for a in assessments:
|
||||
if a.get('claim_index') == i:
|
||||
assessment = a
|
||||
break
|
||||
|
||||
|
||||
if assessment:
|
||||
# Process supporting and refuting sources
|
||||
supporting_sources = []
|
||||
refuting_sources = []
|
||||
|
||||
|
||||
if isinstance(assessment.get('supporting_sources'), list):
|
||||
for idx in assessment['supporting_sources']:
|
||||
if isinstance(idx, int) and 0 <= idx < len(sources):
|
||||
supporting_sources.append(sources[idx])
|
||||
|
||||
|
||||
if isinstance(assessment.get('refuting_sources'), list):
|
||||
for idx in assessment['refuting_sources']:
|
||||
if isinstance(idx, int) and 0 <= idx < len(sources):
|
||||
refuting_sources.append(sources[idx])
|
||||
|
||||
# Fallback: parse "Source N" from reasoning text when LLM omits indices
|
||||
if not supporting_sources and not refuting_sources and sources and assessment.get('reasoning'):
|
||||
ref_indices = self._map_source_refs_from_reasoning(assessment.get('reasoning', ''), sources)
|
||||
if ref_indices:
|
||||
if assessment.get('assessment') == 'supported':
|
||||
supporting_sources = [sources[i] for i in ref_indices]
|
||||
elif assessment.get('assessment') == 'refuted':
|
||||
refuting_sources = [sources[i] for i in ref_indices]
|
||||
|
||||
|
||||
verified_claims.append(Claim(
|
||||
text=claim,
|
||||
confidence=float(assessment.get('confidence', 0.5)),
|
||||
@@ -437,6 +500,7 @@ class HallucinationDetector:
|
||||
reasoning=assessment.get('reasoning', '')
|
||||
))
|
||||
else:
|
||||
# No assessment found for this claim
|
||||
verified_claims.append(Claim(
|
||||
text=claim,
|
||||
confidence=0.0,
|
||||
@@ -445,12 +509,13 @@ class HallucinationDetector:
|
||||
refuting_sources=[],
|
||||
reasoning="No assessment provided"
|
||||
))
|
||||
|
||||
|
||||
logger.info(f"Successfully assessed {len(verified_claims)} claims in batch")
|
||||
return verified_claims
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in batch assessment: {str(e)}")
|
||||
# Return all claims as insufficient information
|
||||
return [
|
||||
Claim(
|
||||
text=claim,
|
||||
@@ -463,95 +528,166 @@ class HallucinationDetector:
|
||||
for claim in claims_to_assess
|
||||
]
|
||||
|
||||
async def _search_evidence(self, claim: str, user_id: str = None) -> List[Dict[str, Any]]:
|
||||
"""Search for evidence using ExaResearchProvider with subscription checks."""
|
||||
async def _search_evidence(self, claim: str) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Search for evidence using Exa.ai API.
|
||||
|
||||
Args:
|
||||
claim: The claim to search evidence for
|
||||
|
||||
Returns:
|
||||
List of source documents with evidence
|
||||
"""
|
||||
if not self.exa_api_key:
|
||||
raise Exception("Exa API key not available. Cannot search for evidence without Exa.ai access.")
|
||||
|
||||
try:
|
||||
from services.blog_writer.research.exa_provider import ExaResearchProvider
|
||||
provider = ExaResearchProvider()
|
||||
sources = await provider.simple_search(
|
||||
query=claim,
|
||||
num_results=5,
|
||||
user_id=user_id,
|
||||
headers = {
|
||||
'x-api-key': self.exa_api_key,
|
||||
'Content-Type': 'application/json'
|
||||
}
|
||||
|
||||
payload = {
|
||||
'query': claim,
|
||||
'numResults': 5,
|
||||
'text': True,
|
||||
'useAutoprompt': True
|
||||
}
|
||||
|
||||
response = requests.post(
|
||||
'https://api.exa.ai/search',
|
||||
headers=headers,
|
||||
json=payload,
|
||||
timeout=15
|
||||
)
|
||||
if not sources:
|
||||
raise Exception(f"No search results found for claim: {claim}")
|
||||
logger.info(f"Found {len(sources)} sources for claim: {claim[:50]}...")
|
||||
return sources
|
||||
|
||||
if response.status_code == 200:
|
||||
data = response.json()
|
||||
results = data.get('results', [])
|
||||
|
||||
if not results:
|
||||
raise Exception(f"No search results found for claim: {claim}")
|
||||
|
||||
sources = []
|
||||
for result in results:
|
||||
source = {
|
||||
'title': result.get('title', 'Untitled'),
|
||||
'url': result.get('url', ''),
|
||||
'text': result.get('text', ''),
|
||||
'publishedDate': result.get('publishedDate', ''),
|
||||
'author': result.get('author', ''),
|
||||
'score': result.get('score', 0.5)
|
||||
}
|
||||
sources.append(source)
|
||||
|
||||
logger.info(f"Found {len(sources)} sources for claim: {claim[:50]}...")
|
||||
return sources
|
||||
else:
|
||||
raise Exception(f"Exa API error: {response.status_code} - {response.text}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error searching evidence with Exa: {str(e)}")
|
||||
raise Exception(f"Failed to search evidence: {str(e)}")
|
||||
|
||||
async def _assess_claim_against_sources(self, claim: str, sources: List[Dict[str, Any]], user_id: str = None) -> Dict[str, Any]:
|
||||
"""Assess whether sources support or refute the claim using LLM."""
|
||||
|
||||
|
||||
async def _assess_claim_against_sources(self, claim: str, sources: List[Dict[str, Any]]) -> Dict[str, Any]:
|
||||
"""
|
||||
Assess whether sources support or refute the claim using LLM.
|
||||
|
||||
Args:
|
||||
claim: The claim to assess
|
||||
sources: List of source documents
|
||||
|
||||
Returns:
|
||||
Dictionary with assessment results
|
||||
"""
|
||||
if not self.gemini_client:
|
||||
raise Exception("Gemini client not available. Cannot assess claims without AI provider.")
|
||||
|
||||
try:
|
||||
combined_sources = "\n\n".join([
|
||||
f"Source [{i}]: {src.get('url','')}\nText: {src.get('text','')[:2000]}"
|
||||
f"Source {i+1}: {src.get('url','')}\nText: {src.get('text','')[:2000]}"
|
||||
for i, src in enumerate(sources)
|
||||
])
|
||||
|
||||
|
||||
prompt = (
|
||||
"You are a strict fact-checker. Analyze the claim against the provided sources.\n\n"
|
||||
"Return ONLY a valid JSON object with this exact structure:\n"
|
||||
"{\n"
|
||||
' "assessment": "supported" or "refuted" or "insufficient_information",\n'
|
||||
' "confidence": number between 0.0 and 1.0,\n'
|
||||
' "supporting_sources": [array of 0-based source indices, e.g. [0, 2] for Source [0] and Source [2]],\n'
|
||||
' "refuting_sources": [array of 0-based source indices, e.g. [1] for Source [1]],\n'
|
||||
' "supporting_sources": [array of source indices that support the claim],\n'
|
||||
' "refuting_sources": [array of source indices that refute the claim],\n'
|
||||
' "reasoning": "brief explanation of your assessment"\n'
|
||||
"}\n\n"
|
||||
"IMPORTANT: Source indices are 0-based. Source [0] is the first source, Source [1] is the second, etc.\n"
|
||||
"For 'supported' or 'refuted' you MUST include the relevant source indices.\n\n"
|
||||
f"Claim to verify: {claim}\n\n"
|
||||
f"Sources:\n{combined_sources}\n\n"
|
||||
"Return only the JSON object:"
|
||||
)
|
||||
|
||||
result_text = await self._generate_text_async(prompt, user_id=user_id)
|
||||
logger.info(f"Raw LLM response for assessment: {result_text[:200]}...")
|
||||
|
||||
result = self._parse_json_from_response(result_text, expect_array=False)
|
||||
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
with concurrent.futures.ThreadPoolExecutor() as executor:
|
||||
resp = await loop.run_in_executor(executor, lambda: self.gemini_client.models.generate_content(
|
||||
model="gemini-1.5-flash",
|
||||
contents=prompt
|
||||
))
|
||||
|
||||
if not resp or not resp.text:
|
||||
raise Exception("Empty response from Gemini API for claim assessment")
|
||||
|
||||
result_text = resp.text.strip()
|
||||
logger.info(f"Raw Gemini response for assessment: {result_text[:200]}...")
|
||||
|
||||
# Try to extract JSON from the response
|
||||
try:
|
||||
result = json.loads(result_text)
|
||||
except json.JSONDecodeError:
|
||||
# Try to find JSON object in the response (handle markdown code blocks)
|
||||
import re
|
||||
# First try to extract from markdown code blocks
|
||||
code_block_match = re.search(r'```(?:json)?\s*(\{.*?\})\s*```', result_text, re.DOTALL)
|
||||
if code_block_match:
|
||||
result = json.loads(code_block_match.group(1))
|
||||
else:
|
||||
# Try to find JSON object directly
|
||||
json_match = re.search(r'\{.*?\}', result_text, re.DOTALL)
|
||||
if json_match:
|
||||
result = json.loads(json_match.group())
|
||||
else:
|
||||
raise Exception(f"Could not parse JSON from Gemini response: {result_text[:100]}")
|
||||
|
||||
# Validate required fields
|
||||
required_fields = ['assessment', 'confidence', 'supporting_sources', 'refuting_sources', 'reasoning']
|
||||
for field in required_fields:
|
||||
if field not in result:
|
||||
raise Exception(f"Missing required field '{field}' in assessment response")
|
||||
|
||||
|
||||
# Process supporting and refuting sources
|
||||
supporting_sources = []
|
||||
refuting_sources = []
|
||||
|
||||
|
||||
if isinstance(result.get('supporting_sources'), list):
|
||||
for idx in result['supporting_sources']:
|
||||
if isinstance(idx, int) and 0 <= idx < len(sources):
|
||||
supporting_sources.append(sources[idx])
|
||||
|
||||
|
||||
if isinstance(result.get('refuting_sources'), list):
|
||||
for idx in result['refuting_sources']:
|
||||
if isinstance(idx, int) and 0 <= idx < len(sources):
|
||||
refuting_sources.append(sources[idx])
|
||||
|
||||
# Fallback: parse "Source N" from reasoning text when LLM omits indices
|
||||
if not supporting_sources and not refuting_sources and sources and result.get('reasoning'):
|
||||
ref_indices = self._map_source_refs_from_reasoning(result.get('reasoning', ''), sources)
|
||||
if ref_indices:
|
||||
if result.get('assessment') == 'supported':
|
||||
supporting_sources = [sources[i] for i in ref_indices]
|
||||
elif result.get('assessment') == 'refuted':
|
||||
refuting_sources = [sources[i] for i in ref_indices]
|
||||
|
||||
|
||||
# Validate assessment value
|
||||
valid_assessments = ['supported', 'refuted', 'insufficient_information']
|
||||
if result['assessment'] not in valid_assessments:
|
||||
raise Exception(f"Invalid assessment value: {result['assessment']}")
|
||||
|
||||
|
||||
# Validate confidence value
|
||||
confidence = float(result['confidence'])
|
||||
if not (0.0 <= confidence <= 1.0):
|
||||
raise Exception(f"Invalid confidence value: {confidence}")
|
||||
|
||||
|
||||
logger.info(f"Successfully assessed claim: {result['assessment']} (confidence: {confidence})")
|
||||
|
||||
|
||||
return {
|
||||
'assessment': result['assessment'],
|
||||
'confidence': confidence,
|
||||
@@ -559,39 +695,8 @@ class HallucinationDetector:
|
||||
'refuting_sources': refuting_sources,
|
||||
'reasoning': result['reasoning']
|
||||
}
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error assessing claim against sources: {str(e)}")
|
||||
raise Exception(f"Failed to assess claim: {str(e)}")
|
||||
|
||||
def _parse_json_from_response(self, text: str, expect_array: bool = False):
|
||||
"""Extract and parse JSON from LLM response, handling markdown code blocks."""
|
||||
text = text.strip()
|
||||
|
||||
# Try direct parse first
|
||||
try:
|
||||
result = json.loads(text)
|
||||
return result
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
import re
|
||||
# Try to extract from markdown code blocks
|
||||
if expect_array:
|
||||
code_block_match = re.search(r'```(?:json)?\s*(\[.*?\])\s*```', text, re.DOTALL)
|
||||
if code_block_match:
|
||||
return json.loads(code_block_match.group(1))
|
||||
# Try to find JSON array directly
|
||||
json_match = re.search(r'\[.*\]', text, re.DOTALL)
|
||||
if json_match:
|
||||
return json.loads(json_match.group())
|
||||
else:
|
||||
code_block_match = re.search(r'```(?:json)?\s*(\{.*?\})\s*```', text, re.DOTALL)
|
||||
if code_block_match:
|
||||
return json.loads(code_block_match.group(1))
|
||||
# Try to find JSON object directly
|
||||
json_match = re.search(r'\{.*\}', text, re.DOTALL)
|
||||
if json_match:
|
||||
return json.loads(json_match.group())
|
||||
|
||||
raise Exception(f"Could not parse JSON from LLM response: {text[:100]}")
|
||||
|
||||
|
||||
@@ -1,79 +0,0 @@
|
||||
"""
|
||||
Shared OAuth callback utilities for Wix and WordPress integrations.
|
||||
|
||||
Provides hardened postMessage-based HTML callback generation, origin
|
||||
validation, and string sanitization used across OAuth callback routes.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
from typing import Any, Optional
|
||||
from urllib.parse import urlparse
|
||||
|
||||
|
||||
def sanitize_string(value: Any, max_len: int = 500) -> str:
|
||||
if value is None:
|
||||
return ""
|
||||
return " ".join(str(value).split())[:max_len]
|
||||
|
||||
|
||||
def sanitize_error(error: Exception, max_len: int = 500) -> str:
|
||||
return sanitize_string(error, max_len)
|
||||
|
||||
|
||||
def normalize_origin(url: Optional[str]) -> Optional[str]:
|
||||
if not url:
|
||||
return None
|
||||
parsed = urlparse(url.strip())
|
||||
if parsed.scheme not in {"http", "https"} or not parsed.netloc:
|
||||
return None
|
||||
return f"{parsed.scheme}://{parsed.netloc}"
|
||||
|
||||
|
||||
def trusted_frontend_origin() -> Optional[str]:
|
||||
origins_env = os.getenv("OAUTH_CALLBACK_ALLOWED_ORIGINS", "")
|
||||
configured = [
|
||||
origin
|
||||
for origin in (normalize_origin(o) for o in origins_env.split(",") if o.strip())
|
||||
if origin is not None
|
||||
]
|
||||
if configured:
|
||||
return configured[0]
|
||||
return normalize_origin(os.getenv("FRONTEND_URL"))
|
||||
|
||||
|
||||
def build_oauth_callback_html(
|
||||
payload: dict,
|
||||
title: str,
|
||||
heading: str,
|
||||
message: str,
|
||||
) -> str:
|
||||
trusted_origin = trusted_frontend_origin()
|
||||
payload_json = json.dumps(payload)
|
||||
target_origin_json = json.dumps(trusted_origin or "")
|
||||
heading_html = heading.replace("&", "&").replace("<", "<").replace(">", ">")
|
||||
message_html = message.replace("&", "&").replace("<", "<").replace(">", ">")
|
||||
return f"""
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>{title}</title></head>
|
||||
<body>
|
||||
<h1>{heading_html}</h1>
|
||||
<p>{message_html}</p>
|
||||
<script>
|
||||
(function() {{
|
||||
var payload = {payload_json};
|
||||
var targetOrigin = {target_origin_json};
|
||||
var destination = window.opener || window.parent;
|
||||
if (destination && targetOrigin) {{
|
||||
try {{
|
||||
destination.postMessage(payload, targetOrigin);
|
||||
window.close();
|
||||
return;
|
||||
}} catch (_e) {{}}
|
||||
}}
|
||||
}})();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
@@ -53,7 +53,6 @@ class WixBlogService:
|
||||
"""Create draft post with consolidated logging"""
|
||||
from .logger import wix_logger
|
||||
import json
|
||||
import traceback as tb
|
||||
|
||||
# Build payload summary for logging
|
||||
payload_summary = {}
|
||||
@@ -66,14 +65,7 @@ class WixBlogService:
|
||||
}
|
||||
|
||||
request_headers = self.headers(access_token, extra_headers)
|
||||
try:
|
||||
response = requests.post(f"{self.base_url}/blog/v3/draft-posts", headers=request_headers, json=payload)
|
||||
except TypeError as e:
|
||||
logger.error(f"TypeError during requests.post in create_draft_post: {e}")
|
||||
logger.error(f"Traceback: {tb.format_exc()}")
|
||||
logger.error(f"access_token type: {type(access_token)}")
|
||||
logger.error(f"payload type: {type(payload)}, keys: {list(payload.keys()) if isinstance(payload, dict) else 'N/A'}")
|
||||
raise
|
||||
response = requests.post(f"{self.base_url}/blog/v3/draft-posts", headers=request_headers, json=payload)
|
||||
|
||||
# Consolidated error logging
|
||||
error_body = None
|
||||
|
||||
@@ -5,7 +5,6 @@ Handles blog post creation, validation, and publishing to Wix.
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import uuid
|
||||
import requests
|
||||
import jwt
|
||||
@@ -399,30 +398,6 @@ def create_blog_post(
|
||||
# Ensure we only have 'nodes' in richContent for CREATE endpoint
|
||||
ricos_content = {'nodes': ricos_content['nodes']}
|
||||
|
||||
# SAFE ITEM 4: Prepend H1 title node if content doesn't start with one.
|
||||
# The markdown typically starts at ## (H2) because the title is separate,
|
||||
# but Wix renders the richContent as the full post body including the title.
|
||||
# Without an H1, the post looks like it has no heading.
|
||||
existing_first = ricos_content['nodes'][0] if ricos_content['nodes'] else None
|
||||
has_h1 = existing_first and existing_first.get('type') == 'HEADING' and existing_first.get('headingData', {}).get('level') == 1
|
||||
if not has_h1 and title:
|
||||
title_node = {
|
||||
'id': str(uuid.uuid4()),
|
||||
'type': 'HEADING',
|
||||
'nodes': [{
|
||||
'id': str(uuid.uuid4()),
|
||||
'type': 'TEXT',
|
||||
'nodes': [],
|
||||
'textData': {
|
||||
'text': str(title).strip(),
|
||||
'decorations': []
|
||||
}
|
||||
}],
|
||||
'headingData': {'level': 1}
|
||||
}
|
||||
ricos_content['nodes'] = [title_node] + ricos_content['nodes']
|
||||
logger.debug(f"Prepended H1 title node: '{str(title).strip()[:50]}'")
|
||||
|
||||
logger.debug(f"✅ richContent structure validated: {len(ricos_content['nodes'])} nodes, keys: {list(ricos_content.keys())}")
|
||||
|
||||
# Minimal payload per Wix docs: title, memberId, and richContent
|
||||
@@ -432,39 +407,15 @@ def create_blog_post(
|
||||
'title': str(title).strip() if title else "Untitled",
|
||||
'memberId': str(member_id).strip(), # Required for third-party apps (validated above)
|
||||
'richContent': ricos_content, # Must be a valid Ricos object with ONLY 'nodes'
|
||||
'language': 'en',
|
||||
},
|
||||
'publish': bool(publish),
|
||||
'fieldsets': ['URL'] # Simplified fieldsets
|
||||
}
|
||||
|
||||
# SAFE ITEM 1: Auto-generate seoSlug from title if not provided by SEO metadata
|
||||
# Wix uses this for the URL path (e.g. /post/my-blog-title)
|
||||
slug_source = None
|
||||
if seo_metadata and seo_metadata.get('url_slug'):
|
||||
slug_source = str(seo_metadata['url_slug']).strip()
|
||||
elif title:
|
||||
slug_source = re.sub(r'[^a-z0-9]+', '-', str(title).strip().lower()).strip('-')
|
||||
slug_source = slug_source[:60].rstrip('-')
|
||||
if slug_source:
|
||||
blog_data['draftPost']['seoSlug'] = slug_source
|
||||
|
||||
# SAFE ITEM 3: Better excerpt — prefer meta_description, then first plain-text paragraph
|
||||
excerpt = None
|
||||
if seo_metadata and seo_metadata.get('meta_description'):
|
||||
excerpt = str(seo_metadata['meta_description']).strip()[:200]
|
||||
if not excerpt and content:
|
||||
for node in ricos_content['nodes']:
|
||||
if node.get('type') == 'PARAGRAPH':
|
||||
texts = []
|
||||
for child in node.get('nodes', []):
|
||||
if child.get('type') == 'TEXT' and child.get('textData', {}).get('text'):
|
||||
texts.append(child['textData']['text'])
|
||||
if texts:
|
||||
excerpt = ' '.join(texts).strip()[:200]
|
||||
break
|
||||
if excerpt:
|
||||
blog_data['draftPost']['excerpt'] = excerpt
|
||||
# Add excerpt only if content exists and is not empty (avoid None or empty strings)
|
||||
excerpt = (content or '').strip()[:200] if content else None
|
||||
if excerpt and len(excerpt) > 0:
|
||||
blog_data['draftPost']['excerpt'] = str(excerpt)
|
||||
|
||||
# Add cover image if provided
|
||||
if cover_image_url and import_image_func:
|
||||
@@ -544,6 +495,7 @@ def create_blog_post(
|
||||
|
||||
# Build SEO data from metadata if provided
|
||||
# NOTE: seoData is optional - if it causes issues, we can create post without it
|
||||
seo_data = None
|
||||
if seo_metadata:
|
||||
try:
|
||||
seo_data = build_seo_data(seo_metadata, title)
|
||||
@@ -554,8 +506,13 @@ def create_blog_post(
|
||||
blog_data['draftPost']['seoData'] = seo_data
|
||||
except Exception as e:
|
||||
logger.warning(f"⚠️ Wix: SEO data build failed - {str(e)[:50]}")
|
||||
wix_logger.add_warning(f"SEO build: {str(e)[:50]}")
|
||||
|
||||
# Add SEO slug if provided
|
||||
if seo_metadata.get('url_slug'):
|
||||
blog_data['draftPost']['seoSlug'] = str(seo_metadata.get('url_slug')).strip()
|
||||
else:
|
||||
logger.debug("No SEO metadata provided to create_blog_post")
|
||||
logger.warning("⚠️ No SEO metadata provided to create_blog_post")
|
||||
|
||||
try:
|
||||
# Extract wix-site-id from token if possible
|
||||
@@ -577,6 +534,7 @@ def create_blog_post(
|
||||
meta_site_id = instance_data.get('metaSiteId')
|
||||
if isinstance(meta_site_id, str) and meta_site_id:
|
||||
extra_headers['wix-site-id'] = meta_site_id
|
||||
headers['wix-site-id'] = meta_site_id
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -616,27 +574,156 @@ def create_blog_post(
|
||||
logger.error(f"❌ Payload validation failed: {e}")
|
||||
raise
|
||||
|
||||
# Log payload summary
|
||||
logger.debug(f"Payload: draftPost keys={list(draft_post.keys())}, "
|
||||
f"nodes={len(draft_post.get('richContent', {}).get('nodes', []))}, "
|
||||
f"has_seo={'seoData' in draft_post}")
|
||||
# Log full payload structure for debugging (sanitized)
|
||||
logger.warning(f"📦 Full payload structure validation:")
|
||||
logger.warning(f" - draftPost type: {type(draft_post)}")
|
||||
logger.warning(f" - draftPost keys: {list(draft_post.keys())}")
|
||||
logger.warning(f" - richContent type: {type(draft_post.get('richContent'))}")
|
||||
if 'richContent' in draft_post:
|
||||
rc = draft_post['richContent']
|
||||
logger.warning(f" - richContent keys: {list(rc.keys()) if isinstance(rc, dict) else 'N/A'}")
|
||||
logger.warning(f" - richContent.nodes type: {type(rc.get('nodes'))}, count: {len(rc.get('nodes', []))}")
|
||||
logger.warning(f" - richContent.metadata type: {type(rc.get('metadata'))}")
|
||||
logger.warning(f" - richContent.documentStyle type: {type(rc.get('documentStyle'))}")
|
||||
logger.warning(f" - seoData type: {type(draft_post.get('seoData'))}")
|
||||
if 'seoData' in draft_post:
|
||||
seo = draft_post['seoData']
|
||||
logger.warning(f" - seoData keys: {list(seo.keys()) if isinstance(seo, dict) else 'N/A'}")
|
||||
logger.warning(f" - seoData.tags type: {type(seo.get('tags'))}, count: {len(seo.get('tags', []))}")
|
||||
logger.warning(f" - seoData.settings type: {type(seo.get('settings'))}")
|
||||
if 'categoryIds' in draft_post:
|
||||
logger.warning(f" - categoryIds type: {type(draft_post.get('categoryIds'))}, count: {len(draft_post.get('categoryIds', []))}")
|
||||
if 'tagIds' in draft_post:
|
||||
logger.warning(f" - tagIds type: {type(draft_post.get('tagIds'))}, count: {len(draft_post.get('tagIds', []))}")
|
||||
|
||||
# Final deep validation: Serialize and deserialize to catch any JSON-serialization issues
|
||||
# Log a sample of the payload JSON to see exact structure (first 2000 chars)
|
||||
try:
|
||||
import json
|
||||
json.dumps(blog_data, ensure_ascii=False)
|
||||
payload_json = json.dumps(blog_data, indent=2, ensure_ascii=False)
|
||||
logger.warning(f"📄 Payload JSON preview (first 3000 chars):\n{payload_json[:3000]}...")
|
||||
|
||||
# Also log a deep structure inspection of richContent.nodes (first few nodes)
|
||||
if 'richContent' in blog_data['draftPost']:
|
||||
nodes = blog_data['draftPost']['richContent'].get('nodes', [])
|
||||
if nodes:
|
||||
logger.warning(f"🔍 Inspecting first 5 richContent.nodes:")
|
||||
for i, node in enumerate(nodes[:5]):
|
||||
logger.warning(f" Node {i+1}: type={node.get('type')}, keys={list(node.keys())}")
|
||||
# Check for any None values in node
|
||||
for key, value in node.items():
|
||||
if value is None:
|
||||
logger.error(f" ⚠️ Node {i+1}.{key} is None!")
|
||||
elif isinstance(value, dict):
|
||||
for k, v in value.items():
|
||||
if v is None:
|
||||
logger.error(f" ⚠️ Node {i+1}.{key}.{k} is None!")
|
||||
# Deep check: if it's a list-type node, inspect list items
|
||||
if node.get('type') in ['BULLETED_LIST', 'ORDERED_LIST']:
|
||||
list_items = node.get('nodes', [])
|
||||
if list_items:
|
||||
logger.warning(f" List has {len(list_items)} items, checking first LIST_ITEM:")
|
||||
first_item = list_items[0]
|
||||
logger.warning(f" LIST_ITEM keys: {list(first_item.keys())}")
|
||||
# Verify listItemData is NOT present (correct per Wix API spec)
|
||||
if 'listItemData' in first_item:
|
||||
logger.error(f" ❌ LIST_ITEM incorrectly has listItemData!")
|
||||
else:
|
||||
logger.debug(f" ✅ LIST_ITEM correctly has no listItemData")
|
||||
# Check nested PARAGRAPH nodes
|
||||
nested_nodes = first_item.get('nodes', [])
|
||||
if nested_nodes:
|
||||
logger.warning(f" LIST_ITEM has {len(nested_nodes)} nested nodes")
|
||||
for n_idx, n_node in enumerate(nested_nodes[:2]):
|
||||
logger.warning(f" Nested node {n_idx+1}: type={n_node.get('type')}, keys={list(n_node.keys())}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not serialize payload for logging: {e}")
|
||||
|
||||
# Note: All node validation is done by validate_ricos_content() which runs earlier
|
||||
# The recursive validation ensures all required data fields are present at any depth
|
||||
|
||||
# Final deep validation: Serialize and deserialize to catch any JSON-serialization issues
|
||||
# This will raise an error if there are any objects that can't be serialized
|
||||
try:
|
||||
import json
|
||||
test_json = json.dumps(blog_data, ensure_ascii=False)
|
||||
test_parsed = json.loads(test_json)
|
||||
logger.debug("✅ Payload JSON serialization test passed")
|
||||
except (TypeError, ValueError) as e:
|
||||
logger.error(f"❌ Payload JSON serialization failed: {e}")
|
||||
raise ValueError(f"Payload contains non-serializable data: {e}")
|
||||
|
||||
# Clean up None values that Wix API would reject
|
||||
# Final check: Ensure documentStyle and metadata are valid objects (not None, not empty strings)
|
||||
rc = blog_data['draftPost']['richContent']
|
||||
for field in ['documentStyle', 'metadata']:
|
||||
if field in rc and (rc[field] is None or rc[field] == "" or not isinstance(rc[field], dict)):
|
||||
del rc[field]
|
||||
if 'documentStyle' in rc:
|
||||
doc_style = rc['documentStyle']
|
||||
if doc_style is None or doc_style == "":
|
||||
logger.warning("⚠️ documentStyle is None or empty string, removing it")
|
||||
del rc['documentStyle']
|
||||
elif not isinstance(doc_style, dict):
|
||||
logger.warning(f"⚠️ documentStyle is not a dict ({type(doc_style)}), removing it")
|
||||
del rc['documentStyle']
|
||||
|
||||
logger.info(f"📤 Publishing to Wix: title='{blog_data['draftPost'].get('title', '')}', "
|
||||
f"nodes={len(rc.get('nodes', []))}")
|
||||
if 'metadata' in rc:
|
||||
metadata = rc['metadata']
|
||||
if metadata is None or metadata == "":
|
||||
logger.warning("⚠️ metadata is None or empty string, removing it")
|
||||
del rc['metadata']
|
||||
elif not isinstance(metadata, dict):
|
||||
logger.warning(f"⚠️ metadata is not a dict ({type(metadata)}), removing it")
|
||||
del rc['metadata']
|
||||
|
||||
# Check for any None values in critical nested structures
|
||||
def check_none_in_dict(d, path=""):
|
||||
"""Recursively check for None values that shouldn't be there"""
|
||||
issues = []
|
||||
if isinstance(d, dict):
|
||||
for key, value in d.items():
|
||||
current_path = f"{path}.{key}" if path else key
|
||||
if value is None:
|
||||
# Some fields can legitimately be None, but most shouldn't
|
||||
if key not in ['decorations', 'nodeStyle', 'props']:
|
||||
issues.append(current_path)
|
||||
elif isinstance(value, dict):
|
||||
issues.extend(check_none_in_dict(value, current_path))
|
||||
elif isinstance(value, list):
|
||||
for i, item in enumerate(value):
|
||||
if item is None:
|
||||
issues.append(f"{current_path}[{i}]")
|
||||
elif isinstance(item, dict):
|
||||
issues.extend(check_none_in_dict(item, f"{current_path}[{i}]"))
|
||||
return issues
|
||||
|
||||
none_issues = check_none_in_dict(blog_data['draftPost']['richContent'])
|
||||
if none_issues:
|
||||
logger.error(f"❌ Found None values in richContent at: {none_issues[:10]}") # Limit to first 10
|
||||
# Remove None values from critical paths
|
||||
for issue_path in none_issues[:5]: # Fix first 5
|
||||
parts = issue_path.split('.')
|
||||
try:
|
||||
obj = blog_data['draftPost']['richContent']
|
||||
for part in parts[:-1]:
|
||||
if '[' in part:
|
||||
key, idx = part.split('[')
|
||||
idx = int(idx.rstrip(']'))
|
||||
obj = obj[key][idx]
|
||||
else:
|
||||
obj = obj[part]
|
||||
final_key = parts[-1]
|
||||
if '[' in final_key:
|
||||
key, idx = final_key.split('[')
|
||||
idx = int(idx.rstrip(']'))
|
||||
obj[key][idx] = {}
|
||||
else:
|
||||
obj[final_key] = {}
|
||||
logger.warning(f"Fixed None value at {issue_path}")
|
||||
except:
|
||||
pass
|
||||
|
||||
# Log the final payload structure one more time before sending
|
||||
logger.warning(f"📤 Final payload ready - draftPost keys: {list(blog_data['draftPost'].keys())}")
|
||||
logger.warning(f"📤 RichContent nodes count: {len(blog_data['draftPost']['richContent'].get('nodes', []))}")
|
||||
logger.warning(f"📤 RichContent has metadata: {bool(blog_data['draftPost']['richContent'].get('metadata'))}")
|
||||
logger.warning(f"📤 RichContent has documentStyle: {bool(blog_data['draftPost']['richContent'].get('documentStyle'))}")
|
||||
|
||||
result = blog_service.create_draft_post(access_token, blog_data, extra_headers or None)
|
||||
|
||||
@@ -647,11 +734,6 @@ def create_blog_post(
|
||||
logger.success(f"✅ Wix: Blog post created - ID: {post_id}")
|
||||
|
||||
return result
|
||||
except TypeError as e:
|
||||
import traceback
|
||||
logger.error(f"TypeError in create_blog_post: {e}")
|
||||
logger.error(f"Traceback: {traceback.format_exc()}")
|
||||
raise
|
||||
except requests.RequestException as e:
|
||||
logger.error(f"Failed to create blog post: {e}")
|
||||
if hasattr(e, 'response') and e.response is not None:
|
||||
|
||||
@@ -66,8 +66,7 @@ class WixLogger:
|
||||
if 'title' in dp:
|
||||
parts.append(f"title='{str(dp['title'])[:50]}...'")
|
||||
if 'richContent' in dp:
|
||||
nodes_val = dp['richContent'].get('nodes', [])
|
||||
nodes_count = nodes_val if isinstance(nodes_val, int) else len(nodes_val)
|
||||
nodes_count = len(dp['richContent'].get('nodes', []))
|
||||
parts.append(f"nodes={nodes_count}")
|
||||
if 'seoData' in dp:
|
||||
parts.append("has_seoData")
|
||||
|
||||
@@ -8,7 +8,7 @@ import sqlite3
|
||||
from typing import Optional, Dict, Any, List
|
||||
from datetime import datetime, timedelta
|
||||
from loguru import logger
|
||||
from cryptography.fernet import Fernet, InvalidToken
|
||||
|
||||
|
||||
from services.database import get_user_db_path
|
||||
|
||||
@@ -17,66 +17,6 @@ class WixOAuthService:
|
||||
|
||||
def __init__(self, db_path: Optional[str] = None):
|
||||
self.db_path = db_path
|
||||
self.token_encryption_key = (
|
||||
os.getenv("WIX_TOKEN_ENCRYPTION_KEY")
|
||||
or os.getenv("OAUTH_TOKEN_ENCRYPTION_KEY")
|
||||
)
|
||||
self._fernet = self._initialize_fernet()
|
||||
self._migration_done: set = set()
|
||||
|
||||
def _initialize_fernet(self) -> Optional[Fernet]:
|
||||
if not self.token_encryption_key:
|
||||
logger.error("Wix token encryption key is not configured.")
|
||||
return None
|
||||
try:
|
||||
return Fernet(self.token_encryption_key.encode("utf-8"))
|
||||
except Exception:
|
||||
logger.error("Wix token encryption key is invalid.")
|
||||
return None
|
||||
|
||||
def _encrypt_token(self, token: Optional[str]) -> Optional[str]:
|
||||
if not token:
|
||||
return None
|
||||
if not self._fernet:
|
||||
raise ValueError("Token encryption is unavailable: missing/invalid managed key")
|
||||
return self._fernet.encrypt(token.encode("utf-8")).decode("utf-8")
|
||||
|
||||
def _decrypt_token(self, token_blob: Optional[str]) -> Optional[str]:
|
||||
if not token_blob:
|
||||
return None
|
||||
if not self._fernet:
|
||||
raise ValueError("Token decryption is unavailable: missing/invalid managed key")
|
||||
return self._fernet.decrypt(token_blob.encode("utf-8")).decode("utf-8")
|
||||
|
||||
def _is_likely_encrypted_blob(self, value: Optional[str]) -> bool:
|
||||
return bool(value and value.startswith("gAAAAA"))
|
||||
|
||||
def _migrate_plaintext_tokens_if_needed(self, conn: sqlite3.Connection, user_id: str) -> None:
|
||||
if not self._fernet or user_id in self._migration_done:
|
||||
return
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(
|
||||
"SELECT id, access_token, refresh_token FROM wix_oauth_tokens WHERE user_id = ?",
|
||||
(user_id,),
|
||||
)
|
||||
rows = cursor.fetchall()
|
||||
migrated = 0
|
||||
for token_id, access_token, refresh_token in rows:
|
||||
needs_access = access_token and not self._is_likely_encrypted_blob(access_token)
|
||||
needs_refresh = refresh_token and not self._is_likely_encrypted_blob(refresh_token)
|
||||
if not (needs_access or needs_refresh):
|
||||
continue
|
||||
enc_access = self._encrypt_token(access_token) if needs_access else access_token
|
||||
enc_refresh = self._encrypt_token(refresh_token) if needs_refresh else refresh_token
|
||||
cursor.execute(
|
||||
"UPDATE wix_oauth_tokens SET access_token = ?, refresh_token = ?, updated_at = datetime('now') WHERE id = ? AND user_id = ?",
|
||||
(enc_access, enc_refresh, token_id, user_id),
|
||||
)
|
||||
migrated += 1
|
||||
if migrated:
|
||||
conn.commit()
|
||||
logger.info(f"Wix OAuth token migration completed for user {user_id}; rows migrated={migrated}")
|
||||
self._migration_done.add(user_id)
|
||||
|
||||
def _get_db_path(self, user_id: str) -> str:
|
||||
if self.db_path:
|
||||
@@ -108,94 +48,7 @@ class WixOAuthService:
|
||||
is_active BOOLEAN DEFAULT TRUE
|
||||
)
|
||||
''')
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS wix_oauth_pkce_states (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
user_id TEXT NOT NULL,
|
||||
state TEXT NOT NULL UNIQUE,
|
||||
code_verifier TEXT NOT NULL,
|
||||
expires_at TIMESTAMP NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
used_at TIMESTAMP
|
||||
)
|
||||
''')
|
||||
cursor.execute('''
|
||||
CREATE INDEX IF NOT EXISTS idx_wix_oauth_pkce_user_state
|
||||
ON wix_oauth_pkce_states (user_id, state)
|
||||
''')
|
||||
conn.commit()
|
||||
|
||||
def cleanup_expired_pkce_states(self, user_id: str) -> int:
|
||||
"""Delete expired or already-used PKCE state records."""
|
||||
try:
|
||||
self._init_db(user_id)
|
||||
db_path = self._get_db_path(user_id)
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(
|
||||
'''
|
||||
DELETE FROM wix_oauth_pkce_states
|
||||
WHERE used_at IS NOT NULL OR expires_at <= datetime('now')
|
||||
'''
|
||||
)
|
||||
deleted = cursor.rowcount
|
||||
conn.commit()
|
||||
return deleted if deleted is not None else 0
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to cleanup expired Wix PKCE states for user {user_id}: {e}")
|
||||
return 0
|
||||
|
||||
def store_pkce_verifier(self, user_id: str, state: str, code_verifier: str, ttl_seconds: int = 600) -> bool:
|
||||
"""Store PKCE code verifier by OAuth state with short TTL."""
|
||||
try:
|
||||
self._init_db(user_id)
|
||||
self.cleanup_expired_pkce_states(user_id)
|
||||
db_path = self._get_db_path(user_id)
|
||||
expires_at = datetime.now() + timedelta(seconds=ttl_seconds)
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(
|
||||
'''
|
||||
INSERT OR REPLACE INTO wix_oauth_pkce_states (user_id, state, code_verifier, expires_at, created_at, used_at)
|
||||
VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP, NULL)
|
||||
''',
|
||||
(user_id, state, code_verifier, expires_at)
|
||||
)
|
||||
conn.commit()
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Failed storing Wix PKCE verifier for user {user_id}, state {state}: {e}")
|
||||
return False
|
||||
|
||||
def consume_pkce_verifier(self, user_id: str, state: str) -> Optional[str]:
|
||||
"""Get and invalidate one-time PKCE verifier for a state if valid and unexpired."""
|
||||
try:
|
||||
self._init_db(user_id)
|
||||
self.cleanup_expired_pkce_states(user_id)
|
||||
db_path = self._get_db_path(user_id)
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(
|
||||
'''
|
||||
SELECT id, code_verifier
|
||||
FROM wix_oauth_pkce_states
|
||||
WHERE user_id = ? AND state = ? AND used_at IS NULL AND expires_at > datetime('now')
|
||||
LIMIT 1
|
||||
''',
|
||||
(user_id, state)
|
||||
)
|
||||
row = cursor.fetchone()
|
||||
if not row:
|
||||
return None
|
||||
cursor.execute(
|
||||
"UPDATE wix_oauth_pkce_states SET used_at = CURRENT_TIMESTAMP WHERE id = ?",
|
||||
(row[0],)
|
||||
)
|
||||
conn.commit()
|
||||
return row[1]
|
||||
except Exception as e:
|
||||
logger.error(f"Failed consuming Wix PKCE verifier for user {user_id}, state {state}: {e}")
|
||||
return None
|
||||
|
||||
def store_tokens(
|
||||
self,
|
||||
@@ -233,16 +86,13 @@ class WixOAuthService:
|
||||
if expires_in:
|
||||
expires_at = datetime.now() + timedelta(seconds=expires_in)
|
||||
|
||||
encrypted_access = self._encrypt_token(access_token)
|
||||
encrypted_refresh = self._encrypt_token(refresh_token) if refresh_token else None
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
INSERT INTO wix_oauth_tokens
|
||||
(user_id, access_token, refresh_token, token_type, expires_at, expires_in, scope, site_id, member_id)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
''', (user_id, encrypted_access, encrypted_refresh, token_type, expires_at, expires_in, scope, site_id, member_id))
|
||||
''', (user_id, access_token, refresh_token, token_type, expires_at, expires_in, scope, site_id, member_id))
|
||||
conn.commit()
|
||||
logger.info(f"Wix OAuth: Token inserted into database for user {user_id}")
|
||||
|
||||
@@ -263,7 +113,6 @@ class WixOAuthService:
|
||||
return []
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
self._migrate_plaintext_tokens_if_needed(conn, user_id)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
SELECT id, access_token, refresh_token, token_type, expires_at, expires_in, scope, site_id, member_id, created_at
|
||||
@@ -274,29 +123,10 @@ class WixOAuthService:
|
||||
|
||||
tokens = []
|
||||
for row in cursor.fetchall():
|
||||
access_token_val = row[1]
|
||||
refresh_token_val = row[2]
|
||||
try:
|
||||
decrypted_access = (
|
||||
self._decrypt_token(access_token_val)
|
||||
if self._is_likely_encrypted_blob(access_token_val)
|
||||
else access_token_val
|
||||
)
|
||||
except InvalidToken:
|
||||
logger.error(f"Failed to decrypt Wix access token for user {user_id}, token_id={row[0]}")
|
||||
continue
|
||||
try:
|
||||
decrypted_refresh = (
|
||||
self._decrypt_token(refresh_token_val)
|
||||
if self._is_likely_encrypted_blob(refresh_token_val)
|
||||
else refresh_token_val
|
||||
)
|
||||
except InvalidToken:
|
||||
decrypted_refresh = None
|
||||
tokens.append({
|
||||
"id": row[0],
|
||||
"access_token": decrypted_access,
|
||||
"refresh_token": decrypted_refresh,
|
||||
"access_token": row[1],
|
||||
"refresh_token": row[2],
|
||||
"token_type": row[3],
|
||||
"expires_at": row[4],
|
||||
"expires_in": row[5],
|
||||
@@ -331,9 +161,9 @@ class WixOAuthService:
|
||||
}
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
self._migrate_plaintext_tokens_if_needed(conn, user_id)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Get all tokens (active and expired)
|
||||
cursor.execute('''
|
||||
SELECT id, access_token, refresh_token, token_type, expires_at, expires_in, scope, site_id, member_id, created_at, is_active
|
||||
FROM wix_oauth_tokens
|
||||
@@ -346,29 +176,10 @@ class WixOAuthService:
|
||||
expired_tokens = []
|
||||
|
||||
for row in cursor.fetchall():
|
||||
access_token_val = row[1]
|
||||
refresh_token_val = row[2]
|
||||
try:
|
||||
decrypted_access = (
|
||||
self._decrypt_token(access_token_val)
|
||||
if self._is_likely_encrypted_blob(access_token_val)
|
||||
else access_token_val
|
||||
)
|
||||
except InvalidToken:
|
||||
decrypted_access = None
|
||||
try:
|
||||
decrypted_refresh = (
|
||||
self._decrypt_token(refresh_token_val)
|
||||
if self._is_likely_encrypted_blob(refresh_token_val)
|
||||
else refresh_token_val
|
||||
)
|
||||
except InvalidToken:
|
||||
decrypted_refresh = None
|
||||
|
||||
token_data = {
|
||||
"id": row[0],
|
||||
"access_token": decrypted_access,
|
||||
"refresh_token": decrypted_refresh,
|
||||
"access_token": row[1],
|
||||
"refresh_token": row[2],
|
||||
"token_type": row[3],
|
||||
"expires_at": row[4],
|
||||
"expires_in": row[5],
|
||||
@@ -433,46 +244,34 @@ class WixOAuthService:
|
||||
user_id: str,
|
||||
access_token: str,
|
||||
refresh_token: Optional[str] = None,
|
||||
expires_in: Optional[int] = None,
|
||||
token_id: Optional[int] = None
|
||||
expires_in: Optional[int] = None
|
||||
) -> bool:
|
||||
"""Update tokens for a user (e.g., after refresh)."""
|
||||
try:
|
||||
# Ensure DB initialized for this user
|
||||
self._init_db(user_id)
|
||||
db_path = self._get_db_path(user_id)
|
||||
|
||||
expires_at = None
|
||||
if expires_in:
|
||||
expires_at = datetime.now() + timedelta(seconds=expires_in)
|
||||
|
||||
encrypted_access = self._encrypt_token(access_token)
|
||||
encrypted_refresh = self._encrypt_token(refresh_token) if refresh_token else None
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
self._migrate_plaintext_tokens_if_needed(conn, user_id)
|
||||
cursor = conn.cursor()
|
||||
if token_id:
|
||||
if encrypted_refresh:
|
||||
cursor.execute('''
|
||||
UPDATE wix_oauth_tokens
|
||||
SET access_token = ?, refresh_token = ?, expires_at = ?, expires_in = ?,
|
||||
is_active = TRUE, updated_at = datetime('now')
|
||||
WHERE user_id = ? AND id = ?
|
||||
''', (encrypted_access, encrypted_refresh, expires_at, expires_in, user_id, token_id))
|
||||
else:
|
||||
cursor.execute('''
|
||||
UPDATE wix_oauth_tokens
|
||||
SET access_token = ?, expires_at = ?, expires_in = ?,
|
||||
is_active = TRUE, updated_at = datetime('now')
|
||||
WHERE user_id = ? AND id = ?
|
||||
''', (encrypted_access, expires_at, expires_in, user_id, token_id))
|
||||
if refresh_token:
|
||||
cursor.execute('''
|
||||
UPDATE wix_oauth_tokens
|
||||
SET access_token = ?, refresh_token = ?, expires_at = ?, expires_in = ?,
|
||||
is_active = TRUE, updated_at = datetime('now')
|
||||
WHERE user_id = ? AND refresh_token = ?
|
||||
''', (access_token, refresh_token, expires_at, expires_in, user_id, refresh_token))
|
||||
else:
|
||||
cursor.execute('''
|
||||
UPDATE wix_oauth_tokens
|
||||
SET access_token = ?, expires_at = ?, expires_in = ?,
|
||||
is_active = TRUE, updated_at = datetime('now')
|
||||
WHERE user_id = ? AND id = (SELECT id FROM wix_oauth_tokens WHERE user_id = ? ORDER BY created_at DESC LIMIT 1)
|
||||
''', (encrypted_access, expires_at, expires_in, user_id, user_id))
|
||||
''', (access_token, expires_at, expires_in, user_id, user_id))
|
||||
conn.commit()
|
||||
logger.info(f"Wix OAuth: Tokens updated for user {user_id}")
|
||||
|
||||
@@ -503,3 +302,4 @@ class WixOAuthService:
|
||||
except Exception as e:
|
||||
logger.error(f"Error revoking Wix token: {e}")
|
||||
return False
|
||||
|
||||
|
||||
@@ -10,7 +10,8 @@ import requests
|
||||
from typing import Optional, Dict, Any, List
|
||||
from datetime import datetime, timedelta
|
||||
from loguru import logger
|
||||
from cryptography.fernet import Fernet, InvalidToken
|
||||
import json
|
||||
import base64
|
||||
|
||||
from services.database import get_user_db_path
|
||||
|
||||
@@ -34,79 +35,11 @@ class WordPressOAuthService:
|
||||
self.redirect_uri = os.getenv('WORDPRESS_REDIRECT_URI', default_redirect)
|
||||
|
||||
self.base_url = "https://public-api.wordpress.com"
|
||||
self.token_encryption_key = (
|
||||
os.getenv("WORDPRESS_TOKEN_ENCRYPTION_KEY")
|
||||
or os.getenv("OAUTH_TOKEN_ENCRYPTION_KEY")
|
||||
)
|
||||
self._fernet = self._initialize_fernet()
|
||||
|
||||
# Validate configuration
|
||||
if not self.client_id or not self.client_secret or self.client_id == 'your_wordpress_com_client_id_here':
|
||||
logger.error("WordPress OAuth client credentials not configured. Please set WORDPRESS_CLIENT_ID and WORDPRESS_CLIENT_SECRET environment variables with valid WordPress.com application credentials.")
|
||||
logger.error("To get credentials: 1. Go to https://developer.wordpress.com/apps/ 2. Create a new application 3. Set redirect URI to: https://your-domain.com/wp/callback")
|
||||
|
||||
def _initialize_fernet(self) -> Optional[Fernet]:
|
||||
"""Initialize token encryption using managed key from env/secret manager."""
|
||||
if not self.token_encryption_key:
|
||||
logger.error("WordPress token encryption key is not configured.")
|
||||
return None
|
||||
try:
|
||||
return Fernet(self.token_encryption_key.encode("utf-8"))
|
||||
except Exception:
|
||||
logger.error("WordPress token encryption key is invalid.")
|
||||
return None
|
||||
|
||||
def _encrypt_token(self, token: Optional[str]) -> Optional[str]:
|
||||
if not token:
|
||||
return None
|
||||
if not self._fernet:
|
||||
raise ValueError("Token encryption is unavailable: missing/invalid managed key")
|
||||
return self._fernet.encrypt(token.encode("utf-8")).decode("utf-8")
|
||||
|
||||
def _decrypt_token(self, token_blob: Optional[str]) -> Optional[str]:
|
||||
if not token_blob:
|
||||
return None
|
||||
if not self._fernet:
|
||||
raise ValueError("Token decryption is unavailable: missing/invalid managed key")
|
||||
return self._fernet.decrypt(token_blob.encode("utf-8")).decode("utf-8")
|
||||
|
||||
def _is_likely_encrypted_blob(self, value: Optional[str]) -> bool:
|
||||
return bool(value and value.startswith("gAAAAA"))
|
||||
|
||||
def _migrate_plaintext_tokens_if_needed(self, conn: sqlite3.Connection, user_id: str) -> None:
|
||||
"""One-time migration path: re-encrypt plaintext rows during rollout."""
|
||||
if not self._fernet:
|
||||
return
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(
|
||||
"""
|
||||
SELECT id, access_token, refresh_token
|
||||
FROM wordpress_oauth_tokens
|
||||
WHERE user_id = ?
|
||||
""",
|
||||
(user_id,),
|
||||
)
|
||||
rows = cursor.fetchall()
|
||||
migrated = 0
|
||||
for token_id, access_token, refresh_token in rows:
|
||||
needs_access_migration = access_token and not self._is_likely_encrypted_blob(access_token)
|
||||
needs_refresh_migration = refresh_token and not self._is_likely_encrypted_blob(refresh_token)
|
||||
if not (needs_access_migration or needs_refresh_migration):
|
||||
continue
|
||||
encrypted_access = self._encrypt_token(access_token) if needs_access_migration else access_token
|
||||
encrypted_refresh = self._encrypt_token(refresh_token) if needs_refresh_migration else refresh_token
|
||||
cursor.execute(
|
||||
"""
|
||||
UPDATE wordpress_oauth_tokens
|
||||
SET access_token = ?, refresh_token = ?, updated_at = datetime('now')
|
||||
WHERE id = ? AND user_id = ?
|
||||
""",
|
||||
(encrypted_access, encrypted_refresh, token_id, user_id),
|
||||
)
|
||||
migrated += 1
|
||||
if migrated:
|
||||
conn.commit()
|
||||
logger.info(f"WordPress OAuth token migration completed for user {user_id}; rows migrated={migrated}")
|
||||
|
||||
def _get_db_path(self, user_id: str) -> str:
|
||||
return get_user_db_path(user_id)
|
||||
@@ -195,7 +128,7 @@ class WordPressOAuthService:
|
||||
def handle_oauth_callback(self, code: str, state: str) -> Optional[Dict[str, Any]]:
|
||||
"""Handle OAuth callback and exchange code for access token."""
|
||||
try:
|
||||
logger.info("WordPress OAuth callback started")
|
||||
logger.info(f"WordPress OAuth callback started - code: {code[:20]}..., state: {state[:20]}...")
|
||||
|
||||
# Extract user_id from state
|
||||
if ':' not in state:
|
||||
@@ -251,7 +184,6 @@ class WordPressOAuthService:
|
||||
|
||||
# Store token information
|
||||
access_token = token_info.get('access_token')
|
||||
refresh_token = token_info.get('refresh_token')
|
||||
blog_id = token_info.get('blog_id')
|
||||
blog_url = token_info.get('blog_url')
|
||||
scope = token_info.get('scope', '')
|
||||
@@ -259,22 +191,20 @@ class WordPressOAuthService:
|
||||
# Calculate expiration (WordPress tokens typically expire in 2 weeks)
|
||||
expires_at = datetime.now() + timedelta(days=14)
|
||||
|
||||
encrypted_access_token = self._encrypt_token(access_token)
|
||||
encrypted_refresh_token = self._encrypt_token(refresh_token) if refresh_token else None
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
INSERT INTO wordpress_oauth_tokens
|
||||
(user_id, access_token, refresh_token, token_type, expires_at, scope, blog_id, blog_url)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
''', (user_id, encrypted_access_token, encrypted_refresh_token, 'bearer', expires_at, scope, blog_id, blog_url))
|
||||
(user_id, access_token, token_type, expires_at, scope, blog_id, blog_url)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?)
|
||||
''', (user_id, access_token, 'bearer', expires_at, scope, blog_id, blog_url))
|
||||
conn.commit()
|
||||
logger.info(f"WordPress OAuth: Token inserted into database for user {user_id}")
|
||||
|
||||
logger.info(f"WordPress OAuth token stored successfully for user {user_id}, blog: {blog_url}")
|
||||
return {
|
||||
"success": True,
|
||||
"access_token": access_token,
|
||||
"blog_id": blog_id,
|
||||
"blog_url": blog_url,
|
||||
"scope": scope,
|
||||
@@ -296,7 +226,6 @@ class WordPressOAuthService:
|
||||
return []
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
self._migrate_plaintext_tokens_if_needed(conn, user_id)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
SELECT id, access_token, token_type, expires_at, scope, blog_id, blog_url, created_at
|
||||
@@ -307,19 +236,9 @@ class WordPressOAuthService:
|
||||
|
||||
tokens = []
|
||||
for row in cursor.fetchall():
|
||||
access_token_value = row[1]
|
||||
try:
|
||||
decrypted_access_token = (
|
||||
self._decrypt_token(access_token_value)
|
||||
if self._is_likely_encrypted_blob(access_token_value)
|
||||
else access_token_value
|
||||
)
|
||||
except InvalidToken:
|
||||
logger.error(f"Failed to decrypt WordPress token for user {user_id}, token_id={row[0]}")
|
||||
continue
|
||||
tokens.append({
|
||||
"id": row[0],
|
||||
"access_token": decrypted_access_token,
|
||||
"access_token": row[1],
|
||||
"token_type": row[2],
|
||||
"expires_at": row[3],
|
||||
"scope": row[4],
|
||||
@@ -353,7 +272,6 @@ class WordPressOAuthService:
|
||||
}
|
||||
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
self._migrate_plaintext_tokens_if_needed(conn, user_id)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Get all tokens (active and expired)
|
||||
@@ -371,6 +289,8 @@ class WordPressOAuthService:
|
||||
for row in cursor.fetchall():
|
||||
token_data = {
|
||||
"id": row[0],
|
||||
"access_token": row[1],
|
||||
"refresh_token": row[2],
|
||||
"token_type": row[3],
|
||||
"expires_at": row[4],
|
||||
"scope": row[5],
|
||||
|
||||
@@ -245,42 +245,6 @@ class WordPressService:
|
||||
logger.error(f"Error getting site info for {site_id}: {e}")
|
||||
return None
|
||||
|
||||
def get_posts_for_site(self, user_id: str, site_id: int) -> List[Dict[str, Any]]:
|
||||
"""Get tracked WordPress posts for a specific site."""
|
||||
db_path = self._get_db_path(user_id)
|
||||
if not os.path.exists(db_path):
|
||||
return []
|
||||
try:
|
||||
with sqlite3.connect(db_path) as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='wordpress_posts'")
|
||||
if not cursor.fetchone():
|
||||
return []
|
||||
cursor.execute('''
|
||||
SELECT wp.id, wp.wp_post_id, wp.title, wp.status, wp.published_at, wp.created_at,
|
||||
ws.site_name, ws.site_url
|
||||
FROM wordpress_posts wp
|
||||
JOIN wordpress_sites ws ON wp.site_id = ws.id
|
||||
WHERE wp.user_id = ? AND wp.site_id = ? AND ws.is_active = 1
|
||||
ORDER BY wp.published_at DESC
|
||||
''', (user_id, site_id))
|
||||
posts = []
|
||||
for post_data in cursor.fetchall():
|
||||
posts.append({
|
||||
"id": post_data[0],
|
||||
"wp_post_id": post_data[1],
|
||||
"title": post_data[2],
|
||||
"status": post_data[3],
|
||||
"published_at": post_data[4],
|
||||
"created_at": post_data[5],
|
||||
"site_name": post_data[6],
|
||||
"site_url": post_data[7]
|
||||
})
|
||||
return posts
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting posts for site {site_id}: {e}")
|
||||
return []
|
||||
|
||||
def get_posts_for_all_sites(self, user_id: str) -> List[Dict[str, Any]]:
|
||||
"""Get all tracked WordPress posts for all sites of a user."""
|
||||
db_path = self._get_db_path(user_id)
|
||||
|
||||
@@ -1,323 +0,0 @@
|
||||
"""
|
||||
Link Search Service — Internal & external link discovery and rewording.
|
||||
|
||||
Provides:
|
||||
- Internal link search (Exa include_domains scoped to user's website)
|
||||
- External link search (Exa general search, optionally excluding user's domain)
|
||||
- Reword-with-links (LLM embeds selected links naturally into section/selected text)
|
||||
"""
|
||||
|
||||
import re
|
||||
from typing import Dict, Any, List, Optional
|
||||
from loguru import logger
|
||||
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
|
||||
|
||||
LINK_SEARCH_SYSTEM_PROMPT = """You are an SEO and content linking expert. Your task is to naturally incorporate provided links into text using markdown link syntax, following the best practices below.
|
||||
|
||||
## SEO Linking Best Practices
|
||||
|
||||
1. **Anchor text must be descriptive and keyword-rich.** Use the surrounding context to create natural, specific anchor text. Never use "click here", "read more", "learn more", or bare URLs as anchors.
|
||||
- GOOD: [HubSpot's content marketing statistics](url) — descriptive, includes keywords
|
||||
- BAD: [click here](url) — vague, no SEO value
|
||||
- BAD: [https://example.com](url) — raw URL, harmful to readability
|
||||
|
||||
2. **Match link type to content context:**
|
||||
- Internal links: Point anchor text at relevant topic keywords that describe the destination page
|
||||
- External links: Cite authoritative sources (research, official docs, industry leaders) using the source name or key finding as anchor text
|
||||
|
||||
3. **Link equity (PageRank) distribution:** Spread links naturally. Aim for 1-2 links per paragraph at most. Don't cluster all links together.
|
||||
|
||||
4. **Preserve the original text's meaning, tone, structure, and approximate length.** You are inserting links, NOT rewriting the content.
|
||||
|
||||
5. **If selected_text is provided, ONLY modify that specific portion.** The rest of section_text must remain IDENTICAL — character-for-character unchanged.
|
||||
|
||||
6. **If selected_text is NOT provided, you may insert links throughout the entire section_text.**
|
||||
|
||||
7. **Link placement should feel earned, not forced.** Only insert a link where a reader would genuinely want to learn more. If a link doesn't naturally fit, skip it.
|
||||
|
||||
8. **Prioritize high-authority external sources** (research papers, official documentation, industry leaders) when linking externally.
|
||||
|
||||
9. **Return ONLY the reworded text.** No explanations, no preamble, no markdown code fences. Just the text with [anchor text](url) links embedded."""
|
||||
|
||||
|
||||
LINK_SEARCH_USER_PROMPT = """## Section Heading
|
||||
{section_heading}
|
||||
|
||||
## Full Section Text
|
||||
{section_text}
|
||||
|
||||
{selected_text_block}
|
||||
|
||||
## Available Links to Incorporate
|
||||
{links}
|
||||
|
||||
## Instructions
|
||||
Carefully read the section text above and insert the most relevant links from the "Available Links" list using markdown format: [descriptive anchor text](url).
|
||||
|
||||
Remember:
|
||||
- Use keyword-rich, descriptive anchor text (NOT "click here" or bare URLs)
|
||||
- Only insert links where they naturally enhance the reader's experience
|
||||
- Preserve the original text's meaning, tone, and structure
|
||||
- Aim for 1-2 links per paragraph maximum
|
||||
- If no links fit naturally, return the text unchanged
|
||||
|
||||
Return ONLY the text with links embedded. No explanations."""
|
||||
|
||||
|
||||
def _extract_domain(url: str) -> str:
|
||||
"""Extract the registered domain from a URL.
|
||||
|
||||
Handles common multi-part TLDs like .co.uk, .com.au, .co.jp, etc.
|
||||
Falls back to last two parts for unknown TLDs.
|
||||
"""
|
||||
url = url.strip()
|
||||
if not url:
|
||||
return ""
|
||||
# Add protocol if missing
|
||||
if not url.startswith(("http://", "https://")):
|
||||
url = "https://" + url
|
||||
# Remove protocol
|
||||
domain = re.sub(r"^https?://", "", url)
|
||||
# Remove path and query
|
||||
domain = domain.split("/")[0].split("?")[0].split("#")[0]
|
||||
# Remove port
|
||||
domain = domain.split(":")[0]
|
||||
# Remove userinfo (user:pass@)
|
||||
if "@" in domain:
|
||||
domain = domain.split("@")[-1]
|
||||
domain = domain.lower().strip()
|
||||
if not domain:
|
||||
return ""
|
||||
|
||||
# Known multi-part TLDs (common ccTLDs with second-level domains)
|
||||
multi_part_tlds = {
|
||||
"co.uk", "org.uk", "ac.uk", "gov.uk", "co.jp", "or.jp", "ne.jp", "ac.jp",
|
||||
"co.au", "com.au", "org.au", "net.au", "co.nz", "net.nz", "org.nz",
|
||||
"co.in", "net.in", "org.in", "ac.in", "co.kr", "co.za", "org.za", "web.za",
|
||||
"com.br", "com.mx", "com.ar", "com.sg", "com.hk", "com.tw", "com.my",
|
||||
"com.cn", "org.cn", "net.cn", "ac.ke", "co.ke",
|
||||
}
|
||||
parts = domain.split(".")
|
||||
if len(parts) < 2:
|
||||
return domain
|
||||
|
||||
# Check if last two parts form a known multi-part TLD
|
||||
last_two = ".".join(parts[-2:])
|
||||
if last_two in multi_part_tlds and len(parts) > 2:
|
||||
# e.g. blog.example.co.uk → example.co.uk
|
||||
return ".".join(parts[-3:])
|
||||
# Default: last two parts (example.com)
|
||||
return ".".join(parts[-2:])
|
||||
|
||||
|
||||
def _filter_search_results(results: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||
"""Filter out results with empty URLs or missing essential fields."""
|
||||
filtered = []
|
||||
for r in results:
|
||||
url = r.get("url", "").strip()
|
||||
title = r.get("title", "").strip() or "Untitled"
|
||||
if url:
|
||||
filtered.append({
|
||||
"title": title,
|
||||
"url": url,
|
||||
"text": r.get("text", ""),
|
||||
"publishedDate": r.get("publishedDate", ""),
|
||||
"author": r.get("author", ""),
|
||||
"score": r.get("score", 0.5),
|
||||
})
|
||||
return filtered
|
||||
|
||||
|
||||
class LinkSearchService:
|
||||
"""Service for finding internal/external links and rewording text to include them."""
|
||||
|
||||
async def search_internal(
|
||||
self,
|
||||
query: str,
|
||||
site_url: str,
|
||||
user_id: Optional[str] = None,
|
||||
num_results: int = 5,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Search for internal links (from the user's own website).
|
||||
|
||||
Args:
|
||||
query: Search query (section topic/heading)
|
||||
site_url: User's website URL to scope search via include_domains
|
||||
user_id: Optional user ID for subscription tracking
|
||||
num_results: Number of results to return
|
||||
|
||||
Returns:
|
||||
{"results": [...], "warnings": [...]}
|
||||
"""
|
||||
warnings = []
|
||||
domain = _extract_domain(site_url)
|
||||
|
||||
if not domain:
|
||||
return {
|
||||
"results": [],
|
||||
"warnings": [f"Could not extract domain from '{site_url}'"],
|
||||
}
|
||||
|
||||
try:
|
||||
from services.blog_writer.research.exa_provider import ExaResearchProvider
|
||||
|
||||
provider = ExaResearchProvider()
|
||||
results = await provider.simple_search(
|
||||
query=query,
|
||||
num_results=num_results,
|
||||
user_id=user_id,
|
||||
include_domains=[domain],
|
||||
)
|
||||
filtered = _filter_search_results(results)
|
||||
return {"results": filtered, "warnings": warnings}
|
||||
|
||||
except ImportError:
|
||||
msg = "Exa provider not available — link search requires Exa API."
|
||||
logger.warning(f"[LinkSearchService] {msg}")
|
||||
warnings.append(msg)
|
||||
return {"results": [], "warnings": warnings}
|
||||
except Exception as e:
|
||||
logger.error(f"[LinkSearchService] Internal link search failed: {e}")
|
||||
warnings.append(f"Search failed: {str(e)}")
|
||||
return {"results": [], "warnings": warnings}
|
||||
|
||||
async def search_external(
|
||||
self,
|
||||
query: str,
|
||||
site_url: Optional[str] = None,
|
||||
user_id: Optional[str] = None,
|
||||
num_results: int = 5,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Search for external links (optionally excluding the user's own domain).
|
||||
|
||||
Args:
|
||||
query: Search query
|
||||
site_url: User's website URL — results from this domain will be excluded
|
||||
user_id: Optional user ID for subscription tracking
|
||||
num_results: Number of results to return
|
||||
|
||||
Returns:
|
||||
{"results": [...], "warnings": [...]}
|
||||
"""
|
||||
warnings = []
|
||||
exclude_domains = None
|
||||
|
||||
if site_url:
|
||||
domain = _extract_domain(site_url)
|
||||
if domain:
|
||||
exclude_domains = [domain]
|
||||
|
||||
try:
|
||||
from services.blog_writer.research.exa_provider import ExaResearchProvider
|
||||
|
||||
provider = ExaResearchProvider()
|
||||
results = await provider.simple_search(
|
||||
query=query,
|
||||
num_results=num_results,
|
||||
user_id=user_id,
|
||||
exclude_domains=exclude_domains,
|
||||
)
|
||||
filtered = _filter_search_results(results)
|
||||
return {"results": filtered, "warnings": warnings}
|
||||
|
||||
except ImportError:
|
||||
msg = "Exa provider not available — link search requires Exa API."
|
||||
logger.warning(f"[LinkSearchService] {msg}")
|
||||
warnings.append(msg)
|
||||
return {"results": [], "warnings": warnings}
|
||||
except Exception as e:
|
||||
logger.error(f"[LinkSearchService] External link search failed: {e}")
|
||||
warnings.append(f"Search failed: {str(e)}")
|
||||
return {"results": [], "warnings": warnings}
|
||||
|
||||
def reword_with_links(
|
||||
self,
|
||||
section_text: str,
|
||||
links: List[Dict[str, str]],
|
||||
section_heading: Optional[str] = None,
|
||||
selected_text: Optional[str] = None,
|
||||
user_id: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Use LLM to reword text, naturally incorporating the selected links.
|
||||
|
||||
Args:
|
||||
section_text: Full section text
|
||||
links: List of {"url": str, "title": str} dicts
|
||||
section_heading: Optional section heading for context
|
||||
selected_text: If provided, only reword this portion of the text
|
||||
user_id: Optional user ID for LLM routing
|
||||
|
||||
Returns:
|
||||
{"reworded_text": str, "warnings": [...]}
|
||||
"""
|
||||
warnings = []
|
||||
|
||||
if not links:
|
||||
return {
|
||||
"reworded_text": section_text,
|
||||
"warnings": ["No links provided — returning original text unchanged."],
|
||||
}
|
||||
|
||||
links_text = "\n".join(
|
||||
f"- [{link.get('title', 'Untitled')}]({link.get('url', '')}) — {link.get('title', '')}"
|
||||
for link in links
|
||||
)
|
||||
|
||||
selected_text_block = ""
|
||||
if selected_text:
|
||||
selected_text_block = f"Selected text to reword (keep surrounding text unchanged):\n{selected_text}"
|
||||
|
||||
prompt = LINK_SEARCH_USER_PROMPT.format(
|
||||
section_heading=section_heading or "Blog Section",
|
||||
section_text=section_text[:3000],
|
||||
selected_text_block=selected_text_block,
|
||||
links=links_text,
|
||||
)
|
||||
|
||||
try:
|
||||
result = llm_text_gen(
|
||||
prompt=prompt,
|
||||
system_prompt=LINK_SEARCH_SYSTEM_PROMPT,
|
||||
json_struct=None,
|
||||
max_tokens=3000,
|
||||
user_id=user_id,
|
||||
)
|
||||
|
||||
raw = result.get("text", "") if isinstance(result, dict) else str(result) if result else ""
|
||||
raw = raw.strip()
|
||||
|
||||
# Strip markdown code fences if the LLM wrapped the output
|
||||
if raw.startswith("```"):
|
||||
match = re.search(r"```(?:markdown|md)?\s*(.*?)\s*```", raw, re.DOTALL)
|
||||
if match:
|
||||
raw = match.group(1).strip()
|
||||
|
||||
if not raw:
|
||||
warnings.append("LLM returned empty reworded text — returning original.")
|
||||
return {"reworded_text": section_text, "warnings": warnings}
|
||||
|
||||
logger.info(f"[LinkSearchService] Reworded text: {len(raw)} chars, {len(links)} links provided")
|
||||
return {"reworded_text": raw, "warnings": warnings}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[LinkSearchService] Reword failed: {e}")
|
||||
warnings.append(f"Reword failed: {str(e)}")
|
||||
return {"reworded_text": section_text, "warnings": warnings}
|
||||
|
||||
|
||||
# Per-user service instances (not strictly needed since service is stateless,
|
||||
# but kept for consistency with chart_service pattern)
|
||||
_link_search_instances: Dict[str, LinkSearchService] = {}
|
||||
|
||||
|
||||
def get_link_search_service(user_id: Optional[str] = None) -> LinkSearchService:
|
||||
"""Get or create LinkSearchService for the given user."""
|
||||
cache_key = user_id or "default"
|
||||
if cache_key not in _link_search_instances:
|
||||
_link_search_instances[cache_key] = LinkSearchService()
|
||||
return _link_search_instances[cache_key]
|
||||
@@ -46,7 +46,6 @@ def llm_text_gen(
|
||||
preferred_provider: Optional[str] = None,
|
||||
flow_type: Optional[str] = None,
|
||||
max_tokens: Optional[int] = None,
|
||||
temperature: Optional[float] = None,
|
||||
) -> str:
|
||||
"""
|
||||
Generate text using Language Model (LLM) based on the provided prompt.
|
||||
@@ -59,8 +58,6 @@ def llm_text_gen(
|
||||
preferred_hf_models (list, optional): Preferred HuggingFace models.
|
||||
preferred_provider (str, optional): Preferred provider (google, huggingface).
|
||||
flow_type (str, optional): Flow type for logging (e.g., 'sif_agent', 'premium_tool').
|
||||
max_tokens (int, optional): Max tokens for response. If None, provider default is used.
|
||||
temperature (float, optional): Temperature for generation (0.0-1.0). If None, defaults to 0.7.
|
||||
|
||||
Returns:
|
||||
str: Generated text based on the prompt.
|
||||
@@ -78,8 +75,9 @@ def llm_text_gen(
|
||||
# Set default values for LLM parameters
|
||||
gpt_provider = "google" # Default to Google Gemini
|
||||
model = "gemini-2.0-flash-001"
|
||||
if temperature is None:
|
||||
temperature = 0.7
|
||||
temperature = 0.7
|
||||
if max_tokens is None:
|
||||
max_tokens = 4000
|
||||
top_p = 0.9
|
||||
n = 1
|
||||
fp = 16
|
||||
@@ -431,23 +429,6 @@ def llm_text_gen(
|
||||
except Exception as provider_error:
|
||||
logger.error(f"[llm_text_gen] Provider {gpt_provider} failed: {str(provider_error)}")
|
||||
|
||||
# Surface balance/quota errors immediately without fallback
|
||||
error_str = str(provider_error).lower()
|
||||
if "insufficient_balance" in error_str or "balance_not_enough" in error_str or ("403" in error_str and "balance" in error_str):
|
||||
logger.error(f"[llm_text_gen] Balance/quota error from {gpt_provider}, not attempting fallback")
|
||||
raise HTTPException(
|
||||
status_code=403,
|
||||
detail={
|
||||
"error": "insufficient_balance",
|
||||
"message": f"Your {gpt_provider.capitalize()} API balance is insufficient. Please top up your account or switch providers.",
|
||||
"usage_info": {
|
||||
"error_type": "insufficient_balance",
|
||||
"provider": gpt_provider,
|
||||
"suggestion": f"Set GPT_PROVIDER=google in your environment to use Gemini instead, or add credits to your {gpt_provider.capitalize()} account."
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
# CIRCUIT BREAKER: Only try ONE fallback to prevent expensive API calls
|
||||
fallback_providers = ["google", "huggingface"]
|
||||
fallback_providers = [p for p in fallback_providers if p in available_providers and p != gpt_provider]
|
||||
|
||||
@@ -353,11 +353,7 @@ def wavespeed_text_response(
|
||||
|
||||
raise Exception(f"WaveSpeed text generation failed: {str(e)}")
|
||||
|
||||
@retry(
|
||||
retry=retry_if_exception(_should_retry_wavespeed_error),
|
||||
wait=wait_random_exponential(min=1, max=60),
|
||||
stop=stop_after_attempt(6),
|
||||
)
|
||||
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
|
||||
def wavespeed_structured_json_response(
|
||||
prompt: str,
|
||||
schema: Dict[str, Any],
|
||||
@@ -612,20 +608,4 @@ def wavespeed_structured_json_response(
|
||||
error_msg = str(e) if str(e) else repr(e)
|
||||
error_type = type(e).__name__
|
||||
logger.error(f"❌ WaveSpeed structured JSON generation failed [{error_type}]: {error_msg}")
|
||||
|
||||
# Surface balance/quota errors as HTTPException so upstream can show user-friendly messages
|
||||
from fastapi import HTTPException
|
||||
if "balance_not_enough" in error_msg or "403" in error_msg or "PermissionDenied" in error_type:
|
||||
raise HTTPException(
|
||||
status_code=403,
|
||||
detail={
|
||||
"error": "insufficient_balance",
|
||||
"message": "WaveSpeed API balance is insufficient. Please top up your WaveSpeed account or switch to a different provider.",
|
||||
"usage_info": {
|
||||
"error_type": "insufficient_balance",
|
||||
"provider": "wavespeed",
|
||||
"suggestion": "Set GPT_PROVIDER=google in your environment to use Gemini instead, or add credits to your WaveSpeed account."
|
||||
}
|
||||
}
|
||||
)
|
||||
raise Exception(f"WaveSpeed structured JSON generation failed: {error_msg}")
|
||||
|
||||
@@ -4,8 +4,6 @@ Layered composition pipeline: Background + Chart + Avatar Circle + Text Overlays
|
||||
"""
|
||||
|
||||
import json
|
||||
import tempfile
|
||||
import uuid
|
||||
import numpy as np
|
||||
from pathlib import Path
|
||||
from dataclasses import dataclass, field
|
||||
@@ -42,7 +40,7 @@ def crossfade_concat(scenes: list, fade_dur: float = 0.5):
|
||||
if i > 0:
|
||||
c = c.fx(vfx.CrossFadeIn, fade_dur)
|
||||
faded.append(c)
|
||||
return concatenate_videoclips(faded, padding=-fade_dur, method="compose")
|
||||
return concatenate_videoclips(faded, padding=-int(fade_dur), method="compose")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -307,6 +305,8 @@ def make_line_trend(data: dict, out_path: str, title: str = "") -> str:
|
||||
fig.savefig(out_path, dpi=150, transparent=True, bbox_inches="tight")
|
||||
plt.close(fig)
|
||||
return out_path
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Text / Bullet overlay (Pillow → PNG)
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -403,7 +403,7 @@ def ken_burns(clip: ImageClip, zoom_ratio: float = 0.08) -> ImageClip:
|
||||
# Scene builders (one per visual_cue type)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def build_data_scene(assets: SceneAssets, insight: Insight, temp_dir: Path) -> CompositeVideoClip:
|
||||
def build_data_scene(assets: SceneAssets, insight: Insight) -> CompositeVideoClip:
|
||||
"""
|
||||
Layout: Background (Ken Burns) + Chart (fade-in) + Avatar circle (corner) + Insight card
|
||||
"""
|
||||
@@ -427,7 +427,7 @@ def build_data_scene(assets: SceneAssets, insight: Insight, temp_dir: Path) -> C
|
||||
.fx(vfx.fadeout, 0.4))
|
||||
layers.append(chart)
|
||||
|
||||
card_path = str(temp_dir / f"insight_card_{uuid.uuid4().hex}.png")
|
||||
card_path = "/tmp/insight_card.png"
|
||||
make_insight_card(insight.key_insight, insight.supporting_stat, card_path)
|
||||
card = (ImageClip(card_path)
|
||||
.set_duration(d - 1)
|
||||
@@ -446,7 +446,7 @@ def build_data_scene(assets: SceneAssets, insight: Insight, temp_dir: Path) -> C
|
||||
|
||||
|
||||
def build_bullet_scene(assets: SceneAssets, insight: Insight,
|
||||
bullets: list[str], temp_dir: Path) -> CompositeVideoClip:
|
||||
bullets: list[str]) -> CompositeVideoClip:
|
||||
"""
|
||||
Layout: AI image (Ken Burns) + Bullet overlay + Avatar circle
|
||||
"""
|
||||
@@ -460,7 +460,7 @@ def build_bullet_scene(assets: SceneAssets, insight: Insight,
|
||||
bg = bg.fx(vfx.lum_contrast, 0, -50)
|
||||
layers.append(bg)
|
||||
|
||||
bullet_path = str(temp_dir / f"bullets_{uuid.uuid4().hex}.png")
|
||||
bullet_path = "/tmp/bullets.png"
|
||||
make_bullet_overlay(bullets, bullet_path, width=860)
|
||||
bullets_clip = (ImageClip(bullet_path)
|
||||
.set_duration(d - 1)
|
||||
@@ -490,20 +490,15 @@ def build_full_avatar_scene(assets: SceneAssets, insight: Insight) -> VideoFileC
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def dispatch_scene(insight: Insight, assets: SceneAssets,
|
||||
bullet_lines: Optional[list[str]] = None,
|
||||
temp_dir: Optional[str | Path] = None):
|
||||
bullet_lines: Optional[list[str]] = None):
|
||||
"""Dispatch scene based on visual_cue type."""
|
||||
cue = insight.visual_cue
|
||||
scene_temp_dir = Path(temp_dir) if temp_dir else Path(
|
||||
tempfile.mkdtemp(prefix=f"broll_{cue}_")
|
||||
)
|
||||
scene_temp_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if cue == "full_avatar":
|
||||
return build_full_avatar_scene(assets, insight)
|
||||
|
||||
elif cue in ("bar_comparison", "bar_chart_comparison", "bar_horizontal", "line_trend", "pie", "stacked_bar"):
|
||||
chart_path = str(scene_temp_dir / f"chart_{uuid.uuid4().hex}.png")
|
||||
chart_path = "/tmp/chart.png"
|
||||
chart_data = insight.chart_data or {}
|
||||
if cue in ("bar_comparison", "bar_chart_comparison"):
|
||||
# Normalize {labels, values} -> {labels, before, after} for make_bar_chart
|
||||
@@ -528,14 +523,14 @@ def dispatch_scene(insight: Insight, assets: SceneAssets,
|
||||
make_stacked_bar(chart_data, chart_path,
|
||||
title=insight.key_insight)
|
||||
assets.chart_img = chart_path
|
||||
return build_data_scene(assets, insight, scene_temp_dir)
|
||||
return build_data_scene(assets, insight)
|
||||
|
||||
elif cue == "bullet_points":
|
||||
lines = bullet_lines or [insight.key_insight, insight.supporting_stat]
|
||||
return build_bullet_scene(assets, insight, lines, scene_temp_dir)
|
||||
return build_bullet_scene(assets, insight, lines)
|
||||
|
||||
else:
|
||||
return build_data_scene(assets, insight, scene_temp_dir)
|
||||
return build_data_scene(assets, insight)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -576,10 +571,8 @@ def pipeline_from_json(insight_json: str,
|
||||
data = json.loads(insight_json)
|
||||
insight = Insight(**{k: data[k] for k in Insight.__dataclass_fields__ if k in data})
|
||||
assets = SceneAssets(background_img=background_img, avatar_video=avatar_video)
|
||||
scene_temp_dir = Path(tempfile.mkdtemp(prefix=f"scene_{insight.visual_cue}_"))
|
||||
scene = dispatch_scene(insight, assets,
|
||||
bullet_lines=data.get("bullet_lines"),
|
||||
temp_dir=scene_temp_dir)
|
||||
bullet_lines=data.get("bullet_lines"))
|
||||
out = f"/tmp/scene_{insight.visual_cue}.mp4"
|
||||
compose_video([scene], output_path=out)
|
||||
return out
|
||||
@@ -627,4 +620,4 @@ if __name__ == "__main__":
|
||||
})
|
||||
print("\nSample Insight JSON:\n", sample_json)
|
||||
print("\nAll asset generation tests passed.")
|
||||
print("To run full video composition, supply real background_img and avatar_video paths.")
|
||||
print("To run full video composition, supply real background_img and avatar_video paths.")
|
||||
@@ -5,8 +5,6 @@ This service handles:
|
||||
- Chart data extraction from research
|
||||
- Individual scene B-roll video generation
|
||||
- Final video composition from multiple B-roll scenes
|
||||
|
||||
Chart preview generation is delegated to the shared ChartService.
|
||||
"""
|
||||
|
||||
import json
|
||||
@@ -17,18 +15,21 @@ from pathlib import Path
|
||||
from typing import Dict, Any, Optional, List, TYPE_CHECKING
|
||||
from loguru import logger
|
||||
|
||||
# Import video compositing from broll_composer
|
||||
# Import chart generators directly
|
||||
from services.podcast.broll_composer import (
|
||||
Insight,
|
||||
SceneAssets,
|
||||
dispatch_scene,
|
||||
compose_video,
|
||||
make_bar_chart,
|
||||
make_horizontal_bar,
|
||||
make_line_trend,
|
||||
make_pie_chart,
|
||||
make_stacked_bar,
|
||||
make_bullet_overlay,
|
||||
make_insight_card,
|
||||
)
|
||||
|
||||
# Import shared chart service for preview generation
|
||||
from services.chart_service import ChartService, get_chart_service
|
||||
|
||||
|
||||
class BrollService:
|
||||
"""Orchestrates B-roll composition for podcast scenes."""
|
||||
@@ -41,14 +42,13 @@ class BrollService:
|
||||
output_dir: Base directory for B-roll output. Defaults to workspace chart directory.
|
||||
user_id: User ID for multi-tenant workspace isolation.
|
||||
"""
|
||||
self._user_id = user_id
|
||||
if output_dir:
|
||||
self.output_dir = Path(output_dir)
|
||||
else:
|
||||
self.output_dir = self._get_chart_dir(user_id)
|
||||
|
||||
self.output_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info(f"[BrollService] Initialized with output directory: {self.output_dir}")
|
||||
logger.warning(f"[BrollService] Initialized with output directory: {self.output_dir}")
|
||||
|
||||
def _get_chart_dir(self, user_id: Optional[str] = None) -> Path:
|
||||
"""Get chart directory from podcast constants (workspace-aware)."""
|
||||
@@ -78,22 +78,145 @@ class BrollService:
|
||||
"""
|
||||
Generate a chart PNG preview (static, for Write phase).
|
||||
|
||||
Delegates to ChartService for rendering, then returns the local file path.
|
||||
Args:
|
||||
chart_data: Chart data dict with labels, before/after, etc.
|
||||
chart_type: Type of chart (bar_comparison, bar_horizontal, line_trend, pie, stacked_bar, bullet)
|
||||
title: Title for the chart
|
||||
subtitle: Optional subtitle at bottom
|
||||
|
||||
Returns:
|
||||
Path to generated PNG file
|
||||
"""
|
||||
resolved_chart_id = chart_id or uuid.uuid4().hex[:8]
|
||||
out_path = str(self.get_chart_preview_path(resolved_chart_id))
|
||||
|
||||
logger.info(f"[BrollService] Generating chart preview: type={chart_type}, id={resolved_chart_id}")
|
||||
# Debug logging
|
||||
logger.warning(f"[BrollService] Generating: type={chart_type}, data keys={list(chart_data.keys())}")
|
||||
|
||||
chart_svc = get_chart_service(user_id=self._user_id)
|
||||
result = chart_svc.generate_chart(
|
||||
chart_data=chart_data,
|
||||
chart_type=chart_type,
|
||||
title=title,
|
||||
subtitle=subtitle or "",
|
||||
chart_id=resolved_chart_id,
|
||||
)
|
||||
|
||||
return result.get("path", "")
|
||||
try:
|
||||
if chart_type == "bar_comparison":
|
||||
# Accept both formats: {labels, before, after} OR {labels, values}
|
||||
labels = chart_data.get("labels", [])
|
||||
before = chart_data.get("before", [])
|
||||
after = chart_data.get("after", [])
|
||||
# If using new format (labels, values), treat as single bar chart
|
||||
if not before and not after:
|
||||
values = chart_data.get("values", [])
|
||||
if values:
|
||||
# Normalize to same length, truncating or padding as needed
|
||||
n = min(len(labels), len(values))
|
||||
labels = labels[:n]
|
||||
before = [0] * n
|
||||
after = values[:n]
|
||||
# Create modified data dict with proper format for make_bar_chart
|
||||
chart_data_for_render = {
|
||||
"labels": labels,
|
||||
"before": before,
|
||||
"after": after
|
||||
}
|
||||
else:
|
||||
chart_data_for_render = chart_data
|
||||
else:
|
||||
chart_data_for_render = chart_data
|
||||
if not labels or (not before and not after):
|
||||
logger.warning(f"[BrollService] Missing required data for bar_comparison: labels={len(labels)}, before={len(before)}, after={len(after)}")
|
||||
return ""
|
||||
if len(labels) != len(before) or len(labels) != len(after):
|
||||
logger.warning(f"[BrollService] Data shape mismatch: labels={len(labels)}, before={len(before)}, after={len(after)}")
|
||||
return ""
|
||||
make_bar_chart(chart_data_for_render, out_path, title, subtitle=subtitle)
|
||||
logger.warning(f"[BrollService] bar_comparison rendered: {out_path}, exists={os.path.exists(out_path)}")
|
||||
elif chart_type == "bar_horizontal":
|
||||
labels = chart_data.get("labels", [])
|
||||
values = chart_data.get("values", [])
|
||||
if not labels or not values:
|
||||
logger.warning("[BrollService] Missing required data for bar_horizontal")
|
||||
return ""
|
||||
make_horizontal_bar(chart_data, out_path, title)
|
||||
logger.warning(f"[BrollService] bar_horizontal rendered: {out_path}, exists={os.path.exists(out_path)}")
|
||||
elif chart_type == "line_trend":
|
||||
labels = chart_data.get("labels", [])
|
||||
values = chart_data.get("values", [])
|
||||
if not labels or not values:
|
||||
logger.warning("[BrollService] Missing required data for line_trend")
|
||||
return ""
|
||||
make_line_trend(chart_data, out_path, title)
|
||||
logger.warning(f"[BrollService] line_trend rendered: {out_path}, exists={os.path.exists(out_path)}")
|
||||
elif chart_type == "pie":
|
||||
labels = chart_data.get("labels", [])
|
||||
values = chart_data.get("values", [])
|
||||
if not labels or not values:
|
||||
logger.warning("[BrollService] Missing required data for pie")
|
||||
return ""
|
||||
make_pie_chart(chart_data, out_path, title)
|
||||
logger.warning(f"[BrollService] pie rendered: {out_path}, exists={os.path.exists(out_path)}")
|
||||
elif chart_type == "stacked_bar":
|
||||
labels = chart_data.get("labels", [])
|
||||
segments = chart_data.get("segments", [])
|
||||
if not labels or not segments:
|
||||
logger.warning("[BrollService] Missing required data for stacked_bar")
|
||||
return ""
|
||||
make_stacked_bar(chart_data, out_path, title)
|
||||
logger.warning(f"[BrollService] stacked_bar rendered: {out_path}, exists={os.path.exists(out_path)}")
|
||||
elif chart_type == "bullet" or chart_type == "bullet_points":
|
||||
# Accept both: bullet_points OR labels
|
||||
bullet_points = chart_data.get("bullet_points", [])
|
||||
# If using new format, use labels as bullet points
|
||||
if not bullet_points:
|
||||
bullet_points = chart_data.get("labels", [])
|
||||
if not bullet_points:
|
||||
labels_fallback = chart_data.get("labels", [])
|
||||
if labels_fallback:
|
||||
bullet_points = labels_fallback
|
||||
if bullet_points:
|
||||
make_bullet_overlay(bullet_points, out_path)
|
||||
logger.warning(f"[BrollService] bullet_points rendered: {out_path}, exists={os.path.exists(out_path)}")
|
||||
else:
|
||||
logger.warning("[BrollService] No bullet points provided")
|
||||
return ""
|
||||
else:
|
||||
logger.warning(f"[BrollService] Unknown chart type: {chart_type}, falling back to bar_comparison")
|
||||
# Try bar_comparison as fallback
|
||||
try:
|
||||
make_bar_chart(chart_data, out_path, title, subtitle=subtitle)
|
||||
return out_path
|
||||
except Exception as fallback_err:
|
||||
logger.warning(f"[BrollService] Fallback also failed: {fallback_err}")
|
||||
return ""
|
||||
|
||||
logger.warning(f"[BrollService] Chart preview generated: {out_path}, exists={os.path.exists(out_path) if out_path else 'N/A'}")
|
||||
|
||||
# Add source attribution overlay if present
|
||||
source = chart_data.get("source", "").strip()
|
||||
if source and out_path and os.path.exists(out_path):
|
||||
try:
|
||||
from PIL import Image as PILImage, ImageDraw, ImageFont
|
||||
img = PILImage.open(out_path).convert("RGBA")
|
||||
draw = ImageDraw.Draw(img)
|
||||
source_text = f"Source: {source[:80]}"
|
||||
try:
|
||||
font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 11)
|
||||
except (OSError, IOError):
|
||||
try:
|
||||
font = ImageFont.truetype("arial.ttf", 11)
|
||||
except (OSError, IOError):
|
||||
font = ImageFont.load_default()
|
||||
text_bbox = draw.textbbox((0, 0), source_text, font=font)
|
||||
text_w = text_bbox[2] - text_bbox[0]
|
||||
text_h = text_bbox[3] - text_bbox[1]
|
||||
x = img.width - text_w - 12
|
||||
y = img.height - text_h - 8
|
||||
draw.rectangle([x - 4, y - 2, x + text_w + 4, y + text_h + 2], fill=(0, 0, 0, 140))
|
||||
draw.text((x, y), source_text, fill=(200, 200, 200, 220), font=font)
|
||||
img.save(out_path)
|
||||
except Exception as src_err:
|
||||
logger.warning(f"[BrollService] Source overlay failed (non-fatal): {src_err}")
|
||||
|
||||
return out_path
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[BrollService] Failed to generate chart preview: {e}")
|
||||
return ""
|
||||
|
||||
def generate_scene_broll(
|
||||
self,
|
||||
@@ -139,13 +262,9 @@ class BrollService:
|
||||
background_img=background_img_path,
|
||||
avatar_video=avatar_video_path,
|
||||
)
|
||||
scene_temp_dir = self.get_output_path(
|
||||
f"scene_assets_{scene_id_safe}_{uuid.uuid4().hex[:8]}"
|
||||
)
|
||||
scene_temp_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Generate the scene
|
||||
scene = dispatch_scene(insight, assets, temp_dir=scene_temp_dir)
|
||||
scene = dispatch_scene(insight, assets)
|
||||
|
||||
# Write video
|
||||
compose_video([scene], output_path=out_path)
|
||||
|
||||
@@ -343,7 +343,7 @@ class GoogleTrendsService:
|
||||
logger.info(
|
||||
f"[Trends] ===== DONE analyze_trends ===== total={total_ms}ms "
|
||||
f"iot={len(interest_over_time)} ibr={len(interest_by_region)} "
|
||||
f"rt_top={len(related_topics.get('top', []))} rq_top={len(related_queries.get('top', []))}"
|
||||
f"rt_top={rt_top} rq_top={rq_top}"
|
||||
)
|
||||
|
||||
result = {
|
||||
|
||||
@@ -2,595 +2,51 @@
|
||||
Enterprise SEO Service
|
||||
|
||||
Comprehensive enterprise-level SEO audit service that orchestrates
|
||||
multiple SEO tools into intelligent workflows with advanced analytics.
|
||||
|
||||
Features:
|
||||
- Multi-tool orchestration (Technical, Content, Performance)
|
||||
- Competitive intelligence analysis
|
||||
- ROI-focused recommendations
|
||||
- Executive reporting and scoring
|
||||
- Content opportunity identification
|
||||
- Search performance optimization
|
||||
multiple SEO tools into intelligent workflows.
|
||||
"""
|
||||
|
||||
from typing import Dict, Any, List, Optional, Tuple
|
||||
from datetime import datetime, timedelta
|
||||
from dataclasses import dataclass, asdict
|
||||
import asyncio
|
||||
import json
|
||||
from typing import Dict, Any, List, Optional
|
||||
from datetime import datetime
|
||||
from loguru import logger
|
||||
import aiohttp
|
||||
|
||||
from services.seo_tools.technical_seo_service import TechnicalSEOService
|
||||
from services.seo_tools.on_page_seo_service import OnPageSEOService
|
||||
from services.seo_tools.pagespeed_service import PageSpeedService
|
||||
from services.seo_tools.sitemap_service import SitemapService
|
||||
from services.seo_tools.content_strategy_service import ContentStrategyService
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
|
||||
|
||||
@dataclass
|
||||
class AuditComponent:
|
||||
"""Data class for audit component results"""
|
||||
component_name: str
|
||||
status: str # 'completed', 'failed', 'pending'
|
||||
score: Optional[float] = None
|
||||
critical_issues: Optional[List[str]] = None
|
||||
recommendations: Optional[List[str]] = None
|
||||
execution_time: Optional[float] = None
|
||||
|
||||
|
||||
class EnterpriseSEOService:
|
||||
"""Service for enterprise SEO audits and workflows with full orchestration"""
|
||||
"""Service for enterprise SEO audits and workflows"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the enterprise SEO service with all sub-services"""
|
||||
"""Initialize the enterprise SEO service"""
|
||||
self.service_name = "enterprise_seo_suite"
|
||||
self.version = "2.0"
|
||||
|
||||
# Initialize sub-services
|
||||
self.technical_seo_service = TechnicalSEOService()
|
||||
self.on_page_seo_service = OnPageSEOService()
|
||||
self.pagespeed_service = PageSpeedService()
|
||||
self.sitemap_service = SitemapService()
|
||||
self.content_strategy_service = ContentStrategyService()
|
||||
|
||||
logger.info(f"Initialized {self.service_name} v{self.version} with all sub-services")
|
||||
logger.info(f"Initialized {self.service_name}")
|
||||
|
||||
async def execute_complete_audit(
|
||||
self,
|
||||
website_url: str,
|
||||
competitors: Optional[List[str]] = None,
|
||||
target_keywords: Optional[List[str]] = None,
|
||||
include_content_analysis: bool = True,
|
||||
include_competitive_analysis: bool = True,
|
||||
generate_executive_report: bool = True
|
||||
competitors: List[str] = None,
|
||||
target_keywords: List[str] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Execute comprehensive enterprise SEO audit with full orchestration.
|
||||
|
||||
Args:
|
||||
website_url: Primary website URL to audit
|
||||
competitors: List of competitor URLs (max 5)
|
||||
target_keywords: List of target keywords for analysis
|
||||
include_content_analysis: Include content strategy analysis
|
||||
include_competitive_analysis: Include competitive benchmarking
|
||||
generate_executive_report: Generate executive summary report
|
||||
|
||||
Returns:
|
||||
Comprehensive audit results with all components
|
||||
"""
|
||||
audit_start_time = datetime.utcnow()
|
||||
audit_id = f"audit_{audit_start_time.strftime('%Y%m%d_%H%M%S')}"
|
||||
|
||||
logger.info(f"Starting complete audit [{audit_id}] for {website_url}")
|
||||
|
||||
try:
|
||||
# Validate inputs
|
||||
if not website_url:
|
||||
raise ValueError("website_url is required")
|
||||
|
||||
# Normalize competitors list
|
||||
competitors = competitors[:5] if competitors else []
|
||||
target_keywords = target_keywords or []
|
||||
|
||||
# Initialize component results tracking
|
||||
audit_components = {}
|
||||
component_scores = {}
|
||||
|
||||
# ============= PARALLEL EXECUTION: Core Audit Components =============
|
||||
logger.info(f"[{audit_id}] Executing core audit components in parallel...")
|
||||
|
||||
# Create tasks for parallel execution
|
||||
tasks = {
|
||||
'technical_seo': self._execute_technical_audit(website_url, audit_id),
|
||||
'on_page_seo': self._execute_on_page_audit(website_url, target_keywords, audit_id),
|
||||
'pagespeed': self._execute_pagespeed_audit(website_url, audit_id),
|
||||
'sitemap': self._execute_sitemap_audit(website_url, audit_id),
|
||||
}
|
||||
|
||||
# Add optional components
|
||||
if include_content_analysis:
|
||||
tasks['content_strategy'] = self._execute_content_audit(
|
||||
website_url, target_keywords, competitors, audit_id
|
||||
)
|
||||
|
||||
# Execute all tasks concurrently
|
||||
results = await asyncio.gather(*tasks.values(), return_exceptions=True)
|
||||
|
||||
# Process results
|
||||
for component_name, result in zip(tasks.keys(), results):
|
||||
if isinstance(result, Exception):
|
||||
logger.error(f"[{audit_id}] {component_name} failed: {str(result)}")
|
||||
audit_components[component_name] = {
|
||||
'status': 'failed',
|
||||
'error': str(result)
|
||||
}
|
||||
component_scores[component_name] = 0
|
||||
else:
|
||||
audit_components[component_name] = result
|
||||
component_scores[component_name] = result.get('score', 0)
|
||||
|
||||
# ============= COMPETITIVE ANALYSIS =============
|
||||
competitive_analysis = {}
|
||||
if include_competitive_analysis and competitors:
|
||||
logger.info(f"[{audit_id}] Executing competitive analysis...")
|
||||
competitive_analysis = await self._execute_competitive_analysis(
|
||||
website_url, competitors, audit_id
|
||||
)
|
||||
|
||||
# ============= CALCULATE OVERALL SCORES =============
|
||||
overall_score = self._calculate_overall_score(component_scores)
|
||||
|
||||
# ============= PRIORITIZE RECOMMENDATIONS =============
|
||||
logger.info(f"[{audit_id}] Aggregating recommendations...")
|
||||
prioritized_actions = await self._aggregate_recommendations(
|
||||
audit_components, component_scores, audit_id
|
||||
)
|
||||
|
||||
# ============= AI-POWERED INSIGHTS =============
|
||||
logger.info(f"[{audit_id}] Generating AI-powered insights...")
|
||||
ai_insights = await self._generate_ai_insights(
|
||||
website_url, audit_components, component_scores, target_keywords, audit_id
|
||||
)
|
||||
|
||||
# ============= EXECUTIVE REPORT =============
|
||||
audit_end_time = datetime.utcnow()
|
||||
execution_time = (audit_end_time - audit_start_time).total_seconds()
|
||||
|
||||
report = {
|
||||
"audit_id": audit_id,
|
||||
"website_url": website_url,
|
||||
"audit_type": "complete_enterprise_audit",
|
||||
"execution_time_seconds": execution_time,
|
||||
"timestamp": audit_end_time.isoformat(),
|
||||
|
||||
# Overall metrics
|
||||
"overall_score": overall_score,
|
||||
"overall_status": self._get_audit_status(overall_score),
|
||||
"components_analyzed": len(audit_components),
|
||||
"components_successful": sum(1 for v in audit_components.values() if v.get('status') == 'completed'),
|
||||
|
||||
# Component details
|
||||
"component_results": audit_components,
|
||||
"component_scores": component_scores,
|
||||
|
||||
# Competitive analysis
|
||||
"competitors_analyzed": len(competitors),
|
||||
"competitive_analysis": competitive_analysis,
|
||||
|
||||
# Recommendations
|
||||
"priority_actions": prioritized_actions,
|
||||
"total_recommendations": len(prioritized_actions),
|
||||
|
||||
# AI Insights
|
||||
"ai_insights": ai_insights,
|
||||
|
||||
# Business metrics
|
||||
"estimated_impact": self._calculate_estimated_impact(
|
||||
overall_score, component_scores
|
||||
),
|
||||
"estimated_traffic_improvement": "15-35%",
|
||||
"implementation_timeline": self._estimate_implementation_timeline(prioritized_actions),
|
||||
|
||||
# Target keywords performance
|
||||
"target_keywords": target_keywords,
|
||||
"keyword_analysis": audit_components.get('content_strategy', {}).get('keyword_analysis', {}),
|
||||
|
||||
# Next steps
|
||||
"next_steps": [
|
||||
"Review priority actions with your team",
|
||||
f"Allocate resources for {len([a for a in prioritized_actions if a.get('priority') == 'critical'])} critical items",
|
||||
"Set implementation milestones",
|
||||
"Schedule follow-up audit in 30 days"
|
||||
]
|
||||
}
|
||||
|
||||
logger.info(f"[{audit_id}] Audit completed successfully in {execution_time:.2f}s with score {overall_score}")
|
||||
return report
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] Complete audit failed: {str(e)}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def _execute_technical_audit(self, website_url: str, audit_id: str) -> Dict[str, Any]:
|
||||
"""Execute technical SEO audit component"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Starting technical SEO audit...")
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
result = await self.technical_seo_service.analyze_technical_seo(
|
||||
url=website_url,
|
||||
crawl_depth=3
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'score': result.get('overall_score', 0),
|
||||
'critical_issues': result.get('critical_issues', []),
|
||||
'issues_count': result.get('total_issues', 0),
|
||||
'crawl_stats': result.get('crawl_stats', {}),
|
||||
'recommendations': result.get('recommendations', []),
|
||||
'execution_time': execution_time
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] Technical audit failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def _execute_on_page_audit(self, website_url: str, keywords: List[str], audit_id: str) -> Dict[str, Any]:
|
||||
"""Execute on-page SEO audit component"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Starting on-page SEO audit...")
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
result = await self.on_page_seo_service.analyze_on_page_seo(
|
||||
url=website_url,
|
||||
target_keywords=keywords
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'score': result.get('page_score', 0),
|
||||
'meta_tags': result.get('meta_tags', {}),
|
||||
'content_quality': result.get('content_quality', {}),
|
||||
'technical_elements': result.get('technical_elements', {}),
|
||||
'keyword_presence': result.get('keyword_analysis', {}),
|
||||
'recommendations': result.get('recommendations', []),
|
||||
'execution_time': execution_time
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] On-page audit failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def _execute_pagespeed_audit(self, website_url: str, audit_id: str) -> Dict[str, Any]:
|
||||
"""Execute PageSpeed Insights audit component"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Starting PageSpeed Insights audit...")
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
result = await self.pagespeed_service.analyze_pagespeed(
|
||||
url=website_url,
|
||||
strategy="MOBILE"
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'score': result.get('performance_score', 0),
|
||||
'core_web_vitals': result.get('core_web_vitals', {}),
|
||||
'metrics': result.get('metrics', {}),
|
||||
'opportunities': result.get('opportunities', []),
|
||||
'recommendations': result.get('optimization_suggestions', []),
|
||||
'mobile_score': result.get('mobile_performance', 0),
|
||||
'desktop_score': result.get('desktop_performance', 0),
|
||||
'execution_time': execution_time
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] PageSpeed audit failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def _execute_sitemap_audit(self, website_url: str, audit_id: str) -> Dict[str, Any]:
|
||||
"""Execute sitemap analysis component"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Starting sitemap analysis...")
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
# Extract domain from website_url for sitemap location
|
||||
from urllib.parse import urlparse
|
||||
domain = urlparse(website_url).netloc
|
||||
sitemap_url = f"https://{domain}/sitemap.xml"
|
||||
|
||||
result = await self.sitemap_service.analyze_sitemap(
|
||||
sitemap_url=sitemap_url
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'score': result.get('sitemap_score', 0),
|
||||
'total_urls': result.get('total_urls', 0),
|
||||
'url_structure': result.get('url_structure_analysis', {}),
|
||||
'publishing_frequency': result.get('publishing_frequency', {}),
|
||||
'content_distribution': result.get('content_distribution', {}),
|
||||
'recommendations': result.get('recommendations', []),
|
||||
'execution_time': execution_time
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] Sitemap audit failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def _execute_content_audit(self, website_url: str, keywords: List[str], competitors: List[str], audit_id: str) -> Dict[str, Any]:
|
||||
"""Execute content strategy analysis component"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Starting content strategy analysis...")
|
||||
start_time = datetime.utcnow()
|
||||
|
||||
result = await self.content_strategy_service.analyze_content_strategy(
|
||||
website_url=website_url,
|
||||
target_keywords=keywords,
|
||||
competitor_urls=competitors
|
||||
)
|
||||
|
||||
execution_time = (datetime.utcnow() - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'score': result.get('strategy_score', 0),
|
||||
'content_gaps': result.get('content_gaps', []),
|
||||
'opportunities': result.get('opportunities', []),
|
||||
'keyword_analysis': result.get('keyword_analysis', {}),
|
||||
'competitive_comparison': result.get('competitive_analysis', {}),
|
||||
'recommendations': result.get('content_recommendations', []),
|
||||
'execution_time': execution_time
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] Content audit failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def _execute_competitive_analysis(self, website_url: str, competitors: List[str], audit_id: str) -> Dict[str, Any]:
|
||||
"""Perform competitive benchmarking across sites"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Executing competitive analysis across {len(competitors)} sites...")
|
||||
|
||||
# This would typically fetch SEO metrics from external APIs
|
||||
# For now, returning structured format
|
||||
competitive_data = {
|
||||
'primary_site': website_url,
|
||||
'competitors_compared': competitors,
|
||||
'benchmarking_metrics': {
|
||||
'domain_authority': 'Data from external API',
|
||||
'backlink_profile': 'Data from external API',
|
||||
'keyword_rankings': 'Data from external API',
|
||||
'content_volume': 'Data from external API',
|
||||
'estimated_traffic': 'Data from external API'
|
||||
},
|
||||
'competitive_advantages': self._identify_competitive_advantages(website_url, competitors),
|
||||
'competitive_gaps': self._identify_competitive_gaps(website_url, competitors),
|
||||
'market_position': 'Moderate - room for improvement'
|
||||
}
|
||||
|
||||
return competitive_data
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] Competitive analysis failed: {str(e)}")
|
||||
return {'status': 'failed', 'error': str(e)}
|
||||
|
||||
def _identify_competitive_advantages(self, primary_url: str, competitors: List[str]) -> List[Dict[str, str]]:
|
||||
"""Identify competitive advantages"""
|
||||
return [
|
||||
{
|
||||
'advantage': 'Unique content angle',
|
||||
'potential_impact': 'High',
|
||||
'description': f'{primary_url} has unique content perspectives competitors lack'
|
||||
},
|
||||
{
|
||||
'advantage': 'Better technical SEO foundation',
|
||||
'potential_impact': 'High',
|
||||
'description': 'Stronger Core Web Vitals and mobile optimization'
|
||||
}
|
||||
]
|
||||
|
||||
def _identify_competitive_gaps(self, primary_url: str, competitors: List[str]) -> List[Dict[str, str]]:
|
||||
"""Identify competitive gaps"""
|
||||
return [
|
||||
{
|
||||
'gap': 'Lower content volume',
|
||||
'priority': 'Medium',
|
||||
'recommendation': 'Increase content production to match or exceed competitors'
|
||||
},
|
||||
{
|
||||
'gap': 'Fewer backlinks',
|
||||
'priority': 'High',
|
||||
'recommendation': 'Develop link-building strategy targeting high-authority domains'
|
||||
}
|
||||
]
|
||||
|
||||
async def _aggregate_recommendations(self, components: Dict[str, Any], scores: Dict[str, float], audit_id: str) -> List[Dict[str, Any]]:
|
||||
"""Aggregate and prioritize recommendations from all components"""
|
||||
try:
|
||||
all_recommendations = []
|
||||
|
||||
# Collect all recommendations from components
|
||||
for component_name, component_data in components.items():
|
||||
if component_data.get('status') == 'completed':
|
||||
component_recs = component_data.get('recommendations', [])
|
||||
for rec in component_recs:
|
||||
all_recommendations.append({
|
||||
'source_component': component_name,
|
||||
'recommendation': rec,
|
||||
'component_score': scores.get(component_name, 0)
|
||||
})
|
||||
|
||||
# Prioritize by component score (lower score = higher priority)
|
||||
all_recommendations.sort(key=lambda x: x['component_score'])
|
||||
|
||||
# Assign priority levels and effort estimates
|
||||
prioritized = []
|
||||
for idx, rec in enumerate(all_recommendations[:15]): # Top 15 recommendations
|
||||
priority = 'critical' if idx < 3 else 'high' if idx < 8 else 'medium'
|
||||
effort = 'quick-win' if idx < 3 else 'short-term' if idx < 8 else 'medium-term'
|
||||
|
||||
prioritized.append({
|
||||
'priority': priority,
|
||||
'recommendation': rec['recommendation'],
|
||||
'source': rec['source_component'],
|
||||
'estimated_effort': effort,
|
||||
'potential_impact': 'High' if priority == 'critical' else 'Medium',
|
||||
'implementation_steps': [
|
||||
f"Step 1: {rec['recommendation'].split('.')[0] if '.' in rec['recommendation'] else rec['recommendation']}",
|
||||
"Step 2: Implement changes",
|
||||
"Step 3: Test and validate",
|
||||
"Step 4: Monitor improvements"
|
||||
]
|
||||
})
|
||||
|
||||
return prioritized
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] Recommendation aggregation failed: {str(e)}")
|
||||
return []
|
||||
|
||||
async def _generate_ai_insights(self, website_url: str, components: Dict[str, Any], scores: Dict[str, float], keywords: List[str], audit_id: str) -> Dict[str, Any]:
|
||||
"""Generate AI-powered strategic insights"""
|
||||
try:
|
||||
logger.info(f"[{audit_id}] Generating AI insights...")
|
||||
|
||||
# Build context for LLM
|
||||
context = f"""
|
||||
Analyze the following SEO audit results and provide strategic insights:
|
||||
|
||||
Website: {website_url}
|
||||
Overall Score: {scores.get('overall_score', 0)}
|
||||
|
||||
Components:
|
||||
- Technical SEO: {scores.get('technical_seo', 0)}
|
||||
- On-Page SEO: {scores.get('on_page_seo', 0)}
|
||||
- PageSpeed: {scores.get('pagespeed', 0)}
|
||||
- Sitemap: {scores.get('sitemap', 0)}
|
||||
- Content Strategy: {scores.get('content_strategy', 0)}
|
||||
|
||||
Target Keywords: {', '.join(keywords) if keywords else 'Not specified'}
|
||||
|
||||
Provide:
|
||||
1. Executive summary of current SEO health
|
||||
2. Top 3 opportunities for quick wins
|
||||
3. Long-term strategy recommendations
|
||||
4. Estimated business impact
|
||||
"""
|
||||
|
||||
# Call LLM for insights
|
||||
try:
|
||||
insights_text = await llm_text_gen(context, max_tokens=1000)
|
||||
return {
|
||||
'status': 'completed',
|
||||
'ai_analysis': insights_text,
|
||||
'generated_at': datetime.utcnow().isoformat()
|
||||
}
|
||||
except:
|
||||
# Fallback if LLM is unavailable
|
||||
return {
|
||||
'status': 'completed',
|
||||
'ai_analysis': 'AI insights generation unavailable. Review component results above.',
|
||||
'generated_at': datetime.utcnow().isoformat()
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"[{audit_id}] AI insights generation failed: {str(e)}")
|
||||
return {'status': 'failed', 'error': str(e)}
|
||||
|
||||
def _calculate_overall_score(self, component_scores: Dict[str, float]) -> float:
|
||||
"""Calculate weighted overall SEO score"""
|
||||
if not component_scores:
|
||||
return 0
|
||||
|
||||
# Weight distribution
|
||||
weights = {
|
||||
'technical_seo': 0.25,
|
||||
'on_page_seo': 0.25,
|
||||
'pagespeed': 0.20,
|
||||
'sitemap': 0.10,
|
||||
'content_strategy': 0.20
|
||||
"""Execute comprehensive enterprise SEO audit"""
|
||||
# Placeholder implementation
|
||||
return {
|
||||
"website_url": website_url,
|
||||
"audit_type": "complete_audit",
|
||||
"overall_score": 78,
|
||||
"competitors_analyzed": len(competitors) if competitors else 0,
|
||||
"target_keywords": target_keywords or [],
|
||||
"technical_audit": {"score": 80, "issues": 5, "recommendations": 8},
|
||||
"content_analysis": {"score": 75, "gaps": 3, "opportunities": 12},
|
||||
"competitive_intelligence": {"position": "moderate", "gaps": 5},
|
||||
"priority_actions": [
|
||||
"Fix technical SEO issues",
|
||||
"Optimize content for target keywords",
|
||||
"Improve site speed"
|
||||
],
|
||||
"estimated_impact": "20-30% improvement in organic traffic",
|
||||
"implementation_timeline": "3-6 months"
|
||||
}
|
||||
|
||||
weighted_sum = sum(
|
||||
component_scores.get(component, 0) * weight
|
||||
for component, weight in weights.items()
|
||||
)
|
||||
|
||||
return round(weighted_sum, 1)
|
||||
|
||||
def _get_audit_status(self, score: float) -> str:
|
||||
"""Get audit status based on score"""
|
||||
if score >= 80:
|
||||
return "excellent"
|
||||
elif score >= 65:
|
||||
return "good"
|
||||
elif score >= 50:
|
||||
return "fair"
|
||||
else:
|
||||
return "needs_improvement"
|
||||
|
||||
def _calculate_estimated_impact(self, overall_score: float, component_scores: Dict[str, float]) -> str:
|
||||
"""Calculate estimated business impact based on audit results"""
|
||||
if overall_score >= 80:
|
||||
return "Minimal improvements needed. Focus on maintaining excellence."
|
||||
elif overall_score >= 65:
|
||||
return "15-25% potential improvement in organic traffic with recommended changes."
|
||||
elif overall_score >= 50:
|
||||
return "25-40% potential improvement in organic traffic with comprehensive implementation."
|
||||
else:
|
||||
return "40-60% potential improvement in organic traffic. Urgent action recommended."
|
||||
|
||||
def _estimate_implementation_timeline(self, recommendations: List[Dict[str, Any]]) -> str:
|
||||
"""Estimate implementation timeline based on recommendations"""
|
||||
critical_count = sum(1 for r in recommendations if r.get('priority') == 'critical')
|
||||
high_count = sum(1 for r in recommendations if r.get('priority') == 'high')
|
||||
|
||||
if critical_count >= 3:
|
||||
return "2-4 weeks (with dedicated resources)"
|
||||
elif high_count >= 5:
|
||||
return "4-8 weeks (phased approach)"
|
||||
else:
|
||||
return "8-12 weeks (ongoing optimization)"
|
||||
|
||||
async def execute_quick_audit(self, website_url: str) -> Dict[str, Any]:
|
||||
"""Execute quick 5-minute audit focusing on critical issues"""
|
||||
try:
|
||||
logger.info(f"Starting quick audit for {website_url}")
|
||||
|
||||
# Execute only critical components
|
||||
technical_result = await self._execute_technical_audit(website_url, "quick_audit")
|
||||
pagespeed_result = await self._execute_pagespeed_audit(website_url, "quick_audit")
|
||||
|
||||
quick_score = (technical_result['score'] + pagespeed_result['score']) / 2
|
||||
|
||||
return {
|
||||
'audit_type': 'quick_audit',
|
||||
'website_url': website_url,
|
||||
'quick_score': quick_score,
|
||||
'critical_issues': technical_result['critical_issues'] + pagespeed_result['recommendations'][:3],
|
||||
'top_recommendation': 'Fix critical technical SEO issues and improve page speed',
|
||||
'timestamp': datetime.utcnow().isoformat()
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Quick audit failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def health_check(self) -> Dict[str, Any]:
|
||||
"""Health check for the enterprise SEO service"""
|
||||
return {
|
||||
"status": "operational",
|
||||
"service": self.service_name,
|
||||
"version": self.version,
|
||||
"sub_services": {
|
||||
"technical_seo": "operational",
|
||||
"on_page_seo": "operational",
|
||||
"pagespeed": "operational",
|
||||
"sitemap": "operational",
|
||||
"content_strategy": "operational"
|
||||
},
|
||||
"last_check": datetime.utcnow().isoformat()
|
||||
}
|
||||
@@ -1,481 +0,0 @@
|
||||
"""
|
||||
Advanced Google Search Console Analyzer Service
|
||||
|
||||
Enterprise-level GSC integration with AI-powered insights including:
|
||||
- Search performance analysis and trends
|
||||
- Content opportunity identification
|
||||
- Keyword performance tracking
|
||||
- Technical SEO signal detection
|
||||
- Competitive positioning analysis
|
||||
- AI-powered recommendations
|
||||
"""
|
||||
|
||||
from typing import Dict, Any, List, Optional, Tuple
|
||||
from datetime import datetime, timedelta
|
||||
import asyncio
|
||||
from loguru import logger
|
||||
import json
|
||||
from dataclasses import dataclass
|
||||
|
||||
from services.llm_providers.main_text_generation import llm_text_gen
|
||||
from services.gsc_service import GSCService
|
||||
|
||||
|
||||
@dataclass
|
||||
class ContentOpportunity:
|
||||
"""Data class for content opportunities"""
|
||||
query: str
|
||||
impressions: int
|
||||
clicks: int
|
||||
ctr: float
|
||||
position: float
|
||||
priority_score: float
|
||||
opportunity_type: str # 'high_volume_low_ctr', 'long_tail', 'ranking_improvement', etc.
|
||||
recommendation: str
|
||||
|
||||
|
||||
class GSCAnalyzerService:
|
||||
"""
|
||||
Advanced Google Search Console analyzer with enterprise-level insights.
|
||||
Provides comprehensive search performance analysis and content opportunities.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the GSC analyzer service"""
|
||||
self.service_name = "gsc_analyzer"
|
||||
self.gsc_service = GSCService()
|
||||
logger.info(f"Initialized {self.service_name}")
|
||||
|
||||
async def analyze_search_performance(
|
||||
self,
|
||||
site_url: str,
|
||||
date_range_days: int = 90,
|
||||
user_id: Optional[str] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Comprehensive search performance analysis from GSC data.
|
||||
|
||||
Args:
|
||||
site_url: Website URL registered in GSC
|
||||
date_range_days: Number of days to analyze (default 90)
|
||||
user_id: Optional user ID for database integration
|
||||
|
||||
Returns:
|
||||
Comprehensive search performance analysis
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Analyzing search performance for {site_url}")
|
||||
analysis_start = datetime.utcnow()
|
||||
|
||||
# Fetch GSC data (would connect to real GSC API with user credentials)
|
||||
gsc_data = await self._fetch_gsc_data(site_url, date_range_days, user_id)
|
||||
|
||||
# Execute parallel analysis tasks
|
||||
analysis_tasks = {
|
||||
'performance_overview': self._analyze_performance_overview(gsc_data),
|
||||
'keyword_performance': self._analyze_keyword_performance(gsc_data),
|
||||
'page_performance': self._analyze_page_performance(gsc_data),
|
||||
'content_opportunities': self._identify_content_opportunities(gsc_data),
|
||||
'technical_signals': self._analyze_technical_seo_signals(gsc_data),
|
||||
'competitive_position': self._analyze_competitive_position(gsc_data, site_url),
|
||||
'trend_analysis': self._analyze_trends(gsc_data),
|
||||
'ai_recommendations': self._generate_ai_recommendations(gsc_data, site_url)
|
||||
}
|
||||
|
||||
# Execute all analyses concurrently
|
||||
results = await asyncio.gather(*analysis_tasks.values(), return_exceptions=True)
|
||||
|
||||
# Process results
|
||||
analysis_results = {}
|
||||
for task_name, result in zip(analysis_tasks.keys(), results):
|
||||
if isinstance(result, Exception):
|
||||
logger.error(f"Analysis task {task_name} failed: {str(result)}")
|
||||
analysis_results[task_name] = {'status': 'failed', 'error': str(result)}
|
||||
else:
|
||||
analysis_results[task_name] = result
|
||||
|
||||
execution_time = (datetime.utcnow() - analysis_start).total_seconds()
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'site_url': site_url,
|
||||
'analysis_period': f"Last {date_range_days} days",
|
||||
'analysis_timestamp': datetime.utcnow().isoformat(),
|
||||
'execution_time_seconds': execution_time,
|
||||
|
||||
# Core analyses
|
||||
'performance_overview': analysis_results.get('performance_overview', {}),
|
||||
'keyword_analysis': analysis_results.get('keyword_performance', {}),
|
||||
'page_analysis': analysis_results.get('page_performance', {}),
|
||||
'content_opportunities': analysis_results.get('content_opportunities', []),
|
||||
'technical_insights': analysis_results.get('technical_signals', {}),
|
||||
'competitive_analysis': analysis_results.get('competitive_position', {}),
|
||||
'trend_analysis': analysis_results.get('trend_analysis', {}),
|
||||
'ai_insights': analysis_results.get('ai_recommendations', {}),
|
||||
|
||||
# Summary metrics
|
||||
'summary': {
|
||||
'total_keywords': len(gsc_data.get('keywords', [])),
|
||||
'total_pages': len(gsc_data.get('pages', [])),
|
||||
'opportunities_identified': len(analysis_results.get('content_opportunities', [])),
|
||||
'critical_issues': self._count_critical_issues(analysis_results)
|
||||
}
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Search performance analysis failed: {str(e)}", exc_info=True)
|
||||
raise
|
||||
|
||||
async def _fetch_gsc_data(self, site_url: str, days: int, user_id: Optional[str]) -> Dict[str, Any]:
|
||||
"""
|
||||
Fetch GSC data for analysis.
|
||||
In production, this would fetch real data from Google Search Console API.
|
||||
"""
|
||||
try:
|
||||
logger.info(f"Fetching GSC data for {site_url} ({days} days)")
|
||||
|
||||
# Mock GSC data for demonstration
|
||||
# In production, replace with actual GSC API calls via gsc_service
|
||||
|
||||
gsc_data = {
|
||||
'site_url': site_url,
|
||||
'date_range_days': days,
|
||||
'keywords': await self._generate_mock_keywords(site_url),
|
||||
'pages': await self._generate_mock_pages(site_url),
|
||||
'devices': {
|
||||
'desktop': {'clicks': 2500, 'impressions': 15000, 'ctr': 16.7, 'position': 4.5},
|
||||
'mobile': {'clicks': 3200, 'impressions': 18000, 'ctr': 17.8, 'position': 5.2},
|
||||
'tablet': {'clicks': 600, 'impressions': 4000, 'ctr': 15.0, 'position': 5.8}
|
||||
},
|
||||
'search_types': {
|
||||
'web': {'clicks': 5100, 'impressions': 32500, 'ctr': 15.7, 'position': 4.9},
|
||||
'news': {'clicks': 50, 'impressions': 3500, 'ctr': 1.4, 'position': 8.2},
|
||||
'image': {'clicks': 51, 'impressions': 1000, 'ctr': 5.1, 'position': 15.0}
|
||||
},
|
||||
'countries': {
|
||||
'United States': {'clicks': 4200, 'impressions': 25000, 'ctr': 16.8},
|
||||
'United Kingdom': {'clicks': 800, 'impressions': 8000, 'ctr': 10.0},
|
||||
'Canada': {'clicks': 300, 'impressions': 5000, 'ctr': 6.0}
|
||||
}
|
||||
}
|
||||
|
||||
return gsc_data
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to fetch GSC data: {str(e)}")
|
||||
raise
|
||||
|
||||
async def _generate_mock_keywords(self, site_url: str) -> List[Dict[str, Any]]:
|
||||
"""Generate mock keyword performance data"""
|
||||
return [
|
||||
{'keyword': 'AI content creation', 'impressions': 2500, 'clicks': 450, 'ctr': 18.0, 'position': 2.5},
|
||||
{'keyword': 'SEO tools', 'impressions': 1800, 'clicks': 198, 'ctr': 11.0, 'position': 4.2},
|
||||
{'keyword': 'content optimization', 'impressions': 1200, 'clicks': 144, 'ctr': 12.0, 'position': 5.1},
|
||||
{'keyword': 'meta description generator', 'impressions': 950, 'clicks': 190, 'ctr': 20.0, 'position': 1.8},
|
||||
{'keyword': 'blog writing AI', 'impressions': 850, 'clicks': 102, 'ctr': 12.0, 'position': 6.5},
|
||||
{'keyword': 'keyword research tool', 'impressions': 750, 'clicks': 67, 'ctr': 8.9, 'position': 8.2},
|
||||
{'keyword': 'technical SEO', 'impressions': 680, 'clicks': 81, 'ctr': 11.9, 'position': 7.1},
|
||||
{'keyword': 'SERP analysis', 'impressions': 620, 'clicks': 43, 'ctr': 6.9, 'position': 11.5},
|
||||
{'keyword': 'content strategy', 'impressions': 580, 'clicks': 64, 'ctr': 11.0, 'position': 8.9},
|
||||
{'keyword': 'on-page optimization', 'impressions': 520, 'clicks': 52, 'ctr': 10.0, 'position': 9.2}
|
||||
]
|
||||
|
||||
async def _generate_mock_pages(self, site_url: str) -> List[Dict[str, Any]]:
|
||||
"""Generate mock page performance data"""
|
||||
return [
|
||||
{'url': f'{site_url}/meta-description', 'clicks': 250, 'impressions': 1250, 'ctr': 20.0, 'position': 1.8},
|
||||
{'url': f'{site_url}/seo-tools', 'clicks': 180, 'impressions': 1640, 'ctr': 11.0, 'position': 4.2},
|
||||
{'url': f'{site_url}/content-optimization', 'clicks': 150, 'impressions': 1250, 'ctr': 12.0, 'position': 5.1},
|
||||
{'url': f'{site_url}/', 'clicks': 500, 'impressions': 3200, 'ctr': 15.6, 'position': 3.5},
|
||||
{'url': f'{site_url}/blog/ai-content', 'clicks': 125, 'impressions': 1045, 'ctr': 12.0, 'position': 6.5},
|
||||
{'url': f'{site_url}/technical-seo', 'clicks': 95, 'impressions': 800, 'ctr': 11.9, 'position': 7.1},
|
||||
{'url': f'{site_url}/competitor-analysis', 'clicks': 85, 'impressions': 920, 'ctr': 9.2, 'position': 8.5},
|
||||
{'url': f'{site_url}/keyword-research', 'clicks': 70, 'impressions': 780, 'ctr': 9.0, 'position': 9.1}
|
||||
]
|
||||
|
||||
async def _analyze_performance_overview(self, gsc_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Analyze overall search performance metrics"""
|
||||
keywords = gsc_data.get('keywords', [])
|
||||
pages = gsc_data.get('pages', [])
|
||||
devices = gsc_data.get('devices', {})
|
||||
|
||||
total_clicks = sum(k.get('clicks', 0) for k in keywords)
|
||||
total_impressions = sum(k.get('impressions', 0) for k in keywords)
|
||||
|
||||
return {
|
||||
'total_clicks': total_clicks,
|
||||
'total_impressions': total_impressions,
|
||||
'overall_ctr': round((total_clicks / total_impressions * 100) if total_impressions else 0, 2),
|
||||
'average_position': round(sum(k.get('position', 0) for k in keywords) / len(keywords) if keywords else 0, 1),
|
||||
'total_keywords_tracked': len(keywords),
|
||||
'total_pages_indexed': len(pages),
|
||||
'top_performing_keyword': max(keywords, key=lambda x: x.get('clicks', 0))['keyword'] if keywords else None,
|
||||
'top_performing_page': max(pages, key=lambda x: x.get('clicks', 0))['url'] if pages else None,
|
||||
'device_breakdown': {
|
||||
'mobile': devices.get('mobile', {}).get('ctr', 0),
|
||||
'desktop': devices.get('desktop', {}).get('ctr', 0),
|
||||
'tablet': devices.get('tablet', {}).get('ctr', 0)
|
||||
}
|
||||
}
|
||||
|
||||
async def _analyze_keyword_performance(self, gsc_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Analyze keyword-level performance"""
|
||||
keywords = gsc_data.get('keywords', [])
|
||||
|
||||
# Sort keywords by clicks
|
||||
top_keywords = sorted(keywords, key=lambda x: x.get('clicks', 0), reverse=True)[:10]
|
||||
|
||||
# Identify keyword opportunities
|
||||
high_volume_low_ctr = [k for k in keywords if k.get('impressions', 0) > 500 and k.get('ctr', 0) < 10]
|
||||
ranking_well = [k for k in keywords if k.get('position', 0) <= 3]
|
||||
|
||||
return {
|
||||
'top_keywords': top_keywords,
|
||||
'total_keywords': len(keywords),
|
||||
'high_volume_low_ctr_keywords': high_volume_low_ctr[:5],
|
||||
'ranking_in_top_3': len(ranking_well),
|
||||
'avg_position': round(sum(k.get('position', 0) for k in keywords) / len(keywords) if keywords else 0, 1),
|
||||
'keyword_trends': {
|
||||
'improving': [k for k in keywords if k.get('trend', 'stable') == 'up'][:3],
|
||||
'declining': [k for k in keywords if k.get('trend', 'stable') == 'down'][:3]
|
||||
}
|
||||
}
|
||||
|
||||
async def _analyze_page_performance(self, gsc_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Analyze page-level performance"""
|
||||
pages = gsc_data.get('pages', [])
|
||||
|
||||
# Sort pages by clicks
|
||||
top_pages = sorted(pages, key=lambda x: x.get('clicks', 0), reverse=True)[:10]
|
||||
|
||||
return {
|
||||
'top_pages': top_pages,
|
||||
'total_pages': len(pages),
|
||||
'pages_with_impressions': len([p for p in pages if p.get('impressions', 0) > 0]),
|
||||
'pages_with_no_clicks': len([p for p in pages if p.get('clicks', 0) == 0 and p.get('impressions', 0) > 0]),
|
||||
'average_page_ctr': round(
|
||||
sum(p.get('clicks', 0) for p in pages) / sum(p.get('impressions', 0) for p in pages) * 100
|
||||
if sum(p.get('impressions', 0) for p in pages) else 0, 2
|
||||
)
|
||||
}
|
||||
|
||||
async def _identify_content_opportunities(self, gsc_data: Dict[str, Any]) -> List[Dict[str, Any]]:
|
||||
"""Identify high-priority content opportunities"""
|
||||
keywords = gsc_data.get('keywords', [])
|
||||
opportunities = []
|
||||
|
||||
for keyword in keywords:
|
||||
impressions = keyword.get('impressions', 0)
|
||||
clicks = keyword.get('clicks', 0)
|
||||
position = keyword.get('position', 0)
|
||||
ctr = keyword.get('ctr', 0)
|
||||
|
||||
priority_score = 0
|
||||
opportunity_type = None
|
||||
recommendation = None
|
||||
|
||||
# High volume, low CTR - improve meta description/title
|
||||
if impressions > 500 and ctr < 10:
|
||||
priority_score = (impressions / 500) * 10 - (ctr / 10) * 5
|
||||
opportunity_type = 'high_volume_low_ctr'
|
||||
recommendation = 'Improve meta title and description to increase click-through rate'
|
||||
|
||||
# Ranking 4-10, could improve to top 3
|
||||
elif position > 3 and position <= 10:
|
||||
priority_score = (10 - position) * 5
|
||||
opportunity_type = 'ranking_improvement'
|
||||
recommendation = 'Optimize content and build backlinks to improve ranking position'
|
||||
|
||||
# Low volume but good position - expand content
|
||||
elif impressions < 100 and position <= 3:
|
||||
priority_score = (100 - impressions) / 100 * 5
|
||||
opportunity_type = 'expansion'
|
||||
recommendation = 'Expand content and build more internal/external links to increase impressions'
|
||||
|
||||
if opportunity_type and priority_score > 0:
|
||||
opportunities.append({
|
||||
'keyword': keyword['keyword'],
|
||||
'current_position': position,
|
||||
'impressions': impressions,
|
||||
'clicks': clicks,
|
||||
'ctr': ctr,
|
||||
'priority_score': round(priority_score, 2),
|
||||
'opportunity_type': opportunity_type,
|
||||
'recommendation': recommendation
|
||||
})
|
||||
|
||||
# Sort by priority score and return top opportunities
|
||||
opportunities.sort(key=lambda x: x['priority_score'], reverse=True)
|
||||
return opportunities[:15]
|
||||
|
||||
async def _analyze_technical_seo_signals(self, gsc_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Analyze technical SEO signals from GSC data"""
|
||||
return {
|
||||
'index_coverage': 'Good - 98% of pages indexed',
|
||||
'mobile_usability': 'Good - No major issues detected',
|
||||
'core_web_vitals': 'Good - All thresholds met',
|
||||
'crawl_stats': {
|
||||
'pages_crawled_per_day': 1250,
|
||||
'average_response_time': '0.8s',
|
||||
'robots.txt_accessible': True
|
||||
},
|
||||
'indexing_issues': [
|
||||
'Redirect errors: 5 pages',
|
||||
'Not found errors: 12 pages',
|
||||
'Server errors: 0 pages'
|
||||
],
|
||||
'coverage_summary': {
|
||||
'valid': 450,
|
||||
'errors': 17,
|
||||
'warnings': 25,
|
||||
'excluded': 50
|
||||
}
|
||||
}
|
||||
|
||||
async def _analyze_competitive_position(self, gsc_data: Dict[str, Any], site_url: str) -> Dict[str, Any]:
|
||||
"""Analyze competitive positioning based on GSC data"""
|
||||
return {
|
||||
'market_position': 'Strong in niche keywords',
|
||||
'domain_visibility': 'Growing trend',
|
||||
'visibility_score': 72.5,
|
||||
'competitive_keywords': [
|
||||
{'keyword': 'AI content creation', 'position': 2, 'strength': 'Very Strong'},
|
||||
{'keyword': 'meta description', 'position': 1, 'strength': 'Very Strong'},
|
||||
{'keyword': 'SEO tools', 'position': 4, 'strength': 'Strong'}
|
||||
],
|
||||
'vulnerabilities': [
|
||||
'Broader 'content optimization' keywords at position 5-8',
|
||||
'Competitors ranking higher for 'AI writing' variants',
|
||||
'Low ranking for 'keyword research tool' (position 8)'
|
||||
],
|
||||
'recommendations': [
|
||||
'Strengthen ranking for broader content keywords',
|
||||
'Build more high-quality backlinks for competitive terms',
|
||||
'Create content targeting long-tail variations'
|
||||
]
|
||||
}
|
||||
|
||||
async def _analyze_trends(self, gsc_data: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Analyze performance trends over time"""
|
||||
return {
|
||||
'clicks_trend': 'Upward - +12% month-over-month',
|
||||
'impressions_trend': 'Stable - +2% month-over-month',
|
||||
'ctr_trend': 'Upward - +8% month-over-month',
|
||||
'position_trend': 'Improving - average position improved from 5.8 to 4.9',
|
||||
'seasonality': 'Peak traffic in Oct-Nov',
|
||||
'growth_forecast': '18-22% improvement expected over next 90 days'
|
||||
}
|
||||
|
||||
async def _generate_ai_recommendations(self, gsc_data: Dict[str, Any], site_url: str) -> Dict[str, Any]:
|
||||
"""Generate AI-powered strategic recommendations"""
|
||||
try:
|
||||
# Build context for LLM
|
||||
keywords = gsc_data.get('keywords', [])
|
||||
top_kw = sorted(keywords, key=lambda x: x.get('clicks', 0), reverse=True)[:5]
|
||||
|
||||
context = f"""
|
||||
Analyze this GSC performance data and provide strategic SEO recommendations:
|
||||
|
||||
Site: {site_url}
|
||||
Top performing keywords: {', '.join([k['keyword'] for k in top_kw])}
|
||||
Total keywords tracked: {len(keywords)}
|
||||
|
||||
Provide:
|
||||
1. Top 3 quick wins for CTR improvement
|
||||
2. Long-term content strategy recommendations
|
||||
3. Competitive positioning strategy
|
||||
4. Technical optimization priorities
|
||||
|
||||
Keep recommendations specific and actionable.
|
||||
"""
|
||||
|
||||
try:
|
||||
recommendations_text = await llm_text_gen(context, max_tokens=800)
|
||||
return {
|
||||
'status': 'completed',
|
||||
'recommendations': recommendations_text,
|
||||
'generated_at': datetime.utcnow().isoformat()
|
||||
}
|
||||
except:
|
||||
return {
|
||||
'status': 'completed',
|
||||
'recommendations': 'AI recommendations generation unavailable.',
|
||||
'generated_at': datetime.utcnow().isoformat()
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"AI recommendations generation failed: {str(e)}")
|
||||
return {'status': 'failed', 'error': str(e)}
|
||||
|
||||
def _count_critical_issues(self, analysis_results: Dict[str, Any]) -> int:
|
||||
"""Count critical issues across all analyses"""
|
||||
critical_count = 0
|
||||
|
||||
# Count from technical signals
|
||||
technical = analysis_results.get('technical_signals', {}).get('indexing_issues', [])
|
||||
critical_count += len([i for i in technical if 'error' in i.lower()])
|
||||
|
||||
# Count from content opportunities
|
||||
opportunities = analysis_results.get('content_opportunities', [])
|
||||
critical_count += len([o for o in opportunities if o.get('opportunity_type') == 'high_volume_low_ctr'])
|
||||
|
||||
return critical_count
|
||||
|
||||
async def get_content_opportunities_report(
|
||||
self,
|
||||
site_url: str,
|
||||
min_impressions: int = 100,
|
||||
date_range_days: int = 90
|
||||
) -> Dict[str, Any]:
|
||||
"""Generate detailed content opportunities report"""
|
||||
try:
|
||||
logger.info(f"Generating content opportunities report for {site_url}")
|
||||
|
||||
gsc_data = await self._fetch_gsc_data(site_url, date_range_days, None)
|
||||
opportunities = await self._identify_content_opportunities(gsc_data)
|
||||
|
||||
# Filter by minimum impressions
|
||||
qualified_opportunities = [o for o in opportunities if o['impressions'] >= min_impressions]
|
||||
|
||||
# Calculate potential impact
|
||||
total_potential_clicks = sum(
|
||||
(o['impressions'] * 0.25) - o['clicks']
|
||||
for o in qualified_opportunities
|
||||
)
|
||||
|
||||
return {
|
||||
'status': 'completed',
|
||||
'site_url': site_url,
|
||||
'report_generated': datetime.utcnow().isoformat(),
|
||||
'opportunities_identified': len(qualified_opportunities),
|
||||
'estimated_additional_clicks': round(total_potential_clicks),
|
||||
'estimated_traffic_increase': '25-40%',
|
||||
'opportunities': qualified_opportunities,
|
||||
'implementation_priority': [
|
||||
{
|
||||
'phase': 'Phase 1 (Weeks 1-2)',
|
||||
'tasks': [o for o in qualified_opportunities if o['opportunity_type'] == 'high_volume_low_ctr'][:5]
|
||||
},
|
||||
{
|
||||
'phase': 'Phase 2 (Weeks 3-4)',
|
||||
'tasks': [o for o in qualified_opportunities if o['opportunity_type'] == 'ranking_improvement'][:5]
|
||||
},
|
||||
{
|
||||
'phase': 'Phase 3 (Month 2)',
|
||||
'tasks': [o for o in qualified_opportunities if o['opportunity_type'] == 'expansion'][:5]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Content opportunities report generation failed: {str(e)}")
|
||||
raise
|
||||
|
||||
async def health_check(self) -> Dict[str, Any]:
|
||||
"""Health check for the GSC analyzer service"""
|
||||
return {
|
||||
'status': 'operational',
|
||||
'service': self.service_name,
|
||||
'gsc_service_available': True,
|
||||
'llm_integration': 'available',
|
||||
'last_check': datetime.utcnow().isoformat()
|
||||
}
|
||||
@@ -548,11 +548,9 @@ def validate_video_generation_operations(
|
||||
def validate_scene_animation_operation(
|
||||
pricing_service: PricingService,
|
||||
user_id: str,
|
||||
scene_count: int = 1,
|
||||
) -> None:
|
||||
"""
|
||||
Validate the per-scene animation workflow before API calls.
|
||||
Validates that the user has sufficient credits for *all* scenes in the batch.
|
||||
"""
|
||||
try:
|
||||
operations_to_validate = [
|
||||
@@ -562,7 +560,6 @@ def validate_scene_animation_operation(
|
||||
'actual_provider_name': 'wavespeed',
|
||||
'operation_type': 'scene_animation',
|
||||
}
|
||||
for _ in range(scene_count)
|
||||
]
|
||||
|
||||
can_proceed, message, error_details = pricing_service.check_comprehensive_limits(
|
||||
@@ -584,8 +581,9 @@ def validate_scene_animation_operation(
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"[Pre-flight Validator] ✅ Scene animation validated for user {user_id} ({scene_count} scene(s))")
|
||||
|
||||
logger.info(f"[Pre-flight Validator] ✅ Scene animation validated for user {user_id}")
|
||||
# Validation passed - no return needed (function raises HTTPException if validation fails)
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
@@ -732,11 +730,9 @@ def validate_video_generation_operations(
|
||||
def validate_scene_animation_operation(
|
||||
pricing_service: PricingService,
|
||||
user_id: str,
|
||||
scene_count: int = 1,
|
||||
) -> None:
|
||||
"""
|
||||
Validate the per-scene animation workflow before API calls.
|
||||
Validates that the user has sufficient credits for *all* scenes in the batch.
|
||||
"""
|
||||
try:
|
||||
operations_to_validate = [
|
||||
@@ -746,7 +742,6 @@ def validate_scene_animation_operation(
|
||||
'actual_provider_name': 'wavespeed',
|
||||
'operation_type': 'scene_animation',
|
||||
}
|
||||
for _ in range(scene_count)
|
||||
]
|
||||
|
||||
can_proceed, message, error_details = pricing_service.check_comprehensive_limits(
|
||||
@@ -768,7 +763,7 @@ def validate_scene_animation_operation(
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"[Pre-flight Validator] ✅ Scene animation validated for user {user_id} ({scene_count} scene(s))")
|
||||
logger.info(f"[Pre-flight Validator] ✅ Scene animation validated for user {user_id}")
|
||||
|
||||
except HTTPException:
|
||||
raise
|
||||
|
||||
@@ -566,10 +566,10 @@ class PricingService:
|
||||
"firecrawl_calls_limit": 0, # DISABLED: Firecrawl not in Free tier
|
||||
"stability_calls_limit": 3, # 3 images - enough to try the product
|
||||
"exa_calls_limit": 10, # 10 research queries - enough to try the product
|
||||
"video_calls_limit": 2, # 2 video renders - try podcast video on Free
|
||||
"video_calls_limit": 0, # DISABLED: Video generation not in Free tier
|
||||
"image_edit_calls_limit": 5, # 5 image edits - enough to try the product
|
||||
"audio_calls_limit": 5, # 5 audio clips - enough to try the product
|
||||
"wavespeed_calls_limit": 0, # 0 = unlimited for Free; video controlled via video_calls_limit
|
||||
"wavespeed_calls_limit": 0, # DISABLED: WaveSpeed not included in Free tier
|
||||
"gemini_tokens_limit": 50000,
|
||||
"openai_tokens_limit": 0, # DISABLED
|
||||
"anthropic_tokens_limit": 0, # DISABLED
|
||||
|
||||
@@ -13,18 +13,25 @@ from loguru import logger
|
||||
from sqlalchemy.orm import Session
|
||||
from sqlalchemy import text
|
||||
|
||||
from services.database import WORKSPACE_DIR, init_user_database, ensure_user_workspace_db_directory
|
||||
from services.database import init_user_database, ensure_user_workspace_db_directory
|
||||
from services.workspace_dirs import ensure_user_workspace_dirs
|
||||
from services.workspace_paths import get_workspace_root, get_user_workspace_dir
|
||||
|
||||
class UserWorkspaceManager:
|
||||
"""Manages user-specific workspaces and progressive setup."""
|
||||
|
||||
def __init__(self, db_session: Session):
|
||||
self.db = db_session
|
||||
# Use shared workspace root authority for all environments.
|
||||
self.base_workspace_dir = get_workspace_root()
|
||||
self.user_workspaces_dir = self.base_workspace_dir
|
||||
# Use environment-safe paths for production
|
||||
if os.getenv("RENDER") or os.getenv("RAILWAY") or os.getenv("HEROKU"):
|
||||
# In production, use temp directories or skip file operations
|
||||
self.base_workspace_dir = Path("/tmp/alwrity_workspace")
|
||||
self.user_workspaces_dir = self.base_workspace_dir / "users"
|
||||
else:
|
||||
# In development, use project root 'workspace' directory
|
||||
# services/user_workspace_manager.py -> services -> backend -> root
|
||||
root_dir = Path(__file__).parent.parent.parent
|
||||
self.base_workspace_dir = root_dir / "workspace"
|
||||
self.user_workspaces_dir = self.base_workspace_dir
|
||||
|
||||
def _sanitize_user_id(self, user_id: str) -> str:
|
||||
"""Sanitize user_id to be safe for filesystem (matches database.py logic)."""
|
||||
@@ -39,46 +46,60 @@ class UserWorkspaceManager:
|
||||
"""Create a complete user workspace with progressive setup."""
|
||||
try:
|
||||
logger.info(f"Creating workspace for user {user_id}")
|
||||
|
||||
production_env = bool(os.getenv("RENDER") or os.getenv("RAILWAY") or os.getenv("HEROKU"))
|
||||
filesystem_minimal_mode = bool(os.getenv("ALWRITY_FILESYSTEM_MINIMAL_MODE"))
|
||||
mode = "filesystem_minimal" if filesystem_minimal_mode else ("production" if production_env else "development")
|
||||
|
||||
user_dir = get_user_workspace_dir(user_id)
|
||||
|
||||
# Sanitize user_id
|
||||
safe_user_id = self._sanitize_user_id(user_id)
|
||||
|
||||
# Check if we're in production and skip file operations if needed
|
||||
if os.getenv("RENDER") or os.getenv("RAILWAY") or os.getenv("HEROKU"):
|
||||
logger.info("Production environment detected - skipping file workspace creation")
|
||||
return {
|
||||
"user_id": user_id,
|
||||
"workspace_path": "/tmp/alwrity_workspace/users/user_" + safe_user_id,
|
||||
"config": self._create_user_config(user_id),
|
||||
"created_at": datetime.utcnow().isoformat(),
|
||||
"production_mode": True
|
||||
}
|
||||
|
||||
# Create user-specific directories
|
||||
# Format: workspaces/workspace_{user_id}
|
||||
user_dir = self.user_workspaces_dir / f"workspace_{safe_user_id}"
|
||||
user_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Ensure canonical DB directory and migrate legacy layout if needed
|
||||
self._ensure_workspace_db_directory(user_id)
|
||||
|
||||
# Create user-specific directories lazily via centralized helper
|
||||
user_dir = ensure_user_workspace_dirs(
|
||||
user_id,
|
||||
capabilities={"core", "content", "research", "media", "assets"},
|
||||
)
|
||||
|
||||
|
||||
# Create user-specific configuration
|
||||
config = self._create_user_config(user_id)
|
||||
config_file = user_dir / "config" / "user_config.json"
|
||||
with open(config_file, 'w') as f:
|
||||
json.dump(config, f, indent=2)
|
||||
|
||||
|
||||
# Create user-specific database tables
|
||||
# Use database.py's init_user_database to ensure proper schema
|
||||
try:
|
||||
init_user_database(user_id)
|
||||
except Exception as db_err:
|
||||
logger.error(f"Failed to initialize user database: {db_err}")
|
||||
# We don't raise here to allow workspace creation to proceed,
|
||||
# but it might be critical. Let's log and continue for now or raise?
|
||||
# If DB init fails, the app might not work.
|
||||
raise db_err
|
||||
|
||||
dirs_created = ["db", "assets", "media", "content", "config/user_config.json"]
|
||||
logger.info(
|
||||
"User workspace created",
|
||||
mode=mode,
|
||||
workspace_path=str(user_dir),
|
||||
dirs_created=dirs_created,
|
||||
)
|
||||
|
||||
logger.info(f"✅ User workspace created: {user_dir}")
|
||||
return {
|
||||
"user_id": user_id,
|
||||
"workspace_path": str(user_dir),
|
||||
"config": config,
|
||||
"created_at": datetime.now().isoformat(),
|
||||
"mode": mode,
|
||||
"dirs_created": dirs_created,
|
||||
"created_at": datetime.now().isoformat()
|
||||
}
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error creating user workspace: {e}")
|
||||
raise
|
||||
@@ -140,7 +161,7 @@ class UserWorkspaceManager:
|
||||
def get_user_workspace(self, user_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get user workspace information."""
|
||||
safe_user_id = self._sanitize_user_id(user_id)
|
||||
user_dir = get_user_workspace_dir(user_id)
|
||||
user_dir = self.user_workspaces_dir / f"workspace_{safe_user_id}"
|
||||
|
||||
if not user_dir.exists():
|
||||
return None
|
||||
@@ -160,7 +181,7 @@ class UserWorkspaceManager:
|
||||
"""Update user configuration."""
|
||||
try:
|
||||
safe_user_id = self._sanitize_user_id(user_id)
|
||||
user_dir = get_user_workspace_dir(user_id)
|
||||
user_dir = self.user_workspaces_dir / f"workspace_{safe_user_id}"
|
||||
config_file = user_dir / "config" / "user_config.json"
|
||||
|
||||
if config_file.exists():
|
||||
@@ -310,7 +331,7 @@ class UserWorkspaceManager:
|
||||
"""Clean up user workspace (for account deletion)."""
|
||||
try:
|
||||
safe_user_id = self._sanitize_user_id(user_id)
|
||||
user_dir = get_user_workspace_dir(user_id)
|
||||
user_dir = self.user_workspaces_dir / f"workspace_{safe_user_id}"
|
||||
if user_dir.exists():
|
||||
shutil.rmtree(user_dir)
|
||||
|
||||
|
||||
@@ -40,7 +40,7 @@ class WixService:
|
||||
if not self.client_id:
|
||||
logger.warning("Wix client ID not configured. Set WIX_CLIENT_ID environment variable.")
|
||||
|
||||
def get_authorization_url(self, state: str = None) -> Dict[str, str]:
|
||||
def get_authorization_url(self, state: str = None) -> str:
|
||||
"""
|
||||
Generate Wix OAuth authorization URL for "on behalf of user" authentication
|
||||
|
||||
@@ -54,7 +54,8 @@ class WixService:
|
||||
Authorization URL for user to visit
|
||||
"""
|
||||
url, code_verifier = self.auth_service.generate_authorization_url(state)
|
||||
return {"authorization_url": url, "state": state, "code_verifier": code_verifier}
|
||||
self._code_verifier = code_verifier
|
||||
return url
|
||||
|
||||
def _create_redirect_session_for_auth(self, redirect_uri: str, client_id: str, code_challenge: str, state: str) -> str:
|
||||
"""
|
||||
@@ -96,13 +97,13 @@ class WixService:
|
||||
logger.error(f"Failed to create redirect session for auth: {e}")
|
||||
raise
|
||||
|
||||
def exchange_code_for_tokens(self, code: str, code_verifier: str) -> Dict[str, Any]:
|
||||
def exchange_code_for_tokens(self, code: str, code_verifier: str = None) -> Dict[str, Any]:
|
||||
"""
|
||||
Exchange authorization code for access and refresh tokens using PKCE
|
||||
|
||||
Args:
|
||||
code: Authorization code from Wix
|
||||
code_verifier: PKCE code verifier
|
||||
code_verifier: PKCE code verifier (uses stored one if not provided)
|
||||
|
||||
Returns:
|
||||
Token response with access_token, refresh_token, etc.
|
||||
@@ -110,7 +111,9 @@ class WixService:
|
||||
if not self.client_id:
|
||||
raise ValueError("Wix client ID not configured")
|
||||
if not code_verifier:
|
||||
raise ValueError("Code verifier is required.")
|
||||
code_verifier = getattr(self, '_code_verifier', None)
|
||||
if not code_verifier:
|
||||
raise ValueError("Code verifier not found. Please provide code_verifier parameter.")
|
||||
try:
|
||||
return self.auth_service.exchange_code_for_tokens(code, code_verifier)
|
||||
except requests.RequestException as e:
|
||||
|
||||
@@ -1,19 +0,0 @@
|
||||
"""Shared workspace path helpers.
|
||||
|
||||
Single authority for workspace root and per-user workspace paths.
|
||||
"""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from utils.storage_paths import get_repo_root, sanitize_user_id
|
||||
|
||||
|
||||
def get_workspace_root() -> Path:
|
||||
"""Return absolute workspace root directory under repo root."""
|
||||
return (get_repo_root() / "workspace").resolve()
|
||||
|
||||
|
||||
def get_user_workspace_dir(user_id: str) -> Path:
|
||||
"""Return absolute workspace directory for the given user."""
|
||||
safe_user_id = sanitize_user_id(user_id)
|
||||
return (get_workspace_root() / f"workspace_{safe_user_id}").resolve()
|
||||
@@ -1,8 +1,8 @@
|
||||
import os
|
||||
import re
|
||||
import asyncio
|
||||
from typing import Any, Dict, List, Optional
|
||||
from typing import Any, Dict, List
|
||||
from dataclasses import dataclass
|
||||
import httpx
|
||||
from loguru import logger
|
||||
import random
|
||||
|
||||
@@ -18,33 +18,49 @@ class WritingSuggestion:
|
||||
|
||||
class WritingAssistantService:
|
||||
"""
|
||||
Writing assistant that combines Exa search with LLM continuation.
|
||||
- Searches relevant sources using the content near the cursor position
|
||||
- Generates a short continuation grounded in sources
|
||||
- Confidence derived from source availability and quality
|
||||
Minimal writing assistant that combines Exa search with Gemini continuation.
|
||||
- Exa provides relevant sources with content snippets
|
||||
- Gemini generates a short, cited continuation based on current text and sources
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.exa_api_key = os.getenv("EXA_API_KEY")
|
||||
|
||||
if not self.exa_api_key:
|
||||
logger.warning("EXA_API_KEY not configured; writing assistant will fail")
|
||||
|
||||
self.http_timeout_seconds = 15
|
||||
|
||||
# COST CONTROL: Daily usage limits
|
||||
self.daily_api_calls = 0
|
||||
self.daily_limit = 50
|
||||
self.daily_limit = 50 # Max 50 API calls per day (~$2.50 max cost)
|
||||
self.last_reset_date = None
|
||||
|
||||
def _get_cached_suggestion(self, text: str) -> WritingSuggestion | None:
|
||||
"""No cached suggestions - always use real API calls for authentic results."""
|
||||
return None
|
||||
|
||||
def _check_daily_limit(self) -> bool:
|
||||
"""Check if we're within daily API usage limits."""
|
||||
import datetime
|
||||
|
||||
today = datetime.date.today()
|
||||
|
||||
# Reset counter if it's a new day
|
||||
if self.last_reset_date != today:
|
||||
self.daily_api_calls = 0
|
||||
self.last_reset_date = today
|
||||
|
||||
# Check if we've exceeded the limit
|
||||
if self.daily_api_calls >= self.daily_limit:
|
||||
return False
|
||||
|
||||
# Increment counter for this API call
|
||||
self.daily_api_calls += 1
|
||||
logger.info(f"Writing assistant API call #{self.daily_api_calls}/{self.daily_limit} today")
|
||||
return True
|
||||
|
||||
async def suggest(self, text: str, user_id: str | None = None, cursor_position: Optional[int] = None) -> List[WritingSuggestion]:
|
||||
async def suggest(self, text: str, user_id: str | None = None) -> List[WritingSuggestion]:
|
||||
if not text or len(text.strip()) < 6:
|
||||
return []
|
||||
|
||||
@@ -59,63 +75,62 @@ class WritingAssistantService:
|
||||
if len(text.strip()) < 50:
|
||||
return []
|
||||
|
||||
# Use text before cursor for context (where the user is actively writing)
|
||||
if cursor_position is not None and 0 < cursor_position <= len(text):
|
||||
context_text = text[:cursor_position]
|
||||
else:
|
||||
context_text = text
|
||||
# 1) Find relevant sources via Exa
|
||||
sources = await self._search_sources(text)
|
||||
|
||||
# 1) Find relevant sources via Exa (non-fatal)
|
||||
sources = []
|
||||
try:
|
||||
sources = await self._search_sources(context_text, user_id=user_id)
|
||||
except Exception as e:
|
||||
logger.warning(f"WritingAssistant Exa search failed, proceeding without sources: {e}")
|
||||
|
||||
# 2) Generate continuation suggestion via LLM
|
||||
suggestion_text, confidence = await self._generate_continuation(context_text, sources, user_id=user_id)
|
||||
# 2) Generate continuation suggestion via LLM grounded in sources
|
||||
suggestion_text, confidence = await self._generate_continuation(text, sources, user_id=user_id)
|
||||
|
||||
if not suggestion_text:
|
||||
return []
|
||||
|
||||
return [WritingSuggestion(text=suggestion_text.strip(), confidence=confidence, sources=sources)]
|
||||
|
||||
async def _search_sources(self, context_text: str, user_id: str = None) -> List[Dict[str, Any]]:
|
||||
"""Search Exa using the last sentence before cursor for a focused query."""
|
||||
async def _search_sources(self, text: str) -> List[Dict[str, Any]]:
|
||||
if not self.exa_api_key:
|
||||
raise Exception("EXA_API_KEY not configured")
|
||||
|
||||
# Follow Exa demo guidance: continuation-style prompt and 1000-char cap
|
||||
exa_query = (
|
||||
(text[-1000:] if len(text) > 1000 else text)
|
||||
+ "\n\nIf you found the above interesting, here's another useful resource to read:"
|
||||
)
|
||||
|
||||
payload = {
|
||||
"query": exa_query,
|
||||
"numResults": 3, # Reduced from 5 to 3 for cost savings
|
||||
"text": True,
|
||||
"type": "neural",
|
||||
"highlights": {"numSentences": 1, "highlightsPerUrl": 1},
|
||||
}
|
||||
|
||||
try:
|
||||
from services.blog_writer.research.exa_provider import ExaResearchProvider
|
||||
|
||||
# Extract the last sentence from context to use as a focused search query
|
||||
sentences = re.split(r'(?<=[.!?])\s+', context_text.strip())
|
||||
last_sentence = sentences[-1].strip().strip('"').strip("'") if sentences else context_text
|
||||
|
||||
# If very short, use last two sentences
|
||||
if len(last_sentence) < 20 and len(sentences) >= 2:
|
||||
last_sentence = ' '.join(s[-2:]).strip().strip('"').strip("'")
|
||||
|
||||
exa_query = last_sentence[:500] if len(last_sentence) > 500 else last_sentence
|
||||
|
||||
provider = ExaResearchProvider()
|
||||
sources = await provider.simple_search(
|
||||
query=exa_query,
|
||||
num_results=3,
|
||||
user_id=user_id,
|
||||
)
|
||||
|
||||
normalized = []
|
||||
for s in sources:
|
||||
normalized.append({
|
||||
"title": s.get("title", "Untitled"),
|
||||
"url": s.get("url", ""),
|
||||
"text": s.get("text", ""),
|
||||
"author": s.get("author", ""),
|
||||
"published_date": s.get("publishedDate", ""),
|
||||
"score": float(s.get("score") if s.get("score") is not None else 0.5),
|
||||
})
|
||||
|
||||
if not normalized:
|
||||
async with httpx.AsyncClient(timeout=self.http_timeout_seconds) as client:
|
||||
resp = await client.post(
|
||||
"https://api.exa.ai/search",
|
||||
headers={"x-api-key": self.exa_api_key, "Content-Type": "application/json"},
|
||||
json=payload,
|
||||
)
|
||||
if resp.status_code != 200:
|
||||
raise Exception(f"Exa error {resp.status_code}: {resp.text}")
|
||||
data = resp.json()
|
||||
results = data.get("results", [])
|
||||
sources: List[Dict[str, Any]] = []
|
||||
for r in results:
|
||||
sources.append(
|
||||
{
|
||||
"title": r.get("title", "Untitled"),
|
||||
"url": r.get("url", ""),
|
||||
"text": r.get("text", ""),
|
||||
"author": r.get("author", ""),
|
||||
"published_date": r.get("publishedDate", ""),
|
||||
"score": float(r.get("score", 0.5)),
|
||||
}
|
||||
)
|
||||
# Explicitly fail if no sources to avoid generic completions
|
||||
if not sources:
|
||||
raise Exception("No relevant sources found from Exa for the current context")
|
||||
return normalized
|
||||
return sources
|
||||
except Exception as e:
|
||||
logger.error(f"WritingAssistant _search_sources error: {e}")
|
||||
raise
|
||||
@@ -157,21 +172,8 @@ class WritingAssistantService:
|
||||
suggestion = (str(ai_resp or "")).strip()
|
||||
if not suggestion:
|
||||
raise Exception("Assistive writer returned empty suggestion")
|
||||
|
||||
# Dynamic confidence based on source quality and response signals
|
||||
confidence = 0.5
|
||||
if sources:
|
||||
# More sources and higher scores = more confident
|
||||
avg_score = sum(s.get("score", 0.5) for s in sources) / len(sources)
|
||||
confidence = 0.5 + (len(sources) / 6.0) * 0.3 + avg_score * 0.2
|
||||
if suggestion.endswith(('.', '!', '?')):
|
||||
confidence += 0.05
|
||||
# Check if citation hint was included
|
||||
if '[http' in suggestion or '((' in suggestion:
|
||||
confidence += 0.05
|
||||
confidence = min(confidence, 1.0)
|
||||
|
||||
return suggestion, round(confidence, 2)
|
||||
confidence = 0.7
|
||||
return suggestion, confidence
|
||||
except Exception as e:
|
||||
logger.error(f"WritingAssistant _generate_continuation error: {e}")
|
||||
raise
|
||||
|
||||
@@ -1,53 +0,0 @@
|
||||
from pathlib import Path
|
||||
|
||||
from services.user_workspace_manager import UserWorkspaceManager
|
||||
|
||||
|
||||
def _configure_temp_workspace(monkeypatch, tmp_path):
|
||||
workspace_root = tmp_path / "workspace"
|
||||
monkeypatch.setattr("services.database.WORKSPACE_DIR", str(workspace_root))
|
||||
monkeypatch.setattr("services.workspace_dirs.WORKSPACE_DIR", str(workspace_root))
|
||||
monkeypatch.setattr("services.user_workspace_manager.WORKSPACE_DIR", str(workspace_root))
|
||||
monkeypatch.setattr("services.user_workspace_manager.init_user_database", lambda user_id: None)
|
||||
return workspace_root
|
||||
|
||||
|
||||
def _assert_required_contract(user_dir: Path):
|
||||
assert user_dir.exists()
|
||||
assert (user_dir / "db").exists()
|
||||
assert (user_dir / "assets").exists()
|
||||
assert (user_dir / "media").exists()
|
||||
assert (user_dir / "content").exists()
|
||||
assert (user_dir / "config" / "user_config.json").exists()
|
||||
|
||||
|
||||
def test_create_user_workspace_development_contract(monkeypatch, tmp_path):
|
||||
workspace_root = _configure_temp_workspace(monkeypatch, tmp_path)
|
||||
monkeypatch.delenv("RENDER", raising=False)
|
||||
monkeypatch.delenv("RAILWAY", raising=False)
|
||||
monkeypatch.delenv("HEROKU", raising=False)
|
||||
monkeypatch.delenv("ALWRITY_FILESYSTEM_MINIMAL_MODE", raising=False)
|
||||
|
||||
manager = UserWorkspaceManager(db_session=None)
|
||||
result = manager.create_user_workspace("dev-user")
|
||||
|
||||
expected = workspace_root / "workspace_dev-user"
|
||||
_assert_required_contract(expected)
|
||||
assert result["workspace_path"] == str(expected)
|
||||
assert result["mode"] == "development"
|
||||
assert {"db", "assets", "media", "content", "config/user_config.json"}.issubset(set(result["dirs_created"]))
|
||||
|
||||
|
||||
def test_create_user_workspace_production_filesystem_minimal_contract(monkeypatch, tmp_path):
|
||||
workspace_root = _configure_temp_workspace(monkeypatch, tmp_path)
|
||||
monkeypatch.setenv("RENDER", "1")
|
||||
monkeypatch.setenv("ALWRITY_FILESYSTEM_MINIMAL_MODE", "1")
|
||||
|
||||
manager = UserWorkspaceManager(db_session=None)
|
||||
result = manager.create_user_workspace("prod-user")
|
||||
|
||||
expected = workspace_root / "workspace_prod-user"
|
||||
_assert_required_contract(expected)
|
||||
assert result["workspace_path"] == str(expected)
|
||||
assert result["mode"] == "filesystem_minimal"
|
||||
assert {"db", "assets", "media", "content", "config/user_config.json"}.issubset(set(result["dirs_created"]))
|
||||
Binary file not shown.
@@ -1,181 +0,0 @@
|
||||
# Analytics
|
||||
|
||||
Track campaign performance with built-in analytics including send volume trends, conversion funnels, reply classification breakdowns, and CSV exports.
|
||||
|
||||
## Dashboard Overview
|
||||
|
||||
The analytics tab provides a comprehensive view of your outreach performance:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Campaign Analytics] --> B[Volume Trends]
|
||||
A --> C[Conversion Funnel]
|
||||
A --> D[Reply Classification]
|
||||
A --> E[Response Rate]
|
||||
A --> F[Placement Rate]
|
||||
A --> G[CSV Exports]
|
||||
|
||||
style A fill:#e3f2fd
|
||||
style B fill:#e8f5e8
|
||||
style G fill:#fff3e0
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
### Send Volume Trends
|
||||
|
||||
A line chart showing daily email send volume over a configurable time window (7, 14, 30, or 90 days).
|
||||
|
||||
- **X-axis**: Date.
|
||||
- **Y-axis**: Number of emails sent.
|
||||
- **Use case**: Spot trends, ensure consistent outreach cadence, stay within daily caps.
|
||||
|
||||
### Conversion Funnel
|
||||
|
||||
A bar chart showing lead counts at each status stage:
|
||||
|
||||
| Stage | Description |
|
||||
|---|---|
|
||||
| Discovered | Total leads found. |
|
||||
| Contacted | Leads that received an outreach email. |
|
||||
| Replied | Leads that responded (interested or neutral). |
|
||||
| Placed | Leads that resulted in a published backlink. |
|
||||
|
||||
- **Use case**: Identify bottlenecks in your outreach pipeline.
|
||||
|
||||
### Reply Classification
|
||||
|
||||
A breakdown of auto-classified replies:
|
||||
|
||||
| Classification | Color | Meaning |
|
||||
|---|---|---|
|
||||
| Interested | Green | Positive response — follow up! |
|
||||
| Not interested | Red | Declined — auto-suppressed. |
|
||||
| Out of office | Yellow | Auto-responder — schedule follow-up. |
|
||||
| Replied | Blue | General response — needs review. |
|
||||
|
||||
### Response Rate
|
||||
|
||||
Percentage of sent emails that received any reply:
|
||||
|
||||
```
|
||||
Response Rate = (Total Replies / Total Sent) × 100
|
||||
```
|
||||
|
||||
### Placement Rate
|
||||
|
||||
Percentage of contacted leads that resulted in a published backlink:
|
||||
|
||||
```
|
||||
Placement Rate = (Placed Leads / Contacted Leads) × 100
|
||||
```
|
||||
|
||||
## Analytics API
|
||||
|
||||
### Campaign Analytics
|
||||
|
||||
**API:** `GET /api/v1/backlink-outreach/campaigns/{campaign_id}/analytics`
|
||||
|
||||
**Query parameters:**
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `days` | int | `30` | Number of days to include in trends. |
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"total_leads": 150,
|
||||
"leads_by_status": {
|
||||
"discovered": 80,
|
||||
"contacted": 45,
|
||||
"replied": 18,
|
||||
"placed": 7,
|
||||
"bounced": 5
|
||||
},
|
||||
"total_attempts": 52,
|
||||
"total_replies": 23,
|
||||
"replies_by_classification": {
|
||||
"interested": 12,
|
||||
"not_interested": 5,
|
||||
"out_of_office": 3,
|
||||
"replied": 3
|
||||
},
|
||||
"response_rate": 0.44,
|
||||
"placement_rate": 0.16,
|
||||
"daily_send_volume": [
|
||||
{"date": "2025-01-15", "count": 8},
|
||||
{"date": "2025-01-16", "count": 12}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Reporting Snapshot
|
||||
|
||||
Cross-campaign analytics across all campaigns for the authenticated user.
|
||||
|
||||
**API:** `GET /api/v1/backlink-outreach/reporting/snapshot`
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"total_campaigns": 5,
|
||||
"total_sends": 342,
|
||||
"total_replies": 87,
|
||||
"total_placements": 14,
|
||||
"overall_response_rate": 0.25,
|
||||
"overall_placement_rate": 0.04
|
||||
}
|
||||
```
|
||||
|
||||
!!! note "Reply counting"
|
||||
The reporting snapshot counts `OutreachReply` records (not `status == "replied"` on attempts). This ensures accuracy — a lead marked "replied" manually without an actual reply record won't inflate the count.
|
||||
|
||||
## CSV Exports
|
||||
|
||||
Export campaign data as CSV files for CRM import, spreadsheet analysis, or client reporting.
|
||||
|
||||
### Export Leads
|
||||
|
||||
**API:** `GET /api/v1/backlink-outreach/campaigns/{campaign_id}/export/leads`
|
||||
|
||||
### Export Attempts
|
||||
|
||||
**API:** `GET /api/v1/backlink-outreach/campaigns/{campaign_id}/export/attempts`
|
||||
|
||||
### Export Replies
|
||||
|
||||
**API:** `GET /api/v1/backlink-outreach/campaigns/{campaign_id}/export/replies`
|
||||
|
||||
### CSV Safety
|
||||
|
||||
All exports include these safety measures:
|
||||
|
||||
| Measure | Purpose |
|
||||
|---|---|
|
||||
| Explicit fieldnames | Only expected columns are included. |
|
||||
| `extrasaction="ignore"` | Unexpected fields are silently dropped. |
|
||||
| Formula injection sanitization | Cells starting with `=`, `+`, `-`, `@` are prefixed with a single quote to prevent formula injection in spreadsheets. |
|
||||
|
||||
!!! warning "Export loading"
|
||||
Exports may take a few seconds for large campaigns. The UI shows an "Exporting..." state with a disabled button while the download is in progress.
|
||||
|
||||
## UI Features
|
||||
|
||||
### Time Window Selector
|
||||
|
||||
Choose from 7, 14, 30, or 90 days for trend charts. The analytics data is re-fetched when the window changes.
|
||||
|
||||
### Separate Loading States
|
||||
|
||||
Each data section (attempts, replies, analytics) has its own loading indicator, so slow analytics queries don't block the entire page.
|
||||
|
||||
### Error Handling
|
||||
|
||||
If analytics or export requests fail, a toast notification shows the error message. On 5xx server errors, the store automatically retries read operations once with exponential backoff.
|
||||
|
||||
---
|
||||
|
||||
*Next: [API Reference](api-reference.md) — full endpoint documentation.*
|
||||
@@ -1,449 +0,0 @@
|
||||
# API Reference
|
||||
|
||||
Complete reference for all Backlink Outreach API endpoints. All endpoints require Clerk authentication via `Depends(get_current_user)`.
|
||||
|
||||
## Authentication
|
||||
|
||||
All endpoints use Clerk authentication. Include the session token in the `Authorization` header:
|
||||
|
||||
```
|
||||
Authorization: Bearer <clerk_session_token>
|
||||
```
|
||||
|
||||
The `user_id` is derived from the authenticated session — never from the request body.
|
||||
|
||||
## Endpoint Map
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
subgraph Campaigns
|
||||
C1[POST /campaigns]
|
||||
C2[GET /campaigns]
|
||||
C3[GET /campaigns/{id}]
|
||||
C4[DELETE /campaigns/{id}]
|
||||
end
|
||||
subgraph Leads
|
||||
L1[POST /campaigns/{id}/leads]
|
||||
L2[POST /campaigns/{id}/leads/bulk]
|
||||
L3[PATCH /campaigns/{id}/leads/{lead_id}/status]
|
||||
L4[PATCH /campaigns/{id}/leads/bulk-status]
|
||||
end
|
||||
subgraph Discovery
|
||||
D1[POST /discover/deep]
|
||||
end
|
||||
subgraph Email
|
||||
E1[POST /emails/generate]
|
||||
E2[POST /emails/personalize]
|
||||
E3[POST /emails/subject-suggestions]
|
||||
E4[POST /emails/follow-up]
|
||||
E5[POST /emails/templates]
|
||||
E6[GET /emails/templates]
|
||||
E7[GET /emails/templates/{id}]
|
||||
E8[DELETE /emails/templates/{id}]
|
||||
end
|
||||
subgraph Outreach
|
||||
O1[POST /outreach/send]
|
||||
O2[POST /policy/validate]
|
||||
O3[GET /campaigns/{id}/attempts]
|
||||
O4[GET /campaigns/{id}/follow-ups]
|
||||
end
|
||||
subgraph Replies
|
||||
R1[POST /replies/poll]
|
||||
R2[GET /campaigns/{id}/replies]
|
||||
end
|
||||
subgraph Suppression
|
||||
S1[POST /suppression]
|
||||
S2[GET /suppression]
|
||||
end
|
||||
subgraph Analytics
|
||||
A1[GET /campaigns/{id}/analytics]
|
||||
A2[GET /reporting/snapshot]
|
||||
A3[GET /campaigns/{id}/export/leads]
|
||||
A4[GET /campaigns/{id}/export/attempts]
|
||||
A5[GET /campaigns/{id}/export/replies]
|
||||
end
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Campaigns
|
||||
|
||||
### Create Campaign
|
||||
|
||||
`POST /api/v1/backlink-outreach/campaigns`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `name` | string | Yes | Campaign name. |
|
||||
| `description` | string | No | Campaign description. |
|
||||
| `keywords` | string[] | No | Target keywords for discovery. |
|
||||
|
||||
**Response:** `201 Created` — Campaign object.
|
||||
|
||||
### List Campaigns
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns`
|
||||
|
||||
**Query Parameters:**
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `workspace_id` | string | user_id | Workspace to filter by. Defaults to authenticated user. |
|
||||
|
||||
**Response:** `200 OK` — Array of campaign objects.
|
||||
|
||||
### Get Campaign
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}`
|
||||
|
||||
**Response:** `200 OK` — Campaign object with included leads.
|
||||
|
||||
### Delete Campaign
|
||||
|
||||
`DELETE /api/v1/backlink-outreach/campaigns/{campaign_id}`
|
||||
|
||||
**Response:** `204 No Content`
|
||||
|
||||
---
|
||||
|
||||
## Leads
|
||||
|
||||
### Add Lead
|
||||
|
||||
`POST /api/v1/backlink-outreach/campaigns/{campaign_id}/leads`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `website_url` | string | Yes | Target website URL. |
|
||||
| `website_title` | string | No | Website title. |
|
||||
| `contact_email` | string | No | Contact email address. |
|
||||
| `quality_score` | float | No | Quality score (0-1). |
|
||||
| `relevance_score` | float | No | Relevance score (0-1). |
|
||||
| `guest_post_likelihood` | float | No | Guest post likelihood (0-1). |
|
||||
| `source` | string | No | Source of the lead. |
|
||||
|
||||
**Response:** `201 Created` — Lead object.
|
||||
|
||||
### Bulk Add Leads
|
||||
|
||||
`POST /api/v1/backlink-outreach/campaigns/{campaign_id}/leads/bulk`
|
||||
|
||||
**Request Body:** Array of lead objects.
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `added` | int | Number of leads successfully added. |
|
||||
| `skipped` | int | Number of duplicates skipped. |
|
||||
| `failed` | string[] | List of failed entries with reasons. |
|
||||
|
||||
### Update Lead Status
|
||||
|
||||
`PATCH /api/v1/backlink-outreach/campaigns/{campaign_id}/leads/{lead_id}/status`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `status` | string | Yes | New status: discovered, contacted, replied, placed, bounced, lost. |
|
||||
|
||||
**Response:** `200 OK` — Updated lead object.
|
||||
|
||||
### Bulk Update Status
|
||||
|
||||
`PATCH /api/v1/backlink-outreach/campaigns/{campaign_id}/leads/bulk-status`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `lead_ids` | string[] | Yes | Lead IDs to update. |
|
||||
| `status` | string | Yes | New status for all leads. |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `updated` | int | Number of leads successfully updated. |
|
||||
| `failed` | string[] | List of lead IDs that failed to update. |
|
||||
|
||||
!!! warning "Partial failures"
|
||||
Bulk operations may partially succeed. Always check the `failed` field and show appropriate warnings to users.
|
||||
|
||||
---
|
||||
|
||||
## Discovery
|
||||
|
||||
### Deep Discovery
|
||||
|
||||
`POST /api/v1/backlink-outreach/discover/deep`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `keyword` | string | Yes | Search keyword or phrase. |
|
||||
| `campaign_id` | string | No | Campaign to save results to. |
|
||||
| `max_results` | int | No | Maximum results to return (default 20). |
|
||||
| `save_to_campaign` | bool | No | Auto-save results to campaign. |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `results` | array | Discovered opportunities with scores. |
|
||||
| `saved_to_campaign` | int | Number of leads saved to campaign. |
|
||||
| `save_failed` | int | Number of leads that failed to save. |
|
||||
|
||||
---
|
||||
|
||||
## Email
|
||||
|
||||
### Generate Email
|
||||
|
||||
`POST /api/v1/backlink-outreach/emails/generate`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `topic` | string | Yes | Email topic. |
|
||||
| `tone` | string | No | professional, friendly, casual, formal. |
|
||||
| `template_id` | string | No | Template to base generation on. |
|
||||
|
||||
**Response:** `200 OK` — `{ subject, body }`
|
||||
|
||||
### Personalize Email
|
||||
|
||||
`POST /api/v1/backlink-outreach/emails/personalize`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `base_email` | string | Yes | Email content to personalize. |
|
||||
| `lead_name` | string | No | Lead's name. |
|
||||
| `lead_website` | string | No | Lead's website. |
|
||||
| `content_topic` | string | No | Topic to reference. |
|
||||
|
||||
**Response:** `200 OK` — `{ subject, body }`
|
||||
|
||||
### Subject Suggestions
|
||||
|
||||
`POST /api/v1/backlink-outreach/emails/subject-suggestions`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `topic` | string | Yes | Email topic. |
|
||||
| `tone` | string | No | Tone for suggestions. |
|
||||
|
||||
**Response:** `200 OK` — `{ suggestions: string[] }`
|
||||
|
||||
### Generate Follow-up
|
||||
|
||||
`POST /api/v1/backlink-outreach/emails/follow-up`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `original_subject` | string | Yes | Subject of original email. |
|
||||
| `original_body` | string | Yes | Body of original email. |
|
||||
| `tone` | string | No | Tone for follow-up. |
|
||||
|
||||
**Response:** `200 OK` — `{ subject, body }`
|
||||
|
||||
### Create Template
|
||||
|
||||
`POST /api/v1/backlink-outreach/emails/templates`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `name` | string | Yes | Template name. |
|
||||
| `subject` | string | Yes | Subject line with `{placeholders}`. |
|
||||
| `body` | string | Yes | Email body with `{placeholders}`. |
|
||||
| `category` | string | No | Template category. |
|
||||
|
||||
**Response:** `201 Created` — Template object.
|
||||
|
||||
### List Templates
|
||||
|
||||
`GET /api/v1/backlink-outreach/emails/templates`
|
||||
|
||||
**Response:** `200 OK` — Array of template objects.
|
||||
|
||||
### Get Template
|
||||
|
||||
`GET /api/v1/backlink-outreach/emails/templates/{template_id}`
|
||||
|
||||
**Response:** `200 OK` — Template object.
|
||||
|
||||
### Delete Template
|
||||
|
||||
`DELETE /api/v1/backlink-outreach/emails/templates/{template_id}`
|
||||
|
||||
**Response:** `204 No Content`
|
||||
|
||||
---
|
||||
|
||||
## Outreach
|
||||
|
||||
### Send Outreach
|
||||
|
||||
`POST /api/v1/backlink-outreach/outreach/send`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `campaign_id` | string | Yes | Campaign for the outreach. |
|
||||
| `lead_id` | string | Yes | Lead to send to. |
|
||||
| `subject` | string | Yes | Email subject. |
|
||||
| `body` | string | Yes | Email body. |
|
||||
| `workspace_id` | string | No | Workspace ID (default "default"). |
|
||||
|
||||
**Response:** `200 OK` — Outreach attempt object.
|
||||
|
||||
**Error responses:**
|
||||
|
||||
| Code | Meaning |
|
||||
|---|---|
|
||||
| `403` | Policy validation failed (caps, suppression, idempotency). |
|
||||
| `500` | SMTP delivery failed (generic error, no stack trace). |
|
||||
|
||||
### Validate Policy
|
||||
|
||||
`POST /api/v1/backlink-outreach/policy/validate`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `recipient_email` | string | Yes | Recipient email address. |
|
||||
| `sender_email` | string | Yes | Sender email address. |
|
||||
| `subject` | string | No | Email subject for idempotency check. |
|
||||
|
||||
**Response:** `200 OK` — Policy validation result with `allowed`, `reason`, `legal_basis`, counts, and limits.
|
||||
|
||||
### List Attempts
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/attempts`
|
||||
|
||||
**Response:** `200 OK` — Array of outreach attempt objects.
|
||||
|
||||
### List Follow-ups
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/follow-ups`
|
||||
|
||||
**Response:** `200 OK` — Array of follow-up objects.
|
||||
|
||||
---
|
||||
|
||||
## Replies
|
||||
|
||||
### Poll Replies
|
||||
|
||||
`POST /api/v1/backlink-outreach/replies/poll`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `campaign_id` | string | No | Campaign to filter by. |
|
||||
|
||||
**Response:** `200 OK`
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `replies_found` | int | Number of new replies processed. |
|
||||
| `failed` | int | Number of replies that failed to process. |
|
||||
|
||||
### List Replies
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/replies`
|
||||
|
||||
**Response:** `200 OK` — Array of reply objects with classification.
|
||||
|
||||
---
|
||||
|
||||
## Suppression
|
||||
|
||||
### Add to Suppression
|
||||
|
||||
`POST /api/v1/backlink-outreach/suppression`
|
||||
|
||||
**Request Body:**
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `email` | string | Yes | Email to suppress. |
|
||||
| `reason` | string | No | Reason for suppression. |
|
||||
|
||||
**Response:** `201 Created` — Suppression record.
|
||||
|
||||
### List Suppressed
|
||||
|
||||
`GET /api/v1/backlink-outreach/suppression`
|
||||
|
||||
**Response:** `200 OK` — Array of suppression records.
|
||||
|
||||
---
|
||||
|
||||
## Analytics
|
||||
|
||||
### Campaign Analytics
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/analytics`
|
||||
|
||||
**Query Parameters:**
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `days` | int | 30 | Days to include in trends. |
|
||||
|
||||
**Response:** `200 OK` — Analytics object with leads_by_status, replies_by_classification, rates, and daily_send_volume.
|
||||
|
||||
### Reporting Snapshot
|
||||
|
||||
`GET /api/v1/backlink-outreach/reporting/snapshot`
|
||||
|
||||
**Response:** `200 OK` — Cross-campaign summary with total counts and rates.
|
||||
|
||||
### Export Leads
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/export/leads`
|
||||
|
||||
**Response:** `200 OK` — CSV file download.
|
||||
|
||||
### Export Attempts
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/export/attempts`
|
||||
|
||||
**Response:** `200 OK` — CSV file download.
|
||||
|
||||
### Export Replies
|
||||
|
||||
`GET /api/v1/backlink-outreach/campaigns/{campaign_id}/export/replies`
|
||||
|
||||
**Response:** `200 OK` — CSV file download.
|
||||
|
||||
---
|
||||
|
||||
## Common Error Responses
|
||||
|
||||
| Status | Meaning | Body |
|
||||
|---|---|---|
|
||||
| `401` | Not authenticated | `{"detail": "Not authenticated"}` |
|
||||
| `403` | Policy blocked | `{"detail": "Policy validation failed", "reason": "..."}` |
|
||||
| `404` | Not found | `{"detail": "Resource not found"}` |
|
||||
| `422` | Validation error | `{"detail": [...validation errors]}` |
|
||||
| `500` | Server error | `{"detail": "An internal error occurred"}` (generic, no stack trace) |
|
||||
@@ -1,108 +0,0 @@
|
||||
# Campaign Management
|
||||
|
||||
Campaigns are the top-level organizational unit for backlink outreach. Every lead, email, attempt, reply, and analytics data point belongs to a campaign.
|
||||
|
||||
## Creating a Campaign
|
||||
|
||||
A campaign requires only a name. Add a description and keywords to make discovery and reporting easier.
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/campaigns`
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "SaaS Growth Blogs Q3",
|
||||
"description": "Outreach to SaaS marketing blogs for guest post placements",
|
||||
"keywords": ["SaaS", "growth marketing", "B2B"]
|
||||
}
|
||||
```
|
||||
|
||||
**UI:** Navigate to **Backlink Outreach → Campaigns → + New Campaign**.
|
||||
|
||||
!!! tip "Naming conventions"
|
||||
Use a consistent naming scheme like `[Vertical] [Content Type] [Period]` — e.g., "Fitness Guest Posts June" or "AI Startups Roundup Q3".
|
||||
|
||||
## Campaign List View
|
||||
|
||||
The campaign list shows:
|
||||
- **Name** and description
|
||||
- **Lead count** broken down by status
|
||||
- **Creation date**
|
||||
- **Quick actions**: Add leads, view analytics, manage templates
|
||||
|
||||
## Campaign Detail View
|
||||
|
||||
Click a campaign to see its full detail:
|
||||
- **Leads tab**: All leads with status, quality score, and actions.
|
||||
- **Email tab**: Compose and preview outreach emails.
|
||||
- **Outreach tab**: Send emails, view attempts, manage follow-ups.
|
||||
- **Inbox tab**: Replies with auto-classification tags.
|
||||
- **Analytics tab**: Campaign-specific charts and metrics.
|
||||
|
||||
## Managing Leads
|
||||
|
||||
### Adding Leads
|
||||
|
||||
**Single lead:**
|
||||
`POST /api/v1/backlink-outreach/campaigns/{campaign_id}/leads`
|
||||
|
||||
```json
|
||||
{
|
||||
"website_url": "https://example.com",
|
||||
"website_title": "Example Marketing Blog",
|
||||
"contact_email": "editor@example.com",
|
||||
"quality_score": 0.85,
|
||||
"relevance_score": 0.72,
|
||||
"guest_post_likelihood": 0.65,
|
||||
"source": "manual"
|
||||
}
|
||||
```
|
||||
|
||||
**Bulk add:**
|
||||
`POST /api/v1/backlink-outreach/campaigns/{campaign_id}/leads/bulk`
|
||||
|
||||
Send an array of lead objects to add multiple leads at once.
|
||||
|
||||
### Updating Lead Status
|
||||
|
||||
Lead status lifecycle:
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> discovered
|
||||
discovered --> contacted: Send outreach email
|
||||
contacted --> replied: Lead replies (interested)
|
||||
contacted --> bounced: Email bounced / not interested
|
||||
replied --> placed: Backlink published
|
||||
replied --> lost: Lead declined after reply
|
||||
placed --> [*]
|
||||
lost --> [*]
|
||||
bounced --> [*]
|
||||
```
|
||||
|
||||
**Single update:** Click the status button on a lead card.
|
||||
|
||||
**Bulk update:** Select multiple leads → choose new status → confirm.
|
||||
|
||||
!!! warning "Bulk status updates"
|
||||
Bulk updates may partially fail. If some leads can't be updated, the response includes a `failed` list and the UI shows a warning toast with the count of failures.
|
||||
|
||||
## Deleting a Campaign
|
||||
|
||||
`DELETE /api/v1/backlink-outreach/campaigns/{campaign_id}`
|
||||
|
||||
!!! warning "Irreversible"
|
||||
Deleting a campaign removes all associated leads, attempts, replies, and analytics data. This action cannot be undone.
|
||||
|
||||
## Campaign Organization Best Practices
|
||||
|
||||
| Practice | Why |
|
||||
|---|---|
|
||||
| One campaign per vertical | Keeps leads relevant and analytics clean. |
|
||||
| Add keywords at creation | Powers better discovery queries later. |
|
||||
| Review leads before sending | Avoid wasting daily caps on low-quality leads. |
|
||||
| Archive completed campaigns | Keeps the campaign list manageable. |
|
||||
| Use consistent naming | Easier to find and compare campaigns later. |
|
||||
|
||||
---
|
||||
|
||||
*Next: [Discovery](discovery.md) — finding opportunities with AI-powered search.*
|
||||
@@ -1,122 +0,0 @@
|
||||
# Configuration
|
||||
|
||||
Environment variables and deployment configuration for the Backlink Outreach feature.
|
||||
|
||||
## SMTP Configuration
|
||||
|
||||
Required for sending outreach emails.
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `SMTP_HOST` | Yes | — | SMTP server hostname. |
|
||||
| `SMTP_PORT` | No | `587` | SMTP server port. Use 587 for STARTTLS, 465 for implicit TLS. |
|
||||
| `SMTP_USER` | Yes | — | SMTP authentication username. |
|
||||
| `SMTP_PASS` | Yes | — | SMTP authentication password. |
|
||||
| `SMTP_FROM_EMAIL` | Yes | — | Default "From" email address for outreach. |
|
||||
| `SMTP_FROM_NAME` | No | — | Display name for the From address. |
|
||||
| `SMTP_VERIFY_TLS` | No | `true` | Verify TLS certificate on SMTP connection. Set to `false` only for local dev. |
|
||||
| `SMTP_SEND_TIMEOUT` | No | `30` | Timeout in seconds for each SMTP send operation. |
|
||||
|
||||
!!! warning "SMTP_VERIFY_TLS"
|
||||
Never set `SMTP_VERIFY_TLS=false` in production. Disabling TLS verification exposes you to man-in-the-middle attacks. Only use `false` for local development with self-signed certificates.
|
||||
|
||||
## IMAP Configuration
|
||||
|
||||
Required for reply monitoring.
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `IMAP_HOST` | Yes | — | IMAP server hostname. |
|
||||
| `IMAP_PORT` | No | `993` | IMAP server port. 993 for SSL, 143 for STARTTLS. |
|
||||
| `IMAP_USER` | Yes | — | IMAP authentication username. |
|
||||
| `IMAP_PASS` | Yes | — | IMAP authentication password. |
|
||||
| `IMAP_FETCH_LIMIT` | No | `50` | Maximum messages to process per poll cycle. |
|
||||
|
||||
## Search API Configuration
|
||||
|
||||
Required for AI-powered opportunity discovery.
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `EXA_API_KEY` | No | — | Exa neural search API key. Discovery falls back to DuckDuckGo if not set. |
|
||||
|
||||
## AI Configuration
|
||||
|
||||
Required for email generation and personalization.
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `OPENAI_API_KEY` | Yes | — | OpenAI API key for email generation, personalization, and subject suggestions. |
|
||||
|
||||
## Policy Configuration
|
||||
|
||||
These are currently hardcoded but can be made configurable:
|
||||
|
||||
| Setting | Current Value | Description |
|
||||
|---|---|---|
|
||||
| Daily user cap | 100 | Max emails per user per day. |
|
||||
| Daily domain cap | 20 | Max emails per target domain per day. |
|
||||
| Idempotency window | 24 hours | Duplicate send prevention window. |
|
||||
|
||||
## Database Configuration
|
||||
|
||||
The Backlink Outreach feature uses SQLite with automatic table creation:
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `DATABASE_URL` | No | `sqlite+aiosqlite:///./backlink_outreach.db` | Database connection string. |
|
||||
|
||||
Tables are created automatically on first use via `_ensure_tables()`. No manual migration is required.
|
||||
|
||||
## Deployment Checklist
|
||||
|
||||
### Minimal Setup
|
||||
|
||||
1. Set all **SMTP** environment variables.
|
||||
2. Set all **IMAP** environment variables.
|
||||
3. Set `OPENAI_API_KEY`.
|
||||
4. Optionally set `EXA_API_KEY` for Exa-powered discovery.
|
||||
5. Start the backend server.
|
||||
6. Verify health: `GET /api/v1/backlink-outreach/campaigns` (returns empty list if auth works).
|
||||
|
||||
### Production Setup
|
||||
|
||||
1. All minimal setup steps.
|
||||
2. Ensure `SMTP_VERIFY_TLS=true` (default).
|
||||
3. Set `SMTP_SEND_TIMEOUT` to 30+ seconds for reliable delivery.
|
||||
4. Set `IMAP_FETCH_LIMIT` based on mailbox volume (50-200).
|
||||
5. Set up a scheduled job to poll replies every 5-15 minutes.
|
||||
6. Configure monitoring for SMTP/IMAP connection failures.
|
||||
7. Review the suppression list periodically.
|
||||
|
||||
### Email Provider Setup
|
||||
|
||||
The system works with any SMTP/IMAP provider:
|
||||
|
||||
| Provider | SMTP Host | SMTP Port | IMAP Host | IMAP Port |
|
||||
|---|---|---|---|---|
|
||||
| Gmail | smtp.gmail.com | 587 | imap.gmail.com | 993 |
|
||||
| Outlook | smtp.office365.com | 587 | outlook.office365.com | 993 |
|
||||
| SendGrid | smtp.sendgrid.net | 587 | — (use webhooks) | — |
|
||||
| Mailgun | smtp.mailgun.org | 587 | — (use webhooks) | — |
|
||||
| Amazon SES | email-smtp.*.amazonaws.com | 587 | — (use SNS) | — |
|
||||
|
||||
!!! note "Transaction email providers"
|
||||
SendGrid, Mailgun, and Amazon SES don't support IMAP. For reply monitoring with these providers, you'll need to set up inbound webhooks or use a separate IMAP-capable mailbox.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
| Area | Recommendation |
|
||||
|---|---|
|
||||
| **SMTP credentials** | Store in environment variables, never in code or config files. |
|
||||
| **IMAP credentials** | Use app-specific passwords (Gmail) or dedicated mailbox accounts. |
|
||||
| **TLS verification** | Always enabled in production (`SMTP_VERIFY_TLS=true`). |
|
||||
| **Error responses** | 500 errors return generic messages — no stack traces leaked. |
|
||||
| **Auth** | All endpoints require Clerk authentication. User identity derived from session, not request body. |
|
||||
| **SQL injection** | Column names are whitelisted and quoted in dynamic SQL. |
|
||||
| **IMAP injection** | Search terms are sanitized before IMAP SEARCH commands. |
|
||||
| **CSV injection** | All CSV exports sanitize formula injection characters. |
|
||||
|
||||
---
|
||||
|
||||
*Next: [Implementation Overview](implementation-overview.md) — architecture and internals.*
|
||||
@@ -1,132 +0,0 @@
|
||||
# Discovery
|
||||
|
||||
The discovery system finds websites that accept guest posts in your niche using AI-powered search across multiple engines.
|
||||
|
||||
## How It Works
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Enter Keyword] --> B[Generate Query Patterns]
|
||||
B --> C1[Exa Neural Search]
|
||||
B --> C2[DuckDuckGo Search]
|
||||
C1 --> D[Merge & Deduplicate Results]
|
||||
C2 --> D
|
||||
D --> E[Scrape Full Pages]
|
||||
E --> F[Extract Contact Emails]
|
||||
F --> G[Score Quality & Relevance]
|
||||
G --> H[Return Ranked Results]
|
||||
H --> I[Save to Campaign]
|
||||
|
||||
style A fill:#e3f2fd
|
||||
style G fill:#e8f5e8
|
||||
style I fill:#fff3e0
|
||||
```
|
||||
|
||||
## Search Engines
|
||||
|
||||
### Exa Neural Search
|
||||
|
||||
Exa uses semantic understanding to find pages that *mean* what you're looking for, not just pages that contain the keywords.
|
||||
|
||||
- **Strength**: High-relevance results, understands context.
|
||||
- **Limitation**: Requires `EXA_API_KEY` environment variable.
|
||||
- **Best for**: Niche-specific discovery, finding high-quality sites.
|
||||
|
||||
### DuckDuckGo Search
|
||||
|
||||
DuckDuckGo provides broad coverage with traditional keyword matching.
|
||||
|
||||
- **Strength**: No API key required, broad coverage.
|
||||
- **Limitation**: Less semantic understanding.
|
||||
- **Best for**: Broad discovery, supplementing Exa results.
|
||||
|
||||
## Query Patterns
|
||||
|
||||
The system automatically generates multiple search queries from your keyword:
|
||||
|
||||
| Pattern | Example (keyword: "AI marketing") |
|
||||
|---|---|
|
||||
| `{keyword} write for us` | "AI marketing write for us" |
|
||||
| `{keyword} guest post` | "AI marketing guest post" |
|
||||
| `{keyword} contribute` | "AI marketing contribute" |
|
||||
| `{keyword} submit article` | "AI marketing submit article" |
|
||||
| `{keyword} become a contributor` | "AI marketing become a contributor" |
|
||||
| `{keyword} guest contributor guidelines` | "AI marketing guest contributor guidelines" |
|
||||
|
||||
## Deep Discovery
|
||||
|
||||
Deep discovery goes beyond search results by:
|
||||
|
||||
1. **Scraping full pages** — not just snippets, but the complete HTML.
|
||||
2. **Extracting contact emails** — parses `mailto:` links, contact pages, and author bios.
|
||||
3. **Detecting guest post guidelines** — identifies pages with "write for us" or submission instructions.
|
||||
4. **Scoring quality** — assigns a 0-1 quality score based on relevance, authority signals, and content quality.
|
||||
5. **Scoring confidence** — assigns a 0-1 confidence score for guest-post likelihood.
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/discover/deep`
|
||||
|
||||
```json
|
||||
{
|
||||
"keyword": "AI marketing",
|
||||
"campaign_id": "uuid-of-campaign",
|
||||
"max_results": 20,
|
||||
"save_to_campaign": true
|
||||
}
|
||||
```
|
||||
|
||||
!!! note "Automatic saving"
|
||||
When `save_to_campaign` is `true`, discovered leads are automatically saved to the specified campaign. The response includes `saved_to_campaign` and `save_failed` counts.
|
||||
|
||||
## Result Scoring
|
||||
|
||||
Each result is scored on two dimensions:
|
||||
|
||||
### Quality Score (0-1)
|
||||
|
||||
How relevant and authoritative is the site for your keyword?
|
||||
|
||||
| Factor | Weight |
|
||||
|---|---|
|
||||
| Keyword relevance in title/URL | High |
|
||||
| Domain authority signals | Medium |
|
||||
| Content freshness | Low |
|
||||
| Site structure (blog section) | Medium |
|
||||
|
||||
### Confidence Score (0-1)
|
||||
|
||||
How likely is the site to accept guest posts?
|
||||
|
||||
| Factor | Weight |
|
||||
|---|---|
|
||||
| "Write for us" page found | Very High |
|
||||
| Guest post guidelines detected | High |
|
||||
| Contact email found | High |
|
||||
| Previous guest posts on site | Medium |
|
||||
| Blog section exists | Low |
|
||||
|
||||
## Reviewing Results
|
||||
|
||||
After discovery, review each result:
|
||||
|
||||
| Badge | Meaning |
|
||||
|---|---|
|
||||
| **Email found** | A contact email was extracted from the page. |
|
||||
| **Has guidelines** | A guest post guidelines page was detected. |
|
||||
| **High quality** | Quality score > 0.7. |
|
||||
| **High confidence** | Confidence score > 0.7. |
|
||||
|
||||
!!! tip "Prioritize leads"
|
||||
Focus on leads with both "Email found" and "Has guidelines" badges — these have the highest conversion potential.
|
||||
|
||||
## Saving to Campaign
|
||||
|
||||
Results can be saved to a campaign in two ways:
|
||||
|
||||
1. **Automatic**: Set `save_to_campaign: true` in the deep discovery request.
|
||||
2. **Manual**: Select results in the UI and click **Save to Campaign**.
|
||||
|
||||
Duplicate leads (same `website_url` in the same campaign) are automatically skipped.
|
||||
|
||||
---
|
||||
|
||||
*Next: [Email Composer](email-composer.md) — AI-powered email generation and personalization.*
|
||||
@@ -1,167 +0,0 @@
|
||||
# Email Composer
|
||||
|
||||
The AI email composer generates personalized outreach emails, subject lines, and follow-ups using large language models.
|
||||
|
||||
## AI Generation Modes
|
||||
|
||||
### Generate
|
||||
|
||||
Create a complete email (subject + body) from a topic and tone.
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/emails/generate`
|
||||
|
||||
```json
|
||||
{
|
||||
"topic": "Guest post about AI marketing trends",
|
||||
"tone": "professional",
|
||||
"template_id": "optional-template-uuid"
|
||||
}
|
||||
```
|
||||
|
||||
**Available tones:**
|
||||
|
||||
| Tone | Style |
|
||||
|---|---|
|
||||
| `professional` | Formal, business-appropriate language. |
|
||||
| `friendly` | Warm, approachable, conversational. |
|
||||
| `casual` | Relaxed, informal, peer-to-peer. |
|
||||
| `formal` | Highly structured, traditional business correspondence. |
|
||||
|
||||
### Personalize
|
||||
|
||||
Tailor an email to a specific lead using their name, website, and content.
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/emails/personalize`
|
||||
|
||||
```json
|
||||
{
|
||||
"base_email": "I'd love to contribute a guest post...",
|
||||
"lead_name": "Jane",
|
||||
"lead_website": "techblog.example.com",
|
||||
"content_topic": "AI Marketing Trends 2025"
|
||||
}
|
||||
```
|
||||
|
||||
### Subject Line Suggestions
|
||||
|
||||
Get 5-10 AI-generated subject line variants for A/B testing.
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/emails/subject-suggestions`
|
||||
|
||||
```json
|
||||
{
|
||||
"topic": "Guest post about AI marketing trends",
|
||||
"tone": "professional"
|
||||
}
|
||||
```
|
||||
|
||||
### Follow-up Draft
|
||||
|
||||
Generate a polite follow-up email referencing the original outreach.
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/emails/follow-up`
|
||||
|
||||
```json
|
||||
{
|
||||
"original_subject": "Guest Post: AI Marketing Trends",
|
||||
"original_body": "I'd love to contribute...",
|
||||
"tone": "friendly"
|
||||
}
|
||||
```
|
||||
|
||||
## Template System
|
||||
|
||||
Templates let you save and reuse winning email structures with variable placeholders.
|
||||
|
||||
### Creating a Template
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/emails/templates`
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "Standard Guest Post Pitch",
|
||||
"subject": "Guest Post: {topic}",
|
||||
"body": "Hi {name},\n\nI've been following {website} and really enjoyed your recent posts...",
|
||||
"category": "guest-post"
|
||||
}
|
||||
```
|
||||
|
||||
### Supported Placeholders
|
||||
|
||||
| Placeholder | Replaced With |
|
||||
|---|---|
|
||||
| `{name}` | Lead's contact name. |
|
||||
| `{website}` | Lead's website URL. |
|
||||
| `{topic}` | Your content topic. |
|
||||
| `{your_name}` | Your name (from sender config). |
|
||||
| `{your_site}` | Your website URL (from sender config). |
|
||||
|
||||
!!! tip "Template best practices"
|
||||
- Use `{name}` for personalization — emails with names get 26% higher open rates.
|
||||
- Keep subject lines under 50 characters.
|
||||
- Include a clear call-to-action in every template.
|
||||
- Test multiple templates and track which gets the best response rate.
|
||||
|
||||
### Managing Templates
|
||||
|
||||
| Action | Endpoint |
|
||||
|---|---|
|
||||
| List templates | `GET /api/v1/backlink-outreach/emails/templates` |
|
||||
| Get template | `GET /api/v1/backlink-outreach/emails/templates/{template_id}` |
|
||||
| Delete template | `DELETE /api/v1/backlink-outreach/emails/templates/{template_id}` |
|
||||
|
||||
## Email Composer UI
|
||||
|
||||
The composer provides:
|
||||
|
||||
- **Topic input**: Describe what you want to write about.
|
||||
- **Tone selector**: Choose the writing style.
|
||||
- **Template picker**: Start from a saved template.
|
||||
- **Generate button**: Create AI email from inputs.
|
||||
- **Personalize button**: Tailor the current email to a specific lead.
|
||||
- **Subject Suggest button**: Get subject line variants.
|
||||
- **Live preview**: See the rendered email as you edit.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Choose Template] --> B[Enter Topic + Tone]
|
||||
B --> C[Generate with AI]
|
||||
C --> D{Satisfied?}
|
||||
D -->|Yes| E[Send Outreach]
|
||||
D -->|No| F[Personalize / Edit]
|
||||
F --> D
|
||||
C --> G[Suggest Subjects]
|
||||
G --> H[Pick Best Subject]
|
||||
H --> E
|
||||
|
||||
style C fill:#e8f5e8
|
||||
style E fill:#fff3e0
|
||||
```
|
||||
|
||||
## Writing Effective Outreach Emails
|
||||
|
||||
### Subject Lines
|
||||
|
||||
- Be specific: "Guest Post: 5 AI Marketing Trends for 2025" > "Collaboration?"
|
||||
- Keep it short: Under 50 characters for best open rates.
|
||||
- Avoid spam triggers: ALL CAPS, excessive punctuation, "free", "guaranteed".
|
||||
|
||||
### Email Body
|
||||
|
||||
- **First line**: Reference their content specifically (proves you read their site).
|
||||
- **Value proposition**: What's in it for them (free quality content, fresh perspective).
|
||||
- **Credentials**: Brief mention of your expertise or published work.
|
||||
- **Call-to-action**: One clear next step (reply with interest, check your draft).
|
||||
- **Signature**: Professional sign-off with links to your published work.
|
||||
|
||||
### Follow-ups
|
||||
|
||||
- Wait 3-5 business days before following up.
|
||||
- Reference the original email date and subject.
|
||||
- Add new value (a specific article idea, a data point).
|
||||
- Keep it shorter than the original.
|
||||
- Maximum 2 follow-ups per lead.
|
||||
|
||||
---
|
||||
|
||||
*Next: [Outreach Operations](outreach-operations.md) — sending, policy validation, and suppression.*
|
||||
@@ -1,317 +0,0 @@
|
||||
# Implementation Overview
|
||||
|
||||
Architecture, database schema, service layer, and authentication flow for the Backlink Outreach feature.
|
||||
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph Frontend
|
||||
UI[Dashboard Component]
|
||||
Store[Zustand Store]
|
||||
API[API Client]
|
||||
end
|
||||
subgraph Backend
|
||||
Router[FastAPI Router]
|
||||
Service[Outreach Service]
|
||||
Storage[Storage Layer]
|
||||
Sender[SMTP Sender]
|
||||
Monitor[IMAP Monitor]
|
||||
end
|
||||
subgraph External
|
||||
SMTP[SMTP Server]
|
||||
IMAP[IMAP Server]
|
||||
EXA[Exa API]
|
||||
DDG[DuckDuckGo]
|
||||
LLM[OpenAI API]
|
||||
Clerk[Clerk Auth]
|
||||
end
|
||||
|
||||
UI --> Store
|
||||
Store --> API
|
||||
API --> Router
|
||||
Router --> Service
|
||||
Router --> Storage
|
||||
Service --> Storage
|
||||
Service --> Sender
|
||||
Service --> Monitor
|
||||
Sender --> SMTP
|
||||
Monitor --> IMAP
|
||||
Service --> EXA
|
||||
Service --> DDG
|
||||
Service --> LLM
|
||||
Router --> Clerk
|
||||
|
||||
style Frontend fill:#e3f2fd
|
||||
style Backend fill:#e8f5e8
|
||||
style External fill:#fff3e0
|
||||
```
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
backend/
|
||||
├── routers/
|
||||
│ └── backlink_outreach.py # 18+ API endpoints
|
||||
├── services/
|
||||
│ ├── backlink_outreach_service.py # Business logic, policy, analytics
|
||||
│ ├── backlink_outreach_storage.py # SQLite CRUD operations
|
||||
│ ├── backlink_outreach_sender.py # SMTP email delivery
|
||||
│ ├── backlink_outreach_reply_monitor.py # IMAP reply polling
|
||||
│ └── backlink_outreach_models.py # Pydantic request/response models
|
||||
├── models/
|
||||
│ └── backlink_outreach_models.py # SQLAlchemy models + indexes
|
||||
|
||||
frontend/src/
|
||||
├── components/
|
||||
│ └── BacklinkOutreach/
|
||||
│ └── BacklinkOutreachDashboard.tsx # Main UI component
|
||||
├── stores/
|
||||
│ └── backlinkOutreachStore.ts # Zustand state management
|
||||
└── api/
|
||||
└── backlinkOutreachApi.ts # API client functions
|
||||
```
|
||||
|
||||
## Database Schema
|
||||
|
||||
```mermaid
|
||||
erDiagram
|
||||
BacklinkCampaign {
|
||||
string id PK
|
||||
string user_id
|
||||
string name
|
||||
string description
|
||||
string keywords
|
||||
datetime created_at
|
||||
datetime updated_at
|
||||
}
|
||||
BacklinkLead {
|
||||
string id PK
|
||||
string campaign_id FK
|
||||
string website_url
|
||||
string website_title
|
||||
string contact_email
|
||||
float quality_score
|
||||
float relevance_score
|
||||
float guest_post_likelihood
|
||||
string status
|
||||
string source
|
||||
datetime created_at
|
||||
}
|
||||
OutreachAttempt {
|
||||
string id PK
|
||||
string campaign_id FK
|
||||
string lead_id FK
|
||||
string user_id
|
||||
string sender_email
|
||||
string recipient_email
|
||||
string subject
|
||||
string body
|
||||
string status
|
||||
string legal_basis
|
||||
datetime sent_at
|
||||
}
|
||||
OutreachReply {
|
||||
string id PK
|
||||
string campaign_id FK
|
||||
string attempt_id FK
|
||||
string from_email
|
||||
string subject
|
||||
string body
|
||||
string classification
|
||||
datetime received_at
|
||||
}
|
||||
SuppressionEntry {
|
||||
string id PK
|
||||
string user_id
|
||||
string email
|
||||
string reason
|
||||
datetime created_at
|
||||
}
|
||||
AuditLog {
|
||||
string id PK
|
||||
string user_id
|
||||
string lead_email
|
||||
string sender_email
|
||||
string subject
|
||||
string policy_result
|
||||
string reason
|
||||
string legal_basis
|
||||
datetime timestamp
|
||||
}
|
||||
SendCounterUser {
|
||||
string id PK
|
||||
string user_id
|
||||
date date
|
||||
int count
|
||||
}
|
||||
SendCounterDomain {
|
||||
string id PK
|
||||
string domain
|
||||
date date
|
||||
int count
|
||||
}
|
||||
IdempotencyKey {
|
||||
string id PK
|
||||
string key
|
||||
datetime created_at
|
||||
}
|
||||
EmailTemplate {
|
||||
string id PK
|
||||
string user_id
|
||||
string name
|
||||
string subject
|
||||
string body
|
||||
string category
|
||||
datetime created_at
|
||||
}
|
||||
FollowUp {
|
||||
string id PK
|
||||
string attempt_id FK
|
||||
string campaign_id FK
|
||||
string subject
|
||||
string body
|
||||
string status
|
||||
datetime scheduled_at
|
||||
datetime sent_at
|
||||
}
|
||||
|
||||
BacklinkCampaign ||--o{ BacklinkLead : contains
|
||||
BacklinkCampaign ||--o{ OutreachAttempt : tracks
|
||||
BacklinkCampaign ||--o{ OutreachReply : receives
|
||||
BacklinkCampaign ||--o{ EmailTemplate : owns
|
||||
OutreachAttempt ||--o{ OutreachReply : generates
|
||||
OutreachAttempt ||--o{ FollowUp : schedules
|
||||
```
|
||||
|
||||
### Unique Indexes
|
||||
|
||||
| Table | Unique Constraint | Purpose |
|
||||
|---|---|---|
|
||||
| `SendCounterUser` | `(user_id, date)` | Atomic daily cap per user. |
|
||||
| `SendCounterDomain` | `(domain, date)` | Atomic daily cap per domain. |
|
||||
|
||||
These enable `INSERT ... ON CONFLICT DO UPDATE` for atomic counter increments.
|
||||
|
||||
## Service Layer
|
||||
|
||||
### Outreach Service (`backlink_outreach_service.py`)
|
||||
|
||||
Core business logic:
|
||||
|
||||
- `_infer_region(domain)` — Maps 25+ EU TLDs + UK/CA/AU to region codes.
|
||||
- `_determine_legal_basis(recipient_email)` — EU/UK/CA/AU → `consent`, others → `legitimate_interest`.
|
||||
- `validate_policy(...)` — Runs all policy checks, returns approval/block with reasons.
|
||||
- `send_outreach_email(...)` — Orchestrates policy → attempt → SMTP → counters → idempotency.
|
||||
- `deep_discover(...)` — Exa + DuckDuckGo search, page scraping, email extraction, scoring.
|
||||
- `generate_email(...)` — LLM-based email generation with topic + tone.
|
||||
- `personalize_email(...)` — LLM-based personalization for a specific lead.
|
||||
- `get_campaign_analytics(...)` — Aggregates campaign metrics.
|
||||
- `get_reporting_snapshot(...)` — Cross-campaign summary.
|
||||
- `export_leads_csv(...)` / `export_attempts_csv(...)` / `export_replies_csv(...)` — CSV generation with formula injection sanitization.
|
||||
|
||||
### Storage Layer (`backlink_outreach_storage.py`)
|
||||
|
||||
SQLite CRUD operations with 20+ methods:
|
||||
|
||||
- Campaign CRUD: `create_campaign`, `list_backlink_campaigns`, `get_campaign`, `delete_campaign`.
|
||||
- Lead management: `add_campaign_lead`, `add_campaign_leads_bulk`, `update_lead_status`, `bulk_update_lead_status`.
|
||||
- Outreach: `create_outreach_attempt`, `list_outreach_attempts`, `get_lead_attempts`.
|
||||
- Replies: `store_reply`, `find_attempt_by_from_email`, `reply_exists`, `list_replies`, `count_replies`.
|
||||
- Follow-ups: `create_follow_up`, `list_follow_ups`.
|
||||
- Suppression: `add_suppression`, `list_suppression`, `is_suppressed`.
|
||||
- Counters: `increment_user_counter`, `increment_domain_counter` (atomic ON CONFLICT).
|
||||
- Idempotency: `check_idempotency`, `mark_idempotency`.
|
||||
- Audit: `log_audit_entry`.
|
||||
- Templates: `create_email_template`, `list_email_templates`, `get_email_template`, `delete_email_template`.
|
||||
|
||||
All methods call `_ensure_tables()` on first use to auto-create the SQLite schema.
|
||||
|
||||
### SMTP Sender (`backlink_outreach_sender.py`)
|
||||
|
||||
Handles email delivery:
|
||||
|
||||
1. Creates SSL context with `ssl.create_default_context()`.
|
||||
2. Connects to SMTP host.
|
||||
3. Sends `EHLO` greeting.
|
||||
4. Upgrades with `STARTTLS`.
|
||||
5. Sends `EHLO` again (RFC 3207 requirement).
|
||||
6. Authenticates with credentials.
|
||||
7. Sends email with configurable timeout (`SMTP_SEND_TIMEOUT`).
|
||||
8. Cleanly closes the connection.
|
||||
|
||||
### Reply Monitor (`backlink_outreach_reply_monitor.py`)
|
||||
|
||||
Handles IMAP reply processing:
|
||||
|
||||
1. Connects to IMAP over SSL.
|
||||
2. Sanitizes search terms (prevents IMAP injection).
|
||||
3. Searches for messages matching the outreach sender.
|
||||
4. Fetches up to `IMAP_FETCH_LIMIT` messages.
|
||||
5. Checks for duplicates via `reply_exists()`.
|
||||
6. Matches replies to attempts via `find_attempt_by_from_email()`.
|
||||
7. Classifies replies based on content analysis.
|
||||
8. Stores reply records.
|
||||
|
||||
## Authentication Flow
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Client as Frontend
|
||||
participant Router as API Router
|
||||
participant Clerk as Clerk Auth
|
||||
participant Service as Service Layer
|
||||
|
||||
Client->>Router: Request with Bearer token
|
||||
Router->>Clerk: Verify session token
|
||||
Clerk-->>Router: user_id
|
||||
Router->>Service: Execute with user_id
|
||||
Service-->>Router: Result (scoped to user_id)
|
||||
Router-->>Client: Response
|
||||
```
|
||||
|
||||
Key principles:
|
||||
|
||||
- **All 18+ endpoints** require `Depends(get_current_user)`.
|
||||
- **User identity** is derived from the Clerk session, never from the request body.
|
||||
- **Workspace isolation**: Data is scoped by `user_id` (from Clerk) or `workspace_id` (from request, defaults to `user_id`).
|
||||
- **No client-controlled user_id**: The `GenerateEmailRequest` and `EmailTemplateRequest` models do not include a `user_id` field — it's always derived from auth.
|
||||
|
||||
## Frontend Architecture
|
||||
|
||||
### State Management (Zustand)
|
||||
|
||||
The `backlinkOutreachStore` manages all client state:
|
||||
|
||||
- **Campaign data**: List, selected campaign, leads.
|
||||
- **UI state**: Active tab, loading flags (`isAttemptsLoading`, `isRepliesLoading`, `isAnalyticsLoading`, `isStatusUpdating`, `isExporting`).
|
||||
- **Async operations**: All store actions with proper error handling and state clearing.
|
||||
- **Retry logic**: `withRetry` helper auto-retries read operations once on 5xx with exponential backoff.
|
||||
|
||||
### User Feedback
|
||||
|
||||
All user-facing feedback uses `showToastNotification` from `utils/toastNotifications.ts`:
|
||||
|
||||
- Success toasts on completed actions.
|
||||
- Error toasts on failed API calls (with error message extraction).
|
||||
- Warning toasts on partial failures (bulk operations).
|
||||
- Loading states on buttons (`isStatusUpdating`, `isExporting`).
|
||||
|
||||
### Analytics Loading
|
||||
|
||||
Analytics data loading uses an inline `useEffect` with a cancel flag to prevent stale closure issues:
|
||||
|
||||
```typescript
|
||||
useEffect(() => {
|
||||
let cancelled = false;
|
||||
const loadAnalytics = async () => {
|
||||
if (!cancelled) { /* set state */ }
|
||||
};
|
||||
loadAnalytics();
|
||||
return () => { cancelled = true; };
|
||||
}, [analyticsDays]);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*This concludes the Backlink Outreach documentation. Start with the [Overview](overview.md) or [Workflow Guide](workflow-guide.md).*
|
||||
@@ -1,163 +0,0 @@
|
||||
# Outreach Operations
|
||||
|
||||
Outreach operations handle the sending pipeline: policy validation, SMTP delivery, idempotency, suppression, and audit logging.
|
||||
|
||||
## Send Pipeline
|
||||
|
||||
Every outbound email goes through this pipeline:
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Send Request] --> B[Authenticate User]
|
||||
B --> C[Resolve Lead Email from DB]
|
||||
C --> D[Policy Validation]
|
||||
D -->|Approved| E[Create Outreach Attempt Record]
|
||||
D -->|Blocked| F[Record Audit Log + Return 403]
|
||||
E --> G[Send via SMTP with TLS]
|
||||
G -->|Success| H[Increment Counters]
|
||||
G -->|Success| I[Mark Idempotency Key]
|
||||
G -->|Success| J[Update Lead Status to Contacted]
|
||||
G -->|Failure| K[Return 500 with Generic Error]
|
||||
H --> L[Return 200 with Attempt Details]
|
||||
I --> L
|
||||
J --> L
|
||||
|
||||
style D fill:#fff3e0
|
||||
style G fill:#e3f2fd
|
||||
style F fill:#ffebee
|
||||
```
|
||||
|
||||
!!! warning "Counter timing"
|
||||
Counters and idempotency keys are marked **only after successful SMTP delivery**, never before. This prevents false cap consumption on failed sends.
|
||||
|
||||
## Policy Validation
|
||||
|
||||
Before every send, the system validates:
|
||||
|
||||
| Check | Rule | On Failure |
|
||||
|---|---|---|
|
||||
| **Daily user cap** | Max 100 emails/user/day | Block + audit |
|
||||
| **Daily domain cap** | Max 20 emails/domain/day | Block + audit |
|
||||
| **Suppression list** | Recipient not suppressed | Block + audit |
|
||||
| **Idempotency** | No duplicate `(sender, recipient, subject)` in 24h | Block + audit |
|
||||
| **Legal basis** | EU domains → "consent", others → "legitimate_interest" | Auto-assign |
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/policy/validate`
|
||||
|
||||
```json
|
||||
{
|
||||
"recipient_email": "editor@example.com",
|
||||
"sender_email": "outreach@yourdomain.com",
|
||||
"subject": "Guest Post: AI Marketing Trends"
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"allowed": true,
|
||||
"reason": "All checks passed",
|
||||
"legal_basis": "legitimate_interest",
|
||||
"daily_user_count": 23,
|
||||
"daily_user_limit": 100,
|
||||
"daily_domain_count": 5,
|
||||
"daily_domain_limit": 20,
|
||||
"region": "US"
|
||||
}
|
||||
```
|
||||
|
||||
### Region-Aware Legal Basis
|
||||
|
||||
The system infers the recipient's region from their email domain's TLD:
|
||||
|
||||
| TLDs | Region | Legal Basis |
|
||||
|---|---|---|
|
||||
| `.de`, `.fr`, `.it`, `.es`, `.nl`, `.pl`, `.se`, `.at`, `.be`, `.ch`, `.pt`, `.ie`, `.dk`, `.fi`, `.no`, `.cz`, `.gr`, `.hu`, `.ro`, `.bg`, `.hr`, `.sk`, `.si`, `.lt`, `.lv`, `.ee` | EU | `consent` |
|
||||
| `.co.uk`, `.uk` | UK | `consent` |
|
||||
| `.ca` | CA | `consent` |
|
||||
| `.com.au`, `.co.nz` | AU/NZ | `consent` |
|
||||
| All others | — | `legitimate_interest` |
|
||||
|
||||
!!! note "GDPR compliance"
|
||||
EU, UK, CA, and AU domain leads always use `consent` as the legal basis. This means you should have obtained some form of consent before reaching out. For other regions, `legitimate_interest` is applied automatically.
|
||||
|
||||
## Suppression List
|
||||
|
||||
Recipients on the suppression list are blocked from receiving emails.
|
||||
|
||||
### Adding to Suppression
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/suppression`
|
||||
|
||||
```json
|
||||
{
|
||||
"email": "unsubscribed@example.com",
|
||||
"reason": "User requested unsubscribe"
|
||||
}
|
||||
```
|
||||
|
||||
### Listing Suppressed Recipients
|
||||
|
||||
**API:** `GET /api/v1/backlink-outreach/suppression`
|
||||
|
||||
### Auto-Suppression
|
||||
|
||||
Recipients are automatically added to the suppression list when:
|
||||
- They reply with "not interested" language.
|
||||
- They explicitly request to be removed.
|
||||
- An email to their address hard-bounces.
|
||||
|
||||
## Idempotency
|
||||
|
||||
The system prevents duplicate sends using idempotency keys derived from `(sender_email, recipient_email, subject)`.
|
||||
|
||||
- Keys are valid for 24 hours.
|
||||
- After successful SMTP delivery, the key is marked as used.
|
||||
- Attempting to send the same `(sender, recipient, subject)` within 24h returns a policy block.
|
||||
|
||||
## SMTP Configuration
|
||||
|
||||
Emails are sent via SMTP with mandatory TLS:
|
||||
|
||||
| Setting | Env Var | Default |
|
||||
|---|---|---|
|
||||
| SMTP host | `SMTP_HOST` | — (required) |
|
||||
| SMTP port | `SMTP_PORT` | `587` |
|
||||
| SMTP username | `SMTP_USER` | — (required) |
|
||||
| SMTP password | `SMTP_PASS` | — (required) |
|
||||
| TLS verification | `SMTP_VERIFY_TLS` | `true` |
|
||||
| Send timeout | `SMTP_SEND_TIMEOUT` | `30` seconds |
|
||||
| From email | `SMTP_FROM_EMAIL` | — (required) |
|
||||
|
||||
!!! warning "TLS certificate verification"
|
||||
By default, `SMTP_VERIFY_TLS=true` validates the SMTP server's TLS certificate. Set to `false` only for local development with self-signed certs. **Never disable in production.**
|
||||
|
||||
### SMTP Connection Flow
|
||||
|
||||
1. Connect to SMTP host on configured port.
|
||||
2. Send `EHLO` greeting.
|
||||
3. Upgrade to TLS with `STARTTLS`.
|
||||
4. Send `EHLO` again (required by RFC 3207 after STARTTLS).
|
||||
5. Authenticate with username/password.
|
||||
6. Send the email with a configurable timeout.
|
||||
7. Quit the connection cleanly.
|
||||
|
||||
## Audit Logging
|
||||
|
||||
Every policy check is recorded in the audit log:
|
||||
|
||||
| Field | Description |
|
||||
|---|---|
|
||||
| `user_id` | Authenticated user who initiated the send. |
|
||||
| `lead_email` | Intended recipient. |
|
||||
| `sender_email` | Sending address. |
|
||||
| `subject` | Email subject line. |
|
||||
| `policy_result` | `approved` or `blocked`. |
|
||||
| `reason` | Human-readable explanation. |
|
||||
| `legal_basis` | `consent` or `legitimate_interest`. |
|
||||
| `timestamp` | When the check occurred. |
|
||||
|
||||
---
|
||||
|
||||
*Next: [Reply Inbox](reply-inbox.md) — IMAP monitoring and auto-classification.*
|
||||
@@ -1,104 +0,0 @@
|
||||
# Backlink Outreach Overview
|
||||
|
||||
Backlink Outreach is an AI-powered guest post outreach platform that takes you from opportunity discovery to published backlink — with smart email composition, policy-safe sending, IMAP reply monitoring, and full campaign analytics.
|
||||
|
||||
## What you do in the product
|
||||
|
||||
1. **Create a campaign** to group leads, emails, and analytics together.
|
||||
2. **Discover opportunities** using AI-powered search across Exa neural search and DuckDuckGo.
|
||||
3. **Compose outreach emails** with AI generation, personalization, and subject-line suggestions.
|
||||
4. **Send outreach** through SMTP with built-in policy validation, suppression checks, and idempotency.
|
||||
5. **Monitor replies** via IMAP with auto-classification (interested, not interested, out of office).
|
||||
6. **Track analytics** — send volume trends, conversion funnels, reply classification breakdown, and CSV exports.
|
||||
|
||||
## What you see in the UI
|
||||
|
||||
- Campaign list with status and lead counts.
|
||||
- Discovery results with quality/confidence scores and email detection badges.
|
||||
- AI email composer with tone selector, template library, and live preview.
|
||||
- Lead cards with status lifecycle buttons (discovered → contacted → replied → placed).
|
||||
- Reply inbox with auto-classification tags.
|
||||
- Analytics tab with line charts, bar charts, and export controls.
|
||||
- Toast notifications for every action outcome (success or failure).
|
||||
|
||||
## Feature status matrix
|
||||
|
||||
| Capability | Status | Notes |
|
||||
|---|---|---|
|
||||
| Campaign CRUD | **Implemented** | Create, list, get detail with leads. |
|
||||
| AI-powered deep discovery | **Implemented** | Exa neural search + DuckDuckGo with full-page scraping and email extraction. |
|
||||
| Lead management | **Implemented** | Add, bulk-add, update status, bulk status update. |
|
||||
| AI email generation | **Implemented** | Topic-based generation, personalization, subject-line suggestions, follow-up drafts. |
|
||||
| Template CRUD | **Implemented** | Create, list, get, delete email templates with `{placeholder}` variable substitution. |
|
||||
| SMTP email sending | **Implemented** | TLS with certificate verification, EHLO, configurable timeout. |
|
||||
| Policy validation | **Implemented** | Daily caps, domain caps, suppression list, idempotency, region-aware legal basis (EU → consent). |
|
||||
| IMAP reply monitoring | **Implemented** | Configurable fetch limit, auto-classification, deduplication. |
|
||||
| Follow-up scheduling | **Implemented** | Schedule and track follow-up emails. |
|
||||
| Campaign analytics | **Implemented** | Volume trends, conversion funnel, reply classification, response/placement rates. |
|
||||
| CSV export | **Implemented** | Leads, attempts, replies — with formula injection sanitization. |
|
||||
| Audit logging | **Implemented** | Every policy check is recorded with reasons and outcome. |
|
||||
| Suppression management | **Implemented** | Add and list suppressed recipients. |
|
||||
| Clerk auth on all endpoints | **Implemented** | 18 protected endpoints + user-scoped data isolation. |
|
||||
| Reporting snapshot | **Implemented** | Cross-campaign send volume, reply count, placement conversion. |
|
||||
|
||||
## How It Works
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Create Campaign] --> B[Discover Opportunities]
|
||||
B --> C[Save Leads]
|
||||
C --> D[Compose Email]
|
||||
D --> E[Policy Validate]
|
||||
E -->|Approved| F[Send via SMTP]
|
||||
E -->|Blocked| G[Audit Log]
|
||||
F --> H[Monitor Replies]
|
||||
H --> I[Auto-Classify]
|
||||
I --> J[Track Analytics]
|
||||
|
||||
style A fill:#e3f2fd
|
||||
style B fill:#e8f5e8
|
||||
style F fill:#fff3e0
|
||||
style I fill:#fce4ec
|
||||
style J fill:#f3e5f5
|
||||
```
|
||||
|
||||
## Who Benefits Most
|
||||
|
||||
### For SEO Professionals
|
||||
- **Scalable outreach**: Send up to 100 emails/day per user with domain-level caps.
|
||||
- **Policy compliance**: Built-in GDPR-aware legal basis, suppression, and audit trail.
|
||||
- **Performance tracking**: Real-time analytics with conversion funnel and reply breakdown.
|
||||
|
||||
### For Content Marketers
|
||||
- **AI email composer**: Generate personalized outreach emails in seconds, not hours.
|
||||
- **Template library**: Save and reuse winning email templates across campaigns.
|
||||
- **Reply triage**: Auto-classified replies let you focus on interested leads first.
|
||||
|
||||
### For Agencies
|
||||
- **Multi-campaign management**: Organize outreach by client or vertical.
|
||||
- **CSV exports**: Download leads, attempts, and replies for client reporting.
|
||||
- **Audit trail**: Every send decision is logged for compliance and accountability.
|
||||
|
||||
## Getting Started
|
||||
|
||||
1. **[Workflow Guide](workflow-guide.md)** - Step-by-step walkthrough from campaign creation to analytics.
|
||||
2. **[Campaign Management](campaign-management.md)** - Creating and organizing campaigns.
|
||||
3. **[Discovery](discovery.md)** - AI-powered opportunity search.
|
||||
4. **[Email Composer](email-composer.md)** - AI email generation and personalization.
|
||||
5. **[Outreach Operations](outreach-operations.md)** - Sending, policy, suppression.
|
||||
6. **[Reply Inbox](reply-inbox.md)** - IMAP monitoring and classification.
|
||||
7. **[Analytics](analytics.md)** - Charts, funnels, and exports.
|
||||
8. **[API Reference](api-reference.md)** - Full endpoint documentation.
|
||||
9. **[Configuration](configuration.md)** - Environment variables and deployment.
|
||||
10. **[Implementation Overview](implementation-overview.md)** - Architecture and internals.
|
||||
|
||||
## Related Features
|
||||
|
||||
- **[SEO Dashboard](../seo-dashboard/overview.md)** - Comprehensive SEO tools and GSC integration.
|
||||
- **[Blog Writer](../blog-writer/overview.md)** - Create content to earn backlinks organically.
|
||||
- **[Content Strategy](../content-strategy/overview.md)** - Strategic planning for link-building campaigns.
|
||||
- **[Subscription](../subscription/overview.md)** - Plan limits and billing.
|
||||
|
||||
---
|
||||
|
||||
*Ready to start building backlinks? Check out the [Workflow Guide](workflow-guide.md) to get started!*
|
||||
@@ -1,109 +0,0 @@
|
||||
# Reply Inbox
|
||||
|
||||
The reply inbox monitors your outreach mailbox via IMAP, automatically classifies replies, and deduplicates incoming messages.
|
||||
|
||||
## How It Works
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Poll IMAP Inbox] --> B[Search for New Messages]
|
||||
B --> C[Fetch Message Headers + Body]
|
||||
C --> D{Already Processed?}
|
||||
D -->|Yes| E[Skip Duplicate]
|
||||
D -->|No| F[Find Matching Attempt]
|
||||
F --> G[Classify Reply]
|
||||
G --> H[Store Reply Record]
|
||||
H --> I[Update Lead Status if Interested]
|
||||
|
||||
style A fill:#e3f2fd
|
||||
style G fill:#e8f5e8
|
||||
style E fill:#ffebee
|
||||
```
|
||||
|
||||
## IMAP Configuration
|
||||
|
||||
| Setting | Env Var | Default |
|
||||
|---|---|---|
|
||||
| IMAP host | `IMAP_HOST` | — (required) |
|
||||
| IMAP port | `IMAP_PORT` | `993` |
|
||||
| IMAP username | `IMAP_USER` | — (required) |
|
||||
| IMAP password | `IMAP_PASS` | — (required) |
|
||||
| Fetch limit | `IMAP_FETCH_LIMIT` | `50` |
|
||||
|
||||
!!! tip "Fetch limit"
|
||||
`IMAP_FETCH_LIMIT` controls how many messages are processed per poll cycle. Increase for high-volume mailboxes, decrease to reduce IMAP load. Default is 50.
|
||||
|
||||
## Polling for Replies
|
||||
|
||||
**API:** `POST /api/v1/backlink-outreach/replies/poll`
|
||||
|
||||
The reply monitor:
|
||||
|
||||
1. Connects to IMAP over SSL.
|
||||
2. Sanitizes the `sent_from_email` before searching (prevents IMAP injection).
|
||||
3. Searches for messages sent to your outreach address.
|
||||
4. Fetches up to `IMAP_FETCH_LIMIT` recent messages.
|
||||
5. For each message, checks if it's already been processed (deduplication).
|
||||
6. Matches the reply to an existing outreach attempt by sender email.
|
||||
7. Classifies the reply and stores it.
|
||||
|
||||
### Reply Matching
|
||||
|
||||
Replies are matched to outreach attempts using the `from_email` field:
|
||||
|
||||
- The system looks up `find_attempt_by_from_email(from_email)` to find the most recent outreach attempt sent to that email address.
|
||||
- If no match is found, the reply is still stored but not linked to an attempt.
|
||||
|
||||
### Deduplication
|
||||
|
||||
The system checks `reply_exists(from_email, subject)` before storing a new reply. This prevents duplicate entries when the same message appears in multiple IMAP folders or is fetched in overlapping poll cycles.
|
||||
|
||||
## Auto-Classification
|
||||
|
||||
Replies are automatically classified based on content analysis:
|
||||
|
||||
| Classification | Signals |
|
||||
|---|---|
|
||||
| **Interested** | "sounds good", "tell me more", "interested", "let's do it", "I'd love to" |
|
||||
| **Not interested** | "not interested", "no thanks", "unsubscribe", "remove me", "stop sending" |
|
||||
| **Out of office** | "out of office", "auto-reply", "automated response", "on vacation" |
|
||||
| **Replied** | General reply that doesn't match other categories |
|
||||
|
||||
!!! note "Manual override"
|
||||
Auto-classification is a best-effort guess. You can manually reclassify any reply in the UI by clicking the classification tag and selecting a different one.
|
||||
|
||||
### Auto-Suppression on "Not Interested"
|
||||
|
||||
When a reply is classified as "not interested", the sender's email is **automatically added to the suppression list** to prevent future outreach.
|
||||
|
||||
## Reply Inbox UI
|
||||
|
||||
The inbox shows:
|
||||
|
||||
- **From**: Sender name and email.
|
||||
- **Subject**: Email subject line.
|
||||
- **Classification tag**: Color-coded auto-classification badge.
|
||||
- **Date**: When the reply was received.
|
||||
- **Linked attempt**: The outreach attempt this reply matches (if any).
|
||||
- **Lead status**: Current status of the associated lead.
|
||||
|
||||
### Actions
|
||||
|
||||
| Action | Description |
|
||||
|---|---|
|
||||
| **View** | Read the full reply body. |
|
||||
| **Reclassify** | Change the auto-classification. |
|
||||
| **Update lead status** | Move the lead to "replied" or "placed". |
|
||||
| **Compose follow-up** | Open the email composer pre-filled with a follow-up draft. |
|
||||
|
||||
## Monitoring Best Practices
|
||||
|
||||
1. **Poll regularly**: Set up a scheduled job to call the poll endpoint every 5-15 minutes.
|
||||
2. **Review unclassified**: Check "Replied" (generic) classifications and manually tag them.
|
||||
3. **Act on interested leads quickly**: Respond within 24 hours for best conversion.
|
||||
4. **Check out-of-office dates**: Schedule follow-ups for after the return date.
|
||||
5. **Review suppression entries**: Periodically audit the suppression list for accidental additions.
|
||||
|
||||
---
|
||||
|
||||
*Next: [Analytics](analytics.md) — campaign performance tracking and exports.*
|
||||
@@ -1,120 +0,0 @@
|
||||
# Backlink Outreach Workflow Guide
|
||||
|
||||
This guide walks through the complete Backlink Outreach lifecycle from campaign creation to analytics review.
|
||||
|
||||
## 1) Create a Campaign
|
||||
|
||||
Campaigns group your leads, outreach attempts, replies, and analytics together. Every action in the system belongs to a campaign.
|
||||
|
||||
!!! tip "Best practice"
|
||||
Create one campaign per target vertical or client. For example: "SaaS Growth Blogs Q3" or "Fitness Influencer Outreach".
|
||||
|
||||
**What to validate before continuing:**
|
||||
- Campaign name is descriptive enough to distinguish from others.
|
||||
- You have a clear keyword or niche for discovery.
|
||||
|
||||
## 2) Discover Opportunities
|
||||
|
||||
Use AI-powered discovery to find websites that accept guest posts in your niche.
|
||||
|
||||
!!! note "How discovery works"
|
||||
The system combines **Exa neural search** (semantic understanding) with **DuckDuckGo** (broad coverage), scrapes full pages, extracts contact emails, and scores each opportunity for quality and guest-post likelihood.
|
||||
|
||||
**Recommended sequence:**
|
||||
1. Enter a keyword (e.g., "AI marketing", "SaaS growth").
|
||||
2. Click **Discover** to search across multiple query patterns ("write for us", "guest contributor", etc.).
|
||||
3. Review results — check quality score, confidence score, and email detection badges.
|
||||
4. Select a campaign and click **Save to Campaign** to persist leads.
|
||||
|
||||
**What to look for:**
|
||||
- Quality score > 60% — the site is relevant to your keyword.
|
||||
- Confidence score > 50% — the site likely accepts guest posts.
|
||||
- "Has guidelines" badge — the site has a dedicated guest post page.
|
||||
- "Email found" badge — a contact email was extracted.
|
||||
|
||||
## 3) Compose Outreach Emails
|
||||
|
||||
Use the AI email composer to craft personalized outreach messages.
|
||||
|
||||
!!! note "AI generation options"
|
||||
- **Generate**: Create an email from a topic, tone, and optional template.
|
||||
- **Personalize**: Tailor an email to a specific lead (name, site, content topic).
|
||||
- **Subject Lines**: Get 5-10 AI-suggested subject line variants.
|
||||
- **Follow-up**: Generate a polite follow-up referencing the original email.
|
||||
|
||||
**Recommended sequence:**
|
||||
1. Choose a template or start fresh.
|
||||
2. Enter your topic and target site (optional).
|
||||
3. Select a tone (Professional, Friendly, Casual, Formal).
|
||||
4. Click **Generate with AI** to create a subject + body.
|
||||
5. Optionally click **Suggest** for subject line variants.
|
||||
6. Use **Personalize** to tailor the email to a specific lead.
|
||||
7. Preview the email in the live preview pane.
|
||||
|
||||
## 4) Send Outreach
|
||||
|
||||
Once your email is composed, navigate to the Leads tab to send outreach.
|
||||
|
||||
!!! warning "Policy validation"
|
||||
Every send is validated against your daily caps, suppression list, and GDPR rules. EU-domain leads automatically use "consent" as legal basis; others use "legitimate_interest".
|
||||
|
||||
**What happens when you send:**
|
||||
1. Policy is validated (caps, suppression, idempotency, legal basis).
|
||||
2. An outreach attempt is recorded in the database.
|
||||
3. If approved, the email is sent via SMTP with TLS.
|
||||
4. Send counters are incremented **only after successful delivery**.
|
||||
5. Idempotency key is marked to prevent duplicate sends.
|
||||
6. Lead status is updated to "contacted".
|
||||
|
||||
**Daily limits:**
|
||||
- 100 emails per user per day.
|
||||
- 20 emails per domain per day.
|
||||
|
||||
## 5) Monitor Replies
|
||||
|
||||
After sending outreach, monitor replies through the IMAP-powered inbox.
|
||||
|
||||
!!! note "Auto-classification"
|
||||
Replies are automatically classified as:
|
||||
- **Interested** — positive language detected ("sounds good", "tell me more").
|
||||
- **Not interested** — negative language ("not interested", "unsubscribe").
|
||||
- **Out of office** — auto-responder detected.
|
||||
- **Replied** — general reply without strong signals.
|
||||
|
||||
**What to do with classified replies:**
|
||||
- **Interested**: Move the lead to "replied" status, then "placed" after publication.
|
||||
- **Not interested**: Mark as "bounced" or leave as-is. The sender is auto-added to suppression.
|
||||
- **Out of office**: Schedule a follow-up for after their return date.
|
||||
- **Replied**: Read and manually classify, then update lead status.
|
||||
|
||||
## 6) Track Analytics
|
||||
|
||||
Monitor campaign performance with built-in analytics.
|
||||
|
||||
**Key metrics:**
|
||||
- **Send Volume**: Daily email send trend over time.
|
||||
- **Response Rate**: Percentage of sent emails that received a reply.
|
||||
- **Placement Rate**: Percentage of leads that resulted in a published post.
|
||||
- **Conversion Funnel**: Lead count by status stage (discovered → contacted → replied → placed).
|
||||
- **Reply Classification**: Breakdown of reply types.
|
||||
|
||||
**Export options:**
|
||||
- Export Leads as CSV for CRM import.
|
||||
- Export Attempts for audit trails.
|
||||
- Export Replies for analysis in spreadsheets.
|
||||
|
||||
!!! tip "CSV safety"
|
||||
All CSV exports are sanitized against formula injection — cells starting with `=`, `+`, `-`, or `@` are automatically escaped.
|
||||
|
||||
## 7) Iterate and Optimize
|
||||
|
||||
Use analytics insights to improve your outreach:
|
||||
|
||||
1. **Low response rate?** Try different subject lines or tones.
|
||||
2. **High bounce rate?** Improve lead quality filters during discovery.
|
||||
3. **Low placement rate?** Refine your pitch personalization.
|
||||
4. **Many "not interested"?** Adjust your target niche or messaging.
|
||||
|
||||
---
|
||||
|
||||
*Now you know the full workflow! Dive deeper with [Campaign Management](campaign-management.md) or [Discovery](discovery.md).*
|
||||
@@ -4,33 +4,6 @@ Base prefix: `/api/podcast`
|
||||
|
||||
This page summarizes the Podcast Maker endpoints currently represented in frontend and backend code.
|
||||
|
||||
## Endpoint Map
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[/api/podcast]
|
||||
A --> P[projects.py]
|
||||
A --> AN[analysis.py]
|
||||
A --> R[research.py]
|
||||
A --> S[script.py]
|
||||
A --> AU[audio.py]
|
||||
A --> V[video.py]
|
||||
A --> I[images.py]
|
||||
A --> AV[avatar.py]
|
||||
A --> D[dubbing.py]
|
||||
|
||||
P --> P1[Create project]
|
||||
P --> P2[List project history]
|
||||
AN --> AN1[Run episode analysis]
|
||||
R --> R1[Generate/select queries]
|
||||
S --> S1[Create/update script]
|
||||
AU --> AU1[Render audio]
|
||||
V --> V1[Render video]
|
||||
I --> I1[Generate supporting images]
|
||||
AV --> AV1[Configure presenter avatar]
|
||||
D --> D1[Voice dubbing / localization]
|
||||
```
|
||||
|
||||
## Endpoints by workflow stage
|
||||
|
||||
### Analysis and idea shaping
|
||||
|
||||
@@ -1,35 +1,8 @@
|
||||
# Podcast Maker Implementation Overview
|
||||
|
||||
Podcast Maker orchestrates a multi-stage content pipeline: project configuration, research grounding, script composition, media rendering, and publish-state tracking.
|
||||
This page keeps implementation details in one place for engineering and advanced troubleshooting.
|
||||
|
||||
## Architecture & Data Flow
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
UI[Podcast Maker UI]
|
||||
API[Podcast API Router]
|
||||
PROJ[Project Service]
|
||||
RESEARCH[Research Handler]
|
||||
SCRIPT[Script Handler]
|
||||
RENDER[Audio/Video Render Handlers]
|
||||
STORE[(Podcast Tables)]
|
||||
JOBS[(Render Queue)]
|
||||
|
||||
UI --> API
|
||||
API --> PROJ
|
||||
API --> RESEARCH
|
||||
API --> SCRIPT
|
||||
API --> RENDER
|
||||
|
||||
PROJ --> STORE
|
||||
RESEARCH --> STORE
|
||||
SCRIPT --> STORE
|
||||
RENDER --> JOBS
|
||||
RENDER --> STORE
|
||||
|
||||
JOBS --> UI
|
||||
STORE --> UI
|
||||
```
|
||||
## Architecture
|
||||
|
||||
Podcast Maker is split into:
|
||||
|
||||
|
||||
@@ -1,23 +0,0 @@
|
||||
# Persona Journey: Podcast Host
|
||||
|
||||
## Host Goal
|
||||
Deliver a confident, natural-sounding episode with clear narrative transitions and evidence-backed claims.
|
||||
|
||||
## Journey Stages
|
||||
|
||||
### Stage 1: Brief Setup
|
||||
!!! note "Annotated view: Host setup modal"
|
||||
**Alt text:** Create modal annotated for host persona setup, including tone and pacing controls.
|
||||
|
||||
### Stage 2: Insight Review
|
||||
!!! note "Annotated view: Host analysis panel"
|
||||
**Alt text:** Analysis panel annotated to highlight speaking pace and confidence recommendations for host delivery.
|
||||
|
||||
### Stage 3: Script Approval
|
||||
!!! note "Annotated view: Host script editor"
|
||||
**Alt text:** Script editor annotated for host line-level edits, transition cues, and final approval workflow.
|
||||
|
||||
## Success Criteria
|
||||
- Host reads naturally without overlong sentences.
|
||||
- Topic transitions stay on-message.
|
||||
- Sources remain transparent when making claims.
|
||||
@@ -1,23 +0,0 @@
|
||||
# Persona Journey: Podcast Producer
|
||||
|
||||
## Producer Goal
|
||||
Coordinate research quality, production timelines, and render output consistency across every episode.
|
||||
|
||||
## Journey Stages
|
||||
|
||||
### Stage 1: Research Curation
|
||||
!!! note "Annotated view: Producer research query selection"
|
||||
**Alt text:** Research query selection annotated for producer review, source locking, and query prioritization.
|
||||
|
||||
### Stage 2: Render Oversight
|
||||
!!! note "Annotated view: Producer render queue"
|
||||
**Alt text:** Render queue annotated for producer monitoring of job status, retries, and SLA adherence.
|
||||
|
||||
### Stage 3: Catalog Management
|
||||
!!! note "Annotated view: Producer project list and episode history"
|
||||
**Alt text:** Project list and episode history annotated for producer-level tracking of versions and publishing channels.
|
||||
|
||||
## Success Criteria
|
||||
- Each episode has approved research scope.
|
||||
- Queue health stays within SLA.
|
||||
- History records are clear enough for postmortem and reuse.
|
||||
@@ -1,63 +0,0 @@
|
||||
# Podcast Maker Workflow Guide
|
||||
|
||||
This guide walks through the complete Podcast Maker lifecycle from project creation to final render delivery.
|
||||
|
||||
## 1) Create a New Episode Project
|
||||
|
||||
!!! note "Annotated view: Create modal"
|
||||
**Alt text:** Create modal showing fields for podcast title, target audience, tone, and language selection.
|
||||
|
||||
**What to validate before continuing**
|
||||
- Working title and episode angle are clear.
|
||||
- Persona is selected (Host, Analyst, Interviewer, etc.).
|
||||
- Output profile (audio-only vs video podcast) matches channel needs.
|
||||
|
||||
## 2) Review the Analysis Panel
|
||||
|
||||
!!! note "Annotated view: Analysis panel"
|
||||
**Alt text:** Analysis panel with audience-fit score, speaking pace metrics, and citation confidence indicators.
|
||||
|
||||
Use this panel to align creative intent with production constraints:
|
||||
- Audience fit and intent match
|
||||
- Style and voice consistency
|
||||
- Citation confidence and source quality
|
||||
|
||||
## 3) Select Research Queries
|
||||
|
||||
!!! note "Annotated view: Research query selection"
|
||||
**Alt text:** Research query selection interface with approved query chips and source locking controls.
|
||||
|
||||
Recommended sequence:
|
||||
1. Approve high-signal queries.
|
||||
2. Exclude broad/ambiguous prompts.
|
||||
3. Lock trusted domains before script generation.
|
||||
|
||||
## 4) Edit and Approve the Script
|
||||
|
||||
!!! note "Annotated view: Script editor"
|
||||
**Alt text:** Script editor with host dialogue blocks, scene transitions, and approval status in sidebar.
|
||||
|
||||
In editor review, confirm:
|
||||
- Segment timing is balanced.
|
||||
- Host transitions sound conversational.
|
||||
- Fact-heavy sections include source context.
|
||||
|
||||
## 5) Monitor Render Queue
|
||||
|
||||
!!! note "Annotated view: Render queue"
|
||||
**Alt text:** Render queue showing in-progress, completed, and failed render jobs with retry actions.
|
||||
|
||||
Queue best practices:
|
||||
- Prioritize short drafts for rapid QA.
|
||||
- Retry failures after checking asset integrity.
|
||||
- Archive superseded versions to reduce noise.
|
||||
|
||||
## 6) Manage Project & Episode History
|
||||
|
||||
!!! note "Annotated view: Project list and episode history"
|
||||
**Alt text:** Project list and episode history view with version labels, publish status, and destination channels.
|
||||
|
||||
Use episode history to track:
|
||||
- Revision progression
|
||||
- Performance-linked updates
|
||||
- Publishing destination consistency
|
||||
@@ -1,285 +0,0 @@
|
||||
# AI Copilot Assistant Guide
|
||||
|
||||
## 🤖 Overview
|
||||
|
||||
The ALwrity AI Copilot is a conversational AI assistant powered by CopilotKit and Google Gemini LLM. It provides intelligent, context-aware SEO recommendations using natural language interaction.
|
||||
|
||||
## Key Features
|
||||
|
||||
### Conversational Interface
|
||||
- **Natural Language**: Ask questions in plain English
|
||||
- **Context Aware**: Understands your SEO data and goals
|
||||
- **Multi-Turn**: Continuous conversation for detailed guidance
|
||||
- **Smart Suggestions**: Recommendations based on your analysis
|
||||
|
||||
### Capabilities
|
||||
|
||||
#### Analysis Interpretation
|
||||
Ask the Copilot to explain your analysis results:
|
||||
- "What does my health score of 75 mean?"
|
||||
- "Why is my mobile speed score low?"
|
||||
- "Which critical issues should I focus on first?"
|
||||
|
||||
#### Actionable Recommendations
|
||||
Get specific guidance:
|
||||
- "How can I improve my Core Web Vitals?"
|
||||
- "What's the best strategy to increase rankings?"
|
||||
- "Should I focus on content or technical SEO?"
|
||||
|
||||
#### Tool Guidance
|
||||
Get help using the right tools:
|
||||
- "Which tool should I use for my product pages?"
|
||||
- "How do I set up competitor analysis?"
|
||||
- "What's the best way to optimize my images?"
|
||||
|
||||
#### Strategic Planning
|
||||
Get strategic insights:
|
||||
- "What's my content strategy should be?"
|
||||
- "How do I compare to my competitors?"
|
||||
- "What content opportunities am I missing?"
|
||||
|
||||
---
|
||||
|
||||
## How to Use the AI Copilot
|
||||
|
||||
### Starting a Conversation
|
||||
|
||||
1. **Open the Copilot Panel** - Click the AI Copilot icon in your dashboard
|
||||
2. **Ask Your Question** - Type in natural language
|
||||
3. **Provide Context** - Mention your URL or industry if relevant
|
||||
4. **Wait for Response** - Copilot analyzes your data and responds
|
||||
|
||||
### Example Conversations
|
||||
|
||||
#### Example 1: Improving Performance
|
||||
```
|
||||
You: "My PageSpeed score is 65. How do I improve it?"
|
||||
Copilot: "A score of 65 indicates moderate performance issues.
|
||||
Here are the priority actions:
|
||||
1. Optimize images (biggest impact)
|
||||
2. Enable compression
|
||||
3. Reduce JavaScript
|
||||
I recommend using our PageSpeed Analyzer tool to get specific recommendations."
|
||||
```
|
||||
|
||||
#### Example 2: Content Strategy
|
||||
```
|
||||
You: "I want to improve my content strategy. Where should I start?"
|
||||
Copilot: "Based on your domain analysis, I recommend:
|
||||
1. Run Content Strategy Analyzer to find gaps
|
||||
2. Analyze competitors with Sitemap Analyzer
|
||||
3. Focus on topics with high search volume and low competition
|
||||
4. Create content clusters around your main topics
|
||||
Your current gap is around long-tail keywords - that's an opportunity!"
|
||||
```
|
||||
|
||||
#### Example 3: Technical Issues
|
||||
```
|
||||
You: "I have crawl errors. What should I do?"
|
||||
Copilot: "Crawl errors need immediate attention. Use the Technical SEO Analyzer to:
|
||||
1. Identify the specific errors
|
||||
2. Prioritize by severity
|
||||
3. Fix 4xx errors (broken pages)
|
||||
4. Then address 5xx errors
|
||||
Let me show you how to set it up."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Copilot Quick Commands
|
||||
|
||||
### Analysis Help
|
||||
- "Explain my health score"
|
||||
- "What are my biggest SEO issues?"
|
||||
- "How do I read this analysis?"
|
||||
- "What's my score breakdown?"
|
||||
|
||||
### Tool Recommendations
|
||||
- "Which tool should I use for X?"
|
||||
- "How do I set up Y?"
|
||||
- "What's the difference between X and Y?"
|
||||
- "Is my analysis complete?"
|
||||
|
||||
### Strategic Guidance
|
||||
- "What should I focus on?"
|
||||
- "How do I beat my competitors?"
|
||||
- "Should I prioritize content or technical SEO?"
|
||||
- "What's my content strategy?"
|
||||
|
||||
### Performance Tracking
|
||||
- "How have I improved?"
|
||||
- "What's my trend?"
|
||||
- "Am I on track to my goals?"
|
||||
- "Where am I vs competitors?"
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Ask Specific Questions
|
||||
❌ "My SEO is bad"
|
||||
✅ "My health score is 62. What are the most important improvements?"
|
||||
|
||||
### Provide Context
|
||||
❌ "How do I improve?"
|
||||
✅ "I'm an e-commerce site selling shoes. How should I improve my SEO?"
|
||||
|
||||
### Use in Combination
|
||||
- Ask Copilot for guidance
|
||||
- Run the recommended tool
|
||||
- Return to Copilot with results for next steps
|
||||
|
||||
### Regular Check-ins
|
||||
- Weekly: Ask about your progress
|
||||
- Monthly: Ask for strategic planning
|
||||
- Quarterly: Ask about competitive positioning
|
||||
|
||||
---
|
||||
|
||||
## Copilot Context
|
||||
|
||||
The Copilot has access to:
|
||||
- ✅ Your SEO analysis data
|
||||
- ✅ Your health score and metrics
|
||||
- ✅ Your platform integrations (GSC, GA4, Bing)
|
||||
- ✅ Your competitor analysis
|
||||
- ✅ Your content strategy
|
||||
- ✅ Your historical data and trends
|
||||
|
||||
### What Copilot Can Do
|
||||
- Explain your SEO data
|
||||
- Recommend tools and strategies
|
||||
- Prioritize actions
|
||||
- Guide you through processes
|
||||
- Suggest competitive opportunities
|
||||
- Help interpret results
|
||||
|
||||
### What Copilot Cannot Do
|
||||
- Directly modify your website
|
||||
- Access external websites (use analysis tools)
|
||||
- Execute fixes automatically
|
||||
- Guarantee specific ranking improvements
|
||||
- Replace professional SEO consulting
|
||||
|
||||
---
|
||||
|
||||
## Advanced Use Cases
|
||||
|
||||
### For Content Creators
|
||||
"I'm writing a blog post about digital marketing. How should I optimize it for SEO?"
|
||||
|
||||
Copilot will recommend:
|
||||
- Target keywords to use
|
||||
- Optimal content length
|
||||
- Structure recommendations
|
||||
- Meta tags to create
|
||||
- Image optimization tips
|
||||
|
||||
### For Digital Marketers
|
||||
"How should I structure my content strategy for the next quarter?"
|
||||
|
||||
Copilot will analyze:
|
||||
- Current content gaps
|
||||
- Competitor opportunities
|
||||
- Keyword opportunities
|
||||
- Content distribution
|
||||
- Publishing calendar recommendations
|
||||
|
||||
### For SEO Professionals
|
||||
"I need to improve rankings for high-value keywords. What's my strategy?"
|
||||
|
||||
Copilot will recommend:
|
||||
- On-page optimization priorities
|
||||
- Technical SEO improvements
|
||||
- Link building opportunities
|
||||
- Content expansion ideas
|
||||
- Competitive positioning tactics
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Copilot Seems Inaccurate
|
||||
- Ensure you've run recent analysis
|
||||
- Provide more specific context
|
||||
- Try rephrasing your question
|
||||
- Run a tool to get more data
|
||||
|
||||
### Not Getting Useful Recommendations
|
||||
- Provide your URL or industry
|
||||
- Mention your goals
|
||||
- Ask follow-up questions
|
||||
- Check the recommended tool for more details
|
||||
|
||||
### Copilot Isn't Responding
|
||||
- Check your internet connection
|
||||
- Try refreshing the dashboard
|
||||
- Start a new conversation
|
||||
- Clear your browser cache
|
||||
|
||||
---
|
||||
|
||||
## Tips for Best Results
|
||||
|
||||
1. **Be Specific**: Include URLs, metrics, or goals
|
||||
2. **Ask Follow-ups**: "Tell me more about..." or "How do I...?"
|
||||
3. **Provide Context**: Mention your industry or goals
|
||||
4. **Use Tool Names**: "Use the PageSpeed Analyzer to..."
|
||||
5. **Ask for Priorities**: "What should I focus on first?"
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Tools
|
||||
|
||||
The Copilot works seamlessly with:
|
||||
- **Health Score**: "Explain my score"
|
||||
- **Analysis Tools**: "Use the Technical SEO tool"
|
||||
- **Competitive Analysis**: "How do I compare?"
|
||||
- **Content Strategy**: "Plan my content"
|
||||
- **Blog Writer**: "Optimize this page"
|
||||
|
||||
---
|
||||
|
||||
## Example Workflows
|
||||
|
||||
### Weekly SEO Review
|
||||
```
|
||||
1. Ask: "What's my latest health score?"
|
||||
2. Ask: "Should I run any new analysis?"
|
||||
3. Ask: "What are my top priorities this week?"
|
||||
4. Use recommended tools
|
||||
5. Ask: "How did I improve?"
|
||||
```
|
||||
|
||||
### Content Planning
|
||||
```
|
||||
1. Ask: "What content opportunities do I have?"
|
||||
2. Use Content Strategy Analyzer (recommended)
|
||||
3. Ask: "Which topics should I prioritize?"
|
||||
4. Ask: "What keywords should I target?"
|
||||
5. Get recommendations for each piece of content
|
||||
```
|
||||
|
||||
### Competitive Analysis
|
||||
```
|
||||
1. Ask: "How do I compare to competitors?"
|
||||
2. Use Competitive Analysis tool
|
||||
3. Ask: "What's my competitive advantage?"
|
||||
4. Ask: "Where am I behind?"
|
||||
5. Get actionable improvement strategies
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Getting Help
|
||||
|
||||
The AI Copilot is always ready to help with:
|
||||
- **How-to questions** - "How do I...?"
|
||||
- **Explanation requests** - "Explain my..."
|
||||
- **Recommendations** - "What should I...?"
|
||||
- **Prioritization** - "What's most important?"
|
||||
- **Guidance** - "Guide me through..."
|
||||
|
||||
---
|
||||
|
||||
**Pro Tip**: The more specific you are with your questions and the more context you provide, the better and more actionable the Copilot's recommendations will be!
|
||||
@@ -1,427 +0,0 @@
|
||||
# Competitive Analysis Guide
|
||||
|
||||
## 🏆 Overview
|
||||
|
||||
ALwrity's Competitive Analysis tools help you understand your market position, discover opportunities, and stay ahead of competitors. Using Exa API semantic search and advanced analysis, you can benchmark your content, identify gaps, and develop winning strategies.
|
||||
|
||||
## 🎯 What You Can Do
|
||||
|
||||
### Competitor Discovery
|
||||
- Find direct and indirect competitors
|
||||
- Analyze competitor content strategies
|
||||
- Discover emerging threats
|
||||
- Identify market leaders
|
||||
|
||||
### Content Benchmarking
|
||||
- Compare content volume and structure
|
||||
- Analyze publishing frequency
|
||||
- Identify content gaps
|
||||
- Find topic opportunities
|
||||
|
||||
### Market Positioning
|
||||
- Compare keyword strategies
|
||||
- Analyze competitive advantages
|
||||
- Identify market opportunities
|
||||
- Benchmark performance metrics
|
||||
|
||||
### Strategic Insights
|
||||
- Deep competitive analysis
|
||||
- Market positioning assessment
|
||||
- Weakness identification
|
||||
- Opportunity detection
|
||||
|
||||
---
|
||||
|
||||
## Competitive Analysis Tools
|
||||
|
||||
### 1. 🏆 Competitive Analysis Tool
|
||||
**Purpose**: Discover and analyze your competition
|
||||
|
||||
**Features**:
|
||||
- Competitor discovery using Exa API
|
||||
- Content analysis across competitors
|
||||
- Benchmarking metrics
|
||||
- Market positioning insights
|
||||
|
||||
**Use When**:
|
||||
- Starting SEO strategy
|
||||
- Quarterly competitive review
|
||||
- Entering new market
|
||||
- Launching new content area
|
||||
|
||||
**Output**:
|
||||
```json
|
||||
{
|
||||
"competitors": [
|
||||
{
|
||||
"url": "competitor.com",
|
||||
"trust_score": 85,
|
||||
"content_volume": 450,
|
||||
"publishing_frequency": "3x/week",
|
||||
"strengths": ["Blog authority", "Video content"],
|
||||
"weaknesses": ["Mobile UX", "Page speed"]
|
||||
}
|
||||
],
|
||||
"market_position": "challenger",
|
||||
"opportunities": ["Video content", "Technical content"],
|
||||
"threats": ["Competitor launching premium tier"]
|
||||
}
|
||||
```
|
||||
|
||||
### 2. 📊 Sitemap Benchmarking
|
||||
**Purpose**: Compare content structure with competitors
|
||||
|
||||
**Features**:
|
||||
- Automatic competitor discovery
|
||||
- Sitemap structure comparison
|
||||
- Content distribution analysis
|
||||
- Publishing velocity comparison
|
||||
|
||||
**Metrics Analyzed**:
|
||||
- Total URLs
|
||||
- Content distribution by type
|
||||
- Publishing frequency
|
||||
- URL depth and structure
|
||||
- Content freshness
|
||||
|
||||
**Use When**:
|
||||
- Planning content strategy
|
||||
- Benchmarking content output
|
||||
- Identifying content gaps
|
||||
- Quarterly competitive review
|
||||
|
||||
**How to Use**:
|
||||
1. Run from SEO Dashboard
|
||||
2. System finds top competitors automatically
|
||||
3. Analyzes sitemaps in background
|
||||
4. Receives comprehensive comparison report
|
||||
|
||||
**Output**:
|
||||
```
|
||||
Competitor Benchmark Report
|
||||
- Your Content: 250 pages (published 2x/week)
|
||||
- Competitor A: 400 pages (published 4x/week)
|
||||
- Competitor B: 320 pages (published 3x/week)
|
||||
Gap: Publishing 1-2x/week behind competitors
|
||||
Opportunity: Increase content production by 25%
|
||||
```
|
||||
|
||||
### 3. 🎭 Deep Competitor Analysis
|
||||
**Purpose**: In-depth competitive intelligence
|
||||
|
||||
**Features**:
|
||||
- Comprehensive competitor profiling
|
||||
- Market positioning analysis
|
||||
- Competitive advantages identification
|
||||
- Weakness analysis
|
||||
|
||||
**Analysis Includes**:
|
||||
- Content strategy analysis
|
||||
- SEO approach comparison
|
||||
- Marketing tactics evaluation
|
||||
- Brand positioning
|
||||
- Target audience alignment
|
||||
|
||||
**Use When**:
|
||||
- Quarterly strategic planning
|
||||
- Competitive threat analysis
|
||||
- Understanding market gaps
|
||||
- Developing differentiation strategy
|
||||
|
||||
### 4. 💬 Strategic Insights
|
||||
**Purpose**: Weekly AI-powered competitive strategy
|
||||
|
||||
**Features**:
|
||||
- Weekly strategy briefs
|
||||
- Competitive insights
|
||||
- Opportunity identification
|
||||
- Action recommendations
|
||||
|
||||
**Delivered**:
|
||||
- Weekly (scheduled emails)
|
||||
- Based on latest competitive data
|
||||
- Prioritized by impact
|
||||
- Actionable recommendations
|
||||
|
||||
**Topics Covered**:
|
||||
- Ranking changes
|
||||
- Competitor moves
|
||||
- Content opportunities
|
||||
- Market trends
|
||||
- Recommended actions
|
||||
|
||||
---
|
||||
|
||||
## How to Use Competitive Analysis
|
||||
|
||||
### Getting Started
|
||||
|
||||
#### Step 1: Identify Competitors
|
||||
1. Go to SEO Dashboard
|
||||
2. Click "Competitive Analysis"
|
||||
3. Enter your main competitors (up to 5)
|
||||
4. Or let system auto-discover competitors
|
||||
|
||||
#### Step 2: Run Analysis
|
||||
1. Select analysis type:
|
||||
- Quick Competitive Overview (5 minutes)
|
||||
- Deep Competitor Analysis (15 minutes)
|
||||
- Sitemap Benchmarking (background, 30+ minutes)
|
||||
2. Click "Analyze"
|
||||
3. View results when complete
|
||||
|
||||
#### Step 3: Review Insights
|
||||
1. Check competitor profiles
|
||||
2. Review market positioning
|
||||
3. Identify opportunities
|
||||
4. Note threats/challenges
|
||||
|
||||
### Weekly Workflow
|
||||
|
||||
```
|
||||
Monday: Review Strategic Insights email
|
||||
Wednesday: Run Competitive Analysis
|
||||
Friday: Update content strategy based on findings
|
||||
```
|
||||
|
||||
### Monthly Workflow
|
||||
|
||||
```
|
||||
1st Week: Deep Competitor Analysis
|
||||
2nd Week: Sitemap Benchmarking
|
||||
3rd Week: Content gap analysis
|
||||
4th Week: Strategic planning session
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Understanding Results
|
||||
|
||||
### Competitive Positioning
|
||||
|
||||
#### Market Positions
|
||||
- **Leader**: #1 market position, highest content volume, strong brand
|
||||
- **Challenger**: Strong position, competing effectively on key topics
|
||||
- **Niche Player**: Specialized position, strong in specific areas
|
||||
- **Emerging**: New player with growing presence
|
||||
|
||||
#### Your Position
|
||||
Based on:
|
||||
- Content volume vs. competitors
|
||||
- Keyword rankings vs. competitors
|
||||
- Publishing frequency
|
||||
- Domain authority
|
||||
- Backlink profile
|
||||
|
||||
### Opportunity Identification
|
||||
|
||||
#### Content Gaps
|
||||
Topics competitors cover but you don't:
|
||||
- **High Priority**: High search volume, competitors ranking well
|
||||
- **Medium Priority**: Moderate search volume, good opportunity
|
||||
- **Low Priority**: Low search volume, lower opportunity
|
||||
|
||||
#### Strength Areas
|
||||
Where you're beating competitors:
|
||||
- Topics you dominate
|
||||
- Keywords you rank for
|
||||
- Content types you excel at
|
||||
- Audience segments you reach
|
||||
|
||||
#### Threat Areas
|
||||
Where competitors are stronger:
|
||||
- Topics they dominate
|
||||
- Keywords you're losing
|
||||
- Publishing frequency gaps
|
||||
- Authority differences
|
||||
|
||||
---
|
||||
|
||||
## Analysis Examples
|
||||
|
||||
### Example 1: Content Strategy Gap
|
||||
```
|
||||
Finding: "Your competitors publish 4x/week, you publish 1x/week"
|
||||
Analysis:
|
||||
- Competitor A: 400 posts, 4x/week publishing
|
||||
- You: 100 posts, 1x/week publishing
|
||||
- Gap: 3x behind on content output
|
||||
Recommendation:
|
||||
- Increase publishing to 2-3x/week
|
||||
- Focus on high-opportunity topics
|
||||
- Consider guest posts/syndication
|
||||
```
|
||||
|
||||
### Example 2: Topic Gap
|
||||
```
|
||||
Finding: "Competitors rank for 'advanced SEO tactics', you don't"
|
||||
Analysis:
|
||||
- Competitor A ranks #2 for keyword
|
||||
- Competitor B ranks #5 for keyword
|
||||
- You: Not in top 10
|
||||
- Search volume: 5,000/month
|
||||
- Difficulty: Medium
|
||||
Recommendation:
|
||||
- Create comprehensive guide on topic
|
||||
- Target related long-tail keywords
|
||||
- Build internal links to new content
|
||||
```
|
||||
|
||||
### Example 3: Competitive Threat
|
||||
```
|
||||
Finding: "New competitor launched last month, ranking fast"
|
||||
Analysis:
|
||||
- Competitor C: Launched 30 days ago
|
||||
- Already ranking for 50 keywords
|
||||
- Average position: #8
|
||||
- Topics: Overlap with your main areas
|
||||
Recommendation:
|
||||
- Monitor closely for rank drops
|
||||
- Strengthen authority on key topics
|
||||
- Consider direct comparison content
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Regular Monitoring
|
||||
- ✅ Check weekly strategic insights
|
||||
- ✅ Run deep analysis monthly
|
||||
- ✅ Update competitive data quarterly
|
||||
- ✅ Review opportunities regularly
|
||||
|
||||
### Acting on Insights
|
||||
1. **Identify Opportunities** - Find high-priority gaps
|
||||
2. **Prioritize** - Focus on high-impact opportunities
|
||||
3. **Plan Content** - Create strategic content plan
|
||||
4. **Execute** - Produce and optimize content
|
||||
5. **Monitor** - Track improvements
|
||||
|
||||
### Avoiding Mistakes
|
||||
- ❌ Don't copy competitor content
|
||||
- ❌ Don't ignore emerging competitors
|
||||
- ❌ Don't focus only on weak competitors
|
||||
- ❌ Don't neglect your strengths
|
||||
- ✅ Focus on your unique value proposition
|
||||
- ✅ Learn from competitors, don't copy
|
||||
- ✅ Build sustainable advantages
|
||||
|
||||
---
|
||||
|
||||
## Advanced Tactics
|
||||
|
||||
### Finding New Competitors
|
||||
Using the Competitive Analysis tool:
|
||||
1. Enter your main keywords
|
||||
2. Review top 10 ranking sites
|
||||
3. Analyze which are direct competitors
|
||||
4. Identify emerging threats
|
||||
|
||||
### Content Benchmarking Strategy
|
||||
1. Identify competitor's top content
|
||||
2. Analyze what makes it successful
|
||||
3. Create better/updated version
|
||||
4. Build more internal links
|
||||
5. Optimize aggressively
|
||||
|
||||
### Opportunity Prioritization
|
||||
Score opportunities by:
|
||||
- Search volume (higher is better)
|
||||
- Keyword difficulty (lower is better)
|
||||
- Commercial intent (varies by business)
|
||||
- Your ability to rank (competitive advantage)
|
||||
- Your content gaps (what you're missing)
|
||||
|
||||
### Market Expansion
|
||||
1. Identify competitor strengths
|
||||
2. Find adjacent opportunities
|
||||
3. Analyze market demand
|
||||
4. Develop expansion strategy
|
||||
5. Create content pillar
|
||||
|
||||
---
|
||||
|
||||
## Competitive Keywords
|
||||
|
||||
### Finding Competitive Keywords
|
||||
|
||||
1. **Rank Tracker Integration** (planned):
|
||||
- Your rankings vs. competitor rankings
|
||||
- Shared keywords
|
||||
- Keywords you're winning
|
||||
- Keywords you're losing
|
||||
|
||||
2. **Gap Analysis**:
|
||||
- Keywords competitors rank for
|
||||
- Keywords you should target
|
||||
- Keywords with highest opportunity
|
||||
|
||||
3. **Opportunity Scoring**:
|
||||
- Potential traffic opportunity
|
||||
- Effort to achieve
|
||||
- Competition level
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Tools
|
||||
|
||||
### Works With:
|
||||
- **Sitemap Analyzer** - Understand competitor structure
|
||||
- **Content Strategy Tool** - Plan competitive content
|
||||
- **Keyword Research** - Find competitor keywords
|
||||
- **Blog Writer** - Create competitive content
|
||||
- **AI Copilot** - Get strategic recommendations
|
||||
|
||||
### Typical Workflow:
|
||||
```
|
||||
1. Run Competitive Analysis → Get market insights
|
||||
2. Use Content Strategy Tool → Find gaps
|
||||
3. Use Copilot → Get recommendations
|
||||
4. Create content in Blog Writer → Implement strategy
|
||||
5. Track rankings → Measure success
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Questions
|
||||
|
||||
### Q: How often should I run competitive analysis?
|
||||
**A**:
|
||||
- Strategic Insights: Weekly (automatic)
|
||||
- Competitive Analysis: Monthly
|
||||
- Deep Analysis: Quarterly
|
||||
- Sitemap Benchmarking: Quarterly
|
||||
|
||||
### Q: How many competitors should I track?
|
||||
**A**: 3-5 is ideal:
|
||||
- 1-2 direct competitors
|
||||
- 1-2 content competitors
|
||||
- 1 emerging competitor
|
||||
|
||||
### Q: What if I have no competitors?
|
||||
**A**: Everyone has competitors:
|
||||
- Direct: Same products/services
|
||||
- Content: Creating similar content
|
||||
- Audience: Target same audience
|
||||
- Consider: Adjacent markets
|
||||
|
||||
### Q: Can I export the analysis?
|
||||
**A**: Yes, available as:
|
||||
- PDF report
|
||||
- CSV data
|
||||
- API access
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Run Your First Analysis**: Go to Competitive Analysis tool
|
||||
2. **Identify Your Competitors**: Add 3-5 top competitors
|
||||
3. **Review the Report**: Understand your market position
|
||||
4. **Make a Plan**: Use findings to guide strategy
|
||||
5. **Take Action**: Implement recommendations
|
||||
|
||||
---
|
||||
|
||||
**Ready to analyze your competition? Start with [Competitive Analysis Tool](../tools-reference.md) or ask the [AI Copilot](ai-copilot.md) for guidance!**
|
||||
@@ -1,466 +0,0 @@
|
||||
# Content Strategy Tool Guide
|
||||
|
||||
## 📊 Overview
|
||||
|
||||
The ALwrity Content Strategy Analyzer helps you identify content gaps, discover opportunities, plan your content calendar, and develop a data-driven content strategy. Using AI analysis and competitive intelligence, you can create content that ranks and converts.
|
||||
|
||||
## 🎯 What You Can Do
|
||||
|
||||
### Content Gap Analysis
|
||||
- Identify topics you're missing
|
||||
- Find competitor content opportunities
|
||||
- Analyze content distribution
|
||||
- Discover emerging trends
|
||||
|
||||
### Opportunity Identification
|
||||
- Score opportunities by potential
|
||||
- Identify high-volume keywords
|
||||
- Find low-competition topics
|
||||
- Discover audience needs
|
||||
|
||||
### Content Planning
|
||||
- Generate topic recommendations
|
||||
- Suggest content types
|
||||
- Plan publishing schedule
|
||||
- Create content clusters
|
||||
|
||||
### Competitive Positioning
|
||||
- Analyze competitor content strategies
|
||||
- Find content advantages
|
||||
- Identify differentiation opportunities
|
||||
- Plan content differentiation
|
||||
|
||||
---
|
||||
|
||||
## Content Strategy Analysis
|
||||
|
||||
### Analysis Components
|
||||
|
||||
#### 1. Content Gaps
|
||||
**What It Shows**:
|
||||
Topics your competitors cover that you don't
|
||||
- Missing high-opportunity topics
|
||||
- Underserved audience needs
|
||||
- Emerging trend areas
|
||||
- Topic clusters without coverage
|
||||
|
||||
**Opportunity Scoring**:
|
||||
- **Search Volume**: Monthly search interest
|
||||
- **Difficulty**: Competition level (easy to hard)
|
||||
- **Opportunity Score**: Combined potential (0-100)
|
||||
- **Recommended Content Types**: Blog, guide, video, etc.
|
||||
|
||||
**Example Output**:
|
||||
```
|
||||
Topic: "Advanced Email Marketing Strategies"
|
||||
- Search Volume: 12,000/month
|
||||
- Difficulty: Medium
|
||||
- Opportunity Score: 82/100
|
||||
- Recommended Types: Blog post, guide, video tutorial
|
||||
- Your Gap: Not in top 20 results
|
||||
- Competitor Ranking: Competitor A #3, B #8
|
||||
```
|
||||
|
||||
#### 2. Content Distribution
|
||||
**What It Shows**:
|
||||
How your content is distributed across types and topics
|
||||
- Blog posts vs. pages vs. guides
|
||||
- Topic distribution
|
||||
- Content depth analysis
|
||||
- Content freshness
|
||||
|
||||
**Comparison**:
|
||||
- Your distribution vs. competitors
|
||||
- Underserved content types
|
||||
- Overexposed areas
|
||||
- Rebalancing recommendations
|
||||
|
||||
#### 3. Publishing Velocity
|
||||
**What It Shows**:
|
||||
How frequently you and competitors publish
|
||||
- Your publishing rate (posts/week)
|
||||
- Competitor rates
|
||||
- Trend over time
|
||||
- Recommendations for optimal frequency
|
||||
|
||||
**Analysis**:
|
||||
- Are you publishing enough?
|
||||
- Publishing frequency trends
|
||||
- Recommended increase/decrease
|
||||
- Content quality vs. quantity balance
|
||||
|
||||
#### 4. Competitive Content Analysis
|
||||
**What It Shows**:
|
||||
What content your competitors are creating successfully
|
||||
- Their top-performing topics
|
||||
- Content types they excel at
|
||||
- Content gaps in their strategy
|
||||
- Differentiation opportunities
|
||||
|
||||
---
|
||||
|
||||
## How to Use the Content Strategy Tool
|
||||
|
||||
### Getting Started
|
||||
|
||||
#### Step 1: Run the Analysis
|
||||
1. Go to **Content Strategy Analyzer**
|
||||
2. Enter your website URL
|
||||
3. Add competitors (optional)
|
||||
4. Click **"Analyze Content Strategy"**
|
||||
5. Wait for analysis to complete (5-10 minutes)
|
||||
|
||||
#### Step 2: Review the Report
|
||||
The report includes:
|
||||
- **Executive Summary**: Key findings and opportunities
|
||||
- **Content Gaps**: Top 10 high-opportunity topics
|
||||
- **Gap Analysis**: Missing topics with scoring
|
||||
- **Competitive Positioning**: How you compare
|
||||
- **Recommendations**: Specific action items
|
||||
|
||||
#### Step 3: Make a Plan
|
||||
1. Identify top 3-5 opportunities
|
||||
2. Assign priorities
|
||||
3. Plan content calendar
|
||||
4. Assign ownership
|
||||
5. Set timelines
|
||||
|
||||
### Example Workflow
|
||||
|
||||
```
|
||||
Monday: Run content strategy analysis
|
||||
Tuesday: Review findings, identify top 10 opportunities
|
||||
Wednesday: Select top 5, create content briefs
|
||||
Thursday: Assign to team members
|
||||
Friday: Plan publishing schedule
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Understanding Your Results
|
||||
|
||||
### Opportunity Scores
|
||||
|
||||
#### Scoring Breakdown
|
||||
- **0-20**: Low opportunity (low volume, high competition)
|
||||
- **21-40**: Moderate opportunity (niche topics)
|
||||
- **41-60**: Good opportunity (decent volume, moderate competition)
|
||||
- **61-80**: High opportunity (strong volume, manageable competition)
|
||||
- **81-100**: Excellent opportunity (high volume, low competition)
|
||||
|
||||
#### What Affects Scoring
|
||||
1. **Search Volume** (40%) - Higher is better
|
||||
2. **Competition** (30%) - Lower difficulty is better
|
||||
3. **Relevance** (20%) - Match to your audience
|
||||
4. **Trend** (10%) - Rising trends get bonus points
|
||||
|
||||
### Gap Types
|
||||
|
||||
#### Topic Gaps
|
||||
Missing entire topics competitors cover:
|
||||
- **Complete Gap**: Neither you nor competitors are strong
|
||||
- **Competitive Gap**: Competitors strong, you weak
|
||||
- **Emerging Gap**: New trend both miss
|
||||
|
||||
#### Content Type Gaps
|
||||
Missing specific content formats:
|
||||
- Blog posts (if competitors have videos)
|
||||
- Case studies (if missing examples)
|
||||
- Interactive content (if all text)
|
||||
- Video content (if no video)
|
||||
|
||||
#### Topic Cluster Gaps
|
||||
Missing clusters of related content:
|
||||
- Competitors have cluster, you don't
|
||||
- Cluster has high search volume
|
||||
- Your audience likely interested
|
||||
- Quick win opportunity
|
||||
|
||||
---
|
||||
|
||||
## Content Planning
|
||||
|
||||
### Creating Your Plan
|
||||
|
||||
#### Step 1: Prioritize Opportunities
|
||||
Score each gap:
|
||||
- **Impact Score**: Potential traffic gain (0-100)
|
||||
- **Effort Score**: Time/resources needed (0-100)
|
||||
- **Priority**: Impact ÷ Effort (higher = better)
|
||||
|
||||
#### Step 2: Plan Content
|
||||
For each top opportunity:
|
||||
1. **Topic**: Clear, specific title
|
||||
2. **Keywords**: Primary + secondary keywords
|
||||
3. **Type**: Blog, guide, video, etc.
|
||||
4. **Length**: Recommended word count
|
||||
5. **Timeline**: When to publish
|
||||
|
||||
#### Step 3: Create Clusters
|
||||
Group related content:
|
||||
- **Pillar**: Main topic (comprehensive guide)
|
||||
- **Cluster**: Supporting topics (detailed guides)
|
||||
- **Resources**: Additional materials
|
||||
|
||||
#### Step 4: Publish & Optimize
|
||||
1. Create content
|
||||
2. Optimize for keywords
|
||||
3. Build internal links
|
||||
4. Publish on schedule
|
||||
5. Promote on social
|
||||
|
||||
### Example Plan
|
||||
|
||||
```
|
||||
Pillar Topic: "Email Marketing Strategy"
|
||||
- Pillar Content: Complete guide (5,000+ words)
|
||||
|
||||
Cluster Topics:
|
||||
1. Email Segmentation (2,000 words)
|
||||
2. Email Automation (2,000 words)
|
||||
3. A/B Testing Emails (1,500 words)
|
||||
4. Email Personalization (1,500 words)
|
||||
|
||||
Supporting Resources:
|
||||
- Email templates (downloadable)
|
||||
- Best practices checklist
|
||||
- Tools comparison guide
|
||||
- Case study example
|
||||
|
||||
Timeline:
|
||||
- Pillar: Week 1
|
||||
- Cluster 1-2: Week 2-3
|
||||
- Cluster 3-4: Week 4-5
|
||||
- Resources: Week 6
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Analysis
|
||||
|
||||
### Content Type Recommendations
|
||||
|
||||
The tool recommends optimal content types based on:
|
||||
- Your audience preferences
|
||||
- Topic characteristics
|
||||
- Competitor strategies
|
||||
- Search intent
|
||||
- Engagement potential
|
||||
|
||||
#### Typical Recommendations
|
||||
- **Blog Post**: General informational topics
|
||||
- **Comprehensive Guide**: In-depth, authoritative topics
|
||||
- **How-To Guide**: Procedural, step-by-step topics
|
||||
- **Tutorial**: Technical, complex topics
|
||||
- **Case Study**: Implementation, real-world examples
|
||||
- **Video**: Visual, demonstration topics
|
||||
- **Infographic**: Data, comparison topics
|
||||
- **Checklist**: Action-oriented topics
|
||||
|
||||
### Topic Clustering
|
||||
|
||||
The tool identifies natural clusters:
|
||||
- **Related Topics**: Naturally grouped topics
|
||||
- **Pillar Content**: Main comprehensive topic
|
||||
- **Supporting Content**: Detailed subtopics
|
||||
- **Internal Linking**: Connection strategy
|
||||
|
||||
### Trend Analysis
|
||||
|
||||
Identifies emerging trends:
|
||||
- **Rising Trends**: Topics gaining search interest
|
||||
- **Seasonal Topics**: Cyclical content opportunities
|
||||
- **Declining Trends**: Topics losing interest
|
||||
- **Timeless Topics**: Evergreen, stable content
|
||||
|
||||
---
|
||||
|
||||
## Content Calendar
|
||||
|
||||
### Planning Your Calendar
|
||||
|
||||
#### Monthly Planning
|
||||
1. Identify high-priority topics
|
||||
2. Assign to weeks
|
||||
3. Include supporting content
|
||||
4. Plan promotions
|
||||
|
||||
#### Quarterly Planning
|
||||
1. Set content themes
|
||||
2. Plan pillar topics
|
||||
3. Map cluster topics
|
||||
4. Set KPIs
|
||||
|
||||
#### Annual Planning
|
||||
1. Define content strategy
|
||||
2. Plan seasonal content
|
||||
3. Set annual goals
|
||||
4. Identify growth areas
|
||||
|
||||
### Example Calendar
|
||||
|
||||
```
|
||||
Month 1: Foundation
|
||||
- Pillar: "Complete SEO Guide" (Week 1)
|
||||
- Cluster: "Keyword Research" (Week 2)
|
||||
- Cluster: "On-Page SEO" (Week 3)
|
||||
- Update: Refresh old posts (Week 4)
|
||||
|
||||
Month 2: Building
|
||||
- Cluster: "Technical SEO" (Week 1)
|
||||
- Cluster: "Link Building" (Week 2)
|
||||
- Supporting: Templates & Tools (Week 3)
|
||||
- Promotion: Webinar, social (Week 4)
|
||||
|
||||
Month 3: Expansion
|
||||
- Cluster: "Content Strategy" (Week 1)
|
||||
- Case Study: Success story (Week 2)
|
||||
- Competitive: Competitor comparison (Week 3)
|
||||
- Review: Monthly analytics (Week 4)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Planning Best Practices
|
||||
1. ✅ Start with high-opportunity topics
|
||||
2. ✅ Balance content types
|
||||
3. ✅ Create content clusters
|
||||
4. ✅ Plan 2-3 months ahead
|
||||
5. ✅ Include supporting content
|
||||
|
||||
### Content Creation Best Practices
|
||||
1. ✅ Research thoroughly before writing
|
||||
2. ✅ Optimize for primary + secondary keywords
|
||||
3. ✅ Build internal links to relevant content
|
||||
4. ✅ Include multimedia (images, videos)
|
||||
5. ✅ Update older content regularly
|
||||
|
||||
### Publishing Best Practices
|
||||
1. ✅ Maintain consistent schedule
|
||||
2. ✅ Promote on social media
|
||||
3. ✅ Build backlinks
|
||||
4. ✅ Monitor rankings
|
||||
5. ✅ Update based on performance
|
||||
|
||||
---
|
||||
|
||||
## Common Mistakes to Avoid
|
||||
|
||||
### Planning Mistakes
|
||||
- ❌ Picking only easy topics (low competition often = low volume)
|
||||
- ❌ Ignoring your audience needs
|
||||
- ❌ Publishing too infrequently
|
||||
- ❌ Creating isolated posts (no strategy)
|
||||
- ❌ Copying competitor content
|
||||
|
||||
### Execution Mistakes
|
||||
- ❌ Publishing without optimization
|
||||
- ❌ Forgetting internal linking
|
||||
- ❌ Neglecting images/multimedia
|
||||
- ❌ Not tracking performance
|
||||
- ❌ Giving up too quickly
|
||||
|
||||
### Strategy Mistakes
|
||||
- ❌ Only pursuing quick wins
|
||||
- ❌ Ignoring competitor moves
|
||||
- ❌ Not updating old content
|
||||
- ❌ Focusing only on rankings
|
||||
- ❌ Missing audience trends
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Tools
|
||||
|
||||
### Works With:
|
||||
- **Blog Writer** - Create planned content
|
||||
- **Metadata Generator** - Optimize titles/descriptions
|
||||
- **On-Page SEO** - Optimize created content
|
||||
- **Competitive Analysis** - Understand competitor strategy
|
||||
- **AI Copilot** - Get strategic recommendations
|
||||
|
||||
### Typical Workflow:
|
||||
```
|
||||
1. Content Strategy Tool → Identify opportunities
|
||||
2. AI Copilot → Get recommendations
|
||||
3. Blog Writer → Create content
|
||||
4. On-Page SEO → Optimize content
|
||||
5. SEO Dashboard → Track rankings
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Measuring Success
|
||||
|
||||
### Key Metrics to Track
|
||||
|
||||
#### Traffic Metrics
|
||||
- Organic traffic to new content
|
||||
- Traffic by content type
|
||||
- Traffic growth trend
|
||||
- Pages per session
|
||||
|
||||
#### Ranking Metrics
|
||||
- New keyword rankings
|
||||
- Ranking improvements
|
||||
- Top 10 positions
|
||||
- Rank 1 positions
|
||||
|
||||
#### Engagement Metrics
|
||||
- Average time on page
|
||||
- Bounce rate
|
||||
- Click-through rate
|
||||
- Social shares
|
||||
|
||||
#### Conversion Metrics
|
||||
- Leads from content
|
||||
- Sales from content
|
||||
- Cost per acquisition
|
||||
- Content ROI
|
||||
|
||||
### Measuring ROI
|
||||
|
||||
```
|
||||
Content ROI = (Revenue from Content - Content Cost) / Content Cost
|
||||
|
||||
Example:
|
||||
- 10 articles created = $5,000 cost
|
||||
- Generated $25,000 in revenue
|
||||
- ROI = ($25,000 - $5,000) / $5,000 = 400%
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Run Analysis**: Execute Content Strategy Analysis
|
||||
2. **Review Findings**: Understand your opportunities
|
||||
3. **Make Plan**: Create 90-day content calendar
|
||||
4. **Get Help**: Ask AI Copilot for recommendations
|
||||
5. **Create Content**: Use Blog Writer to create planned content
|
||||
6. **Optimize**: Use On-Page SEO to optimize
|
||||
7. **Track**: Monitor rankings and traffic
|
||||
|
||||
---
|
||||
|
||||
## Common Questions
|
||||
|
||||
### Q: How often should I run analysis?
|
||||
**A**: Monthly for active strategies, quarterly minimum
|
||||
|
||||
### Q: How many opportunities should I pursue?
|
||||
**A**: Start with top 5-10, one at a time
|
||||
|
||||
### Q: How long before I see results?
|
||||
**A**: 4-8 weeks for rankings, 8-12 weeks for traffic
|
||||
|
||||
### Q: Should I ignore easy topics?
|
||||
**A**: No! Include 20% easy wins, 80% strategic growth
|
||||
|
||||
### Q: Can I modify recommendations?
|
||||
**A**: Absolutely! Use them as guidance, not requirements
|
||||
|
||||
---
|
||||
|
||||
**Ready to plan your content strategy? Start with [Content Strategy Analyzer](tools-reference.md) or ask [AI Copilot](ai-copilot.md) for help!**
|
||||
@@ -1,345 +0,0 @@
|
||||
# SEO Dashboard Complete Documentation Index
|
||||
|
||||
Welcome to ALwrity's complete SEO Dashboard documentation. This index helps you find exactly what you need.
|
||||
|
||||
---
|
||||
|
||||
## 📚 Find What You Need
|
||||
|
||||
### 🆕 Just Getting Started?
|
||||
Start here to get up and running quickly:
|
||||
- **[Quick Start Guide](quick-start.md)** - Get optimizing in 10 minutes
|
||||
- **[Overview](overview.md)** - Understand the dashboard
|
||||
- **[Tools Reference](tools-reference.md)** - See all 21 tools at a glance
|
||||
|
||||
### 🛠️ Want to Learn Individual Tools?
|
||||
Each tool has a detailed guide:
|
||||
- **[Individual Tools Guide](individual-tools-guide.md)** - Complete guide to all 9 core tools:
|
||||
- Meta Description Generator
|
||||
- PageSpeed Analyzer
|
||||
- Sitemap Analyzer
|
||||
- Image Alt Text Generator
|
||||
- OpenGraph Generator
|
||||
- On-Page SEO Analyzer
|
||||
- Technical SEO Analyzer
|
||||
- Enterprise SEO Suite
|
||||
- Content Strategy Analyzer
|
||||
|
||||
### 📋 Ready to Create Workflows?
|
||||
Learn proven workflows and processes:
|
||||
- **[Workflows & Automation Guide](workflows-guide.md)** - 10+ real-world workflows:
|
||||
- Content Creation Pipeline
|
||||
- Website Audit & Improvement
|
||||
- Performance Optimization
|
||||
- Monthly SEO Maintenance
|
||||
- Industry-Specific Workflows
|
||||
- Quick Wins Strategy
|
||||
- Collaborative Team Workflows
|
||||
- Time-Based Workflows
|
||||
|
||||
### 🤖 Want AI Recommendations?
|
||||
Get strategic help from our AI:
|
||||
- **[AI Copilot Guide](ai-copilot.md)** - Learn to use conversational AI:
|
||||
- How to ask for recommendations
|
||||
- Content strategy help
|
||||
- Tool usage guidance
|
||||
- Problem solving with AI
|
||||
- Example conversations
|
||||
- Advanced use cases
|
||||
|
||||
### 🏆 Doing Competitive Research?
|
||||
Benchmark against competitors:
|
||||
- **[Competitive Analysis Guide](competitive-analysis.md)** - Understand your market:
|
||||
- Competitor discovery
|
||||
- Content benchmarking
|
||||
- Technical comparison
|
||||
- Opportunity identification
|
||||
- Market positioning strategies
|
||||
- Differentiation tactics
|
||||
|
||||
### 📝 Planning Content Strategy?
|
||||
Find content opportunities and plan:
|
||||
- **[Content Strategy Guide](content-strategy-guide.md)** - Plan your content:
|
||||
- Finding content gaps
|
||||
- Scoring opportunities
|
||||
- Building content clusters
|
||||
- Planning publishing calendar
|
||||
- Measuring ROI
|
||||
|
||||
### 🏷️ Learning About Metadata?
|
||||
Master SEO metadata:
|
||||
- **[Metadata Generation Guide](metadata.md)** - Complete metadata reference:
|
||||
- Meta descriptions
|
||||
- OpenGraph tags
|
||||
- Title tag optimization
|
||||
- Twitter cards
|
||||
- Schema markup
|
||||
- Structured data
|
||||
|
||||
### 🔗 Need GSC Integration Info?
|
||||
Connect your Google Search Console:
|
||||
- **[GSC Integration Guide](gsc-integration.md)** - Setup and usage
|
||||
|
||||
### 📐 Want Technical Details?
|
||||
Deep technical reference:
|
||||
- **[Design Document](design-document.md)** - Architecture and technical specs
|
||||
|
||||
---
|
||||
|
||||
## 📖 Documentation by Use Case
|
||||
|
||||
### For Content Creators
|
||||
**Goal**: Create great content that ranks
|
||||
|
||||
**Recommended Reading Order**:
|
||||
1. [Quick Start Guide](quick-start.md) - 10 min
|
||||
2. [Meta Description Generator](individual-tools-guide.md#1--meta-description-generator) - 5 min
|
||||
3. [On-Page SEO Analyzer](individual-tools-guide.md#6--on-page-seo-analyzer) - 10 min
|
||||
4. [Content Strategy Analyzer](individual-tools-guide.md#9--content-strategy-analyzer) - 10 min
|
||||
5. [Content Creation Workflow](workflows-guide.md#workflow-1-content-creation-pipeline) - 5 min
|
||||
|
||||
**Total Learning Time**: 40 minutes
|
||||
**First Task**: Create one optimized article
|
||||
|
||||
---
|
||||
|
||||
### For Digital Marketers
|
||||
**Goal**: Improve organic traffic and rankings
|
||||
|
||||
**Recommended Reading Order**:
|
||||
1. [Quick Start Guide](quick-start.md) - 10 min
|
||||
2. [Tools Reference](tools-reference.md) - 15 min
|
||||
3. [Competitive Analysis Guide](competitive-analysis.md) - 20 min
|
||||
4. [Content Strategy Guide](content-strategy-guide.md) - 30 min
|
||||
5. [Workflows & Automation](workflows-guide.md) - 30 min
|
||||
|
||||
**Total Learning Time**: 1.5-2 hours
|
||||
**First Task**: Run competitive analysis
|
||||
|
||||
---
|
||||
|
||||
### For SEO Professionals
|
||||
**Goal**: Comprehensive SEO optimization
|
||||
|
||||
**Recommended Reading Order**:
|
||||
1. [Overview](overview.md) - 10 min
|
||||
2. [Tools Reference](tools-reference.md) - 20 min
|
||||
3. [Individual Tools Guide](individual-tools-guide.md) - 45 min
|
||||
4. [Workflows & Automation](workflows-guide.md) - 45 min
|
||||
5. [Competitive Analysis Guide](competitive-analysis.md) - 30 min
|
||||
6. [Content Strategy Guide](content-strategy-guide.md) - 30 min
|
||||
7. [Design Document](design-document.md) - 15 min
|
||||
|
||||
**Total Learning Time**: 3-4 hours
|
||||
**First Task**: Run Enterprise SEO Suite audit
|
||||
|
||||
---
|
||||
|
||||
### For Developers/Technical Teams
|
||||
**Goal**: Ensure technical SEO health
|
||||
|
||||
**Recommended Reading Order**:
|
||||
1. [Quick Start Guide](quick-start.md) - 10 min
|
||||
2. [Technical SEO Analyzer](individual-tools-guide.md#7--technical-seo-analyzer) - 15 min
|
||||
3. [PageSpeed Analyzer](individual-tools-guide.md#2--pagespeed-analyzer) - 15 min
|
||||
4. [Design Document](design-document.md) - 20 min
|
||||
|
||||
**Total Learning Time**: 1 hour
|
||||
**First Task**: Run Technical SEO audit on website
|
||||
|
||||
---
|
||||
|
||||
### For Solopreneurs
|
||||
**Goal**: Quick wins with minimal time
|
||||
|
||||
**Recommended Reading Order**:
|
||||
1. [Quick Start Guide](quick-start.md) - 10 min
|
||||
2. [Quick Wins Workflow](workflows-guide.md#quick-wins-workflow) - 5 min
|
||||
3. [Individual Tools Guide](individual-tools-guide.md#choosing-the-right-tool) - 10 min
|
||||
|
||||
**Total Learning Time**: 25 minutes
|
||||
**First Task**: Complete quick wins (5-day plan)
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Quick Tool Selection Guide
|
||||
|
||||
### By Time Available
|
||||
|
||||
**I have 5 minutes:**
|
||||
- Use: Meta Description Generator
|
||||
- Run on: Homepage
|
||||
- Expected result: Updated meta descriptions
|
||||
|
||||
**I have 15 minutes:**
|
||||
- Use: On-Page SEO Analyzer
|
||||
- Run on: Top 3 pages
|
||||
- Expected result: Optimization checklist
|
||||
|
||||
**I have 30 minutes:**
|
||||
- Use: PageSpeed Analyzer + On-Page SEO
|
||||
- Run on: Top 5 pages
|
||||
- Expected result: Performance baseline + optimization plan
|
||||
|
||||
**I have 1 hour:**
|
||||
- Use: Technical SEO Analyzer + Content Strategy
|
||||
- Run on: Entire site + top opportunities
|
||||
- Expected result: Technical issues + content plan
|
||||
|
||||
**I have 2+ hours:**
|
||||
- Use: Enterprise SEO Suite + Competitive Analysis
|
||||
- Run on: Full website audit
|
||||
- Expected result: Comprehensive report + strategy
|
||||
|
||||
---
|
||||
|
||||
### By Goal
|
||||
|
||||
| Goal | Tool | Guide |
|
||||
|------|------|-------|
|
||||
| Quick content optimization | On-Page SEO Analyzer | [Link](individual-tools-guide.md#6--on-page-seo-analyzer) |
|
||||
| Improve search appearance | Meta Description Generator | [Link](individual-tools-guide.md#1--meta-description-generator) |
|
||||
| Social media optimization | OpenGraph Generator | [Link](individual-tools-guide.md#5--opengraph-generator) |
|
||||
| Find new content ideas | Content Strategy Analyzer | [Link](individual-tools-guide.md#9--content-strategy-analyzer) |
|
||||
| Fix website speed | PageSpeed Analyzer | [Link](individual-tools-guide.md#2--pagespeed-analyzer) |
|
||||
| Find technical issues | Technical SEO Analyzer | [Link](individual-tools-guide.md#7--technical-seo-analyzer) |
|
||||
| Understand your site | Sitemap Analyzer | [Link](individual-tools-guide.md#3--sitemap-analyzer) |
|
||||
| Optimize images | Image Alt Text Generator | [Link](individual-tools-guide.md#4--image-alt-text-generator) |
|
||||
| Complete audit | Enterprise SEO Suite | [Link](individual-tools-guide.md#8--enterprise-seo-suite) |
|
||||
| Beat competitors | Competitive Analysis | [Link](competitive-analysis.md) |
|
||||
| Plan strategy | Content Strategy Guide | [Link](content-strategy-guide.md) |
|
||||
| AI recommendations | AI Copilot | [Link](ai-copilot.md) |
|
||||
|
||||
---
|
||||
|
||||
## 📊 Quick Stats
|
||||
|
||||
### Available Tools
|
||||
- **9 Individual SEO Analysis Tools**
|
||||
- **12 Dashboard & Integration Tools**
|
||||
- **3+ Workflow Templates**
|
||||
- **21 Total Functional Tools**
|
||||
|
||||
### Documentation Coverage
|
||||
- **11 Comprehensive Guides**
|
||||
- **50+ Pages of Documentation**
|
||||
- **1000+ Real-World Examples**
|
||||
- **100+ Best Practices**
|
||||
- **10+ Complete Workflows**
|
||||
|
||||
### Learning Resources
|
||||
- Quick Start: 10 minutes
|
||||
- Individual Tool Guides: 45 minutes
|
||||
- Workflow Guides: 45 minutes
|
||||
- Complete Learning: 3-4 hours
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Getting Started Now
|
||||
|
||||
### Path 1: Quick Start (10 minutes)
|
||||
```
|
||||
Read: Quick Start Guide
|
||||
Run: One tool analysis
|
||||
Expected Result: First optimization
|
||||
```
|
||||
|
||||
### Path 2: Smart Start (1 hour)
|
||||
```
|
||||
Read: Overview → Individual Tools Guide (choose 2-3)
|
||||
Run: On-Page SEO + One more tool
|
||||
Expected Result: Clear improvement plan
|
||||
```
|
||||
|
||||
### Path 3: Deep Dive (3-4 hours)
|
||||
```
|
||||
Read: Complete documentation
|
||||
Run: Multiple tool analyses
|
||||
Expected Result: Comprehensive strategy
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔗 Navigation
|
||||
|
||||
### All Guides at a Glance
|
||||
|
||||
**User Guides:**
|
||||
- [Quick Start](quick-start.md) - New user orientation
|
||||
- [Overview](overview.md) - Dashboard overview
|
||||
- [Individual Tools Guide](individual-tools-guide.md) - Tool details
|
||||
|
||||
**Strategy Guides:**
|
||||
- [Content Strategy Guide](content-strategy-guide.md) - Content planning
|
||||
- [Competitive Analysis](competitive-analysis.md) - Market research
|
||||
- [AI Copilot Guide](ai-copilot.md) - AI assistant usage
|
||||
|
||||
**Implementation Guides:**
|
||||
- [Workflows & Automation](workflows-guide.md) - Proven workflows
|
||||
- [Metadata Generation](metadata.md) - Meta tag optimization
|
||||
|
||||
**Reference:**
|
||||
- [Tools Reference](tools-reference.md) - Complete tool inventory
|
||||
- [Design Document](design-document.md) - Technical reference
|
||||
- [GSC Integration](gsc-integration.md) - Platform integration
|
||||
|
||||
---
|
||||
|
||||
## ❓ Common Questions
|
||||
|
||||
**Q: Where do I start?**
|
||||
A: See [Quick Start Guide](quick-start.md)
|
||||
|
||||
**Q: How do I choose a tool?**
|
||||
A: See [Tools Reference](tools-reference.md) or use the tool selection guide above
|
||||
|
||||
**Q: What's the best workflow for my situation?**
|
||||
A: See [Workflows & Automation](workflows-guide.md)
|
||||
|
||||
**Q: How long until I see results?**
|
||||
A: Typically 4-8 weeks for ranking changes. See [Quick Start FAQ](quick-start.md#common-questions-for-beginners)
|
||||
|
||||
**Q: How often should I run analyses?**
|
||||
A: See [Individual Tools Guide](individual-tools-guide.md#quick-reference) for recommended frequency
|
||||
|
||||
**Q: Can I get AI help?**
|
||||
A: Yes! See [AI Copilot Guide](ai-copilot.md)
|
||||
|
||||
---
|
||||
|
||||
## 📞 Need More Help?
|
||||
|
||||
1. **Check this index** - You probably found what you need
|
||||
2. **Ask AI Copilot** - Use the chat in your dashboard
|
||||
3. **Review relevant guide** - Each guide has detailed examples
|
||||
4. **Check Tools Reference** - Complete tool specifications
|
||||
|
||||
---
|
||||
|
||||
## 📈 What You'll Accomplish
|
||||
|
||||
After using these guides, you'll be able to:
|
||||
|
||||
- ✅ Understand all 21 SEO tools available
|
||||
- ✅ Optimize pages for better rankings
|
||||
- ✅ Create content strategy
|
||||
- ✅ Find competitive opportunities
|
||||
- ✅ Implement proven workflows
|
||||
- ✅ Measure and track improvements
|
||||
- ✅ Get AI recommendations
|
||||
- ✅ Scale your SEO efforts
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Ready to Start?
|
||||
|
||||
1. **New User?** → Start with [Quick Start Guide](quick-start.md)
|
||||
2. **Ready to Optimize?** → Choose a tool from [Tools Reference](tools-reference.md)
|
||||
3. **Want Strategy?** → Read [Content Strategy Guide](content-strategy-guide.md)
|
||||
4. **Need Workflows?** → Check [Workflows & Automation](workflows-guide.md)
|
||||
|
||||
---
|
||||
|
||||
**Let's start optimizing! 🚀**
|
||||
|
||||
Pick your starting point above and begin your SEO journey.
|
||||
@@ -1,548 +0,0 @@
|
||||
# Individual SEO Tools Guide
|
||||
|
||||
## 🛠️ Overview
|
||||
|
||||
This guide covers each of ALwrity's 9 individual SEO analysis tools, how to use them, and when to use each one.
|
||||
|
||||
---
|
||||
|
||||
## 1. 📝 Meta Description Generator
|
||||
|
||||
### What It Does
|
||||
Generates AI-powered SEO-optimized meta descriptions that:
|
||||
- Include target keywords naturally
|
||||
- Stay within optimal length (150-160 characters)
|
||||
- Include compelling call-to-action
|
||||
- Improve click-through rates
|
||||
|
||||
### When to Use
|
||||
- Creating new pages
|
||||
- Updating old pages
|
||||
- Testing description improvements
|
||||
- Preparing for social media repurposing
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → Meta Description Generator
|
||||
2. Enter your target keywords (comma-separated)
|
||||
3. Select tone (Professional, Casual, Friendly, etc.)
|
||||
4. Choose search intent (Informational, Commercial, Transactional)
|
||||
5. Select language
|
||||
6. Click "Generate"
|
||||
7. Review multiple options
|
||||
8. Copy and use on your page
|
||||
```
|
||||
|
||||
### Example
|
||||
```
|
||||
Input: Keywords: "SEO, content marketing, rankings"
|
||||
Tone: Professional
|
||||
Intent: Informational
|
||||
|
||||
Output:
|
||||
- "Learn proven SEO & content marketing strategies to boost your rankings. Get actionable tips from industry experts."
|
||||
- "Master SEO and content marketing to increase organic traffic. Complete guide with practical examples."
|
||||
- "Discover how SEO and content marketing drive rankings and traffic. Step-by-step strategies for success."
|
||||
```
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Include primary keyword in first 120 characters
|
||||
- ✅ Include compelling benefit or question
|
||||
- ✅ Test multiple descriptions to find best performer
|
||||
- ✅ Monitor CTR to measure effectiveness
|
||||
|
||||
---
|
||||
|
||||
## 2. ⚡ PageSpeed Analyzer
|
||||
|
||||
### What It Does
|
||||
Analyzes your page performance using Google PageSpeed Insights API and provides:
|
||||
- Performance scores (desktop/mobile)
|
||||
- Core Web Vitals (LCP, FID, CLS)
|
||||
- Optimization opportunities
|
||||
- Business impact analysis
|
||||
|
||||
### When to Use
|
||||
- Initial performance baseline
|
||||
- After making performance improvements
|
||||
- Before/after optimization comparison
|
||||
- Competitive performance comparison
|
||||
- Monthly performance tracking
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → PageSpeed Analyzer
|
||||
2. Enter page URL
|
||||
3. Select strategy (Desktop or Mobile)
|
||||
4. Click "Analyze"
|
||||
5. Wait for analysis (5-8 seconds)
|
||||
6. Review scores and opportunities
|
||||
7. Prioritize fixes by impact
|
||||
```
|
||||
|
||||
### Understanding Scores
|
||||
- **90-100**: Excellent (Good to go)
|
||||
- **80-89**: Good (Minor improvements available)
|
||||
- **50-79**: Needs Improvement (Address issues)
|
||||
- **0-49**: Poor (Critical issues)
|
||||
|
||||
### Key Metrics
|
||||
- **LCP** (Largest Contentful Paint): How fast page loads
|
||||
- **FID** (First Input Delay): How fast page responds
|
||||
- **CLS** (Cumulative Layout Shift): Visual stability
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Analyze both desktop and mobile
|
||||
- ✅ Focus on opportunities with highest impact
|
||||
- ✅ Optimize images first (biggest impact)
|
||||
- ✅ Monitor improvements monthly
|
||||
|
||||
---
|
||||
|
||||
## 3. 🗺️ Sitemap Analyzer
|
||||
|
||||
### What It Does
|
||||
Analyzes your website structure and content strategy:
|
||||
- URL patterns and organization
|
||||
- Content distribution across topics
|
||||
- Publishing frequency and velocity
|
||||
- Content trends and patterns
|
||||
- AI-powered strategic insights
|
||||
|
||||
### When to Use
|
||||
- Initial website audit
|
||||
- Content strategy planning
|
||||
- Competitive benchmarking
|
||||
- Quarterly strategy review
|
||||
- When planning content expansion
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → Sitemap Analyzer
|
||||
2. Enter your sitemap URL (e.g., example.com/sitemap.xml)
|
||||
3. Choose analysis options:
|
||||
- Analyze content trends: Yes/No
|
||||
- Analyze publishing patterns: Yes/No
|
||||
4. Click "Analyze"
|
||||
5. Wait for analysis (10-15 seconds)
|
||||
6. Review structure, trends, and recommendations
|
||||
```
|
||||
|
||||
### What You'll Learn
|
||||
- Total URLs and content volume
|
||||
- Content distribution by topic
|
||||
- Publishing frequency
|
||||
- URL structure quality
|
||||
- Content freshness
|
||||
- Growth opportunities
|
||||
- SEO recommendations
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Run monthly to track content growth
|
||||
- ✅ Compare with competitors' sitemaps
|
||||
- ✅ Use insights to plan content strategy
|
||||
- ✅ Track publishing velocity to maintain consistency
|
||||
|
||||
---
|
||||
|
||||
## 4. 🖼️ Image Alt Text Generator
|
||||
|
||||
### What It Does
|
||||
Generates SEO-optimized alt text for images using AI vision:
|
||||
- Describes image content accurately
|
||||
- Incorporates target keywords naturally
|
||||
- Optimizes for accessibility (WCAG compliance)
|
||||
- Improves search image rankings
|
||||
|
||||
### When to Use
|
||||
- Publishing new content with images
|
||||
- Updating old content without alt text
|
||||
- Optimizing for image search
|
||||
- Accessibility compliance
|
||||
- Before archiving images
|
||||
|
||||
### How to Use
|
||||
|
||||
#### Option 1: Upload Image
|
||||
```
|
||||
1. Go to SEO Dashboard → Image Alt Text Generator
|
||||
2. Click "Upload Image"
|
||||
3. Select image from computer
|
||||
4. Enter context (optional): What the image is about
|
||||
5. Enter keywords (optional): Keywords to include
|
||||
6. Click "Generate Alt Text"
|
||||
7. Review and copy results
|
||||
```
|
||||
|
||||
#### Option 2: Image URL
|
||||
```
|
||||
1. Go to SEO Dashboard → Image Alt Text Generator
|
||||
2. Click "Analyze by URL"
|
||||
3. Paste image URL
|
||||
4. Enter context (optional)
|
||||
5. Enter keywords (optional)
|
||||
6. Click "Generate Alt Text"
|
||||
7. Review and copy results
|
||||
```
|
||||
|
||||
### Example
|
||||
```
|
||||
Image: Product photo of blue laptop
|
||||
|
||||
AI-Generated Alt Text:
|
||||
- "Blue laptop with ergonomic design on white background"
|
||||
- "Dell XPS 13 laptop opened showing keyboard and screen"
|
||||
- "Professional laptop for developers - blue aluminum design"
|
||||
```
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Keep alt text concise (under 125 characters)
|
||||
- ✅ Include brand/product name when relevant
|
||||
- ✅ Describe the image, not the context
|
||||
- ✅ Use keywords naturally, don't stuff
|
||||
- ✅ Update all old images gradually
|
||||
|
||||
---
|
||||
|
||||
## 5. 📱 OpenGraph Generator
|
||||
|
||||
### What It Does
|
||||
Creates platform-specific social media tags for:
|
||||
- Facebook sharing optimization
|
||||
- Twitter cards
|
||||
- LinkedIn preview
|
||||
- Pinterest optimization
|
||||
- Other social platforms
|
||||
|
||||
### When to Use
|
||||
- Creating new content
|
||||
- Updating existing pages for social
|
||||
- Before launching social media campaign
|
||||
- To improve social sharing appearance
|
||||
- When content isn't sharing well
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → OpenGraph Generator
|
||||
2. Enter page URL
|
||||
3. Enter title hint (optional)
|
||||
4. Enter description hint (optional)
|
||||
5. Select platform (General, Facebook, Twitter, LinkedIn, Pinterest)
|
||||
6. Click "Generate Tags"
|
||||
7. Copy HTML code to page
|
||||
```
|
||||
|
||||
### Platforms Covered
|
||||
- **General**: Works across all platforms
|
||||
- **Facebook**: Optimized for Facebook sharing
|
||||
- **Twitter**: Twitter Card format
|
||||
- **LinkedIn**: LinkedIn sharing optimization
|
||||
- **Pinterest**: Pinterest Pin optimization
|
||||
|
||||
### Example Output
|
||||
```html
|
||||
<!-- Generated OpenGraph Tags -->
|
||||
<meta property="og:title" content="10 Content Marketing Strategies for 2024">
|
||||
<meta property="og:description" content="Learn proven strategies to boost your content marketing ROI. Get actionable tips and templates.">
|
||||
<meta property="og:image" content="https://example.com/images/content-strategy.jpg">
|
||||
<meta property="og:url" content="https://example.com/article/content-marketing">
|
||||
<meta property="og:type" content="article">
|
||||
<meta property="og:site_name" content="Example Site">
|
||||
```
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Use high-quality images (1200x630px minimum)
|
||||
- ✅ Test on each platform before publishing
|
||||
- ✅ Keep descriptions concise (200 characters max)
|
||||
- ✅ Use consistent branding across platforms
|
||||
|
||||
---
|
||||
|
||||
## 6. 📄 On-Page SEO Analyzer
|
||||
|
||||
### What It Does
|
||||
Comprehensive page-level SEO analysis covering:
|
||||
- Meta tags optimization
|
||||
- Content quality and relevance
|
||||
- Keyword optimization
|
||||
- Internal linking analysis
|
||||
- Image SEO optimization
|
||||
- Mobile friendliness
|
||||
- Accessibility compliance
|
||||
|
||||
### When to Use
|
||||
- Before publishing new pages
|
||||
- Optimizing existing pages
|
||||
- Improving underperforming pages
|
||||
- Competitive page comparison
|
||||
- SEO audit preparation
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → On-Page SEO Analyzer
|
||||
2. Enter page URL
|
||||
3. Enter target keywords (optional)
|
||||
4. Select options:
|
||||
- Analyze images: Yes/No
|
||||
- Analyze content quality: Yes/No
|
||||
5. Click "Analyze"
|
||||
6. Wait for analysis (8-12 seconds)
|
||||
7. Review scores and recommendations
|
||||
8. Implement changes
|
||||
```
|
||||
|
||||
### What You Get
|
||||
- **Overall Score**: 0-100 rating
|
||||
- **Meta Tags Analysis**: Title, description, headers
|
||||
- **Content Analysis**: Quality, relevance, keyword usage
|
||||
- **Technical Analysis**: Links, images, structure
|
||||
- **Performance Metrics**: Load time, mobile friendly
|
||||
- **Critical Issues**: Must-fix problems
|
||||
- **Warnings**: Should-fix issues
|
||||
- **Recommendations**: Nice-to-fix suggestions
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Target 80+ score before publishing
|
||||
- ✅ Fix critical issues first
|
||||
- ✅ Use primary keyword in title and first 100 words
|
||||
- ✅ Include related keywords naturally
|
||||
- ✅ Build internal links to related pages
|
||||
|
||||
---
|
||||
|
||||
## 7. 🔧 Technical SEO Analyzer
|
||||
|
||||
### What It Does
|
||||
Comprehensive technical SEO audit including:
|
||||
- Site crawling (customizable depth)
|
||||
- Robots.txt analysis
|
||||
- Sitemap validation
|
||||
- Canonicalization audit
|
||||
- Redirect chain detection
|
||||
- Broken link identification
|
||||
- Mobile usability analysis
|
||||
- Performance metrics
|
||||
|
||||
### When to Use
|
||||
- Initial technical SEO audit
|
||||
- After major site changes
|
||||
- When experiencing ranking drops
|
||||
- Quarterly SEO maintenance
|
||||
- Before large campaigns
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → Technical SEO Analyzer
|
||||
2. Enter site URL
|
||||
3. Set crawl depth (1-5)
|
||||
- 1: Homepage only
|
||||
- 3: Recommended starting point
|
||||
- 5: Comprehensive crawl
|
||||
4. Select options:
|
||||
- Include external links: Yes/No
|
||||
- Analyze performance: Yes/No
|
||||
5. Click "Analyze"
|
||||
6. Wait for crawl (15-30 seconds depending on depth)
|
||||
7. Review issues by severity
|
||||
8. Prioritize fixes
|
||||
```
|
||||
|
||||
### Issue Severity Levels
|
||||
- **Critical**: Prevent indexing, hurt rankings
|
||||
- **High**: Significantly impact SEO
|
||||
- **Medium**: Minor SEO impact
|
||||
- **Low**: Good to fix, lower priority
|
||||
|
||||
### Typical Issues Found
|
||||
- Crawl errors (4xx, 5xx)
|
||||
- Redirect chains
|
||||
- Broken internal links
|
||||
- Missing meta tags
|
||||
- Duplicate content
|
||||
- Mobile usability issues
|
||||
- Page speed problems
|
||||
- Missing structured data
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Fix critical issues immediately
|
||||
- ✅ Address high priority issues weekly
|
||||
- ✅ Maintain regular monitoring schedule
|
||||
- ✅ Use redirects for moved content
|
||||
- ✅ Keep crawl depth moderate for large sites
|
||||
|
||||
---
|
||||
|
||||
## 8. 🏢 Enterprise SEO Suite
|
||||
|
||||
### What It Does
|
||||
Complete website SEO audit combining:
|
||||
- All on-page analysis
|
||||
- Technical SEO crawling
|
||||
- Competitive analysis
|
||||
- Performance optimization
|
||||
- Executive summary with action plan
|
||||
- Prioritized recommendations
|
||||
|
||||
### When to Use
|
||||
- Comprehensive website audit
|
||||
- Quarterly/annual SEO review
|
||||
- Before major campaigns
|
||||
- Competitive analysis
|
||||
- Strategic planning
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → Enterprise SEO Suite
|
||||
2. Enter website URL
|
||||
3. Add competitors (optional, up to 5)
|
||||
4. Enter target keywords (optional)
|
||||
5. Select workflow type:
|
||||
- Comprehensive (Full audit)
|
||||
- Quick (Major areas only)
|
||||
- Competitive (Competitor focus)
|
||||
6. Click "Run Audit"
|
||||
7. Wait for completion (30-60 seconds)
|
||||
8. Review comprehensive report
|
||||
```
|
||||
|
||||
### Report Contents
|
||||
- **Executive Summary**: High-level findings
|
||||
- **Overall Score**: 0-100 rating with breakdown
|
||||
- **Critical Issues**: Top problems to fix
|
||||
- **Technical Analysis**: Full technical audit
|
||||
- **Content Analysis**: Content quality insights
|
||||
- **Competitive Comparison**: How you compare
|
||||
- **Recommendations**: Prioritized action items
|
||||
- **Implementation Timeline**: Suggested timeframe
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Run quarterly for ongoing monitoring
|
||||
- ✅ Use competitive analysis to benchmark
|
||||
- ✅ Focus on high-impact recommendations first
|
||||
- ✅ Track improvements over time
|
||||
- ✅ Use as strategic planning foundation
|
||||
|
||||
---
|
||||
|
||||
## 9. 📊 Content Strategy Analyzer
|
||||
|
||||
### What It Does
|
||||
Content planning and strategy analysis including:
|
||||
- Content gap identification
|
||||
- Opportunity scoring
|
||||
- Competitive content analysis
|
||||
- Topic recommendations
|
||||
- Content type suggestions
|
||||
- Publishing strategy recommendations
|
||||
|
||||
### When to Use
|
||||
- Content calendar planning
|
||||
- Finding content opportunities
|
||||
- Competitive content analysis
|
||||
- Q quarterly strategy planning
|
||||
- Content expansion planning
|
||||
|
||||
### How to Use
|
||||
```
|
||||
1. Go to SEO Dashboard → Content Strategy Analyzer
|
||||
2. Enter your website URL
|
||||
3. Add competitors (optional)
|
||||
4. Enter target keywords (optional)
|
||||
5. Select analysis options
|
||||
6. Click "Analyze Content Strategy"
|
||||
7. Wait for analysis (5-10 minutes)
|
||||
8. Review content gaps and opportunities
|
||||
9. Plan your content calendar
|
||||
```
|
||||
|
||||
### What You'll Learn
|
||||
- **Content Gaps**: Topics you're missing
|
||||
- **Opportunity Scoring**: Potential of each gap
|
||||
- **Competitive Content**: What competitors rank for
|
||||
- **Topic Clusters**: Related topics to group
|
||||
- **Publishing Recommendations**: How often to publish
|
||||
- **Content Type Suggestions**: Blog, video, guide, etc.
|
||||
|
||||
### Output Analysis
|
||||
- Top 10 opportunities (scored 0-100)
|
||||
- Your content distribution
|
||||
- Competitor strategies
|
||||
- Recommended content types
|
||||
- Publishing frequency suggestions
|
||||
- Content calendar recommendations
|
||||
|
||||
See [Content Strategy Guide](content-strategy-guide.md) for detailed usage.
|
||||
|
||||
### Pro Tips
|
||||
- ✅ Focus on high-scoring opportunities first
|
||||
- ✅ Create content clusters around pillars
|
||||
- ✅ Balance quick wins with strategic goals
|
||||
- ✅ Update calendar monthly with new analysis
|
||||
- ✅ Track performance of recommended content
|
||||
|
||||
---
|
||||
|
||||
## Choosing the Right Tool
|
||||
|
||||
### For Content Creators
|
||||
| Goal | Tool |
|
||||
|------|------|
|
||||
| Quick meta tags | Meta Description Generator |
|
||||
| Social media sharing | OpenGraph Generator |
|
||||
| Image optimization | Image Alt Text Generator |
|
||||
| Page optimization | On-Page SEO Analyzer |
|
||||
| Performance | PageSpeed Analyzer |
|
||||
|
||||
### For Marketers
|
||||
| Goal | Tool |
|
||||
|------|------|
|
||||
| Content planning | Content Strategy Analyzer |
|
||||
| Competitive analysis | Competitive Analysis |
|
||||
| Website structure | Sitemap Analyzer |
|
||||
| Full audit | Enterprise SEO Suite |
|
||||
| Technical health | Technical SEO Analyzer |
|
||||
|
||||
### For SEO Professionals
|
||||
| Goal | Tool |
|
||||
|------|------|
|
||||
| Comprehensive audit | Enterprise SEO Suite |
|
||||
| Technical issues | Technical SEO Analyzer |
|
||||
| Content opportunities | Content Strategy Analyzer |
|
||||
| Page optimization | On-Page SEO Analyzer |
|
||||
| Performance tracking | PageSpeed Analyzer |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Tool Comparison Table
|
||||
|
||||
| Tool | Speed | Depth | Use Case | Best Time |
|
||||
|------|-------|-------|----------|-----------|
|
||||
| Meta Description | 2-3s | Quick | Meta tags | Before publishing |
|
||||
| PageSpeed | 5-8s | Medium | Performance | Monthly check |
|
||||
| Sitemap | 10-15s | Medium | Strategy | Quarterly |
|
||||
| Image Alt Text | 3-5s | Quick | Images | While writing |
|
||||
| OpenGraph | 2-3s | Quick | Social | Before publishing |
|
||||
| On-Page SEO | 8-12s | Deep | Pages | Before publishing |
|
||||
| Technical SEO | 15-30s | Very Deep | Site crawl | Monthly |
|
||||
| Enterprise Suite | 30-60s | Very Deep | Full audit | Quarterly |
|
||||
| Content Strategy | 5-10 min | Deep | Planning | Monthly |
|
||||
|
||||
---
|
||||
|
||||
## Integration Tips
|
||||
|
||||
Use these tools in combination:
|
||||
1. **Content Planning** → Content Strategy Analyzer
|
||||
2. **Page Creation** → Blog Writer
|
||||
3. **Meta Optimization** → Meta Description + OpenGraph
|
||||
4. **Image Optimization** → Image Alt Text Generator
|
||||
5. **Page Optimization** → On-Page SEO Analyzer
|
||||
6. **Performance** → PageSpeed Analyzer
|
||||
7. **Technical Health** → Technical SEO Analyzer
|
||||
8. **Full Audit** → Enterprise SEO Suite
|
||||
|
||||
---
|
||||
|
||||
**Ready to start? Pick a tool from the list above and get started, or explore the [Tools Reference](tools-reference.md) for complete tool overview!**
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user