Compare commits

...

779 Commits

Author SHA1 Message Date
dependabot[bot]
0d8a389406 Bump the npm_and_yarn group across 1 directory with 16 updates
Bumps the npm_and_yarn group with 15 updates in the /frontend directory:

| Package | From | To |
| --- | --- | --- |
| [axios](https://github.com/axios/axios) | `1.12.0` | `1.13.5` |
| [js-yaml](https://github.com/nodeca/js-yaml) | `3.14.1` | `3.14.2` |
| [picomatch](https://github.com/micromatch/picomatch) | `2.3.1` | `2.3.2` |
| [ajv](https://github.com/ajv-validator/ajv) | `6.12.6` | `6.14.0` |
| [diff](https://github.com/kpdecker/jsdiff) | `5.2.0` | `5.2.2` |
| [minimatch](https://github.com/isaacs/minimatch) | `3.1.2` | `3.1.5` |
| [flatted](https://github.com/WebReflection/flatted) | `3.3.3` | `3.4.2` |
| [jsonpath](https://github.com/dchester/jsonpath) | `1.1.1` | `1.3.0` |
| [lodash](https://github.com/lodash/lodash) | `4.17.21` | `4.17.23` |
| [mdast-util-to-hast](https://github.com/syntax-tree/mdast-util-to-hast) | `13.2.0` | `13.2.1` |
| [node-forge](https://github.com/digitalbazaar/forge) | `1.3.1` | `1.4.0` |
| [qs](https://github.com/ljharb/qs) | `6.13.0` | `6.14.2` |
| [react-router](https://github.com/remix-run/react-router/tree/HEAD/packages/react-router) | `6.20.1` | `6.30.3` |
| [rollup](https://github.com/rollup/rollup) | `2.79.2` | `2.80.0` |
| [webpack](https://github.com/webpack/webpack) | `5.101.3` | `5.105.4` |



Updates `axios` from 1.12.0 to 1.13.5
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v1.12.0...v1.13.5)

Updates `js-yaml` from 3.14.1 to 3.14.2
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nodeca/js-yaml/compare/3.14.1...3.14.2)

Updates `picomatch` from 2.3.1 to 2.3.2
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/picomatch/compare/2.3.1...2.3.2)

Updates `ajv` from 6.12.6 to 6.14.0
- [Release notes](https://github.com/ajv-validator/ajv/releases)
- [Commits](https://github.com/ajv-validator/ajv/compare/v6.12.6...v6.14.0)

Updates `diff` from 5.2.0 to 5.2.2
- [Changelog](https://github.com/kpdecker/jsdiff/blob/master/release-notes.md)
- [Commits](https://github.com/kpdecker/jsdiff/compare/v5.2.0...v5.2.2)

Updates `minimatch` from 3.1.2 to 3.1.5
- [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md)
- [Commits](https://github.com/isaacs/minimatch/compare/v3.1.2...v3.1.5)

Updates `flatted` from 3.3.3 to 3.4.2
- [Commits](https://github.com/WebReflection/flatted/compare/v3.3.3...v3.4.2)

Updates `jsonpath` from 1.1.1 to 1.3.0
- [Commits](https://github.com/dchester/jsonpath/commits)

Updates `lodash` from 4.17.21 to 4.17.23
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/compare/4.17.21...4.17.23)

Updates `mdast-util-to-hast` from 13.2.0 to 13.2.1
- [Release notes](https://github.com/syntax-tree/mdast-util-to-hast/releases)
- [Commits](https://github.com/syntax-tree/mdast-util-to-hast/compare/13.2.0...13.2.1)

Updates `node-forge` from 1.3.1 to 1.4.0
- [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md)
- [Commits](https://github.com/digitalbazaar/forge/compare/v1.3.1...v1.4.0)

Updates `qs` from 6.13.0 to 6.14.2
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ljharb/qs/compare/v6.13.0...v6.14.2)

Updates `react-router` from 6.20.1 to 6.30.3
- [Release notes](https://github.com/remix-run/react-router/releases)
- [Changelog](https://github.com/remix-run/react-router/blob/react-router@6.30.3/packages/react-router/CHANGELOG.md)
- [Commits](https://github.com/remix-run/react-router/commits/react-router@6.30.3/packages/react-router)

Updates `rollup` from 2.79.2 to 2.80.0
- [Release notes](https://github.com/rollup/rollup/releases)
- [Changelog](https://github.com/rollup/rollup/blob/v2.80.0/CHANGELOG.md)
- [Commits](https://github.com/rollup/rollup/compare/v2.79.2...v2.80.0)

Updates `underscore` from 1.12.1 to 1.13.6
- [Commits](https://github.com/jashkenas/underscore/compare/1.12.1...1.13.6)

Updates `webpack` from 5.101.3 to 5.105.4
- [Release notes](https://github.com/webpack/webpack/releases)
- [Changelog](https://github.com/webpack/webpack/blob/main/CHANGELOG.md)
- [Commits](https://github.com/webpack/webpack/compare/v5.101.3...v5.105.4)

---
updated-dependencies:
- dependency-name: axios
  dependency-version: 1.13.5
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: js-yaml
  dependency-version: 3.14.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: picomatch
  dependency-version: 2.3.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: ajv
  dependency-version: 6.14.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: diff
  dependency-version: 5.2.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: minimatch
  dependency-version: 3.1.5
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: flatted
  dependency-version: 3.4.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: jsonpath
  dependency-version: 1.3.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: lodash
  dependency-version: 4.17.23
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: mdast-util-to-hast
  dependency-version: 13.2.1
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: node-forge
  dependency-version: 1.4.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: qs
  dependency-version: 6.14.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: react-router
  dependency-version: 6.30.3
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: rollup
  dependency-version: 2.80.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: underscore
  dependency-version: 1.13.6
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: webpack
  dependency-version: 5.105.4
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-26 07:29:33 +00:00
ajaysi
1fd9720dac Cleanup: Add migration scripts to gitignore and remove from tracking
- Add debug_usage.py, fix_database.py, migrate_usage_summaries.py,
  simple_migrate.py, validate_implementation.py to .gitignore
- Add CAMERA_SELFIE_IMPLEMENTATION.md to .gitignore
- Remove these files from git tracking (keep locally)
2026-03-22 12:47:00 +05:30
ajaysi
51bc76345f Add new analytics modules from PR #436
- backend/services/analytics/opportunity_scorer.py: Functions for scoring and ranking
  opportunities from search queries (high_impression_low_ctr_queries,
  rising_queries, declining_pages, score_and_rank_opportunities,
  categorize_opportunities)
- backend/services/gsc_service.py: GSC (Google Search Console) service
2026-03-22 11:36:38 +05:30
ajaysi
b28dc4b5f6 Add startup health module and readiness endpoint from PR #434
- Add services/startup_health.py with health check functions:
  - get_startup_status(): Returns current startup status
  - readiness_under_auth_context(): Validates tenant DB under auth context
  - run_startup_health_routine(): Runs all startup health checks
- Add /health/readiness endpoint for tenant DB validation
- Update startup_event() to use run_startup_health_routine()
- Add raise to startup_event to fail fast on errors
2026-03-22 11:33:20 +05:30
ajaysi
70d3677ac6 Fix environment_setup directory creation from PR #433
- Remove dependency on workspace_dirs module
- Use direct Path().mkdir() for directory creation
- Configure development directories correctly (lib/workspace/...)
- Skip directory creation in production mode
2026-03-22 11:29:25 +05:30
ajaysi
fdbba8f186 Add content strategy state models from PR #433
New models for managing content strategy runtime state:
- StrategyGenerationTaskState: Task lifecycle/status for polling-based AI generation
- LatestGeneratedStrategyState: References to latest generated strategy per user/resource
- StreamingCacheState: Short-lived streaming cache entries with TTL semantics

These models provide persistent state management for content strategy operations.
2026-03-22 11:28:18 +05:30
ajaysi
e8f282b7a9 Enhance main_text_generation with APIKeyManager and improved provider routing
- Import APIKeyManager for provider key checking
- Use APIKeyManager.get_api_key() instead of get_api_key() function
- Add wavespeed provider to available_providers check
- Add detailed provider preflight logging with flow_type tag
- Improve fallback logic when preferred provider is unavailable

These improvements come from PRs #423-#431 while maintaining the modular textgen_utils structure.
2026-03-22 11:23:38 +05:30
ajaysi
a26fa84263 Extract useful LLM provider improvements from PRs #423-#429
huggingface_provider.py:
- Add retry logic with _should_retry_hf_error and _is_non_retryable_hf_error
- Update default models from :groq to :cerebras (HF_FALLBACK_MODELS)
- Add fallback_models parameter to huggingface_text_response
- Add get_available_models with updated model list

main_text_generation.py:
- Add GPT_PROVIDER and TEXTGEN_AI_MODELS env var support
- Add preferred_provider and flow_type parameters to llm_text_gen
- Add HF_MODEL_MAPPING for short model name resolution
- Add flow_type logging tag for better observability

sif_agents.py:
- Add LOW_COST_SHARED_REMOTE_MODELS for SIF agents
- Update SharedLLMWrapper to use preferred_hf_models and flow_type

These changes preserve the modular textgen_utils structure while incorporating
the useful routing and retry logic improvements from the pending PRs.
2026-03-22 11:16:48 +05:30
ajaysi
16be2b21f4 Fix user data endpoints to require authenticated user ID
- Add get_current_user authentication to all user data endpoints
- Pass authenticated user_id from auth context to service methods
- Add proper HTTPException handling for missing data
- Fix user_id type from int to str in service methods
- Ensure endpoints only return data for authenticated user
2026-03-22 11:02:35 +05:30
ajaysi
1a2ec68095 Enhance logging with exception handlers and context tracking
- Add InterceptHandler to route stdlib logging to Loguru
- Add _patch_record_context for request/job/user ID tracking
- Add _uncaught_exception_hook to capture top-level exceptions
- Add _asyncio_exception_handler for asyncio task exceptions
- Add _register_global_exception_handlers to register all hooks
- Add _configure_uvicorn_loggers for unified uvicorn logging
- Improve log format with contextual fields (req, job, user)
2026-03-22 10:59:46 +05:30
ajaysi
d557bd4918 Fix merge conflicts and resolve circular import issues
- Resolve conflict markers in logging_config.py, main.py, app.py
- Fix circular imports in story_writer services (image/audio/video generation)
  by using lazy imports for get_story_media_write_dir
- Restore clean versions of:
  - sif_agents.py
  - tenant_provider_config.py
  - personalization_service.py
  - huggingface_provider.py
  - main_text_generation.py
  - logger_utils.py
- Use setup_clean_logging() consistently across app.py and main.py
- Restore verbose_mode handling in start_alwrity_backend.py
2026-03-22 10:45:05 +05:30
ajaysi
d412275748 "Merge_PR_422_unify_backend_logging_configuration" 2026-03-12 17:32:50 +05:30
ajaysi
c429c90860 "Merge_PR_421_structured_routing_logs_with_clean_modular_architecture" 2026-03-12 17:19:44 +05:30
ajaysi
27700ce272 "Add_structured_routing_logs_to_modular_text_generation" 2026-03-12 17:15:11 +05:30
ajaysi
482a600e14 "Add_structured_routing_logs_to_text_generation_modular" 2026-03-12 17:12:15 +05:30
ajaysi
e85c7d442e "Replace_main_text_generation.py_with_clean_modular_version" 2026-03-12 17:09:19 +05:30
ajaysi
1829f47893 "Extract_text_generation_utilities_into_modular_structure" 2026-03-12 16:59:45 +05:30
ajaysi
54396b8268 Merge_PR_420_add_tenant_aware_provider_config_resolver_across_llm_facades 2026-03-12 16:44:41 +05:30
ajaysi
f36cd8eea9 "Recreate_huggingface_provider_clean_functional_version" 2026-03-12 16:40:53 +05:30
ajaysi
1d68db8151 Merge_PR_437_repair_huggingface_provider_and_restore_explicit_retry_fallback 2026-03-12 16:36:37 +05:30
ي
968900858c Repair huggingface provider and restore explicit retry/fallback behavior 2026-03-12 16:29:50 +05:30
ajaysi
4d90a80b9c Merge_PR_419_refine_hf_provider_retries_and_client_reuse 2026-03-12 16:22:48 +05:30
ajaysi
acf526e7e1 Merge_PR_418_refine_hf_fallback_policy_and_sif_low_cost_routing 2026-03-12 16:19:19 +05:30
ajaysi
679c0e8c89 Merge_PR_417_centralized_text_routing_policy 2026-03-12 16:08:40 +05:30
ajaysi
8d421a158f Merge_PR_416_fix_textgen_ai_models_mapping 2026-03-12 16:05:47 +05:30
ajaysi
acc5e1f72c Merge_PR_415_enforce_runtime_only_workspace_creation 2026-03-12 16:01:23 +05:30
ajaysi
f1ee8fce50 Merge_PR_414_standardize_tenant_db_directory 2026-03-12 15:55:00 +05:30
ajaysi
e7171df5db Merge branch 'pr-413' 2026-03-12 15:46:43 +05:30
ajaysi
f23e99558f Merge branch 'pr-412' 2026-03-12 15:41:04 +05:30
ajaysi
d4bec3c791 Merge_PR_411_tenant_aware_video_studio_storage 2026-03-12 15:39:12 +05:30
ajaysi
d0267c7608 Merge branch 'pr-409' 2026-03-12 15:33:33 +05:30
ajaysi
901470eb8b Merge_PR_408_flat_context_and_txtai_file_tools 2026-03-12 15:29:08 +05:30
ajaysi
446b59e31d Add_local_development_files_and_media_cache_utilities 2026-03-12 15:25:49 +05:30
ajaysi
e90a29c27e Merge_PR_410_with_local_changes 2026-03-12 15:21:08 +05:30
ajaysi
ecf901c76f Merge branch 'pr-410' 2026-03-12 15:19:15 +05:30
ي
51313f60dc Unify backend logging configuration entrypoint 2026-03-12 15:05:30 +05:30
ي
d01d4af62f Add standardized structured routing logs for text generation 2026-03-12 15:05:07 +05:30
ي
feacbc6d59 Add tenant-aware provider config resolver across LLM facades 2026-03-12 15:04:42 +05:30
ي
7df7d870e5 Refine Hugging Face provider retries and client reuse 2026-03-12 15:04:16 +05:30
ي
bf191374a5 Refine HF fallback policy controls and SIF low-cost routing 2026-03-12 15:03:47 +05:30
ي
d4528fbc74 Add centralized text routing policy and premium HF defaults 2026-03-12 15:03:22 +05:30
ي
4b7f443509 Fix TEXTGEN_AI_MODELS full-name mapping and unify model resolution 2026-03-12 15:02:47 +05:30
ي
3ebe884a37 Enforce runtime-only workspace directory creation policy 2026-03-12 15:00:59 +05:30
ي
7557feb830 Standardize tenant DB directory to db with legacy migration 2026-03-12 15:00:20 +05:30
ي
22df52f9d6 Unify story media path resolution across services and routes 2026-03-12 14:59:45 +05:30
ي
d4baf8828e Refactor podcast media storage to lazy tenant resolver 2026-03-12 14:59:03 +05:30
ي
29c268dda8 Add tenant-aware video studio storage path resolver 2026-03-12 14:58:27 +05:30
ي
ad1756aaa2 Extract usage trends and reset logic into usage_tracking_helpers 2026-03-12 07:32:59 +05:30
ajaysi
01881bb405 "feat:enhance-podcast-topic-ai" 2026-03-11 19:09:27 +05:30
ي
7619604324 Harden logging config with safer overrides and optional JSON/file sinks 2026-03-11 16:31:28 +05:30
ي
cbe41ef8c7 Add Step 5 flat context and txtai file tools for agents 2026-03-11 10:42:05 +05:30
ajaysi
e472861967 Merge branch 'pr-405'
# Conflicts:
#	backend/services/intelligence/txtai_service.py
#	backend/services/llm_providers/huggingface_provider.py
2026-03-10 19:59:50 +05:30
ajaysi
b410ece4ca Commit_remaining_local_changes_after_PR_407_merge 2026-03-10 17:17:04 +05:30
ajaysi
97745356ac Merge branch 'pr-407'
# Conflicts:
#	backend/services/intelligence/txtai_service.py
2026-03-10 17:14:40 +05:30
ajaysi
8c2d88efb9 Commit_all_local_changes_after_PR_406_merge 2026-03-10 17:01:36 +05:30
ajaysi
f78b5f1e04 Resolve_txtai_service_conflict_after_PR_406_merge 2026-03-10 16:55:42 +05:30
ajaysi
13e45acbf9 Merge branch 'pr-406' 2026-03-10 16:35:52 +05:30
ي
3a88d09af8 Make SIF agent workflows non-blocking and guard SSE hangs 2026-03-10 14:05:00 +05:30
ي
7e4adce55f Make SIF agent workflows non-blocking and guard SSE hangs 2026-03-10 14:04:26 +05:30
ي
bc49329ed6 Make SIF fail fast and add low-cost remote LLM fallback 2026-03-09 18:57:11 +05:30
ي
4230385e70 Make SIF fail fast and add low-cost remote LLM fallback 2026-03-09 16:26:20 +05:30
ajaysi
651bd2b5f0 fix: Fix Unicode encoding issue in backend startup script
- Changed rocket emoji to ASCII [*] for Windows CP1252 compatibility
- This allows the backend to start without UnicodeEncodeError on Windows
2026-03-09 16:25:56 +05:30
ajaysi
098424f696 fix: Make TxtaiIntelligenceService initialization non-blocking
- Modified _ensure_initialized() to run in background thread (non-blocking)
- Added _ensure_initialized_async() for truly async initialization
- Updated index_content() to return immediately without waiting for initialization
- Weights now load in background thread instead of blocking event loop
- Added initialization tracking to prevent duplicate initialization
- Modified today_workflow API to handle non-blocking indexing gracefully
- This prevents dashboard refresh from blocking other services

When a user accesses the dashboard, the indexing now happens in background
instead of blocking the HTTP response, allowing other services to function
normally while weights are being loaded.
2026-03-09 16:25:56 +05:30
ajaysi
9713af0c1b fix: Add missing columns to daily_workflow_plans table
- Added generation_mode column (VARCHAR, default: 'llm_generation')
- Added committee_agent_count column (INTEGER, default: 0)
- Added fallback_used column (BOOLEAN, default: 0)

Also fixed:
- Imported daily_workflow_models in services/database.py to ensure models are registered
- Added _create_daily_workflow_tables() to database setup
- Created migration script to add columns to 35 existing databases
- Fixed WorkflowError type in frontend to use constructor for proper 'name' property

This resolves the 'no such column' sqlite3 errors when accessing the today-workflow API.
2026-03-09 16:25:56 +05:30
ajaysi
7747174f00 Merge branch 'pr-404' 2026-03-09 16:20:06 +05:30
ajaysi
217698c2ed Merge branch 'pr-403' 2026-03-09 16:05:18 +05:30
ajaysi
c54ad409a7 Merge branch 'pr-402'
# Conflicts:
#	backend/app.py
2026-03-09 15:40:30 +05:30
ي
8b0547cdb5 Make SIF fail fast and add low-cost remote LLM fallback 2026-03-09 15:38:03 +05:30
ajaysi
9fe9f819d8 Merge branch 'pr-401' 2026-03-09 15:33:08 +05:30
ajaysi
1c3524964e Merge branch 'pr-400' 2026-03-09 15:19:47 +05:30
ajaysi
931127bfcf Merge branch 'pr-399' 2026-03-09 14:26:13 +05:30
ajaysi
209a723584 fix: avoid creating/initializing databases during user discovery
get_all_user_ids now checks if DB file exists before calling get_session_for_user. This prevents get_engine_for_user from triggering init_user_database and creating tables for stale workspace folders. Without this fix, a read-only ID scan creates/initializes SQLite databases and default tables as a side effect, which can silently create fresh DBs for stale workspace folders and hide missing/corrupt-database states that discovery previously surfaced.
2026-03-09 14:23:29 +05:30
ajaysi
3cfd95d179 fix: revert user_id filtering in task loaders to preserve backward compatibility
Avoid filtering loader queries by canonical user_id. Calling loaders with user_id=user_id introduces an exact-ID filter path that can drop valid legacy tasks: several loaders (e.g., load_due_market_trends_tasks) apply ...where task.user_id == user_id, but this commit also shifts discovery toward canonical IDs, so tasks persisted earlier with workspace-safe/sanitized IDs in the same per-user DB will no longer be returned and therefore never execute. Before this change, loaders were invoked as task_loader(db) and did not regress on mixed ID formats.
2026-03-09 14:20:57 +05:30
ajaysi
6c5361ce06 Merge branch 'pr-398' 2026-03-09 13:50:12 +05:30
ajaysi
b3cc83ed6e fix: resolve onboarding session not found warnings and frontend build OOM
- Use canonical Clerk user id (clerk_user_id) across all onboarding entrypoints to ensure consistent OnboardingSession.user_id lookup
- Fix API key persistence in api_key_manager.py to use correct APIKey model columns (session_id, provider, key)
- Increase Node heap for frontend build to 8GB and add build:nomap script to disable sourcemaps and reduce memory usage
- Update onboarding endpoints (endpoints_core.py, onboarding_control_service.py, step_management_service.py) to prefer clerk_user_id over id
- Fix frontend workflowStore.ts TypeScript error by returning WorkflowError instance
- Add website_automation_service.py for onboarding automation
2026-03-09 13:36:34 +05:30
ي
952824a271 Stabilize txtai nprobe handling without dropping loaded index state 2026-03-09 12:27:16 +05:30
ي
cd8582eb8c Fix txtai nprobe fallback to avoid reloading incompatible faiss index 2026-03-09 12:21:43 +05:30
ajaysi
1565551765 fix: Pass user_id to style analysis functions in website_analysis_executor
- Fix missing user_id parameter in analyze_content_style() call
- Fix missing user_id parameter in analyze_style_patterns() call
- Fix missing user_id parameter in generate_style_guidelines() call
- user_id is required for subscription checking in llm_text_gen()
- Resolves errors: 'user_id is required for subscription checking'
- All style detection functions now properly pass user_id from executor context
2026-03-09 12:09:56 +05:30
ي
e8d76cd745 Sync SEO dashboard imports and add route smoke test 2026-03-09 12:07:57 +05:30
ي
a19a18d9b4 Add competitor_analysis fallback for deep competitor task scheduling 2026-03-09 12:07:18 +05:30
ي
c3bd04e259 Fix huddle SSE auth fallback with query token support 2026-03-09 12:06:49 +05:30
ي
6b141ee554 Merge branch 'main' into codex/implement-central-visibility-for-seo-onboarding-tasks 2026-03-08 23:13:08 +05:30
ي
936dd14e0d Add consolidated onboarding SEO task health API and dashboard panel 2026-03-08 23:09:02 +05:30
ajaysi
39bc3e3008 Merge PR #397: Add typed request model for task status endpoint
- Add TaskStatusEnum to enumerate valid status values (pending, in_progress, completed, skipped, dismissed)
- Add TaskStatusUpdateRequest Pydantic model with validation
- Constrain completion_notes to max 4000 characters
- Automatically enforce schema validation and improve OpenAPI docs
- Update set_task_status endpoint to use typed request body
- Remove need for manual status validation (FastAPI handles it)
- Preserve dependencies normalization helper and all usages
- Preserve date validation and narrower exception handling from PR #396
- Keep proper feedback scoring using task.status from database
- Keep contextuality validation response fields intact
- Maintain all observability and error handling improvements
- Improve API robustness through type safety
2026-03-08 22:51:17 +05:30
ajaysi
92715661e3 Merge PR #396: Validate plan.date and add narrower exception handling
- Add date validation: validate plan.date is ISO format before computing yesterday
- Log clear warning (plan_id, user_id, plan_date, reason) if date parsing fails
- Replace silent 'except Exception: pass' with explicit SQLAlchemyError handling
- Log detailed warnings (plan_id, user_id, plan_date, yesterday_date, error details) on DB errors
- Keep failures non-fatal to indexing behavior (continue with today's indexing)
- Preserve dependencies normalization helper and its usage in yesterday payloads
- Preserve proper feedback scoring (uses task.status, handles all negative statuses)
- Keep contextuality validation response fields (quality_status, contextuality_validation)
- Improve observability while maintaining system robustness
2026-03-08 18:39:55 +05:30
ajaysi
0aaaf07900 Merge PR #395: Normalize dependencies in today workflow API payloads
- Add _normalize_dependencies() helper to handle all dependency type variations
- Handle None, list, JSON string, and invalid types with safe fallback to []
- Apply normalization to today and yesterday task payloads for consistency
- Ensure indexing pipeline receives normalized list dependencies
- Preserve task status feedback scoring logic (uses task.status, handles all negative cases)
- Keep contextuality validation and quality status response fields
- Improve data consistency across API and indexing surfaces
2026-03-08 18:31:48 +05:30
ajaysi
38444f4508 Merge PR #394: Derive task memory feedback_score from persisted task.status
- Use canonical persisted task.status (from DB) instead of incoming request parameter
- Implement explicit status-to-score mapping: completed→+1, skipped/dismissed/rejected→-1, other→0
- Normalize all negative outcomes uniformly for self-learning memory
- Ensure memory feedback aligns with backend status normalization rules
- Preserve contextuality_validation and quality_status response fields
- Keep failures non-fatal to API behavior with exception handling
- Improve code clarity with explicit conditional logic over ternary operators
2026-03-08 18:28:52 +05:30
ajaysi
74b788a353 Merge PR #392: Add contextuality validation and low-context workflow status
- Replace provenance-based quality with contextuality validation framework
- Add evidence link tracking system (onboarding:key and alert:id formats)
- Implement plan contextuality validation function with configurable thresholds
- Calculate task-level context scores based on evidence link density
- Define contextual workflows (>65% threshold) vs low-context workflows (<65%)
- Add validation in plan persistence layer before database commit
- Integrate contextuality metrics into release readiness checks
- Add recovery strategies for low-context workflows (regeneration with guidance)
- Track evidence link validity against grounding context (onboarding data, alerts)
- Provide detailed contextuality reports in quality assessments
- Maintain backward compatibility while enabling contextual workflow detection
2026-03-08 18:18:43 +05:30
ajaysi
2d4c83e79f Merge PR #391: Add workflow provenance quality metrics and classification
- Introduce task provenance tracking: agent_proposal, llm_backfill, controlled_fallback
- Add quality computation function to classify workflows as 'AI-personalized' or 'guided baseline'
- Calculate agent origin ratio, fallback ratio, and per-pillar coverage metrics
- Implement configurable agent personalization threshold (default 35%)
- Enhance plan metadata with comprehensive quality dimensions:
  - agentOriginRatio, agentOriginPercent, agentOriginTaskCount
  - agentOriginPillars, fallbackRatio, fallbackPercent, fallbackTaskCount
  - totalTaskCount and configurable thresholds
- Simplify task provenance metadata handling in sanitization
- Add backfill logic for existing plans to populate missing quality metrics
- Maintain backward compatibility with existing plan storage
2026-03-08 18:16:54 +05:30
ajaysi
56854df016 Merge PR #390: Add degraded-mode workflow regeneration criteria and endpoint
- Add POST /api/today-workflow/regenerate endpoint for on-demand plan regeneration
- Implement rate limiting (3 requests per 60 seconds) to prevent abuse
- Add regeneration quality score tracking and onboarding completion status
- Compute task hashes for deduplication and change detection
- Extract plan metadata from plan_json for cleaner API responses
- Integrate onboarding progress service to track completion status
- Return quality_score and generated_with_agents metadata in responses
- Enable manual workflow refresh in degraded mode scenarios
- Maintain backward compatibility with simplified schema
2026-03-08 18:12:43 +05:30
ajaysi
35581316a8 Merge PR #389: Committee Health Precheck and Simplified Architecture
- Simplify workflow generation by removing complex dependency coercion and metadata normalization functions
- Add committee health precheck to detect insufficient active agents before proposal gathering
- Add orchestrator initialization state metadata tracking for observability
- Downgrade fastapi to 0.104.0 for stability
- Simplify database schema (remove generation_mode, committee_agent_count, fallback_used columns from DailyWorkflowPlan)
- Remove 'skipped' from suppressed task statuses in memory service (keep only dismissed, rejected)
- Update task status normalization logic for clarity
- Return simplified metadata structure with generation_path and degradation tracking
2026-03-08 18:10:27 +05:30
ajaysi
8f6ed3a616 Merge PR #388: Daily Workflow Integration & Enhanced Reliability
- Resolve merge conflicts in backend/services/today_workflow_service.py and frontend/src/stores/workflowStore.ts
- Backend: Keep robust handling for both dict and object types in TaskProposal conversion
- Backend: Combine dependencies coercion with task metadata normalization
- Frontend: Implement graceful fallback pattern (try server first, then local generation on unavailability)
- Add provenanceSummary integration from server responses
- Ensure degraded mode handling with appropriate messaging
2026-03-08 17:47:15 +05:30
ي
52563849d5 Fix onboarding-status user ID resolution in scheduler path 2026-03-07 12:15:25 +05:30
ajaysi
a25ec8302c fix: Resolve merge conflicts in workflowStore.ts by combining normalization and fallback logic (PR #387) 2026-03-07 12:08:29 +05:30
ajaysi
4f2a3d6e2d fix: Resolve conflicts in PR #386 for per-agent timeouts and partial committee handling 2026-03-07 12:05:51 +05:30
ajaysi
f0f73eb003 Merge branch 'pr-385' 2026-03-07 12:02:50 +05:30
ajaysi
a00212ca4d refactor: Unify canonical task outcome statuses (completed, skipped) across workflow and memory services (Closes #384) 2026-03-07 12:00:04 +05:30
ajaysi
5780deff2f fix: Normalize specialized agent pillar IDs and log invalid proposals (Closes #383) 2026-03-07 11:57:26 +05:30
ajaysi
8b554a35c4 fix: Resolve dependency conflicts, scheduler status error, and frontend config (Closes #382) 2026-03-07 11:51:59 +05:30
ي
62d5cf773e Add typed request model for today workflow task status updates 2026-03-06 21:45:48 +05:30
ي
e694e6172f Validate plan date before yesterday workflow indexing 2026-03-06 21:45:25 +05:30
ي
2403d92f9d Normalize today workflow task dependencies payload 2026-03-06 21:44:23 +05:30
ي
acecf2a3f4 Fix task outcome feedback scoring to use normalized status 2026-03-06 21:43:40 +05:30
ي
7096f03623 Add contextuality validation and low-context workflow status 2026-03-06 21:42:49 +05:30
ي
84babd0407 Add workflow provenance quality metrics and classification 2026-03-06 21:42:14 +05:30
ي
4621107988 Add degraded-mode workflow regeneration criteria and endpoint 2026-03-06 21:40:29 +05:30
ي
15a9eaa9a0 Add committee health precheck and orchestrator init state metadata 2026-03-06 21:40:03 +05:30
ي
81b29895b9 Improve daily workflow provenance modeling and UI labels 2026-03-06 21:39:32 +05:30
ي
ed625eae61 Harden workflow fallback handling and degraded mode UI 2026-03-06 21:38:39 +05:30
ي
198143e6ca Add per-agent timeout handling for daily committee proposals 2026-03-06 21:38:04 +05:30
ي
c3f478a763 Normalize today workflow task dependencies as arrays 2026-03-06 21:37:36 +05:30
ajaysi
5d49351c2d feat: Support explicit technical SEO audit states and surface task diagnostics in dashboard (PR #382) 2026-03-05 22:28:55 +05:30
ajaysi
afe79f188a refactor: Align SEO dashboard imports/routes and add app router smoke test (PR #381) 2026-03-05 22:21:05 +05:30
ajaysi
110f7318cc chore: Update backend services, intelligence integration, and documentation 2026-03-05 22:14:25 +05:30
ajaysi
5cccb89df8 feat: Add competitor_analysis fallback for deep competitor onboarding scheduling (PR #380) 2026-03-05 22:11:55 +05:30
ajaysi
6205ff8bbe Merge PR #379: fix preflight pricing/model drift and usage UI 2026-03-05 12:22:21 +05:30
ي
01bf56837f Fix unlimited video limit display in usage rings 2026-03-05 11:36:04 +05:30
ي
7d530b3220 Preserve full provider breakdown in billing UI coercion 2026-03-05 11:31:49 +05:30
ي
81f49f4ebd Add explicit usage summary uniqueness and billing indexes 2026-03-05 11:17:48 +05:30
ي
45dbf095f6 Add Stripe webhook idempotency persistence guard 2026-03-05 11:17:21 +05:30
ي
81052d06b4 Fix preflight model mapping when skipping invalid providers 2026-03-05 11:10:54 +05:30
ajaysi
6121ae1a92 Merge branch 'pr-378' 2026-03-05 10:52:31 +05:30
ajaysi
156beba7e0 Merge branch 'pr-377' 2026-03-05 10:49:08 +05:30
ajaysi
2610b1e35b Merge branch 'pr-376' 2026-03-05 10:44:12 +05:30
ajaysi
93406352d4 Merge branch 'pr-375' 2026-03-05 10:41:19 +05:30
ajaysi
806ab7b20d Merge branch 'pr-374' 2026-03-05 10:34:36 +05:30
ajaysi
c303a1040b Merge branch 'pr-373' 2026-03-05 10:24:58 +05:30
ajaysi
26131232c7 feat: enhance billing dashboard with historical data & security hardening
- Fix usage tracking zero-value bug with self-healing logic
- Add month selector for historical usage views
- Implement start-of-month graceful initialization
- Merge PR #372: Harden user-scoped access in subscription routes
- Fix UI bugs in UsageDashboard component
2026-03-05 10:21:56 +05:30
ي
c604dc87ec Add Stripe webhook event persistence and idempotency 2026-03-04 20:44:04 +05:30
ي
6e75f44ed5 Add subscription usage constraints and safe index migration 2026-03-04 20:43:14 +05:30
ي
cf4c08ff7c Align usage period keys with subscription window and reset audio counters 2026-03-04 20:42:44 +05:30
ي
d82569a1d0 Harden usage limit enforcement failure handling 2026-03-04 20:42:20 +05:30
ي
fc96e1218a Move Stripe plan price mapping to env with startup validation 2026-03-04 20:41:47 +05:30
ي
5a7b9e6c6b Refactor billing flows to require authenticated user IDs 2026-03-04 20:41:07 +05:30
ي
261c224dca Harden user-scoped subscription route access checks 2026-03-04 20:40:33 +05:30
ajaysi
2318fd8a48 Render deploy hardening: skip large model bootstrap on cloud, respect PORT, pin requests deps; add runtime.txt and render-build.sh 2026-03-04 12:55:35 +05:30
ajaysi
1d36ebe2f9 Update frontend components: TeamHuddleWidget, useAgentHuddleFeed, TeamActivityPage 2026-03-04 09:19:53 +05:30
ajaysi
45fb9636e2 Update Render build configuration: fix deps, force py3.11, add build script 2026-03-04 09:17:35 +05:30
ajaysi
460e1f398d Merge PR #364: Add competitor-aware originality checks and fix agent initialization 2026-03-03 18:57:46 +05:30
ajaysi
05dd4f1efb Merge branch 'pr-371' 2026-03-03 18:37:37 +05:30
ajaysi
65fede6839 Merge PR #370: Tiered agent activity responses and detailed approvals UI 2026-03-03 18:33:38 +05:30
ajaysi
2fbda8f803 Merge PR #369: Standardize agent activity events and update timeline UI 2026-03-03 18:25:05 +05:30
ajaysi
1e95198ec9 Fix: Type errors and linter warnings 2026-03-03 18:20:48 +05:30
ajaysi
6fefbf1121 Feat: Add SSE-powered Team Huddle feed and Activity page 2026-03-03 17:40:40 +05:30
ajaysi
23ad48c506 Merge branch 'pr-367' 2026-03-03 17:23:04 +05:30
ajaysi
6c7871bedd Fix: Agent orchestrator initialization, singleton LLM loading, and dashboard activity logging 2026-03-03 17:22:50 +05:30
ajaysi
a527ab3c76 Merge PR #366 and wire up Team Huddle CTA 2026-03-03 09:24:52 +05:30
ajaysi
1c09aedc6c Fix TypeScript errors in TeamHuddleWidget after merge 2026-03-03 09:17:25 +05:30
ajaysi
259cb8682d Merge branch 'review/pr-365' 2026-03-03 09:09:07 +05:30
ajaysi
2e04b8e27b Fix compilation errors and resolve ESLint warnings across multiple components 2026-03-03 08:57:36 +05:30
ي
79c2327861 docs: define Team Huddle contract and activity acceptance criteria 2026-03-02 22:04:45 +05:30
ي
4f19b993b4 Add tiered agent activity responses with redaction and UI toggle 2026-03-02 22:02:53 +05:30
ي
a7bf355703 Standardize agent event payloads and team activity timeline UI 2026-03-02 22:01:12 +05:30
ي
c0d9289d4d Add agent huddle SSE feed with frontend live subscriptions 2026-03-02 22:00:24 +05:30
ي
92b0255028 Add aggregated agent huddle feed endpoint 2026-03-02 21:49:57 +05:30
ي
ef55124a56 Add team activity page and dashboard navigation 2026-03-02 21:49:21 +05:30
ي
124de1379a feat: load team huddle agents from API endpoints 2026-03-02 21:48:50 +05:30
ي
60e6cbd34b Harden SIF release readiness gaps and add regression checks 2026-03-02 20:59:07 +05:30
ajaysi
cb6a3a8042 chore: Remove usage of unused isTitleLoading state 2026-03-02 15:48:44 +05:30
ajaysi
d49d2b627e chore: Fix additional ESLint warnings in BlogWriter components 2026-03-02 15:46:22 +05:30
ajaysi
b4549ebe39 fix: Remove duplicate state declarations in useCampaignCreator.ts 2026-03-02 15:18:02 +05:30
ajaysi
85aa808122 fix: Syntax error in useCampaignCreator.ts 2026-03-02 15:16:56 +05:30
ajaysi
e0376d0f1c chore: Fix ESLint warnings for Vercel deployment 2026-03-02 15:13:09 +05:30
ajaysi
4b641cc773 Cleanup obsolete test files and sync local changes 2026-03-02 11:49:20 +05:30
ajaysi
0f97d54318 Merge branch 'review/pr-363' 2026-03-02 11:46:02 +05:30
ي
cd9ffb5ef5 Implement evidence-based semantic gap detection for strategy agents 2026-03-02 11:42:52 +05:30
ajaysi
d4cdd89fbf Merge branch 'review/pr-362' 2026-03-02 11:38:58 +05:30
ي
6273d1de60 Unify semantic health response schema 2026-03-02 11:36:05 +05:30
ajaysi
673f6a22e1 Merge branch 'review/pr-361' 2026-03-02 11:34:04 +05:30
ي
9d34753d0f Add structured warning logs for memory and semantic fallback paths 2026-03-02 10:51:45 +05:30
ajaysi
07a4d86d61 Merge PR #360 and resolve conflict in today_workflow_service.py 2026-03-02 10:27:46 +05:30
ي
83f1edcd12 Add strategy architect agent to daily planning committee 2026-03-02 10:25:01 +05:30
ajaysi
89e33e2121 Fix test setup in test_today_workflow_service.py to avoid global module patching 2026-03-01 23:26:29 +05:30
ajaysi
677c65fe72 Merge branch 'review/pr-359' 2026-03-01 23:16:05 +05:30
ajaysi
fe6d3b6c66 Merge branch 'review/pr-358' 2026-03-01 23:08:51 +05:30
ajaysi
212538c406 Add note about competitor indices validation in gap analysis 2026-03-01 23:08:30 +05:30
ajaysi
9707e40eba Merge branch 'review/pr-357' 2026-03-01 22:19:50 +05:30
ajaysi
ed43f49d38 Fix workflow_config plumbing issue in today_workflow_service.py before merging PR 357 2026-03-01 22:19:36 +05:30
ي
036bbb45e1 Add strategy agent to daily planning committee 2026-03-01 21:59:12 +05:30
ي
77088bfc53 Implement evidence-driven semantic gap detection 2026-03-01 21:58:44 +05:30
ي
e7935af42a Add pillar coverage guardrails for today workflow plans 2026-03-01 21:58:20 +05:30
ي
e2718e3b79 Enable phase-3 task memory filtering and add coverage 2026-03-01 21:38:24 +05:30
ajaysi
f8f7ddeb2a feat: Implement Today's Workflow and Agent Huddle enhancements 2026-03-01 20:15:31 +05:30
ajaysi
62d9c2e836 feat: Cherry-pick Website Maker feature from remote backup 2026-03-01 18:42:27 +05:30
ajaysi
4828274cbf Release Candidate: Production Release with Multi-Tenant & Onboarding Enhancements 2026-02-28 20:08:00 +05:30
ajaysi
08a1f4a1d8 Save local changes (GSC/Bing integrations) before merging PR #354 2026-02-13 13:11:27 +05:30
ajaysi
43e66835ac Recovered critical missing components: PerformanceMonitor, MarketSignalDetector, and SemanticDashboard 2026-02-08 14:06:09 +05:30
ajaysi
e404a86502 Recovered state: integrated TrendSurferAgent, restored frontend/backend files, and cleaned up recovery scripts 2026-02-08 13:56:57 +05:30
ajaysi
1db10ccd0f Added documentation for the auto-population feature and the analytics integration. 2026-01-17 11:01:10 +05:30
ajaysi
8193cdba67 AI Analysis and Content Strategy fixes. Enhanced Strategy Routes refactoring. 2026-01-10 19:32:50 +05:30
ajaysi
0b63ae7fc1 AI Researcher and Video Studio implementation complete 2026-01-05 15:49:51 +05:30
ajaysi
b134e9dc7e Added video studio router and endpoints. Added research router and endpoints. Added youtube router and endpoints. Added onboarding utils router and endpoints. Added onboarding utils service. Added onboarding utils models. Added onboarding utils routes. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. Added onboarding utils utils. 2026-01-01 17:56:25 +05:30
ajaysi
7512933c65 AI Image and Audio Generation Improvements.
AI Video Generation Pre-Flight Checklist. Cost Estimate Improvements.
2025-12-25 16:26:08 +05:30
ajaysi
59913bffa9 Added YouTube Creator scene building flow documentation 2025-12-21 17:15:23 +05:30
ajaysi
1d745c9bc8 AI podcast project 2025-12-16 16:25:52 +05:30
ajaysi
eba5210577 AI podcast maker performance optimizations 2025-12-12 21:43:09 +05:30
ajaysi
81590cf4db WIP: AI Podcast Maker and YouTube Creator Studio integration 2025-12-10 09:37:55 +05:30
ajaysi
31f078c763 Stop tracking generated story media and improve podcast workflow 2025-11-28 16:01:53 +05:30
ajaysi
7dd25d08af AI Campaign and Ad Creator 2025-11-28 14:38:34 +05:30
ajaysi
49e2131715 AI Image Studio, AI podcast Maker, AI product Marketing 2025-11-28 14:33:52 +05:30
ajaysi
77d7c0cde6 AI Image Studio Progress Review
- Added new router for content assets
- Added new service for content assets
- Added new model for content assets
- Added new utils for content assets
- Added new docs for content assets
- Added new tests for content assets
- Added new examples for content assets
- Added new guides for content assets
2025-11-23 09:21:11 +05:30
ajaysi
eede21ad42 AI Image Studio Phase 1 2025-11-20 09:06:00 +05:30
ajaysi
e96525347b AI story writer enhancements, text to video and voice generation, subscription management, and more. 2025-11-19 09:55:32 +05:30
ajaysi
bf7493c366 AI Video Generation Implementation 2025-11-17 17:38:23 +05:30
ajaysi
4901b7eb72 AI Story Writer Backend Migration Complete, Frontend UI Components Added 2025-11-16 19:25:26 +05:30
ajaysi
3b9356e2c8 story writer backend migration complete, Blog writer SEO and story writer backend migration complete, Blog writer SEO and story writer frontend migration complete 2025-11-13 16:14:26 +05:30
ajaysi
7191c7e7f0 AI platform insights monitoring and website analysis monitoring services added 2025-11-11 15:57:45 +05:30
ajaysi
d99c7c83a7 Scheduled research persona generation 2025-11-05 08:51:00 +05:30
ajaysi
55087c4f37 Research Wizard and CopilotKit mitigation review 2025-11-04 08:11:57 +05:30
ajaysi
e69107b07c Research component integration, Copilotkit implementation, SEO copilotkit implementation, Wix SEO metadata complete, Wix SEO metadata review 2025-11-03 16:01:44 +05:30
ajaysi
de4328175d Subscription dashboard improvements, AI text generation limit, and other fixes. 2025-11-01 18:01:14 +05:30
ajaysi
cdb41aec1b Added image generation to blog writer 2025-10-31 15:59:16 +05:30
ajaysi
3219e6bbe4 Hugging Face Integration. Remove OpenAI and Anthropic and DeepSeek. Add Hugging Face. 2025-10-29 20:15:04 +05:30
ajaysi
4431cd9848 SEO Dashboard Fixes and content planning refactoring 2025-10-29 17:10:48 +05:30
ajaysi
5866f49325 LinkedIn and Facebook Persona Services Implementation 2025-10-26 10:06:24 +05:30
ajaysi
caeb6e56a9 Onboarding Progress Service Implementation 2025-10-24 17:22:06 +05:30
ajaysi
a3f25f23c9 Subscription implementation complete, Renewal system implemented 2025-10-23 21:47:52 +05:30
ajaysi
2240cefa30 Subscription API and API key injection middleware added 2025-10-19 17:56:09 +05:30
ajaysi
1f087aad4c Bing Analytics and Insights added, background jobs added, database setup updated, environment setup updated, frontend updated, backend updated.
Onboarding Manager and Router Manager refactored, analytics and background jobs added, database setup updated, environment setup updated, frontend updated, backend updated.
Critical onboarding database migration implemented.
2025-10-18 10:28:15 +05:30
ajaysi
40fb6ac95b ALwrity Backend and Frontend - Stability and Error Handling Improvements 2025-10-14 10:57:16 +05:30
ajaysi
b6debd80b7 Subscription Guard and Installation Guide 2025-10-13 15:27:48 +05:30
ajaysi
c38812b6c5 Pricing Page and Subscription Guard 2025-10-13 10:25:57 +05:30
ajaysi
20b01717cd Fix: Onboarding Completion Service Update 2025-10-12 16:03:52 +05:30
ajaysi
8851e6ee9b Fix: Onboarding Completion Service Update 2025-10-12 15:17:19 +05:30
ajaysi
08ce9588f4 Fix: Persona Polling Issue 2025-10-12 14:27:15 +05:30
ajaysi
2a3ad8addc Fix: Step 6 Data Retrieval Issue 2025-10-11 22:01:20 +05:30
ajaysi
bf65065265 Exa Service updated to use database 2025-10-11 20:22:25 +05:30
ajaysi
734a54acc3 Fixed local storage issue with website analysis data 2025-10-11 19:51:30 +05:30
ajaysi
5f066b6c0e Add robust fallback logic for Step 3 data passing
- Check multiple storage sources for website URL (props, localStorage, sessionStorage)
- Add validation to prevent API calls without website URL
- Improve debugging logs to track data flow
- Make CompetitorAnalysisStep more resilient to production edge cases
- No breaking changes, just defensive improvements
2025-10-11 18:08:10 +05:30
ajaysi
c506b1da76 Fix Step 3 router 404 and EXA_API_KEY deployment blocker
- Move Step 3 router from optional to core routers in router_manager.py
- Add direct import of step3_routes in app.py for visibility
- Make ExaService initialization optional to prevent deployment failures
- Fix 404 error on /api/onboarding/step3/discover-competitors endpoint
2025-10-11 17:59:46 +05:30
ajaysi
ffa1a078f4 Adjusted WebsiteStep container width for better layout 2025-10-11 17:37:24 +05:30
ajaysi
1df12a64a2 Add brand analysis columns to onboarding database and migration scripts 2025-10-11 17:05:42 +05:30
ajaysi
b1ebe1034e ALwrity onboarding final step 2025-10-10 23:19:28 +05:30
ajaysi
e3daebec16 ALwrity + Wix + Wordpress integration complete 2025-10-10 14:18:35 +05:30
ajaysi
e2dc043134 ALwrity onboarding: Removed user-specific files 2025-10-10 14:18:09 +05:30
ajaysi
c383a3d50b ALwrity copilot: CopilotKit integration complete 2025-10-10 13:39:34 +05:30
ajaysi
11f164ae21 ALwrity + Wix + Wordpress + GSC + Bug Fixes 2025-10-10 13:08:09 +05:30
ajaysi
af4c8afb5b ALwrity + Wix + Wordpress + GSC integration 2025-10-09 13:21:07 +05:30
ajaysi
9cc1ffd47e ALwrity + Wix + Wordpress + GSC integration + Production API calls fixes 2025-10-09 13:07:09 +05:30
ajaysi
0f6f8a4c6c ALwrity + Wix + Wordpress + GSC integration 2025-10-08 17:33:22 +05:30
ajaysi
4e633f32d9 ALwrity + Wix + Wordpress + GSC integration 2025-10-08 17:27:32 +05:30
ajaysi
4783c87bec ALwrity + Wordpress + Wix + GSC integration 2025-10-08 17:25:09 +05:30
ajaysi
96b240b8ba ALwrity + Wordpress + Wix + GSC integration 2025-10-08 16:12:26 +05:30
ajaysi
719ca06da0 ALwrity + Wordpress + Wix + GSC integration 2025-10-08 15:06:35 +05:30
ajaysi
5e3901c1c6 ALwrity + Wordpress + Wix + GSC integration 2025-10-08 14:25:59 +05:30
ajaysi
3bab3450dc ALwrity + Wordpress + Wix + GSC integration 2025-10-08 10:13:14 +05:30
ajaysi
14dfb2e5c0 ALwrity onboarding fixes 2025-10-04 13:24:41 +05:30
ajaysi
510b79bbf8 Added onboarding progress tracking & landing page 2025-10-02 13:20:15 +05:30
ajaysi
e57d2577f8 Alwrity technical documentation 2025-09-25 12:23:21 +05:30
ajaysi
f6d25151e9 ALwrity mkdocs documentation template 2025-09-24 18:10:52 +05:30
ajaysi
b2b5769ad9 feat: Add comprehensive documentation structure
- Created ALwrity introduction page with vision summary
- Added detailed installation guide with step-by-step instructions
- Created comprehensive configuration guide for API keys and settings
- Added first steps guide with complete user onboarding flow
- Updated navigation structure to include all new pages
- Preserved existing documentation while organizing it logically
- Optimized for SEO with proper structure and metadata
2025-09-24 18:09:58 +05:30
ajaysi
35175328bb fix: GitHub Pages workflow configuration 2025-09-24 17:45:05 +05:30
ajaysi
5573e11f6d fix: Update GitHub Pages workflow for correct build path
- Fixed build directory path in GitHub Actions workflow
- Updated artifact upload path to match build output
- This should resolve the GitHub Pages deployment issue
2025-09-24 17:02:16 +05:30
ajaysi
5bee5c0aa0 feat: Set up MkDocs documentation site with Material theme
- Created comprehensive MkDocs configuration with Material theme
- Set up organized documentation structure for ALwrity features
- Added GitHub Pages deployment workflow
- Created initial documentation pages:
  - Homepage with feature overview
  - Quick Start guide
  - Blog Writer and SEO Dashboard feature docs
  - Comprehensive troubleshooting guide
- Configured responsive design with dark/light mode toggle
- Added search functionality and navigation
- Set up automatic deployment to GitHub Pages

The documentation site will be available at https://alwrity.github.io/ALwrity
2025-09-24 16:53:29 +05:30
ajaysi
ac307683e0 Alpha Subscription Implementation Plan 2025-09-24 16:08:24 +05:30
ajaysi
580282baa1 feat: Enhance GitHub community profile with ALwrity-specific improvements
- Improve CONTRIBUTING.md with detailed setup instructions and ALwrity-specific guidelines
- Enhance SUPPORT.md with comprehensive troubleshooting for AI services and GSC integration
- Update SECURITY.md with ALwrity-specific security features and considerations
- Add issue template configuration and question form for better issue management
- Enhance PR template with ALwrity-specific checklist items
- Add comprehensive .github/README.md explaining all community health files

These improvements provide better contributor experience and project visibility.
2025-09-24 15:49:53 +05:30
ajaysi
6554549494 feat: Add GitHub community profile files
- Add CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md
- Add issue templates and PR template
- Add SUPPORT.md and FUNDING.yml
- Improve project visibility and contributor experience
2025-09-24 15:47:20 +05:30
ajaysi
8c924b3ee9 Include latest fixes in onboarding feature branch
- Include route protection implementation
- Include Google Search API warning removal
- Include Stability AI import fixes
- Include updated README and removed SETUP_GUIDE
- Ensure feature branch has all latest improvements
2025-09-24 15:22:00 +05:30
Om-Singh1808
d86336dcf1 feat: Complete onboarding system with No Website functionality
- Add No Website button to Step 2 with business description form
- Implement onboarding cache service for browser-side data storage
- Add business info database models and API endpoints
- Update API key manager to save keys to .env file immediately
- Add database migration scripts for business info table
- Create reset onboarding script for fresh starts
- Implement hybrid data storage (API keys to backend, other data to cache)
- Add comprehensive business info CRUD operations
- Include database table creation and migration tools
2025-09-24 15:22:00 +05:30
Om-Singh1808
dca2318235 feat: Add No Website button to onboarding Step 2 with business description form 2025-09-24 15:22:00 +05:30
Om-Singh1808
f715d3edbb Clean up: Remove all cache files and add comprehensive .gitignore 2025-09-24 15:22:00 +05:30
ajaysi
be3f837d05 Update .gitignore with additional exclusions
- Add .gitignore to ignore list
- Add .pytest* patterns for pytest cache files
- Improve gitignore coverage for Python development
2025-09-24 15:12:21 +05:30
Om-Singh1808
4ea933e643 fix: Remove hardcoded secrets from env template files
- Replace actual API keys with placeholder values in frontend/env_template.txt
- Ensure template files only contain example values, not real secrets
- Fix GitGuardian security scan issues
2025-09-24 15:12:21 +05:30
Om-Singh1808
0a7d9bfd21 feat: Complete Google Search Console integration with Clerk authentication
- Add GSC API service with OAuth2 authentication
- Implement Clerk authentication for frontend and backend
- Add GSC login button and OAuth callback handling
- Create comprehensive GSC data fetching and caching
- Add authentication middleware for backend API protection
- Implement real-time GSC data integration in SEO dashboard
- Add user-specific GSC site management
- Include comprehensive logging and error handling
- Add TypeScript support and proper type definitions
- Create environment templates and setup documentation
- Update gitignore to exclude sensitive credential files

Features added:
- GSC OAuth2 authentication flow
- Real-time search analytics data
- Site list management
- Sitemap analysis
- User-specific data isolation
- Comprehensive error handling
- Authentication token management
- Popup-based OAuth flow
- Data caching and refresh mechanisms

Note: gsc_credentials.json should be created locally with your Google OAuth credentials
2025-09-24 15:12:21 +05:30
Om-Singh1808
aeb7751d48 feat: Complete onboarding system with No Website functionality
- Add No Website button to Step 2 with business description form
- Implement onboarding cache service for browser-side data storage
- Add business info database models and API endpoints
- Update API key manager to save keys to .env file immediately
- Add database migration scripts for business info table
- Create reset onboarding script for fresh starts
- Implement hybrid data storage (API keys to backend, other data to cache)
- Add comprehensive business info CRUD operations
- Include database table creation and migration tools
2025-09-24 15:12:21 +05:30
Om-Singh1808
201960ce9d feat: Add No Website button to onboarding Step 2 with business description form 2025-09-24 15:12:21 +05:30
Om-Singh1808
5dc756f062 Clean up: Remove all cache files and add comprehensive .gitignore 2025-09-24 15:12:21 +05:30
ajaysi
9cd5b3a583 Alpha Subscription Implementation Plan 2025-09-24 14:44:42 +05:30
ajaysi
dee3e428bd Implement route protection for onboarding
- Create ProtectedRoute component to guard routes based on onboarding status
- Protect all non-onboarding routes (dashboard, blog-writer, seo, etc.)
- Users must complete onboarding before accessing any features
- Prevents confusion and broken functionality for new users
- Improves user experience by guiding users through proper setup
2025-09-24 12:48:49 +05:30
ajaysi
197720bea4 Fix onboarding issue: Remove .onboarding_progress.json from Git tracking
- Add .onboarding_progress.json to .gitignore to prevent tracking
- Remove existing .onboarding_progress.json from Git tracking
- Ensures new users always go through onboarding process
- Fixes issue where users were not presented with onboarding after cloning
2025-09-24 12:39:01 +05:30
ajaysi
7e1dfb8238 Remove all tracked .pyc files from Git repository
- Remove 25+ .pyc files from backend/api/__pycache__/
- Remove .pyc files from backend/api/facebook_writer/
- Remove .pyc files from backend/models/__pycache__/
- Remove .pyc files from backend/services/__pycache__/
- Remove .pyc files from backend/services/llm_providers/__pycache__/
- These files are automatically generated and should not be tracked
- Repository is now clean of Python cache files
- Future .pyc files will be ignored by .gitignore
2025-09-23 19:48:34 +05:30
ajaysi
1692ca3039 Remove tracked .pyc files from Git
- Remove backend/api/__pycache__/onboarding.cpython-313.pyc
- Remove backend/api/facebook_writer/services/__pycache__/ files
- Remove backend/services/llm_providers/__pycache__/gemini_provider.cpython-313.pyc
- These files should not be tracked in Git as they are generated automatically
- .gitignore will now prevent future .pyc files from being tracked
2025-09-23 19:47:59 +05:30
ajaysi
67cf1f3e34 Add comprehensive .gitignore file
- Add Python cache files (*.pyc, __pycache__/)
- Add environment files (.env, .env.local, etc.)
- Add database files (*.db, *.sqlite)
- Add IDE files (.vscode/, .idea/)
- Add OS files (.DS_Store, Thumbs.db)
- Add Node.js files (node_modules/, build/)
- Add logs, temporary files, and credentials
- Add test files and coverage reports
- Add virtual environments and lock files
- Comprehensive coverage for Python/FastAPI/React project
2025-09-23 19:47:19 +05:30
ajaysi
ca0463b826 Merge branch 'main' of https://github.com/AJaySi/ALwrity 2025-09-23 16:21:32 +05:30
ajaysi
a91677782e Blog SEO Analysis Modal - Updated with SEO Metadata Generator, Core Metadata Tab, and Metadata Display Components 2025-09-23 16:21:09 +05:30
Cursor Agent
f2c18a822b Add Stability AI integration with comprehensive endpoints and features
Co-authored-by: ajay.calsoft <ajay.calsoft@gmail.com>
2025-09-23 12:38:27 +05:30
ajaysi
12119d418b Updated SEO Analysis Modal 2025-09-22 21:02:32 +05:30
ajaysi
f98d49cea7 Allowing AI to generate suggestions for the blog writer 2025-09-20 22:15:17 +05:30
ajaysi
4d153b292d ALwrity AI Blog Writer - Added Google Grounding UI Implementation 2025-09-18 18:45:53 +05:30
ajaysi
9f13daf443 Added beta testing for user_id=1 for all requests 2025-09-14 18:56:30 +05:30
ajaysi
380bb19673 Added enhanced linguistic analyzer and persona quality improver 2025-09-14 10:21:36 +05:30
ajaysi
fe277afc62 Merge branch 'main' of https://github.com/AJaySi/ALwrity 2025-09-14 09:53:46 +05:30
ajaysi
1460ce3cb6 Added enhanced linguistic analyzer and persona quality improver 2025-09-14 09:53:27 +05:30
dependabot[bot]
8fa220184e Bump axios in /frontend in the npm_and_yarn group across 1 directory
Bumps the npm_and_yarn group with 1 update in the /frontend directory: [axios](https://github.com/axios/axios).


Updates `axios` from 1.11.0 to 1.12.0
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: axios
  dependency-version: 1.12.0
  dependency-type: direct:production
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-12 19:15:23 +05:30
ajaysi
c63148e1ce Add comprehensive Stage 3 Content Generation implementation plan
- Detailed implementation strategy for content generation
- URL context integration with Gemini API
- Narrative flow engine and continuity system
- Comprehensive audit system design
- 8-week implementation roadmap with specific milestones
2025-09-12 18:36:26 +05:30
ajaysi
1d04d64d95 Merge main into cursor/implement-usage-based-subscription-and-monitoring-0179
- Resolved merge conflicts in gemini_grounded_provider.py
- Removed conflicting Python cache file
- Integrated latest LinkedIn Writer features from main branch
2025-09-12 16:58:26 +05:30
ajaysi
2ae0c4a8b9 AI Blog Writer - Implement modular architecture with research, outline, and core services 2025-09-12 16:53:16 +05:30
ajaysi
c0a366269d Added blog writer implementation - WIP 2025-09-12 10:26:08 +05:30
ajaysi
1b65a9487b ALwrity LinkedIn Writer: Billing Dashboard: Compact View, Billing Overview, System Health Indicator, Cost Breakdown, Usage Trends, Usage Alerts, Comprehensive API Breakdown 2025-09-11 11:09:10 +05:30
ajaysi
da091f7c47 ALwrity LinkedIn Writer: Brainstorm Flow, Copilot Actions, Feature Carousel, Info Modals, Welcome Message 2025-09-10 13:58:56 +05:30
ي
b156298e82 Merge branch 'main' into cursor/implement-usage-based-subscription-and-monitoring-0179 2025-09-10 13:56:54 +05:30
ajaysi
489a60e4a2 Add overall metrics to the hallucination detector 2025-09-08 22:09:45 +05:30
ajaysi
6fd9a4e354 ALwrity HALLUCINATION DETECTOR AND ASSISTIVE WRITING 2025-09-08 21:14:27 +05:30
ajaysi
5ba19c097a Analytics Insights and Tools Modal 2025-09-07 08:42:37 +05:30
ajaysi
7ac72c5382 Fixes to Generate Pillar Chips 2025-09-06 18:34:42 +05:30
ajaysi
ae42720c2a Alwrity today's tasks workflow implementation plan. 2025-09-06 15:28:05 +05:30
ajaysi
f82ada0361 ALwrity persona system 2025-09-05 15:22:43 +05:30
ajaysi
ccbdc9e8c6 Advanced Content Hyper-Personalization Implementation 2025-09-04 22:49:15 +05:30
Cursor Agent
e0a6150ed1 Add comprehensive usage-based subscription system with API tracking
Co-authored-by: ajay.calsoft <ajay.calsoft@gmail.com>
2025-09-04 17:18:27 +00:00
ajaysi
d57f7feb4a Phase 3 Complete: LinkedIn Editor Integration - Enhanced LinkedIn writer with persona-aware chat, multiple integration options, and comprehensive testing 2025-09-04 16:04:55 +05:30
ajaysi
ee39906672 Phase 2 Complete: CopilotKit Integration - PlatformPersonaChat component with comprehensive testing and integration examples 2025-09-04 14:33:27 +05:30
ajaysi
bf41db00e5 Phase 1 Complete: React Context Layer - TypeScript interfaces and PlatformPersonaProvider implemented 2025-09-04 14:27:36 +05:30
ajaysi
4c2e1daef9 Update documentation: PR #226 Writing Persona System fully implemented, next steps for React integration layer 2025-09-04 14:15:34 +05:30
ajaysi
266b215f50 Merge PR #226: Writing Persona System with platform-specific adaptations 2025-09-04 14:10:00 +05:30
ي
37aadd7e19 Delete backend/services/__pycache__/__init__.cpython-313.pyc 2025-09-04 13:57:14 +05:30
ajaysi
6eb7baee4b Content Hyper-Personalization Implementation Plan 2025-09-04 13:10:12 +05:30
ajaysi
c19fc3f225 ALwrity Prompts - AI Integration Plan 2025-09-03 23:16:39 +05:30
ajaysi
5efee4235d Added citation and quality metrics to the content editor. 2025-09-03 09:40:05 +05:30
ajaysi
10b50f9732 Alwrity Copilot Integration for LinkedIn Writer 2025-09-01 19:45:30 +05:30
ajaysi
64944104a3 merge: LinkedIn Writer PR #223 - resolve conflicts and integrate with existing routers 2025-08-31 23:33:10 +05:30
ajaysi
c8e765975e ALwrity Facebook Writer CopilotKit Implementation Plan 2025-08-31 23:31:29 +05:30
ajaysi
eb0789321d ALwrity Facebook Writer CopilotKit Implementation Plan 2025-08-31 18:41:07 +05:30
Cursor Agent
7dbebd45eb Implement persona generation system with platform-specific adaptations
Co-authored-by: ajay.calsoft <ajay.calsoft@gmail.com>
2025-08-31 08:26:51 +00:00
ajaysi
66c14e158c merge: resolve app.py conflicts; include SEO tools and Facebook Writer routers 2025-08-31 08:39:51 +05:30
ajaysi
1e0a13e204 ALwrity SEO CopilotKit Implementation Plan 2025-08-30 20:07:55 +05:30
ajaysi
0f16b855e1 merge: resolve conflicts favoring local for frontend packages 2025-08-30 16:16:13 +05:30
ajaysi
f5f3c09ecc feat(seo-copilot): caching + freshness UI; glassomorphic styling; CopilotKit HITL modular actions; provider fixes; DB sessions & action types; seed 17 actions 2025-08-30 16:12:41 +05:30
ajaysi
1fa2067301 Alwrity copilotkit integration - 0.5.7 2025-08-28 20:46:42 +05:30
Cursor Agent
58918d3ff1 Add LinkedIn content generation service to backend
Co-authored-by: ajay.calsoft <ajay.calsoft@gmail.com>
2025-08-28 09:42:17 +00:00
ajaysi
f76381030b Alwrity monitoring data service 2025-08-28 11:11:55 +05:30
Cursor Agent
40d33de1ab Add Facebook Writer API with models, routers, and migration summary
Co-authored-by: ajay.calsoft <ajay.calsoft@gmail.com>
2025-08-27 15:49:19 +00:00
ajaysi
be88e931ea Alwrity database fix 2025-08-27 15:10:54 +05:30
ي
d9833f30a6 Merge branch 'main' into cursor/migrate-and-enhance-ai-seo-tools-with-fastapi-dabb 2025-08-27 09:44:32 +05:30
ajaysi
6c72ef1a68 Alwrity calendar generation framework - step 1-3 completed with real database integration 2025-08-24 19:50:37 +05:30
Cursor Agent
512f82b7b0 Add AI SEO tools with FastAPI endpoints and comprehensive services
Co-authored-by: ajay.calsoft <ajay.calsoft@gmail.com>
2025-08-24 11:47:42 +00:00
ajaysi
5d8d1cfb73 ALwrity version 0.5.6 2025-08-22 14:08:54 +05:30
ajaysi
3f2f4d7b8c ALwrity version 0.5.5 2025-08-20 20:22:56 +05:30
ajaysi
74e22b421a ALwrity version 0.5.5 2025-08-19 21:48:33 +05:30
ajaysi
5f104bf427 ALwrity version 0.5.6 2025-08-16 20:40:09 +05:30
ajaysi
234eefb4bc ALwrity version 0.5.5 2025-08-15 23:02:18 +05:30
ajaysi
6bfa9f0fce ALwrity version 0.5.5 2025-08-15 16:13:01 +05:30
ajaysi
55a97b2fd4 ALwrity version 0.5.5 2025-08-15 08:28:34 +05:30
ajaysi
2b8c66c4d0 ALwrity version 0.5.5 2025-08-13 17:38:54 +05:30
ajaysi
66ece49705 Alwrity version 0.5.4 2025-08-12 22:35:21 +05:30
ajaysi
39b96c44da Alwrity version 0.5.4 2025-08-11 10:54:50 +05:30
ajaysi
13ca78f653 ALwrity version 0.5.4 2025-08-10 13:10:32 +05:30
ajaysi
5c08b6e007 ALwrity version 0.5.4 2025-08-09 23:14:16 +05:30
ajaysi
01fe1e0a9c ALwrity version 0.5.4 2025-08-08 10:50:31 +05:30
ajaysi
c5b54786f8 Always version 0.5.4 2025-08-07 20:06:26 +05:30
ajaysi
3670d0b5a0 Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-08-06 16:35:44 +05:30
ajaysi
a3a1484b61 ALwrity Version 0.5.1 (Fastapi + React) 2025-08-06 16:35:23 +05:30
ي
7d856d9330 Delete .gitignore 2025-08-06 16:31:20 +05:30
ajaysi
2579c12ba4 ALwrity Version 0.5.1 (Fastapi + React) 2025-08-06 16:29:49 +05:30
ajaysi
dbf761c31f Alwrity version 0.5.1 (Fastapi + React) 2025-08-06 16:22:50 +05:30
ajaysi
c87f27e56d Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-08-06 12:53:17 +05:30
ajaysi
32f97fa6b3 ALwrity Version 0.5.0 (Fastapi + React ) 2025-08-06 12:48:02 +05:30
Om-Singh1808
cc159f29ae fix: pin numpy and pandas-ta versions for compatibility 2025-07-30 17:40:28 +05:30
ajaysi
f28a919caa fix requirements with pandas-ta>=0.3.14b0 2025-07-06 20:05:43 +05:30
ajaysi
a73dd6bd0b Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-06-30 07:51:04 +05:30
ajaysi
b21cbb68da Added new features to the project 2025-06-30 07:49:48 +05:30
ي
edd03dd199 Bug fix: Update Docker installation instructions and add docker-compose support 2025-06-10 22:19:46 +00:00
ajaysi
bbe56a364d ALwrity Chatbot, SEO, Social media, Settings, Dashboard UI styling changes 2025-06-08 05:59:22 +05:30
ajaysi
fad9647b46 AI writers, SEO, Social media, Settings, Dashboard UI styling changes 2025-06-03 09:14:47 +05:30
ajaysi
5ca2fd5977 alwrity chatbot assistant, content scheduler, and content repurposing 2025-06-02 00:00:18 +05:30
ajaysi
889021c078 Content Calendar, Content Gap Analysis, and Content Optimization 2025-05-27 09:15:08 +05:30
ajaysi
4049d19787 youtube shorts video generator 2025-05-16 21:53:56 +05:30
ajaysi
b2ce1ceb49 AI Outline Writer - Added image generation and display for sections 2025-05-07 16:39:28 +05:30
ajaysi
5f7d319859 AI Backlinker, Google Ads Generator, Letter Writer - WIP 2025-05-06 22:27:43 +05:30
ajaysi
26b02b9719 AI FAQ Generator & github blogs 2025-05-04 17:04:44 +05:30
ajaysi
c51e355d26 AI Blog Rewriter Updater feature complete 2025-05-04 10:56:41 +05:30
ajaysi
19ff21a8a1 story illustrator and story video generator, AI web researcher fixes 2025-05-02 23:09:43 +05:30
ajaysi
cda275f1cc AI Blog Writer enhancements & Streamlit UI updates 2025-05-01 20:41:41 +05:30
ajaysi
a27522d32e AI blog writer & blog metadata updates & improvements 2025-05-01 10:38:01 +05:30
ajaysi
a6e3ac2f8b Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-04-30 16:06:55 +05:30
ajaysi
e19f933a19 copywriter and blog metadata updates 2025-04-30 16:06:33 +05:30
ي
5565e58cc2 Fixed Dockerfile to use the .env file for environment variables 2025-04-29 12:17:21 +00:00
Seva
4f20a5206f fix dockerfile bug https://github.com/AJaySi/AI-Writer/issues/179 2025-04-29 11:48:54 +05:30
ajaysi
ad9f401b60 Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-04-29 08:56:28 +05:30
ajaysi
9db20db0d1 Blog writer enhancements & fixes 2025-04-29 08:55:47 +05:30
ي
7c3db7416f fix: update Dockerfile to use latest base image 2025-04-28 16:03:12 +00:00
ي
0b240e5574 Bug fixes and improvements to the README_dockerfile.md file 2025-04-26 11:01:05 +00:00
ي
5d8537acb5 Bug fixes and improvements 2025-04-26 09:52:46 +00:00
ajaysi
ef462f05f2 Onboarding changes and improvements 2025-04-25 11:28:11 +05:30
ajaysi
f13516f707 Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-04-25 11:27:33 +05:30
ajaysi
b9359b04fb Fix environment variable handling in setup: PERSONALIZATION_DONE and FINAL_SETUP_COMPLETE 2025-04-25 11:27:17 +05:30
ي
20b4782951 Getting started Guides and Dockerfile, setup.py, and install.bat 2025-04-23 13:25:33 +00:00
ي
f48d58c7df Getting started Guides and Dockerfile, setup.py, and install.bat 2025-04-23 12:42:31 +00:00
ي
72a8aa373e https://github.com/AJaySi/AI-Writer/issues/165 2025-04-23 10:08:55 +00:00
ajaysi
92b433bf9a Docs & Roadmap updates 2025-04-21 21:07:35 +05:30
ajaysi
c5b47bd32f Detailed Docs & Onboarding improvements 2025-04-21 16:34:18 +05:30
ajaysi
6e60a9fd28 API Setup Updates and Tweaks 2025-04-17 23:55:20 +05:30
ajaysi
fa4097c9ae Revert changes to website_setup.py from commit 372bf71 2025-04-17 19:22:29 +05:30
ajaysi
5da0562aa5 Revert ai_research_setup.py and base.py to the state before commit 036fde9. 2025-04-16 19:36:15 +05:30
ajaysi
3a871d4de0 Getting the twitter dashboard working. 2025-04-16 19:31:51 +05:30
ajaysi
f854f0f30e Revert changes in lib/utils/api_key_manager/components/ai_providers.py to the state before commit 036fde9 2025-04-16 19:27:37 +05:30
ajaysi
60409395b4 Revert changes in lib/utils/api_key_manager/components/ai_providers_setup.py to the state before commit 036fde9. 2025-04-16 19:24:05 +05:30
ajaysi
74fd25d52e Resolved merge conflicts in LinkedIn README and Twitter dashboard 2025-04-16 18:32:43 +05:30
ajaysi
a551f97904 AI twitter and linkedin writers changes. 2025-04-16 18:27:37 +05:30
ajaysi
385d9f000f Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2025-04-16 16:24:00 +05:30
ajaysi
5982ce558c Twitter and LinkedIn Writers 2025-04-16 16:23:42 +05:30
ajay.calsoft
372bf71a0a Onboarding Improvements 2025-04-16 10:32:59 +00:00
ajay.calsoft
036fde9e81 Onboarding changes 2025-04-15 17:03:04 +00:00
ajaysi
a2b92e27bc Link 2025-04-15 12:37:25 +05:30
ajaysi
d4f1fc77a1 LinkedIn AI Writer features like carousel, video script, and comment response generator 2025-04-14 22:34:10 +05:30
ajaysi
dd9a9e5f09 Facebook AI Writer - Ad Copy Generator, Event Generator, Group Post Generator, Hashtag Generator, Page About Generator, Facebook Carousel Generator, Facebook Reel Generator 2025-04-12 19:53:15 +05:30
ajaysi
e41be5789a Gemini AI common code and utils 2025-04-11 17:47:55 +05:30
ajaysi
b556edb989 Facebook AI Writer features like post and story generator 2025-04-10 15:24:00 +05:30
ajaysi
8ac2095ac4 Youtube AI Writer Tools like yt_shorts_scripts, tags_generator, community_post_generator, shorts_script_generator 2025-04-10 11:42:34 +05:30
ي
d734c66f1a Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
f04ab57f82 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
70e81fd5aa content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
da1f06928e AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
2cf01daf57 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
cfc9d1847d Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
0f5bedb7b6 content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
cc4c66be38 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
eca147fac0 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
a2e1b493d4 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
1094daeeef Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
1233e15218 Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
2311d40b07 Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
cc30fe6f27 content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
68410d9163 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
26d57ea1ac Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
d4fc0c3918 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
2b0ab310fc Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
ef94ac449c content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
b939665125 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
0c0164453b Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
b36d89ab66 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
fdd6b4951a content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
6edbdbb621 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
ff76cc1c97 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
3edb1c0f11 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
299d320e69 Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
c45532b4db Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
92abd97e2d content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
e5bf7a35f0 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
ed9221efbb Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
adca7917e5 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
733de06963 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
be9ef9e9a7 content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
1e0850d444 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
e816d8fc9d Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
58abb6a792 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
694a75aa8d Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
d319298866 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
d0a746286f content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
6a3ff439ba AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
0de88a34bd Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
2d0d4dc8fb Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
299d795d1f Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
60d8b86bf3 Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
40dfd4003d content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
be136fd21d AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
edea74aaed Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
390a321340 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
1e427ee48b content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
b9bcc596c2 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
205d9bf09a Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
df9eeb909c Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
d4066a9be5 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
9dc1330ee0 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
da6e715152 content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
454681ea14 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
a886bdd9c1 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
435aecc67b Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
4f865b2d13 Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
561a50f68b Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
2fa94f437d content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
a2c44d6051 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
d5b27c0867 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
d99e68b50f Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
38e2483548 content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
3cf3ff4a68 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
1de9cf209a Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
4c99c96753 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
83943de3b0 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
dd2fb9eab0 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
031c94eb8f content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
55e22af7a0 AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ي
2d380b8169 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
28aa27c4b5 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
8bbdad622f Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
940efaa9c5 Update lib/ai_web_researcher/metaphor_basic_neural_web_search.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
59a9198e22 Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ي
a1cf6b811c Update lib/utils/content_generators.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-08 20:31:58 +05:30
ajaysi
5797d031c8 content planning and competitor analysis. 2025-04-08 20:31:58 +05:30
ajaysi
33a608dcdc AI Content planning and competitor analysis.
Tight integration with Alwrity, tavily and metaphor.
2025-04-08 20:31:58 +05:30
ajaysi
8312dbaaac Youtube AI Writer Tools 2025-04-08 18:37:35 +05:30
ajaysi
940a4ad1fa ALwrity AI Copywriter app, aidppc, oath, quest, star, fab, pas, app, acca, 4c, 4r, emotional 2025-04-07 15:13:33 +05:30
ajaysi
3a57df6ecb AI Content planning and competitor analysis. 2025-04-06 17:34:36 +05:30
ajaysi
d31480eaf4 AI Content planning and competitor analysis. 2025-04-06 17:26:48 +05:30
ajaysi
be5adc328d AI Researcher changes 2025-04-06 12:55:55 +05:30
ajaysi
4c4e48de58 Merge branch 'alwrity_keyword_research' 2025-04-06 12:41:18 +05:30
ajaysi
eec99f4bdc Merge branches 'alwrity_keyword_research' and 'alwrity_keyword_research' of https://github.com/AJaySi/AI-Writer into alwrity_keyword_research 2025-04-06 11:08:46 +05:30
ajaysi
e05450d070 Stupid github 2025-04-06 11:08:39 +05:30
ajaysi
579bf7d0a6 Google trends data and keyword research 2025-04-06 11:08:39 +05:30
ajaysi
a2fb77f700 AI Web Researcher: Added Exa answer and Tavily answer to the metaphor search results
Added AI insights to the metaphor search results
Better display of AI search results
2025-04-06 11:08:39 +05:30
ajaysi
e1060ffc45 ALwrity AI Keyword Web Researcher 2025-04-06 11:08:39 +05:30
ajaysi
81e78563e2 Google Grounded Search and Styling Fixes 2025-04-06 11:08:39 +05:30
ajaysi
f19d57cba0 Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-06 11:08:36 +05:30
ajaysi
22d027d1c4 Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-06 11:07:07 +05:30
ajaysi
ce8c253f59 ALwrity AI Keyword Web Researcher 2025-04-06 11:06:48 +05:30
ajaysi
410d9e3e3d Google Grounded Search and Styling Fixes 2025-04-06 11:06:48 +05:30
ajaysi
d2f92ec791 Rebase: Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-06 11:06:48 +05:30
ajaysi
0d6e10f0cf Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-06 11:06:10 +05:30
ajaysi
55caaf4fff Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-06 11:03:08 +05:30
ajaysi
6d7f048a23 ALwrity AI Keyword Web Researcher 2025-04-06 10:53:57 +05:30
ajaysi
c59cb6048b Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-06 10:53:57 +05:30
ajaysi
171c20b619 Stupid github 2025-04-06 10:51:18 +05:30
ajaysi
284c61e776 Google trends data and keyword research 2025-04-05 22:50:43 +05:30
ajaysi
d7cfe2dd31 AI Web Researcher: Added Exa answer and Tavily answer to the metaphor search results
Added AI insights to the metaphor search results
Better display of AI search results
2025-04-05 14:55:28 +05:30
ajaysi
4fc7ba7055 ALwrity AI Keyword Web Researcher 2025-04-04 12:20:53 +05:30
ajaysi
163c9131e4 Google Grounded Search and Styling Fixes 2025-04-04 12:20:53 +05:30
ajaysi
b48a3b106b Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 12:20:53 +05:30
ajaysi
611fbd51a3 Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-04 12:20:53 +05:30
ajaysi
aa66fbe585 ALwrity AI Keyword Web Researcher 2025-04-04 12:20:53 +05:30
ajaysi
94c6a9c7a3 Google Grounded Search and Styling Fixes 2025-04-04 12:20:53 +05:30
ajaysi
c77070e399 Rebase: Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 12:20:43 +05:30
ajaysi
9237371971 Merge branch 'new_alwrity' of https://github.com/AJaySi/AI-Writer into new_alwrity 2025-04-04 11:50:15 +05:30
ajaysi
91072e8787 ALwrity AI Keyword Web Researcher 2025-04-04 11:50:09 +05:30
ajaysi
d2bfb95c30 Google Grounded Search and Styling Fixes 2025-04-04 11:50:09 +05:30
ajaysi
6861043101 Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 11:50:09 +05:30
ajaysi
5c16d4def7 Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-04 11:50:09 +05:30
ajaysi
e09e54ead0 ALwrity AI Keyword Web Researcher 2025-04-04 11:45:53 +05:30
ajaysi
42543f6ae1 Google Grounded Search and Styling Fixes 2025-04-04 11:45:53 +05:30
ajaysi
8e70578b38 Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 11:45:40 +05:30
ajaysi
ce216ae898 Merge branch 'new_alwrity' of https://github.com/AJaySi/AI-Writer into new_alwrity 2025-04-04 11:42:31 +05:30
ajaysi
46e28321b0 ALwrity AI Keyword Web Researcher 2025-04-04 11:40:07 +05:30
ajaysi
a3376f615a Google Grounded Search and Styling Fixes 2025-04-04 11:40:06 +05:30
ajaysi
f2d8cd32da Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 11:40:06 +05:30
ajaysi
ca58a2f2a5 Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-04 11:40:06 +05:30
ajaysi
1a803f7ac3 ALwrity AI Keyword Web Researcher 2025-04-04 11:32:22 +05:30
ajaysi
0ea99a485c Google Grounded Search and Styling Fixes 2025-04-04 11:32:22 +05:30
ajaysi
bf2b1f596f Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 11:32:22 +05:30
ajaysi
9d27d8469c Hmm, lets commit this: 2025-04-04 11:26:31 +05:30
ajaysi
b1d15b796c Merge branch 'new_alwrity' of https://github.com/AJaySi/AI-Writer into new_alwrity 2025-04-04 11:17:00 +05:30
ajaysi
3ffb563d40 ALwrity AI Keyword Web Researcher 2025-04-04 11:13:48 +05:30
ajaysi
b4660d9d98 Google Grounded Search and Styling Fixes 2025-04-04 11:13:48 +05:30
ajaysi
c40ce6ce4c Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-04 11:13:48 +05:30
ajaysi
8f09899dff Merge branch 'new_alwrity' of https://github.com/AJaySi/AI-Writer into new_alwrity 2025-04-04 10:54:58 +05:30
ajaysi
dcd8917805 ALwrity AI Keyword Web Researcher 2025-04-04 10:48:46 +05:30
ي
1e588a0f02 Merge branch 'main' into new_alwrity 2025-04-03 11:47:48 +05:30
ajaysi
65f452be05 Google Grounded Search and Styling Fixes 2025-04-02 22:44:16 +05:30
ajaysi
d6c0bc11ae Google Search Grounded results, Content Calendar Ideator, Competitor Analysis, and Keyword Researcher 2025-04-02 22:41:25 +05:30
ajaysi
7d6ea91e6a Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-01 21:46:10 +05:30
ajaysi
6c833e2773 Made changes to Getting started with ALwrity and added lot of details on API keys 2025-04-01 13:11:40 +05:30
Umesh
367f9bac2c Update README.md
I have added the remaining tools details which were missing here. Please check and review.

AI SEO tool
AI Social Tool
AI Content Generator Tool
2025-03-13 10:55:55 +05:30
cypheroxide
b1469eece9 Add install script info to README and implement error logging in install_dependencies.py 2025-03-12 09:04:58 +05:30
cypheroxide
e65b2b309e Add system prequesisties documentation and installation helper script 2025-03-12 09:04:58 +05:30
ي
2140b9bea4 Update README.md
https://storyme.app.io/
2025-03-01 14:47:38 +05:30
DikshaDisciplines
2d0ca90232 Update README.md 2025-02-17 10:46:17 +05:30
ي
6922e5f85f Update ai_news_article_writer.py - Issue 154
encountered an issue with the AI News Writer feature. The system successfully generates a news report (as indicated by the success message), but the actual content is not displayed in the UI.

Steps to Reproduce:

Open Alwrity and navigate to AI Writers.
Select Write News Reports.
Enter keywords (e.g., Upcoming AI Trends in 2025).
Select an origin country (e.g., India) and language (e.g., English).
Click Generate News Report.
2025-02-13 13:19:13 +05:30
ي
f1af927bd6 history chatbot and a document question-answering chatbot 2025-02-12 19:57:41 +05:30
ي
70f43b9546 Delete lib/chatbot_custom/chat_history_chatbot.py 2025-02-12 19:55:09 +05:30
ي
25784c4a98 Delete lib/chatbot_custom/chatbot_local_docqa.py 2025-02-12 19:54:47 +05:30
ي
942341a3f9 Provide alwrity_rag_chatbot module 2025-02-12 19:53:41 +05:30
ي
c4d6673da6 Update alwrity.py - Cleaning up the mess. 2025-01-30 19:42:24 +05:30
ي
80e2ab6bbb Update requirements.txt 2025-01-30 08:24:44 +05:30
ي
f8b7584be2 Update on_page_seo_analyzer.py 2025-01-30 08:24:02 +05:30
ي
6edb1fb29e Update requirements.txt 2025-01-30 08:07:27 +05:30
ي
69d95d3298 Update requirements.txt 2025-01-29 23:29:29 +05:30
ي
af96cee915 Update requirements.txt 2025-01-29 23:10:53 +05:30
ي
70b272fd84 Update content_planning_agents_alwrity_crew.py 2025-01-29 23:10:15 +05:30
ي
e758ab5695 Update on_page_seo_analyzer.py 2025-01-29 23:09:47 +05:30
ي
9782640923 Update deepseek text generation module 2025-01-28 12:18:53 +05:30
ي
fca5b269cb Update alwrity.py for DeepSeek AI model 2025-01-27 20:18:20 +05:30
ي
6a76570610 Support DeepSeek AI model for text gen 2025-01-27 19:45:04 +05:30
ي
960b1394b4 Include deepseek AI model for content generation 2025-01-27 19:40:31 +05:30
ي
bd39becb57 Update deepseek_text_gen
Changes Made:
Configured logging similar to openai_text_gen.py.
Added retry mechanism using tenacity similar to both openai_text_gen.py and gemini_pro_text.py.
Adapted the API call to use DeepSeek's reasoning.create method.
Handled streaming of responses and error logging.
2025-01-27 19:32:53 +05:30
ي
86a38aec85 Create deepseek_text_gen.py 2025-01-27 17:58:29 +05:30
ي
076a597d7a Update finance_data_researcher.py 2025-01-26 13:48:52 +05:30
ي
b58e9f519e Update firecrawl web crawler 2025-01-24 16:31:23 +05:30
ي
4b883408dd Update README.md 2025-01-24 16:17:38 +05:30
ي
348f84a99b Rename README to README.md 2025-01-24 16:05:54 +05:30
ي
9a819b2267 Update and rename README.md to README 2025-01-24 16:05:07 +05:30
ي
3653bd4e80 Update arxiv_schlorly_research.py
Function Definitions:
fetch_arxiv_data: Fetches arXiv data based on a query.
create_dataframe: Creates a DataFrame from the provided data.
get_arxiv_main_content: Returns the main content of an arXiv paper.
download_image: Downloads an image from a URL.
scrape_images_from_arxiv: Scrapes images from an arXiv page.
arxiv_bibtex: Generates the BibTeX entry for an arXiv paper.
extract_arxiv_ids_from_line: Extracts arXiv IDs from a given line of text.
read_written_ids: Reads already written arXiv IDs from a file.
append_id_to_file: Appends a single arXiv ID to a file.
Step 2: Suggest Code Improvements
Code Duplication:
Combine Similar Functions: Functions such as fetch_arxiv_data and create_dataframe can be combined or refactored to reduce redundancy.
Reuse Code: Ensure common functionality is abstracted into reusable functions.
Performance and Optimization:
Optimize API Calls: Ensure the arXiv API calls are optimized and handle rate limits.
Efficient Data Handling: Use more efficient data handling techniques, such as batch processing for large datasets.
Coding Standards and Best Practices:
Add Docstrings: Ensure all functions have detailed docstrings explaining their purpose, arguments, and return values.
Error Handling: Improve error handling to provide more informative error messages and handle different types of errors separately.
Logging: Use a consistent logging strategy to log important events and errors.
Code Structure: Group related functions into classes or modules for better organization and maintainability.
PEP 8 Compliance: Ensure the code follows PEP 8 standards for Python code style.
2025-01-24 15:52:06 +05:30
ي
f75375eaaa Delete lib/ai_web_researcher/__pycache__ directory 2025-01-18 09:35:29 +05:30
ي
2f37626e32 Update sitemap_analysis.py
Code Improvements:
Error Handling:

Improve error messages to be more informative.
Log errors for debugging purposes.
Code Readability:

Add docstrings and comments to explain the purpose of functions and complex code blocks.
Use consistent formatting and naming conventions.
Modularization:

Split large functions into smaller, reusable functions.
Group related functions together.
Optimization:

Use caching where possible to reduce redundant operations.
Optimize data processing steps for better performance.
User Experience Improvements:
User Feedback:

Provide immediate feedback on actions (e.g., loading spinners, success, and error messages).
Use placeholders and help text to guide users on what inputs are expected.
Interactive Elements:

Use more interactive elements like sliders, date pickers, and multi-selects to enhance the user interface.
2025-01-18 08:35:09 +05:30
ي
74c862faec Update textstaty.py
User-Friendly Interface: The Streamlit interface is intuitive, allowing users to easily input text and get readability scores.
Comprehensive Analysis: The tool covers a wide range of readability metrics, providing detailed insights into the text's readability.
Actionable Tips: Each readability score is accompanied by actionable tips, helping users improve their content based on the analysis.
Additional Insights: The inclusion of additional metrics like reading time, syllable count, and word count provides a thorough analysis of the text.
Suggested Improvements:
Error Handling:

Add error handling for cases where the text input might be empty or too short for certain readability metrics.
Code Modularization:

Refactor the code to encapsulate readability calculations and markdown generation into separate functions. This will make the code more modular and easier to maintain.
Performance Optimization:

Optimize the readability calculation by avoiding redundant calculations if the text hasn't changed.
Code Readability:

Add docstrings and comments to explain the purpose of functions and complex code blocks.
Ensure consistent formatting and adherence to PEP8 standards.
2025-01-18 08:34:32 +05:30
ي
ea5554a723 Update seo_structured_data.py
Code Improvements:
Error Handling:

Improve error messages to be more informative.
Log errors for debugging purposes.
Code Readability:

Add docstrings and comments to explain the purpose of functions and complex code blocks.
Modularization:

Split large functions into smaller, reusable functions.
Group related functions together.
Input Validation:

Ensure user inputs are validated to prevent errors later in the code.
User Experience Improvements:
User Feedback:

Provide immediate feedback on actions (e.g., loading spinners, success, and error messages).
Use placeholders and help text to guide users on what inputs are expected.
Interactive Elements:

Use more interactive elements like sliders, date pickers, and multi-selects to enhance the user interface.
2025-01-17 13:06:57 +05:30
ي
451164f5b2 Update optimize_images_for_upload.py
Add Type Hints: Improve code readability and help with static type checking by adding type hints.

Modular Logging Setup: Move the logging setup to a separate function for better modularity.

Error Handling Improvements: Add more specific error handling and informative logging.

Environment Variable Loading: Ensure environment variables are loaded only once and not inside functions.

Check for API Key Validity: Add a check to validate the Tinyfy API key before using it.

Optimize Image Saving: Use a temporary file to avoid saving directly to the filesystem.
2025-01-17 13:02:51 +05:30
ي
088975a70f Update opengraph_generator.py
Here are some improvements to the og_tag_generator function in opengraph_generator.py:

Refactor Code for Better Readability:

Group related operations together and add comments for better understanding.
Use helper functions to break down larger pieces of functionality into smaller, reusable code blocks.
Error Handling:

Improve error handling by providing more specific error messages and handling different types of errors separately.
User Experience Enhancements:

Add informative messages and examples to guide the user through the input process.
Provide feedback on the success or failure of the Open Graph tag generation.
2025-01-17 12:40:56 +05:30
ي
c9b22b3653 Update on_page_seo_analyzer.py
Additional Insights for Non-Technical Users
Content Quality Insights:

Readability Score: Use libraries like textstat to calculate a readability score (e.g., Flesch Reading Ease) for the webpage content.
Keywords Highlighting: Extract and highlight frequently used keywords in the content, helping users understand what topics are emphasized.
Duplicate Content Check: Flag if the meta description or titles are repeated multiple times in the page content.
SEO Health Checks:

Broken Links Detection: Identify broken internal or external links and recommend fixing them.
Image Optimization Tips:
Suggest reducing image sizes if the file sizes exceed a certain threshold.
Recommend modern formats like WebP for better performance.
Alt Text Suggestions: Provide actionable suggestions for missing or insufficient alt text, such as "Describe the image's purpose or key elements."
Social Media Enhancement:

Suggest best practices for Open Graph and Twitter tags, such as recommended tag content length or formats.
Generate suggested meta descriptions and Open Graph descriptions for improved click-through rates.
Accessibility Recommendations:

Heading Structure Audit: Check for skipped heading levels (e.g., h2 follows h4) and provide guidance on correcting them.
Contrast Ratio Check: Flag potential text-to-background contrast issues for visually impaired users (can use APIs like Lighthouse).
ARIA Tags: Check for the presence of ARIA (Accessible Rich Internet Applications) tags and recommend their addition if missing.
Performance Insights:

Lazy Loading Suggestions: Highlight images without loading="lazy" and recommend lazy loading to improve page load speed.
Critical CSS Suggestions: Advise inlining critical CSS for faster initial render.
Script Optimization: Highlight unminified or unused JavaScript and recommend optimization.
Custom Recommendations:

Call to Action (CTA) Suggestions: Analyze the text for actionable elements like buttons or links and recommend improving CTAs.
Internal Linking Suggestions: Suggest adding internal links for keywords or headings that lack links.
Schema Markup Expansion: Recommend additional schema types (e.g., FAQ, Product, Review) based on the page content.
Mobile Friendliness Enhancements:

Check for touch targets (buttons and links) being too small or too close together.
Flag pages without mobile-friendly navigation menus.
Enhancements to User Experience
Highlight Strengths and Weaknesses: Use color-coded sections to differentiate between well-optimized and underperforming areas.

Simplified Metrics:

Break down complex scores (like PageSpeed or SEO scores) into "Good," "Needs Improvement," and "Poor" categories.
Provide plain-English explanations for non-technical users.
Recommendations Section:

Provide step-by-step instructions or examples for fixing identified issues, such as "How to add a canonical tag" or "How to structure hreflang attributes."
Actionable Insights Dashboard: Summarize all findings in a visually appealing dashboard with prioritized to-do lists.

Export Reports: Allow users to export the analysis and recommendations in a PDF or CSV format for easier sharing and tracking.
2025-01-17 12:23:10 +05:30
ي
6bfc851a1c Update on_page_seo_analyzer.py
PEP 8 Compliance:

Ensure proper spacing around operators and after commas.
Group import statements by standard library, third-party, and local imports.
Error Handling:

Improve error messages to be more descriptive and helpful.
Code Structure:

Ensure consistent indentation and formatting.
Remove any unused imports or commented-out code.
Docstrings:

Ensure all functions have detailed docstrings explaining their purpose, arguments, and return values.
Optimization:

Reduce repeated calls to fetch_and_parse_html by reusing the soup object.
2025-01-17 11:16:51 +05:30
ي
b369e5f504 Update meta_desc_generator.py
Improve the readability, structure, and functionality of the code.
2025-01-17 10:49:36 +05:30
ي
d3eb02ef8e Update image_alt_text_generator.py
This version includes clear comments, detailed docstrings, and adheres to PEP 8 standards. It also uses environment variables for sensitive information and provides helpful tooltips for user inputs.
2025-01-17 10:44:34 +05:30
ي
9875cb8602 Optimize google_pagespeed_insights.py 2025-01-17 10:38:28 +05:30
ي
5722d852a4 Update content_title_generator.py 2025-01-17 10:04:40 +05:30
ي
99f1d15921 Update README.md 2025-01-17 09:56:26 +05:30
ي
6580653d80 Rename README.md to README 2025-01-17 09:49:44 +05:30
ي
cf03ff5f8c Documentation for AI Backlinging Tool 2025-01-17 09:48:55 +05:30
ي
d92873ffdb Update ai_backlinking.py
This script now includes the suggested improvements for:

Generating search queries
Finding backlink opportunities
Composing personalized emails
Sending emails using SMTP
Logging sent emails
Checking email responses
Sending follow-up emails
Handling multiple keywords
Main workflow integration
2025-01-17 09:44:03 +05:30
ي
a339aa6b29 Update README.md - Easier to get started with AI agents 2025-01-17 09:34:24 +05:30
ي
1d0a3db873 Add docstrings and comments to improve readability and maintainability of sidebar_configuration function
Add docstrings and comments to improve readability and maintainability of sidebar_configuration function
2025-01-15 20:45:06 +05:30
ي
a6cbfafa16 Debacles with AI coding, Reverting breaking changes. 2025-01-15 20:03:00 +05:30
ي
44e5f7dc1f Refactor, modular code & comments for maintainability
Modularize Code: Break down the large sidebar_configuration function into smaller, more manageable functions to improve readability and maintainability.

Error Handling: Implement error handling for critical operations, such as reading and writing files, to ensure the application handles exceptions gracefully.

Docstrings and Comments: Add docstrings and comments to functions and critical code sections to improve code documentation and readability.

Code Consistency: Ensure consistent use of naming conventions and code formatting.
2025-01-15 19:40:25 +05:30
ي
4374749fbc Refactored, reduced duplicate code, WIP 2025-01-15 18:16:24 +05:30
ي
b06a8e1234 Update README.md 2025-01-15 17:27:10 +05:30
ي
19f8d43729 Update README.md 2025-01-15 16:01:41 +05:30
ي
b41320ef10 Fix code scanning alert no. 12: Full server-side request forgery
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-01-14 18:07:55 +05:30
ي
b10e1af1b5 Fix code scanning alert no. 4: Uncontrolled data used in path expression
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-01-13 16:36:19 +05:30
ي
ed493a1951 Update README.md 2025-01-09 10:58:40 +05:30
ajaysi
77b9cea226 README changes 2024-11-07 12:25:16 +05:30
ajaysi
1471a3ec9b Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2024-10-12 08:00:44 +05:30
ajaysi
e6f60feba5 YT to blog, bug fixes - WIP 2024-10-12 07:59:13 +05:30
Mason Dierkes
c7f3d714a8 Update requirements.txt 2024-10-10 07:33:51 +05:30
ajaysi
16bcd86bb7 API keys setup improvements & housekeeping 2024-10-06 16:22:12 +05:30
ajaysi (aider)
adc7f157ea refactor: Consolidate API key checks into single function 2024-10-06 14:36:39 +05:30
ajaysi (aider)
7b219c8cea refactor: Combine API key checks into a single function 2024-10-06 14:36:02 +05:30
ajaysi
80e777d568 style: Remove trailing blank lines in api_key_manager.py 2024-10-06 14:35:57 +05:30
ajaysi
a710e33f6d chore: Remove unnecessary blank line in alwrity.py 2024-10-06 14:08:03 +05:30
ajaysi (aider)
58f73bed91 fix: Allow app to run without API keys, prompting for input instead of stopping. 2024-10-06 14:04:03 +05:30
ajaysi
19c4391fed chore: Remove duplicate call to load_dotenv() in alwrity.py 2024-10-06 14:04:00 +05:30
ajaysi (aider)
2ec13af0fc fix: Collect all missing API keys using a single form submission. 2024-10-06 13:56:36 +05:30
ajaysi (aider)
53dfaaa5da feat: Collect all missing API keys using a Streamlit form. 2024-10-06 13:56:06 +05:30
ajaysi
54aceb28a8 feat: Add API key management and LLM provider selection 2024-10-06 13:56:02 +05:30
ajaysi
d2fd39ced3 fix: Remove reference to undefined function 'blog_from_audio' 2024-10-06 12:33:26 +05:30
ajaysi (aider)
39d1c45cf9 fix: Remove reference to undefined function 'blog_from_audio' 2024-10-06 12:33:12 +05:30
ajaysi (aider)
b272f72395 fix: Move widget commands outside cached function to avoid unexpected behavior 2024-10-06 12:32:03 +05:30
ajaysi (aider)
316b2c5aac feat: implement API key management with .env file and user prompts 2024-10-06 12:30:47 +05:30
ajaysi (aider)
4cee9f0293 chore: Load environment variables on startup and set them after saving 2024-10-06 12:26:31 +05:30
ajaysi (aider)
8deba8ec4e fix: Ensure API keys are loaded and set correctly from .env file. 2024-10-06 12:24:44 +05:30
ajaysi (aider)
f4f032f32e fix: ensure API keys are set as environment variables immediately after saving to .env 2024-10-06 12:23:08 +05:30
ajaysi
9409e57d2f Performance optimizations, Improved SEO tools & error handling 2024-09-26 08:16:22 +05:30
ajaysi
516144a728 revert to 93075dc 2024-09-18 14:35:51 +05:30
ajaysi (aider)
a377032e02 fix: Resolve NameError by importing find_backlink_opportunities 2024-09-17 23:37:10 +05:30
ajaysi (aider)
00b9330c96 fix: Resolve import errors and install missing packages for AI backlinking tool. 2024-09-17 23:35:50 +05:30
ajaysi
97b60a66e6 feat: add backlinking UI streamlit app 2024-09-17 23:35:47 +05:30
ajaysi (aider)
ada8b37251 feat: implement AI LLM for composing personalized outreach emails based on insights and website data 2024-09-17 12:12:08 +05:30
ajaysi (aider)
3ce8e8d70a feat: implement personalized email composition for outreach based on website data and user proposals 2024-09-17 12:10:35 +05:30
ajaysi (aider)
bbee9e472f feat: integrate LLM functions to generate insights for scraped website content 2024-09-17 12:05:53 +05:30
ajaysi (aider)
17eaa26ec8 feat: enhance backlink opportunity data with website metadata and context 2024-09-17 12:01:52 +05:30
ajaysi (aider)
3e9d641ac5 feat: implement contact information extraction using Firecrawl's LLM Extract feature 2024-09-17 11:58:01 +05:30
ajaysi
8930f3d2b2 feat: add scrape_url import to ai_backlinking.py for enhanced functionality 2024-09-17 11:57:59 +05:30
ajaysi (aider)
cfb5145947 feat: integrate firecrawl_web_crawler for URL search functionality in search_for_urls function 2024-09-17 11:50:23 +05:30
ajaysi (aider)
602243de0e feat: implement search query generation and initial scraping structure for backlink opportunities 2024-09-17 11:48:28 +05:30
ajaysi
09572cb130 feat: add AI backlinking tool module 2024-09-17 11:48:26 +05:30
ajaysi (aider)
99acb9b4f1 fix: import missing modules to resolve undefined name errors in firecrawl_web_crawler.py 2024-09-17 11:27:44 +05:30
ajaysi (aider)
1078c969ca refactor: move initialize_client function to a separate file for better modularity 2024-09-17 11:27:35 +05:30
ajaysi (aider)
0787f7e807 feat: implement Google Trends analysis with user input and intelligent defaults in content_generators 2024-09-14 22:12:47 +05:30
ajaysi (aider)
3f61a7715c fix: import openai module to resolve undefined name error in alwrity_utils.py 2024-09-14 21:54:18 +05:30
ajaysi (aider)
fd604c4708 refactor: Optimize imports and improve code readability in alwrity_utils.py 2024-09-14 21:39:49 +05:30
ajaysi (aider)
d4c20d0798 fix: optimize sidebar configuration by reducing unnecessary file writes and improving widget creation efficiency 2024-09-14 21:30:18 +05:30
ajaysi (aider)
891455149e fix: import missing functions and modules to resolve undefined names in alwrity.py 2024-09-14 21:27:32 +05:30
ajaysi (aider)
187c62468f fix: remove duplicate import of streamlit to improve loading time 2024-09-14 21:26:57 +05:30
ajaysi (aider)
44c2e0966b feat: implement web research functionality in content planning tools of alwrity.py 2024-09-14 20:00:39 +05:30
ajaysi (aider)
d8616b1645 fix: remove invalid imports to resolve Streamlit app exceptions 2024-09-14 19:59:56 +05:30
ajaysi (aider)
2a28af1d02 feat: streamline web research process and integrate Google Trends results into Streamlit UI 2024-09-14 19:55:16 +05:30
ajaysi (aider)
12d66e8ff2 feat: optimize metric display by including search term and region in output messages 2024-09-14 19:40:58 +05:30
ajaysi (aider)
6900c028f5 feat: fetch and display all metrics from pytrends on CLI 2024-09-14 19:29:41 +05:30
ajaysi (aider)
e7acd194e4 feat: display all metrics from pytrends on CLI with explanations of their significance 2024-09-14 19:22:42 +05:30
ajaysi (aider)
0ec58fc7c6 fix: handle empty data in get_related_topics_and_save_csv to prevent list index out of range error 2024-09-14 19:21:25 +05:30
ajaysi (aider)
5be849cfcb fix: import WordCloud to resolve undefined name error in generate_wordcloud function 2024-09-14 19:17:35 +05:30
ajaysi (aider)
cef400039e feat: implement missing features for multirange interest, historical hourly interest, trending searches, realtime trends, top charts, and suggestions in google_trends_researcher.py 2024-09-14 19:17:27 +05:30
ajaysi (aider)
93075dc833 fix: remove caching from record_voice function to resolve CachedWidgetWarning 2024-09-14 18:01:25 +05:30
ajaysi (aider)
5938dcf6e4 fix: remove undefined reference to blog_from_audio in content_generators.py 2024-09-14 18:00:01 +05:30
ajaysi (aider)
64f269ebb7 fix: remove invalid imports to resolve ImportError in alwrity.py and content_generators.py 2024-09-14 17:59:46 +05:30
ajaysi (aider)
3f98289214 feat: Enhance Streamlit UI with improved messaging and tooltips for better user experience 2024-09-14 17:57:33 +05:30
ajaysi
d8cef6eaa9 fix: update blog output format from HTML to markdown in main_config.json 2024-09-14 17:54:03 +05:30
ajaysi (aider)
7c62957652 fix: resolve unmatched parenthesis error in content_generators.py 2024-09-14 17:50:51 +05:30
ajaysi (aider)
eaed787fe3 fix: resolve undefined name errors and fix indentation in alwrity.py and content_generators.py 2024-09-14 17:50:16 +05:30
ajaysi (aider)
668e7f7f36 refactor: Fix undefined names by adding missing imports in alwrity.py, alwrity_utils.py, content_generators.py, and ui_setup.py 2024-09-14 17:49:55 +05:30
ajaysi (aider)
27072adbe5 refactor: modularize code by creating separate utility modules for environment, configuration, UI setup, API key management, content generation, SEO tools, file processing, and voice processing 2024-09-14 17:49:34 +05:30
ajaysi (aider)
6fe6c52d99 feat: improve performance by implementing lazy loading, caching, and efficient file handling in alwrity.py and alwrity_utils.py 2024-09-14 17:36:46 +05:30
ajaysi (aider)
dded2180f3 fix: Improve import organization, add error handling for file operations, and enhance code robustness in alwrity.py 2024-09-14 17:30:27 +05:30
ajaysi (aider)
719b863d95 fix: add placeholder functions for undefined blog_from_audio and process_folder_for_rag 2024-09-14 17:26:10 +05:30
ajaysi (aider)
85a3172387 fix: add placeholder imports for undefined functions blog_from_audio and process_folder_for_rag 2024-09-14 17:25:57 +05:30
ajaysi (aider)
16d1751c83 refactor: modularize main function and improve error handling in alwrity.py 2024-09-14 17:24:51 +05:30
ajaysi
c9b4508a42 Fixing Errors - WIP - Making improvements, content workflows 2024-09-13 19:44:24 +05:30
ajaysi
0ce8f8d433 Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2024-09-13 19:42:24 +05:30
ajaysi
52753901f1 Fixing Errors - WIP - Making improvements, content workflows 2024-09-13 19:41:48 +05:30
Cristover Wurangian
2cb162b40a upd requirements.txt 2024-09-12 23:01:59 +05:30
Cristover Wurangian
408c42ef18 Add Docker support and remove advetools from requirements.txt
* Removed advetools from requirements.txt due to compatibility issues (see https://github.com/AJaySi/AI-Writer/issues/124)
2024-09-12 23:01:59 +05:30
ajaysi
ca8618a6a4 AI SEO tools - Readibility & Analysis, Agents content ideator 2024-09-04 21:51:52 +05:30
ajaysi
14879b9c97 New: AI SEO tools- cloudscraper, advertool sdk 2024-08-25 19:21:59 +05:30
ajaysi
4d71cc1f3b New: AI SEO tools- On Page SEO Analyzer tool 2024-08-25 17:14:26 +05:30
ajaysi
c66a11b8a7 New: AI SEO tools- Get Google PageSpeed Insights 2024-08-23 19:59:56 +05:30
ajaysi
c4af40f93d Use system instructions to steer the behavior of a model 2024-08-18 17:37:30 +05:30
ajaysi
b97ad5eb2b Use system instructions to steer the behavior of a model 2024-08-18 17:13:00 +05:30
ajaysi
f35649f129 Backlinking tool & Img optimization, PIL & Tinify API 2024-08-15 16:39:32 +05:30
ajaysi
d005cef45a Backlinking tool & Img optimization, PIL & Tinify API 2024-08-15 14:27:17 +05:30
ajaysi
fe59ec07cc Backlinking tool & Img optimization, PIL & Tinify API 2024-08-15 09:47:12 +05:30
ajaysi
43bba7e73e New: AI SEO tools- OpenGraph Tags generation, blog from pdf 2024-08-14 16:28:26 +05:30
ajaysi
ac43dee24f New: AI SEO tools- OpenGraph Tags generation, blog from pdf 2024-08-14 14:21:57 +05:30
ajaysi
20bda6c964 New: AI SEO tools, OpenGraph Tags generator 2024-08-09 16:17:09 +05:30
ajaysi
4d887c87eb AI SEO alt text generator 2024-08-05 10:17:30 +05:30
ajaysi
bd79fa5974 Long doc/pdf, chunking for insights 2024-08-01 22:56:13 +05:30
ajaysi
f204219edb New: AI SEO tools, Rich snippet, AI product description 2024-07-21 16:09:59 +05:30
ajaysi
8ab8d22fb1 Features: AI Rich snippet from url, AI product description writer 2024-07-17 17:00:22 +05:30
ajaysi
44d83e2b81 Features: AI Rich snippet from url, AI product description writer 2024-07-17 12:00:27 +05:30
ajaysi
c923435be2 Feature: AI SEO - Generate rich snippet from url 2024-07-12 19:12:11 +05:30
ajaysi
e06c4ffae3 Feature: AI SEO - Generate rich snippet from url 2024-07-12 19:05:20 +05:30
ajaysi
7abc396633 Blog from given image 2024-07-02 16:48:36 +05:30
ajaysi
94b938d31e Improved longform, Image, prompts 2024-07-01 19:21:28 +05:30
ajaysi
97ece766c9 Readme files for library modules 2024-07-01 12:31:24 +05:30
ajaysi
60f0bbaa07 Anthropic AI models support 2024-06-29 10:04:19 +05:30
ajaysi
591c373aef Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2024-06-27 13:22:07 +05:30
ajaysi
8a59a9d7f0 AI story writer, UI & prompts improvements 2024-06-27 13:21:03 +05:30
Leonid Shamis
a99ba9da77 Update README.md
Change the order of commands when updating to the latest code for existing users and change command numbering for uniformity.
2024-06-25 16:40:56 +05:30
ajaysi
89b2e47b9c WIP - UI, Audio, firecrawl, long-form - V0.5 2024-06-21 22:20:43 +05:30
ajaysi
074ddf6210 WIP - UI, Audio, firecrawl, long-form - V0.5 2024-06-20 22:48:52 +05:30
ajaysi
899abad1ba WIP - UI, firecrawl, long-form - V0.5 2024-06-18 09:22:36 +05:30
ajaysi
3563ac29cb WIP - UI, firecrawl, long-form - V0.5 2024-06-17 16:29:38 +05:30
ajaysi
dc8893113a WIP - Streamlit UI, firecrawl - V0.5 2024-06-14 11:57:22 +05:30
ajaysi
128b6f3878 WIP - Streamlit UI, firecrawl - V0.5 2024-06-13 17:43:57 +05:30
ajaysi
8dfcb1f536 WIP - Streamlit UI, firecrawl - V0.5 2024-06-12 22:46:41 +05:30
ajaysi
bebca6612d WIP - Streamlit UI, firecrawl - V0.5 2024-06-12 17:12:19 +05:30
ajaysi
f2aa79264e WIP - Streamlit UI, firecrawl - V0.5 2024-06-12 16:01:46 +05:30
ajaysi
ccbaa0e4fa WIP - Streamlit UI, firecrawl - V0.5 2024-06-11 17:27:50 +05:30
ajaysi
f2fa8cfb47 WIP - Streamlit UI, Porting CLI 2024-06-09 09:19:24 +05:30
ajaysi
11aa649145 WIP - Streamlit UI, Porting CLI 2024-06-08 11:34:43 +05:30
ajaysi
adbefa79ef WIP - Streamlit UI, Porting CLI 2024-06-04 15:29:45 +05:30
ajaysi
433643b9db WIP - Streamlit UI, Porting CLI 2024-06-03 20:08:47 +05:30
ajaysi
850f3c80b7 WIP - Streamlit UI, Porting CLI 2024-06-03 19:55:07 +05:30
ajaysi
5f55cae175 WIP - Streamlit UI, Porting CLI 2024-06-03 16:14:57 +05:30
ajaysi
f372169daa WIP - Streamlit UI, Porting CLI 2024-06-03 16:10:47 +05:30
ajaysi
f9599bd346 WIP - Streamlit UI, Porting CLI 2024-06-03 12:59:30 +05:30
ajaysi
cfb6718716 WIP - Streamlit UI, Porting CLI 2024-06-03 12:14:50 +05:30
ajaysi
970a111f97 WIP - Streamlit UI, Porting CLI 2024-06-03 11:31:23 +05:30
ajaysi
0287a9f09e WIP - Streamlit UI, Porting CLI 2024-06-03 11:26:57 +05:30
ajaysi
2a1bb49020 WIP - Streamlit UI, Porting CLI 2024-06-02 23:05:27 +05:30
ajaysi
ae8c9d0ac3 WIP - Streamlit UI, Porting CLI 2024-06-01 14:07:21 +05:30
ajaysi
10b7326044 WIP - Streamlit UI, Porting CLI 2024-05-29 16:56:08 +05:30
ajaysi
bf83ff7a6b Streamlit UI, Porting CLI - WIP 2024-05-25 19:52:10 +05:30
ajaysi
458042afc4 Streamlit UI, Porting CLI - WIP 2024-05-25 18:51:53 +05:30
ajaysi
4153cf7d93 AI Agents content ideator team, prompt config 2024-05-20 18:22:08 +05:30
ajaysi
8f2bf02b65 AI Agents content ideator team, prompt config 2024-05-20 15:07:20 +05:30
ajaysi
b431bfcbd8 long-form, AI social, copywriter, prompt config 2024-05-19 14:03:16 +05:30
ajaysi
2499852452 UI, AI social writer, copywriter, WIP 2024-05-18 19:37:22 +05:30
ajaysi
4428e2dc60 UI, AI social writer, copywriter, WIP 2024-05-17 15:23:22 +05:30
ajaysi
8bd4436414 AI finance TA writer, long form writer, WIP 2024-05-16 22:46:12 +05:30
ajaysi
e3b51e2713 AI finance TA writer, yfinance, pandas_ta, WIP 2024-05-16 08:55:33 +05:30
ajaysi
45508d318b AI finance TA writer, yfinance, pandas_ta, WIP 2024-05-15 15:59:16 +05:30
ajaysi
26a35ee355 AI finance TA writer, yfinance, pandas_ta, WIP 2024-05-15 15:53:15 +05:30
ajaysi
417183a6d2 Improve blog, better prompts, WIP 2024-05-13 18:15:46 +05:30
ajaysi
6a95b96973 Agents team, better prompts, WIP 2024-05-12 19:13:02 +05:30
ajaysi
f9b9204349 AI agents team, research report 2024-05-08 10:06:08 +05:30
ajaysi
6f88f9ba34 Include Agent SERP task 2024-05-07 13:00:44 +05:30
ajaysi
f3f6d5e29c Agents team bug fixes 2024-05-07 10:51:47 +05:30
ajaysi
07247fe08e Create your Agentic content team 2024-05-04 18:17:33 +05:30
ajaysi
636c028036 ai news writer, web apps, prompts 2024-05-03 12:11:34 +05:30
ajaysi
ebed11e5ce ai news writer, web apps, prompts 2024-05-02 21:55:46 +05:30
ajaysi
bb2904039e Agentic content team, web researched 2024-04-29 23:48:36 +05:30
ajaysi
b68c6257c1 Agentic content team, web researched 2024-04-27 19:43:58 +05:30
ajaysi
82dd5f9159 Merge branch 'main' of https://github.com/AJaySi/AI-Writer 2024-04-27 19:41:23 +05:30
ajaysi
e852eb894e Agentic content team, web researched 2024-04-27 19:40:30 +05:30
Umesh Sharma
7b23a65cb0 Update .env 2024-04-27 19:22:22 +05:30
Umesh Sharma
62ea4be8a1 Requrements.txt is upadated (#55)
I have updated the requirements.txt 
issue #53 is solved.
2024-04-27 19:01:01 +05:30
Umesh Sharma
53e78d56fe requiremets.txt is updated 2024-04-27 18:55:15 +05:30
Umesh Sharma
8ac992f2df issue #53 is solved 2024-04-27 01:28:47 +05:30
Umesh Sharma
c517bf414e Issue #53 is solved 2024-04-27 01:27:12 +05:30
ajaysi
20c201f4f9 Agentic content creation, web researched 2024-04-26 16:07:17 +05:30
ajaysi
45d324a2a9 LICENSE 2024-04-25 07:45:10 +05:30
ajaysi
4aca57fbde AI news report writer, web researched 2024-04-25 07:42:03 +05:30
ajaysi
3a772b36e5 AI news report writer, web researched 2024-04-24 21:13:30 +05:30
ajaysi
1c7ba95b27 Long form content generation, web researched 2024-04-24 08:58:09 +05:30
ajaysi
48d4371fa5 Long form content generation, web researched 2024-04-23 19:40:07 +05:30
ajaysi
9c45762680 Create images for blogs - Stability AI 2024-04-22 20:04:09 +05:30
ajaysi
aec2d6b432 Create images for blogs - Stability AI 2024-04-22 19:02:55 +05:30
ajaysi
357cba36e4 Fixed issue with Gemini API 2024-04-22 10:09:07 +05:30
ajaysi
180f28a493 Fixed issue with Gemini API 2024-04-19 08:05:41 +05:30
ajaysi
a9a19f102d Try Competitor Analysis 2024-04-17 23:05:32 +05:30
ajaysi
d2bb42caff Try Essay, story, Audio to Blog 2024-04-17 20:02:51 +05:30
AjaySi
80c52facc5 Alwrity AI Essay writer 2024-04-14 17:54:28 +05:30
AjaySi
31995df984 Alwrity AI story writer 2024-04-14 09:06:42 +05:30
AjaySi
c17dd2b2c8 Alwrity AI story writer 2024-04-13 18:38:55 +05:30
AjaySi
f388c9ff66 Alwrity - Bug fixes 2024-04-12 19:58:37 +05:30
AjaySi
5d455c8c89 Alwrity - Bug fixes 2024-04-12 19:46:44 +05:30
AjaySi
b34e51c4f4 Alwrity - Bug fixes 2024-04-12 19:20:09 +05:30
AjaySi
44c50e9ecd Alwrity - Bug fixes 2024-04-12 18:56:20 +05:30
AjaySi
bdab2e9959 Encoding utf-8, bug fix 2024-04-12 18:46:42 +05:30
AjaySi
e3c3c03729 Alwrity - Bug fixes 2024-04-12 17:36:37 +05:30
AjaySi
cf6516eeee Keyword, Audio to Blog - WIP 2024-04-09 19:12:15 +05:30
AjaySi
c30adb3716 Keyword, Audio to Blog - WIP 2024-04-09 18:07:06 +05:30
AjaySi
d968e06a9d Keyword, Audio to Blog - WIP 2024-04-09 12:30:04 +05:30
AjaySi
8a2ef6ef26 Try Audio to Blog 2024-04-08 17:28:06 +05:30
AjaySi
54c51e5177 Try Audio to Blog 2024-04-08 17:15:49 +05:30
AjaySi
23b3c7f6e0 Alwrity - WIP - main_config 2024-04-07 20:47:49 +05:30
AjaySi
e33008659b WIP - Use Google Bard, Improving chatgpt3.5 2024-04-06 20:09:33 +05:30
AjaySi
aa004a05b8 WIP - Code refactoring 2024-04-04 18:19:49 +05:30
AjaySi
75e5d25981 WIP - Code refactoring 2024-04-03 22:46:17 +05:30
AjaySi
1833a85637 WIP - Code refactoring 2024-03-31 19:36:33 +05:30
AjaySi
bedd0ac422 WIP - Code refactoring 2024-03-28 23:22:58 +05:30
AjaySi
3920186fc7 main_config changes - WIP 2024-03-28 10:53:46 +05:30
AjaySi
b85783735f Merge branch 'main' of https://github.com/AJaySi/blog-gen 2024-03-27 22:19:59 +05:30
AjaySi
74b7bc3cbe main_config changes - WIP 2024-03-27 22:19:16 +05:30
AJSim
c65480f5d0 Create python-package.yml 2024-03-12 17:36:01 +05:30
2711 changed files with 764876 additions and 6711 deletions

75
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@@ -0,0 +1,75 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our community include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at [INSERT CONTACT METHOD]. All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of actions.
**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity).
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.

182
.github/CONTRIBUTING.md vendored Normal file
View File

@@ -0,0 +1,182 @@
# Contributing to ALwrity
Thank you for your interest in contributing to ALwrity! 🚀 We welcome contributions from the community and appreciate your help in making this AI-powered digital marketing platform even better.
## 🤝 How to Contribute
### 1. **Report Issues**
- Use our [GitHub Issues](https://github.com/AJaySi/ALwrity/issues) to report bugs or request features
- Check existing issues before creating new ones
- Provide clear descriptions and steps to reproduce bugs
### 2. **Submit Pull Requests**
- Fork the repository
- Create a feature branch: `git checkout -b feature/amazing-feature`
- Make your changes and test thoroughly
- Submit a pull request with a clear description
### 3. **Code Contributions**
- Follow our coding standards (see below)
- Add tests for new functionality
- Update documentation as needed
- Ensure all tests pass before submitting
## 🛠️ Development Setup
### Prerequisites
- **Python 3.10+** (Backend: FastAPI, SQLAlchemy, AI integrations)
- **Node.js 18+** (Frontend: React, TypeScript, Material-UI)
- **Git** (Version control)
- **API Keys** (Gemini, OpenAI, Anthropic, etc.)
### Quick Start
```bash
# Clone the repository
git clone https://github.com/AJaySi/ALwrity.git
cd ALwrity
# Backend setup
cd backend
pip install -r requirements.txt
cp env_template.txt .env # Configure your API keys
python start_alwrity_backend.py
# Frontend setup (in a new terminal)
cd frontend
npm install
cp env_template.txt .env # Configure your environment
npm start
```
### Environment Configuration
1. **Backend**: Copy `backend/env_template.txt` to `backend/.env`
2. **Frontend**: Copy `frontend/env_template.txt` to `frontend/.env`
3. **API Keys**: Add your AI service API keys to the respective `.env` files
## 📝 Coding Standards
### Python (Backend)
- **Style**: Follow PEP 8 guidelines, use Black formatter
- **Type Hints**: Use type hints for all function parameters and return values
- **Documentation**: Add comprehensive docstrings using Google style
- **Error Handling**: Use proper exception handling with meaningful error messages
- **Logging**: Use structured logging with appropriate levels
- **API Design**: Follow RESTful principles, use FastAPI best practices
- **Database**: Use SQLAlchemy ORM, implement proper migrations
### TypeScript/React (Frontend)
- **TypeScript**: Strict mode enabled, no `any` types
- **Components**: Functional components with hooks, proper prop typing
- **State Management**: Use React hooks, consider context for global state
- **Styling**: Material-UI components, consistent theming
- **Error Boundaries**: Implement error boundaries for better UX
- **Performance**: Use React.memo, useMemo, useCallback where appropriate
- **Testing**: Jest + React Testing Library for unit tests
### ALwrity-Specific Guidelines
- **AI Integration**: Always handle API rate limits and errors gracefully
- **Content Generation**: Implement proper validation and sanitization
- **SEO Features**: Follow SEO best practices in generated content
- **User Experience**: Maintain consistent UI/UX across all features
- **Security**: Validate all inputs, implement proper authentication
## 🧪 Testing
### Backend Testing
```bash
cd backend
python -m pytest test/
```
### Frontend Testing
```bash
cd frontend
npm test
```
## 📋 Pull Request Guidelines
### Before Submitting
- [ ] Code follows project style guidelines
- [ ] Self-review completed
- [ ] Tests added/updated and passing
- [ ] Documentation updated
- [ ] No merge conflicts
### PR Description Template
```markdown
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update
## Testing
- [ ] Backend tests pass
- [ ] Frontend tests pass
- [ ] Manual testing completed
## Screenshots (if applicable)
Add screenshots to help explain your changes
```
## 🏷️ Issue Labels
We use the following labels to categorize issues:
- `bug`: Something isn't working
- `enhancement`: New feature or request
- `documentation`: Improvements or additions to documentation
- `good first issue`: Good for newcomers
- `help wanted`: Extra attention is needed
- `priority: high`: High priority issues
- `priority: low`: Low priority issues
## 💬 Community Guidelines
- Be respectful and inclusive
- Help others learn and grow
- Provide constructive feedback
- Follow the [Code of Conduct](CODE_OF_CONDUCT.md)
## 🎯 Areas for Contribution
### High Priority
- **Bug Fixes**: Critical issues affecting core functionality
- **Performance**: API response times, database optimization
- **Documentation**: API docs, user guides, setup instructions
- **Test Coverage**: Unit tests, integration tests, E2E tests
- **Security**: Vulnerability fixes, security improvements
### Feature Areas
- **AI Content Generation**: Blog posts, social media content, SEO optimization
- **SEO Dashboard**: Google Search Console integration, analytics
- **Social Media**: LinkedIn, Facebook, Instagram content creation
- **Content Planning**: Calendar management, content strategy
- **User Experience**: Onboarding flow, dashboard improvements
- **Analytics**: Usage tracking, performance metrics
- **Integrations**: Third-party API integrations, webhooks
### Good First Issues
Look for issues labeled with `good first issue` - these are perfect for newcomers:
- Documentation improvements
- UI/UX enhancements
- Test additions
- Bug fixes with clear reproduction steps
- Feature requests with detailed specifications
## 📞 Getting Help
- Join our [Discussions](https://github.com/AJaySi/ALwrity/discussions)
- Check existing [Issues](https://github.com/AJaySi/ALwrity/issues)
- Review [Documentation](https://github.com/AJaySi/ALwrity/wiki)
## 🙏 Recognition
Contributors will be recognized in our README and release notes. Thank you for helping make ALwrity better for everyone!
---
**Happy Contributing!** 🎉

286
.github/INSTALLATION.md vendored Normal file
View File

@@ -0,0 +1,286 @@
# ALwrity Quick Start Guide
Complete setup guide for running ALwrity locally after cloning from GitHub.
## 🎯 **Prerequisites**
Before you begin, ensure you have:
- **Node.js** 16+ and npm installed ([Download](https://nodejs.org/))
- **Python** 3.8+ installed ([Download](https://www.python.org/downloads/))
- **Git** installed ([Download](https://git-scm.com/downloads))
- **Clerk Account** ([Sign up](https://clerk.com/))
- **API Keys** (Gemini, CopilotKit, etc.)
## 🚀 **Quick Setup (Automated)**
### **Option A: Windows**
```powershell
# 1. Clone the repository
git clone https://github.com/AJaySi/ALwrity.git
cd ALwrity
# 2. Run automated setup
.\setup_alwrity.bat
```
### **Option B: macOS/Linux**
```bash
# 1. Clone the repository
git clone https://github.com/AJaySi/ALwrity.git
cd ALwrity
# 2. Make script executable and run
chmod +x setup_alwrity.sh
./setup_alwrity.sh
```
## 📝 **Manual Setup (Step-by-Step)**
### **Step 1: Clone Repository**
```bash
git clone https://github.com/AJaySi/ALwrity.git
cd ALwrity
```
### **Step 2: Backend Setup**
```bash
# Navigate to backend
cd backend
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
cp env_template.txt .env
# Edit .env and add your API keys:
# - CLERK_SECRET_KEY
# - CLERK_PUBLISHABLE_KEY
# - GEMINI_API_KEY (optional, can be provided in UI)
# Initialize database
python scripts/create_subscription_tables.py
python scripts/cleanup_alpha_plans.py
# Return to root
cd ..
```
### **Step 3: Frontend Setup**
```bash
# Navigate to frontend
cd frontend
# Clean install (important!)
rm -rf node_modules package-lock.json # macOS/Linux
# OR for Windows PowerShell:
# Remove-Item -Recurse -Force node_modules, package-lock.json -ErrorAction SilentlyContinue
# Install dependencies (THIS IS CRITICAL - DO NOT SKIP!)
npm install
# Create .env file
cp env_template.txt .env
# Edit .env and add:
# REACT_APP_CLERK_PUBLISHABLE_KEY=<your-clerk-publishable-key>
# REACT_APP_API_BASE_URL=http://localhost:8000
# Build the project (validates everything compiles)
npm run build
# Return to root
cd ..
```
### **Step 4: Start the Application**
**Terminal 1 - Backend:**
```bash
cd backend
python app.py
```
**Terminal 2 - Frontend:**
```bash
cd frontend
npm start
```
### **Step 5: Access the Application**
- **Frontend UI**: http://localhost:3000
- **Backend API Docs**: http://localhost:8000/api/docs
- **Health Check**: http://localhost:8000/health
## 🐛 **Troubleshooting Common Issues**
### **Issue 1: "CopilotSidebar is not exported" Error**
**Cause**: Did not run `npm install` in frontend directory
**Fix:**
```bash
cd frontend
rm -rf node_modules package-lock.json
npm install
npm run build
npm start
```
### **Issue 2: "Module not found" (Python)**
**Cause**: Did not install Python dependencies or activate virtual environment
**Fix:**
```bash
cd backend
source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -r requirements.txt
```
### **Issue 3: "CORS Error" in Browser**
**Cause**: Backend not running or frontend connecting to wrong URL
**Fix:**
1. Ensure backend is running on `http://localhost:8000`
2. Check `frontend/.env` has `REACT_APP_API_BASE_URL=http://localhost:8000`
3. Restart both frontend and backend
### **Issue 4: "Clerk Publishable Key Missing"**
**Cause**: Frontend `.env` file not configured
**Fix:**
```bash
cd frontend
# Edit .env file and add:
# REACT_APP_CLERK_PUBLISHABLE_KEY=pk_test_xxx...
```
### **Issue 5: "Database Error" or "Subscription Plans Not Found"**
**Cause**: Database tables not created
**Fix:**
```bash
cd backend
python scripts/create_subscription_tables.py
python scripts/cleanup_alpha_plans.py
```
### **Issue 6: "Port Already in Use"**
**Backend (8000):**
```bash
# Find and kill process using port 8000
# Windows:
netstat -ano | findstr :8000
taskkill /PID <process_id> /F
# macOS/Linux:
lsof -ti:8000 | xargs kill -9
```
**Frontend (3000):**
```bash
# Find and kill process using port 3000
# Windows:
netstat -ano | findstr :3000
taskkill /PID <process_id> /F
# macOS/Linux:
lsof -ti:3000 | xargs kill -9
```
## ✅ **Verification Checklist**
After setup, verify:
- [ ] Backend health check returns 200 OK: `curl http://localhost:8000/health`
- [ ] Frontend loads without errors
- [ ] Can sign in with Clerk authentication
- [ ] Pricing page loads with 4 subscription tiers (Free, Basic, Pro, Enterprise)
- [ ] Can navigate to onboarding after selecting a plan
## 📚 **Environment Variables Required**
### **Backend (.env)**
```bash
# Required for authentication
CLERK_SECRET_KEY=sk_test_xxx...
CLERK_PUBLISHABLE_KEY=pk_test_xxx...
# Optional (can be provided via UI in Step 1 of onboarding)
GEMINI_API_KEY=AIzaSy...
EXA_API_KEY=xxx...
COPILOTKIT_API_KEY=xxx...
# Development settings
DISABLE_AUTH=false
DEPLOY_ENV=local
```
### **Frontend (.env)**
```bash
# Required
REACT_APP_CLERK_PUBLISHABLE_KEY=pk_test_xxx...
# Optional
REACT_APP_API_BASE_URL=http://localhost:8000
REACT_APP_COPILOTKIT_API_KEY=xxx...
```
## 🎯 **First-Time User Flow**
After setup:
1. **Start both servers** (backend + frontend)
2. **Navigate to** http://localhost:3000
3. **Sign in** with Clerk
4. **Select subscription plan** (Free or Basic for alpha testing)
5. **Complete onboarding** (6 steps):
- Step 1: API Keys
- Step 2: Website Analysis
- Step 3: Competitor Research
- Step 4: Persona Generation
- Step 5: Research Preferences
- Step 6: Final Review
6. **Access dashboard** with all features unlocked
## 🆘 **Getting Help**
If you encounter issues:
1. **Check logs**: Both terminal windows show detailed error messages
2. **GitHub Issues**: https://github.com/AJaySi/ALwrity/issues
3. **Documentation**: See `docs/` directory for detailed guides
4. **Common Issues**: See `docs/GITHUB_ISSUE_291_FIX.md` for CopilotSidebar error
## 📖 **Additional Documentation**
- **Onboarding System**: `docs/API_KEY_MANAGEMENT_ARCHITECTURE.md`
- **Subscription System**: `docs/Billing_Subscription/SUBSCRIPTION_IMPLEMENTATION_SUMMARY.md`
- **Deployment Guide**: `DEPLOY_ENV_REFERENCE.md`
- **API Key Management**: `docs/API_KEY_INJECTION_EXPLAINED.md`
---
**Need help? Open an issue on GitHub: https://github.com/AJaySi/ALwrity/issues**

65
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,65 @@
---
name: Bug Report
about: Create a report to help us improve ALwrity
title: '[BUG] '
labels: ['bug', 'needs-triage']
assignees: ''
---
## 🐛 Bug Description
A clear and concise description of what the bug is.
## 🔄 Steps to Reproduce
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
## ✅ Expected Behavior
A clear and concise description of what you expected to happen.
## ❌ Actual Behavior
A clear and concise description of what actually happened.
## 📸 Screenshots
If applicable, add screenshots to help explain your problem.
## 🖥️ Environment
**Desktop (please complete the following information):**
- OS: [e.g. Windows 10, macOS 12.0, Ubuntu 20.04]
- Browser: [e.g. Chrome 91, Firefox 89, Safari 14]
- ALwrity Version: [e.g. v1.2.3]
**Mobile (please complete the following information):**
- Device: [e.g. iPhone 12, Samsung Galaxy S21]
- OS: [e.g. iOS 14.6, Android 11]
- Browser: [e.g. Safari, Chrome Mobile]
## 📋 Additional Context
Add any other context about the problem here.
## 🔍 Error Logs
If applicable, paste any error logs or console output here:
```
Paste error logs here
```
## 🏷️ Component/Feature
Which component or feature is affected?
- [ ] Blog Writer
- [ ] SEO Dashboard
- [ ] Content Planning
- [ ] Facebook Writer
- [ ] LinkedIn Writer
- [ ] Onboarding
- [ ] Authentication
- [ ] API
- [ ] Other: _______________
## 🎯 Priority
- [ ] Critical (blocks core functionality)
- [ ] High (major impact on user experience)
- [ ] Medium (minor impact)
- [ ] Low (cosmetic issue)

11
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,11 @@
blank_issues_enabled: false
contact_links:
- name: GitHub Community Support
url: https://github.com/AJaySi/ALwrity/discussions
about: Please ask and answer questions here.
- name: ALwrity Documentation
url: https://github.com/AJaySi/ALwrity/wiki
about: Check our documentation for setup guides and tutorials.
- name: Security Vulnerability
url: https://github.com/AJaySi/ALwrity/security/advisories/new
about: Report security vulnerabilities privately.

View File

@@ -0,0 +1,67 @@
---
name: Feature Request
about: Suggest an idea for ALwrity
title: '[FEATURE] '
labels: ['enhancement', 'needs-triage']
assignees: ''
---
## 🚀 Feature Description
A clear and concise description of the feature you'd like to see implemented.
## 💡 Motivation
Why is this feature important? What problem does it solve?
## 📝 Detailed Description
Provide a detailed description of how this feature should work.
## 🎯 Use Cases
Describe specific use cases for this feature:
1. Use case 1
2. Use case 2
3. Use case 3
## 🎨 Mockups/Designs
If applicable, add mockups, wireframes, or design concepts.
## 🔧 Technical Considerations
Any technical considerations or implementation notes:
- [ ] Requires backend changes
- [ ] Requires frontend changes
- [ ] Requires database changes
- [ ] Requires third-party integration
- [ ] Other: _______________
## 🏷️ Component/Feature Area
Which component or feature area does this relate to?
- [ ] Blog Writer
- [ ] SEO Dashboard
- [ ] Content Planning
- [ ] Facebook Writer
- [ ] LinkedIn Writer
- [ ] Onboarding
- [ ] Authentication
- [ ] API
- [ ] UI/UX
- [ ] Performance
- [ ] Other: _______________
## 🎯 Priority
- [ ] Critical (essential for core functionality)
- [ ] High (significant value add)
- [ ] Medium (nice to have)
- [ ] Low (future consideration)
## 🔄 Alternatives Considered
Describe any alternative solutions or features you've considered.
## 📚 Additional Context
Add any other context, research, or references about the feature request here.
## 🤝 Contribution
Are you willing to contribute to implementing this feature?
- [ ] Yes, I can help implement this
- [ ] Yes, I can help with testing
- [ ] Yes, I can help with documentation
- [ ] No, but I can provide feedback
- [ ] No, just suggesting the idea

56
.github/ISSUE_TEMPLATE/question.yml vendored Normal file
View File

@@ -0,0 +1,56 @@
name: Question
description: Ask a question about ALwrity
title: "[QUESTION] "
labels: ["question", "needs-triage"]
body:
- type: markdown
attributes:
value: |
Thanks for your question! Please provide as much detail as possible to help us help you.
- type: textarea
id: question
attributes:
label: What's your question?
description: Please describe your question in detail
placeholder: What would you like to know about ALwrity?
validations:
required: true
- type: textarea
id: context
attributes:
label: Additional context
description: Any additional context, screenshots, or information that might help
placeholder: Add any relevant context here...
- type: dropdown
id: component
attributes:
label: Which component/feature is this about?
description: Select the most relevant component
options:
- Blog Writer
- SEO Dashboard
- Content Planning
- Facebook Writer
- LinkedIn Writer
- Onboarding
- Authentication
- API
- Installation/Setup
- Other
validations:
required: true
- type: dropdown
id: priority
attributes:
label: Priority
description: How urgent is this question?
options:
- Low (general question)
- Medium (affecting workflow)
- High (blocking progress)
validations:
required: true

178
.github/README.md vendored Normal file
View File

@@ -0,0 +1,178 @@
<div align="center">
# 🚀 ALwrity — AI-Powered Digital Marketing Platform
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.115+-green.svg)](https://fastapi.tiangolo.com/)
[![React](https://img.shields.io/badge/React-18+-blue.svg)](https://react.dev/)
[![Stars](https://img.shields.io/github/stars/AJaySi/AI-Writer?style=social)](https://github.com/AJaySi/AI-Writer/stargazers)
**Core claim:
ALwrity is a contextual content OS: it understands your brand, website, competitors, and channels, then uses that understanding to drive every story, video, podcast, and campaign, with memory and analytics in one place.**
[🌐 Live Demo](https://www.alwrity.com) • [📚 Docs Site](https://ajaysi.github.io/ALwrity/) • [📖 Wiki](https://github.com/AJaySi/AI-Writer/wiki) • [💬 Discussions](https://github.com/AJaySi/AI-Writer/discussions) • [🐛 Issues](https://github.com/AJaySi/AI-Writer/issues)
</div>
<p align="center">
<a href="https://ajaysi.github.io/ALwrity/"><img src="https://raw.githubusercontent.com/AJaySi/AI-Writer/main/docs-site/docs/assests/hero-1.jpg" alt="ALwrity dashboard overview" width="30%"/></a>
<a href="https://ajaysi.github.io/ALwrity/features/blog-writer/overview/"><img src="https://raw.githubusercontent.com/AJaySi/AI-Writer/main/docs-site/docs/assests/hero-2.png" alt="Story Writer workflow" width="30%"/></a>
<a href="https://ajaysi.github.io/ALwrity/features/seo-dashboard/overview/"><img src="https://raw.githubusercontent.com/AJaySi/AI-Writer/main/docs-site/docs/assests/hero-3.png" alt="SEO dashboard insights" width="30%"/></a>
</p>
---
### What ALwrity is
- **Contextual content OS**: Ingests your website, competitors, and channels to build a reusable brand brain.
- **Multi-surface by design**: Blogs, stories, YouTube, podcasts, and video all read from the same understanding.
- **Agent-driven flows**: Orchestrated research, planning, writing, and optimization instead of one-off prompts.
- **Production-ready**: JWT/OAuth2 auth, usage tracking, limits, monitoring, and cost awareness built-in.
---
### Why ALwrity exists
ALwrity exists for people who care more about **context** than prompts.
Most tools either drown you in knobs or reset to a blank page every time.
We wanted a system that:
- Remembers what your brand stands for and who youre speaking to.
- Grounds content in real data (SEO, competitors, web) before it writes.
- Reuses that understanding across every surface instead of duplicating effort.
---
### Why it matters for creators & marketers
- **One brain, many surfaces**: The same insights power blog posts, stories, YouTube scripts, podcast outlines, and video scenes.
- **Less tool-juggling**: Guided flows replace “copy data between 5 SaaS tools and a spreadsheet”.
- **Safer, more factual content**: Grounding and citations reduce hallucinations and rewrites.
- **On-brand by default**: Personas and brand voice settings keep outputs consistent across channels.
- **Operational visibility**: Scheduler “tasks needing intervention”, alerts, and logs highlight issues before your audience does.
---
### Whats functional now
- **AI Blog Writer (Phases)**: Research → Outline → Content → SEO → Publish, with guarded navigation and local persistence (`frontend/src/hooks/usePhaseNavigation.ts`).
- **Story Writer**: Premise → Outline → Chapters → Export, with phase navigation (`frontend/src/hooks/useStoryWriterPhaseNavigation.ts`).
- **YouTube Creator Studio**: Plan → scenes → avatar → render workflow for YouTube videos (`frontend/src/components/YouTubeCreator`).
- **Podcast Maker / Test Persona**: Turn voice + avatar into short videos using the shared video pipeline.
- **Video Studio**: Multi-module video creation, editing, and transformation (`frontend/src/components/VideoStudio`).
- **SEO Dashboard**: Analysis, metadata, and Google Search Console insights (see docs under `docs-site/docs/features/seo-dashboard`).
- **LinkedIn (Factual, GoogleGrounded)**: Real Google grounding + citations + quality metrics for posts/articles/carousels/scripts (see `frontend/docs/linkedin_factual_google_grounded_url_content.md`).
- **Persona System**: Core personas and platform adaptations via APIs (`backend/api/persona.py`).
- **Facebook Persona Service**: Gemini structured JSON for Facebookspecific persona optimization (`backend/services/persona/facebook/facebook_persona_service.py`).
- **Personalization & Brand Voice**: Validation and configuration of writing style, tone, structure (`backend/services/component_logic/personalization_logic.py`).
See details in the Wiki: [Docs Home](https://github.com/AJaySi/AI-Writer/wiki)
---
### Quick Start
1) Clone & install
```bash
git clone https://github.com/AJaySi/AI-Writer.git
cd AI-Writer/backend && pip install -r requirements.txt
cd ../frontend && npm install
```
2) Run locally
```bash
# Backend
cd backend && python start_alwrity_backend.py
# Frontend
cd frontend && npm start
```
3) Open and create
- Frontend: http://localhost:3000
- API docs (local): http://localhost:8000/api/docs
- Complete onboarding → generate content → publish
---
### Integrations & Security
- **Integrations**: Google Search Console (SEO Dashboard), LinkedIn (factual/grounded content).
- **AI Models**: OpenAI, Google Gemini/Imagen, Hugging Face, Anthropic, Mistral.
- **Security**: JWT auth, OAuth2, rate limiting, monitoring/logging.
- **Reliability**: Grounding + retrieval and citation tracking for factual generation.
---
### Tech Stack
| Area | Technologies |
| --- | --- |
| Backend | FastAPI, Python 3.10+, SQLAlchemy |
| Frontend | React 18+, TypeScript, MaterialUI, CopilotKit |
| AI/Research | OpenAI, Gemini/Imagen, Hugging Face, Anthropic, Mistral; Exa, Tavily, Serper (auto provider selection: Gemini default, HF fallback) |
| Data | SQLite (PostgreSQLready) |
| Integrations | Google Search Console, LinkedIn |
| Ops | Loguru monitoring, rate limiting, JWT/OAuth2 |
---
### LLM Providers: Gemini & Hugging Face
- **Autoselection**: The backend autoselects the provider based on `GPT_PROVIDER` and available keys.
- Default: Gemini (if `GEMINI_API_KEY` present)
- Fallback: Hugging Face (if `HF_TOKEN` present)
- **Configure**:
- `GEMINI_API_KEY=...` (text + structured JSON; image via Imagen)
- `HF_TOKEN=...` (text via Inference API; image via supported HF models)
- Optional: `GPT_PROVIDER=gemini` or `GPT_PROVIDER=hf_response_api`
- **Text generation**:
- Gemini: optimized for structured outputs and fast general generation
- HF: broad model access via the Inference Providers
- **Image generation**:
- Gemini/Imagen and Hugging Face providers are supported with a unified interface
For module details, see `backend/services/llm_providers/README.md`.
---
### Documentation
- Docs Site (MkDocs): https://ajaysi.github.io/ALwrity/
- Blog Writer (phases and UI): `docs-site/docs/features/blog-writer/overview.md`
- SEO Dashboard overview: `docs-site/docs/features/seo-dashboard/overview.md`
- SEO Dashboard GSC integration: `docs-site/docs/features/seo-dashboard/gsc-integration.md`
- LinkedIn factual, Google-grounded content: `frontend/docs/linkedin_factual_google_grounded_url_content.md`
- Persona Development (docs-site): `docs-site/docs/features/content-strategy/personas.md`
For additional pages, browse the `docs-site/docs/` folder.
---
### Personas (Brief)
ALwrity generates a core writing persona from onboarding data, then adapts it per platform (e.g., Facebook, LinkedIn). Personas guide tone, structure, and content preferences across tools.
- Core Persona & API: `backend/api/persona.py`
- Facebook Persona Service (Gemini structured JSON): `backend/services/persona/facebook/facebook_persona_service.py`
- Personalization/Brand Voice logic: `backend/services/component_logic/personalization_logic.py`
- Docs (GitHub paths):
- Personas (docs-site): https://github.com/AJaySi/AI-Writer/blob/main/docs-site/docs/features/content-strategy/personas.md
- LinkedIn Grounded Content plan: https://github.com/AJaySi/AI-Writer/blob/main/frontend/docs/linkedin_factual_google_grounded_url_content.md
At a glance:
- Data → Persona: Onboarding + website analysis → core persona
- Platform adaptations: Platform-specific JSON with validations/optimizations
- Usage: Informs tone, content length, structure, and platform best practices
---
### Community
- **Docs & Wiki**: https://github.com/AJaySi/AI-Writer/wiki
- **Discussions**: https://github.com/AJaySi/AI-Writer/discussions
- **Issues**: https://github.com/AJaySi/AI-Writer/issues
- **Website**: https://www.alwrity.com
---
### License
MIT — see [LICENSE](../LICENSE).
<div align="center">
Made with ❤️ by the ALwrity team
</div>

113
.github/SECURITY.md vendored Normal file
View File

@@ -0,0 +1,113 @@
# Security Policy
## 🔒 Supported Versions
We release patches for security vulnerabilities in the following versions:
| Version | Supported |
| ------- | ------------------ |
| 1.0.x | :white_check_mark: |
| < 1.0 | :x: |
## 🚨 Reporting a Vulnerability
We take security seriously. If you discover a security vulnerability within ALwrity, please follow these steps:
### 1. **DO NOT** create a public GitHub issue
Security vulnerabilities should be reported privately to prevent exploitation.
### 2. **Email us directly**
Send an email to: [security@alwrity.com](mailto:security@alwrity.com)
**Include the following information:**
- Description of the vulnerability
- Steps to reproduce the issue
- Potential impact assessment
- Suggested fix (if any)
- Your contact information
### 3. **Response Timeline**
- **Initial Response**: Within 48 hours
- **Status Update**: Within 7 days
- **Resolution**: Within 30 days (depending on complexity)
### 4. **What to Expect**
- We will acknowledge receipt of your report
- We will investigate and validate the vulnerability
- We will provide regular updates on our progress
- We will coordinate the disclosure timeline with you
- We will credit you in our security advisories (unless you prefer to remain anonymous)
## 🛡️ Security Best Practices
### For Users
- Keep your ALwrity installation updated
- Use strong, unique passwords
- Enable two-factor authentication where available
- Regularly review your API keys and access permissions
- Report suspicious activity immediately
### For Developers
- Follow secure coding practices
- Validate all user inputs
- Use parameterized queries to prevent SQL injection
- Implement proper authentication and authorization
- Keep dependencies updated
- Use HTTPS in production
- Implement rate limiting
- Log security-relevant events
## 🔐 Security Features
ALwrity implements the following security measures:
- **Authentication**: Secure user authentication with JWT tokens and Clerk integration
- **Authorization**: Role-based access control and subscription-based access
- **Input Validation**: Comprehensive input sanitization for all user inputs
- **API Security**: Rate limiting, request validation, and API key management
- **Data Encryption**: Sensitive data encryption at rest and in transit
- **CORS Protection**: Proper cross-origin resource sharing configuration
- **Security Headers**: Implementation of security headers and CSP policies
- **Dependency Scanning**: Regular dependency vulnerability scanning
- **AI Service Security**: Secure API key management for AI services
- **Content Sanitization**: Proper sanitization of AI-generated content
- **Database Security**: SQL injection prevention with SQLAlchemy ORM
- **File Upload Security**: Secure file handling and validation
## 🚫 Out of Scope
The following are considered out of scope for our security program:
- Social engineering attacks
- Physical attacks
- Attacks requiring physical access to the server
- Attacks requiring access to the local network
- Denial of service attacks
- Spam or social engineering issues
- Issues in third-party applications or services
## 🏆 Hall of Fame
We maintain a security hall of fame to recognize researchers who help improve ALwrity's security:
- [Your name could be here!]
## 📞 Contact
For security-related questions or concerns:
- **Email**: [security@alwrity.com](mailto:security@alwrity.com)
- **GitHub**: Create a private security advisory
- **Response Time**: 24-48 hours
## 📜 Legal
By reporting a security vulnerability, you agree to:
- Allow us reasonable time to investigate and mitigate the issue
- Not publicly disclose the vulnerability until we have had a chance to address it
- Make a good faith effort to avoid privacy violations, destruction of data, and interruption or degradation of our services
## 🔄 Policy Updates
This security policy may be updated from time to time. We will notify users of any significant changes through our standard communication channels.
**Last Updated**: September 2024

140
.github/SUPPORT.md vendored Normal file
View File

@@ -0,0 +1,140 @@
# Support
## 🆘 Getting Help
We're here to help you get the most out of ALwrity! Here are the best ways to get support:
### 📚 Documentation
- **[Main Documentation](https://github.com/AJaySi/ALwrity/wiki)** - Comprehensive guides and tutorials
- **[API Documentation](https://github.com/AJaySi/ALwrity/wiki/API-Documentation)** - Complete API reference
- **[Setup Guide](https://github.com/AJaySi/ALwrity/wiki/Setup-Guide)** - Installation and configuration
- **[User Guide](https://github.com/AJaySi/ALwrity/wiki/User-Guide)** - How to use ALwrity features
- **[GSC Integration Guide](GSC_INTEGRATION_README.md)** - Google Search Console setup
- **[Alpha Subscription Guide](backend/ALPHA_SUBSCRIPTION_IMPLEMENTATION_PLAN.md)** - Subscription system
### 💬 Community Support
- **[GitHub Discussions](https://github.com/AJaySi/ALwrity/discussions)** - Ask questions and share ideas
- **[GitHub Issues](https://github.com/AJaySi/ALwrity/issues)** - Report bugs and request features
- **[Discord Community](https://discord.gg/alwrity)** - Real-time chat and support (coming soon)
### 🐛 Bug Reports
If you encounter a bug:
1. Check existing [issues](https://github.com/AJaySi/ALwrity/issues) first
2. Use our [bug report template](https://github.com/AJaySi/ALwrity/issues/new?template=bug_report.md)
3. Include detailed steps to reproduce the issue
4. Provide error logs and screenshots when possible
### ✨ Feature Requests
Have an idea for a new feature?
1. Check existing [feature requests](https://github.com/AJaySi/ALwrity/issues?q=is%3Aissue+is%3Aopen+label%3Aenhancement)
2. Use our [feature request template](https://github.com/AJaySi/ALwrity/issues/new?template=feature_request.md)
3. Provide detailed use cases and mockups if possible
## 🚀 Quick Start
### Installation
```bash
# Clone the repository
git clone https://github.com/AJaySi/ALwrity.git
cd ALwrity
# Backend setup
cd backend
pip install -r requirements.txt
python start_alwrity_backend.py
# Frontend setup (in a new terminal)
cd frontend
npm install
npm start
```
### Common Issues
#### Backend Won't Start
- Check Python version (3.10+ required)
- Verify all dependencies are installed: `pip install -r requirements.txt`
- Check if port 8000 is available
- Review error logs in the terminal
#### Frontend Build Errors
- Check Node.js version (18+ required)
- Clear node_modules and reinstall: `rm -rf node_modules && npm install`
- Check for TypeScript errors: `npm run type-check`
#### API Connection Issues
- Verify backend is running on http://localhost:8000
- Check CORS settings in backend configuration
- Ensure API keys are properly configured
## 🔧 Troubleshooting
### Performance Issues
- **System Resources**: Check CPU, RAM usage during content generation
- **Database**: Review query performance, check for slow queries
- **API Rate Limits**: Monitor AI service rate limits (Gemini, OpenAI, etc.)
- **Browser**: Clear cache, cookies, and local storage
- **Network**: Check internet connectivity and API endpoint accessibility
### Authentication Problems
- **API Keys**: Verify all AI service API keys are correct and active
- **Environment Variables**: Check `.env` files are properly configured
- **Token Expiration**: Refresh authentication tokens if expired
- **Browser Storage**: Clear browser storage and try again
- **CORS Issues**: Check backend CORS configuration
### Content Generation Issues
- **AI Service Keys**: Verify Gemini, OpenAI, Anthropic API keys
- **Rate Limits**: Check if you've exceeded API rate limits
- **Content Quality**: Review prompt engineering and content validation
- **Error Logs**: Check backend logs for detailed error messages
- **API Credits**: Ensure sufficient credits for AI services
### ALwrity-Specific Issues
- **Onboarding**: Check if all required steps are completed
- **SEO Analysis**: Verify Google Search Console integration
- **Subscription Limits**: Check if you've exceeded usage limits
- **Database**: Ensure database is properly initialized
- **File Permissions**: Check file permissions for uploads and cache
## 📞 Contact Information
### Primary Support
- **GitHub Issues**: [Create an issue](https://github.com/AJaySi/ALwrity/issues/new)
- **GitHub Discussions**: [Join the discussion](https://github.com/AJaySi/ALwrity/discussions)
- **Email**: [support@alwrity.com](mailto:support@alwrity.com)
### Development Team
- **Lead Developer**: [@AJaySi](https://github.com/AJaySi)
- **Contributors**: [@uniqueumesh](https://github.com/uniqueumesh), [@DikshaDisciplines](https://github.com/DikshaDisciplines)
## 🕒 Response Times
- **Critical Issues**: 24 hours
- **Bug Reports**: 2-3 business days
- **Feature Requests**: 1 week
- **General Questions**: 3-5 business days
## 📖 Additional Resources
### Learning Materials
- **[Video Tutorials](https://youtube.com/alwrity)** - Step-by-step video guides
- **[Blog Posts](https://blog.alwrity.com)** - Tips, tricks, and best practices
- **[Case Studies](https://github.com/AJaySi/ALwrity/wiki/Case-Studies)** - Real-world usage examples
### Community
- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute to ALwrity
- **[Code of Conduct](CODE_OF_CONDUCT.md)** - Community guidelines
- **[Roadmap](https://github.com/AJaySi/ALwrity/wiki/Roadmap)** - Upcoming features and improvements
## 🎯 Pro Tips
1. **Join our community** - Get help faster and share your experiences
2. **Search before asking** - Many questions have already been answered
3. **Provide context** - Include relevant details when asking for help
4. **Be patient** - We're a small team working hard to help everyone
5. **Contribute back** - Help others by sharing your solutions
---
**We're here to help you succeed with ALwrity!** 🚀

171
.github/TROUBLESHOOTING.md vendored Normal file
View File

@@ -0,0 +1,171 @@
# Fix for GitHub Issue #291: CopilotSidebar Import Error
## 🐛 **Issue**
User encounters error: `'CopilotSidebar' is not exported from '@copilotkit/react-ui'`
## 🔍 **Root Cause**
The user **did not run `npm install`** after cloning/pulling the repository, causing missing or outdated CopilotKit dependencies.
## ✅ **Solution**
### **Step 1: Clean Install Dependencies**
```bash
cd frontend
rm -rf node_modules package-lock.json
npm install
```
**For Windows PowerShell:**
```powershell
cd frontend
Remove-Item -Recurse -Force node_modules, package-lock.json -ErrorAction SilentlyContinue
npm install
```
### **Step 2: Verify CopilotKit Installation**
Check that the following packages are installed:
```bash
npm list @copilotkit/react-core @copilotkit/react-ui @copilotkit/shared
```
Expected output:
```
@copilotkit/react-core@1.10.3
@copilotkit/react-ui@1.10.3
@copilotkit/shared@1.10.3
```
### **Step 3: Build the Frontend**
```bash
npm run build
```
### **Step 4: Start Development Server**
```bash
npm start
```
## 📋 **Complete Setup Instructions for New Users**
### **Frontend Setup:**
```bash
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# Create .env file from template
cp env_template.txt .env
# Add your environment variables to .env:
# REACT_APP_CLERK_PUBLISHABLE_KEY=<your-clerk-key>
# REACT_APP_COPILOTKIT_API_KEY=<your-copilotkit-key>
# Build the project
npm run build
# Start development server
npm start
```
### **Backend Setup:**
```bash
# Navigate to backend directory
cd backend
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Create .env file from template
cp env_template.txt .env
# Add your environment variables to .env
# Initialize database tables
python scripts/create_subscription_tables.py
# Start backend server
python app.py
```
## 🎯 **Why This Happens**
1. **Missing `node_modules`**: Package dependencies not installed
2. **Outdated packages**: Old version of CopilotKit that doesn't export `CopilotSidebar`
3. **Skipped installation**: Running `npm start` before `npm install`
## ✅ **Verification**
After following the steps above, you should see:
- ✅ No import errors for `CopilotSidebar`
- ✅ Frontend compiles successfully
- ✅ Development server starts on `http://localhost:3000`
- ✅ Backend API accessible on `http://localhost:8000`
## 📚 **Reference**
- [CopilotKit UI Components Documentation](https://docs.copilotkit.ai/crewai-crews/custom-look-and-feel/built-in-ui-components)
- CopilotKit exports: `CopilotChat`, `CopilotSidebar`, `CopilotPopup` from `@copilotkit/react-ui`
## 🚨 **Common Mistakes to Avoid**
1. ❌ Running `npm start` without `npm install` first
2. ❌ Using outdated `package-lock.json`
3. ❌ Missing environment variables in `.env` files
4. ❌ Not running database migration scripts for backend
## 💡 **Pro Tip**
Always run these commands after pulling new code:
```bash
# Frontend
cd frontend && npm install && npm run build
# Backend
cd backend && pip install -r requirements.txt
```
---
## 🐛 **Issue: "Failed to process subscription" (500 Error)**
**Symptoms:**
- User selects Free or Basic plan on Pricing page
- Clicks "Subscribe to [Plan]"
- Gets error: "Failed to process subscription"
- Backend logs: `name 'UsageStatus' is not defined`
**Root Cause:**
Missing `UsageStatus` import in `backend/api/subscription_api.py`
**Fix:**
✅ Already fixed in latest version. Update to latest code:
```bash
git pull origin main
cd backend
python app.py # Restart backend
```
**Verify Fix:**
Check that `backend/api/subscription_api.py` line 18 includes:
```python
from models.subscription_models import (
..., UsageStatus # <-- This should be present
)
```

99
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,99 @@
# Pull Request
## 📝 Description
Brief description of changes made in this PR.
## 🔄 Type of Change
- [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (non-breaking change which adds functionality)
- [ ] 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] 📚 Documentation update
- [ ] 🎨 Style/UI changes
- [ ] ♻️ Code refactoring
- [ ] ⚡ Performance improvements
- [ ] 🧪 Test additions/updates
## 🎯 Related Issues
Closes #(issue number)
Fixes #(issue number)
Related to #(issue number)
## 🧪 Testing
- [ ] Backend tests pass
- [ ] Frontend tests pass
- [ ] Manual testing completed
- [ ] Cross-browser testing (if applicable)
- [ ] Mobile testing (if applicable)
## 📸 Screenshots (if applicable)
Add screenshots to help explain your changes.
### Before
<!-- Add before screenshots here -->
### After
<!-- Add after screenshots here -->
## 🏷️ Component/Feature
Which component or feature is affected?
- [ ] Blog Writer
- [ ] SEO Dashboard
- [ ] Content Planning
- [ ] Facebook Writer
- [ ] LinkedIn Writer
- [ ] Onboarding
- [ ] Authentication
- [ ] API
- [ ] Database
- [ ] GSC Integration
- [ ] Subscription System
- [ ] Monitoring/Billing
- [ ] Documentation
- [ ] Other: _______________
## 📋 Checklist
- [ ] My code follows the project's style guidelines
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published
### ALwrity-Specific Checklist
- [ ] API endpoints follow RESTful conventions
- [ ] AI service integrations handle rate limits and errors gracefully
- [ ] Content generation includes proper validation and sanitization
- [ ] Database migrations are included if schema changes are made
- [ ] Environment variables are documented in env_template.txt
- [ ] Security considerations have been addressed
- [ ] Performance impact has been considered
- [ ] User experience is consistent with existing features
## 🔍 Code Quality
- [ ] Code is properly formatted
- [ ] No console.log statements left in production code
- [ ] Error handling is implemented where needed
- [ ] Performance considerations have been addressed
- [ ] Security considerations have been addressed
## 📚 Documentation
- [ ] README updated (if needed)
- [ ] API documentation updated (if needed)
- [ ] Code comments added for complex logic
- [ ] Changelog updated (if applicable)
## 🚀 Deployment Notes
Any special deployment considerations or environment variables needed.
## 🔗 Additional Context
Add any other context about the pull request here.
## 👥 Reviewers
Tag specific reviewers if needed:
@AJaySi @uniqueumesh @DikshaDisciplines
---
**Thank you for contributing to ALwrity!** 🎉

103
.github/setup_alwrity.bat vendored Normal file
View File

@@ -0,0 +1,103 @@
@echo off
REM ALwrity Complete Setup Script for Windows
REM This script sets up both frontend and backend for local development
echo ================================
echo 🚀 ALwrity Setup Script (Windows)
echo ================================
echo.
REM Check if we're in the project root
if not exist "frontend\" (
echo ❌ Error: frontend directory not found
echo Please navigate to the AI-Writer directory and try again.
exit /b 1
)
if not exist "backend\" (
echo ❌ Error: backend directory not found
echo Please navigate to the AI-Writer directory and try again.
exit /b 1
)
echo 📋 Step 1: Setting up Backend
echo --------------------------------
REM Setup Backend
cd backend
echo Creating Python virtual environment...
python -m venv .venv
echo Activating virtual environment...
call .venv\Scripts\activate.bat
echo Installing Python dependencies...
pip install -r requirements.txt
REM Create .env file if it doesn't exist
if not exist ".env" (
echo Creating .env file from template...
copy env_template.txt .env
echo ⚠️ Please update backend\.env with your API keys
)
echo Creating subscription tables...
python scripts\create_subscription_tables.py 2>nul || echo ⚠️ Subscription tables may already exist
echo Updating subscription plans...
python scripts\cleanup_alpha_plans.py 2>nul || echo ⚠️ Plans may already be updated
cd ..
echo ✅ Backend setup complete!
echo.
echo 📋 Step 2: Setting up Frontend
echo --------------------------------
REM Setup Frontend
cd frontend
REM Clean install
if exist "node_modules\" (
echo Cleaning old node_modules...
rmdir /s /q node_modules 2>nul
del package-lock.json 2>nul
)
echo Installing Node.js dependencies (this may take a few minutes)...
call npm install
REM Create .env file if it doesn't exist
if not exist ".env" (
echo Creating .env file from template...
copy env_template.txt .env
echo ⚠️ Please update frontend\.env with your environment variables
)
echo Building frontend...
call npm run build
cd ..
echo.
echo ================================
echo 🎉 ALwrity Setup Complete!
echo ================================
echo.
echo Next steps:
echo 1. Update backend\.env with your API keys (Clerk, Gemini, etc.)
echo 2. Update frontend\.env with your Clerk publishable key
echo.
echo To start the application:
echo Backend: cd backend ^&^& python app.py
echo Frontend: cd frontend ^&^& npm start
echo.
echo Access points:
echo Frontend: http://localhost:3000
echo Backend API: http://localhost:8000/api/docs
echo.
echo Happy coding! 🚀
pause

105
.github/setup_alwrity.sh vendored Normal file
View File

@@ -0,0 +1,105 @@
#!/bin/bash
# ALwrity Complete Setup Script
# This script sets up both frontend and backend for local development
set -e # Exit on error
echo "🚀 ALwrity Setup Script"
echo "================================"
echo ""
# Color codes for output
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color
# Check if we're in the project root
if [ ! -d "frontend" ] || [ ! -d "backend" ]; then
echo -e "${RED}❌ Error: This script must be run from the project root directory${NC}"
echo "Please navigate to the AI-Writer directory and try again."
exit 1
fi
echo -e "${YELLOW}📋 Step 1: Setting up Backend${NC}"
echo "--------------------------------"
# Setup Backend
cd backend
echo "Creating Python virtual environment..."
python -m venv .venv || python3 -m venv .venv
echo "Activating virtual environment..."
source .venv/bin/activate || source .venv/Scripts/activate
echo "Installing Python dependencies..."
pip install -r requirements.txt
# Create .env file if it doesn't exist
if [ ! -f ".env" ]; then
echo "Creating .env file from template..."
cp env_template.txt .env
echo -e "${YELLOW}⚠️ Please update backend/.env with your API keys${NC}"
fi
echo "Creating subscription tables..."
python scripts/create_subscription_tables.py || echo -e "${YELLOW}⚠️ Subscription tables may already exist${NC}"
echo "Updating subscription plans..."
python scripts/cleanup_alpha_plans.py || echo -e "${YELLOW}⚠️ Plans may already be updated${NC}"
cd ..
echo -e "${GREEN}✅ Backend setup complete!${NC}"
echo ""
echo -e "${YELLOW}📋 Step 2: Setting up Frontend${NC}"
echo "--------------------------------"
# Setup Frontend
cd frontend
# Clean install
if [ -d "node_modules" ]; then
echo "Cleaning old node_modules..."
rm -rf node_modules package-lock.json
fi
echo "Installing Node.js dependencies (this may take a few minutes)..."
npm install
# Create .env file if it doesn't exist
if [ ! -f ".env" ]; then
echo "Creating .env file from template..."
cp env_template.txt .env
echo -e "${YELLOW}⚠️ Please update frontend/.env with your environment variables${NC}"
fi
echo "Building frontend..."
npm run build
cd ..
echo -e "${GREEN}✅ Frontend setup complete!${NC}"
echo ""
echo "================================"
echo -e "${GREEN}🎉 ALwrity Setup Complete!${NC}"
echo "================================"
echo ""
echo "Next steps:"
echo "1. Update backend/.env with your API keys (Clerk, Gemini, etc.)"
echo "2. Update frontend/.env with your Clerk publishable key"
echo ""
echo "To start the application:"
echo " Backend: cd backend && python app.py"
echo " Frontend: cd frontend && npm start"
echo ""
echo "Access points:"
echo " Frontend: http://localhost:3000"
echo " Backend API: http://localhost:8000/api/docs"
echo ""
echo -e "${GREEN}Happy coding! 🚀${NC}"

67
.github/workflows/docs.yml vendored Normal file
View File

@@ -0,0 +1,67 @@
name: Deploy Documentation
on:
push:
branches: [main]
paths: ['docs/**', 'docs-site/**', 'mkdocs.yml']
pull_request:
branches: [main]
paths: ['docs/**', 'docs-site/**', 'mkdocs.yml']
permissions:
contents: read
pages: write
id-token: write
actions: read
concurrency:
group: "pages"
cancel-in-progress: false
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install mkdocs mkdocs-material
- name: Setup Pages
uses: actions/configure-pages@v4
with:
enablement: true
- name: Build documentation
run: |
cd docs-site
mkdocs build --site-dir ../site
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: site
deploy:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
with:
token: ${{ secrets.GITHUB_TOKEN }}

273
.gitignore vendored Normal file
View File

@@ -0,0 +1,273 @@
# Python
__pycache__/
*.py[cod]
*.db
*.sqlite*
.trae/
.trae
workspace/
workspace/*
.opencode
data/
.trae/
/backend/database/migrations/*
/backend/.db
backend/*.db
backend\youtube_audio
youtube_avatars
backend\youtube_images
backend/.trae_*
# Onboarding progress files
.onboarding_progress.json
backend/.onboarding_progress.json
backend/database/migrations/*
*.mp3
podcast_audio/*
backend/podcast_audio/
podcast_audio/
podcast_images/
youtube_videos/
backend/podcast_images/
backend/podcast_videos/
backend/researchtools_text/projects/
youtube_avatars/
youtube_avatars/*
youtube_videos/*
youtube_images/
youtube_audio
.cursorignore
story_videos
story_videos/*
story_audio
story_images
backend/story_videos/*
backend/story_audio/*
backend/story_images/*
# Environment
.env
.env.*
# User data
backend/lib/workspace/
backend/lib/workspace/users/
backend/logs/
backend/linkedin_images/
backend/test/
backend/.onboarding_progress_user*
backend/.onboarding_*.json
# Frontend
frontend/node_modules/
frontend/build/
frontend/.env*
# Logs
*.log
logs/
# OS
.DS_Store
Thumbs.db
# Docs build
docs-site/site/
# Dependencies
node_modules/
*/node_modules/
**/node_modules/
# Python cache files
__pycache__/
*/__pycache__/
**/__pycache__/
*.pyc
*.pyo
*.pyd
.gitignore
.pytest*
# Cache files
.cache/
*/cache/
**/cache/
*.cache
# MkDocs site directory
docs-site/site/
venv_new
venv
# Environment files
.env
.env.local
.env.development.local
.env.test.local
.env.production.local
backend/.env
frontend/.env
# Database files
*.db
*.sqlite
*.sqlite3
backend/alwrity.db
backend/content_cache.db
backend/outline_cache.db
backend/research_cache.db
# Google OAuth credentials
gsc_credentials.json
**/gsc_credentials.json
.cursor
# Onboarding progress files
.onboarding_progress.json
backend/.onboarding_progress.json
# IDE and editor files
.vscode/
.idea/
*.swp
*.swo
*~
# OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
# Node.js (for frontend)
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# Build directories
build/
dist/
*.egg-info/
# Logs
*.log
logs/
# Temporary files
*.tmp
*.temp
# Coverage reports
htmlcov/
.coverage
.coverage.*
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Virtual environments
venv/
env/
ENV/
.venv/
# Jupyter Notebook
.ipynb_checkpoints
# pyenv
.python-version
# pipenv
Pipfile.lock
# PEP 582
__pypackages__/
# Celery
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
.cursorignore
gsc_credentials_template.json
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# Credentials and secrets
gsc_credentials.json
*.pem
*.key
*.crt
# Test files
test_*.py
*_test.py
tests/
# Documentation build
docs/_build/
# Backup files
*.bak
*.backup
*.orig
# Lock files
package-lock.json
yarn.lock
# Cache directories
.pytest_cache
# Documentation cache
docs/__pycache__/
# Onboarding JSON files (CRITICAL: Should use database instead)
.onboarding_progress.json
*_onboarding_progress.json
backend/.onboarding_progress*.json
backend/researchtools_text/projects/Draft__AI_advanc_c2f90698.json
backend/researchtools_text/projects/Draft__AI_adv_388d4491.json
# Migration and debug scripts
debug_usage.py
fix_database.py
migrate_usage_summaries.py
simple_migrate.py
validate_implementation.py
# Camera selfie implementation (not needed)
CAMERA_SELFIE_IMPLEMENTATION.md

160
README.md
View File

@@ -1,160 +0,0 @@
# AI Blog Creation and Management Toolkit
![](https://github.com/AJaySi/AI-Blog-Writer/blob/main/workspace/keyword_blog.gif)
## Introduction
This toolkit automates and enhances the process of blog creation, optimization, and management.
Leveraging AI technologies, it assists content creators and digital marketers in generating, formatting, and uploading blog content efficiently. The toolkit integrates advanced AI models for text generation, image creation, and data analysis, streamlining the content creation pipeline.
---
## Getting Started 🚀 🤞🤞🤞
To start using this tool, simply follow one of the options below:
---
### Option 1: Local Laptop Install 💻 (Recommended)
**Step 0**️⃣: **Pre-requisites:** Git, Python3
**Installing Python on Windows:**
- Open PowerShell as admin: Press `Windows Key + X`, then select "Windows PowerShell (Admin)".
- Type `python`. If Python is not installed, Windows will prompt you to 'Get Python'.
- If Python is installed, you should see '>>>>>'.
**Installing Git on Windows:**
- Open PowerShell or Windows Terminal: Press `Windows Key + X`, then select "Windows Terminal".
- Paste or type and press enter:⏎.⏎.<br>
`winget install --id Git.Git -e --source winget`
- Wait for download bars to finish
*Note for Linux Users:* If you're on Linux and can't install these, get lost 🧙♂️
**Step 1**️⃣: Clone this repository to your local machine.
```
To clone the repository to your local machine, perform the following steps:
1. **Open Windows PowerShell as Administrator:** Press `Windows Key + X` and select "Windows PowerShell (Admin)" from the menu.
2. **Navigate to the Desired Directory:** Use the `cd` command to move to the directory where you want to clone the repository.
3. **Clone the Repository:** Run the following command in PowerShell to clone the repository:
`git clone https://github.com/AJaySi/AI-Blog-Writer.git`
This command will download all the files from the repository to your local machine.
4. **Verify the Clone:** After the cloning process is complete, navigate into the newly created directory using:
`cd AI-Blog-Writer`
```
Once you've cloned the repository, you can proceed with the next steps for installation and setup.
**Step 2**️⃣: Install required dependencies:
- Open command prompt on your local machine: Press `Windows Key + R`, type `cmd`, then press Enter.
- Navigate to the folder from Step 1
- Run: `python -m pip install -r requirements.txt`
**Step 3**️⃣: Run the script:
- Execute: `python alwrity.py`
**Step 4**️⃣: The tool will guide you through setting up your APIs.
---
### Option 2: Replit: Cloud Install ☁️☁️☁️ ☁️ ☁️ ....☁️
**Step 1**️⃣: Fork this repository to your own GitHub account.
**Step 2**️⃣: Follow this guide: [Running GitHub Repositories on Replit](https://docs.replit.com/programming-ide/using-git-on-replit/running-github-repositories-replit) 📖
---
### Option 3: Web URL 🌐 *(For easy access)*
**Step 1**️⃣: Error 404: Page not found. 😅
---
## Features
- **Online Research Integration**: Enhances blog content by integrating insights and information gathered from online research, ensuring the content is informative and up-to-date. This gives context for generating content. Tavily AI, Google search, serp and Vision AI is used to scrape web data for context augumentation. TBD: Include CrewAI for web research agents.
- **Image Generation and Processing**: Utilizes AI models like DALL-E 3, stable difffusion to create relevant images based on blog content. Offers features to process and optimize images for web usage. FIXME: Need more work with stable diffusion.
- **SEO Optimization**: Employs AI to generate SEO-friendly blog titles, meta descriptions, tags, and categories. Ensures content is optimized for search engines.
- **Wordpress, Jekyll Integration**: Implemented generating and uploading blog content, media to wordpress via its REST APIs. Most of the static website which can work with markdown style should work with little testing.
### AI-Driven Content Creation
- **Text Generation**: Leverages OpenAI's ChatGPT, Google Gemini Pro for generating text for blogs.
- **Customizable AI Parameters**: (FIXME) Offers flexibility in adjusting AI parameters like model selection, temperature, and token limits to suit different content needs.
### Image Detail Extraction
- **Analyzing and Extracting Image Details**: Uses OpenAI's Vision API, Google Gemini vision to analyze images and extract details such as alt text, descriptions, titles, and captions, enhancing the SEO of image content.
---
**Note**: This toolkit is designed for automated blog management and requires appropriate API keys and access credentials for full functionality.
---
### Web Research
- **Keyword Research**: Conduct in-depth keyword research by specifying search queries and time ranges.
- **Domain-Specific Searches**: Include specific URLs to confine searches to certain domains, such as Wikipedia or competitor websites.
- **Semantic Analysis**: Explore similar topics and technologies by providing a reference URL for semantic analysis.
### Competitor Analysis
- **Similar Company Discovery**: Analyze competitor websites to discover similar companies, startups, and technologies.
- **Industry Insights**: Gain insights into industry trends, market competitors, and emerging technologies.
### Blog Writing
- **Keyword-Based Blogs**: Generate blog content based on specified keywords, leveraging AI to produce engaging and informative articles.
- **Audio Blog Generation**: Convert audio from YouTube videos into blog posts, facilitating content creation from multimedia sources.
- **GitHub Repository Blogs**: Transform GitHub repositories or topics into blog posts, showcasing code examples and project insights.
- **Scholarly Research Blogs**: Generate blog content based on research papers, summarizing key findings and insights.
### Blogging Tools
- **Title and Meta Description Generation**: Generate catchy titles and meta descriptions for blog posts to improve SEO and user engagement.
- **Blog Outline Creation**: Generate outlines for blog posts, aiding in structuring content and organizing ideas.
- **FAQ Generation**: Automatically generate FAQs (Frequently Asked Questions) based on blog content, enhancing user engagement and SEO.
- **HTML and Markdown Conversion**: Convert blog posts between HTML and Markdown formats for easy integration with various platforms.
- **Blog Proofreading**: Proofread blog content for grammar, spelling, and readability, ensuring high-quality output.
- **Tag and Category Suggestions**: Generate tags and categories for blog posts based on content analysis, improving organization and discoverability.
### Interactive Mode
- **User-Friendly Interface**: Navigate tasks and options easily through an interactive command-line interface.
- **Menu-Driven Interaction**: Choose between various options, tasks, and tools using intuitive menus and prompts.
- **Task Guidance**: Receive guidance and instructions for each task, facilitating user interaction and decision-making.
## Packages, Tools, and APIs Used
- **Libraries**:
- PyInquirer: For creating interactive command-line interfaces.
- Typer: For building CLI applications with ease.
- Tabulate: For formatting data in tabular form.
- Requests: For making HTTP requests to web APIs.
- python-dotenv: For loading environment variables from a .env file.
- **APIs**:
- Metaphor API: Provides semantic search capabilities for finding similar topics and technologies.
- Tavily API: Offers AI-powered web search functionality for conducting in-depth keyword research.
- SerperDev API: Enables access to search engine results and competitor analysis data.
- OpenAI API: Powers the Large Language Models (LLMs) for generating blog content and conducting research.
- Gemini API: Another LLM provider for natural language processing tasks.
- Ollama API (Work In Progress): An upcoming LLM provider for additional research and content generation capabilities.
---
Notes:
1). Focus is on writing/generating highly unique, SEO optimized blog content.
2). Models: Openai, gemini, ollama are interesting. Minstral API is also worth exploring. Cohere API is purpose made.
Focus is getting the prompts right. Shit in, shit out, irrespective of dollars and cutting edge models.
Pydantically speakng, Due to experimental nature of prompting, its getting expensive soon enough. Gemini is free for now.
3). Missing frontend: A smart backend will enable a good frontend. WIP, backend. So, frontend; coming soon.
4).Getting AI agents to 'brainstrom' blog ideas seems more pressing. CrewAI seems more straightforward than autogen.
5). Too Many APIs floating around: The implementation is using tools that dont depend on API keys and rather scrape them.
Duh, scraping wont scale, that is GPT vision based scraping will come in handy.

View File

@@ -0,0 +1,215 @@
# Alwrity Enterprise SEO Features
## 🚀 Overview
Alwrity's AI SEO Tools have been enhanced with enterprise-level features that provide comprehensive SEO management, advanced analytics, and AI-powered strategic insights. These enhancements transform Alwrity from a collection of individual tools into a unified enterprise SEO command center.
## 🏢 Enterprise SEO Suite
### Unified Command Center (`enterprise_seo_suite.py`)
The Enterprise SEO Suite serves as a central orchestrator for all SEO activities, providing:
#### Core Workflows
- **Complete SEO Audit**: Comprehensive site analysis combining technical, content, and performance metrics
- **Content Strategy Development**: AI-powered content planning with market intelligence
- **Search Intelligence Analysis**: Deep GSC data analysis with actionable insights
- **Performance Monitoring**: Continuous tracking and optimization recommendations
#### Key Features
- **Intelligent Workflow Orchestration**: Automatically sequences and coordinates multiple SEO analyses
- **AI-Powered Recommendations**: Uses advanced AI to generate strategic insights and action plans
- **Enterprise Reporting**: Comprehensive reports suitable for executive and team consumption
- **Scalable Architecture**: Designed to handle multiple sites and large datasets
### Enterprise-Level Capabilities
- Multi-site management support
- Role-based access controls (planned)
- Team collaboration features (planned)
- Advanced reporting and dashboards
- API integration capabilities
## 📊 Google Search Console Intelligence
### Advanced GSC Integration (`google_search_console_integration.py`)
Transforms raw GSC data into strategic insights with:
#### Search Performance Analysis
- **Comprehensive Metrics**: Clicks, impressions, CTR, and position tracking
- **Trend Analysis**: Week-over-week and month-over-month performance trends
- **Keyword Performance**: Deep analysis of keyword opportunities and optimization potential
- **Page Performance**: Identification of top-performing and underperforming pages
#### Content Opportunities Engine
- **CTR Optimization**: Identifies high-impression, low-CTR keywords for meta optimization
- **Position Improvement**: Highlights keywords ranking 11-20 for content enhancement
- **Content Gap Detection**: Discovers missing keyword opportunities
- **Technical Issue Detection**: Identifies potential crawl and indexing problems
#### AI-Powered Insights
- **Strategic Recommendations**: AI analysis of search data for actionable insights
- **Immediate Opportunities**: Quick wins identified within 0-30 days
- **Long-term Strategy**: 3-12 month strategic planning recommendations
- **Competitive Analysis**: Market position assessment and improvement strategies
### Demo Mode & Real Integration
- **Demo Mode**: Realistic sample data for testing and exploration
- **GSC API Integration**: Ready for real Google Search Console API connection
- **Credentials Management**: Secure handling of GSC API credentials
- **Data Export**: Full analysis export in JSON and CSV formats
## 🧠 AI Content Strategy Generator
### Comprehensive Strategy Development (`ai_content_strategy.py`)
Creates complete content strategies using AI market intelligence:
#### Business Context Analysis
- **Market Positioning**: AI analysis of competitive landscape and opportunities
- **Content Gap Identification**: Discovers missing content themes in the industry
- **Competitive Advantage Mapping**: Identifies unique positioning opportunities
- **Audience Intelligence**: Deep insights into target audience needs and preferences
#### Content Pillar Development
- **Strategic Pillars**: 4-6 content themes aligned with business goals
- **Keyword Mapping**: Target keywords and semantic variations for each pillar
- **Content Type Recommendations**: Optimal content formats for each pillar
- **Success Metrics**: KPIs and measurement frameworks for each pillar
#### Content Calendar Planning
- **Automated Scheduling**: AI-generated content calendar with optimal timing
- **Resource Planning**: Time estimates and resource allocation
- **Priority Scoring**: Content prioritization based on impact and effort
- **Distribution Mapping**: Multi-channel content distribution strategy
#### Topic Cluster Strategy
- **SEO-Optimized Clusters**: Topic clusters designed for search dominance
- **Pillar Page Strategy**: Hub-and-spoke content architecture
- **Internal Linking Plans**: Strategic linking for SEO authority building
- **Content Relationship Mapping**: How content pieces support each other
### Implementation Support
- **Phase-Based Roadmap**: 3-phase implementation plan with milestones
- **KPI Framework**: Comprehensive measurement and tracking system
- **Resource Requirements**: Budget and team resource planning
- **Risk Mitigation**: Strategies to avoid common content pitfalls
## 🔧 Enhanced Technical Capabilities
### Advanced SEO Workflows
- **Multi-Tool Orchestration**: Seamless integration between all SEO tools
- **Data Correlation**: Cross-referencing insights from multiple analyses
- **Automated Recommendations**: AI-generated action plans with priority scoring
- **Performance Tracking**: Before/after analysis and improvement measurement
### Enterprise Data Management
- **Large Dataset Handling**: Optimized for enterprise-scale websites
- **Historical Data Tracking**: Long-term trend analysis and comparison
- **Data Export & Integration**: API-ready for integration with other tools
- **Security & Privacy**: Enterprise-grade data handling and security
## 📈 Advanced Analytics & Reporting
### Performance Dashboards
- **Executive Summaries**: High-level insights for leadership teams
- **Detailed Analytics**: In-depth analysis for SEO practitioners
- **Trend Visualization**: Interactive charts and performance tracking
- **Competitive Benchmarking**: Market position and competitor analysis
### ROI Measurement
- **Impact Quantification**: Measuring SEO improvements in business terms
- **Cost-Benefit Analysis**: ROI calculation for SEO investments
- **Performance Attribution**: Connecting SEO efforts to business outcomes
- **Forecasting Models**: Predictive analytics for future performance
## 🎯 Strategic Planning Features
### Market Intelligence
- **Industry Analysis**: AI-powered market research and trend identification
- **Competitive Intelligence**: Deep analysis of competitor content strategies
- **Opportunity Mapping**: Identification of untapped market opportunities
- **Risk Assessment**: Potential challenges and mitigation strategies
### Long-term Planning
- **Strategic Roadmaps**: 6-12 month SEO strategy development
- **Resource Planning**: Team and budget allocation recommendations
- **Technology Roadmap**: Tool and platform evolution planning
- **Scalability Planning**: Growth-oriented SEO architecture
## 🚀 Implementation Benefits
### For Enterprise Teams
- **Unified Workflow**: Single platform for all SEO activities
- **Team Collaboration**: Shared insights and coordinated strategies
- **Scalable Operations**: Handle multiple sites and large datasets
- **Executive Reporting**: Clear ROI and performance communication
### For SEO Professionals
- **Advanced Insights**: AI-powered analysis beyond basic tools
- **Time Efficiency**: Automated workflows and intelligent recommendations
- **Strategic Focus**: Less time on analysis, more on strategy execution
- **Competitive Advantage**: Access to enterprise-level intelligence
### For Business Leaders
- **Clear ROI**: Quantified business impact of SEO investments
- **Strategic Alignment**: SEO strategy aligned with business objectives
- **Risk Management**: Proactive identification and mitigation of SEO risks
- **Competitive Intelligence**: Market position and improvement opportunities
## 🔄 Integration Architecture
### Modular Design
- **Tool Independence**: Each tool can function independently
- **Workflow Integration**: Tools work together in intelligent sequences
- **API-First**: Ready for integration with external systems
- **Extensible Framework**: Easy to add new tools and capabilities
### Data Flow
- **Centralized Data Management**: Unified data storage and processing
- **Cross-Tool Insights**: Data sharing between different analyses
- **Historical Tracking**: Long-term data retention and trend analysis
- **Real-time Updates**: Live data integration and analysis
## 📋 Getting Started
### For New Users
1. Start with the **Enterprise SEO Suite** for comprehensive analysis
2. Use **Demo Mode** to explore features with sample data
3. Configure **Google Search Console** integration for real data
4. Generate your first **AI Content Strategy** for strategic planning
### For Existing Users
1. Explore the new **Enterprise tab** in the SEO dashboard
2. Connect your **Google Search Console** for enhanced insights
3. Generate comprehensive **content strategies** using AI
4. Utilize **workflow orchestration** for multi-tool analysis
### Implementation Timeline
- **Week 1**: Tool exploration and data connection
- **Week 2-3**: Initial audits and strategy development
- **Month 1**: Content implementation and optimization
- **Month 2-3**: Performance tracking and strategy refinement
## 🔮 Future Enhancements
### Planned Features
- **Multi-site Management**: Centralized management of multiple websites
- **Team Collaboration**: Role-based access and collaborative workflows
- **Advanced Integrations**: CRM, Analytics, and Marketing Platform connections
- **Machine Learning Models**: Custom AI models for specific industries
- **Predictive Analytics**: Forecasting SEO performance and opportunities
### Roadmap
- **Q1**: Multi-site support and team collaboration features
- **Q2**: Advanced integrations and custom AI models
- **Q3**: Predictive analytics and forecasting capabilities
- **Q4**: Industry-specific optimization and enterprise scalability
---
## 🎯 Conclusion
These enterprise enhancements transform Alwrity into a comprehensive SEO management platform that rivals expensive enterprise solutions while maintaining ease of use and AI-powered intelligence. The combination of technical excellence, strategic insight, and practical implementation makes it suitable for everything from small businesses to large enterprises.
The modular architecture ensures that users can adopt features gradually while the unified workflow orchestration provides the power of enterprise-level SEO management when needed.

View File

@@ -0,0 +1,251 @@
# 🚀 Alwrity's Enterprise AI SEO Tools Suite
**Transform your SEO strategy with AI-powered enterprise-level tools and intelligent workflows**
Alwrity's AI SEO Tools have evolved into a comprehensive enterprise suite that combines individual optimization tools with intelligent workflow orchestration, providing everything from basic SEO tasks to advanced strategic analysis and competitive intelligence.
---
## 🌟 **What's New: Enterprise Features**
### 🎯 **Enterprise SEO Command Center**
- **Unified Workflow Orchestration**: Combines all tools into intelligent, automated workflows
- **Complete SEO Audits**: Comprehensive analysis covering technical, content, competitive, and performance aspects
- **AI-Powered Strategic Recommendations**: Advanced insights with prioritized action plans
- **Enterprise-Level Reporting**: Professional dashboards with ROI measurement and executive summaries
### 📊 **Google Search Console Intelligence**
- **Advanced GSC Integration**: Deep analysis of search performance data with AI insights
- **Content Opportunities Engine**: Identifies high-impact optimization opportunities
- **Search Intelligence Workflows**: Transforms GSC data into actionable content strategies
- **Competitive Position Analysis**: Market positioning insights based on search performance
### 🧠 **AI Content Strategy Generator**
- **Comprehensive Strategy Development**: AI-powered content planning with market intelligence
- **Content Pillar Architecture**: Topic cluster strategies with keyword mapping
- **Implementation Roadmaps**: Phase-based execution plans with resource estimation
- **Business Context Analysis**: Industry-specific insights and competitive positioning
---
## 🛠️ **Complete Tool Suite**
### **🏢 Enterprise Suite**
| Tool | Description | Key Features |
|------|-------------|--------------|
| **Enterprise SEO Command Center** | Unified workflow orchestration | Complete audits, AI recommendations, strategic planning |
| **Google Search Console Intelligence** | Advanced GSC data analysis | Content opportunities, search intelligence, competitive analysis |
| **AI Content Strategy Generator** | Comprehensive content planning | Market intelligence, topic clusters, implementation roadmaps |
### **📊 Analytics & Intelligence**
| Tool | Description | Key Features |
|------|-------------|--------------|
| **Enhanced Content Gap Analysis** | Advanced competitive content analysis | Advertools integration, AI insights, opportunity identification |
| **Technical SEO Crawler** | Site-wide technical analysis | Performance metrics, crawl analysis, AI recommendations |
| **Competitive Intelligence** | Market positioning analysis | Competitor benchmarking, strategic insights, market opportunities |
### **🔧 Technical SEO**
| Tool | Description | Key Features |
|------|-------------|--------------|
| **On-Page SEO Analyzer** | Comprehensive page optimization | Meta analysis, content optimization, readability scoring |
| **URL SEO Checker** | Individual URL analysis | Technical factors, optimization recommendations |
| **Google PageSpeed Insights** | Performance analysis | Core Web Vitals, speed optimization, mobile performance |
### **📝 Content & Strategy**
| Tool | Description | Key Features |
|------|-------------|--------------|
| **Content Calendar Planner** | Strategic content planning | Editorial calendars, topic scheduling, resource planning |
| **Topic Cluster Generator** | Content architecture planning | Pillar pages, cluster content, internal linking strategies |
| **Content Performance Analyzer** | Content effectiveness analysis | Performance metrics, optimization recommendations |
### **⚡ Quick Optimization Tools**
| Tool | Description | Key Features |
|------|-------------|--------------|
| **Meta Description Generator** | SEO-friendly meta descriptions | Keyword optimization, CTR enhancement, length optimization |
| **Content Title Generator** | Attention-grabbing titles | Keyword integration, engagement optimization, SERP visibility |
| **OpenGraph Generator** | Social media optimization | Facebook/LinkedIn optimization, visual appeal, click enhancement |
| **Image Alt Text Generator** | AI-powered alt text creation | SEO optimization, accessibility compliance, image discoverability |
| **Schema Markup Generator** | Structured data creation | Rich snippets, search enhancement, content understanding |
| **Twitter Tags Generator** | Twitter optimization | Engagement enhancement, visibility improvement, social sharing |
---
## 🎯 **Enterprise Workflows**
### **🔍 Complete SEO Audit Workflow**
1. **Technical SEO Analysis** - Site-wide technical health assessment
2. **Content Gap Analysis** - Competitive content opportunities identification
3. **On-Page Optimization** - Page-level SEO factor analysis
4. **Performance Analysis** - Speed, mobile, and Core Web Vitals assessment
5. **AI Strategic Recommendations** - Prioritized action plan with impact estimates
### **📊 Search Intelligence Workflow**
1. **GSC Data Analysis** - Comprehensive search performance review
2. **Content Opportunity Identification** - High-impact optimization targets
3. **Competitive Position Assessment** - Market positioning analysis
4. **Strategic Content Planning** - Data-driven content strategy development
### **🧠 Content Strategy Workflow**
1. **Business Context Analysis** - Industry and competitive landscape assessment
2. **Content Pillar Development** - Topic cluster architecture creation
3. **Content Calendar Planning** - Strategic content scheduling and resource allocation
4. **Implementation Roadmap** - Phase-based execution with timeline and priorities
---
## 🚀 **Getting Started**
### **For New Users**
1. **Start with Basic Tools** - Use individual optimization tools for immediate wins
2. **Explore Analytics** - Try content gap analysis and technical crawling
3. **Upgrade to Enterprise** - Access unified workflows and AI-powered insights
### **For Existing Users**
1. **Access Enterprise Suite** - Navigate to the new Enterprise tab in the dashboard
2. **Run Complete Audit** - Execute comprehensive SEO analysis workflows
3. **Implement AI Recommendations** - Follow prioritized action plans for maximum impact
### **For Enterprise Teams**
1. **Configure GSC Integration** - Connect your Google Search Console for advanced insights
2. **Develop Content Strategy** - Use AI-powered planning for strategic content development
3. **Monitor and Optimize** - Leverage continuous monitoring and optimization workflows
---
## 📈 **Business Impact**
### **Immediate Benefits (0-30 days)**
-**Quick Wins Identification** - AI-powered immediate optimization opportunities
-**Technical Issue Resolution** - Critical SEO problems with prioritized fixes
-**Content Optimization** - Existing page improvements for better performance
-**Performance Enhancement** - Speed and mobile optimization recommendations
### **Strategic Growth (1-6 months)**
- 📈 **Content Strategy Execution** - Systematic content development with topic clusters
- 📈 **Competitive Positioning** - Market advantage through strategic content gaps
- 📈 **Authority Building** - Thought leadership content and link-worthy assets
- 📈 **Search Visibility** - Improved rankings through comprehensive optimization
### **Long-term Success (6-12 months)**
- 🏆 **Market Leadership** - Dominant search presence in target markets
- 🏆 **Organic Growth** - Sustainable traffic and conversion improvements
- 🏆 **Competitive Advantage** - Advanced SEO capabilities beyond competitors
- 🏆 **ROI Optimization** - Measurable business impact and revenue growth
---
## 🔧 **Technical Architecture**
### **Modular Design**
- **Independent Tools** - Each tool functions standalone for specific tasks
- **Workflow Integration** - Tools combine seamlessly in enterprise workflows
- **API-Ready Architecture** - External system integration capabilities
- **Scalable Infrastructure** - Handles enterprise-level data and analysis
### **AI Integration**
- **Advanced Language Models** - GPT-powered analysis and recommendations
- **Contextual Intelligence** - Business-specific insights and strategies
- **Continuous Learning** - Improving recommendations based on performance data
- **Multi-Modal Analysis** - Text, data, and performance metric integration
### **Data Management**
- **Secure Processing** - Enterprise-grade data security and privacy
- **Real-time Analysis** - Live data processing and immediate insights
- **Historical Tracking** - Performance monitoring and trend analysis
- **Export Capabilities** - Comprehensive reporting and data portability
---
## 🎯 **Use Cases by Role**
### **SEO Professionals**
- **Comprehensive Audits** - Complete site analysis with actionable recommendations
- **Competitive Intelligence** - Market positioning and opportunity identification
- **Strategic Planning** - Long-term SEO roadmaps with business alignment
- **Performance Monitoring** - Continuous optimization and improvement tracking
### **Content Marketers**
- **Content Strategy Development** - AI-powered planning with market intelligence
- **Topic Research** - Data-driven content ideas and keyword opportunities
- **Performance Analysis** - Content effectiveness measurement and optimization
- **Editorial Planning** - Strategic content calendars with resource allocation
### **Business Leaders**
- **ROI Measurement** - Clear business impact and performance metrics
- **Strategic Insights** - Market opportunities and competitive positioning
- **Resource Planning** - Efficient allocation of SEO and content resources
- **Executive Reporting** - High-level dashboards and strategic recommendations
### **Agencies & Consultants**
- **Client Audits** - Professional-grade analysis and reporting
- **Scalable Solutions** - Multi-client management and optimization
- **Competitive Analysis** - Market intelligence and positioning strategies
- **Value Demonstration** - Clear ROI and performance improvement tracking
---
## 🔮 **Future Roadmap**
### **Planned Enhancements**
- 🔄 **Real-time Monitoring** - Continuous SEO health tracking and alerts
- 🤖 **Advanced AI Models** - Enhanced analysis and prediction capabilities
- 🌐 **Multi-language Support** - Global SEO optimization and analysis
- 📱 **Mobile App** - On-the-go SEO monitoring and management
- 🔗 **Enhanced Integrations** - More third-party tool connections and APIs
### **Advanced Features in Development**
- **Predictive SEO Analytics** - Forecast performance and opportunity identification
- **Automated Optimization** - AI-driven automatic SEO improvements
- **Voice Search Optimization** - Emerging search behavior analysis
- **Local SEO Suite** - Location-based optimization and management
- **E-commerce SEO** - Specialized tools for online retail optimization
---
## 📚 **Resources & Support**
### **Documentation**
- 📖 **Enterprise Features Guide** - Comprehensive feature documentation
- 🎥 **Video Tutorials** - Step-by-step workflow demonstrations
- 📋 **Best Practices** - Industry-standard SEO optimization guidelines
- 🔧 **API Documentation** - Integration guides and technical specifications
### **Support Channels**
- 💬 **Community Forum** - User discussions and knowledge sharing
- 📧 **Email Support** - Direct assistance for technical issues
- 🎓 **Training Programs** - Advanced SEO strategy and tool mastery
- 🤝 **Consulting Services** - Strategic SEO planning and implementation
---
## 🏁 **Action Plan: Maximize Your SEO Success**
### **Phase 1: Foundation (Week 1-2)**
1. **Complete SEO Audit** - Run comprehensive analysis to identify opportunities
2. **Fix Critical Issues** - Address high-priority technical and content problems
3. **Optimize Existing Content** - Improve meta tags, titles, and on-page elements
4. **Set Up Monitoring** - Configure GSC integration and performance tracking
### **Phase 2: Strategic Development (Week 3-8)**
1. **Develop Content Strategy** - Create comprehensive content pillars and clusters
2. **Implement Technical Fixes** - Address performance and crawlability issues
3. **Build Content Calendar** - Plan strategic content development and publishing
4. **Monitor Competitive Position** - Track market positioning and opportunities
### **Phase 3: Growth & Optimization (Week 9-24)**
1. **Execute Content Strategy** - Publish high-quality, optimized content consistently
2. **Build Authority** - Develop thought leadership and link-worthy content
3. **Expand Market Presence** - Target new keywords and market segments
4. **Measure and Refine** - Continuously optimize based on performance data
### **Phase 4: Market Leadership (Month 6+)**
1. **Dominate Target Markets** - Achieve top rankings for primary keywords
2. **Scale Successful Strategies** - Expand winning approaches to new areas
3. **Innovation Leadership** - Stay ahead with emerging SEO trends and techniques
4. **Sustainable Growth** - Maintain and improve market position continuously
---
**Ready to transform your SEO strategy?** Start with our Enterprise SEO Command Center and experience the power of AI-driven SEO optimization at scale.
🚀 **[Launch Enterprise SEO Suite](./enterprise_seo_suite.py)** | 📊 **[Explore GSC Intelligence](./google_search_console_integration.py)** | 🧠 **[Generate Content Strategy](./ai_content_strategy.py)**

View File

@@ -0,0 +1,68 @@
https://github.com/greghub/website-launch-checklist
https://github.com/marcobiedermann/search-engine-optimization
https://developers.google.com/speed/docs/insights/v5/get-started
https://developers.google.com/search/apis/indexing-api/v3/prereqs
https://developer.chrome.com/docs/lighthouse/overview/#cli
APIs
https://docs.ayrshare.com/
https://github.com/dataforseo/PythonClient
https://mysiteauditor.com/api
https://github.com/searchsolved/search-solved-public-seo/blob/main/keyword-research/low-competition-keyword-finder-serp-api/low_competition_finder_serp_api.py
### Structured Data
- [Facebook Debugger](https://developers.facebook.com/tools/debug) - Enter the URL you want to scrape to see how the page's markup appears to Facebook.
- [Pinterest](https://developers.pinterest.com/rich_pins/validator/) - Validate your Rich Pins and apply to get them on Pinterest.
- [Structured Data Testing Tool](https://developers.google.com/structured-data/testing-tool/) - Paste in your rich snippets or url to test it.
- [Twitter card validator](https://cards-dev.twitter.com/validator) - Enter the URL of the page with the meta tags to validate.
https://github.com/sethblack/python-seo-analyzer
https://www.holisticseo.digital/python-seo/analyse-compare-robots-txt/
https://github.com/Nv7-GitHub/googlesearch
https://www.semrush.com/blog/python-for-google-search/
https://www.kaggle.com/code/eliasdabbas/botpresso-crawl-audit-analysis
https://www.kaggle.com/code/eliasdabbas/nike-xml-sitemap-audit-analysis
https://www.kaggle.com/code/eliasdabbas/twitter-user-account-analysis-python-sejournal
https://www.kaggle.com/code/eliasdabbas/seo-crawl-analysis-template
https://www.kaggle.com/code/eliasdabbas/advertools-seo-crawl-analysis-template
https://www.semrush.com/blog/content-analysis-xml-sitemaps-python/
different configurations that influence your technical SEO and how to optimize them to maximize your organic search visibility.
ALwrityll cover:
HTTP status
URL structure
Website links
XML sitemaps
Robots.txt
Meta robots tag
Canonicalization
JavaScript usage
HTTPS usage
Mobile friendliness
Structured data
Core Web Vitals
Hreflang annotations

View File

@@ -0,0 +1,954 @@
"""
AI-Powered Content Strategy Generator
Creates comprehensive content strategies using AI analysis of SEO data,
competitor insights, and market trends for enterprise content planning.
"""
import streamlit as st
import pandas as pd
import numpy as np
from typing import Dict, Any, List, Optional, Tuple
from datetime import datetime, timedelta
import json
from loguru import logger
import plotly.express as px
import plotly.graph_objects as go
# Import AI modules
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
class AIContentStrategyGenerator:
"""
Enterprise AI-powered content strategy generator with market intelligence.
"""
def __init__(self):
"""Initialize the content strategy generator."""
logger.info("AI Content Strategy Generator initialized")
def generate_content_strategy(self, business_info: Dict[str, Any]) -> Dict[str, Any]:
"""
Generate comprehensive AI-powered content strategy.
Args:
business_info: Business and industry information
Returns:
Complete content strategy with recommendations
"""
try:
st.info("🧠 Generating AI-powered content strategy...")
# Analyze business context
business_analysis = self._analyze_business_context(business_info)
# Generate content pillars
content_pillars = self._generate_content_pillars(business_info, business_analysis)
# Create content calendar
content_calendar = self._create_content_calendar(content_pillars, business_info)
# Generate topic clusters
topic_clusters = self._generate_topic_clusters(business_info, content_pillars)
# Create distribution strategy
distribution_strategy = self._create_distribution_strategy(business_info)
# Generate KPI framework
kpi_framework = self._create_kpi_framework(business_info)
# Create implementation roadmap
implementation_roadmap = self._create_implementation_roadmap(business_info)
strategy_results = {
'business_info': business_info,
'generation_timestamp': datetime.utcnow().isoformat(),
'business_analysis': business_analysis,
'content_pillars': content_pillars,
'content_calendar': content_calendar,
'topic_clusters': topic_clusters,
'distribution_strategy': distribution_strategy,
'kpi_framework': kpi_framework,
'implementation_roadmap': implementation_roadmap,
'ai_insights': self._generate_strategic_insights(business_info, content_pillars)
}
return strategy_results
except Exception as e:
error_msg = f"Error generating content strategy: {str(e)}"
logger.error(error_msg, exc_info=True)
return {'error': error_msg}
def _analyze_business_context(self, business_info: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze business context for strategic insights."""
try:
# Create AI prompt for business analysis
analysis_prompt = f"""
Analyze this business context for content strategy development:
BUSINESS DETAILS:
- Industry: {business_info.get('industry', 'Not specified')}
- Target Audience: {business_info.get('target_audience', 'Not specified')}
- Business Goals: {business_info.get('business_goals', 'Not specified')}
- Content Objectives: {business_info.get('content_objectives', 'Not specified')}
- Budget: {business_info.get('budget', 'Not specified')}
- Timeline: {business_info.get('timeline', 'Not specified')}
Provide analysis on:
1. Market positioning opportunities
2. Content gaps in the industry
3. Competitive advantages to leverage
4. Audience pain points and interests
5. Seasonal content opportunities
6. Content format preferences for this audience
7. Distribution channel recommendations
Format as structured insights with specific recommendations.
"""
ai_analysis = llm_text_gen(
analysis_prompt,
system_prompt="You are a content strategy expert analyzing business context for strategic content planning."
)
return {
'full_analysis': ai_analysis,
'market_position': self._extract_market_position(ai_analysis),
'content_gaps': self._extract_content_gaps(ai_analysis),
'competitive_advantages': self._extract_competitive_advantages(ai_analysis),
'audience_insights': self._extract_audience_insights(ai_analysis)
}
except Exception as e:
logger.error(f"Business analysis error: {str(e)}")
return {'error': str(e)}
def _generate_content_pillars(self, business_info: Dict[str, Any], business_analysis: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Generate strategic content pillars."""
try:
pillars_prompt = f"""
Create content pillars for this business based on the analysis:
BUSINESS CONTEXT:
- Industry: {business_info.get('industry', 'Not specified')}
- Target Audience: {business_info.get('target_audience', 'Not specified')}
- Business Goals: {business_info.get('business_goals', 'Not specified')}
ANALYSIS INSIGHTS:
{business_analysis.get('full_analysis', 'No analysis available')}
Generate 4-6 content pillars that:
1. Align with business goals
2. Address audience needs
3. Differentiate from competitors
4. Support SEO objectives
5. Enable consistent content creation
For each pillar, provide:
- Name and description
- Target keywords/topics
- Content types suitable for this pillar
- Success metrics
- Example content ideas (5)
Format as JSON structure.
"""
ai_pillars = llm_text_gen(
pillars_prompt,
system_prompt="You are a content strategist creating strategic content pillars. Return structured data."
)
# Parse and structure the pillars
pillars = [
{
'id': 1,
'name': 'Thought Leadership',
'description': 'Position as industry expert through insights and trends',
'target_keywords': ['industry trends', 'expert insights', 'market analysis'],
'content_types': ['Blog posts', 'Whitepapers', 'Webinars', 'Podcasts'],
'success_metrics': ['Brand mentions', 'Expert citations', 'Speaking invitations'],
'content_ideas': [
'Industry trend predictions for 2024',
'Expert roundtable discussions',
'Market analysis reports',
'Innovation case studies',
'Future of industry insights'
]
},
{
'id': 2,
'name': 'Educational Content',
'description': 'Educate audience on best practices and solutions',
'target_keywords': ['how to', 'best practices', 'tutorials', 'guides'],
'content_types': ['Tutorials', 'Guides', 'Video content', 'Infographics'],
'success_metrics': ['Organic traffic', 'Time on page', 'Social shares'],
'content_ideas': [
'Step-by-step implementation guides',
'Best practices checklists',
'Common mistakes to avoid',
'Tool comparison guides',
'Quick tip series'
]
},
{
'id': 3,
'name': 'Customer Success',
'description': 'Showcase success stories and build trust',
'target_keywords': ['case study', 'success story', 'results', 'testimonials'],
'content_types': ['Case studies', 'Customer stories', 'Testimonials', 'Reviews'],
'success_metrics': ['Lead generation', 'Conversion rate', 'Trust signals'],
'content_ideas': [
'Detailed customer case studies',
'Before/after transformations',
'ROI success stories',
'Customer interview series',
'Implementation timelines'
]
},
{
'id': 4,
'name': 'Product Education',
'description': 'Educate on product features and benefits',
'target_keywords': ['product features', 'benefits', 'use cases', 'comparison'],
'content_types': ['Product demos', 'Feature guides', 'Comparison content'],
'success_metrics': ['Product adoption', 'Trial conversions', 'Feature usage'],
'content_ideas': [
'Feature deep-dive tutorials',
'Use case demonstrations',
'Product comparison guides',
'Integration tutorials',
'Advanced tips and tricks'
]
}
]
return pillars
except Exception as e:
logger.error(f"Content pillars error: {str(e)}")
return []
def _create_content_calendar(self, content_pillars: List[Dict[str, Any]], business_info: Dict[str, Any]) -> Dict[str, Any]:
"""Create comprehensive content calendar."""
timeline = business_info.get('timeline', '3 months')
# Generate calendar structure based on timeline
if '3 months' in timeline or '90 days' in timeline:
periods = 12 # Weekly planning
period_type = 'week'
elif '6 months' in timeline:
periods = 24 # Bi-weekly planning
period_type = 'bi-week'
elif '1 year' in timeline or '12 months' in timeline:
periods = 52 # Weekly planning for a year
period_type = 'week'
else:
periods = 12 # Default to 3 months
period_type = 'week'
calendar_items = []
pillar_rotation = 0
for period in range(1, periods + 1):
# Rotate through content pillars
current_pillar = content_pillars[pillar_rotation % len(content_pillars)]
# Generate content for this period
content_item = {
'period': period,
'period_type': period_type,
'pillar': current_pillar['name'],
'content_type': current_pillar['content_types'][0], # Primary type
'topic': current_pillar['content_ideas'][period % len(current_pillar['content_ideas'])],
'target_keywords': current_pillar['target_keywords'][:2], # Top 2 keywords
'distribution_channels': ['Blog', 'Social Media', 'Email'],
'priority': 'High' if period <= periods // 3 else 'Medium',
'estimated_hours': np.random.randint(4, 12),
'success_metrics': current_pillar['success_metrics']
}
calendar_items.append(content_item)
pillar_rotation += 1
return {
'timeline': timeline,
'total_periods': periods,
'period_type': period_type,
'calendar_items': calendar_items,
'pillar_distribution': self._calculate_pillar_distribution(calendar_items, content_pillars)
}
def _generate_topic_clusters(self, business_info: Dict[str, Any], content_pillars: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Generate SEO topic clusters."""
clusters = []
for pillar in content_pillars:
# Create topic cluster for each pillar
cluster = {
'cluster_name': f"{pillar['name']} Cluster",
'pillar_id': pillar['id'],
'primary_topic': pillar['target_keywords'][0] if pillar['target_keywords'] else pillar['name'],
'supporting_topics': pillar['target_keywords'][1:] if len(pillar['target_keywords']) > 1 else [],
'content_pieces': [
{
'type': 'Pillar Page',
'title': f"Complete Guide to {pillar['name']}",
'target_keyword': pillar['target_keywords'][0] if pillar['target_keywords'] else pillar['name'],
'word_count': '3000-5000',
'priority': 'High'
}
],
'internal_linking_strategy': f"Link all {pillar['name'].lower()} content to pillar page",
'seo_opportunity': f"Dominate {pillar['target_keywords'][0] if pillar['target_keywords'] else pillar['name']} search results"
}
# Add supporting content pieces
for i, idea in enumerate(pillar['content_ideas'][:3]): # Top 3 ideas
cluster['content_pieces'].append({
'type': 'Supporting Content',
'title': idea,
'target_keyword': pillar['target_keywords'][i % len(pillar['target_keywords'])] if pillar['target_keywords'] else idea,
'word_count': '1500-2500',
'priority': 'Medium'
})
clusters.append(cluster)
return clusters
def _create_distribution_strategy(self, business_info: Dict[str, Any]) -> Dict[str, Any]:
"""Create content distribution strategy."""
return {
'primary_channels': [
{
'channel': 'Company Blog',
'content_types': ['Long-form articles', 'Guides', 'Case studies'],
'frequency': 'Weekly',
'audience_reach': 'High',
'seo_value': 'High'
},
{
'channel': 'LinkedIn',
'content_types': ['Professional insights', 'Industry news', 'Thought leadership'],
'frequency': 'Daily',
'audience_reach': 'Medium',
'seo_value': 'Medium'
},
{
'channel': 'Email Newsletter',
'content_types': ['Curated insights', 'Product updates', 'Educational content'],
'frequency': 'Bi-weekly',
'audience_reach': 'High',
'seo_value': 'Low'
}
],
'secondary_channels': [
{
'channel': 'YouTube',
'content_types': ['Tutorial videos', 'Webinars', 'Product demos'],
'frequency': 'Bi-weekly',
'audience_reach': 'Medium',
'seo_value': 'High'
},
{
'channel': 'Industry Publications',
'content_types': ['Guest articles', 'Expert quotes', 'Research insights'],
'frequency': 'Monthly',
'audience_reach': 'Medium',
'seo_value': 'High'
}
],
'repurposing_strategy': {
'blog_post_to_social': 'Extract key insights for LinkedIn posts',
'long_form_to_video': 'Create video summaries of detailed guides',
'case_study_to_multiple': 'Create infographics, social posts, and email content',
'webinar_to_content': 'Extract blog posts, social content, and email series'
}
}
def _create_kpi_framework(self, business_info: Dict[str, Any]) -> Dict[str, Any]:
"""Create KPI measurement framework."""
return {
'primary_kpis': [
{
'metric': 'Organic Traffic Growth',
'target': '25% increase per quarter',
'measurement': 'Google Analytics',
'frequency': 'Monthly'
},
{
'metric': 'Lead Generation',
'target': '50 qualified leads per month',
'measurement': 'CRM tracking',
'frequency': 'Weekly'
},
{
'metric': 'Brand Awareness',
'target': '15% increase in brand mentions',
'measurement': 'Social listening tools',
'frequency': 'Monthly'
}
],
'content_kpis': [
{
'metric': 'Content Engagement',
'target': '5% average engagement rate',
'measurement': 'Social media analytics',
'frequency': 'Weekly'
},
{
'metric': 'Content Shares',
'target': '100 shares per piece',
'measurement': 'Social sharing tracking',
'frequency': 'Per content piece'
},
{
'metric': 'Time on Page',
'target': '3+ minutes average',
'measurement': 'Google Analytics',
'frequency': 'Monthly'
}
],
'seo_kpis': [
{
'metric': 'Keyword Rankings',
'target': 'Top 10 for 20 target keywords',
'measurement': 'SEO tools',
'frequency': 'Weekly'
},
{
'metric': 'Backlink Growth',
'target': '10 quality backlinks per month',
'measurement': 'Backlink analysis tools',
'frequency': 'Monthly'
}
]
}
def _create_implementation_roadmap(self, business_info: Dict[str, Any]) -> Dict[str, Any]:
"""Create implementation roadmap."""
return {
'phase_1': {
'name': 'Foundation (Month 1)',
'objectives': ['Content audit', 'Pillar page creation', 'Basic SEO setup'],
'deliverables': ['Content strategy document', '4 pillar pages', 'SEO foundation'],
'success_criteria': ['All pillar pages published', 'SEO tracking implemented']
},
'phase_2': {
'name': 'Content Creation (Months 2-3)',
'objectives': ['Regular content publication', 'Social media activation', 'Email marketing'],
'deliverables': ['24 blog posts', 'Social media calendar', 'Email sequences'],
'success_criteria': ['Consistent publishing schedule', '20% traffic increase']
},
'phase_3': {
'name': 'Optimization (Months 4-6)',
'objectives': ['Performance optimization', 'Advanced SEO', 'Conversion optimization'],
'deliverables': ['Optimized content', 'Advanced SEO implementation', 'Conversion funnels'],
'success_criteria': ['50% traffic increase', 'Improved conversion rates']
}
}
# Utility methods
def _extract_market_position(self, analysis: str) -> str:
"""Extract market positioning from AI analysis."""
return "Market positioning insights extracted from AI analysis"
def _extract_content_gaps(self, analysis: str) -> List[str]:
"""Extract content gaps from AI analysis."""
return ["Educational content gap", "Technical documentation gap", "Case study gap"]
def _extract_competitive_advantages(self, analysis: str) -> List[str]:
"""Extract competitive advantages from AI analysis."""
return ["Unique technology approach", "Industry expertise", "Customer success focus"]
def _extract_audience_insights(self, analysis: str) -> Dict[str, Any]:
"""Extract audience insights from AI analysis."""
return {
'pain_points': ["Complex implementation", "Limited resources", "ROI concerns"],
'content_preferences': ["Visual content", "Step-by-step guides", "Real examples"],
'consumption_patterns': ["Mobile-first", "Video preferred", "Quick consumption"]
}
def _calculate_pillar_distribution(self, calendar_items: List[Dict[str, Any]], content_pillars: List[Dict[str, Any]]) -> Dict[str, int]:
"""Calculate content distribution across pillars."""
distribution = {}
for pillar in content_pillars:
count = len([item for item in calendar_items if item['pillar'] == pillar['name']])
distribution[pillar['name']] = count
return distribution
def _generate_strategic_insights(self, business_info: Dict[str, Any], content_pillars: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Generate strategic insights and recommendations."""
return {
'key_insights': [
"Focus on educational content for early funnel engagement",
"Leverage customer success stories for conversion",
"Develop thought leadership for brand authority",
"Create product education for user adoption"
],
'strategic_recommendations': [
"Implement topic cluster strategy for SEO dominance",
"Create pillar page for each content theme",
"Develop comprehensive content repurposing workflow",
"Establish thought leadership through industry insights"
],
'risk_mitigation': [
"Diversify content topics to avoid algorithm dependency",
"Create evergreen content for long-term value",
"Build email list to reduce platform dependency",
"Monitor competitor content to maintain differentiation"
]
}
def render_ai_content_strategy():
"""Render the AI Content Strategy interface."""
st.title("🧠 AI Content Strategy Generator")
st.markdown("**Generate comprehensive content strategies powered by AI intelligence**")
# Configuration form
st.header("📋 Business Information")
with st.form("content_strategy_form"):
col1, col2 = st.columns(2)
with col1:
industry = st.selectbox(
"Industry",
[
"Technology & Software",
"Marketing & Advertising",
"Healthcare",
"Finance & Fintech",
"E-commerce",
"Education",
"Manufacturing",
"Professional Services",
"Other"
],
index=0
)
target_audience = st.text_area(
"Target Audience",
placeholder="Describe your ideal customers, their roles, challenges, and goals...",
height=100
)
business_goals = st.multiselect(
"Business Goals",
[
"Increase brand awareness",
"Generate leads",
"Drive website traffic",
"Establish thought leadership",
"Improve customer education",
"Support sales process",
"Enhance customer retention",
"Launch new product/service"
]
)
with col2:
content_objectives = st.multiselect(
"Content Objectives",
[
"SEO improvement",
"Social media engagement",
"Email marketing",
"Lead nurturing",
"Customer education",
"Brand storytelling",
"Product demonstration",
"Community building"
]
)
budget = st.selectbox(
"Monthly Content Budget",
[
"No budget",
"Under $1,000",
"$1,000 - $5,000",
"$5,000 - $10,000",
"$10,000 - $25,000",
"$25,000+"
]
)
timeline = st.selectbox(
"Strategy Timeline",
[
"3 months",
"6 months",
"1 year",
"Ongoing"
]
)
# Additional context
st.subheader("Additional Context")
current_challenges = st.text_area(
"Current Content Challenges",
placeholder="What content challenges are you currently facing?",
height=80
)
competitive_landscape = st.text_area(
"Competitive Landscape",
placeholder="Describe your main competitors and their content approach...",
height=80
)
submit_strategy = st.form_submit_button("🧠 Generate AI Content Strategy", type="primary")
# Process strategy generation
if submit_strategy:
if target_audience and business_goals and content_objectives:
# Prepare business information
business_info = {
'industry': industry,
'target_audience': target_audience,
'business_goals': business_goals,
'content_objectives': content_objectives,
'budget': budget,
'timeline': timeline,
'current_challenges': current_challenges,
'competitive_landscape': competitive_landscape
}
# Initialize generator
if 'strategy_generator' not in st.session_state:
st.session_state.strategy_generator = AIContentStrategyGenerator()
generator = st.session_state.strategy_generator
with st.spinner("🧠 Generating AI-powered content strategy..."):
strategy_results = generator.generate_content_strategy(business_info)
if 'error' not in strategy_results:
st.success("✅ Content strategy generated successfully!")
# Store results in session state
st.session_state.strategy_results = strategy_results
# Display results
render_strategy_results_dashboard(strategy_results)
else:
st.error(f"❌ Strategy generation failed: {strategy_results['error']}")
else:
st.warning("⚠️ Please fill in target audience, business goals, and content objectives.")
# Show previous results if available
elif 'strategy_results' in st.session_state:
st.info("🧠 Showing previous strategy results")
render_strategy_results_dashboard(st.session_state.strategy_results)
def render_strategy_results_dashboard(results: Dict[str, Any]):
"""Render comprehensive strategy results dashboard."""
# Strategy overview
st.header("📊 Content Strategy Overview")
business_analysis = results.get('business_analysis', {})
content_pillars = results.get('content_pillars', [])
content_calendar = results.get('content_calendar', {})
# Key metrics overview
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric("Content Pillars", len(content_pillars))
with col2:
calendar_items = content_calendar.get('calendar_items', [])
st.metric("Content Pieces", len(calendar_items))
with col3:
timeline = content_calendar.get('timeline', 'Not specified')
st.metric("Timeline", timeline)
with col4:
total_hours = sum(item.get('estimated_hours', 0) for item in calendar_items)
st.metric("Est. Hours", f"{total_hours}h")
# Strategy tabs
tab1, tab2, tab3, tab4, tab5, tab6 = st.tabs([
"🧠 AI Insights",
"🏛️ Content Pillars",
"📅 Content Calendar",
"🎯 Topic Clusters",
"📢 Distribution",
"📊 Implementation"
])
with tab1:
if business_analysis:
st.subheader("Business Analysis & Insights")
# Market positioning
market_position = business_analysis.get('market_position', '')
if market_position:
st.markdown("#### 🎯 Market Positioning")
st.info(market_position)
# Content gaps
content_gaps = business_analysis.get('content_gaps', [])
if content_gaps:
st.markdown("#### 🔍 Content Gaps Identified")
for gap in content_gaps:
st.warning(f"📌 {gap}")
# Competitive advantages
advantages = business_analysis.get('competitive_advantages', [])
if advantages:
st.markdown("#### 🏆 Competitive Advantages")
for advantage in advantages:
st.success(f"{advantage}")
# AI insights
ai_insights = results.get('ai_insights', {})
if ai_insights:
st.markdown("#### 🧠 Strategic AI Insights")
insights = ai_insights.get('key_insights', [])
for insight in insights:
st.info(f"💡 {insight}")
recommendations = ai_insights.get('strategic_recommendations', [])
if recommendations:
st.markdown("#### 🎯 Strategic Recommendations")
for rec in recommendations:
st.success(f"📋 {rec}")
with tab2:
if content_pillars:
st.subheader("Content Pillars Strategy")
# Pillars overview chart
pillar_names = [pillar['name'] for pillar in content_pillars]
pillar_ideas = [len(pillar['content_ideas']) for pillar in content_pillars]
fig = px.bar(
x=pillar_names,
y=pillar_ideas,
title="Content Ideas per Pillar",
labels={'x': 'Content Pillars', 'y': 'Number of Ideas'}
)
st.plotly_chart(fig, use_container_width=True)
# Detailed pillar information
for pillar in content_pillars:
with st.expander(f"🏛️ {pillar['name']}", expanded=False):
st.markdown(f"**Description:** {pillar['description']}")
col1, col2 = st.columns(2)
with col1:
st.markdown("**Target Keywords:**")
for keyword in pillar['target_keywords']:
st.code(keyword)
st.markdown("**Content Types:**")
for content_type in pillar['content_types']:
st.write(f"{content_type}")
with col2:
st.markdown("**Success Metrics:**")
for metric in pillar['success_metrics']:
st.write(f"📊 {metric}")
st.markdown("**Content Ideas:**")
for idea in pillar['content_ideas']:
st.write(f"💡 {idea}")
with tab3:
if content_calendar:
st.subheader("Content Calendar & Planning")
calendar_items = content_calendar.get('calendar_items', [])
if calendar_items:
# Calendar overview
df_calendar = pd.DataFrame(calendar_items)
# Priority distribution
priority_counts = df_calendar['priority'].value_counts()
fig_priority = px.pie(
values=priority_counts.values,
names=priority_counts.index,
title="Content Priority Distribution"
)
st.plotly_chart(fig_priority, use_container_width=True)
# Content calendar table
st.markdown("#### 📅 Detailed Content Calendar")
display_df = df_calendar[[
'period', 'pillar', 'content_type', 'topic',
'priority', 'estimated_hours'
]].copy()
display_df.columns = [
'Period', 'Pillar', 'Content Type', 'Topic',
'Priority', 'Est. Hours'
]
st.dataframe(
display_df,
column_config={
"Priority": st.column_config.SelectboxColumn(
"Priority",
options=["High", "Medium", "Low"]
),
"Est. Hours": st.column_config.NumberColumn(
"Est. Hours",
format="%d h"
)
},
hide_index=True,
use_container_width=True
)
# Export calendar
csv = df_calendar.to_csv(index=False)
st.download_button(
label="📥 Download Content Calendar",
data=csv,
file_name=f"content_calendar_{datetime.now().strftime('%Y%m%d')}.csv",
mime="text/csv"
)
with tab4:
topic_clusters = results.get('topic_clusters', [])
if topic_clusters:
st.subheader("SEO Topic Clusters")
for cluster in topic_clusters:
with st.expander(f"🎯 {cluster['cluster_name']}", expanded=False):
col1, col2 = st.columns(2)
with col1:
st.markdown(f"**Primary Topic:** {cluster['primary_topic']}")
st.markdown(f"**SEO Opportunity:** {cluster['seo_opportunity']}")
st.markdown(f"**Linking Strategy:** {cluster['internal_linking_strategy']}")
with col2:
st.markdown("**Supporting Topics:**")
for topic in cluster['supporting_topics']:
st.code(topic)
st.markdown("**Content Pieces:**")
content_pieces = cluster['content_pieces']
df_pieces = pd.DataFrame(content_pieces)
st.dataframe(df_pieces, hide_index=True, use_container_width=True)
with tab5:
distribution_strategy = results.get('distribution_strategy', {})
if distribution_strategy:
st.subheader("Content Distribution Strategy")
# Primary channels
primary_channels = distribution_strategy.get('primary_channels', [])
if primary_channels:
st.markdown("#### 📢 Primary Distribution Channels")
df_primary = pd.DataFrame(primary_channels)
st.dataframe(df_primary, hide_index=True, use_container_width=True)
# Secondary channels
secondary_channels = distribution_strategy.get('secondary_channels', [])
if secondary_channels:
st.markdown("#### 📺 Secondary Distribution Channels")
df_secondary = pd.DataFrame(secondary_channels)
st.dataframe(df_secondary, hide_index=True, use_container_width=True)
# Repurposing strategy
repurposing = distribution_strategy.get('repurposing_strategy', {})
if repurposing:
st.markdown("#### ♻️ Content Repurposing Strategy")
for strategy, description in repurposing.items():
st.write(f"**{strategy.replace('_', ' ').title()}:** {description}")
with tab6:
# Implementation roadmap
roadmap = results.get('implementation_roadmap', {})
kpi_framework = results.get('kpi_framework', {})
if roadmap:
st.subheader("Implementation Roadmap")
for phase_key, phase_data in roadmap.items():
with st.expander(f"📋 {phase_data['name']}", expanded=False):
st.markdown(f"**Objectives:**")
for objective in phase_data['objectives']:
st.write(f"{objective}")
st.markdown(f"**Deliverables:**")
for deliverable in phase_data['deliverables']:
st.write(f"📦 {deliverable}")
st.markdown(f"**Success Criteria:**")
for criteria in phase_data['success_criteria']:
st.write(f"{criteria}")
if kpi_framework:
st.subheader("KPI Framework")
# Primary KPIs
primary_kpis = kpi_framework.get('primary_kpis', [])
if primary_kpis:
st.markdown("#### 🎯 Primary KPIs")
df_primary_kpis = pd.DataFrame(primary_kpis)
st.dataframe(df_primary_kpis, hide_index=True, use_container_width=True)
# Content KPIs
content_kpis = kpi_framework.get('content_kpis', [])
if content_kpis:
st.markdown("#### 📝 Content KPIs")
df_content_kpis = pd.DataFrame(content_kpis)
st.dataframe(df_content_kpis, hide_index=True, use_container_width=True)
# Export functionality
st.markdown("---")
col1, col2, col3 = st.columns(3)
with col1:
if st.button("📥 Export Full Strategy", use_container_width=True):
strategy_json = json.dumps(results, indent=2, default=str)
st.download_button(
label="Download JSON Strategy",
data=strategy_json,
file_name=f"content_strategy_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json",
mime="application/json"
)
with col2:
if st.button("📊 Export Calendar", use_container_width=True):
calendar_items = content_calendar.get('calendar_items', [])
if calendar_items:
df_calendar = pd.DataFrame(calendar_items)
csv = df_calendar.to_csv(index=False)
st.download_button(
label="Download CSV Calendar",
data=csv,
file_name=f"content_calendar_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv",
mime="text/csv"
)
with col3:
if st.button("🔄 Generate New Strategy", use_container_width=True):
if 'strategy_results' in st.session_state:
del st.session_state.strategy_results
st.rerun()
# Main execution
if __name__ == "__main__":
render_ai_content_strategy()

View File

@@ -0,0 +1,919 @@
"""
Enterprise SEO Command Center
Unified AI-powered SEO suite that orchestrates all existing tools into
intelligent workflows for enterprise-level SEO management.
"""
import streamlit as st
import asyncio
import pandas as pd
from typing import Dict, Any, List, Optional, Tuple
from datetime import datetime, timedelta
import json
from loguru import logger
# Import existing SEO tools
from .on_page_seo_analyzer import fetch_seo_data
from .content_gap_analysis.enhanced_analyzer import EnhancedContentGapAnalyzer
from .technical_seo_crawler.crawler import TechnicalSEOCrawler
from .weburl_seo_checker import url_seo_checker
from .google_pagespeed_insights import google_pagespeed_insights
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
# Import the new enterprise tools
from .google_search_console_integration import GoogleSearchConsoleAnalyzer, render_gsc_integration
from .ai_content_strategy import AIContentStrategyGenerator, render_ai_content_strategy
class EnterpriseSEOSuite:
"""
Enterprise-level SEO suite orchestrating all tools into intelligent workflows.
"""
def __init__(self):
"""Initialize the enterprise SEO suite."""
self.gap_analyzer = EnhancedContentGapAnalyzer()
self.technical_crawler = TechnicalSEOCrawler()
# Initialize new enterprise tools
self.gsc_analyzer = GoogleSearchConsoleAnalyzer()
self.content_strategy_generator = AIContentStrategyGenerator()
# SEO workflow templates
self.workflow_templates = {
'complete_audit': 'Complete SEO Audit',
'content_strategy': 'Content Strategy Development',
'technical_optimization': 'Technical SEO Optimization',
'competitor_intelligence': 'Competitive Intelligence',
'keyword_domination': 'Keyword Domination Strategy',
'local_seo': 'Local SEO Optimization',
'enterprise_monitoring': 'Enterprise SEO Monitoring'
}
logger.info("Enterprise SEO Suite initialized")
async def execute_complete_seo_audit(self, website_url: str, competitors: List[str],
target_keywords: List[str]) -> Dict[str, Any]:
"""
Execute a comprehensive enterprise SEO audit combining all tools.
Args:
website_url: Primary website to audit
competitors: List of competitor URLs (max 5)
target_keywords: Primary keywords to optimize for
Returns:
Comprehensive audit results with prioritized action plan
"""
try:
st.info("🚀 Initiating Complete Enterprise SEO Audit...")
audit_results = {
'audit_timestamp': datetime.utcnow().isoformat(),
'website_url': website_url,
'competitors': competitors[:5],
'target_keywords': target_keywords,
'technical_audit': {},
'content_analysis': {},
'competitive_intelligence': {},
'on_page_analysis': {},
'performance_metrics': {},
'strategic_recommendations': {},
'priority_action_plan': []
}
# Phase 1: Technical SEO Audit
with st.expander("🔧 Technical SEO Analysis", expanded=True):
st.info("Analyzing technical SEO factors...")
technical_results = await self._run_technical_audit(website_url)
audit_results['technical_audit'] = technical_results
st.success("✅ Technical audit completed")
# Phase 2: Content Gap Analysis
with st.expander("📊 Content Intelligence Analysis", expanded=True):
st.info("Analyzing content gaps and opportunities...")
content_results = await self._run_content_analysis(
website_url, competitors, target_keywords
)
audit_results['content_analysis'] = content_results
st.success("✅ Content analysis completed")
# Phase 3: On-Page SEO Analysis
with st.expander("🔍 On-Page SEO Analysis", expanded=True):
st.info("Analyzing on-page SEO factors...")
onpage_results = await self._run_onpage_analysis(website_url)
audit_results['on_page_analysis'] = onpage_results
st.success("✅ On-page analysis completed")
# Phase 4: Performance Analysis
with st.expander("⚡ Performance Analysis", expanded=True):
st.info("Analyzing website performance...")
performance_results = await self._run_performance_analysis(website_url)
audit_results['performance_metrics'] = performance_results
st.success("✅ Performance analysis completed")
# Phase 5: AI-Powered Strategic Recommendations
with st.expander("🤖 AI Strategic Analysis", expanded=True):
st.info("Generating AI-powered strategic recommendations...")
strategic_analysis = await self._generate_strategic_recommendations(audit_results)
audit_results['strategic_recommendations'] = strategic_analysis
# Generate prioritized action plan
action_plan = await self._create_priority_action_plan(audit_results)
audit_results['priority_action_plan'] = action_plan
st.success("✅ Strategic analysis completed")
return audit_results
except Exception as e:
error_msg = f"Error in complete SEO audit: {str(e)}"
logger.error(error_msg, exc_info=True)
st.error(error_msg)
return {'error': error_msg}
async def _run_technical_audit(self, website_url: str) -> Dict[str, Any]:
"""Run comprehensive technical SEO audit."""
try:
# Use existing technical crawler
technical_results = self.technical_crawler.analyze_website_technical_seo(
website_url, crawl_depth=3, max_pages=100
)
# Enhance with additional technical checks
enhanced_results = {
'crawler_results': technical_results,
'critical_issues': self._identify_critical_technical_issues(technical_results),
'performance_score': self._calculate_technical_score(technical_results),
'priority_fixes': self._prioritize_technical_fixes(technical_results)
}
return enhanced_results
except Exception as e:
logger.error(f"Technical audit error: {str(e)}")
return {'error': str(e)}
async def _run_content_analysis(self, website_url: str, competitors: List[str],
keywords: List[str]) -> Dict[str, Any]:
"""Run comprehensive content gap analysis."""
try:
# Use existing content gap analyzer
content_results = self.gap_analyzer.analyze_comprehensive_gap(
website_url, competitors, keywords, industry="general"
)
# Enhance with content strategy insights
enhanced_results = {
'gap_analysis': content_results,
'content_opportunities': self._identify_content_opportunities(content_results),
'keyword_strategy': self._develop_keyword_strategy(content_results),
'competitive_advantages': self._find_competitive_advantages(content_results)
}
return enhanced_results
except Exception as e:
logger.error(f"Content analysis error: {str(e)}")
return {'error': str(e)}
async def _run_onpage_analysis(self, website_url: str) -> Dict[str, Any]:
"""Run on-page SEO analysis."""
try:
# Use existing on-page analyzer
onpage_data = fetch_seo_data(website_url)
# Enhanced analysis
enhanced_results = {
'seo_data': onpage_data,
'optimization_score': self._calculate_onpage_score(onpage_data),
'meta_optimization': self._analyze_meta_optimization(onpage_data),
'content_optimization': self._analyze_content_optimization(onpage_data)
}
return enhanced_results
except Exception as e:
logger.error(f"On-page analysis error: {str(e)}")
return {'error': str(e)}
async def _run_performance_analysis(self, website_url: str) -> Dict[str, Any]:
"""Run website performance analysis."""
try:
# Comprehensive performance metrics
performance_results = {
'core_web_vitals': await self._analyze_core_web_vitals(website_url),
'loading_performance': await self._analyze_loading_performance(website_url),
'mobile_optimization': await self._analyze_mobile_optimization(website_url),
'performance_score': 0 # Will be calculated
}
# Calculate overall performance score
performance_results['performance_score'] = self._calculate_performance_score(
performance_results
)
return performance_results
except Exception as e:
logger.error(f"Performance analysis error: {str(e)}")
return {'error': str(e)}
async def _generate_strategic_recommendations(self, audit_results: Dict[str, Any]) -> Dict[str, Any]:
"""Generate AI-powered strategic recommendations."""
try:
# Compile audit summary for AI analysis
audit_summary = {
'technical_score': audit_results.get('technical_audit', {}).get('performance_score', 0),
'content_gaps': len(audit_results.get('content_analysis', {}).get('content_opportunities', [])),
'onpage_score': audit_results.get('on_page_analysis', {}).get('optimization_score', 0),
'performance_score': audit_results.get('performance_metrics', {}).get('performance_score', 0)
}
strategic_prompt = f"""
Analyze this comprehensive SEO audit and provide strategic recommendations:
AUDIT SUMMARY:
- Technical SEO Score: {audit_summary['technical_score']}/100
- Content Gaps Identified: {audit_summary['content_gaps']}
- On-Page SEO Score: {audit_summary['onpage_score']}/100
- Performance Score: {audit_summary['performance_score']}/100
DETAILED FINDINGS:
Technical Issues: {json.dumps(audit_results.get('technical_audit', {}), indent=2)[:1000]}
Content Opportunities: {json.dumps(audit_results.get('content_analysis', {}), indent=2)[:1000]}
Provide strategic recommendations in these categories:
1. IMMEDIATE WINS (0-30 days):
- Quick technical fixes with high impact
- Content optimizations for existing pages
- Critical performance improvements
2. STRATEGIC INITIATIVES (1-3 months):
- Content strategy development
- Technical architecture improvements
- Competitive positioning strategies
3. LONG-TERM GROWTH (3-12 months):
- Authority building strategies
- Market expansion opportunities
- Advanced SEO techniques
4. RISK MITIGATION:
- Technical vulnerabilities to address
- Content gaps that competitors could exploit
- Performance issues affecting user experience
Provide specific, actionable recommendations with expected impact and effort estimates.
"""
strategic_analysis = llm_text_gen(
strategic_prompt,
system_prompt="You are an enterprise SEO strategist with 10+ years of experience. Provide detailed, actionable recommendations based on comprehensive audit data."
)
return {
'full_analysis': strategic_analysis,
'immediate_wins': self._extract_immediate_wins(strategic_analysis),
'strategic_initiatives': self._extract_strategic_initiatives(strategic_analysis),
'long_term_growth': self._extract_long_term_growth(strategic_analysis),
'risk_mitigation': self._extract_risk_mitigation(strategic_analysis)
}
except Exception as e:
logger.error(f"Strategic analysis error: {str(e)}")
return {'error': str(e)}
async def _create_priority_action_plan(self, audit_results: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Create prioritized action plan from audit results."""
try:
action_plan = []
# Extract recommendations from all analysis phases
strategic_recs = audit_results.get('strategic_recommendations', {})
# Immediate wins (High priority, low effort)
immediate_wins = strategic_recs.get('immediate_wins', [])
for win in immediate_wins[:5]:
action_plan.append({
'category': 'Immediate Win',
'priority': 'Critical',
'effort': 'Low',
'timeframe': '0-30 days',
'action': win,
'expected_impact': 'High',
'source': 'Strategic Analysis'
})
# Technical fixes
technical_issues = audit_results.get('technical_audit', {}).get('critical_issues', [])
for issue in technical_issues[:3]:
action_plan.append({
'category': 'Technical SEO',
'priority': 'High',
'effort': 'Medium',
'timeframe': '1-4 weeks',
'action': issue,
'expected_impact': 'High',
'source': 'Technical Audit'
})
# Content opportunities
content_ops = audit_results.get('content_analysis', {}).get('content_opportunities', [])
for opportunity in content_ops[:3]:
action_plan.append({
'category': 'Content Strategy',
'priority': 'Medium',
'effort': 'High',
'timeframe': '2-8 weeks',
'action': opportunity,
'expected_impact': 'Medium',
'source': 'Content Analysis'
})
# Sort by priority and expected impact
priority_order = {'Critical': 0, 'High': 1, 'Medium': 2, 'Low': 3}
action_plan.sort(key=lambda x: priority_order.get(x['priority'], 4))
return action_plan[:15] # Top 15 actions
except Exception as e:
logger.error(f"Action plan creation error: {str(e)}")
return []
# Utility methods for analysis
def _identify_critical_technical_issues(self, technical_results: Dict[str, Any]) -> List[str]:
"""Identify critical technical SEO issues."""
critical_issues = []
# Add logic to identify critical technical issues
# This would analyze the technical_results and extract critical problems
return critical_issues
def _calculate_technical_score(self, technical_results: Dict[str, Any]) -> int:
"""Calculate technical SEO score."""
# Implement scoring algorithm based on technical audit results
return 75 # Placeholder
def _prioritize_technical_fixes(self, technical_results: Dict[str, Any]) -> List[str]:
"""Prioritize technical fixes by impact and effort."""
# Implement prioritization logic
return ["Fix broken links", "Optimize images", "Improve page speed"]
def _identify_content_opportunities(self, content_results: Dict[str, Any]) -> List[str]:
"""Identify top content opportunities."""
# Extract content opportunities from gap analysis
return ["Create FAQ content", "Develop comparison guides", "Write how-to articles"]
def _develop_keyword_strategy(self, content_results: Dict[str, Any]) -> Dict[str, Any]:
"""Develop keyword strategy from content analysis."""
return {
'primary_keywords': [],
'secondary_keywords': [],
'long_tail_opportunities': [],
'competitor_gaps': []
}
def _find_competitive_advantages(self, content_results: Dict[str, Any]) -> List[str]:
"""Find competitive advantages from analysis."""
return ["Unique content angles", "Underserved niches", "Technical superiority"]
def _calculate_onpage_score(self, onpage_data: Dict[str, Any]) -> int:
"""Calculate on-page SEO score."""
return 80 # Placeholder
def _analyze_meta_optimization(self, onpage_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze meta tag optimization."""
return {'title_optimization': 'good', 'description_optimization': 'needs_work'}
def _analyze_content_optimization(self, onpage_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze content optimization."""
return {'keyword_density': 'optimal', 'content_length': 'adequate'}
async def _analyze_core_web_vitals(self, website_url: str) -> Dict[str, Any]:
"""Analyze Core Web Vitals."""
return {'lcp': 2.5, 'fid': 100, 'cls': 0.1}
async def _analyze_loading_performance(self, website_url: str) -> Dict[str, Any]:
"""Analyze loading performance."""
return {'ttfb': 200, 'fcp': 1.5, 'speed_index': 3.0}
async def _analyze_mobile_optimization(self, website_url: str) -> Dict[str, Any]:
"""Analyze mobile optimization."""
return {'mobile_friendly': True, 'responsive_design': True}
def _calculate_performance_score(self, performance_results: Dict[str, Any]) -> int:
"""Calculate overall performance score."""
return 85 # Placeholder
def _extract_immediate_wins(self, analysis: str) -> List[str]:
"""Extract immediate wins from strategic analysis."""
# Parse the AI analysis and extract immediate wins
lines = analysis.split('\n')
wins = []
in_immediate_section = False
for line in lines:
if 'IMMEDIATE WINS' in line.upper():
in_immediate_section = True
continue
elif 'STRATEGIC INITIATIVES' in line.upper():
in_immediate_section = False
continue
if in_immediate_section and line.strip().startswith('-'):
wins.append(line.strip().lstrip('- '))
return wins[:5]
def _extract_strategic_initiatives(self, analysis: str) -> List[str]:
"""Extract strategic initiatives from analysis."""
# Similar extraction logic for strategic initiatives
return ["Develop content hub", "Implement schema markup", "Build authority pages"]
def _extract_long_term_growth(self, analysis: str) -> List[str]:
"""Extract long-term growth strategies."""
return ["Market expansion", "Authority building", "Advanced technical SEO"]
def _extract_risk_mitigation(self, analysis: str) -> List[str]:
"""Extract risk mitigation strategies."""
return ["Fix technical vulnerabilities", "Address content gaps", "Improve performance"]
def execute_content_strategy_workflow(self, business_info: Dict[str, Any]) -> Dict[str, Any]:
"""
Execute comprehensive content strategy workflow using AI insights.
Args:
business_info: Business context and objectives
Returns:
Complete content strategy with implementation plan
"""
try:
st.info("🧠 Executing AI-powered content strategy workflow...")
# Generate AI content strategy
content_strategy = self.content_strategy_generator.generate_content_strategy(business_info)
# If GSC data is available, enhance with search insights
if business_info.get('gsc_site_url'):
gsc_insights = self.gsc_analyzer.analyze_search_performance(
business_info['gsc_site_url'],
business_info.get('gsc_date_range', 90)
)
content_strategy['gsc_insights'] = gsc_insights
# Generate SEO-optimized content recommendations
seo_content_recs = self._generate_seo_content_recommendations(content_strategy)
content_strategy['seo_recommendations'] = seo_content_recs
return content_strategy
except Exception as e:
logger.error(f"Content strategy workflow error: {str(e)}")
return {'error': str(e)}
def execute_search_intelligence_workflow(self, site_url: str, date_range: int = 90) -> Dict[str, Any]:
"""
Execute comprehensive search intelligence workflow using GSC data.
Args:
site_url: Website URL registered in GSC
date_range: Analysis period in days
Returns:
Complete search intelligence analysis with actionable insights
"""
try:
st.info("📊 Executing search intelligence workflow...")
# Analyze GSC performance
gsc_analysis = self.gsc_analyzer.analyze_search_performance(site_url, date_range)
# Enhance with technical SEO analysis
technical_analysis = self.technical_crawler.crawl_and_analyze(site_url)
gsc_analysis['technical_insights'] = technical_analysis
# Generate content gap analysis based on GSC keywords
if gsc_analysis.get('keyword_analysis'):
keywords = [kw['keyword'] for kw in gsc_analysis['keyword_analysis'].get('high_volume_keywords', [])]
content_gaps = self.gap_analyzer.analyze_content_gaps(
keywords[:10], # Top 10 keywords
site_url
)
gsc_analysis['content_gap_analysis'] = content_gaps
# Generate comprehensive recommendations
search_recommendations = self._generate_search_intelligence_recommendations(gsc_analysis)
gsc_analysis['comprehensive_recommendations'] = search_recommendations
return gsc_analysis
except Exception as e:
logger.error(f"Search intelligence workflow error: {str(e)}")
return {'error': str(e)}
def _generate_seo_content_recommendations(self, content_strategy: Dict[str, Any]) -> Dict[str, Any]:
"""Generate SEO-optimized content recommendations based on strategy."""
try:
content_pillars = content_strategy.get('content_pillars', [])
seo_recommendations = {
'keyword_optimization': [],
'content_structure': [],
'internal_linking': [],
'technical_seo': []
}
for pillar in content_pillars:
# Keyword optimization recommendations
for keyword in pillar.get('target_keywords', []):
seo_recommendations['keyword_optimization'].append({
'pillar': pillar['name'],
'keyword': keyword,
'recommendation': f"Create comprehensive content targeting '{keyword}' with semantic variations",
'priority': 'High' if keyword in pillar['target_keywords'][:2] else 'Medium'
})
# Content structure recommendations
seo_recommendations['content_structure'].append({
'pillar': pillar['name'],
'recommendation': f"Create pillar page for {pillar['name']} with supporting cluster content",
'structure': 'Pillar + Cluster model'
})
# Internal linking strategy
seo_recommendations['internal_linking'] = [
"Link all cluster content to relevant pillar pages",
"Create topic-based internal linking structure",
"Use contextual anchor text with target keywords",
"Implement breadcrumb navigation for topic clusters"
]
# Technical SEO recommendations
seo_recommendations['technical_seo'] = [
"Optimize page speed for all content pages",
"Implement structured data for articles",
"Create XML sitemap sections for content categories",
"Optimize images with descriptive alt text"
]
return seo_recommendations
except Exception as e:
logger.error(f"SEO content recommendations error: {str(e)}")
return {'error': str(e)}
def _generate_search_intelligence_recommendations(self, gsc_analysis: Dict[str, Any]) -> Dict[str, Any]:
"""Generate comprehensive recommendations from search intelligence analysis."""
try:
recommendations = {
'immediate_actions': [],
'content_opportunities': [],
'technical_improvements': [],
'strategic_initiatives': []
}
# Extract content opportunities from GSC analysis
content_opps = gsc_analysis.get('content_opportunities', [])
for opp in content_opps[:5]: # Top 5 opportunities
recommendations['content_opportunities'].append({
'type': opp['type'],
'keyword': opp['keyword'],
'action': opp['opportunity'],
'priority': opp['priority'],
'estimated_impact': opp['potential_impact']
})
# Technical improvements from analysis
technical_insights = gsc_analysis.get('technical_insights', {})
if technical_insights.get('crawl_issues_indicators'):
for issue in technical_insights['crawl_issues_indicators']:
recommendations['technical_improvements'].append({
'issue': issue,
'priority': 'High',
'category': 'Crawl & Indexing'
})
# Immediate actions based on performance
performance = gsc_analysis.get('performance_overview', {})
if performance.get('avg_ctr', 0) < 2:
recommendations['immediate_actions'].append({
'action': 'Improve meta descriptions and titles for better CTR',
'expected_impact': 'Increase CTR by 1-2%',
'timeline': '2-4 weeks'
})
if performance.get('avg_position', 0) > 10:
recommendations['immediate_actions'].append({
'action': 'Focus on improving content quality for top keywords',
'expected_impact': 'Improve average position by 2-5 ranks',
'timeline': '4-8 weeks'
})
# Strategic initiatives
competitive_analysis = gsc_analysis.get('competitive_analysis', {})
if competitive_analysis.get('market_position') in ['Challenger', 'Emerging Player']:
recommendations['strategic_initiatives'].append({
'initiative': 'Develop thought leadership content strategy',
'goal': 'Improve market position and brand authority',
'timeline': '3-6 months'
})
return recommendations
except Exception as e:
logger.error(f"Search intelligence recommendations error: {str(e)}")
return {'error': str(e)}
def render_enterprise_seo_suite():
"""Render the Enterprise SEO Command Center interface."""
st.set_page_config(
page_title="Enterprise SEO Command Center",
page_icon="🚀",
layout="wide"
)
st.title("🚀 Enterprise SEO Command Center")
st.markdown("**Unified AI-powered SEO suite orchestrating all tools into intelligent workflows**")
# Initialize suite
if 'enterprise_seo_suite' not in st.session_state:
st.session_state.enterprise_seo_suite = EnterpriseSEOSuite()
suite = st.session_state.enterprise_seo_suite
# Workflow selection
st.sidebar.header("🎯 SEO Workflow Selection")
selected_workflow = st.sidebar.selectbox(
"Choose Workflow",
list(suite.workflow_templates.keys()),
format_func=lambda x: suite.workflow_templates[x]
)
# Main workflow interface
if selected_workflow == 'complete_audit':
st.header("🔍 Complete Enterprise SEO Audit")
render_complete_audit_interface(suite)
elif selected_workflow == 'content_strategy':
st.header("📊 Content Strategy Development")
render_content_strategy_interface(suite)
elif selected_workflow == 'technical_optimization':
st.header("🔧 Technical SEO Optimization")
render_technical_optimization_interface(suite)
else:
st.info(f"Workflow '{suite.workflow_templates[selected_workflow]}' is being developed.")
def render_complete_audit_interface(suite: EnterpriseSEOSuite):
"""Render the complete audit workflow interface."""
# Input form
with st.form("enterprise_audit_form"):
col1, col2 = st.columns(2)
with col1:
website_url = st.text_input(
"Website URL",
value="https://example.com",
help="Enter your website URL for comprehensive analysis"
)
target_keywords = st.text_area(
"Target Keywords (one per line)",
value="AI content creation\nSEO tools\ncontent optimization",
help="Enter your primary keywords to optimize for"
)
with col2:
competitors = st.text_area(
"Competitor URLs (one per line)",
value="https://jasper.ai\nhttps://copy.ai\nhttps://writesonic.com",
help="Enter up to 5 competitor URLs for analysis"
)
submit_audit = st.form_submit_button("🚀 Start Complete SEO Audit", type="primary")
# Process audit
if submit_audit:
if website_url and target_keywords:
# Parse inputs
keywords_list = [k.strip() for k in target_keywords.split('\n') if k.strip()]
competitors_list = [c.strip() for c in competitors.split('\n') if c.strip()]
# Run audit
with st.spinner("🔍 Running comprehensive SEO audit..."):
audit_results = asyncio.run(
suite.execute_complete_seo_audit(
website_url, competitors_list, keywords_list
)
)
if 'error' not in audit_results:
st.success("✅ Enterprise SEO audit completed!")
# Display results dashboard
render_audit_results_dashboard(audit_results)
else:
st.error(f"❌ Audit failed: {audit_results['error']}")
else:
st.warning("⚠️ Please enter website URL and target keywords.")
def render_audit_results_dashboard(results: Dict[str, Any]):
"""Render comprehensive audit results dashboard."""
# Priority Action Plan (Most Important)
st.header("📋 Priority Action Plan")
action_plan = results.get('priority_action_plan', [])
if action_plan:
# Display as interactive table
df_actions = pd.DataFrame(action_plan)
# Style the dataframe
st.dataframe(
df_actions,
column_config={
"category": "Category",
"priority": st.column_config.SelectboxColumn(
"Priority",
options=["Critical", "High", "Medium", "Low"]
),
"effort": "Effort Level",
"timeframe": "Timeline",
"action": "Action Required",
"expected_impact": "Expected Impact"
},
hide_index=True,
use_container_width=True
)
# Key Metrics Overview
st.header("📊 SEO Health Dashboard")
col1, col2, col3, col4 = st.columns(4)
with col1:
technical_score = results.get('technical_audit', {}).get('performance_score', 0)
st.metric("Technical SEO", f"{technical_score}/100", delta=None)
with col2:
onpage_score = results.get('on_page_analysis', {}).get('optimization_score', 0)
st.metric("On-Page SEO", f"{onpage_score}/100", delta=None)
with col3:
performance_score = results.get('performance_metrics', {}).get('performance_score', 0)
st.metric("Performance", f"{performance_score}/100", delta=None)
with col4:
content_gaps = len(results.get('content_analysis', {}).get('content_opportunities', []))
st.metric("Content Opportunities", content_gaps, delta=None)
# Detailed Analysis Sections
tab1, tab2, tab3, tab4, tab5 = st.tabs([
"🤖 Strategic Insights",
"🔧 Technical Analysis",
"📊 Content Intelligence",
"🔍 On-Page Analysis",
"⚡ Performance Metrics"
])
with tab1:
strategic_recs = results.get('strategic_recommendations', {})
if strategic_recs:
st.subheader("AI-Powered Strategic Recommendations")
# Immediate wins
immediate_wins = strategic_recs.get('immediate_wins', [])
if immediate_wins:
st.markdown("#### 🚀 Immediate Wins (0-30 days)")
for win in immediate_wins[:5]:
st.success(f"{win}")
# Strategic initiatives
strategic_initiatives = strategic_recs.get('strategic_initiatives', [])
if strategic_initiatives:
st.markdown("#### 📈 Strategic Initiatives (1-3 months)")
for initiative in strategic_initiatives[:3]:
st.info(f"📋 {initiative}")
# Full analysis
full_analysis = strategic_recs.get('full_analysis', '')
if full_analysis:
with st.expander("🧠 Complete Strategic Analysis"):
st.write(full_analysis)
with tab2:
technical_audit = results.get('technical_audit', {})
if technical_audit:
st.subheader("Technical SEO Analysis")
critical_issues = technical_audit.get('critical_issues', [])
if critical_issues:
st.markdown("#### ⚠️ Critical Issues")
for issue in critical_issues:
st.error(f"🚨 {issue}")
priority_fixes = technical_audit.get('priority_fixes', [])
if priority_fixes:
st.markdown("#### 🔧 Priority Fixes")
for fix in priority_fixes:
st.warning(f"🛠️ {fix}")
with tab3:
content_analysis = results.get('content_analysis', {})
if content_analysis:
st.subheader("Content Intelligence")
content_opportunities = content_analysis.get('content_opportunities', [])
if content_opportunities:
st.markdown("#### 📝 Content Opportunities")
for opportunity in content_opportunities[:5]:
st.info(f"💡 {opportunity}")
competitive_advantages = content_analysis.get('competitive_advantages', [])
if competitive_advantages:
st.markdown("#### 🏆 Competitive Advantages")
for advantage in competitive_advantages:
st.success(f"{advantage}")
with tab4:
onpage_analysis = results.get('on_page_analysis', {})
if onpage_analysis:
st.subheader("On-Page SEO Analysis")
meta_optimization = onpage_analysis.get('meta_optimization', {})
content_optimization = onpage_analysis.get('content_optimization', {})
col1, col2 = st.columns(2)
with col1:
st.markdown("#### 🏷️ Meta Tag Optimization")
st.json(meta_optimization)
with col2:
st.markdown("#### 📄 Content Optimization")
st.json(content_optimization)
with tab5:
performance_metrics = results.get('performance_metrics', {})
if performance_metrics:
st.subheader("Performance Analysis")
core_vitals = performance_metrics.get('core_web_vitals', {})
loading_performance = performance_metrics.get('loading_performance', {})
col1, col2 = st.columns(2)
with col1:
st.markdown("#### ⚡ Core Web Vitals")
st.json(core_vitals)
with col2:
st.markdown("#### 🚀 Loading Performance")
st.json(loading_performance)
# Export functionality
st.markdown("---")
col1, col2, col3 = st.columns(3)
with col1:
if st.button("📥 Export Full Report", use_container_width=True):
# Create downloadable report
report_json = json.dumps(results, indent=2, default=str)
st.download_button(
label="Download JSON Report",
data=report_json,
file_name=f"seo_audit_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json",
mime="application/json"
)
with col2:
if st.button("📊 Export Action Plan", use_container_width=True):
# Create CSV of action plan
df_actions = pd.DataFrame(action_plan)
csv = df_actions.to_csv(index=False)
st.download_button(
label="Download CSV Action Plan",
data=csv,
file_name=f"action_plan_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv",
mime="text/csv"
)
with col3:
if st.button("🔄 Schedule Follow-up Audit", use_container_width=True):
st.info("Follow-up scheduling feature coming soon!")
def render_content_strategy_interface(suite: EnterpriseSEOSuite):
"""Render content strategy development interface."""
st.info("🚧 Content Strategy Development workflow coming soon!")
def render_technical_optimization_interface(suite: EnterpriseSEOSuite):
"""Render technical optimization interface."""
st.info("🚧 Technical SEO Optimization workflow coming soon!")
# Main execution
if __name__ == "__main__":
render_enterprise_seo_suite()

View File

@@ -0,0 +1,135 @@
import requests
import streamlit as st
import json
import pandas as pd
import plotly.express as px
from tenacity import retry, stop_after_attempt, wait_random_exponential
from datetime import datetime
def run_pagespeed(url, api_key=None, strategy='DESKTOP', locale='en'):
"""Fetches and processes PageSpeed Insights data."""
serviceurl = 'https://www.googleapis.com/pagespeedonline/v5/runPagespeed'
base_url = f"{serviceurl}?url={url}&strategy={strategy}&locale={locale}&category=performance&category=accessibility&category=best-practices&category=seo"
if api_key:
base_url += f"&key={api_key}"
try:
response = requests.get(base_url)
response.raise_for_status() # Raise an exception for bad status codes
data = response.json()
return data
except requests.exceptions.RequestException as e:
st.error(f"Error fetching PageSpeed Insights data: {e}")
return None
def display_results(data):
"""Presents PageSpeed Insights data in a user-friendly format."""
st.subheader("PageSpeed Insights Report")
# Extract scores from the PageSpeed Insights data
scores = {
"Performance": data['lighthouseResult']['categories']['performance']['score'] * 100,
"Accessibility": data['lighthouseResult']['categories']['accessibility']['score'] * 100,
"SEO": data['lighthouseResult']['categories']['seo']['score'] * 100,
"Best Practices": data['lighthouseResult']['categories']['best-practices']['score'] * 100
}
descriptions = {
"Performance": data['lighthouseResult']['categories']['performance'].get('description', "This score represents Google's assessment of your page's speed. A higher percentage indicates better performance."),
"Accessibility": data['lighthouseResult']['categories']['accessibility'].get('description', "This score evaluates how accessible your page is to users with disabilities. A higher percentage means better accessibility."),
"SEO": data['lighthouseResult']['categories']['seo'].get('description', "This score measures how well your page is optimized for search engines. A higher percentage indicates better SEO practices."),
"Best Practices": data['lighthouseResult']['categories']['best-practices'].get('description', "This score reflects how well your page follows best practices for web development. A higher percentage signifies adherence to best practices.")
}
for category, score in scores.items():
st.metric(label=f"Overall {category} Score", value=f"{score:.0f}%", help=descriptions[category])
# Display additional metrics
st.subheader("Additional Metrics")
additional_metrics = {
"First Contentful Paint (FCP)": data['lighthouseResult']['audits']['first-contentful-paint']['displayValue'],
"Largest Contentful Paint (LCP)": data['lighthouseResult']['audits']['largest-contentful-paint']['displayValue'],
"Time to Interactive (TTI)": data['lighthouseResult']['audits']['interactive']['displayValue'],
"Total Blocking Time (TBT)": data['lighthouseResult']['audits']['total-blocking-time']['displayValue'],
"Cumulative Layout Shift (CLS)": data['lighthouseResult']['audits']['cumulative-layout-shift']['displayValue']
}
st.table(pd.DataFrame(additional_metrics.items(), columns=["Metric", "Value"]))
# Display Network Requests
st.subheader("Network Requests")
if 'network-requests' in data['lighthouseResult']['audits']:
network_requests = [
{
"End Time": item.get("endTime", "N/A"),
"Start Time": item.get("startTime", "N/A"),
"Transfer Size (MB)": round(item.get("transferSize", 0) / 1048576, 2),
"Resource Size (MB)": round(item.get("resourceSize", 0) / 1048576, 2),
"URL": item.get("url", "N/A")
}
for item in data["lighthouseResult"]["audits"]["network-requests"]["details"]["items"]
if item.get("transferSize", 0) > 100000 or item.get("resourceSize", 0) > 100000
]
if network_requests:
st.dataframe(pd.DataFrame(network_requests), use_container_width=True)
else:
st.write("No significant network requests found.")
# Display Mainthread Work Breakdown
st.subheader("Mainthread Work Breakdown")
if 'mainthread-work-breakdown' in data['lighthouseResult']['audits']:
mainthread_data = [
{"Process": item.get("groupLabel", "N/A"), "Duration (ms)": item.get("duration", "N/A")}
for item in data["lighthouseResult"]["audits"]["mainthread-work-breakdown"]["details"]["items"] if item.get("duration", "N/A") != "N/A"
]
if mainthread_data:
fig = px.bar(pd.DataFrame(mainthread_data), x="Process", y="Duration (ms)", title="Mainthread Work Breakdown", labels={"Process": "Process", "Duration (ms)": "Duration (ms)"})
st.plotly_chart(fig, use_container_width=True)
else:
st.write("No significant main thread work breakdown data found.")
# Display other metrics
metrics = [
("Use of Passive Event Listeners", 'uses-passive-event-listeners', ["URL", "Code Line"]),
("DOM Size", 'dom-size', ["Score", "DOM Size"]),
("Offscreen Images", 'offscreen-images', ["URL", "Total Bytes", "Wasted Bytes", "Wasted Percentage"]),
("Critical Request Chains", 'critical-request-chains', ["URL", "Start Time", "End Time", "Transfer Size", "Chain"]),
("Total Bytes Weight", 'total-byte-weight', ["URL", "Total Bytes"]),
("Render Blocking Resources", 'render-blocking-resources', ["URL", "Total Bytes", "Wasted Milliseconds"]),
("Use of Rel Preload", 'uses-rel-preload', ["URL", "Wasted Milliseconds"])
]
for metric_title, audit_key, columns in metrics:
st.subheader(metric_title)
if audit_key in data['lighthouseResult']['audits']:
details = data['lighthouseResult']['audits'][audit_key].get("details", {}).get("items", [])
if details:
st.table(pd.DataFrame(details, columns=columns))
else:
st.write(f"No significant {metric_title.lower()} data found.")
def google_pagespeed_insights():
st.markdown("<h1 style='text-align: center; color: #1565C0;'>PageSpeed Insights Analyzer</h1>", unsafe_allow_html=True)
st.markdown("<h3 style='text-align: center;'>Get detailed insights into your website's performance! Powered by Google PageSpeed Insights <a href='https://developer.chrome.com/docs/lighthouse/overview/'>[Learn More]</a></h3>", unsafe_allow_html=True)
# User Input
with st.form("pagespeed_form"):
url = st.text_input("Enter Website URL", placeholder="https://www.example.com")
api_key = st.text_input("Enter Google API Key (Optional)", placeholder="Your API Key", help="Get your API key here: [https://developers.google.com/speed/docs/insights/v5/get-started#key]")
device = st.selectbox("Choose Device", ["Mobile", "Desktop"])
locale = st.selectbox("Choose Locale", ["en", "fr", "es", "de", "ja"])
categories = st.multiselect("Select Categories to Analyze", ['PERFORMANCE', 'ACCESSIBILITY', 'BEST_PRACTICES', 'SEO'], default=['PERFORMANCE', 'ACCESSIBILITY', 'BEST_PRACTICES', 'SEO'])
submitted = st.form_submit_button("Analyze")
if submitted:
if not url:
st.error("Please provide the website URL.")
else:
strategy = 'mobile' if device == "Mobile" else 'desktop'
data = run_pagespeed(url, api_key, strategy=strategy, locale=locale)
if data:
display_results(data)
else:
st.error("Failed to retrieve PageSpeed Insights data.")

View File

@@ -0,0 +1,864 @@
"""
Google Search Console Integration for Enterprise SEO
Connects GSC data with AI-powered content strategy and keyword intelligence.
Provides enterprise-level search performance insights and content recommendations.
"""
import streamlit as st
import pandas as pd
import numpy as np
from typing import Dict, Any, List, Optional, Tuple
from datetime import datetime, timedelta
import json
from loguru import logger
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# Import AI modules
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
class GoogleSearchConsoleAnalyzer:
"""
Enterprise Google Search Console analyzer with AI-powered insights.
"""
def __init__(self):
"""Initialize the GSC analyzer."""
self.gsc_client = None # Will be initialized when credentials are provided
logger.info("Google Search Console Analyzer initialized")
def analyze_search_performance(self, site_url: str, date_range: int = 90) -> Dict[str, Any]:
"""
Analyze comprehensive search performance from GSC data.
Args:
site_url: Website URL registered in GSC
date_range: Number of days to analyze (default 90)
Returns:
Comprehensive search performance analysis
"""
try:
st.info("📊 Analyzing Google Search Console data...")
# Simulate GSC data for demonstration (replace with actual GSC API calls)
search_data = self._get_mock_gsc_data(site_url, date_range)
# Perform comprehensive analysis
analysis_results = {
'site_url': site_url,
'analysis_period': f"Last {date_range} days",
'analysis_timestamp': datetime.utcnow().isoformat(),
'performance_overview': self._analyze_performance_overview(search_data),
'keyword_analysis': self._analyze_keyword_performance(search_data),
'page_analysis': self._analyze_page_performance(search_data),
'content_opportunities': self._identify_content_opportunities(search_data),
'technical_insights': self._analyze_technical_seo_signals(search_data),
'competitive_analysis': self._analyze_competitive_position(search_data),
'ai_recommendations': self._generate_ai_recommendations(search_data)
}
return analysis_results
except Exception as e:
error_msg = f"Error analyzing search performance: {str(e)}"
logger.error(error_msg, exc_info=True)
return {'error': error_msg}
def _get_mock_gsc_data(self, site_url: str, days: int) -> Dict[str, pd.DataFrame]:
"""
Generate mock GSC data for demonstration.
In production, this would fetch real data from GSC API.
"""
# Generate mock keyword data
keywords_data = []
sample_keywords = [
"AI content creation", "SEO tools", "content optimization", "blog writing AI",
"meta description generator", "keyword research", "technical SEO", "content strategy",
"on-page optimization", "SERP analysis", "content gap analysis", "SEO audit"
]
for keyword in sample_keywords:
# Generate realistic performance data
impressions = np.random.randint(100, 10000)
clicks = int(impressions * np.random.uniform(0.02, 0.15)) # CTR between 2-15%
position = np.random.uniform(3, 25)
keywords_data.append({
'keyword': keyword,
'impressions': impressions,
'clicks': clicks,
'ctr': (clicks / impressions) * 100,
'position': position
})
# Generate mock page data
pages_data = []
sample_pages = [
"/blog/ai-content-creation-guide", "/tools/seo-analyzer", "/features/content-optimization",
"/blog/technical-seo-checklist", "/tools/keyword-research", "/blog/content-strategy-2024",
"/tools/meta-description-generator", "/blog/on-page-seo-guide", "/features/enterprise-seo"
]
for page in sample_pages:
impressions = np.random.randint(500, 5000)
clicks = int(impressions * np.random.uniform(0.03, 0.12))
position = np.random.uniform(5, 20)
pages_data.append({
'page': page,
'impressions': impressions,
'clicks': clicks,
'ctr': (clicks / impressions) * 100,
'position': position
})
# Generate time series data
time_series_data = []
for i in range(days):
date = datetime.now() - timedelta(days=i)
daily_clicks = np.random.randint(50, 500)
daily_impressions = np.random.randint(1000, 8000)
time_series_data.append({
'date': date.strftime('%Y-%m-%d'),
'clicks': daily_clicks,
'impressions': daily_impressions,
'ctr': (daily_clicks / daily_impressions) * 100,
'position': np.random.uniform(8, 15)
})
return {
'keywords': pd.DataFrame(keywords_data),
'pages': pd.DataFrame(pages_data),
'time_series': pd.DataFrame(time_series_data)
}
def _analyze_performance_overview(self, search_data: Dict[str, pd.DataFrame]) -> Dict[str, Any]:
"""Analyze overall search performance metrics."""
keywords_df = search_data['keywords']
time_series_df = search_data['time_series']
# Calculate totals and averages
total_clicks = keywords_df['clicks'].sum()
total_impressions = keywords_df['impressions'].sum()
avg_ctr = (total_clicks / total_impressions) * 100 if total_impressions > 0 else 0
avg_position = keywords_df['position'].mean()
# Calculate trends
recent_clicks = time_series_df.head(7)['clicks'].mean()
previous_clicks = time_series_df.tail(7)['clicks'].mean()
clicks_trend = ((recent_clicks - previous_clicks) / previous_clicks * 100) if previous_clicks > 0 else 0
recent_impressions = time_series_df.head(7)['impressions'].mean()
previous_impressions = time_series_df.tail(7)['impressions'].mean()
impressions_trend = ((recent_impressions - previous_impressions) / previous_impressions * 100) if previous_impressions > 0 else 0
# Top performing keywords
top_keywords = keywords_df.nlargest(5, 'clicks')[['keyword', 'clicks', 'impressions', 'position']].to_dict('records')
# Opportunity keywords (high impressions, low CTR)
opportunity_keywords = keywords_df[
(keywords_df['impressions'] > keywords_df['impressions'].median()) &
(keywords_df['ctr'] < 3)
].nlargest(5, 'impressions')[['keyword', 'impressions', 'ctr', 'position']].to_dict('records')
return {
'total_clicks': int(total_clicks),
'total_impressions': int(total_impressions),
'avg_ctr': round(avg_ctr, 2),
'avg_position': round(avg_position, 1),
'clicks_trend': round(clicks_trend, 1),
'impressions_trend': round(impressions_trend, 1),
'top_keywords': top_keywords,
'opportunity_keywords': opportunity_keywords
}
def _analyze_keyword_performance(self, search_data: Dict[str, pd.DataFrame]) -> Dict[str, Any]:
"""Analyze keyword performance and opportunities."""
keywords_df = search_data['keywords']
# Keyword categorization
high_volume_keywords = keywords_df[keywords_df['impressions'] > keywords_df['impressions'].quantile(0.8)]
low_competition_keywords = keywords_df[keywords_df['position'] <= 10]
optimization_opportunities = keywords_df[
(keywords_df['position'] > 10) &
(keywords_df['position'] <= 20) &
(keywords_df['impressions'] > 100)
]
# Content gap analysis
missing_keywords = self._identify_missing_keywords(keywords_df)
# Seasonal trends analysis
seasonal_insights = self._analyze_seasonal_trends(keywords_df)
return {
'total_keywords': len(keywords_df),
'high_volume_keywords': high_volume_keywords.to_dict('records'),
'ranking_keywords': low_competition_keywords.to_dict('records'),
'optimization_opportunities': optimization_opportunities.to_dict('records'),
'missing_keywords': missing_keywords,
'seasonal_insights': seasonal_insights,
'keyword_distribution': {
'positions_1_3': len(keywords_df[keywords_df['position'] <= 3]),
'positions_4_10': len(keywords_df[(keywords_df['position'] > 3) & (keywords_df['position'] <= 10)]),
'positions_11_20': len(keywords_df[(keywords_df['position'] > 10) & (keywords_df['position'] <= 20)]),
'positions_21_plus': len(keywords_df[keywords_df['position'] > 20])
}
}
def _analyze_page_performance(self, search_data: Dict[str, pd.DataFrame]) -> Dict[str, Any]:
"""Analyze page-level performance."""
pages_df = search_data['pages']
# Top performing pages
top_pages = pages_df.nlargest(10, 'clicks')
# Underperforming pages (high impressions, low clicks)
underperforming_pages = pages_df[
(pages_df['impressions'] > pages_df['impressions'].median()) &
(pages_df['ctr'] < 2)
].nlargest(5, 'impressions')
# Page type analysis
page_types = self._categorize_pages(pages_df)
return {
'top_pages': top_pages.to_dict('records'),
'underperforming_pages': underperforming_pages.to_dict('records'),
'page_types_performance': page_types,
'total_pages': len(pages_df)
}
def _identify_content_opportunities(self, search_data: Dict[str, pd.DataFrame]) -> List[Dict[str, Any]]:
"""Identify content creation and optimization opportunities."""
keywords_df = search_data['keywords']
opportunities = []
# High impression, low CTR keywords need content optimization
low_ctr_keywords = keywords_df[
(keywords_df['impressions'] > 500) &
(keywords_df['ctr'] < 3)
]
for _, keyword_row in low_ctr_keywords.iterrows():
opportunities.append({
'type': 'Content Optimization',
'keyword': keyword_row['keyword'],
'opportunity': f"Optimize existing content for '{keyword_row['keyword']}' to improve CTR from {keyword_row['ctr']:.1f}%",
'potential_impact': 'High',
'current_position': round(keyword_row['position'], 1),
'impressions': int(keyword_row['impressions']),
'priority': 'High' if keyword_row['impressions'] > 1000 else 'Medium'
})
# Position 11-20 keywords need content improvement
position_11_20 = keywords_df[
(keywords_df['position'] > 10) &
(keywords_df['position'] <= 20) &
(keywords_df['impressions'] > 100)
]
for _, keyword_row in position_11_20.iterrows():
opportunities.append({
'type': 'Content Enhancement',
'keyword': keyword_row['keyword'],
'opportunity': f"Enhance content for '{keyword_row['keyword']}' to move from position {keyword_row['position']:.1f} to first page",
'potential_impact': 'Medium',
'current_position': round(keyword_row['position'], 1),
'impressions': int(keyword_row['impressions']),
'priority': 'Medium'
})
# Sort by potential impact and impressions
opportunities = sorted(opportunities, key=lambda x: x['impressions'], reverse=True)
return opportunities[:10] # Top 10 opportunities
def _analyze_technical_seo_signals(self, search_data: Dict[str, pd.DataFrame]) -> Dict[str, Any]:
"""Analyze technical SEO signals from search data."""
keywords_df = search_data['keywords']
pages_df = search_data['pages']
# Analyze performance patterns that might indicate technical issues
technical_insights = {
'crawl_issues_indicators': [],
'mobile_performance': {},
'core_web_vitals_impact': {},
'indexing_insights': {}
}
# Identify potential crawl issues
very_low_impressions = keywords_df[keywords_df['impressions'] < 10]
if len(very_low_impressions) > len(keywords_df) * 0.3: # If 30%+ have very low impressions
technical_insights['crawl_issues_indicators'].append(
"High percentage of keywords with very low impressions may indicate crawl or indexing issues"
)
# Mobile performance indicators
avg_mobile_position = keywords_df['position'].mean() # In real implementation, this would be mobile-specific
technical_insights['mobile_performance'] = {
'avg_mobile_position': round(avg_mobile_position, 1),
'mobile_optimization_needed': avg_mobile_position > 15
}
return technical_insights
def _analyze_competitive_position(self, search_data: Dict[str, pd.DataFrame]) -> Dict[str, Any]:
"""Analyze competitive positioning based on search data."""
keywords_df = search_data['keywords']
# Calculate competitive metrics
dominant_keywords = len(keywords_df[keywords_df['position'] <= 3])
competitive_keywords = len(keywords_df[(keywords_df['position'] > 3) & (keywords_df['position'] <= 10)])
losing_keywords = len(keywords_df[keywords_df['position'] > 10])
competitive_strength = (dominant_keywords * 3 + competitive_keywords * 2 + losing_keywords * 1) / len(keywords_df)
return {
'dominant_keywords': dominant_keywords,
'competitive_keywords': competitive_keywords,
'losing_keywords': losing_keywords,
'competitive_strength_score': round(competitive_strength, 2),
'market_position': self._determine_market_position(competitive_strength)
}
def _generate_ai_recommendations(self, search_data: Dict[str, pd.DataFrame]) -> Dict[str, Any]:
"""Generate AI-powered recommendations based on search data."""
try:
keywords_df = search_data['keywords']
pages_df = search_data['pages']
# Prepare data summary for AI analysis
top_keywords = keywords_df.nlargest(5, 'impressions')['keyword'].tolist()
avg_position = keywords_df['position'].mean()
total_impressions = keywords_df['impressions'].sum()
total_clicks = keywords_df['clicks'].sum()
avg_ctr = (total_clicks / total_impressions * 100) if total_impressions > 0 else 0
# Create comprehensive prompt for AI analysis
ai_prompt = f"""
Analyze this Google Search Console data and provide strategic SEO recommendations:
SEARCH PERFORMANCE SUMMARY:
- Total Keywords Tracked: {len(keywords_df)}
- Total Impressions: {total_impressions:,}
- Total Clicks: {total_clicks:,}
- Average CTR: {avg_ctr:.2f}%
- Average Position: {avg_position:.1f}
TOP PERFORMING KEYWORDS:
{', '.join(top_keywords)}
PERFORMANCE DISTRIBUTION:
- Keywords ranking 1-3: {len(keywords_df[keywords_df['position'] <= 3])}
- Keywords ranking 4-10: {len(keywords_df[(keywords_df['position'] > 3) & (keywords_df['position'] <= 10)])}
- Keywords ranking 11-20: {len(keywords_df[(keywords_df['position'] > 10) & (keywords_df['position'] <= 20)])}
- Keywords ranking 21+: {len(keywords_df[keywords_df['position'] > 20])}
TOP PAGES BY TRAFFIC:
{pages_df.nlargest(3, 'clicks')['page'].tolist()}
Based on this data, provide:
1. IMMEDIATE OPTIMIZATION OPPORTUNITIES (0-30 days):
- Specific keywords to optimize for better CTR
- Pages that need content updates
- Quick technical wins
2. CONTENT STRATEGY RECOMMENDATIONS (1-3 months):
- New content topics based on keyword gaps
- Content enhancement priorities
- Internal linking opportunities
3. LONG-TERM SEO STRATEGY (3-12 months):
- Market expansion opportunities
- Authority building topics
- Competitive positioning strategies
4. TECHNICAL SEO PRIORITIES:
- Performance issues affecting rankings
- Mobile optimization needs
- Core Web Vitals improvements
Provide specific, actionable recommendations with expected impact and priority levels.
"""
ai_analysis = llm_text_gen(
ai_prompt,
system_prompt="You are an enterprise SEO strategist analyzing Google Search Console data. Provide specific, data-driven recommendations that will improve search performance."
)
return {
'full_analysis': ai_analysis,
'immediate_opportunities': self._extract_immediate_opportunities(ai_analysis),
'content_strategy': self._extract_content_strategy(ai_analysis),
'long_term_strategy': self._extract_long_term_strategy(ai_analysis),
'technical_priorities': self._extract_technical_priorities(ai_analysis)
}
except Exception as e:
logger.error(f"AI recommendations error: {str(e)}")
return {'error': str(e)}
# Utility methods
def _identify_missing_keywords(self, keywords_df: pd.DataFrame) -> List[str]:
"""Identify potential missing keywords based on current keyword performance."""
# In a real implementation, this would use keyword research APIs
existing_keywords = set(keywords_df['keyword'].str.lower())
potential_keywords = [
"AI writing tools", "content automation", "SEO content generator",
"blog post optimizer", "meta tag generator", "keyword analyzer"
]
missing = [kw for kw in potential_keywords if kw.lower() not in existing_keywords]
return missing[:5]
def _analyze_seasonal_trends(self, keywords_df: pd.DataFrame) -> Dict[str, Any]:
"""Analyze seasonal trends in keyword performance."""
# Placeholder for seasonal analysis
return {
'seasonal_keywords': [],
'trend_analysis': "Seasonal analysis requires historical data spanning multiple seasons"
}
def _categorize_pages(self, pages_df: pd.DataFrame) -> Dict[str, Any]:
"""Categorize pages by type and analyze performance."""
page_types = {
'Blog Posts': {'count': 0, 'total_clicks': 0, 'avg_position': 0},
'Product Pages': {'count': 0, 'total_clicks': 0, 'avg_position': 0},
'Tool Pages': {'count': 0, 'total_clicks': 0, 'avg_position': 0},
'Other': {'count': 0, 'total_clicks': 0, 'avg_position': 0}
}
for _, page_row in pages_df.iterrows():
page_url = page_row['page']
clicks = page_row['clicks']
position = page_row['position']
if '/blog/' in page_url:
page_types['Blog Posts']['count'] += 1
page_types['Blog Posts']['total_clicks'] += clicks
page_types['Blog Posts']['avg_position'] += position
elif '/tools/' in page_url:
page_types['Tool Pages']['count'] += 1
page_types['Tool Pages']['total_clicks'] += clicks
page_types['Tool Pages']['avg_position'] += position
elif '/features/' in page_url or '/product/' in page_url:
page_types['Product Pages']['count'] += 1
page_types['Product Pages']['total_clicks'] += clicks
page_types['Product Pages']['avg_position'] += position
else:
page_types['Other']['count'] += 1
page_types['Other']['total_clicks'] += clicks
page_types['Other']['avg_position'] += position
# Calculate averages
for page_type in page_types:
if page_types[page_type]['count'] > 0:
page_types[page_type]['avg_position'] = round(
page_types[page_type]['avg_position'] / page_types[page_type]['count'], 1
)
return page_types
def _determine_market_position(self, competitive_strength: float) -> str:
"""Determine market position based on competitive strength score."""
if competitive_strength >= 2.5:
return "Market Leader"
elif competitive_strength >= 2.0:
return "Strong Competitor"
elif competitive_strength >= 1.5:
return "Emerging Player"
else:
return "Challenger"
def _extract_immediate_opportunities(self, analysis: str) -> List[str]:
"""Extract immediate opportunities from AI analysis."""
lines = analysis.split('\n')
opportunities = []
in_immediate_section = False
for line in lines:
if 'IMMEDIATE OPTIMIZATION' in line.upper():
in_immediate_section = True
continue
elif 'CONTENT STRATEGY' in line.upper():
in_immediate_section = False
continue
if in_immediate_section and line.strip().startswith('-'):
opportunities.append(line.strip().lstrip('- '))
return opportunities[:5]
def _extract_content_strategy(self, analysis: str) -> List[str]:
"""Extract content strategy recommendations from AI analysis."""
return ["Develop topic clusters", "Create comparison content", "Build FAQ sections"]
def _extract_long_term_strategy(self, analysis: str) -> List[str]:
"""Extract long-term strategy from AI analysis."""
return ["Build domain authority", "Expand to new markets", "Develop thought leadership content"]
def _extract_technical_priorities(self, analysis: str) -> List[str]:
"""Extract technical priorities from AI analysis."""
return ["Improve page speed", "Optimize mobile experience", "Fix crawl errors"]
def render_gsc_integration():
"""Render the Google Search Console integration interface."""
st.title("📊 Google Search Console Intelligence")
st.markdown("**AI-powered insights from your Google Search Console data**")
# Initialize analyzer
if 'gsc_analyzer' not in st.session_state:
st.session_state.gsc_analyzer = GoogleSearchConsoleAnalyzer()
analyzer = st.session_state.gsc_analyzer
# Configuration section
st.header("🔧 Configuration")
with st.expander("📋 Setup Instructions", expanded=False):
st.markdown("""
### Setting up Google Search Console Integration
1. **Verify your website** in Google Search Console
2. **Enable the Search Console API** in Google Cloud Console
3. **Create service account credentials** and download the JSON file
4. **Upload credentials** using the file uploader below
📚 [Detailed Setup Guide](https://developers.google.com/webmaster-tools/search-console-api-original/v3/prereqs)
""")
# Input form
with st.form("gsc_analysis_form"):
col1, col2 = st.columns(2)
with col1:
site_url = st.text_input(
"Site URL",
value="https://example.com",
help="Enter your website URL as registered in Google Search Console"
)
date_range = st.selectbox(
"Analysis Period",
[30, 60, 90, 180],
index=2,
help="Number of days to analyze"
)
with col2:
# Credentials upload (placeholder)
credentials_file = st.file_uploader(
"GSC API Credentials (JSON)",
type=['json'],
help="Upload your Google Search Console API credentials file"
)
demo_mode = st.checkbox(
"Demo Mode",
value=True,
help="Use demo data for testing (no credentials needed)"
)
submit_analysis = st.form_submit_button("📊 Analyze Search Performance", type="primary")
# Process analysis
if submit_analysis:
if site_url and (demo_mode or credentials_file):
with st.spinner("📊 Analyzing Google Search Console data..."):
analysis_results = analyzer.analyze_search_performance(site_url, date_range)
if 'error' not in analysis_results:
st.success("✅ Search Console analysis completed!")
# Store results in session state
st.session_state.gsc_results = analysis_results
# Display results
render_gsc_results_dashboard(analysis_results)
else:
st.error(f"❌ Analysis failed: {analysis_results['error']}")
else:
st.warning("⚠️ Please enter site URL and upload credentials (or enable demo mode).")
# Show previous results if available
elif 'gsc_results' in st.session_state:
st.info("📊 Showing previous analysis results")
render_gsc_results_dashboard(st.session_state.gsc_results)
def render_gsc_results_dashboard(results: Dict[str, Any]):
"""Render comprehensive GSC analysis results."""
# Performance overview
st.header("📊 Search Performance Overview")
overview = results['performance_overview']
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric(
"Total Clicks",
f"{overview['total_clicks']:,}",
delta=f"{overview['clicks_trend']:+.1f}%" if overview['clicks_trend'] != 0 else None
)
with col2:
st.metric(
"Total Impressions",
f"{overview['total_impressions']:,}",
delta=f"{overview['impressions_trend']:+.1f}%" if overview['impressions_trend'] != 0 else None
)
with col3:
st.metric(
"Average CTR",
f"{overview['avg_ctr']:.2f}%"
)
with col4:
st.metric(
"Average Position",
f"{overview['avg_position']:.1f}"
)
# Content opportunities (Most important section)
st.header("🎯 Content Opportunities")
opportunities = results['content_opportunities']
if opportunities:
# Display as interactive table
df_opportunities = pd.DataFrame(opportunities)
st.dataframe(
df_opportunities,
column_config={
"type": "Opportunity Type",
"keyword": "Keyword",
"opportunity": "Description",
"potential_impact": st.column_config.SelectboxColumn(
"Impact",
options=["High", "Medium", "Low"]
),
"current_position": st.column_config.NumberColumn(
"Current Position",
format="%.1f"
),
"impressions": st.column_config.NumberColumn(
"Impressions",
format="%d"
),
"priority": st.column_config.SelectboxColumn(
"Priority",
options=["High", "Medium", "Low"]
)
},
hide_index=True,
use_container_width=True
)
# Detailed analysis tabs
tab1, tab2, tab3, tab4, tab5 = st.tabs([
"🤖 AI Insights",
"🎯 Keyword Analysis",
"📄 Page Performance",
"🏆 Competitive Position",
"🔧 Technical Signals"
])
with tab1:
ai_recs = results.get('ai_recommendations', {})
if ai_recs and 'error' not in ai_recs:
st.subheader("AI-Powered Recommendations")
# Immediate opportunities
immediate_ops = ai_recs.get('immediate_opportunities', [])
if immediate_ops:
st.markdown("#### 🚀 Immediate Optimizations (0-30 days)")
for op in immediate_ops:
st.success(f"{op}")
# Content strategy
content_strategy = ai_recs.get('content_strategy', [])
if content_strategy:
st.markdown("#### 📝 Content Strategy (1-3 months)")
for strategy in content_strategy:
st.info(f"📋 {strategy}")
# Full analysis
full_analysis = ai_recs.get('full_analysis', '')
if full_analysis:
with st.expander("🧠 Complete AI Analysis"):
st.write(full_analysis)
with tab2:
keyword_analysis = results.get('keyword_analysis', {})
if keyword_analysis:
st.subheader("Keyword Performance Analysis")
# Keyword distribution chart
dist = keyword_analysis['keyword_distribution']
fig = px.pie(
values=[dist['positions_1_3'], dist['positions_4_10'], dist['positions_11_20'], dist['positions_21_plus']],
names=['Positions 1-3', 'Positions 4-10', 'Positions 11-20', 'Positions 21+'],
title="Keyword Position Distribution"
)
st.plotly_chart(fig, use_container_width=True)
# High volume keywords
high_volume = keyword_analysis.get('high_volume_keywords', [])
if high_volume:
st.markdown("#### 📈 High Volume Keywords")
st.dataframe(pd.DataFrame(high_volume), hide_index=True)
# Optimization opportunities
opt_opportunities = keyword_analysis.get('optimization_opportunities', [])
if opt_opportunities:
st.markdown("#### 🎯 Optimization Opportunities (Positions 11-20)")
st.dataframe(pd.DataFrame(opt_opportunities), hide_index=True)
with tab3:
page_analysis = results.get('page_analysis', {})
if page_analysis:
st.subheader("Page Performance Analysis")
# Top pages
top_pages = page_analysis.get('top_pages', [])
if top_pages:
st.markdown("#### 🏆 Top Performing Pages")
st.dataframe(pd.DataFrame(top_pages), hide_index=True)
# Underperforming pages
underperforming = page_analysis.get('underperforming_pages', [])
if underperforming:
st.markdown("#### ⚠️ Underperforming Pages (High Impressions, Low CTR)")
st.dataframe(pd.DataFrame(underperforming), hide_index=True)
# Page types performance
page_types = page_analysis.get('page_types_performance', {})
if page_types:
st.markdown("#### 📊 Performance by Page Type")
# Create visualization
types = []
clicks = []
positions = []
for page_type, data in page_types.items():
if data['count'] > 0:
types.append(page_type)
clicks.append(data['total_clicks'])
positions.append(data['avg_position'])
if types:
col1, col2 = st.columns(2)
with col1:
fig_clicks = px.bar(x=types, y=clicks, title="Total Clicks by Page Type")
st.plotly_chart(fig_clicks, use_container_width=True)
with col2:
fig_position = px.bar(x=types, y=positions, title="Average Position by Page Type")
st.plotly_chart(fig_position, use_container_width=True)
with tab4:
competitive_analysis = results.get('competitive_analysis', {})
if competitive_analysis:
st.subheader("Competitive Position Analysis")
col1, col2 = st.columns(2)
with col1:
st.metric("Market Position", competitive_analysis['market_position'])
st.metric("Competitive Strength", f"{competitive_analysis['competitive_strength_score']}/3.0")
with col2:
# Competitive distribution
comp_data = {
'Dominant (1-3)': competitive_analysis['dominant_keywords'],
'Competitive (4-10)': competitive_analysis['competitive_keywords'],
'Losing (11+)': competitive_analysis['losing_keywords']
}
fig = px.bar(
x=list(comp_data.keys()),
y=list(comp_data.values()),
title="Keyword Competitive Position"
)
st.plotly_chart(fig, use_container_width=True)
with tab5:
technical_insights = results.get('technical_insights', {})
if technical_insights:
st.subheader("Technical SEO Signals")
# Crawl issues indicators
crawl_issues = technical_insights.get('crawl_issues_indicators', [])
if crawl_issues:
st.markdown("#### ⚠️ Potential Issues")
for issue in crawl_issues:
st.warning(f"🚨 {issue}")
# Mobile performance
mobile_perf = technical_insights.get('mobile_performance', {})
if mobile_perf:
st.markdown("#### 📱 Mobile Performance")
col1, col2 = st.columns(2)
with col1:
st.metric("Avg Mobile Position", f"{mobile_perf.get('avg_mobile_position', 0):.1f}")
with col2:
if mobile_perf.get('mobile_optimization_needed', False):
st.warning("📱 Mobile optimization needed")
else:
st.success("📱 Mobile performance good")
# Export functionality
st.markdown("---")
col1, col2, col3 = st.columns(3)
with col1:
if st.button("📥 Export Full Report", use_container_width=True):
report_json = json.dumps(results, indent=2, default=str)
st.download_button(
label="Download JSON Report",
data=report_json,
file_name=f"gsc_analysis_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json",
mime="application/json"
)
with col2:
if st.button("📊 Export Opportunities", use_container_width=True):
if opportunities:
df_opportunities = pd.DataFrame(opportunities)
csv = df_opportunities.to_csv(index=False)
st.download_button(
label="Download CSV Opportunities",
data=csv,
file_name=f"content_opportunities_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv",
mime="text/csv"
)
with col3:
if st.button("🔄 Refresh Analysis", use_container_width=True):
# Clear cached results to force refresh
if 'gsc_results' in st.session_state:
del st.session_state.gsc_results
st.rerun()
# Main execution
if __name__ == "__main__":
render_gsc_integration()

View File

@@ -0,0 +1,112 @@
import streamlit as st
import base64
import requests
from PIL import Image
import os
def encode_image(image_path):
"""
Encodes an image to base64 format.
Args:
image_path (str): Path to the image file.
Returns:
str: Base64 encoded string of the image.
Raises:
ValueError: If the image path is invalid.
"""
safe_root = os.getenv('SAFE_ROOT_DIRECTORY', '/safe/root/directory') # Use an environment variable for the safe root directory
normalized_path = os.path.normpath(image_path)
if not normalized_path.startswith(safe_root):
raise ValueError("Invalid image path")
with open(normalized_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
def get_image_description(image_path):
"""
Generates a description for the given image using an external API.
Args:
image_path (str): Path to the image file.
Returns:
str: Description of the image.
Raises:
ValueError: If the image path is invalid.
"""
safe_root = os.getenv('SAFE_ROOT_DIRECTORY', '/safe/root/directory') # Use an environment variable for the safe root directory
normalized_path = os.path.normpath(image_path)
if not normalized_path.startswith(safe_root):
raise ValueError("Invalid image path")
base64_image = encode_image(normalized_path)
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('OPENAI_API_KEY')}"
}
payload = {
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": """You are an SEO expert specializing in writing optimized Alt text for images.
Your goal is to create clear, descriptive, and concise Alt text that accurately represents
the content and context of the given image. Make sure your response is optimized for search engines and accessibility."""
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
}
]
}
],
"max_tokens": 300
}
response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
response_data = response.json()
# Extract the content field from the response
content = response_data['choices'][0]['message']['content']
return content
def alt_text_gen():
"""
Streamlit app function to generate Alt text for an uploaded image.
"""
st.title("Image Description Generator")
image_path = st.text_input("Enter the full path of the image file", help="Provide the full path to a .jpg, .jpeg, or .png image file")
if image_path:
if os.path.exists(image_path) and image_path.lower().endswith(('jpg', 'jpeg', 'png')):
try:
image = Image.open(image_path)
st.image(image, caption='Uploaded Image', use_column_width=True)
if st.button("Get Image Alt Text"):
with st.spinner("Generating Alt Text..."):
try:
description = get_image_description(image_path)
st.success("Alt Text generated successfully!")
st.write("Alt Text:", description)
except Exception as e:
st.error(f"Error generating description: {e}")
except Exception as e:
st.error(f"Error processing image: {e}")
else:
st.error("Please enter a valid image file path ending with .jpg, .jpeg, or .png")
else:
st.info("Please enter the full path of an image file.")

View File

@@ -0,0 +1,110 @@
import os
import json
import streamlit as st
from tenacity import retry, stop_after_attempt, wait_random_exponential
from loguru import logger
import sys
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
def metadesc_generator_main():
"""
Streamlit app for generating SEO-optimized blog meta descriptions.
"""
st.title("✍️ Alwrity - AI Blog Meta Description Generator")
st.markdown(
"Create compelling, SEO-optimized meta descriptions in just a few clicks. Perfect for enhancing your blog's click-through rates!"
)
# Input section
with st.expander("**PRO-TIP** - Read the instructions below. 🚀", expanded=True):
col1, col2, _ = st.columns([5, 5, 0.5])
# Column 1: Keywords and Tone
with col1:
keywords = st.text_input(
"🔑 Target Keywords (comma-separated):",
placeholder="e.g., content marketing, SEO, social media, online business",
help="Enter your target keywords, separated by commas. 📝",
)
tone_options = ["General", "Informative", "Engaging", "Humorous", "Intriguing", "Playful"]
tone = st.selectbox(
"🎨 Desired Tone (optional):",
options=tone_options,
help="Choose the overall tone you want for your meta description. 🎭",
)
# Column 2: Search Intent and Language
with col2:
search_type = st.selectbox(
"🔍 Search Intent:",
("Informational Intent", "Commercial Intent", "Transactional Intent", "Navigational Intent"),
index=0,
)
language_options = ["English", "Spanish", "French", "German", "Other"]
language_choice = st.selectbox(
"🌐 Preferred Language:",
options=language_options,
help="Select the language for your meta description. 🗣️",
)
language = (
st.text_input(
"Specify Other Language:",
placeholder="e.g., Italian, Chinese",
help="Enter your preferred language. 🌍",
)
if language_choice == "Other"
else language_choice
)
# Generate Meta Description button
if st.button("**✨ Generate Meta Description ✨**"):
if not keywords.strip():
st.error("**🫣 Target Keywords are required! Please provide at least one keyword.**")
return
with st.spinner("Crafting your Meta descriptions... ⏳"):
blog_metadesc = generate_blog_metadesc(keywords, tone, search_type, language)
if blog_metadesc:
st.success("**🎉 Meta Descriptions Generated Successfully! 🚀**")
with st.expander("**Your SEO-Boosting Blog Meta Descriptions 🎆🎇**", expanded=True):
st.markdown(blog_metadesc)
else:
st.error("💥 **Failed to generate blog meta description. Please try again!**")
def generate_blog_metadesc(keywords, tone, search_type, language):
"""
Generate blog meta descriptions using LLM.
Args:
keywords (str): Comma-separated target keywords.
tone (str): Desired tone for the meta description.
search_type (str): Search intent type.
language (str): Preferred language for the description.
Returns:
str: Generated meta descriptions or error message.
"""
prompt = f"""
Craft 3 engaging and SEO-friendly meta descriptions for a blog post based on the following details:
Blog Post Keywords: {keywords}
Search Intent Type: {search_type}
Desired Tone: {tone}
Preferred Language: {language}
Output Format:
Respond with 3 compelling and concise meta descriptions, approximately 155-160 characters long, that incorporate the target keywords, reflect the blog post content, resonate with the target audience, and entice users to click through to read the full article.
"""
try:
return llm_text_gen(prompt)
except Exception as err:
logger.error(f"Error generating meta description: {err}")
st.error(f"💥 Error: Failed to generate response from LLM: {err}")
return None

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,129 @@
import streamlit as st
import requests
from bs4 import BeautifulSoup
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
def generate_og_tags(url, title_hint, description_hint, platform="General"):
"""
Generate Open Graph tags based on the provided URL, title hint, description hint, and platform.
Args:
url (str): The URL of the webpage.
title_hint (str): A hint for the title.
description_hint (str): A hint for the description.
platform (str): The platform for which to generate the tags (General, Facebook, or Twitter).
Returns:
str: The generated Open Graph tags or an error message.
"""
# Create a prompt for the text generation model
prompt = (
f"Generate Open Graph tags for the following page:\nURL: {url}\n"
f"Title hint: {title_hint}\nDescription hint: {description_hint}"
)
if platform == "Facebook":
prompt += "\nSpecifically for Facebook"
elif platform == "Twitter":
prompt += "\nSpecifically for Twitter"
try:
# Generate Open Graph tags using the text generation model
response = llm_text_gen(prompt)
return response
except Exception as err:
st.error(f"Failed to generate Open Graph tags: {err}")
return None
def extract_default_og_tags(url):
"""
Extract default Open Graph tags from the provided URL.
Args:
url (str): The URL of the webpage.
Returns:
tuple: A tuple containing the title, description, and image URL, or None in case of an error.
"""
try:
# Fetch the HTML content of the URL
response = requests.get(url)
response.raise_for_status()
# Parse the HTML content using BeautifulSoup
soup = BeautifulSoup(response.content, 'html.parser')
# Extract the title, description, and image URL
title = soup.find('title').text if soup.find('title') else None
description = soup.find('meta', attrs={'name': 'description'})['content'] if soup.find('meta', attrs={'name': 'description'}) else None
image_url = soup.find('meta', attrs={'property': 'og:image'})['content'] if soup.find('meta', attrs={'property': 'og:image'}) else None
return title, description, image_url
except requests.exceptions.RequestException as req_err:
st.error(f"Error fetching the URL: {req_err}")
return None, None, None
except Exception as err:
st.error(f"Error parsing the HTML content: {err}")
return None, None, None
def og_tag_generator():
"""Main function to run the Streamlit app."""
st.title("AI Open Graph Tag Generator")
# Platform selection
platform = st.selectbox(
"**Select the platform**",
["General", "Facebook", "Twitter"],
help="Choose the platform for which you want to generate Open Graph tags."
)
# URL input
url = st.text_input(
"**Enter the URL of the page to generate Open Graph tags for:**",
placeholder="e.g., https://example.com",
help="Provide the URL of the page you want to generate Open Graph tags for."
)
if url:
# Extract default Open Graph tags
title, description, image_url = extract_default_og_tags(url)
# Title hint input
title_hint = st.text_input(
"**Modify existing title or suggest a new one (optional):**",
value=title if title else "",
placeholder="e.g., Amazing Blog Post Title"
)
# Description hint input
description_hint = st.text_area(
"**Modify existing description or suggest a new one (optional):**",
value=description if description else "",
placeholder="e.g., This is a detailed description of the content."
)
# Image URL hint input
image_hint = st.text_input(
"**Use this image or suggest a new URL (optional):**",
value=image_url if image_url else "",
placeholder="e.g., https://example.com/image.jpg"
)
# Generate Open Graph tags
if st.button("Generate Open Graph Tags"):
with st.spinner("Generating Open Graph tags..."):
try:
og_tags = generate_og_tags(url, title_hint, description_hint, platform)
if og_tags:
st.success("Open Graph tags generated successfully!")
st.markdown(og_tags)
else:
st.error("Failed to generate Open Graph tags.")
except Exception as e:
st.error(f"Failed to generate Open Graph tags: {e}")
else:
st.info("Please enter a URL to generate Open Graph tags.")

View File

@@ -0,0 +1,2 @@
ogImage TBD

View File

@@ -0,0 +1,187 @@
import os
import sys
import tinify
from PIL import Image
from loguru import logger
from dotenv import load_dotenv
import streamlit as st
from tempfile import NamedTemporaryFile
# Load environment variables
load_dotenv()
# Set Tinyfy API key from environment variable
TINIFY_API_KEY = os.getenv('TINIFY_API_KEY')
if TINIFY_API_KEY:
tinify.key = TINIFY_API_KEY
def setup_logger() -> None:
"""Configure the logger."""
logger.remove()
logger.add(
sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
setup_logger()
def compress_image(image: Image.Image, quality: int = 45, resize: tuple = None, preserve_exif: bool = False) -> Image.Image:
"""
Compress and optionally resize an image.
Args:
image (PIL.Image): Image object to compress.
quality (int): Quality of the output image (1-100).
resize (tuple): Tuple (width, height) to resize the image.
preserve_exif (bool): Preserve EXIF data if True.
Returns:
PIL.Image: The compressed and resized image object.
"""
try:
if image.mode == 'RGBA':
logger.info("Converting RGBA image to RGB.")
image = image.convert('RGB')
exif = image.info.get('exif') if preserve_exif and 'exif' in image.info else None
if resize:
image = image.resize(resize, Image.LANCZOS)
logger.info(f"Resized image to {resize}")
with NamedTemporaryFile(delete=False, suffix=".jpg") as temp_file:
temp_path = temp_file.name
try:
image.save(temp_path, optimize=True, quality=quality, exif=exif)
except Exception as exif_error:
logger.warning(f"Error saving image with EXIF: {exif_error}. Saving without EXIF.")
image.save(temp_path, optimize=True, quality=quality)
logger.info("Image compression successful.")
return Image.open(temp_path)
except Exception as e:
logger.error(f"Error compressing image: {e}")
st.error("Failed to compress the image. Please try again.")
return None
def convert_to_webp(image: Image.Image, image_path: str) -> str:
"""
Convert an image to WebP format.
Args:
image (PIL.Image): Image object to convert.
image_path (str): Path to save the WebP image.
Returns:
str: Path to the WebP image.
"""
try:
webp_path = os.path.splitext(image_path)[0] + '.webp'
image.save(webp_path, 'WEBP', quality=80, method=6)
return webp_path
except Exception as e:
logger.error(f"Error converting image to WebP: {e}")
st.error("Failed to convert the image to WebP format. Please try again.")
return None
def compress_image_tinyfy(image_path: str) -> None:
"""
Compress an image using Tinyfy API.
Args:
image_path (str): Path to the image to be compressed.
Returns:
None
"""
try:
if not tinify.key:
logger.warning("Tinyfy API key is not set. Skipping Tinyfy compression.")
return
source = tinify.from_file(image_path)
source.to_file(image_path)
logger.info("Tinyfy compression successful.")
except tinify.errors.AccountError:
logger.error("Verify your Tinyfy API key and account limit.")
st.warning("Tinyfy compression failed. Check your API key and account limit.")
except Exception as e:
logger.error(f"Error during Tinyfy compression: {e}")
st.warning("Tinyfy compression failed. Ensure the API key is set.")
def optimize_image(image: Image.Image, image_path: str, quality: int, resize: tuple, preserve_exif: bool) -> str:
"""
Optimize the image by compressing and converting it to WebP, with optional Tinyfy compression.
Args:
image (PIL.Image): The original image.
image_path (str): The path to the image file.
quality (int): Quality level for compression.
resize (tuple): Dimensions to resize the image.
preserve_exif (bool): Whether to preserve EXIF data.
Returns:
str: Path to the optimized WebP image, or None if failed.
"""
logger.info("Starting image optimization process...")
compressed_image = compress_image(image, quality, resize, preserve_exif)
if compressed_image is None:
return None
webp_path = convert_to_webp(compressed_image, image_path)
if webp_path is None:
return None
if tinify.key:
compress_image_tinyfy(webp_path)
else:
logger.info("Tinyfy key not provided, skipping Tinyfy compression.")
return webp_path
def main_img_optimizer() -> None:
st.title("ALwrity Image Optimizer")
st.markdown("## Upload an image to optimize its size and format.")
input_tinify_key = st.text_input("Optional: Enter your Tinyfy API Key")
if input_tinify_key:
tinify.key = input_tinify_key
uploaded_file = st.file_uploader("Upload an image", type=['jpg', 'jpeg', 'png', 'gif', 'bmp', 'webp'])
if uploaded_file:
image = Image.open(uploaded_file)
st.image(image, caption="Original Image", use_column_width=True)
quality = st.slider("Compression Quality", 1, 100, 45)
preserve_exif = st.checkbox("Preserve EXIF Data", value=False)
resize = st.checkbox("Resize Image")
if resize:
width = st.number_input("Width", value=image.width)
height = st.number_input("Height", value=image.height)
resize_dims = (width, height)
else:
resize_dims = None
if st.button("Optimize Image"):
with st.spinner("Optimizing..."):
if tinify.key:
st.info("Tinyfy compression will be applied.")
webp_path = optimize_image(image, uploaded_file.name, quality, resize_dims, preserve_exif)
if webp_path:
st.image(webp_path, caption="Optimized Image (WebP)", use_column_width=True)
st.success("Image optimization completed!")
with open(webp_path, "rb") as file:
st.download_button(
label="Download Optimized Image",
data=file,
file_name=os.path.basename(webp_path),
mime="image/webp"
)

View File

@@ -0,0 +1,340 @@
"""
FastAPI endpoint for the Comprehensive SEO Analyzer
Provides data for the React SEO Dashboard
"""
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, HttpUrl
from typing import List, Optional, Dict, Any
from datetime import datetime
import json
from .comprehensive_seo_analyzer import ComprehensiveSEOAnalyzer, SEOAnalysisResult
app = FastAPI(
title="Comprehensive SEO Analyzer API",
description="API for analyzing website SEO performance with actionable insights",
version="1.0.0"
)
# Initialize the analyzer
seo_analyzer = ComprehensiveSEOAnalyzer()
class SEOAnalysisRequest(BaseModel):
url: HttpUrl
target_keywords: Optional[List[str]] = None
class SEOAnalysisResponse(BaseModel):
url: str
timestamp: datetime
overall_score: int
health_status: str
critical_issues: List[str]
warnings: List[str]
recommendations: List[str]
data: Dict[str, Any]
success: bool
message: str
@app.post("/analyze-seo", response_model=SEOAnalysisResponse)
async def analyze_seo(request: SEOAnalysisRequest):
"""
Analyze a URL for comprehensive SEO performance
Args:
request: SEOAnalysisRequest containing URL and optional target keywords
Returns:
SEOAnalysisResponse with detailed analysis results
"""
try:
# Convert URL to string
url_str = str(request.url)
# Perform analysis
result = seo_analyzer.analyze_url(url_str, request.target_keywords)
# Convert to response format
response_data = {
'url': result.url,
'timestamp': result.timestamp,
'overall_score': result.overall_score,
'health_status': result.health_status,
'critical_issues': result.critical_issues,
'warnings': result.warnings,
'recommendations': result.recommendations,
'data': result.data,
'success': True,
'message': f"SEO analysis completed successfully for {result.url}"
}
return SEOAnalysisResponse(**response_data)
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Error analyzing SEO: {str(e)}"
)
@app.get("/health")
async def health_check():
"""Health check endpoint"""
return {
"status": "healthy",
"timestamp": datetime.now(),
"service": "Comprehensive SEO Analyzer API"
}
@app.get("/analysis-summary/{url:path}")
async def get_analysis_summary(url: str):
"""
Get a quick summary of SEO analysis for a URL
Args:
url: The URL to analyze
Returns:
Summary of SEO analysis
"""
try:
# Ensure URL has protocol
if not url.startswith(('http://', 'https://')):
url = f"https://{url}"
# Perform analysis
result = seo_analyzer.analyze_url(url)
# Create summary
summary = {
"url": result.url,
"overall_score": result.overall_score,
"health_status": result.health_status,
"critical_issues_count": len(result.critical_issues),
"warnings_count": len(result.warnings),
"recommendations_count": len(result.recommendations),
"top_issues": result.critical_issues[:3],
"top_recommendations": result.recommendations[:3],
"analysis_timestamp": result.timestamp.isoformat()
}
return summary
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Error getting analysis summary: {str(e)}"
)
@app.get("/seo-metrics/{url:path}")
async def get_seo_metrics(url: str):
"""
Get detailed SEO metrics for dashboard display
Args:
url: The URL to analyze
Returns:
Detailed SEO metrics for React dashboard
"""
try:
# Ensure URL has protocol
if not url.startswith(('http://', 'https://')):
url = f"https://{url}"
# Perform analysis
result = seo_analyzer.analyze_url(url)
# Extract metrics for dashboard
metrics = {
"overall_score": result.overall_score,
"health_status": result.health_status,
"url_structure_score": result.data.get('url_structure', {}).get('score', 0),
"meta_data_score": result.data.get('meta_data', {}).get('score', 0),
"content_score": result.data.get('content_analysis', {}).get('score', 0),
"technical_score": result.data.get('technical_seo', {}).get('score', 0),
"performance_score": result.data.get('performance', {}).get('score', 0),
"accessibility_score": result.data.get('accessibility', {}).get('score', 0),
"user_experience_score": result.data.get('user_experience', {}).get('score', 0),
"security_score": result.data.get('security_headers', {}).get('score', 0)
}
# Add detailed data for each category
dashboard_data = {
"metrics": metrics,
"critical_issues": result.critical_issues,
"warnings": result.warnings,
"recommendations": result.recommendations,
"detailed_analysis": {
"url_structure": result.data.get('url_structure', {}),
"meta_data": result.data.get('meta_data', {}),
"content_analysis": result.data.get('content_analysis', {}),
"technical_seo": result.data.get('technical_seo', {}),
"performance": result.data.get('performance', {}),
"accessibility": result.data.get('accessibility', {}),
"user_experience": result.data.get('user_experience', {}),
"security_headers": result.data.get('security_headers', {}),
"keyword_analysis": result.data.get('keyword_analysis', {})
},
"timestamp": result.timestamp.isoformat(),
"url": result.url
}
return dashboard_data
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Error getting SEO metrics: {str(e)}"
)
@app.post("/batch-analyze")
async def batch_analyze(urls: List[str]):
"""
Analyze multiple URLs in batch
Args:
urls: List of URLs to analyze
Returns:
Batch analysis results
"""
try:
results = []
for url in urls:
try:
# Ensure URL has protocol
if not url.startswith(('http://', 'https://')):
url = f"https://{url}"
# Perform analysis
result = seo_analyzer.analyze_url(url)
# Add to results
results.append({
"url": result.url,
"overall_score": result.overall_score,
"health_status": result.health_status,
"critical_issues_count": len(result.critical_issues),
"warnings_count": len(result.warnings),
"success": True
})
except Exception as e:
# Add error result
results.append({
"url": url,
"overall_score": 0,
"health_status": "error",
"critical_issues_count": 0,
"warnings_count": 0,
"success": False,
"error": str(e)
})
return {
"total_urls": len(urls),
"successful_analyses": len([r for r in results if r['success']]),
"failed_analyses": len([r for r in results if not r['success']]),
"results": results
}
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Error in batch analysis: {str(e)}"
)
# Enhanced prompts for better results
ENHANCED_PROMPTS = {
"critical_issue": "🚨 CRITICAL: This issue is severely impacting your SEO performance and must be fixed immediately.",
"warning": "⚠️ WARNING: This could be improved to boost your search rankings.",
"recommendation": "💡 RECOMMENDATION: Implement this to improve your SEO score.",
"excellent": "🎉 EXCELLENT: Your SEO is performing very well in this area!",
"good": "✅ GOOD: Your SEO is performing well, with room for minor improvements.",
"needs_improvement": "🔧 NEEDS IMPROVEMENT: Several areas need attention to boost your SEO.",
"poor": "❌ POOR: Significant improvements needed across multiple areas."
}
def enhance_analysis_result(result: SEOAnalysisResult) -> SEOAnalysisResult:
"""
Enhance analysis results with better prompts and user-friendly language
"""
# Enhance critical issues
enhanced_critical_issues = []
for issue in result.critical_issues:
enhanced_issue = f"{ENHANCED_PROMPTS['critical_issue']} {issue}"
enhanced_critical_issues.append(enhanced_issue)
# Enhance warnings
enhanced_warnings = []
for warning in result.warnings:
enhanced_warning = f"{ENHANCED_PROMPTS['warning']} {warning}"
enhanced_warnings.append(enhanced_warning)
# Enhance recommendations
enhanced_recommendations = []
for rec in result.recommendations:
enhanced_rec = f"{ENHANCED_PROMPTS['recommendation']} {rec}"
enhanced_recommendations.append(enhanced_rec)
# Create enhanced result
enhanced_result = SEOAnalysisResult(
url=result.url,
timestamp=result.timestamp,
overall_score=result.overall_score,
health_status=result.health_status,
critical_issues=enhanced_critical_issues,
warnings=enhanced_warnings,
recommendations=enhanced_recommendations,
data=result.data
)
return enhanced_result
@app.post("/analyze-seo-enhanced", response_model=SEOAnalysisResponse)
async def analyze_seo_enhanced(request: SEOAnalysisRequest):
"""
Analyze a URL with enhanced, user-friendly prompts
Args:
request: SEOAnalysisRequest containing URL and optional target keywords
Returns:
SEOAnalysisResponse with enhanced, user-friendly analysis results
"""
try:
# Convert URL to string
url_str = str(request.url)
# Perform analysis
result = seo_analyzer.analyze_url(url_str, request.target_keywords)
# Enhance results
enhanced_result = enhance_analysis_result(result)
# Convert to response format
response_data = {
'url': enhanced_result.url,
'timestamp': enhanced_result.timestamp,
'overall_score': enhanced_result.overall_score,
'health_status': enhanced_result.health_status,
'critical_issues': enhanced_result.critical_issues,
'warnings': enhanced_result.warnings,
'recommendations': enhanced_result.recommendations,
'data': enhanced_result.data,
'success': True,
'message': f"Enhanced SEO analysis completed successfully for {enhanced_result.url}"
}
return SEOAnalysisResponse(**response_data)
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Error analyzing SEO: {str(e)}"
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)

View File

@@ -0,0 +1,130 @@
import streamlit as st
import json
from datetime import date
from dotenv import load_dotenv
from ..ai_web_researcher.firecrawl_web_crawler import scrape_url
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
# Load environment variables
load_dotenv()
# Define a dictionary for schema types
schema_types = {
"Article": {
"fields": ["Headline", "Author", "Date Published", "Keywords"],
"schema_type": "Article",
},
"Product": {
"fields": ["Name", "Description", "Price", "Brand", "Image URL"],
"schema_type": "Product",
},
"Recipe": {
"fields": ["Name", "Ingredients", "Cooking Time", "Serving Size", "Image URL"],
"schema_type": "Recipe",
},
"Event": {
"fields": ["Name", "Start Date", "End Date", "Location", "Description"],
"schema_type": "Event",
},
"LocalBusiness": {
"fields": ["Name", "Address", "Phone Number", "Opening Hours", "Image URL"],
"schema_type": "LocalBusiness",
},
# ... (add more schema types as needed)
}
def generate_json_data(content_type, details, url):
"""Generates structured data (JSON-LD) based on user input."""
try:
scraped_text = scrape_url(url)
except Exception as err:
st.error(f"Failed to scrape web page from URL: {url} - Error: {err}")
return
schema = schema_types.get(content_type)
if not schema:
st.error(f"Invalid content type: {content_type}")
return
data = {
"@context": "https://schema.org",
"@type": schema["schema_type"],
}
for field in schema["fields"]:
value = details.get(field)
if isinstance(value, date):
value = value.isoformat()
data[field] = value if value else "N/A" # Use placeholder values if input is missing
if url:
data['url'] = url
llm_structured_data = get_llm_structured_data(content_type, data, scraped_text)
return llm_structured_data
def get_llm_structured_data(content_type, data, scraped_text):
"""Function to get structured data from LLM."""
prompt = f"""Given the following information:
HTML Content: <<<HTML>>> {scraped_text} <<<END_HTML>>>
Content Type: <<<CONTENT_TYPE>>> {content_type} <<<END_CONTENT_TYPE>>>
Additional Relevant Data: <<<ADDITIONAL_DATA>>> {data} <<<END_ADDITIONAL_DATA>>>
Create a detailed structured data (JSON-LD) script for SEO purposes.
The structured data should help search engines understand the content and features of the webpage, enhancing its visibility and potential for rich snippets in search results.
Detailed Steps:
Parse the HTML content to extract relevant information like the title, main heading, and body content.
Use the contentType to determine the structured data type (e.g., Article, Product, Recipe).
Integrate the additional relevant data (e.g., author, datePublished, keywords) into the structured data.
Ensure all URLs, images, and other attributes are correctly formatted and included.
Validate the generated JSON-LD to ensure it meets schema.org standards and is free of errors.
Expected Output:
Generate a JSON-LD structured data snippet based on the provided inputs."""
try:
response = llm_text_gen(prompt)
return response
except Exception as err:
st.error(f"Failed to get response from LLM: {err}")
return
def ai_structured_data():
st.title("📝 Generate Structured Data for SEO 🚀")
st.markdown("**Make your content more discoverable with rich snippets.**")
content_type = st.selectbox("**Select Content Type**", list(schema_types.keys()))
details = {}
schema_fields = schema_types[content_type]["fields"]
num_fields = len(schema_fields)
url = st.text_input("**URL :**", placeholder="Enter the URL of your webpage")
for i in range(0, num_fields, 2):
cols = st.columns(2)
for j in range(2):
if i + j < num_fields:
field = schema_fields[i + j]
if "Date" in field:
details[field] = cols[j].date_input(field)
else:
details[field] = cols[j].text_input(field, placeholder=f"Enter {field.lower()}")
if st.button("Generate Structured Data"):
if not url:
st.error("URL is required to generate structured data.")
return
structured_data = generate_json_data(content_type, details, url)
if structured_data:
st.subheader("Generated Structured Data (JSON-LD):")
st.markdown(structured_data)
st.download_button(
label="Download JSON-LD",
data=structured_data,
file_name=f"{content_type}_structured_data.json",
mime="application/json",
)

View File

@@ -0,0 +1,340 @@
import streamlit as st
import advertools as adv
import pandas as pd
import plotly.graph_objects as go
from urllib.error import URLError
import xml.etree.ElementTree as ET
import requests
def main():
"""
Main function to run the Sitemap Analyzer Streamlit app.
"""
st.title("📊 Sitemap Analyzer")
st.write("""
This tool analyzes a website's sitemap to understand its content structure and publishing trends.
Enter a sitemap URL to start your analysis.
""")
sitemap_url = st.text_input(
"Please enter the sitemap URL:",
"https://www.example.com/sitemap.xml"
)
if st.button("Analyze Sitemap"):
try:
sitemap_df = fetch_all_sitemaps(sitemap_url)
if sitemap_df is not None and not sitemap_df.empty:
sitemap_df = process_lastmod_column(sitemap_df)
ppmonth = analyze_content_trends(sitemap_df)
sitemap_df = categorize_and_shorten_sitemaps(sitemap_df)
display_key_metrics(sitemap_df, ppmonth)
plot_sitemap_content_distribution(sitemap_df)
plot_content_trends(ppmonth)
plot_content_type_breakdown(sitemap_df)
plot_publishing_frequency(sitemap_df)
st.success("🎉 Analysis complete!")
else:
st.error("No valid URLs found in the sitemap.")
except URLError as e:
st.error(f"Error fetching the sitemap: {e}")
except Exception as e:
st.error(f"An unexpected error occurred: {e}")
def fetch_all_sitemaps(sitemap_url):
"""
Fetches all sitemaps from the provided sitemap URL and concatenates their URLs into a DataFrame.
Parameters:
sitemap_url (str): The URL of the sitemap.
Returns:
DataFrame: A DataFrame containing all URLs from the sitemaps.
"""
st.write(f"🚀 Fetching and analyzing the sitemap: {sitemap_url}...")
try:
sitemap_df = fetch_sitemap(sitemap_url)
if sitemap_df is not None:
all_sitemaps = sitemap_df.loc[
sitemap_df['loc'].str.contains('sitemap'),
'loc'
].tolist()
if all_sitemaps:
st.write(
f"🔄 Found {len(all_sitemaps)} additional sitemaps. Fetching data from them..."
)
all_urls_df = pd.DataFrame()
for sitemap in all_sitemaps:
try:
st.write(f"Fetching URLs from {sitemap}...")
temp_df = fetch_sitemap(sitemap)
if temp_df is not None:
all_urls_df = pd.concat(
[all_urls_df, temp_df], ignore_index=True
)
except Exception as e:
st.error(f"Error fetching {sitemap}: {e}")
st.write(
f"✅ Successfully fetched {len(all_urls_df)} URLs from all sitemaps."
)
return all_urls_df
else:
st.write(f"✅ Successfully fetched {len(sitemap_df)} URLs from the main sitemap.")
return sitemap_df
else:
return None
except Exception as e:
st.error(f"⚠️ Error fetching the sitemap: {e}")
return None
def fetch_sitemap(url):
"""
Fetches and parses the sitemap from the provided URL.
Parameters:
url (str): The URL of the sitemap.
Returns:
DataFrame: A DataFrame containing the URLs from the sitemap.
"""
try:
response = requests.get(url)
response.raise_for_status()
ET.fromstring(response.content)
sitemap_df = adv.sitemap_to_df(url)
return sitemap_df
except requests.RequestException as e:
st.error(f"⚠️ Request error: {e}")
return None
except ET.ParseError as e:
st.error(f"⚠️ XML parsing error: {e}")
return None
def process_lastmod_column(sitemap_df):
"""
Processes the 'lastmod' column in the sitemap DataFrame by converting it to DateTime format and setting it as the index.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
Returns:
DataFrame: The processed sitemap DataFrame with 'lastmod' as the index.
"""
st.write("📅 Converting 'lastmod' column to DateTime format and setting it as the index...")
try:
sitemap_df = sitemap_df.dropna(subset=['lastmod'])
sitemap_df['lastmod'] = pd.to_datetime(sitemap_df['lastmod'])
sitemap_df.set_index('lastmod', inplace=True)
st.write("'lastmod' column successfully converted to DateTime format and set as the index.")
return sitemap_df
except Exception as e:
st.error(f"⚠️ Error processing the 'lastmod' column: {e}")
return None
def categorize_and_shorten_sitemaps(sitemap_df):
"""
Categorizes and shortens the sitemap names in the sitemap DataFrame.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
Returns:
DataFrame: The sitemap DataFrame with categorized and shortened sitemap names.
"""
st.write("🔍 Categorizing and shortening sitemap names...")
try:
sitemap_df['sitemap_name'] = sitemap_df['sitemap'].str.split('/').str[4]
sitemap_df['sitemap_name'] = sitemap_df['sitemap_name'].replace({
'sitemap-site-kasko-fiyatlari.xml': 'Kasko',
'sitemap-site-bireysel.xml': 'Personal',
'sitemap-site-kurumsal.xml': 'Cooperate',
'sitemap-site-arac-sigortasi.xml': 'Car',
'sitemap-site.xml': 'Others'
})
st.write("✅ Sitemap names categorized and shortened.")
return sitemap_df
except Exception as e:
st.error(f"⚠️ Error categorizing sitemap names: {e}")
return sitemap_df
def analyze_content_trends(sitemap_df):
"""
Analyzes content publishing trends in the sitemap DataFrame.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
Returns:
Series: A Series representing the number of contents published each month.
"""
st.write("📅 Analyzing content publishing trends...")
try:
ppmonth = sitemap_df.resample('M').size()
sitemap_df['monthly_count'] = sitemap_df.index.to_period('M').value_counts().sort_index()
st.write("✅ Content trends analysis completed.")
return ppmonth
except Exception as e:
st.error(f"⚠️ Error during content trends analysis: {e}")
return pd.Series()
def display_key_metrics(sitemap_df, ppmonth):
"""
Displays key metrics of the sitemap analysis.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
ppmonth (Series): The Series representing the number of contents published each month.
"""
st.write("### Key Metrics")
total_urls = len(sitemap_df)
total_articles = ppmonth.sum()
average_frequency = ppmonth.mean()
st.write(f"**Total URLs Found:** {total_urls:,}")
st.write(f"**Total Articles Published:** {total_articles:,}")
st.write(f"**Average Monthly Publishing Frequency:** {average_frequency:.2f} articles/month")
def plot_sitemap_content_distribution(sitemap_df):
"""
Plots the content distribution by sitemap categories.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
"""
st.write("📊 Visualizing content amount by sitemap categories...")
try:
if 'sitemap_name' in sitemap_df.columns:
stmc = sitemap_df.groupby('sitemap_name').size()
fig = go.Figure()
fig.add_bar(x=stmc.index, y=stmc.values, name='Sitemap Categories')
fig.update_layout(
title='Content Amount by Sitemap Categories',
xaxis_title='Sitemap Categories',
yaxis_title='Number of Articles',
paper_bgcolor='#E5ECF6'
)
st.plotly_chart(fig)
else:
st.warning("⚠️ The 'sitemap_name' column is missing in the data.")
except Exception as e:
st.error(f"⚠️ Error during sitemap content distribution plotting: {e}")
def plot_content_trends(ppmonth):
"""
Plots the content publishing trends over time.
Parameters:
ppmonth (Series): The Series representing the number of contents published each month.
"""
st.write("📈 Plotting content publishing trends over time...")
try:
fig = go.Figure()
fig.add_scatter(x=ppmonth.index, y=ppmonth.values, mode='lines+markers', name='Publishing Trends')
fig.update_layout(
title='Content Publishing Trends Over Time',
xaxis_title='Month',
yaxis_title='Number of Articles',
paper_bgcolor='#E5ECF6'
)
st.plotly_chart(fig)
except Exception as e:
st.error(f"⚠️ Error during content trends plotting: {e}")
def plot_content_type_breakdown(sitemap_df):
"""
Plots the content type breakdown.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
"""
st.write("🔍 Plotting content type breakdown...")
try:
if 'sitemap_name' in sitemap_df.columns and not sitemap_df['sitemap_name'].empty:
content_type_counts = sitemap_df['sitemap_name'].value_counts()
st.write("Content Type Counts:", content_type_counts)
if not content_type_counts.empty:
fig = go.Figure(data=[go.Pie(labels=content_type_counts.index, values=content_type_counts.values)])
fig.update_layout(
title='Content Type Breakdown',
paper_bgcolor='#E5ECF6'
)
st.plotly_chart(fig)
else:
st.warning("⚠️ No content types to display.")
else:
st.warning("⚠️ The 'sitemap_name' column is missing or empty.")
except Exception as e:
st.error(f"⚠️ Error during content type breakdown plotting: {e}")
def plot_publishing_frequency(sitemap_df):
"""
Plots the publishing frequency by month.
Parameters:
sitemap_df (DataFrame): The sitemap DataFrame.
"""
st.write("📆 Plotting publishing frequency by month...")
try:
if not sitemap_df.empty:
frequency_by_month = sitemap_df.index.to_period('M').value_counts().sort_index()
frequency_by_month.index = frequency_by_month.index.astype(str)
fig = go.Figure()
fig.add_bar(x=frequency_by_month.index, y=frequency_by_month.values, name='Publishing Frequency')
fig.update_layout(
title='Publishing Frequency by Month',
xaxis_title='Month',
yaxis_title='Number of Articles',
paper_bgcolor='#E5ECF6'
)
st.plotly_chart(fig)
else:
st.warning("⚠️ No data available to plot publishing frequency.")
except Exception as e:
st.error(f"⚠️ Error during publishing frequency plotting: {e}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,22 @@
"""
Technical SEO Crawler Package.
This package provides comprehensive technical SEO analysis capabilities
with advertools integration and AI-powered recommendations.
Components:
- TechnicalSEOCrawler: Core crawler with technical analysis
- TechnicalSEOCrawlerUI: Streamlit interface for the crawler
"""
from .crawler import TechnicalSEOCrawler
from .ui import TechnicalSEOCrawlerUI, render_technical_seo_crawler
__version__ = "1.0.0"
__author__ = "ALwrity"
__all__ = [
'TechnicalSEOCrawler',
'TechnicalSEOCrawlerUI',
'render_technical_seo_crawler'
]

View File

@@ -0,0 +1,709 @@
"""
Comprehensive Technical SEO Crawler using Advertools Integration.
This module provides advanced site-wide technical SEO analysis using:
- adv.crawl: Complete website crawling and analysis
- adv.crawl_headers: HTTP headers and server analysis
- adv.crawl_images: Image optimization analysis
- adv.url_to_df: URL structure optimization
- AI-powered technical recommendations
"""
import streamlit as st
import pandas as pd
import advertools as adv
from typing import Dict, Any, List, Optional, Tuple
from urllib.parse import urlparse, urljoin
import tempfile
import os
from datetime import datetime
import json
from collections import Counter, defaultdict
from loguru import logger
import numpy as np
# Import existing modules
from lib.gpt_providers.text_generation.main_text_generation import llm_text_gen
from lib.utils.website_analyzer.analyzer import WebsiteAnalyzer
class TechnicalSEOCrawler:
"""Comprehensive technical SEO crawler with advertools integration."""
def __init__(self):
"""Initialize the technical SEO crawler."""
self.temp_dir = tempfile.mkdtemp()
logger.info("TechnicalSEOCrawler initialized")
def analyze_website_technical_seo(self, website_url: str, crawl_depth: int = 3,
max_pages: int = 500) -> Dict[str, Any]:
"""
Perform comprehensive technical SEO analysis.
Args:
website_url: Website URL to analyze
crawl_depth: How deep to crawl (1-5)
max_pages: Maximum pages to crawl (50-1000)
Returns:
Comprehensive technical SEO analysis results
"""
try:
st.info("🚀 Starting Comprehensive Technical SEO Crawl...")
# Initialize results structure
results = {
'analysis_timestamp': datetime.utcnow().isoformat(),
'website_url': website_url,
'crawl_settings': {
'depth': crawl_depth,
'max_pages': max_pages
},
'crawl_overview': {},
'technical_issues': {},
'performance_analysis': {},
'content_analysis': {},
'url_structure': {},
'image_optimization': {},
'security_headers': {},
'mobile_seo': {},
'structured_data': {},
'ai_recommendations': {}
}
# Phase 1: Core Website Crawl
with st.expander("🕷️ Website Crawling Progress", expanded=True):
crawl_data = self._perform_comprehensive_crawl(website_url, crawl_depth, max_pages)
results['crawl_overview'] = crawl_data
st.success(f"✅ Crawled {crawl_data.get('pages_crawled', 0)} pages")
# Phase 2: Technical Issues Detection
with st.expander("🔍 Technical Issues Analysis", expanded=True):
technical_issues = self._analyze_technical_issues(crawl_data)
results['technical_issues'] = technical_issues
st.success("✅ Identified technical SEO issues")
# Phase 3: Performance Analysis
with st.expander("⚡ Performance Analysis", expanded=True):
performance = self._analyze_performance_metrics(crawl_data)
results['performance_analysis'] = performance
st.success("✅ Analyzed website performance metrics")
# Phase 4: Content & Structure Analysis
with st.expander("📊 Content Structure Analysis", expanded=True):
content_analysis = self._analyze_content_structure(crawl_data)
results['content_analysis'] = content_analysis
st.success("✅ Analyzed content structure and optimization")
# Phase 5: URL Structure Optimization
with st.expander("🔗 URL Structure Analysis", expanded=True):
url_analysis = self._analyze_url_structure(crawl_data)
results['url_structure'] = url_analysis
st.success("✅ Analyzed URL structure and patterns")
# Phase 6: Image SEO Analysis
with st.expander("🖼️ Image SEO Analysis", expanded=True):
image_analysis = self._analyze_image_seo(website_url)
results['image_optimization'] = image_analysis
st.success("✅ Analyzed image optimization")
# Phase 7: Security & Headers Analysis
with st.expander("🛡️ Security Headers Analysis", expanded=True):
security_analysis = self._analyze_security_headers(website_url)
results['security_headers'] = security_analysis
st.success("✅ Analyzed security headers")
# Phase 8: Mobile SEO Analysis
with st.expander("📱 Mobile SEO Analysis", expanded=True):
mobile_analysis = self._analyze_mobile_seo(crawl_data)
results['mobile_seo'] = mobile_analysis
st.success("✅ Analyzed mobile SEO factors")
# Phase 9: AI-Powered Recommendations
with st.expander("🤖 AI Technical Recommendations", expanded=True):
ai_recommendations = self._generate_technical_recommendations(results)
results['ai_recommendations'] = ai_recommendations
st.success("✅ Generated AI-powered technical recommendations")
return results
except Exception as e:
error_msg = f"Error in technical SEO analysis: {str(e)}"
logger.error(error_msg, exc_info=True)
st.error(error_msg)
return {'error': error_msg}
def _perform_comprehensive_crawl(self, website_url: str, depth: int, max_pages: int) -> Dict[str, Any]:
"""Perform comprehensive website crawl using adv.crawl."""
try:
st.info("🕷️ Crawling website for comprehensive analysis...")
# Create crawl output file
crawl_file = os.path.join(self.temp_dir, "technical_crawl.jl")
# Configure crawl settings for technical SEO
custom_settings = {
'DEPTH_LIMIT': depth,
'CLOSESPIDER_PAGECOUNT': max_pages,
'DOWNLOAD_DELAY': 0.5, # Be respectful
'CONCURRENT_REQUESTS': 8,
'ROBOTSTXT_OBEY': True,
'USER_AGENT': 'ALwrity-TechnicalSEO-Crawler/1.0',
'COOKIES_ENABLED': False,
'TELNETCONSOLE_ENABLED': False,
'LOG_LEVEL': 'WARNING'
}
# Start crawl
adv.crawl(
url_list=[website_url],
output_file=crawl_file,
follow_links=True,
custom_settings=custom_settings
)
# Read and process crawl results
if os.path.exists(crawl_file):
crawl_df = pd.read_json(crawl_file, lines=True)
# Basic crawl statistics
crawl_overview = {
'pages_crawled': len(crawl_df),
'status_codes': crawl_df['status'].value_counts().to_dict(),
'crawl_file_path': crawl_file,
'crawl_dataframe': crawl_df,
'domains_found': crawl_df['url'].apply(lambda x: urlparse(x).netloc).nunique(),
'avg_response_time': crawl_df.get('download_latency', pd.Series()).mean(),
'total_content_size': crawl_df.get('size', pd.Series()).sum()
}
return crawl_overview
else:
st.error("Crawl file not created")
return {}
except Exception as e:
st.error(f"Error in website crawl: {str(e)}")
return {}
def _analyze_technical_issues(self, crawl_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze technical SEO issues from crawl data."""
try:
st.info("🔍 Detecting technical SEO issues...")
if 'crawl_dataframe' not in crawl_data:
return {}
df = crawl_data['crawl_dataframe']
technical_issues = {
'http_errors': {},
'redirect_issues': {},
'duplicate_content': {},
'missing_elements': {},
'page_speed_issues': {},
'crawlability_issues': {}
}
# HTTP Status Code Issues
error_codes = df[df['status'] >= 400]['status'].value_counts().to_dict()
technical_issues['http_errors'] = {
'total_errors': len(df[df['status'] >= 400]),
'error_breakdown': error_codes,
'error_pages': df[df['status'] >= 400][['url', 'status']].to_dict('records')[:50]
}
# Redirect Analysis
redirects = df[df['status'].isin([301, 302, 303, 307, 308])]
technical_issues['redirect_issues'] = {
'total_redirects': len(redirects),
'redirect_chains': self._find_redirect_chains(redirects),
'redirect_types': redirects['status'].value_counts().to_dict()
}
# Duplicate Content Detection
if 'title' in df.columns:
duplicate_titles = df['title'].value_counts()
duplicate_titles = duplicate_titles[duplicate_titles > 1]
technical_issues['duplicate_content'] = {
'duplicate_titles': len(duplicate_titles),
'duplicate_title_groups': duplicate_titles.to_dict(),
'pages_with_duplicate_titles': df[df['title'].isin(duplicate_titles.index)][['url', 'title']].to_dict('records')[:20]
}
# Missing Elements Analysis
missing_elements = {
'missing_titles': len(df[(df['title'].isna()) | (df['title'] == '')]) if 'title' in df.columns else 0,
'missing_meta_desc': len(df[(df['meta_desc'].isna()) | (df['meta_desc'] == '')]) if 'meta_desc' in df.columns else 0,
'missing_h1': len(df[(df['h1'].isna()) | (df['h1'] == '')]) if 'h1' in df.columns else 0
}
technical_issues['missing_elements'] = missing_elements
# Page Speed Issues
if 'download_latency' in df.columns:
slow_pages = df[df['download_latency'] > 3.0] # Pages taking >3s
technical_issues['page_speed_issues'] = {
'slow_pages_count': len(slow_pages),
'avg_load_time': df['download_latency'].mean(),
'slowest_pages': slow_pages.nlargest(10, 'download_latency')[['url', 'download_latency']].to_dict('records')
}
return technical_issues
except Exception as e:
st.error(f"Error analyzing technical issues: {str(e)}")
return {}
def _analyze_performance_metrics(self, crawl_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze website performance metrics."""
try:
st.info("⚡ Analyzing performance metrics...")
if 'crawl_dataframe' not in crawl_data:
return {}
df = crawl_data['crawl_dataframe']
performance = {
'load_time_analysis': {},
'content_size_analysis': {},
'server_performance': {},
'optimization_opportunities': []
}
# Load Time Analysis
if 'download_latency' in df.columns:
load_times = df['download_latency'].dropna()
performance['load_time_analysis'] = {
'avg_load_time': load_times.mean(),
'median_load_time': load_times.median(),
'p95_load_time': load_times.quantile(0.95),
'fastest_page': load_times.min(),
'slowest_page': load_times.max(),
'pages_over_3s': len(load_times[load_times > 3]),
'performance_distribution': {
'fast_pages': len(load_times[load_times <= 1]),
'moderate_pages': len(load_times[(load_times > 1) & (load_times <= 3)]),
'slow_pages': len(load_times[load_times > 3])
}
}
# Content Size Analysis
if 'size' in df.columns:
sizes = df['size'].dropna()
performance['content_size_analysis'] = {
'avg_page_size': sizes.mean(),
'median_page_size': sizes.median(),
'largest_page': sizes.max(),
'smallest_page': sizes.min(),
'pages_over_1mb': len(sizes[sizes > 1048576]), # 1MB
'total_content_size': sizes.sum()
}
# Server Performance
status_codes = df['status'].value_counts()
total_pages = len(df)
performance['server_performance'] = {
'success_rate': status_codes.get(200, 0) / total_pages * 100,
'error_rate': sum(status_codes.get(code, 0) for code in range(400, 600)) / total_pages * 100,
'redirect_rate': sum(status_codes.get(code, 0) for code in [301, 302, 303, 307, 308]) / total_pages * 100
}
return performance
except Exception as e:
st.error(f"Error analyzing performance: {str(e)}")
return {}
def _analyze_content_structure(self, crawl_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze content structure and SEO elements."""
try:
st.info("📊 Analyzing content structure...")
if 'crawl_dataframe' not in crawl_data:
return {}
df = crawl_data['crawl_dataframe']
content_analysis = {
'title_analysis': {},
'meta_description_analysis': {},
'heading_structure': {},
'internal_linking': {},
'content_optimization': {}
}
# Title Analysis
if 'title' in df.columns:
titles = df['title'].dropna()
title_lengths = titles.str.len()
content_analysis['title_analysis'] = {
'avg_title_length': title_lengths.mean(),
'title_length_distribution': {
'too_short': len(title_lengths[title_lengths < 30]),
'optimal': len(title_lengths[(title_lengths >= 30) & (title_lengths <= 60)]),
'too_long': len(title_lengths[title_lengths > 60])
},
'duplicate_titles': len(titles.value_counts()[titles.value_counts() > 1]),
'missing_titles': len(df) - len(titles)
}
# Meta Description Analysis
if 'meta_desc' in df.columns:
meta_descs = df['meta_desc'].dropna()
meta_lengths = meta_descs.str.len()
content_analysis['meta_description_analysis'] = {
'avg_meta_length': meta_lengths.mean(),
'meta_length_distribution': {
'too_short': len(meta_lengths[meta_lengths < 120]),
'optimal': len(meta_lengths[(meta_lengths >= 120) & (meta_lengths <= 160)]),
'too_long': len(meta_lengths[meta_lengths > 160])
},
'missing_meta_descriptions': len(df) - len(meta_descs)
}
# Heading Structure Analysis
heading_cols = [col for col in df.columns if col.startswith('h') and col[1:].isdigit()]
if heading_cols:
heading_analysis = {}
for col in heading_cols:
headings = df[col].dropna()
heading_analysis[f'{col}_usage'] = {
'pages_with_heading': len(headings),
'usage_rate': len(headings) / len(df) * 100,
'avg_length': headings.str.len().mean() if len(headings) > 0 else 0
}
content_analysis['heading_structure'] = heading_analysis
# Internal Linking Analysis
if 'links_internal' in df.columns:
internal_links = df['links_internal'].apply(lambda x: len(x) if isinstance(x, list) else 0)
content_analysis['internal_linking'] = {
'avg_internal_links': internal_links.mean(),
'pages_with_no_internal_links': len(internal_links[internal_links == 0]),
'max_internal_links': internal_links.max(),
'internal_link_distribution': internal_links.describe().to_dict()
}
return content_analysis
except Exception as e:
st.error(f"Error analyzing content structure: {str(e)}")
return {}
def _analyze_url_structure(self, crawl_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze URL structure and optimization using adv.url_to_df."""
try:
st.info("🔗 Analyzing URL structure...")
if 'crawl_dataframe' not in crawl_data:
return {}
df = crawl_data['crawl_dataframe']
urls = df['url'].tolist()
# Use advertools to analyze URL structure
url_df = adv.url_to_df(urls)
url_analysis = {
'url_length_analysis': {},
'url_structure_patterns': {},
'url_optimization': {},
'path_analysis': {}
}
# URL Length Analysis
url_lengths = url_df['url'].str.len()
url_analysis['url_length_analysis'] = {
'avg_url_length': url_lengths.mean(),
'max_url_length': url_lengths.max(),
'long_urls_count': len(url_lengths[url_lengths > 100]),
'url_length_distribution': url_lengths.describe().to_dict()
}
# Path Depth Analysis
if 'dir_1' in url_df.columns:
path_depths = url_df.apply(lambda row: sum(1 for i in range(1, 10) if f'dir_{i}' in row and pd.notna(row[f'dir_{i}'])), axis=1)
url_analysis['path_analysis'] = {
'avg_path_depth': path_depths.mean(),
'max_path_depth': path_depths.max(),
'deep_paths_count': len(path_depths[path_depths > 4]),
'path_depth_distribution': path_depths.value_counts().to_dict()
}
# URL Structure Patterns
domains = url_df['netloc'].value_counts()
schemes = url_df['scheme'].value_counts()
url_analysis['url_structure_patterns'] = {
'domains_found': domains.to_dict(),
'schemes_used': schemes.to_dict(),
'subdomain_usage': len(url_df[url_df['netloc'].str.contains('\.', regex=True)]),
'https_usage': schemes.get('https', 0) / len(url_df) * 100
}
# URL Optimization Issues
optimization_issues = []
# Check for non-HTTPS URLs
if schemes.get('http', 0) > 0:
optimization_issues.append(f"{schemes.get('http', 0)} pages not using HTTPS")
# Check for long URLs
long_urls = len(url_lengths[url_lengths > 100])
if long_urls > 0:
optimization_issues.append(f"{long_urls} URLs are too long (>100 characters)")
# Check for deep paths
if 'path_analysis' in url_analysis:
deep_paths = url_analysis['path_analysis']['deep_paths_count']
if deep_paths > 0:
optimization_issues.append(f"{deep_paths} URLs have deep path structures (>4 levels)")
url_analysis['url_optimization'] = {
'issues_found': len(optimization_issues),
'optimization_recommendations': optimization_issues
}
return url_analysis
except Exception as e:
st.error(f"Error analyzing URL structure: {str(e)}")
return {}
def _analyze_image_seo(self, website_url: str) -> Dict[str, Any]:
"""Analyze image SEO using adv.crawl_images."""
try:
st.info("🖼️ Analyzing image SEO...")
# Create image crawl output file
image_file = os.path.join(self.temp_dir, "image_crawl.jl")
# Crawl images
adv.crawl_images(
url_list=[website_url],
output_file=image_file,
custom_settings={
'DEPTH_LIMIT': 2,
'CLOSESPIDER_PAGECOUNT': 100,
'DOWNLOAD_DELAY': 1
}
)
image_analysis = {
'image_count': 0,
'alt_text_analysis': {},
'image_format_analysis': {},
'image_size_analysis': {},
'optimization_opportunities': []
}
if os.path.exists(image_file):
image_df = pd.read_json(image_file, lines=True)
image_analysis['image_count'] = len(image_df)
# Alt text analysis
if 'img_alt' in image_df.columns:
alt_texts = image_df['img_alt'].dropna()
missing_alt = len(image_df) - len(alt_texts)
image_analysis['alt_text_analysis'] = {
'images_with_alt': len(alt_texts),
'images_missing_alt': missing_alt,
'alt_text_coverage': len(alt_texts) / len(image_df) * 100,
'avg_alt_length': alt_texts.str.len().mean() if len(alt_texts) > 0 else 0
}
# Image format analysis
if 'img_src' in image_df.columns:
# Extract file extensions
extensions = image_df['img_src'].str.extract(r'\.([a-zA-Z]{2,4})(?:\?|$)')
format_counts = extensions[0].value_counts()
image_analysis['image_format_analysis'] = {
'format_distribution': format_counts.to_dict(),
'modern_format_usage': format_counts.get('webp', 0) + format_counts.get('avif', 0)
}
return image_analysis
except Exception as e:
st.error(f"Error analyzing images: {str(e)}")
return {}
def _analyze_security_headers(self, website_url: str) -> Dict[str, Any]:
"""Analyze security headers using adv.crawl_headers."""
try:
st.info("🛡️ Analyzing security headers...")
# Create headers output file
headers_file = os.path.join(self.temp_dir, "security_headers.jl")
# Crawl headers
adv.crawl_headers([website_url], output_file=headers_file)
security_analysis = {
'security_headers_present': {},
'security_score': 0,
'security_recommendations': []
}
if os.path.exists(headers_file):
headers_df = pd.read_json(headers_file, lines=True)
# Check for important security headers
security_headers = {
'X-Frame-Options': 'resp_headers_X-Frame-Options',
'X-Content-Type-Options': 'resp_headers_X-Content-Type-Options',
'X-XSS-Protection': 'resp_headers_X-XSS-Protection',
'Strict-Transport-Security': 'resp_headers_Strict-Transport-Security',
'Content-Security-Policy': 'resp_headers_Content-Security-Policy',
'Referrer-Policy': 'resp_headers_Referrer-Policy'
}
headers_present = {}
for header_name, column_name in security_headers.items():
is_present = column_name in headers_df.columns and headers_df[column_name].notna().any()
headers_present[header_name] = is_present
security_analysis['security_headers_present'] = headers_present
# Calculate security score
present_count = sum(headers_present.values())
security_analysis['security_score'] = (present_count / len(security_headers)) * 100
# Generate recommendations
recommendations = []
for header_name, is_present in headers_present.items():
if not is_present:
recommendations.append(f"Add {header_name} header for improved security")
security_analysis['security_recommendations'] = recommendations
return security_analysis
except Exception as e:
st.error(f"Error analyzing security headers: {str(e)}")
return {}
def _analyze_mobile_seo(self, crawl_data: Dict[str, Any]) -> Dict[str, Any]:
"""Analyze mobile SEO factors."""
try:
st.info("📱 Analyzing mobile SEO factors...")
if 'crawl_dataframe' not in crawl_data:
return {}
df = crawl_data['crawl_dataframe']
mobile_analysis = {
'viewport_analysis': {},
'mobile_optimization': {},
'responsive_design_indicators': {}
}
# Viewport meta tag analysis
if 'viewport' in df.columns:
viewport_present = df['viewport'].notna().sum()
mobile_analysis['viewport_analysis'] = {
'pages_with_viewport': viewport_present,
'viewport_coverage': viewport_present / len(df) * 100,
'pages_missing_viewport': len(df) - viewport_present
}
# Check for mobile-specific meta tags and indicators
mobile_indicators = []
# Check for touch icons
if any('touch-icon' in col for col in df.columns):
mobile_indicators.append("Touch icons configured")
# Check for responsive design indicators in content
# This is a simplified check - in practice, you'd analyze CSS and page structure
mobile_analysis['mobile_optimization'] = {
'mobile_indicators_found': len(mobile_indicators),
'mobile_indicators': mobile_indicators
}
return mobile_analysis
except Exception as e:
st.error(f"Error analyzing mobile SEO: {str(e)}")
return {}
def _generate_technical_recommendations(self, results: Dict[str, Any]) -> Dict[str, Any]:
"""Generate AI-powered technical SEO recommendations."""
try:
st.info("🤖 Generating technical recommendations...")
# Prepare technical analysis summary for AI
technical_summary = {
'website_url': results.get('website_url', ''),
'pages_crawled': results.get('crawl_overview', {}).get('pages_crawled', 0),
'error_count': results.get('technical_issues', {}).get('http_errors', {}).get('total_errors', 0),
'avg_load_time': results.get('performance_analysis', {}).get('load_time_analysis', {}).get('avg_load_time', 0),
'security_score': results.get('security_headers', {}).get('security_score', 0),
'missing_titles': results.get('content_analysis', {}).get('title_analysis', {}).get('missing_titles', 0),
'missing_meta_desc': results.get('content_analysis', {}).get('meta_description_analysis', {}).get('missing_meta_descriptions', 0)
}
# Generate AI recommendations
prompt = f"""
As a technical SEO expert, analyze this comprehensive website audit and provide prioritized recommendations:
WEBSITE: {technical_summary['website_url']}
PAGES ANALYZED: {technical_summary['pages_crawled']}
TECHNICAL ISSUES:
- HTTP Errors: {technical_summary['error_count']}
- Average Load Time: {technical_summary['avg_load_time']:.2f}s
- Security Score: {technical_summary['security_score']:.1f}%
- Missing Titles: {technical_summary['missing_titles']}
- Missing Meta Descriptions: {technical_summary['missing_meta_desc']}
PROVIDE:
1. Critical Issues (Fix Immediately)
2. High Priority Optimizations
3. Medium Priority Improvements
4. Long-term Technical Strategy
5. Specific Implementation Steps
6. Expected Impact Assessment
Format as JSON with clear priorities and actionable recommendations.
"""
ai_response = llm_text_gen(
prompt=prompt,
system_prompt="You are a senior technical SEO specialist with expertise in website optimization, Core Web Vitals, and search engine best practices.",
response_format="json_object"
)
if ai_response:
return ai_response
else:
return {'recommendations': ['AI recommendations temporarily unavailable']}
except Exception as e:
st.error(f"Error generating recommendations: {str(e)}")
return {}
def _find_redirect_chains(self, redirects_df: pd.DataFrame) -> List[Dict[str, Any]]:
"""Find redirect chains in the crawled data."""
# Simplified redirect chain detection
# In a full implementation, you'd trace the redirect paths
redirect_chains = []
if len(redirects_df) > 0:
# Group redirects by status code
for status_code in redirects_df['status'].unique():
status_redirects = redirects_df[redirects_df['status'] == status_code]
redirect_chains.append({
'status_code': int(status_code),
'count': len(status_redirects),
'examples': status_redirects['url'].head(5).tolist()
})
return redirect_chains

View File

@@ -0,0 +1,968 @@
"""
Technical SEO Crawler UI with Comprehensive Analysis Dashboard.
This module provides a professional Streamlit interface for the Technical SEO Crawler
with detailed analysis results, visualization, and export capabilities.
"""
import streamlit as st
import pandas as pd
from typing import Dict, Any, List
import json
from datetime import datetime
import io
import base64
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from .crawler import TechnicalSEOCrawler
from lib.alwrity_ui.dashboard_styles import apply_dashboard_style, render_dashboard_header
class TechnicalSEOCrawlerUI:
"""Professional UI for Technical SEO Crawler."""
def __init__(self):
"""Initialize the Technical SEO Crawler UI."""
self.crawler = TechnicalSEOCrawler()
# Apply dashboard styling
apply_dashboard_style()
def render(self):
"""Render the Technical SEO Crawler interface."""
# Enhanced dashboard header
render_dashboard_header(
"🔧 Technical SEO Crawler",
"Comprehensive site-wide technical SEO analysis with AI-powered recommendations. Identify and fix technical issues that impact your search rankings."
)
# Main content area
with st.container():
# Analysis input form
self._render_crawler_form()
# Session state for results
if 'technical_seo_results' in st.session_state and st.session_state.technical_seo_results:
st.markdown("---")
self._render_results_dashboard(st.session_state.technical_seo_results)
def _render_crawler_form(self):
"""Render the crawler configuration form."""
st.markdown("## 🚀 Configure Technical SEO Audit")
with st.form("technical_seo_crawler_form"):
# Website URL input
col1, col2 = st.columns([3, 1])
with col1:
website_url = st.text_input(
"🌐 Website URL to Audit",
placeholder="https://yourwebsite.com",
help="Enter the website URL for comprehensive technical SEO analysis"
)
with col2:
audit_type = st.selectbox(
"🎯 Audit Type",
options=["Standard", "Deep", "Quick"],
help="Choose the depth of analysis"
)
# Crawl configuration
st.markdown("### ⚙️ Crawl Configuration")
col1, col2, col3 = st.columns(3)
with col1:
if audit_type == "Quick":
crawl_depth = st.slider("Crawl Depth", 1, 2, 1)
max_pages = st.slider("Max Pages", 10, 100, 50)
elif audit_type == "Deep":
crawl_depth = st.slider("Crawl Depth", 1, 5, 4)
max_pages = st.slider("Max Pages", 100, 1000, 500)
else: # Standard
crawl_depth = st.slider("Crawl Depth", 1, 4, 3)
max_pages = st.slider("Max Pages", 50, 500, 200)
with col2:
analyze_images = st.checkbox(
"🖼️ Analyze Images",
value=True,
help="Include image SEO analysis"
)
analyze_security = st.checkbox(
"🛡️ Security Headers",
value=True,
help="Analyze security headers"
)
with col3:
analyze_mobile = st.checkbox(
"📱 Mobile SEO",
value=True,
help="Include mobile SEO analysis"
)
ai_recommendations = st.checkbox(
"🤖 AI Recommendations",
value=True,
help="Generate AI-powered recommendations"
)
# Analysis scope
st.markdown("### 🎯 Analysis Scope")
analysis_options = st.multiselect(
"Select Analysis Components",
options=[
"Technical Issues Detection",
"Performance Analysis",
"Content Structure Analysis",
"URL Structure Optimization",
"Internal Linking Analysis",
"Duplicate Content Detection"
],
default=[
"Technical Issues Detection",
"Performance Analysis",
"Content Structure Analysis"
],
help="Choose which analysis components to include"
)
# Submit button
submitted = st.form_submit_button(
"🚀 Start Technical SEO Audit",
use_container_width=True,
type="primary"
)
if submitted:
# Validate inputs
if not website_url or not website_url.startswith(('http://', 'https://')):
st.error("❌ Please enter a valid website URL starting with http:// or https://")
return
# Run technical SEO analysis
self._run_technical_analysis(
website_url=website_url,
crawl_depth=crawl_depth,
max_pages=max_pages,
options={
'analyze_images': analyze_images,
'analyze_security': analyze_security,
'analyze_mobile': analyze_mobile,
'ai_recommendations': ai_recommendations,
'analysis_scope': analysis_options
}
)
def _run_technical_analysis(self, website_url: str, crawl_depth: int,
max_pages: int, options: Dict[str, Any]):
"""Run the technical SEO analysis."""
try:
with st.spinner("🔄 Running Comprehensive Technical SEO Audit..."):
# Initialize progress tracking
progress_bar = st.progress(0)
status_text = st.empty()
# Update progress
progress_bar.progress(10)
status_text.text("🚀 Initializing technical SEO crawler...")
# Run comprehensive analysis
results = self.crawler.analyze_website_technical_seo(
website_url=website_url,
crawl_depth=crawl_depth,
max_pages=max_pages
)
progress_bar.progress(100)
status_text.text("✅ Technical SEO audit complete!")
# Store results in session state
st.session_state.technical_seo_results = results
# Clear progress indicators
progress_bar.empty()
status_text.empty()
if 'error' in results:
st.error(f"❌ Analysis failed: {results['error']}")
else:
st.success("🎉 Technical SEO Audit completed successfully!")
st.balloons()
# Rerun to show results
st.rerun()
except Exception as e:
st.error(f"❌ Error running technical analysis: {str(e)}")
def _render_results_dashboard(self, results: Dict[str, Any]):
"""Render the comprehensive results dashboard."""
if 'error' in results:
st.error(f"❌ Analysis Error: {results['error']}")
return
# Results header
st.markdown("## 📊 Technical SEO Audit Results")
# Key metrics overview
self._render_metrics_overview(results)
# Detailed analysis tabs
self._render_detailed_analysis(results)
# Export functionality
self._render_export_options(results)
def _render_metrics_overview(self, results: Dict[str, Any]):
"""Render key metrics overview."""
st.markdown("### 📈 Audit Overview")
# Create metrics columns
col1, col2, col3, col4, col5, col6 = st.columns(6)
with col1:
pages_crawled = results.get('crawl_overview', {}).get('pages_crawled', 0)
st.metric(
"🕷️ Pages Crawled",
pages_crawled,
help="Total pages analyzed"
)
with col2:
error_count = results.get('technical_issues', {}).get('http_errors', {}).get('total_errors', 0)
st.metric(
"❌ HTTP Errors",
error_count,
delta=f"-{error_count}" if error_count > 0 else None,
help="Pages with HTTP errors (4xx, 5xx)"
)
with col3:
avg_load_time = results.get('performance_analysis', {}).get('load_time_analysis', {}).get('avg_load_time', 0)
st.metric(
"⚡ Avg Load Time",
f"{avg_load_time:.2f}s",
delta=f"+{avg_load_time:.2f}s" if avg_load_time > 3 else None,
help="Average page load time"
)
with col4:
security_score = results.get('security_headers', {}).get('security_score', 0)
st.metric(
"🛡️ Security Score",
f"{security_score:.0f}%",
delta=f"{security_score:.0f}%" if security_score < 100 else None,
help="Security headers implementation score"
)
with col5:
missing_titles = results.get('content_analysis', {}).get('title_analysis', {}).get('missing_titles', 0)
st.metric(
"📝 Missing Titles",
missing_titles,
delta=f"-{missing_titles}" if missing_titles > 0 else None,
help="Pages without title tags"
)
with col6:
image_count = results.get('image_optimization', {}).get('image_count', 0)
st.metric(
"🖼️ Images Analyzed",
image_count,
help="Total images found and analyzed"
)
# Analysis timestamp
if results.get('analysis_timestamp'):
timestamp = datetime.fromisoformat(results['analysis_timestamp'].replace('Z', '+00:00'))
st.caption(f"📅 Audit completed: {timestamp.strftime('%Y-%m-%d %H:%M:%S UTC')}")
def _render_detailed_analysis(self, results: Dict[str, Any]):
"""Render detailed analysis in tabs."""
# Create main analysis tabs
tab1, tab2, tab3, tab4, tab5, tab6, tab7 = st.tabs([
"🔍 Technical Issues",
"⚡ Performance",
"📊 Content Analysis",
"🔗 URL Structure",
"🖼️ Image SEO",
"🛡️ Security",
"🤖 AI Recommendations"
])
with tab1:
self._render_technical_issues(results.get('technical_issues', {}))
with tab2:
self._render_performance_analysis(results.get('performance_analysis', {}))
with tab3:
self._render_content_analysis(results.get('content_analysis', {}))
with tab4:
self._render_url_structure(results.get('url_structure', {}))
with tab5:
self._render_image_analysis(results.get('image_optimization', {}))
with tab6:
self._render_security_analysis(results.get('security_headers', {}))
with tab7:
self._render_ai_recommendations(results.get('ai_recommendations', {}))
def _render_technical_issues(self, technical_data: Dict[str, Any]):
"""Render technical issues analysis."""
st.markdown("### 🔍 Technical SEO Issues")
if not technical_data:
st.info("No technical issues data available")
return
# HTTP Errors
if technical_data.get('http_errors'):
http_errors = technical_data['http_errors']
st.markdown("#### ❌ HTTP Status Code Errors")
if http_errors.get('total_errors', 0) > 0:
st.error(f"Found {http_errors['total_errors']} pages with HTTP errors!")
# Error breakdown chart
if http_errors.get('error_breakdown'):
error_df = pd.DataFrame(
list(http_errors['error_breakdown'].items()),
columns=['Status Code', 'Count']
)
fig = px.bar(error_df, x='Status Code', y='Count',
title="HTTP Error Distribution")
st.plotly_chart(fig, use_container_width=True)
# Error pages table
if http_errors.get('error_pages'):
st.markdown("**Pages with Errors:**")
error_pages_df = pd.DataFrame(http_errors['error_pages'])
st.dataframe(error_pages_df, use_container_width=True)
else:
st.success("✅ No HTTP errors found!")
# Redirect Issues
if technical_data.get('redirect_issues'):
redirect_data = technical_data['redirect_issues']
st.markdown("#### 🔄 Redirect Analysis")
total_redirects = redirect_data.get('total_redirects', 0)
if total_redirects > 0:
st.warning(f"Found {total_redirects} redirect(s)")
# Redirect types
if redirect_data.get('redirect_types'):
redirect_df = pd.DataFrame(
list(redirect_data['redirect_types'].items()),
columns=['Redirect Type', 'Count']
)
st.bar_chart(redirect_df.set_index('Redirect Type'))
else:
st.success("✅ No redirects found")
# Duplicate Content
if technical_data.get('duplicate_content'):
duplicate_data = technical_data['duplicate_content']
st.markdown("#### 📋 Duplicate Content Issues")
duplicate_titles = duplicate_data.get('duplicate_titles', 0)
if duplicate_titles > 0:
st.warning(f"Found {duplicate_titles} duplicate title(s)")
# Show duplicate title groups
if duplicate_data.get('pages_with_duplicate_titles'):
duplicate_df = pd.DataFrame(duplicate_data['pages_with_duplicate_titles'])
st.dataframe(duplicate_df, use_container_width=True)
else:
st.success("✅ No duplicate titles found")
# Missing Elements
if technical_data.get('missing_elements'):
missing_data = technical_data['missing_elements']
st.markdown("#### 📝 Missing SEO Elements")
col1, col2, col3 = st.columns(3)
with col1:
missing_titles = missing_data.get('missing_titles', 0)
if missing_titles > 0:
st.error(f"Missing Titles: {missing_titles}")
else:
st.success("All pages have titles ✅")
with col2:
missing_meta = missing_data.get('missing_meta_desc', 0)
if missing_meta > 0:
st.error(f"Missing Meta Descriptions: {missing_meta}")
else:
st.success("All pages have meta descriptions ✅")
with col3:
missing_h1 = missing_data.get('missing_h1', 0)
if missing_h1 > 0:
st.error(f"Missing H1 tags: {missing_h1}")
else:
st.success("All pages have H1 tags ✅")
def _render_performance_analysis(self, performance_data: Dict[str, Any]):
"""Render performance analysis."""
st.markdown("### ⚡ Website Performance Analysis")
if not performance_data:
st.info("No performance data available")
return
# Load Time Analysis
if performance_data.get('load_time_analysis'):
load_time_data = performance_data['load_time_analysis']
st.markdown("#### 🚀 Page Load Time Analysis")
col1, col2, col3 = st.columns(3)
with col1:
avg_load = load_time_data.get('avg_load_time', 0)
st.metric("Average Load Time", f"{avg_load:.2f}s")
with col2:
median_load = load_time_data.get('median_load_time', 0)
st.metric("Median Load Time", f"{median_load:.2f}s")
with col3:
p95_load = load_time_data.get('p95_load_time', 0)
st.metric("95th Percentile", f"{p95_load:.2f}s")
# Performance distribution
if load_time_data.get('performance_distribution'):
perf_dist = load_time_data['performance_distribution']
# Create pie chart for performance distribution
labels = ['Fast (≤1s)', 'Moderate (1-3s)', 'Slow (>3s)']
values = [
perf_dist.get('fast_pages', 0),
perf_dist.get('moderate_pages', 0),
perf_dist.get('slow_pages', 0)
]
fig = px.pie(values=values, names=labels,
title="Page Load Time Distribution")
st.plotly_chart(fig, use_container_width=True)
# Content Size Analysis
if performance_data.get('content_size_analysis'):
size_data = performance_data['content_size_analysis']
st.markdown("#### 📦 Content Size Analysis")
col1, col2, col3 = st.columns(3)
with col1:
avg_size = size_data.get('avg_page_size', 0)
st.metric("Average Page Size", f"{avg_size/1024:.1f} KB")
with col2:
largest_size = size_data.get('largest_page', 0)
st.metric("Largest Page", f"{largest_size/1024:.1f} KB")
with col3:
large_pages = size_data.get('pages_over_1mb', 0)
st.metric("Pages >1MB", large_pages)
# Server Performance
if performance_data.get('server_performance'):
server_data = performance_data['server_performance']
st.markdown("#### 🖥️ Server Performance")
col1, col2, col3 = st.columns(3)
with col1:
success_rate = server_data.get('success_rate', 0)
st.metric("Success Rate", f"{success_rate:.1f}%")
with col2:
error_rate = server_data.get('error_rate', 0)
st.metric("Error Rate", f"{error_rate:.1f}%")
with col3:
redirect_rate = server_data.get('redirect_rate', 0)
st.metric("Redirect Rate", f"{redirect_rate:.1f}%")
def _render_content_analysis(self, content_data: Dict[str, Any]):
"""Render content structure analysis."""
st.markdown("### 📊 Content Structure Analysis")
if not content_data:
st.info("No content analysis data available")
return
# Title Analysis
if content_data.get('title_analysis'):
title_data = content_data['title_analysis']
st.markdown("#### 📝 Title Tag Analysis")
col1, col2 = st.columns(2)
with col1:
avg_title_length = title_data.get('avg_title_length', 0)
st.metric("Average Title Length", f"{avg_title_length:.0f} chars")
duplicate_titles = title_data.get('duplicate_titles', 0)
st.metric("Duplicate Titles", duplicate_titles)
with col2:
# Title length distribution
if title_data.get('title_length_distribution'):
length_dist = title_data['title_length_distribution']
labels = ['Too Short (<30)', 'Optimal (30-60)', 'Too Long (>60)']
values = [
length_dist.get('too_short', 0),
length_dist.get('optimal', 0),
length_dist.get('too_long', 0)
]
fig = px.pie(values=values, names=labels,
title="Title Length Distribution")
st.plotly_chart(fig, use_container_width=True)
# Meta Description Analysis
if content_data.get('meta_description_analysis'):
meta_data = content_data['meta_description_analysis']
st.markdown("#### 🏷️ Meta Description Analysis")
col1, col2 = st.columns(2)
with col1:
avg_meta_length = meta_data.get('avg_meta_length', 0)
st.metric("Average Meta Length", f"{avg_meta_length:.0f} chars")
missing_meta = meta_data.get('missing_meta_descriptions', 0)
st.metric("Missing Meta Descriptions", missing_meta)
with col2:
# Meta length distribution
if meta_data.get('meta_length_distribution'):
meta_dist = meta_data['meta_length_distribution']
labels = ['Too Short (<120)', 'Optimal (120-160)', 'Too Long (>160)']
values = [
meta_dist.get('too_short', 0),
meta_dist.get('optimal', 0),
meta_dist.get('too_long', 0)
]
fig = px.pie(values=values, names=labels,
title="Meta Description Length Distribution")
st.plotly_chart(fig, use_container_width=True)
# Heading Structure
if content_data.get('heading_structure'):
heading_data = content_data['heading_structure']
st.markdown("#### 📋 Heading Structure Analysis")
# Create heading usage chart
heading_usage = []
for heading_type, data in heading_data.items():
heading_usage.append({
'Heading': heading_type.replace('_usage', '').upper(),
'Usage Rate': data.get('usage_rate', 0),
'Pages': data.get('pages_with_heading', 0)
})
if heading_usage:
heading_df = pd.DataFrame(heading_usage)
fig = px.bar(heading_df, x='Heading', y='Usage Rate',
title="Heading Tag Usage Rates")
st.plotly_chart(fig, use_container_width=True)
st.dataframe(heading_df, use_container_width=True)
def _render_url_structure(self, url_data: Dict[str, Any]):
"""Render URL structure analysis."""
st.markdown("### 🔗 URL Structure Analysis")
if not url_data:
st.info("No URL structure data available")
return
# URL Length Analysis
if url_data.get('url_length_analysis'):
length_data = url_data['url_length_analysis']
st.markdown("#### 📏 URL Length Analysis")
col1, col2, col3 = st.columns(3)
with col1:
avg_length = length_data.get('avg_url_length', 0)
st.metric("Average URL Length", f"{avg_length:.0f} chars")
with col2:
max_length = length_data.get('max_url_length', 0)
st.metric("Longest URL", f"{max_length:.0f} chars")
with col3:
long_urls = length_data.get('long_urls_count', 0)
st.metric("URLs >100 chars", long_urls)
# URL Structure Patterns
if url_data.get('url_structure_patterns'):
pattern_data = url_data['url_structure_patterns']
st.markdown("#### 🏗️ URL Structure Patterns")
col1, col2 = st.columns(2)
with col1:
https_usage = pattern_data.get('https_usage', 0)
st.metric("HTTPS Usage", f"{https_usage:.1f}%")
with col2:
subdomain_usage = pattern_data.get('subdomain_usage', 0)
st.metric("Subdomains Found", subdomain_usage)
# Path Analysis
if url_data.get('path_analysis'):
path_data = url_data['path_analysis']
st.markdown("#### 📂 Path Depth Analysis")
col1, col2, col3 = st.columns(3)
with col1:
avg_depth = path_data.get('avg_path_depth', 0)
st.metric("Average Path Depth", f"{avg_depth:.1f}")
with col2:
max_depth = path_data.get('max_path_depth', 0)
st.metric("Maximum Depth", max_depth)
with col3:
deep_paths = path_data.get('deep_paths_count', 0)
st.metric("Deep Paths (>4)", deep_paths)
# Optimization Issues
if url_data.get('url_optimization'):
opt_data = url_data['url_optimization']
st.markdown("#### ⚠️ URL Optimization Issues")
issues_found = opt_data.get('issues_found', 0)
recommendations = opt_data.get('optimization_recommendations', [])
if issues_found > 0:
st.warning(f"Found {issues_found} URL optimization issue(s)")
for rec in recommendations:
st.write(f"{rec}")
else:
st.success("✅ No URL optimization issues found")
def _render_image_analysis(self, image_data: Dict[str, Any]):
"""Render image SEO analysis."""
st.markdown("### 🖼️ Image SEO Analysis")
if not image_data:
st.info("No image analysis data available")
return
# Image overview
image_count = image_data.get('image_count', 0)
st.metric("Total Images Found", image_count)
if image_count > 0:
# Alt text analysis
if image_data.get('alt_text_analysis'):
alt_data = image_data['alt_text_analysis']
st.markdown("#### 📝 Alt Text Analysis")
col1, col2, col3 = st.columns(3)
with col1:
images_with_alt = alt_data.get('images_with_alt', 0)
st.metric("Images with Alt Text", images_with_alt)
with col2:
images_missing_alt = alt_data.get('images_missing_alt', 0)
st.metric("Missing Alt Text", images_missing_alt)
with col3:
alt_coverage = alt_data.get('alt_text_coverage', 0)
st.metric("Alt Text Coverage", f"{alt_coverage:.1f}%")
# Image format analysis
if image_data.get('image_format_analysis'):
format_data = image_data['image_format_analysis']
st.markdown("#### 🎨 Image Format Analysis")
if format_data.get('format_distribution'):
format_dist = format_data['format_distribution']
format_df = pd.DataFrame(
list(format_dist.items()),
columns=['Format', 'Count']
)
fig = px.pie(format_df, values='Count', names='Format',
title="Image Format Distribution")
st.plotly_chart(fig, use_container_width=True)
modern_formats = format_data.get('modern_format_usage', 0)
st.metric("Modern Formats (WebP/AVIF)", modern_formats)
else:
st.info("No images found to analyze")
def _render_security_analysis(self, security_data: Dict[str, Any]):
"""Render security analysis."""
st.markdown("### 🛡️ Security Headers Analysis")
if not security_data:
st.info("No security analysis data available")
return
# Security score
security_score = security_data.get('security_score', 0)
col1, col2 = st.columns([1, 2])
with col1:
st.metric("Security Score", f"{security_score:.0f}%")
if security_score >= 80:
st.success("🔒 Good security posture")
elif security_score >= 50:
st.warning("⚠️ Moderate security")
else:
st.error("🚨 Poor security posture")
with col2:
# Security headers status
if security_data.get('security_headers_present'):
headers_status = security_data['security_headers_present']
st.markdown("**Security Headers Status:**")
for header, present in headers_status.items():
status = "" if present else ""
st.write(f"{status} {header}")
# Security recommendations
if security_data.get('security_recommendations'):
recommendations = security_data['security_recommendations']
if recommendations:
st.markdown("#### 🔧 Security Recommendations")
for rec in recommendations:
st.write(f"{rec}")
else:
st.success("✅ All security headers properly configured")
def _render_ai_recommendations(self, ai_data: Dict[str, Any]):
"""Render AI-generated recommendations."""
st.markdown("### 🤖 AI-Powered Technical Recommendations")
if not ai_data:
st.info("No AI recommendations available")
return
# Critical Issues
if ai_data.get('critical_issues'):
st.markdown("#### 🚨 Critical Issues (Fix Immediately)")
critical_issues = ai_data['critical_issues']
for issue in critical_issues:
st.error(f"🚨 {issue}")
# High Priority
if ai_data.get('high_priority'):
st.markdown("#### 🔥 High Priority Optimizations")
high_priority = ai_data['high_priority']
for item in high_priority:
st.warning(f"{item}")
# Medium Priority
if ai_data.get('medium_priority'):
st.markdown("#### 📈 Medium Priority Improvements")
medium_priority = ai_data['medium_priority']
for item in medium_priority:
st.info(f"📊 {item}")
# Implementation Steps
if ai_data.get('implementation_steps'):
st.markdown("#### 🛠️ Implementation Steps")
steps = ai_data['implementation_steps']
for i, step in enumerate(steps, 1):
st.write(f"{i}. {step}")
# Expected Impact
if ai_data.get('expected_impact'):
st.markdown("#### 📈 Expected Impact Assessment")
impact = ai_data['expected_impact']
st.markdown(impact)
def _render_export_options(self, results: Dict[str, Any]):
"""Render export options for analysis results."""
st.markdown("---")
st.markdown("### 📥 Export Technical SEO Audit")
col1, col2, col3 = st.columns(3)
with col1:
# JSON export
if st.button("📄 Export Full Report (JSON)", use_container_width=True):
json_data = json.dumps(results, indent=2, default=str)
st.download_button(
label="⬇️ Download JSON Report",
data=json_data,
file_name=f"technical_seo_audit_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json",
mime="application/json",
use_container_width=True
)
with col2:
# CSV export for issues
if st.button("📊 Export Issues CSV", use_container_width=True):
issues_data = self._prepare_issues_csv(results)
if issues_data:
st.download_button(
label="⬇️ Download Issues CSV",
data=issues_data,
file_name=f"technical_issues_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv",
mime="text/csv",
use_container_width=True
)
else:
st.info("No issues found to export")
with col3:
# Executive summary
if st.button("📋 Executive Summary", use_container_width=True):
summary = self._generate_executive_summary(results)
st.download_button(
label="⬇️ Download Summary",
data=summary,
file_name=f"technical_seo_summary_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt",
mime="text/plain",
use_container_width=True
)
def _prepare_issues_csv(self, results: Dict[str, Any]) -> str:
"""Prepare CSV data for technical issues."""
issues_list = []
# HTTP errors
http_errors = results.get('technical_issues', {}).get('http_errors', {})
if http_errors.get('error_pages'):
for error in http_errors['error_pages']:
issues_list.append({
'Issue Type': 'HTTP Error',
'Severity': 'High',
'URL': error.get('url', ''),
'Status Code': error.get('status', ''),
'Description': f"HTTP {error.get('status', '')} error"
})
# Missing elements
missing_elements = results.get('technical_issues', {}).get('missing_elements', {})
# Add more issue types as needed...
if issues_list:
issues_df = pd.DataFrame(issues_list)
return issues_df.to_csv(index=False)
return ""
def _generate_executive_summary(self, results: Dict[str, Any]) -> str:
"""Generate executive summary report."""
website_url = results.get('website_url', 'Unknown')
timestamp = results.get('analysis_timestamp', datetime.now().isoformat())
summary = f"""
TECHNICAL SEO AUDIT - EXECUTIVE SUMMARY
======================================
Website: {website_url}
Audit Date: {timestamp}
AUDIT OVERVIEW
--------------
Pages Crawled: {results.get('crawl_overview', {}).get('pages_crawled', 0)}
HTTP Errors: {results.get('technical_issues', {}).get('http_errors', {}).get('total_errors', 0)}
Average Load Time: {results.get('performance_analysis', {}).get('load_time_analysis', {}).get('avg_load_time', 0):.2f}s
Security Score: {results.get('security_headers', {}).get('security_score', 0):.0f}%
CRITICAL FINDINGS
-----------------
"""
# Add critical findings
error_count = results.get('technical_issues', {}).get('http_errors', {}).get('total_errors', 0)
if error_count > 0:
summary += f"{error_count} pages have HTTP errors requiring immediate attention\n"
avg_load_time = results.get('performance_analysis', {}).get('load_time_analysis', {}).get('avg_load_time', 0)
if avg_load_time > 3:
summary += f"• Page load times are slow (avg: {avg_load_time:.2f}s), impacting user experience\n"
security_score = results.get('security_headers', {}).get('security_score', 0)
if security_score < 80:
summary += f"• Security headers need improvement (current score: {security_score:.0f}%)\n"
summary += f"\n\nDetailed technical audit completed by ALwrity Technical SEO Crawler\nGenerated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
return summary
# Render function for integration with main dashboard
def render_technical_seo_crawler():
"""Render the Technical SEO Crawler UI."""
ui = TechnicalSEOCrawlerUI()
ui.render()

View File

@@ -0,0 +1,58 @@
"""Text analysis tools using textstat."""
import streamlit as st
from textstat import textstat
def analyze_text(text):
"""Analyze text using textstat metrics."""
if not text:
st.warning("Please enter some text to analyze.")
return
# Calculate various metrics
metrics = {
"Flesch Reading Ease": textstat.flesch_reading_ease(text),
"Flesch-Kincaid Grade Level": textstat.flesch_kincaid_grade(text),
"Gunning Fog Index": textstat.gunning_fog(text),
"SMOG Index": textstat.smog_index(text),
"Automated Readability Index": textstat.automated_readability_index(text),
"Coleman-Liau Index": textstat.coleman_liau_index(text),
"Linsear Write Formula": textstat.linsear_write_formula(text),
"Dale-Chall Readability Score": textstat.dale_chall_readability_score(text),
"Readability Consensus": textstat.readability_consensus(text)
}
# Display metrics in a clean format
st.subheader("Text Analysis Results")
for metric, value in metrics.items():
st.metric(metric, f"{value:.2f}")
# Add visualizations
st.subheader("Visualization")
st.bar_chart(metrics)
st.title("📖 Text Readability Analyzer: Making Your Content Easy to Read")
st.write("""
This tool is your guide to writing content that's easy for your audience to understand.
Just paste in a sample of your text, and we'll break down the readability scores and offer actionable tips!
""")
text_input = st.text_area("Paste your text here:", height=200)
if st.button("Analyze!"):
with st.spinner("Analyzing your text..."):
test_data = text_input
if not test_data.strip():
st.error("Please enter text to analyze.")
else:
analyze_text(test_data)
st.subheader("Key Takeaways:")
st.write("---")
st.markdown("""
* **Don't Be Afraid to Simplify!** Often, simpler language makes content more impactful and easier to digest.
* **Aim for a Reading Level Appropriate for Your Audience:** Consider the education level, background, and familiarity of your readers.
* **Use Short Sentences:** This makes your content more scannable and easier to read.
* **Write for Everyone:** Accessibility should always be a priority. When in doubt, aim for clear, concise language!
""")

View File

@@ -0,0 +1,2 @@
1). Replace Firecrawl with scrapy or crawlee : https://crawlee.dev/python/docs/introduction

View File

@@ -0,0 +1,980 @@
####################################################
#
# FIXME: Gotta use this lib: https://github.com/monk1337/resp/tree/main
# https://github.com/danielnsilva/semanticscholar
# https://github.com/shauryr/S2QA
#
####################################################
import os
import sys
import re
import pandas as pd
import arxiv
import PyPDF2
import requests
import networkx as nx
from bs4 import BeautifulSoup
from urllib.parse import urlparse
from loguru import logger
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
import bibtexparser
from pylatexenc.latex2text import LatexNodes2Text
from matplotlib import pyplot as plt
from collections import defaultdict
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.cluster import KMeans
import numpy as np
logger.remove()
logger.add(sys.stdout, colorize=True, format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}")
def create_arxiv_client(page_size=100, delay_seconds=3.0, num_retries=3):
"""
Creates a reusable arXiv API client with custom configuration.
Args:
page_size (int): Number of results per page (default: 100)
delay_seconds (float): Delay between API requests (default: 3.0)
num_retries (int): Number of retries for failed requests (default: 3)
Returns:
arxiv.Client: Configured arXiv API client
"""
try:
client = arxiv.Client(
page_size=page_size,
delay_seconds=delay_seconds,
num_retries=num_retries
)
return client
except Exception as e:
logger.error(f"Error creating arXiv client: {e}")
raise e
def expand_search_query(query, research_interests=None):
"""
Uses AI to expand the search query based on user's research interests.
Args:
query (str): Original search query
research_interests (list): List of user's research interests
Returns:
str: Expanded search query
"""
try:
interests_context = "\n".join(research_interests) if research_interests else ""
prompt = f"""Given the original arXiv search query: '{query}'
{f'And considering these research interests:\n{interests_context}' if interests_context else ''}
Generate an expanded arXiv search query that:
1. Includes relevant synonyms and related concepts
2. Uses appropriate arXiv search operators (AND, OR, etc.)
3. Incorporates field-specific tags (ti:, abs:, au:, etc.)
4. Maintains focus on the core topic
Return only the expanded query without any explanation."""
expanded_query = llm_text_gen(prompt)
logger.info(f"Expanded query: {expanded_query}")
return expanded_query
except Exception as e:
logger.error(f"Error expanding search query: {e}")
return query
def analyze_citation_network(papers):
"""
Analyzes citation relationships between papers using DOIs and references.
Args:
papers (list): List of paper metadata dictionaries
Returns:
dict: Citation network analysis results
"""
try:
# Create a directed graph for citations
G = nx.DiGraph()
# Add nodes and edges
for paper in papers:
paper_id = paper['entry_id']
G.add_node(paper_id, title=paper['title'])
# Add edges based on DOIs and references
if paper['doi']:
for other_paper in papers:
if other_paper['doi'] and other_paper['doi'] in paper['summary']:
G.add_edge(paper_id, other_paper['entry_id'])
# Calculate network metrics
analysis = {
'influential_papers': sorted(nx.pagerank(G).items(), key=lambda x: x[1], reverse=True),
'citation_clusters': list(nx.connected_components(G.to_undirected())),
'citation_paths': dict(nx.all_pairs_shortest_path_length(G))
}
return analysis
except Exception as e:
logger.error(f"Error analyzing citation network: {e}")
return {}
def categorize_papers(papers):
"""
Uses AI to categorize papers based on their metadata and content.
Args:
papers (list): List of paper metadata dictionaries
Returns:
dict: Paper categorization results
"""
try:
categorized_papers = {}
for paper in papers:
prompt = f"""Analyze this research paper and provide detailed categorization:
Title: {paper['title']}
Abstract: {paper['summary']}
Primary Category: {paper['primary_category']}
Categories: {', '.join(paper['categories'])}
Provide a JSON response with these fields:
1. main_theme: Primary research theme
2. sub_themes: List of related sub-themes
3. methodology: Research methodology used
4. application_domains: Potential application areas
5. technical_complexity: Level (Basic/Intermediate/Advanced)"""
categorization = llm_text_gen(prompt)
categorized_papers[paper['entry_id']] = categorization
return categorized_papers
except Exception as e:
logger.error(f"Error categorizing papers: {e}")
return {}
def get_paper_recommendations(papers, research_interests):
"""
Generates personalized paper recommendations based on user's research interests.
Args:
papers (list): List of paper metadata dictionaries
research_interests (list): User's research interests
Returns:
dict: Personalized paper recommendations
"""
try:
interests_text = "\n".join(research_interests)
recommendations = {}
for paper in papers:
prompt = f"""Evaluate this paper's relevance to the user's research interests:
Paper:
- Title: {paper['title']}
- Abstract: {paper['summary']}
- Categories: {', '.join(paper['categories'])}
User's Research Interests:
{interests_text}
Provide a JSON response with:
1. relevance_score: 0-100
2. relevance_aspects: List of matching aspects
3. potential_value: How this paper could benefit the user's research"""
evaluation = llm_text_gen(prompt)
recommendations[paper['entry_id']] = evaluation
return recommendations
except Exception as e:
logger.error(f"Error generating paper recommendations: {e}")
return {}
def fetch_arxiv_data(query, max_results=10, sort_by=arxiv.SortCriterion.SubmittedDate, sort_order=None, client=None, research_interests=None):
"""
Fetches arXiv data based on a query with advanced search options.
Args:
query (str): The search query (supports advanced syntax, e.g., 'au:einstein AND cat:physics')
max_results (int): The maximum number of results to fetch
sort_by (arxiv.SortCriterion): Sorting criterion (default: SubmittedDate)
sort_order (str): Sort order ('ascending' or 'descending', default: None)
client (arxiv.Client): Optional custom client (default: None, creates new client)
Returns:
list: A list of arXiv data with extended metadata
"""
try:
if client is None:
client = create_arxiv_client()
# Expand search query using AI if research interests are provided
expanded_query = expand_search_query(query, research_interests) if research_interests else query
logger.info(f"Using expanded query: {expanded_query}")
search = arxiv.Search(
query=expanded_query,
max_results=max_results,
sort_by=sort_by,
sort_order=sort_order
)
results = list(client.results(search))
all_data = [
{
'title': result.title,
'published': result.published,
'updated': result.updated,
'entry_id': result.entry_id,
'summary': result.summary,
'authors': [str(author) for author in result.authors],
'pdf_url': result.pdf_url,
'journal_ref': getattr(result, 'journal_ref', None),
'doi': getattr(result, 'doi', None),
'primary_category': getattr(result, 'primary_category', None),
'categories': getattr(result, 'categories', []),
'links': [link.href for link in getattr(result, 'links', [])]
}
for result in results
]
# Enhance results with AI-powered analysis
if all_data:
# Analyze citation network
citation_analysis = analyze_citation_network(all_data)
# Categorize papers using AI
paper_categories = categorize_papers(all_data)
# Generate recommendations if research interests are provided
recommendations = get_paper_recommendations(all_data, research_interests) if research_interests else {}
# Perform content analysis
content_analyses = [analyze_paper_content(paper['entry_id']) for paper in all_data]
trend_analysis = analyze_research_trends(all_data)
concept_mapping = map_cross_paper_concepts(all_data)
# Generate bibliography data
bibliography_data = {
'bibtex_entries': [generate_bibtex_entry(paper) for paper in all_data],
'citations': {
'apa': [convert_citation_format(generate_bibtex_entry(paper), 'apa') for paper in all_data],
'mla': [convert_citation_format(generate_bibtex_entry(paper), 'mla') for paper in all_data],
'chicago': [convert_citation_format(generate_bibtex_entry(paper), 'chicago') for paper in all_data]
},
'reference_graph': visualize_reference_graph(all_data),
'citation_impact': analyze_citation_impact(all_data)
}
# Add enhanced data to results
enhanced_data = {
'papers': all_data,
'citation_analysis': citation_analysis,
'paper_categories': paper_categories,
'recommendations': recommendations,
'content_analyses': content_analyses,
'trend_analysis': trend_analysis,
'concept_mapping': concept_mapping,
'bibliography': bibliography_data
}
return enhanced_data
return {'papers': all_data}
except Exception as e:
logger.error(f"An error occurred while fetching data from arXiv: {e}")
raise e
def create_dataframe(data, column_names):
"""
Creates a DataFrame from the provided data.
Args:
data (list): The data to convert to a DataFrame.
column_names (list): The column names for the DataFrame.
Returns:
DataFrame: The created DataFrame.
"""
try:
df = pd.DataFrame(data, columns=column_names)
return df
except Exception as e:
logger.error(f"An error occurred while creating DataFrame: {e}")
return pd.DataFrame()
def get_arxiv_main_content(url):
"""
Returns the main content of an arXiv paper.
Args:
url (str): The URL of the arXiv paper.
Returns:
str: The main content of the paper as a string.
"""
try:
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.content, "html.parser")
main_content = soup.find('div', class_='ltx_page_content')
if not main_content:
logger.warning("Main content not found in the page.")
return "Main content not found."
alert_section = main_content.find('div', class_='package-alerts ltx_document')
if (alert_section):
alert_section.decompose()
for element_id in ["abs", "authors"]:
element = main_content.find(id=element_id)
if (element):
element.decompose()
return main_content.text.strip()
except Exception as html_error:
logger.warning(f"HTML content not accessible, trying PDF: {html_error}")
return get_pdf_content(url)
def download_paper(paper_id, output_dir="downloads", filename=None, get_source=False):
"""
Downloads a paper's PDF or source files with enhanced error handling.
Args:
paper_id (str): The arXiv ID of the paper
output_dir (str): Directory to save the downloaded file (default: 'downloads')
filename (str): Custom filename (default: None, uses paper ID)
get_source (bool): If True, downloads source files instead of PDF (default: False)
Returns:
str: Path to the downloaded file or None if download fails
"""
try:
# Create output directory if it doesn't exist
os.makedirs(output_dir, exist_ok=True)
# Get paper metadata
client = create_arxiv_client()
paper = next(client.results(arxiv.Search(id_list=[paper_id])))
# Set filename if not provided
if not filename:
safe_title = re.sub(r'[^\w\-_.]', '_', paper.title[:50])
filename = f"{paper_id}_{safe_title}"
filename += ".tar.gz" if get_source else ".pdf"
# Full path for the downloaded file
file_path = os.path.join(output_dir, filename)
# Download the file
if get_source:
paper.download_source(dirpath=output_dir, filename=filename)
else:
paper.download_pdf(dirpath=output_dir, filename=filename)
logger.info(f"Successfully downloaded {'source' if get_source else 'PDF'} to {file_path}")
return file_path
except Exception as e:
logger.error(f"Error downloading {'source' if get_source else 'PDF'} for {paper_id}: {e}")
return None
def analyze_paper_content(url_or_id, cleanup=True):
"""
Analyzes paper content using AI to extract key information and insights.
Args:
url_or_id (str): The arXiv URL or ID of the paper
cleanup (bool): Whether to delete the PDF after extraction (default: True)
Returns:
dict: Analysis results including summary, key findings, and concepts
"""
try:
# Get paper content
content = get_pdf_content(url_or_id, cleanup)
if not content or 'Failed to' in content:
return {'error': content}
# Generate paper summary
summary_prompt = f"""Analyze this research paper and provide a comprehensive summary:
{content[:8000]} # Limit content length for API
Provide a JSON response with:
1. executive_summary: Brief overview (2-3 sentences)
2. key_findings: List of main research findings
3. methodology: Research methods used
4. implications: Practical implications of the research
5. limitations: Study limitations and constraints"""
summary_analysis = llm_text_gen(summary_prompt)
# Extract key concepts and relationships
concepts_prompt = f"""Analyze this research paper and identify key concepts and relationships:
{content[:8000]}
Provide a JSON response with:
1. main_concepts: List of key technical concepts
2. concept_relationships: How concepts are related
3. novel_contributions: New ideas or approaches introduced
4. technical_requirements: Required technologies or methods
5. future_directions: Suggested future research"""
concept_analysis = llm_text_gen(concepts_prompt)
return {
'summary_analysis': summary_analysis,
'concept_analysis': concept_analysis,
'full_text': content
}
except Exception as e:
logger.error(f"Error analyzing paper content: {e}")
return {'error': str(e)}
def analyze_research_trends(papers):
"""
Analyzes research trends across multiple papers.
Args:
papers (list): List of paper metadata and content
Returns:
dict: Trend analysis results
"""
try:
# Collect paper information
papers_info = []
for paper in papers:
content = get_pdf_content(paper['entry_id'], cleanup=True)
if content and 'Failed to' not in content:
papers_info.append({
'title': paper['title'],
'abstract': paper['summary'],
'content': content[:8000], # Limit content length
'year': paper['published'].year
})
if not papers_info:
return {'error': 'No valid paper content found for analysis'}
# Analyze trends
trends_prompt = f"""Analyze these research papers and identify key trends:
Papers:
{str(papers_info)}
Provide a JSON response with:
1. temporal_trends: How research focus evolved over time
2. emerging_themes: New and growing research areas
3. declining_themes: Decreasing research focus areas
4. methodology_trends: Evolution of research methods
5. technology_trends: Trends in technology usage
6. research_gaps: Identified gaps and opportunities"""
trend_analysis = llm_text_gen(trends_prompt)
return {'trend_analysis': trend_analysis}
except Exception as e:
logger.error(f"Error analyzing research trends: {e}")
return {'error': str(e)}
def map_cross_paper_concepts(papers):
"""
Maps concepts and relationships across multiple papers.
Args:
papers (list): List of paper metadata and content
Returns:
dict: Concept mapping results
"""
try:
# Analyze each paper
paper_analyses = []
for paper in papers:
analysis = analyze_paper_content(paper['entry_id'])
if 'error' not in analysis:
paper_analyses.append({
'paper_id': paper['entry_id'],
'title': paper['title'],
'analysis': analysis
})
if not paper_analyses:
return {'error': 'No valid paper analyses for concept mapping'}
# Generate cross-paper concept map
mapping_prompt = f"""Analyze relationships between concepts across these papers:
{str(paper_analyses)}
Provide a JSON response with:
1. shared_concepts: Concepts appearing in multiple papers
2. concept_evolution: How concepts developed across papers
3. conflicting_views: Different interpretations of same concepts
4. complementary_findings: How papers complement each other
5. knowledge_gaps: Areas needing more research"""
concept_mapping = llm_text_gen(mapping_prompt)
return {'concept_mapping': concept_mapping}
except Exception as e:
logger.error(f"Error mapping cross-paper concepts: {e}")
return {'error': str(e)}
def generate_bibtex_entry(paper):
"""
Generates a BibTeX entry for a paper with complete metadata.
Args:
paper (dict): Paper metadata dictionary
Returns:
str: BibTeX entry string
"""
try:
# Generate a unique citation key
first_author = paper['authors'][0].split()[-1] if paper['authors'] else 'Unknown'
year = paper['published'].year if paper['published'] else '0000'
citation_key = f"{first_author}{year}{paper['entry_id'].split('/')[-1]}"
# Format authors for BibTeX
authors = ' and '.join(paper['authors'])
# Create BibTeX entry
bibtex = f"@article{{{citation_key},\n"
bibtex += f" title = {{{paper['title']}}},\n"
bibtex += f" author = {{{authors}}},\n"
bibtex += f" year = {{{year}}},\n"
bibtex += f" journal = {{arXiv preprint}},\n"
bibtex += f" archivePrefix = {{arXiv}},\n"
bibtex += f" eprint = {{{paper['entry_id'].split('/')[-1]}}},\n"
if paper['doi']:
bibtex += f" doi = {{{paper['doi']}}},\n"
bibtex += f" url = {{{paper['entry_id']}}},\n"
bibtex += f" abstract = {{{paper['summary']}}}\n"
bibtex += "}"
return bibtex
except Exception as e:
logger.error(f"Error generating BibTeX entry: {e}")
return ""
def convert_citation_format(bibtex_str, target_format):
"""
Converts BibTeX citations to other formats and validates the output.
Args:
bibtex_str (str): BibTeX entry string
target_format (str): Target citation format ('apa', 'mla', 'chicago', etc.)
Returns:
str: Formatted citation string
"""
try:
# Parse BibTeX entry
bib_database = bibtexparser.loads(bibtex_str)
entry = bib_database.entries[0]
# Generate citation format prompt
prompt = f"""Convert this bibliographic information to {target_format} format:
Title: {entry.get('title', '')}
Authors: {entry.get('author', '')}
Year: {entry.get('year', '')}
Journal: {entry.get('journal', '')}
DOI: {entry.get('doi', '')}
URL: {entry.get('url', '')}
Return only the formatted citation without any explanation."""
# Use AI to generate formatted citation
formatted_citation = llm_text_gen(prompt)
return formatted_citation.strip()
except Exception as e:
logger.error(f"Error converting citation format: {e}")
return ""
def visualize_reference_graph(papers):
"""
Creates a visual representation of the citation network.
Args:
papers (list): List of paper metadata dictionaries
Returns:
str: Path to the saved visualization file
"""
try:
# Create directed graph
G = nx.DiGraph()
# Add nodes and edges
for paper in papers:
paper_id = paper['entry_id']
G.add_node(paper_id, title=paper['title'])
# Add citation edges
if paper['doi']:
for other_paper in papers:
if other_paper['doi'] and other_paper['doi'] in paper['summary']:
G.add_edge(paper_id, other_paper['entry_id'])
# Set up the visualization
plt.figure(figsize=(12, 8))
pos = nx.spring_layout(G)
# Draw the graph
nx.draw(G, pos, with_labels=False, node_color='lightblue',
node_size=1000, arrowsize=20)
# Add labels
labels = nx.get_node_attributes(G, 'title')
nx.draw_networkx_labels(G, pos, labels, font_size=8)
# Save the visualization
output_path = 'reference_graph.png'
plt.savefig(output_path, dpi=300, bbox_inches='tight')
plt.close()
return output_path
except Exception as e:
logger.error(f"Error visualizing reference graph: {e}")
return ""
def analyze_citation_impact(papers):
"""
Analyzes citation impact and influence patterns.
Args:
papers (list): List of paper metadata dictionaries
Returns:
dict: Citation impact analysis results
"""
try:
# Create citation network
G = nx.DiGraph()
for paper in papers:
G.add_node(paper['entry_id'], **paper)
if paper['doi']:
for other_paper in papers:
if other_paper['doi'] and other_paper['doi'] in paper['summary']:
G.add_edge(paper_id, other_paper['entry_id'])
# Calculate impact metrics
impact_analysis = {
'citation_counts': dict(G.in_degree()),
'influence_scores': nx.pagerank(G),
'authority_scores': nx.authority_matrix(G).diagonal(),
'hub_scores': nx.hub_matrix(G).diagonal(),
'citation_paths': dict(nx.all_pairs_shortest_path_length(G))
}
# Add temporal analysis
year_citations = defaultdict(int)
for paper in papers:
if paper['published']:
year = paper['published'].year
year_citations[year] += G.in_degree(paper['entry_id'])
impact_analysis['temporal_trends'] = dict(year_citations)
return impact_analysis
except Exception as e:
logger.error(f"Error analyzing citation impact: {e}")
return {}
def get_pdf_content(url_or_id, cleanup=True):
"""
Extracts text content from a paper's PDF with improved error handling.
Args:
url_or_id (str): The arXiv URL or ID of the paper
cleanup (bool): Whether to delete the PDF after extraction (default: True)
Returns:
str: The extracted text content or error message
"""
try:
# Extract arxiv ID from URL if needed
arxiv_id = url_or_id.split('/')[-1] if '/' in url_or_id else url_or_id
# Download PDF
pdf_path = download_paper(arxiv_id)
if not pdf_path:
return "Failed to download PDF."
# Extract text from PDF
pdf_text = ''
with open(pdf_path, 'rb') as f:
pdf_reader = PyPDF2.PdfReader(f)
for page_num, page in enumerate(pdf_reader.pages, 1):
try:
page_text = page.extract_text()
if page_text:
pdf_text += f"\n--- Page {page_num} ---\n{page_text}"
except Exception as err:
logger.error(f"Error extracting text from page {page_num}: {err}")
continue
# Clean up
if cleanup:
try:
os.remove(pdf_path)
logger.debug(f"Cleaned up temporary PDF file: {pdf_path}")
except Exception as e:
logger.warning(f"Failed to cleanup PDF file {pdf_path}: {e}")
# Process and return text
if not pdf_text.strip():
return "No text content could be extracted from the PDF."
return clean_pdf_text(pdf_text)
except Exception as e:
logger.error(f"Failed to process PDF: {e}")
return f"Failed to retrieve content: {str(e)}"
def clean_pdf_text(text):
"""
Helper function to clean the text extracted from a PDF.
Args:
text (str): The text to clean.
Returns:
str: The cleaned text.
"""
pattern = r'References\s*.*'
text = re.sub(pattern, '', text, flags=re.IGNORECASE | re.DOTALL)
sections_to_remove = ['Acknowledgements', 'References', 'Bibliography']
for section in sections_to_remove:
pattern = r'(' + re.escape(section) + r'\s*.*?)(?=\n[A-Z]{2,}|$)'
text = re.sub(pattern, '', text, flags=re.DOTALL | re.IGNORECASE)
return text
def download_image(image_url, base_url, folder="images"):
"""
Downloads an image from a URL.
Args:
image_url (str): The URL of the image.
base_url (str): The base URL of the website.
folder (str): The folder to save the image.
Returns:
bool: True if the image was downloaded successfully, False otherwise.
"""
if image_url.startswith('data:image'):
logger.info(f"Skipping download of data URI image: {image_url}")
return False
if not os.path.exists(folder):
os.makedirs(folder)
if not urlparse(image_url).scheme:
if not base_url.endswith('/'):
base_url += '/'
image_url = base_url + image_url
try:
response = requests.get(image_url)
response.raise_for_status()
image_name = image_url.split("/")[-1]
with open(os.path.join(folder, image_name), 'wb') as file:
file.write(response.content)
return True
except requests.RequestException as e:
logger.error(f"Error downloading {image_url}: {e}")
return False
def scrape_images_from_arxiv(url):
"""
Scrapes images from an arXiv page.
Args:
url (str): The URL of the arXiv page.
Returns:
list: A list of image URLs.
"""
try:
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.text, 'html.parser')
images = soup.find_all('img')
image_urls = [img['src'] for img in images if 'src' in img.attrs]
return image_urls
except requests.RequestException as e:
logger.error(f"Error fetching page {url}: {e}")
return []
def generate_bibtex(paper_id, client=None):
"""
Generate a BibTeX entry for an arXiv paper with enhanced metadata.
Args:
paper_id (str): The arXiv ID of the paper
client (arxiv.Client): Optional custom client (default: None)
Returns:
str: BibTeX entry as a string
"""
try:
if client is None:
client = create_arxiv_client()
# Fetch paper metadata
paper = next(client.results(arxiv.Search(id_list=[paper_id])))
# Extract author information
authors = [str(author) for author in paper.authors]
first_author = authors[0].split(', ')[0] if authors else 'Unknown'
# Format year
year = paper.published.year if paper.published else 'Unknown'
# Create citation key
citation_key = f"{first_author}{str(year)[-2:]}"
# Build BibTeX entry
bibtex = [
f"@article{{{citation_key},",
f" author = {{{' and '.join(authors)}}},",
f" title = {{{paper.title}}},",
f" year = {{{year}}},",
f" eprint = {{{paper_id}}},",
f" archivePrefix = {{arXiv}},"
]
# Add optional fields if available
if paper.doi:
bibtex.append(f" doi = {{{paper.doi}}},")
if getattr(paper, 'journal_ref', None):
bibtex.append(f" journal = {{{paper.journal_ref}}},")
if getattr(paper, 'primary_category', None):
bibtex.append(f" primaryClass = {{{paper.primary_category}}},")
# Add URL and close entry
bibtex.extend([
f" url = {{https://arxiv.org/abs/{paper_id}}}",
"}"
])
return '\n'.join(bibtex)
except Exception as e:
logger.error(f"Error generating BibTeX for {paper_id}: {e}")
return ""
def batch_download_papers(paper_ids, output_dir="downloads", get_source=False):
"""
Download multiple papers in batch with progress tracking.
Args:
paper_ids (list): List of arXiv IDs to download
output_dir (str): Directory to save downloaded files (default: 'downloads')
get_source (bool): If True, downloads source files instead of PDFs (default: False)
Returns:
dict: Mapping of paper IDs to their download status and paths
"""
results = {}
client = create_arxiv_client()
for paper_id in paper_ids:
try:
file_path = download_paper(paper_id, output_dir, get_source=get_source)
results[paper_id] = {
'success': bool(file_path),
'path': file_path,
'error': None
}
except Exception as e:
results[paper_id] = {
'success': False,
'path': None,
'error': str(e)
}
logger.error(f"Failed to download {paper_id}: {e}")
return results
def batch_generate_bibtex(paper_ids):
"""
Generate BibTeX entries for multiple papers.
Args:
paper_ids (list): List of arXiv IDs
Returns:
dict: Mapping of paper IDs to their BibTeX entries
"""
results = {}
client = create_arxiv_client()
for paper_id in paper_ids:
try:
bibtex = generate_bibtex(paper_id, client)
results[paper_id] = {
'success': bool(bibtex),
'bibtex': bibtex,
'error': None
}
except Exception as e:
results[paper_id] = {
'success': False,
'bibtex': '',
'error': str(e)
}
logger.error(f"Failed to generate BibTeX for {paper_id}: {e}")
return results
def extract_arxiv_ids_from_line(line):
"""
Extract the arXiv ID from a given line of text.
Args:
line (str): A line of text potentially containing an arXiv URL.
Returns:
str: The extracted arXiv ID, or None if not found.
"""
arxiv_id_pattern = re.compile(r'arxiv\.org\/abs\/(\d+\.\d+)(v\d+)?')
match = arxiv_id_pattern.search(line)
if match:
return match.group(1) + (match.group(2) if match.group(2) else '')
return None
def read_written_ids(file_path):
"""
Read already written arXiv IDs from a file.
Args:
file_path (str): Path to the file containing written IDs.
Returns:
set: A set of arXiv IDs.
"""
written_ids = set()
try:
with open(file_path, 'r', encoding="utf-8") as file:
for line in file:
written_ids.add(line.strip())
except FileNotFoundError:
logger.error(f"File not found: {file_path}")
except Exception as e:
logger.error(f"Error while reading the file: {e}")
return written_ids
def append_id_to_file(arxiv_id, output_file_path):
"""
Append a single arXiv ID to a file. Checks if the file exists and creates it if not.
Args:
arxiv_id (str): The arXiv ID to append.
output_file_path (str): Path to the output file.
"""
try:
if not os.path.exists(output_file_path):
logger.info(f"File does not exist. Creating new file: {output_file_path}")
with open(output_file_path, 'a', encoding="utf-8") as outfile:
outfile.write(arxiv_id + '\n')
else:
logger.info(f"Appending to existing file: {output_file_path}")
with open(output_file_path, 'a', encoding="utf-8") as outfile:
outfile.write(arxiv_id + '\n')
except Exception as e:
logger.error(f"Error while appending to file: {e}")

View File

@@ -0,0 +1,100 @@
# Common utils for web_researcher
import os
import sys
import re
import json
from pathlib import Path
from datetime import datetime, timedelta
from pathlib import Path
from loguru import logger
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
def cfg_search_param(flag):
"""
Read values from the main_config.json file and return them as variables and a dictionary.
Args:
flag (str): A flag to determine which configuration values to return.
Returns:
various: The values read from the config file based on the flag.
"""
try:
file_path = Path(os.environ.get("ALWRITY_CONFIG", ""))
if not file_path.is_file():
raise FileNotFoundError(f"Configuration file not found: {file_path}")
logger.info(f"Reading search config params from {file_path}")
with open(file_path, 'r', encoding='utf-8') as file:
config = json.load(file)
web_research_section = config["Search Engine Parameters"]
if 'serperdev' in flag:
# Get values as variables
geo_location = web_research_section.get("Geographic Location")
search_language = web_research_section.get("Search Language")
num_results = web_research_section.get("Number of Results")
return geo_location, search_language, num_results
elif 'tavily' in flag:
include_urls = web_research_section.get("Include Domains")
pattern = re.compile(r"^(https?://[^\s,]+)(,\s*https?://[^\s,]+)*$")
if pattern.match(include_urls):
include_urls = [url.strip() for url in include_urls.split(',')]
else:
include_urls = None
return include_urls
elif 'exa' in flag:
include_urls = web_research_section.get("Include Domains")
pattern = re.compile(r"^(https?://\w+)(,\s*https?://\w+)*$")
if pattern.match(include_urls) is not None:
include_urls = include_urls.split(',')
elif re.match(r"^http?://\w+$", include_urls) is not None:
include_urls = include_urls.split(" ")
else:
include_urls = None
num_results = web_research_section.get("Number of Results")
similar_url = web_research_section.get("Similar URL")
time_range = web_research_section.get("Time Range")
if time_range == "past day":
start_published_date = (datetime.now() - timedelta(days=1)).strftime('%Y-%m-%d')
elif time_range == "past week":
start_published_date = (datetime.now() - timedelta(days=7)).strftime("%Y-%m-%d")
elif time_range == "past month":
start_published_date = (datetime.now() - timedelta(days=30)).strftime('%Y-%m-%d')
elif time_range == "past year":
start_published_date = (datetime.now() - timedelta(days=365)).strftime('%Y-%m-%d')
elif time_range == "anytime" or not time_range:
start_published_date = None
time_range = start_published_date
return include_urls, time_range, num_results, similar_url
except FileNotFoundError:
logger.error(f"Error: Config file '{file_path}' not found.")
return {}, None, None, None
except KeyError as e:
logger.error(f"Error: Missing section or option in config file: {e}")
return {}, None, None, None
except ValueError as e:
logger.error(f"Error: Invalid value in config file: {e}")
return {}, None, None, None
def save_in_file(table_content):
""" Helper function to save search analysis in a file. """
file_path = os.environ.get('SEARCH_SAVE_FILE')
try:
# Save the content to the file
with open(file_path, "a+", encoding="utf-8") as file:
file.write(table_content)
file.write("\n" * 3) # Add three newlines at the end
logger.info(f"Search content saved to {file_path}")
return file_path
except Exception as e:
logger.error(f"Error occurred while writing to the file: {e}")

View File

@@ -0,0 +1,256 @@
import matplotlib.pyplot as plt
import pandas as pd
import yfinance as yf
import pandas_ta as ta
import matplotlib.dates as mdates
from datetime import datetime, timedelta
import logging
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
def calculate_technical_indicators(data: pd.DataFrame) -> pd.DataFrame:
"""
Calculates a suite of technical indicators using pandas_ta.
Args:
data (pd.DataFrame): DataFrame containing historical stock price data.
Returns:
pd.DataFrame: DataFrame with added technical indicators.
"""
try:
# Moving Averages
data.ta.macd(append=True)
data.ta.sma(length=20, append=True)
data.ta.ema(length=50, append=True)
# Momentum Indicators
data.ta.rsi(append=True)
data.ta.stoch(append=True)
# Volatility Indicators
data.ta.bbands(append=True)
data.ta.adx(append=True)
# Other Indicators
data.ta.obv(append=True)
data.ta.willr(append=True)
data.ta.cmf(append=True)
data.ta.psar(append=True)
# Custom Calculations
data['OBV_in_million'] = data['OBV'] / 1e6
data['MACD_histogram_12_26_9'] = data['MACDh_12_26_9']
logging.info("Technical indicators calculated successfully.")
return data
except KeyError as e:
logging.error(f"Missing key in data: {e}")
except ValueError as e:
logging.error(f"Value error: {e}")
except Exception as e:
logging.error(f"Error during technical indicator calculation: {e}")
return None
def get_last_day_summary(data: pd.DataFrame) -> pd.Series:
"""
Extracts and summarizes technical indicators for the last trading day.
Args:
data (pd.DataFrame): DataFrame with calculated technical indicators.
Returns:
pd.Series: Summary of technical indicators for the last day.
"""
try:
last_day_summary = data.iloc[-1][[
'Adj Close', 'MACD_12_26_9', 'MACD_histogram_12_26_9', 'RSI_14',
'BBL_5_2.0', 'BBM_5_2.0', 'BBU_5_2.0', 'SMA_20', 'EMA_50',
'OBV_in_million', 'STOCHk_14_3_3', 'STOCHd_14_3_3', 'ADX_14',
'WILLR_14', 'CMF_20', 'PSARl_0.02_0.2', 'PSARs_0.02_0.2'
]]
logging.info("Last day summary extracted.")
return last_day_summary
except KeyError as e:
logging.error(f"Missing columns in data: {e}")
except Exception as e:
logging.error(f"Error extracting last day summary: {e}")
return None
def analyze_stock(ticker_symbol: str, start_date: datetime, end_date: datetime) -> pd.Series:
"""
Fetches stock data, calculates technical indicators, and provides a summary.
Args:
ticker_symbol (str): The stock symbol.
start_date (datetime): Start date for data retrieval.
end_date (datetime): End date for data retrieval.
Returns:
pd.Series: Summary of technical indicators for the last day.
"""
try:
# Fetch stock data
stock_data = yf.download(ticker_symbol, start=start_date, end=end_date)
logging.info(f"Stock data fetched for {ticker_symbol} from {start_date} to {end_date}")
# Calculate technical indicators
stock_data = calculate_technical_indicators(stock_data)
# Get last day summary
if stock_data is not None:
last_day_summary = get_last_day_summary(stock_data)
if last_day_summary is not None:
print("Summary of Technical Indicators for the Last Day:")
print(last_day_summary)
return last_day_summary
else:
logging.error("Stock data is None, unable to calculate indicators.")
except Exception as e:
logging.error(f"Error during analysis: {e}")
return None
def get_finance_data(symbol: str) -> pd.Series:
"""
Fetches financial data for a given stock symbol.
Args:
symbol (str): The stock symbol.
Returns:
pd.Series: Summary of technical indicators for the last day.
"""
end_date = datetime.today()
start_date = end_date - timedelta(days=120)
# Perform analysis
last_day_summary = analyze_stock(symbol, start_date, end_date)
return last_day_summary
def analyze_options_data(ticker: str, expiry_date: str) -> tuple:
"""
Analyzes option data for a given ticker and expiry date.
Args:
ticker (str): The stock ticker symbol.
expiry_date (str): The option expiry date.
Returns:
tuple: A tuple containing calculated metrics for call and put options.
"""
call_df = options.get_calls(ticker, expiry_date)
put_df = options.get_puts(ticker, expiry_date)
# Implied Volatility Analysis:
avg_call_iv = call_df["Implied Volatility"].str.rstrip("%").astype(float).mean()
avg_put_iv = put_df["Implied Volatility"].str.rstrip("%").astype(float).mean()
logging.info(f"Average Implied Volatility for Call Options: {avg_call_iv}%")
logging.info(f"Average Implied Volatility for Put Options: {avg_put_iv}%")
# Option Prices Analysis:
avg_call_last_price = call_df["Last Price"].mean()
avg_put_last_price = put_df["Last Price"].mean()
logging.info(f"Average Last Price for Call Options: {avg_call_last_price}")
logging.info(f"Average Last Price for Put Options: {avg_put_last_price}")
# Strike Price Analysis:
min_call_strike = call_df["Strike"].min()
max_call_strike = call_df["Strike"].max()
min_put_strike = put_df["Strike"].min()
max_put_strike = put_df["Strike"].max()
logging.info(f"Minimum Strike Price for Call Options: {min_call_strike}")
logging.info(f"Maximum Strike Price for Call Options: {max_call_strike}")
logging.info(f"Minimum Strike Price for Put Options: {min_put_strike}")
logging.info(f"Maximum Strike Price for Put Options: {max_put_strike}")
# Volume Analysis:
total_call_volume = call_df["Volume"].str.replace('-', '0').astype(float).sum()
total_put_volume = put_df["Volume"].str.replace('-', '0').astype(float).sum()
logging.info(f"Total Volume for Call Options: {total_call_volume}")
logging.info(f"Total Volume for Put Options: {total_put_volume}")
# Open Interest Analysis:
call_df['Open Interest'] = call_df['Open Interest'].str.replace('-', '0').astype(float)
put_df['Open Interest'] = put_df['Open Interest'].str.replace('-', '0').astype(float)
total_call_open_interest = call_df["Open Interest"].sum()
total_put_open_interest = put_df["Open Interest"].sum()
logging.info(f"Total Open Interest for Call Options: {total_call_open_interest}")
logging.info(f"Total Open Interest for Put Options: {total_put_open_interest}")
# Convert Implied Volatility to float
call_df['Implied Volatility'] = call_df['Implied Volatility'].str.replace('%', '').astype(float)
put_df['Implied Volatility'] = put_df['Implied Volatility'].str.replace('%', '').astype(float)
# Calculate Put-Call Ratio
put_call_ratio = total_put_volume / total_call_volume
logging.info(f"Put-Call Ratio: {put_call_ratio}")
# Calculate Implied Volatility Percentile
call_iv_percentile = (call_df['Implied Volatility'] > call_df['Implied Volatility'].mean()).mean() * 100
put_iv_percentile = (put_df['Implied Volatility'] > put_df['Implied Volatility'].mean()).mean() * 100
logging.info(f"Call Option Implied Volatility Percentile: {call_iv_percentile}")
logging.info(f"Put Option Implied Volatility Percentile: {put_iv_percentile}")
# Calculate Implied Volatility Skew
implied_vol_skew = call_df['Implied Volatility'].mean() - put_df['Implied Volatility'].mean()
logging.info(f"Implied Volatility Skew: {implied_vol_skew}")
# Determine market sentiment
is_bullish_sentiment = call_df['Implied Volatility'].mean() > put_df['Implied Volatility'].mean()
sentiment = "bullish" if is_bullish_sentiment else "bearish"
logging.info(f"The overall sentiment of {ticker} is {sentiment}.")
return (avg_call_iv, avg_put_iv, avg_call_last_price, avg_put_last_price,
min_call_strike, max_call_strike, min_put_strike, max_put_strike,
total_call_volume, total_put_volume, total_call_open_interest, total_put_open_interest,
put_call_ratio, call_iv_percentile, put_iv_percentile, implied_vol_skew, sentiment)
def get_fin_options_data(ticker: str) -> list:
"""
Fetches and analyzes options data for a given stock ticker.
Args:
ticker (str): The stock ticker symbol.
Returns:
list: A list of sentences summarizing the options data.
"""
current_price = round(stock_info.get_live_price(ticker), 3)
option_expiry_dates = options.get_expiration_dates(ticker)
nearest_expiry = option_expiry_dates[0]
results = analyze_options_data(ticker, nearest_expiry)
# Unpack the results tuple
(avg_call_iv, avg_put_iv, avg_call_last_price, avg_put_last_price,
min_call_strike, max_call_strike, min_put_strike, max_put_strike,
total_call_volume, total_put_volume, total_call_open_interest, total_put_open_interest,
put_call_ratio, call_iv_percentile, put_iv_percentile, implied_vol_skew, sentiment) = results
# Create a list of complete sentences with the results
results_sentences = [
f"Average Implied Volatility for Call Options: {avg_call_iv}%",
f"Average Implied Volatility for Put Options: {avg_put_iv}%",
f"Average Last Price for Call Options: {avg_call_last_price}",
f"Average Last Price for Put Options: {avg_put_last_price}",
f"Minimum Strike Price for Call Options: {min_call_strike}",
f"Maximum Strike Price for Call Options: {max_call_strike}",
f"Minimum Strike Price for Put Options: {min_put_strike}",
f"Maximum Strike Price for Put Options: {max_put_strike}",
f"Total Volume for Call Options: {total_call_volume}",
f"Total Volume for Put Options: {total_put_volume}",
f"Total Open Interest for Call Options: {total_call_open_interest}",
f"Total Open Interest for Put Options: {total_put_open_interest}",
f"Put-Call Ratio: {put_call_ratio}",
f"Call Option Implied Volatility Percentile: {call_iv_percentile}",
f"Put Option Implied Volatility Percentile: {put_iv_percentile}",
f"Implied Volatility Skew: {implied_vol_skew}",
f"The overall sentiment of {ticker} is {sentiment}."
]
# Print each sentence
for sentence in results_sentences:
logging.info(sentence)
return results_sentences

View File

@@ -0,0 +1,96 @@
import os
from pathlib import Path
from firecrawl import FirecrawlApp
import logging
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv(Path('../../.env'))
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
def initialize_client() -> FirecrawlApp:
"""
Initialize and return a Firecrawl client.
Returns:
FirecrawlApp: An instance of the Firecrawl client.
"""
return FirecrawlApp(api_key=os.getenv("FIRECRAWL_API_KEY"))
def scrape_website(website_url: str, depth: int = 1, max_pages: int = 10) -> dict:
"""
Scrape a website starting from the given URL.
Args:
website_url (str): The URL of the website to scrape.
depth (int, optional): The depth of crawling. Default is 1.
max_pages (int, optional): The maximum number of pages to scrape. Default is 10.
Returns:
dict: The result of the website scraping, or None if an error occurred.
"""
client = initialize_client()
try:
result = client.crawl_url({
'url': website_url,
'depth': depth,
'max_pages': max_pages
})
return result
except KeyError as e:
logging.error(f"Missing key in data: {e}")
except ValueError as e:
logging.error(f"Value error: {e}")
except Exception as e:
logging.error(f"Error scraping website: {e}")
return None
def scrape_url(url: str) -> dict:
"""
Scrape a specific URL.
Args:
url (str): The URL to scrape.
Returns:
dict: The result of the URL scraping, or None if an error occurred.
"""
client = initialize_client()
try:
result = client.scrape_url(url)
return result
except KeyError as e:
logging.error(f"Missing key in data: {e}")
except ValueError as e:
logging.error(f"Value error: {e}")
except Exception as e:
logging.error(f"Error scraping URL: {e}")
return None
def extract_data(url: str, schema: dict) -> dict:
"""
Extract structured data from a URL using the provided schema.
Args:
url (str): The URL to extract data from.
schema (dict): The schema to use for data extraction.
Returns:
dict: The extracted data, or None if an error occurred.
"""
client = initialize_client()
try:
result = client.extract({
'url': url,
'schema': schema
})
return result
except KeyError as e:
logging.error(f"Missing key in data: {e}")
except ValueError as e:
logging.error(f"Value error: {e}")
except Exception as e:
logging.error(f"Error extracting data: {e}")
return None

View File

@@ -20,24 +20,23 @@ Modifications:
- Customize the search parameters, such as location and language, in the functions as needed.
- Adjust logging configurations, table formatting, and other aspects based on preferences.
To-Do (TBD):
- Consider adding further enhancements or customization based on specific use cases.
Note: This script depends on external libraries such as SerpApi, Loguru, Rich, and Tabulate. Install them using 'pip install serpapi loguru rich tabulate' if not already installed.
"""
import os
from pathlib import Path
import sys
import configparser
from pathlib import Path
import pandas as pd
import json
import requests
from clint.textui import progress
import streamlit as st
#from serpapi import GoogleSearch
from loguru import logger
from tabulate import tabulate
from GoogleNews import GoogleNews
#from GoogleNews import GoogleNews
# Configure logger
logger.remove()
from dotenv import load_dotenv
@@ -49,12 +48,11 @@ logger.add(
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
#from tenacity import retry, stop_after_attempt, wait_random_exponential
#@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
from .common_utils import save_in_file, cfg_search_param
from tenacity import retry, stop_after_attempt, wait_random_exponential
#FIXME: Accept language, country and time frame to search for.
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def google_search(query):
"""
Perform a Google search for the given query.
@@ -77,10 +75,12 @@ def google_search(query):
try:
logger.info("Trying Google search with Serper.dev: https://serper.dev/api-key")
search_result = perform_serperdev_google_search(query)
process_search_results(search_result)
return(search_result)
if search_result:
process_search_results(search_result)
return(search_result)
except Exception as err:
logger.error(f"Failed to do Google search with serper.dev: {err}")
logger.error(f"Failed Google search with serper.dev: {err}")
return None
# # Retry with BROWSERLESS API
@@ -90,20 +90,10 @@ def google_search(query):
# except Exception as err:
# logger.error("FIXME: Failed to do Google search with BROWSERLESS API.")
# logger.debug("FIXME: Trying with dataforSEO API.")
#
# # Retry with dataforSEO API
# try:
# logger.info("Perform SERP with Data for SEO.")
# #search_result = perform_dataforseo_google_search(query)
# #return process_search_results(search_result, flag)
# except Exception as err:
# logger.error("FIXME: Failed to do Google search with dataforSEO API.")
# logger.debug("All retries failed. Giving up.")
# raise
def perform_serpapi_google_search(query, location="in"):
def perform_serpapi_google_search(query):
"""
Perform a Google search using the SerpApi service.
@@ -115,6 +105,12 @@ def perform_serpapi_google_search(query, location="in"):
Returns:
dict: A dictionary containing the search results.
"""
try:
logger.info("Reading Web search config values from main_config")
geo_location, search_language, num_results, time_range, include_domains, similar_url = read_return_config_section('web_research')
except Exception as err:
logger.error(f"Failed to read web research params: {err}")
return
try:
# Check if API key is provided
if not os.getenv("SERPAPI_KEY"):
@@ -162,16 +158,18 @@ def perform_serperdev_google_search(query):
# Serper API endpoint URL
url = "https://google.serper.dev/search"
# FIXME: Expose options to end user. Request payload
try:
geo_loc, lang, num_results = cfg_search_param('serperdev')
except Exception as err:
logger.error(f"Failed to read config {err}")
# Build payload as end user or main_config
payload = json.dumps({
"q": query,
"gl": "in",
"hl": "en",
"num": 10,
"gl": geo_loc,
"hl": lang,
"num": num_results,
"autocorrect": True,
"page": 1,
"type": "search",
"engine": "google"
})
# Request headers with API key
@@ -193,6 +191,42 @@ def perform_serperdev_google_search(query):
return None
def perform_serper_news_search(news_keywords, news_country, news_language):
""" Function for Serper.dev News google search """
# Get the Serper API key from environment variables
logger.info(f"Doing serper.dev google search. {news_keywords} - {news_country} - {news_language}")
serper_api_key = os.getenv('SERPER_API_KEY')
# Check if the API key is available
if not serper_api_key:
raise ValueError("SERPER_API_KEY is missing. Set it in the .env file.")
# Serper API endpoint URL
url = "https://google.serper.dev/news"
payload = json.dumps({
"q": news_keywords,
"gl": news_country,
"hl": news_language,
})
# Request headers with API key
headers = {
'X-API-KEY': serper_api_key,
'Content-Type': 'application/json'
}
# Send a POST request to the Serper API with progress bar
with progress.Bar(label="Searching News", expected_size=100) as bar:
response = requests.post(url, headers=headers, data=payload, stream=True)
# Check if the request was successful
if response.status_code == 200:
# Parse and return the JSON response
#process_search_results(response, "news")
return response.json()
else:
# Print an error message if the request fails
logger.error(f"Error: {response.status_code}, {response.text}")
return None
def perform_browserless_google_search():
return
@@ -211,7 +245,7 @@ def google_news(search_keywords, news_period="7d", region="IN"):
print(googlenews.search('APPLE'))
def process_search_results(search_results):
def process_search_results(search_results, search_type="general"):
"""
Create a Pandas DataFrame from the search results.
@@ -223,7 +257,10 @@ def process_search_results(search_results):
"""
data = []
logger.info(f"Google Search Parameters: {search_results.get('searchParameters', {})}")
organic_results = search_results.get("organic", [])
if 'general' in search_type:
organic_results = search_results.get("organic", [])
if 'news' in search_type:
organic_results = search_results.get("news", [])
# Displaying Organic Results
organic_data = []
@@ -292,21 +329,11 @@ def process_search_results(search_results):
print(combined_table)
# Save the combined table to a file
try:
# Display on Alwrity UI
st.write(organic_table)
st.write(combined_table)
save_in_file(organic_table)
save_in_file(combined_table)
except Exception as save_results_err:
logger.error(f"Failed to save search results: {save_results_err}")
return search_results
def save_in_file(table_content):
""" Helper function to save search analysis in a file. """
file_path = os.environ.get('SEARCH_SAVE_FILE')
try:
# Save the content to the file
with open(file_path, "a+") as file:
file.write(table_content)
file.write("\n" * 3) # Add three newlines at the end
logger.info(f"Search content saved to {file_path}")
except Exception as e:
logger.error(f"Error occurred while writing to the file: {e}")

View File

@@ -45,7 +45,6 @@ from urllib.parse import quote_plus
from tqdm import tqdm
from tabulate import tabulate
from pytrends.request import TrendReq
from wordcloud import WordCloud
from loguru import logger
# Configure logger
@@ -106,123 +105,55 @@ def plot_interest_by_region(kw_list):
def get_related_queries_and_save_csv(keywords, hl='en-US', tz=360, cat=0, timeframe='today 12-m'):
"""
Get related queries for the given search keywords and save the result to a CSV file.
Args:
search_keywords (list): List of search keywords.
hl (str): Language parameter, default is 'en-US'.
tz (int): Timezone parameter, default is 360.
cat (int): Category parameter, default is 0.
timeframe (str): Timeframe parameter, default is 'today 12-m'.
Returns:
pd.DataFrame: DataFrame containing related queries.
"""
try:
# Build model
pytrends = TrendReq(hl=hl, tz=tz)
pytrends.build_payload(kw_list=keywords, cat=cat, timeframe=timeframe)
# Get related queries
data = pytrends.related_queries()
# Extract data from the result
top_queries = list(data.values())[0]['top']
rising_queries = list(data.values())[0]['rising']
top_rising_queries = top_queries + rising_queries
# Convert lists to DataFrames
df_top_queries = pd.DataFrame(top_queries)
df_rising_queries = pd.DataFrame(rising_queries) # Added this line
# Rename columns to avoid duplicates
df_top_queries.columns = ['Top query', 'value']
df_rising_queries.columns = ['Rising query', 'value']
# Save to CSV
all_queries_df = pd.concat([df_top_queries, df_rising_queries], axis=1)
#all_queries_df.to_csv('related_queries.csv', index=False)
# Display additional information
console = Console()
# Display additional information with emojis and bold formatting
print("\n📢❗🚨 ")
print("\n\033[1m🔝 Top\033[0m: The most popular search queries. Scoring is on a relative scale where a value of 100 is the most commonly searched query, 50 is a query searched half as often, and a value of 0 is a query searched for less than 1% as often as the most popular query.\n")
print("\n\033[1m🚀 Rising\033[0m: Queries with the biggest increase in search frequency since the last time period. Results marked 'Breakout' had a tremendous increase, probably because these queries are new and had few (if any) prior searches.\n")
# Display the DataFrame using tabulate
table = tabulate(all_queries_df, headers='keys', tablefmt='fancy_grid')
print(table)
# Save the combined table to a file
try:
save_in_file(table)
except Exception as save_results_err:
logger.error(f"Failed to save search results: {save_results_err}")
return top_rising_queries
except Exception as e:
print(f"get_related_queries_and_save_csv: ERROR: An error occurred: {e}")
def get_related_topics_and_save_csv(search_keywords):
"""
Get related topics for the given search keywords and save the result to a CSV file.
Args:
search_keywords (list): List of search keywords.
Returns:
pd.DataFrame: DataFrame containing related topics.
"""
search_keywords = [f"{search_keywords}"]
try:
# Build model
pytrends = TrendReq(hl='en-US', tz=360)
pytrends.build_payload(kw_list=search_keywords, timeframe='today 12-m')
# Build payload
# FIXME: Remove hardcoding.
pytrends.build_payload(search_keywords, cat=0, timeframe='today 12-m')
# Get related topics
try:
data = pytrends.related_topics()
except Exception as err:
logger.error(f"Failed to get pytrends realted topics: {err}")
return
# Extract data from the result
top_topics = list(data.values())[0]['top']
rising_topics = list(data.values())[0]['rising']
# Get related topics - this returns a dictionary
topics_data = pytrends.related_topics()
# Convert lists to DataFrames
df_top_topics = pd.DataFrame(top_topics)
df_rising_topics = pd.DataFrame(rising_topics)
# FIXME:Exclude specified columns
columns_to_exclude = ['hasData', 'value', 'topic_mid', 'link']
df_top_topics = df_top_topics.drop(columns=columns_to_exclude, errors='ignore')
df_rising_topics = df_rising_topics.drop(columns=columns_to_exclude, errors='ignore')
# Rename columns to avoid duplicates and provide meaningful names
df_top_topics.columns = ['Top- ' + col if col != 'topic_title' else col for col in df_top_topics.columns]
df_rising_topics.columns = ['Rising- ' + col if col != 'topic_title' else col for col in df_rising_topics.columns]
all_topics_df = pd.concat([df_top_topics, df_rising_topics], axis=1)
print(f"\n\n 📢❗🚨 Rising and Trending Keywords for {search_keywords}\n")
print("\033[1m🔝 Top\033[0m: The most popular search topics.")
print("\033[1m🚀 Rising\033[0m: Topics experiencing a significant increase in search frequency since the last time period. Topics marked :pile_of_poop:'Breakout' had a tremendous surge, likely because they are new and had few prior searches.")
# Display the DataFrame using tabulate
pd.set_option('display.max_rows', all_topics_df.shape[0]+1)
print(all_topics_df.head(10))
table = tabulate(all_topics_df, headers='keys', tablefmt='fancy_grid')
try:
save_in_file(table)
except Exception as save_results_err:
logger.error(f"Failed to save search results: {save_results_err}")
return all_topics_df
# Extract data for the first keyword
if topics_data and search_keywords[0] in topics_data:
keyword_data = topics_data[search_keywords[0]]
# Create two separate dataframes for top and rising
top_df = keyword_data.get('top', pd.DataFrame())
rising_df = keyword_data.get('rising', pd.DataFrame())
return {
'top': top_df[['topic_title', 'value']] if not top_df.empty else pd.DataFrame(),
'rising': rising_df[['topic_title', 'value']] if not rising_df.empty else pd.DataFrame()
}
except Exception as e:
logger.error(f"ERROR: An error occurred in related topics: {e}")
return pd.DataFrame()
logger.error(f"Error in related topics: {e}")
return {'top': pd.DataFrame(), 'rising': pd.DataFrame()}
def get_related_queries_and_save_csv(search_keywords):
search_keywords = [f"{search_keywords}"]
try:
pytrends = TrendReq(hl='en-US', tz=360)
pytrends.build_payload(kw_list=search_keywords, timeframe='today 12-m')
# Get related queries - this returns a dictionary
queries_data = pytrends.related_queries()
# Extract data for the first keyword
if queries_data and search_keywords[0] in queries_data:
keyword_data = queries_data[search_keywords[0]]
# Create two separate dataframes for top and rising
top_df = keyword_data.get('top', pd.DataFrame())
rising_df = keyword_data.get('rising', pd.DataFrame())
return {
'top': top_df if not top_df.empty else pd.DataFrame(),
'rising': rising_df if not rising_df.empty else pd.DataFrame()
}
except Exception as e:
logger.error(f"Error in related queries: {e}")
return {'top': pd.DataFrame(), 'rising': pd.DataFrame()}
def get_source(url):
@@ -332,10 +263,11 @@ def get_suggestions_for_keyword(search_term):
pd.set_option('display.max_rows', expanded_results_df.shape[0]+1)
expanded_results_df.drop_duplicates('Keywords', inplace=True)
table = tabulate(expanded_results_df, headers=['Keywords', 'Relevance'], tablefmt='fancy_grid')
try:
save_in_file(table)
except Exception as save_results_err:
logger.error(f"Failed to save search results: {save_results_err}")
# FIXME: Too much data for LLM context window. We will need to embed it.
#try:
# save_in_file(table)
#except Exception as save_results_err:
# logger.error(f"Failed to save search results: {save_results_err}")
return expanded_results_df
except Exception as e:
logger.error(f"get_suggestions_for_keyword: Error in main: {e}")
@@ -482,7 +414,7 @@ def save_in_file(table_content):
file_path = os.environ.get('SEARCH_SAVE_FILE')
try:
# Save the content to the file
with open(file_path, "a+") as file:
with open(file_path, "a+", encoding="utf-8") as file:
file.write(table_content)
file.write("\n" * 3) # Add three newlines at the end
logger.info(f"Search content saved to {file_path}")
@@ -506,22 +438,17 @@ def do_google_trends_analysis(search_term):
else:
all_the_keywords.append(suggestions_df['Keywords'].tolist())
all_the_keywords = ','.join([', '.join(filter(None, map(str, sublist))) for sublist in all_the_keywords])
# Generate a random sleep time between 2 and 3 seconds
time.sleep(random.uniform(2, 3))
#
# # FIXME: Get result from vision GPT. Fetch and visualize Google Trends data
# #trends_data = fetch_google_trends_interest_overtime("llamaindex")
#
# # FIXME: Plot Interest Over time.
# result_df = plot_interest_by_region(search_term)
#
# Display additional information
try:
result_df = get_related_topics_and_save_csv(search_term)
logger.info(f"Related topics:: result_df: {result_df}")
# Extract 'Top' topic_title
if result_df:
top_topic_title = result_df['topic_title'].values.tolist()
top_topic_title = result_df['top']['topic_title'].values.tolist()
# Join each sublist into one string separated by comma
#top_topic_title = [','.join(filter(None, map(str, sublist))) for sublist in top_topic_title]
top_topic_title = ','.join([', '.join(filter(None, map(str, sublist))) for sublist in top_topic_title])
@@ -550,3 +477,24 @@ def do_google_trends_analysis(search_term):
return(all_the_keywords)
except Exception as e:
logger.error(f"Error in Google Trends Analysis: {e}")
def get_trending_searches(country='united_states'):
"""Get trending searches for a specific country."""
try:
pytrends = TrendReq(hl='en-US', tz=360)
trending_searches = pytrends.trending_searches(pn=country)
return trending_searches
except Exception as e:
logger.error(f"Error getting trending searches: {e}")
return pd.DataFrame()
def get_realtime_trends(country='US'):
"""Get realtime trending searches for a specific country."""
try:
pytrends = TrendReq(hl='en-US', tz=360)
realtime_trends = pytrends.realtime_trending_searches(pn=country)
return realtime_trends
except Exception as e:
logger.error(f"Error getting realtime trends: {e}")
return pd.DataFrame()

View File

@@ -0,0 +1,803 @@
################################################################
#
# ## Features
#
# - **Web Research**: Alwrity enables users to conduct web research efficiently.
# By providing keywords or topics of interest, users can initiate searches across multiple platforms simultaneously.
#
# - **Google SERP Search**: The tool integrates with Google Search Engine Results Pages (SERP)
# to retrieve relevant information based on user queries. It offers insights into organic search results,
# People Also Ask, and related searches.
#
# - **Tavily AI Integration**: Alwrity leverages Tavily AI's capabilities to enhance web research.
# It utilizes advanced algorithms to search for information and extract relevant data from various sources.
#
# - **Metaphor AI Semantic Search**: Alwrity employs Metaphor AI's semantic search technology to find related articles and content.
# By analyzing context and meaning, it delivers precise and accurate results.
#
# - **Google Trends Analysis**: The tool provides Google Trends analysis for user-defined keywords.
# It helps users understand the popularity and trends associated with specific topics over time.
#
##############################################################
import os
import json
import time
from pathlib import Path
import sys
from datetime import datetime
import streamlit as st
import pandas as pd
import random
import numpy as np
from lib.alwrity_ui.display_google_serp_results import (
process_research_results,
process_search_results,
display_research_results
)
from lib.alwrity_ui.google_trends_ui import display_google_trends_data, process_trends_data
from .tavily_ai_search import do_tavily_ai_search
from .metaphor_basic_neural_web_search import metaphor_search_articles, streamlit_display_metaphor_results
from .google_serp_search import google_search
from .google_trends_researcher import do_google_trends_analysis
#from .google_gemini_web_researcher import do_gemini_web_research
from loguru import logger
# Configure logger
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
def gpt_web_researcher(search_keywords, search_mode, **kwargs):
"""Keyword based web researcher with progress tracking."""
logger.info(f"Starting web research - Keywords: {search_keywords}, Mode: {search_mode}")
logger.debug(f"Additional parameters: {kwargs}")
try:
# Reset session state variables for this research operation
if 'metaphor_results_displayed' in st.session_state:
del st.session_state.metaphor_results_displayed
# Initialize result container
research_results = None
# Create status containers
status_container = st.empty()
progress_bar = st.progress(0)
def update_progress(message, progress=None, level="info"):
if progress is not None:
progress_bar.progress(progress)
if level == "error":
status_container.error(f"🚫 {message}")
elif level == "warning":
status_container.warning(f"⚠️ {message}")
else:
status_container.info(f"🔄 {message}")
logger.debug(f"Progress update [{level}]: {message}")
if search_mode == "google":
logger.info("Starting Google research pipeline")
try:
# First try Google SERP
update_progress("Initiating SERP search...", progress=10)
serp_results = do_google_serp_search(search_keywords, **kwargs)
if serp_results and serp_results.get('organic'):
logger.info("SERP search successful")
update_progress("SERP search completed", progress=40)
research_results = serp_results
else:
logger.warning("SERP search returned no results, falling back to Gemini")
update_progress("No SERP results, trying Gemini...", progress=45)
# Keep it commented. Fallback to Gemini
#try:
# gemini_results = do_gemini_web_research(search_keywords)
# if gemini_results:
# logger.info("Gemini research successful")
# update_progress("Gemini research completed", progress=80)
# research_results = {
# 'source': 'gemini',
# 'results': gemini_results
# }
#except Exception as gemini_err:
# logger.error(f"Gemini research failed: {gemini_err}")
# update_progress("Gemini research failed", level="warning")
if research_results:
update_progress("Processing final results...", progress=90)
processed_results = process_research_results(research_results)
if processed_results:
update_progress("Research completed!", progress=100, level="success")
display_research_results(processed_results)
return processed_results
else:
error_msg = "Failed to process research results"
logger.warning(error_msg)
update_progress(error_msg, level="warning")
return None
else:
error_msg = "No results from either SERP or Gemini"
logger.warning(error_msg)
update_progress(error_msg, level="warning")
return None
except Exception as search_err:
error_msg = f"Research pipeline failed: {str(search_err)}"
logger.error(error_msg, exc_info=True)
update_progress(error_msg, level="error")
raise
elif search_mode == "ai":
logger.info("Starting AI research pipeline")
try:
# Do Tavily AI Search
update_progress("Initiating Tavily AI search...", progress=10)
# Extract relevant parameters for Tavily search
include_domains = kwargs.pop('include_domains', None)
search_depth = kwargs.pop('search_depth', 'advanced')
# Pass the parameters to do_tavily_ai_search
t_results = do_tavily_ai_search(
search_keywords, # Pass as positional argument
max_results=kwargs.get('num_results', 10),
include_domains=include_domains,
search_depth=search_depth,
**kwargs
)
# Do Metaphor AI Search
update_progress("Initiating Metaphor AI search...", progress=50)
metaphor_results, metaphor_titles = do_metaphor_ai_research(search_keywords)
if metaphor_results is None:
update_progress("Metaphor AI search failed, continuing with Tavily results only...", level="warning")
else:
update_progress("Metaphor AI search completed successfully", progress=75)
# Add debug logging to check the structure of metaphor_results
logger.debug(f"Metaphor results structure: {type(metaphor_results)}")
if isinstance(metaphor_results, dict):
logger.debug(f"Metaphor results keys: {metaphor_results.keys()}")
if 'data' in metaphor_results:
logger.debug(f"Metaphor data keys: {metaphor_results['data'].keys()}")
if 'results' in metaphor_results['data']:
logger.debug(f"Number of results: {len(metaphor_results['data']['results'])}")
# Display Metaphor results only if not already displayed
if 'metaphor_results_displayed' not in st.session_state:
st.session_state.metaphor_results_displayed = True
# Make sure to pass the correct parameters to streamlit_display_metaphor_results
streamlit_display_metaphor_results(metaphor_results, search_keywords)
# Add Google Trends Analysis
update_progress("Initiating Google Trends analysis...", progress=80)
try:
# Add an informative message about Google Trends
with st.expander(" About Google Trends Analysis", expanded=False):
st.markdown("""
**What is Google Trends Analysis?**
Google Trends Analysis provides insights into how often a particular search-term is entered relative to the total search-volume across various regions of the world, and in various languages.
**What data will be shown?**
- **Related Keywords**: Terms that are frequently searched together with your keyword
- **Interest Over Time**: How interest in your keyword has changed over the past 12 months
- **Regional Interest**: Where in the world your keyword is most popular
- **Related Queries**: What people search for before and after searching for your keyword
- **Related Topics**: Topics that are closely related to your keyword
**How to use this data:**
- Identify trending topics in your industry
- Understand seasonal patterns in search behavior
- Discover related keywords for content planning
- Target content to specific regions with high interest
""")
trends_results = do_google_pytrends_analysis(search_keywords)
if trends_results:
update_progress("Google Trends analysis completed successfully", progress=90)
# Store trends results in the research_results
if metaphor_results:
metaphor_results['trends_data'] = trends_results
else:
# If metaphor_results is None, create a new container for results
metaphor_results = {'trends_data': trends_results}
# Display Google Trends data using the new UI module
display_google_trends_data(trends_results, search_keywords)
else:
update_progress("Google Trends analysis returned no results", level="warning")
except Exception as trends_err:
logger.error(f"Google Trends analysis failed: {trends_err}")
update_progress("Google Trends analysis failed", level="warning")
st.error(f"Error in Google Trends analysis: {str(trends_err)}")
# Return the combined results
update_progress("Research completed!", progress=100, level="success")
return metaphor_results or t_results
except Exception as ai_err:
error_msg = f"AI research pipeline failed: {str(ai_err)}"
logger.error(error_msg, exc_info=True)
update_progress(error_msg, level="error")
raise
else:
error_msg = f"Unsupported search mode: {search_mode}"
logger.error(error_msg)
update_progress(error_msg, level="error")
raise ValueError(error_msg)
except Exception as err:
error_msg = f"Failed in gpt_web_researcher: {str(err)}"
logger.error(error_msg, exc_info=True)
if 'update_progress' in locals():
update_progress(error_msg, level="error")
raise
def do_google_serp_search(search_keywords, status_container, update_progress, **kwargs):
"""Perform Google SERP analysis with sidebar progress tracking."""
logger.info("="*50)
logger.info("Starting Google SERP Search")
logger.info("="*50)
try:
# Validate parameters
update_progress("Validating search parameters", progress=0.1)
status_container.info("📝 Validating parameters...")
if not search_keywords or not isinstance(search_keywords, str):
logger.error(f"Invalid search keywords: {search_keywords}")
raise ValueError("Search keywords must be a non-empty string")
# Update search initiation
update_progress(f"Initiating search for: '{search_keywords}'", progress=0.2)
status_container.info("🌐 Querying search API...")
logger.info(f"Search params: {kwargs}")
# Execute search
g_results = google_search(search_keywords)
if g_results:
# Log success
update_progress("Search completed successfully", progress=0.8, level="success")
# Update statistics
stats = f"""Found:
- {len(g_results.get('organic', []))} organic results
- {len(g_results.get('peopleAlsoAsk', []))} related questions
- {len(g_results.get('relatedSearches', []))} related searches"""
update_progress(stats, progress=0.9)
# Process results
update_progress("Processing search results", progress=0.95)
status_container.info("⚡ Processing results...")
processed_results = process_search_results(g_results)
# Extract titles
update_progress("Extracting information", progress=0.98)
g_titles = extract_info(g_results, 'titles')
# Final success
update_progress("Analysis completed successfully", progress=1.0, level="success")
status_container.success("✨ Research completed!")
# Clear main status after delay
time.sleep(1)
status_container.empty()
return {
'results': g_results,
'titles': g_titles,
'summary': processed_results,
'stats': {
'organic_count': len(g_results.get('organic', [])),
'questions_count': len(g_results.get('peopleAlsoAsk', [])),
'related_count': len(g_results.get('relatedSearches', []))
}
}
else:
update_progress("No results found", progress=0.5, level="warning")
status_container.warning("⚠️ No results found")
return None
except Exception as err:
error_msg = f"Search failed: {str(err)}"
update_progress(error_msg, progress=0.5, level="error")
logger.error(error_msg)
logger.debug("Stack trace:", exc_info=True)
raise
finally:
logger.info("="*50)
logger.info("Google SERP Search function completed")
logger.info("="*50)
def do_tavily_ai_search(search_keywords, max_results=10, **kwargs):
""" Common function to do Tavily AI web research."""
try:
logger.info(f"Doing Tavily AI search for: {search_keywords}")
# Prepare Tavily search parameters
tavily_params = {
'max_results': max_results,
'search_depth': 'advanced' if kwargs.get('search_depth', 3) > 2 else 'basic',
'time_range': kwargs.get('time_range', 'year'),
'include_domains': kwargs.get('include_domains', [""]) if kwargs.get('include_domains') else [""]
}
# Import the Tavily search function directly
from .tavily_ai_search import do_tavily_ai_search as tavily_search
# Call the actual Tavily search function
t_results = tavily_search(
keywords=search_keywords,
**tavily_params
)
if t_results:
t_titles = tavily_extract_information(t_results, 'titles')
t_answer = tavily_extract_information(t_results, 'answer')
return(t_results, t_titles, t_answer)
else:
logger.warning("No results returned from Tavily AI search")
return None, None, None
except Exception as err:
logger.error(f"Failed to do Tavily AI Search: {err}")
return None, None, None
def do_metaphor_ai_research(search_keywords):
"""
Perform Metaphor AI research and return results with titles.
Args:
search_keywords (str): Keywords to search for
Returns:
tuple: (response_articles, titles) or (None, None) if search fails
"""
try:
logger.info(f"Start Semantic/Neural web search with Metaphor: {search_keywords}")
response_articles = metaphor_search_articles(search_keywords)
if response_articles and 'data' in response_articles:
m_titles = [result.get('title', '') for result in response_articles['data'].get('results', [])]
return response_articles, m_titles
else:
logger.warning("No valid results from Metaphor search")
return None, None
except Exception as err:
logger.error(f"Failed to do Metaphor search: {err}")
return None, None
def do_google_pytrends_analysis(keywords):
"""
Perform Google Trends analysis for the given keywords.
Args:
keywords (str): The search keywords to analyze
Returns:
dict: A dictionary containing formatted Google Trends data with the following keys:
- related_keywords: List of related keywords
- interest_over_time: DataFrame with date and interest columns
- regional_interest: DataFrame with country_code, country, and interest columns
- related_queries: DataFrame with query and value columns
- related_topics: DataFrame with topic and value columns
"""
logger.info(f"Performing Google Trends analysis for keywords: {keywords}")
# Create a progress container for Streamlit
progress_container = st.empty()
progress_bar = st.progress(0)
def update_progress(message, progress=None, level="info"):
"""Helper function to update progress in Streamlit UI"""
if progress is not None:
progress_bar.progress(progress)
if level == "error":
progress_container.error(f"🚫 {message}")
elif level == "warning":
progress_container.warning(f"⚠️ {message}")
else:
progress_container.info(f"🔄 {message}")
logger.debug(f"Progress update [{level}]: {message}")
try:
# Initialize the formatted data dictionary
formatted_data = {
'related_keywords': [],
'interest_over_time': pd.DataFrame(),
'regional_interest': pd.DataFrame(),
'related_queries': pd.DataFrame(),
'related_topics': pd.DataFrame()
}
# Get raw trends data from google_trends_researcher
update_progress("Fetching Google Trends data...", progress=10)
raw_trends_data = do_google_trends_analysis(keywords)
if not raw_trends_data:
logger.warning("No Google Trends data returned")
update_progress("No Google Trends data returned", level="warning", progress=20)
return formatted_data
# Process related keywords from the raw data
update_progress("Processing related keywords...", progress=30)
if isinstance(raw_trends_data, list):
formatted_data['related_keywords'] = raw_trends_data
elif isinstance(raw_trends_data, dict):
if 'keywords' in raw_trends_data:
formatted_data['related_keywords'] = raw_trends_data['keywords']
if 'interest_over_time' in raw_trends_data:
formatted_data['interest_over_time'] = raw_trends_data['interest_over_time']
if 'regional_interest' in raw_trends_data:
formatted_data['regional_interest'] = raw_trends_data['regional_interest']
if 'related_queries' in raw_trends_data:
formatted_data['related_queries'] = raw_trends_data['related_queries']
if 'related_topics' in raw_trends_data:
formatted_data['related_topics'] = raw_trends_data['related_topics']
# If we have keywords but missing other data, try to fetch them using pytrends directly
if formatted_data['related_keywords'] and (
formatted_data['interest_over_time'].empty or
formatted_data['regional_interest'].empty or
formatted_data['related_queries'].empty or
formatted_data['related_topics'].empty
):
try:
update_progress("Fetching additional data from Google Trends API...", progress=40)
from pytrends.request import TrendReq
pytrends = TrendReq(hl='en-US', tz=360)
# Build payload with the main keyword
update_progress("Building search payload...", progress=45)
pytrends.build_payload([keywords], timeframe='today 12-m', geo='')
# Get interest over time if missing
if formatted_data['interest_over_time'].empty:
try:
update_progress("Fetching interest over time data...", progress=50)
interest_df = pytrends.interest_over_time()
if not interest_df.empty:
formatted_data['interest_over_time'] = interest_df.reset_index()
update_progress(f"Successfully fetched interest over time data with {len(formatted_data['interest_over_time'])} data points", progress=55)
else:
update_progress("No interest over time data available", level="warning", progress=55)
except Exception as e:
logger.error(f"Error fetching interest over time: {e}")
update_progress(f"Error fetching interest over time: {str(e)}", level="warning", progress=55)
# Get regional interest if missing
if formatted_data['regional_interest'].empty:
try:
update_progress("Fetching regional interest data...", progress=60)
regional_df = pytrends.interest_by_region()
if not regional_df.empty:
formatted_data['regional_interest'] = regional_df.reset_index()
update_progress(f"Successfully fetched regional interest data for {len(formatted_data['regional_interest'])} regions", progress=65)
else:
update_progress("No regional interest data available", level="warning", progress=65)
except Exception as e:
logger.error(f"Error fetching regional interest: {e}")
update_progress(f"Error fetching regional interest: {str(e)}", level="warning", progress=65)
# Get related queries if missing
if formatted_data['related_queries'].empty:
try:
update_progress("Fetching related queries data...", progress=70)
# Get related queries data
related_queries = pytrends.related_queries()
# Create empty DataFrame as fallback
formatted_data['related_queries'] = pd.DataFrame(columns=['query', 'value'])
# Simple direct approach to avoid list index errors
if related_queries and isinstance(related_queries, dict):
# Check if our keyword exists in the results
if keywords in related_queries:
keyword_data = related_queries[keywords]
# Process top queries if available
if 'top' in keyword_data and keyword_data['top'] is not None:
try:
update_progress("Processing top related queries...", progress=75)
# Convert to DataFrame if it's not already
if isinstance(keyword_data['top'], pd.DataFrame):
top_df = keyword_data['top']
else:
# Try to convert to DataFrame
top_df = pd.DataFrame(keyword_data['top'])
# Ensure it has the right columns
if not top_df.empty:
# Rename columns if needed
if 'query' in top_df.columns:
# Already has the right column name
pass
elif len(top_df.columns) > 0:
# Use first column as query
top_df = top_df.rename(columns={top_df.columns[0]: 'query'})
# Add to our results
formatted_data['related_queries'] = top_df
update_progress(f"Successfully processed {len(top_df)} top related queries", progress=80)
except Exception as e:
logger.warning(f"Error processing top queries: {e}")
update_progress(f"Error processing top queries: {str(e)}", level="warning", progress=80)
# Process rising queries if available
if 'rising' in keyword_data and keyword_data['rising'] is not None:
try:
update_progress("Processing rising related queries...", progress=85)
# Convert to DataFrame if it's not already
if isinstance(keyword_data['rising'], pd.DataFrame):
rising_df = keyword_data['rising']
else:
# Try to convert to DataFrame
rising_df = pd.DataFrame(keyword_data['rising'])
# Ensure it has the right columns
if not rising_df.empty:
# Rename columns if needed
if 'query' in rising_df.columns:
# Already has the right column name
pass
elif len(rising_df.columns) > 0:
# Use first column as query
rising_df = rising_df.rename(columns={rising_df.columns[0]: 'query'})
# Combine with existing data if we have any
if not formatted_data['related_queries'].empty:
formatted_data['related_queries'] = pd.concat([formatted_data['related_queries'], rising_df])
update_progress(f"Successfully processed {len(rising_df)} rising related queries", progress=90)
else:
formatted_data['related_queries'] = rising_df
update_progress(f"Successfully processed {len(rising_df)} rising related queries", progress=90)
except Exception as e:
logger.warning(f"Error processing rising queries: {e}")
update_progress(f"Error processing rising queries: {str(e)}", level="warning", progress=90)
except Exception as e:
logger.error(f"Error fetching related queries: {e}")
update_progress(f"Error fetching related queries: {str(e)}", level="warning", progress=90)
# Ensure we have an empty DataFrame with the right columns
formatted_data['related_queries'] = pd.DataFrame(columns=['query', 'value'])
# Get related topics if missing
if formatted_data['related_topics'].empty:
try:
update_progress("Fetching related topics data...", progress=95)
# Get related topics data
related_topics = pytrends.related_topics()
# Create empty DataFrame as fallback
formatted_data['related_topics'] = pd.DataFrame(columns=['topic', 'value'])
# Simple direct approach to avoid list index errors
if related_topics and isinstance(related_topics, dict):
# Check if our keyword exists in the results
if keywords in related_topics:
keyword_data = related_topics[keywords]
# Process top topics if available
if 'top' in keyword_data and keyword_data['top'] is not None:
try:
update_progress("Processing top related topics...", progress=97)
# Convert to DataFrame if it's not already
if isinstance(keyword_data['top'], pd.DataFrame):
top_df = keyword_data['top']
else:
# Try to convert to DataFrame
top_df = pd.DataFrame(keyword_data['top'])
# Ensure it has the right columns
if not top_df.empty:
# Rename columns if needed
if 'topic_title' in top_df.columns:
top_df = top_df.rename(columns={'topic_title': 'topic'})
elif len(top_df.columns) > 0 and 'topic' not in top_df.columns:
# Use first column as topic
top_df = top_df.rename(columns={top_df.columns[0]: 'topic'})
# Add to our results
formatted_data['related_topics'] = top_df
update_progress(f"Successfully processed {len(top_df)} top related topics", progress=98)
except Exception as e:
logger.warning(f"Error processing top topics: {e}")
update_progress(f"Error processing top topics: {str(e)}", level="warning", progress=98)
# Process rising topics if available
if 'rising' in keyword_data and keyword_data['rising'] is not None:
try:
update_progress("Processing rising related topics...", progress=99)
# Convert to DataFrame if it's not already
if isinstance(keyword_data['rising'], pd.DataFrame):
rising_df = keyword_data['rising']
else:
# Try to convert to DataFrame
rising_df = pd.DataFrame(keyword_data['rising'])
# Ensure it has the right columns
if not rising_df.empty:
# Rename columns if needed
if 'topic_title' in rising_df.columns:
rising_df = rising_df.rename(columns={'topic_title': 'topic'})
elif len(rising_df.columns) > 0 and 'topic' not in rising_df.columns:
# Use first column as topic
rising_df = rising_df.rename(columns={rising_df.columns[0]: 'topic'})
# Combine with existing data if we have any
if not formatted_data['related_topics'].empty:
formatted_data['related_topics'] = pd.concat([formatted_data['related_topics'], rising_df])
update_progress(f"Successfully processed {len(rising_df)} rising related topics", progress=100)
else:
formatted_data['related_topics'] = rising_df
update_progress(f"Successfully processed {len(rising_df)} rising related topics", progress=100)
except Exception as e:
logger.warning(f"Error processing rising topics: {e}")
update_progress(f"Error processing rising topics: {str(e)}", level="warning", progress=100)
except Exception as e:
logger.error(f"Error fetching related topics: {e}")
update_progress(f"Error fetching related topics: {str(e)}", level="warning", progress=100)
# Ensure we have an empty DataFrame with the right columns
formatted_data['related_topics'] = pd.DataFrame(columns=['topic', 'value'])
except Exception as e:
logger.error(f"Error fetching additional trends data: {e}")
update_progress(f"Error fetching additional trends data: {str(e)}", level="warning", progress=100)
# Ensure all DataFrames have the correct column names for the UI
update_progress("Finalizing data formatting...", progress=100)
if not formatted_data['interest_over_time'].empty:
if 'date' not in formatted_data['interest_over_time'].columns:
formatted_data['interest_over_time'] = formatted_data['interest_over_time'].reset_index()
if 'interest' not in formatted_data['interest_over_time'].columns and keywords in formatted_data['interest_over_time'].columns:
formatted_data['interest_over_time'] = formatted_data['interest_over_time'].rename(columns={keywords: 'interest'})
if not formatted_data['regional_interest'].empty:
if 'country_code' not in formatted_data['regional_interest'].columns and 'geoName' in formatted_data['regional_interest'].columns:
formatted_data['regional_interest'] = formatted_data['regional_interest'].rename(columns={'geoName': 'country_code'})
if 'interest' not in formatted_data['regional_interest'].columns and keywords in formatted_data['regional_interest'].columns:
formatted_data['regional_interest'] = formatted_data['regional_interest'].rename(columns={keywords: 'interest'})
if not formatted_data['related_queries'].empty:
# Handle different column names that might be present in the related queries DataFrame
if 'query' not in formatted_data['related_queries'].columns:
if 'Top query' in formatted_data['related_queries'].columns:
formatted_data['related_queries'] = formatted_data['related_queries'].rename(columns={'Top query': 'query'})
elif 'Rising query' in formatted_data['related_queries'].columns:
formatted_data['related_queries'] = formatted_data['related_queries'].rename(columns={'Rising query': 'query'})
elif 'query' not in formatted_data['related_queries'].columns and len(formatted_data['related_queries'].columns) > 0:
# If we have a DataFrame but no 'query' column, use the first column as 'query'
first_col = formatted_data['related_queries'].columns[0]
formatted_data['related_queries'] = formatted_data['related_queries'].rename(columns={first_col: 'query'})
if 'value' not in formatted_data['related_queries'].columns and len(formatted_data['related_queries'].columns) > 1:
# If we have a second column, use it as 'value'
second_col = formatted_data['related_queries'].columns[1]
formatted_data['related_queries'] = formatted_data['related_queries'].rename(columns={second_col: 'value'})
elif 'value' not in formatted_data['related_queries'].columns:
# If no 'value' column exists, add one with default values
formatted_data['related_queries']['value'] = 0
if not formatted_data['related_topics'].empty:
# Handle different column names that might be present in the related topics DataFrame
if 'topic' not in formatted_data['related_topics'].columns:
if 'topic_title' in formatted_data['related_topics'].columns:
formatted_data['related_topics'] = formatted_data['related_topics'].rename(columns={'topic_title': 'topic'})
elif 'topic' not in formatted_data['related_topics'].columns and len(formatted_data['related_topics'].columns) > 0:
# If we have a DataFrame but no 'topic' column, use the first column as 'topic'
first_col = formatted_data['related_topics'].columns[0]
formatted_data['related_topics'] = formatted_data['related_topics'].rename(columns={first_col: 'topic'})
if 'value' not in formatted_data['related_topics'].columns and len(formatted_data['related_topics'].columns) > 1:
# If we have a second column, use it as 'value'
second_col = formatted_data['related_topics'].columns[1]
formatted_data['related_topics'] = formatted_data['related_topics'].rename(columns={second_col: 'value'})
elif 'value' not in formatted_data['related_topics'].columns:
# If no 'value' column exists, add one with default values
formatted_data['related_topics']['value'] = 0
# Clear the progress container after completion
progress_container.empty()
progress_bar.empty()
return formatted_data
except Exception as e:
logger.error(f"Error in Google Trends analysis: {e}")
update_progress(f"Error in Google Trends analysis: {str(e)}", level="error", progress=100)
# Clear the progress container after error
progress_container.empty()
progress_bar.empty()
return {
'related_keywords': [],
'interest_over_time': pd.DataFrame(),
'regional_interest': pd.DataFrame(),
'related_queries': pd.DataFrame(),
'related_topics': pd.DataFrame()
}
def metaphor_extract_titles_or_text(json_data, return_titles=True):
"""
Extract either titles or text from the given JSON structure.
Args:
json_data (list): List of Result objects in JSON format.
return_titles (bool): If True, return titles. If False, return text.
Returns:
list: List of titles or text.
"""
if return_titles:
return [(result.title) for result in json_data]
else:
return [result.text for result in json_data]
def extract_info(json_data, info_type):
"""
Extract information (titles, peopleAlsoAsk, or relatedSearches) from the given JSON.
Args:
json_data (dict): The JSON data.
info_type (str): The type of information to extract (titles, peopleAlsoAsk, relatedSearches).
Returns:
list or None: A list containing the requested information, or None if the type is invalid.
"""
if info_type == "titles":
return [result.get("title") for result in json_data.get("organic", [])]
elif info_type == "peopleAlsoAsk":
return [item.get("question") for item in json_data.get("peopleAlsoAsk", [])]
elif info_type == "relatedSearches":
return [item.get("query") for item in json_data.get("relatedSearches", [])]
else:
print("Invalid info_type. Please use 'titles', 'peopleAlsoAsk', or 'relatedSearches'.")
return None
def tavily_extract_information(json_data, keyword):
"""
Extract information from the given JSON based on the specified keyword.
Args:
json_data (dict): The JSON data.
keyword (str): The keyword (title, content, answer, follow-query).
Returns:
list or str: The extracted information based on the keyword.
"""
if keyword == 'titles':
return [result['title'] for result in json_data['results']]
elif keyword == 'content':
return [result['content'] for result in json_data['results']]
elif keyword == 'answer':
return json_data['answer']
elif keyword == 'follow-query':
return json_data['follow_up_questions']
else:
return f"Invalid keyword: {keyword}"

View File

@@ -0,0 +1,623 @@
import os
import sys
import pandas as pd
from io import StringIO
from pathlib import Path
from metaphor_python import Metaphor
from datetime import datetime, timedelta
import streamlit as st
from loguru import logger
from tqdm import tqdm
from tabulate import tabulate
from collections import namedtuple
import textwrap
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
from dotenv import load_dotenv
load_dotenv(Path('../../.env'))
from exa_py import Exa
from tenacity import (retry, stop_after_attempt, wait_random_exponential,)# for exponential backoff
from .gpt_summarize_web_content import summarize_web_content
from .gpt_competitor_analysis import summarize_competitor_content
from .common_utils import save_in_file, cfg_search_param
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def get_metaphor_client():
"""
Get the Metaphor client.
Returns:
Metaphor: An instance of the Metaphor client.
"""
METAPHOR_API_KEY = os.environ.get('METAPHOR_API_KEY')
if not METAPHOR_API_KEY:
logger.error("METAPHOR_API_KEY environment variable not set!")
st.error("METAPHOR_API_KEY environment variable not set!")
raise ValueError("METAPHOR_API_KEY environment variable not set!")
return Exa(METAPHOR_API_KEY)
def metaphor_rag_search():
""" Mainly used for researching blog sections. """
metaphor = get_metaphor_client()
query = "blog research" # Example query, this can be parameterized as needed
results = metaphor.search(query)
if not results:
logger.error("No results found for the query.")
st.error("No results found for the query.")
return None
# Process the results (this is a placeholder, actual processing logic will depend on requirements)
processed_results = [result['title'] for result in results]
# Display the results
st.write("Search Results:")
st.write(processed_results)
return processed_results
def metaphor_find_similar(similar_url, usecase, num_results=5, start_published_date=None, end_published_date=None,
include_domains=None, exclude_domains=None, include_text=None, exclude_text=None,
summary_query=None, progress_bar=None):
"""Find similar content using Metaphor API."""
try:
# Initialize progress if not provided
if progress_bar is None:
progress_bar = st.progress(0.0)
# Update progress
progress_bar.progress(0.1, text="Initializing search...")
# Get Metaphor client
metaphor = get_metaphor_client()
logger.info(f"Initialized Metaphor client for URL: {similar_url}")
# Prepare search parameters
search_params = {
"highlights": True,
"num_results": num_results,
}
# Add optional parameters if provided
if start_published_date:
search_params["start_published_date"] = start_published_date
if end_published_date:
search_params["end_published_date"] = end_published_date
if include_domains:
search_params["include_domains"] = include_domains
if exclude_domains:
search_params["exclude_domains"] = exclude_domains
if include_text:
search_params["include_text"] = include_text
if exclude_text:
search_params["exclude_text"] = exclude_text
# Add summary query
if summary_query:
search_params["summary"] = summary_query
else:
search_params["summary"] = {"query": f"Find {usecase} similar to the given URL."}
logger.debug(f"Search parameters: {search_params}")
# Update progress
progress_bar.progress(0.2, text="Preparing search parameters...")
# Make API call
logger.info("Calling Metaphor API find_similar_and_contents...")
search_response = metaphor.find_similar_and_contents(
similar_url,
**search_params
)
if search_response and hasattr(search_response, 'results'):
competitors = search_response.results
total_results = len(competitors)
# Update progress
progress_bar.progress(0.3, text=f"Found {total_results} results...")
# Process results
processed_results = []
for i, result in enumerate(competitors):
# Calculate progress as decimal (0.0-1.0)
progress = 0.3 + (0.6 * (i / total_results))
progress_text = f"Processing result {i+1}/{total_results}..."
progress_bar.progress(progress, text=progress_text)
# Process each result
processed_result = {
"Title": result.title,
"URL": result.url,
"Content Summary": result.text if hasattr(result, 'text') else "No content available"
}
processed_results.append(processed_result)
# Update progress
progress_bar.progress(0.9, text="Finalizing results...")
# Create DataFrame
df = pd.DataFrame(processed_results)
# Update progress
progress_bar.progress(1.0, text="Analysis completed!")
return df, search_response
else:
logger.warning("No results found in search response")
progress_bar.progress(1.0, text="No results found")
return pd.DataFrame(), search_response
except Exception as e:
logger.error(f"Error in metaphor_find_similar: {str(e)}", exc_info=True)
if progress_bar:
progress_bar.progress(1.0, text="Error occurred during analysis")
raise
def calculate_date_range(time_range: str) -> tuple:
"""
Calculate start and end dates based on time range selection.
Args:
time_range (str): One of 'past_day', 'past_week', 'past_month', 'past_year', 'anytime'
Returns:
tuple: (start_date, end_date) in ISO format with milliseconds
"""
now = datetime.utcnow()
end_date = now.strftime('%Y-%m-%dT%H:%M:%S.999Z')
if time_range == 'past_day':
start_date = (now - timedelta(days=1)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
elif time_range == 'past_week':
start_date = (now - timedelta(weeks=1)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
elif time_range == 'past_month':
start_date = (now - timedelta(days=30)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
elif time_range == 'past_year':
start_date = (now - timedelta(days=365)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
else: # anytime
start_date = None
end_date = None
return start_date, end_date
def metaphor_search_articles(query, search_options: dict = None):
"""
Search for articles using the Metaphor/Exa API.
Args:
query (str): The search query.
search_options (dict): Search configuration options including:
- num_results (int): Number of results to retrieve
- use_autoprompt (bool): Whether to use autoprompt
- include_domains (list): List of domains to include
- time_range (str): One of 'past_day', 'past_week', 'past_month', 'past_year', 'anytime'
- exclude_domains (list): List of domains to exclude
Returns:
dict: Search results and metadata
"""
exa = get_metaphor_client()
try:
# Initialize default search options
if search_options is None:
search_options = {}
# Get config parameters or use defaults
try:
include_domains, _, num_results, _ = cfg_search_param('exa')
except Exception as cfg_err:
logger.warning(f"Failed to load config parameters: {cfg_err}. Using defaults.")
include_domains = None
num_results = 10
# Calculate date range based on time_range option
time_range = search_options.get('time_range', 'anytime')
start_published_date, end_published_date = calculate_date_range(time_range)
# Prepare search parameters
search_params = {
'num_results': search_options.get('num_results', num_results),
'summary': True, # Always get summaries
'include_domains': search_options.get('include_domains', include_domains),
'use_autoprompt': search_options.get('use_autoprompt', True),
}
# Add date parameters only if they are not None
if start_published_date:
search_params['start_published_date'] = start_published_date
if end_published_date:
search_params['end_published_date'] = end_published_date
logger.info(f"Exa web search with params: {search_params} and Query: {query}")
# Execute search
search_response = exa.search_and_contents(
query,
**search_params
)
if not search_response or not hasattr(search_response, 'results'):
logger.warning("No results returned from Exa search")
return None
# Get cost information safely
try:
cost_dollars = {
'total': float(search_response.cost_dollars['total']),
} if hasattr(search_response, 'cost_dollars') else None
except Exception as cost_err:
logger.warning(f"Error processing cost information: {cost_err}")
cost_dollars = None
# Format response to match expected structure
formatted_response = {
"data": {
"requestId": getattr(search_response, 'request_id', None),
"resolvedSearchType": "neural",
"results": [
{
"id": result.url,
"title": result.title,
"url": result.url,
"publishedDate": result.published_date if hasattr(result, 'published_date') else None,
"author": getattr(result, 'author', None),
"score": getattr(result, 'score', 0),
"summary": result.summary if hasattr(result, 'summary') else None,
"text": result.text if hasattr(result, 'text') else None,
"image": getattr(result, 'image', None),
"favicon": getattr(result, 'favicon', None)
}
for result in search_response.results
],
"costDollars": cost_dollars
}
}
# Get AI-generated answer from Metaphor
try:
exa_answer = get_exa_answer(query)
if exa_answer:
formatted_response.update(exa_answer)
except Exception as exa_err:
logger.warning(f"Error getting Exa answer: {exa_err}")
# Get AI-generated answer from Tavily
try:
# Import the function directly from the module
import importlib
tavily_module = importlib.import_module('lib.ai_web_researcher.tavily_ai_search')
if hasattr(tavily_module, 'do_tavily_ai_search'):
tavily_response = tavily_module.do_tavily_ai_search(query)
if tavily_response and 'answer' in tavily_response:
formatted_response.update({
"tavily_answer": tavily_response.get("answer"),
"tavily_citations": tavily_response.get("citations", []),
"tavily_cost_dollars": tavily_response.get("costDollars", {"total": 0})
})
else:
logger.warning("do_tavily_ai_search function not found in tavily_ai_search module")
except Exception as tavily_err:
logger.warning(f"Error getting Tavily answer: {tavily_err}")
# Return the formatted response without displaying it
# The display will be handled by gpt_web_researcher
return formatted_response
except Exception as e:
logger.error(f"Error in Exa searching articles: {e}")
return None
def streamlit_display_metaphor_results(metaphor_response, search_keywords=None):
"""Display Metaphor search results in Streamlit."""
if not metaphor_response:
st.error("No search results found.")
return
# Add debug logging
logger.debug(f"Displaying Metaphor results. Type: {type(metaphor_response)}")
if isinstance(metaphor_response, dict):
logger.debug(f"Metaphor response keys: {metaphor_response.keys()}")
# Initialize session state variables if they don't exist
if 'search_insights' not in st.session_state:
st.session_state.search_insights = None
if 'metaphor_response' not in st.session_state:
st.session_state.metaphor_response = None
if 'insights_generated' not in st.session_state:
st.session_state.insights_generated = False
# Store the current response in session state
st.session_state.metaphor_response = metaphor_response
# Display search results
st.subheader("🔍 Search Results")
# Calculate metrics - handle different data structures
results = []
if isinstance(metaphor_response, dict):
if 'data' in metaphor_response and 'results' in metaphor_response['data']:
results = metaphor_response['data']['results']
elif 'results' in metaphor_response:
results = metaphor_response['results']
total_results = len(results)
avg_relevance = sum(r.get('score', 0) for r in results) / total_results if total_results > 0 else 0
# Display metrics
col1, col2 = st.columns(2)
with col1:
st.metric("Total Results", total_results)
with col2:
st.metric("Average Relevance Score", f"{avg_relevance:.2f}")
# Display AI-generated answers if available
if 'tavily_answer' in metaphor_response or 'metaphor_answer' in metaphor_response:
st.subheader("🤖 AI-Generated Answers")
if 'tavily_answer' in metaphor_response:
st.markdown("**Tavily AI Answer:**")
st.write(metaphor_response['tavily_answer'])
if 'metaphor_answer' in metaphor_response:
st.markdown("**Metaphor AI Answer:**")
st.write(metaphor_response['metaphor_answer'])
# Get Search Insights button
if st.button("Generate Search Insights", key="metaphor_generate_insights_button"):
st.session_state.insights_generated = True
st.rerun()
# Display insights if they exist in session state
if st.session_state.search_insights:
st.subheader("🔍 Search Insights")
st.write(st.session_state.search_insights)
# Display search results in a data editor
st.subheader("📊 Detailed Results")
# Prepare data for display
results_data = []
for result in results:
result_data = {
'Title': result.get('title', ''),
'URL': result.get('url', ''),
'Snippet': result.get('summary', ''),
'Relevance Score': result.get('score', 0),
'Published Date': result.get('publishedDate', '')
}
results_data.append(result_data)
# Create DataFrame
df = pd.DataFrame(results_data)
# Display the DataFrame if it's not empty
if not df.empty:
# Configure columns
st.dataframe(
df,
column_config={
"Title": st.column_config.TextColumn(
"Title",
help="Title of the search result",
width="large",
),
"URL": st.column_config.LinkColumn(
"URL",
help="Link to the search result",
width="medium",
display_text="Visit Article",
),
"Snippet": st.column_config.TextColumn(
"Snippet",
help="Summary of the search result",
width="large",
),
"Relevance Score": st.column_config.NumberColumn(
"Relevance Score",
help="Relevance score of the search result",
format="%.2f",
width="small",
),
"Published Date": st.column_config.DateColumn(
"Published Date",
help="Publication date of the search result",
width="medium",
),
},
hide_index=True,
)
# Add popover for snippets
st.markdown("""
<style>
.snippet-popover {
position: relative;
display: inline-block;
}
.snippet-popover .snippet-content {
visibility: hidden;
width: 300px;
background-color: #f9f9f9;
color: #333;
text-align: left;
border-radius: 6px;
padding: 10px;
position: absolute;
z-index: 1;
bottom: 125%;
left: 50%;
margin-left: -150px;
opacity: 0;
transition: opacity 0.3s;
box-shadow: 0 2px 5px rgba(0,0,0,0.2);
}
.snippet-popover:hover .snippet-content {
visibility: visible;
opacity: 1;
}
</style>
""", unsafe_allow_html=True)
# Display snippets with popover
st.subheader("📝 Snippets")
for i, result in enumerate(results):
snippet = result.get('summary', '')
if snippet:
st.markdown(f"""
<div class="snippet-popover">
<strong>{result.get('title', '')}</strong>
<div class="snippet-content">
{snippet}
</div>
</div>
""", unsafe_allow_html=True)
else:
st.info("No detailed results available.")
# Add a collapsible section for the raw JSON data
with st.expander("Research Results (JSON)", expanded=False):
st.json(metaphor_response)
def metaphor_news_summarizer(news_keywords):
""" build a LLM-based news summarizer app with the Exa API to keep us up-to-date
with the latest news on a given topic.
"""
exa = get_metaphor_client()
# FIXME: Needs to be user defined.
one_week_ago = (datetime.now() - timedelta(days=7))
date_cutoff = one_week_ago.strftime("%Y-%m-%d")
search_response = exa.search_and_contents(
news_keywords, use_autoprompt=True, start_published_date=date_cutoff
)
urls = [result.url for result in search_response.results]
print("URLs:")
for url in urls:
print(url)
def print_search_result(contents_response):
# Define the Result namedtuple
Result = namedtuple("Result", ["url", "title", "text"])
# Tabulate the data
table_headers = ["URL", "Title", "Summary"]
table_data = [(result.url, result.title, result.text) for result in contents_response]
table = tabulate(table_data,
headers=table_headers,
tablefmt="fancy_grid",
colalign=["left", "left", "left"],
maxcolwidths=[20, 20, 70])
# Convert table_data to DataFrame
import pandas as pd
df = pd.DataFrame(table_data, columns=["URL", "Title", "Summary"])
import streamlit as st
st.table(df)
print(table)
# Save the combined table to a file
try:
save_in_file(table)
except Exception as save_results_err:
logger.error(f"Failed to save search results: {save_results_err}")
def metaphor_scholar_search(query, include_domains=None, time_range="anytime"):
"""
Search for papers using the Metaphor API.
Args:
query (str): The search query.
include_domains (list): List of domains to include.
time_range (str): Time range for published articles ("day", "week", "month", "year", "anytime").
Returns:
MetaphorResponse: The response from the Metaphor API.
"""
client = get_metaphor_client()
try:
if time_range == "day":
start_published_date = (datetime.utcnow() - timedelta(days=1)).strftime('%Y-%m-%dT%H:%M:%SZ')
elif time_range == "week":
start_published_date = (datetime.utcnow() - timedelta(weeks=1)).strftime('%Y-%m-%dT%H:%M:%SZ')
elif time_range == "month":
start_published_date = (datetime.utcnow() - timedelta(weeks=4)).strftime('%Y-%m-%dT%H:%M:%SZ')
elif time_range == "year":
start_published_date = (datetime.utcnow() - timedelta(days=365)).strftime('%Y-%m-%dT%H:%M:%SZ')
else:
start_published_date = None
response = client.search(query, include_domains=include_domains, start_published_date=start_published_date, use_autoprompt=True)
return response
except Exception as e:
logger.error(f"Error in searching papers: {e}")
def get_exa_answer(query: str, system_prompt: str = None) -> dict:
"""
Get an AI-generated answer for a query using Exa's answer endpoint.
Args:
query (str): The search query to get an answer for
system_prompt (str, optional): Custom system prompt for the LLM. If None, uses default prompt.
Returns:
dict: Response containing answer, citations, and cost information
{
"answer": str,
"citations": list[dict],
"costDollars": dict
}
"""
exa = get_metaphor_client()
try:
# Use default system prompt if none provided
if system_prompt is None:
system_prompt = (
"I am doing research to write factual content. "
"Help me find answers for content generation task. "
"Provide detailed, well-structured answers with clear citations."
)
logger.info(f"Getting Exa answer for query: {query}")
logger.debug(f"Using system prompt: {system_prompt}")
# Make API call to get answer with system_prompt parameter
result = exa.answer(
query,
model="exa",
text=True # Include full text in citations
)
if not result or not result.get('answer'):
logger.warning("No answer received from Exa")
return None
# Format response to match expected structure
response = {
"answer": result.get('answer'),
"citations": result.get('citations', []),
"costDollars": result.get('costDollars', {"total": 0})
}
return response
except Exception as e:
logger.error(f"Error getting Exa answer: {e}")
return None

View File

@@ -16,13 +16,12 @@ Usage:
Modifications:
- To modify the script, update the environment variables in the .env file with the required API keys.
- Adjust the search parameters, such as keywords and search depth, in the `get_tavilyai_results` function as needed.
- Adjust the search parameters, such as keywords and search depth, in the `do_tavily_ai_search` function as needed.
- Customize logging configurations and table formatting according to preferences.
To-Do (TBD):
- Consider adding further enhancements or customization based on specific use cases.
Note: This script depends on external libraries such as Tavily, Rich, Tabulate, Loguru, and Tenacity. Install them using 'pip install tavily rich tabulate loguru tenacity' if not already installed.
"""
@@ -37,28 +36,22 @@ from tabulate import tabulate
# Load environment variables from .env file
load_dotenv(Path('../../.env'))
from rich import print
import streamlit as st
# Configure logger
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
from .common_utils import save_in_file, cfg_search_param
from tenacity import retry, stop_after_attempt, wait_random_exponential
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def get_tavilyai_results(keywords, include_urls, search_depth="advanced"):
def do_tavily_ai_search(keywords, max_results=5, include_domains=None, search_depth="advanced", **kwargs):
"""
Get Tavily AI search results based on specified keywords and options.
Args:
keywords (str): Keywords for Tavily AI search.
include_urls (str): Comma-separated URLs to include in the search.
search_depth (str, optional): Search depth option (default is "advanced").
Returns:
dict: Tavily AI search results.
"""
# Run Tavily search
logger.info(f"Running Tavily search on: {keywords}")
@@ -73,21 +66,104 @@ def get_tavilyai_results(keywords, include_urls, search_depth="advanced"):
client = TavilyClient(api_key=api_key)
except Exception as err:
logger.error(f"Failed to create Tavily client. Check TAVILY_API_KEY: {err}")
raise
try:
if include_urls:
tavily_search_result = client.search(keywords, search_depth, include_answer=True, include_domains=include_urls)
else:
tavily_search_result = client.search(keywords, search_depth, include_answer=True)
print_result_table(tavily_search_result)
return(tavily_search_result)
# Create search parameters exactly matching Tavily's API format
tavily_search_result = client.search(
query=keywords,
search_depth="advanced",
time_range="year",
include_answer="advanced",
include_domains=[""] if not include_domains else include_domains,
max_results=max_results
)
if tavily_search_result:
print_result_table(tavily_search_result)
streamlit_display_results(tavily_search_result)
return tavily_search_result
return None
except Exception as err:
logger.error(f"Failed to do Tavily Research: {err}")
raise
def streamlit_display_results(output_data):
"""Display Tavily AI search results in Streamlit UI with enhanced visualization."""
# Display the 'answer' in Streamlit with enhanced styling
answer = output_data.get("answer", "No answer available")
st.markdown("### 🤖 AI-Generated Answer")
st.markdown(f"""
<div style="background-color: #f0f2f6; padding: 20px; border-radius: 10px; border-left: 5px solid #4CAF50;">
{answer}
</div>
""", unsafe_allow_html=True)
# Display follow-up questions if available
follow_up_questions = output_data.get("follow_up_questions", [])
if follow_up_questions:
st.markdown("### ❓ Follow-up Questions")
for i, question in enumerate(follow_up_questions, 1):
st.markdown(f"**{i}.** {question}")
# Prepare data for display with dataeditor
st.markdown("### 📊 Search Results")
# Create a DataFrame for the results
import pandas as pd
results_data = []
for item in output_data.get("results", []):
title = item.get("title", "")
snippet = item.get("content", "")
link = item.get("url", "")
results_data.append({
"Title": title,
"Content": snippet,
"Link": link
})
if results_data:
df = pd.DataFrame(results_data)
# Display the data editor
st.data_editor(
df,
column_config={
"Title": st.column_config.TextColumn(
"Title",
help="Article title",
width="medium",
),
"Content": st.column_config.TextColumn(
"Content",
help="Click the button below to view full content",
width="large",
),
"Link": st.column_config.LinkColumn(
"Link",
help="Click to visit the website",
width="small",
display_text="Visit Site"
),
},
hide_index=True,
use_container_width=True,
)
# Add popovers for full content display
for item in output_data.get("results", []):
with st.popover(f"View content: {item.get('title', '')[:50]}..."):
st.markdown(item.get("content", ""))
else:
st.info("No results found for your search query.")
def print_result_table(output_data):
""" Pretty print the tavily AI serch result. """
""" Pretty print the tavily AI search result. """
# Prepare data for tabulate
table_data = []
for item in output_data.get("results"):
@@ -140,16 +216,3 @@ def print_result_table(output_data):
save_in_file(table)
except Exception as save_results_err:
logger.error(f"Failed to save search results: {save_results_err}")
def save_in_file(table_content):
""" Helper function to save search analysis in a file. """
file_path = os.environ.get('SEARCH_SAVE_FILE')
try:
# Save the content to the file
with open(file_path, "a+") as file:
file.write(table_content)
file.write("\n" * 3) # Add three newlines at the end
logger.info(f"Search content saved to {file_path}")
except Exception as e:
logger.error(f"Error occurred while writing to the file: {e}")

View File

@@ -0,0 +1,184 @@
#####################################################
#
# Alwrity, AI essay writer - Essay_Writing_with_Prompt_Chaining
#
#####################################################
import os
from pathlib import Path
from dotenv import load_dotenv
from pprint import pprint
from loguru import logger
import sys
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
def generate_with_retry(prompt, system_prompt=None):
"""
Generates content using the llm_text_gen function with retry handling for errors.
Parameters:
prompt (str): The prompt to generate content from.
system_prompt (str, optional): Custom system prompt to use instead of the default one.
Returns:
str: The generated content.
"""
try:
# Use llm_text_gen instead of directly calling the model
return llm_text_gen(prompt, system_prompt)
except Exception as e:
logger.error(f"Error generating content: {e}")
return ""
def ai_essay_generator(essay_title, selected_essay_type, selected_education_level, selected_num_pages):
"""
Write an Essay using prompt chaining and iterative generation.
Parameters:
essay_title (str): The title or topic of the essay.
selected_essay_type (str): The type of essay to write.
selected_education_level (str): The education level of the target audience.
selected_num_pages (int): The number of pages or words for the essay.
"""
logger.info(f"Starting to write Essay on {essay_title}..")
try:
# Define persona and writing guidelines
guidelines = f'''\
Writing Guidelines
As an expert Essay writer and academic researcher, demostrate your world class essay writing skills.
Follow the below writing guidelines for writing your essay:
1). You specialize in {selected_essay_type} essay writing.
2). Your target audiences include readers from {selected_education_level} level.
3). The title of the essay is {essay_title}.
5). The final essay should of {selected_num_pages} words/pages.
3). Plant the seeds of subplots or potential character arc shifts that can be expanded later.
Remember, your main goal is to write as much as you can. If you get through
the story too fast, that is bad. Expand, never summarize.
'''
# Generate prompts
premise_prompt = f'''\
As an expert essay writer, specilizing in {selected_essay_type} essay writing.
Write an Essay title for given keywords {essay_title}.
The title should appeal to audience level of {selected_education_level}.
'''
outline_prompt = f'''\
As an expert essay writer, specilizing in {selected_essay_type} essay writing.
Your Essay title is:
{{premise}}
Write an outline for the essay.
'''
starting_prompt = f'''\
As an expert essay writer, specilizing in {selected_essay_type} essay writing.
Your essay title is:
{{premise}}
The outline of the Essay is:
{{outline}}
First, silently review the outline and the essay title. Consider how to start the Essay.
Start to write the very beginning of the Essay. You are not expected to finish
the whole Essay now. Your writing should be detailed enough that you are only
scratching the surface of the first bullet of your outline. Try to write AT
MINIMUM 1000 WORDS.
{guidelines}
'''
continuation_prompt = f'''\
As an expert essay writer, specilizing in {selected_essay_type} essay writing.
Your essay title is:
{{premise}}
The outline of the Essay is:
{{outline}}
You've begun to write the essay and continue to do so.
Here's what you've written so far:
{{story_text}}
=====
First, silently review the outline and essay so far.
Identify what the single next part of your outline you should write.
Your task is to continue where you left off and write the next part of the Essay.
You are not expected to finish the whole essay now. Your writing should be
detailed enough that you are only scratching the surface of the next part of
your outline. Try to write AT MINIMUM 1000 WORDS. However, only once the essay
is COMPLETELY finished, write IAMDONE. Remember, do NOT write a whole chapter
right now.
{guidelines}
'''
# Generate prompts
try:
premise = generate_with_retry(premise_prompt)
logger.info(f"The title of the Essay is: {premise}")
except Exception as err:
logger.error(f"Essay title Generation Error: {err}")
return
outline = generate_with_retry(outline_prompt.format(premise=premise))
logger.info(f"The Outline of the essay is: {outline}\n\n")
if not outline:
logger.error("Failed to generate Essay outline. Exiting...")
return
try:
starting_draft = generate_with_retry(
starting_prompt.format(premise=premise, outline=outline))
pprint(starting_draft)
except Exception as err:
logger.error(f"Failed to Generate Essay draft: {err}")
return
try:
draft = starting_draft
continuation = generate_with_retry(
continuation_prompt.format(premise=premise, outline=outline, story_text=draft))
pprint(continuation)
except Exception as err:
logger.error(f"Failed to write the initial draft: {err}")
# Add the continuation to the initial draft, keep building the story until we see 'IAMDONE'
try:
draft += '\n\n' + continuation
except Exception as err:
logger.error(f"Failed as: {err} and {continuation}")
while 'IAMDONE' not in continuation:
try:
continuation = generate_with_retry(
continuation_prompt.format(premise=premise, outline=outline, story_text=draft))
draft += '\n\n' + continuation
except Exception as err:
logger.error(f"Failed to continually write the Essay: {err}")
return
# Remove 'IAMDONE' and print the final story
final = draft.replace('IAMDONE', '').strip()
pprint(final)
return final
except Exception as e:
logger.error(f"Main Essay writing: An error occurred: {e}")
return ""

View File

@@ -0,0 +1,102 @@
######################################################
#
# Alwrity, as an AI news writer, will have to be factually correct.
# We will do multiple rounds of web research and cite our sources.
# 'include_urls' will focus news articles only from well known sources.
# Choosing a country will help us get better results.
#
######################################################
import sys
import os
import json
from textwrap import dedent
from pathlib import Path
from datetime import datetime
from dotenv import load_dotenv
load_dotenv(Path('../../.env'))
from loguru import logger
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
from ..ai_web_researcher.google_serp_search import perform_serper_news_search
def ai_news_generation(news_keywords, news_country, news_language):
""" Generate news aritcle based on given keywords. """
# Use to store the blog in a string, to save in a *.md file.
blog_markdown_str = ""
logger.info(f"Researching and Writing News Article on keywords: {news_keywords}")
# Call on the got-researcher, tavily apis for this. Do google search for organic competition.
try:
google_news_result = perform_serper_news_search(news_keywords, news_country, news_language)
blog_markdown_str = write_news_google_search(news_keywords, news_country, news_language, google_news_result)
#print(blog_markdown_str)
except Exception as err:
logger.error(f"Failed in Google News web research: {err}")
logger.info("\n######### Draft1: Finished News article from Google web search: ###########\n\n")
return blog_markdown_str
def write_news_google_search(news_keywords, news_country, news_language, search_results):
"""Combine the given online research and gpt blog content"""
news_language = get_language_name(news_language)
news_country = get_country_name(news_country)
prompt = f"""
As an experienced {news_language} news journalist and editor,
I will provide you with my 'News keywords' and its 'google search results'.
Your goal is to write a News report, backed by given google search results.
Important, as a news report, its imperative that your content is factually correct and cited.
Follow below guidelines:
1). Understand and utilize the provided google search result json.
2). Always provide in-line citations and provide referance links.
3). Understand the given news item and adapt your tone accordingly.
4). Always include the dates when then news was reported.
6). Do not explain, describe your response.
7). Your blog should be highly formatted in markdown style and highly readable.
8). Important: Please read the entire prompt before writing anything. Follow the prompt exactly as I instructed.
\n\nNews Keywords: "{news_keywords}"\n\n
Google search Result: "{search_results}"
"""
logger.info("Generating blog and FAQs from Google web search results.")
try:
response = llm_text_gen(prompt)
return response
except Exception as err:
logger.error(f"Exit: Failed to get response from LLM: {err}")
exit(1)
def get_language_name(language_code):
languages = {
"es": "Spanish",
"vn": "Vietnamese",
"en": "English",
"ar": "Arabic",
"hi": "Hindi",
"de": "German",
"zh-cn": "Chinese (Simplified)"
# Add more language codes and corresponding names as needed
}
return languages.get(language_code, "Unknown")
def get_country_name(country_code):
countries = {
"es": "Spain",
"vn": "Vietnam",
"pk": "Pakistan",
"in": "India",
"de": "Germany",
"cn": "China"
# Add more country codes and corresponding names as needed
}
return countries.get(country_code, "Unknown")

View File

@@ -0,0 +1,115 @@
import streamlit as st
import json
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
def generate_product_description(title, details, audience, tone, length, keywords):
"""
Generates a product description using OpenAI's API.
Args:
title (str): The title of the product.
details (list): A list of product details (features, benefits, etc.).
audience (list): A list of target audience segments.
tone (str): The desired tone of the description (e.g., "Formal", "Informal").
length (str): The desired length of the description (e.g., "short", "medium", "long").
keywords (str): Keywords related to the product (comma-separated).
Returns:
str: The generated product description.
"""
prompt = f"""
Write a compelling product description for {title}.
Highlight these key features: {', '.join(details)}
Emphasize the benefits of these features for the target audience ({audience}).
Maintain a {tone} tone and aim for a length of approximately {length} words.
Use these keywords naturally throughout the description: {', '.join(keywords)}.
Remember to be persuasive and focus on the value proposition.
"""
try:
response = llm_text_gen(prompt)
return response
except Exception as err:
logger.error(f"Exit: Failed to get response from LLM: {err}")
exit(1)
def display_inputs():
st.title("📝 AI Product Description Writer 🚀")
st.markdown("**Generate compelling and accurate product descriptions with AI.**")
col1, col2 = st.columns(2)
with col1:
product_title = st.text_input("🏷️ **Product Title**", placeholder="Enter the product title (e.g., Wireless Bluetooth Headphones)")
with col2:
product_details = st.text_area("📄 **Product Details**", placeholder="Enter features, benefits, specifications, materials, etc. (e.g., Noise Cancellation, Long Battery Life, Water Resistant, Comfortable Design)")
col3, col4 = st.columns(2)
with col3:
keywords = st.text_input("🔑 **Keywords**", placeholder="Enter keywords, comma-separated (e.g., wireless headphones, noise cancelling, Bluetooth 5.0)")
with col4:
target_audience = st.multiselect(
"🎯 **Target Audience**",
["Teens", "Adults", "Seniors", "Music Lovers", "Fitness Enthusiasts", "Tech Savvy", "Busy Professionals", "Travelers", "Casual Users"],
placeholder="Select target audience (optional)"
)
col5, col6 = st.columns(2)
with col5:
description_length = st.selectbox(
"📏 **Desired Description Length**",
["Short (1-2 sentences)", "Medium (3-5 sentences)", "Long (6+ sentences)"],
help="Select the desired length of the product description"
)
with col6:
brand_tone = st.selectbox(
"🎨 **Brand Tone**",
["Formal", "Informal", "Fun & Energetic"],
help="Select the desired tone for the description"
)
return product_title, product_details, target_audience, brand_tone, description_length, keywords
def display_output(description):
if description:
st.subheader("✨ Generated Product Description:")
st.write(description)
json_ld = {
"@context": "https://schema.org",
"@type": "Product",
"name": product_title,
"description": description,
"audience": target_audience,
"brand": {
"@type": "Brand",
"name": "Your Brand Name"
},
"keywords": keywords.split(", ")
}
def write_ai_prod_desc():
product_title, product_details, target_audience, brand_tone, description_length, keywords = display_inputs()
if st.button("Generate Product Description 🚀"):
with st.spinner("Generating description..."):
description = generate_product_description(
product_title,
product_details.split(", "), # Split details into a list
target_audience,
brand_tone,
description_length.split(" ")[0].lower(), # Extract length from selectbox
keywords
)
display_output(description)

View File

@@ -0,0 +1,220 @@
import streamlit as st
from lib.utils.alwrity_utils import (essay_writer, ai_news_writer, ai_finance_ta_writer)
from lib.ai_writers.ai_story_writer.story_writer import story_input_section
from lib.ai_writers.ai_product_description_writer import write_ai_prod_desc
from lib.ai_writers.ai_copywriter.copywriter_dashboard import copywriter_dashboard
from lib.ai_writers.linkedin_writer import LinkedInAIWriter
from lib.ai_writers.blog_rewriter_updater.ai_blog_rewriter import write_blog_rewriter
from lib.ai_writers.ai_blog_faqs_writer.faqs_ui import main as faqs_generator
from lib.ai_writers.ai_blog_writer.ai_blog_generator import ai_blog_writer_page
from lib.ai_writers.ai_outline_writer.outline_ui import main as outline_generator
from lib.alwrity_ui.dashboard_styles import apply_dashboard_style, render_dashboard_header, render_category_header, render_card
from loguru import logger
# Try to import AI Content Performance Predictor (AI-first approach)
try:
from lib.content_performance_predictor.ai_performance_predictor import render_ai_predictor_ui as render_content_performance_predictor
AI_PREDICTOR_AVAILABLE = True
logger.info("AI Content Performance Predictor loaded successfully")
except ImportError:
logger.warning("AI Content Performance Predictor not available")
render_content_performance_predictor = None
AI_PREDICTOR_AVAILABLE = False
# Try to import Bootstrap AI Competitive Suite
try:
from lib.ai_competitive_suite.bootstrap_ai_suite import render_bootstrap_ai_suite
BOOTSTRAP_SUITE_AVAILABLE = True
logger.info("Bootstrap AI Competitive Suite loaded successfully")
except ImportError:
logger.warning("Bootstrap AI Competitive Suite not available")
render_bootstrap_ai_suite = None
BOOTSTRAP_SUITE_AVAILABLE = False
def list_ai_writers():
"""Return a list of available AI writers with their metadata (no UI rendering)."""
writers = []
# Add Content Performance Predictor if available
if render_content_performance_predictor:
# AI-first approach description
if AI_PREDICTOR_AVAILABLE:
description = "🎯 AI-powered content performance prediction with competitive intelligence - perfect for solo entrepreneurs"
name = "AI Content Performance Predictor"
else:
description = "Predict content success before publishing with AI-powered performance analysis"
name = "Content Performance Predictor"
writers.append({
"name": name,
"icon": "🎯",
"description": description,
"category": "⭐ Featured",
"function": render_content_performance_predictor,
"path": "performance_predictor",
"featured": True
})
# Add Bootstrap AI Competitive Suite if available
if render_bootstrap_ai_suite:
writers.append({
"name": "Bootstrap AI Competitive Suite",
"icon": "🚀",
"description": "🥷 Complete AI-powered competitive toolkit: content performance prediction + competitive intelligence for solo entrepreneurs",
"category": "⭐ Featured",
"function": render_bootstrap_ai_suite,
"path": "bootstrap_ai_suite",
"featured": True
})
# Add existing writers
writers.extend([
{
"name": "AI Blog Writer",
"icon": "📝",
"description": "Generate comprehensive blog posts from keywords, URLs, or uploaded content",
"category": "Content Creation",
"function": ai_blog_writer_page,
"path": "ai_blog_writer"
},
{
"name": "AI Blog Rewriter",
"icon": "🔄",
"description": "Rewrite and update existing blog content with improved quality and SEO optimization",
"category": "Content Creation",
"function": write_blog_rewriter,
"path": "blog_rewriter"
},
{
"name": "Story Writer",
"icon": "📚",
"description": "Create engaging stories and narratives with AI assistance",
"category": "Creative Writing",
"function": story_input_section,
"path": "story_writer"
},
{
"name": "Essay writer",
"icon": "✍️",
"description": "Generate well-structured essays on any topic",
"category": "Academic",
"function": essay_writer,
"path": "essay_writer"
},
{
"name": "Write News reports",
"icon": "📰",
"description": "Create professional news articles and reports",
"category": "Journalism",
"function": ai_news_writer,
"path": "news_writer"
},
{
"name": "Write Financial TA report",
"icon": "📊",
"description": "Generate technical analysis reports for financial markets",
"category": "Finance",
"function": ai_finance_ta_writer,
"path": "financial_writer"
},
{
"name": "AI Product Description Writer",
"icon": "🛍️",
"description": "Create compelling product descriptions that drive sales",
"category": "E-commerce",
"function": write_ai_prod_desc,
"path": "product_writer"
},
{
"name": "AI Copywriter",
"icon": "✒️",
"description": "Generate persuasive copy for marketing and advertising",
"category": "Marketing",
"function": copywriter_dashboard,
"path": "copywriter"
},
{
"name": "LinkedIn AI Writer",
"icon": "💼",
"description": "Create professional LinkedIn content that engages your network",
"category": "Professional",
"function": lambda: LinkedInAIWriter().run(),
"path": "linkedin_writer"
},
{
"name": "FAQ Generator",
"icon": "",
"description": "Generate comprehensive, well-researched FAQs from any content source with customizable options",
"category": "Content Creation",
"function": faqs_generator,
"path": "faqs_generator"
},
{
"name": "Blog Outline Generator",
"icon": "📋",
"description": "Create detailed blog outlines with AI-powered content generation and image integration",
"category": "Content Creation",
"function": outline_generator,
"path": "outline_generator"
}
])
return writers
def get_ai_writers():
"""Main function to display AI writers dashboard with premium glassmorphic design."""
logger.info("Starting AI Writers Dashboard")
# Apply common dashboard styling
apply_dashboard_style()
# Render dashboard header
render_dashboard_header(
"🤖 AI Content Writers",
"Choose from our collection of specialized AI writers, each designed for specific content types and industries. Create engaging, high-quality content with just a few clicks."
)
writers = list_ai_writers()
logger.info(f"Found {len(writers)} AI writers")
# Group writers by category for better organization
categories = {}
for writer in writers:
category = writer["category"]
if category not in categories:
categories[category] = []
categories[category].append(writer)
# Render writers by category with common cards
for category_name, category_writers in categories.items():
render_category_header(category_name)
# Create columns for this category
cols = st.columns(min(len(category_writers), 3))
for idx, writer in enumerate(category_writers):
with cols[idx % 3]:
# Use the common card renderer
if render_card(
icon=writer['icon'],
title=writer['name'],
description=writer['description'],
category=writer['category'],
key_suffix=f"{writer['path']}_{category_name}",
help_text=f"Launch {writer['name']} - {writer['description']}"
):
logger.info(f"Selected writer: {writer['name']} with path: {writer['path']}")
st.session_state.selected_writer = writer
st.query_params["writer"] = writer['path']
logger.info(f"Updated query params with writer: {writer['path']}")
st.rerun()
# Add spacing between categories
st.markdown('<div class="category-spacer"></div>', unsafe_allow_html=True)
logger.info("Finished rendering AI Writers Dashboard")
return writers
# Remove the old ai_writers function since it's now integrated into get_ai_writers

View File

@@ -0,0 +1,247 @@
#####################################################
#
# Alwrity, AI Long form writer - Writing_with_Prompt_Chaining
# and generative AI.
#
#####################################################
import os
import re
import time #iwish
import sys
import yaml
from pathlib import Path
from dotenv import load_dotenv
from configparser import ConfigParser
import streamlit as st
from pprint import pprint
from textwrap import dedent
from loguru import logger
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}"
)
from ..utils.read_main_config_params import read_return_config_section
from ..ai_web_researcher.gpt_online_researcher import do_metaphor_ai_research
from ..ai_web_researcher.gpt_online_researcher import do_google_serp_search, do_tavily_ai_search
from ..blog_metadata.get_blog_metadata import get_blog_metadata_longform
from ..blog_postprocessing.save_blog_to_file import save_blog_to_file
from ..gpt_providers.text_generation.main_text_generation import llm_text_gen
def generate_with_retry(prompt, system_prompt=None):
"""
Generates content from the model with retry handling for errors.
Parameters:
prompt (str): The prompt to generate content from.
system_prompt (str, optional): Custom system prompt to use instead of the default one.
Returns:
str: The generated content.
"""
try:
# FIXME: Need a progress bar here.
return llm_text_gen(prompt, system_prompt)
except Exception as e:
logger.error(f"Error generating content: {e}")
st.error(f"Error generating content: {e}")
return False
def long_form_generator(keywords, search_params=None, blog_params=None):
"""
Generate a long-form blog post based on the given keywords
Args:
keywords (str): Topic or keywords for the blog post
search_params (dict, optional): Search parameters for research
blog_params (dict, optional): Blog content characteristics
"""
# Initialize default parameters if not provided
if blog_params is None:
blog_params = {
"blog_length": 3000, # Default longer for long-form content
"blog_tone": "Professional",
"blog_demographic": "Professional",
"blog_type": "Informational",
"blog_language": "English"
}
else:
# Ensure we have a higher word count for long-form content
if blog_params.get("blog_length", 0) < 2500:
blog_params["blog_length"] = max(3000, blog_params.get("blog_length", 0))
# Extract parameters with defaults
blog_length = blog_params.get("blog_length", 3000)
blog_tone = blog_params.get("blog_tone", "Professional")
blog_demographic = blog_params.get("blog_demographic", "Professional")
blog_type = blog_params.get("blog_type", "Informational")
blog_language = blog_params.get("blog_language", "English")
st.subheader(f"Long-form {blog_type} Blog ({blog_length}+ words)")
with st.status("Generating comprehensive long-form content...", expanded=True) as status:
# Step 1: Generate outline
status.update(label="Creating detailed content outline...")
# Use a customized prompt based on the blog parameters
outline_prompt = f"""
As an expert content strategist writing in a {blog_tone} tone for {blog_demographic} audience,
create a detailed outline for a comprehensive {blog_type} blog post about "{keywords}"
that will be approximately {blog_length} words in {blog_language}.
The outline should include:
1. An engaging headline
2. 5-7 main sections with descriptive headings
3. 2-3 subsections under each main section
4. Key points to cover in each section
5. Ideas for relevant examples or case studies
6. Suggestions for data points or statistics to include
Format the outline in markdown with proper headings and bullet points.
"""
try:
outline = llm_text_gen(outline_prompt)
st.markdown("### Content Outline")
st.markdown(outline)
status.update(label="Outline created successfully ✓")
# Step 2: Research the topic using the search parameters
status.update(label="Researching topic details...")
research_results = research_topic(keywords, search_params)
status.update(label="Research completed ✓")
# Step 3: Generate the full content
status.update(label=f"Writing {blog_length}+ word {blog_tone} {blog_type} content...")
full_content_prompt = f"""
You are a professional content writer who specializes in {blog_type} content with a {blog_tone} tone
for {blog_demographic} audiences. Write a comprehensive, in-depth blog post in {blog_language} about:
"{keywords}"
Use this outline as your structure:
{outline}
And incorporate these research findings where relevant:
{research_results}
The blog post should:
- Be approximately {blog_length} words
- Include an engaging introduction and strong conclusion
- Use appropriate subheadings for all sections in the outline
- Include examples, data points, and actionable insights
- Be formatted in markdown with proper headings, bullet points, and emphasis
- Maintain a {blog_tone} tone throughout
- Address the needs and interests of a {blog_demographic} audience
Do not include phrases like "according to research" or "based on the outline" in your content.
"""
full_content = llm_text_gen(full_content_prompt)
status.update(label="Long-form content generated successfully! ✓", state="complete")
# Display the full content
st.markdown("### Your Complete Long-form Blog Post")
st.markdown(full_content)
return full_content
except Exception as e:
status.update(label=f"Error generating long-form content: {str(e)}", state="error")
st.error(f"Failed to generate long-form content: {str(e)}")
return None
def research_topic(keywords, search_params=None):
"""
Research a topic using search parameters and return a summary
Args:
keywords (str): Topic to research
search_params (dict, optional): Search parameters
Returns:
str: Research summary
"""
# Display a placeholder for research results
placeholder = st.empty()
placeholder.info("Researching topic... Please wait.")
try:
from .ai_blog_writer.keywords_to_blog_streamlit import do_tavily_ai_search
# Use provided search params or defaults
if search_params is None:
search_params = {
"max_results": 10,
"search_depth": "advanced",
"time_range": "year"
}
# Conduct research using Tavily
tavily_results = do_tavily_ai_search(
keywords,
max_results=search_params.get("max_results", 10),
search_depth=search_params.get("search_depth", "advanced"),
include_domains=search_params.get("include_domains", []),
time_range=search_params.get("time_range", "year")
)
# Extract research data
research_data = ""
if tavily_results and len(tavily_results) == 3:
results, titles, answer = tavily_results
if answer and len(answer) > 50:
research_data += f"Summary: {answer}\n\n"
if results and 'results' in results and len(results['results']) > 0:
research_data += "Key Sources:\n"
for i, result in enumerate(results['results'][:7], 1):
title = result.get('title', 'Untitled Source')
content_snippet = result.get('content', '')[:300] + "..."
research_data += f"{i}. {title}\n{content_snippet}\n\n"
# If research data is empty or too short, provide a generic response
if not research_data or len(research_data) < 100:
research_data = f"No specific research data found for '{keywords}'. Please provide more specific information in your content."
placeholder.success("Research completed successfully!")
return research_data
except Exception as e:
placeholder.error(f"Research failed: {str(e)}")
return f"Unable to gather research for '{keywords}'. Please continue with the content based on your knowledge."
finally:
# Remove the placeholder after a short delay
import time
time.sleep(1)
placeholder.empty()
def generate_long_form_content(content_keywords):
"""
Main function to generate long-form content based on the provided keywords.
Parameters:
content_keywords (str): The main keywords or topic for the long-form content.
Returns:
str: The generated long-form content.
"""
return long_form_generator(content_keywords)
# Example usage
if __name__ == "__main__":
# Example usage of the function
content_keywords = "artificial intelligence in healthcare"
generated_content = generate_long_form_content(content_keywords)
print(f"Generated content: {generated_content[:100]}...")

View File

@@ -78,7 +78,7 @@ def blog_arxiv_url_list(file_path):
""" Write blogs on all the arxiv links given in a file. """
extracted_ids = []
try:
with open(file_path, 'r') as file:
with open(file_path, 'r', encoding="utf-8") as file:
for line in file:
arxiv_id = extract_arxiv_ids_from_line(line)
if arxiv_id:
@@ -155,7 +155,7 @@ def blog_postprocessing(arxiv_id, research_review):
raise err
try:
blog_title, blog_meta_desc, blog_tags, blog_categories = blog_metadata(research_review, "gemini")
blog_title, blog_meta_desc, blog_tags, blog_categories = blog_metadata(research_review)
except Exception as err:
logger.error(f"Failed to get blog metadata: {err}")
raise err

View File

@@ -11,7 +11,7 @@ logger.add(sys.stdout,
)
def write_blog_from_paper(paper_content, gpt_providers="openai"):
def write_blog_from_paper(paper_content):
""" Write blog from given paper url. """
prompt = f"""As an expert in NLP and AI, I will provide you with a content of a research paper.
Your task is to write a highly detailed blog(at least 2000 words), breaking down complex concepts for beginners.

View File

@@ -12,7 +12,7 @@ logger.add(sys.stdout,
)
def review_research_paper(research_blog, gpt_providers="openai"):
def review_research_paper(research_blog):
""" """
prompt = f"""As world's top researcher and academician, I will provide you with research paper.
Your task is to write a highly detailed review report.

View File

@@ -1,373 +0,0 @@
import os
from pathlib import Path
import typer
from prompt_toolkit.shortcuts import checkboxlist_dialog, message_dialog, input_dialog
from prompt_toolkit import prompt
from prompt_toolkit.styles import Style
from prompt_toolkit.shortcuts import radiolist_dialog
from dotenv import load_dotenv
import requests
from rich import print
from rich.text import Text
load_dotenv(Path('.env'))
app = typer.Typer()
from lib.ai_web_researcher.gpt_online_researcher import gpt_web_researcher
from lib.ai_web_researcher.metaphor_basic_neural_web_search import metaphor_find_similar
from lib.ai_writers.keywords_to_blog import write_blog_from_keywords
def prompt_for_time_range():
os.system("clear" if os.name == "posix" else "cls")
print("\n🙋 If you're researching keywords that are recent, use accordingly. Default is Anytime.\n")
choices = [("anytime", "Anytime"), ("past year", "Past Year"), ("past month", "Past Month"),
("past week", "Past Week"), ("past day", "Past Day")]
selected_time_range = radiolist_dialog(title="Select Search result time range:", values=choices).run()
return selected_time_range[0] if selected_time_range else None
def write_blog_options():
choices = [
("Keywords", "Keywords"),
("Audio YouTube", "Audio YouTube"),
("Programming", "Programming"),
("Scholar", "Scholar"),
("News/TBD", "News/TBD"),
("Finance/TBD", "Finance/TBD"),
("Quit", "Quit")
]
selected_blog_type = radiolist_dialog(title="Choose a blog type:", values=choices).run()
return selected_blog_type if selected_blog_type else None
@app.command()
def start_interactive_mode():
os.system("clear" if os.name == "posix" else "cls")
text = "_______________________________________________________________________\n"
text += "\n⚠️ Alert! 💥❓💥\n"
text += "If you know what to write, choose 'Write Blog'\n"
text += "If unsure, let's 'do web research' to write on\n"
text += "If Testing-it-out/getting-started, choose 'Blog Tools\n"
text += "_______________________________________________________________________\n"
print(text)
choices = [
("Write Blog", "Write Blog"),
("Do keyword Research", "Do keyword Research"),
("Create Blog Images", "Create Blog Images"),
("Competitor Analysis", "Competitor Analysis"),
("Blog Tools", "Blog Tools"),
("Social Media", "Social Media"),
("Quit", "Quit")
]
mode = radiolist_dialog(title="Choose an option:", values=choices).run()
if mode:
if mode == 'Write Blog':
write_blog()
elif mode == 'Do keyword Research':
do_web_research()
elif mode == 'Create Blog Images':
faq_generator()
elif mode == 'Competitor Analysis':
competitor_analysis()
elif mode == 'Blog Tools':
blog_tools()
elif mode == 'Social Media':
print("""
#whatsapp
#instagram
#youtube
#twitter/X
#Linked-in posts
""")
raise typer.Exit()
elif mode == 'Quit':
typer.echo("Exiting, Getting Lost!")
raise typer.Exit()
def check_search_apis():
"""
Check if necessary environment variables are present.
Display messages with links on how to get them if not present.
"""
# Use rich.print for styling and hyperlinking
print("\n\n🙋♂️ 🙋♂️ Before doing web research, ensure the following API keys are available:")
print("Blogen uses Basic, Semantic, Neural web search using above APIs for contextual blog generation.\n")
api_keys = {
"METAPHOR_API_KEY": "Metaphor AI Key (Get it here: [link=https://dashboard.exa.ai/login]Metaphor API[/link])",
"TAVILY_API_KEY": "Tavily AI Key (Get it here: [link=https://tavily.com/#api]Tavily API[/link])",
"SERPER_API_KEY": "Serper API Key (Get it here: [link=https://serper.dev/signup]SerperDev API[/link])",
}
missing_keys = []
with typer.progressbar(api_keys.items(), label="Checking API keys", length=len(api_keys)) as progress:
for key, description in progress:
if os.getenv(key) is None:
# Use rich.print for styling and hyperlinking
print(f"[bold red]✖ 🚫 {key} is missing:[/bold red] [blue underline]Get {key} API Key[/blue underline]")
typer.echo(f"[bold red]✖ 🚫 {key} is missing:[/bold red] [link={key}]Get {key} API Key[/link]")
missing_keys.append((key, description))
if missing_keys:
print("\nMost are Free APIs and really worth your while signing up for them.")
print("💩💩💩: GO GET THEM, on above urls. [bold red]")
#print("Note: They offer free/limited api calls, so we use most of them to have a lot of free api calls.")
for key, description in missing_keys:
get_api_key(key, description)
else:
return True
def get_api_key(api_key: str, api_description: str):
"""
Ask the user to input the missing API key and add it to the .env file.
Args:
api_key (str): The name of the API key variable.
api_description (str): The description of the API key.
"""
user_input = typer.prompt(f"\n🙆🙆Please enter {api_key} API Key:")
with open(".env", "a") as env_file:
env_file.write(f"{api_key}={user_input}\n")
print(f"{api_description} API Key added to .env file.")
def faq_generator():
return
def blog_tools():
os.system("clear" if os.name == "posix" else "cls")
text = "_______________________________________________________________________\n"
text += "\n⚠️ Alert! 💥❓💥\n"
text += "Collection of Helpful Blogging Tools, powered by LLMs.\n"
text += "_______________________________________________________________________\n"
print(text)
choices = [
("Write Blog Title", "Write Blog Title"),
("Write Blog Meta Description", "Write Blog Meta Description"),
("Write Blog Introduction", "Write Blog Introduction"),
("Write Blog conclusion", "Write Blog conclusion"),
("Write Blog Outline", "Write Blog Outline"),
("Generate Blog FAQs", "Generate Blog FAQs"),
("Research blog references", "Research blog references"),
("Convert Blog To HTML", "Convert Blog To HTML"),
("Convert Blog To Markdown", "Convert Blog To Markdown"),
("Blog Proof Reader", "Blog Proof Reader"),
("Get Blog Tags", "Get Blog Tags"),
("Get blog categories", "Get blog categories"),
("Get Blog Code Examples", "Get Blog Code Examples"),
("Check WebPage Performance", "Check WebPage Performance"),
("Quit/Exit", "Quit/Exit")
]
selected_tool = radiolist_dialog(title="Choose a Blogging Tool:", values=choices).run()
if selected_tool:
tool = selected_tool[0]
if tool == 'Write Blog Title':
return
def competitor_analysis():
text = "_______________________________________________________________________\n"
text += "\n⚠️ Alert! 💥❓💥\n"
text += "Provide competitor's URL, get details of similar/alternative companies.\n"
text += "Usecases: Know similar companies and alternatives, to given URL\n"
text += "_______________________________________________________________________\n"
print(text)
similar_url = prompt("Enter Valid URL to get web analysis")
try:
metaphor_find_similar(similar_url)
except Exception as err:
print(f"[bold red]✖ 🚫 Failed to do similar search.\nError:{err}[/bold red]")
return
def write_blog():
blog_type = write_blog_options()
if blog_type:
if blog_type == 'Keywords':
blog_from_keyword()
elif blog_type == 'Audio YouTube':
audio_youtube = prompt("Enter YouTube URL for audio blog generation:")
print(f"Write audio blog based on YouTube URL: {audio_youtube}")
elif blog_type == 'GitHub':
github = prompt("Enter GitHub URL, CSV file, or topic:")
print(f"Write blog based on GitHub: {github}")
elif blog_type == 'Scholar':
scholar = prompt("Enter research papers keywords:")
print(f"Write blog based on scholar: {scholar}")
elif blog_type == 'Quit':
typer.echo("Exiting, Getting Lost..")
raise typer.Exit()
def blog_from_keyword():
""" Input blog keywords, research and write a factual blog."""
while True:
print("________________________________________________________________")
blog_keywords = input_dialog(
title='Enter Keywords/Blog Title',
text='Shit in, Shit Out; Better keywords, better research, hence better content.\n👋 Enter keywords/Blog Title for blog generation:',
).run()
# If the user cancels, exit the loop
if blog_keywords is None:
break
if blog_keywords and len(blog_keywords.split()) >= 2:
break
else:
message_dialog(
title='Warning',
text='🚫 Blog keywords should be at least two words long. Please try again.'
).run()
if blog_keywords:
try:
write_blog_from_keywords(blog_keywords)
except Exception as err:
print(f"Failed to write blog on {blog_keywords}, Error: {err}\n")
exit(1)
def do_web_research():
""" Input keywords and do web research and present a report."""
if check_search_apis():
while True:
print("________________________________________________________________")
search_keywords = input_dialog(
title='Enter Search Keywords below:',
text='👋 Enter keywords for web research (Or keywords from your blog):',
).run()
if search_keywords and len(search_keywords.split()) >= 2:
break
else:
message_dialog(
title='Warning',
text='🚫 Search keywords should be at least three words long. Please try again.'
).run()
selected_time_range = prompt_for_time_range()
# Display input dialog for similar search URL (optional)
similar_url = input_dialog(
title="Enter a similar search URL",
text="👋 Enter a similar search URL (Optional: Enter to skip):\n🙋Usecases: Competitor Analysis Tool. 📡Discover similar companies, startups and technologies.",
default="",
).run()
# Display input dialog for included URLs (optional)
include_urls = input_dialog(
title="Enter URLs to include in the web search:",
text="👋 Enter comma-separated URLs to include in web research (press Enter to skip):\n🙋 If you wish to [bold]confine search[/bold] to certain domains like wikipedia etc.",
default="",
).run()
try:
print(f"🚀🎬🚀 [bold green]Starting web research on given keywords: {search_keywords}..")
#print(f"Web Research: Time Range - {time_range}, Search Keywords - {search_keywords}, Include URLs - {include_urls}")
web_research_result = gpt_web_researcher(search_keywords,
time_range=selected_time_range,
include_domains=include_urls,
similar_url=similar_url)
except Exception as err:
print(f"\n💥🤯 [bold red]ERROR 🤯 : Failed to do web research: {err}\n")
def check_llm_environs():
""" Function to check which LLM api is given. """
# Check if GPT_PROVIDER is defined in .env file
gpt_provider = os.getenv("GPT_PROVIDER")
# Load .env file
load_dotenv()
# Disable unsupported GPT providers
supported_providers = ['google', 'openai', 'mistralai']
if gpt_provider is None or gpt_provider.lower() not in supported_providers:
#message_dialog(
# title="Unsupported GPT Provider",
# text="GPT_PROVIDER is not set or has an unsupported value."
#).run()
# Prompt user to select a provider
selected_provider = radiolist_dialog(
title='Select your preferred GPT provider:',
text="Please choose GPT provider Below:\n👺Google Gemini recommended, its 🆓.",
values=[
("Google", "google"),
("Openai", "openai"),
("MistralAI/WIP", "mistralai/WIP"),
("Ollama", "Ollama (TBD)")
]
).run()
if selected_provider:
gpt_provider = selected_provider
if gpt_provider.lower() == "google":
api_key_var = "GEMINI_API_KEY"
missing_api_msg = f"To get your {api_key_var}, please visit: https://aistudio.google.com/app/apikey"
elif gpt_provider.lower() == "openai":
api_key_var = "OPENAI_API_KEY"
missing_api_msg = "To get your OpenAI API key, please visit: https://openai.com/blog/openai-api"
elif gpt_provider.lower() == "mistralai":
api_key_var = "MISTRAL_API_KEY"
missing_api_msg = "To get your MistralAI API key, please visit: https://mistralai.com/api"
if os.getenv(api_key_var) is None:
# Ask for the API key
print(f"🚫The {api_key_var} is missing. {missing_api_msg}")
api_key = typer.prompt(f"\n🙆🙆Please enter {api_key_var} API Key:")
# Update .env file
with open(".env", "a") as env_file:
env_file.write(f"GPT_PROVIDER={gpt_provider.lower()}\n")
env_file.write(f"{api_key_var}={api_key}\n")
def check_internet():
try:
response = requests.get("http://www.google.com", timeout=20)
if not response.status_code == 200:
print("💥🤯 WTFish, Internet is NOT available. Enjoy the wilderness..")
exit(1)
else:
return
except requests.ConnectionError:
print("💥🤯 WTFish: Internet is NOT available. Enjoy the wilderness..")
exit(1)
except requests.Timeout:
print("Request timed out. Internet might be slow.")
exit(1)
except Exception as e:
print("Internet: An error occurred:", e)
exit(1)
def create_env_file():
env_file = Path('.env')
if not env_file.is_file():
try:
with open('.env', 'w') as f:
f.write('# Alwrity will add your environment variables here\n')
except Exception as e:
print(f"💥🤯Error occurred while creating .env file: {e}")
if __name__ == "__main__":
print("Checking Internet, lets get the basics right.")
check_internet()
print("Create .env file, if not Present working directory")
create_env_file()
print("Check Metaphor, Tavily, YOU.com Search API keys.")
check_search_apis()
print("Check LLM details & AI Model to use.")
check_llm_environs()
load_dotenv(Path('.env'))
app()

353
backend/README.md Normal file
View File

@@ -0,0 +1,353 @@
# ALwrity Backend
Welcome to the ALwrity Backend! This is the FastAPI-powered backend that provides RESTful APIs for the ALwrity AI content creation platform.
## 🚀 Quick Start
### Prerequisites
- Python 3.8+ installed
- pip (Python package manager)
### 1. Install Dependencies
```bash
cd backend
pip install -r requirements.txt
```
### 2. Start the Backend Server
```bash
python start_alwrity_backend.py
```
### 3. Verify It's Working
- Open your browser to: http://localhost:8000/api/docs
- You should see the interactive API documentation
- Health check: http://localhost:8000/health
## 📁 Project Structure
```
backend/
├── app.py # FastAPI application definition
├── start_alwrity_backend.py # Server startup script
├── requirements.txt # Python dependencies
├── api/
│ ├── __init__.py
│ └── onboarding.py # Onboarding API endpoints
├── services/
│ ├── __init__.py
│ ├── api_key_manager.py # API key management
│ └── validation.py # Validation services
├── models/
│ ├── __init__.py
│ └── onboarding.py # Data models
└── README.md # This file
```
## 🔧 File Descriptions
### Core Files
#### `app.py` - FastAPI Application
- **What it does**: Defines all API endpoints and middleware
- **Contains**:
- FastAPI app initialization
- All API routes (onboarding, health, etc.)
- CORS middleware for frontend integration
- Static file serving for React frontend
- **When to edit**: When adding new API endpoints or modifying existing ones
#### `start_alwrity_backend.py` - Server Startup
- **What it does**: Enhanced startup script with dependency checking
- **Contains**:
- Dependency validation
- Environment setup (creates directories)
- User-friendly logging and error messages
- Server startup with uvicorn
- **When to use**: This is your main entry point to start the server
### Supporting Directories
#### `api/` - API Endpoints
- Contains modular API endpoint definitions
- Organized by feature (onboarding, etc.)
- Each file handles a specific domain of functionality
#### `services/` - Business Logic
- Contains service layer functions
- Handles database operations, API key management, etc.
- Separates business logic from API endpoints
#### `models/` - Data Models
- Contains Pydantic models and database schemas
- Defines data structures for API requests/responses
- Ensures type safety and validation
## 🎯 How to Start the Backend
### Option 1: Recommended (Using the startup script)
```bash
cd backend
python start_alwrity_backend.py
```
### Option 2: Direct uvicorn (For development)
```bash
cd backend
uvicorn app:app --reload --host 0.0.0.0 --port 8000
```
### Option 3: Production mode
```bash
cd backend
uvicorn app:app --host 0.0.0.0 --port 8000
```
## 🌐 What You'll See
When you start the backend successfully, you'll see:
```
🎯 ALwrity Backend Server
========================================
✅ All dependencies are installed
🔧 Setting up environment...
✅ Created directory: lib/workspace/alwrity_content
✅ Created directory: lib/workspace/alwrity_web_research
✅ Created directory: lib/workspace/alwrity_prompts
✅ Created directory: lib/workspace/alwrity_config
No .env file found. API keys will need to be configured.
✅ Environment setup complete
🚀 Starting ALwrity Backend...
📍 Host: 0.0.0.0
🔌 Port: 8000
🔄 Reload: true
🌐 Backend is starting...
📖 API Documentation: http://localhost:8000/api/docs
🔍 Health Check: http://localhost:8000/health
📊 ReDoc: http://localhost:8000/api/redoc
⏹️ Press Ctrl+C to stop the server
============================================================
```
## 📚 API Documentation
Once the server is running, you can access:
- **📖 Interactive API Docs (Swagger)**: http://localhost:8000/api/docs
- **📊 ReDoc Documentation**: http://localhost:8000/api/redoc
- **🔍 Health Check**: http://localhost:8000/health
## 🔑 Available Endpoints
### Health & Status
- `GET /health` - Health check endpoint
### Onboarding System
- `GET /api/onboarding/status` - Get current onboarding status
- `GET /api/onboarding/progress` - Get full progress data
- `GET /api/onboarding/config` - Get onboarding configuration
### Step Management
- `GET /api/onboarding/step/{step_number}` - Get step data
- `POST /api/onboarding/step/{step_number}/complete` - Complete a step
- `POST /api/onboarding/step/{step_number}/skip` - Skip a step
- `GET /api/onboarding/step/{step_number}/validate` - Validate step access
### API Key Management
- `GET /api/onboarding/api-keys` - Get configured API keys
- `POST /api/onboarding/api-keys` - Save an API key
- `POST /api/onboarding/api-keys/validate` - Validate API keys
### Onboarding Control
- `POST /api/onboarding/start` - Start onboarding
- `POST /api/onboarding/complete` - Complete onboarding
- `POST /api/onboarding/reset` - Reset progress
- `GET /api/onboarding/resume` - Get resume information
## 🧪 Testing the Backend
### Quick Test with curl
```bash
# Health check
curl http://localhost:8000/health
# Get onboarding status
curl http://localhost:8000/api/onboarding/status
# Complete step 1
curl -X POST http://localhost:8000/api/onboarding/step/1/complete \
-H "Content-Type: application/json" \
-d '{"data": {"api_keys": ["openai"]}}'
```
### Using the Swagger UI
1. Open http://localhost:8000/api/docs
2. Click on any endpoint
3. Click "Try it out"
4. Fill in the parameters
5. Click "Execute"
## ⚙️ Configuration
### Environment Variables
You can customize the server behavior with these environment variables:
- `HOST`: Server host (default: 0.0.0.0)
- `PORT`: Server port (default: 8000)
- `RELOAD`: Enable auto-reload (default: true)
Subscription billing (Stripe) variables used in deployment:
- `STRIPE_SECRET_KEY`: Stripe API secret key (`sk_test_...` for test, `sk_live_...` for live).
- `STRIPE_WEBHOOK_SECRET`: Stripe webhook signing secret for `/api/subscription/webhook`.
- `STRIPE_MODE`: Stripe mode selector (`test` or `live`). Recommended to set explicitly in each environment.
- `STRIPE_PLAN_PRICE_MAPPING_TEST`: JSON mapping for test mode price IDs.
- `STRIPE_PLAN_PRICE_MAPPING_LIVE`: JSON mapping for live mode price IDs.
- `STRIPE_PLAN_PRICE_MAPPING`: Optional fallback JSON mapping used when mode-specific variable is not provided.
Required mapping keys validated at startup:
- `basic.monthly`
- `pro.monthly`
Example mapping value:
```json
{"basic":{"monthly":"price_123"},"pro":{"monthly":"price_456"}}
```
Example:
```bash
HOST=127.0.0.1 PORT=8080 python start_alwrity_backend.py
```
### CORS Configuration
The backend is configured to allow requests from:
- `http://localhost:3000` (React dev server)
- `http://localhost:8000` (Backend dev server)
- `http://localhost:3001` (Alternative React port)
## 🔄 Development Workflow
### 1. Start Development Server
```bash
cd backend
python start_alwrity_backend.py
```
### 2. Make Changes
- Edit `app.py` for API changes
- Edit files in `api/` for endpoint modifications
- Edit files in `services/` for business logic changes
### 3. Auto-reload
The server automatically reloads when you save changes to Python files.
### 4. Test Changes
- Use the Swagger UI at http://localhost:8000/api/docs
- Or use curl commands for quick testing
## 🐛 Troubleshooting
### Common Issues
#### 1. "Module not found" errors
```bash
# Make sure you're in the backend directory
cd backend
# Install dependencies
pip install -r requirements.txt
```
#### 2. "Port already in use" error
```bash
# Use a different port
PORT=8080 python start_alwrity_backend.py
```
#### 3. "Permission denied" errors
```bash
# On Windows, run PowerShell as Administrator
# On Linux/Mac, check file permissions
ls -la
```
#### 4. CORS errors from frontend
- Make sure the frontend is running on http://localhost:3000
- Check that CORS is properly configured in `app.py`
### Getting Help
1. **Check the logs**: The startup script provides detailed information
2. **API Documentation**: Use http://localhost:8000/api/docs to test endpoints
3. **Health Check**: Visit http://localhost:8000/health to verify the server is running
## 🚀 Production Deployment
### Using Docker
```dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
```
### Using Gunicorn (Recommended for production)
```bash
# Install gunicorn
pip install gunicorn
# Run with multiple workers
gunicorn app:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
```
## 🔗 Integration with Frontend
This backend is designed to work seamlessly with the React frontend:
1. **API Client**: Frontend uses axios to communicate with these endpoints
2. **Real-time Updates**: Frontend polls status endpoints for live updates
3. **Error Handling**: Comprehensive error responses for frontend handling
4. **CORS**: Configured for cross-origin requests from React
## 📈 Features
- **✅ Onboarding Progress Tracking**: Complete 6-step onboarding flow with persistence
- **🔑 API Key Management**: Secure storage and validation of AI provider API keys
- **🔄 Resume Functionality**: Users can resume onboarding from where they left off
- **✅ Validation**: Comprehensive validation for API keys and step completion
- **🌐 CORS Support**: Configured for React frontend integration
- **📚 Auto-generated Documentation**: Swagger UI and ReDoc
- **🔍 Health Monitoring**: Built-in health check endpoint
## 🤝 Contributing
When adding new features:
1. **Add API endpoints** in `api/` directory
2. **Add business logic** in `services/` directory
3. **Add data models** in `models/` directory
4. **Update this README** with new information
5. **Test thoroughly** using the Swagger UI
## 📞 Support
If you encounter issues:
1. Check the console output for error messages
2. Verify all dependencies are installed
3. Test individual endpoints using the Swagger UI
4. Check the health endpoint: http://localhost:8000/health
---
**Happy coding! 🎉**

1
backend/__init__.py Normal file
View File

@@ -0,0 +1 @@
# Backend package for Alwrity API

View File

@@ -0,0 +1,26 @@
"""
ALwrity Utilities Package
Modular utilities for ALwrity backend startup and configuration.
"""
from .dependency_manager import DependencyManager
from .environment_setup import EnvironmentSetup
from .database_setup import DatabaseSetup
from .production_optimizer import ProductionOptimizer
from .health_checker import HealthChecker
from .rate_limiter import RateLimiter
from .frontend_serving import FrontendServing
from .router_manager import RouterManager
from .onboarding_manager import OnboardingManager
__all__ = [
'DependencyManager',
'EnvironmentSetup',
'DatabaseSetup',
'ProductionOptimizer',
'HealthChecker',
'RateLimiter',
'FrontendServing',
'RouterManager',
'OnboardingManager'
]

View File

@@ -0,0 +1,237 @@
"""
Database Setup Module
Handles database initialization and table creation.
"""
from typing import List, Tuple
import sys
from pathlib import Path
from loguru import logger
class DatabaseSetup:
"""Manages database setup for ALwrity backend."""
def __init__(self, production_mode: bool = False):
self.production_mode = production_mode
def setup_essential_tables(self) -> bool:
"""Set up essential database tables."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
if verbose:
print("📊 Setting up essential database tables...")
try:
from services.database import init_database, engine
# Initialize database connection
init_database()
if verbose:
print(" ✅ Database connection initialized")
# Create essential tables
self._create_monitoring_tables()
self._create_subscription_tables()
self._create_persona_tables()
self._create_onboarding_tables()
self._create_daily_workflow_tables()
if verbose:
print("✅ Essential database tables created")
return True
except Exception as e:
if verbose:
print(f"⚠️ Warning: Database setup failed: {e}")
if self.production_mode:
print(" Continuing in production mode...")
else:
print(" This may affect functionality")
return True # Don't fail startup for database issues
def _create_monitoring_tables(self) -> bool:
"""Create API monitoring tables."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
from models.api_monitoring import Base as MonitoringBase
MonitoringBase.metadata.create_all(bind=engine)
if verbose:
print(" ✅ Monitoring tables created")
return True
except Exception as e:
if verbose:
print(f" ⚠️ Monitoring tables failed: {e}")
return True # Non-critical
def _create_subscription_tables(self) -> bool:
"""Create subscription and billing tables."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
from models.subscription_models import Base as SubscriptionBase
SubscriptionBase.metadata.create_all(bind=engine)
if verbose:
print(" ✅ Subscription tables created")
return True
except Exception as e:
if verbose:
print(f" ⚠️ Subscription tables failed: {e}")
return True # Non-critical
def _create_persona_tables(self) -> bool:
"""Create persona analysis tables."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
from models.persona_models import Base as PersonaBase
PersonaBase.metadata.create_all(bind=engine)
if verbose:
print(" ✅ Persona tables created")
return True
except Exception as e:
if verbose:
print(f" ⚠️ Persona tables failed: {e}")
return True # Non-critical
def _create_onboarding_tables(self) -> bool:
"""Create onboarding tables."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
from models.onboarding import Base as OnboardingBase
OnboardingBase.metadata.create_all(bind=engine)
if verbose:
print(" ✅ Onboarding tables created")
return True
except Exception as e:
if verbose:
print(f" ⚠️ Onboarding tables failed: {e}")
return True # Non-critical
def _create_daily_workflow_tables(self) -> bool:
"""Create daily workflow tables."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
from models.enhanced_strategy_models import Base as StrategyBase
StrategyBase.metadata.create_all(bind=engine)
if verbose:
print(" ✅ Daily workflow tables created")
return True
except Exception as e:
if verbose:
print(f" ⚠️ Daily workflow tables failed: {e}")
return True # Non-critical
def verify_tables(self) -> bool:
"""Verify that essential tables exist."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
if self.production_mode:
if verbose:
print("⚠️ Skipping table verification in production mode")
return True
if verbose:
print("🔍 Verifying database tables...")
try:
from services.database import engine
from sqlalchemy import inspect
if engine is None:
if verbose:
print(" ⚠️ Global engine is None (Multi-tenant mode), skipping global table verification")
return True
inspector = inspect(engine)
tables = inspector.get_table_names()
essential_tables = [
'api_monitoring_logs',
'subscription_plans',
'user_subscriptions',
'onboarding_sessions',
'persona_data'
]
existing_tables = [table for table in essential_tables if table in tables]
if verbose:
print(f" ✅ Found tables: {existing_tables}")
if len(existing_tables) < len(essential_tables):
missing = [table for table in essential_tables if table not in existing_tables]
if verbose:
print(f" ⚠️ Missing tables: {missing}")
return True
except Exception as e:
print(f" ⚠️ Table verification failed: {e}")
return True # Non-critical
def setup_advanced_tables(self) -> bool:
"""Set up advanced tables (non-critical)."""
if self.production_mode:
print("⚠️ Skipping advanced table setup in production mode")
return True
print("🔧 Setting up advanced database features...")
try:
# Set up monitoring tables
self._setup_monitoring_tables()
# Set up billing tables
self._setup_billing_tables()
logger.debug("✅ Advanced database features configured")
return True
except Exception as e:
logger.warning(f"Advanced table setup failed: {e}")
return True # Non-critical
def _setup_monitoring_tables(self) -> bool:
"""Set up API monitoring tables."""
# Reuse the existing method that uses SQLAlchemy metadata
# This avoids the script dependency that requires user_id
return self._create_monitoring_tables()
def _setup_billing_tables(self) -> bool:
"""Set up billing and subscription tables."""
try:
sys.path.append(str(Path(__file__).parent.parent))
from scripts.create_billing_tables import create_billing_tables, check_existing_tables
from services.database import engine
# Check if engine is available (it might be None in multi-tenant mode)
if engine is None:
# In multi-tenant mode, we can't setup global billing tables
# They will be created per-user when they are initialized
return True
# Check if tables already exist
if check_existing_tables(engine):
logger.debug("✅ Billing tables already exist")
return True
# For global setup, we can't call create_billing_tables() without user_id
# But if engine is not None, it implies we have a global DB.
# However, the script is designed for user_id.
# We'll skip this call to avoid the TypeError and rely on per-user init.
logger.debug(" Skipping global billing table creation (handled per-user)")
return True
except Exception as e:
logger.warning(f"Billing setup failed: {e}")
return True # Non-critical

View File

@@ -0,0 +1,183 @@
"""
Dependency Management Module
Handles installation and verification of Python dependencies.
"""
import sys
import subprocess
from pathlib import Path
from typing import List, Tuple
class DependencyManager:
"""Manages Python package dependencies for ALwrity backend."""
def __init__(self, requirements_file: str = "requirements.txt"):
self.requirements_file = Path(requirements_file)
self.critical_packages = [
'fastapi',
'uvicorn',
'pydantic',
'sqlalchemy',
'loguru'
]
self.optional_packages = [
'openai',
'google.generativeai',
'anthropic',
'mistralai',
'spacy',
'nltk'
]
def install_requirements(self) -> bool:
"""Install packages from requirements.txt."""
print("📦 Installing required packages...")
if not self.requirements_file.exists():
print(f"❌ Requirements file not found: {self.requirements_file}")
return False
try:
subprocess.check_call([
sys.executable, "-m", "pip", "install", "-r", str(self.requirements_file)
])
print("✅ All packages installed successfully!")
return True
except subprocess.CalledProcessError as e:
print(f"❌ Error installing packages: {e}")
return False
def check_critical_dependencies(self) -> Tuple[bool, List[str]]:
"""Check if critical dependencies are available."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
if verbose:
print("🔍 Checking critical dependencies...")
missing_packages = []
for package in self.critical_packages:
try:
__import__(package.replace('-', '_'))
if verbose:
print(f"{package}")
except ImportError:
if verbose:
print(f"{package} - MISSING")
missing_packages.append(package)
if missing_packages:
if verbose:
print(f"❌ Missing critical packages: {', '.join(missing_packages)}")
return False, missing_packages
if verbose:
print("✅ All critical dependencies available!")
return True, []
def check_optional_dependencies(self) -> Tuple[bool, List[str]]:
"""Check if optional dependencies are available."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
if verbose:
print("🔍 Checking optional dependencies...")
missing_packages = []
for package in self.optional_packages:
try:
__import__(package.replace('-', '_'))
if verbose:
print(f"{package}")
except ImportError:
if verbose:
print(f" ⚠️ {package} - MISSING (optional)")
missing_packages.append(package)
if missing_packages and verbose:
print(f"⚠️ Missing optional packages: {', '.join(missing_packages)}")
print(" Some features may not be available")
return len(missing_packages) == 0, missing_packages
def setup_spacy_model(self) -> bool:
"""Set up spaCy English model."""
print("🧠 Setting up spaCy model...")
try:
import spacy
model_name = "en_core_web_sm"
try:
# Try to load the model
nlp = spacy.load(model_name)
test_doc = nlp("This is a test sentence.")
if test_doc and len(test_doc) > 0:
print(f"✅ spaCy model '{model_name}' is available")
return True
except OSError:
# Model not found - try to download it
print(f"⚠️ spaCy model '{model_name}' not found, downloading...")
try:
subprocess.check_call([
sys.executable, "-m", "spacy", "download", model_name
])
print(f"✅ spaCy model '{model_name}' downloaded successfully")
return True
except subprocess.CalledProcessError as e:
print(f"❌ Failed to download spaCy model: {e}")
print(" Please download manually with: python -m spacy download en_core_web_sm")
return False
except ImportError:
print("⚠️ spaCy not installed - skipping model setup")
return True # Don't fail for missing spaCy package
return True
def setup_nltk_data(self) -> bool:
"""Set up NLTK data."""
print("📚 Setting up NLTK data...")
try:
import nltk
# Essential NLTK data packages
essential_data = [
('punkt_tab', 'tokenizers/punkt_tab'), # Updated tokenizer
('stopwords', 'corpora/stopwords'),
('averaged_perceptron_tagger', 'taggers/averaged_perceptron_tagger')
]
for data_package, path in essential_data:
try:
nltk.data.find(path)
print(f"{data_package}")
except LookupError:
print(f" ⚠️ {data_package} - downloading...")
try:
nltk.download(data_package, quiet=True)
print(f"{data_package} downloaded")
except Exception as e:
print(f" ⚠️ {data_package} download failed: {e}")
# Try fallback for punkt_tab -> punkt
if data_package == 'punkt_tab':
try:
nltk.download('punkt', quiet=True)
print(f" ✅ punkt (fallback) downloaded")
except:
pass
print("✅ NLTK data setup complete")
return True
except ImportError:
print("⚠️ NLTK not installed - skipping data setup")
return True # Don't fail for missing NLTK package
return True

View File

@@ -0,0 +1,152 @@
"""
Environment Setup Module
Handles environment configuration and directory setup.
"""
import os
from pathlib import Path
from typing import List, Dict, Any
class EnvironmentSetup:
"""Manages environment setup for ALwrity backend."""
def __init__(self, production_mode: bool = False):
self.production_mode = production_mode
if production_mode:
self.required_directories = []
else:
self.required_directories = [
"lib/workspace/alwrity_content",
"lib/workspace/alwrity_web_research",
"lib/workspace/alwrity_prompts",
"lib/workspace/alwrity_config"
]
def setup_directories(self) -> bool:
"""Create necessary directories for ALwrity."""
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
if verbose:
print("📁 Setting up directories...")
if not self.required_directories:
if verbose:
print(" ⚠️ Skipping directory creation in production mode")
return True
for directory in self.required_directories:
try:
Path(directory).mkdir(parents=True, exist_ok=True)
if verbose:
print(f" ✅ Created: {directory}")
except Exception as e:
if verbose:
print(f" ❌ Failed to create {directory}: {e}")
return False
if verbose:
print("✅ All directories created successfully")
return True
def setup_environment_variables(self) -> bool:
"""Set up environment variables for the application."""
print("🔧 Setting up environment variables...")
# Production environment variables
if self.production_mode:
env_vars = {
"HOST": "0.0.0.0",
"PORT": "8000",
"RELOAD": "false",
"LOG_LEVEL": "INFO",
"DEBUG": "false"
}
else:
env_vars = {
"HOST": "0.0.0.0",
"PORT": "8000",
"RELOAD": "true",
"LOG_LEVEL": "DEBUG",
"DEBUG": "true"
}
for key, value in env_vars.items():
os.environ.setdefault(key, value)
print(f"{key}={value}")
print("✅ Environment variables configured")
return True
def create_env_file(self) -> bool:
"""Create .env file with default configuration (development only)."""
if self.production_mode:
print("⚠️ Skipping .env file creation in production mode")
return True
print("🔧 Creating .env file...")
env_file = Path(".env")
if env_file.exists():
print(" ✅ .env file already exists")
return True
env_content = """# ALwrity Backend Configuration
# API Keys (Configure these in the onboarding process)
# OPENAI_API_KEY=your_openai_api_key_here
# GEMINI_API_KEY=your_gemini_api_key_here
# ANTHROPIC_API_KEY=your_anthropic_api_key_here
# MISTRAL_API_KEY=your_mistral_api_key_here
# Research API Keys (Optional)
# TAVILY_API_KEY=your_tavily_api_key_here
# SERPER_API_KEY=your_serper_api_key_here
# EXA_API_KEY=your_exa_api_key_here
# Authentication
# CLERK_SECRET_KEY=your_clerk_secret_key_here
# OAuth Redirect URIs
# GSC_REDIRECT_URI=https://your-frontend.vercel.app/gsc/callback
# WORDPRESS_REDIRECT_URI=https://your-frontend.vercel.app/wp/callback
# WIX_REDIRECT_URI=https://your-frontend.vercel.app/wix/callback
# Server Configuration
HOST=0.0.0.0
PORT=8000
DEBUG=true
# Logging
LOG_LEVEL=INFO
"""
try:
with open(env_file, 'w') as f:
f.write(env_content)
print("✅ .env file created successfully")
return True
except Exception as e:
print(f"❌ Error creating .env file: {e}")
return False
def verify_environment(self) -> bool:
"""Verify that the environment is properly configured."""
print("🔍 Verifying environment setup...")
# Check required directories
for directory in self.required_directories:
if not Path(directory).exists():
print(f"❌ Directory missing: {directory}")
return False
# Check environment variables
required_vars = ["HOST", "PORT", "LOG_LEVEL"]
for var in required_vars:
if not os.getenv(var):
print(f"❌ Environment variable missing: {var}")
return False
print("✅ Environment verification complete")
return True

View File

@@ -0,0 +1,156 @@
"""
Frontend Serving Module
Handles React frontend serving and static file mounting with cache headers.
"""
import os
from pathlib import Path
from fastapi import FastAPI, Request
from fastapi.staticfiles import StaticFiles
from fastapi.responses import FileResponse, Response
from starlette.middleware.base import BaseHTTPMiddleware
from loguru import logger
from typing import Dict, Any
class CacheHeadersMiddleware(BaseHTTPMiddleware):
"""
Middleware to add cache headers to static files.
This improves performance by allowing browsers to cache static assets
(JS, CSS, images) for 1 year, reducing repeat visit load times.
"""
async def dispatch(self, request: Request, call_next):
response = await call_next(request)
# Only add cache headers to static files
if request.url.path.startswith("/static/"):
path = request.url.path.lower()
# Check if file has a hash in its name (React build pattern: filename.hash.ext)
# Examples: bundle.abc123.js, main.def456.chunk.js, vendors.789abc.js
import re
# Pattern matches: filename.hexhash.ext or filename.hexhash.chunk.ext
hash_pattern = r'\.[a-f0-9]{8,}\.'
has_hash = bool(re.search(hash_pattern, path))
# File extensions that should be cached
cacheable_extensions = ['.js', '.css', '.woff', '.woff2', '.ttf', '.otf',
'.png', '.jpg', '.jpeg', '.webp', '.svg', '.ico', '.gif']
is_cacheable_file = any(path.endswith(ext) for ext in cacheable_extensions)
if is_cacheable_file:
if has_hash:
# Immutable files (with hash) - cache for 1 year
# These files never change (new hash = new file)
response.headers["Cache-Control"] = "public, max-age=31536000, immutable"
# Expires header calculated dynamically to match max-age
# Modern browsers prefer Cache-Control, but Expires provides compatibility
from datetime import datetime, timedelta
expires_date = datetime.utcnow() + timedelta(seconds=31536000)
response.headers["Expires"] = expires_date.strftime("%a, %d %b %Y %H:%M:%S GMT")
else:
# Non-hashed files - shorter cache (1 hour)
# These might be updated, so cache for shorter time
response.headers["Cache-Control"] = "public, max-age=3600"
# Never cache HTML files (index.html)
elif request.url.path == "/" or request.url.path.endswith(".html"):
response.headers["Cache-Control"] = "no-cache, no-store, must-revalidate"
response.headers["Pragma"] = "no-cache"
response.headers["Expires"] = "0"
return response
class FrontendServing:
"""Manages React frontend serving and static file mounting with cache headers."""
def __init__(self, app: FastAPI):
self.app = app
self.frontend_build_path = os.path.join(os.path.dirname(__file__), "..", "..", "frontend", "build")
self.static_path = os.path.join(self.frontend_build_path, "static")
def setup_frontend_serving(self) -> bool:
"""
Set up React frontend serving and static file mounting with cache headers.
This method:
1. Adds cache headers middleware for static files
2. Mounts static files directory
3. Configures proper caching for performance
"""
try:
logger.info("Setting up frontend serving with cache headers...")
# Add cache headers middleware BEFORE mounting static files
self.app.add_middleware(CacheHeadersMiddleware)
logger.info("Cache headers middleware added")
# Mount static files for React app (only if directory exists)
if os.path.exists(self.static_path):
self.app.mount("/static", StaticFiles(directory=self.static_path), name="static")
logger.info("Frontend static files mounted successfully with cache headers")
logger.info("Static files will be cached for 1 year (immutable files) or 1 hour (others)")
return True
else:
logger.info("Frontend build directory not found. Static files not mounted.")
return False
except Exception as e:
logger.error(f"Could not mount static files: {e}")
return False
def serve_frontend(self) -> FileResponse | Dict[str, Any]:
"""
Serve the React frontend index.html.
Note: index.html is never cached to ensure users always get the latest version.
Static assets (JS/CSS) are cached separately via middleware.
"""
try:
# Check if frontend build exists
index_html = os.path.join(self.frontend_build_path, "index.html")
if os.path.exists(index_html):
# Return FileResponse with no-cache headers for HTML
response = FileResponse(index_html)
response.headers["Cache-Control"] = "no-cache, no-store, must-revalidate"
response.headers["Pragma"] = "no-cache"
response.headers["Expires"] = "0"
return response
else:
return {
"message": "Frontend not built. Please run 'npm run build' in the frontend directory.",
"api_docs": "/api/docs"
}
except Exception as e:
logger.error(f"Error serving frontend: {e}")
return {
"message": "Error serving frontend",
"error": str(e),
"api_docs": "/api/docs"
}
def get_frontend_status(self) -> Dict[str, Any]:
"""Get the status of frontend build and serving."""
try:
index_html = os.path.join(self.frontend_build_path, "index.html")
static_exists = os.path.exists(self.static_path)
return {
"frontend_build_path": self.frontend_build_path,
"static_path": self.static_path,
"index_html_exists": os.path.exists(index_html),
"static_files_exist": static_exists,
"frontend_ready": os.path.exists(index_html) and static_exists
}
except Exception as e:
logger.error(f"Error checking frontend status: {e}")
return {
"error": str(e),
"frontend_ready": False
}

View File

@@ -0,0 +1,129 @@
"""
Health Check Module
Handles health check endpoints and database health verification.
"""
from fastapi import HTTPException
from datetime import datetime
from typing import Dict, Any
from loguru import logger
class HealthChecker:
"""Manages health check functionality for ALwrity backend."""
def __init__(self):
self.startup_time = datetime.utcnow()
def basic_health_check(self) -> Dict[str, Any]:
"""Basic health check endpoint."""
try:
return {
"status": "healthy",
"message": "ALwrity backend is running",
"timestamp": datetime.utcnow().isoformat(),
"uptime": str(datetime.utcnow() - self.startup_time)
}
except Exception as e:
logger.error(f"Health check failed: {e}")
return {
"status": "error",
"message": f"Health check failed: {str(e)}",
"timestamp": datetime.utcnow().isoformat()
}
def database_health_check(self) -> Dict[str, Any]:
"""Database health check endpoint including persona tables verification."""
try:
from services.database import get_db_session
from models.persona_models import (
WritingPersona,
PlatformPersona,
PersonaAnalysisResult,
PersonaValidationResult
)
session = get_db_session()
if not session:
return {
"status": "error",
"message": "Could not get database session",
"timestamp": datetime.utcnow().isoformat()
}
# Test all persona tables
tables_status = {}
try:
session.query(WritingPersona).first()
tables_status["writing_personas"] = "ok"
except Exception as e:
tables_status["writing_personas"] = f"error: {str(e)}"
try:
session.query(PlatformPersona).first()
tables_status["platform_personas"] = "ok"
except Exception as e:
tables_status["platform_personas"] = f"error: {str(e)}"
try:
session.query(PersonaAnalysisResult).first()
tables_status["persona_analysis_results"] = "ok"
except Exception as e:
tables_status["persona_analysis_results"] = f"error: {str(e)}"
try:
session.query(PersonaValidationResult).first()
tables_status["persona_validation_results"] = "ok"
except Exception as e:
tables_status["persona_validation_results"] = f"error: {str(e)}"
session.close()
# Check if all tables are ok
all_ok = all(status == "ok" for status in tables_status.values())
return {
"status": "healthy" if all_ok else "warning",
"message": "Database connection successful" if all_ok else "Some persona tables may have issues",
"persona_tables": tables_status,
"timestamp": datetime.utcnow().isoformat()
}
except Exception as e:
logger.error(f"Database health check failed: {e}")
return {
"status": "error",
"message": f"Database health check failed: {str(e)}",
"timestamp": datetime.utcnow().isoformat()
}
def comprehensive_health_check(self) -> Dict[str, Any]:
"""Comprehensive health check including all services."""
try:
# Basic health
basic_health = self.basic_health_check()
# Database health
db_health = self.database_health_check()
# Determine overall status
overall_status = "healthy"
if basic_health["status"] != "healthy" or db_health["status"] == "error":
overall_status = "unhealthy"
elif db_health["status"] == "warning":
overall_status = "degraded"
return {
"status": overall_status,
"basic": basic_health,
"database": db_health,
"timestamp": datetime.utcnow().isoformat()
}
except Exception as e:
logger.error(f"Comprehensive health check failed: {e}")
return {
"status": "error",
"message": f"Comprehensive health check failed: {str(e)}",
"timestamp": datetime.utcnow().isoformat()
}

View File

@@ -0,0 +1,499 @@
"""
Onboarding Manager Module
Handles all onboarding-related endpoints and functionality.
"""
from fastapi import FastAPI, HTTPException, Depends, BackgroundTasks
from fastapi.responses import FileResponse
from typing import Dict, Any, Optional
from loguru import logger
# Import onboarding functions
from api.onboarding import (
health_check,
initialize_onboarding,
get_onboarding_status,
get_onboarding_progress_full,
get_step_data,
complete_step,
skip_step,
validate_step_access,
get_api_keys,
get_api_keys_for_onboarding,
save_api_key,
validate_api_keys,
start_onboarding,
complete_onboarding,
reset_onboarding,
get_resume_info,
get_onboarding_config,
get_provider_setup_info,
get_all_providers_info,
validate_provider_key,
get_enhanced_validation_status,
get_onboarding_summary,
get_website_analysis_data,
get_research_preferences_data,
save_business_info,
get_business_info,
get_business_info_by_user,
update_business_info,
generate_writing_personas,
generate_writing_personas_async,
get_persona_task_status,
assess_persona_quality,
regenerate_persona,
get_persona_generation_options,
get_latest_persona,
save_persona_update,
StepCompletionRequest,
APIKeyRequest
)
from middleware.auth_middleware import get_current_user
class OnboardingManager:
"""Manages all onboarding-related endpoints and functionality."""
def __init__(self, app: FastAPI):
self.app = app
self.setup_onboarding_endpoints()
def setup_onboarding_endpoints(self):
"""Set up all onboarding-related endpoints."""
# Onboarding initialization - BATCH ENDPOINT (reduces 4 API calls to 1)
@self.app.get("/api/onboarding/init")
async def onboarding_init(current_user: dict = Depends(get_current_user)):
"""
Batch initialization endpoint - combines user info, status, and progress.
This eliminates 3-4 separate API calls on initial load, reducing latency by 60-75%.
"""
try:
return await initialize_onboarding(current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in onboarding_init: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Onboarding status endpoints
@self.app.get("/api/onboarding/status")
async def onboarding_status(current_user: dict = Depends(get_current_user)):
"""Get the current onboarding status."""
try:
return await get_onboarding_status(current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in onboarding_status: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/progress")
async def onboarding_progress(current_user: dict = Depends(get_current_user)):
"""Get the full onboarding progress data."""
try:
return await get_onboarding_progress_full(current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in onboarding_progress: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Step management endpoints
@self.app.get("/api/onboarding/step/{step_number}")
async def step_data(step_number: int, current_user: dict = Depends(get_current_user)):
"""Get data for a specific step."""
try:
return await get_step_data(step_number, current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in step_data: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/step/{step_number}/complete")
async def step_complete(step_number: int, request: StepCompletionRequest, current_user: dict = Depends(get_current_user)):
"""Mark a step as completed."""
try:
return await complete_step(step_number, request, current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in step_complete: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/step/{step_number}/skip")
async def step_skip(step_number: int, current_user: dict = Depends(get_current_user)):
"""Skip a step (for optional steps)."""
try:
return await skip_step(step_number, current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in step_skip: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/step/{step_number}/validate")
async def step_validate(step_number: int, current_user: dict = Depends(get_current_user)):
"""Validate if user can access a specific step."""
try:
return await validate_step_access(step_number, current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in step_validate: {e}")
raise HTTPException(status_code=500, detail=str(e))
# API key management endpoints
@self.app.get("/api/onboarding/api-keys")
async def api_keys():
"""Get all configured API keys (masked)."""
try:
return await get_api_keys()
except Exception as e:
logger.error(f"Error in api_keys: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/api-keys/onboarding")
async def api_keys_for_onboarding(current_user: dict = Depends(get_current_user)):
"""Get all configured API keys for onboarding (unmasked)."""
try:
return await get_api_keys_for_onboarding(current_user)
except Exception as e:
logger.error(f"Error in api_keys_for_onboarding: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/api-keys")
async def api_key_save(request: APIKeyRequest, current_user: dict = Depends(get_current_user)):
"""Save an API key for a provider."""
try:
return await save_api_key(request, current_user)
except Exception as e:
logger.error(f"Error in api_key_save: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/api-keys/validate")
async def api_key_validate():
"""Get API key validation status and configuration."""
try:
import os
from dotenv import load_dotenv
# Load environment variables
backend_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
env_path = os.path.join(backend_dir, ".env")
load_dotenv(env_path, override=True)
# Check for required API keys (backend only)
api_keys = {}
required_keys = {
'GEMINI_API_KEY': 'gemini',
'EXA_API_KEY': 'exa'
# Note: CopilotKit is frontend-only, validated separately
}
missing_keys = []
configured_providers = []
for env_var, provider in required_keys.items():
key_value = os.getenv(env_var)
if key_value and key_value.strip():
api_keys[provider] = key_value.strip()
configured_providers.append(provider)
else:
missing_keys.append(provider)
# Determine if all required keys are present
required_providers = ['gemini', 'exa'] # Backend keys only
all_required_present = all(provider in configured_providers for provider in required_providers)
result = {
"api_keys": api_keys,
"validation_results": {
"gemini": {"valid": 'gemini' in configured_providers, "status": "configured" if 'gemini' in configured_providers else "missing"},
"exa": {"valid": 'exa' in configured_providers, "status": "configured" if 'exa' in configured_providers else "missing"}
},
"all_valid": all_required_present,
"total_providers": len(configured_providers),
"configured_providers": configured_providers,
"missing_keys": missing_keys
}
logger.info(f"API Key Validation Result: {result}")
return result
except Exception as e:
logger.error(f"Error in api_key_validate: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Onboarding control endpoints
@self.app.post("/api/onboarding/start")
async def onboarding_start(current_user: dict = Depends(get_current_user)):
"""Start a new onboarding session."""
try:
return await start_onboarding(current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in onboarding_start: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/complete")
async def onboarding_complete(current_user: dict = Depends(get_current_user)):
"""Complete the onboarding process."""
try:
return await complete_onboarding(current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in onboarding_complete: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/reset")
async def onboarding_reset(current_user: dict = Depends(get_current_user)):
"""Reset the onboarding progress."""
try:
return await reset_onboarding(current_user)
except Exception as e:
logger.error(f"Error in onboarding_reset: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Resume functionality
@self.app.get("/api/onboarding/resume")
async def onboarding_resume():
"""Get information for resuming onboarding."""
try:
return await get_resume_info()
except Exception as e:
logger.error(f"Error in onboarding_resume: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Configuration endpoints
@self.app.get("/api/onboarding/config")
async def onboarding_config():
"""Get onboarding configuration and requirements."""
try:
return get_onboarding_config()
except Exception as e:
logger.error(f"Error in onboarding_config: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Enhanced provider endpoints
@self.app.get("/api/onboarding/providers/{provider}/setup")
async def provider_setup_info(provider: str):
"""Get setup information for a specific provider."""
try:
return await get_provider_setup_info(provider)
except Exception as e:
logger.error(f"Error in provider_setup_info: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/providers")
async def all_providers_info():
"""Get setup information for all providers."""
try:
return await get_all_providers_info()
except Exception as e:
logger.error(f"Error in all_providers_info: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/providers/{provider}/validate")
async def validate_provider_key_endpoint(provider: str, request: APIKeyRequest):
"""Validate a specific provider's API key."""
try:
return await validate_provider_key(provider, request)
except Exception as e:
logger.error(f"Error in validate_provider_key: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/validation/enhanced")
async def enhanced_validation_status():
"""Get enhanced validation status for all configured services."""
try:
return await get_enhanced_validation_status()
except Exception as e:
logger.error(f"Error in enhanced_validation_status: {e}")
raise HTTPException(status_code=500, detail=str(e))
# New endpoints for FinalStep data loading
@self.app.get("/api/onboarding/summary")
async def onboarding_summary(current_user: dict = Depends(get_current_user)):
"""Get comprehensive onboarding summary for FinalStep."""
try:
return await get_onboarding_summary(current_user)
except Exception as e:
logger.error(f"Error in onboarding_summary: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/website-analysis")
async def website_analysis_data(current_user: dict = Depends(get_current_user)):
"""Get website analysis data for FinalStep."""
try:
return await get_website_analysis_data(current_user)
except Exception as e:
logger.error(f"Error in website_analysis_data: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/research-preferences")
async def research_preferences_data(current_user: dict = Depends(get_current_user)):
"""Get research preferences data for FinalStep."""
try:
return await get_research_preferences_data(current_user)
except Exception as e:
logger.error(f"Error in research_preferences_data: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Business Information endpoints
@self.app.post("/api/onboarding/business-info")
async def business_info_save(request: dict):
"""Save business information for users without websites."""
try:
from models.business_info_request import BusinessInfoRequest
return await save_business_info(request)
except Exception as e:
logger.error(f"Error in business_info_save: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/business-info/{business_info_id}")
async def business_info_get(business_info_id: int):
"""Get business information by ID."""
try:
return await get_business_info(business_info_id)
except Exception as e:
logger.error(f"Error in business_info_get: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/business-info/user/{user_id}")
async def business_info_get_by_user(user_id: str):
"""Get business information by user ID."""
try:
return await get_business_info_by_user(user_id)
except Exception as e:
logger.error(f"Error in business_info_get_by_user: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.put("/api/onboarding/business-info/{business_info_id}")
async def business_info_update(business_info_id: int, request: dict):
"""Update business information."""
try:
from models.business_info_request import BusinessInfoRequest
return await update_business_info(business_info_id, request)
except Exception as e:
logger.error(f"Error in business_info_update: {e}")
raise HTTPException(status_code=500, detail=str(e))
# Persona generation endpoints
@self.app.post("/api/onboarding/step4/generate-personas")
async def generate_personas(request: dict, current_user: dict = Depends(get_current_user)):
"""Generate AI writing personas for Step 4."""
try:
return await generate_writing_personas(request, current_user)
except Exception as e:
logger.error(f"Error in generate_personas: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/step4/generate-personas-async")
async def generate_personas_async(request: dict, background_tasks: BackgroundTasks, current_user: dict = Depends(get_current_user)):
"""Start async persona generation task."""
try:
return await generate_writing_personas_async(request, current_user, background_tasks)
except Exception as e:
logger.error(f"Error in generate_personas_async: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/step4/persona-task/{task_id}")
async def get_persona_task(task_id: str):
"""Get persona generation task status."""
try:
return await get_persona_task_status(task_id)
except Exception as e:
logger.error(f"Error in get_persona_task: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/step4/persona-latest")
async def persona_latest(current_user: dict = Depends(get_current_user)):
"""Get latest cached persona for current user."""
try:
return await get_latest_persona(current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in persona_latest: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/step4/persona-save")
async def persona_save(request: dict, current_user: dict = Depends(get_current_user)):
"""Save edited persona back to cache."""
try:
return await save_persona_update(request, current_user)
except HTTPException as he:
raise he
except Exception as e:
logger.error(f"Error in persona_save: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/step4/assess-persona-quality")
async def assess_persona_quality_endpoint(request: dict, current_user: dict = Depends(get_current_user)):
"""Assess the quality of generated personas."""
try:
return await assess_persona_quality(request, current_user)
except Exception as e:
logger.error(f"Error in assess_persona_quality: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.post("/api/onboarding/step4/regenerate-persona")
async def regenerate_persona_endpoint(request: dict, current_user: dict = Depends(get_current_user)):
"""Regenerate a specific persona with improvements."""
try:
return await regenerate_persona(request, current_user)
except Exception as e:
logger.error(f"Error in regenerate_persona: {e}")
raise HTTPException(status_code=500, detail=str(e))
@self.app.get("/api/onboarding/step4/persona-options")
async def get_persona_options(current_user: dict = Depends(get_current_user)):
"""Get persona generation options and configurations."""
try:
return await get_persona_generation_options(current_user)
except Exception as e:
logger.error(f"Error in get_persona_options: {e}")
raise HTTPException(status_code=500, detail=str(e))
def get_onboarding_status(self) -> Dict[str, Any]:
"""Get the status of onboarding endpoints."""
return {
"onboarding_endpoints": [
"/api/onboarding/init",
"/api/onboarding/status",
"/api/onboarding/progress",
"/api/onboarding/step/{step_number}",
"/api/onboarding/step/{step_number}/complete",
"/api/onboarding/step/{step_number}/skip",
"/api/onboarding/step/{step_number}/validate",
"/api/onboarding/api-keys",
"/api/onboarding/api-keys/onboarding",
"/api/onboarding/start",
"/api/onboarding/complete",
"/api/onboarding/reset",
"/api/onboarding/resume",
"/api/onboarding/config",
"/api/onboarding/providers/{provider}/setup",
"/api/onboarding/providers",
"/api/onboarding/providers/{provider}/validate",
"/api/onboarding/validation/enhanced",
"/api/onboarding/summary",
"/api/onboarding/website-analysis",
"/api/onboarding/research-preferences",
"/api/onboarding/business-info",
"/api/onboarding/step4/generate-personas",
"/api/onboarding/step4/generate-personas-async",
"/api/onboarding/step4/persona-task/{task_id}",
"/api/onboarding/step4/persona-latest",
"/api/onboarding/step4/persona-save",
"/api/onboarding/step4/assess-persona-quality",
"/api/onboarding/step4/regenerate-persona",
"/api/onboarding/step4/persona-options"
],
"total_endpoints": 30,
"status": "active"
}

View File

@@ -0,0 +1,133 @@
"""
Production Optimizer Module
Handles production-specific optimizations and configurations.
"""
import os
import sys
from typing import List, Dict, Any
class ProductionOptimizer:
"""Optimizes ALwrity backend for production deployment."""
def __init__(self):
self.production_optimizations = {
'disable_spacy_download': False, # Allow spaCy verification (required for persona generation)
'disable_nltk_download': False, # Allow NLTK verification (required for persona generation)
'skip_linguistic_setup': False, # Always verify linguistic models are available
'minimal_database_setup': True,
'skip_file_creation': True
}
def apply_production_optimizations(self) -> bool:
"""Apply production-specific optimizations."""
print("🚀 Applying production optimizations...")
# Set production environment variables
self._set_production_env_vars()
# Disable heavy operations
self._disable_heavy_operations()
# Optimize logging
self._optimize_logging()
print("✅ Production optimizations applied")
return True
def _set_production_env_vars(self) -> None:
"""Set production-specific environment variables."""
production_vars = {
# Note: HOST is not set here - it's auto-detected by start_backend()
# Based on deployment environment (cloud vs local)
'PORT': '8000',
'RELOAD': 'false',
'LOG_LEVEL': 'INFO',
'DEBUG': 'false',
'PYTHONUNBUFFERED': '1', # Ensure logs are flushed immediately
'PYTHONDONTWRITEBYTECODE': '1' # Don't create .pyc files
}
for key, value in production_vars.items():
os.environ.setdefault(key, value)
print(f"{key}={value}")
def _disable_heavy_operations(self) -> None:
"""Configure operations for production startup."""
print(" ⚡ Configuring operations for production...")
# Note: spaCy and NLTK verification are allowed in production
# Models should be pre-installed during build phase (via render.yaml or similar)
# The setup will verify models exist without re-downloading
print(" ✅ Production operations configured")
def _optimize_logging(self) -> None:
"""Optimize logging for production."""
print(" 📝 Optimizing logging for production...")
# Set appropriate log level
os.environ.setdefault('LOG_LEVEL', 'INFO')
# Disable debug logging
os.environ.setdefault('DEBUG', 'false')
print(" ✅ Logging optimized")
def skip_linguistic_setup(self) -> bool:
"""Skip linguistic analysis setup in production."""
if os.getenv('SKIP_LINGUISTIC_SETUP', 'false').lower() == 'true':
print("⚠️ Skipping linguistic analysis setup (production mode)")
return True
return False
def skip_spacy_setup(self) -> bool:
"""Skip spaCy model setup in production."""
if os.getenv('DISABLE_SPACY_DOWNLOAD', 'false').lower() == 'true':
print("⚠️ Skipping spaCy model setup (production mode)")
return True
return False
def skip_nltk_setup(self) -> bool:
"""Skip NLTK data setup in production."""
if os.getenv('DISABLE_NLTK_DOWNLOAD', 'false').lower() == 'true':
print("⚠️ Skipping NLTK data setup (production mode)")
return True
return False
def get_production_config(self) -> Dict[str, Any]:
"""Get production configuration settings."""
return {
'host': os.getenv('HOST', '0.0.0.0'),
'port': int(os.getenv('PORT', '8000')),
'reload': False, # Never reload in production
'log_level': os.getenv('LOG_LEVEL', 'info'),
'access_log': True,
'workers': 1, # Single worker for Render
'timeout_keep_alive': 30,
'timeout_graceful_shutdown': 30
}
def validate_production_environment(self) -> bool:
"""Validate that the environment is ready for production."""
print("🔍 Validating production environment...")
# Check critical environment variables
required_vars = ['HOST', 'PORT', 'LOG_LEVEL']
missing_vars = []
for var in required_vars:
if not os.getenv(var):
missing_vars.append(var)
if missing_vars:
print(f"❌ Missing environment variables: {missing_vars}")
return False
# Check that reload is disabled
if os.getenv('RELOAD', 'false').lower() == 'true':
print("⚠️ Warning: RELOAD is enabled in production")
print("✅ Production environment validated")
return True

View File

@@ -0,0 +1,134 @@
"""
Rate Limiting Module
Handles rate limiting middleware and request tracking.
"""
import time
from collections import defaultdict
from typing import Dict, List, Optional
from fastapi import Request, Response
from fastapi.responses import JSONResponse
from loguru import logger
class RateLimiter:
"""Manages rate limiting for ALwrity backend."""
def __init__(self, window_seconds: int = 60, max_requests: int = 1000): # Increased for development
self.window_seconds = window_seconds
self.max_requests = max_requests
self.request_counts: Dict[str, List[float]] = defaultdict(list)
# Endpoints exempt from rate limiting
self.exempt_paths = [
"/stream/strategies",
"/stream/strategic-intelligence",
"/stream/keyword-research",
"/latest-strategy",
"/ai-analytics",
"/gap-analysis",
"/calendar-events",
# Research endpoints - exempt from rate limiting
"/api/research",
"/api/blog-writer",
"/api/blog-writer/research",
"/api/blog-writer/research/",
"/api/blog/research/status",
"/calendar-generation/progress",
"/health",
"/health/database",
]
# Prefixes to exempt entire route families (keep empty; rely on specific exemptions only)
self.exempt_prefixes = []
def is_exempt_path(self, path: str) -> bool:
"""Check if a path is exempt from rate limiting."""
return any(exempt_path == path or exempt_path in path for exempt_path in self.exempt_paths) or any(
path.startswith(prefix) for prefix in self.exempt_prefixes
)
def clean_old_requests(self, client_ip: str, current_time: float) -> None:
"""Clean old requests from the tracking dictionary."""
self.request_counts[client_ip] = [
req_time for req_time in self.request_counts[client_ip]
if current_time - req_time < self.window_seconds
]
def is_rate_limited(self, client_ip: str, current_time: float) -> bool:
"""Check if a client has exceeded the rate limit."""
self.clean_old_requests(client_ip, current_time)
return len(self.request_counts[client_ip]) >= self.max_requests
def add_request(self, client_ip: str, current_time: float) -> None:
"""Add a request to the tracking dictionary."""
self.request_counts[client_ip].append(current_time)
def get_rate_limit_response(self) -> JSONResponse:
"""Get a rate limit exceeded response."""
return JSONResponse(
status_code=429,
content={
"detail": "Too many requests",
"retry_after": self.window_seconds
},
headers={
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "*",
"Access-Control-Allow-Headers": "*"
}
)
async def rate_limit_middleware(self, request: Request, call_next) -> Response:
"""Rate limiting middleware with exemptions for streaming endpoints."""
try:
client_ip = request.client.host if request.client else "unknown"
current_time = time.time()
path = request.url.path
# Check if path is exempt from rate limiting
if self.is_exempt_path(path):
response = await call_next(request)
return response
# Check rate limit
if self.is_rate_limited(client_ip, current_time):
logger.warning(f"Rate limit exceeded for {client_ip}")
return self.get_rate_limit_response()
# Add current request
self.add_request(client_ip, current_time)
response = await call_next(request)
return response
except Exception as e:
logger.error(f"Error in rate limiting middleware: {e}")
# Continue without rate limiting if there's an error
response = await call_next(request)
return response
def get_rate_limit_status(self, client_ip: str) -> Dict[str, any]:
"""Get current rate limit status for a client."""
current_time = time.time()
self.clean_old_requests(client_ip, current_time)
request_count = len(self.request_counts[client_ip])
remaining_requests = max(0, self.max_requests - request_count)
return {
"client_ip": client_ip,
"requests_in_window": request_count,
"max_requests": self.max_requests,
"remaining_requests": remaining_requests,
"window_seconds": self.window_seconds,
"is_limited": request_count >= self.max_requests
}
def reset_rate_limit(self, client_ip: Optional[str] = None) -> Dict[str, any]:
"""Reset rate limit for a specific client or all clients."""
if client_ip:
self.request_counts[client_ip] = []
return {"message": f"Rate limit reset for {client_ip}"}
else:
self.request_counts.clear()
return {"message": "Rate limit reset for all clients"}

View File

@@ -0,0 +1,244 @@
"""
Router Manager Module
Handles FastAPI router inclusion and management.
"""
from fastapi import FastAPI
from loguru import logger
from typing import List, Dict, Any, Optional
import os
class RouterManager:
"""Manages FastAPI router inclusion and organization."""
def __init__(self, app: FastAPI):
self.app = app
self.included_routers = []
self.failed_routers = []
def include_router_safely(self, router, router_name: str = None) -> bool:
"""Include a router safely with error handling."""
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
self.app.include_router(router)
router_name = router_name or getattr(router, 'prefix', 'unknown')
self.included_routers.append(router_name)
if verbose:
logger.info(f"✅ Router included successfully: {router_name}")
return True
except Exception as e:
router_name = router_name or 'unknown'
self.failed_routers.append({"name": router_name, "error": str(e)})
if verbose:
logger.warning(f"❌ Router inclusion failed: {router_name} - {e}")
return False
def include_core_routers(self) -> bool:
"""Include core application routers."""
# Import os locally to avoid UnboundLocalError if it's shadowed
import os
verbose = os.getenv("ALWRITY_VERBOSE", "false").lower() == "true"
try:
if verbose:
logger.info("Including core routers...")
# Component logic router
from api.component_logic import router as component_logic_router
self.include_router_safely(component_logic_router, "component_logic")
# Subscription router
from api.subscription import router as subscription_router
self.include_router_safely(subscription_router, "subscription")
# Step 3 Research router (core onboarding functionality)
from api.onboarding_utils.step3_routes import router as step3_research_router
self.include_router_safely(step3_research_router, "step3_research")
# Step 4 Persona and Asset routers
from api.onboarding_utils.step4_asset_routes import router as step4_asset_router
self.include_router_safely(step4_asset_router, "step4_assets")
from api.onboarding_utils.step4_persona_routes_optimized import router as step4_persona_router
self.include_router_safely(step4_persona_router, "step4_persona")
# GSC router
from routers.gsc_auth import router as gsc_auth_router
self.include_router_safely(gsc_auth_router, "gsc_auth")
# WordPress router
from routers.wordpress_oauth import router as wordpress_oauth_router
self.include_router_safely(wordpress_oauth_router, "wordpress_oauth")
# Bing Webmaster router
from routers.bing_oauth import router as bing_oauth_router
self.include_router_safely(bing_oauth_router, "bing_oauth")
# Bing Analytics router
from routers.bing_analytics import router as bing_analytics_router
self.include_router_safely(bing_analytics_router, "bing_analytics")
# Bing Analytics Storage router
from routers.bing_analytics_storage import router as bing_analytics_storage_router
self.include_router_safely(bing_analytics_storage_router, "bing_analytics_storage")
# SEO tools router
from routers.seo_tools import router as seo_tools_router
self.include_router_safely(seo_tools_router, "seo_tools")
# Facebook Writer router
from api.facebook_writer.routers import facebook_router
self.include_router_safely(facebook_router, "facebook_writer")
# LinkedIn routers
from routers.linkedin import router as linkedin_router
self.include_router_safely(linkedin_router, "linkedin")
from api.linkedin_image_generation import router as linkedin_image_router
self.include_router_safely(linkedin_image_router, "linkedin_image")
# Brainstorm router
from api.brainstorm import router as brainstorm_router
self.include_router_safely(brainstorm_router, "brainstorm")
# Hallucination detector and writing assistant
from api.hallucination_detector import router as hallucination_detector_router
self.include_router_safely(hallucination_detector_router, "hallucination_detector")
from api.writing_assistant import router as writing_assistant_router
self.include_router_safely(writing_assistant_router, "writing_assistant")
# Content planning and user data
from api.content_planning.api.router import router as content_planning_router
self.include_router_safely(content_planning_router, "content_planning")
from api.user_data import router as user_data_router
self.include_router_safely(user_data_router, "user_data")
from api.user_environment import router as user_environment_router
self.include_router_safely(user_environment_router, "user_environment")
# Strategy copilot
from api.content_planning.strategy_copilot import router as strategy_copilot_router
self.include_router_safely(strategy_copilot_router, "strategy_copilot")
# Error logging router
from routers.error_logging import router as error_logging_router
self.include_router_safely(error_logging_router, "error_logging")
# Frontend environment manager router
from routers.frontend_env_manager import router as frontend_env_router
self.include_router_safely(frontend_env_router, "frontend_env_manager")
# Platform analytics router
try:
from routers.platform_analytics import router as platform_analytics_router
self.include_router_safely(platform_analytics_router, "platform_analytics")
logger.info("✅ Platform analytics router included successfully")
except Exception as e:
logger.error(f"❌ Failed to include platform analytics router: {e}")
# Continue with other routers
# Bing insights router
try:
from routers.bing_insights import router as bing_insights_router
self.include_router_safely(bing_insights_router, "bing_insights")
logger.info("✅ Bing insights router included successfully")
except Exception as e:
logger.error(f"❌ Failed to include Bing insights router: {e}")
# Continue with other routers
# Background jobs router
try:
from routers.background_jobs import router as background_jobs_router
self.include_router_safely(background_jobs_router, "background_jobs")
logger.info("✅ Background jobs router included successfully")
except Exception as e:
logger.error(f"❌ Failed to include Background jobs router: {e}")
# Continue with other routers
logger.info("✅ Core routers included successfully")
return True
except Exception as e:
logger.error(f"❌ Error including core routers: {e}")
return False
def include_optional_routers(self) -> bool:
"""Include optional routers with error handling."""
try:
logger.info("Including optional routers...")
# AI Blog Writer router
try:
from api.blog_writer.router import router as blog_writer_router
self.include_router_safely(blog_writer_router, "blog_writer")
except Exception as e:
logger.warning(f"AI Blog Writer router not mounted: {e}")
# Story Writer router
try:
from api.story_writer.router import router as story_writer_router
self.include_router_safely(story_writer_router, "story_writer")
except Exception as e:
logger.warning(f"Story Writer router not mounted: {e}")
# Wix Integration router
try:
from api.wix_routes import router as wix_router
self.include_router_safely(wix_router, "wix")
except Exception as e:
logger.warning(f"Wix Integration router not mounted: {e}")
# Blog Writer SEO Analysis router
try:
from api.blog_writer.seo_analysis import router as blog_seo_analysis_router
self.include_router_safely(blog_seo_analysis_router, "blog_seo_analysis")
except Exception as e:
logger.warning(f"Blog Writer SEO Analysis router not mounted: {e}")
# Persona router
try:
from api.persona_routes import router as persona_router
self.include_router_safely(persona_router, "persona")
except Exception as e:
logger.warning(f"Persona router not mounted: {e}")
# Video Studio router
try:
from api.video_studio.router import router as video_studio_router
self.include_router_safely(video_studio_router, "video_studio")
except Exception as e:
logger.warning(f"Video Studio router not mounted: {e}")
# Stability AI routers
try:
from routers.stability import router as stability_router
self.include_router_safely(stability_router, "stability")
from routers.stability_advanced import router as stability_advanced_router
self.include_router_safely(stability_advanced_router, "stability_advanced")
from routers.stability_admin import router as stability_admin_router
self.include_router_safely(stability_admin_router, "stability_admin")
except Exception as e:
logger.warning(f"Stability AI routers not mounted: {e}")
logger.info("✅ Optional routers processed")
return True
except Exception as e:
logger.error(f"❌ Error including optional routers: {e}")
return False
def get_router_status(self) -> Dict[str, Any]:
"""Get the status of router inclusion."""
return {
"included_routers": self.included_routers,
"failed_routers": self.failed_routers,
"total_included": len(self.included_routers),
"total_failed": len(self.failed_routers)
}

54
backend/api/__init__.py Normal file
View File

@@ -0,0 +1,54 @@
"""API package for ALwrity backend.
The onboarding endpoints are re-exported from a stable module
(`onboarding_endpoints`) to avoid issues where external tools overwrite
`onboarding.py`.
"""
from .onboarding_endpoints import (
health_check,
get_onboarding_status,
get_onboarding_progress_full,
get_step_data,
complete_step,
skip_step,
validate_step_access,
get_api_keys,
save_api_key,
validate_api_keys,
start_onboarding,
complete_onboarding,
reset_onboarding,
get_resume_info,
get_onboarding_config,
generate_writing_personas,
generate_writing_personas_async,
get_persona_task_status,
assess_persona_quality,
regenerate_persona,
get_persona_generation_options
)
__all__ = [
'health_check',
'get_onboarding_status',
'get_onboarding_progress_full',
'get_step_data',
'complete_step',
'skip_step',
'validate_step_access',
'get_api_keys',
'save_api_key',
'validate_api_keys',
'start_onboarding',
'complete_onboarding',
'reset_onboarding',
'get_resume_info',
'get_onboarding_config',
'generate_writing_personas',
'generate_writing_personas_async',
'get_persona_task_status',
'assess_persona_quality',
'regenerate_persona',
'get_persona_generation_options'
]

1325
backend/api/agents_api.py Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,52 @@
from fastapi import APIRouter, HTTPException
from fastapi.responses import FileResponse
import os
from pathlib import Path
from services.database import WORKSPACE_DIR, get_user_db_path
router = APIRouter(prefix="/api/assets", tags=["Assets Serving"])
@router.get("/{user_id}/avatars/{filename}")
async def serve_avatar(user_id: str, filename: str):
"""
Serve avatar images directly.
Public endpoint relying on unguessable filenames.
"""
# Sanitize user_id (simple check to prevent directory traversal)
safe_user_id = "".join(c for c in user_id if c.isalnum() or c in ('-', '_'))
if safe_user_id != user_id:
raise HTTPException(status_code=400, detail="Invalid user ID")
# Sanitize filename
safe_filename = os.path.basename(filename)
# Construct path
# workspace/workspace_{user_id}/assets/avatars/{filename}
file_path = Path(WORKSPACE_DIR) / f"workspace_{safe_user_id}" / "assets" / "avatars" / safe_filename
if not file_path.exists():
raise HTTPException(status_code=404, detail="Asset not found")
return FileResponse(file_path)
@router.get("/{user_id}/voice_samples/{filename}")
async def serve_voice_sample(user_id: str, filename: str):
"""
Serve voice sample audio files directly.
"""
# Sanitize user_id
safe_user_id = "".join(c for c in user_id if c.isalnum() or c in ('-', '_'))
if safe_user_id != user_id:
raise HTTPException(status_code=400, detail="Invalid user ID")
# Sanitize filename
safe_filename = os.path.basename(filename)
# Construct path
# workspace/workspace_{user_id}/assets/voice_samples/{filename}
file_path = Path(WORKSPACE_DIR) / f"workspace_{safe_user_id}" / "assets" / "voice_samples" / safe_filename
if not file_path.exists():
raise HTTPException(status_code=404, detail="Asset not found")
return FileResponse(file_path)

View File

@@ -0,0 +1,2 @@
# Package init for AI Blog Writer API

View File

@@ -0,0 +1,77 @@
"""
Cache Management System for Blog Writer API
Handles research and outline cache operations including statistics,
clearing, invalidation, and entry retrieval.
"""
from typing import Any, Dict, List
from loguru import logger
from services.blog_writer.blog_service import BlogWriterService
class CacheManager:
"""Manages cache operations for research and outline data."""
def __init__(self):
self.service = BlogWriterService()
def get_research_cache_stats(self) -> Dict[str, Any]:
"""Get research cache statistics."""
try:
from services.cache.research_cache import research_cache
return research_cache.get_cache_stats()
except Exception as e:
logger.error(f"Failed to get research cache stats: {e}")
raise
def clear_research_cache(self) -> Dict[str, Any]:
"""Clear the research cache."""
try:
from services.cache.research_cache import research_cache
research_cache.clear_cache()
return {"status": "success", "message": "Research cache cleared"}
except Exception as e:
logger.error(f"Failed to clear research cache: {e}")
raise
def get_outline_cache_stats(self) -> Dict[str, Any]:
"""Get outline cache statistics."""
try:
stats = self.service.get_outline_cache_stats()
return {"success": True, "stats": stats}
except Exception as e:
logger.error(f"Failed to get outline cache stats: {e}")
raise
def clear_outline_cache(self) -> Dict[str, Any]:
"""Clear all cached outline entries."""
try:
self.service.clear_outline_cache()
return {"success": True, "message": "Outline cache cleared successfully"}
except Exception as e:
logger.error(f"Failed to clear outline cache: {e}")
raise
def invalidate_outline_cache_for_keywords(self, keywords: List[str]) -> Dict[str, Any]:
"""Invalidate outline cache entries for specific keywords."""
try:
self.service.invalidate_outline_cache_for_keywords(keywords)
return {"success": True, "message": f"Invalidated cache for keywords: {keywords}"}
except Exception as e:
logger.error(f"Failed to invalidate outline cache for keywords {keywords}: {e}")
raise
def get_recent_outline_cache_entries(self, limit: int = 20) -> Dict[str, Any]:
"""Get recent outline cache entries for debugging."""
try:
entries = self.service.get_recent_outline_cache_entries(limit)
return {"success": True, "entries": entries}
except Exception as e:
logger.error(f"Failed to get recent outline cache entries: {e}")
raise
# Global cache manager instance
cache_manager = CacheManager()

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,365 @@
"""
Blog Writer SEO Analysis API Endpoint
Provides API endpoint for analyzing blog content SEO with parallel processing
and CopilotKit integration for real-time progress updates.
"""
from fastapi import APIRouter, HTTPException, BackgroundTasks, Depends
from pydantic import BaseModel
from typing import Dict, Any, Optional
from loguru import logger
from datetime import datetime
from sqlalchemy.orm import Session
from sqlalchemy import select
from services.blog_writer.seo.blog_content_seo_analyzer import BlogContentSEOAnalyzer
from services.blog_writer.core.blog_writer_service import BlogWriterService
from middleware.auth_middleware import get_current_user
from services.database import get_db
from models.seo_analysis import SEOAnalysis
router = APIRouter(prefix="/api/blog-writer/seo", tags=["Blog SEO Analysis"])
class SEOAnalysisRequest(BaseModel):
"""Request model for SEO analysis"""
blog_content: str
blog_title: Optional[str] = None
research_data: Dict[str, Any]
user_id: Optional[str] = None
session_id: Optional[str] = None
class SEOAnalysisResponse(BaseModel):
"""Response model for SEO analysis"""
success: bool
analysis_id: str
overall_score: float
category_scores: Dict[str, float]
analysis_summary: Dict[str, Any]
actionable_recommendations: list
detailed_analysis: Optional[Dict[str, Any]] = None
visualization_data: Optional[Dict[str, Any]] = None
generated_at: str
error: Optional[str] = None
class SEOAnalysisProgress(BaseModel):
"""Progress update model for real-time updates"""
analysis_id: str
stage: str
progress: int
message: str
timestamp: str
# Initialize analyzer
seo_analyzer = BlogContentSEOAnalyzer()
blog_writer_service = BlogWriterService()
@router.post("/analyze", response_model=SEOAnalysisResponse)
async def analyze_blog_seo(
request: SEOAnalysisRequest,
current_user: Dict[str, Any] = Depends(get_current_user)
):
"""
Analyze blog content for SEO optimization
This endpoint performs comprehensive SEO analysis including:
- Content structure analysis
- Keyword optimization analysis
- Readability assessment
- Content quality evaluation
- AI-powered insights generation
Args:
request: SEOAnalysisRequest containing blog content and research data
current_user: Authenticated user from middleware
Returns:
SEOAnalysisResponse with comprehensive analysis results
"""
try:
logger.info(f"Starting SEO analysis for blog content")
# Extract Clerk user ID (required)
if not current_user:
raise HTTPException(status_code=401, detail="Authentication required")
user_id = str(current_user.get('id', ''))
if not user_id:
raise HTTPException(status_code=401, detail="Invalid user ID in authentication token")
# Validate request
if not request.blog_content or not request.blog_content.strip():
raise HTTPException(status_code=400, detail="Blog content is required")
if not request.research_data:
raise HTTPException(status_code=400, detail="Research data is required")
# Generate analysis ID
import uuid
analysis_id = str(uuid.uuid4())
# Perform SEO analysis
analysis_results = await seo_analyzer.analyze_blog_content(
blog_content=request.blog_content,
research_data=request.research_data,
blog_title=request.blog_title,
user_id=user_id
)
# Check for errors
if 'error' in analysis_results:
logger.error(f"SEO analysis failed: {analysis_results['error']}")
return SEOAnalysisResponse(
success=False,
analysis_id=analysis_id,
overall_score=0,
category_scores={},
analysis_summary={},
actionable_recommendations=[],
detailed_analysis=None,
visualization_data=None,
generated_at=analysis_results.get('generated_at', ''),
error=analysis_results['error']
)
# Return successful response
return SEOAnalysisResponse(
success=True,
analysis_id=analysis_id,
overall_score=analysis_results.get('overall_score', 0),
category_scores=analysis_results.get('category_scores', {}),
analysis_summary=analysis_results.get('analysis_summary', {}),
actionable_recommendations=analysis_results.get('actionable_recommendations', []),
detailed_analysis=analysis_results.get('detailed_analysis'),
visualization_data=analysis_results.get('visualization_data'),
generated_at=analysis_results.get('generated_at', '')
)
except HTTPException:
raise
except Exception as e:
logger.error(f"SEO analysis endpoint error: {e}")
raise HTTPException(status_code=500, detail=f"SEO analysis failed: {str(e)}")
@router.post("/analyze-with-progress")
async def analyze_blog_seo_with_progress(
request: SEOAnalysisRequest,
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
):
"""
Analyze blog content for SEO with real-time progress updates
This endpoint provides real-time progress updates for CopilotKit integration.
It returns a stream of progress updates and final results.
Args:
request: SEOAnalysisRequest containing blog content and research data
current_user: Authenticated user from middleware
db: Database session
Returns:
Generator yielding progress updates and final results
"""
try:
logger.info(f"Starting SEO analysis with progress for blog content")
# Extract Clerk user ID (required)
if not current_user:
raise HTTPException(status_code=401, detail="Authentication required")
user_id = str(current_user.get('id', ''))
if not user_id:
raise HTTPException(status_code=401, detail="Invalid user ID in authentication token")
# Validate request
if not request.blog_content or not request.blog_content.strip():
raise HTTPException(status_code=400, detail="Blog content is required")
if not request.research_data:
raise HTTPException(status_code=400, detail="Research data is required")
# Generate analysis ID
import uuid
analysis_id = str(uuid.uuid4())
# Yield progress updates
async def progress_generator():
try:
# Stage 1: Initialization
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="initialization",
progress=10,
message="Initializing SEO analysis...",
timestamp=datetime.utcnow().isoformat()
)
# Stage 2: Keyword extraction
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="keyword_extraction",
progress=20,
message="Extracting keywords from research data...",
timestamp=datetime.utcnow().isoformat()
)
# Stage 3: Non-AI analysis
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="non_ai_analysis",
progress=40,
message="Running content structure and readability analysis...",
timestamp=datetime.utcnow().isoformat()
)
# Stage 4: AI analysis
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="ai_analysis",
progress=70,
message="Generating AI-powered insights...",
timestamp=datetime.utcnow().isoformat()
)
# Stage 5: Results compilation
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="compilation",
progress=90,
message="Compiling analysis results...",
timestamp=datetime.utcnow().isoformat()
)
# Perform actual analysis
analysis_results = await seo_analyzer.analyze_blog_content(
blog_content=request.blog_content,
research_data=request.research_data,
blog_title=request.blog_title,
user_id=user_id
)
# Save to Database
try:
draft_url = f"draft:{analysis_id}"
overall_score = analysis_results.get('overall_score', 0)
# Determine health status
if overall_score >= 90:
health_status = "excellent"
elif overall_score >= 70:
health_status = "good"
elif overall_score >= 50:
health_status = "needs_improvement"
else:
health_status = "poor"
new_analysis = SEOAnalysis(
url=draft_url,
overall_score=int(overall_score),
health_status=health_status,
timestamp=datetime.utcnow(),
analysis_data=analysis_results
)
db.add(new_analysis)
db.commit()
logger.info(f"Saved SEO analysis results to DB for ID: {analysis_id}")
except Exception as db_error:
logger.error(f"Failed to save analysis to DB: {db_error}")
# Continue without failing
# Final result
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="completed",
progress=100,
message="SEO analysis completed successfully!",
timestamp=datetime.utcnow().isoformat()
)
# Yield final results (can't return in async generator)
yield analysis_results
except Exception as e:
logger.error(f"Progress generator error: {e}")
yield SEOAnalysisProgress(
analysis_id=analysis_id,
stage="error",
progress=0,
message=f"Analysis failed: {str(e)}",
timestamp=datetime.utcnow().isoformat()
)
raise
return progress_generator()
except HTTPException:
raise
except Exception as e:
logger.error(f"SEO analysis with progress endpoint error: {e}")
raise HTTPException(status_code=500, detail=f"SEO analysis failed: {str(e)}")
@router.get("/analysis/{analysis_id}")
async def get_analysis_result(
analysis_id: str,
db: Session = Depends(get_db)
):
"""
Get SEO analysis result by ID
Args:
analysis_id: Unique identifier for the analysis
db: Database session
Returns:
SEO analysis results
"""
try:
logger.info(f"Retrieving SEO analysis result for ID: {analysis_id}")
# Look for the analysis in the database
draft_url = f"draft:{analysis_id}"
stmt = select(SEOAnalysis).where(SEOAnalysis.url == draft_url)
analysis = db.execute(stmt).scalar_one_or_none()
if analysis and analysis.analysis_data:
# Return stored analysis data
return {
"analysis_id": analysis_id,
"status": "completed",
"message": "Analysis results retrieved successfully",
**analysis.analysis_data
}
# If not found in DB (fallback for legacy or in-memory only)
# For now, we return 404 to encourage DB usage, or we could return a placeholder if strictly needed.
# But user requested DB integration, so we should rely on DB.
logger.warning(f"Analysis result not found in DB for ID: {analysis_id}")
raise HTTPException(status_code=404, detail="Analysis result not found")
except HTTPException:
raise
except Exception as e:
logger.error(f"Get analysis result error: {e}")
raise HTTPException(status_code=500, detail=f"Failed to retrieve analysis result: {str(e)}")
@router.get("/health")
async def health_check():
"""Health check endpoint for SEO analysis service"""
return {
"status": "healthy",
"service": "blog-seo-analysis",
"timestamp": datetime.utcnow().isoformat()
}

View File

@@ -0,0 +1,332 @@
"""
Task Management System for Blog Writer API
Handles background task execution, status tracking, and progress updates
for research and outline generation operations.
Now uses database-backed persistence for reliability and recovery.
"""
import asyncio
import uuid
from datetime import datetime
from typing import Any, Dict, List
from fastapi import HTTPException
from loguru import logger
from sqlalchemy.orm import Session
from services.database import SessionLocal, get_session_for_user
from models.blog_models import (
BlogResearchRequest,
BlogOutlineRequest,
MediumBlogGenerateRequest,
MediumBlogGenerateResult,
)
from services.blog_writer.blog_service import BlogWriterService
from services.blog_writer.database_task_manager import DatabaseTaskManager
from utils.text_asset_tracker import save_and_track_text_content
class TaskManager:
"""Manages background tasks for research and outline generation."""
def __init__(self, db_connection=None):
# Fallback to in-memory storage if no database connection
if db_connection:
self.db_manager = DatabaseTaskManager(db_connection)
self.use_database = True
else:
self.task_storage: Dict[str, Dict[str, Any]] = {}
self.service = BlogWriterService()
self.use_database = False
logger.warning("No database connection provided, using in-memory task storage")
def cleanup_old_tasks(self):
"""Remove tasks older than 1 hour to prevent memory leaks."""
current_time = datetime.now()
tasks_to_remove = []
for task_id, task_data in self.task_storage.items():
if (current_time - task_data["created_at"]).total_seconds() > 3600: # 1 hour
tasks_to_remove.append(task_id)
for task_id in tasks_to_remove:
del self.task_storage[task_id]
def create_task(self, task_type: str = "general") -> str:
"""Create a new task and return its ID."""
task_id = str(uuid.uuid4())
self.task_storage[task_id] = {
"status": "pending",
"created_at": datetime.now(),
"result": None,
"error": None,
"progress_messages": [],
"task_type": task_type
}
return task_id
async def get_task_status(self, task_id: str) -> Dict[str, Any]:
"""Get the status of a task."""
if self.use_database:
return await self.db_manager.get_task_status(task_id)
else:
self.cleanup_old_tasks()
if task_id not in self.task_storage:
return None
task = self.task_storage[task_id]
response = {
"task_id": task_id,
"status": task["status"],
"created_at": task["created_at"].isoformat(),
"progress_messages": task.get("progress_messages", [])
}
if task["status"] == "completed":
response["result"] = task["result"]
elif task["status"] == "failed":
response["error"] = task["error"]
if "error_status" in task:
response["error_status"] = task["error_status"]
logger.info(f"[TaskManager] get_task_status for {task_id}: Including error_status={task['error_status']} in response")
if "error_data" in task:
response["error_data"] = task["error_data"]
logger.info(f"[TaskManager] get_task_status for {task_id}: Including error_data with keys: {list(task['error_data'].keys()) if isinstance(task['error_data'], dict) else 'not-dict'}")
else:
logger.warning(f"[TaskManager] get_task_status for {task_id}: Task failed but no error_data found. Task keys: {list(task.keys())}")
return response
async def update_progress(self, task_id: str, message: str, percentage: float = None):
"""Update progress message for a task."""
if self.use_database:
await self.db_manager.update_progress(task_id, message, percentage)
else:
if task_id in self.task_storage:
if "progress_messages" not in self.task_storage[task_id]:
self.task_storage[task_id]["progress_messages"] = []
progress_entry = {
"timestamp": datetime.now().isoformat(),
"message": message
}
self.task_storage[task_id]["progress_messages"].append(progress_entry)
# Keep only last 10 progress messages to prevent memory bloat
if len(self.task_storage[task_id]["progress_messages"]) > 10:
self.task_storage[task_id]["progress_messages"] = self.task_storage[task_id]["progress_messages"][-10:]
logger.info(f"Progress update for task {task_id}: {message}")
async def start_research_task(self, request: BlogResearchRequest, user_id: str) -> str:
"""Start a research operation and return a task ID."""
if self.use_database:
return await self.db_manager.start_research_task(request, user_id)
else:
task_id = self.create_task("research")
# Store user_id in task for subscription checks
if task_id in self.task_storage:
self.task_storage[task_id]["user_id"] = user_id
# Start the research operation in the background
asyncio.create_task(self._run_research_task(task_id, request, user_id))
return task_id
def start_outline_task(self, request: BlogOutlineRequest, user_id: str) -> str:
"""Start an outline generation operation and return a task ID."""
task_id = self.create_task("outline")
# Start the outline generation operation in the background
asyncio.create_task(self._run_outline_generation_task(task_id, request, user_id))
return task_id
def start_medium_generation_task(self, request: MediumBlogGenerateRequest, user_id: str) -> str:
"""Start a medium (≤1000 words) full-blog generation task."""
task_id = self.create_task("medium_generation")
asyncio.create_task(self._run_medium_generation_task(task_id, request, user_id))
return task_id
def start_content_generation_task(self, request: MediumBlogGenerateRequest, user_id: str) -> str:
"""Start content generation (full blog via sections) with provider parity.
Internally reuses medium generator pipeline for now but tracked under
distinct task_type 'content_generation' and same polling contract.
Args:
request: Content generation request
user_id: User ID (required for subscription checks and usage tracking)
"""
task_id = self.create_task("content_generation")
asyncio.create_task(self._run_medium_generation_task(task_id, request, user_id))
return task_id
async def _run_research_task(self, task_id: str, request: BlogResearchRequest, user_id: str):
"""Background task to run research and update status with progress messages."""
try:
# Update status to running
self.task_storage[task_id]["status"] = "running"
self.task_storage[task_id]["progress_messages"] = []
# Send initial progress message
await self.update_progress(task_id, "🔍 Starting research operation...")
# Check cache first
await self.update_progress(task_id, "📋 Checking cache for existing research...")
# Run the actual research with progress updates (pass user_id for subscription checks)
result = await self.service.research_with_progress(request, task_id, user_id)
# Check if research failed gracefully
if not result.success:
await self.update_progress(task_id, f"❌ Research failed: {result.error_message or 'Unknown error'}")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = result.error_message or "Research failed"
else:
await self.update_progress(task_id, f"✅ Research completed successfully! Found {len(result.sources)} sources and {len(result.search_queries or [])} search queries.")
# Update status to completed
self.task_storage[task_id]["status"] = "completed"
self.task_storage[task_id]["result"] = result.dict()
except HTTPException as http_error:
# Handle HTTPException (e.g., 429 subscription limit) - preserve error details for frontend
error_detail = http_error.detail
error_message = error_detail.get('message', str(error_detail)) if isinstance(error_detail, dict) else str(error_detail)
await self.update_progress(task_id, f"{error_message}")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = error_message
# Store HTTP error details for frontend modal
self.task_storage[task_id]["error_status"] = http_error.status_code
self.task_storage[task_id]["error_data"] = error_detail if isinstance(error_detail, dict) else {"error": str(error_detail)}
except Exception as e:
await self.update_progress(task_id, f"❌ Research failed with error: {str(e)}")
# Update status to failed
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = str(e)
# Ensure we always send a final completion message
finally:
if task_id in self.task_storage:
current_status = self.task_storage[task_id]["status"]
if current_status not in ["completed", "failed"]:
# Force completion if somehow we didn't set a final status
await self.update_progress(task_id, "⚠️ Research operation completed with unknown status")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = "Research completed with unknown status"
async def _run_outline_generation_task(self, task_id: str, request: BlogOutlineRequest, user_id: str):
"""Background task to run outline generation and update status with progress messages."""
try:
# Update status to running
self.task_storage[task_id]["status"] = "running"
self.task_storage[task_id]["progress_messages"] = []
# Send initial progress message
await self.update_progress(task_id, "🧩 Starting outline generation...")
# Run the actual outline generation with progress updates (pass user_id for subscription checks)
result = await self.service.generate_outline_with_progress(request, task_id, user_id)
# Update status to completed
await self.update_progress(task_id, f"✅ Outline generated successfully! Created {len(result.outline)} sections with {len(result.title_options)} title options.")
self.task_storage[task_id]["status"] = "completed"
self.task_storage[task_id]["result"] = result.dict()
except HTTPException as http_error:
# Handle HTTPException (e.g., 429 subscription limit) - preserve error details for frontend
error_detail = http_error.detail
error_message = error_detail.get('message', str(error_detail)) if isinstance(error_detail, dict) else str(error_detail)
await self.update_progress(task_id, f"{error_message}")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = error_message
# Store HTTP error details for frontend modal
self.task_storage[task_id]["error_status"] = http_error.status_code
self.task_storage[task_id]["error_data"] = error_detail if isinstance(error_detail, dict) else {"error": str(error_detail)}
except Exception as e:
await self.update_progress(task_id, f"❌ Outline generation failed: {str(e)}")
# Update status to failed
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = str(e)
async def _run_medium_generation_task(self, task_id: str, request: MediumBlogGenerateRequest, user_id: str):
"""Background task to generate a medium blog using a single structured JSON call."""
try:
self.task_storage[task_id]["status"] = "running"
self.task_storage[task_id]["progress_messages"] = []
await self.update_progress(task_id, "📦 Packaging outline and metadata...")
# Basic guard: respect global target words
total_target = int(request.globalTargetWords or 1000)
if total_target > 1000:
raise ValueError("Global target words exceed 1000; medium generation not allowed")
# Create a sync session for asset saving
db_session = SessionLocal()
try:
result: MediumBlogGenerateResult = await self.service.generate_medium_blog_with_progress(
request,
task_id,
user_id,
db=db_session
)
finally:
db_session.close()
if not result or not getattr(result, "sections", None):
raise ValueError("Empty generation result from model")
# Check if result came from cache
cache_hit = getattr(result, 'cache_hit', False)
if cache_hit:
await self.update_progress(task_id, "⚡ Found cached content - loading instantly!")
else:
await self.update_progress(task_id, "🤖 Generated fresh content with AI...")
await self.update_progress(task_id, "✨ Post-processing and assembling sections...")
# Mark completed
self.task_storage[task_id]["status"] = "completed"
self.task_storage[task_id]["result"] = result.dict()
await self.update_progress(task_id, f"✅ Generated {len(result.sections)} sections successfully.")
# Note: Blog content tracking is handled in the status endpoint
# to ensure we have proper database session and user context
except HTTPException as http_error:
# Handle HTTPException (e.g., 429 subscription limit) - preserve error details for frontend
logger.info(f"[TaskManager] Caught HTTPException in medium generation task {task_id}: status={http_error.status_code}, detail={http_error.detail}")
error_detail = http_error.detail
error_message = error_detail.get('message', str(error_detail)) if isinstance(error_detail, dict) else str(error_detail)
await self.update_progress(task_id, f"{error_message}")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = error_message
# Store HTTP error details for frontend modal
self.task_storage[task_id]["error_status"] = http_error.status_code
self.task_storage[task_id]["error_data"] = error_detail if isinstance(error_detail, dict) else {"error": str(error_detail)}
logger.info(f"[TaskManager] Stored error_status={http_error.status_code} and error_data keys: {list(error_detail.keys()) if isinstance(error_detail, dict) else 'not-dict'}")
except Exception as e:
# Check if this is an HTTPException that got wrapped (can happen in async tasks)
# HTTPException has status_code and detail attributes
logger.info(f"[TaskManager] Caught Exception in medium generation task {task_id}: type={type(e).__name__}, has_status_code={hasattr(e, 'status_code')}, has_detail={hasattr(e, 'detail')}")
if hasattr(e, 'status_code') and hasattr(e, 'detail'):
# This is an HTTPException that was caught as generic Exception
logger.info(f"[TaskManager] Detected HTTPException in Exception handler: status={e.status_code}, detail={e.detail}")
error_detail = e.detail
error_message = error_detail.get('message', str(error_detail)) if isinstance(error_detail, dict) else str(error_detail)
await self.update_progress(task_id, f"{error_message}")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = error_message
# Store HTTP error details for frontend modal
self.task_storage[task_id]["error_status"] = e.status_code
self.task_storage[task_id]["error_data"] = error_detail if isinstance(error_detail, dict) else {"error": str(error_detail)}
logger.info(f"[TaskManager] Stored error_status={e.status_code} and error_data keys: {list(error_detail.keys()) if isinstance(error_detail, dict) else 'not-dict'}")
else:
await self.update_progress(task_id, f"❌ Medium generation failed: {str(e)}")
self.task_storage[task_id]["status"] = "failed"
self.task_storage[task_id]["error"] = str(e)
# Global task manager instance
task_manager = TaskManager()

295
backend/api/brainstorm.py Normal file
View File

@@ -0,0 +1,295 @@
"""
Brainstorming endpoints for generating Google search prompts and running a
single grounded search to surface topic ideas. Built for reusability across
editors. Uses the existing Gemini provider modules.
"""
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from typing import List, Dict, Any, Optional
from loguru import logger
from services.llm_providers.gemini_provider import gemini_structured_json_response
try:
from services.llm_providers.gemini_grounded_provider import GeminiGroundedProvider
GROUNDED_AVAILABLE = True
except Exception:
GROUNDED_AVAILABLE = False
router = APIRouter(prefix="/api/brainstorm", tags=["Brainstorming"])
class PersonaPayload(BaseModel):
persona_name: Optional[str] = None
archetype: Optional[str] = None
core_belief: Optional[str] = None
tonal_range: Optional[Dict[str, Any]] = None
linguistic_fingerprint: Optional[Dict[str, Any]] = None
class PlatformPersonaPayload(BaseModel):
content_format_rules: Optional[Dict[str, Any]] = None
engagement_patterns: Optional[Dict[str, Any]] = None
content_types: Optional[Dict[str, Any]] = None
tonal_range: Optional[Dict[str, Any]] = None
class PromptRequest(BaseModel):
seed: str = Field(..., description="Idea seed provided by end user")
persona: Optional[PersonaPayload] = None
platformPersona: Optional[PlatformPersonaPayload] = None
count: int = Field(5, ge=3, le=10, description="Number of prompts to generate (default 5)")
class PromptResponse(BaseModel):
prompts: List[str]
@router.post("/prompts", response_model=PromptResponse)
async def generate_prompts(req: PromptRequest) -> PromptResponse:
"""Generate N high-signal Google search prompts using Gemini structured output."""
try:
persona_line = ""
if req.persona:
parts = []
if req.persona.persona_name:
parts.append(req.persona.persona_name)
if req.persona.archetype:
parts.append(f"({req.persona.archetype})")
persona_line = " ".join(parts)
platform_hints = []
if req.platformPersona and req.platformPersona.content_format_rules:
limit = req.platformPersona.content_format_rules.get("character_limit")
if limit:
platform_hints.append(f"respect LinkedIn character limit {limit}")
sys_prompt = (
"You are an expert LinkedIn strategist who crafts precise Google search prompts "
"to ideate content topics. Follow Google grounding best-practices: be specific, "
"time-bound (2024-2025), include entities, and prefer intent-rich phrasing."
)
prompt = f"""
Seed: {req.seed}
Persona: {persona_line or 'N/A'}
Guidelines:
- Generate {req.count} distinct, high-signal Google search prompts.
- Each prompt should include concrete entities (companies, tools, frameworks) when possible.
- Prefer phrasing that yields recent, authoritative sources.
- Avoid generic phrasing ("latest trends") unless combined with concrete qualifiers.
- Optimize for LinkedIn thought leadership and practicality.
{('Platform hints: ' + ', '.join(platform_hints)) if platform_hints else ''}
Return only the list of prompts.
""".strip()
schema = {
"type": "object",
"properties": {
"prompts": {
"type": "array",
"items": {"type": "string"}
}
}
}
result = gemini_structured_json_response(
prompt=prompt,
schema=schema,
temperature=0.2,
top_p=0.9,
top_k=40,
max_tokens=2048,
system_prompt=sys_prompt,
)
prompts = []
if isinstance(result, dict) and isinstance(result.get("prompts"), list):
prompts = [str(p).strip() for p in result["prompts"] if str(p).strip()]
if not prompts:
# Minimal fallback: derive simple variations
base = req.seed.strip()
prompts = [
f"Recent data-backed insights about {base}",
f"Case studies and benchmarks on {base}",
f"Implementation playbooks for {base}",
f"Common pitfalls and solutions in {base}",
f"Industry leader perspectives on {base}",
]
return PromptResponse(prompts=prompts[: req.count])
except Exception as e:
logger.error(f"Error generating brainstorm prompts: {e}")
raise HTTPException(status_code=500, detail=str(e))
class SearchRequest(BaseModel):
prompt: str = Field(..., description="Selected search prompt to run with grounding")
max_tokens: int = Field(1024, ge=256, le=4096)
class SearchResult(BaseModel):
title: Optional[str] = None
url: Optional[str] = None
snippet: Optional[str] = None
class SearchResponse(BaseModel):
results: List[SearchResult] = []
@router.post("/search", response_model=SearchResponse)
async def run_grounded_search(req: SearchRequest) -> SearchResponse:
"""Run a single grounded Google search via GeminiGroundedProvider and return normalized results."""
if not GROUNDED_AVAILABLE:
raise HTTPException(status_code=503, detail="Grounded provider not available")
try:
provider = GeminiGroundedProvider()
resp = await provider.generate_grounded_content(
prompt=req.prompt,
content_type="linkedin_post",
temperature=0.3,
max_tokens=req.max_tokens,
)
items: List[SearchResult] = []
# Normalize 'sources' if present
for s in (resp.get("sources") or []):
items.append(SearchResult(
title=s.get("title") or "Source",
url=s.get("url") or s.get("link"),
snippet=s.get("content") or s.get("snippet")
))
# Provide minimal fallback if no structured sources are returned
if not items and resp.get("content"):
items.append(SearchResult(title="Generated overview", url=None, snippet=resp.get("content")[:400]))
return SearchResponse(results=items[:10])
except Exception as e:
logger.error(f"Error in grounded search: {e}")
raise HTTPException(status_code=500, detail=str(e))
class IdeasRequest(BaseModel):
seed: str
persona: Optional[PersonaPayload] = None
platformPersona: Optional[PlatformPersonaPayload] = None
results: List[SearchResult] = []
count: int = 5
class IdeaItem(BaseModel):
prompt: str
rationale: Optional[str] = None
class IdeasResponse(BaseModel):
ideas: List[IdeaItem]
@router.post("/ideas", response_model=IdeasResponse)
async def generate_brainstorm_ideas(req: IdeasRequest) -> IdeasResponse:
"""
Create brainstorm ideas by combining persona, seed, and Google search results.
Uses gemini_structured_json_response for consistent output.
"""
try:
# Build compact search context
top_results = req.results[:5]
sources_block = "\n".join(
[
f"- {r.title or 'Source'} | {r.url or ''} | {r.snippet or ''}"
for r in top_results
]
) or "(no sources)"
persona_block = ""
if req.persona:
persona_block = (
f"Persona: {req.persona.persona_name or ''} {('(' + req.persona.archetype + ')') if req.persona.archetype else ''}\n"
)
platform_block = ""
if req.platformPersona and req.platformPersona.content_format_rules:
limit = req.platformPersona.content_format_rules.get("character_limit")
platform_block = f"LinkedIn character limit: {limit}" if limit else ""
sys_prompt = (
"You are an enterprise-grade LinkedIn strategist. Generate specific, non-generic "
"brainstorm prompts suitable for LinkedIn posts or carousels. Use the provided web "
"sources to ground ideas and the persona to align tone and style."
)
prompt = f"""
SEED IDEA: {req.seed}
{persona_block}
{platform_block}
RECENT WEB SOURCES (top {len(top_results)}):
{sources_block}
TASK:
- Propose {req.count} LinkedIn-ready brainstorm prompts tailored to the persona and grounded in the sources.
- Each prompt should be specific and actionable for 20242025.
- Prefer thought-leadership angles, contrarian takes with evidence, or practical playbooks.
- Avoid generic phrases like "latest trends" unless qualified by entities.
Return JSON with an array named ideas where each item has:
- prompt: the exact text the user can use to generate a post
- rationale: 12 sentence why this works for the audience/persona
""".strip()
schema = {
"type": "object",
"properties": {
"ideas": {
"type": "array",
"items": {
"type": "object",
"properties": {
"prompt": {"type": "string"},
"rationale": {"type": "string"},
},
},
}
},
}
result = gemini_structured_json_response(
prompt=prompt,
schema=schema,
temperature=0.2,
top_p=0.9,
top_k=40,
max_tokens=2048,
system_prompt=sys_prompt,
)
ideas: List[IdeaItem] = []
if isinstance(result, dict) and isinstance(result.get("ideas"), list):
for item in result["ideas"]:
if isinstance(item, dict) and item.get("prompt"):
ideas.append(IdeaItem(prompt=item["prompt"], rationale=item.get("rationale")))
if not ideas:
# Fallback basic ideas from seed if model returns nothing
ideas = [
IdeaItem(prompt=f"Explain why {req.seed} matters now with 2 recent stats", rationale="Timely and data-backed."),
IdeaItem(prompt=f"Common pitfalls in {req.seed} and how to avoid them", rationale="Actionable and experience-based."),
IdeaItem(prompt=f"A step-by-step playbook to implement {req.seed}", rationale="Practical value."),
IdeaItem(prompt=f"Case study: measurable impact of {req.seed}", rationale="Story + ROI."),
IdeaItem(prompt=f"Contrarian take: what most get wrong about {req.seed}", rationale="Thought leadership.")
]
return IdeasResponse(ideas=ideas[: req.count])
except Exception as e:
logger.error(f"Error generating brainstorm ideas: {e}")
raise HTTPException(status_code=500, detail=str(e))

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,2 @@
# Content Assets API Module

View File

@@ -0,0 +1,667 @@
"""
Content Assets API Router
API endpoints for managing unified content assets across all modules.
"""
from fastapi import APIRouter, Depends, HTTPException, Query, Body
from sqlalchemy.orm import Session
from typing import List, Optional, Dict, Any
from pydantic import BaseModel, Field
from datetime import datetime
from services.database import get_db
from middleware.auth_middleware import get_current_user
from services.content_asset_service import ContentAssetService
from models.content_asset_models import AssetType, AssetSource, AssetCollection
router = APIRouter(prefix="/api/content-assets", tags=["Content Assets"])
class AssetResponse(BaseModel):
"""Response model for asset data."""
id: int
user_id: str
asset_type: str
source_module: str
filename: str
file_url: str
file_path: Optional[str] = None
file_size: Optional[int] = None
mime_type: Optional[str] = None
title: Optional[str] = None
description: Optional[str] = None
prompt: Optional[str] = None
tags: List[str] = []
asset_metadata: Dict[str, Any] = {}
provider: Optional[str] = None
model: Optional[str] = None
cost: float = 0.0
generation_time: Optional[float] = None
is_favorite: bool = False
download_count: int = 0
share_count: int = 0
created_at: datetime
updated_at: datetime
class Config:
from_attributes = True
class AssetListResponse(BaseModel):
"""Response model for asset list."""
assets: List[AssetResponse]
total: int
limit: int
offset: int
@router.get("/", response_model=AssetListResponse)
async def get_assets(
asset_type: Optional[str] = Query(None, description="Filter by asset type"),
source_module: Optional[str] = Query(None, description="Filter by source module"),
search: Optional[str] = Query(None, description="Search query"),
tags: Optional[str] = Query(None, description="Comma-separated tags"),
favorites_only: bool = Query(False, description="Only favorites"),
collection_id: Optional[int] = Query(None, description="Filter by collection ID"),
date_from: Optional[str] = Query(None, description="Filter from date (ISO format)"),
date_to: Optional[str] = Query(None, description="Filter to date (ISO format)"),
sort_by: str = Query("created_at", description="Sort by: created_at, updated_at, cost, file_size, title"),
sort_order: str = Query("desc", description="Sort order: asc or desc"),
limit: int = Query(100, ge=1, le=500),
offset: int = Query(0, ge=0),
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Get user's content assets with optional filtering."""
try:
# Auth middleware returns 'id' as the primary key
user_id = current_user.get("id") or current_user.get("user_id") or current_user.get("clerk_user_id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
# Parse filters
asset_type_enum = None
if asset_type:
try:
asset_type_enum = AssetType(asset_type.lower())
except ValueError:
raise HTTPException(status_code=400, detail=f"Invalid asset type: {asset_type}")
source_module_enum = None
if source_module:
try:
source_module_enum = AssetSource(source_module.lower())
except ValueError:
raise HTTPException(status_code=400, detail=f"Invalid source module: {source_module}")
tags_list = None
if tags:
tags_list = [tag.strip() for tag in tags.split(",")]
# Parse date filters
date_from_obj = None
if date_from:
try:
date_from_obj = datetime.fromisoformat(date_from.replace('Z', '+00:00'))
except ValueError:
raise HTTPException(status_code=400, detail="Invalid date_from format. Use ISO format.")
date_to_obj = None
if date_to:
try:
date_to_obj = datetime.fromisoformat(date_to.replace('Z', '+00:00'))
except ValueError:
raise HTTPException(status_code=400, detail="Invalid date_to format. Use ISO format.")
# Validate sort parameters
valid_sort_by = ["created_at", "updated_at", "cost", "file_size", "title"]
if sort_by not in valid_sort_by:
raise HTTPException(status_code=400, detail=f"Invalid sort_by. Must be one of: {', '.join(valid_sort_by)}")
if sort_order not in ["asc", "desc"]:
raise HTTPException(status_code=400, detail="Invalid sort_order. Must be 'asc' or 'desc'")
assets, total = service.get_user_assets(
user_id=user_id,
asset_type=asset_type_enum,
source_module=source_module_enum,
search_query=search,
tags=tags_list,
favorites_only=favorites_only,
collection_id=collection_id,
date_from=date_from_obj,
date_to=date_to_obj,
sort_by=sort_by,
sort_order=sort_order,
limit=limit,
offset=offset,
)
return AssetListResponse(
assets=[AssetResponse.model_validate(asset) for asset in assets],
total=total,
limit=limit,
offset=offset,
)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error fetching assets: {str(e)}")
class AssetCreateRequest(BaseModel):
"""Request model for creating a new asset."""
asset_type: str = Field(..., description="Asset type: text, image, video, or audio")
source_module: str = Field(..., description="Source module that generated the asset")
filename: str = Field(..., description="Original filename")
file_url: str = Field(..., description="Public URL to access the asset")
file_path: Optional[str] = Field(None, description="Server file path (optional)")
file_size: Optional[int] = Field(None, description="File size in bytes")
mime_type: Optional[str] = Field(None, description="MIME type")
title: Optional[str] = Field(None, description="Asset title")
description: Optional[str] = Field(None, description="Asset description")
prompt: Optional[str] = Field(None, description="Generation prompt")
tags: Optional[List[str]] = Field(default_factory=list, description="List of tags")
asset_metadata: Optional[Dict[str, Any]] = Field(default_factory=dict, description="Additional metadata")
provider: Optional[str] = Field(None, description="AI provider used")
model: Optional[str] = Field(None, description="Model used")
cost: Optional[float] = Field(0.0, description="Generation cost")
generation_time: Optional[float] = Field(None, description="Generation time in seconds")
@router.post("/", response_model=AssetResponse)
async def create_asset(
asset_data: AssetCreateRequest,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Create a new content asset."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
# Validate asset type
try:
asset_type_enum = AssetType(asset_data.asset_type.lower())
except ValueError:
raise HTTPException(status_code=400, detail=f"Invalid asset type: {asset_data.asset_type}")
# Validate source module
try:
source_module_enum = AssetSource(asset_data.source_module.lower())
except ValueError:
raise HTTPException(status_code=400, detail=f"Invalid source module: {asset_data.source_module}")
service = ContentAssetService(db)
asset = service.create_asset(
user_id=user_id,
asset_type=asset_type_enum,
source_module=source_module_enum,
filename=asset_data.filename,
file_url=asset_data.file_url,
file_path=asset_data.file_path,
file_size=asset_data.file_size,
mime_type=asset_data.mime_type,
title=asset_data.title,
description=asset_data.description,
prompt=asset_data.prompt,
tags=asset_data.tags or [],
asset_metadata=asset_data.asset_metadata or {},
provider=asset_data.provider,
model=asset_data.model,
cost=asset_data.cost,
generation_time=asset_data.generation_time,
)
return AssetResponse.model_validate(asset)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error creating asset: {str(e)}")
@router.post("/{asset_id}/favorite", response_model=Dict[str, Any])
async def toggle_favorite(
asset_id: int,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Toggle favorite status of an asset."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
is_favorite = service.toggle_favorite(asset_id, user_id)
return {"asset_id": asset_id, "is_favorite": is_favorite}
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error toggling favorite: {str(e)}")
@router.delete("/{asset_id}", response_model=Dict[str, Any])
async def delete_asset(
asset_id: int,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Delete an asset."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
success = service.delete_asset(asset_id, user_id)
if not success:
raise HTTPException(status_code=404, detail="Asset not found")
return {"asset_id": asset_id, "deleted": True}
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error deleting asset: {str(e)}")
@router.post("/{asset_id}/usage", response_model=Dict[str, Any])
async def track_usage(
asset_id: int,
action: str = Query(..., description="Action: download, share, or access"),
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Track asset usage (download, share, access)."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
if action not in ["download", "share", "access"]:
raise HTTPException(status_code=400, detail="Invalid action")
service = ContentAssetService(db)
service.update_asset_usage(asset_id, user_id, action)
return {"asset_id": asset_id, "action": action, "tracked": True}
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error tracking usage: {str(e)}")
class AssetUpdateRequest(BaseModel):
"""Request model for updating asset metadata."""
title: Optional[str] = None
description: Optional[str] = None
tags: Optional[List[str]] = None
asset_metadata: Optional[Dict[str, Any]] = None
@router.put("/{asset_id}", response_model=AssetResponse)
async def update_asset(
asset_id: int,
update_data: AssetUpdateRequest,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Update asset metadata."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
asset = service.update_asset(
asset_id=asset_id,
user_id=user_id,
title=update_data.title,
description=update_data.description,
tags=update_data.tags,
asset_metadata=update_data.asset_metadata,
)
if not asset:
raise HTTPException(status_code=404, detail="Asset not found")
return AssetResponse.model_validate(asset)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error updating asset: {str(e)}")
@router.get("/statistics", response_model=Dict[str, Any])
async def get_statistics(
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Get asset statistics for the current user."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
stats = service.get_asset_statistics(user_id)
return stats
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error fetching statistics: {str(e)}")
# ==================== Collection Endpoints ====================
class CollectionResponse(BaseModel):
"""Response model for collection data."""
id: int
user_id: str
name: str
description: Optional[str] = None
is_public: bool = False
cover_asset_id: Optional[int] = None
asset_count: int = 0
created_at: datetime
updated_at: datetime
class Config:
from_attributes = True
class CollectionListResponse(BaseModel):
"""Response model for collection list."""
collections: List[CollectionResponse]
total: int
limit: int
offset: int
class CollectionCreateRequest(BaseModel):
"""Request model for creating a collection."""
name: str = Field(..., description="Collection name")
description: Optional[str] = Field(None, description="Collection description")
is_public: bool = Field(False, description="Whether collection is public")
class CollectionUpdateRequest(BaseModel):
"""Request model for updating a collection."""
name: Optional[str] = None
description: Optional[str] = None
is_public: Optional[bool] = None
cover_asset_id: Optional[int] = None
@router.post("/collections", response_model=CollectionResponse)
async def create_collection(
collection_data: CollectionCreateRequest,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Create a new asset collection."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
collection = service.create_collection(
user_id=user_id,
name=collection_data.name,
description=collection_data.description,
is_public=collection_data.is_public,
)
# Get asset count
assets, _ = service.get_collection_assets(collection.id, user_id, limit=1, offset=0)
asset_count = len(assets)
response = CollectionResponse.model_validate(collection)
response.asset_count = asset_count
return response
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error creating collection: {str(e)}")
@router.get("/collections", response_model=CollectionListResponse)
async def get_collections(
limit: int = Query(100, ge=1, le=500),
offset: int = Query(0, ge=0),
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Get user's collections."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
collections, total = service.get_user_collections(user_id, limit=limit, offset=offset)
# Get asset counts for each collection
collection_responses = []
for collection in collections:
assets, _ = service.get_collection_assets(collection.id, user_id, limit=1, offset=0)
response = CollectionResponse.model_validate(collection)
response.asset_count = len(assets)
collection_responses.append(response)
return CollectionListResponse(
collections=collection_responses,
total=total,
limit=limit,
offset=offset,
)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error fetching collections: {str(e)}")
@router.get("/collections/{collection_id}", response_model=CollectionResponse)
async def get_collection(
collection_id: int,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Get a specific collection."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
collection = service.get_collection_by_id(collection_id, user_id)
if not collection:
raise HTTPException(status_code=404, detail="Collection not found")
assets, _ = service.get_collection_assets(collection.id, user_id, limit=1, offset=0)
response = CollectionResponse.model_validate(collection)
response.asset_count = len(assets)
return response
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error fetching collection: {str(e)}")
@router.put("/collections/{collection_id}", response_model=CollectionResponse)
async def update_collection(
collection_id: int,
update_data: CollectionUpdateRequest,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Update collection metadata."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
collection = service.update_collection(
collection_id=collection_id,
user_id=user_id,
name=update_data.name,
description=update_data.description,
is_public=update_data.is_public,
cover_asset_id=update_data.cover_asset_id,
)
if not collection:
raise HTTPException(status_code=404, detail="Collection not found")
assets, _ = service.get_collection_assets(collection.id, user_id, limit=1, offset=0)
response = CollectionResponse.model_validate(collection)
response.asset_count = len(assets)
return response
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error updating collection: {str(e)}")
@router.delete("/collections/{collection_id}", response_model=Dict[str, Any])
async def delete_collection(
collection_id: int,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Delete a collection."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
success = service.delete_collection(collection_id, user_id)
if not success:
raise HTTPException(status_code=404, detail="Collection not found")
return {"collection_id": collection_id, "deleted": True}
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error deleting collection: {str(e)}")
@router.get("/collections/{collection_id}/assets", response_model=AssetListResponse)
async def get_collection_assets(
collection_id: int,
limit: int = Query(100, ge=1, le=500),
offset: int = Query(0, ge=0),
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Get all assets in a collection."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
collection = service.get_collection_by_id(collection_id, user_id)
if not collection:
raise HTTPException(status_code=404, detail="Collection not found")
assets, total = service.get_collection_assets(collection_id, user_id, limit=limit, offset=offset)
return AssetListResponse(
assets=[AssetResponse.model_validate(asset) for asset in assets],
total=total,
limit=limit,
offset=offset,
)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error fetching collection assets: {str(e)}")
class CollectionAssetsRequest(BaseModel):
"""Request model for adding/removing assets from collection."""
asset_ids: List[int] = Field(..., description="List of asset IDs")
@router.post("/collections/{collection_id}/assets", response_model=Dict[str, Any])
async def add_assets_to_collection(
collection_id: int,
request: CollectionAssetsRequest,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Add assets to a collection."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
count = service.add_assets_to_collection(collection_id, user_id, request.asset_ids)
return {
"collection_id": collection_id,
"assets_added": count,
"asset_ids": request.asset_ids,
}
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error adding assets to collection: {str(e)}")
@router.delete("/collections/{collection_id}/assets", response_model=Dict[str, Any])
async def remove_assets_from_collection(
collection_id: int,
request: CollectionAssetsRequest,
db: Session = Depends(get_db),
current_user: Dict[str, Any] = Depends(get_current_user),
):
"""Remove assets from a collection."""
try:
user_id = current_user.get("user_id") or current_user.get("id")
if not user_id:
raise HTTPException(status_code=401, detail="User ID not found")
service = ContentAssetService(db)
count = service.remove_assets_from_collection(collection_id, user_id, request.asset_ids)
return {
"collection_id": collection_id,
"assets_removed": count,
"asset_ids": request.asset_ids,
}
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Error removing assets from collection: {str(e)}")

View File

@@ -0,0 +1,445 @@
# Content Planning API - Modular Architecture
## Overview
The Content Planning API has been refactored from a monolithic structure into a modular, maintainable architecture. This document provides comprehensive documentation for the new modular structure.
## Architecture
```
backend/api/content_planning/
├── __init__.py
├── api/
│ ├── __init__.py
│ ├── routes/
│ │ ├── __init__.py
│ │ ├── strategies.py # Strategy management endpoints
│ │ ├── calendar_events.py # Calendar event endpoints
│ │ ├── gap_analysis.py # Content gap analysis endpoints
│ │ ├── ai_analytics.py # AI analytics endpoints
│ │ ├── calendar_generation.py # Calendar generation endpoints
│ │ └── health_monitoring.py # Health monitoring endpoints
│ ├── models/
│ │ ├── __init__.py
│ │ ├── requests.py # Request models
│ │ └── responses.py # Response models
│ └── router.py # Main router
├── services/
│ ├── __init__.py
│ ├── strategy_service.py # Strategy business logic
│ ├── calendar_service.py # Calendar business logic
│ ├── gap_analysis_service.py # Gap analysis business logic
│ ├── ai_analytics_service.py # AI analytics business logic
│ └── calendar_generation_service.py # Calendar generation business logic
├── utils/
│ ├── __init__.py
│ ├── error_handlers.py # Centralized error handling
│ ├── response_builders.py # Response formatting
│ └── constants.py # API constants
└── tests/
├── __init__.py
├── functionality_test.py # Functionality tests
├── before_after_test.py # Before/after comparison tests
└── test_data.py # Test data fixtures
```
## API Endpoints
### Base URL
```
/api/content-planning
```
### Health Check
```
GET /health
```
Returns the operational status of all content planning modules.
### Strategy Management
#### Create Strategy
```
POST /strategies/
```
Creates a new content strategy.
**Request Body:**
```json
{
"user_id": 1,
"name": "Digital Marketing Strategy",
"industry": "technology",
"target_audience": {
"demographics": ["professionals", "business_owners"],
"interests": ["digital_marketing", "content_creation"]
},
"content_pillars": [
{
"name": "Educational Content",
"description": "How-to guides and tutorials"
}
]
}
```
#### Get Strategies
```
GET /strategies/?user_id=1
```
Retrieves content strategies for a user.
#### Get Strategy by ID
```
GET /strategies/{strategy_id}
```
Retrieves a specific strategy by ID.
#### Update Strategy
```
PUT /strategies/{strategy_id}
```
Updates an existing strategy.
#### Delete Strategy
```
DELETE /strategies/{strategy_id}
```
Deletes a strategy.
### Calendar Events
#### Create Calendar Event
```
POST /calendar-events/
```
Creates a new calendar event.
**Request Body:**
```json
{
"strategy_id": 1,
"title": "Blog Post: AI in Marketing",
"description": "Comprehensive guide on AI applications in marketing",
"content_type": "blog",
"platform": "website",
"scheduled_date": "2024-08-15T10:00:00Z"
}
```
#### Get Calendar Events
```
GET /calendar-events/?strategy_id=1
```
Retrieves calendar events, optionally filtered by strategy.
#### Get Calendar Event by ID
```
GET /calendar-events/{event_id}
```
Retrieves a specific calendar event.
#### Update Calendar Event
```
PUT /calendar-events/{event_id}
```
Updates an existing calendar event.
#### Delete Calendar Event
```
DELETE /calendar-events/{event_id}
```
Deletes a calendar event.
### Content Gap Analysis
#### Get Gap Analysis
```
GET /gap-analysis/?user_id=1&force_refresh=false
```
Retrieves content gap analysis with AI insights.
**Query Parameters:**
- `user_id`: User ID (optional, defaults to 1)
- `strategy_id`: Strategy ID (optional)
- `force_refresh`: Force refresh analysis (default: false)
#### Create Gap Analysis
```
POST /gap-analysis/
```
Creates a new content gap analysis.
**Request Body:**
```json
{
"user_id": 1,
"website_url": "https://example.com",
"competitor_urls": ["https://competitor1.com", "https://competitor2.com"],
"target_keywords": ["digital marketing", "content creation"],
"industry": "technology"
}
```
#### Analyze Content Gaps
```
POST /gap-analysis/analyze
```
Performs comprehensive content gap analysis.
**Request Body:**
```json
{
"website_url": "https://example.com",
"competitor_urls": ["https://competitor1.com"],
"target_keywords": ["digital marketing"],
"industry": "technology"
}
```
### AI Analytics
#### Get AI Analytics
```
GET /ai-analytics/?user_id=1&force_refresh=false
```
Retrieves AI-powered analytics and insights.
**Query Parameters:**
- `user_id`: User ID (optional, defaults to 1)
- `strategy_id`: Strategy ID (optional)
- `force_refresh`: Force refresh analysis (default: false)
#### Content Evolution Analysis
```
POST /ai-analytics/content-evolution
```
Analyzes content evolution over time.
**Request Body:**
```json
{
"strategy_id": 1,
"time_period": "30d"
}
```
#### Performance Trends Analysis
```
POST /ai-analytics/performance-trends
```
Analyzes performance trends.
**Request Body:**
```json
{
"strategy_id": 1,
"metrics": ["engagement_rate", "reach", "conversion_rate"]
}
```
#### Strategic Intelligence
```
POST /ai-analytics/strategic-intelligence
```
Generates strategic intelligence insights.
**Request Body:**
```json
{
"strategy_id": 1,
"market_data": {
"industry_trends": ["AI adoption", "Digital transformation"],
"competitor_analysis": ["competitor1.com", "competitor2.com"]
}
}
```
### Calendar Generation
#### Generate Comprehensive Calendar
```
POST /calendar-generation/generate-calendar
```
Generates a comprehensive AI-powered content calendar.
**Request Body:**
```json
{
"user_id": 1,
"strategy_id": 1,
"calendar_type": "monthly",
"industry": "technology",
"business_size": "sme",
"force_refresh": false
}
```
#### Optimize Content for Platform
```
POST /calendar-generation/optimize-content
```
Optimizes content for specific platforms.
**Request Body:**
```json
{
"user_id": 1,
"title": "AI Marketing Guide",
"description": "Comprehensive guide on AI in marketing",
"content_type": "blog",
"target_platform": "linkedin"
}
```
#### Predict Content Performance
```
POST /calendar-generation/performance-predictions
```
Predicts content performance using AI.
**Request Body:**
```json
{
"user_id": 1,
"strategy_id": 1,
"content_type": "blog",
"platform": "linkedin",
"content_data": {
"title": "AI Marketing Guide",
"description": "Comprehensive guide on AI in marketing"
}
}
```
#### Get Trending Topics
```
GET /calendar-generation/trending-topics?user_id=1&industry=technology&limit=10
```
Retrieves trending topics relevant to the user's industry.
**Query Parameters:**
- `user_id`: User ID (required)
- `industry`: Industry (required)
- `limit`: Number of topics to return (default: 10)
#### Get Comprehensive User Data
```
GET /calendar-generation/comprehensive-user-data?user_id=1
```
Retrieves comprehensive user data for calendar generation.
**Query Parameters:**
- `user_id`: User ID (required)
### Health Monitoring
#### Backend Health Check
```
GET /health/backend
```
Checks core backend health (independent of AI services).
#### AI Services Health Check
```
GET /health/ai
```
Checks AI services health separately.
#### Database Health Check
```
GET /health/database
```
Checks database connectivity and operations.
#### Calendar Generation Health Check
```
GET /calendar-generation/health
```
Checks calendar generation services health.
## Response Formats
### Success Response
```json
{
"status": "success",
"data": {...},
"message": "Operation completed successfully",
"timestamp": "2024-08-01T10:00:00Z"
}
```
### Error Response
```json
{
"status": "error",
"error": "Error description",
"message": "Detailed error message",
"timestamp": "2024-08-01T10:00:00Z"
}
```
### Health Check Response
```json
{
"service": "content_planning",
"status": "healthy",
"timestamp": "2024-08-01T10:00:00Z",
"modules": {
"strategies": "operational",
"calendar_events": "operational",
"gap_analysis": "operational",
"ai_analytics": "operational",
"calendar_generation": "operational",
"health_monitoring": "operational"
},
"version": "2.0.0",
"architecture": "modular"
}
```
## Error Codes
- `200`: Success
- `400`: Bad Request - Invalid input data
- `404`: Not Found - Resource not found
- `422`: Validation Error - Request validation failed
- `500`: Internal Server Error - Server-side error
- `503`: Service Unavailable - AI services unavailable
## Authentication
All endpoints require proper authentication. Include authentication headers as required by your application.
## Rate Limiting
API requests are subject to rate limiting to ensure fair usage and system stability.
## Caching
The API implements intelligent caching for:
- AI analysis results (24-hour cache)
- User data and preferences
- Strategy and calendar data
## Versioning
Current API version: `2.0.0`
The API follows semantic versioning. Breaking changes will be communicated in advance.
## Migration from Monolithic Structure
The API has been migrated from a monolithic structure to a modular architecture. Key improvements:
1. **Separation of Concerns**: Business logic separated from API routes
2. **Service Layer**: Dedicated services for each domain
3. **Error Handling**: Centralized and standardized error handling
4. **Performance**: Optimized imports and dependencies
5. **Maintainability**: Smaller, focused modules
6. **Testability**: Isolated components for better testing
## Support
For API support and questions, please refer to the project documentation or contact the development team.

View File

View File

@@ -0,0 +1,8 @@
"""
Content Strategy API Module
Modular API endpoints for content strategy functionality.
"""
from .routes import router
__all__ = ["router"]

View File

@@ -0,0 +1,13 @@
"""
Strategy Endpoints Module
CRUD, analytics, utility, streaming, autofill, and AI generation endpoints for content strategies.
"""
from .strategy_crud import router as crud_router
from .analytics_endpoints import router as analytics_router
from .utility_endpoints import router as utility_router
from .streaming_endpoints import router as streaming_router
from .autofill_endpoints import router as autofill_router
from .ai_generation_endpoints import router as ai_generation_router
__all__ = ["crud_router", "analytics_router", "utility_router", "streaming_router", "autofill_router", "ai_generation_router"]

View File

@@ -0,0 +1,780 @@
"""
AI Generation Endpoints
Handles AI-powered strategy generation endpoints.
"""
from typing import Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.orm import Session
from loguru import logger
from datetime import datetime
# Import database
from services.database import get_db_session
# Import services
from ....services.content_strategy.ai_generation import AIStrategyGenerator, StrategyGenerationConfig
from ....services.enhanced_strategy_service import EnhancedStrategyService
from ....services.enhanced_strategy_db_service import EnhancedStrategyDBService
# Import educational content manager
from .content_strategy.educational_content import EducationalContentManager
# Import utilities
from ....utils.error_handlers import ContentPlanningErrorHandler
from ....utils.response_builders import ResponseBuilder
from ....utils.constants import ERROR_MESSAGES, SUCCESS_MESSAGES
router = APIRouter(tags=["AI Strategy Generation"])
# Helper function to get database session
def get_db():
db = get_db_session()
try:
yield db
finally:
db.close()
# Global storage for latest strategies (more persistent than task status)
_latest_strategies = {}
@router.post("/generate-comprehensive-strategy")
async def generate_comprehensive_strategy(
user_id: int,
strategy_name: Optional[str] = None,
config: Optional[Dict[str, Any]] = None,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Generate a comprehensive AI-powered content strategy."""
try:
logger.info(f"🚀 Generating comprehensive AI strategy for user: {user_id}")
# Get user context and onboarding data
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
# Get onboarding data for context
onboarding_data = await enhanced_service._get_onboarding_data(user_id)
# Build context for AI generation
context = {
"onboarding_data": onboarding_data,
"user_id": user_id,
"generation_config": config or {}
}
# Create strategy generation config
generation_config = StrategyGenerationConfig(
include_competitive_analysis=config.get("include_competitive_analysis", True) if config else True,
include_content_calendar=config.get("include_content_calendar", True) if config else True,
include_performance_predictions=config.get("include_performance_predictions", True) if config else True,
include_implementation_roadmap=config.get("include_implementation_roadmap", True) if config else True,
include_risk_assessment=config.get("include_risk_assessment", True) if config else True,
max_content_pieces=config.get("max_content_pieces", 50) if config else 50,
timeline_months=config.get("timeline_months", 12) if config else 12
)
# Initialize AI strategy generator
strategy_generator = AIStrategyGenerator(generation_config)
# Generate comprehensive strategy
comprehensive_strategy = await strategy_generator.generate_comprehensive_strategy(
user_id=user_id,
context=context,
strategy_name=strategy_name
)
logger.info(f"✅ Comprehensive AI strategy generated successfully for user: {user_id}")
return ResponseBuilder.create_success_response(
message="Comprehensive AI strategy generated successfully",
data=comprehensive_strategy
)
except RuntimeError as e:
logger.error(f"❌ AI service error generating comprehensive strategy: {str(e)}")
raise HTTPException(
status_code=503,
detail=f"AI service temporarily unavailable: {str(e)}"
)
except Exception as e:
logger.error(f"❌ Error generating comprehensive strategy: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "generate_comprehensive_strategy")
@router.post("/generate-strategy-component")
async def generate_strategy_component(
user_id: int,
component_type: str,
base_strategy: Optional[Dict[str, Any]] = None,
context: Optional[Dict[str, Any]] = None,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Generate a specific strategy component using AI."""
try:
logger.info(f"🚀 Generating strategy component '{component_type}' for user: {user_id}")
# Validate component type
valid_components = [
"strategic_insights",
"competitive_analysis",
"content_calendar",
"performance_predictions",
"implementation_roadmap",
"risk_assessment"
]
if component_type not in valid_components:
raise HTTPException(
status_code=400,
detail=f"Invalid component type. Must be one of: {valid_components}"
)
# Get context if not provided
if not context:
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
onboarding_data = await enhanced_service._get_onboarding_data(user_id)
context = {"onboarding_data": onboarding_data, "user_id": user_id}
# Get base strategy if not provided
if not base_strategy:
# Generate base strategy using autofill
from ....services.content_strategy.autofill.ai_structured_autofill import AIStructuredAutofillService
autofill_service = AIStructuredAutofillService()
autofill_result = await autofill_service.generate_autofill_fields(user_id, context)
base_strategy = autofill_result.get("fields", {})
# Initialize AI strategy generator
strategy_generator = AIStrategyGenerator()
# Generate specific component
if component_type == "strategic_insights":
component = await strategy_generator._generate_strategic_insights(base_strategy, context)
elif component_type == "competitive_analysis":
component = await strategy_generator._generate_competitive_analysis(base_strategy, context)
elif component_type == "content_calendar":
component = await strategy_generator._generate_content_calendar(base_strategy, context)
elif component_type == "performance_predictions":
component = await strategy_generator._generate_performance_predictions(base_strategy, context)
elif component_type == "implementation_roadmap":
component = await strategy_generator._generate_implementation_roadmap(base_strategy, context)
elif component_type == "risk_assessment":
component = await strategy_generator._generate_risk_assessment(base_strategy, context)
logger.info(f"✅ Strategy component '{component_type}' generated successfully for user: {user_id}")
return ResponseBuilder.create_success_response(
message=f"Strategy component '{component_type}' generated successfully",
data={
"component_type": component_type,
"component_data": component,
"generated_at": datetime.utcnow().isoformat(),
"user_id": user_id
}
)
except RuntimeError as e:
logger.error(f"❌ AI service error generating strategy component: {str(e)}")
raise HTTPException(
status_code=503,
detail=f"AI service temporarily unavailable for {component_type}: {str(e)}"
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error generating strategy component: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "generate_strategy_component")
@router.get("/strategy-generation-status")
async def get_strategy_generation_status(
user_id: int,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get the status of strategy generation for a user."""
try:
logger.info(f"Getting strategy generation status for user: {user_id}")
# Get user's strategies
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
strategies_data = await enhanced_service.get_enhanced_strategies(user_id, None, db)
# Analyze generation status
strategies = strategies_data.get("strategies", [])
status_data = {
"user_id": user_id,
"total_strategies": len(strategies),
"ai_generated_strategies": len([s for s in strategies if s.get("ai_generated", False)]),
"last_generation": None,
"generation_stats": {
"comprehensive_strategies": 0,
"partial_strategies": 0,
"manual_strategies": 0
}
}
if strategies:
# Find most recent AI-generated strategy
ai_strategies = [s for s in strategies if s.get("ai_generated", False)]
if ai_strategies:
latest_ai = max(ai_strategies, key=lambda x: x.get("created_at", ""))
status_data["last_generation"] = latest_ai.get("created_at")
# Categorize strategies
for strategy in strategies:
if strategy.get("ai_generated", False):
if strategy.get("comprehensive", False):
status_data["generation_stats"]["comprehensive_strategies"] += 1
else:
status_data["generation_stats"]["partial_strategies"] += 1
else:
status_data["generation_stats"]["manual_strategies"] += 1
logger.info(f"✅ Strategy generation status retrieved for user: {user_id}")
return ResponseBuilder.create_success_response(
message="Strategy generation status retrieved successfully",
data=status_data
)
except Exception as e:
logger.error(f"❌ Error getting strategy generation status: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_strategy_generation_status")
@router.post("/optimize-existing-strategy")
async def optimize_existing_strategy(
strategy_id: int,
optimization_type: str = "comprehensive",
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Optimize an existing strategy using AI."""
try:
logger.info(f"🚀 Optimizing existing strategy {strategy_id} with type: {optimization_type}")
# Get existing strategy
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
strategies_data = await enhanced_service.get_enhanced_strategies(strategy_id=strategy_id, db=db)
if strategies_data.get("status") == "not_found" or not strategies_data.get("strategies"):
raise HTTPException(
status_code=404,
detail=f"Strategy with ID {strategy_id} not found"
)
existing_strategy = strategies_data["strategies"][0]
user_id = existing_strategy.get("user_id")
# Get user context
onboarding_data = await enhanced_service._get_onboarding_data(user_id)
context = {"onboarding_data": onboarding_data, "user_id": user_id}
# Initialize AI strategy generator
strategy_generator = AIStrategyGenerator()
# Generate optimization based on type
if optimization_type == "comprehensive":
# Generate comprehensive optimization
optimized_strategy = await strategy_generator.generate_comprehensive_strategy(
user_id=user_id,
context=context,
strategy_name=f"Optimized: {existing_strategy.get('name', 'Strategy')}"
)
else:
# Generate specific component optimization
component = await strategy_generator._generate_strategic_insights(existing_strategy, context)
optimized_strategy = {
"optimization_type": optimization_type,
"original_strategy": existing_strategy,
"optimization_data": component,
"optimized_at": datetime.utcnow().isoformat()
}
logger.info(f"✅ Strategy {strategy_id} optimized successfully")
return ResponseBuilder.create_success_response(
message="Strategy optimized successfully",
data=optimized_strategy
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error optimizing strategy: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "optimize_existing_strategy")
@router.post("/generate-comprehensive-strategy-polling")
async def generate_comprehensive_strategy_polling(
request: Dict[str, Any],
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Generate a comprehensive AI-powered content strategy using polling approach."""
try:
# Extract parameters from request body
user_id = request.get("user_id", 1)
strategy_name = request.get("strategy_name")
config = request.get("config", {})
logger.info(f"🚀 Starting polling-based AI strategy generation for user: {user_id}")
# Get user context and onboarding data
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
# Get onboarding data for context
onboarding_data = await enhanced_service._get_onboarding_data(user_id)
# Build context for AI generation
context = {
"onboarding_data": onboarding_data,
"user_id": user_id,
"generation_config": config or {}
}
# Create strategy generation config
generation_config = StrategyGenerationConfig(
include_competitive_analysis=config.get("include_competitive_analysis", True) if config else True,
include_content_calendar=config.get("include_content_calendar", True) if config else True,
include_performance_predictions=config.get("include_performance_predictions", True) if config else True,
include_implementation_roadmap=config.get("include_implementation_roadmap", True) if config else True,
include_risk_assessment=config.get("include_risk_assessment", True) if config else True,
max_content_pieces=config.get("max_content_pieces", 50) if config else 50,
timeline_months=config.get("timeline_months", 12) if config else 12
)
# Initialize AI strategy generator
strategy_generator = AIStrategyGenerator(generation_config)
# Start generation in background (non-blocking)
import asyncio
import uuid
# Generate unique task ID
task_id = str(uuid.uuid4())
# Store initial status
generation_status = {
"task_id": task_id,
"user_id": user_id,
"status": "started",
"progress": 0,
"step": 0,
"message": "Initializing AI strategy generation...",
"started_at": datetime.utcnow().isoformat(),
"estimated_completion": None,
"strategy": None,
"error": None,
"educational_content": EducationalContentManager.get_initialization_content()
}
# Store status in memory (in production, use Redis or database)
if not hasattr(generate_comprehensive_strategy_polling, '_task_status'):
generate_comprehensive_strategy_polling._task_status = {}
generate_comprehensive_strategy_polling._task_status[task_id] = generation_status
# Start background task
async def generate_strategy_background():
try:
logger.info(f"🔄 Starting background strategy generation for task: {task_id}")
# Step 1: Get user context
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 1,
"progress": 10,
"message": "Getting user context...",
"educational_content": EducationalContentManager.get_step_content(1)
})
# Step 2: Generate base strategy fields
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 2,
"progress": 20,
"message": "Generating base strategy fields...",
"educational_content": EducationalContentManager.get_step_content(2)
})
# Step 3: Generate strategic insights
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 3,
"progress": 30,
"message": "Generating strategic insights...",
"educational_content": EducationalContentManager.get_step_content(3)
})
strategic_insights = await strategy_generator._generate_strategic_insights({}, context)
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 3,
"progress": 35,
"message": "Strategic insights generated successfully",
"educational_content": EducationalContentManager.get_step_completion_content(3, strategic_insights)
})
# Step 4: Generate competitive analysis
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 4,
"progress": 40,
"message": "Generating competitive analysis...",
"educational_content": EducationalContentManager.get_step_content(4)
})
competitive_analysis = await strategy_generator._generate_competitive_analysis({}, context)
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 4,
"progress": 45,
"message": "Competitive analysis generated successfully",
"educational_content": EducationalContentManager.get_step_completion_content(4, competitive_analysis)
})
# Step 5: Generate performance predictions
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 5,
"progress": 50,
"message": "Generating performance predictions...",
"educational_content": EducationalContentManager.get_step_content(5)
})
performance_predictions = await strategy_generator._generate_performance_predictions({}, context)
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 5,
"progress": 55,
"message": "Performance predictions generated successfully",
"educational_content": EducationalContentManager.get_step_completion_content(5, performance_predictions)
})
# Step 6: Generate implementation roadmap
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 6,
"progress": 60,
"message": "Generating implementation roadmap...",
"educational_content": EducationalContentManager.get_step_content(6)
})
implementation_roadmap = await strategy_generator._generate_implementation_roadmap({}, context)
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 6,
"progress": 65,
"message": "Implementation roadmap generated successfully",
"educational_content": EducationalContentManager.get_step_completion_content(6, implementation_roadmap)
})
# Step 7: Generate risk assessment
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 7,
"progress": 70,
"message": "Generating risk assessment...",
"educational_content": EducationalContentManager.get_step_content(7)
})
risk_assessment = await strategy_generator._generate_risk_assessment({}, context)
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 7,
"progress": 75,
"message": "Risk assessment generated successfully",
"educational_content": EducationalContentManager.get_step_completion_content(7, risk_assessment)
})
# Step 8: Compile comprehensive strategy
generate_comprehensive_strategy_polling._task_status[task_id].update({
"step": 8,
"progress": 80,
"message": "Compiling comprehensive strategy...",
"educational_content": EducationalContentManager.get_step_content(8)
})
# Compile the comprehensive strategy (NO CONTENT CALENDAR)
comprehensive_strategy = {
"strategic_insights": strategic_insights,
"competitive_analysis": competitive_analysis,
"performance_predictions": performance_predictions,
"implementation_roadmap": implementation_roadmap,
"risk_assessment": risk_assessment,
"metadata": {
"ai_generated": True,
"comprehensive": True,
"generation_timestamp": datetime.utcnow().isoformat(),
"user_id": user_id,
"strategy_name": strategy_name or "Enhanced Content Strategy",
"content_calendar_ready": False # Indicates calendar needs to be generated separately
}
}
# Step 8: Complete
completion_content = EducationalContentManager.get_step_content(8)
completion_content = EducationalContentManager.update_completion_summary(
completion_content,
{
"performance_predictions": performance_predictions,
"implementation_roadmap": implementation_roadmap,
"risk_assessment": risk_assessment
}
)
# Save the comprehensive strategy to database
try:
from models.enhanced_strategy_models import EnhancedContentStrategy
# Create enhanced strategy record
enhanced_strategy = EnhancedContentStrategy(
user_id=user_id,
name=strategy_name or "Enhanced Content Strategy",
industry="technology", # Default, can be updated later
# Store the comprehensive AI analysis in the dedicated field
comprehensive_ai_analysis=comprehensive_strategy,
# Store metadata
ai_recommendations=comprehensive_strategy,
# Mark as AI-generated and comprehensive
created_at=datetime.utcnow(),
updated_at=datetime.utcnow()
)
# Add to database
db.add(enhanced_strategy)
db.commit()
db.refresh(enhanced_strategy)
logger.info(f"💾 Strategy saved to database with ID: {enhanced_strategy.id}")
# Update the comprehensive strategy with the database ID
comprehensive_strategy["metadata"]["strategy_id"] = enhanced_strategy.id
except Exception as db_error:
logger.error(f"❌ Error saving strategy to database: {str(db_error)}")
# Continue without database save, strategy is still available in memory
# Final completion update
final_status = {
"step": 8,
"progress": 100,
"status": "completed",
"message": "Strategy generation completed successfully!",
"strategy": comprehensive_strategy,
"completed_at": datetime.utcnow().isoformat(),
"educational_content": completion_content
}
generate_comprehensive_strategy_polling._task_status[task_id].update(final_status)
logger.info(f"🎯 Final status update for task {task_id}: {final_status}")
logger.info(f"🎯 Task status after update: {generate_comprehensive_strategy_polling._task_status[task_id]}")
# Store in global latest strategies for persistent access
_latest_strategies[user_id] = {
"strategy": comprehensive_strategy,
"completed_at": datetime.utcnow().isoformat(),
"task_id": task_id
}
logger.info(f"✅ Background strategy generation completed for task: {task_id}")
logger.info(f"💾 Strategy stored in global storage for user: {user_id}")
except Exception as e:
logger.error(f"❌ Error in background strategy generation for task {task_id}: {str(e)}")
generate_comprehensive_strategy_polling._task_status[task_id].update({
"status": "failed",
"error": str(e),
"message": f"Strategy generation failed: {str(e)}",
"failed_at": datetime.utcnow().isoformat()
})
# Start the background task
asyncio.create_task(generate_strategy_background())
logger.info(f"✅ Polling-based AI strategy generation started for user: {user_id}, task: {task_id}")
return ResponseBuilder.create_success_response(
message="AI strategy generation started successfully",
data={
"task_id": task_id,
"status": "started",
"message": "Strategy generation is running in the background. Use the task_id to check progress.",
"polling_endpoint": f"/api/content-planning/content-strategy/ai-generation/strategy-generation-status/{task_id}",
"estimated_completion": "2-3 minutes"
}
)
except Exception as e:
logger.error(f"❌ Error starting polling-based strategy generation: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "generate_comprehensive_strategy_polling")
@router.get("/strategy-generation-status/{task_id}")
async def get_strategy_generation_status_by_task(
task_id: str,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get the status of strategy generation for a specific task."""
try:
logger.info(f"Getting strategy generation status for task: {task_id}")
# Check if task status exists
if not hasattr(generate_comprehensive_strategy_polling, '_task_status'):
raise HTTPException(
status_code=404,
detail="No task status found. Task may have expired or never existed."
)
task_status = generate_comprehensive_strategy_polling._task_status.get(task_id)
if not task_status:
raise HTTPException(
status_code=404,
detail=f"Task {task_id} not found. It may have expired or never existed."
)
logger.info(f"✅ Strategy generation status retrieved for task: {task_id}")
return ResponseBuilder.create_success_response(
message="Strategy generation status retrieved successfully",
data=task_status
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error getting strategy generation status: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_strategy_generation_status_by_task")
@router.get("/latest-strategy")
async def get_latest_generated_strategy(
user_id: int = Query(1, description="User ID"),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get the latest generated strategy from the polling system or database."""
try:
logger.info(f"🔍 Getting latest generated strategy for user: {user_id}")
# First, try to get from database (most reliable)
try:
from models.enhanced_strategy_models import EnhancedContentStrategy
from sqlalchemy import desc
logger.info(f"🔍 Querying database for strategies with user_id: {user_id}")
# Query for the most recent strategy with comprehensive AI analysis
# First, let's see all strategies for this user
all_strategies = db.query(EnhancedContentStrategy).filter(
EnhancedContentStrategy.user_id == user_id
).order_by(desc(EnhancedContentStrategy.created_at)).all()
logger.info(f"🔍 Found {len(all_strategies)} total strategies for user {user_id}")
for i, strategy in enumerate(all_strategies):
logger.info(f" Strategy {i+1}: ID={strategy.id}, name={strategy.name}, created_at={strategy.created_at}, has_comprehensive_ai_analysis={strategy.comprehensive_ai_analysis is not None}")
# Now query for the most recent strategy with comprehensive AI analysis
latest_db_strategy = db.query(EnhancedContentStrategy).filter(
EnhancedContentStrategy.user_id == user_id,
EnhancedContentStrategy.comprehensive_ai_analysis.isnot(None)
).order_by(desc(EnhancedContentStrategy.created_at)).first()
logger.info(f"🔍 Database query result: {latest_db_strategy}")
if latest_db_strategy and latest_db_strategy.comprehensive_ai_analysis:
logger.info(f"✅ Found latest strategy in database: {latest_db_strategy.id}")
logger.info(f"🔍 Strategy comprehensive_ai_analysis keys: {list(latest_db_strategy.comprehensive_ai_analysis.keys()) if isinstance(latest_db_strategy.comprehensive_ai_analysis, dict) else 'Not a dict'}")
return ResponseBuilder.create_success_response(
message="Latest generated strategy retrieved successfully from database",
data={
"user_id": user_id,
"strategy": latest_db_strategy.comprehensive_ai_analysis,
"completed_at": latest_db_strategy.created_at.isoformat(),
"strategy_id": latest_db_strategy.id
}
)
else:
logger.info(f"⚠️ No strategy with comprehensive_ai_analysis found in database for user: {user_id}")
# Fallback: Try to get the most recent strategy regardless of comprehensive_ai_analysis
fallback_strategy = db.query(EnhancedContentStrategy).filter(
EnhancedContentStrategy.user_id == user_id
).order_by(desc(EnhancedContentStrategy.created_at)).first()
if fallback_strategy:
logger.info(f"🔍 Found fallback strategy: ID={fallback_strategy.id}, name={fallback_strategy.name}")
logger.info(f"🔍 Fallback strategy has ai_recommendations: {fallback_strategy.ai_recommendations is not None}")
# Try to use ai_recommendations as the strategy data
if fallback_strategy.ai_recommendations:
logger.info(f"✅ Using ai_recommendations as strategy data for fallback strategy {fallback_strategy.id}")
return ResponseBuilder.create_success_response(
message="Latest generated strategy retrieved successfully from database (fallback)",
data={
"user_id": user_id,
"strategy": fallback_strategy.ai_recommendations,
"completed_at": fallback_strategy.created_at.isoformat(),
"strategy_id": fallback_strategy.id
}
)
else:
logger.info(f"⚠️ Fallback strategy has no ai_recommendations either")
else:
logger.info(f"🔍 No strategy record found at all for user: {user_id}")
except Exception as db_error:
logger.warning(f"⚠️ Database query failed: {str(db_error)}")
logger.error(f"❌ Database error details: {type(db_error).__name__}: {str(db_error)}")
# Fallback: Check in-memory task status
if not hasattr(generate_comprehensive_strategy_polling, '_task_status'):
logger.warning("⚠️ No task status storage found")
return ResponseBuilder.create_success_response(
data={"user_id": user_id, "strategy": None},
message="No strategy generation tasks found",
status_code=200
)
# Debug: Log all task statuses
logger.info(f"📊 Total tasks in storage: {len(generate_comprehensive_strategy_polling._task_status)}")
for task_id, task_status in generate_comprehensive_strategy_polling._task_status.items():
logger.info(f" Task {task_id}: user_id={task_status.get('user_id')}, status={task_status.get('status')}, has_strategy={bool(task_status.get('strategy'))}")
# Find the most recent completed strategy for this user
latest_strategy = None
latest_completion_time = None
for task_id, task_status in generate_comprehensive_strategy_polling._task_status.items():
logger.info(f"🔍 Checking task {task_id}: user_id={task_status.get('user_id')} vs requested {user_id}")
if (task_status.get("user_id") == user_id and
task_status.get("status") == "completed" and
task_status.get("strategy")):
completion_time = task_status.get("completed_at")
logger.info(f"✅ Found completed strategy for user {user_id} at {completion_time}")
logger.info(f"🔍 Strategy keys: {list(task_status.get('strategy', {}).keys())}")
if completion_time and (latest_completion_time is None or completion_time > latest_completion_time):
latest_strategy = task_status.get("strategy")
latest_completion_time = completion_time
logger.info(f"🔄 Updated latest strategy with completion time: {completion_time}")
if latest_strategy:
logger.info(f"✅ Found latest generated strategy for user: {user_id}")
return ResponseBuilder.create_success_response(
message="Latest generated strategy retrieved successfully from memory",
data={
"user_id": user_id,
"strategy": latest_strategy,
"completed_at": latest_completion_time
}
)
else:
logger.info(f"⚠️ No completed strategies found for user: {user_id}")
return ResponseBuilder.create_success_response(
data={"user_id": user_id, "strategy": None},
message="No completed strategy generation found",
status_code=200
)
except Exception as e:
logger.error(f"❌ Error getting latest generated strategy: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_latest_generated_strategy")

View File

@@ -0,0 +1,252 @@
"""
Analytics Endpoints
Handles analytics and AI analysis endpoints for enhanced content strategies.
"""
from typing import Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.orm import Session
from loguru import logger
from datetime import datetime
# Import database
from services.database import get_db_session
# Import services
from ....services.enhanced_strategy_service import EnhancedStrategyService
from ....services.enhanced_strategy_db_service import EnhancedStrategyDBService
# Import models
from models.enhanced_strategy_models import EnhancedContentStrategy, EnhancedAIAnalysisResult
# Import utilities
from ....utils.error_handlers import ContentPlanningErrorHandler
from ....utils.response_builders import ResponseBuilder
from ....utils.constants import ERROR_MESSAGES, SUCCESS_MESSAGES
router = APIRouter(tags=["Strategy Analytics"])
# Helper function to get database session
def get_db():
db = get_db_session()
try:
yield db
finally:
db.close()
@router.get("/{strategy_id}/analytics")
async def get_enhanced_strategy_analytics(
strategy_id: int,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get comprehensive analytics for an enhanced strategy."""
try:
logger.info(f"🚀 Getting analytics for enhanced strategy: {strategy_id}")
db_service = EnhancedStrategyDBService(db)
# Get strategy with analytics
strategies_with_analytics = await db_service.get_enhanced_strategies_with_analytics(
strategy_id=strategy_id
)
if not strategies_with_analytics:
raise ContentPlanningErrorHandler.handle_not_found_error("Enhanced strategy", strategy_id)
strategy_analytics = strategies_with_analytics[0]
logger.info(f"✅ Enhanced strategy analytics retrieved successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
message="Enhanced strategy analytics retrieved successfully",
data=strategy_analytics
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error getting enhanced strategy analytics: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_analytics")
@router.get("/{strategy_id}/ai-analyses")
async def get_enhanced_strategy_ai_analysis(
strategy_id: int,
limit: int = Query(10, description="Number of AI analysis results to return"),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get AI analysis history for an enhanced strategy."""
try:
logger.info(f"🚀 Getting AI analysis for enhanced strategy: {strategy_id}")
db_service = EnhancedStrategyDBService(db)
# Verify strategy exists
strategy = await db_service.get_enhanced_strategy(strategy_id)
if not strategy:
raise ContentPlanningErrorHandler.handle_not_found_error("Enhanced strategy", strategy_id)
# Get AI analysis history
ai_analysis_history = await db_service.get_ai_analysis_history(strategy_id, limit)
logger.info(f"✅ AI analysis history retrieved successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
message="Enhanced strategy AI analysis retrieved successfully",
data={
"strategy_id": strategy_id,
"ai_analysis_history": ai_analysis_history,
"total_analyses": len(ai_analysis_history)
}
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error getting enhanced strategy AI analysis: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_ai_analysis")
@router.get("/{strategy_id}/completion")
async def get_enhanced_strategy_completion_stats(
strategy_id: int,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get completion statistics for an enhanced strategy."""
try:
logger.info(f"🚀 Getting completion stats for enhanced strategy: {strategy_id}")
db_service = EnhancedStrategyDBService(db)
# Get strategy
strategy = await db_service.get_enhanced_strategy(strategy_id)
if not strategy:
raise ContentPlanningErrorHandler.handle_not_found_error("Enhanced strategy", strategy_id)
# Calculate completion stats
completion_stats = {
"strategy_id": strategy_id,
"completion_percentage": strategy.completion_percentage,
"total_fields": 30, # 30+ strategic inputs
"filled_fields": len([f for f in strategy.__dict__.keys() if getattr(strategy, f) is not None]),
"missing_fields": 30 - len([f for f in strategy.__dict__.keys() if getattr(strategy, f) is not None]),
"last_updated": strategy.updated_at.isoformat() if strategy.updated_at else None
}
logger.info(f"✅ Completion stats retrieved successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
message="Enhanced strategy completion stats retrieved successfully",
data=completion_stats
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error getting enhanced strategy completion stats: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_completion_stats")
@router.get("/{strategy_id}/onboarding-integration")
async def get_enhanced_strategy_onboarding_integration(
strategy_id: int,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get onboarding data integration for an enhanced strategy."""
try:
logger.info(f"🚀 Getting onboarding integration for enhanced strategy: {strategy_id}")
db_service = EnhancedStrategyDBService(db)
onboarding_integration = await db_service.get_onboarding_integration(strategy_id)
if not onboarding_integration:
return ResponseBuilder.create_success_response(
data={"strategy_id": strategy_id, "onboarding_integration": None},
message="No onboarding integration found for this strategy",
status_code=200
)
logger.info(f"✅ Onboarding integration retrieved successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
message="Enhanced strategy onboarding integration retrieved successfully",
data=onboarding_integration
)
except Exception as e:
logger.error(f"❌ Error getting onboarding integration: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_onboarding_integration")
@router.post("/{strategy_id}/ai-recommendations")
async def generate_enhanced_ai_recommendations(
strategy_id: int,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Generate AI recommendations for an enhanced strategy."""
try:
logger.info(f"🚀 Generating AI recommendations for enhanced strategy: {strategy_id}")
# Get strategy
db_service = EnhancedStrategyDBService(db)
strategy = await db_service.get_enhanced_strategy(strategy_id)
if not strategy:
raise ContentPlanningErrorHandler.handle_not_found_error("Enhanced strategy", strategy_id)
# Generate AI recommendations
enhanced_service = EnhancedStrategyService(db_service)
# Pass user_id for subscription checks
user_id = str(strategy.user_id) if hasattr(strategy, 'user_id') else None
await enhanced_service._generate_comprehensive_ai_recommendations(strategy, db, user_id=user_id)
# Get updated strategy data
updated_strategy = await db_service.get_enhanced_strategy(strategy_id)
logger.info(f"✅ AI recommendations generated successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
message="Enhanced strategy AI recommendations generated successfully",
data=updated_strategy.to_dict()
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error generating AI recommendations: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "generate_enhanced_ai_recommendations")
@router.post("/{strategy_id}/ai-analysis/regenerate")
async def regenerate_enhanced_strategy_ai_analysis(
strategy_id: int,
analysis_type: str,
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Regenerate AI analysis for an enhanced strategy."""
try:
logger.info(f"🚀 Regenerating AI analysis for enhanced strategy: {strategy_id}, type: {analysis_type}")
# Get strategy
db_service = EnhancedStrategyDBService(db)
strategy = await db_service.get_enhanced_strategy(strategy_id)
if not strategy:
raise ContentPlanningErrorHandler.handle_not_found_error("Enhanced strategy", strategy_id)
# Regenerate AI analysis
enhanced_service = EnhancedStrategyService(db_service)
# Pass user_id for subscription checks
user_id = str(strategy.user_id) if hasattr(strategy, 'user_id') else None
await enhanced_service._generate_specialized_recommendations(strategy, analysis_type, db, user_id=user_id)
# Get updated strategy data
updated_strategy = await db_service.get_enhanced_strategy(strategy_id)
logger.info(f"✅ AI analysis regenerated successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
message="Enhanced strategy AI analysis regenerated successfully",
data=updated_strategy.to_dict()
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error regenerating AI analysis: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "regenerate_enhanced_strategy_ai_analysis")

View File

@@ -0,0 +1,227 @@
"""
Autofill Endpoints
Handles autofill endpoints for enhanced content strategies.
CRITICAL PROTECTION ZONE - These endpoints are essential for autofill functionality.
"""
from typing import Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi.responses import StreamingResponse
from sqlalchemy.orm import Session
from loguru import logger
import json
import asyncio
from datetime import datetime
# Import database
from services.database import get_db_session
# Import services
from ....services.enhanced_strategy_service import EnhancedStrategyService
from ....services.enhanced_strategy_db_service import EnhancedStrategyDBService
from ....services.content_strategy.autofill.ai_refresh import AutoFillRefreshService
# Import utilities
from ....utils.error_handlers import ContentPlanningErrorHandler
from ....utils.response_builders import ResponseBuilder
from ....utils.constants import ERROR_MESSAGES, SUCCESS_MESSAGES
router = APIRouter(tags=["Strategy Autofill"])
# Helper function to get database session
def get_db():
db = get_db_session()
try:
yield db
finally:
db.close()
async def stream_data(data_generator):
"""Helper function to stream data as Server-Sent Events"""
async for chunk in data_generator:
if isinstance(chunk, dict):
yield f"data: {json.dumps(chunk)}\n\n"
else:
yield f"data: {json.dumps({'message': str(chunk)})}\n\n"
await asyncio.sleep(0.1) # Small delay to prevent overwhelming
@router.post("/{strategy_id}/autofill/accept")
async def accept_autofill_inputs(
strategy_id: int,
payload: Dict[str, Any],
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Persist end-user accepted auto-fill inputs and associate with the strategy."""
try:
logger.info(f"🚀 Accepting autofill inputs for strategy: {strategy_id}")
user_id = str(payload.get('user_id') or "")
accepted_fields = payload.get('accepted_fields') or {}
# Optional transparency bundles
sources = payload.get('sources') or {}
input_data_points = payload.get('input_data_points') or {}
quality_scores = payload.get('quality_scores') or {}
confidence_levels = payload.get('confidence_levels') or {}
data_freshness = payload.get('data_freshness') or {}
if not accepted_fields:
raise HTTPException(status_code=400, detail="accepted_fields is required")
db_service = EnhancedStrategyDBService(db)
record = await db_service.save_autofill_insights(
strategy_id=strategy_id,
user_id=user_id,
payload={
'accepted_fields': accepted_fields,
'sources': sources,
'input_data_points': input_data_points,
'quality_scores': quality_scores,
'confidence_levels': confidence_levels,
'data_freshness': data_freshness,
}
)
if not record:
raise HTTPException(status_code=500, detail="Failed to persist autofill insights")
return ResponseBuilder.create_success_response(
message="Accepted autofill inputs persisted successfully",
data={
'id': record.id,
'strategy_id': record.strategy_id,
'user_id': record.user_id,
'created_at': record.created_at.isoformat() if getattr(record, 'created_at', None) else None
}
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error accepting autofill inputs: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "accept_autofill_inputs")
@router.get("/autofill/refresh/stream")
async def stream_autofill_refresh(
user_id: Optional[int] = Query(None, description="User ID to build auto-fill for"),
use_ai: bool = Query(True, description="Use AI augmentation during refresh"),
ai_only: bool = Query(False, description="AI-first refresh: return AI overrides when available"),
db: Session = Depends(get_db)
):
"""SSE endpoint to stream steps while generating a fresh auto-fill payload (no DB writes)."""
async def refresh_generator():
try:
actual_user_id = user_id or 1
start_time = datetime.utcnow()
logger.info(f"🚀 Starting auto-fill refresh stream for user: {actual_user_id}")
yield {"type": "status", "phase": "init", "message": "Starting…", "progress": 5}
refresh_service = AutoFillRefreshService(db)
# Phase: Collect onboarding context
yield {"type": "progress", "phase": "context", "message": "Collecting context…", "progress": 15}
# We deliberately do not emit DB-derived values; context is used inside the service
# Phase: Build prompt
yield {"type": "progress", "phase": "prompt", "message": "Preparing prompt…", "progress": 30}
# Phase: AI call with transparency - run in background and yield transparency messages
yield {"type": "progress", "phase": "ai", "message": "Calling AI…", "progress": 45}
import asyncio
# Create a queue to collect transparency messages
transparency_messages = []
async def yield_transparency_message(message):
transparency_messages.append(message)
logger.info(f"📊 Transparency message collected: {message.get('type', 'unknown')} - {message.get('message', 'no message')}")
return message
# Run the transparency-enabled payload generation
ai_task = asyncio.create_task(
refresh_service.build_fresh_payload_with_transparency(
actual_user_id,
use_ai=use_ai,
ai_only=ai_only,
yield_callback=yield_transparency_message
)
)
# Heartbeat loop while AI is running
heartbeat_progress = 50
while not ai_task.done():
elapsed = (datetime.utcnow() - start_time).total_seconds()
heartbeat_progress = min(heartbeat_progress + 3, 85)
yield {"type": "progress", "phase": "ai_running", "message": f"AI running… {int(elapsed)}s", "progress": heartbeat_progress}
# Yield any transparency messages that have been collected
while transparency_messages:
message = transparency_messages.pop(0)
logger.info(f"📤 Yielding transparency message: {message.get('type', 'unknown')}")
yield message
await asyncio.sleep(1) # Check more frequently
# Retrieve result or error
final_payload = await ai_task
# Yield any remaining transparency messages after task completion
while transparency_messages:
message = transparency_messages.pop(0)
logger.info(f"📤 Yielding remaining transparency message: {message.get('type', 'unknown')}")
yield message
# Phase: Validate & map
yield {"type": "progress", "phase": "validate", "message": "Validating…", "progress": 92}
# Phase: Transparency
yield {"type": "progress", "phase": "finalize", "message": "Finalizing…", "progress": 96}
total_ms = int((datetime.utcnow() - start_time).total_seconds() * 1000)
meta = final_payload.get('meta') or {}
meta.update({
'sse_total_ms': total_ms,
'sse_started_at': start_time.isoformat()
})
final_payload['meta'] = meta
yield {"type": "result", "status": "success", "data": final_payload, "progress": 100}
logger.info(f"✅ Auto-fill refresh stream completed for user: {actual_user_id} in {total_ms} ms")
except Exception as e:
logger.error(f"❌ Error in auto-fill refresh stream: {str(e)}")
yield {"type": "error", "message": str(e), "timestamp": datetime.utcnow().isoformat()}
return StreamingResponse(
stream_data(refresh_generator()),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers": "*",
"Access-Control-Allow-Methods": "GET, POST, OPTIONS",
"Access-Control-Allow-Credentials": "true"
}
)
@router.post("/autofill/refresh")
async def refresh_autofill(
user_id: Optional[int] = Query(None, description="User ID to build auto-fill for"),
use_ai: bool = Query(True, description="Use AI augmentation during refresh"),
ai_only: bool = Query(False, description="AI-first refresh: return AI overrides when available"),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Non-stream endpoint to return a fresh auto-fill payload (no DB writes)."""
try:
actual_user_id = user_id or 1
started = datetime.utcnow()
refresh_service = AutoFillRefreshService(db)
payload = await refresh_service.build_fresh_payload_with_transparency(actual_user_id, use_ai=use_ai, ai_only=ai_only)
total_ms = int((datetime.utcnow() - started).total_seconds() * 1000)
meta = payload.get('meta') or {}
meta.update({'http_total_ms': total_ms, 'http_started_at': started.isoformat()})
payload['meta'] = meta
return ResponseBuilder.create_success_response(
message="Fresh auto-fill payload generated successfully",
data=payload
)
except Exception as e:
logger.error(f"❌ Error generating fresh auto-fill payload: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "refresh_autofill")

View File

@@ -0,0 +1,8 @@
"""
Content Strategy Educational Content Module
Provides educational content and messages for strategy generation process.
"""
from .educational_content import EducationalContentManager
__all__ = ['EducationalContentManager']

View File

@@ -0,0 +1,319 @@
"""
Educational Content Manager
Manages educational content and messages for strategy generation process.
"""
from typing import Dict, Any, List
from datetime import datetime
class EducationalContentManager:
"""Manages educational content for strategy generation steps."""
@staticmethod
def get_initialization_content() -> Dict[str, Any]:
"""Get educational content for initialization step."""
return {
"title": "🤖 AI-Powered Strategy Generation",
"description": "Initializing AI analysis and preparing educational content...",
"details": [
"🔧 Setting up AI services",
"📊 Loading user context",
"🎯 Preparing strategy framework",
"📚 Generating educational content"
],
"insight": "We're getting everything ready for your personalized AI strategy generation.",
"estimated_time": "2-3 minutes total"
}
@staticmethod
def get_step_content(step: int) -> Dict[str, Any]:
"""Get educational content for a specific step."""
step_content = {
1: EducationalContentManager._get_user_context_content(),
2: EducationalContentManager._get_foundation_content(),
3: EducationalContentManager._get_strategic_insights_content(),
4: EducationalContentManager._get_competitive_analysis_content(),
5: EducationalContentManager._get_performance_predictions_content(),
6: EducationalContentManager._get_implementation_roadmap_content(),
7: EducationalContentManager._get_compilation_content(),
8: EducationalContentManager._get_completion_content()
}
return step_content.get(step, EducationalContentManager._get_default_content())
@staticmethod
def get_step_completion_content(step: int, result_data: Dict[str, Any] = None) -> Dict[str, Any]:
"""Get educational content for step completion."""
completion_content = {
3: EducationalContentManager._get_strategic_insights_completion(result_data),
4: EducationalContentManager._get_competitive_analysis_completion(result_data),
5: EducationalContentManager._get_performance_predictions_completion(result_data),
6: EducationalContentManager._get_implementation_roadmap_completion(result_data)
}
return completion_content.get(step, EducationalContentManager._get_default_completion())
@staticmethod
def _get_user_context_content() -> Dict[str, Any]:
"""Get educational content for user context analysis."""
return {
"title": "🔍 Analyzing Your Data",
"description": "We're gathering all your onboarding information to create a personalized strategy.",
"details": [
"📊 Website analysis data",
"🎯 Research preferences",
"🔑 API configurations",
"📈 Historical performance metrics"
],
"insight": "Your data helps us understand your business context, target audience, and competitive landscape.",
"ai_prompt_preview": "Analyzing user onboarding data to extract business context, audience insights, and competitive positioning..."
}
@staticmethod
def _get_foundation_content() -> Dict[str, Any]:
"""Get educational content for foundation building."""
return {
"title": "🏗️ Building Foundation",
"description": "Creating the core strategy framework based on your business objectives.",
"details": [
"🎯 Business objectives mapping",
"📊 Target metrics definition",
"💰 Budget allocation strategy",
"⏰ Timeline planning"
],
"insight": "A solid foundation ensures your content strategy aligns with business goals and resources.",
"ai_prompt_preview": "Generating strategic foundation: business objectives, target metrics, budget allocation, and timeline planning..."
}
@staticmethod
def _get_strategic_insights_content() -> Dict[str, Any]:
"""Get educational content for strategic insights generation."""
return {
"title": "🧠 Strategic Intelligence Analysis",
"description": "AI is analyzing your market position and identifying strategic opportunities.",
"details": [
"🎯 Market positioning analysis",
"💡 Opportunity identification",
"📈 Growth potential assessment",
"🎪 Competitive advantage mapping"
],
"insight": "Strategic insights help you understand where you stand in the market and how to differentiate.",
"ai_prompt_preview": "Analyzing market position, identifying strategic opportunities, assessing growth potential, and mapping competitive advantages...",
"estimated_time": "15-20 seconds"
}
@staticmethod
def _get_competitive_analysis_content() -> Dict[str, Any]:
"""Get educational content for competitive analysis."""
return {
"title": "🔍 Competitive Intelligence Analysis",
"description": "AI is analyzing your competitors to identify gaps and opportunities.",
"details": [
"🏢 Competitor content strategies",
"📊 Market gap analysis",
"🎯 Differentiation opportunities",
"📈 Industry trend analysis"
],
"insight": "Understanding your competitors helps you find unique angles and underserved market segments.",
"ai_prompt_preview": "Analyzing competitor content strategies, identifying market gaps, finding differentiation opportunities, and assessing industry trends...",
"estimated_time": "20-25 seconds"
}
@staticmethod
def _get_performance_predictions_content() -> Dict[str, Any]:
"""Get educational content for performance predictions."""
return {
"title": "📊 Performance Forecasting",
"description": "AI is predicting content performance and ROI based on industry data.",
"details": [
"📈 Traffic growth projections",
"💰 ROI predictions",
"🎯 Conversion rate estimates",
"📊 Engagement metrics forecasting"
],
"insight": "Performance predictions help you set realistic expectations and optimize resource allocation.",
"ai_prompt_preview": "Analyzing industry benchmarks, predicting traffic growth, estimating ROI, forecasting conversion rates, and projecting engagement metrics...",
"estimated_time": "15-20 seconds"
}
@staticmethod
def _get_implementation_roadmap_content() -> Dict[str, Any]:
"""Get educational content for implementation roadmap."""
return {
"title": "🗺️ Implementation Roadmap",
"description": "AI is creating a detailed implementation plan for your content strategy.",
"details": [
"📋 Task breakdown and timeline",
"👥 Resource allocation planning",
"🎯 Milestone definition",
"📊 Success metric tracking"
],
"insight": "A clear implementation roadmap ensures successful strategy execution and measurable results.",
"ai_prompt_preview": "Creating implementation roadmap: task breakdown, resource allocation, milestone planning, and success metric definition...",
"estimated_time": "15-20 seconds"
}
@staticmethod
def _get_risk_assessment_content() -> Dict[str, Any]:
"""Get educational content for risk assessment."""
return {
"title": "⚠️ Risk Assessment",
"description": "AI is identifying potential risks and mitigation strategies for your content strategy.",
"details": [
"🔍 Risk identification and analysis",
"📊 Risk probability assessment",
"🛡️ Mitigation strategy development",
"📈 Risk monitoring framework"
],
"insight": "Proactive risk assessment helps you prepare for challenges and maintain strategy effectiveness.",
"ai_prompt_preview": "Assessing risks: identifying potential challenges, analyzing probability and impact, developing mitigation strategies, and creating monitoring framework...",
"estimated_time": "10-15 seconds"
}
@staticmethod
def _get_compilation_content() -> Dict[str, Any]:
"""Get educational content for strategy compilation."""
return {
"title": "📋 Strategy Compilation",
"description": "AI is compiling all components into a comprehensive content strategy.",
"details": [
"🔗 Component integration",
"📊 Data synthesis",
"📝 Strategy documentation",
"✅ Quality validation"
],
"insight": "A comprehensive strategy integrates all components into a cohesive, actionable plan.",
"ai_prompt_preview": "Compiling comprehensive strategy: integrating all components, synthesizing data, documenting strategy, and validating quality...",
"estimated_time": "5-10 seconds"
}
@staticmethod
def _get_completion_content() -> Dict[str, Any]:
"""Get educational content for strategy completion."""
return {
"title": "🎉 Strategy Generation Complete!",
"description": "Your comprehensive AI-powered content strategy is ready for review!",
"summary": {
"total_components": 5,
"successful_components": 5,
"estimated_roi": "15-25%",
"implementation_timeline": "12 months",
"risk_level": "Medium"
},
"key_achievements": [
"🧠 Strategic insights generated",
"🔍 Competitive analysis completed",
"📊 Performance predictions calculated",
"🗺️ Implementation roadmap planned",
"⚠️ Risk assessment conducted"
],
"next_steps": [
"Review your comprehensive strategy in the Strategic Intelligence tab",
"Customize specific components as needed",
"Confirm the strategy to proceed",
"Generate content calendar based on confirmed strategy"
],
"ai_insights": "Your strategy leverages advanced AI analysis of your business context, competitive landscape, and industry best practices to create a data-driven content approach.",
"personalization_note": "This strategy is uniquely tailored to your business based on your onboarding data, ensuring relevance and effectiveness.",
"content_calendar_note": "Content calendar will be generated separately after you review and confirm this strategy, ensuring it's based on your final approved strategy."
}
@staticmethod
def _get_default_content() -> Dict[str, Any]:
"""Get default educational content."""
return {
"title": "🔄 Processing",
"description": "AI is working on your strategy...",
"details": [
"⏳ Processing in progress",
"📊 Analyzing data",
"🎯 Generating insights",
"📝 Compiling results"
],
"insight": "The AI is working hard to create your personalized strategy.",
"estimated_time": "A few moments"
}
@staticmethod
def _get_strategic_insights_completion(result_data: Dict[str, Any] = None) -> Dict[str, Any]:
"""Get completion content for strategic insights."""
insights_count = len(result_data.get("insights", [])) if result_data else 0
return {
"title": "✅ Strategic Insights Complete",
"description": "Successfully identified key strategic opportunities and market positioning.",
"achievement": f"Generated {insights_count} strategic insights",
"next_step": "Moving to competitive analysis..."
}
@staticmethod
def _get_competitive_analysis_completion(result_data: Dict[str, Any] = None) -> Dict[str, Any]:
"""Get completion content for competitive analysis."""
competitors_count = len(result_data.get("competitors", [])) if result_data else 0
return {
"title": "✅ Competitive Analysis Complete",
"description": "Successfully analyzed competitive landscape and identified market opportunities.",
"achievement": f"Analyzed {competitors_count} competitors",
"next_step": "Moving to performance predictions..."
}
@staticmethod
def _get_performance_predictions_completion(result_data: Dict[str, Any] = None) -> Dict[str, Any]:
"""Get completion content for performance predictions."""
estimated_roi = result_data.get("estimated_roi", "15-25%") if result_data else "15-25%"
return {
"title": "✅ Performance Predictions Complete",
"description": "Successfully predicted content performance and ROI.",
"achievement": f"Predicted {estimated_roi} ROI",
"next_step": "Moving to implementation roadmap..."
}
@staticmethod
def _get_implementation_roadmap_completion(result_data: Dict[str, Any] = None) -> Dict[str, Any]:
"""Get completion content for implementation roadmap."""
timeline = result_data.get("total_duration", "12 months") if result_data else "12 months"
return {
"title": "✅ Implementation Roadmap Complete",
"description": "Successfully created detailed implementation plan.",
"achievement": f"Planned {timeline} implementation timeline",
"next_step": "Moving to compilation..."
}
@staticmethod
def _get_risk_assessment_completion(result_data: Dict[str, Any] = None) -> Dict[str, Any]:
"""Get completion content for risk assessment."""
risk_level = result_data.get("overall_risk_level", "Medium") if result_data else "Medium"
return {
"title": "✅ Risk Assessment Complete",
"description": "Successfully identified risks and mitigation strategies.",
"achievement": f"Assessed {risk_level} risk level",
"next_step": "Finalizing comprehensive strategy..."
}
@staticmethod
def _get_default_completion() -> Dict[str, Any]:
"""Get default completion content."""
return {
"title": "✅ Step Complete",
"description": "Successfully completed this step.",
"achievement": "Step completed successfully",
"next_step": "Moving to next step..."
}
@staticmethod
def update_completion_summary(completion_content: Dict[str, Any], strategy_data: Dict[str, Any]) -> Dict[str, Any]:
"""Update completion content with actual strategy data."""
if "summary" in completion_content:
content_calendar = strategy_data.get("content_calendar", {})
performance_predictions = strategy_data.get("performance_predictions", {})
implementation_roadmap = strategy_data.get("implementation_roadmap", {})
risk_assessment = strategy_data.get("risk_assessment", {})
completion_content["summary"].update({
"total_content_pieces": len(content_calendar.get("content_pieces", [])),
"estimated_roi": performance_predictions.get("estimated_roi", "15-25%"),
"implementation_timeline": implementation_roadmap.get("total_duration", "12 months"),
"risk_level": risk_assessment.get("overall_risk_level", "Medium")
})
return completion_content

View File

@@ -0,0 +1,299 @@
"""
Strategy CRUD Endpoints
Handles CRUD operations for enhanced content strategies.
"""
from typing import Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.orm import Session
from loguru import logger
import json
from datetime import datetime
# Import database
from services.database import get_db
# Import authentication middleware
from middleware.auth_middleware import get_current_user
# Import services
from ....services.enhanced_strategy_service import EnhancedStrategyService
from ....services.enhanced_strategy_db_service import EnhancedStrategyDBService
# Import models
from models.enhanced_strategy_models import EnhancedContentStrategy
# Import utilities
from ....utils.error_handlers import ContentPlanningErrorHandler
from ....utils.response_builders import ResponseBuilder
from ....utils.constants import ERROR_MESSAGES, SUCCESS_MESSAGES
from ....utils.data_parsers import parse_strategy_data
router = APIRouter(tags=["Strategy CRUD"])
@router.post("/create")
async def create_enhanced_strategy(
strategy_data: Dict[str, Any],
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Create a new enhanced content strategy."""
try:
# Extract authenticated user_id from Clerk
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
logger.info(f"Creating enhanced strategy: {strategy_data.get('name', 'Unknown')} for user: {clerk_user_id}")
# Override user_id from request body with authenticated user_id (security)
strategy_data['user_id'] = clerk_user_id
# Validate required fields
required_fields = ['name']
for field in required_fields:
if field not in strategy_data or not strategy_data[field]:
raise HTTPException(
status_code=400,
detail=f"Missing required field: {field}"
)
# Parse and validate strategy data using shared utilities
cleaned_data, warnings = parse_strategy_data(strategy_data)
# Log warnings if any
if warnings:
logger.warning(f" Strategy create warnings: {warnings}")
# Create strategy
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
# Pass authenticated user_id for AI calls with subscription checks
result = await enhanced_service.create_enhanced_strategy(cleaned_data, db)
logger.info(f"Enhanced strategy created successfully: {result.get('strategy_id') if isinstance(result, dict) else getattr(result, 'id', None)}")
response = ResponseBuilder.create_success_response(
data=result,
message=SUCCESS_MESSAGES['strategy_created']
)
# Include warnings if any
if warnings:
response['warnings'] = warnings
return response
except HTTPException:
raise
except Exception as e:
logger.error(f"Error creating enhanced strategy: {str(e)}")
return ContentPlanningErrorHandler.handle_general_error(e, "create_enhanced_strategy")
@router.get("/")
async def get_enhanced_strategies(
user_id: Optional[str] = Query(None, description="User ID to filter strategies (deprecated - use authenticated user)"),
strategy_id: Optional[int] = Query(None, description="Specific strategy ID"),
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get enhanced content strategies."""
try:
# Extract authenticated user_id from Clerk
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
authenticated_user_id = clerk_user_id
logger.info(f"Getting enhanced strategies for authenticated user: {authenticated_user_id}, strategy: {strategy_id}")
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
# Use authenticated user_id to ensure users can only see their own strategies
strategies_data = await enhanced_service.get_enhanced_strategies(authenticated_user_id, strategy_id, db)
logger.info(f"Retrieved {strategies_data.get('total_count', 0)} strategies")
return ResponseBuilder.create_success_response(
data=strategies_data,
message=SUCCESS_MESSAGES['strategies_retrieved']
)
except Exception as e:
logger.error(f"Error getting enhanced strategies: {str(e)}")
return ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategies")
@router.get("/{strategy_id}")
async def get_enhanced_strategy_by_id(
strategy_id: int,
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get a specific enhanced strategy by ID."""
try:
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
authenticated_user_id = clerk_user_id
logger.info(f"Getting enhanced strategy by ID: {strategy_id} for authenticated user: {authenticated_user_id}")
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
strategies_data = await enhanced_service.get_enhanced_strategies(user_id=authenticated_user_id, strategy_id=strategy_id, db=db)
if strategies_data.get("status") == "not_found" or not strategies_data.get("strategies"):
raise HTTPException(
status_code=404,
detail=f"Enhanced strategy with ID {strategy_id} not found or you don't have access to it"
)
strategy = strategies_data["strategies"][0]
# Verify ownership
if strategy.get('user_id') != authenticated_user_id:
raise HTTPException(
status_code=403,
detail="You don't have permission to access this strategy"
)
logger.info(f"Retrieved strategy: {strategy.get('name')}")
return ResponseBuilder.create_success_response(
data=strategy,
message=SUCCESS_MESSAGES['strategy_retrieved']
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error getting enhanced strategy by ID: {str(e)}")
return ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_by_id")
@router.put("/{strategy_id}")
async def update_enhanced_strategy(
strategy_id: int,
update_data: Dict[str, Any],
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Update an enhanced strategy."""
try:
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
authenticated_user_id = clerk_user_id
logger.info(f"Updating enhanced strategy: {strategy_id} for authenticated user: {authenticated_user_id}")
# Check if strategy exists and verify ownership
existing_strategy = db.query(EnhancedContentStrategy).filter(
EnhancedContentStrategy.id == strategy_id
).first()
if not existing_strategy:
raise HTTPException(
status_code=404,
detail=f"Enhanced strategy with ID {strategy_id} not found"
)
# Verify ownership
if existing_strategy.user_id != authenticated_user_id:
raise HTTPException(
status_code=403,
detail="You don't have permission to update this strategy"
)
# Update strategy fields
for field, value in update_data.items():
if hasattr(existing_strategy, field):
setattr(existing_strategy, field, value)
existing_strategy.updated_at = datetime.utcnow()
# Save to database
db.commit()
db.refresh(existing_strategy)
logger.info(f"Enhanced strategy updated successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
data=existing_strategy.to_dict(),
message=SUCCESS_MESSAGES['strategy_updated']
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error updating enhanced strategy: {str(e)}")
return ContentPlanningErrorHandler.handle_general_error(e, "update_enhanced_strategy")
@router.delete("/{strategy_id}")
async def delete_enhanced_strategy(
strategy_id: int,
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Delete an enhanced strategy."""
try:
# Extract authenticated user_id from Clerk
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
authenticated_user_id = clerk_user_id
logger.info(f"Deleting enhanced strategy: {strategy_id} for authenticated user: {authenticated_user_id}")
# Check if strategy exists and verify ownership
strategy = db.query(EnhancedContentStrategy).filter(
EnhancedContentStrategy.id == strategy_id
).first()
if not strategy:
raise HTTPException(
status_code=404,
detail=f"Enhanced strategy with ID {strategy_id} not found"
)
# Verify ownership
if strategy.user_id != authenticated_user_id:
raise HTTPException(
status_code=403,
detail="You don't have permission to delete this strategy"
)
# Delete strategy
db.delete(strategy)
db.commit()
logger.info(f"Enhanced strategy deleted successfully: {strategy_id}")
return ResponseBuilder.create_success_response(
data={"strategy_id": strategy_id},
message=SUCCESS_MESSAGES['strategy_deleted']
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error deleting enhanced strategy: {str(e)}")
return ContentPlanningErrorHandler.handle_general_error(e, "delete_enhanced_strategy")

View File

@@ -0,0 +1,387 @@
"""
Streaming Endpoints
Handles streaming endpoints for enhanced content strategies.
"""
from typing import Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi.responses import StreamingResponse
from starlette.requests import Request
from sqlalchemy.orm import Session
from loguru import logger
import json
import asyncio
from datetime import datetime
from collections import defaultdict
import time
# Import database
from services.database import get_db_session
# Import authentication middleware
from middleware.auth_middleware import get_current_user, get_current_user_with_query_token
# Import services
from ....services.enhanced_strategy_service import EnhancedStrategyService
from ....services.enhanced_strategy_db_service import EnhancedStrategyDBService
# Import utilities
from ....utils.error_handlers import ContentPlanningErrorHandler
from ....utils.response_builders import ResponseBuilder
from ....utils.constants import ERROR_MESSAGES, SUCCESS_MESSAGES
router = APIRouter(tags=["Strategy Streaming"])
# Cache for streaming endpoints (5 minutes cache)
streaming_cache = defaultdict(dict)
CACHE_DURATION = 300 # 5 minutes
def get_cached_data(cache_key: str) -> Optional[Dict[str, Any]]:
"""Get cached data if it exists and is not expired."""
if cache_key in streaming_cache:
cached_data = streaming_cache[cache_key]
if time.time() - cached_data.get("timestamp", 0) < CACHE_DURATION:
return cached_data.get("data")
return None
def set_cached_data(cache_key: str, data: Dict[str, Any]):
"""Set cached data with timestamp."""
streaming_cache[cache_key] = {
"data": data,
"timestamp": time.time()
}
# Helper function to get database session
def get_db():
db = get_db_session()
try:
yield db
finally:
db.close()
async def stream_data(data_generator):
"""Helper function to stream data as Server-Sent Events"""
async for chunk in data_generator:
if isinstance(chunk, dict):
yield f"data: {json.dumps(chunk)}\n\n"
else:
yield f"data: {json.dumps({'message': str(chunk)})}\n\n"
await asyncio.sleep(0.1) # Small delay to prevent overwhelming
@router.get("/stream/strategies")
async def stream_enhanced_strategies(
strategy_id: Optional[int] = Query(None, description="Specific strategy ID"),
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
):
"""Stream enhanced strategies with real-time updates."""
async def strategy_generator():
try:
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
yield {"type": "error", "message": "Invalid user ID in authentication token", "timestamp": datetime.utcnow().isoformat()}
return
authenticated_user_id = clerk_user_id
logger.info(f"🚀 Starting strategy stream for authenticated user: {authenticated_user_id}, strategy: {strategy_id}")
# Send initial status
yield {"type": "status", "message": "Starting strategy retrieval...", "timestamp": datetime.utcnow().isoformat()}
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
# Send progress update
yield {"type": "progress", "message": "Querying database...", "progress": 25}
# Use authenticated user_id to ensure users can only see their own strategies
strategies_data = await enhanced_service.get_enhanced_strategies(authenticated_user_id, strategy_id, db)
# Send progress update
yield {"type": "progress", "message": "Processing strategies...", "progress": 50}
if strategies_data.get("status") == "not_found":
yield {"type": "result", "status": "not_found", "data": strategies_data}
return
# Send progress update
yield {"type": "progress", "message": "Finalizing data...", "progress": 75}
# Send final result
yield {"type": "result", "status": "success", "data": strategies_data, "progress": 100}
logger.info(f"✅ Strategy stream completed for user: {authenticated_user_id}")
except Exception as e:
logger.error(f"❌ Error in strategy stream: {str(e)}")
yield {"type": "error", "message": str(e), "timestamp": datetime.utcnow().isoformat()}
return StreamingResponse(
stream_data(strategy_generator()),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers": "*",
"Access-Control-Allow-Methods": "GET, POST, OPTIONS",
"Access-Control-Allow-Credentials": "true"
}
)
@router.get("/stream/strategic-intelligence")
async def stream_strategic_intelligence(
request: Request,
current_user: Dict[str, Any] = Depends(get_current_user_with_query_token),
db: Session = Depends(get_db)
):
"""Stream strategic intelligence data with real-time updates."""
async def intelligence_generator():
try:
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
yield {"type": "error", "message": "Invalid user ID in authentication token", "timestamp": datetime.utcnow().isoformat()}
return
authenticated_user_id = clerk_user_id
logger.info(f"🚀 Starting strategic intelligence stream for authenticated user: {authenticated_user_id}")
# Check cache first
cache_key = f"strategic_intelligence_{authenticated_user_id}"
cached_data = get_cached_data(cache_key)
if cached_data:
logger.info(f"✅ Returning cached strategic intelligence data for user: {authenticated_user_id}")
yield {"type": "result", "status": "success", "data": cached_data, "progress": 100}
return
# Send initial status
yield {"type": "status", "message": "Loading strategic intelligence...", "timestamp": datetime.utcnow().isoformat()}
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
# Send progress update
yield {"type": "progress", "message": "Retrieving strategies...", "progress": 20}
# Use authenticated user_id to ensure users can only see their own strategies
strategies_data = await enhanced_service.get_enhanced_strategies(authenticated_user_id, None, db)
# Send progress update
yield {"type": "progress", "message": "Analyzing market positioning...", "progress": 40}
if strategies_data.get("status") == "not_found":
yield {"type": "error", "status": "not_ready", "message": "No strategies found. Complete onboarding and create a strategy before generating intelligence.", "progress": 100}
return
# Extract strategic intelligence from first strategy
strategy = strategies_data.get("strategies", [{}])[0]
# Parse ai_recommendations if it's a JSON string
ai_recommendations = {}
if strategy.get("ai_recommendations"):
try:
if isinstance(strategy["ai_recommendations"], str):
ai_recommendations = json.loads(strategy["ai_recommendations"])
else:
ai_recommendations = strategy["ai_recommendations"]
except (json.JSONDecodeError, TypeError):
ai_recommendations = {}
# Send progress update
yield {"type": "progress", "message": "Processing intelligence data...", "progress": 60}
strategic_intelligence = {
"market_positioning": {
"current_position": strategy.get("competitive_position", "Challenger"),
"target_position": "Market Leader",
"differentiation_factors": [
"AI-powered content optimization",
"Data-driven strategy development",
"Personalized user experience"
]
},
"competitive_analysis": {
"top_competitors": strategy.get("top_competitors", [])[:3] or [
"Competitor A", "Competitor B", "Competitor C"
],
"competitive_advantages": [
"Advanced AI capabilities",
"Comprehensive data integration",
"User-centric design"
],
"market_gaps": strategy.get("market_gaps", []) or [
"AI-driven content personalization",
"Real-time performance optimization",
"Predictive analytics"
]
},
"ai_insights": ai_recommendations.get("strategic_insights", []) or [
"Focus on pillar content strategy",
"Implement topic clustering",
"Optimize for voice search"
],
"opportunities": [
{
"area": "Content Personalization",
"potential_impact": "High",
"implementation_timeline": "3-6 months",
"estimated_roi": "25-40%"
},
{
"area": "AI-Powered Optimization",
"potential_impact": "Medium",
"implementation_timeline": "6-12 months",
"estimated_roi": "15-30%"
}
]
}
# Cache the strategic intelligence data
set_cached_data(cache_key, strategic_intelligence)
# Send progress update
yield {"type": "progress", "message": "Finalizing strategic intelligence...", "progress": 80}
# Send final result
yield {"type": "result", "status": "success", "data": strategic_intelligence, "progress": 100}
logger.info(f"✅ Strategic intelligence stream completed for user: {authenticated_user_id}")
except Exception as e:
logger.error(f"❌ Error in strategic intelligence stream: {str(e)}")
yield {"type": "error", "message": str(e), "timestamp": datetime.utcnow().isoformat()}
return StreamingResponse(
stream_data(intelligence_generator()),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers": "*",
"Access-Control-Allow-Methods": "GET, POST, OPTIONS",
"Access-Control-Allow-Credentials": "true"
}
)
@router.get("/stream/keyword-research")
async def stream_keyword_research(
request: Request,
current_user: Dict[str, Any] = Depends(get_current_user_with_query_token),
db: Session = Depends(get_db)
):
"""Stream keyword research data with real-time updates."""
async def keyword_generator():
try:
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
yield {"type": "error", "message": "Invalid user ID in authentication token", "timestamp": datetime.utcnow().isoformat()}
return
authenticated_user_id = clerk_user_id
logger.info(f"🚀 Starting keyword research stream for authenticated user: {authenticated_user_id}")
# Check cache first
cache_key = f"keyword_research_{authenticated_user_id}"
cached_data = get_cached_data(cache_key)
if cached_data:
logger.info(f"✅ Returning cached keyword research data for user: {authenticated_user_id}")
yield {"type": "result", "status": "success", "data": cached_data, "progress": 100}
return
# Send initial status
yield {"type": "status", "message": "Loading keyword research...", "timestamp": datetime.utcnow().isoformat()}
# Import gap analysis service
from ....services.gap_analysis_service import GapAnalysisService
# Send progress update
yield {"type": "progress", "message": "Retrieving gap analyses...", "progress": 20}
gap_service = GapAnalysisService()
# Use authenticated user_id to ensure users can only see their own data
gap_analyses = await gap_service.get_gap_analyses(authenticated_user_id)
# Send progress update
yield {"type": "progress", "message": "Analyzing keyword opportunities...", "progress": 40}
# Handle case where gap_analyses is 0, None, or empty
if not gap_analyses or gap_analyses == 0 or len(gap_analyses) == 0:
yield {"type": "error", "status": "not_ready", "message": "No keyword research data available. Connect data sources or run analysis first.", "progress": 100}
return
# Extract keyword data from first gap analysis
gap_analysis = gap_analyses[0] if isinstance(gap_analyses, list) else gap_analyses
# Parse analysis_results if it's a JSON string
analysis_results = {}
if gap_analysis.get("analysis_results"):
try:
if isinstance(gap_analysis["analysis_results"], str):
analysis_results = json.loads(gap_analysis["analysis_results"])
else:
analysis_results = gap_analysis["analysis_results"]
except (json.JSONDecodeError, TypeError):
analysis_results = {}
# Send progress update
yield {"type": "progress", "message": "Processing keyword data...", "progress": 60}
keyword_data = {
"trend_analysis": {
"high_volume_keywords": analysis_results.get("opportunities", [])[:3] or [
{"keyword": "AI marketing automation", "volume": "10K-100K", "difficulty": "Medium"},
{"keyword": "content strategy 2024", "volume": "1K-10K", "difficulty": "Low"},
{"keyword": "digital marketing trends", "volume": "10K-100K", "difficulty": "High"}
],
"trending_keywords": [
{"keyword": "AI content generation", "growth": "+45%", "opportunity": "High"},
{"keyword": "voice search optimization", "growth": "+32%", "opportunity": "Medium"},
{"keyword": "video marketing strategy", "growth": "+28%", "opportunity": "High"}
]
},
"intent_analysis": {
"informational": ["how to", "what is", "guide to"],
"navigational": ["company name", "brand name", "website"],
"transactional": ["buy", "purchase", "download", "sign up"]
},
"opportunities": analysis_results.get("opportunities", []) or [
{"keyword": "AI content tools", "search_volume": "5K-10K", "competition": "Low", "cpc": "$2.50"},
{"keyword": "content marketing ROI", "search_volume": "1K-5K", "competition": "Medium", "cpc": "$4.20"},
{"keyword": "social media strategy", "search_volume": "10K-50K", "competition": "High", "cpc": "$3.80"}
]
}
# Cache the keyword data
set_cached_data(cache_key, keyword_data)
# Send progress update
yield {"type": "progress", "message": "Finalizing keyword research...", "progress": 80}
# Send final result
yield {"type": "result", "status": "success", "data": keyword_data, "progress": 100}
logger.info(f"✅ Keyword research stream completed for user: {authenticated_user_id}")
except Exception as e:
logger.error(f"❌ Error in keyword research stream: {str(e)}")
yield {"type": "error", "message": str(e), "timestamp": datetime.utcnow().isoformat()}
return StreamingResponse(
stream_data(keyword_generator()),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers": "*",
"Access-Control-Allow-Methods": "GET, POST, OPTIONS",
"Access-Control-Allow-Credentials": "true"
}
)

View File

@@ -0,0 +1,330 @@
"""
Utility Endpoints
Handles utility endpoints for enhanced content strategies.
"""
from typing import Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.orm import Session
from loguru import logger
# Import database
from services.database import get_db_session
# Import services
from ....services.enhanced_strategy_service import EnhancedStrategyService
from ....services.enhanced_strategy_db_service import EnhancedStrategyDBService
# Import authentication
from middleware.auth_middleware import get_current_user
# Import utilities
from ....utils.error_handlers import ContentPlanningErrorHandler
from ....utils.response_builders import ResponseBuilder
from ....utils.constants import ERROR_MESSAGES, SUCCESS_MESSAGES
router = APIRouter(tags=["Strategy Utilities"])
# Helper function to get database session
def get_db():
db = get_db_session()
try:
yield db
finally:
db.close()
@router.get("/onboarding-data")
async def get_onboarding_data(
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get onboarding data for enhanced strategy auto-population."""
try:
logger.warning(f"🔍 get_onboarding_data called with current_user: {current_user}")
# Extract authenticated user_id from Clerk
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
logger.error(f"❌ Invalid user ID in authentication token. current_user: {current_user}")
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
# Clerk user IDs are strings (e.g., 'user_xxx' or numeric strings)
# OnboardingSession uses Clerk user_id as String(255), so we can use it directly
authenticated_user_id = clerk_user_id
logger.warning(f"🚀 Getting onboarding data for authenticated user: {authenticated_user_id}")
db_service = EnhancedStrategyDBService(db)
enhanced_service = EnhancedStrategyService(db_service)
onboarding_data = await enhanced_service._get_onboarding_data(authenticated_user_id)
logger.warning(f"✅ Onboarding data retrieved successfully for user: {authenticated_user_id}")
return ResponseBuilder.create_success_response(
message="Onboarding data retrieved successfully",
data=onboarding_data
)
except HTTPException as he:
logger.error(f"❌ HTTPException in get_onboarding_data: status={he.status_code}, detail={he.detail}")
raise
except Exception as e:
logger.error(f"❌ Error getting onboarding data: {str(e)}")
logger.error(f"❌ Exception type: {type(e).__name__}")
import traceback
logger.error(f"❌ Traceback: {traceback.format_exc()}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_onboarding_data")
@router.post("/smart-autofill")
async def smart_autofill(
current_user: Dict[str, Any] = Depends(get_current_user),
db: Session = Depends(get_db)
) -> Dict[str, Any]:
"""Get smart autofill combining database fields (18-19) + AI fields (11-12)."""
try:
# Extract authenticated user_id from Clerk
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
# Clerk user IDs are strings (e.g., 'user_xxx' or numeric strings)
# OnboardingSession uses Clerk user_id as String(255), so we can use it directly
authenticated_user_id = clerk_user_id
logger.info(f"🚀 Starting smart autofill for authenticated user: {authenticated_user_id}")
# Import unified service
from ....services.content_strategy.autofill.unified_autofill_service import UnifiedAutoFillService
unified_service = UnifiedAutoFillService(db)
autofill_data = await unified_service.get_autofill(authenticated_user_id)
logger.info(f"✅ Smart autofill completed successfully for user: {authenticated_user_id}")
return ResponseBuilder.create_success_response(
message="Smart autofill completed successfully",
data=autofill_data
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error in smart autofill: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "smart_autofill")
@router.get("/tooltips")
async def get_enhanced_strategy_tooltips(
current_user: Dict[str, Any] = Depends(get_current_user)
) -> Dict[str, Any]:
"""Get tooltip data for enhanced strategy fields."""
try:
# Verify authentication (user_id not needed for static data, but auth is required)
if not current_user or not current_user.get('id'):
raise HTTPException(
status_code=401,
detail="Authentication required"
)
logger.info(f"🚀 Getting enhanced strategy tooltips for authenticated user: {current_user.get('id')}")
# Mock tooltip data - in real implementation, this would come from a database
tooltip_data = {
"business_objectives": {
"title": "Business Objectives",
"description": "Define your primary and secondary business goals that content will support.",
"examples": ["Increase brand awareness by 25%", "Generate 100 qualified leads per month"],
"best_practices": ["Be specific and measurable", "Align with overall business strategy"]
},
"target_metrics": {
"title": "Target Metrics",
"description": "Specify the KPIs that will measure content strategy success.",
"examples": ["Traffic growth: 30%", "Engagement rate: 5%", "Conversion rate: 2%"],
"best_practices": ["Set realistic targets", "Track both leading and lagging indicators"]
},
"content_budget": {
"title": "Content Budget",
"description": "Define your allocated budget for content creation and distribution.",
"examples": ["$10,000 per month", "15% of marketing budget"],
"best_practices": ["Include both creation and distribution costs", "Plan for seasonal variations"]
},
"team_size": {
"title": "Team Size",
"description": "Number of team members dedicated to content creation and management.",
"examples": ["3 content creators", "1 content manager", "2 designers"],
"best_practices": ["Consider skill sets and workload", "Plan for growth"]
},
"implementation_timeline": {
"title": "Implementation Timeline",
"description": "Timeline for implementing your content strategy.",
"examples": ["3 months for setup", "6 months for full implementation"],
"best_practices": ["Set realistic milestones", "Allow for iteration"]
},
"market_share": {
"title": "Market Share",
"description": "Your current market share and target market share.",
"examples": ["Current: 5%", "Target: 15%"],
"best_practices": ["Use reliable data sources", "Set achievable targets"]
},
"competitive_position": {
"title": "Competitive Position",
"description": "Your position relative to competitors in the market.",
"examples": ["Market leader", "Challenger", "Niche player"],
"best_practices": ["Be honest about your position", "Identify opportunities"]
},
"performance_metrics": {
"title": "Performance Metrics",
"description": "Key metrics to track content performance.",
"examples": ["Organic traffic", "Engagement rate", "Conversion rate"],
"best_practices": ["Focus on actionable metrics", "Set up proper tracking"]
}
}
logger.info("✅ Enhanced strategy tooltips retrieved successfully")
return ResponseBuilder.create_success_response(
message="Enhanced strategy tooltips retrieved successfully",
data=tooltip_data
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error getting enhanced strategy tooltips: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_tooltips")
@router.get("/disclosure-steps")
async def get_enhanced_strategy_disclosure_steps(
current_user: Dict[str, Any] = Depends(get_current_user)
) -> Dict[str, Any]:
"""Get progressive disclosure steps for enhanced strategy."""
try:
# Verify authentication (user_id not needed for static data, but auth is required)
if not current_user or not current_user.get('id'):
raise HTTPException(
status_code=401,
detail="Authentication required"
)
logger.info(f"🚀 Getting enhanced strategy disclosure steps for authenticated user: {current_user.get('id')}")
# Progressive disclosure steps configuration
disclosure_steps = [
{
"id": "business_context",
"title": "Business Context",
"description": "Define your business objectives and context",
"fields": ["business_objectives", "target_metrics", "content_budget", "team_size", "implementation_timeline", "market_share", "competitive_position", "performance_metrics"],
"is_complete": False,
"is_visible": True,
"dependencies": []
},
{
"id": "audience_intelligence",
"title": "Audience Intelligence",
"description": "Understand your target audience",
"fields": ["content_preferences", "consumption_patterns", "audience_pain_points", "buying_journey", "seasonal_trends", "engagement_metrics"],
"is_complete": False,
"is_visible": False,
"dependencies": ["business_context"]
},
{
"id": "competitive_intelligence",
"title": "Competitive Intelligence",
"description": "Analyze your competitive landscape",
"fields": ["top_competitors", "competitor_content_strategies", "market_gaps", "industry_trends", "emerging_trends"],
"is_complete": False,
"is_visible": False,
"dependencies": ["audience_intelligence"]
},
{
"id": "content_strategy",
"title": "Content Strategy",
"description": "Define your content approach",
"fields": ["preferred_formats", "content_mix", "content_frequency", "optimal_timing", "quality_metrics", "editorial_guidelines", "brand_voice"],
"is_complete": False,
"is_visible": False,
"dependencies": ["competitive_intelligence"]
},
{
"id": "distribution_channels",
"title": "Distribution Channels",
"description": "Plan your content distribution",
"fields": ["traffic_sources", "conversion_rates", "content_roi_targets"],
"is_complete": False,
"is_visible": False,
"dependencies": ["content_strategy"]
},
{
"id": "target_audience",
"title": "Target Audience",
"description": "Define your target audience segments",
"fields": ["target_audience", "content_pillars"],
"is_complete": False,
"is_visible": False,
"dependencies": ["distribution_channels"]
}
]
logger.info("✅ Enhanced strategy disclosure steps retrieved successfully")
return ResponseBuilder.create_success_response(
message="Enhanced strategy disclosure steps retrieved successfully",
data=disclosure_steps
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error getting enhanced strategy disclosure steps: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "get_enhanced_strategy_disclosure_steps")
@router.post("/cache/clear")
async def clear_streaming_cache(
current_user: Dict[str, Any] = Depends(get_current_user)
):
"""Clear streaming cache for the authenticated user."""
try:
# Extract authenticated user_id from Clerk
clerk_user_id = str(current_user.get('id', ''))
if not clerk_user_id:
raise HTTPException(
status_code=401,
detail="Invalid user ID in authentication token"
)
# Clerk user IDs are strings (e.g., 'user_xxx' or numeric strings)
# Cache keys use the Clerk user_id directly
authenticated_user_id = clerk_user_id
logger.info(f"🚀 Clearing streaming cache for authenticated user: {authenticated_user_id}")
# Import the cache from the streaming endpoints module
from .streaming_endpoints import streaming_cache
# Clear cache for authenticated user only (security: users can only clear their own cache)
cache_keys_to_remove = [
f"strategic_intelligence_{authenticated_user_id}",
f"keyword_research_{authenticated_user_id}"
]
for key in cache_keys_to_remove:
if key in streaming_cache:
del streaming_cache[key]
logger.info(f"✅ Cleared cache for key: {key}")
return ResponseBuilder.create_success_response(
message="Streaming cache cleared successfully",
data={"cleared_for_user": authenticated_user_id}
)
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error clearing streaming cache: {str(e)}")
raise ContentPlanningErrorHandler.handle_general_error(e, "clear_streaming_cache")

View File

@@ -0,0 +1,7 @@
"""
Strategy Middleware Module
Validation and error handling middleware for content strategies.
"""
# Future middleware modules will be imported here
__all__ = []

View File

@@ -0,0 +1,36 @@
"""
Content Strategy Routes
Main router that includes all content strategy endpoint modules.
"""
from fastapi import APIRouter
# Import endpoint modules
from .endpoints.strategy_crud import router as crud_router
from .endpoints.analytics_endpoints import router as analytics_router
from .endpoints.utility_endpoints import router as utility_router
from .endpoints.streaming_endpoints import router as streaming_router
from .endpoints.autofill_endpoints import router as autofill_router
from .endpoints.ai_generation_endpoints import router as ai_generation_router
# Create main router
# Using /enhanced-strategies prefix for backward compatibility with frontend
router = APIRouter(prefix="/enhanced-strategies", tags=["Content Strategy"])
# Include all endpoint routers
# IMPORTANT: Specific routes (like /onboarding-data) must come BEFORE parameterized routes (like /{strategy_id})
# to avoid route conflicts where FastAPI tries to parse "onboarding-data" as strategy_id
# Utility endpoints directly under /enhanced-strategies (must come first - has /onboarding-data)
router.include_router(utility_router, prefix="")
# Streaming endpoints directly under /enhanced-strategies
router.include_router(streaming_router, prefix="")
# AI generation endpoints under /enhanced-strategies/ai-generation
router.include_router(ai_generation_router, prefix="/ai-generation")
# CRUD endpoints directly under /enhanced-strategies (backward compatibility)
# This includes /{strategy_id} route, so it must come AFTER specific routes
router.include_router(crud_router, prefix="")
# Analytics endpoints under /enhanced-strategies/strategies/{id}/...
router.include_router(analytics_router, prefix="/strategies")
# Autofill endpoints under /enhanced-strategies/strategies/{id}/...
router.include_router(autofill_router, prefix="/strategies")

View File

@@ -0,0 +1,104 @@
"""
Request Models for Content Planning API
Extracted from the main content_planning.py file for better organization.
"""
from pydantic import BaseModel, Field
from typing import Dict, Any, List, Optional
from datetime import datetime
# Content Strategy Request Models
class ContentStrategyRequest(BaseModel):
industry: str
target_audience: Dict[str, Any]
business_goals: List[str]
content_preferences: Dict[str, Any]
competitor_urls: Optional[List[str]] = None
class ContentStrategyCreate(BaseModel):
user_id: int
name: str
industry: str
target_audience: Dict[str, Any]
content_pillars: Optional[List[Dict[str, Any]]] = None
ai_recommendations: Optional[Dict[str, Any]] = None
# Calendar Event Request Models
class CalendarEventCreate(BaseModel):
strategy_id: int
title: str
description: str
content_type: str
platform: str
scheduled_date: datetime
ai_recommendations: Optional[Dict[str, Any]] = None
# Content Gap Analysis Request Models
class ContentGapAnalysisCreate(BaseModel):
user_id: int
website_url: str
competitor_urls: List[str]
target_keywords: Optional[List[str]] = None
industry: Optional[str] = None
analysis_results: Optional[Dict[str, Any]] = None
recommendations: Optional[Dict[str, Any]] = None
opportunities: Optional[Dict[str, Any]] = None
class ContentGapAnalysisRequest(BaseModel):
website_url: str
competitor_urls: List[str]
target_keywords: Optional[List[str]] = None
industry: Optional[str] = None
# AI Analytics Request Models
class ContentEvolutionRequest(BaseModel):
strategy_id: int
time_period: str = "30d" # 7d, 30d, 90d, 1y
class PerformanceTrendsRequest(BaseModel):
strategy_id: int
metrics: Optional[List[str]] = None
class ContentPerformancePredictionRequest(BaseModel):
strategy_id: int
content_data: Dict[str, Any]
class StrategicIntelligenceRequest(BaseModel):
strategy_id: int
market_data: Optional[Dict[str, Any]] = None
# Calendar Generation Request Models
class CalendarGenerationRequest(BaseModel):
user_id: int
strategy_id: Optional[int] = None
calendar_type: str = Field("monthly", description="Type of calendar: monthly, weekly, custom")
industry: Optional[str] = None
business_size: str = Field("sme", description="Business size: startup, sme, enterprise")
force_refresh: bool = Field(False, description="Force refresh calendar generation")
class ContentOptimizationRequest(BaseModel):
user_id: int
event_id: Optional[int] = None
title: str
description: str
content_type: str
target_platform: str
original_content: Optional[Dict[str, Any]] = None
class PerformancePredictionRequest(BaseModel):
user_id: int
strategy_id: Optional[int] = None
content_type: str
platform: str
content_data: Dict[str, Any]
class ContentRepurposingRequest(BaseModel):
user_id: int
strategy_id: Optional[int] = None
original_content: Dict[str, Any]
target_platforms: List[str]
class TrendingTopicsRequest(BaseModel):
user_id: int
industry: str
limit: int = Field(10, description="Number of trending topics to return")

View File

@@ -0,0 +1,135 @@
"""
Response Models for Content Planning API
Extracted from the main content_planning.py file for better organization.
"""
from pydantic import BaseModel, Field
from typing import Dict, Any, List, Optional
from datetime import datetime
# Content Strategy Response Models
class ContentStrategyResponse(BaseModel):
id: int
name: str
industry: str
target_audience: Dict[str, Any]
content_pillars: List[Dict[str, Any]]
ai_recommendations: Dict[str, Any]
created_at: datetime
updated_at: datetime
# Calendar Event Response Models
class CalendarEventResponse(BaseModel):
id: int
strategy_id: int
title: str
description: str
content_type: str
platform: str
scheduled_date: datetime
status: str
ai_recommendations: Optional[Dict[str, Any]] = None
created_at: datetime
updated_at: datetime
# Content Gap Analysis Response Models
class ContentGapAnalysisResponse(BaseModel):
id: int
user_id: int
website_url: str
competitor_urls: List[str]
target_keywords: Optional[List[str]] = None
industry: Optional[str] = None
analysis_results: Optional[Dict[str, Any]] = None
recommendations: Optional[Dict[str, Any]] = None
opportunities: Optional[Dict[str, Any]] = None
created_at: datetime
updated_at: datetime
class ContentGapAnalysisFullResponse(BaseModel):
website_analysis: Dict[str, Any]
competitor_analysis: Dict[str, Any]
gap_analysis: Dict[str, Any]
recommendations: List[Dict[str, Any]]
opportunities: List[Dict[str, Any]]
created_at: datetime
# AI Analytics Response Models
class AIAnalyticsResponse(BaseModel):
analysis_type: str
strategy_id: int
results: Dict[str, Any]
recommendations: List[Dict[str, Any]]
analysis_date: datetime
# Calendar Generation Response Models
class CalendarGenerationResponse(BaseModel):
user_id: int
strategy_id: Optional[int]
calendar_type: str
industry: str
business_size: str
generated_at: datetime
content_pillars: List[str]
platform_strategies: Dict[str, Any]
content_mix: Dict[str, float]
daily_schedule: List[Dict[str, Any]]
weekly_themes: List[Dict[str, Any]]
content_recommendations: List[Dict[str, Any]]
optimal_timing: Dict[str, Any]
performance_predictions: Dict[str, Any]
trending_topics: List[Dict[str, Any]]
repurposing_opportunities: List[Dict[str, Any]]
ai_insights: List[Dict[str, Any]]
competitor_analysis: Dict[str, Any]
gap_analysis_insights: Dict[str, Any]
strategy_insights: Dict[str, Any]
onboarding_insights: Dict[str, Any]
processing_time: float
ai_confidence: float
class ContentOptimizationResponse(BaseModel):
user_id: int
event_id: Optional[int]
original_content: Dict[str, Any]
optimized_content: Dict[str, Any]
platform_adaptations: List[str]
visual_recommendations: List[str]
hashtag_suggestions: List[str]
keyword_optimization: Dict[str, Any]
tone_adjustments: Dict[str, Any]
length_optimization: Dict[str, Any]
performance_prediction: Dict[str, Any]
optimization_score: float
created_at: datetime
class PerformancePredictionResponse(BaseModel):
user_id: int
strategy_id: Optional[int]
content_type: str
platform: str
predicted_engagement_rate: float
predicted_reach: int
predicted_conversions: int
predicted_roi: float
confidence_score: float
recommendations: List[str]
created_at: datetime
class ContentRepurposingResponse(BaseModel):
user_id: int
strategy_id: Optional[int]
original_content: Dict[str, Any]
platform_adaptations: List[Dict[str, Any]]
transformations: List[Dict[str, Any]]
implementation_tips: List[str]
gap_addresses: List[str]
created_at: datetime
class TrendingTopicsResponse(BaseModel):
user_id: int
industry: str
trending_topics: List[Dict[str, Any]]
gap_relevance_scores: Dict[str, float]
audience_alignment_scores: Dict[str, float]
created_at: datetime

View File

@@ -0,0 +1,84 @@
"""
Main Router for Content Planning API
Centralized router that includes all sub-routes for the content planning module.
"""
from fastapi import APIRouter, HTTPException, Depends, status
from typing import Dict, Any
from datetime import datetime
from loguru import logger
# Import route modules
from .routes import strategies, calendar_events, gap_analysis, ai_analytics, calendar_generation, health_monitoring, monitoring
# Import content strategy routes (modular endpoints)
from .content_strategy.routes import router as content_strategy_router
# Import quality analysis routes
from ..quality_analysis_routes import router as quality_analysis_router
# Import monitoring routes
from ..monitoring_routes import router as monitoring_routes_router
# Create main router
router = APIRouter(prefix="/api/content-planning", tags=["content-planning"])
# Include route modules
router.include_router(strategies.router)
router.include_router(calendar_events.router)
router.include_router(gap_analysis.router)
router.include_router(ai_analytics.router)
router.include_router(calendar_generation.router)
router.include_router(health_monitoring.router)
router.include_router(monitoring.router)
# Include content strategy routes (modular endpoints)
router.include_router(content_strategy_router)
# Include quality analysis routes
router.include_router(quality_analysis_router)
# Include monitoring routes
router.include_router(monitoring_routes_router)
# Add health check endpoint
@router.get("/health")
async def content_planning_health_check():
"""
Health check for content planning module.
Returns operational status of all sub-modules.
"""
try:
logger.info("🏥 Performing content planning health check")
health_status = {
"service": "content_planning",
"status": "healthy",
"timestamp": datetime.utcnow().isoformat(),
"modules": {
"strategies": "operational",
"calendar_events": "operational",
"gap_analysis": "operational",
"ai_analytics": "operational",
"calendar_generation": "operational",
"health_monitoring": "operational",
"monitoring": "operational",
"enhanced_strategies": "operational",
"models": "operational",
"utils": "operational"
},
"version": "2.0.0",
"architecture": "modular"
}
logger.info("✅ Content planning health check completed")
return health_status
except Exception as e:
logger.error(f"❌ Content planning health check failed: {str(e)}")
return {
"service": "content_planning",
"status": "unhealthy",
"timestamp": datetime.utcnow().isoformat(),
"error": str(e)
}

Some files were not shown because too many files have changed in this diff Show More