WIP- Try AI-Writer and Web research; working.

This commit is contained in:
AjaySi
2024-02-24 15:15:01 +05:30
parent d89d9ad3d2
commit a87a87a620
21 changed files with 587 additions and 279 deletions

109
README.md
View File

@@ -5,28 +5,15 @@ This toolkit automates and enhances the process of blog creation, optimization,
## Features
### Blog Generation and Optimization
- **YouTube to Blog Conversion**: Converts YouTube videos into detailed blog posts by extracting and transcribing audio, then generating text-based content. TBD: Audio to blog.
- **Online Research Integration**: Enhances blog content by integrating insights and information gathered from online research, ensuring the content is informative and up-to-date. This gives context for generating content. Tavily AI, Google search, serp and Vision AI is used to scrape web data for context augumentation. TBD: Include CrewAI for web research agents.
- **Image Generation and Processing**: Utilizes AI models like DALL-E 3, stable difffusion to create relevant images based on blog content. Offers features to process and optimize images for web usage. FIXME: Need more work with stable diffusion.
- **Write Scholarly Article**: Does search for given keywords, arxiv IDs and write review or blog on research papers. Basically, PDF to Blog.
- **Write blogs from PDFs**: TBD . The code is there, need to abstract/extract it. There is RAG with llamaindex for 'n' pdfs.
- **
- **SEO Optimization**: Employs AI to generate SEO-friendly blog titles, meta descriptions, tags, and categories. Ensures content is optimized for search engines.
- **Blog Output formats**: For easy upload to website, blogs output format can be in plaintext, HTML, Mardown/MLA format.
- **Wordpress Integration**: Implemented generating and uploading blog content, media to wordpress via its REST APIs. Most of the static website which can work with markdown style should work with little testing.
- **Wordpress, Jekyll Integration**: Implemented generating and uploading blog content, media to wordpress via its REST APIs. Most of the static website which can work with markdown style should work with little testing.
### Speech-to-Text Conversion
- **Audio Transcription**: Converts speech from video content into text, facilitating the creation of blogs and articles from video sources.
- **AI models used**: OpenAI whisper model, (TBD) AssemblyAI
### AI-Driven Content Creation
- **Text Generation**: Leverages OpenAI's ChatGPT, Google Gemini Pro for generating text for blogs.
- **Customizable AI Parameters**: (FIXME) Offers flexibility in adjusting AI parameters like model selection, temperature, and token limits to suit different content needs.
@@ -35,64 +22,62 @@ This toolkit automates and enhances the process of blog creation, optimization,
- **Analyzing and Extracting Image Details**: Uses OpenAI's Vision API, Google Gemini vision to analyze images and extract details such as alt text, descriptions, titles, and captions, enhancing the SEO of image content.
---
## Installation and Configuration
1. **Clone the Repository**: Clone the toolkit from the provided repository link.
2. **Install Dependencies**: Install necessary Python packages and libraries.
## Installation
---
**Note**: This toolkit is designed for automated blog management and requires appropriate API keys and access credentials for full functionality.
### 1). Prerequisites: pip install requirements.txt
```
pip install -r requirements.txt
```
---
### 2). OpenAI, Gemini API keys
Create a file .env in the present directory and include OpenAI keys.
FIXME: The code is little messed up here.
### Web Research
- **Keyword Research**: Conduct in-depth keyword research by specifying search queries and time ranges.
- **Domain-Specific Searches**: Include specific URLs to confine searches to certain domains, such as Wikipedia or competitor websites.
- **Semantic Analysis**: Explore similar topics and technologies by providing a reference URL for semantic analysis.
---
### Competitor Analysis
- **Similar Company Discovery**: Analyze competitor websites to discover similar companies, startups, and technologies.
- **Industry Insights**: Gain insights into industry trends, market competitors, and emerging technologies.
This is in active development and needs ironing out. The main concern is make it general purpose, for all.
Usuability and extendibility are major concerns. This section will be updated soon.
### Blog Writing
- **Keyword-Based Blogs**: Generate blog content based on specified keywords, leveraging AI to produce engaging and informative articles.
- **Audio Blog Generation**: Convert audio from YouTube videos into blog posts, facilitating content creation from multimedia sources.
- **GitHub Repository Blogs**: Transform GitHub repositories or topics into blog posts, showcasing code examples and project insights.
- **Scholarly Research Blogs**: Generate blog content based on research papers, summarizing key findings and insights.
usage: pseo_main.py [-h] [--csv CSV] [--keywords KEYWORDS] [--youtube_urls YOUTUBE_URLS] [--scholar SCHOLAR] [--niche] [--wordpress]
[--output_format {plaintext,markdown,html}]
### Blogging Tools
- **Title and Meta Description Generation**: Generate catchy titles and meta descriptions for blog posts to improve SEO and user engagement.
- **Blog Outline Creation**: Generate outlines for blog posts, aiding in structuring content and organizing ideas.
- **FAQ Generation**: Automatically generate FAQs (Frequently Asked Questions) based on blog content, enhancing user engagement and SEO.
- **HTML and Markdown Conversion**: Convert blog posts between HTML and Markdown formats for easy integration with various platforms.
- **Blog Proofreading**: Proofread blog content for grammar, spelling, and readability, ensuring high-quality output.
- **Tag and Category Suggestions**: Generate tags and categories for blog posts based on content analysis, improving organization and discoverability.
options:
-h, --help show this help message and exit
--csv CSV Provide path csv file. Check the template csv for example.
--keywords KEYWORDS Keywords for blog generation.
--youtube_urls YOUTUBE_URLS
Comma-separated YouTube URLs for blog generation.
--scholar SCHOLAR Write blog from latest research papers on given keywords. Use 'arxiv_papers_url' to provide a file arxiv url
list.
--niche Flag to generate niche blogs (default: False).
--wordpress Flag to upload blogs to WordPress (default: False).
--output_format {plaintext,markdown,html}
Output format of the blogs (default: plaintext).
### Interactive Mode
- **User-Friendly Interface**: Navigate tasks and options easily through an interactive command-line interface.
- **Menu-Driven Interaction**: Choose between various options, tasks, and tools using intuitive menus and prompts.
- **Task Guidance**: Receive guidance and instructions for each task, facilitating user interaction and decision-making.
---
## Packages, Tools, and APIs Used
**Example Usage:**
- **Keyword usage**:
```
python pseo_main.py --keywords "Writesonic AI SEO-optimized blog writing,PepperType AI virtual content assistant,Copysmith AI enterprise eCommerce content,Copy AI artificial intelligence content generator,Jasper AI creative content platform,Contents generative AI content strategy"
```
**YouTube usage**:
```
python pseo_main.py --youtube https://www.youtube.com/watch?v=yu27PWzJI_Y,https://www.youtube.com/watch?v=WGzoBD-xthI,https://www.youtube.com/watch?v=zizonToFXDs
```
**Scholar usage**:
```
python pseo_main.py --scholar "GPT-4 Technical Report"
```
- **Libraries**:
- PyInquirer: For creating interactive command-line interfaces.
- Typer: For building CLI applications with ease.
- Tabulate: For formatting data in tabular form.
- Requests: For making HTTP requests to web APIs.
- python-dotenv: For loading environment variables from a .env file.
- **APIs**:
- Metaphor API: Provides semantic search capabilities for finding similar topics and technologies.
- Tavily API: Offers AI-powered web search functionality for conducting in-depth keyword research.
- SerperDev API: Enables access to search engine results and competitor analysis data.
- OpenAI API: Powers the Large Language Models (LLMs) for generating blog content and conducting research.
- Gemini API: Another LLM provider for natural language processing tasks.
- Ollama API (Work In Progress): An upcoming LLM provider for additional research and content generation capabilities.
## Getting Started
To use this tool, follow these steps:
1. Clone this repository to your local machine.
2. Install the required dependencies using `pip install -r requirements.txt`.
3. Run the script by executing `python blogen.py`.
4. Set up the necessary API keys by following the instructions provided in the script and adding them to the `.env` file.
---
Notes: