diff --git a/README.md b/README.md index e9f50d46..daa2652d 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,9 @@ -# AI Blog Content Generation Toolkit - Alwrity +# AI Content Generation Toolkit - Alwrity ![](https://github.com/AJaySi/AI-Blog-Writer/blob/main/workspace/keyword_blog.gif) ## Introduction -This toolkit automates and enhances the process of blog creation, optimization, and management. +Alwrity automates and enhances the process of blog creation, optimization, and management. Leveraging AI technologies, it assists content creators and digital marketers in generating, formatting, and uploading blog content efficiently. The toolkit integrates advanced AI models for text generation, image creation, and data analysis, streamlining the content creation pipeline. --- @@ -11,11 +11,12 @@ Leveraging AI technologies, it assists content creators and digital marketers in ## Getting Started πŸš€ 🀞🀞🀞 To start using this tool, simply follow one of the options below: + --- +### Option 1: FOLLOW-ME Local Laptop Install πŸ’» **(WINDOWS)** -### Option 1: π—™π—Όπ—Ήπ—Ήπ—Όπ˜„ 𝗺𝗲 Local Laptop Install πŸ’» (Recommended) - -**Step 0**️⃣: **Pre-requisites:** Git, Python3 +#### Step 0 **Pre-requisites:** Git, Python3 +--- **Installing Python on Windows:🐍πŸͺŸ** - Open PowerShell as admin: Press `Windows Key + X`, then select "Windows PowerShell (Admin)". @@ -23,6 +24,7 @@ To start using this tool, simply follow one of the options below: - Type `python`. If Python is not installed, Windows will prompt you to 'Get Python'. - If Python is installed, you should see '>>>>>'. +--- **Installing Git on Windows:πŸ›Ί** - Open PowerShell or Windows Terminal: Press `Windows Key + X`, then select "Windows Terminal". @@ -30,55 +32,53 @@ To start using this tool, simply follow one of the options below: winget install --id Git.Git -e --source winget - Wait for download bars to finish -*Note for Linux Users:* If you're on Linux and can't install these, get lost πŸ§™β™‚οΈ +--- +### Step 1: Clone this repository to your local machine. -**Step 1**️⃣: Clone this repository to your local machine. - -``` To clone the repository to your local machine, perform the following steps: -1. **Open Windows PowerShell as Administrator:** Press `Windows Key + X` and select "Windows PowerShell (Admin)" from the menu. +1. **Open Windows PowerShell as Administrator:** Open "Windows PowerShell (Admin)" from start menu. -2. **Navigate to the Desired Directory:** Use the `cd` command to move to the directory where you want to clone the repository. +2. **Navigate to the Desired Directory:** Use the 'cd' command to move to the directory where you want to clone the repository. 3. **Clone the Repository:** Run the following command in PowerShell to clone the repository: -git clone https://github.com/AJaySi/AI-Blog-Writer.git +`git clone https://github.com/AJaySi/AI-Blog-Writer.git` This command will download all the files from the repository to your local machine. 4. **Verify the Clone:** After the cloning process is complete, navigate into the newly created directory using: -cd AI-Blog-Writer +`cd AI-Writer` -``` -Once you've cloned the repository, you can proceed with the next steps for installation and setup. - - -**Step 2**️⃣: Install required dependencies: -- Open command prompt on your local machine: Press `Windows Key + R`, type `cmd`, then press Enter. -- Navigate to the folder from Step 1 -- Run: `python -m pip install -r requirements.txt` - -**Step 3**️⃣: Run the script: -- Execute: `python alwrity.py` - -**Step 4**️⃣: The tool will guide you through setting up your APIs. +Congratulations: Once you've cloned the repository, you can proceed with the next steps for installation and setup. --- -### Option 2: Replit: Cloud Install ☁️☁️☁️ ☁️ ☁️ ....☁️ +### Step 2: Install required dependencies: -**Step 1**️⃣: Fork this repository to your own GitHub account. - -**Step 2**️⃣: Follow this guide: [Running GitHub Repositories on Replit](https://docs.replit.com/programming-ide/using-git-on-replit/running-github-repositories-replit) πŸ“– +- Open command prompt on your local machine. +- Navigate to the folder from Step 1 (AI-Writer) +- Run in powershell: +`python -m pip install -r requirements.txt` --- + +### Step 3: Run the script: + +- Run in powershell: +`python alwrity.py` + +--- + +### Step 4: The tool will guide you through setting up your APIs. +--- + + ### Option 3: Web URL 🌐 *(For easy access)* -**Step 1**️⃣: Error 404: Page not found. πŸ˜… - - +Coming Soon.... --- + ## Features - **Online Research Integration**: Enhances blog content by integrating insights and information gathered from online research, ensuring the content is informative and up-to-date. This gives context for generating content. Tavily AI, Google search, serp and Vision AI is used to scrape web data for context augumentation. TBD: Include CrewAI for web research agents. diff --git a/lib/ai_web_researcher/arxiv_schlorly_research.py b/lib/ai_web_researcher/arxiv_schlorly_research.py index cd459b60..cb833b22 100644 --- a/lib/ai_web_researcher/arxiv_schlorly_research.py +++ b/lib/ai_web_researcher/arxiv_schlorly_research.py @@ -112,7 +112,7 @@ def get_arxiv_main_content(url): pdf_text = '' # Read the downloaded PDF - with open(pdf_filename, 'rb') as f: + with open(pdf_filename, 'rb', encoding="utf-8") as f: pdf_reader = PyPDF2.PdfReader(f) for page in pdf_reader.pages: @@ -168,7 +168,7 @@ def download_image(image_url, base_url, folder="images"): response.raise_for_status() image_name = image_url.split("/")[-1] - with open(os.path.join(folder, image_name), 'wb') as file: + with open(os.path.join(folder, image_name), 'wb', encoding="utf-8") as file: file.write(response.content) return True @@ -297,7 +297,7 @@ def read_written_ids(file_path): """ written_ids = set() try: - with open(file_path, 'r') as file: + with open(file_path, 'r', encoding="utf-8") as file: for line in file: written_ids.add(line.strip()) except FileNotFoundError: @@ -320,12 +320,12 @@ def append_id_to_file(arxiv_id, output_file_path): if not os.path.exists(output_file_path): logger.info(f"File does not exist. Creating new file: {output_file_path}") # Create a new file and append the ID - with open(output_file_path, 'a') as outfile: + with open(output_file_path, 'a', encoding="utf-8") as outfile: outfile.write(arxiv_id + '\n') else: logger.info(f"Appending to existing file: {output_file_path}") # File exists, append the ID - with open(output_file_path, 'a') as outfile: + with open(output_file_path, 'a', encoding="utf-8") as outfile: outfile.write(arxiv_id + '\n') except Exception as e: diff --git a/lib/ai_web_researcher/common_utils.py b/lib/ai_web_researcher/common_utils.py index bdaf3ad6..64694295 100644 --- a/lib/ai_web_researcher/common_utils.py +++ b/lib/ai_web_researcher/common_utils.py @@ -93,7 +93,7 @@ def save_in_file(table_content): file_path = os.environ.get('SEARCH_SAVE_FILE') try: # Save the content to the file - with open(file_path, "a+") as file: + with open(file_path, "a+", encoding="utf-8") as file: file.write(table_content) file.write("\n" * 3) # Add three newlines at the end logger.info(f"Search content saved to {file_path}") diff --git a/lib/ai_web_researcher/google_trends_researcher.py b/lib/ai_web_researcher/google_trends_researcher.py index 53a14743..bea37e4a 100644 --- a/lib/ai_web_researcher/google_trends_researcher.py +++ b/lib/ai_web_researcher/google_trends_researcher.py @@ -482,7 +482,7 @@ def save_in_file(table_content): file_path = os.environ.get('SEARCH_SAVE_FILE') try: # Save the content to the file - with open(file_path, "a+") as file: + with open(file_path, "a+", encoding="utf-8") as file: file.write(table_content) file.write("\n" * 3) # Add three newlines at the end logger.info(f"Search content saved to {file_path}") diff --git a/lib/ai_web_researcher/tavily_ai_search.py b/lib/ai_web_researcher/tavily_ai_search.py index e5829df0..088c9d2a 100644 --- a/lib/ai_web_researcher/tavily_ai_search.py +++ b/lib/ai_web_researcher/tavily_ai_search.py @@ -160,7 +160,7 @@ def save_in_file(table_content): file_path = os.environ.get('SEARCH_SAVE_FILE') try: # Save the content to the file - with open(file_path, "a") as file: + with open(file_path, "a", encoding="utf-8") as file: file.write(table_content) file.write("\n" * 3) # Add three newlines at the end logger.info(f"Search content saved to {file_path}") diff --git a/lib/ai_writers/keywords_to_blog.py b/lib/ai_writers/keywords_to_blog.py index 8d16a31f..a618db30 100644 --- a/lib/ai_writers/keywords_to_blog.py +++ b/lib/ai_writers/keywords_to_blog.py @@ -37,26 +37,28 @@ def write_blog_from_keywords(search_keywords, url=None): blog_markdown_str = "" example_blog_titles = [] -# logger.info(f"Researching and Writing Blog on keywords: {search_keywords}") -# # Call on the got-researcher, tavily apis for this. Do google search for organic competition. -# try: -# google_search_result, g_titles = do_google_serp_search(search_keywords) -# example_blog_titles.append(g_titles) -# blog_markdown_str = write_blog_google_serp(search_keywords, google_search_result) -# except Exception as err: -# logger.error(f"Failed in Google web research: {err}") -# # logger.info/check the final blog content. -# logger.info("\n######### Draft1: Finished Blog from Google web search: ###########\n\n") + logger.info(f"Researching and Writing Blog on keywords: {search_keywords}") + # Call on the got-researcher, tavily apis for this. Do google search for organic competition. + try: + google_search_result, g_titles = do_google_serp_search(search_keywords) + example_blog_titles.append(g_titles) + blog_markdown_str = write_blog_google_serp(search_keywords, google_search_result) + except Exception as err: + logger.error(f"Failed in Google web research: {err}") + # logger.info/check the final blog content. + logger.info("\n######### Draft1: Finished Blog from Google web search: ###########\n\n") + exit(1) -# # Do Tavily AI research to augument the above blog. -# try: -# tavily_search_result, t_titles = do_tavily_ai_search(search_keywords) -# example_blog_titles.append(t_titles) -# blog_markdown_str = blog_with_research(blog_markdown_str, tavily_search_result) -# logger.info(f"######### Blog content after Tavily AI research: ######### \n\n{blog_markdown_str}\n\n") -# except Exception as err: -# logger.error(f"Failed to do Tavily AI research: {err}") -# logger.info("######### Draft2: Blog content after Tavily AI research: #########\n\n") + + # Do Tavily AI research to augument the above blog. + try: + tavily_search_result, t_titles = do_tavily_ai_search(search_keywords) + example_blog_titles.append(t_titles) + blog_markdown_str = blog_with_research(blog_markdown_str, tavily_search_result) + logger.info(f"######### Blog content after Tavily AI research: ######### \n\n{blog_markdown_str}\n\n") + except Exception as err: + logger.error(f"Failed to do Tavily AI research: {err}") + logger.info("######### Draft2: Blog content after Tavily AI research: #########\n\n") try: # Do Metaphor/Exa AI search. diff --git a/lib/blog_postprocessing/save_blog_to_file.py b/lib/blog_postprocessing/save_blog_to_file.py index bf5cdc68..09071cc8 100644 --- a/lib/blog_postprocessing/save_blog_to_file.py +++ b/lib/blog_postprocessing/save_blog_to_file.py @@ -109,7 +109,7 @@ def save_blog_to_file(blog_content, blog_title, blog_meta_desc, blog_tags, blog_ # Write to the file try: - with open(blog_output_path, "w") as f: + with open(blog_output_path, "w", encoding="utf-8") as f: f.write(blog_frontmatter) f.write(blog_content) except Exception as e: diff --git a/lib/blog_postprocessing/save_image.py b/lib/blog_postprocessing/save_image.py index a005aeac..89733b67 100644 --- a/lib/blog_postprocessing/save_image.py +++ b/lib/blog_postprocessing/save_image.py @@ -19,7 +19,7 @@ def save_generated_image(img_generation_response, image_dir): try: response = requests.get(generated_image_url, stream=True) response.raise_for_status() - with open(generated_image_filepath, "wb") as image_file: + with open(generated_image_filepath, "wb", encoding="utf-8") as image_file: image_file.write(response.content) except requests.exceptions.RequestException as e: logger.error(f"Failed to get generated image content: {e}") diff --git a/lib/github_blogs/main_getting_started_blogs.py b/lib/github_blogs/main_getting_started_blogs.py index c3cc20b3..c397fefc 100644 --- a/lib/github_blogs/main_getting_started_blogs.py +++ b/lib/github_blogs/main_getting_started_blogs.py @@ -34,7 +34,7 @@ def blog_from_github(github_opts, flag): elif 'csv' in flag: try: gh_urls = [] - with open(github_opts, 'r') as file: + with open(github_opts, 'r', encoding="utf-8") as file: # Read each line in the file for gh_url in file: gh_urls.append(gh_url.strip()) diff --git a/lib/github_blogs/scrape_github_readme.py b/lib/github_blogs/scrape_github_readme.py index 5bc39536..3e03958c 100644 --- a/lib/github_blogs/scrape_github_readme.py +++ b/lib/github_blogs/scrape_github_readme.py @@ -276,7 +276,7 @@ def check_if_already_written(github_url, file_path='papers_already_written_on.tx bool: True if an exact match is found, False otherwise. """ try: - with open(file_path, 'r') as file: + with open(file_path, 'r', encoding="utf-8") as file: # Read each line in the file for line in file: # Check for an exact match diff --git a/lib/gpt_providers/audio_to_text_generation/stt_audio_blog.py b/lib/gpt_providers/audio_to_text_generation/stt_audio_blog.py index ee55be08..957fc8f2 100644 --- a/lib/gpt_providers/audio_to_text_generation/stt_audio_blog.py +++ b/lib/gpt_providers/audio_to_text_generation/stt_audio_blog.py @@ -79,7 +79,7 @@ def speech_to_text(video_url, output_path='.'): logger.info("Transcribing using OpenAI's Whisper model.") transcript = client.audio.transcriptions.create( model="whisper-1", - file=open(audio_file, "rb"), + file=open(audio_file, "rb", encoding="utf-8"), response_format="text" ) logger.info(f"\nYouTube video transcription:\n{yt.title}\n{transcript}\n") @@ -133,7 +133,7 @@ def long_video(temp_file_name): for i, chunk in enumerate(chunks): with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as audio_chunk_file: chunk.write_audiofile(audio_chunk_file.name, codec="mp3") - with open(audio_chunk_file.name, "rb") as audio_file: + with open(audio_chunk_file.name, "rb", encoding="utf-8") as audio_file: # Transcribe each chunk using OpenAI's Whisper API app.logger.info(f"Transcribing chunk {i+1}/{len(chunks)}") transcript = openai.Audio.transcribe("whisper-1", audio_file) diff --git a/lib/gpt_providers/image_generation/gen_variation_img.py b/lib/gpt_providers/image_generation/gen_variation_img.py index 2380e886..e0899b98 100644 --- a/lib/gpt_providers/image_generation/gen_variation_img.py +++ b/lib/gpt_providers/image_generation/gen_variation_img.py @@ -35,7 +35,7 @@ def gen_new_from_given_img(img_path, image_dir, num_img=1, img_size="1024x1024", client = OpenAI() variation_response = client.images.create_variation( - image=open(img_path, "rb"), + image=open(img_path, "rb", encoding="utf-8"), n=num_img, size=img_size, response_format=response_format diff --git a/lib/gpt_providers/image_to_text_gen/openai_vision_img_details.py b/lib/gpt_providers/image_to_text_gen/openai_vision_img_details.py index a8fe9efb..384ea0db 100644 --- a/lib/gpt_providers/image_to_text_gen/openai_vision_img_details.py +++ b/lib/gpt_providers/image_to_text_gen/openai_vision_img_details.py @@ -32,7 +32,7 @@ def analyze_and_extract_details_from_image(image_path): def encode_image(path): """ Encodes an image to a base64 string. """ - with open(path, "rb") as image_file: + with open(path, "rb", encoding="utf-8") as image_file: return base64.b64encode(image_file.read()).decode('utf-8') base64_image = encode_image(image_path) diff --git a/lib/gpt_providers/text_generation/gemini_pro_text.py b/lib/gpt_providers/text_generation/gemini_pro_text.py index a015dc3b..0e90b5f6 100644 --- a/lib/gpt_providers/text_generation/gemini_pro_text.py +++ b/lib/gpt_providers/text_generation/gemini_pro_text.py @@ -4,6 +4,7 @@ import sys from pathlib import Path import google.generativeai as genai +from google.api_core import retry from dotenv import load_dotenv load_dotenv(Path('../../../.env')) from loguru import logger @@ -13,16 +14,10 @@ logger.add(sys.stdout, format="{level}|{file}:{line}:{function}| {message}" ) -from tenacity import ( - retry, - stop_after_attempt, - wait_random_exponential, -) # for exponential backoff - -@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6)) def gemini_text_response(prompt, temperature, top_p, n, max_tokens): """ Common functiont to get response from gemini pro Text. """ + #FIXME: Include : https://github.com/google-gemini/cookbook/blob/main/quickstarts/rest/System_instructions_REST.ipynb try: genai.configure(api_key=os.getenv('GEMINI_API_KEY')) except Exception as err: @@ -35,10 +30,13 @@ def gemini_text_response(prompt, temperature, top_p, n, max_tokens): "top_k": n, "max_output_tokens": max_tokens } + # FIXME: Expose model_name in main_config model = genai.GenerativeModel(model_name="gemini-1.0-pro", generation_config=generation_config) try: - response = model.generate_content(prompt, stream=True) + # text_response = [] + response = model.generate_content(prompt, stream=True, request_options={'retry':retry.Retry()}) for chunk in response: + # text_response.append(chunk.text) print(chunk.text) return response.text except Exception as err: diff --git a/lib/image_to_text/gpt_vision_image_details.py b/lib/image_to_text/gpt_vision_image_details.py index 890b7f30..6c5e2935 100644 --- a/lib/image_to_text/gpt_vision_image_details.py +++ b/lib/image_to_text/gpt_vision_image_details.py @@ -22,7 +22,7 @@ def analyze_and_extract_details_from_image(image_path, api_key): """ def encode_image(path): """ Encodes an image to a base64 string. """ - with open(path, "rb") as image_file: + with open(path, "rb", encoding="utf-8") as image_file: return base64.b64encode(image_file.read()).decode('utf-8') base64_image = encode_image(image_path) diff --git a/lib/scholar_blogs/main_arxiv_to_blog.py b/lib/scholar_blogs/main_arxiv_to_blog.py index 4689bcde..61f417d9 100644 --- a/lib/scholar_blogs/main_arxiv_to_blog.py +++ b/lib/scholar_blogs/main_arxiv_to_blog.py @@ -78,7 +78,7 @@ def blog_arxiv_url_list(file_path): """ Write blogs on all the arxiv links given in a file. """ extracted_ids = [] try: - with open(file_path, 'r') as file: + with open(file_path, 'r', encoding="utf-8") as file: for line in file: arxiv_id = extract_arxiv_ids_from_line(line) if arxiv_id: diff --git a/lib/utils/take_url_screenshot.py b/lib/utils/take_url_screenshot.py index d99324a8..632dd007 100644 --- a/lib/utils/take_url_screenshot.py +++ b/lib/utils/take_url_screenshot.py @@ -47,7 +47,7 @@ def screenshot_api(url, generated_image_filepath): image = client.take(options) # store the screenshot the example.png file - with open(generated_image_filepath, 'wb') as result_file: + with open(generated_image_filepath, 'wb', encoding="utf-8") as result_file: shutil.copyfileobj(image, result_file) # Display the screenshot using Image.show @@ -89,7 +89,7 @@ def take_screenshot(url, generated_image_filepath): screenshot = driver.get_screenshot_as_png() # Save the screenshot to a file - with open(generated_image_filepath, "wb") as f: + with open(generated_image_filepath, "wb", encoding="utf-8") as f: f.write(screenshot) # Display the screenshot using Image.show diff --git a/lib/utils/wordpress_blog_uploader.py b/lib/utils/wordpress_blog_uploader.py index 1211708a..ff346fd6 100644 --- a/lib/utils/wordpress_blog_uploader.py +++ b/lib/utils/wordpress_blog_uploader.py @@ -249,7 +249,7 @@ def upload_media(url, username, password, media_path, alt_text, description, tit 'Content-Disposition': 'attachment; filename={}'.format(os.path.basename(media_path)) } - with open(media_path, 'rb') as media: + with open(media_path, 'rb', encoding="utf-8") as media: media_name = os.path.basename(media_path) files = {'file': (media_name, media, mime_type)}