AI FAQ Generator & github blogs

This commit is contained in:
ajaysi
2025-05-04 17:04:44 +05:30
parent c51e355d26
commit 26b02b9719
9 changed files with 1810 additions and 463 deletions

View File

@@ -0,0 +1,192 @@
# AI-Powered FAQ Generator
A sophisticated FAQ generation system that creates comprehensive, well-researched FAQs from various content sources. This tool leverages AI to analyze content, conduct web research, and generate detailed FAQs with customizable options.
## Features
### Content Processing
- **Multiple Input Sources**
- Direct text input
- File uploads (DOCX, TXT)
- URL content extraction
- Support for any content type (general, technical, educational, etc.)
### Research Capabilities
- **Multi-level Search Depth**
- **Basic**: Google Search for quick, general information
- **Comprehensive**: Tavily AI for detailed, in-depth research
- **Expert**: Metaphor AI for specialized, expert-level content
### Customization Options
- **Target Audience**
- Beginner
- Intermediate
- Expert
- **FAQ Style**
- Technical
- Conversational
- Professional
- **Advanced Features**
- Emoji inclusion
- Code example generation
- Reference integration
- Customizable time range for research
- Multi-language support
### Output Formats
- Interactive preview
- Markdown
- HTML
- JSON
## Installation
1. Clone the repository
2. Install dependencies:
```bash
pip install -r requirements.txt
```
## Usage
### Basic Usage
```python
from lib.ai_writers.ai_blog_faqs_writer.faqs_generator_blog import FAQGenerator, FAQConfig
# Initialize with default configuration
generator = FAQGenerator()
# Generate FAQs from content
faqs = await generator.generate_faqs("Your content here")
```
### Advanced Configuration
```python
from lib.ai_writers.ai_blog_faqs_writer.faqs_generator_blog import (
FAQGenerator, FAQConfig, TargetAudience, FAQStyle, SearchDepth
)
# Custom configuration
config = FAQConfig(
num_faqs=10,
target_audience=TargetAudience.INTERMEDIATE,
faq_style=FAQStyle.TECHNICAL,
include_emojis=True,
include_code_examples=True,
include_references=True,
search_depth=SearchDepth.COMPREHENSIVE,
time_range="last_6_months",
language="English"
)
generator = FAQGenerator(config)
```
### Web Interface
Run the Streamlit interface:
```bash
streamlit run lib/ai_writers/ai_blog_faqs_writer/faqs_ui.py
```
## Research Process
1. **Content Analysis**
- Identifies key topics and concepts
- Extracts potential questions
- Determines research requirements
2. **Web Research**
- Selects appropriate search function based on depth
- Gathers relevant information
- Validates and cross-references data
3. **FAQ Generation**
- Creates comprehensive questions
- Provides detailed answers
- Includes code examples (if applicable)
- Adds references and citations
## Output Structure
Each FAQ item includes:
- Question
- Detailed answer
- Category
- Code example (if applicable)
- References
- Confidence score
- Last updated timestamp
## Configuration Options
### FAQConfig Parameters
- `num_faqs`: Number of FAQs to generate (default: 5)
- `target_audience`: Target audience level (default: INTERMEDIATE)
- `faq_style`: Writing style (default: PROFESSIONAL)
- `include_emojis`: Whether to include emojis (default: True)
- `include_code_examples`: Whether to include code examples (default: True)
- `include_references`: Whether to include references (default: True)
- `search_depth`: Research depth level (default: COMPREHENSIVE)
- `time_range`: Time range for research (default: "last_6_months")
- `language`: Output language (default: "English")
## Research Depth Options
### Basic (Google Search)
- Quick, general information
- Broad coverage
- Suitable for basic topics
### Comprehensive (Tavily AI)
- Detailed, in-depth research
- Multiple source integration
- Best for most use cases
### Expert (Metaphor AI)
- Specialized, expert-level content
- Advanced topic coverage
- Technical and academic focus
## Best Practices
1. **Content Preparation**
- Provide clear, well-structured content
- Include key terms and concepts
- Specify target audience and style
2. **Research Selection**
- Use Basic for general topics
- Choose Comprehensive for detailed analysis
- Select Expert for technical subjects
3. **Output Review**
- Verify accuracy of information
- Check code examples
- Validate references
## Contributing
1. Fork the repository
2. Create a feature branch
3. Commit your changes
4. Push to the branch
5. Create a Pull Request
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Support
For support, please open an issue in the repository or contact the maintainers.
## Acknowledgments
- OpenAI for GPT integration
- Google Search API
- Tavily AI
- Metaphor AI
- BeautifulSoup for web scraping
- Streamlit for UI

View File

@@ -0,0 +1,386 @@
"""
Enhanced FAQ Generator
This module provides a comprehensive FAQ generation system that can create detailed,
well-researched FAQs from various content sources with customizable options.
"""
import sys
import json
from typing import Dict, List, Optional, Union
from pathlib import Path
from enum import Enum
from dataclasses import dataclass
from loguru import logger
from lib.gpt_providers.text_generation.main_text_generation import llm_text_gen
from lib.ai_web_researcher.google_serp_search import google_search
from lib.ai_web_researcher.tavily_ai_search import tavily_search
from lib.ai_web_researcher.metaphor_basic_neural_web_search import metaphor_search_articles
logger.remove()
logger.add(sys.stdout,
colorize=True,
format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}")
class TargetAudience(Enum):
BEGINNER = "beginner"
INTERMEDIATE = "intermediate"
EXPERT = "expert"
class FAQStyle(Enum):
TECHNICAL = "technical"
CONVERSATIONAL = "conversational"
PROFESSIONAL = "professional"
class SearchDepth(Enum):
BASIC = "basic"
COMPREHENSIVE = "comprehensive"
EXPERT = "expert"
@dataclass
class FAQConfig:
"""Configuration for FAQ generation."""
num_faqs: int = 5
target_audience: TargetAudience = TargetAudience.INTERMEDIATE
faq_style: FAQStyle = FAQStyle.PROFESSIONAL
include_emojis: bool = True
include_code_examples: bool = True
include_references: bool = True
search_depth: SearchDepth = SearchDepth.COMPREHENSIVE
time_range: str = "last_6_months"
exclude_domains: List[str] = None
language: str = "English"
@dataclass
class FAQItem:
"""Individual FAQ item with metadata."""
question: str
answer: str
category: str
code_example: Optional[str] = None
references: List[Dict[str, str]] = None
confidence_score: float = 0.0
last_updated: str = None
class FAQGenerator:
"""Enhanced FAQ Generator with research capabilities."""
def __init__(self, config: Optional[FAQConfig] = None):
"""Initialize the FAQ generator with optional configuration."""
self.config = config or FAQConfig()
self.faqs: List[FAQItem] = []
self.research_results = {}
async def generate_faqs(self, content: str, content_type: str = "general") -> List[FAQItem]:
"""Generate FAQs from the given content with research integration."""
try:
# Step 1: Research the topic
research_results = await self._conduct_research(content)
# Step 2: Generate initial FAQs
initial_faqs = await self._generate_initial_faqs(content, research_results)
# Step 3: Enhance FAQs with research
enhanced_faqs = await self._enhance_faqs_with_research(initial_faqs, research_results)
# Step 4: Add code examples if requested
if self.config.include_code_examples:
enhanced_faqs = await self._add_code_examples(enhanced_faqs)
# Step 5: Add references if requested
if self.config.include_references:
enhanced_faqs = await self._add_references(enhanced_faqs, research_results)
self.faqs = enhanced_faqs
return enhanced_faqs
except Exception as err:
logger.error(f"Failed to generate FAQs: {err}")
raise
async def _conduct_research(self, content: str) -> Dict:
"""Conduct online research based on the content."""
try:
research_prompt = f"""Based on the following content, identify key topics and questions for research:
{content}
Please provide a list of research topics and questions that would help create comprehensive FAQs.
Focus on:
1. Key concepts and terms
2. Common questions users might have
3. Technical aspects that need clarification
4. Best practices and recommendations
"""
research_topics = await llm_text_gen(research_prompt)
# Conduct research for each topic
research_results = {}
for topic in research_topics.split('\n'):
if topic.strip():
# Select search function based on search depth
if self.config.search_depth == SearchDepth.BASIC:
results = await google_search(topic.strip())
elif self.config.search_depth == SearchDepth.COMPREHENSIVE:
results = await tavily_search(topic.strip())
elif self.config.search_depth == SearchDepth.EXPERT:
results = await metaphor_search_articles(topic.strip())
else:
logger.warning(f"Unknown search depth: {self.config.search_depth}, defaulting to Google search")
results = await google_search(topic.strip())
research_results[topic.strip()] = results
return research_results
except Exception as err:
logger.error(f"Failed to conduct research: {err}")
return {}
async def _generate_initial_faqs(self, content: str, research_results: Dict) -> List[FAQItem]:
"""Generate initial FAQs using LLM."""
try:
system_prompt = f"""You are an expert FAQ generator with deep knowledge in content creation and technical writing.
Your task is to create comprehensive FAQs based on the given content and research.
Guidelines:
1. Target Audience: {self.config.target_audience.value}
2. Style: {self.config.faq_style.value}
3. Include emojis: {self.config.include_emojis}
4. Language: {self.config.language}
5. Number of FAQs: {self.config.num_faqs}
Create FAQs that are:
- Clear and concise
- Well-structured
- Technically accurate
- Engaging and informative
- Based on the provided research
- Relevant to the target audience
- Written in the specified style
"""
prompt = f"""Content to generate FAQs from:
{content}
Research Results:
{json.dumps(research_results, indent=2)}
Please generate {self.config.num_faqs} FAQs following the guidelines above.
Format each FAQ with:
- Question
- Detailed answer
- Category
- Confidence score (0-1)
"""
response = await llm_text_gen(prompt, system_prompt=system_prompt)
# Parse the response into FAQItem objects
faqs = []
current_faq = None
for line in response.split('\n'):
if line.startswith('Q:'):
if current_faq:
faqs.append(current_faq)
current_faq = FAQItem(question=line[2:].strip(), answer="", category="")
elif line.startswith('A:'):
if current_faq:
current_faq.answer = line[2:].strip()
elif line.startswith('Category:'):
if current_faq:
current_faq.category = line[9:].strip()
elif line.startswith('Confidence:'):
if current_faq:
current_faq.confidence_score = float(line[11:].strip())
if current_faq:
faqs.append(current_faq)
return faqs
except Exception as err:
logger.error(f"Failed to generate initial FAQs: {err}")
raise
async def _enhance_faqs_with_research(self, faqs: List[FAQItem], research_results: Dict) -> List[FAQItem]:
"""Enhance FAQs with research findings."""
try:
enhanced_faqs = []
for faq in faqs:
# Find relevant research for this FAQ
relevant_research = self._find_relevant_research(faq, research_results)
if relevant_research:
# Enhance the answer with research findings
enhancement_prompt = f"""Enhance the following FAQ answer with the provided research:
Question: {faq.question}
Current Answer: {faq.answer}
Research:
{json.dumps(relevant_research, indent=2)}
Please enhance the answer while:
1. Maintaining the original style and tone
2. Adding relevant information from the research
3. Ensuring technical accuracy
4. Keeping the answer concise and clear
"""
enhanced_answer = await llm_text_gen(enhancement_prompt)
faq.answer = enhanced_answer
enhanced_faqs.append(faq)
return enhanced_faqs
except Exception as err:
logger.error(f"Failed to enhance FAQs with research: {err}")
return faqs
async def _add_code_examples(self, faqs: List[FAQItem]) -> List[FAQItem]:
"""Add code examples to FAQs where applicable."""
try:
for faq in faqs:
if self._is_technical_question(faq.question):
code_prompt = f"""Generate a code example for the following FAQ:
Question: {faq.question}
Answer: {faq.answer}
Please provide a relevant code example that:
1. Illustrates the answer clearly
2. Includes comments and explanations
3. Follows best practices
4. Is easy to understand
"""
code_example = await llm_text_gen(code_prompt)
faq.code_example = code_example
return faqs
except Exception as err:
logger.error(f"Failed to add code examples: {err}")
return faqs
async def _add_references(self, faqs: List[FAQItem], research_results: Dict) -> List[FAQItem]:
"""Add references to FAQs."""
try:
for faq in faqs:
relevant_research = self._find_relevant_research(faq, research_results)
if relevant_research:
faq.references = [
{
"title": ref.get("title", ""),
"url": ref.get("url", ""),
"source": ref.get("source", ""),
"date": ref.get("date", "")
}
for ref in relevant_research.get("references", [])
]
return faqs
except Exception as err:
logger.error(f"Failed to add references: {err}")
return faqs
def _find_relevant_research(self, faq: FAQItem, research_results: Dict) -> Dict:
"""Find research relevant to a specific FAQ."""
# Simple keyword matching for now - can be enhanced with semantic search
relevant_research = {}
for topic, results in research_results.items():
if any(keyword in faq.question.lower() for keyword in topic.lower().split()):
relevant_research[topic] = results
return relevant_research
def _is_technical_question(self, question: str) -> bool:
"""Determine if a question is technical and might benefit from a code example."""
technical_keywords = ["code", "program", "function", "method", "class", "api", "syntax", "error", "debug"]
return any(keyword in question.lower() for keyword in technical_keywords)
def to_markdown(self) -> str:
"""Convert FAQs to markdown format."""
markdown = "# Frequently Asked Questions\n\n"
for i, faq in enumerate(self.faqs, 1):
markdown += f"## {i}. {faq.question}\n\n"
markdown += f"{faq.answer}\n\n"
if faq.code_example:
markdown += "```\n"
markdown += f"{faq.code_example}\n"
markdown += "```\n\n"
if faq.references:
markdown += "### References\n"
for ref in faq.references:
markdown += f"- [{ref['title']}]({ref['url']}) - {ref['source']} ({ref['date']})\n"
markdown += "\n"
return markdown
def to_html(self) -> str:
"""Convert FAQs to HTML format."""
html = """
<!DOCTYPE html>
<html>
<head>
<title>Frequently Asked Questions</title>
<style>
.faq-container { max-width: 800px; margin: 0 auto; }
.faq-item { margin-bottom: 2em; }
.question { font-weight: bold; font-size: 1.2em; }
.answer { margin: 1em 0; }
.code-example { background: #f5f5f5; padding: 1em; }
.references { margin-top: 1em; font-size: 0.9em; }
</style>
</head>
<body>
<div class="faq-container">
<h1>Frequently Asked Questions</h1>
"""
for i, faq in enumerate(self.faqs, 1):
html += f"""
<div class="faq-item">
<div class="question">{i}. {faq.question}</div>
<div class="answer">{faq.answer}</div>
"""
if faq.code_example:
html += f"""
<pre class="code-example">{faq.code_example}</pre>
"""
if faq.references:
html += """
<div class="references">
<h3>References</h3>
<ul>
"""
for ref in faq.references:
html += f"""
<li><a href="{ref['url']}">{ref['title']}</a> - {ref['source']} ({ref['date']})</li>
"""
html += """
</ul>
</div>
"""
html += """
</div>
"""
html += """
</div>
</body>
</html>
"""
return html

View File

@@ -0,0 +1,177 @@
"""
Streamlit UI for FAQ Generator
This module provides a user-friendly interface for generating FAQs from various content sources.
"""
import streamlit as st
import asyncio
from pathlib import Path
from typing import Optional
import json
import requests
from bs4 import BeautifulSoup
from .faqs_generator_blog import FAQGenerator, FAQConfig, TargetAudience, FAQStyle, SearchDepth
def fetch_url_content(url):
"""Fetch and extract content from a URL."""
try:
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.text, 'html.parser')
# Remove script and style elements
for script in soup(["script", "style"]):
script.decompose()
# Get text
text = soup.get_text()
# Break into lines and remove leading and trailing space
lines = (line.strip() for line in text.splitlines())
# Break multi-headlines into a line each
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
# Drop blank lines
text = '\n'.join(chunk for chunk in chunks if chunk)
return text
except Exception as e:
st.error(f"Error fetching URL content: {str(e)}")
return None
def main():
st.set_page_config(
page_title="FAQ Generator",
page_icon="",
layout="wide"
)
st.title("FAQ Generator")
st.markdown("Generate comprehensive FAQs from your content with research integration.")
# Sidebar for configuration
with st.sidebar:
st.header("Configuration")
# Basic settings
num_faqs = st.slider("Number of FAQs", 1, 20, 5)
target_audience = st.selectbox(
"Target Audience",
[audience.value for audience in TargetAudience]
)
faq_style = st.selectbox(
"FAQ Style",
[style.value for style in FAQStyle]
)
# Advanced settings
with st.expander("Advanced Settings"):
include_emojis = st.checkbox("Include Emojis", value=True)
include_code_examples = st.checkbox("Include Code Examples", value=True)
include_references = st.checkbox("Include References", value=True)
search_depth = st.selectbox(
"Search Depth",
[depth.value for depth in SearchDepth]
)
time_range = st.selectbox(
"Time Range",
["last_month", "last_6_months", "last_year", "all_time"]
)
language = st.text_input("Language", value="English")
# Main content area
content_type = st.radio(
"Content Source",
["Direct Input", "File Upload", "URL"]
)
content = ""
if content_type == "Direct Input":
content = st.text_area("Enter your content", height=300)
elif content_type == "URL":
url = st.text_input("Enter URL")
if url:
content = fetch_url_content(url)
if content:
st.text_area("Extracted Content", content, height=300)
# Generate button
if st.button("Generate FAQs") and content:
try:
# Create config
config = FAQConfig(
num_faqs=num_faqs,
target_audience=TargetAudience(target_audience),
faq_style=FAQStyle(faq_style),
include_emojis=include_emojis,
include_code_examples=include_code_examples,
include_references=include_references,
search_depth=SearchDepth(search_depth),
time_range=time_range,
language=language
)
# Initialize generator
generator = FAQGenerator(config)
# Generate FAQs
with st.spinner("Generating FAQs..."):
faqs = asyncio.run(generator.generate_faqs(content))
# Display results
st.success("FAQs generated successfully!")
# Output format selection
output_format = st.radio(
"Output Format",
["Preview", "Markdown", "HTML", "JSON"]
)
if output_format == "Preview":
for i, faq in enumerate(faqs, 1):
with st.expander(f"{i}. {faq.question}"):
st.markdown(faq.answer)
if faq.code_example:
st.code(faq.code_example)
if faq.references:
st.markdown("**References:**")
for ref in faq.references:
st.markdown(f"- [{ref['title']}]({ref['url']}) - {ref['source']} ({ref['date']})")
elif output_format == "Markdown":
st.code(generator.to_markdown(), language="markdown")
st.download_button(
"Download Markdown",
generator.to_markdown(),
file_name="faqs.md",
mime="text/markdown"
)
elif output_format == "HTML":
st.code(generator.to_html(), language="html")
st.download_button(
"Download HTML",
generator.to_html(),
file_name="faqs.html",
mime="text/html"
)
elif output_format == "JSON":
json_output = json.dumps([faq.__dict__ for faq in faqs], indent=2)
st.code(json_output, language="json")
st.download_button(
"Download JSON",
json_output,
file_name="faqs.json",
mime="application/json"
)
except Exception as e:
st.error(f"Error generating FAQs: {str(e)}")
if __name__ == "__main__":
main()