AI FAQ Generator & github blogs

2025-05-04 17:04:44 +05:30
parent c51e355d26
commit 26b02b9719
9 changed files with 1810 additions and 463 deletions
--- a/lib/ai_writers/ai_blog_faqs_writer/README.md
+++ b/lib/ai_writers/ai_blog_faqs_writer/README.md
@@ -0,0 +1,192 @@
+# AI-Powered FAQ Generator
+
+A sophisticated FAQ generation system that creates comprehensive, well-researched FAQs from various content sources. This tool leverages AI to analyze content, conduct web research, and generate detailed FAQs with customizable options.
+
+## Features
+
+### Content Processing
+- **Multiple Input Sources**
+  - Direct text input
+  - File uploads (DOCX, TXT)
+  - URL content extraction
+  - Support for any content type (general, technical, educational, etc.)
+
+### Research Capabilities
+- **Multi-level Search Depth**
+  - **Basic**: Google Search for quick, general information
+  - **Comprehensive**: Tavily AI for detailed, in-depth research
+  - **Expert**: Metaphor AI for specialized, expert-level content
+
+### Customization Options
+- **Target Audience**
+  - Beginner
+  - Intermediate
+  - Expert
+
+- **FAQ Style**
+  - Technical
+  - Conversational
+  - Professional
+
+- **Advanced Features**
+  - Emoji inclusion
+  - Code example generation
+  - Reference integration
+  - Customizable time range for research
+  - Multi-language support
+
+### Output Formats
+- Interactive preview
+- Markdown
+- HTML
+- JSON
+
+## Installation
+
+1. Clone the repository
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+
+## Usage
+
+### Basic Usage
+```python
+from lib.ai_writers.ai_blog_faqs_writer.faqs_generator_blog import FAQGenerator, FAQConfig
+
+# Initialize with default configuration
+generator = FAQGenerator()
+
+# Generate FAQs from content
+faqs = await generator.generate_faqs("Your content here")
+```
+
+### Advanced Configuration
+```python
+from lib.ai_writers.ai_blog_faqs_writer.faqs_generator_blog import (
+    FAQGenerator, FAQConfig, TargetAudience, FAQStyle, SearchDepth
+)
+
+# Custom configuration
+config = FAQConfig(
+    num_faqs=10,
+    target_audience=TargetAudience.INTERMEDIATE,
+    faq_style=FAQStyle.TECHNICAL,
+    include_emojis=True,
+    include_code_examples=True,
+    include_references=True,
+    search_depth=SearchDepth.COMPREHENSIVE,
+    time_range="last_6_months",
+    language="English"
+)
+
+generator = FAQGenerator(config)
+```
+
+### Web Interface
+Run the Streamlit interface:
+```bash
+streamlit run lib/ai_writers/ai_blog_faqs_writer/faqs_ui.py
+```
+
+## Research Process
+
+1. **Content Analysis**
+   - Identifies key topics and concepts
+   - Extracts potential questions
+   - Determines research requirements
+
+2. **Web Research**
+   - Selects appropriate search function based on depth
+   - Gathers relevant information
+   - Validates and cross-references data
+
+3. **FAQ Generation**
+   - Creates comprehensive questions
+   - Provides detailed answers
+   - Includes code examples (if applicable)
+   - Adds references and citations
+
+## Output Structure
+
+Each FAQ item includes:
+- Question
+- Detailed answer
+- Category
+- Code example (if applicable)
+- References
+- Confidence score
+- Last updated timestamp
+
+## Configuration Options
+
+### FAQConfig Parameters
+- `num_faqs`: Number of FAQs to generate (default: 5)
+- `target_audience`: Target audience level (default: INTERMEDIATE)
+- `faq_style`: Writing style (default: PROFESSIONAL)
+- `include_emojis`: Whether to include emojis (default: True)
+- `include_code_examples`: Whether to include code examples (default: True)
+- `include_references`: Whether to include references (default: True)
+- `search_depth`: Research depth level (default: COMPREHENSIVE)
+- `time_range`: Time range for research (default: "last_6_months")
+- `language`: Output language (default: "English")
+
+## Research Depth Options
+
+### Basic (Google Search)
+- Quick, general information
+- Broad coverage
+- Suitable for basic topics
+
+### Comprehensive (Tavily AI)
+- Detailed, in-depth research
+- Multiple source integration
+- Best for most use cases
+
+### Expert (Metaphor AI)
+- Specialized, expert-level content
+- Advanced topic coverage
+- Technical and academic focus
+
+## Best Practices
+
+1. **Content Preparation**
+   - Provide clear, well-structured content
+   - Include key terms and concepts
+   - Specify target audience and style
+
+2. **Research Selection**
+   - Use Basic for general topics
+   - Choose Comprehensive for detailed analysis
+   - Select Expert for technical subjects
+
+3. **Output Review**
+   - Verify accuracy of information
+   - Check code examples
+   - Validate references
+
+## Contributing
+
+1. Fork the repository
+2. Create a feature branch
+3. Commit your changes
+4. Push to the branch
+5. Create a Pull Request
+
+## License
+
+This project is licensed under the MIT License - see the LICENSE file for details.
+
+## Support
+
+For support, please open an issue in the repository or contact the maintainers.
+
+## Acknowledgments
+
+- OpenAI for GPT integration
+- Google Search API
+- Tavily AI
+- Metaphor AI
+- BeautifulSoup for web scraping
+- Streamlit for UI 
--- a/lib/ai_writers/ai_blog_faqs_writer/faqs_generator_blog.py
+++ b/lib/ai_writers/ai_blog_faqs_writer/faqs_generator_blog.py
@@ -0,0 +1,386 @@
+"""
+Enhanced FAQ Generator
+
+This module provides a comprehensive FAQ generation system that can create detailed,
+well-researched FAQs from various content sources with customizable options.
+"""
+
+import sys
+import json
+from typing import Dict, List, Optional, Union
+from pathlib import Path
+from enum import Enum
+from dataclasses import dataclass
+from loguru import logger
+
+from lib.gpt_providers.text_generation.main_text_generation import llm_text_gen
+from lib.ai_web_researcher.google_serp_search import google_search
+from lib.ai_web_researcher.tavily_ai_search import tavily_search
+from lib.ai_web_researcher.metaphor_basic_neural_web_search import metaphor_search_articles
+
+logger.remove()
+logger.add(sys.stdout,
+          colorize=True,
+          format="<level>{level}</level>|<green>{file}:{line}:{function}</green>| {message}")
+
+class TargetAudience(Enum):
+    BEGINNER = "beginner"
+    INTERMEDIATE = "intermediate"
+    EXPERT = "expert"
+
+class FAQStyle(Enum):
+    TECHNICAL = "technical"
+    CONVERSATIONAL = "conversational"
+    PROFESSIONAL = "professional"
+
+class SearchDepth(Enum):
+    BASIC = "basic"
+    COMPREHENSIVE = "comprehensive"
+    EXPERT = "expert"
+
+@dataclass
+class FAQConfig:
+    """Configuration for FAQ generation."""
+    num_faqs: int = 5
+    target_audience: TargetAudience = TargetAudience.INTERMEDIATE
+    faq_style: FAQStyle = FAQStyle.PROFESSIONAL
+    include_emojis: bool = True
+    include_code_examples: bool = True
+    include_references: bool = True
+    search_depth: SearchDepth = SearchDepth.COMPREHENSIVE
+    time_range: str = "last_6_months"
+    exclude_domains: List[str] = None
+    language: str = "English"
+
+@dataclass
+class FAQItem:
+    """Individual FAQ item with metadata."""
+    question: str
+    answer: str
+    category: str
+    code_example: Optional[str] = None
+    references: List[Dict[str, str]] = None
+    confidence_score: float = 0.0
+    last_updated: str = None
+
+class FAQGenerator:
+    """Enhanced FAQ Generator with research capabilities."""
+    
+    def __init__(self, config: Optional[FAQConfig] = None):
+        """Initialize the FAQ generator with optional configuration."""
+        self.config = config or FAQConfig()
+        self.faqs: List[FAQItem] = []
+        self.research_results = {}
+        
+    async def generate_faqs(self, content: str, content_type: str = "general") -> List[FAQItem]:
+        """Generate FAQs from the given content with research integration."""
+        try:
+            # Step 1: Research the topic
+            research_results = await self._conduct_research(content)
+            
+            # Step 2: Generate initial FAQs
+            initial_faqs = await self._generate_initial_faqs(content, research_results)
+            
+            # Step 3: Enhance FAQs with research
+            enhanced_faqs = await self._enhance_faqs_with_research(initial_faqs, research_results)
+            
+            # Step 4: Add code examples if requested
+            if self.config.include_code_examples:
+                enhanced_faqs = await self._add_code_examples(enhanced_faqs)
+            
+            # Step 5: Add references if requested
+            if self.config.include_references:
+                enhanced_faqs = await self._add_references(enhanced_faqs, research_results)
+            
+            self.faqs = enhanced_faqs
+            return enhanced_faqs
+            
+        except Exception as err:
+            logger.error(f"Failed to generate FAQs: {err}")
+            raise
+    
+    async def _conduct_research(self, content: str) -> Dict:
+        """Conduct online research based on the content."""
+        try:
+            research_prompt = f"""Based on the following content, identify key topics and questions for research:
+            {content}
+            
+            Please provide a list of research topics and questions that would help create comprehensive FAQs.
+            Focus on:
+            1. Key concepts and terms
+            2. Common questions users might have
+            3. Technical aspects that need clarification
+            4. Best practices and recommendations
+            """
+            
+            research_topics = await llm_text_gen(research_prompt)
+            
+            # Conduct research for each topic
+            research_results = {}
+            for topic in research_topics.split('\n'):
+                if topic.strip():
+                    # Select search function based on search depth
+                    if self.config.search_depth == SearchDepth.BASIC:
+                        results = await google_search(topic.strip())
+                    elif self.config.search_depth == SearchDepth.COMPREHENSIVE:
+                        results = await tavily_search(topic.strip())
+                    elif self.config.search_depth == SearchDepth.EXPERT:
+                        results = await metaphor_search_articles(topic.strip())
+                    else:
+                        logger.warning(f"Unknown search depth: {self.config.search_depth}, defaulting to Google search")
+                        results = await google_search(topic.strip())
+                    
+                    research_results[topic.strip()] = results
+            
+            return research_results
+            
+        except Exception as err:
+            logger.error(f"Failed to conduct research: {err}")
+            return {}
+    
+    async def _generate_initial_faqs(self, content: str, research_results: Dict) -> List[FAQItem]:
+        """Generate initial FAQs using LLM."""
+        try:
+            system_prompt = f"""You are an expert FAQ generator with deep knowledge in content creation and technical writing.
+            Your task is to create comprehensive FAQs based on the given content and research.
+
+            Guidelines:
+            1. Target Audience: {self.config.target_audience.value}
+            2. Style: {self.config.faq_style.value}
+            3. Include emojis: {self.config.include_emojis}
+            4. Language: {self.config.language}
+            5. Number of FAQs: {self.config.num_faqs}
+
+            Create FAQs that are:
+            - Clear and concise
+            - Well-structured
+            - Technically accurate
+            - Engaging and informative
+            - Based on the provided research
+            - Relevant to the target audience
+            - Written in the specified style
+            """
+            
+            prompt = f"""Content to generate FAQs from:
+            {content}
+
+            Research Results:
+            {json.dumps(research_results, indent=2)}
+
+            Please generate {self.config.num_faqs} FAQs following the guidelines above.
+            Format each FAQ with:
+            - Question
+            - Detailed answer
+            - Category
+            - Confidence score (0-1)
+            """
+            
+            response = await llm_text_gen(prompt, system_prompt=system_prompt)
+            
+            # Parse the response into FAQItem objects
+            faqs = []
+            current_faq = None
+            
+            for line in response.split('\n'):
+                if line.startswith('Q:'):
+                    if current_faq:
+                        faqs.append(current_faq)
+                    current_faq = FAQItem(question=line[2:].strip(), answer="", category="")
+                elif line.startswith('A:'):
+                    if current_faq:
+                        current_faq.answer = line[2:].strip()
+                elif line.startswith('Category:'):
+                    if current_faq:
+                        current_faq.category = line[9:].strip()
+                elif line.startswith('Confidence:'):
+                    if current_faq:
+                        current_faq.confidence_score = float(line[11:].strip())
+            
+            if current_faq:
+                faqs.append(current_faq)
+            
+            return faqs
+            
+        except Exception as err:
+            logger.error(f"Failed to generate initial FAQs: {err}")
+            raise
+    
+    async def _enhance_faqs_with_research(self, faqs: List[FAQItem], research_results: Dict) -> List[FAQItem]:
+        """Enhance FAQs with research findings."""
+        try:
+            enhanced_faqs = []
+            
+            for faq in faqs:
+                # Find relevant research for this FAQ
+                relevant_research = self._find_relevant_research(faq, research_results)
+                
+                if relevant_research:
+                    # Enhance the answer with research findings
+                    enhancement_prompt = f"""Enhance the following FAQ answer with the provided research:
+                    
+                    Question: {faq.question}
+                    Current Answer: {faq.answer}
+                    
+                    Research:
+                    {json.dumps(relevant_research, indent=2)}
+                    
+                    Please enhance the answer while:
+                    1. Maintaining the original style and tone
+                    2. Adding relevant information from the research
+                    3. Ensuring technical accuracy
+                    4. Keeping the answer concise and clear
+                    """
+                    
+                    enhanced_answer = await llm_text_gen(enhancement_prompt)
+                    faq.answer = enhanced_answer
+                
+                enhanced_faqs.append(faq)
+            
+            return enhanced_faqs
+            
+        except Exception as err:
+            logger.error(f"Failed to enhance FAQs with research: {err}")
+            return faqs
+    
+    async def _add_code_examples(self, faqs: List[FAQItem]) -> List[FAQItem]:
+        """Add code examples to FAQs where applicable."""
+        try:
+            for faq in faqs:
+                if self._is_technical_question(faq.question):
+                    code_prompt = f"""Generate a code example for the following FAQ:
+                    
+                    Question: {faq.question}
+                    Answer: {faq.answer}
+                    
+                    Please provide a relevant code example that:
+                    1. Illustrates the answer clearly
+                    2. Includes comments and explanations
+                    3. Follows best practices
+                    4. Is easy to understand
+                    """
+                    
+                    code_example = await llm_text_gen(code_prompt)
+                    faq.code_example = code_example
+            
+            return faqs
+            
+        except Exception as err:
+            logger.error(f"Failed to add code examples: {err}")
+            return faqs
+    
+    async def _add_references(self, faqs: List[FAQItem], research_results: Dict) -> List[FAQItem]:
+        """Add references to FAQs."""
+        try:
+            for faq in faqs:
+                relevant_research = self._find_relevant_research(faq, research_results)
+                if relevant_research:
+                    faq.references = [
+                        {
+                            "title": ref.get("title", ""),
+                            "url": ref.get("url", ""),
+                            "source": ref.get("source", ""),
+                            "date": ref.get("date", "")
+                        }
+                        for ref in relevant_research.get("references", [])
+                    ]
+            
+            return faqs
+            
+        except Exception as err:
+            logger.error(f"Failed to add references: {err}")
+            return faqs
+    
+    def _find_relevant_research(self, faq: FAQItem, research_results: Dict) -> Dict:
+        """Find research relevant to a specific FAQ."""
+        # Simple keyword matching for now - can be enhanced with semantic search
+        relevant_research = {}
+        for topic, results in research_results.items():
+            if any(keyword in faq.question.lower() for keyword in topic.lower().split()):
+                relevant_research[topic] = results
+        return relevant_research
+    
+    def _is_technical_question(self, question: str) -> bool:
+        """Determine if a question is technical and might benefit from a code example."""
+        technical_keywords = ["code", "program", "function", "method", "class", "api", "syntax", "error", "debug"]
+        return any(keyword in question.lower() for keyword in technical_keywords)
+    
+    def to_markdown(self) -> str:
+        """Convert FAQs to markdown format."""
+        markdown = "# Frequently Asked Questions\n\n"
+        
+        for i, faq in enumerate(self.faqs, 1):
+            markdown += f"## {i}. {faq.question}\n\n"
+            markdown += f"{faq.answer}\n\n"
+            
+            if faq.code_example:
+                markdown += "```\n"
+                markdown += f"{faq.code_example}\n"
+                markdown += "```\n\n"
+            
+            if faq.references:
+                markdown += "### References\n"
+                for ref in faq.references:
+                    markdown += f"- [{ref['title']}]({ref['url']}) - {ref['source']} ({ref['date']})\n"
+                markdown += "\n"
+        
+        return markdown
+    
+    def to_html(self) -> str:
+        """Convert FAQs to HTML format."""
+        html = """
+        <!DOCTYPE html>
+        <html>
+        <head>
+            <title>Frequently Asked Questions</title>
+            <style>
+                .faq-container { max-width: 800px; margin: 0 auto; }
+                .faq-item { margin-bottom: 2em; }
+                .question { font-weight: bold; font-size: 1.2em; }
+                .answer { margin: 1em 0; }
+                .code-example { background: #f5f5f5; padding: 1em; }
+                .references { margin-top: 1em; font-size: 0.9em; }
+            </style>
+        </head>
+        <body>
+            <div class="faq-container">
+                <h1>Frequently Asked Questions</h1>
+        """
+        
+        for i, faq in enumerate(self.faqs, 1):
+            html += f"""
+                <div class="faq-item">
+                    <div class="question">{i}. {faq.question}</div>
+                    <div class="answer">{faq.answer}</div>
+            """
+            
+            if faq.code_example:
+                html += f"""
+                    <pre class="code-example">{faq.code_example}</pre>
+                """
+            
+            if faq.references:
+                html += """
+                    <div class="references">
+                        <h3>References</h3>
+                        <ul>
+                """
+                for ref in faq.references:
+                    html += f"""
+                            <li><a href="{ref['url']}">{ref['title']}</a> - {ref['source']} ({ref['date']})</li>
+                    """
+                html += """
+                        </ul>
+                    </div>
+                """
+            
+            html += """
+                </div>
+            """
+        
+        html += """
+            </div>
+        </body>
+        </html>
+        """
+        
+        return html
--- a/lib/ai_writers/ai_blog_faqs_writer/faqs_ui.py
+++ b/lib/ai_writers/ai_blog_faqs_writer/faqs_ui.py
@@ -0,0 +1,177 @@
+"""
+Streamlit UI for FAQ Generator
+
+This module provides a user-friendly interface for generating FAQs from various content sources.
+"""
+
+import streamlit as st
+import asyncio
+from pathlib import Path
+from typing import Optional
+import json
+import requests
+from bs4 import BeautifulSoup
+
+from .faqs_generator_blog import FAQGenerator, FAQConfig, TargetAudience, FAQStyle, SearchDepth
+
+
+def fetch_url_content(url):
+    """Fetch and extract content from a URL."""
+    try:
+        response = requests.get(url)
+        response.raise_for_status()
+        soup = BeautifulSoup(response.text, 'html.parser')
+        
+        # Remove script and style elements
+        for script in soup(["script", "style"]):
+            script.decompose()
+            
+        # Get text
+        text = soup.get_text()
+        
+        # Break into lines and remove leading and trailing space
+        lines = (line.strip() for line in text.splitlines())
+        # Break multi-headlines into a line each
+        chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
+        # Drop blank lines
+        text = '\n'.join(chunk for chunk in chunks if chunk)
+        
+        return text
+    except Exception as e:
+        st.error(f"Error fetching URL content: {str(e)}")
+        return None
+
+def main():
+    st.set_page_config(
+        page_title="FAQ Generator",
+        page_icon="❓",
+        layout="wide"
+    )
+    
+    st.title("FAQ Generator")
+    st.markdown("Generate comprehensive FAQs from your content with research integration.")
+    
+    # Sidebar for configuration
+    with st.sidebar:
+        st.header("Configuration")
+        
+        # Basic settings
+        num_faqs = st.slider("Number of FAQs", 1, 20, 5)
+        target_audience = st.selectbox(
+            "Target Audience",
+            [audience.value for audience in TargetAudience]
+        )
+        faq_style = st.selectbox(
+            "FAQ Style",
+            [style.value for style in FAQStyle]
+        )
+        
+        # Advanced settings
+        with st.expander("Advanced Settings"):
+            include_emojis = st.checkbox("Include Emojis", value=True)
+            include_code_examples = st.checkbox("Include Code Examples", value=True)
+            include_references = st.checkbox("Include References", value=True)
+            
+            search_depth = st.selectbox(
+                "Search Depth",
+                [depth.value for depth in SearchDepth]
+            )
+            time_range = st.selectbox(
+                "Time Range",
+                ["last_month", "last_6_months", "last_year", "all_time"]
+            )
+            language = st.text_input("Language", value="English")
+    
+    # Main content area
+    content_type = st.radio(
+        "Content Source",
+        ["Direct Input", "File Upload", "URL"]
+    )
+    
+    content = ""
+    if content_type == "Direct Input":
+        content = st.text_area("Enter your content", height=300)
+    
+    elif content_type == "URL":
+        url = st.text_input("Enter URL")
+        if url:
+            content = fetch_url_content(url)
+            if content:
+                st.text_area("Extracted Content", content, height=300)
+    
+    # Generate button
+    if st.button("Generate FAQs") and content:
+        try:
+            # Create config
+            config = FAQConfig(
+                num_faqs=num_faqs,
+                target_audience=TargetAudience(target_audience),
+                faq_style=FAQStyle(faq_style),
+                include_emojis=include_emojis,
+                include_code_examples=include_code_examples,
+                include_references=include_references,
+                search_depth=SearchDepth(search_depth),
+                time_range=time_range,
+                language=language
+            )
+            
+            # Initialize generator
+            generator = FAQGenerator(config)
+            
+            # Generate FAQs
+            with st.spinner("Generating FAQs..."):
+                faqs = asyncio.run(generator.generate_faqs(content))
+            
+            # Display results
+            st.success("FAQs generated successfully!")
+            
+            # Output format selection
+            output_format = st.radio(
+                "Output Format",
+                ["Preview", "Markdown", "HTML", "JSON"]
+            )
+            
+            if output_format == "Preview":
+                for i, faq in enumerate(faqs, 1):
+                    with st.expander(f"{i}. {faq.question}"):
+                        st.markdown(faq.answer)
+                        if faq.code_example:
+                            st.code(faq.code_example)
+                        if faq.references:
+                            st.markdown("**References:**")
+                            for ref in faq.references:
+                                st.markdown(f"- [{ref['title']}]({ref['url']}) - {ref['source']} ({ref['date']})")
+            
+            elif output_format == "Markdown":
+                st.code(generator.to_markdown(), language="markdown")
+                st.download_button(
+                    "Download Markdown",
+                    generator.to_markdown(),
+                    file_name="faqs.md",
+                    mime="text/markdown"
+                )
+            
+            elif output_format == "HTML":
+                st.code(generator.to_html(), language="html")
+                st.download_button(
+                    "Download HTML",
+                    generator.to_html(),
+                    file_name="faqs.html",
+                    mime="text/html"
+                )
+            
+            elif output_format == "JSON":
+                json_output = json.dumps([faq.__dict__ for faq in faqs], indent=2)
+                st.code(json_output, language="json")
+                st.download_button(
+                    "Download JSON",
+                    json_output,
+                    file_name="faqs.json",
+                    mime="application/json"
+                )
+        
+        except Exception as e:
+            st.error(f"Error generating FAQs: {str(e)}")
+
+if __name__ == "__main__":
+    main()