Detailed Docs & Onboarding improvements

2025-04-21 16:34:18 +05:30
parent 6e60a9fd28
commit c5b47bd32f
42 changed files with 5114 additions and 79 deletions
--- a/docs/architecture/database_schema.rst
+++ b/docs/architecture/database_schema.rst
@@ -0,0 +1,306 @@
+Database Schema
+==============
+
+This document describes the database schema used in the AI-Writer platform, including both the relational database and vector database components.
+
+Relational Database Schema
+------------------------
+
+AI-Writer uses SQLAlchemy ORM to interact with the relational database. The schema consists of the following main tables:
+
+User
+~~~~
+
+Stores user information and preferences.
+
+.. code-block:: python
+
+   class User(Base):
+       __tablename__ = "users"
+       
+       id = Column(Integer, primary_key=True)
+       username = Column(String, unique=True, nullable=False)
+       email = Column(String, unique=True, nullable=False)
+       password_hash = Column(String, nullable=False)
+       created_at = Column(DateTime, default=datetime.utcnow)
+       updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
+       
+       # Relationships
+       api_keys = relationship("ApiKey", back_populates="user")
+       contents = relationship("Content", back_populates="user")
+       settings = relationship("UserSetting", back_populates="user", uselist=False)
+
+ApiKey
+~~~~~~
+
+Stores encrypted API keys for various services.
+
+.. code-block:: python
+
+   class ApiKey(Base):
+       __tablename__ = "api_keys"
+       
+       id = Column(Integer, primary_key=True)
+       user_id = Column(Integer, ForeignKey("users.id"))
+       service_name = Column(String, nullable=False)
+       encrypted_key = Column(String, nullable=False)
+       is_active = Column(Boolean, default=True)
+       created_at = Column(DateTime, default=datetime.utcnow)
+       updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
+       
+       # Relationships
+       user = relationship("User", back_populates="api_keys")
+
+Content
+~~~~~~~
+
+Stores generated content with metadata.
+
+.. code-block:: python
+
+   class Content(Base):
+       __tablename__ = "contents"
+       
+       id = Column(Integer, primary_key=True)
+       user_id = Column(Integer, ForeignKey("users.id"))
+       title = Column(String, nullable=False)
+       content_type = Column(String, nullable=False)  # blog, linkedin, twitter, etc.
+       content_text = Column(Text, nullable=False)
+       metadata = Column(JSON)
+       created_at = Column(DateTime, default=datetime.utcnow)
+       updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
+       
+       # Relationships
+       user = relationship("User", back_populates="contents")
+       versions = relationship("ContentVersion", back_populates="content")
+       analytics = relationship("ContentAnalytics", back_populates="content")
+
+ContentVersion
+~~~~~~~~~~~~~
+
+Tracks versions of content for history and rollback.
+
+.. code-block:: python
+
+   class ContentVersion(Base):
+       __tablename__ = "content_versions"
+       
+       id = Column(Integer, primary_key=True)
+       content_id = Column(Integer, ForeignKey("contents.id"))
+       version_number = Column(Integer, nullable=False)
+       content_text = Column(Text, nullable=False)
+       metadata = Column(JSON)
+       created_at = Column(DateTime, default=datetime.utcnow)
+       
+       # Relationships
+       content = relationship("Content", back_populates="versions")
+
+ContentAnalytics
+~~~~~~~~~~~~~~
+
+Stores analytics data for content performance.
+
+.. code-block:: python
+
+   class ContentAnalytics(Base):
+       __tablename__ = "content_analytics"
+       
+       id = Column(Integer, primary_key=True)
+       content_id = Column(Integer, ForeignKey("contents.id"))
+       views = Column(Integer, default=0)
+       likes = Column(Integer, default=0)
+       shares = Column(Integer, default=0)
+       comments = Column(Integer, default=0)
+       engagement_rate = Column(Float, default=0.0)
+       last_updated = Column(DateTime, default=datetime.utcnow)
+       
+       # Relationships
+       content = relationship("Content", back_populates="analytics")
+
+UserSetting
+~~~~~~~~~~
+
+Stores user preferences and settings.
+
+.. code-block:: python
+
+   class UserSetting(Base):
+       __tablename__ = "user_settings"
+       
+       id = Column(Integer, primary_key=True)
+       user_id = Column(Integer, ForeignKey("users.id"), unique=True)
+       preferred_ai_provider = Column(String)
+       default_content_type = Column(String)
+       ui_theme = Column(String, default="light")
+       language = Column(String, default="en")
+       settings_json = Column(JSON)
+       
+       # Relationships
+       user = relationship("User", back_populates="settings")
+
+Template
+~~~~~~~
+
+Stores reusable content templates.
+
+.. code-block:: python
+
+   class Template(Base):
+       __tablename__ = "templates"
+       
+       id = Column(Integer, primary_key=True)
+       user_id = Column(Integer, ForeignKey("users.id"))
+       name = Column(String, nullable=False)
+       content_type = Column(String, nullable=False)
+       template_text = Column(Text, nullable=False)
+       variables = Column(JSON)
+       created_at = Column(DateTime, default=datetime.utcnow)
+       updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
+       
+       # Relationships
+       user = relationship("User")
+
+Vector Database Schema
+--------------------
+
+AI-Writer uses ChromaDB for vector storage, which enables semantic search and retrieval of content. The vector database stores:
+
+1. **Content Embeddings**
+   
+   * Generated from content text using embedding models
+   * Used for semantic search and content similarity
+
+2. **Metadata**
+   
+   * Content ID (linking to relational database)
+   * Content type
+   * Creation date
+   * Keywords and tags
+
+3. **Collections**
+   
+   ChromaDB organizes embeddings into collections:
+   
+   * `content_embeddings`: Main collection for all content
+   * `user_{user_id}_content`: Per-user content collections
+   * `{content_type}_embeddings`: Collections by content type
+
+Vector Database Operations
+------------------------
+
+The vector database supports the following operations:
+
+1. **Adding Content**
+   
+   .. code-block:: python
+
+      def add_content_to_vector_db(content_id, content_text, metadata):
+          """Add content to the vector database.
+          
+          Args:
+              content_id: The ID of the content in the relational database.
+              content_text: The text content to embed.
+              metadata: Additional metadata for the content.
+          """
+          embeddings = get_embeddings(content_text)
+          collection = get_collection("content_embeddings")
+          collection.add(
+              ids=[str(content_id)],
+              embeddings=[embeddings],
+              metadatas=[metadata],
+              documents=[content_text]
+          )
+
+2. **Searching Content**
+   
+   .. code-block:: python
+
+      def search_similar_content(query_text, limit=5):
+          """Search for similar content using vector similarity.
+          
+          Args:
+              query_text: The query text to search for.
+              limit: Maximum number of results to return.
+              
+          Returns:
+              List of similar content items with their similarity scores.
+          """
+          query_embedding = get_embeddings(query_text)
+          collection = get_collection("content_embeddings")
+          results = collection.query(
+              query_embeddings=[query_embedding],
+              n_results=limit
+          )
+          return results
+
+3. **Updating Content**
+   
+   .. code-block:: python
+
+      def update_content_in_vector_db(content_id, new_content_text, metadata):
+          """Update content in the vector database.
+          
+          Args:
+              content_id: The ID of the content to update.
+              new_content_text: The updated text content.
+              metadata: Updated metadata.
+          """
+          new_embedding = get_embeddings(new_content_text)
+          collection = get_collection("content_embeddings")
+          collection.update(
+              ids=[str(content_id)],
+              embeddings=[new_embedding],
+              metadatas=[metadata],
+              documents=[new_content_text]
+          )
+
+Database Migrations
+-----------------
+
+AI-Writer uses Alembic for database migrations. The migration workflow is:
+
+1. **Create Migration**
+   
+   .. code-block:: bash
+
+      alembic revision --autogenerate -m "Description of changes"
+
+2. **Apply Migration**
+   
+   .. code-block:: bash
+
+      alembic upgrade head
+
+3. **Rollback Migration**
+   
+   .. code-block:: bash
+
+      alembic downgrade -1
+
+Database Backup and Restore
+-------------------------
+
+Regular database backups are recommended:
+
+1. **SQLite Backup**
+   
+   .. code-block:: bash
+
+      # Backup
+      sqlite3 data/alwrity.db .dump > backup.sql
+      
+      # Restore
+      sqlite3 data/alwrity.db < backup.sql
+
+2. **Vector Database Backup**
+   
+   ChromaDB data is stored in the specified directory and can be backed up by copying the directory:
+   
+   .. code-block:: bash
+
+      # Backup
+      cp -r data/vectordb data/vectordb_backup
+      
+      # Restore
+      rm -rf data/vectordb
+      cp -r data/vectordb_backup data/vectordb