Detailed Docs & Onboarding improvements
This commit is contained in:
306
docs/architecture/database_schema.rst
Normal file
306
docs/architecture/database_schema.rst
Normal file
@@ -0,0 +1,306 @@
|
||||
Database Schema
|
||||
==============
|
||||
|
||||
This document describes the database schema used in the AI-Writer platform, including both the relational database and vector database components.
|
||||
|
||||
Relational Database Schema
|
||||
------------------------
|
||||
|
||||
AI-Writer uses SQLAlchemy ORM to interact with the relational database. The schema consists of the following main tables:
|
||||
|
||||
User
|
||||
~~~~
|
||||
|
||||
Stores user information and preferences.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class User(Base):
|
||||
__tablename__ = "users"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
username = Column(String, unique=True, nullable=False)
|
||||
email = Column(String, unique=True, nullable=False)
|
||||
password_hash = Column(String, nullable=False)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
|
||||
|
||||
# Relationships
|
||||
api_keys = relationship("ApiKey", back_populates="user")
|
||||
contents = relationship("Content", back_populates="user")
|
||||
settings = relationship("UserSetting", back_populates="user", uselist=False)
|
||||
|
||||
ApiKey
|
||||
~~~~~~
|
||||
|
||||
Stores encrypted API keys for various services.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class ApiKey(Base):
|
||||
__tablename__ = "api_keys"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
user_id = Column(Integer, ForeignKey("users.id"))
|
||||
service_name = Column(String, nullable=False)
|
||||
encrypted_key = Column(String, nullable=False)
|
||||
is_active = Column(Boolean, default=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
|
||||
|
||||
# Relationships
|
||||
user = relationship("User", back_populates="api_keys")
|
||||
|
||||
Content
|
||||
~~~~~~~
|
||||
|
||||
Stores generated content with metadata.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class Content(Base):
|
||||
__tablename__ = "contents"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
user_id = Column(Integer, ForeignKey("users.id"))
|
||||
title = Column(String, nullable=False)
|
||||
content_type = Column(String, nullable=False) # blog, linkedin, twitter, etc.
|
||||
content_text = Column(Text, nullable=False)
|
||||
metadata = Column(JSON)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
|
||||
|
||||
# Relationships
|
||||
user = relationship("User", back_populates="contents")
|
||||
versions = relationship("ContentVersion", back_populates="content")
|
||||
analytics = relationship("ContentAnalytics", back_populates="content")
|
||||
|
||||
ContentVersion
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Tracks versions of content for history and rollback.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class ContentVersion(Base):
|
||||
__tablename__ = "content_versions"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
content_id = Column(Integer, ForeignKey("contents.id"))
|
||||
version_number = Column(Integer, nullable=False)
|
||||
content_text = Column(Text, nullable=False)
|
||||
metadata = Column(JSON)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
# Relationships
|
||||
content = relationship("Content", back_populates="versions")
|
||||
|
||||
ContentAnalytics
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Stores analytics data for content performance.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class ContentAnalytics(Base):
|
||||
__tablename__ = "content_analytics"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
content_id = Column(Integer, ForeignKey("contents.id"))
|
||||
views = Column(Integer, default=0)
|
||||
likes = Column(Integer, default=0)
|
||||
shares = Column(Integer, default=0)
|
||||
comments = Column(Integer, default=0)
|
||||
engagement_rate = Column(Float, default=0.0)
|
||||
last_updated = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
# Relationships
|
||||
content = relationship("Content", back_populates="analytics")
|
||||
|
||||
UserSetting
|
||||
~~~~~~~~~~
|
||||
|
||||
Stores user preferences and settings.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class UserSetting(Base):
|
||||
__tablename__ = "user_settings"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
user_id = Column(Integer, ForeignKey("users.id"), unique=True)
|
||||
preferred_ai_provider = Column(String)
|
||||
default_content_type = Column(String)
|
||||
ui_theme = Column(String, default="light")
|
||||
language = Column(String, default="en")
|
||||
settings_json = Column(JSON)
|
||||
|
||||
# Relationships
|
||||
user = relationship("User", back_populates="settings")
|
||||
|
||||
Template
|
||||
~~~~~~~
|
||||
|
||||
Stores reusable content templates.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class Template(Base):
|
||||
__tablename__ = "templates"
|
||||
|
||||
id = Column(Integer, primary_key=True)
|
||||
user_id = Column(Integer, ForeignKey("users.id"))
|
||||
name = Column(String, nullable=False)
|
||||
content_type = Column(String, nullable=False)
|
||||
template_text = Column(Text, nullable=False)
|
||||
variables = Column(JSON)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
|
||||
|
||||
# Relationships
|
||||
user = relationship("User")
|
||||
|
||||
Vector Database Schema
|
||||
--------------------
|
||||
|
||||
AI-Writer uses ChromaDB for vector storage, which enables semantic search and retrieval of content. The vector database stores:
|
||||
|
||||
1. **Content Embeddings**
|
||||
|
||||
* Generated from content text using embedding models
|
||||
* Used for semantic search and content similarity
|
||||
|
||||
2. **Metadata**
|
||||
|
||||
* Content ID (linking to relational database)
|
||||
* Content type
|
||||
* Creation date
|
||||
* Keywords and tags
|
||||
|
||||
3. **Collections**
|
||||
|
||||
ChromaDB organizes embeddings into collections:
|
||||
|
||||
* `content_embeddings`: Main collection for all content
|
||||
* `user_{user_id}_content`: Per-user content collections
|
||||
* `{content_type}_embeddings`: Collections by content type
|
||||
|
||||
Vector Database Operations
|
||||
------------------------
|
||||
|
||||
The vector database supports the following operations:
|
||||
|
||||
1. **Adding Content**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def add_content_to_vector_db(content_id, content_text, metadata):
|
||||
"""Add content to the vector database.
|
||||
|
||||
Args:
|
||||
content_id: The ID of the content in the relational database.
|
||||
content_text: The text content to embed.
|
||||
metadata: Additional metadata for the content.
|
||||
"""
|
||||
embeddings = get_embeddings(content_text)
|
||||
collection = get_collection("content_embeddings")
|
||||
collection.add(
|
||||
ids=[str(content_id)],
|
||||
embeddings=[embeddings],
|
||||
metadatas=[metadata],
|
||||
documents=[content_text]
|
||||
)
|
||||
|
||||
2. **Searching Content**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def search_similar_content(query_text, limit=5):
|
||||
"""Search for similar content using vector similarity.
|
||||
|
||||
Args:
|
||||
query_text: The query text to search for.
|
||||
limit: Maximum number of results to return.
|
||||
|
||||
Returns:
|
||||
List of similar content items with their similarity scores.
|
||||
"""
|
||||
query_embedding = get_embeddings(query_text)
|
||||
collection = get_collection("content_embeddings")
|
||||
results = collection.query(
|
||||
query_embeddings=[query_embedding],
|
||||
n_results=limit
|
||||
)
|
||||
return results
|
||||
|
||||
3. **Updating Content**
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def update_content_in_vector_db(content_id, new_content_text, metadata):
|
||||
"""Update content in the vector database.
|
||||
|
||||
Args:
|
||||
content_id: The ID of the content to update.
|
||||
new_content_text: The updated text content.
|
||||
metadata: Updated metadata.
|
||||
"""
|
||||
new_embedding = get_embeddings(new_content_text)
|
||||
collection = get_collection("content_embeddings")
|
||||
collection.update(
|
||||
ids=[str(content_id)],
|
||||
embeddings=[new_embedding],
|
||||
metadatas=[metadata],
|
||||
documents=[new_content_text]
|
||||
)
|
||||
|
||||
Database Migrations
|
||||
-----------------
|
||||
|
||||
AI-Writer uses Alembic for database migrations. The migration workflow is:
|
||||
|
||||
1. **Create Migration**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
alembic revision --autogenerate -m "Description of changes"
|
||||
|
||||
2. **Apply Migration**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
alembic upgrade head
|
||||
|
||||
3. **Rollback Migration**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
alembic downgrade -1
|
||||
|
||||
Database Backup and Restore
|
||||
-------------------------
|
||||
|
||||
Regular database backups are recommended:
|
||||
|
||||
1. **SQLite Backup**
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Backup
|
||||
sqlite3 data/alwrity.db .dump > backup.sql
|
||||
|
||||
# Restore
|
||||
sqlite3 data/alwrity.db < backup.sql
|
||||
|
||||
2. **Vector Database Backup**
|
||||
|
||||
ChromaDB data is stored in the specified directory and can be backed up by copying the directory:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Backup
|
||||
cp -r data/vectordb data/vectordb_backup
|
||||
|
||||
# Restore
|
||||
rm -rf data/vectordb
|
||||
cp -r data/vectordb_backup data/vectordb
|
||||
Reference in New Issue
Block a user