ALwrity/docs/developer/coding_standards.rst

Coding Standards
===============

This document outlines the coding standards and best practices for contributing to the AI-Writer project.

Code Style
---------

AI-Writer follows the PEP 8 style guide for Python code with some additional guidelines:

1. **Indentation**

   * Use 4 spaces for indentation (no tabs)
   * Continuation lines should align with the opening delimiter or be indented by 4 spaces

2. **Line Length**

   * Maximum line length is 100 characters
   * For docstrings and comments, limit to 80 characters

3. **Imports**

   * Group imports in the following order:
     1. Standard library imports
     2. Related third-party imports
     3. Local application/library specific imports
   * Within each group, imports should be sorted alphabetically
   * Use absolute imports rather than relative imports

   Example:

   .. code-block:: python

      # Standard library
      import os
      import sys
      from typing import Dict, List, Optional

      # Third-party
      import numpy as np
      import pandas as pd
      import streamlit as st

      # Local
      from lib.database import models
      from lib.utils import helpers

4. **Naming Conventions**

   * Classes: `CamelCase`
   * Functions and variables: `snake_case`
   * Constants: `UPPER_CASE`
   * Private methods and variables: `_leading_underscore`
   * Protected methods and variables: `__double_leading_underscore`

5. **String Formatting**

   * Use f-strings for string formatting when possible
   * For older Python versions, use `.format()` method
   * Avoid using `%` formatting

   Example:

   .. code-block:: python

      # Preferred
      name = "World"
      greeting = f"Hello, {name}!"

      # Acceptable
      greeting = "Hello, {}!".format(name)

      # Avoid
      greeting = "Hello, %s!" % name

Documentation
------------

1. **Docstrings**

   * Use Google-style docstrings
   * All modules, classes, and functions should have docstrings
   * Include type hints in function signatures

   Example:

   .. code-block:: python

      def generate_content(prompt: str, max_tokens: int = 100) -> str:
          """Generate content using the AI model.

          Args:
              prompt: The input prompt for content generation.
              max_tokens: Maximum number of tokens to generate.

          Returns:
              The generated content as a string.

          Raises:
              ValueError: If the prompt is empty or max_tokens is negative.
          """
          if not prompt:
              raise ValueError("Prompt cannot be empty")

          if max_tokens < 0:
              raise ValueError("max_tokens must be a positive integer")

          # Implementation...
          return generated_content

2. **Comments**

   * Use comments sparingly and only when necessary
   * Focus on explaining "why" rather than "what"
   * Keep comments up-to-date with code changes

3. **Type Hints**

   * Use type hints for all function parameters and return values
   * Use `Optional` for parameters that can be None
   * Use `Union` for parameters that can be multiple types
   * Use `Any` only when absolutely necessary

   Example:

   .. code-block:: python

      from typing import Dict, List, Optional, Union

      def process_data(
          data: Union[Dict[str, str], List[str]],
          config: Optional[Dict[str, str]] = None
      ) -> List[str]:
          """Process the input data."""
          # Implementation...
          return processed_data

Error Handling
-------------

1. **Exceptions**

   * Use specific exception types rather than generic exceptions
   * Handle exceptions at the appropriate level
   * Include meaningful error messages
   * Log exceptions with appropriate context

   Example:

   .. code-block:: python

      try:
          result = api_client.fetch_data(query)
      except ConnectionError as e:
          logger.error(f"Failed to connect to API: {e}")
          raise ServiceUnavailableError("API service is currently unavailable") from e
      except ValueError as e:
          logger.warning(f"Invalid query parameter: {e}")
          raise InvalidParameterError(f"Invalid query parameter: {e}") from e

2. **Validation**

   * Validate input parameters early
   * Use assertions for internal checks (not for input validation)
   * Return meaningful error messages for invalid inputs

Testing
------

1. **Test Coverage**

   * Aim for at least 80% test coverage for new code
   * Write unit tests for all new functions and classes
   * Include integration tests for complex interactions

2. **Test Organization**

   * Place tests in the `tests/` directory
   * Mirror the package structure in the test directory
   * Name test files with `test_` prefix

3. **Test Naming**

   * Use descriptive test names that explain what is being tested
   * Follow the pattern `test_<function_name>_<scenario>_<expected_result>`

   Example:

   .. code-block:: python

      def test_generate_content_empty_prompt_raises_value_error():
          """Test that generate_content raises ValueError for empty prompts."""
          with pytest.raises(ValueError, match="Prompt cannot be empty"):
              generate_content("")

Performance Considerations
------------------------

1. **Resource Usage**

   * Be mindful of memory usage, especially for large datasets
   * Use generators and iterators for large data processing
   * Consider using async functions for I/O-bound operations

2. **Optimization**

   * Optimize for readability first, then performance
   * Document performance-critical sections
   * Include benchmarks for performance-sensitive code

Security Best Practices
---------------------

1. **API Keys and Secrets**

   * Never hardcode API keys or secrets
   * Use environment variables or secure storage
   * Implement proper access controls for sensitive data

2. **Input Validation**

   * Validate and sanitize all user inputs
   * Use parameterized queries for database operations
   * Implement proper authentication and authorization

3. **Dependency Management**

   * Keep dependencies up-to-date
   * Regularly check for security vulnerabilities
   * Pin dependency versions for reproducibility