Coding Standards =============== This document outlines the coding standards and best practices for contributing to the AI-Writer project. Code Style --------- AI-Writer follows the PEP 8 style guide for Python code with some additional guidelines: 1. **Indentation** * Use 4 spaces for indentation (no tabs) * Continuation lines should align with the opening delimiter or be indented by 4 spaces 2. **Line Length** * Maximum line length is 100 characters * For docstrings and comments, limit to 80 characters 3. **Imports** * Group imports in the following order: 1. Standard library imports 2. Related third-party imports 3. Local application/library specific imports * Within each group, imports should be sorted alphabetically * Use absolute imports rather than relative imports Example: .. code-block:: python # Standard library import os import sys from typing import Dict, List, Optional # Third-party import numpy as np import pandas as pd import streamlit as st # Local from lib.database import models from lib.utils import helpers 4. **Naming Conventions** * Classes: `CamelCase` * Functions and variables: `snake_case` * Constants: `UPPER_CASE` * Private methods and variables: `_leading_underscore` * Protected methods and variables: `__double_leading_underscore` 5. **String Formatting** * Use f-strings for string formatting when possible * For older Python versions, use `.format()` method * Avoid using `%` formatting Example: .. code-block:: python # Preferred name = "World" greeting = f"Hello, {name}!" # Acceptable greeting = "Hello, {}!".format(name) # Avoid greeting = "Hello, %s!" % name Documentation ------------ 1. **Docstrings** * Use Google-style docstrings * All modules, classes, and functions should have docstrings * Include type hints in function signatures Example: .. code-block:: python def generate_content(prompt: str, max_tokens: int = 100) -> str: """Generate content using the AI model. Args: prompt: The input prompt for content generation. max_tokens: Maximum number of tokens to generate. Returns: The generated content as a string. Raises: ValueError: If the prompt is empty or max_tokens is negative. """ if not prompt: raise ValueError("Prompt cannot be empty") if max_tokens < 0: raise ValueError("max_tokens must be a positive integer") # Implementation... return generated_content 2. **Comments** * Use comments sparingly and only when necessary * Focus on explaining "why" rather than "what" * Keep comments up-to-date with code changes 3. **Type Hints** * Use type hints for all function parameters and return values * Use `Optional` for parameters that can be None * Use `Union` for parameters that can be multiple types * Use `Any` only when absolutely necessary Example: .. code-block:: python from typing import Dict, List, Optional, Union def process_data( data: Union[Dict[str, str], List[str]], config: Optional[Dict[str, str]] = None ) -> List[str]: """Process the input data.""" # Implementation... return processed_data Error Handling ------------- 1. **Exceptions** * Use specific exception types rather than generic exceptions * Handle exceptions at the appropriate level * Include meaningful error messages * Log exceptions with appropriate context Example: .. code-block:: python try: result = api_client.fetch_data(query) except ConnectionError as e: logger.error(f"Failed to connect to API: {e}") raise ServiceUnavailableError("API service is currently unavailable") from e except ValueError as e: logger.warning(f"Invalid query parameter: {e}") raise InvalidParameterError(f"Invalid query parameter: {e}") from e 2. **Validation** * Validate input parameters early * Use assertions for internal checks (not for input validation) * Return meaningful error messages for invalid inputs Testing ------ 1. **Test Coverage** * Aim for at least 80% test coverage for new code * Write unit tests for all new functions and classes * Include integration tests for complex interactions 2. **Test Organization** * Place tests in the `tests/` directory * Mirror the package structure in the test directory * Name test files with `test_` prefix 3. **Test Naming** * Use descriptive test names that explain what is being tested * Follow the pattern `test___` Example: .. code-block:: python def test_generate_content_empty_prompt_raises_value_error(): """Test that generate_content raises ValueError for empty prompts.""" with pytest.raises(ValueError, match="Prompt cannot be empty"): generate_content("") Performance Considerations ------------------------ 1. **Resource Usage** * Be mindful of memory usage, especially for large datasets * Use generators and iterators for large data processing * Consider using async functions for I/O-bound operations 2. **Optimization** * Optimize for readability first, then performance * Document performance-critical sections * Include benchmarks for performance-sensitive code Security Best Practices --------------------- 1. **API Keys and Secrets** * Never hardcode API keys or secrets * Use environment variables or secure storage * Implement proper access controls for sensitive data 2. **Input Validation** * Validate and sanitize all user inputs * Use parameterized queries for database operations * Implement proper authentication and authorization 3. **Dependency Management** * Keep dependencies up-to-date * Regularly check for security vulnerabilities * Pin dependency versions for reproducibility