ALwrity LLM Gateway – Architecture Overview

ALwrity’s LLM Gateway lives under llm_providers and provides a consistent, production‑oriented interface for text, image, audio, and video generation across multiple model providers. It encapsulates provider differences, applies subscription enforcement, and centralizes observability and reliability patterns.

Goals

Unified surface for LLM operations across providers
Strong subscription enforcement and cost awareness
Resilient calls with retries and structured error handling
Extensible provider architecture with clear contracts
Transparent metrics, usage logging, and pricing integration

High‑Level Flow

Entry points route requests to the appropriate capability:
- Text generation via main_text_generation.py
- Image generation and editing via image_generation
- Video generation via video_generation
- Audio/STT via audio_to_text_generation
Subscription enforcement integrates before provider calls:
- Uses PricingService and UsageTrackingService to validate tokens/operations
- Blocks requests that exceed limits with actionable error payloads
Provider module performs the call with provider‑specific SDKs/APIs
Results are normalized to ALwrity types and returned upstream

Core Components

Text Generation Entry: main_text_generation.py
- Detects available providers via APIKeyManager
- Applies strict subscription checks using PricingService and UsageTrackingService
- Routes to Gemini or Hugging Face implementations
Image Generation Contracts: base.py
- Options and Result dataclasses
- Protocols for generation, edit, and face‑swap providers
Video Generation Contracts: base.py
- Options and Result dataclasses
- Async protocol with progress callbacks
Provider Implementations:
- Gemini text: gemini_provider.py
- Hugging Face text: huggingface_provider.py
- Hugging Face image: hf_provider.py
- WaveSpeed video: wavespeed_provider.py

Provider Abstraction

Image providers conform to:
- ImageGenerationProvider.generate(options) -> ImageGenerationResult
- ImageEditProvider.edit(options) -> ImageGenerationResult
- FaceSwapProvider.swap_face(options) -> ImageGenerationResult
Video providers conform to:
- VideoGenerationProvider.generate_video(options, progress_cb) -> VideoGenerationResult

These contracts ensure consistent options/result types so downstream UI and logging remain stable regardless of provider.

Subscription Enforcement

Performed in the text pipeline entry point before any provider call:
- See enforcement and usage checks in main_text_generation.py
Preflight operations endpoint also validates multi‑operation cost/limits:
- See preflight.py
Image/video modules typically rely on the calling route to validate limits first, then perform provider calls.

Configuration and Secrets

Gemini: GEMINI_API_KEY
- Loaded and validated in gemini_provider.py
Hugging Face: HF_TOKEN
- Loaded and validated in huggingface_provider.py
Hugging Face image defaults: HF_IMAGE_MODEL
- Used in image_generation/hf_provider.py
Provider clients must never log secrets; logs are provider‑scoped via get_service_logger.

Reliability and Error Handling

Exponential backoff retries using tenacity:
- Gemini text: gemini_text_response
- Hugging Face text: huggingface_text_response
Structured exceptions surface HTTP 429 for limit breaches with usage info
Provider modules return normalized results; callers handle downstream persistence and telemetry

Pricing and Cost Awareness

Preflight cost estimation computes operation costs per provider/model:
- See multi‑operation handling in preflight.py
Video cost calculation is provider/model aware:
- See WaveSpeed services and calculate_cost in video_generation/wavespeed_provider.py

Observability

Service‑scoped loggers for each provider/module
Central usage logs recorded via subscription services on the calling routes
Provider metadata normalized in result objects for consistent analytics

Extensibility Guidelines

Implement the appropriate Protocol interface in a new provider module
Normalize options and results to the gateway dataclasses
Keep environment/key validation local to the provider module
Add cost mapping in PricingService and preflight for new operations/models
Wire subscription validation in the calling route before invoking provider

Request Lifecycle (Text)

Client submits prompt to text endpoint
Entry point determines provider (env or APIKeyManager) and validates subscription limits
Provider‑specific function executes with retries and returns normalized text
Caller logs usage and returns response to client

Request Lifecycle (Media)

Client submits generation/edit/face‑swap request
Route validates plan limits (tokens, requests, or per‑operation limits)
Provider service executes call and produces normalized binary payload and metadata
Caller logs usage and returns media/links to client

This architecture isolates provider variability while standardizing contracts, enabling safe expansion to new models and modalities without destabilizing upstream consumers.

7.3 KiB Raw Permalink Blame History Unescape Escape