Files
ALwrity/docs/API_KEY_INJECTION_EXPLAINED.md
2025-10-10 23:19:28 +05:30

9.6 KiB

API Key Injection - How It Works in Production

🎯 The Problem You Identified

Question: "For production, when we read APIs from database, how will they be exported to the environment?"

Answer: They are temporarily injected into os.environ for each request, then immediately cleaned up.


🔍 The Challenge

Existing Code Pattern:

Most of your codebase uses this pattern:

import os
import google.generativeai as genai

def generate_content(prompt: str):
    # Expects GEMINI_API_KEY in environment
    gemini_key = os.getenv('GEMINI_API_KEY')
    genai.configure(api_key=gemini_key)
    # ...

Production Problem:

User A's request:
  ↓
  os.getenv('GEMINI_API_KEY') → ??? (User A's key in database, not in os.environ)
  
User B's request (simultaneous):
  ↓
  os.getenv('GEMINI_API_KEY') → ??? (User B's key in database, not in os.environ)

Issue: os.environ is global, but we need user-specific keys!


The Solution: Request-Scoped Injection

How It Works:

1. Request arrives with Authorization: Bearer <user_a_token>
   ↓
2. API Key Injection Middleware extracts user_id from token
   ↓
3. Fetch User A's keys from database
   ↓
4. Temporarily inject into os.environ:
   - GEMINI_API_KEY = user_a_gemini_key
   - EXA_API_KEY = user_a_exa_key
   ↓
5. Process request (all os.getenv() calls get User A's keys)
   ↓
6. Request completes
   ↓
7. IMMEDIATELY clean up os.environ (remove User A's keys)

Key Insight:

The injection is request-scoped, not global:

  • User A's keys exist in os.environ ONLY during User A's request
  • Immediately removed after response sent
  • User B's request gets User B's keys injected
  • No overlap, no conflict!

🏗️ Architecture

Middleware Flow:

FastAPI Request Pipeline:

┌─────────────────────────────────────────────────────────────┐
│ 1. Rate Limit Middleware                                    │
│    └─> Check rate limits                                    │
└─────────────────────────────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────────────┐
│ 2. API Key Injection Middleware (NEW!)                      │
│    ├─> Extract user_id from Authorization header            │
│    ├─> Fetch user's API keys from database                  │
│    ├─> Inject into os.environ (temporarily)                 │
│    │   ├─> GEMINI_API_KEY = user_specific_key               │
│    │   ├─> EXA_API_KEY = user_specific_key                  │
│    │   └─> COPILOTKIT_API_KEY = user_specific_key           │
│    └─> [Request processed with user-specific keys]          │
│         ↓                                                    │
│    ├─> [Response generated]                                 │
│    └─> CLEANUP: Remove injected keys from os.environ        │
└─────────────────────────────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────────────┐
│ 3. Your Endpoint (e.g., /api/blog/generate)                 │
│    └─> Calls service that uses os.getenv('GEMINI_API_KEY')  │
│        └─> Gets user-specific key! ✅                        │
└─────────────────────────────────────────────────────────────┘

💻 Code Example

The Middleware:

async def __call__(self, request: Request, call_next):
    # 1. Extract user_id from token
    user_id = extract_user_from_token(request)
    
    if not user_id or DEPLOY_ENV == 'local':
        return await call_next(request)  # Skip in local mode
    
    # 2. Get user-specific keys from database
    with user_api_keys(user_id) as user_keys:
        # 3. Save original environment (if any)
        original_gemini = os.environ.get('GEMINI_API_KEY')
        original_exa = os.environ.get('EXA_API_KEY')
        
        # 4. Inject user-specific keys
        os.environ['GEMINI_API_KEY'] = user_keys['gemini']
        os.environ['EXA_API_KEY'] = user_keys['exa']
        
        try:
            # 5. Process request with user-specific keys
            response = await call_next(request)
            return response
        finally:
            # 6. CRITICAL: Restore original environment
            if original_gemini is None:
                del os.environ['GEMINI_API_KEY']
            else:
                os.environ['GEMINI_API_KEY'] = original_gemini
            
            if original_exa is None:
                del os.environ['EXA_API_KEY']
            else:
                os.environ['EXA_API_KEY'] = original_exa

📊 Concurrent Requests Example

Scenario: Two Users Generate Content Simultaneously

TIME: 00:00:000
User A request arrives
├─> Extract user_id = "user_a"
├─> Fetch keys from DB: gemini_key = "key_a_123"
├─> os.environ['GEMINI_API_KEY'] = "key_a_123"
│
├─> TIME: 00:00:050 (50ms later)
│   User B request arrives
│   ├─> Extract user_id = "user_b"
│   ├─> Fetch keys from DB: gemini_key = "key_b_456"
│   ├─> os.environ['GEMINI_API_KEY'] = "key_b_456"  ← Overwrites!
│   │
│   ├─> User B's request processes
│   │   os.getenv('GEMINI_API_KEY') → "key_b_456" ✅
│   │
│   └─> TIME: 00:00:100
│       User B response sent
│       os.environ['GEMINI_API_KEY'] restored
│
└─> TIME: 00:00:120
    User A's request processes
    os.getenv('GEMINI_API_KEY') → ??? (Could be wrong!)

⚠️ PROBLEM: Race condition!


🔒 Thread Safety Solution

Python's asyncio in FastAPI handles this correctly:

# FastAPI uses asyncio, which is single-threaded
# Each request is processed in sequence (no parallel execution)
# So the injection is safe!

User A request:
  ├─> Inject A's keys
  ├─> await generate_content()   Async, but single-threaded
  └─> Cleanup A's keys

User B request (after A):
  ├─> Inject B's keys
  ├─> await generate_content()
  └─> Cleanup B's keys

BUT: If your code uses threading or multiprocessing, this approach WON'T work safely.


🎛️ Modes Compared

Local Mode (DEPLOY_ENV=local):

Request arrives
  ↓
Middleware detects DEPLOY_ENV=local
  ↓
SKIP injection (keys already in .env)
  ↓
os.getenv('GEMINI_API_KEY') → reads from .env file
  ↓
Works! ✅

Production Mode (DEPLOY_ENV=production):

Request arrives with user_id=user_123
  ↓
Middleware detects DEPLOY_ENV=production
  ↓
Fetch user_123's keys from database
  ↓
Inject into os.environ (temporarily)
  ↓
os.getenv('GEMINI_API_KEY') → gets user_123's key
  ↓
Process request
  ↓
Clean up os.environ
  ↓
Works! ✅

🚨 Important Caveats

1. Async-Only Safety

This approach is safe ONLY because FastAPI uses asyncio (single-threaded event loop).

If you use:

  • concurrent.futures.ThreadPoolExecutor
  • multiprocessing.Pool
  • threading.Thread

Then environment injection is NOT SAFE and will cause race conditions!

2. Better Long-Term Approach

For critical services, refactor to pass user_id explicitly:

# Instead of:
def generate(prompt: str):
    key = os.getenv('GEMINI_API_KEY')  # Fragile!
    
# Do this:
def generate(user_id: str, prompt: str):
    with user_api_keys(user_id) as keys:
        key = keys['gemini']  # Explicit and safe!

📝 Summary

The Magic:

  1. Request arrives → Middleware extracts user_id
  2. Fetch from DB → Get user's keys
  3. Inject temporarilyos.environ['GEMINI_API_KEY'] = user_key
  4. Process request → All os.getenv() calls get user's key
  5. Cleanup → Remove from os.environ
  6. Next request → Different user, different keys

Why It Works:

  • FastAPI is async + single-threaded
  • Injection is request-scoped
  • Cleanup is guaranteed (finally block)
  • Existing code works without changes
  • Each user gets their own keys

Limitations:

  • ⚠️ Not safe with threading/multiprocessing
  • ⚠️ Slightly slower (DB query per request)
  • ⚠️ Better to refactor critical services

Bottom Line:

It works! Your existing code that uses os.getenv() will get user-specific keys in production, with zero code changes. The middleware handles everything automatically.


🔄 Migration Path

Phase 1: Now (Compatibility Layer)

  • Middleware injects keys for ALL services
  • No code changes needed
  • Works immediately

Phase 2: Later (Gradual Refactor)

  • Refactor critical services to use UserAPIKeyContext directly
  • Remove dependency on os.getenv()
  • More explicit, safer

Phase 3: Future (Full Migration)

  • All services use user_api_keys(user_id)
  • Remove injection middleware
  • Clean, explicit architecture

For now: Middleware lets you deploy immediately without touching 100+ files! 🎉