Files

ajaysi 01881bb405 "feat:enhance-podcast-topic-ai"

2026-03-11 19:09:27 +05:30

5.3 KiB

Raw Blame History

title, updated

title	updated
SIF and AI Tools model LLM choices	2026-03-11

SIF and AI Tools model LLM choices

This document captures the intended LLM/provider split between:

Premium AI tools (podcast, story writer, blog writer, etc.)
SIF / agents (local-first intelligence workflows)

It also records recent fixes, root causes, and consolidation next steps.

1) Design Intent (Target Behavior)

A) Premium AI Tools

Use remote premium API path by default.

Primary provider route: Hugging Face router
Preferred premium model: openai/gpt-oss-120b:groq
GPT_PROVIDER values that should map to this premium remote text route:
- huggingface
- hf
- hf_response_api
- wavespeed (alias mapping for premium remote route)

Fallback policy for premium tools:

Keep fallback minimal and explicit.
Do not accidentally inherit SIF low-cost fallback chains.
If provider is explicitly pinned per call (preferred_provider), avoid cross-provider switching to reduce noisy retries and cost/time waste.

B) SIF / Agents

Use local-first strategy.

Primary: local models (where SIF pipeline supports them)
Fallback: smaller remote models (HF + environment-guided provider logic)
Explicit low-cost model lists should be passed by SIF wrappers (e.g., preferred_hf_models) to keep these flows distinct from premium tools.

2) Current Routing Contract in `llm_text_gen`

llm_text_gen(...) now supports explicit context signals:

preferred_provider: pin provider intent for tool-specific flows
preferred_hf_models: low-cost model list for SIF/agent fallback usage
flow_type: diagnostic tag (premium_tool vs sif_agent)

Flow separation rule

If preferred_hf_models is used (SIF path), that list drives HF model selection/fallback.
Premium tool calls should not pass SIF low-cost lists.

Diagnostics

Logs include:

[llm_text_gen][flow_type=premium_tool] ...
[llm_text_gen][flow_type=sif_agent] ...

This makes mixed routing issues visible immediately.

3) Key Issues Found and Fixes Applied

Issue A: Premium/SIF behavior got mixed

Symptoms:

premium calls iterating through low-cost fallback chains
noisy model-not-found logs
wasted latency and confusion over routing

Fix:

made fallback model chain caller-controlled
kept SIF-specific fallback models passed only from SIF wrappers
kept premium calls separate and explicitly tagged

Issue B: Podcast bible generation error (`NoneType` callable)

Symptoms:

services.podcast_bible_service:generate_bible -> 'NoneType' object is not callable

Root cause:

personalization session acquisition/payload handling edge cases

Fix:

safe DB session retrieval via user-scoped session function
non-dict guardrails for integrated payload/canonical profile
fallback to defaults instead of crashing

Issue C: Premium default model drift

Symptoms:

premium default shifted to smaller model in recent patches

Fix:

restored premium default model to:
- openai/gpt-oss-120b:groq
kept wavespeed env alias mapped to premium remote text route logic

4) Provider Notes

Hugging Face provider

Accepts explicit fallback_models list.
If fallback_models=[], no broad fallback chain is injected beyond direct model variant handling.

Wavespeed

Wavespeed services exist in codebase and are used for dedicated workloads.
In text routing context (llm_text_gen), GPT_PROVIDER=wavespeed is treated as an alias to premium remote text route (HF provider path), preserving current behavior without introducing a second text-provider implementation in this function.

5) Operational Validation Checklist

When testing /api/podcast/idea/enhance:

Verify request log and auth token attachment in frontend.
Verify backend log shows:
- [llm_text_gen][flow_type=premium_tool] Using provider=huggingface, model=openai/gpt-oss-120b:groq
Verify no SIF-specific low-cost model list is being used in this flow.
Verify no repeated broad fallback cascades unless explicitly configured.
Verify podcast bible generation does not crash and gracefully falls back to defaults if onboarding payload is malformed.

6) Consolidation Next Steps

Centralize routing policy constants
- define premium defaults and SIF defaults in one module
- avoid drift from scattered hardcoded model strings
Add explicit route_intent enum (optional)
- premium_tool, sif_local_first, sif_remote_fallback
- reduce ambiguity vs inferred behavior
Add unit tests for routing matrix
- test combinations of:
  - GPT_PROVIDER
  - preferred_provider
  - preferred_hf_models
  - key presence/absence
Add structured log fields
- route_intent, provider_selected, model_selected, fallback_count
- easier production RCA
Document model availability assumptions
- account-level HF router model availability differs across keys/orgs
- include fallback policy per environment (dev/staging/prod)

7) Practical Rule of Thumb

If the caller is a premium AI tool: call with premium provider intent and avoid SIF low-cost list.
If the caller is SIF/agent: local-first, then explicitly pass low-cost remote fallback list.
Keep these paths separate in code and logs.

5.3 KiB Raw Blame History

SIF and AI Tools model LLM choices

1) Design Intent (Target Behavior)

A) Premium AI Tools

B) SIF / Agents

2) Current Routing Contract in llm_text_gen

Flow separation rule

Diagnostics

3) Key Issues Found and Fixes Applied

Issue A: Premium/SIF behavior got mixed

Issue B: Podcast bible generation error (NoneType callable)

Issue C: Premium default model drift

4) Provider Notes

Hugging Face provider

Wavespeed

5) Operational Validation Checklist

6) Consolidation Next Steps

7) Practical Rule of Thumb

5.3 KiB

Raw Blame History

2) Current Routing Contract in `llm_text_gen`

Issue B: Podcast bible generation error (`NoneType` callable)