2.9 KiB
2.9 KiB
Prompt Quality Issue Analysis
Date: 2025-01-29
Issue: Quality degradation after prompt builder changes
Status: Investigating
🔍 Problem Statement
User reports that after changes to unified_prompt_builder.py, the quality of AI-generated research intent and Exa/Tavily options has significantly degraded. Previously getting great results, now getting poor quality.
📊 Current Prompt Analysis
Prompt Length & Complexity
Current Unified Prompt: ~500 lines
- Very detailed instructions
- Multiple "CRITICAL" sections
- Extensive provider options documentation
- Complex query linking rules
- Detailed optimization rules
Potential Issues:
- Prompt Too Long: ~500 lines may be overwhelming the LLM
- Too Many Constraints: Multiple "CRITICAL" sections may conflict
- Over-Prescriptive: Too many rules may confuse rather than guide
- Information Overload: Provider options table is very detailed
🔄 What Changed Recently
Based on conversation history, recent changes include:
- Added keyword emphasis - "MUST include user's actual keywords"
- Removed confidence optimization - Reverted confidence instructions
- Added query linking rules - Explicit linking to intent fields
- Enhanced provider optimization - More detailed rules
🎯 Key Differences: Original vs Current
Original Intent Prompt (Simple, Working)
- ~200 lines
- Clear, focused instructions
- Simple confidence scoring
- Straightforward query generation
- Basic provider selection
Current Unified Prompt (Complex, Degraded)
- ~500 lines
- Multiple "CRITICAL" sections
- Complex query linking
- Extensive provider documentation
- Detailed optimization rules
💡 Hypothesis
The prompt may be too complex, causing the LLM to:
- Get confused by conflicting instructions
- Focus on wrong aspects (too many rules)
- Produce lower quality due to information overload
- Miss the core task (intent inference) due to complexity
🔧 Recommended Fixes
Option 1: Simplify the Prompt (Recommended)
- Reduce prompt length by 50%
- Remove redundant instructions
- Simplify provider documentation
- Focus on core task: intent inference + query generation
Option 2: Split Back to Separate Calls
- Use original
intent_prompt_builder.pyfor intent - Use separate query generation
- Use separate parameter optimization
- Trade-off: More LLM calls but better quality
Option 3: Hybrid Approach
- Keep unified call but simplify prompt
- Remove detailed provider documentation (reference only)
- Focus on clear, concise instructions
- Let LLM infer more, prescribe less
📝 Next Steps
- Review original working prompt structure
- Identify what made it work well
- Simplify current prompt while keeping essential features
- Test with same inputs that previously worked
- Compare quality before/after
Status: Ready for prompt simplification