Summarize chat trigger (#1890)

> [!NOTE] > Adds a context-limit banner with one-click “summarize into new chat,” refactors token counting with react-query, and persists per-message max token usage. > > - **Chat UX** > - **Context limit banner** (`ContextLimitBanner.tsx`, `MessagesList.tsx`): shows when within 40k tokens of `contextWindow`, with tooltip and action to summarize into a new chat. > - **Summarize flow**: extracted to `useSummarizeInNewChat` and used in chat input and banner; new summarize system prompt (`summarize_chat_system_prompt.ts`). > - **Token usage & counting** > - **Persist max tokens used per assistant message**: DB migration (`messages.max_tokens_used`), schema updates, and saving usage during streaming (`chat_stream_handlers.ts`). > - **Token counting refactor** (`useCountTokens.ts`): react-query with debounce; returns `estimatedTotalTokens` and `actualMaxTokens`; invalidated on model change and stream end; `TokenBar` updated. > - **Surfacing usage**: tooltip on latest assistant message shows total tokens (`ChatMessage.tsx`). > - **Model/config tweaks** > - Set `auto` model `contextWindow` to `200_000` (`language_model_constants.ts`). > - Improve chat auto-scroll dependency (`ChatPanel.tsx`). > - Fix app path validation regex (`app_handlers.ts`). > - **Testing & dev server** > - E2E tests for banner and summarize (`e2e-tests/context_limit_banner.spec.ts` + fixtures/snapshot). > - Fake LLM server streams usage to simulate high token scenarios (`testing/fake-llm-server/*`). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2ae16a14d50699cc772407426419192c2fdf2ec3. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>   --- ## Summary by cubic Adds a “Summarize into new chat” trigger and a context limit banner to help keep conversations focused and avoid hitting model limits. Also tracks and surfaces actual token usage per assistant message, with a token counting refactor for reliability. - **New Features** - Summarize into new chat from the input or banner; improved system prompt with clear output format. - Context limit banner shows when within 40k tokens of the model’s context window and offers a one-click summarize action. - Tooltip on the latest assistant message shows total tokens used. - **Refactors** - Token counting now uses react-query and returns estimatedTotalTokens and actualMaxTokens; counts are invalidated on model change and when streaming settles. - Persist per-message max_tokens_used in the messages table; backend aggregates model usage during streaming and saves it. - Adjusted default “Auto” model contextWindow to 200k for more realistic limits. - Improved chat scrolling while streaming; fixed app path validation regex. <sup>Written for commit 2ae16a14d50699cc772407426419192c2fdf2ec3. Summary will update automatically on new commits.</sup>
2025-12-04 23:00:28 -08:00
parent 90c5805b57
commit 6235f7bb9d
24 changed files with 1185 additions and 91 deletions
--- a/src/prompts/summarize_chat_system_prompt.ts
+++ b/src/prompts/summarize_chat_system_prompt.ts
@@ -1,8 +1,42 @@
 export const SUMMARIZE_CHAT_SYSTEM_PROMPT = `
-You are a helpful assistant that understands long conversations and can summarize them in a few bullet points.
+You are a helpful assistant that summarizes AI coding chat sessions with a focus on technical changes and file modifications.

-I want you to write down the gist of the conversation in a few bullet points, focusing on the major changes, particularly
-at the end of the conversation.
+Your task is to analyze the conversation and provide:

-Use <dyad-chat-summary> for setting the chat summary (put this at the end). The chat summary should be less than a sentence, but more than a few words. YOU SHOULD ALWAYS INCLUDE EXACTLY ONE CHAT TITLE
+1. **Chat Summary**: A concise summary (less than a sentence, more than a few words) that captures the primary objective or outcome of the session.
+
+2. **Major Changes**: Identify and highlight:
+   - Major code modifications, refactors, or new features implemented
+   - Critical bug fixes or debugging sessions
+   - Architecture or design pattern changes
+   - Important decisions made during the conversation
+
+3. **Relevant Files**: List the most important files discussed or modified, with brief context:
+   - Files that received significant changes
+   - New files created
+   - Files central to the discussion or problem-solving
+   - Format: \`path/to/file.ext - brief description of changes\`
+
+4. **Focus on Recency**: Prioritize changes and discussions from the latter part of the conversation, as these typically represent the final state or most recent decisions.
+
+**Output Format:**
+
+## Major Changes
+- Bullet point of significant change 1
+- Bullet point of significant change 2
+
+## Important Context
+- Any critical decisions, trade-offs, or next steps discussed
+
+## Relevant Files
+- \`file1.ts\` - Description of changes
+- \`file2.py\` - Description of changes
+
+<dyad-chat-summary>
+[Your concise summary here - less than a sentence, more than a few words]
+</dyad-chat-summary>
+
+**Reminder:**
+
+YOU MUST ALWAYS INCLUDE EXACTLY ONE <dyad-chat-summary> TAG AT THE END.
 `;