moreminimore-vibe

Author	SHA1	Message	Date
Will Chen	6235f7bb9d	Summarize chat trigger (#1890 ) <!-- CURSOR_SUMMARY --> > [!NOTE] > Adds a context-limit banner with one-click “summarize into new chat,” refactors token counting with react-query, and persists per-message max token usage. > > - Chat UX > - Context limit banner (`ContextLimitBanner.tsx`, `MessagesList.tsx`): shows when within 40k tokens of `contextWindow`, with tooltip and action to summarize into a new chat. > - Summarize flow: extracted to `useSummarizeInNewChat` and used in chat input and banner; new summarize system prompt (`summarize_chat_system_prompt.ts`). > - Token usage & counting > - Persist max tokens used per assistant message: DB migration (`messages.max_tokens_used`), schema updates, and saving usage during streaming (`chat_stream_handlers.ts`). > - Token counting refactor (`useCountTokens.ts`): react-query with debounce; returns `estimatedTotalTokens` and `actualMaxTokens`; invalidated on model change and stream end; `TokenBar` updated. > - Surfacing usage: tooltip on latest assistant message shows total tokens (`ChatMessage.tsx`). > - Model/config tweaks > - Set `auto` model `contextWindow` to `200_000` (`language_model_constants.ts`). > - Improve chat auto-scroll dependency (`ChatPanel.tsx`). > - Fix app path validation regex (`app_handlers.ts`). > - Testing & dev server > - E2E tests for banner and summarize (`e2e-tests/context_limit_banner.spec.ts` + fixtures/snapshot). > - Fake LLM server streams usage to simulate high token scenarios (`testing/fake-llm-server/`). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 2ae16a14d50699cc772407426419192c2fdf2ec3. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Adds a “Summarize into new chat” trigger and a context limit banner to help keep conversations focused and avoid hitting model limits. Also tracks and surfaces actual token usage per assistant message, with a token counting refactor for reliability. - New Features* - Summarize into new chat from the input or banner; improved system prompt with clear output format. - Context limit banner shows when within 40k tokens of the model’s context window and offers a one-click summarize action. - Tooltip on the latest assistant message shows total tokens used. - Refactors - Token counting now uses react-query and returns estimatedTotalTokens and actualMaxTokens; counts are invalidated on model change and when streaming settles. - Persist per-message max_tokens_used in the messages table; backend aggregates model usage during streaming and saves it. - Adjusted default “Auto” model contextWindow to 200k for more realistic limits. - Improved chat scrolling while streaming; fixed app path validation regex. <sup>Written for commit 2ae16a14d50699cc772407426419192c2fdf2ec3. Summary will update automatically on new commits.</sup> <!-- End of auto-generated description by cubic. -->	2025-12-04 23:00:28 -08:00
Tanner-Maasen	2ffbbbca8f	Add Azure OpenAI Custom Model Integration (#1001 ) Fixes #710 This PR implements comprehensive Azure OpenAI integration for Dyad, enabling users to leverage Azure OpenAI models through proper environment variable configuration. The implementation adds Azure as a supported provider with full integration into the existing language model architecture, including support for GPT-5 models. Key features include environment-based configuration using `AZURE_API_KEY` and `AZURE_RESOURCE_NAME`, specialized UI components that provide clear setup instructions and status indicators, and seamless integration with Dyad's existing provider system. The Azure provider leverages the @ai-sdk/azure package (v1.3.25) for compatibility with the current TypeScript language model interfaces. The implementation includes robust error handling for missing configuration, comprehensive test coverage with 9 new unit tests covering critical functionality like model client creation and error scenarios, and an E2E test for the Azure-specific settings UI. <img width="1510" height="908" alt="Screenshot 2025-08-18 at 9 14 32 PM" src="https://github.com/user-attachments/assets/04aa99e1-1590-4bb0-86c9-a67b97bc7500" /> --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Will Chen <willchen90@gmail.com>	2025-08-30 20:47:25 -07:00
Will Chen	d535db6251	Upgrade to AI sdk with codemod (#1000 )	2025-08-18 22:21:27 -07:00
Will Chen	bd809a010d	GitHub workflows (#428 ) Fixes #348 Fixes #274 Fixes #149 - Connect to existing repos - Push to other branches on GitHub besides main - Allows force push (with confirmation) dialog --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>	2025-06-17 16:59:26 -07:00
Will Chen	30b5c0d0ef	Replace thinking with native Gemini thinking summaries (#400 ) This uses Gemini's native [thinking summaries](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#thought-summaries) which were recently added to the API. Why? The grafted thinking would sometimes cause weird issues where the model, especially Gemini 2.5 Flash, got confused and put dyad tags like `<dyad-write>` inside the `<think>` tags. This also improves the UX because you can see the native thoughts rather than having the Gemini response load for a while without any feedback. I tried adding Anthropic extended thinking, however it requires temp to be set at 1, which isn't ideal for Dyad's use case where we need precise syntax following.	2025-06-16 17:29:32 -07:00
Will Chen	c227a08d11	Gateway e2e (#323 )	2025-06-03 16:35:46 -07:00
Will Chen	fc1ebe9e8a	e2e tests for engine (#322 )	2025-06-03 16:11:16 -07:00
Will Chen	c0adf8d3f2	Attach image e2e tests (#301 )	2025-06-01 00:44:19 -07:00
Will Chen	8a743ca4f5	LM studio e2e test (#297 )	2025-05-31 23:04:28 -07:00
Will Chen	af7d6fa9f8	Create ollama e2e test (#296 )	2025-05-31 22:01:48 -07:00
Will Chen	efb814ec95	Create tests: dumps message, "retry" (#281 )	2025-05-31 21:15:41 -07:00
Will Chen	647fd0169e	make it easy to write multiple e2e tests (#280 )	2025-05-29 00:03:51 -07:00
Will Chen	509e044137	Boilerplate free tests (#277 )	2025-05-28 22:55:54 -07:00
Will Chen	f4c7d614bd	Escape dyad tags inside thinking blocks (#229 )	2025-05-22 16:06:28 -07:00
Will Chen	069c221292	Implement saver mode (#154 )	2025-05-13 15:34:41 -07:00

15 Commits