This uses Gemini's native [thinking
summaries](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#thought-summaries)
which were recently added to the API.
Why? The grafted thinking would sometimes cause weird issues where the
model, especially Gemini 2.5 Flash, got confused and put dyad tags like
`<dyad-write>` inside the `<think>` tags.
This also improves the UX because you can see the native thoughts rather
than having the Gemini response load for a while without any feedback.
I tried adding Anthropic extended thinking, however it requires temp to
be set at 1, which isn't ideal for Dyad's use case where we need precise
syntax following.
- [x] add e2e test - happy case (make sure it clears selection and next
prompt is empty, and preview is cleared); de-selection case
- [x] shim - old & new file
- [x] upgrade path
- [x] add docs
- [x] add try-catch to parser script
- [x] make it work for next.js
- [x] extract npm package
- [x] make sure plugin doesn't apply in prod