This uses Gemini's native [thinking
summaries](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#thought-summaries)
which were recently added to the API.
Why? The grafted thinking would sometimes cause weird issues where the
model, especially Gemini 2.5 Flash, got confused and put dyad tags like
`<dyad-write>` inside the `<think>` tags.
This also improves the UX because you can see the native thoughts rather
than having the Gemini response load for a while without any feedback.
I tried adding Anthropic extended thinking, however it requires temp to
be set at 1, which isn't ideal for Dyad's use case where we need precise
syntax following.
- [x] add e2e test - happy case (make sure it clears selection and next
prompt is empty, and preview is cleared); de-selection case
- [x] shim - old & new file
- [x] upgrade path
- [x] add docs
- [x] add try-catch to parser script
- [x] make it work for next.js
- [x] extract npm package
- [x] make sure plugin doesn't apply in prod
things to test:
- [x] allow real URL to open in new window
- [x] packaging in electron?
- [ ] does it work on windows?
- [x] make sure it works with older apps
- [x] what about cache / reuse? - maybe use a bigger range of ports??