<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Parameterized the system prompt and tokenized it in e2e dumps to make
snapshots smaller and stable. No runtime behavior changes; future prompt
edits won’t churn tests.
- **Refactors**
- Exported BUILD_SYSTEM_PREFIX and BUILD_SYSTEM_POSTFIX from
system_prompt.ts.
- Updated test_helper to replace the full prompt with
${BUILD_SYSTEM_PREFIX}/${BUILD_SYSTEM_POSTFIX} tokens in message dumps.
- Regenerated e2e snapshots to use tokens, reducing ~270 lines per
snapshot.
<!-- End of auto-generated description by cubic. -->
This uses Gemini's native [thinking
summaries](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#thought-summaries)
which were recently added to the API.
Why? The grafted thinking would sometimes cause weird issues where the
model, especially Gemini 2.5 Flash, got confused and put dyad tags like
`<dyad-write>` inside the `<think>` tags.
This also improves the UX because you can see the native thoughts rather
than having the Gemini response load for a while without any feedback.
I tried adding Anthropic extended thinking, however it requires temp to
be set at 1, which isn't ideal for Dyad's use case where we need precise
syntax following.
- [x] add e2e test - happy case (make sure it clears selection and next
prompt is empty, and preview is cleared); de-selection case
- [x] shim - old & new file
- [x] upgrade path
- [x] add docs
- [x] add try-catch to parser script
- [x] make it work for next.js
- [x] extract npm package
- [x] make sure plugin doesn't apply in prod