Fixes#1405
<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Adds GPT-5 Codex (OpenAI and Azure) and Claude 4.5 Sonnet to the model
options to enable newer coding models and larger contexts. Also
increases Claude 4 Sonnet max output tokens to 32k.
<!-- End of auto-generated description by cubic. -->
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds GPT‑5 Codex (OpenAI/Azure) and Claude 4.5 Sonnet, and increases
Claude 4 Sonnet max output tokens to 32k across providers and tests.
>
> - **Models**:
> - **OpenAI**: add `gpt-5-codex` (400k context, default temp 1).
> - **Anthropic**:
> - add `claude-sonnet-4-5-20250929` (1M context, `maxOutputTokens:
32_000`).
> - update `claude-sonnet-4-20250514` `maxOutputTokens` from `16_000` to
`32_000`.
> - **Azure**: add `gpt-5-codex` (400k context, `maxOutputTokens:
128_000`).
> - **Bedrock**:
> - add `us.anthropic.claude-sonnet-4-5-20250929-v1:0` (1M context,
`maxOutputTokens: 32_000`).
> - update `us.anthropic.claude-sonnet-4-20250514-v1:0`
`maxOutputTokens` to `32_000`.
> - **E2E tests**:
> - Update snapshots to reflect `max_tokens` increased to `32000` for
`anthropic/claude-sonnet-4-20250514` in engine and gateway tests.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
73298d2da0c833468f957bb436f1e33400307483. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Set “balanced” as the default smart context mode. Users now get balanced
when Smart Files Context is enabled and no mode is set; “conservative”
must be explicitly selected.
- **Refactors**
- Default fallback to balanced in UI and engine (proSmartContextOption
undefined -> "balanced").
- ProModeSelector saves "conservative" explicitly; selector reads
undefined as balanced.
- Updated schema and types to allow "balanced" | "conservative".
- Engine payload now includes smart_context_mode with "balanced" by
default; e2e tests and snapshots updated.
- **Migration**
- No action needed. Existing users without an explicit mode will use
balanced by default; selecting conservative persists.
<!-- End of auto-generated description by cubic. -->
Fixes#1037
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Enable JSON file support in codebase scanning so common configs and data
(e.g., package.json, vercel.json, translations) are included. Adds .json
to the allowed extensions and removes special-casing for
package.json/vercel.json.
<!-- End of auto-generated description by cubic. -->
Fixes#554#1049
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Stop mutating package.json when reading files. readFileWithCache now
returns raw content so all fields are preserved (e.g., packageManager).
Fixes#554 and #1049.
- **Bug Fixes**
- Removed package.json "cleaning" logic and the cleanContent helper.
- Return and cache unmodified file content from both fs and
virtualFileSystem.
<!-- End of auto-generated description by cubic. -->
This uses Gemini's native [thinking
summaries](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#thought-summaries)
which were recently added to the API.
Why? The grafted thinking would sometimes cause weird issues where the
model, especially Gemini 2.5 Flash, got confused and put dyad tags like
`<dyad-write>` inside the `<think>` tags.
This also improves the UX because you can see the native thoughts rather
than having the Gemini response load for a while without any feedback.
I tried adding Anthropic extended thinking, however it requires temp to
be set at 1, which isn't ideal for Dyad's use case where we need precise
syntax following.