Fixes#1405
<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Adds GPT-5 Codex (OpenAI and Azure) and Claude 4.5 Sonnet to the model
options to enable newer coding models and larger contexts. Also
increases Claude 4 Sonnet max output tokens to 32k.
<!-- End of auto-generated description by cubic. -->
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds GPT‑5 Codex (OpenAI/Azure) and Claude 4.5 Sonnet, and increases
Claude 4 Sonnet max output tokens to 32k across providers and tests.
>
> - **Models**:
> - **OpenAI**: add `gpt-5-codex` (400k context, default temp 1).
> - **Anthropic**:
> - add `claude-sonnet-4-5-20250929` (1M context, `maxOutputTokens:
32_000`).
> - update `claude-sonnet-4-20250514` `maxOutputTokens` from `16_000` to
`32_000`.
> - **Azure**: add `gpt-5-codex` (400k context, `maxOutputTokens:
128_000`).
> - **Bedrock**:
> - add `us.anthropic.claude-sonnet-4-5-20250929-v1:0` (1M context,
`maxOutputTokens: 32_000`).
> - update `us.anthropic.claude-sonnet-4-20250514-v1:0`
`maxOutputTokens` to `32_000`.
> - **E2E tests**:
> - Update snapshots to reflect `max_tokens` increased to `32000` for
`anthropic/claude-sonnet-4-20250514` in engine and gateway tests.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
73298d2da0c833468f957bb436f1e33400307483. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Fixes#554#1049
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Stop mutating package.json when reading files. readFileWithCache now
returns raw content so all fields are preserved (e.g., packageManager).
Fixes#554 and #1049.
- **Bug Fixes**
- Removed package.json "cleaning" logic and the cleanContent helper.
- Return and cache unmodified file content from both fs and
virtualFileSystem.
<!-- End of auto-generated description by cubic. -->
This uses Gemini's native [thinking
summaries](https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#thought-summaries)
which were recently added to the API.
Why? The grafted thinking would sometimes cause weird issues where the
model, especially Gemini 2.5 Flash, got confused and put dyad tags like
`<dyad-write>` inside the `<think>` tags.
This also improves the UX because you can see the native thoughts rather
than having the Gemini response load for a while without any feedback.
I tried adding Anthropic extended thinking, however it requires temp to
be set at 1, which isn't ideal for Dyad's use case where we need precise
syntax following.