Files
moreminimore-vibe/testing/fake-llm-server
Will Chen a8f3c97396 Turbo edits v2 (#1653)
Fixes #1222 #1646 

TODOs
- [x] description?
- [x] collect errors across all files for turbo edits
- [x] be forgiving around whitespaces
- [x] write e2e tests
- [x] do more manual testing across different models



<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Adds Turbo Edits v2 search-replace flow with settings/UI selector,
parser/renderer, dry-run validation + fallback, proposal integration,
and comprehensive tests; updates licensing.
> 
> - **Engine/Processing**:
> - Add `dyad-search-replace` end-to-end: parsing
(`getDyadSearchReplaceTags`), markdown rendering (`DyadSearchReplace`),
and application (`applySearchReplace`) with dry-run validation and
fallback to `dyad-write`.
> - Inject Turbo Edits v2 system prompt; toggle via
`isTurboEditsV2Enabled`; disable classic lazy edits when v2 is on.
> - Include search-replace edits in proposals and full-response
processing.
> - **Settings/UI**:
> - Introduce `proLazyEditsMode` (`off`|`v1`|`v2`) and helper selectors;
update `ProModeSelector` with Turbo Edits and Smart Context selectors
(`data-testid`s).
> - **LLM/token flow**:
> - Construct system prompt conditionally; update token counting and
chat stream to validate and repair search-replace responses.
> - **Tests**:
> - Add unit tests for search-replace processor; e2e tests for Turbo
Edits v2 and options; fixtures and snapshots.
> - **Licensing/Docs**:
> - Add `src/pro/LICENSE` (FSL 1.1 ALv2 future), update root `LICENSE`
and README license section.
> - **Tooling**:
> - Update `.prettierignore`; enhance test helpers (selectors, path
normalization, snapshot filtering).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
7aefa02bfae2fe22a25c7d87f3c4c326f820f1e6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2025-10-28 11:36:20 -07:00
..
2025-06-17 16:59:26 -07:00
2025-06-17 16:59:26 -07:00
2025-05-13 15:34:41 -07:00
2025-05-13 15:34:41 -07:00

Fake LLM Server

A simple server that mimics the OpenAI streaming chat completions API for testing purposes.

Features

  • Implements a basic version of the OpenAI chat completions API
  • Supports both streaming and non-streaming responses
  • Always responds with "hello world" message
  • Simulates a 429 rate limit error when the last message is "[429]"
  • Configurable through environment variables

Installation

npm install

Usage

Start the server:

# Development mode
npm run dev

# Production mode
npm run build
npm start

Example usage

curl -X POST http://localhost:3500/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Say something"}],"model":"any-model","stream":true}'

The server will be available at http://localhost:3500 by default.

API Endpoints

POST /v1/chat/completions

This endpoint mimics OpenAI's chat completions API.

Request Format

{
  "messages": [{ "role": "user", "content": "Your prompt here" }],
  "model": "any-model",
  "stream": true
}
  • Set stream: true to receive a streaming response
  • Set stream: false or omit it for a regular JSON response

Response

For non-streaming requests, you'll get a standard JSON response:

{
  "id": "chatcmpl-123456789",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "fake-model",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "hello world"
      },
      "finish_reason": "stop"
    }
  ]
}

For streaming requests, you'll receive a series of server-sent events (SSE), each containing a chunk of the response.

Simulating Rate Limit Errors

To test how your application handles rate limiting, send a message with content exactly equal to [429]:

{
  "messages": [{ "role": "user", "content": "[429]" }],
  "model": "any-model"
}

This will return a 429 status code with the following response:

{
  "error": {
    "message": "Too many requests. Please try again later.",
    "type": "rate_limit_error",
    "param": null,
    "code": "rate_limit_exceeded"
  }
}

Configuration

You can configure the server by modifying the PORT variable in the code.

Use Case

This server is primarily intended for testing applications that integrate with OpenAI's API, allowing you to develop and test without making actual API calls to OpenAI.