Files
moreminimore-vibe/testing/fake-llm-server
Will Chen c50527b4c0 Security Panel MVP (#1660)
TODOs:

- [x] Add documentation
- [x] e2e tests: run security review, update knowledge, and fix issue
- [x] more stringent risk rating


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces a new Security mode with a Security Review panel that runs
reviews, edits rules, parses findings via IPC, and supports fixing
issues, with tests and prompt/runtime support.
> 
> - **UI/Preview Panel**:
> - Add `security` preview mode to `previewModeAtom` and ActionHeader
(Shield button).
> - New `SecurityPanel` showing findings table (sorted by severity), run
review, fix issue flow, and edit `SECURITY_RULES.md` dialog.
>   - Wire into `PreviewPanel` content switch.
> - **Hooks**:
>   - `useSecurityReview(appId)`: fetch latest review via IPC.
> - `useStreamChat`: add `onSettled` callback to invoke refreshes after
streams.
> - **IPC/Main**:
> - `security_handlers`: `get-latest-security-review` parses
`<dyad-security-finding>` from latest assistant message.
>   - Register handler in `ipc_host`; expose channel in `preload`.
>   - `ipc_client`: add `getLatestSecurityReview(appId)`.
> - `chat_stream_handlers`: detect `/security-review`, use dedicated
system prompt, optionally append `SECURITY_RULES.md`, suppress
Supabase-not-available note in this mode.
> - **Prompts**:
> - Add `SECURITY_REVIEW_SYSTEM_PROMPT` with structured finding output.
> - **Supabase**:
> - Enhance schema query to include `rls_enabled`, split policy
`using_clause`/`with_check_clause`.
> - **E2E Tests**:
> - New `security_review.spec.ts` plus snapshots and fixture findings;
update test helper for `security` mode and findings table snapshot.
> - Fake LLM server streams security findings for `/security-review` and
increases batch size.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
5022d01e22a2dd929a968eeba0da592e0aeece01. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2025-10-29 17:32:52 -07:00
..
2025-06-17 16:59:26 -07:00
2025-06-17 16:59:26 -07:00
2025-05-13 15:34:41 -07:00
2025-05-13 15:34:41 -07:00

Fake LLM Server

A simple server that mimics the OpenAI streaming chat completions API for testing purposes.

Features

  • Implements a basic version of the OpenAI chat completions API
  • Supports both streaming and non-streaming responses
  • Always responds with "hello world" message
  • Simulates a 429 rate limit error when the last message is "[429]"
  • Configurable through environment variables

Installation

npm install

Usage

Start the server:

# Development mode
npm run dev

# Production mode
npm run build
npm start

Example usage

curl -X POST http://localhost:3500/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Say something"}],"model":"any-model","stream":true}'

The server will be available at http://localhost:3500 by default.

API Endpoints

POST /v1/chat/completions

This endpoint mimics OpenAI's chat completions API.

Request Format

{
  "messages": [{ "role": "user", "content": "Your prompt here" }],
  "model": "any-model",
  "stream": true
}
  • Set stream: true to receive a streaming response
  • Set stream: false or omit it for a regular JSON response

Response

For non-streaming requests, you'll get a standard JSON response:

{
  "id": "chatcmpl-123456789",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "fake-model",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "hello world"
      },
      "finish_reason": "stop"
    }
  ]
}

For streaming requests, you'll receive a series of server-sent events (SSE), each containing a chunk of the response.

Simulating Rate Limit Errors

To test how your application handles rate limiting, send a message with content exactly equal to [429]:

{
  "messages": [{ "role": "user", "content": "[429]" }],
  "model": "any-model"
}

This will return a 429 status code with the following response:

{
  "error": {
    "message": "Too many requests. Please try again later.",
    "type": "rate_limit_error",
    "param": null,
    "code": "rate_limit_exceeded"
  }
}

Configuration

You can configure the server by modifying the PORT variable in the code.

Use Case

This server is primarily intended for testing applications that integrate with OpenAI's API, allowing you to develop and test without making actual API calls to OpenAI.