Auto-sync from website-creator
This commit is contained in:
57
skills/image-analyze/SKILL.md
Normal file
57
skills/image-analyze/SKILL.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
name: image-analyze
|
||||
description: Analyze images using vision AI when the current model doesn't support image input. Use this skill when you need to understand, describe, or extract information from images.
|
||||
---
|
||||
|
||||
# Image Analyze
|
||||
|
||||
Analyze images with vision AI via `python3 scripts/analyze_image.py <image_path> [prompt]`.
|
||||
|
||||
## Commands
|
||||
|
||||
| Command | Args | Description |
|
||||
|---------|------|-------------|
|
||||
| `analyze` | `<image_path> [prompt]` | Analyze image with optional custom prompt |
|
||||
|
||||
## Options
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `--max-tokens` | 1024 | Maximum tokens in response |
|
||||
| `--temperature` | 0.7 | Response creativity (0-2) |
|
||||
| `--model` | moonshotai/Kimi-K2.5-TEE | Vision model to use |
|
||||
|
||||
## Examples
|
||||
|
||||
```bash
|
||||
# Basic analysis
|
||||
python3 scripts/analyze_image.py photo.jpg
|
||||
|
||||
# With custom prompt
|
||||
python3 scripts/analyze_image.py diagram.png "Extract all text and explain the workflow"
|
||||
|
||||
# Detailed analysis
|
||||
python3 scripts/analyze_image.py screenshot.png "Describe all UI elements and their positions"
|
||||
|
||||
# OCR-like extraction
|
||||
python3 scripts/analyze_image.py document.jpg "Transcribe all text exactly as shown"
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
1. Provide image path (PNG, JPG, JPEG, GIF, WEBP, BMP)
|
||||
2. Optionally provide custom analysis prompt
|
||||
3. Script converts image to base64 and sends to vision API
|
||||
4. Returns detailed analysis text
|
||||
|
||||
## Output Format
|
||||
|
||||
- Success: Analysis text directly
|
||||
- Error: `Error: message` (to stderr)
|
||||
|
||||
## Notes
|
||||
|
||||
- Requires `CHUTES_API_TOKEN` in environment
|
||||
- Uses Kimi-K2.5-TEE vision model via Chutes AI
|
||||
- Supports common image formats
|
||||
- Best for: image description, OCR, UI analysis, diagram interpretation
|
||||
Reference in New Issue
Block a user