feat(oi): add open interest scraper module

add new oi_scraper directory for collecting open interest data
and update the main EA to integrate with the scraper functionality
This commit is contained in:
Kunthawat Greethong
2026-01-04 17:35:14 +07:00
parent d79bf572ef
commit 28a4546cd8
9 changed files with 1278 additions and 0 deletions

658
oi_scraper/WINDOWS_SETUP.md Normal file
View File

@@ -0,0 +1,658 @@
# CME OI Scraper - Windows Setup Guide
Complete guide for setting up and running the CME OI scraper on Windows with automatic daily updates.
## Table of Contents
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Configuration](#configuration)
- [Manual Testing](#manual-testing)
- [Automatic Daily Updates](#automatic-daily-updates)
- [MetaTrader 5 Integration](#metatrader-5-integration)
- [Troubleshooting](#troubleshooting)
---
## Prerequisites
### Required Software
1. **Python 3.9 or higher**
- Download: https://www.python.org/downloads/
- During installation: ✅ Check "Add Python to PATH"
2. **CME Group QuikStrike Account**
- Free account required: https://www.cmegroup.com/
- Register for QuikStrike access
- Save your username and password
3. **MetaTrader 5** (for EA integration)
- Download: https://www.metatrader5.com/
- Install on your Windows machine
### Verify Python Installation
```cmd
python --version
```
Expected output: `Python 3.9.x` or higher
If not found, install Python or use `py` or `python3` commands.
---
## Installation
### Step 1: Navigate to Scraper Directory
Open Command Prompt (cmd) and navigate:
```cmd
cd C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper
```
Replace `YourUsername` with your actual Windows username.
### Step 2: Create Environment File
```cmd
copy .env.example .env
```
### Step 3: Edit .env File
Open `.env` with Notepad:
```cmd
notepad .env
```
Update with your credentials:
```env
# CME Group QuikStrike Login Credentials
CME_USERNAME=your_actual_username_here
CME_PASSWORD=your_actual_password_here
# Product Configuration (Gold)
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=40&viewitemid=IntegratedOpenInterestTool
# Output Settings
CSV_OUTPUT_PATH=./oi_data.csv
TOP_N_STRIKES=3
# Scraping Settings
HEADLESS=false
TIMEOUT_SECONDS=30
RETRY_ATTEMPTS=3
# Logging
LOG_LEVEL=INFO
```
**Save and close** (Ctrl+S, then Alt+F4).
### Step 4: Install Python Dependencies
```cmd
pip install -r requirements.txt
```
Expected output: Successfully installed playwright, python-dotenv, pandas
### Step 5: Install Playwright Browser
```cmd
playwright install chromium
```
Expected output: Downloading Chromium... [progress bar]
---
## Configuration
### Available Products
**Gold (XAUUSD/COMEX Gold - OG|GC):**
```env
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=40&viewitemid=IntegratedOpenInterestTool
```
**Silver:**
```env
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=41&viewitemid=IntegratedOpenInterestTool
```
**SOFR (3M SOFR):**
```env
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=476&viewitemid=IntegratedOpenInterestTool
```
### Configuration Options
| Setting | Description | Default |
|----------|-------------|---------|
| `TOP_N_STRIKES` | Number of top strikes to export | 3 |
| `HEADLESS` | Run browser without window (true/false) | false |
| `TIMEOUT_SECONDS` | Page load timeout in seconds | 30 |
| `CSV_OUTPUT_PATH` | Output CSV file path | ./oi_data.csv |
| `LOG_LEVEL` | DEBUG, INFO, WARNING, ERROR | INFO |
---
## Manual Testing
### Run Scraper Manually
```cmd
python main.py
```
Expected output:
```
INFO:__main__:Cookies loaded from file
INFO:__main__:Using existing session (cookies)
INFO:__main__:Navigating to OI Heatmap: https://...
INFO:__main__:Extracting OI data from Gold matrix table...
INFO:__main__:Extracted 6 OI levels
INFO:__main__:Exported OI data to ./oi_data.csv
```
### Check Output
**1. Verify CSV created:**
```cmd
dir oi_data.csv
```
**2. View CSV content:**
```cmd
notepad oi_data.csv
```
Expected format:
```csv
Type,Strike,OI
CALL,4400,6193
CALL,4300,3826
CALL,4350,1983
PUT,4400,5559
PUT,4300,2988
PUT,4350,1214
```
### Check Logs
```cmd
type scraper.log
```
Or view in Notepad:
```cmd
notepad scraper.log
```
---
## Automatic Daily Updates
### Option 1: Windows Task Scheduler (Recommended)
#### Step 1: Create Batch File Wrapper
Create `run_scraper.bat` in the oi_scraper directory:
```cmd
@echo off
cd /d C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper
echo Starting CME OI Scraper at %date% %time% >> scraper.log
echo ---------------------------------------- >> scraper.log
python main.py >> scraper.log 2>&1
if %ERRORLEVEL% EQU 0 (
echo %date% %time%: Scraper completed successfully >> scraper.log
) else (
echo %date% %time%: Scraper failed with error %ERRORLEVEL% >> scraper.log
)
echo ---------------------------------------- >> scraper.log
```
Replace `YourUsername` with your actual username.
#### Step 2: Open Task Scheduler
Press `Win + R`, type `taskschd.msc`, press Enter
Or: Start → Windows Administrative Tools → Task Scheduler
#### Step 3: Create Task
1. Click **"Create Basic Task"** on right sidebar
2. **Name:** `CME OI Scraper - Daily`
3. **Description:** `Update OI data from CME QuikStrike every day at 9 AM`
4. Click **Next**
#### Step 4: Set Trigger
1. **Trigger:** Select "Daily"
2. **Start date:** Today's date
3. **Start time:** 9:00:00 AM (or your preferred time)
4. Click **Next**
#### Step 5: Set Action
1. **Action:** Select "Start a program"
2. **Program/script:**
```
C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper\run_scraper.bat
```
3. **Start in (optional):**
```
C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper
```
4. Click **Next**
#### Step 6: Finish
1. Review settings
2. Check "Open the Properties dialog for this task when I click Finish"
3. Click **Finish**
#### Step 7: Configure Advanced Settings (Optional)
In the Properties dialog:
- **General tab:**
- ✅ Run whether user is logged on or not
- ✅ Do not store password (if using Windows authentication)
- ✅ Run with highest privileges
- **Conditions tab:**
- ✅ Start the task only if the computer is on AC power
- ✅ Stop if the computer switches to battery power
- ✅ Wake the computer to run this task
- **Settings tab:**
- ✅ Allow task to be run on demand
- ❌ Stop the task if it runs longer than: 30 minutes
- ✅ If the task fails, restart every: 5 minutes (up to 3 times)
Click **OK** to save settings.
#### Step 8: Test Task
1. In Task Scheduler, find "CME OI Scraper - Daily"
2. Right-click → **Run**
3. Check `scraper.log` after a minute:
```cmd
type scraper.log
```
---
### Option 2: PowerShell Script (Advanced)
#### Step 1: Create PowerShell Script
Save as `run_scraper.ps1`:
```powershell
# Script configuration
$scriptPath = "C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper"
$logFile = "$scriptPath\scraper.log"
$timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
# Navigate to script directory
cd $scriptPath
try {
# Run Python scraper
Write-Output "$timestamp: Starting CME OI Scraper" | Add-Content $logFile
& python main.py *>> $logFile 2>&1
# Check if CSV was created
if (Test-Path "oi_data.csv") {
$fileInfo = Get-Item "oi_data.csv"
Write-Output "$timestamp: Scraper completed successfully (CSV updated: $($fileInfo.LastWriteTime))" | Add-Content $logFile
} else {
Write-Output "$timestamp: WARNING - CSV file not created" | Add-Content $logFile
}
} catch {
$errorMsg = $_.Exception.Message
Write-Output "$timestamp: ERROR - $errorMsg" | Add-Content $logFile
exit 1
}
```
#### Step 2: Update Task Scheduler to Use PowerShell
Same steps as Option 1, but:
- **Program/script:** `powershell.exe`
- **Add arguments:**
```
-ExecutionPolicy Bypass -File "C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper\run_scraper.ps1"
```
---
## MetaTrader 5 Integration
### Find MT5 Files Directory
MT5 data directory location:
```
C:\Users\YourUsername\AppData\Roaming\MetaQuotes\Terminal\[Terminal_ID]\MQL5\Files\
```
**To find your Terminal_ID:**
1. Open MT5
2. Click **File** → **Open Data Folder**
3. Navigate to `Terminal\[Your_Terminal_ID]\MQL5\Files\`
### Update Batch File to Copy to MT5
Edit `run_scraper.bat`:
```cmd
@echo off
cd /d C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper
echo Starting CME OI Scraper at %date% %time% >> scraper.log
echo ---------------------------------------- >> scraper.log
python main.py >> scraper.log 2>&1
if %ERRORLEVEL% EQU 0 (
if exist oi_data.csv (
echo Copying OI data to MT5... >> scraper.log
copy oi_data.csv "C:\Users\YourUsername\AppData\Roaming\MetaQuotes\Terminal\[Your_Terminal_ID]\MQL5\Files\oi_data.csv"
if %ERRORLEVEL% EQU 0 (
echo %date% %time%: Scraper completed - OI data copied to MT5 >> scraper.log
) else (
echo %date% %time%: ERROR - Failed to copy to MT5 >> scraper.log
)
) else (
echo %date% %time%: ERROR - oi_data.csv not found >> scraper.log
)
) else (
echo %date% %time%: ERROR - Scraper failed with error %ERRORLEVEL% >> scraper.log
)
echo ---------------------------------------- >> scraper.log
```
Replace `[Your_Terminal_ID]` with your actual MT5 terminal ID.
### Update EA Configuration
In your EA (`OI_MeanReversion_Pro_XAUUSD_A.mq5`), set:
```mql5
input ENUM_OI_SOURCE InpOISource = OI_SOURCE_CSV_FILE; // Load from CSV file
```
The EA will automatically read `oi_data.csv` from its Files directory.
---
## Troubleshooting
### Python Not Found
**Error:** `'python' is not recognized as an internal or external command`
**Solutions:**
1. Use full path to Python:
```cmd
C:\Users\YourUsername\AppData\Local\Programs\Python\Python312\python.exe main.py
```
2. Use `py` launcher:
```cmd
py main.py
```
3. Reinstall Python with "Add to PATH" option
### Module Import Errors
**Error:** `ModuleNotFoundError: No module named 'playwright'`
**Solution:**
```cmd
pip install -r requirements.txt
```
### Login Fails
**Error:** `Login failed - still on login page`
**Solutions:**
1. Check credentials in `.env` file:
```cmd
notepad .env
```
2. Check login screenshots:
- `login_failed.png` - Shows login page
- `login_error.png` - Shows error during login
- `login_success.png` - Confirms successful login
3. Manually test login at: https://www.cmegroup.com/
4. Check if 2FA is required (CME may require additional authentication)
### No Data Extracted
**Warning:** `No CALL OI data extracted` or `No PUT OI data extracted`
**Solutions:**
1. Check if you're logged in:
- Delete `cookies.json` to force fresh login
- Run scraper manually with `HEADLESS=false` in `.env`
2. Check if page structure changed:
- View screenshots to see actual page content
- Check if Gold product URL is correct
3. Increase timeout:
```env
TIMEOUT_SECONDS=60
```
### Task Not Running
**Issue:** Task Scheduler doesn't execute the task
**Solutions:**
1. Check task history:
- Task Scheduler → Right-click task → Properties → History tab
- Look for errors in the log
2. Test manually:
- Right-click task → Run
- Check `scraper.log` for output
3. Check account permissions:
- Ensure task is set to run with your Windows account
- Check "Run whether user is logged on or not"
4. Check Windows Event Viewer:
- Event Viewer → Windows Logs → Application
- Look for Task Scheduler errors
### Session Expiration
**Issue:** Session expires after some time
**Solution:**
The scraper will automatically re-login when cookies expire. No manual action needed.
To force fresh login:
```cmd
del cookies.json
```
### Check Logs
**View recent logs:**
```cmd
type scraper.log | more
```
**View last 20 lines:**
```cmd
powershell "Get-Content scraper.log -Tail 20"
```
**Search for errors:**
```cmd
findstr /C:"ERROR" scraper.log
```
### Verify CSV Output
**Check if CSV is valid:**
```cmd
python -c "import pandas as pd; print(pd.read_csv('oi_data.csv'))"
```
**Check file size:**
```cmd
dir oi_data.csv
```
---
## Advanced Options
### Run Multiple Times Per Day
**Edit Task Scheduler Trigger:**
1. Open task properties → Triggers tab
2. Edit existing trigger → Click "New" to add additional
3. Set different times:
- 9:00 AM
- 12:00 PM
- 3:00 PM
- 6:00 PM
### Run on Market Days Only
**Create separate batch file:**
```cmd
@echo off
cd /d C:\Users\YourUsername\Gitea\MeanRevisionEA\oi_scraper
REM Check if today is weekday (1=Monday, 5=Friday)
for /f "skip=1 tokens=*" %%a in ('wmic path win32_localtime get dayofweek /value') do set DAY=%%a
if %DAY% LSS 1 goto END
if %DAY% GTR 5 goto END
REM Run scraper
python main.py >> scraper.log 2>&1
:END
```
### Email Notifications
**Use PowerShell to send email on completion:**
```powershell
# Add to run_scraper.ps1 at the end
$smtpServer = "smtp.gmail.com"
$smtpPort = 587
$smtpUser = "your_email@gmail.com"
$smtpPass = "your_password"
$from = "CME OI Scraper <your_email@gmail.com>"
$to = "your_email@gmail.com"
$subject = "CME OI Scraper - %date%"
if ($errorOccurred) {
$body = "CME OI Scraper failed. Check logs for details."
} else {
$body = "CME OI Scraper completed successfully.`n`nUpdated files:`n- oi_data.csv"
}
$message = New-Object System.Net.Mail.MailMessage $from, $to
$message.Subject = $subject
$message.Body = $body
$smtp = New-Object System.Net.Mail.SmtpClient $smtpServer, $smtpPort
$smtp.EnableSsl = $true
$smtp.Credentials = New-Object System.Net.NetworkCredential $smtpUser, $smtpPass
$smtp.Send($message)
```
---
## Summary
**Quick Start Checklist:**
- [ ] Python 3.9+ installed
- [ ] CME QuikStrike account created
- [ ] `.env` file configured with credentials
- [ ] Dependencies installed (`pip install -r requirements.txt`)
- [ ] Playwright browser installed (`playwright install chromium`)
- [ ] Manual test successful (`python main.py`)
- [ ] `oi_data.csv` created and valid
- [ ] Task Scheduler task created
- [ ] Task tested manually
- [ ] CSV copied to MT5 Files directory
- [ ] EA configured to use CSV file
**Daily Workflow:**
1. Task Scheduler runs at 9:00 AM
2. Batch file executes Python scraper
3. Scraper logs in with saved cookies (or fresh login)
4. OI data extracted and saved to `oi_data.csv`
5. CSV copied to MT5 Files directory
6. EA reads updated OI data
7. EA uses new OI levels for trading
---
## Support
For issues or questions:
1. Check `scraper.log` for detailed error messages
2. Review screenshots (login_failed.png, login_error.png)
3. Verify `.env` configuration
4. Test manually without Task Scheduler
5. Check Windows Event Viewer for system errors
---
**Last Updated:** January 4, 2026
**Version:** 1.0
**Platform:** Windows 10/11