refactor(oi): improve data extraction and consolidate documentation
- Fix MQL5 API usage in EA to use correct CopyRates and POSITION_TYPE enums - Refactor scraper data extraction to use drop_duplicates for unique strikes - Consolidate Windows setup guide into main README - Add virtual environment batch files for easier setup and execution - Simplify run_scraper.bat to focus on core execution - Normalize lot calculation to use SymbolInfo.LotsStep()
This commit is contained in:
@@ -1,179 +1,268 @@
|
||||
# CME OI Scraper
|
||||
|
||||
Python scraper to pull Open Interest data from CME Group QuikStrike and current gold price from investing.com.
|
||||
Python scraper that extracts Open Interest data from CME Group QuikStrike and current gold price from investing.com.
|
||||
|
||||
## What It Extracts
|
||||
|
||||
1. **OI Levels (from CME QuikStrike):**
|
||||
- Top 3 CALL strikes by OI volume
|
||||
- Top 3 PUT strikes by OI volume
|
||||
- Top 3 CALL strikes by OI volume (unique strikes)
|
||||
- Top 3 PUT strikes by OI volume (unique strikes)
|
||||
|
||||
2. **Gold Price (from investing.com):**
|
||||
- Current gold futures price (e.g., 4345.50)
|
||||
- Current gold futures price (e.g., 4476.50)
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Python 3.9+
|
||||
- CME Group QuikStrike account with login credentials
|
||||
- Python 3.9 or higher
|
||||
- CME Group QuikStrike account (free registration at https://www.cmegroup.com)
|
||||
- Windows 10/11 (for batch files) or Linux/macOS
|
||||
|
||||
## Installation
|
||||
## Quick Start
|
||||
|
||||
1. Copy environment variables:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
### Windows
|
||||
|
||||
2. Edit `.env` and add your CME credentials:
|
||||
```bash
|
||||
CME_USERNAME=your_username
|
||||
CME_PASSWORD=your_password
|
||||
```
|
||||
1. **Run one-time setup:**
|
||||
```cmd
|
||||
cd C:\Path\To\oi_scraper
|
||||
setup_env.bat
|
||||
```
|
||||
|
||||
3. Install dependencies:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
playwright install chromium
|
||||
```
|
||||
2. **Run the scraper:**
|
||||
```cmd
|
||||
run_with_venv.bat
|
||||
```
|
||||
|
||||
## Usage
|
||||
### Linux/macOS
|
||||
|
||||
### Basic Scraping
|
||||
1. **Setup:**
|
||||
```bash
|
||||
cd /path/to/oi_scraper
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
playwright install chromium
|
||||
```
|
||||
|
||||
```bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
This will:
|
||||
- Login to CME QuikStrike
|
||||
- Navigate to OI Heatmap
|
||||
- Extract top 3 CALL and PUT strikes by OI volume
|
||||
- Scrape current gold price from investing.com
|
||||
- Export to `oi_data.csv`
|
||||
|
||||
### Session Persistence
|
||||
|
||||
The scraper automatically saves your login session to `cookies.json`. This means:
|
||||
|
||||
- **First run**: Logs in with your credentials, saves cookies
|
||||
- **Subsequent runs**: Uses saved cookies if session is still valid
|
||||
- **Session expired**: Automatically logs in again and saves new cookies
|
||||
|
||||
Benefits for scheduled runs:
|
||||
- Faster execution (skips login when session is valid)
|
||||
- Reduces login attempts to CME servers
|
||||
- CME sessions typically last several days/weeks
|
||||
|
||||
To force a fresh login, delete `cookies.json`:
|
||||
```bash
|
||||
rm cookies.json
|
||||
```
|
||||
|
||||
### Output Format
|
||||
|
||||
The CSV output is compatible with the EA's `LoadOIFromCSV()` and `LoadFuturePriceFromCSV()` functions:
|
||||
|
||||
```csv
|
||||
Type,Strike,OI
|
||||
CALL,4345,155398
|
||||
CALL,4350,229137
|
||||
CALL,4360,90649
|
||||
PUT,4300,227936
|
||||
PUT,4290,270135
|
||||
PUT,4280,65839
|
||||
|
||||
[Price]
|
||||
FuturePrice,4345.50
|
||||
```
|
||||
|
||||
**Note:** The `[Price]` section contains the current gold futures price scraped from investing.com. The EA reads this value for Delta calculation.
|
||||
2. **Run:**
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
python main.py
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Edit `.env` to customize:
|
||||
### Edit `.env` File
|
||||
|
||||
- `PRODUCT_URL` - QuikStrike product page URL (requires login)
|
||||
- `CME_LOGIN_URL` - CME login page URL (default: SSO URL)
|
||||
- `TOP_N_STRIKES` - Number of top strikes to export (default: 3)
|
||||
- `HEADLESS` - Run browser in headless mode (default: false for debugging)
|
||||
- `CSV_OUTPUT_PATH` - Output CSV file path
|
||||
- `TIMEOUT_SECONDS` - Page load timeout
|
||||
Copy and edit the environment file:
|
||||
|
||||
### Available Products
|
||||
|
||||
**Gold (XAUUSD/COMEX Gold - OG|GC):**
|
||||
```
|
||||
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=40&viewitemid=IntegratedOpenInterestTool
|
||||
```cmd
|
||||
copy .env.example .env
|
||||
notepad .env
|
||||
```
|
||||
|
||||
**Silver:**
|
||||
```
|
||||
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=41&viewitemid=IntegratedOpenInterestTool
|
||||
Required settings:
|
||||
```env
|
||||
CME_USERNAME=your_cme_username
|
||||
CME_PASSWORD=your_cme_password
|
||||
```
|
||||
|
||||
**SOFR (3M SOFR):**
|
||||
```
|
||||
PRODUCT_URL=https://cmegroup.quikstrike.net/User/QuikStrikeView.aspx?pid=476&viewitemid=IntegratedOpenInterestTool
|
||||
Optional settings:
|
||||
```env
|
||||
# Number of top strikes to export (default: 3)
|
||||
TOP_N_STRIKES=3
|
||||
|
||||
# Run browser without window (default: false)
|
||||
HEADLESS=false
|
||||
|
||||
# Page timeout in seconds (default: 30)
|
||||
TIMEOUT_SECONDS=30
|
||||
|
||||
# Output CSV path
|
||||
CSV_OUTPUT_PATH=./oi_data.csv
|
||||
|
||||
# Logging level: DEBUG, INFO, WARNING, ERROR
|
||||
LOG_LEVEL=INFO
|
||||
```
|
||||
|
||||
**Note:** You must be logged in to access QuikStrike data. The scraper will automatically login using credentials from `.env`.
|
||||
## Output Format
|
||||
|
||||
The scraper exports to `oi_data.csv`:
|
||||
|
||||
```csv
|
||||
Type,Strike,OI
|
||||
CALL,4375.0,147
|
||||
CALL,4450.0,173
|
||||
CALL,4500.0,176
|
||||
PUT,4435.0,49
|
||||
PUT,4400.0,102
|
||||
PUT,4515.0,150
|
||||
|
||||
[Price]
|
||||
FuturePrice,4467.8
|
||||
```
|
||||
|
||||
The `[Price]` section contains the current gold futures price scraped from investing.com.
|
||||
|
||||
## Session Persistence
|
||||
|
||||
The scraper saves login sessions to `cookies.json`:
|
||||
|
||||
- **First run:** Logs in with credentials, saves cookies
|
||||
- **Subsequent runs:** Uses saved cookies if session is valid
|
||||
- **Session expired:** Automatically re-logs in and saves new cookies
|
||||
|
||||
This makes scheduled runs faster and reduces login attempts to CME servers.
|
||||
|
||||
To force a fresh login:
|
||||
```cmd
|
||||
del cookies.json
|
||||
```
|
||||
|
||||
## Integration with EA
|
||||
|
||||
The EA reads OI data from CSV when `InpOISource = OI_SOURCE_CSV_FILE`.
|
||||
The EA reads OI data from CSV when configured:
|
||||
```mql5
|
||||
input ENUM_OI_SOURCE InpOISource = OI_SOURCE_CSV_FILE;
|
||||
```
|
||||
|
||||
Place the generated `oi_data.csv` in MetaTrader's `MQL5/Files` directory.
|
||||
Copy `oi_data.csv` to your MT5 `MQL5/Files` directory:
|
||||
```
|
||||
C:\Users\YourUsername\AppData\Roaming\MetaQuotes\Terminal\Common\MQL5\Files\oi_data.csv
|
||||
```
|
||||
|
||||
## Scheduling
|
||||
## Automatic Daily Scheduling
|
||||
|
||||
Use cron or Windows Task Scheduler to run periodically:
|
||||
### Windows Task Scheduler
|
||||
|
||||
1. **Create scheduled task:**
|
||||
- Open Task Scheduler (`taskschd.msc`)
|
||||
- Click "Create Task"
|
||||
|
||||
2. **Configure General tab:**
|
||||
- Name: `CME OI Scraper - Daily`
|
||||
- ✅ Run whether user is logged on or not
|
||||
- ✅ Run with highest privileges
|
||||
|
||||
3. **Configure Triggers tab:**
|
||||
- New → On a schedule → Daily
|
||||
- Start time: 9:00 AM (or your preferred time)
|
||||
- ✅ Enabled
|
||||
|
||||
4. **Configure Actions tab:**
|
||||
- Action: Start a program
|
||||
- Program/script:
|
||||
```
|
||||
C:\Path\To\oi_scraper\run_scheduled.bat
|
||||
```
|
||||
- Start in:
|
||||
```
|
||||
C:\Path\To\oi_scraper
|
||||
```
|
||||
|
||||
5. **Click OK to save**
|
||||
|
||||
### Linux/macOS (cron)
|
||||
|
||||
```bash
|
||||
# Run every hour
|
||||
0 * * * * cd /path/to/oi_scraper && python main.py
|
||||
# Edit crontab
|
||||
crontab -e
|
||||
|
||||
# Add line to run every day at 9 AM
|
||||
0 9 * * * cd /path/to/oi_scraper && /path/to/venv/bin/python main.py
|
||||
```
|
||||
|
||||
## Batch Files Reference
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `setup_env.bat` | One-time setup (creates virtual environment) |
|
||||
| `run_with_venv.bat` | Manual run with visible window |
|
||||
| `run_scheduled.bat` | For Task Scheduler (no window, no pause) |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Login fails:**
|
||||
### Module Not Found Errors
|
||||
|
||||
**Error:** `ModuleNotFoundError: No module named 'playwright'`
|
||||
|
||||
**Solution:**
|
||||
```cmd
|
||||
run_with_venv.bat
|
||||
```
|
||||
|
||||
The virtual environment ensures all dependencies are isolated.
|
||||
|
||||
### Login Fails
|
||||
|
||||
- Verify credentials in `.env`
|
||||
- Check if CME requires 2FA
|
||||
- Set `HEADLESS=false` to see what's happening
|
||||
- Check screenshots: `login_failed.png`, `login_error.png`, `login_success.png`
|
||||
- Check if CME requires 2FA (manual intervention needed)
|
||||
- Set `HEADLESS=false` to see browser activity
|
||||
- Check screenshots: `login_failed.png`, `login_error.png`
|
||||
|
||||
**No data extracted:**
|
||||
- Check if table structure changed
|
||||
- Increase `TIMEOUT_SECONDS`
|
||||
- Check logs for detailed errors
|
||||
- Screenshot saved as `login_debug.png` or `login_failed.png`
|
||||
### No Data Extracted
|
||||
|
||||
**Login page selectors changed:**
|
||||
- If the scraper can't find username/password inputs, CME may have updated their login page
|
||||
- Update the selectors in `login_to_cme()` function in `main.py`:
|
||||
```python
|
||||
# Example: update to match current CME login form
|
||||
page.fill('input[id="username"]', CME_USERNAME)
|
||||
page.fill('input[id="password"]', CME_PASSWORD)
|
||||
page.click('button[type="submit"]')
|
||||
```
|
||||
- Check if CME table structure changed
|
||||
- Increase `TIMEOUT_SECONDS=60` in `.env`
|
||||
- Check logs for errors
|
||||
- Screenshot saved as `login_debug.png`
|
||||
|
||||
**Browser issues:**
|
||||
- Install Chromium dependencies: `playwright install chromium`
|
||||
- Try different browser: Change `p.chromium.launch()` to `p.firefox.launch()`
|
||||
### Browser Issues
|
||||
|
||||
```cmd
|
||||
# Reinstall Chromium
|
||||
python -m playwright install chromium
|
||||
```
|
||||
|
||||
### Session Expires Frequently
|
||||
|
||||
Delete cookies to force fresh login:
|
||||
```cmd
|
||||
del cookies.json
|
||||
```
|
||||
|
||||
### Check Python Path Issues (Windows)
|
||||
|
||||
```cmd
|
||||
# Check which Python is being used
|
||||
where python
|
||||
|
||||
# Use Python launcher
|
||||
py -3 main.py
|
||||
|
||||
# Or use the virtual environment
|
||||
run_with_venv.bat
|
||||
```
|
||||
|
||||
## Finding Product IDs
|
||||
|
||||
To scrape other instruments (Silver, Crude Oil, etc.):
|
||||
|
||||
1. Visit CME QuikStrike OI Heatmap
|
||||
2. Login to your CME account
|
||||
3. Select a product from the dropdown
|
||||
4. The URL updates with the `pid` parameter
|
||||
5. Note: This scraper is configured for Gold by default
|
||||
|
||||
## Notes
|
||||
|
||||
- The scraper targets the OI Heatmap table structure
|
||||
- Only exports top N strikes by OI volume
|
||||
- Login session is not persisted (login each run)
|
||||
- Cookies could be saved for faster subsequent runs
|
||||
- Targets the OI Heatmap table structure
|
||||
- Exports top N unique strikes by OI volume
|
||||
- Uses session cookies for faster subsequent runs
|
||||
- CME sessions typically last several days to weeks
|
||||
- Virtual environment recommended to avoid Python path conflicts
|
||||
|
||||
### Finding Product IDs
|
||||
## Files
|
||||
|
||||
To find product IDs for other instruments:
|
||||
1. Visit https://www.cmegroup.com/tools-information/quikstrike/open-interest-heatmap.html
|
||||
2. Login to your CME account
|
||||
3. Select a product from the "Products" menu
|
||||
4. The URL will update with the `pid` parameter
|
||||
5. Copy that URL to your `.env` file
|
||||
|
||||
Example: `https://www.cmegroup.com/tools-information/quikstrike/open-interest-heatmap.html?pid=40` (Gold)
|
||||
```
|
||||
oi_scraper/
|
||||
├── main.py # Main scraper script
|
||||
├── requirements.txt # Python dependencies
|
||||
├── .env.example # Environment template
|
||||
├── .env # Your credentials (create from example)
|
||||
├── setup_env.bat # Windows: Create virtual environment
|
||||
├── run_with_venv.bat # Windows: Manual run
|
||||
├── run_scheduled.bat # Windows: Task Scheduler run
|
||||
├── oi_data.csv # Output file (generated)
|
||||
├── cookies.json # Session cookies (generated)
|
||||
└── scraper.log # Log file (generated)
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user