Initial commit: Arbitrage

This commit is contained in:
newkle3r
2026-05-15 15:06:30 +02:00
commit c9406410ae
12 changed files with 51372 additions and 0 deletions
+7
View File
@@ -0,0 +1,7 @@
# The Odds API — https://the-odds-api.com/
# Copy to .env and set your key before running odds fetch scripts.
ODDS_API_KEY=your_api_key_here
# Default region and market format (optional)
ODDS_REGIONS=uk,eu
ODDS_MARKETS=h2h
+50
View File
@@ -0,0 +1,50 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual environments
.venv/
venv/
arb-venv/
env/
ENV/
# Secrets and local config
.env
.env.*
!.env.example
# IDE
.idea/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
# Jupyter
.ipynb_checkpoints/
# pytest / coverage
.pytest_cache/
.coverage
htmlcov/
+14
View File
@@ -0,0 +1,14 @@
{
"files.exclude": {
"**/.git": true,
"**/.svn": true,
"**/.hg": true,
"**/.DS_Store": true,
"**/Thumbs.db": true,
"**/CVS": true,
"**/.retool_types/**": true,
"**/*tsconfig.json": true,
".cache": true,
"retool.config.json": true
}
}
+62
View File
@@ -0,0 +1,62 @@
# Arbitrage
Tools and data for exploring **sports betting odds** and (eventually) **arbitrage opportunities** across bookmakers.
## Project status
| Area | Status |
|------|--------|
| Odds API sample data (boxing) | Included under `odds/data/samples/` |
| Odds fetch script | `odds/scripts/fetch_odds.py` |
| Arbitrage detection engine | Not implemented yet |
## Repository layout
```
arbitrage/
├── docs/ # Architecture and data schema
├── odds/
│ ├── data/samples/ # Cached API responses (JSON)
│ └── scripts/ # fetch_odds.py
├── requirements.txt
└── .env.example # ODDS_API_KEY template
```
## Quick start
### 1. Clone and create a virtual environment
```bash
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate
pip install -r requirements.txt
```
### 2. Configure The Odds API (for live pulls)
1. Sign up at [The Odds API](https://the-odds-api.com/).
2. Copy `.env.example` to `.env` and set `ODDS_API_KEY`.
3. Fetch fresh odds:
```bash
python odds/scripts/fetch_odds.py --sport boxing_boxing --out odds/data/samples/boxing_odds.json
```
## Documentation
- [Architecture](docs/ARCHITECTURE.md) — how modules fit together
- [Odds data schema](docs/DATA_SCHEMA.md) — JSON structure from The Odds API
- [Odds module](odds/README.md) — fetching and sample data
## What is not in this repo
- Virtual environments (`arb-venv/`, `.venv/`) — create locally; see `.gitignore`
- API keys — use `.env` (never commit)
## License
Add a license file before publishing if you intend open-source distribution.
+53
View File
@@ -0,0 +1,53 @@
# Architecture
## Overview
```mermaid
flowchart LR
subgraph external [External]
API[The Odds API]
end
subgraph repo [Repository]
Fetch[odds/scripts/fetch_odds.py]
Samples[(odds/data/samples/*.json)]
Arb[Arbitrage engine - planned]
end
API --> Fetch --> Samples
Samples --> Arb
```
## Odds pipeline (current and planned)
### Current
1. Operator sets `ODDS_API_KEY` in `.env`
2. `fetch_odds.py` calls `GET /v4/sports/{sport_key}/odds`
3. Response JSON is stored under `odds/data/samples/`
### Planned
| Stage | Responsibility |
|-------|----------------|
| Loader | Read JSON snapshots or live API |
| Normalizer | One row per (event, bookmaker, market, outcome) |
| Arb scanner | Compare implied probabilities across books |
| Reporter | CLI or export of opportunities above margin threshold |
## Configuration
| Variable | Used by | Description |
|----------|---------|-------------|
| `ODDS_API_KEY` | `fetch_odds.py` | The Odds API authentication |
| `ODDS_REGIONS` | (optional future) | Default regions for fetch |
| `ODDS_MARKETS` | (optional future) | Default markets for fetch |
## Dependencies
See root `requirements.txt`. Odds fetch uses `requests`.
## Out of scope for git
- `arb-venv/`, `.venv/` — local virtualenvs
- `.env` — secrets
+102
View File
@@ -0,0 +1,102 @@
# Odds data schema
Sample file: `odds/data/samples/boxing_odds.json`
Source: [The Odds API](https://the-odds-api.com/) v4 — `GET /sports/{sport}/odds`
## Top level
The response is a **JSON array** of event objects.
```json
[
{
"id": "7605c958b8ffbe29c0dcb81e4d2c8a10",
"sport_key": "boxing_boxing",
"sport_title": "Boxing",
"commence_time": "2025-05-10T16:00:00Z",
"home_team": "Charlie Senior",
"away_team": "Cesar Ignacio Paredes",
"bookmakers": [ ... ]
}
]
```
## Event fields
| Field | Type | Description |
|-------|------|-------------|
| `id` | string | Unique event id |
| `sport_key` | string | API sport identifier |
| `sport_title` | string | Human-readable sport name |
| `commence_time` | ISO 8601 UTC | Scheduled start |
| `home_team` | string | Home / first fighter name |
| `away_team` | string | Away / second fighter name |
| `bookmakers` | array | List of books offering lines on this event |
## Bookmaker object
| Field | Type | Description |
|-------|------|-------------|
| `key` | string | Bookmaker slug (e.g. `unibet_uk`) |
| `title` | string | Display name |
| `last_update` | ISO 8601 | When this books lines were updated |
| `link` | string \| null | Deep link to event on book site |
| `sid` | string \| null | Book-specific event id |
| `markets` | array | Markets offered for this event |
## Market object
| Field | Type | Description |
|-------|------|-------------|
| `key` | string | Market type: `h2h` (moneyline), `h2h_lay` (exchange lay), etc. |
| `last_update` | ISO 8601 | Market update time |
| `outcomes` | array | Selections and prices |
## Outcome object
| Field | Type | Description |
|-------|------|-------------|
| `name` | string | Fighter name or `Draw` |
| `price` | number | **Decimal** odds (European format) |
| `link` | string \| null | Selection deep link |
| `sid` | string \| null | Selection id at book |
| `bet_limit` | number \| null | Max stake if provided |
## Implied probability (for arbitrage)
For decimal odds `d`:
```
implied_probability = 1 / d
```
For a **single** books full market (all mutually exclusive outcomes), sum of implied probabilities > 1 indicates book margin (overround).
**Cross-book arbitrage** (simplified): for each outcome, take the **best** (highest) decimal odds across books, then:
```
arb_sum = sum(1 / best_odds_i for each outcome i)
```
If `arb_sum < 1`, a risk-free profit is theoretically possible before fees and limits.
## Normalized row (recommended for future code)
| Column | Example |
|--------|---------|
| `event_id` | `7605c958...` |
| `commence_time` | `2025-05-10T16:00:00Z` |
| `home_team` | `Charlie Senior` |
| `away_team` | `Cesar Ignacio Paredes` |
| `bookmaker_key` | `unibet_uk` |
| `market_key` | `h2h` |
| `outcome_name` | `Charlie Senior` |
| `price` | `1.03` |
| `implied_prob` | `0.9709` |
## API usage notes
- Responses count against your API quota; check response headers `x-requests-remaining`.
- Region and market query params filter payload size; sample uses `uk,eu` and `h2h`.
- Prices are decimal in this project (`oddsFormat=decimal` in fetch script).
+49
View File
@@ -0,0 +1,49 @@
# Odds module
Handles **ingestion and storage** of sports odds from [The Odds API](https://the-odds-api.com/). Arbitrage logic will consume normalized odds from this layer once implemented.
## Contents
| Path | Description |
|------|-------------|
| `data/samples/boxing_odds.json` | Snapshot of boxing `h2h` odds (~1 MB) |
| `scripts/fetch_odds.py` | CLI to pull live odds and save JSON |
Legacy path `api-pull/Odds/boxing_odds.json` may still exist locally; prefer `data/samples/` for new work.
## Fetching odds
```bash
# From repository root, with ODDS_API_KEY set
python odds/scripts/fetch_odds.py \
--sport boxing_boxing \
--regions uk,eu \
--markets h2h \
--out odds/data/samples/boxing_odds.json
```
### Script functions
| Function | Purpose |
|----------|---------|
| `get_api_key()` | Reads `ODDS_API_KEY` from the environment |
| `fetch_odds(sport, regions, markets, api_key)` | HTTP GET to `/v4/sports/{sport}/odds` |
| `save_odds(data, output_path)` | Writes pretty-printed JSON |
| `main()` | Parses CLI args and runs fetch + save |
## Sample data
`boxing_odds.json` is an array of **events**. Each event has `home_team`, `away_team`, `commence_time`, and a `bookmakers` list. Each bookmaker exposes `markets` (e.g. `h2h` head-to-head) with `outcomes` and decimal `price` values.
See [docs/DATA_SCHEMA.md](../docs/DATA_SCHEMA.md) for the full field reference.
## Next steps (arbitrage)
Planned pipeline:
1. **Normalize** — flatten bookmaker/outcome rows per event
2. **Implied probability**`1 / decimal_odds` per outcome
3. **Cross-book compare** — find outcomes where sum of best inverse odds &lt; 1 (arb margin)
4. **Stake split** — optional Kelly / equal-profit calculators
None of steps 24 exist in code yet; this folder is ingestion-only today.
File diff suppressed because it is too large Load Diff
+9
View File
@@ -0,0 +1,9 @@
# Legacy path
Sample odds were originally stored here. Use the current path instead:
| Old | New |
|-----|-----|
| `api-pull/Odds/boxing_odds.json` | `odds/data/samples/boxing_odds.json` |
You may delete `api-pull/` locally after confirming the copy exists under `odds/data/samples/`.
File diff suppressed because it is too large Load Diff
+103
View File
@@ -0,0 +1,103 @@
"""
Fetch sports odds from The Odds API and save JSON snapshots.
Requires ODDS_API_KEY in environment or .env (see project .env.example).
Usage:
python odds/scripts/fetch_odds.py --sport boxing_boxing --out odds/data/samples/boxing_odds.json
"""
from __future__ import annotations
import argparse
import json
import os
import sys
from pathlib import Path
try:
import requests
except ImportError:
print("Install dependencies: pip install -r requirements.txt", file=sys.stderr)
raise
API_BASE = "https://api.the-odds-api.com/v4"
DEFAULT_SPORT = "boxing_boxing"
DEFAULT_REGIONS = "uk,eu"
DEFAULT_MARKETS = "h2h"
def get_api_key() -> str:
"""Read API key from ODDS_API_KEY environment variable."""
key = os.environ.get("ODDS_API_KEY", "").strip()
if not key:
raise SystemExit(
"Missing ODDS_API_KEY. Copy .env.example to .env and set your key."
)
return key
def fetch_odds(
sport: str,
regions: str,
markets: str,
api_key: str,
) -> list[dict]:
"""
GET odds for a sport from The Odds API.
Args:
sport: Sport key (e.g. boxing_boxing).
regions: Comma-separated region codes.
markets: Comma-separated market keys.
api_key: The Odds API key.
Returns:
Parsed JSON list of events.
Raises:
requests.HTTPError: On non-2xx response.
"""
url = f"{API_BASE}/sports/{sport}/odds"
params = {
"apiKey": api_key,
"regions": regions,
"markets": markets,
"oddsFormat": "decimal",
}
response = requests.get(url, params=params, timeout=60)
response.raise_for_status()
return response.json()
def save_odds(data: list[dict], output_path: Path) -> None:
"""Write odds JSON to disk with stable formatting."""
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open("w", encoding="utf-8") as f:
json.dump(data, f, indent=4, ensure_ascii=False)
print(f"Saved {len(data)} events to {output_path}")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Fetch odds from The Odds API")
parser.add_argument("--sport", default=DEFAULT_SPORT, help="Sport key")
parser.add_argument("--regions", default=DEFAULT_REGIONS, help="Region codes")
parser.add_argument("--markets", default=DEFAULT_MARKETS, help="Market keys")
parser.add_argument(
"--out",
type=Path,
default=Path("odds/data/samples/boxing_odds.json"),
help="Output JSON path",
)
return parser.parse_args()
def main() -> None:
args = parse_args()
api_key = get_api_key()
data = fetch_odds(args.sport, args.regions, args.markets, api_key)
save_odds(data, args.out)
if __name__ == "__main__":
main()
+1
View File
@@ -0,0 +1 @@
requests>=2.31.0