Train Sports Prediction Models with Odds Data

Odds carry more predictive signal than any stats dataset. A bookmaker’s closing line is the market’s consensus probability for an outcome, refined by millions of dollars of action. This guide shows how to use that signal.

What You’ll Use

Endpoint	Purpose
`GET /v1/events`	Get pre-match events with odds
`GET /v1/events/{id}/odds`	Track odds over time
`GET /v1/events/{id}/result`	Get actual outcomes for training labels
`GET /v1/settlements`	Verify settlement for accuracy

Why Odds > Stats

Feature source	Problem
Player stats	Doesn’t account for team dynamics, injuries, motivation
Historical records	Past performance ≠ future results
Bookmaker odds	Already incorporates ALL available info + money flows

Implied probability from odds = the market’s best estimate. Your model’s job is to find where the market is slightly wrong.

Step 1: Collect Pre-Match Odds (Python)

import requests
import sqlite3
import time
from datetime import datetime

API_KEY = "your_api_key"
BASE = "https://api.fieldfunded.com/v1"
HEADERS = {"X-API-Key": API_KEY}

db = sqlite3.connect("odds_data.db")
db.execute("""
  CREATE TABLE IF NOT EXISTS snapshots (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT,
    event_id TEXT,
    home_team TEXT,
    away_team TEXT,
    league TEXT,
    market TEXT,
    selection TEXT,
    odds REAL
  )
""")
db.execute("""
  CREATE TABLE IF NOT EXISTS results (
    event_id TEXT PRIMARY KEY,
    home_score INTEGER,
    away_score INTEGER,
    status TEXT,
    winner TEXT
  )
""")

def snapshot_odds(sport="soccer"):
    resp = requests.get(
        f"{BASE}/events",
        headers=HEADERS,
        params={"sport": sport, "status": "prematch", "starts_within": "48h"}
    )
    events = resp.json().get("events", [])

    for event in events[:30]:
        detail = requests.get(
            f"{BASE}/events/{event['id']}/odds",
            headers=HEADERS
        ).json()

        for market in detail.get("markets", [])[:3]:  # Main markets
            for sel in market.get("selections", []):
                db.execute(
                    "INSERT INTO snapshots (timestamp, event_id, home_team, away_team, league, market, selection, odds) VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
                    (datetime.now().isoformat(), event["id"], event["home_team"],
                     event["away_team"], event["league"], market["name"],
                     sel["name"], sel["odds"])
                )
    db.commit()

Step 2: Collect Results (Training Labels)

def collect_results(sport="soccer"):
    resp = requests.get(
        f"{BASE}/events",
        headers=HEADERS,
        params={"sport": sport, "status": "ended"}
    )

    for event in resp.json().get("events", []):
        result = requests.get(
            f"{BASE}/events/{event['id']}/result",
            headers=HEADERS
        ).json()

        if result.get("score"):
            home = result["score"]["home"]
            away = result["score"]["away"]
            winner = "home" if home > away else "away" if away > home else "draw"

            db.execute(
                "INSERT OR REPLACE INTO results (event_id, home_score, away_score, status, winner) VALUES (?, ?, ?, ?, ?)",
                (event["id"], home, away, result["status"], winner)
            )
    db.commit()

Step 3: Extract Features

import pandas as pd

def build_features():
    # Get closing odds (last snapshot before game)
    df = pd.read_sql("""
      SELECT s.event_id, s.home_team, s.away_team, s.market, s.selection, s.odds,
             r.winner, r.home_score, r.away_score
      FROM snapshots s
      JOIN results r ON s.event_id = r.event_id
      WHERE s.market = 'Match Winner'
      AND s.id IN (
        SELECT MAX(id) FROM snapshots
        GROUP BY event_id, market, selection
      )
    """, db)

    # Pivot: one row per event with home/draw/away odds
    pivoted = df.pivot_table(
        index=['event_id', 'home_team', 'away_team', 'winner'],
        columns='selection',
        values='odds'
    ).reset_index()

    # Implied probabilities (1/odds, normalized)
    for col in ['Home', 'Draw', 'Away']:
        if col in pivoted.columns:
            pivoted[f'prob_{col.lower()}'] = 1 / pivoted[col]

    # Normalize to remove overround
    prob_cols = [c for c in pivoted.columns if c.startswith('prob_')]
    total = pivoted[prob_cols].sum(axis=1)
    for col in prob_cols:
        pivoted[col] = pivoted[col] / total

    return pivoted

Step 4: Train a Simple Model

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

df = build_features()

# Features: implied probabilities
X = df[['prob_home', 'prob_draw', 'prob_away']]
y = df['winner']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

predictions = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, predictions):.3f}")
print(classification_report(y_test, predictions))

Step 5: Evaluate — Did You Beat the Market?

The real test isn’t accuracy — it’s whether your model finds profitable edges:

# Simulate betting on model's predictions
bankroll = 1000
stake = 10

for i, row in X_test.iterrows():
    pred = model.predict([row])[0]
    actual = y_test.loc[i]

    # Get the odds for the predicted outcome
    odds_col = {'home': 'Home', 'draw': 'Draw', 'away': 'Away'}[pred]
    odds = df.loc[i, odds_col]

    if pred == actual:
        bankroll += stake * (odds - 1)  # Profit
    else:
        bankroll -= stake  # Loss

print(f"Final bankroll: ${bankroll:.2f} (started at $1000)")
print(f"ROI: {((bankroll - 1000) / 1000 * 100):.1f}%")

What’s Next

Line movement as features: Track odds over time, use the change velocity as input
LSTM on sequences: Feed sequences of odds snapshots into a recurrent network
Cross-sport transfer: Train on high-volume sports (soccer), apply to lower-volume

Events API Reference

Event listing with filters →

Results API Reference

Get final scores →

Build a Line Tracker — store odds snapshots over time
Build a Settlement Engine — verify outcomes automatically
Affordable Odds API — cost-per-datapoint analysis

Get Your Free API Key

Start collecting odds data today — 10,000 free requests/month

Build a Product

Build a Bot

Creative Projects

Sports Prediction with ML

Train Sports Prediction Models with Odds Data

What You’ll Use

Why Odds > Stats

Step 1: Collect Pre-Match Odds (Python)

Step 2: Collect Results (Training Labels)

Step 3: Extract Features

Step 4: Train a Simple Model

Step 5: Evaluate — Did You Beat the Market?

What’s Next

Events API Reference

Results API Reference

Get Your Free API Key

Build a Product

Build a Bot

Creative Projects

Documentation Index

​Train Sports Prediction Models with Odds Data

​What You’ll Use

​Why Odds > Stats

​Step 1: Collect Pre-Match Odds (Python)

​Step 2: Collect Results (Training Labels)

​Step 3: Extract Features

​Step 4: Train a Simple Model

​Step 5: Evaluate — Did You Beat the Market?

​What’s Next

Events API Reference

Results API Reference

​Related Guides

Get Your Free API Key

Train Sports Prediction Models with Odds Data

What You’ll Use

Why Odds > Stats

Step 1: Collect Pre-Match Odds (Python)

Step 2: Collect Results (Training Labels)

Step 3: Extract Features

Step 4: Train a Simple Model

Step 5: Evaluate — Did You Beat the Market?

What’s Next

Related Guides