Chapter 60

Capstone - Complete Analytics System

Intermediate 30 min read 5 sections 10 code examples

0 of 60 chapters completed (0%)

Fantasy Football & Betting Analytics

Fantasy football and sports betting represent two of the most popular applications of football analytics outside professional clubs. Both require predicting player and match outcomes, but with different optimization goals and constraints.

Learning Objectives

Understand FPL scoring systems and optimization strategies
Project player points using expected metrics
Build fixture difficulty ratings and rotation planners
Understand betting market efficiency and value betting
Calculate implied probabilities from odds
Apply responsible gambling principles

Important Note

This chapter discusses betting analytics from an educational perspective. Always gamble responsibly and be aware of the risks. The house has an edge, and no model guarantees profits.

Fantasy Premier League Analytics

Fantasy Premier League (FPL) is the world's most popular fantasy football game. Analytics can help optimize squad selection, captain choices, and transfer strategy.

FPL Scoring System

Minutes: 1pt (1-59min), 2pts (60+min)
Goals: 4pts (FWD), 5pts (MID), 6pts (DEF/GK)
Assists: 3pts all positions
Clean Sheet: 4pts (DEF/GK), 1pt (MID)
Saves: 1pt per 3 saves (GK)
Bonus: 1-3pts for top performers

Key Metrics

xG: Predicts goals scored
xA: Predicts assists
xGC: Expected goals conceded (clean sheets)
xPoints: Expected FPL points
ICT Index: FPL's influence/creativity/threat

fpl_projections

# Python: FPL expected points model
import pandas as pd
import numpy as np

class FPLProjector:
    """Project expected FPL points for players."""

    GOAL_POINTS = {"GKP": 6, "DEF": 6, "MID": 5, "FWD": 4}
    CS_POINTS = {"GKP": 4, "DEF": 4, "MID": 1, "FWD": 0}

    def __init__(self, player_data):
        self.data = player_data.copy()

    def calculate_xpoints(self):
        """Calculate expected points for all players."""

        df = self.data

        # Minutes points
        df["xpts_minutes"] = np.where(df["expected_minutes"] >= 60, 2,
                            np.where(df["expected_minutes"] >= 1, 1, 0))

        # Goal points
        df["xpts_goals"] = df.apply(
            lambda x: x["xg"] * self.GOAL_POINTS.get(x["position"], 4),
            axis=1
        )

        # Assist points
        df["xpts_assists"] = df["xa"] * 3

        # Clean sheet probability (Poisson: P(0) = e^(-lambda))
        df["cs_prob"] = np.exp(-df["xgc"].fillna(2))
        df["xpts_cs"] = df.apply(
            lambda x: x["cs_prob"] * self.CS_POINTS.get(x["position"], 0)
                      if x["expected_minutes"] >= 60 else 0,
            axis=1
        )

        # Save points (GK only)
        df["xpts_saves"] = np.where(
            df["position"] == "GKP",
            df["expected_saves"].fillna(0) / 3,
            0
        )

        # Bonus points estimate (simplified)
        df["xpts_bonus"] = df["xg"] * 0.8 + df["xa"] * 0.5

        # Total
        df["xpoints"] = (df["xpts_minutes"] + df["xpts_goals"] +
                        df["xpts_assists"] + df["xpts_cs"] +
                        df["xpts_saves"] + df["xpts_bonus"])

        # Value (points per million)
        df["value"] = df["xpoints"] / (df["price"] / 10)

        return df

    def rank_by_value(self, position=None, min_price=None, max_price=None):
        """Rank players by value with optional filters."""

        df = self.calculate_xpoints()

        if position:
            df = df[df["position"] == position]
        if min_price:
            df = df[df["price"] >= min_price]
        if max_price:
            df = df[df["price"] <= max_price]

        return df.sort_values("value", ascending=False)

    def captain_picks(self, gameweek_fixtures):
        """Recommend captain picks for gameweek."""

        df = self.calculate_xpoints()

        # Factor in fixture difficulty
        df = df.merge(gameweek_fixtures, on="team")
        df["adjusted_xpts"] = df["xpoints"] * (1 + (3 - df["fdr"]) * 0.1)

        return df.nlargest(5, "adjusted_xpts")[
            ["name", "position", "xpoints", "fdr", "adjusted_xpts"]
        ]

# Example usage
players = pd.DataFrame({
    "name": ["Haaland", "Salah", "Trippier", "Raya", "Saka"],
    "position": ["FWD", "MID", "DEF", "GKP", "MID"],
    "team": ["MCI", "LIV", "NEW", "ARS", "ARS"],
    "price": [14.0, 12.5, 6.5, 5.5, 9.0],
    "xg": [0.85, 0.52, 0.08, 0.0, 0.35],
    "xa": [0.12, 0.35, 0.22, 0.0, 0.28],
    "xgc": [np.nan, np.nan, 1.1, 0.95, np.nan],
    "expected_minutes": [85, 88, 90, 90, 85],
    "expected_saves": [np.nan, np.nan, np.nan, 3.2, np.nan]
})

projector = FPLProjector(players)
results = projector.calculate_xpoints()
print(results[["name", "position", "price", "xpoints", "value"]].to_string())
# R: FPL expected points model
library(tidyverse)

# Calculate expected FPL points
calculate_xpoints <- function(player_data) {
  player_data %>%
    mutate(
      # Base points for playing
      xpoints_minutes = case_when(
        expected_minutes >= 60 ~ 2,
        expected_minutes >= 1 ~ 1,
        TRUE ~ 0
      ),

      # Goals (position-dependent)
      goal_points = case_when(
        position == "GKP" ~ 6,
        position == "DEF" ~ 6,
        position == "MID" ~ 5,
        position == "FWD" ~ 4
      ),
      xpoints_goals = xg * goal_points,

      # Assists
      xpoints_assists = xa * 3,

      # Clean sheets (for defenders and goalkeepers)
      cs_probability = exp(-xgc),  # Poisson probability of 0 goals
      cs_points = case_when(
        position %in% c("GKP", "DEF") ~ 4,
        position == "MID" ~ 1,
        TRUE ~ 0
      ),
      xpoints_cs = cs_probability * cs_points * (expected_minutes >= 60),

      # Saves (goalkeepers only)
      xpoints_saves = if_else(position == "GKP",
                              expected_saves / 3, 0),

      # Total expected points
      xpoints = xpoints_minutes + xpoints_goals + xpoints_assists +
                xpoints_cs + xpoints_saves,

      # Value calculation
      value = xpoints / (price / 10)
    )
}

# Example usage
players <- tribble(
  ~name, ~position, ~price, ~xg, ~xa, ~xgc, ~expected_minutes, ~expected_saves,
  "Haaland", "FWD", 14.0, 0.85, 0.12, NA, 85, NA,
  "Salah", "MID", 12.5, 0.52, 0.35, NA, 88, NA,
  "Trippier", "DEF", 6.5, 0.08, 0.22, 1.1, 90, NA,
  "Raya", "GKP", 5.5, 0, 0, 0.95, 90, 3.2
)

fpl_projections <- calculate_xpoints(players)
fpl_projections %>%
  select(name, position, price, xpoints, value) %>%
  arrange(desc(xpoints))

Output

      name position  price  xpoints     value
0  Haaland      FWD   14.0    6.42  0.458571
1    Salah      MID   12.5    5.31  0.424800
2     Saka      MID    9.0    4.12  0.457778
3 Trippier      DEF    6.5    3.87  0.595385
4     Raya      GKP    5.5    3.52  0.640000

Squad Optimization

Building an optimal FPL squad is a constrained optimization problem: maximize expected points subject to budget, position limits, and team limits.

fpl_optimization

# Python: FPL squad optimization
from scipy.optimize import milp, LinearConstraint, Bounds
import numpy as np
import pandas as pd

class FPLOptimizer:
    """Optimize FPL squad selection."""

    SQUAD_STRUCTURE = {"GKP": 2, "DEF": 5, "MID": 5, "FWD": 3}
    MAX_PER_TEAM = 3

    def __init__(self, player_pool):
        self.players = player_pool.copy()
        self.n_players = len(player_pool)

    def optimize_squad(self, budget=100.0):
        """Find optimal squad within constraints."""

        # Objective: maximize xpoints (minimize negative)
        c = -self.players["xpoints"].values

        # Variable bounds (binary: 0 or 1)
        integrality = np.ones(self.n_players)  # All binary

        # Constraints
        constraints = []

        # Budget constraint: sum(price * selected) <= budget
        A_budget = self.players["price"].values.reshape(1, -1)
        constraints.append(LinearConstraint(A_budget, -np.inf, budget))

        # Position constraints: exactly N players per position
        for pos, count in self.SQUAD_STRUCTURE.items():
            A_pos = (self.players["position"] == pos).astype(int).values
            constraints.append(LinearConstraint(A_pos.reshape(1, -1), count, count))

        # Team constraints: max 3 per team
        for team in self.players["team"].unique():
            A_team = (self.players["team"] == team).astype(int).values
            constraints.append(LinearConstraint(A_team.reshape(1, -1),
                                               -np.inf, self.MAX_PER_TEAM))

        # Total squad size = 15
        A_total = np.ones((1, self.n_players))
        constraints.append(LinearConstraint(A_total, 15, 15))

        # Solve
        bounds = Bounds(0, 1)
        result = milp(c, constraints=constraints, integrality=integrality,
                     bounds=bounds)

        if result.success:
            selected_idx = np.where(result.x > 0.5)[0]
            squad = self.players.iloc[selected_idx].copy()

            return {
                "squad": squad,
                "total_xpoints": squad["xpoints"].sum(),
                "total_cost": squad["price"].sum(),
                "remaining_budget": budget - squad["price"].sum()
            }

        return None

    def optimize_with_existing(self, budget, existing_players,
                               free_transfers=1, transfer_cost=4):
        """Optimize considering existing squad and transfer costs."""

        # Add transfer penalty to players not in existing squad
        self.players["transfer_penalty"] = np.where(
            self.players["name"].isin(existing_players),
            0,
            transfer_cost
        )

        # Adjust objective to account for transfers beyond free
        # This is a simplified version - full implementation would be more complex

        return self.optimize_squad(budget)

    def find_differentials(self, ownership_threshold=5.0):
        """Find high-value low-ownership players."""

        df = self.players.copy()
        differentials = df[
            (df["ownership"] < ownership_threshold) &
            (df["xpoints"] > df["xpoints"].median())
        ].sort_values("value", ascending=False)

        return differentials.head(10)

# Example usage
# optimizer = FPLOptimizer(all_players)
# result = optimizer.optimize_squad(budget=100.0)
# print(f"Optimal squad: {result['total_xpoints']:.1f} xPts, £{result['total_cost']:.1f}m")
# R: FPL squad optimization with linear programming
library(lpSolve)
library(tidyverse)

optimize_fpl_squad <- function(players, budget = 100, bench_boost = FALSE) {
  n <- nrow(players)

  # Objective: maximize expected points
  objective <- players$xpoints

  # Constraints matrix
  constraints <- rbind(
    # Budget constraint
    players$price,

    # Position constraints (exactly 2 GK, 5 DEF, 5 MID, 3 FWD)
    as.numeric(players$position == "GKP"),
    as.numeric(players$position == "DEF"),
    as.numeric(players$position == "MID"),
    as.numeric(players$position == "FWD"),

    # Team constraints (max 3 from each team)
    sapply(unique(players$team), function(t)
      as.numeric(players$team == t))
  )

  # Constraint directions and RHS
  directions <- c(
    "<=",           # Budget
    "==", "==", "==", "==",  # Positions
    rep("<=", length(unique(players$team)))  # Teams
  )

  rhs <- c(
    budget,         # Budget
    2, 5, 5, 3,    # Positions
    rep(3, length(unique(players$team)))  # Teams (max 3)
  )

  # Solve
  solution <- lp(
    direction = "max",
    objective.in = objective,
    const.mat = constraints,
    const.dir = directions,
    const.rhs = rhs,
    all.bin = TRUE
  )

  # Extract selected players
  selected <- players[solution$solution == 1, ]

  list(
    squad = selected,
    total_xpoints = sum(selected$xpoints),
    total_cost = sum(selected$price),
    remaining_budget = budget - sum(selected$price)
  )
}

# Run optimization
# result <- optimize_fpl_squad(all_players, budget = 100)
# result$squad %>% arrange(position, desc(xpoints))

Fixture Difficulty Ratings

Fixture difficulty is crucial for FPL planning. Understanding which teams face easier or harder runs helps with captain choices, transfers, and chip timing.

fdr_calculator

# Python: Fixture Difficulty Rating system
import pandas as pd
import numpy as np
from typing import Dict, List, Tuple
from dataclasses import dataclass

@dataclass
class FixtureRating:
    """Rating for a single fixture."""
    opponent: str
    is_home: bool
    fdr_attack: int  # 1-5 (difficulty to score)
    fdr_defense: int  # 1-5 (difficulty to keep clean sheet)
    fdr_overall: int  # 1-5 (overall difficulty)

class FDRCalculator:
    """Calculate Fixture Difficulty Ratings for FPL."""

    def __init__(self, matches_df: pd.DataFrame, lookback_days: int = 180):
        self.matches = matches_df.copy()
        self.lookback_days = lookback_days
        self.team_strength = self._calculate_team_strength()

    def _calculate_team_strength(self) -> pd.DataFrame:
        """Calculate attack and defense strength for each team."""

        recent = self.matches[
            self.matches["date"] >= self.matches["date"].max() - pd.Timedelta(days=self.lookback_days)
        ]

        # Home stats
        home_stats = recent.groupby("home_team").agg({
            "home_xg": "mean",
            "away_xg": "mean",
            "home_goals": "sum",
            "away_goals": "sum"
        }).rename(columns={
            "home_xg": "home_xg_for",
            "away_xg": "home_xg_against"
        })

        # Away stats
        away_stats = recent.groupby("away_team").agg({
            "away_xg": "mean",
            "home_xg": "mean",
            "away_goals": "sum",
            "home_goals": "sum"
        }).rename(columns={
            "away_xg": "away_xg_for",
            "home_xg": "away_xg_against"
        })

        # Combine
        strength = home_stats.join(away_stats, how="outer").fillna(0)

        # Calculate overall metrics
        strength["attack_strength"] = (strength["home_xg_for"] + strength["away_xg_for"]) / 2
        strength["defense_strength"] = (strength["home_xg_against"] + strength["away_xg_against"]) / 2
        strength["overall_strength"] = strength["attack_strength"] - strength["defense_strength"]

        # FDR scores (1-5)
        strength["fdr_attack"] = pd.qcut(
            strength["defense_strength"],
            q=5, labels=[1, 2, 3, 4, 5]
        ).astype(int)

        strength["fdr_defense"] = pd.qcut(
            strength["attack_strength"],
            q=5, labels=[1, 2, 3, 4, 5]
        ).astype(int)

        strength["fdr_overall"] = pd.qcut(
            -strength["overall_strength"],
            q=5, labels=[1, 2, 3, 4, 5]
        ).astype(int)

        return strength.reset_index().rename(columns={"index": "team"})

    def get_fixture_difficulty(self, team: str, opponent: str,
                               is_home: bool) -> FixtureRating:
        """Get difficulty rating for a specific fixture."""

        opp_data = self.team_strength[
            self.team_strength["team"] == opponent
        ].iloc[0]

        # Adjust for home/away
        home_advantage = 0.3  # FDR reduction for home games
        fdr_adj = int(home_advantage) if is_home else 0

        return FixtureRating(
            opponent=opponent,
            is_home=is_home,
            fdr_attack=max(1, opp_data["fdr_attack"] - fdr_adj),
            fdr_defense=max(1, opp_data["fdr_defense"] - fdr_adj),
            fdr_overall=max(1, opp_data["fdr_overall"] - fdr_adj)
        )

    def calculate_fixture_run(self, fixtures_df: pd.DataFrame,
                             n_gameweeks: int = 6) -> pd.DataFrame:
        """Calculate fixture difficulty for upcoming gameweeks."""

        results = []

        for team in fixtures_df["team"].unique():
            team_fixtures = fixtures_df[
                fixtures_df["team"] == team
            ].head(n_gameweeks)

            ratings = []
            for _, fix in team_fixtures.iterrows():
                rating = self.get_fixture_difficulty(
                    team, fix["opponent"], fix["is_home"]
                )
                ratings.append(rating.fdr_overall)

            results.append({
                "team": team,
                "avg_fdr": np.mean(ratings),
                "easy_count": sum(1 for r in ratings if r <= 2),
                "hard_count": sum(1 for r in ratings if r >= 4),
                "fixture_swing": sum(1 for r in ratings if r <= 2) - sum(1 for r in ratings if r >= 4)
            })

        return pd.DataFrame(results).sort_values("avg_fdr")

    def find_double_gameweek_targets(self, fixtures_df: pd.DataFrame) -> pd.DataFrame:
        """Identify teams with double gameweeks."""

        dgw = fixtures_df.groupby(["team", "gameweek"]).size().reset_index(name="fixtures")
        dgw = dgw[dgw["fixtures"] > 1]

        # Add FDR for DGW fixtures
        dgw_details = []
        for _, row in dgw.iterrows():
            team_gw = fixtures_df[
                (fixtures_df["team"] == row["team"]) &
                (fixtures_df["gameweek"] == row["gameweek"])
            ]
            fdrs = [
                self.get_fixture_difficulty(row["team"], f["opponent"], f["is_home"]).fdr_overall
                for _, f in team_gw.iterrows()
            ]
            dgw_details.append({
                "team": row["team"],
                "gameweek": row["gameweek"],
                "fixtures": row["fixtures"],
                "avg_fdr": np.mean(fdrs)
            })

        return pd.DataFrame(dgw_details).sort_values(["gameweek", "avg_fdr"])

class FixtureSwingAnalyzer:
    """Analyze fixture swings for transfer planning."""

    def __init__(self, fdr_calc: FDRCalculator, fixtures: pd.DataFrame):
        self.fdr = fdr_calc
        self.fixtures = fixtures

    def find_rotation_pairs(self, budget: float = 10.0) -> List[Tuple[str, str]]:
        """Find pairs of players/teams that rotate well."""

        teams = self.fixtures["team"].unique()
        pairs = []

        for i, team1 in enumerate(teams):
            for team2 in teams[i+1:]:
                # Get next 10 fixtures for each
                fix1 = self.fixtures[self.fixtures["team"] == team1].head(10)
                fix2 = self.fixtures[self.fixtures["team"] == team2].head(10)

                if len(fix1) == 10 and len(fix2) == 10:
                    # Check if they complement each other
                    rotation_score = self._calculate_rotation_score(fix1, fix2)
                    if rotation_score > 7:  # Good rotation
                        pairs.append((team1, team2, rotation_score))

        return sorted(pairs, key=lambda x: x[2], reverse=True)

    def _calculate_rotation_score(self, fix1: pd.DataFrame,
                                  fix2: pd.DataFrame) -> float:
        """Score how well two fixture lists rotate."""

        score = 0
        for i in range(min(len(fix1), len(fix2))):
            fdr1 = self.fdr.get_fixture_difficulty(
                fix1.iloc[i]["team"],
                fix1.iloc[i]["opponent"],
                fix1.iloc[i]["is_home"]
            ).fdr_overall

            fdr2 = self.fdr.get_fixture_difficulty(
                fix2.iloc[i]["team"],
                fix2.iloc[i]["opponent"],
                fix2.iloc[i]["is_home"]
            ).fdr_overall

            # Best rotation: one easy, one hard
            if (fdr1 <= 2 and fdr2 >= 4) or (fdr1 >= 4 and fdr2 <= 2):
                score += 1
            # Good: one easy, one medium
            elif (fdr1 <= 2 and fdr2 == 3) or (fdr1 == 3 and fdr2 <= 2):
                score += 0.5

        return score

print("FDR Calculator initialized!")
# R: Fixture Difficulty Rating system
library(tidyverse)

# Build comprehensive FDR model
calculate_fdr <- function(teams_data, matches_data, n_matches = 6) {

    # Calculate home and away strength
    team_strength <- matches_data %>%
        filter(date >= max(date) - 180) %>%  # Last 6 months
        group_by(home_team) %>%
        summarise(
            home_xg_for = mean(home_xg),
            home_xg_against = mean(away_xg),
            home_points = sum(case_when(
                home_goals > away_goals ~ 3,
                home_goals == away_goals ~ 1,
                TRUE ~ 0
            )) / n(),
            .groups = "drop"
        ) %>%
        rename(team = home_team) %>%
        left_join(
            matches_data %>%
                filter(date >= max(date) - 180) %>%
                group_by(away_team) %>%
                summarise(
                    away_xg_for = mean(away_xg),
                    away_xg_against = mean(home_xg),
                    away_points = sum(case_when(
                        away_goals > home_goals ~ 3,
                        away_goals == home_goals ~ 1,
                        TRUE ~ 0
                    )) / n(),
                    .groups = "drop"
                ) %>%
                rename(team = away_team),
            by = "team"
        ) %>%
        mutate(
            # Overall attacking/defensive strength
            attack_strength = (home_xg_for + away_xg_for) / 2,
            defense_strength = (home_xg_against + away_xg_against) / 2,
            overall_strength = attack_strength - defense_strength,

            # FDR score (1-5 scale, lower = easier)
            fdr_attack = ntile(defense_strength, 5),  # Easier to attack weak defenses
            fdr_defense = ntile(attack_strength, 5),  # Harder to keep CS vs strong attacks
            fdr_overall = ntile(-overall_strength, 5)
        )

    team_strength
}

# Calculate fixture runs
calculate_fixture_runs <- function(fixtures, fdr_data, n_gameweeks = 6) {

    fixtures %>%
        filter(gameweek <= max(gameweek) + n_gameweeks) %>%
        left_join(
            fdr_data %>% select(team, fdr_overall),
            by = c("opponent" = "team")
        ) %>%
        group_by(team) %>%
        summarise(
            next_n_fdr = mean(fdr_overall),
            easy_fixtures = sum(fdr_overall <= 2),
            hard_fixtures = sum(fdr_overall >= 4),
            fixture_swing = easy_fixtures - hard_fixtures,
            .groups = "drop"
        ) %>%
        arrange(next_n_fdr)
}

# Fixture ticker for dashboard
create_fixture_ticker <- function(fixtures, fdr_data, team) {
    fixtures %>%
        filter(team == !!team) %>%
        head(10) %>%
        left_join(fdr_data %>% select(team, fdr_overall),
                  by = c("opponent" = "team")) %>%
        mutate(
            fdr_color = case_when(
                fdr_overall == 1 ~ "#00FF00",  # Bright green
                fdr_overall == 2 ~ "#90EE90",  # Light green
                fdr_overall == 3 ~ "#FFFF00",  # Yellow
                fdr_overall == 4 ~ "#FFA500",  # Orange
                fdr_overall == 5 ~ "#FF0000"   # Red
            ),
            display = paste0(
                opponent, " (",
                ifelse(is_home, "H", "A"), ")"
            )
        )
}

print("FDR calculation system ready!")

FDR Strategy Tips

Wildcards: Time wildcards to catch fixture swings - target teams with 4+ green fixtures
Differentials: Look for low-ownership players from teams with easy runs
Rotation: Pair 4.5m defenders who rotate well to maximize clean sheets
Captaincy: Weight captain picks toward easier fixtures, especially for premiums

FPL Chip Strategy & Optimization

FPL chips (Wildcard, Bench Boost, Triple Captain, Free Hit) can swing hundreds of points when used optimally. Understanding when to deploy them is crucial for top finishes.

chip_optimization

# Python: Chip optimization strategy
import pandas as pd
import numpy as np
from dataclasses import dataclass
from typing import List, Dict, Optional
from scipy.optimize import milp, LinearConstraint, Bounds

@dataclass
class ChipOpportunity:
    """Represents a chip deployment opportunity."""
    gameweek: int
    chip_type: str
    score: float
    reasoning: str
    recommended_action: str

class ChipOptimizer:
    """Optimize FPL chip deployment."""

    def __init__(self, fixtures: pd.DataFrame, player_pool: pd.DataFrame,
                 fdr_data: pd.DataFrame):
        self.fixtures = fixtures
        self.players = player_pool
        self.fdr = fdr_data

    def find_bench_boost_opportunities(self,
                                       remaining_gws: List[int]) -> List[ChipOpportunity]:
        """Find optimal Bench Boost gameweeks."""

        opportunities = []

        for gw in remaining_gws:
            gw_fixtures = self.fixtures[self.fixtures["gameweek"] == gw]

            # Count doubles and blanks
            team_counts = gw_fixtures.groupby("team").size()
            doubles = (team_counts == 2).sum()
            blanks = len(set(self.players["team"]) - set(gw_fixtures["team"]))

            # Average FDR
            avg_fdr = gw_fixtures.merge(self.fdr[["team", "fdr_overall"]],
                                        left_on="opponent", right_on="team",
                                        how="left")["fdr_overall"].mean()

            # Score the opportunity
            score = doubles * 3 + (5 - avg_fdr) * 2 - blanks * 2

            if score > 5:
                opportunities.append(ChipOpportunity(
                    gameweek=gw,
                    chip_type="Bench Boost",
                    score=score,
                    reasoning=f"{doubles} DGWs, avg FDR {avg_fdr:.1f}",
                    recommended_action=f"Consider BB in GW{gw}" if score > 8 else "Monitor"
                ))

        return sorted(opportunities, key=lambda x: x.score, reverse=True)

    def find_triple_captain_targets(self, gameweek: int,
                                    squad: List[str]) -> pd.DataFrame:
        """Find best Triple Captain picks for a gameweek."""

        # Get fixtures for the gameweek
        gw_fixtures = self.fixtures[self.fixtures["gameweek"] == gameweek]

        # Find players with doubles or great fixtures
        squad_players = self.players[self.players["name"].isin(squad)]

        results = []
        for _, player in squad_players.iterrows():
            team_fixtures = gw_fixtures[gw_fixtures["team"] == player["team"]]

            if len(team_fixtures) == 0:
                continue

            # Calculate TC expected points
            fixture_count = len(team_fixtures)
            avg_fdr = team_fixtures.merge(
                self.fdr[["team", "fdr_overall"]],
                left_on="opponent", right_on="team"
            )["fdr_overall"].mean()

            # Adjust xpoints for fixture count and difficulty
            base_xpts = player["xpoints"]
            adj_xpts = base_xpts * fixture_count * (1 + (3 - avg_fdr) * 0.1)
            tc_expected = adj_xpts * 3  # Triple points

            results.append({
                "name": player["name"],
                "position": player["position"],
                "team": player["team"],
                "fixtures": fixture_count,
                "avg_fdr": avg_fdr,
                "base_xpts": base_xpts,
                "tc_expected": tc_expected
            })

        return pd.DataFrame(results).sort_values("tc_expected", ascending=False)

    def optimize_free_hit_squad(self, gameweek: int,
                               budget: float = 100.0) -> Dict:
        """Build optimal Free Hit squad for a specific gameweek."""

        gw_fixtures = self.fixtures[self.fixtures["gameweek"] == gameweek]
        playing_teams = set(gw_fixtures["team"])

        # Filter to players with fixtures
        available = self.players[self.players["team"].isin(playing_teams)].copy()

        # Adjust for double gameweeks
        team_counts = gw_fixtures.groupby("team").size().to_dict()
        available["gw_fixtures"] = available["team"].map(team_counts)

        # Add FDR adjustment
        available = available.merge(
            gw_fixtures.groupby("team")["opponent"].apply(list).reset_index(),
            on="team", how="left"
        )

        # Calculate adjusted xpoints for this gameweek
        def calc_gw_xpts(row):
            base = row["xpoints"]
            fixtures = row["gw_fixtures"]
            return base * fixtures * 1.1  # Slight boost for DGW

        available["gw_xpoints"] = available.apply(calc_gw_xpts, axis=1)

        # Now optimize (using existing optimizer)
        n = len(available)

        # Objective: maximize gw_xpoints
        c = -available["gw_xpoints"].values

        constraints = []

        # Budget
        A_budget = available["price"].values.reshape(1, -1)
        constraints.append(LinearConstraint(A_budget, -np.inf, budget))

        # Position constraints
        for pos, count in {"GKP": 2, "DEF": 5, "MID": 5, "FWD": 3}.items():
            A_pos = (available["position"] == pos).astype(int).values
            constraints.append(LinearConstraint(A_pos.reshape(1, -1), count, count))

        # Team constraints
        for team in available["team"].unique():
            A_team = (available["team"] == team).astype(int).values
            constraints.append(LinearConstraint(A_team.reshape(1, -1),
                                               -np.inf, 3))

        # Total = 15
        constraints.append(LinearConstraint(np.ones((1, n)), 15, 15))

        # Solve
        integrality = np.ones(n)
        result = milp(c, constraints=constraints, integrality=integrality,
                     bounds=Bounds(0, 1))

        if result.success:
            selected_idx = np.where(result.x > 0.5)[0]
            squad = available.iloc[selected_idx]

            return {
                "squad": squad[["name", "position", "team", "price",
                              "gw_fixtures", "gw_xpoints"]],
                "total_xpoints": squad["gw_xpoints"].sum(),
                "total_cost": squad["price"].sum()
            }

        return None

    def wildcard_timing_analysis(self, current_gw: int,
                                 remaining_gws: List[int]) -> Dict:
        """Analyze optimal wildcard timing."""

        swing_analysis = []

        for gw in remaining_gws:
            # Calculate fixture swing from current GW to target
            upcoming = self.fixtures[
                (self.fixtures["gameweek"] >= gw) &
                (self.fixtures["gameweek"] < gw + 6)
            ]

            team_fdrs = upcoming.merge(
                self.fdr[["team", "fdr_overall"]],
                left_on="opponent", right_on="team"
            ).groupby("team_x")["fdr_overall"].mean().reset_index()

            # Find teams with improving fixtures
            improving_teams = team_fdrs[team_fdrs["fdr_overall"] <= 2.5]

            swing_analysis.append({
                "wildcard_gw": gw,
                "easy_fixture_teams": len(improving_teams),
                "best_teams": improving_teams.nsmallest(5, "fdr_overall")["team_x"].tolist()
            })

        return pd.DataFrame(swing_analysis)

# Example usage
print("Chip Optimizer ready for deployment!")

# Usage pattern:
# optimizer = ChipOptimizer(fixtures_df, players_df, fdr_df)
# bb_opps = optimizer.find_bench_boost_opportunities([30, 31, 32, 33, 34])
# tc_picks = optimizer.find_triple_captain_targets(34, my_squad)
# fh_squad = optimizer.optimize_free_hit_squad(29, budget=100.0)
# R: Chip optimization strategy
library(tidyverse)

# Chip timing optimizer
analyze_chip_opportunities <- function(fixtures, fdr_data, player_pool) {

    # Find best Bench Boost gameweeks
    find_bb_opportunities <- function(fixtures, n_gw = 10) {
        fixtures %>%
            group_by(gameweek) %>%
            summarise(
                double_gw_count = sum(is_double_gw),
                avg_fdr = mean(fdr),
                easy_fixture_count = sum(fdr <= 2),
                .groups = "drop"
            ) %>%
            mutate(
                bb_score = double_gw_count * 3 + easy_fixture_count +
                          (5 - avg_fdr) * 2
            ) %>%
            arrange(desc(bb_score)) %>%
            head(n_gw)
    }

    # Find best Triple Captain targets
    find_tc_opportunities <- function(player_pool, fixtures) {
        player_pool %>%
            filter(position %in% c("MID", "FWD")) %>%
            left_join(fixtures, by = "team") %>%
            filter(is_double_gw | fdr <= 2) %>%
            mutate(
                tc_score = xpoints * (1 + is_double_gw + (3 - fdr) * 0.2)
            ) %>%
            arrange(desc(tc_score)) %>%
            head(10)
    }

    # Free Hit gameweek analysis
    find_fh_opportunities <- function(fixtures) {
        fixtures %>%
            group_by(gameweek) %>%
            summarise(
                blank_count = sum(is_blank),
                double_count = sum(is_double_gw),
                avg_fdr = mean(fdr, na.rm = TRUE),
                .groups = "drop"
            ) %>%
            mutate(
                fh_score = blank_count * 5 +  # High value if many blanks
                          double_count * 2 +
                          (5 - avg_fdr)
            ) %>%
            filter(fh_score >= 5) %>%
            arrange(desc(fh_score))
    }

    list(
        bench_boost = find_bb_opportunities(fixtures),
        triple_captain = find_tc_opportunities(player_pool, fixtures),
        free_hit = find_fh_opportunities(fixtures)
    )
}

# Wildcard planning
plan_wildcard <- function(current_squad, player_pool, target_gw,
                          budget = 100, upcoming_fixtures) {

    # Calculate value scores for all players
    player_pool <- player_pool %>%
        left_join(
            upcoming_fixtures %>%
                group_by(team) %>%
                summarise(avg_fdr = mean(fdr), .groups = "drop"),
            by = "team"
        ) %>%
        mutate(
            # Adjust xpoints by fixture difficulty
            adj_xpoints = xpoints * (1 + (3 - avg_fdr) * 0.15),
            value = adj_xpoints / (price / 10)
        )

    # Find optimal new squad
    list(
        transfers_needed = sum(!current_squad$name %in% player_pool$name),
        top_picks_by_position = player_pool %>%
            group_by(position) %>%
            slice_max(value, n = 5) %>%
            select(name, team, price, xpoints, adj_xpoints, value)
    )
}

print("Chip strategy analyzer ready!")

Chip	Optimal Timing	Key Factors	Common Mistakes
Bench Boost	Double Gameweek with 15 playing players	All bench players have fixtures, ideally doubles	Using without full playing squad
Triple Captain	Premium player with DGW or 2 easy fixtures	High xG player, good fixtures, form	Chasing last year's TC pick
Free Hit	Blank Gameweek with many teams missing	Squad normally has many blanks	Using for DGW instead of blank
Wildcard	Major fixture swing or injury crisis	4+ transfers needed, fixture improvement	Panic wildcard after one bad week

Betting Market Analysis

Understanding betting markets helps evaluate model performance and market efficiency. Odds contain valuable information about probability estimates.

betting_analysis

# Python: Betting odds analysis
import pandas as pd
import numpy as np

class OddsAnalyzer:
    """Analyze betting odds and implied probabilities."""

    def __init__(self):
        pass

    def convert_odds(self, decimal_odds):
        """Convert decimal odds to all formats."""
        return {
            "decimal": decimal_odds,
            "fractional": f"{int(decimal_odds - 1)}/1" if decimal_odds >= 2 else f"1/{int(1/(decimal_odds-1))}",
            "american": f"+{int((decimal_odds - 1) * 100)}" if decimal_odds >= 2
                        else f"-{int(100 / (decimal_odds - 1))}",
            "implied_prob": 1 / decimal_odds
        }

    def calculate_implied_probabilities(self, home_odds, draw_odds, away_odds):
        """Calculate true probabilities from betting odds."""

        # Raw implied probabilities
        home_prob = 1 / home_odds
        draw_prob = 1 / draw_odds
        away_prob = 1 / away_odds

        # Overround (bookmaker margin)
        overround = home_prob + draw_prob + away_prob

        # True probabilities (margin removed)
        return pd.DataFrame({
            "outcome": ["Home", "Draw", "Away"],
            "odds": [home_odds, draw_odds, away_odds],
            "raw_implied": [home_prob, draw_prob, away_prob],
            "true_implied": [home_prob/overround, draw_prob/overround,
                           away_prob/overround],
            "overround_pct": [(overround - 1) * 100] * 3
        })

    def calculate_expected_value(self, model_prob, odds):
        """Calculate expected value of a bet."""
        # EV = (probability * profit) - (1 - probability) * stake
        # For unit stake: EV = (prob * (odds - 1)) - ((1 - prob) * 1)
        ev = (model_prob * (odds - 1)) - (1 - model_prob)
        return ev

    def find_value_bets(self, matches_df, model_probs_col, odds_col,
                       ev_threshold=0.05):
        """Identify bets where model shows positive expected value."""

        df = matches_df.copy()

        df["implied_prob"] = 1 / df[odds_col]
        df["ev"] = df.apply(
            lambda x: self.calculate_expected_value(x[model_probs_col], x[odds_col]),
            axis=1
        )
        df["edge"] = df[model_probs_col] - df["implied_prob"]

        value_bets = df[df["ev"] > ev_threshold].sort_values("ev", ascending=False)
        return value_bets

    def kelly_criterion(self, model_prob, odds, fraction=0.25):
        """Calculate Kelly stake as percentage of bankroll."""
        # Full Kelly: f* = (bp - q) / b
        # where b = odds - 1, p = prob of win, q = prob of loss

        b = odds - 1
        p = model_prob
        q = 1 - p

        kelly = (b * p - q) / b

        # Apply fractional Kelly for safety
        return max(0, kelly * fraction)

# Example usage
analyzer = OddsAnalyzer()

# Analyze match odds
probs = analyzer.calculate_implied_probabilities(1.65, 3.80, 5.50)
print("Implied Probabilities:")
print(probs.to_string(index=False))
print(f"\nBookmaker margin: {probs['overround_pct'].iloc[0]:.1f}%")

# Calculate EV for a bet
model_prob = 0.65  # Our model says 65% chance of home win
home_odds = 1.65

ev = analyzer.calculate_expected_value(model_prob, home_odds)
kelly = analyzer.kelly_criterion(model_prob, home_odds)

print(f"\nModel probability: {model_prob:.1%}")
print(f"Implied probability: {1/home_odds:.1%}")
print(f"Expected Value: {ev:.3f}")
print(f"Kelly stake: {kelly:.1%} of bankroll")
# R: Betting odds analysis
library(tidyverse)

# Convert odds formats
convert_odds <- function(decimal_odds) {
  list(
    decimal = decimal_odds,
    fractional = paste0(round(decimal_odds - 1), "/1"),
    american = ifelse(decimal_odds >= 2,
                      paste0("+", round((decimal_odds - 1) * 100)),
                      paste0("-", round(100 / (decimal_odds - 1)))),
    implied_prob = 1 / decimal_odds
  )
}

# Calculate implied probabilities from odds
implied_probabilities <- function(home_odds, draw_odds, away_odds) {
  # Raw implied probabilities
  home_prob <- 1 / home_odds
  draw_prob <- 1 / draw_odds
  away_prob <- 1 / away_odds

  # Calculate overround (bookmaker margin)
  overround <- home_prob + draw_prob + away_prob

  # True probabilities (removing margin)
  tibble(
    outcome = c("Home", "Draw", "Away"),
    odds = c(home_odds, draw_odds, away_odds),
    raw_implied = c(home_prob, draw_prob, away_prob),
    true_implied = c(home_prob, draw_prob, away_prob) / overround,
    overround_pct = (overround - 1) * 100
  )
}

# Example
match_odds <- implied_probabilities(
  home_odds = 1.65,
  draw_odds = 3.80,
  away_odds = 5.50
)

print(match_odds)
cat("\nBookmaker margin:", round(match_odds$overround_pct[1], 1), "%")

Output

Implied Probabilities:
outcome  odds  raw_implied  true_implied  overround_pct
   Home  1.65       0.6061        0.5714           6.1
   Draw  3.80       0.2632        0.2484           6.1
   Away  5.50       0.1818        0.1716           6.1

Bookmaker margin: 6.1%

Model probability: 65.0%
Implied probability: 60.6%
Expected Value: 0.073
Kelly stake: 2.8% of bankroll

Evaluating Prediction Models

Assessing model quality is crucial for both fantasy and betting applications. Key metrics include calibration, log loss, and Brier score.

model_evaluation

# Python: Model evaluation metrics
import numpy as np
from sklearn.metrics import brier_score_loss, log_loss, roc_auc_score
from sklearn.calibration import calibration_curve
import matplotlib.pyplot as plt

class ModelEvaluator:
    """Evaluate prediction model performance."""

    def __init__(self, predictions, actuals):
        self.predictions = np.array(predictions)
        self.actuals = np.array(actuals)

    def brier_score(self):
        """Calculate Brier score (lower is better)."""
        return brier_score_loss(self.actuals, self.predictions)

    def calculate_log_loss(self):
        """Calculate log loss (lower is better)."""
        return log_loss(self.actuals, self.predictions)

    def calculate_auc(self):
        """Calculate ROC AUC."""
        return roc_auc_score(self.actuals, self.predictions)

    def calibration_analysis(self, n_bins=10):
        """Analyze model calibration."""
        prob_true, prob_pred = calibration_curve(
            self.actuals, self.predictions, n_bins=n_bins
        )

        return {
            "predicted": prob_pred,
            "actual": prob_true,
            "calibration_error": np.mean(np.abs(prob_pred - prob_true))
        }

    def plot_calibration(self):
        """Plot calibration curve."""
        cal = self.calibration_analysis()

        fig, ax = plt.subplots(figsize=(8, 8))

        # Perfect calibration line
        ax.plot([0, 1], [0, 1], "k--", label="Perfectly Calibrated")

        # Model calibration
        ax.plot(cal["predicted"], cal["actual"], "s-",
               label=f"Model (ECE: {cal['calibration_error']:.3f})")

        ax.set_xlabel("Mean Predicted Probability")
        ax.set_ylabel("Actual Proportion")
        ax.set_title("Calibration Curve")
        ax.legend()
        ax.grid(True, alpha=0.3)

        return fig

    def full_report(self):
        """Generate comprehensive evaluation report."""
        return {
            "brier_score": self.brier_score(),
            "log_loss": self.calculate_log_loss(),
            "auc": self.calculate_auc(),
            "calibration": self.calibration_analysis()
        }

# Compare model to betting market
def evaluate_against_market(model_probs, market_probs, actuals):
    """Compare model performance to market."""

    model_eval = ModelEvaluator(model_probs, actuals)
    market_eval = ModelEvaluator(market_probs, actuals)

    comparison = pd.DataFrame({
        "Metric": ["Brier Score", "Log Loss", "AUC"],
        "Model": [model_eval.brier_score(),
                  model_eval.calculate_log_loss(),
                  model_eval.calculate_auc()],
        "Market": [market_eval.brier_score(),
                   market_eval.calculate_log_loss(),
                   market_eval.calculate_auc()]
    })

    comparison["Model Better"] = comparison.apply(
        lambda x: x["Model"] < x["Market"] if x["Metric"] != "AUC"
                  else x["Model"] > x["Market"],
        axis=1
    )

    return comparison

# Example evaluation
np.random.seed(42)
n = 500
predictions = np.random.beta(2, 3, n)
actuals = (np.random.random(n) < predictions).astype(int)

evaluator = ModelEvaluator(predictions, actuals)
report = evaluator.full_report()

print("Model Evaluation Report:")
print(f"  Brier Score: {report['brier_score']:.4f}")
print(f"  Log Loss: {report['log_loss']:.4f}")
print(f"  AUC: {report['auc']:.4f}")
print(f"  Calibration Error: {report['calibration']['calibration_error']:.4f}")
# R: Model evaluation metrics
library(tidyverse)

# Brier score (lower is better)
brier_score <- function(predicted_prob, actual_outcome) {
  mean((predicted_prob - actual_outcome)^2)
}

# Log loss (lower is better)
log_loss <- function(predicted_prob, actual_outcome) {
  eps <- 1e-15
  predicted_prob <- pmax(pmin(predicted_prob, 1 - eps), eps)
  -mean(actual_outcome * log(predicted_prob) +
        (1 - actual_outcome) * log(1 - predicted_prob))
}

# Calibration analysis
calibration_analysis <- function(predictions, outcomes, n_bins = 10) {
  data <- tibble(pred = predictions, actual = outcomes)

  data %>%
    mutate(bin = cut(pred, breaks = seq(0, 1, length.out = n_bins + 1),
                     include.lowest = TRUE)) %>%
    group_by(bin) %>%
    summarise(
      mean_predicted = mean(pred),
      mean_actual = mean(actual),
      count = n()
    )
}

# ROC AUC calculation
calculate_auc <- function(predictions, outcomes) {
  # Simple AUC calculation
  pos <- predictions[outcomes == 1]
  neg <- predictions[outcomes == 0]

  mean(sapply(pos, function(p) mean(p > neg)))
}

# Comprehensive evaluation
evaluate_model <- function(predictions, outcomes) {
  list(
    brier = brier_score(predictions, outcomes),
    log_loss = log_loss(predictions, outcomes),
    auc = calculate_auc(predictions, outcomes),
    calibration = calibration_analysis(predictions, outcomes)
  )
}

Output

Model Evaluation Report:
  Brier Score: 0.1923
  Log Loss: 0.5847
  AUC: 0.7234
  Calibration Error: 0.0412

Goal Scorer & Over/Under Markets

Player-level and match total betting markets can be analyzed using xG-based models. These markets often show different efficiency levels than match result markets.

goal_scorer_markets

# Python: Goal scorer market analysis
import pandas as pd
import numpy as np
from scipy.stats import poisson
from typing import Dict, List, Tuple

class GoalScorerAnalyzer:
    """Analyze anytime goalscorer markets."""

    def __init__(self, player_stats: pd.DataFrame, odds_data: pd.DataFrame):
        self.players = player_stats.merge(odds_data, on="player_name", how="left")

    def calculate_ags_probability(self, xg_per_90: float,
                                  expected_mins: float) -> float:
        """Calculate probability of scoring at least once."""
        expected_goals = xg_per_90 * expected_mins / 90
        return 1 - np.exp(-expected_goals)

    def analyze_market(self) -> pd.DataFrame:
        """Analyze all AGS market prices."""

        df = self.players.copy()

        # Calculate model probability
        df["expected_goals"] = df["xg_per_90"] * df["expected_mins"] / 90
        df["model_prob"] = 1 - np.exp(-df["expected_goals"])

        # Market implied probability
        df["implied_prob"] = 1 / df["ags_odds"]

        # Edge and EV
        df["edge"] = df["model_prob"] - df["implied_prob"]
        df["ev"] = (df["model_prob"] * (df["ags_odds"] - 1)) - (1 - df["model_prob"])

        # Confidence score
        df["confidence"] = np.minimum(df["matches_played"] / 10, 1)

        return df.sort_values("ev", ascending=False)

    def find_value_ags(self, min_ev: float = 0.05,
                       min_confidence: float = 0.5) -> pd.DataFrame:
        """Find value AGS bets."""

        analysis = self.analyze_market()

        return analysis[
            (analysis["ev"] > min_ev) &
            (analysis["confidence"] >= min_confidence)
        ][["player_name", "team", "ags_odds", "model_prob",
           "implied_prob", "edge", "ev"]]


class OverUnderAnalyzer:
    """Analyze over/under and BTTS markets."""

    def __init__(self, match_data: pd.DataFrame):
        self.matches = match_data

    def analyze_ou_market(self, line: float = 2.5) -> pd.DataFrame:
        """Analyze over/under market for given line."""

        df = self.matches.copy()

        # Total expected goals
        df["total_xg"] = df["home_xg"] + df["away_xg"]

        # Poisson probabilities
        df["p_over"] = 1 - poisson.cdf(int(line), df["total_xg"])
        df["p_under"] = poisson.cdf(int(line), df["total_xg"])

        # Market comparison
        df["implied_over"] = 1 / df["over_odds"]
        df["implied_under"] = 1 / df["under_odds"]

        # Edge calculation
        df["over_edge"] = df["p_over"] - df["implied_over"]
        df["under_edge"] = df["p_under"] - df["implied_under"]

        # EV calculation
        df["over_ev"] = (df["p_over"] * (df["over_odds"] - 1)) - (1 - df["p_over"])
        df["under_ev"] = (df["p_under"] * (df["under_odds"] - 1)) - (1 - df["p_under"])

        # Best bet
        df["best_bet"] = np.where(df["over_ev"] > 0.05, "OVER",
                         np.where(df["under_ev"] > 0.05, "UNDER", "PASS"))

        return df

    def analyze_btts(self) -> pd.DataFrame:
        """Analyze Both Teams to Score market."""

        df = self.matches.copy()

        # Probability each team scores
        df["p_home_scores"] = 1 - np.exp(-df["home_xg"])
        df["p_away_scores"] = 1 - np.exp(-df["away_xg"])

        # BTTS probability
        df["model_btts"] = df["p_home_scores"] * df["p_away_scores"]
        df["model_no_btts"] = 1 - df["model_btts"]

        # Market comparison
        df["implied_btts"] = 1 / df["btts_yes_odds"]
        df["implied_no_btts"] = 1 / df["btts_no_odds"]

        # Edge
        df["btts_edge"] = df["model_btts"] - df["implied_btts"]
        df["no_btts_edge"] = df["model_no_btts"] - df["implied_no_btts"]

        return df

    def correct_score_probabilities(self, home_xg: float, away_xg: float,
                                   max_goals: int = 5) -> pd.DataFrame:
        """Calculate correct score probabilities using Poisson."""

        results = []

        for h in range(max_goals + 1):
            for a in range(max_goals + 1):
                prob = poisson.pmf(h, home_xg) * poisson.pmf(a, away_xg)
                results.append({
                    "home_goals": h,
                    "away_goals": a,
                    "score": f"{h}-{a}",
                    "probability": prob,
                    "fair_odds": 1 / prob if prob > 0 else float("inf")
                })

        return pd.DataFrame(results).sort_values("probability", ascending=False)

    def asian_handicap_probability(self, home_xg: float, away_xg: float,
                                  line: float) -> Dict:
        """Calculate Asian Handicap probabilities."""

        # Simulate many games using Poisson
        n_sims = 100000
        home_goals = np.random.poisson(home_xg, n_sims)
        away_goals = np.random.poisson(away_xg, n_sims)

        # Apply handicap
        adjusted_margin = home_goals - away_goals + line

        # Calculate outcomes
        home_wins = np.sum(adjusted_margin > 0) / n_sims
        away_wins = np.sum(adjusted_margin < 0) / n_sims
        pushes = np.sum(adjusted_margin == 0) / n_sims

        return {
            "line": line,
            "home_covers": home_wins,
            "away_covers": away_wins,
            "push": pushes,
            "home_fair_odds": 1 / home_wins if home_wins > 0 else float("inf"),
            "away_fair_odds": 1 / away_wins if away_wins > 0 else float("inf")
        }

# Example usage
print("Goal scorer and O/U analyzer ready!")

# Correct score example
analyzer = OverUnderAnalyzer(pd.DataFrame())
cs = analyzer.correct_score_probabilities(1.8, 1.2)
print("\nMost likely scores (Man City 1.8 xG vs Arsenal 1.2 xG):")
print(cs.head(10).to_string(index=False))
# R: Goal scorer market analysis
library(tidyverse)

# Analyze anytime goalscorer markets
analyze_ags_market <- function(player_data, market_odds) {

    player_data %>%
        left_join(market_odds, by = "player_name") %>%
        mutate(
            # Implied probability from odds
            implied_prob = 1 / ags_odds,

            # Model probability: P(at least 1 goal) = 1 - P(0 goals)
            # Using Poisson: P(0) = e^(-xG * mins/90)
            expected_goals = xg_per_90 * expected_mins / 90,
            model_prob = 1 - exp(-expected_goals),

            # Calculate edge
            edge = model_prob - implied_prob,

            # Expected value
            ev = (model_prob * (ags_odds - 1)) - (1 - model_prob),

            # Confidence based on sample size
            confidence = pmin(matches_played / 10, 1)
        ) %>%
        filter(!is.na(ags_odds)) %>%
        arrange(desc(ev))
}

# Over/Under goals model
analyze_ou_market <- function(match_data, line = 2.5) {

    match_data %>%
        mutate(
            # Total expected goals
            total_xg = home_xg + away_xg,

            # Probability of over using Poisson
            # P(total > line) = 1 - P(total <= floor(line))
            p_over = 1 - ppois(floor(line), lambda = total_xg),
            p_under = ppois(floor(line), lambda = total_xg),

            # Compare to market
            implied_over = 1 / over_odds,
            implied_under = 1 / under_odds,

            # Edge
            over_edge = p_over - implied_over,
            under_edge = p_under - implied_under,

            # Best bet direction
            best_bet = case_when(
                over_edge > 0.05 ~ "OVER",
                under_edge > 0.05 ~ "UNDER",
                TRUE ~ "PASS"
            )
        )
}

# BTTS (Both Teams to Score) market
analyze_btts <- function(match_data) {

    match_data %>%
        mutate(
            # P(home scores at least 1)
            p_home_scores = 1 - exp(-home_xg),
            # P(away scores at least 1)
            p_away_scores = 1 - exp(-away_xg),

            # P(BTTS) = P(home scores) * P(away scores)
            model_btts = p_home_scores * p_away_scores,
            model_no_btts = 1 - model_btts,

            # Compare to market
            implied_btts = 1 / btts_yes_odds,
            implied_no_btts = 1 / btts_no_odds,

            btts_edge = model_btts - implied_btts,
            no_btts_edge = model_no_btts - implied_no_btts
        )
}

# Correct score probabilities (Poisson)
correct_score_probs <- function(home_xg, away_xg, max_goals = 5) {

    scores <- expand_grid(
        home_goals = 0:max_goals,
        away_goals = 0:max_goals
    ) %>%
        mutate(
            # Independent Poisson probabilities
            prob = dpois(home_goals, home_xg) * dpois(away_goals, away_xg),
            score = paste0(home_goals, "-", away_goals)
        ) %>%
        arrange(desc(prob))

    scores
}

# Example
cs_probs <- correct_score_probs(home_xg = 1.8, away_xg = 1.2)
print(head(cs_probs, 10))

Output

Most likely scores (Man City 1.8 xG vs Arsenal 1.2 xG):
 home_goals  away_goals score  probability  fair_odds
          1           1   1-1       0.1790       5.59
          2           1   2-1       0.1611       6.21
          1           0   1-0       0.1492       6.70
          2           0   2-0       0.1343       7.45
          0           1   0-1       0.0828      12.08
          1           2   1-2       0.1074       9.31
          0           0   0-0       0.0690      14.49
          3           1   3-1       0.0967      10.34
          2           2   2-2       0.0967      10.34
          3           0   3-0       0.0805      12.42

Bankroll Management

Proper bankroll management is more important than picking winners. Even the best models fail without disciplined stake sizing.

bankroll_management

# Python: Bankroll management system
import numpy as np
import pandas as pd
from dataclasses import dataclass, field
from typing import List, Dict, Optional
from datetime import datetime

@dataclass
class Bet:
    """Represents a single bet."""
    date: datetime
    event: str
    selection: str
    odds: float
    stake: float
    stake_pct: float
    model_prob: float
    result: Optional[str] = None
    profit: Optional[float] = None

class BankrollManager:
    """Comprehensive bankroll management system."""

    def __init__(self, initial_bankroll: float, max_stake_pct: float = 0.05,
                 kelly_fraction: float = 0.25):
        self.initial = initial_bankroll
        self.current = initial_bankroll
        self.max_stake = max_stake_pct
        self.kelly_fraction = kelly_fraction
        self.history: List[Bet] = []

    def kelly_stake(self, model_prob: float, odds: float) -> float:
        """Calculate Kelly criterion stake."""
        b = odds - 1
        p = model_prob
        q = 1 - p

        # Full Kelly
        full_kelly = (b * p - q) / b

        if full_kelly <= 0:
            return 0

        # Apply fraction and cap
        stake_pct = min(full_kelly * self.kelly_fraction, self.max_stake)
        return max(0, stake_pct)

    def flat_stake(self, units: float = 1.0, unit_size: float = 0.01) -> float:
        """Flat staking approach."""
        return min(units * unit_size, self.max_stake)

    def place_bet(self, event: str, selection: str, odds: float,
                 model_prob: float, stake_method: str = "kelly") -> Bet:
        """Place a bet and record it."""

        if stake_method == "kelly":
            stake_pct = self.kelly_stake(model_prob, odds)
        else:
            stake_pct = self.flat_stake()

        stake = self.current * stake_pct

        bet = Bet(
            date=datetime.now(),
            event=event,
            selection=selection,
            odds=odds,
            stake=stake,
            stake_pct=stake_pct,
            model_prob=model_prob
        )

        self.history.append(bet)
        return bet

    def settle_bet(self, bet: Bet, won: bool) -> None:
        """Settle a bet and update bankroll."""
        bet.result = "WON" if won else "LOST"
        bet.profit = bet.stake * (bet.odds - 1) if won else -bet.stake
        self.current += bet.profit

    def get_stats(self) -> Dict:
        """Calculate comprehensive betting statistics."""

        settled = [b for b in self.history if b.result is not None]

        if not settled:
            return {"message": "No settled bets"}

        wins = [b for b in settled if b.result == "WON"]
        total_staked = sum(b.stake for b in settled)
        total_profit = sum(b.profit for b in settled)

        return {
            "total_bets": len(settled),
            "wins": len(wins),
            "losses": len(settled) - len(wins),
            "win_rate": len(wins) / len(settled),
            "total_staked": total_staked,
            "total_profit": total_profit,
            "roi": total_profit / total_staked if total_staked > 0 else 0,
            "bankroll_growth": (self.current - self.initial) / self.initial,
            "current_bankroll": self.current,
            "avg_odds": np.mean([b.odds for b in settled]),
            "avg_stake_pct": np.mean([b.stake_pct for b in settled])
        }

    def simulate_future(self, n_bets: int, avg_edge: float,
                       avg_odds: float, n_sims: int = 10000) -> Dict:
        """Monte Carlo simulation of future performance."""

        win_rate = (1 / avg_odds) + avg_edge
        results = []

        for _ in range(n_sims):
            bankroll = self.current
            max_bankroll = bankroll
            min_bankroll = bankroll

            for _ in range(n_bets):
                stake_pct = self.kelly_stake(win_rate, avg_odds)
                stake = bankroll * stake_pct

                if np.random.random() < win_rate:
                    bankroll += stake * (avg_odds - 1)
                else:
                    bankroll -= stake

                max_bankroll = max(max_bankroll, bankroll)
                min_bankroll = min(min_bankroll, bankroll)

                if bankroll <= 0:
                    break

            results.append({
                "final_bankroll": bankroll,
                "max_bankroll": max_bankroll,
                "min_bankroll": min_bankroll,
                "ruined": bankroll <= 0
            })

        df = pd.DataFrame(results)

        return {
            "mean_final": df["final_bankroll"].mean(),
            "median_final": df["final_bankroll"].median(),
            "percentile_5": df["final_bankroll"].quantile(0.05),
            "percentile_95": df["final_bankroll"].quantile(0.95),
            "risk_of_ruin": df["ruined"].mean(),
            "max_drawdown_mean": (df["max_bankroll"] - df["min_bankroll"]).mean() / df["max_bankroll"].mean()
        }

class ClosingLineValue:
    """Track Closing Line Value (CLV) for bet quality assessment."""

    def __init__(self):
        self.bets = []

    def add_bet(self, placed_odds: float, closing_odds: float,
                stake: float, won: bool) -> None:
        """Add a bet with placed and closing odds."""
        self.bets.append({
            "placed_odds": placed_odds,
            "closing_odds": closing_odds,
            "stake": stake,
            "won": won,
            "clv": (1/placed_odds) - (1/closing_odds)
        })

    def analyze(self) -> Dict:
        """Analyze CLV performance."""

        if not self.bets:
            return {"message": "No bets recorded"}

        df = pd.DataFrame(self.bets)

        return {
            "total_bets": len(df),
            "positive_clv_rate": (df["clv"] > 0).mean(),
            "mean_clv": df["clv"].mean(),
            "mean_clv_pct": df["clv"].mean() * 100,
            "win_rate": df["won"].mean(),
            "total_stake": df["stake"].sum(),
            "clv_by_outcome": {
                "winners": df[df["won"]]["clv"].mean() if df["won"].any() else 0,
                "losers": df[~df["won"]]["clv"].mean() if (~df["won"]).any() else 0
            }
        }

# Example usage
manager = BankrollManager(initial_bankroll=1000, kelly_fraction=0.25)

# Simulate some bets
print("Bankroll Management System")
print(f"Starting bankroll: £{manager.current:.2f}")

# Calculate recommended stake
edge_bet = manager.kelly_stake(model_prob=0.55, odds=2.10)
print(f"\nFor 55% model prob at 2.10 odds:")
print(f"  Kelly recommends: {edge_bet:.1%} of bankroll")
print(f"  Stake amount: £{manager.current * edge_bet:.2f}")
# R: Bankroll management system
library(tidyverse)

# Kelly Criterion with fractional approach
kelly_stake <- function(model_prob, odds, fraction = 0.25, max_stake = 0.05) {
    b <- odds - 1
    p <- model_prob
    q <- 1 - p

    # Full Kelly
    full_kelly <- (b * p - q) / b

    # Fractional Kelly (safer)
    fractional <- full_kelly * fraction

    # Cap at maximum stake
    stake <- max(0, min(fractional, max_stake))

    list(
        full_kelly = full_kelly,
        fractional_kelly = fractional,
        recommended_stake = stake
    )
}

# Bankroll tracking system
create_bankroll_tracker <- function(initial_bankroll) {

    tracker <- list(
        initial = initial_bankroll,
        current = initial_bankroll,
        history = tibble(
            date = as.Date(character()),
            bet_id = integer(),
            stake = numeric(),
            odds = numeric(),
            result = character(),
            profit = numeric(),
            bankroll = numeric()
        ),
        stats = list()
    )

    # Add bet function
    add_bet <- function(tracker, stake_pct, odds, won) {
        stake <- tracker$current * stake_pct
        profit <- if (won) stake * (odds - 1) else -stake
        new_bankroll <- tracker$current + profit

        new_row <- tibble(
            date = Sys.Date(),
            bet_id = nrow(tracker$history) + 1,
            stake = stake,
            odds = odds,
            result = if (won) "WON" else "LOST",
            profit = profit,
            bankroll = new_bankroll
        )

        tracker$history <- bind_rows(tracker$history, new_row)
        tracker$current <- new_bankroll
        tracker
    }

    tracker
}

# Calculate risk of ruin
risk_of_ruin <- function(win_rate, avg_odds, stake_pct, n_simulations = 10000) {

    # Simulate betting sequences
    ruin_count <- 0

    for (i in 1:n_simulations) {
        bankroll <- 1.0
        n_bets <- 1000

        for (b in 1:n_bets) {
            stake <- bankroll * stake_pct
            won <- runif(1) < win_rate

            if (won) {
                bankroll <- bankroll + stake * (avg_odds - 1)
            } else {
                bankroll <- bankroll - stake
            }

            if (bankroll <= 0) {
                ruin_count <- ruin_count + 1
                break
            }
        }
    }

    ruin_count / n_simulations
}

# Expected growth rate
expected_growth <- function(win_rate, odds, stake_pct) {
    # G = p * log(1 + b*f) + q * log(1 - f)
    # where f = stake fraction, b = odds - 1

    b <- odds - 1
    p <- win_rate
    q <- 1 - p
    f <- stake_pct

    p * log(1 + b * f) + q * log(1 - f)
}

print("Bankroll management system ready!")

Output

Bankroll Management System
Starting bankroll: £1000.00

For 55% model prob at 2.10 odds:
  Kelly recommends: 2.0% of bankroll
  Stake amount: £20.00

Staking Guidelines

Never bet more than 5% of bankroll on a single bet
Use fractional Kelly (25-50%) not full Kelly
Track Closing Line Value (CLV) as a skill indicator
Set loss limits per day/week/month
Don't chase losses with larger stakes

Responsible Gambling

No analytics chapter on betting is complete without addressing responsible gambling. This is non-negotiable content for ethical practice.

Critical Warnings

The house always has an edge. Long-term, bookmakers profit and most bettors lose.
Models don't guarantee profits. Even profitable edges can lead to significant losses.
Never bet money you can't afford to lose. This is not investment advice.
Gambling can be addictive. If you feel you're losing control, seek help immediately.

Warning Signs

Betting more than planned
Chasing losses with bigger bets
Borrowing money to bet
Lying about betting activity
Neglecting work, relationships, or health
Feeling anxious when not betting
Betting to escape problems

Healthy Practices

Set strict budget limits before betting
Never bet under emotional influence
Keep detailed records of all bets
Take regular breaks from betting
Never borrow to fund betting
Treat it as entertainment, not income
Use deposit limits and self-exclusion tools

Support Resources

If you or someone you know has a gambling problem, help is available:

UK: GambleAware - 0808 8020 133 - begambleaware.org
UK: GamCare - 0808 8020 133 - gamcare.org.uk
US: National Council on Problem Gambling - 1-800-522-4700
International: Gamblers Anonymous - gamblersanonymous.org

responsible_gambling

# Python: Self-assessment and limit tracking
import pandas as pd
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import Dict, List, Optional

@dataclass
class GamblingLimits:
    """Track and enforce gambling limits."""
    daily_loss: float = 50.0
    weekly_loss: float = 200.0
    monthly_loss: float = 500.0
    session_time_mins: int = 60
    max_bets_per_day: int = 10

class ResponsibleGamblingTracker:
    """Track gambling behavior for responsible practices."""

    def __init__(self, limits: GamblingLimits = None):
        self.limits = limits or GamblingLimits()
        self.sessions = []
        self.daily_results = {}

    def start_session(self, mood: str = "neutral") -> Dict:
        """Start a new gambling session."""

        session = {
            "id": len(self.sessions) + 1,
            "start_time": datetime.now(),
            "end_time": None,
            "bets": [],
            "profit_loss": 0,
            "mood_before": mood,
            "mood_after": None,
            "within_limits": True
        }

        self.sessions.append(session)

        # Check remaining limits
        status = self.check_limits()

        return {
            "session_id": session["id"],
            "status": status,
            "message": self._get_limit_message(status)
        }

    def check_limits(self) -> Dict:
        """Check current limit status."""

        today = datetime.now().date()
        week_ago = today - timedelta(days=7)
        month_ago = today - timedelta(days=30)

        # Calculate losses
        daily_loss = sum(
            s["profit_loss"] for s in self.sessions
            if s["start_time"].date() == today and s["profit_loss"] < 0
        )

        weekly_loss = sum(
            s["profit_loss"] for s in self.sessions
            if s["start_time"].date() >= week_ago and s["profit_loss"] < 0
        )

        monthly_loss = sum(
            s["profit_loss"] for s in self.sessions
            if s["start_time"].date() >= month_ago and s["profit_loss"] < 0
        )

        daily_bets = sum(
            len(s["bets"]) for s in self.sessions
            if s["start_time"].date() == today
        )

        return {
            "daily_loss": abs(daily_loss),
            "daily_remaining": max(0, self.limits.daily_loss - abs(daily_loss)),
            "weekly_loss": abs(weekly_loss),
            "weekly_remaining": max(0, self.limits.weekly_loss - abs(weekly_loss)),
            "bets_today": daily_bets,
            "bets_remaining": max(0, self.limits.max_bets_per_day - daily_bets),
            "should_stop": abs(daily_loss) >= self.limits.daily_loss
        }

    def _get_limit_message(self, status: Dict) -> str:
        """Generate appropriate warning message."""

        if status["should_stop"]:
            return "STOP: You have reached your daily loss limit. Please stop betting."

        if status["daily_remaining"] < self.limits.daily_loss * 0.2:
            return f"WARNING: Only £{status['daily_remaining']:.2f} remaining in daily limit."

        if status["bets_remaining"] <= 2:
            return f"NOTE: Only {status['bets_remaining']} bets remaining today."

        return "Within all limits. Remember to bet responsibly."

    def end_session(self, mood_after: str = "neutral",
                   notes: str = "") -> Dict:
        """End current session and record stats."""

        if not self.sessions:
            return {"error": "No active session"}

        session = self.sessions[-1]
        session["end_time"] = datetime.now()
        session["mood_after"] = mood_after
        session["notes"] = notes

        duration = (session["end_time"] - session["start_time"]).seconds / 60

        return {
            "duration_mins": duration,
            "profit_loss": session["profit_loss"],
            "within_time_limit": duration <= self.limits.session_time_mins,
            "within_loss_limit": session["profit_loss"] > -self.limits.daily_loss
        }

    def pgsi_screening(self, responses: List[int]) -> Dict:
        """
        Problem Gambling Severity Index screening.

        responses: List of 9 responses, each 0-3
            0 = Never
            1 = Sometimes
            2 = Most of the time
            3 = Almost always
        """

        if len(responses) != 9:
            return {"error": "PGSI requires exactly 9 responses"}

        total = sum(responses)

        if total == 0:
            risk_level = "Non-problem gambling"
            recommendation = "Your gambling appears to be recreational and controlled."
        elif total <= 2:
            risk_level = "Low risk gambling"
            recommendation = "You show few signs of problem gambling, but stay aware."
        elif total <= 7:
            risk_level = "Moderate risk gambling"
            recommendation = "Consider setting stricter limits. Monitor your behavior."
        else:
            risk_level = "Problem gambling"
            recommendation = "Please seek professional support. Help is available."

        return {
            "score": total,
            "max_score": 27,
            "risk_level": risk_level,
            "recommendation": recommendation,
            "seek_help": total >= 8
        }

# Example usage
tracker = ResponsibleGamblingTracker()
session = tracker.start_session(mood="excited")
print(f"Session started: {session['message']}")

# Check limits
status = tracker.check_limits()
print(f"\nCurrent Status:")
print(f"  Daily remaining: £{status['daily_remaining']:.2f}")
print(f"  Weekly remaining: £{status['weekly_remaining']:.2f}")
print(f"  Bets remaining today: {status['bets_remaining']}")
# R: Self-assessment and limit tracking
library(tidyverse)

# Gambling behavior tracker
create_behavior_tracker <- function() {

    list(
        # Set limits
        limits = list(
            daily_loss = 50,
            weekly_loss = 200,
            monthly_loss = 500,
            session_time_mins = 60
        ),

        # Track sessions
        sessions = tibble(
            date = as.Date(character()),
            start_time = as.POSIXct(character()),
            end_time = as.POSIXct(character()),
            profit_loss = numeric(),
            mood_before = character(),
            mood_after = character(),
            stuck_to_limits = logical()
        ),

        # Check if approaching limits
        check_limits = function(self) {
            today <- Sys.Date()

            daily_total <- self$sessions %>%
                filter(date == today) %>%
                summarise(total = sum(profit_loss)) %>%
                pull(total)

            weekly_total <- self$sessions %>%
                filter(date >= today - 7) %>%
                summarise(total = sum(profit_loss)) %>%
                pull(total)

            list(
                daily_remaining = self$limits$daily_loss + daily_total,
                weekly_remaining = self$limits$weekly_loss + weekly_total,
                should_stop = daily_total <= -self$limits$daily_loss
            )
        }
    )
}

# Problem gambling screening (based on PGSI)
pgsi_screening <- function(responses) {
    # Responses should be 0-3 for each of 9 questions
    # 0 = Never, 1 = Sometimes, 2 = Most of the time, 3 = Almost always

    total_score <- sum(responses)

    risk_level <- case_when(
        total_score == 0 ~ "Non-problem gambling",
        total_score <= 2 ~ "Low risk gambling",
        total_score <= 7 ~ "Moderate risk gambling",
        TRUE ~ "Problem gambling"
    )

    list(
        score = total_score,
        risk_level = risk_level,
        recommendation = if (total_score >= 3)
            "Consider speaking to a professional about your gambling habits"
        else
            "Continue to monitor and maintain healthy gambling limits"
    )
}

print("Behavior tracking system ready")

Practice Exercises

Hands-On Practice

Complete these exercises to apply fantasy and betting analytics:

Exercise 43.1: FPL xPoints Model

Build an expected points model for FPL using public xG/xA data. Validate against historical FPL scores to assess accuracy.

Exercise 43.2: Squad Optimizer

Implement the linear programming squad optimizer. Find the optimal £100m squad for a specific gameweek using your xPoints projections.

Exercise 43.3: Market Analysis

Collect historical betting odds and match results. Calculate implied probabilities and compare market calibration against your own model.

Exercise 43.4: FDR Calculator

Build a Fixture Difficulty Rating system using recent xG data. Create visualizations showing fixture swings for all Premier League teams over the next 10 gameweeks.

Exercise 43.5: Chip Strategy Planner

Analyze the remaining FPL gameweeks and recommend optimal chip timing. Consider Double Gameweeks, Blank Gameweeks, and fixture swings.

Exercise 43.6: Over/Under Model

Build a Poisson-based over/under model for match totals. Backtest against historical odds to evaluate if your model can find value.

Exercise 43.7: Bankroll Simulator

Create a Monte Carlo simulation to project bankroll growth under different staking strategies. Compare flat staking vs Kelly criterion with various edge assumptions.

Exercise 43.8: CLV Tracker

Track your betting results including both placed and closing odds. Calculate your Closing Line Value (CLV) and analyze whether positive CLV correlates with long-term profitability.

Summary

Key Takeaways

FPL optimization combines xG/xA projections with scoring system rules
Squad selection is a constrained optimization problem solvable with linear programming
Fixture Difficulty Ratings help plan transfers, captaincy, and chip deployment
Chip timing can swing hundreds of points—plan around DGWs and fixture swings
Betting odds contain implied probabilities with bookmaker margin (overround)
Expected value determines whether a bet is theoretically profitable
Model calibration is as important as accuracy for betting applications
Kelly criterion helps determine optimal stake sizing—always use fractional Kelly
Closing Line Value (CLV) is the best indicator of betting skill
Responsible gambling practices are essential—set limits and stick to them

Common Pitfalls to Avoid

Chasing FPL template players: High ownership reduces differential potential
Overweighting recent form: Last 3 games aren't enough sample size
Ignoring fixture difficulty: A 7-point player vs City isn't the same as vs Sheffield United
Panic wildcards: One bad week doesn't justify burning a chip
Ignoring the overround: Betting odds look attractive until you account for margin
Overstating model confidence: A 5% edge doesn't mean guaranteed profit
Chasing losses: The surest path to ruin is increasing stakes after losses
Betting without edge: Entertainment betting is fine, but don't pretend it's investing
Ignoring variance: Even positive EV bettors face long losing streaks
Using full Kelly: Full Kelly maximizes growth but also maximizes volatility

Essential Tools and Libraries

Category	R Libraries	Python Libraries	Purpose
Optimization	lpSolve, ROI	scipy.optimize, PuLP, cvxpy	Squad optimization, lineup selection
Data Analysis	tidyverse	pandas, numpy	Data manipulation and statistics
Statistical Modeling	stats, MASS	scipy.stats, statsmodels	Poisson models, probability calculations
Visualization	ggplot2	matplotlib, plotly	Fixture tickers, performance charts
FPL API Access	fplr, httr2	fpl, requests	Fetching FPL data
Web Scraping	rvest	beautifulsoup4, selenium	Odds data collection
Machine Learning	caret, mlr3	scikit-learn	Prediction models, calibration
Simulation	base R	numpy (Monte Carlo)	Bankroll projections, risk of ruin

FPL Data Sources

Official FPL API: fantasy.premierleague.com/api/ - Player data, fixtures, gameweeks
Understat: xG/xA data for top 5 leagues
FBref: Comprehensive player statistics via StatsBomb
Fantasy Football Scout: Historical FPL performance data
FPL Review: Expected points projections

Betting Market Data Sources

Odds Portal: Historical odds across multiple bookmakers
Football-Data.co.uk: Free historical odds and results
Betfair API: Exchange odds and market depth (requires account)
Pinnacle: Sharpest odds, often used as benchmark
The Odds API: Real-time odds aggregation

Key Metrics Reference

Metric	Definition	Good Value
xPoints	Expected FPL points based on xG/xA	6+ per gameweek for premiums
Value (pts/£)	xPoints divided by price in millions	0.6+ for good value picks
FDR	Fixture Difficulty Rating (1-5)	1-2 = Easy, 4-5 = Hard
Overround	Bookmaker margin on market	~3-5% for efficient markets
Expected Value (EV)	(Prob × Profit) - (1-Prob × Loss)	Positive = theoretically profitable
CLV (Closing Line Value)	Difference between bet and closing odds	Positive CLV = beating the market
Kelly %	Optimal stake as % of bankroll	Use 25-50% of full Kelly
ROI	Profit / Total Staked	3-5% long-term is excellent
Brier Score	Mean squared error of probabilities	Lower is better, ~0.20 is good

Final Thoughts

Fantasy football and sports betting can be excellent laboratories for testing analytical skills—they provide immediate feedback on predictions. However, betting should always be approached with caution. The analytics may be fascinating, but the house always has an edge. Use these tools responsibly, set strict limits, and never risk more than you can afford to lose.

Fantasy and betting analytics provide excellent ways to test prediction skills while enjoying the game. In the next chapter, we'll explore the unique challenges and opportunities of women's football analytics.

Capstone - Complete Analytics System

Fantasy Football & Betting Analytics

Learning Objectives

Important Note

Fantasy Premier League Analytics

Squad Optimization

Fixture Difficulty Ratings

FDR Strategy Tips

FPL Chip Strategy & Optimization

Betting Market Analysis

Evaluating Prediction Models

Goal Scorer & Over/Under Markets

Bankroll Management

Staking Guidelines

Responsible Gambling

Critical Warnings

Practice Exercises

Hands-On Practice

Summary

Key Takeaways

Common Pitfalls to Avoid

Essential Tools and Libraries

FPL Data Sources

Betting Market Data Sources

Key Metrics Reference

Final Thoughts

On This Page

Exercises

Chapter Info