Capstone - Complete Analytics System
Introduction to Fixture Congestion Analytics
Modern football clubs competing across multiple competitions face unprecedented fixture congestion. Top teams can play 60+ matches per season, with periods of intense scheduling that challenge player welfare and team performance. Analytics provides essential tools for managing this complexity.
The Modern Scheduling Challenge
Elite clubs now navigate domestic leagues, cup competitions, and continental tournaments simultaneously. Understanding the performance impact of fixture density and optimizing rotation strategies can be the difference between silverware and squad breakdown.
# Analyzing Fixture Congestion
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
# Sample fixture data
fixtures = pd.DataFrame({
"date": ["2024-08-17", "2024-08-24", "2024-08-28", "2024-08-31",
"2024-09-14", "2024-09-17", "2024-09-21", "2024-09-24", "2024-09-28"],
"competition": ["Premier League", "Premier League", "League Cup", "Premier League",
"Premier League", "Champions League", "Premier League",
"League Cup", "Premier League"],
"opponent": ["Chelsea", "Brighton", "Fulham", "Newcastle", "Everton",
"AC Milan", "Bournemouth", "West Ham", "Wolves"],
"location": ["Home", "Away", "Home", "Home", "Away", "Away", "Home", "Away", "Away"],
"result": ["W", "W", "W", "D", "W", "W", "W", "W", "L"]
})
# Convert dates
fixtures["date"] = pd.to_datetime(fixtures["date"])
# Calculate rest days
fixtures["days_since_last"] = fixtures["date"].diff().dt.days
fixtures["days_to_next"] = fixtures["date"].diff(-1).dt.days.abs()
# Categorize rest periods
def categorize_rest(days):
if pd.isna(days):
return "N/A"
elif days <= 3:
return "Short (<=3 days)"
elif days <= 5:
return "Medium (4-5 days)"
elif days <= 7:
return "Normal (6-7 days)"
else:
return "Extended (>7 days)"
fixtures["rest_category"] = fixtures["days_since_last"].apply(categorize_rest)
# Month and week
fixtures["month"] = fixtures["date"].dt.month_name()
fixtures["week"] = fixtures["date"].dt.isocalendar().week
# Monthly congestion analysis
monthly_congestion = fixtures.groupby("month").agg({
"date": "count",
"days_since_last": ["mean", "min"],
"competition": "nunique"
}).reset_index()
monthly_congestion.columns = ["month", "matches", "avg_rest", "min_rest", "competitions"]
print("Monthly Fixture Congestion:")
print(monthly_congestion)
# Performance by rest days
def get_points(result):
return 3 if result == "W" else (1 if result == "D" else 0)
fixtures["points"] = fixtures["result"].apply(get_points)
rest_performance = fixtures[fixtures["days_since_last"].notna()].groupby("rest_category").agg({
"date": "count",
"points": "mean",
"result": lambda x: (x == "W").mean() * 100
}).reset_index()
rest_performance.columns = ["rest_category", "matches", "avg_points", "win_rate"]
print("\nPerformance by Rest Period:")
print(rest_performance)
# Analyzing Fixture Congestion
library(tidyverse)
library(lubridate)
# Sample fixture data
fixtures <- tribble(
~date, ~competition, ~opponent, ~location, ~result,
"2024-08-17", "Premier League", "Chelsea", "Home", "W",
"2024-08-24", "Premier League", "Brighton", "Away", "W",
"2024-08-28", "League Cup", "Fulham", "Home", "W",
"2024-08-31", "Premier League", "Newcastle", "Home", "D",
"2024-09-14", "Premier League", "Everton", "Away", "W",
"2024-09-17", "Champions League", "AC Milan", "Away", "W",
"2024-09-21", "Premier League", "Bournemouth", "Home", "W",
"2024-09-24", "League Cup", "West Ham", "Away", "W",
"2024-09-28", "Premier League", "Wolves", "Away", "L"
)
# Convert dates and calculate rest days
fixtures <- fixtures %>%
mutate(
date = as.Date(date),
days_since_last = as.numeric(date - lag(date)),
days_to_next = as.numeric(lead(date) - date),
# Categorize rest periods
rest_category = case_when(
days_since_last <= 3 ~ "Short (<=3 days)",
days_since_last <= 5 ~ "Medium (4-5 days)",
days_since_last <= 7 ~ "Normal (6-7 days)",
TRUE ~ "Extended (>7 days)"
),
# Week number for congestion analysis
week = week(date),
month = month(date, label = TRUE)
)
# Congestion metrics by month
monthly_congestion <- fixtures %>%
group_by(month) %>%
summarise(
matches = n(),
avg_rest = mean(days_since_last, na.rm = TRUE),
min_rest = min(days_since_last, na.rm = TRUE),
competitions = n_distinct(competition),
.groups = "drop"
)
print("Monthly Fixture Congestion:")
print(monthly_congestion)
# Performance by rest days
rest_performance <- fixtures %>%
filter(!is.na(days_since_last)) %>%
mutate(
points = case_when(
result == "W" ~ 3,
result == "D" ~ 1,
TRUE ~ 0
)
) %>%
group_by(rest_category) %>%
summarise(
matches = n(),
avg_points = mean(points),
win_rate = mean(result == "W") * 100,
.groups = "drop"
)
print("\nPerformance by Rest Period:")
print(rest_performance)
Player Workload Monitoring
Effective squad rotation requires understanding individual player workload. By tracking minutes played, match intensity, and recovery status, analysts can provide data-driven rotation recommendations.
- Minutes Played: Cumulative and rolling
- Match Load: High-intensity actions
- Travel Burden: Distance and time zones
- Recovery Time: Days between appearances
- Acute:Chronic Ratio: Training load spikes
- Consecutive Starts: Fatigue accumulation
- Age Factor: Recovery rates by age
- Injury History: Vulnerability patterns
# Player Workload Tracking System
import pandas as pd
import numpy as np
# Player appearance data
appearances = pd.DataFrame({
"player": ["Player A"]*6 + ["Player B"]*6,
"date": ["2024-08-17", "2024-08-24", "2024-08-28", "2024-08-31",
"2024-09-14", "2024-09-17"] * 2,
"minutes": [90, 90, 70, 90, 0, 90, 90, 65, 0, 90, 90, 75],
"competition": ["League", "League", "Cup", "League", "League", "UCL"] * 2,
"started": [True, True, True, True, False, True,
True, True, False, True, True, True]
})
appearances["date"] = pd.to_datetime(appearances["date"])
class WorkloadTracker:
"""Track and analyze player workload"""
def __init__(self, appearances_df):
self.data = appearances_df.sort_values(["player", "date"])
def calculate_metrics(self):
"""Calculate comprehensive workload metrics"""
df = self.data.copy()
metrics = []
for player in df["player"].unique():
player_data = df[df["player"] == player].copy()
# Rolling minutes
player_data["minutes_7d"] = player_data["minutes"].rolling(3, min_periods=1).sum()
player_data["minutes_28d"] = player_data["minutes"].rolling(6, min_periods=1).sum()
# ACWR
player_data["acwr"] = player_data["minutes_7d"] / (player_data["minutes_28d"] / 4)
# Consecutive starts
player_data["consecutive_starts"] = player_data["started"].groupby(
(~player_data["started"]).cumsum()
).cumsum()
# Days rest
player_data["days_rest"] = player_data["date"].diff().dt.days
# Cumulative minutes
player_data["season_minutes"] = player_data["minutes"].cumsum()
metrics.append(player_data)
return pd.concat(metrics)
def get_current_status(self):
"""Get current workload status for all players"""
metrics = self.calculate_metrics()
current = metrics.groupby("player").last().reset_index()
# Risk assessment
def assess_fatigue(row):
if row["acwr"] > 1.5:
return "High"
elif row["acwr"] > 1.2:
return "Medium"
return "Low"
def assess_status(row):
if row["season_minutes"] > 1000 and row["consecutive_starts"] > 4:
return "Needs Rest"
elif row["acwr"] > 1.3:
return "Monitor Closely"
return "Good"
current["fatigue_risk"] = current.apply(assess_fatigue, axis=1)
current["workload_status"] = current.apply(assess_status, axis=1)
return current[["player", "season_minutes", "minutes_7d",
"acwr", "consecutive_starts", "days_rest",
"fatigue_risk", "workload_status"]]
def rotation_recommendation(self):
"""Generate rotation recommendations"""
status = self.get_current_status()
priority_map = {"Needs Rest": 1, "Monitor Closely": 2, "Good": 3}
status["priority"] = status["workload_status"].map(priority_map)
recommendation_map = {
1: "Rest recommended",
2: "Consider rotation",
3: "Available for selection"
}
status["recommendation"] = status["priority"].map(recommendation_map)
return status.sort_values("priority")
# Usage
tracker = WorkloadTracker(appearances)
current_status = tracker.get_current_status()
print("Current Player Workload Status:")
print(current_status.to_string(index=False))
print("\nRotation Recommendations:")
print(tracker.rotation_recommendation()[["player", "recommendation"]].to_string(index=False))
# Player Workload Tracking System
library(tidyverse)
library(zoo)
# Player appearance data
appearances <- tribble(
~player, ~date, ~minutes, ~competition, ~started,
"Player A", "2024-08-17", 90, "League", TRUE,
"Player A", "2024-08-24", 90, "League", TRUE,
"Player A", "2024-08-28", 70, "Cup", TRUE,
"Player A", "2024-08-31", 90, "League", TRUE,
"Player A", "2024-09-14", 0, "League", FALSE,
"Player A", "2024-09-17", 90, "UCL", TRUE,
"Player B", "2024-08-17", 90, "League", TRUE,
"Player B", "2024-08-24", 65, "League", TRUE,
"Player B", "2024-08-28", 0, "Cup", FALSE,
"Player B", "2024-08-31", 90, "League", TRUE,
"Player B", "2024-09-14", 90, "League", TRUE,
"Player B", "2024-09-17", 75, "UCL", TRUE
)
# Calculate workload metrics
workload_metrics <- appearances %>%
mutate(date = as.Date(date)) %>%
arrange(player, date) %>%
group_by(player) %>%
mutate(
# Rolling metrics
minutes_7d = rollsum(minutes, k = 3, fill = NA, align = "right"),
minutes_28d = rollsum(minutes, k = 6, fill = NA, align = "right"),
# Acute:Chronic ratio (simplified)
acwr = minutes_7d / (minutes_28d / 4),
# Consecutive starts
consecutive_starts = cumsum(started) - cummax(cumsum(!started) * !started),
# Days since last appearance
days_rest = as.numeric(date - lag(date)),
# Cumulative season minutes
season_minutes = cumsum(minutes)
) %>%
ungroup()
# Current workload status
current_status <- workload_metrics %>%
group_by(player) %>%
slice_tail(n = 1) %>%
select(player, season_minutes, minutes_7d, acwr, consecutive_starts, days_rest) %>%
mutate(
# Risk assessment
fatigue_risk = case_when(
acwr > 1.5 ~ "High",
acwr > 1.2 ~ "Medium",
TRUE ~ "Low"
),
workload_status = case_when(
season_minutes > 1000 & consecutive_starts > 4 ~ "Needs Rest",
acwr > 1.3 ~ "Monitor Closely",
TRUE ~ "Good"
)
)
print("Current Player Workload Status:")
print(current_status)
# Rotation recommendation
rotation_recommendation <- function(player_status, upcoming_matches) {
player_status %>%
mutate(
priority_score = case_when(
workload_status == "Needs Rest" ~ 1,
workload_status == "Monitor Closely" ~ 2,
TRUE ~ 3
),
recommendation = case_when(
priority_score == 1 ~ "Rest recommended",
priority_score == 2 ~ "Consider rotation",
TRUE ~ "Available for selection"
)
) %>%
arrange(priority_score)
}
Squad Rotation Optimization
Optimizing squad rotation requires balancing multiple objectives: maximizing match performance, minimizing injury risk, and ensuring squad development. Mathematical optimization provides a framework for these complex decisions.
# Squad Rotation Optimization
import pandas as pd
import numpy as np
from scipy.optimize import milp, LinearConstraint, Bounds
# Squad data
squad_data = {
"player": ["GK1", "GK2", "CB1", "CB2", "CB3", "CB4", "LB1", "LB2",
"RB1", "RB2", "CM1", "CM2", "CM3", "CM4", "WG1", "WG2",
"WG3", "ST1", "ST2"],
"position": ["GK", "GK", "CB", "CB", "CB", "CB", "LB", "LB",
"RB", "RB", "CM", "CM", "CM", "CM", "WG", "WG",
"WG", "ST", "ST"],
"quality": [85, 72, 88, 85, 78, 75, 82, 74, 80, 73, 90, 86, 80, 75, 88, 82, 76, 90, 82],
"fatigue": [0.2, 0.1, 0.6, 0.3, 0.2, 0.1, 0.5, 0.2, 0.4, 0.2,
0.7, 0.4, 0.2, 0.1, 0.5, 0.3, 0.2, 0.6, 0.3],
"form": [80, 75, 82, 85, 78, 72, 80, 76, 78, 74, 88, 84, 80, 72, 86, 80, 74, 88, 80],
"age": [28, 24, 30, 27, 23, 21, 29, 22, 28, 23, 31, 27, 24, 20, 26, 25, 22, 29, 24],
"injury_risk": [0.1, 0.1, 0.3, 0.2, 0.1, 0.1, 0.25, 0.1, 0.2, 0.1,
0.35, 0.2, 0.15, 0.1, 0.2, 0.15, 0.1, 0.25, 0.15]
}
squad = pd.DataFrame(squad_data)
class RotationOptimizer:
"""Optimize squad rotation for fixture congestion"""
def __init__(self, squad_df):
self.squad = squad_df.copy()
self._calculate_metrics()
def _calculate_metrics(self):
"""Calculate effective quality metrics"""
df = self.squad
df["fatigue_penalty"] = df["fatigue"] * 15
df["effective_quality"] = df["quality"] - df["fatigue_penalty"] + (df["form"] - 80) * 0.5
df["match_fitness"] = df["effective_quality"] * (1 - df["injury_risk"])
def optimize_lineup(self, formation="4-3-3", match_importance="high"):
"""Select optimal XI based on formation and match importance"""
formations = {
"4-3-3": {"GK": 1, "CB": 2, "LB": 1, "RB": 1, "CM": 3, "WG": 2, "ST": 1},
"4-4-2": {"GK": 1, "CB": 2, "LB": 1, "RB": 1, "CM": 4, "WG": 0, "ST": 2},
"3-5-2": {"GK": 1, "CB": 3, "LB": 0, "RB": 0, "CM": 5, "WG": 0, "ST": 2}
}
reqs = formations[formation]
# Importance weighting
weights = {"high": 0.8, "medium": 0.5, "low": 0.2}
w = weights.get(match_importance, 0.5)
# Selection score
self.squad["selection_score"] = (
self.squad["effective_quality"] * w +
(100 - self.squad["fatigue"] * 100) * (1 - w)
)
# Greedy selection by position
selected = []
selected_players = set()
for pos, needed in reqs.items():
if needed > 0:
candidates = self.squad[
(self.squad["position"] == pos) &
(~self.squad["player"].isin(selected_players))
].nlargest(needed, "selection_score")
selected.append(candidates)
selected_players.update(candidates["player"].tolist())
return pd.concat(selected).sort_values(
"position",
key=lambda x: pd.Categorical(x, ["GK", "CB", "LB", "RB", "CM", "WG", "ST"])
)
def weekly_rotation_plan(self, fixtures):
"""Generate rotation plan for upcoming fixtures"""
plans = []
for fixture in fixtures:
lineup = self.optimize_lineup(
formation=fixture.get("formation", "4-3-3"),
match_importance=fixture.get("importance", "medium")
)
plans.append({
"match": fixture["opponent"],
"lineup": lineup["player"].tolist(),
"avg_quality": lineup["effective_quality"].mean()
})
# Update fatigue for selected players
for player in lineup["player"]:
idx = self.squad[self.squad["player"] == player].index[0]
self.squad.loc[idx, "fatigue"] = min(1.0,
self.squad.loc[idx, "fatigue"] + 0.15)
# Rest effect for non-selected
non_selected = ~self.squad["player"].isin(lineup["player"])
self.squad.loc[non_selected, "fatigue"] = (
self.squad.loc[non_selected, "fatigue"] * 0.7
)
self._calculate_metrics()
return plans
# Usage
optimizer = RotationOptimizer(squad)
print("High Importance Match XI:")
high_xi = optimizer.optimize_lineup("4-3-3", "high")
print(high_xi[["player", "position", "effective_quality", "fatigue"]].to_string(index=False))
optimizer2 = RotationOptimizer(squad) # Fresh instance
print("\nRotation Match XI:")
rotation_xi = optimizer2.optimize_lineup("4-3-3", "low")
print(rotation_xi[["player", "position", "effective_quality", "fatigue"]].to_string(index=False))
# Squad Rotation Optimization
library(tidyverse)
library(lpSolve)
# Squad data with current status
squad <- tribble(
~player, ~position, ~quality, ~fatigue, ~form, ~age, ~injury_risk,
"GK1", "GK", 85, 0.2, 80, 28, 0.1,
"GK2", "GK", 72, 0.1, 75, 24, 0.1,
"CB1", "CB", 88, 0.6, 82, 30, 0.3,
"CB2", "CB", 85, 0.3, 85, 27, 0.2,
"CB3", "CB", 78, 0.2, 78, 23, 0.1,
"CB4", "CB", 75, 0.1, 72, 21, 0.1,
"LB1", "LB", 82, 0.5, 80, 29, 0.25,
"LB2", "LB", 74, 0.2, 76, 22, 0.1,
"RB1", "RB", 80, 0.4, 78, 28, 0.2,
"RB2", "RB", 73, 0.2, 74, 23, 0.1,
"CM1", "CM", 90, 0.7, 88, 31, 0.35,
"CM2", "CM", 86, 0.4, 84, 27, 0.2,
"CM3", "CM", 80, 0.2, 80, 24, 0.15,
"CM4", "CM", 75, 0.1, 72, 20, 0.1,
"WG1", "WG", 88, 0.5, 86, 26, 0.2,
"WG2", "WG", 82, 0.3, 80, 25, 0.15,
"WG3", "WG", 76, 0.2, 74, 22, 0.1,
"ST1", "ST", 90, 0.6, 88, 29, 0.25,
"ST2", "ST", 82, 0.3, 80, 24, 0.15
)
# Calculate effective quality (accounting for fatigue)
squad <- squad %>%
mutate(
fatigue_penalty = fatigue * 15, # Up to 15 point reduction
effective_quality = quality - fatigue_penalty + (form - 80) * 0.5,
# Match fitness score
match_fitness = effective_quality * (1 - injury_risk)
)
# Optimization: Select best XI with constraints
optimize_lineup <- function(squad, formation = "4-3-3",
match_importance = "high") {
# Formation requirements
formations <- list(
"4-3-3" = c(GK = 1, CB = 2, LB = 1, RB = 1, CM = 3, WG = 2, ST = 1),
"4-4-2" = c(GK = 1, CB = 2, LB = 1, RB = 1, CM = 4, WG = 0, ST = 2),
"3-5-2" = c(GK = 1, CB = 3, LB = 0, RB = 0, CM = 5, WG = 0, ST = 2)
)
reqs <- formations[[formation]]
# Weight objective by match importance
importance_weight <- case_when(
match_importance == "high" ~ 0.8, # Prioritize quality
match_importance == "medium" ~ 0.5, # Balance
TRUE ~ 0.2 # Prioritize rest
)
# Objective: maximize effective quality while managing fatigue
squad <- squad %>%
mutate(
selection_score = effective_quality * importance_weight +
(100 - fatigue * 100) * (1 - importance_weight)
)
# Simple greedy selection by position
selected <- tibble()
for (pos in names(reqs)) {
needed <- reqs[pos]
if (needed > 0) {
pos_players <- squad %>%
filter(position == pos, !player %in% selected$player) %>%
arrange(desc(selection_score)) %>%
head(needed)
selected <- bind_rows(selected, pos_players)
}
}
selected %>%
arrange(factor(position, levels = c("GK", "CB", "LB", "RB", "CM", "WG", "ST")))
}
# Generate lineup for different scenarios
high_importance <- optimize_lineup(squad, "4-3-3", "high")
rotation_match <- optimize_lineup(squad, "4-3-3", "low")
cat("High Importance Match XI:\n")
print(high_importance %>% select(player, position, effective_quality, fatigue))
cat("\nRotation Match XI:\n")
print(rotation_match %>% select(player, position, effective_quality, fatigue))
Competition Prioritization
Not all competitions carry equal weight. Managers must make strategic decisions about where to allocate their best players, sometimes sacrificing cup runs to preserve league positions or vice versa.
Competition Value Framework
Consider these factors when prioritizing competitions:
- Trophy Value: Prestige and historical significance
- Financial Impact: Prize money, TV revenue, qualification bonuses
- Strategic Position: Current standing and qualification implications
- Opponent Difficulty: Expected challenge level
- Schedule Impact: How it affects upcoming key matches
# Competition Prioritization Model
import pandas as pd
import numpy as np
# Competition values
competitions = pd.DataFrame({
"competition": ["Premier League", "Champions League", "FA Cup",
"League Cup", "Europa League", "Community Shield"],
"trophy_value": [100, 100, 70, 40, 60, 20],
"financial_value": [150, 200, 30, 20, 80, 10],
"prestige": [95, 100, 80, 50, 70, 30]
})
# Current context
context = pd.DataFrame({
"competition": ["Premier League", "Champions League", "FA Cup", "League Cup"],
"current_position": ["3", "QF", "R16", "Out"],
"matches_remaining": [15, 3, 4, 0]
})
class CompetitionPrioritizer:
"""Calculate dynamic competition priorities"""
def __init__(self, comp_values, season_context):
self.comp_values = comp_values
self.context = season_context
def calculate_priorities(self):
"""Calculate priority scores for each competition"""
df = self.comp_values.merge(self.context, on="competition", how="left")
# Base score
df["base_score"] = (df["trophy_value"] + df["financial_value"] + df["prestige"]) / 3
# Position factor
def position_factor(row):
if row["competition"] == "Premier League":
try:
pos = int(row["current_position"])
if pos <= 4: return 1.3
elif pos <= 6: return 1.1
except: pass
if row["competition"] == "Champions League" and row["current_position"] == "QF":
return 1.4
if row["current_position"] == "Out":
return 0
return 1.0
df["position_factor"] = df.apply(position_factor, axis=1)
# Proximity bonus
def proximity_bonus(matches):
if pd.isna(matches): return 1.0
if matches <= 3 and matches > 0: return 1.2
if matches <= 5 and matches > 0: return 1.1
return 1.0
df["proximity_bonus"] = df["matches_remaining"].apply(proximity_bonus)
# Final priority
df["priority_score"] = df["base_score"] * df["position_factor"] * df["proximity_bonus"]
df["priority_rank"] = df["priority_score"].rank(ascending=False, method="dense")
return df.sort_values("priority_rank")
def match_importance(self, competition, opponent_strength, match_type="regular"):
"""Calculate importance of specific match"""
priorities = self.calculate_priorities()
comp_row = priorities[priorities["competition"] == competition]
if len(comp_row) == 0:
return 0
comp_priority = comp_row["priority_score"].values[0]
# Match type multiplier
type_mult = {
"final": 1.5,
"semifinal": 1.3,
"knockout": 1.2,
"title_decider": 1.5,
"regular": 1.0
}.get(match_type, 1.0)
# Opponent factor
opponent_factor = 0.5 + (opponent_strength / 200)
return comp_priority * type_mult * opponent_factor
# Usage
prioritizer = CompetitionPrioritizer(competitions, context)
priorities = prioritizer.calculate_priorities()
print("Competition Priority Rankings:")
print(priorities[["competition", "base_score", "priority_score", "priority_rank"]].to_string(index=False))
# Example matches
example_matches = [
("Premier League", "Liverpool", 95, "title_decider"),
("Champions League", "Bayern", 92, "knockout"),
("FA Cup", "Luton", 60, "regular"),
]
print("\nUpcoming Match Importance:")
for comp, opp, strength, mtype in example_matches:
importance = prioritizer.match_importance(comp, strength, mtype)
print(f" {comp} vs {opp}: {importance:.1f}")
# Competition Prioritization Model
library(tidyverse)
# Define competition values
competitions <- tribble(
~competition, ~trophy_value, ~financial_value, ~prestige,
"Premier League", 100, 150, 95,
"Champions League", 100, 200, 100,
"FA Cup", 70, 30, 80,
"League Cup", 40, 20, 50,
"Europa League", 60, 80, 70,
"Community Shield", 20, 10, 30
)
# Current season context
season_context <- tribble(
~competition, ~current_position, ~qualification_gap, ~matches_remaining,
"Premier League", 3, 5, 15,
"Champions League", "QF", NA, 3,
"FA Cup", "R16", NA, 4,
"League Cup", "Out", NA, 0
)
# Calculate dynamic priority
calculate_priority <- function(comp_values, context) {
comp_values %>%
left_join(context, by = "competition") %>%
mutate(
# Base importance
base_score = (trophy_value + financial_value + prestige) / 3,
# Context adjustments
position_factor = case_when(
competition == "Premier League" & current_position <= 4 ~ 1.3,
competition == "Premier League" & current_position <= 6 ~ 1.1,
competition == "Champions League" & current_position == "QF" ~ 1.4,
current_position == "Out" ~ 0,
TRUE ~ 1.0
),
# Achievement proximity bonus
proximity_bonus = case_when(
matches_remaining <= 3 & matches_remaining > 0 ~ 1.2,
matches_remaining <= 5 & matches_remaining > 0 ~ 1.1,
TRUE ~ 1.0
),
# Final priority score
priority_score = base_score * position_factor * proximity_bonus,
priority_rank = dense_rank(desc(priority_score))
) %>%
arrange(priority_rank)
}
priority_table <- calculate_priority(competitions, season_context)
print("Competition Priority Rankings:")
print(priority_table %>%
select(competition, base_score, priority_score, priority_rank))
# Match importance calculator
calculate_match_importance <- function(competition, opponent_strength,
match_type = "regular") {
# Get competition priority
comp_priority <- priority_table %>%
filter(competition == !!competition) %>%
pull(priority_score)
# Match type multiplier
type_mult <- case_when(
match_type == "final" ~ 1.5,
match_type == "semifinal" ~ 1.3,
match_type == "knockout" ~ 1.2,
match_type == "title_decider" ~ 1.5,
TRUE ~ 1.0
)
# Opponent factor (0-100 scale)
opponent_factor <- 0.5 + (opponent_strength / 200)
# Final importance
importance <- comp_priority * type_mult * opponent_factor
return(importance)
}
# Example matches
example_matches <- tribble(
~competition, ~opponent, ~opponent_strength, ~match_type,
"Premier League", "Liverpool", 95, "title_decider",
"Champions League", "Bayern", 92, "knockout",
"FA Cup", "Luton", 60, "regular",
"League Cup", "Wigan", 45, "regular"
)
example_matches <- example_matches %>%
rowwise() %>%
mutate(
importance = calculate_match_importance(competition, opponent_strength, match_type)
) %>%
ungroup() %>%
arrange(desc(importance))
print("\nUpcoming Match Importance:")
print(example_matches)
Travel Impact Analysis
Travel can significantly impact player performance, particularly for teams competing in European competitions. Understanding travel burden helps in planning appropriate rest and rotation.
# Travel Impact Analysis
import pandas as pd
import numpy as np
from math import radians, sin, cos, sqrt, atan2
# Stadium locations
stadiums = pd.DataFrame({
"city": ["London", "Manchester", "Liverpool", "Munich",
"Milan", "Madrid", "Istanbul", "Moscow"],
"lat": [51.5074, 53.4631, 53.4308, 48.2188,
45.4781, 40.4530, 41.0370, 55.8166],
"lon": [-0.1278, -2.2913, -2.9608, 11.6247,
9.1240, -3.6883, 28.9850, 37.5554],
"timezone_offset": [0, 0, 0, 1, 1, 1, 3, 3] # Hours from GMT
})
def haversine_distance(lat1, lon1, lat2, lon2):
"""Calculate distance between two points in km"""
R = 6371 # Earth radius in km
lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2])
dlat = lat2 - lat1
dlon = lon2 - lon1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
class TravelAnalyzer:
"""Analyze travel impact on squad"""
def __init__(self, stadiums_df):
self.stadiums = stadiums_df
def calculate_travel_burden(self, from_city, to_city):
"""Calculate travel burden between cities"""
from_loc = self.stadiums[self.stadiums["city"] == from_city].iloc[0]
to_loc = self.stadiums[self.stadiums["city"] == to_city].iloc[0]
# Distance
distance = haversine_distance(
from_loc["lat"], from_loc["lon"],
to_loc["lat"], to_loc["lon"]
)
# Timezone difference
tz_diff = abs(from_loc["timezone_offset"] - to_loc["timezone_offset"])
# Travel burden score
burden = (distance / 500) + (tz_diff * 2)
return {
"from": from_city,
"to": to_city,
"distance_km": round(distance),
"timezone_diff": tz_diff,
"travel_burden": round(burden, 2)
}
def analyze_fixtures(self, fixtures_df):
"""Analyze travel burden for fixture list"""
results = []
for _, fixture in fixtures_df.iterrows():
travel = self.calculate_travel_burden(
fixture["home_city"],
fixture["away_city"]
)
travel["match_date"] = fixture["match_date"]
travel["competition"] = fixture["competition"]
results.append(travel)
df = pd.DataFrame(results)
# Recovery recommendations
def recovery_days(burden):
if burden > 5: return 4
elif burden > 3: return 3
elif burden > 1: return 2
return 1
def rotation_rec(burden):
if burden > 5: return "Heavy rotation recommended"
elif burden > 3: return "Consider rotation"
return "Normal selection"
df["recovery_days"] = df["travel_burden"].apply(recovery_days)
df["rotation_recommendation"] = df["travel_burden"].apply(rotation_rec)
return df
# Example fixtures
fixtures = pd.DataFrame({
"match_date": ["2024-09-17", "2024-09-21", "2024-10-01", "2024-10-05"],
"home_city": ["Munich", "London", "Istanbul", "London"],
"away_city": ["London", "Manchester", "London", "Liverpool"],
"competition": ["UCL", "PL", "UCL", "PL"]
})
analyzer = TravelAnalyzer(stadiums)
travel_analysis = analyzer.analyze_fixtures(fixtures)
print("Travel Burden Analysis:")
print(travel_analysis[["match_date", "competition", "from", "to",
"distance_km", "travel_burden"]].to_string(index=False))
print("\nRecovery Recommendations:")
print(travel_analysis[["match_date", "to", "recovery_days",
"rotation_recommendation"]].to_string(index=False))
# Travel Impact Analysis
library(tidyverse)
library(geosphere)
# Stadium locations (example coordinates)
stadiums <- tribble(
~city, ~lat, ~lon, ~timezone,
"London", 51.5074, -0.1278, "Europe/London",
"Manchester", 53.4631, -2.2913, "Europe/London",
"Liverpool", 53.4308, -2.9608, "Europe/London",
"Munich", 48.2188, 11.6247, "Europe/Berlin",
"Milan", 45.4781, 9.1240, "Europe/Rome",
"Madrid", 40.4530, -3.6883, "Europe/Madrid",
"Istanbul", 41.0370, 28.9850, "Europe/Istanbul",
"Moscow", 55.8166, 37.5554, "Europe/Moscow"
)
# Calculate travel distances and time zone changes
calculate_travel_burden <- function(from_city, to_city, stadiums_df) {
from_loc <- stadiums_df %>% filter(city == from_city)
to_loc <- stadiums_df %>% filter(city == to_city)
# Distance in km
distance <- distHaversine(
c(from_loc$lon, from_loc$lat),
c(to_loc$lon, to_loc$lat)
) / 1000
# Timezone difference (simplified)
tz_diff <- abs(as.numeric(difftime(
as.POSIXct("2024-01-01 12:00", tz = from_loc$timezone),
as.POSIXct("2024-01-01 12:00", tz = to_loc$timezone),
units = "hours"
)))
# Travel burden score
burden <- (distance / 500) + (tz_diff * 2)
tibble(
from = from_city,
to = to_city,
distance_km = round(distance),
timezone_diff = tz_diff,
travel_burden = round(burden, 2)
)
}
# Example travel analysis
fixtures <- tribble(
~match_date, ~home_city, ~away_city, ~competition,
"2024-09-17", "Munich", "London", "UCL",
"2024-09-21", "London", "Manchester", "PL",
"2024-10-01", "Istanbul", "London", "UCL",
"2024-10-05", "London", "Liverpool", "PL"
)
travel_analysis <- fixtures %>%
rowwise() %>%
mutate(
travel_info = list(calculate_travel_burden(home_city, away_city, stadiums))
) %>%
unnest(travel_info)
print("Travel Burden Analysis:")
print(travel_analysis %>%
select(match_date, competition, from, to, distance_km, travel_burden))
# Recovery recommendations based on travel
travel_analysis <- travel_analysis %>%
mutate(
recovery_days = case_when(
travel_burden > 5 ~ 4,
travel_burden > 3 ~ 3,
travel_burden > 1 ~ 2,
TRUE ~ 1
),
rotation_recommendation = case_when(
travel_burden > 5 ~ "Heavy rotation recommended",
travel_burden > 3 ~ "Consider rotation",
TRUE ~ "Normal selection"
)
)
print("\nRecovery Recommendations:")
print(travel_analysis %>%
select(match_date, to, recovery_days, rotation_recommendation))
Injury Risk Prediction During Congestion
Fixture congestion significantly increases injury risk. Understanding the relationship between workload, fatigue, and injury allows for proactive prevention strategies.
# Python: Injury risk prediction model
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, roc_auc_score
from dataclasses import dataclass
from typing import Dict, List
@dataclass
class InjuryRiskAssessment:
"""Assessment result for a player."""
player_id: str
injury_probability: float
risk_level: str
recommendation: str
key_factors: List[str]
class InjuryPredictor:
"""Predict injury risk based on workload and player factors."""
def __init__(self):
self.model = None
self.scaler = StandardScaler()
self.feature_names = [
"acwr", "cumulative_minutes_28d", "consecutive_starts",
"days_since_last_match", "age", "previous_injuries",
"muscle_fatigue_index", "match_intensity", "travel_burden"
]
def generate_training_data(self, n_samples: int = 2000) -> pd.DataFrame:
"""Generate synthetic training data."""
np.random.seed(42)
data = pd.DataFrame({
"player_id": np.repeat(range(50), n_samples // 50),
"acwr": np.random.normal(1.0, 0.3, n_samples),
"cumulative_minutes_28d": np.random.normal(450, 150, n_samples),
"consecutive_starts": np.random.poisson(3, n_samples),
"days_since_last_match": np.random.choice(range(2, 11), n_samples),
"age": np.repeat(np.random.choice(range(20, 36), 50), n_samples // 50),
"previous_injuries": np.repeat(np.random.poisson(1.5, 50), n_samples // 50),
"muscle_fatigue_index": np.random.normal(50, 15, n_samples),
"match_intensity": np.random.normal(70, 10, n_samples),
"travel_burden": np.random.exponential(2, n_samples)
})
# Generate injury labels based on features
log_odds = (
-3 +
(data["acwr"] - 1) * 2 +
(data["cumulative_minutes_28d"] - 450) / 200 +
data["consecutive_starts"] * 0.1 +
(data["days_since_last_match"] < 4).astype(int) * 0.5 +
(data["age"] > 30).astype(int) * 0.3 +
data["previous_injuries"] * 0.2 +
data["muscle_fatigue_index"] / 100
)
prob = 1 / (1 + np.exp(-log_odds))
data["injury"] = (np.random.random(n_samples) < prob).astype(int)
return data
def train(self, data: pd.DataFrame) -> Dict:
"""Train the injury prediction model."""
X = data[self.feature_names]
y = data["injury"]
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Scale features
X_train_scaled = self.scaler.fit_transform(X_train)
X_test_scaled = self.scaler.transform(X_test)
# Train model
self.model = LogisticRegression(
class_weight="balanced",
max_iter=1000
)
self.model.fit(X_train_scaled, y_train)
# Evaluate
y_pred = self.model.predict(X_test_scaled)
y_prob = self.model.predict_proba(X_test_scaled)[:, 1]
return {
"auc": roc_auc_score(y_test, y_prob),
"report": classification_report(y_test, y_pred),
"feature_importance": dict(zip(
self.feature_names,
self.model.coef_[0]
))
}
def assess_risk(self, player_data: pd.DataFrame) -> List[InjuryRiskAssessment]:
"""Assess injury risk for players."""
if self.model is None:
raise ValueError("Model not trained")
X = player_data[self.feature_names]
X_scaled = self.scaler.transform(X)
probabilities = self.model.predict_proba(X_scaled)[:, 1]
assessments = []
for i, (_, row) in enumerate(player_data.iterrows()):
prob = probabilities[i]
# Determine risk level
if prob > 0.3:
risk_level = "High"
recommendation = "REST - Do not play"
elif prob > 0.15:
risk_level = "Medium"
recommendation = "MONITOR - Limited minutes only"
else:
risk_level = "Low"
recommendation = "AVAILABLE - Normal selection"
# Identify key risk factors
key_factors = []
if row["acwr"] > 1.3:
key_factors.append(f"High ACWR ({row['acwr']:.2f})")
if row["consecutive_starts"] > 5:
key_factors.append(f"Many consecutive starts ({row['consecutive_starts']})")
if row["days_since_last_match"] < 4:
key_factors.append(f"Short recovery ({row['days_since_last_match']} days)")
if row["muscle_fatigue_index"] > 70:
key_factors.append(f"High fatigue ({row['muscle_fatigue_index']:.0f})")
assessments.append(InjuryRiskAssessment(
player_id=str(row.get("player_id", i)),
injury_probability=prob,
risk_level=risk_level,
recommendation=recommendation,
key_factors=key_factors
))
return assessments
def acwr_zone(self, acwr: float) -> str:
"""Classify ACWR into risk zones."""
if acwr < 0.8:
return "Undertraining"
elif acwr <= 1.3:
return "Sweet Spot"
elif acwr <= 1.5:
return "Danger Zone"
else:
return "High Risk"
# Example usage
predictor = InjuryPredictor()
# Train on synthetic data
training_data = predictor.generate_training_data()
results = predictor.train(training_data)
print("Injury Prediction Model Results:")
print(f"AUC: {results['auc']:.3f}")
print("\nFeature Importance:")
for feature, importance in sorted(results["feature_importance"].items(),
key=lambda x: abs(x[1]), reverse=True):
print(f" {feature}: {importance:.3f}")# R: Injury risk prediction model
library(tidyverse)
library(caret)
# Generate training data (historical player-match data)
create_injury_dataset <- function() {
set.seed(42)
n <- 2000
tibble(
player_id = rep(1:50, each = 40),
match_id = 1:n,
# Workload features
acwr = rnorm(n, 1.0, 0.3),
cumulative_minutes_28d = rnorm(n, 450, 150),
consecutive_starts = rpois(n, 3),
days_since_last_match = sample(2:10, n, replace = TRUE),
# Player features
age = rep(sample(20:35, 50, replace = TRUE), each = 40),
previous_injuries = rep(rpois(50, 1.5), each = 40),
muscle_fatigue_index = rnorm(n, 50, 15),
# Match features
match_intensity = rnorm(n, 70, 10),
travel_burden = rexp(n, 0.5),
# Target: Injury occurred (binary)
injury = rbinom(n, 1, prob = plogis(
-3 +
(acwr - 1) * 2 +
(cumulative_minutes_28d - 450) / 200 +
consecutive_starts * 0.1 +
(days_since_last_match < 4) * 0.5 +
(age > 30) * 0.3 +
previous_injuries * 0.2 +
muscle_fatigue_index / 100
))
)
}
# Train injury prediction model
train_injury_model <- function(data) {
# Split data
set.seed(123)
train_idx <- createDataPartition(data$injury, p = 0.8, list = FALSE)
train_data <- data[train_idx, ]
test_data <- data[-train_idx, ]
# Train logistic regression
model <- glm(
injury ~ acwr + cumulative_minutes_28d + consecutive_starts +
days_since_last_match + age + previous_injuries +
muscle_fatigue_index + match_intensity + travel_burden,
data = train_data,
family = binomial
)
# Evaluate
predictions <- predict(model, test_data, type = "response")
predicted_class <- ifelse(predictions > 0.3, 1, 0)
confusion <- table(Predicted = predicted_class, Actual = test_data$injury)
list(
model = model,
confusion_matrix = confusion,
auc = pROC::auc(test_data$injury, predictions)
)
}
# Risk assessment function
assess_injury_risk <- function(model, player_data) {
# Predict probability
prob <- predict(model, player_data, type = "response")
player_data %>%
mutate(
injury_probability = prob,
risk_level = case_when(
injury_probability > 0.3 ~ "High",
injury_probability > 0.15 ~ "Medium",
TRUE ~ "Low"
),
recommendation = case_when(
risk_level == "High" ~ "REST - Do not play",
risk_level == "Medium" ~ "MONITOR - Limited minutes only",
TRUE ~ "AVAILABLE - Normal selection"
)
)
}
# ACWR danger zone visualization
plot_acwr_zones <- function() {
tibble(
acwr = seq(0.5, 2.0, 0.01)
) %>%
mutate(
zone = case_when(
acwr < 0.8 ~ "Undertraining",
acwr <= 1.3 ~ "Sweet Spot",
acwr <= 1.5 ~ "Danger Zone",
TRUE ~ "High Risk"
),
injury_risk = case_when(
zone == "Sweet Spot" ~ 0.1,
zone == "Undertraining" ~ 0.15,
zone == "Danger Zone" ~ 0.25,
TRUE ~ 0.4
)
) %>%
ggplot(aes(x = acwr, y = injury_risk, fill = zone)) +
geom_area(alpha = 0.6) +
geom_vline(xintercept = c(0.8, 1.3, 1.5), linetype = "dashed") +
scale_fill_manual(values = c(
"Undertraining" = "orange",
"Sweet Spot" = "green",
"Danger Zone" = "yellow",
"High Risk" = "red"
)) +
labs(
title = "ACWR Zones and Injury Risk",
x = "Acute:Chronic Workload Ratio",
y = "Injury Risk"
) +
theme_minimal()
}
print("Injury prediction model framework ready!")ACWR Danger Zones
- < 0.8: Undertraining - Player may not be match-ready
- 0.8 - 1.3: Sweet Spot - Optimal training load
- 1.3 - 1.5: Danger Zone - Increased injury risk
- > 1.5: High Risk - Significant injury likelihood, recommend rest
Youth Integration During Congestion
Fixture congestion creates opportunities for youth development. Understanding when and how to integrate young players helps build squad depth while managing first-team workload.
# Python: Youth integration planning
import pandas as pd
import numpy as np
from dataclasses import dataclass
from typing import Dict, List
@dataclass
class YouthPlayer:
"""Youth player profile."""
name: str
position: str
age: int
quality: int
potential: int
experience_level: str
ready_for: str
class YouthIntegrationPlanner:
"""Plan youth player integration during fixture congestion."""
def __init__(self, youth_players: pd.DataFrame):
self.youth = youth_players
def calculate_development_value(self, match_context: str) -> pd.DataFrame:
"""Calculate development value per 90 minutes in given context."""
df = self.youth.copy()
# Base development rate
df["base_dev_rate"] = (df["potential"] - df["quality"]) / 10
# Competition multiplier
comp_mult = {
"League": 1.5,
"UCL": 2.0,
"Cup": 1.2,
"Friendly": 0.5
}
df["comp_mult"] = comp_mult.get(match_context, 1.0)
# Development value per 90
df["dev_value_per_90"] = df["base_dev_rate"] * df["comp_mult"]
# Readiness check
def is_ready(row):
if row["experience_level"] == "First team fringe":
return True
if row["experience_level"] == "Reserves" and match_context in ["Cup", "League"]:
return True
if row["experience_level"] == "Academy" and match_context == "Cup":
return True
return False
df["is_ready"] = df.apply(is_ready, axis=1)
return df
def find_integration_windows(self, fixtures: pd.DataFrame) -> pd.DataFrame:
"""Identify matches suitable for youth integration."""
df = fixtures.copy()
def is_opportunity(row):
if row["competition"] == "League Cup":
return True
if row["competition"] == "FA Cup" and row.get("round", 4) <= 3:
return True
if row.get("importance_score", 50) < 50:
return True
if row.get("days_until_key_match", 3) > 5:
return True
return False
df["is_integration_opportunity"] = df.apply(is_opportunity, axis=1)
def youth_minutes(row):
if row["is_integration_opportunity"]:
return 270 # 3 starters
if row.get("importance_score", 70) < 70:
return 90 # 1 starter
return 30 # Late sub
df["youth_minutes_available"] = df.apply(youth_minutes, axis=1)
return df[df["is_integration_opportunity"]]
def create_development_plan(self) -> pd.DataFrame:
"""Create season-long development plan for youth players."""
df = self.youth.copy()
# Target minutes by experience level
target_map = {
"First team fringe": 1500,
"Reserves": 800,
"Academy": 400
}
df["season_target"] = df["experience_level"].map(target_map).fillna(200)
# Preferred competitions
pref_map = {
"First team fringe": "League, Cup, UCL groups",
"Reserves": "Cup, League rotation",
"Academy": "Cup early rounds"
}
df["preferred_comps"] = df["experience_level"].map(pref_map)
# Integration priority score
df["priority"] = df["potential"] - df["quality"] + (21 - df["age"]) * 2
return df.sort_values("priority", ascending=False)
def recommend_lineup_youth(self, match_info: Dict,
first_team_fatigue: pd.DataFrame) -> List[str]:
"""Recommend which youth players to include in lineup."""
dev_values = self.calculate_development_value(match_info.get("competition", "Cup"))
ready_youth = dev_values[dev_values["is_ready"]]
# Sort by development value
recommended = ready_youth.nlargest(3, "dev_value_per_90")
return recommended[["name", "position", "dev_value_per_90"]].to_dict("records")
# Example usage
youth_data = pd.DataFrame({
"name": ["Youth A", "Youth B", "Youth C", "Youth D", "Youth E"],
"position": ["CM", "RB", "ST", "CB", "WG"],
"age": [18, 19, 17, 20, 18],
"quality": [65, 70, 60, 72, 68],
"potential": [88, 82, 90, 78, 85],
"experience_level": ["Academy", "Reserves", "Academy", "First team fringe", "Academy"],
"ready_for": ["Cup", "League rotation", "Late sub", "League starter", "Cup"]
})
planner = YouthIntegrationPlanner(youth_data)
dev_plan = planner.create_development_plan()
print("Youth Development Priority List:")
print(dev_plan[["name", "position", "age", "quality", "potential",
"priority", "season_target"]].to_string(index=False))# R: Youth integration planning
library(tidyverse)
# Define youth player pool
youth_players <- tribble(
~player, ~position, ~age, ~quality, ~potential, ~experience_level, ~ready_for,
"Youth A", "CM", 18, 65, 88, "Academy", "Cup matches",
"Youth B", "RB", 19, 70, 82, "Reserves", "League rotation",
"Youth C", "ST", 17, 60, 90, "Academy", "Late substitute",
"Youth D", "CB", 20, 72, 78, "First team fringe", "League starter",
"Youth E", "WG", 18, 68, 85, "Academy", "Cup matches",
"Youth F", "GK", 19, 62, 80, "Reserves", "Cup matches"
)
# Calculate development value of minutes
calculate_development_value <- function(player_data, match_context) {
player_data %>%
mutate(
# Base development rate
base_dev_rate = (potential - quality) / 10,
# Context multipliers
competition_mult = case_when(
match_context == "League" ~ 1.5,
match_context == "UCL" ~ 2.0,
match_context == "Cup" ~ 1.2,
TRUE ~ 1.0
),
opponent_mult = case_when(
match_context == "Top 6" ~ 1.8,
match_context == "Mid-table" ~ 1.3,
match_context == "Lower" ~ 1.0,
TRUE ~ 1.0
),
# Development value per 90 minutes
dev_value_per_90 = base_dev_rate * competition_mult * opponent_mult,
# Readiness check
is_ready = case_when(
experience_level == "First team fringe" ~ TRUE,
experience_level == "Reserves" & match_context %in% c("Cup", "League rotation") ~ TRUE,
experience_level == "Academy" & match_context == "Cup" ~ TRUE,
TRUE ~ FALSE
)
)
}
# Identify optimal integration opportunities
find_integration_windows <- function(fixtures, first_team_workload, youth_players) {
fixtures %>%
mutate(
# Low-stakes matches good for youth
is_integration_opportunity = case_when(
competition == "League Cup" ~ TRUE,
competition == "FA Cup" & round <= 3 ~ TRUE,
importance_score < 50 ~ TRUE,
days_until_key_match > 5 ~ TRUE,
TRUE ~ FALSE
),
# How many minutes available for youth
youth_minutes_available = case_when(
is_integration_opportunity ~ 270, # 3 starters
importance_score < 70 ~ 90, # 1 starter
TRUE ~ 30 # Late sub only
)
) %>%
filter(is_integration_opportunity)
}
# Season-long youth development plan
create_development_plan <- function(youth_players, season_fixtures, target_minutes) {
youth_players %>%
mutate(
# Target minutes based on readiness
season_target = case_when(
experience_level == "First team fringe" ~ 1500,
experience_level == "Reserves" ~ 800,
experience_level == "Academy" ~ 400,
TRUE ~ 200
),
# Preferred competitions
preferred_comps = case_when(
experience_level == "First team fringe" ~ "League, Cup, UCL groups",
experience_level == "Reserves" ~ "Cup, League rotation",
experience_level == "Academy" ~ "Cup early rounds",
TRUE ~ "Reserves only"
),
# Integration priority
priority = potential - quality + (21 - age) * 2
) %>%
arrange(desc(priority))
}
# Example
youth_plan <- create_development_plan(youth_players, NULL, 1000)
print("Youth Development Priority List:")
print(youth_plan %>%
select(player, position, age, quality, potential, priority, season_target))Cup Early Rounds
- Ideal for academy players
- Lower pressure environment
- Full 90 minutes possible
League Rotation
- First team fringe players
- Mix with experienced starters
- Against lower-table opposition
Late Substitutes
- Safe introduction method
- Any readiness level
- When result is secure
In-Season Squad Planning
Dynamic squad planning throughout the season ensures optimal resource allocation. Regular reviews of workload distribution, injury patterns, and form allow for proactive adjustments.
# Python: In-season squad planning system
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from typing import Dict, List
class InSeasonPlanner:
"""Dynamic in-season squad planning system."""
def __init__(self, squad_data: pd.DataFrame):
self.data = squad_data
def assess_squad_status(self, recent_matches: int = 10) -> pd.DataFrame:
"""Assess current squad status across all players."""
# Get recent data for each player
recent = self.data.groupby("player").apply(
lambda x: x.nlargest(recent_matches, "date")
).reset_index(drop=True)
# Aggregate by player
status = recent.groupby(["player", "position"]).agg({
"minutes": ["sum", "mean"],
"started": "sum",
"rating": "mean",
"goals": "sum",
"assists": "sum",
"fatigue": "last",
"date": "max"
}).reset_index()
status.columns = ["player", "position", "total_minutes", "avg_minutes",
"starts", "avg_rating", "goals", "assists",
"current_fatigue", "last_match"]
# Workload status
def workload_status(row):
if row["total_minutes"] > 800 and row["current_fatigue"] > 0.6:
return "Overworked"
elif row["total_minutes"] > 700:
return "High Load"
elif row["total_minutes"] > 500:
return "Optimal"
elif row["total_minutes"] > 300:
return "Underused"
return "Minimal Involvement"
status["workload_status"] = status.apply(workload_status, axis=1)
# Form status
def form_status(rating):
if rating > 7.5: return "Excellent"
elif rating > 7.0: return "Good"
elif rating > 6.5: return "Average"
return "Poor"
status["form_status"] = status["avg_rating"].apply(form_status)
# Action required
def action_needed(row):
if row["workload_status"] == "Overworked":
return "Immediate rest needed"
if row["workload_status"] == "Underused" and row["form_status"] == "Good":
return "Increase minutes"
if row["form_status"] == "Poor":
return "Review/Drop"
return "Continue current plan"
status["action"] = status.apply(action_needed, axis=1)
return status
def identify_squad_gaps(self) -> pd.DataFrame:
"""Identify positions needing reinforcement."""
status = self.assess_squad_status()
position_summary = status.groupby("position").agg({
"player": "count",
"workload_status": lambda x: (x != "Overworked").sum(),
"form_status": lambda x: x.isin(["Excellent", "Good"]).sum(),
"total_minutes": "std"
}).reset_index()
position_summary.columns = ["position", "total_players",
"healthy_available", "in_form", "minutes_spread"]
# Identify concerns
position_summary["has_depth_issue"] = position_summary["healthy_available"] < 2
position_summary["has_form_issue"] = position_summary["in_form"] < 1
def concern_level(row):
if row["has_depth_issue"] and row["has_form_issue"]:
return "Critical"
elif row["has_depth_issue"] or row["has_form_issue"]:
return "Moderate"
elif row["minutes_spread"] > 300:
return "Minor"
return "OK"
position_summary["concern_level"] = position_summary.apply(concern_level, axis=1)
def recommendation(level):
if level == "Critical":
return "Consider January signing"
elif level == "Moderate":
return "Promote youth player"
return "No action needed"
position_summary["recommendation"] = position_summary["concern_level"].apply(recommendation)
return position_summary
def monthly_review(self, month: int, year: int) -> pd.DataFrame:
"""Review squad utilization for a specific month."""
month_data = self.data[
(self.data["date"].dt.month == month) &
(self.data["date"].dt.year == year)
]
review = month_data.groupby("player").agg({
"match_id": "count",
"started": "sum",
"minutes": "sum",
"rating": "mean"
}).reset_index()
review.columns = ["player", "matches_available", "matches_started",
"total_minutes", "avg_rating"]
review["utilization"] = review["matches_started"] / review["matches_available"]
def classify_utilization(util):
if util > 0.9: return "Over-relied upon"
elif util > 0.7: return "Key player"
elif util > 0.4: return "Rotation option"
elif util > 0.2: return "Underutilized"
return "Unused"
review["status"] = review["utilization"].apply(classify_utilization)
return review.sort_values("total_minutes", ascending=False)
def project_remaining_season(self, remaining_matches: int) -> Dict:
"""Project squad capacity for remaining season."""
status = self.assess_squad_status()
# Capacity by workload status
capacity_multiplier = {
"Optimal": 1.0,
"Underused": 1.0,
"High Load": 0.5,
"Overworked": 0.2,
"Minimal Involvement": 0.8
}
status["capacity"] = status["workload_status"].map(capacity_multiplier)
status["estimated_available_matches"] = status["capacity"] * remaining_matches * 0.7
# Position-level summary
position_capacity = status.groupby("position").agg({
"estimated_available_matches": "sum",
"player": "count"
}).reset_index()
position_capacity["matches_needed"] = remaining_matches
position_capacity["coverage_ratio"] = (
position_capacity["estimated_available_matches"] /
position_capacity["matches_needed"]
)
position_capacity["needs_reinforcement"] = position_capacity["coverage_ratio"] < 1.5
return {
"player_projections": status[["player", "position", "capacity",
"estimated_available_matches"]],
"position_capacity": position_capacity
}
# Example usage
print("In-Season Squad Planner initialized")
print("Methods: assess_squad_status(), identify_squad_gaps(), monthly_review(), project_remaining_season()")# R: In-season squad planning system
library(tidyverse)
# Comprehensive squad status assessment
assess_squad_status <- function(squad_data, recent_matches = 10) {
squad_data %>%
group_by(player, position) %>%
slice_tail(n = recent_matches) %>%
summarise(
# Minutes analysis
total_minutes = sum(minutes),
avg_minutes = mean(minutes),
starts = sum(started),
# Form
avg_rating = mean(rating, na.rm = TRUE),
goals = sum(goals),
assists = sum(assists),
# Workload
current_fatigue = last(fatigue),
days_since_last = as.numeric(Sys.Date() - max(date)),
.groups = "drop"
) %>%
mutate(
# Status classification
workload_status = case_when(
total_minutes > 800 & current_fatigue > 0.6 ~ "Overworked",
total_minutes > 700 ~ "High Load",
total_minutes > 500 ~ "Optimal",
total_minutes > 300 ~ "Underused",
TRUE ~ "Minimal Involvement"
),
form_status = case_when(
avg_rating > 7.5 ~ "Excellent",
avg_rating > 7.0 ~ "Good",
avg_rating > 6.5 ~ "Average",
TRUE ~ "Poor"
),
# Action required
action = case_when(
workload_status == "Overworked" ~ "Immediate rest needed",
workload_status == "Underused" & form_status == "Good" ~ "Increase minutes",
form_status == "Poor" ~ "Review/Drop",
TRUE ~ "Continue current plan"
)
)
}
# Identify squad gaps and needs
identify_squad_gaps <- function(squad_status) {
# Position-level analysis
position_summary <- squad_status %>%
group_by(position) %>%
summarise(
total_players = n(),
healthy_available = sum(workload_status != "Overworked"),
in_form = sum(form_status %in% c("Excellent", "Good")),
avg_minutes_spread = sd(total_minutes),
# Depth concerns
has_depth_issue = healthy_available < 2,
has_form_issue = in_form < 1,
.groups = "drop"
)
# Flag concerning positions
position_summary %>%
mutate(
concern_level = case_when(
has_depth_issue & has_form_issue ~ "Critical",
has_depth_issue | has_form_issue ~ "Moderate",
avg_minutes_spread > 300 ~ "Minor",
TRUE ~ "OK"
),
recommendation = case_when(
concern_level == "Critical" ~ "Consider January signing",
concern_level == "Moderate" ~ "Promote youth player",
TRUE ~ "No action needed"
)
)
}
# Monthly rotation review
monthly_review <- function(month_data) {
month_data %>%
group_by(player) %>%
summarise(
matches_available = n(),
matches_started = sum(started),
total_minutes = sum(minutes),
avg_rating = mean(rating),
.groups = "drop"
) %>%
mutate(
utilization = matches_started / matches_available,
status = case_when(
utilization > 0.9 ~ "Over-relied upon",
utilization > 0.7 ~ "Key player",
utilization > 0.4 ~ "Rotation option",
utilization > 0.2 ~ "Underutilized",
TRUE ~ "Unused"
)
) %>%
arrange(desc(total_minutes))
}
# Project remaining season needs
project_season_needs <- function(current_status, remaining_fixtures) {
matches_left <- nrow(remaining_fixtures)
# Estimate minutes needed per position
total_minutes_needed <- matches_left * 90 * 11 # Starting XI minutes
current_status %>%
group_by(position) %>%
summarise(
current_capacity = sum(case_when(
workload_status %in% c("Optimal", "Underused") ~ 1,
workload_status == "High Load" ~ 0.5,
TRUE ~ 0
)),
# Minutes distribution capacity
estimated_available = current_capacity * matches_left * 70,
.groups = "drop"
) %>%
mutate(
# Gap analysis
coverage = estimated_available / (matches_left * 90),
needs_reinforcement = coverage < 1.5
)
}
print("In-season planning system ready!")| Review Period | Focus Areas | Actions |
|---|---|---|
| Weekly | Individual workload, immediate fatigue | Lineup decisions, training load adjustments |
| Monthly | Utilization balance, form trends | Rotation strategy adjustments |
| Quarterly | Squad gaps, youth integration progress | Transfer window planning |
| Mid-Season | Remaining capacity projection | January signings, loan recalls |
Practice Exercises
Exercise 46.1: Workload Dashboard
Build a comprehensive workload monitoring dashboard that tracks all squad members' minutes, calculates ACWR, and generates automated rotation recommendations based on upcoming fixtures.
- Include rolling 7-day and 28-day minute totals
- Add visual indicators for players in danger zones
- Factor in match importance when generating recommendations
Exercise 46.2: Season Rotation Simulator
Create a simulation that models an entire season of fixtures. Test different rotation strategies and measure their impact on total points, injury rates, and squad fatigue.
- Model fatigue accumulation and recovery rates
- Include probabilistic injury occurrence based on workload
- Compare "minimal rotation" vs "heavy rotation" strategies
Exercise 46.3: Multi-Competition Optimizer
Develop an optimization model that allocates squad resources across multiple competitions simultaneously, maximizing expected trophies while respecting workload constraints.
- Define utility functions for each competition outcome
- Include position-specific depth constraints
- Model knockout competition uncertainty
Summary
Key Takeaways
- Fixture Density: Modern scheduling creates unprecedented workload challenges that require systematic analytical approaches
- Workload Monitoring: Track individual player metrics including ACWR, consecutive starts, and cumulative minutes
- Rotation Optimization: Balance match importance, player fatigue, and squad depth using mathematical optimization
- Competition Priorities: Dynamically adjust priorities based on season context and strategic position
- Travel Impact: Factor in travel burden when planning rotation, especially for European competitions
Effective fixture congestion management combines data-driven workload monitoring with strategic competition prioritization. The best rotation strategies protect player welfare while maximizing competitive outcomes across all competitions.