Capstone - Complete Analytics System
Learning Objectives
- Understand the complete match day analytics workflow
- Build pre-match scouting and preparation reports
- Implement live match tracking and halftime analysis
- Create automated post-match reports and visualizations
- Design dashboards for coaching staff consumption
Match day is when analytics meets reality. The work done before, during, and after matches forms the backbone of data-driven decision making in football. This chapter walks through the complete workflow that professional clubs use to prepare, analyze, and learn from every match.
The Match Day Analytics Cycle
Modern football analytics departments operate on a continuous cycle around match days. Each phase has specific deliverables and timelines.
| Phase | Timing | Key Deliverables | Primary Audience |
|---|---|---|---|
| Pre-Match | MD-7 to MD-1 | Opposition report, set piece analysis, team selection insights | Coaching staff, players |
| Match Day | MD (before kickoff) | Final lineup analysis, weather/conditions, key matchups | Coaching staff |
| Live | During match | Real-time stats, halftime analysis, tactical adjustments | Technical staff, bench |
| Post-Match | MD+1 to MD+3 | Performance report, video clips, individual reports | Coaching staff, players, board |
# Python: Define match day workflow structure
from dataclasses import dataclass, field
from datetime import date, datetime, timedelta
from typing import Dict, List, Optional, Any
from enum import Enum
class MatchPhase(Enum):
PRE_MATCH = "pre_match"
MATCH_DAY = "match_day"
LIVE = "live"
POST_MATCH = "post_match"
@dataclass
class PreMatchData:
"""Container for pre-match analytics."""
opposition_report: Optional[Dict] = None
set_piece_analysis: Optional[Dict] = None
selection_insights: Optional[Dict] = None
key_matchups: Optional[List[Dict]] = None
tactical_plan: Optional[Dict] = None
@dataclass
class LiveMatchData:
"""Container for live match analytics."""
first_half_stats: Optional[Dict] = None
halftime_analysis: Optional[Dict] = None
second_half_stats: Optional[Dict] = None
substitution_recommendations: Optional[List[Dict]] = None
@dataclass
class PostMatchData:
"""Container for post-match analytics."""
performance_report: Optional[Dict] = None
player_ratings: Optional[Dict[str, float]] = None
video_clips: Optional[List[str]] = None
xg_timeline: Optional[List[Dict]] = None
@dataclass
class MatchDayWorkflow:
"""Complete match day analytics workflow."""
match_id: str
opponent: str
match_date: date
home_away: str
competition: str
pre_match: PreMatchData = field(default_factory=PreMatchData)
live_match: LiveMatchData = field(default_factory=LiveMatchData)
post_match: PostMatchData = field(default_factory=PostMatchData)
def days_until_match(self) -> int:
"""Calculate days until match."""
return (self.match_date - date.today()).days
def get_current_phase(self) -> MatchPhase:
"""Determine current workflow phase."""
days = self.days_until_match()
if days > 0:
return MatchPhase.PRE_MATCH
elif days == 0:
return MatchPhase.MATCH_DAY
else:
return MatchPhase.POST_MATCH
# Example initialization
next_match = MatchDayWorkflow(
match_id="2024_001",
opponent="Manchester City",
match_date=date.today() + timedelta(days=5),
home_away="Away",
competition="Premier League"
)
print(f"Match: vs {next_match.opponent} ({next_match.home_away})")
print(f"Days until match: {next_match.days_until_match()}")
print(f"Current phase: {next_match.get_current_phase().value}")# R: Define match day workflow structure
library(tidyverse)
library(R6)
# Match Day Analytics Workflow
MatchDayWorkflow <- R6Class("MatchDayWorkflow",
public = list(
match_id = NULL,
opponent = NULL,
match_date = NULL,
home_away = NULL,
competition = NULL,
# Workflow stages
pre_match = NULL,
live_match = NULL,
post_match = NULL,
initialize = function(match_id, opponent, match_date,
home_away, competition) {
self$match_id <- match_id
self$opponent <- opponent
self$match_date <- match_date
self$home_away <- home_away
self$competition <- competition
# Initialize stage containers
self$pre_match <- list(
opposition_report = NULL,
set_piece_analysis = NULL,
selection_insights = NULL,
key_matchups = NULL
)
self$live_match <- list(
first_half_stats = NULL,
halftime_analysis = NULL,
second_half_stats = NULL
)
self$post_match <- list(
performance_report = NULL,
player_ratings = NULL,
video_clips = NULL,
next_match_prep = NULL
)
},
get_days_until_match = function() {
as.numeric(self$match_date - Sys.Date())
},
get_current_phase = function() {
days <- self$get_days_until_match()
if (days > 0) return("pre_match")
if (days == 0) return("match_day")
return("post_match")
}
)
)
# Example initialization
next_match <- MatchDayWorkflow$new(
match_id = "2024_001",
opponent = "Manchester City",
match_date = Sys.Date() + 5,
home_away = "Away",
competition = "Premier League"
)
cat(sprintf("Match: vs %s (%s)\n", next_match$opponent, next_match$home_away))
cat(sprintf("Days until match: %d\n", next_match$get_days_until_match()))
cat(sprintf("Current phase: %s\n", next_match$get_current_phase()))Match: vs Manchester City (Away)
Days until match: 5
Current phase: pre_matchPre-Match Analysis
Pre-match preparation typically begins 5-7 days before the match. The analytics team produces opposition reports, identifies key tactical patterns, and provides insights for team selection.
# Python: Build pre-match opposition report
import pandas as pd
import numpy as np
from typing import Dict, List
from dataclasses import dataclass
from datetime import date
@dataclass
class OppositionReport:
"""Structured opposition report."""
opponent: str
generated_date: date
team_style: Dict
formations: pd.DataFrame
key_players: Dict[str, pd.DataFrame]
recent_results: pd.DataFrame
set_pieces: Dict
weaknesses: List[str]
class PreMatchAnalyzer:
"""Generate pre-match analysis reports."""
def generate_opposition_report(self, opponent_name: str,
matches_data: pd.DataFrame,
players_data: pd.DataFrame) -> OppositionReport:
"""Generate comprehensive opposition report."""
# Filter to opponent matches
opponent_matches = matches_data[
matches_data["team"] == opponent_name
].sort_values("date", ascending=False).head(10)
# Calculate team style metrics
team_style = {
"avg_possession": opponent_matches["possession"].mean(),
"avg_xg_for": opponent_matches["xg"].mean(),
"avg_xg_against": opponent_matches["xg_against"].mean(),
"avg_ppda": opponent_matches["ppda"].mean(),
"avg_field_tilt": opponent_matches["field_tilt"].mean(),
"pressing_style": self._classify_pressing(opponent_matches),
"build_up_style": self._classify_buildup(opponent_matches)
}
# Formation analysis
formations = opponent_matches.groupby("formation").size().reset_index()
formations.columns = ["formation", "count"]
formations["pct"] = formations["count"] / formations["count"].sum() * 100
formations = formations.sort_values("count", ascending=False)
# Key players
opponent_players = players_data[
players_data["team"] == opponent_name
].sort_values("minutes", ascending=False).head(15)
key_players = {
"scorers": opponent_players.nlargest(3, "goals"),
"creators": opponent_players.nlargest(3, "xa"),
"progressive": opponent_players.nlargest(3, "progressive_passes")
}
# Identify weaknesses
weaknesses = self._identify_weaknesses(team_style, opponent_matches)
return OppositionReport(
opponent=opponent_name,
generated_date=date.today(),
team_style=team_style,
formations=formations,
key_players=key_players,
recent_results=opponent_matches[["date", "opponent", "result", "score"]],
set_pieces=self._analyze_set_pieces(opponent_matches),
weaknesses=weaknesses
)
def _classify_pressing(self, matches: pd.DataFrame) -> str:
"""Classify team pressing style."""
avg_ppda = matches["ppda"].mean()
if avg_ppda < 8:
return "High Press"
elif avg_ppda < 12:
return "Medium Press"
return "Low Block"
def _classify_buildup(self, matches: pd.DataFrame) -> str:
"""Classify build-up style."""
avg_possession = matches["possession"].mean()
if avg_possession > 55:
return "Possession-Based"
elif avg_possession > 48:
return "Balanced"
return "Direct/Counter"
def _identify_weaknesses(self, style: Dict,
matches: pd.DataFrame) -> List[str]:
"""Identify potential weaknesses."""
weaknesses = []
if style["avg_xg_against"] > 1.5:
weaknesses.append("Defensive vulnerability (high xGA)")
if style["avg_possession"] > 60:
weaknesses.append("Potentially vulnerable on counter-attacks")
if matches["goals_conceded_set_pieces"].mean() > 0.5:
weaknesses.append("Set piece defending issues")
return weaknesses
def _analyze_set_pieces(self, matches: pd.DataFrame) -> Dict:
"""Analyze set piece patterns."""
return {
"corners_per_game": matches["corners"].mean(),
"set_piece_goals": matches["goals_from_set_pieces"].sum(),
"set_piece_goals_conceded": matches["goals_conceded_set_pieces"].sum()
}
def format_report(self, report: OppositionReport) -> str:
"""Format report for display."""
output = f"""
=== OPPOSITION REPORT: {report.opponent} ===
Generated: {report.generated_date}
TEAM STYLE:
Possession: {report.team_style["avg_possession"]:.1f}%
Build-up: {report.team_style["build_up_style"]}
Pressing: {report.team_style["pressing_style"]}
xG For: {report.team_style["avg_xg_for"]:.2f}
xG Against: {report.team_style["avg_xg_against"]:.2f}
FORMATIONS:
{report.formations.head(3).to_string(index=False)}
KEY THREATS:
Scorers: {", ".join(report.key_players["scorers"]["player"].tolist())}
Creators: {", ".join(report.key_players["creators"]["player"].tolist())}
IDENTIFIED WEAKNESSES:
{chr(10).join([" - " + w for w in report.weaknesses])}
"""
return output
analyzer = PreMatchAnalyzer()
print("Pre-match analyzer ready")# R: Build pre-match opposition report
library(tidyverse)
# Opposition Analysis Generator
generate_opposition_report <- function(opponent_name, matches_data, players_data) {
# Filter opponent data
opponent_matches <- matches_data %>%
filter(team == opponent_name) %>%
arrange(desc(date)) %>%
head(10) # Last 10 matches
# Team style metrics
team_style <- opponent_matches %>%
summarise(
avg_possession = mean(possession, na.rm = TRUE),
avg_xg_for = mean(xg, na.rm = TRUE),
avg_xg_against = mean(xg_against, na.rm = TRUE),
avg_ppda = mean(ppda, na.rm = TRUE),
avg_field_tilt = mean(field_tilt, na.rm = TRUE),
home_record = sprintf("%d-%d-%d",
sum(result == "W" & venue == "Home"),
sum(result == "D" & venue == "Home"),
sum(result == "L" & venue == "Home")
),
away_record = sprintf("%d-%d-%d",
sum(result == "W" & venue == "Away"),
sum(result == "D" & venue == "Away"),
sum(result == "L" & venue == "Away")
)
)
# Formation analysis
formations <- opponent_matches %>%
count(formation) %>%
arrange(desc(n)) %>%
mutate(pct = n / sum(n) * 100)
# Key players
opponent_players <- players_data %>%
filter(team == opponent_name) %>%
arrange(desc(minutes)) %>%
head(15)
top_creators <- opponent_players %>%
arrange(desc(xa)) %>%
head(3)
top_scorers <- opponent_players %>%
arrange(desc(goals)) %>%
head(3)
# Compile report
report <- list(
opponent = opponent_name,
generated_date = Sys.Date(),
team_style = team_style,
formations = formations,
key_players = list(
creators = top_creators,
scorers = top_scorers
),
recent_results = opponent_matches %>%
select(date, opponent, result, score, xg, xg_against)
)
return(report)
}
# Print formatted report
print_opposition_report <- function(report) {
cat(sprintf("\n=== OPPOSITION REPORT: %s ===\n", report$opponent))
cat(sprintf("Generated: %s\n\n", report$generated_date))
cat("TEAM STYLE:\n")
cat(sprintf(" Avg Possession: %.1f%%\n", report$team_style$avg_possession))
cat(sprintf(" Avg xG For: %.2f | Against: %.2f\n",
report$team_style$avg_xg_for, report$team_style$avg_xg_against))
cat(sprintf(" PPDA: %.1f | Field Tilt: %.1f%%\n",
report$team_style$avg_ppda, report$team_style$avg_field_tilt))
cat("\nFORMATIONS USED:\n")
for (i in 1:min(3, nrow(report$formations))) {
f <- report$formations[i,]
cat(sprintf(" %s: %.0f%% (%d matches)\n", f$formation, f$pct, f$n))
}
cat("\nKEY THREATS:\n")
cat(" Top Scorers:", paste(report$key_players$scorers$player, collapse = ", "), "\n")
cat(" Top Creators:", paste(report$key_players$creators$player, collapse = ", "), "\n")
}Key Matchups Analysis
# Python: Identify key matchups
import pandas as pd
from typing import Dict, List, Optional
from dataclasses import dataclass
@dataclass
class Matchup:
"""Individual player matchup."""
our_player: str
opponent: str
our_stats: Dict
opponent_stats: Dict
advantage: str
key_metric: str
class MatchupAnalyzer:
"""Analyze key player matchups."""
def analyze_matchups(self, our_players: pd.DataFrame,
opponent_players: pd.DataFrame,
our_formation: str,
opp_formation: str) -> List[Matchup]:
"""Identify key matchups based on formations."""
matchups = []
# Striker vs Center-backs
striker_matchup = self._analyze_striker_matchup(
our_players, opponent_players
)
if striker_matchup:
matchups.append(striker_matchup)
# Winger vs Full-back
winger_matchups = self._analyze_winger_matchups(
our_players, opponent_players
)
matchups.extend(winger_matchups)
# Midfield battle
midfield_matchup = self._analyze_midfield(
our_players, opponent_players
)
if midfield_matchup:
matchups.append(midfield_matchup)
return matchups
def _analyze_striker_matchup(self, our_players: pd.DataFrame,
opp_players: pd.DataFrame) -> Optional[Matchup]:
"""Analyze striker vs center-back matchup."""
our_strikers = our_players[
our_players["position"].str.contains("FW|CF|ST", na=False)
].sort_values("minutes", ascending=False)
opp_cbs = opp_players[
opp_players["position"].str.contains("CB|DC", na=False)
].sort_values("minutes", ascending=False)
if our_strikers.empty or opp_cbs.empty:
return None
striker = our_strikers.iloc[0]
cb = opp_cbs.iloc[0]
# Determine advantage
aerial_diff = striker.get("aerial_win_pct", 50) - cb.get("aerial_win_pct", 50)
pace_diff = striker.get("pace", 70) - cb.get("pace", 70)
if aerial_diff > 10:
advantage = "Aerial advantage - target with crosses"
key_metric = "aerial"
elif pace_diff > 5:
advantage = "Pace advantage - play balls in behind"
key_metric = "pace"
elif pace_diff < -5:
advantage = "Opponent has pace advantage - hold up play"
key_metric = "pace"
else:
advantage = "Even matchup - vary approach"
key_metric = "balanced"
return Matchup(
our_player=striker["player"],
opponent=cb["player"],
our_stats={"aerial": striker.get("aerial_win_pct", 50),
"pace": striker.get("pace", 70)},
opponent_stats={"aerial": cb.get("aerial_win_pct", 50),
"pace": cb.get("pace", 70)},
advantage=advantage,
key_metric=key_metric
)
def _analyze_winger_matchups(self, our_players: pd.DataFrame,
opp_players: pd.DataFrame) -> List[Matchup]:
"""Analyze winger vs full-back matchups."""
matchups = []
# Implementation similar to striker matchup
return matchups
def _analyze_midfield(self, our_players: pd.DataFrame,
opp_players: pd.DataFrame) -> Optional[Matchup]:
"""Analyze midfield battle."""
# Implementation
return None
def format_matchups(self, matchups: List[Matchup]) -> str:
"""Format matchups for display."""
output = "\n=== KEY MATCHUPS ===\n"
for m in matchups:
output += f"""
{m.our_player} vs {m.opponent}
Our stats: {m.our_stats}
Their stats: {m.opponent_stats}
Assessment: {m.advantage}
"""
return output
print("Matchup analyzer ready")# R: Identify key matchups
library(tidyverse)
# Find key player matchups
analyze_key_matchups <- function(our_players, opponent_players, formation) {
matchups <- list()
# Striker vs Center-backs
our_strikers <- our_players %>%
filter(grepl("FW|CF|ST", position)) %>%
arrange(desc(minutes))
opp_cbs <- opponent_players %>%
filter(grepl("CB|DC", position)) %>%
arrange(desc(minutes))
if (nrow(our_strikers) > 0 && nrow(opp_cbs) > 0) {
matchups$striker_vs_cb <- tibble(
our_player = our_strikers$player[1],
opponent = opp_cbs$player[1],
our_aerial_win_pct = our_strikers$aerial_win_pct[1],
opp_aerial_win_pct = opp_cbs$aerial_win_pct[1],
our_pace = our_strikers$pace[1],
opp_pace = opp_cbs$pace[1],
matchup_advantage = case_when(
our_strikers$aerial_win_pct[1] > opp_cbs$aerial_win_pct[1] + 10 ~ "Aerial advantage",
our_strikers$pace[1] > opp_cbs$pace[1] + 5 ~ "Pace advantage",
TRUE ~ "Neutral"
)
)
}
# Wingers vs Full-backs
our_wingers <- our_players %>%
filter(grepl("LW|RW|LM|RM", position)) %>%
arrange(desc(dribbles_completed))
opp_fbs <- opponent_players %>%
filter(grepl("LB|RB|FB", position)) %>%
arrange(desc(minutes))
# Midfield battle
our_mids <- our_players %>%
filter(grepl("CM|CDM|CAM", position)) %>%
arrange(desc(minutes))
opp_mids <- opponent_players %>%
filter(grepl("CM|CDM|CAM", position)) %>%
arrange(desc(minutes))
return(matchups)
}
# Generate matchup summary
print_matchup_summary <- function(matchups) {
cat("\n=== KEY MATCHUPS ===\n\n")
if (!is.null(matchups$striker_vs_cb)) {
m <- matchups$striker_vs_cb
cat(sprintf("STRIKER VS CB: %s vs %s\n", m$our_player, m$opponent))
cat(sprintf(" Aerial: %.0f%% vs %.0f%%\n",
m$our_aerial_win_pct, m$opp_aerial_win_pct))
cat(sprintf(" Assessment: %s\n\n", m$matchup_advantage))
}
}Live Match Analytics
During the match, the analytics team tracks real-time statistics and prepares halftime insights. Speed is crucial—analysis must be ready within 2-3 minutes of halftime.
# Python: Live match tracking system
from dataclasses import dataclass, field
from typing import Dict, List, Optional
from datetime import datetime
import pandas as pd
@dataclass
class LiveStats:
"""Running match statistics."""
team: str
shots: int = 0
shots_on_target: int = 0
xg: float = 0.0
passes: int = 0
pass_accuracy: float = 0.0
possession: float = 50.0
corners: int = 0
fouls: int = 0
yellows: int = 0
reds: int = 0
class LiveMatchTracker:
"""Track and analyze match in real-time."""
def __init__(self, match_id: str, home_team: str, away_team: str):
self.match_id = match_id
self.home_team = home_team
self.away_team = away_team
self.events: List[Dict] = []
self.score = {"home": 0, "away": 0}
self.current_minute = 0
self.home_stats = LiveStats(team=home_team)
self.away_stats = LiveStats(team=away_team)
def add_event(self, minute: int, event_type: str, team: str,
player: Optional[str] = None, **details):
"""Add match event and update stats."""
event = {
"minute": minute,
"event_type": event_type,
"team": team,
"player": player,
"timestamp": datetime.now(),
**details
}
self.events.append(event)
self.current_minute = minute
self._update_stats(event)
def _update_stats(self, event: Dict):
"""Update running statistics based on event."""
stats = self.home_stats if event["team"] == self.home_team else self.away_stats
if event["event_type"] == "shot":
stats.shots += 1
if event.get("on_target"):
stats.shots_on_target += 1
stats.xg += event.get("xg", 0)
elif event["event_type"] == "goal":
if event["team"] == self.home_team:
self.score["home"] += 1
else:
self.score["away"] += 1
elif event["event_type"] == "corner":
stats.corners += 1
elif event["event_type"] == "foul":
stats.fouls += 1
elif event["event_type"] == "yellow_card":
stats.yellows += 1
def generate_halftime_report(self) -> Dict:
"""Generate halftime analysis report."""
xg_diff = self.home_stats.xg - self.away_stats.xg
return {
"score": self.score,
"xg": {
"home": round(self.home_stats.xg, 2),
"away": round(self.away_stats.xg, 2),
"diff": round(xg_diff, 2)
},
"shots": {
"home": f"{self.home_stats.shots} ({self.home_stats.shots_on_target} on target)",
"away": f"{self.away_stats.shots} ({self.away_stats.shots_on_target} on target)"
},
"key_events": [e for e in self.events
if e["event_type"] in ["goal", "yellow_card", "red_card"]],
"recommendations": self._generate_recommendations()
}
def _generate_recommendations(self) -> List[str]:
"""Generate tactical recommendations."""
recs = []
xg_diff = self.home_stats.xg - self.away_stats.xg
score_diff = self.score["home"] - self.score["away"]
if xg_diff < -0.5 and score_diff >= 0:
recs.append("Overperforming xG - consider defensive consolidation")
if xg_diff > 0.5 and score_diff <= 0:
recs.append("Creating chances but not converting - maintain approach")
if self.away_stats.xg > 1.0 and self.score["away"] == 0:
recs.append("Opponent creating quality chances - tighten defense")
return recs
def format_halftime_display(self) -> str:
"""Format halftime report for display."""
report = self.generate_halftime_report()
return f"""
=== HALFTIME REPORT ===
Score: {self.home_team} {report["score"]["home"]} - {report["score"]["away"]} {self.away_team}
xG: {report["xg"]["home"]} - {report["xg"]["away"]}
Shots: {report["shots"]["home"]} - {report["shots"]["away"]}
Recommendations:
{chr(10).join([" - " + r for r in report["recommendations"]])}
"""
print("Live match tracker ready")# R: Live match tracking system
library(tidyverse)
# Live Match Tracker
LiveMatchTracker <- R6Class("LiveMatchTracker",
public = list(
match_id = NULL,
events = NULL,
current_score = NULL,
current_minute = 0,
# Running statistics
home_stats = NULL,
away_stats = NULL,
initialize = function(match_id, home_team, away_team) {
self$match_id <- match_id
self$events <- tibble()
self$current_score <- c(home = 0, away = 0)
# Initialize stats
self$home_stats <- list(
team = home_team,
shots = 0, shots_on_target = 0,
xg = 0, passes = 0, pass_accuracy = 0,
possession = 50, corners = 0,
fouls = 0, yellows = 0, reds = 0
)
self$away_stats <- self$home_stats
self$away_stats$team <- away_team
},
# Add event
add_event = function(minute, event_type, team, player = NULL,
details = list()) {
new_event <- tibble(
minute = minute,
event_type = event_type,
team = team,
player = player,
timestamp = Sys.time()
)
# Bind additional details
for (name in names(details)) {
new_event[[name]] <- details[[name]]
}
self$events <- bind_rows(self$events, new_event)
self$current_minute <- minute
# Update running stats
self$update_stats(new_event)
},
update_stats = function(event) {
stats <- if (event$team == self$home_stats$team) self$home_stats else self$away_stats
switch(event$event_type,
"shot" = {
stats$shots <- stats$shots + 1
if (!is.null(event$on_target) && event$on_target) {
stats$shots_on_target <- stats$shots_on_target + 1
}
if (!is.null(event$xg)) {
stats$xg <- stats$xg + event$xg
}
},
"goal" = {
if (event$team == self$home_stats$team) {
self$current_score["home"] <- self$current_score["home"] + 1
} else {
self$current_score["away"] <- self$current_score["away"] + 1
}
},
"corner" = { stats$corners <- stats$corners + 1 },
"foul" = { stats$fouls <- stats$fouls + 1 },
"yellow_card" = { stats$yellows <- stats$yellows + 1 }
)
if (event$team == self$home_stats$team) {
self$home_stats <- stats
} else {
self$away_stats <- stats
}
},
# Generate halftime report
generate_halftime_report = function() {
list(
score = self$current_score,
home = self$home_stats,
away = self$away_stats,
xg_diff = self$home_stats$xg - self$away_stats$xg,
key_events = self$events %>%
filter(event_type %in% c("goal", "yellow_card", "red_card", "penalty")),
recommendations = self$generate_recommendations()
)
},
generate_recommendations = function() {
recs <- character()
# Based on xG difference
xg_diff <- self$home_stats$xg - self$away_stats$xg
if (xg_diff < -0.5 && self$current_score["home"] >= self$current_score["away"]) {
recs <- c(recs, "Overperforming xG - consider more defensive approach")
}
if (xg_diff > 0.5 && self$current_score["home"] <= self$current_score["away"]) {
recs <- c(recs, "Underperforming xG - continue attacking approach")
}
return(recs)
}
)
)
cat("Live match tracker ready\n")Post-Match Analysis
Post-match analysis typically begins immediately after the final whistle and continues for 2-3 days. The goal is to produce actionable insights for the next match cycle.
# Python: Comprehensive post-match report generator
import pandas as pd
import numpy as np
from typing import Dict, List
from dataclasses import dataclass
@dataclass
class PostMatchReport:
"""Complete post-match analysis report."""
match_id: str
result: str
score: str
xg_analysis: Dict
player_ratings: pd.DataFrame
key_moments: pd.DataFrame
improvement_areas: List[str]
tactical_notes: List[str]
class PostMatchAnalyzer:
"""Generate comprehensive post-match analysis."""
def generate_report(self, match_events: pd.DataFrame,
player_stats: pd.DataFrame,
our_team: str,
opponent: str) -> PostMatchReport:
"""Generate complete post-match report."""
# Calculate match result
our_goals = len(match_events[
(match_events["event_type"] == "goal") &
(match_events["team"] == our_team)
])
opp_goals = len(match_events[
(match_events["event_type"] == "goal") &
(match_events["team"] == opponent)
])
if our_goals > opp_goals:
result = "Win"
elif our_goals < opp_goals:
result = "Loss"
else:
result = "Draw"
# xG analysis
our_xg = match_events[match_events["team"] == our_team]["xg"].sum()
opp_xg = match_events[match_events["team"] == opponent]["xg"].sum()
xg_analysis = {
"our_xg": round(our_xg, 2),
"opponent_xg": round(opp_xg, 2),
"xg_difference": round(our_xg - opp_xg, 2),
"goal_difference": our_goals - opp_goals,
"performance": self._assess_performance(
our_goals - opp_goals, our_xg - opp_xg
)
}
# Player ratings
player_ratings = self._calculate_player_ratings(
player_stats[player_stats["team"] == our_team]
)
# Key moments
key_moments = match_events[
match_events["event_type"].isin(["goal", "red_card", "penalty"])
][["minute", "event_type", "team", "player"]]
# Improvement areas
improvement_areas = self._identify_improvements(match_events, our_team)
return PostMatchReport(
match_id=match_events["match_id"].iloc[0] if not match_events.empty else "N/A",
result=result,
score=f"{our_goals} - {opp_goals}",
xg_analysis=xg_analysis,
player_ratings=player_ratings,
key_moments=key_moments,
improvement_areas=improvement_areas,
tactical_notes=[]
)
def _assess_performance(self, goal_diff: int, xg_diff: float) -> str:
"""Assess performance vs expectation."""
actual_vs_expected = goal_diff - xg_diff
if actual_vs_expected > 0.5:
return "Overperformed expectations"
elif actual_vs_expected < -0.5:
return "Underperformed expectations"
return "Performed as expected"
def _calculate_player_ratings(self, stats: pd.DataFrame) -> pd.DataFrame:
"""Calculate player match ratings."""
df = stats.copy()
# Base rating + contributions
df["rating"] = 6.0 # Base
df["rating"] += df.get("goals", 0) * 0.5
df["rating"] += df.get("assists", 0) * 0.3
df["rating"] += df.get("key_passes", 0) * 0.1
df["rating"] += df.get("tackles_won", 0) * 0.05
df["rating"] -= df.get("errors", 0) * 0.3
# Clamp ratings 4-10
df["rating"] = df["rating"].clip(4, 10)
return df.sort_values("rating", ascending=False)[
["player", "position", "minutes", "rating"]
]
def _identify_improvements(self, events: pd.DataFrame,
our_team: str) -> List[str]:
"""Identify areas for improvement."""
improvements = []
# Check for goals conceded from set pieces
goals_conceded = events[
(events["event_type"] == "goal") &
(events["team"] != our_team)
]
if "situation" in goals_conceded.columns:
set_piece_goals = goals_conceded[
goals_conceded["situation"].str.contains(
"corner|free kick", case=False, na=False
)
]
if len(set_piece_goals) > 0:
improvements.append(
f"Set piece defending: {len(set_piece_goals)} goal(s) conceded"
)
return improvements
def format_report(self, report: PostMatchReport) -> str:
"""Format report for display."""
return f"""
================================================================================
POST-MATCH ANALYSIS REPORT
================================================================================
Result: {report.result} ({report.score})
xG ANALYSIS:
Our xG: {report.xg_analysis["our_xg"]}
Opponent xG: {report.xg_analysis["opponent_xg"]}
xG Difference: {report.xg_analysis["xg_difference"]:+.2f}
Assessment: {report.xg_analysis["performance"]}
PLAYER RATINGS:
{report.player_ratings.head(5).to_string(index=False)}
AREAS FOR IMPROVEMENT:
{chr(10).join([" - " + a for a in report.improvement_areas]) or " None identified"}
================================================================================
"""
print("Post-match analyzer ready")# R: Comprehensive post-match report generator
library(tidyverse)
library(ggplot2)
# Post-match Report Generator
generate_post_match_report <- function(match_events, player_stats,
our_team, opponent) {
report <- list()
# Match summary
our_goals <- sum(match_events$event_type == "goal" &
match_events$team == our_team)
opp_goals <- sum(match_events$event_type == "goal" &
match_events$team == opponent)
report$summary <- list(
result = case_when(
our_goals > opp_goals ~ "Win",
our_goals < opp_goals ~ "Loss",
TRUE ~ "Draw"
),
score = sprintf("%d - %d", our_goals, opp_goals),
our_xg = sum(match_events$xg[match_events$team == our_team], na.rm = TRUE),
opp_xg = sum(match_events$xg[match_events$team == opponent], na.rm = TRUE)
)
# Performance vs expectation
report$xg_analysis <- list(
xg_difference = report$summary$our_xg - report$summary$opp_xg,
goal_difference = our_goals - opp_goals,
performance = case_when(
(our_goals - opp_goals) > (report$summary$our_xg - report$summary$opp_xg) + 0.5 ~
"Overperformed xG",
(our_goals - opp_goals) < (report$summary$our_xg - report$summary$opp_xg) - 0.5 ~
"Underperformed xG",
TRUE ~ "Performed as expected"
)
)
# Player ratings
report$player_ratings <- player_stats %>%
filter(team == our_team) %>%
mutate(
rating = 6.0 + # Base rating
(goals * 0.5) +
(assists * 0.3) +
(key_passes * 0.1) +
(successful_dribbles * 0.05) +
(tackles_won * 0.05) -
(errors_leading_to_shot * 0.3)
) %>%
mutate(rating = pmin(pmax(rating, 4), 10)) %>% # Clamp 4-10
arrange(desc(rating)) %>%
select(player, position, minutes, rating, goals, assists)
# Key moments
report$key_moments <- match_events %>%
filter(event_type %in% c("goal", "red_card", "penalty_miss", "var_decision")) %>%
select(minute, event_type, team, player, details)
# Areas for improvement
report$improvement_areas <- identify_improvement_areas(match_events, player_stats)
return(report)
}
# Identify tactical improvements
identify_improvement_areas <- function(events, player_stats) {
areas <- character()
# Check set piece defending
set_piece_goals_conceded <- events %>%
filter(event_type == "goal", team != our_team,
grepl("corner|free kick|throw", situation)) %>%
nrow()
if (set_piece_goals_conceded > 0) {
areas <- c(areas, sprintf("Set piece defending: %d goals conceded",
set_piece_goals_conceded))
}
# Check transition defending
counter_goals <- events %>%
filter(event_type == "goal", team != our_team,
grepl("counter|fast break", situation)) %>%
nrow()
if (counter_goals > 0) {
areas <- c(areas, "Counter-attack vulnerability")
}
return(areas)
}
cat("Post-match report generator ready\n")Practice Exercises
Implement the complete match day workflow using real StatsBomb open data. Create pre-match opposition reports, simulate live tracking, and generate post-match analysis for a Champions League match.
Design a one-page halftime dashboard that displays: current score, xG comparison, shot maps, possession flow, and key events. The dashboard should update automatically and be optimized for tablet viewing on the touchline.
Create a system that takes match event data and generates timestamp markers for video analysis. Automatically tag key moments like goals, big chances, defensive errors, and tactical patterns for video review sessions.
Summary
Key Takeaways
- Structured workflow: Match day analytics follows a clear pre/live/post cycle
- Pre-match: Opposition reports, key matchups, and tactical preparation
- Live analysis: Real-time tracking with rapid halftime insights
- Post-match: Performance assessment, player ratings, and improvement identification
- Speed matters: Halftime analysis must be ready in 2-3 minutes
Deliverables by Phase
| Pre-Match | Opposition report, set piece analysis, team selection insights |
|---|---|
| Live | Running stats, halftime analysis, substitution recommendations |
| Post-Match | Performance report, player ratings, video clips, improvement areas |