Chapter 60

Capstone - Complete Analytics System

Intermediate 30 min read 5 sections 10 code examples
0 of 60 chapters completed (0%)
Learning Objectives
  • Understand the complete match day analytics workflow
  • Build pre-match scouting and preparation reports
  • Implement live match tracking and halftime analysis
  • Create automated post-match reports and visualizations
  • Design dashboards for coaching staff consumption

Match day is when analytics meets reality. The work done before, during, and after matches forms the backbone of data-driven decision making in football. This chapter walks through the complete workflow that professional clubs use to prepare, analyze, and learn from every match.

The Match Day Analytics Cycle

Modern football analytics departments operate on a continuous cycle around match days. Each phase has specific deliverables and timelines.

Match Day Timeline
Phase Timing Key Deliverables Primary Audience
Pre-Match MD-7 to MD-1 Opposition report, set piece analysis, team selection insights Coaching staff, players
Match Day MD (before kickoff) Final lineup analysis, weather/conditions, key matchups Coaching staff
Live During match Real-time stats, halftime analysis, tactical adjustments Technical staff, bench
Post-Match MD+1 to MD+3 Performance report, video clips, individual reports Coaching staff, players, board
match_day_workflow.R / match_day_workflow.py
# Python: Define match day workflow structure
from dataclasses import dataclass, field
from datetime import date, datetime, timedelta
from typing import Dict, List, Optional, Any
from enum import Enum

class MatchPhase(Enum):
    PRE_MATCH = "pre_match"
    MATCH_DAY = "match_day"
    LIVE = "live"
    POST_MATCH = "post_match"

@dataclass
class PreMatchData:
    """Container for pre-match analytics."""
    opposition_report: Optional[Dict] = None
    set_piece_analysis: Optional[Dict] = None
    selection_insights: Optional[Dict] = None
    key_matchups: Optional[List[Dict]] = None
    tactical_plan: Optional[Dict] = None

@dataclass
class LiveMatchData:
    """Container for live match analytics."""
    first_half_stats: Optional[Dict] = None
    halftime_analysis: Optional[Dict] = None
    second_half_stats: Optional[Dict] = None
    substitution_recommendations: Optional[List[Dict]] = None

@dataclass
class PostMatchData:
    """Container for post-match analytics."""
    performance_report: Optional[Dict] = None
    player_ratings: Optional[Dict[str, float]] = None
    video_clips: Optional[List[str]] = None
    xg_timeline: Optional[List[Dict]] = None

@dataclass
class MatchDayWorkflow:
    """Complete match day analytics workflow."""
    match_id: str
    opponent: str
    match_date: date
    home_away: str
    competition: str
    pre_match: PreMatchData = field(default_factory=PreMatchData)
    live_match: LiveMatchData = field(default_factory=LiveMatchData)
    post_match: PostMatchData = field(default_factory=PostMatchData)

    def days_until_match(self) -> int:
        """Calculate days until match."""
        return (self.match_date - date.today()).days

    def get_current_phase(self) -> MatchPhase:
        """Determine current workflow phase."""
        days = self.days_until_match()
        if days > 0:
            return MatchPhase.PRE_MATCH
        elif days == 0:
            return MatchPhase.MATCH_DAY
        else:
            return MatchPhase.POST_MATCH

# Example initialization
next_match = MatchDayWorkflow(
    match_id="2024_001",
    opponent="Manchester City",
    match_date=date.today() + timedelta(days=5),
    home_away="Away",
    competition="Premier League"
)

print(f"Match: vs {next_match.opponent} ({next_match.home_away})")
print(f"Days until match: {next_match.days_until_match()}")
print(f"Current phase: {next_match.get_current_phase().value}")
# R: Define match day workflow structure
library(tidyverse)
library(R6)

# Match Day Analytics Workflow
MatchDayWorkflow <- R6Class("MatchDayWorkflow",
  public = list(
    match_id = NULL,
    opponent = NULL,
    match_date = NULL,
    home_away = NULL,
    competition = NULL,

    # Workflow stages
    pre_match = NULL,
    live_match = NULL,
    post_match = NULL,

    initialize = function(match_id, opponent, match_date,
                         home_away, competition) {
      self$match_id <- match_id
      self$opponent <- opponent
      self$match_date <- match_date
      self$home_away <- home_away
      self$competition <- competition

      # Initialize stage containers
      self$pre_match <- list(
        opposition_report = NULL,
        set_piece_analysis = NULL,
        selection_insights = NULL,
        key_matchups = NULL
      )
      self$live_match <- list(
        first_half_stats = NULL,
        halftime_analysis = NULL,
        second_half_stats = NULL
      )
      self$post_match <- list(
        performance_report = NULL,
        player_ratings = NULL,
        video_clips = NULL,
        next_match_prep = NULL
      )
    },

    get_days_until_match = function() {
      as.numeric(self$match_date - Sys.Date())
    },

    get_current_phase = function() {
      days <- self$get_days_until_match()
      if (days > 0) return("pre_match")
      if (days == 0) return("match_day")
      return("post_match")
    }
  )
)

# Example initialization
next_match <- MatchDayWorkflow$new(
  match_id = "2024_001",
  opponent = "Manchester City",
  match_date = Sys.Date() + 5,
  home_away = "Away",
  competition = "Premier League"
)

cat(sprintf("Match: vs %s (%s)\n", next_match$opponent, next_match$home_away))
cat(sprintf("Days until match: %d\n", next_match$get_days_until_match()))
cat(sprintf("Current phase: %s\n", next_match$get_current_phase()))
Output
Match: vs Manchester City (Away)
Days until match: 5
Current phase: pre_match

Pre-Match Analysis

Pre-match preparation typically begins 5-7 days before the match. The analytics team produces opposition reports, identifies key tactical patterns, and provides insights for team selection.

pre_match_analysis.R / pre_match_analysis.py
# Python: Build pre-match opposition report
import pandas as pd
import numpy as np
from typing import Dict, List
from dataclasses import dataclass
from datetime import date

@dataclass
class OppositionReport:
    """Structured opposition report."""
    opponent: str
    generated_date: date
    team_style: Dict
    formations: pd.DataFrame
    key_players: Dict[str, pd.DataFrame]
    recent_results: pd.DataFrame
    set_pieces: Dict
    weaknesses: List[str]

class PreMatchAnalyzer:
    """Generate pre-match analysis reports."""

    def generate_opposition_report(self, opponent_name: str,
                                    matches_data: pd.DataFrame,
                                    players_data: pd.DataFrame) -> OppositionReport:
        """Generate comprehensive opposition report."""

        # Filter to opponent matches
        opponent_matches = matches_data[
            matches_data["team"] == opponent_name
        ].sort_values("date", ascending=False).head(10)

        # Calculate team style metrics
        team_style = {
            "avg_possession": opponent_matches["possession"].mean(),
            "avg_xg_for": opponent_matches["xg"].mean(),
            "avg_xg_against": opponent_matches["xg_against"].mean(),
            "avg_ppda": opponent_matches["ppda"].mean(),
            "avg_field_tilt": opponent_matches["field_tilt"].mean(),
            "pressing_style": self._classify_pressing(opponent_matches),
            "build_up_style": self._classify_buildup(opponent_matches)
        }

        # Formation analysis
        formations = opponent_matches.groupby("formation").size().reset_index()
        formations.columns = ["formation", "count"]
        formations["pct"] = formations["count"] / formations["count"].sum() * 100
        formations = formations.sort_values("count", ascending=False)

        # Key players
        opponent_players = players_data[
            players_data["team"] == opponent_name
        ].sort_values("minutes", ascending=False).head(15)

        key_players = {
            "scorers": opponent_players.nlargest(3, "goals"),
            "creators": opponent_players.nlargest(3, "xa"),
            "progressive": opponent_players.nlargest(3, "progressive_passes")
        }

        # Identify weaknesses
        weaknesses = self._identify_weaknesses(team_style, opponent_matches)

        return OppositionReport(
            opponent=opponent_name,
            generated_date=date.today(),
            team_style=team_style,
            formations=formations,
            key_players=key_players,
            recent_results=opponent_matches[["date", "opponent", "result", "score"]],
            set_pieces=self._analyze_set_pieces(opponent_matches),
            weaknesses=weaknesses
        )

    def _classify_pressing(self, matches: pd.DataFrame) -> str:
        """Classify team pressing style."""
        avg_ppda = matches["ppda"].mean()
        if avg_ppda < 8:
            return "High Press"
        elif avg_ppda < 12:
            return "Medium Press"
        return "Low Block"

    def _classify_buildup(self, matches: pd.DataFrame) -> str:
        """Classify build-up style."""
        avg_possession = matches["possession"].mean()
        if avg_possession > 55:
            return "Possession-Based"
        elif avg_possession > 48:
            return "Balanced"
        return "Direct/Counter"

    def _identify_weaknesses(self, style: Dict,
                             matches: pd.DataFrame) -> List[str]:
        """Identify potential weaknesses."""
        weaknesses = []
        if style["avg_xg_against"] > 1.5:
            weaknesses.append("Defensive vulnerability (high xGA)")
        if style["avg_possession"] > 60:
            weaknesses.append("Potentially vulnerable on counter-attacks")
        if matches["goals_conceded_set_pieces"].mean() > 0.5:
            weaknesses.append("Set piece defending issues")
        return weaknesses

    def _analyze_set_pieces(self, matches: pd.DataFrame) -> Dict:
        """Analyze set piece patterns."""
        return {
            "corners_per_game": matches["corners"].mean(),
            "set_piece_goals": matches["goals_from_set_pieces"].sum(),
            "set_piece_goals_conceded": matches["goals_conceded_set_pieces"].sum()
        }

    def format_report(self, report: OppositionReport) -> str:
        """Format report for display."""
        output = f"""
=== OPPOSITION REPORT: {report.opponent} ===
Generated: {report.generated_date}

TEAM STYLE:
  Possession: {report.team_style["avg_possession"]:.1f}%
  Build-up: {report.team_style["build_up_style"]}
  Pressing: {report.team_style["pressing_style"]}
  xG For: {report.team_style["avg_xg_for"]:.2f}
  xG Against: {report.team_style["avg_xg_against"]:.2f}

FORMATIONS:
{report.formations.head(3).to_string(index=False)}

KEY THREATS:
  Scorers: {", ".join(report.key_players["scorers"]["player"].tolist())}
  Creators: {", ".join(report.key_players["creators"]["player"].tolist())}

IDENTIFIED WEAKNESSES:
{chr(10).join(["  - " + w for w in report.weaknesses])}
"""
        return output

analyzer = PreMatchAnalyzer()
print("Pre-match analyzer ready")
# R: Build pre-match opposition report
library(tidyverse)

# Opposition Analysis Generator
generate_opposition_report <- function(opponent_name, matches_data, players_data) {

  # Filter opponent data
  opponent_matches <- matches_data %>%
    filter(team == opponent_name) %>%
    arrange(desc(date)) %>%
    head(10)  # Last 10 matches

  # Team style metrics
  team_style <- opponent_matches %>%
    summarise(
      avg_possession = mean(possession, na.rm = TRUE),
      avg_xg_for = mean(xg, na.rm = TRUE),
      avg_xg_against = mean(xg_against, na.rm = TRUE),
      avg_ppda = mean(ppda, na.rm = TRUE),
      avg_field_tilt = mean(field_tilt, na.rm = TRUE),
      home_record = sprintf("%d-%d-%d",
        sum(result == "W" & venue == "Home"),
        sum(result == "D" & venue == "Home"),
        sum(result == "L" & venue == "Home")
      ),
      away_record = sprintf("%d-%d-%d",
        sum(result == "W" & venue == "Away"),
        sum(result == "D" & venue == "Away"),
        sum(result == "L" & venue == "Away")
      )
    )

  # Formation analysis
  formations <- opponent_matches %>%
    count(formation) %>%
    arrange(desc(n)) %>%
    mutate(pct = n / sum(n) * 100)

  # Key players
  opponent_players <- players_data %>%
    filter(team == opponent_name) %>%
    arrange(desc(minutes)) %>%
    head(15)

  top_creators <- opponent_players %>%
    arrange(desc(xa)) %>%
    head(3)

  top_scorers <- opponent_players %>%
    arrange(desc(goals)) %>%
    head(3)

  # Compile report
  report <- list(
    opponent = opponent_name,
    generated_date = Sys.Date(),
    team_style = team_style,
    formations = formations,
    key_players = list(
      creators = top_creators,
      scorers = top_scorers
    ),
    recent_results = opponent_matches %>%
      select(date, opponent, result, score, xg, xg_against)
  )

  return(report)
}

# Print formatted report
print_opposition_report <- function(report) {
  cat(sprintf("\n=== OPPOSITION REPORT: %s ===\n", report$opponent))
  cat(sprintf("Generated: %s\n\n", report$generated_date))

  cat("TEAM STYLE:\n")
  cat(sprintf("  Avg Possession: %.1f%%\n", report$team_style$avg_possession))
  cat(sprintf("  Avg xG For: %.2f | Against: %.2f\n",
              report$team_style$avg_xg_for, report$team_style$avg_xg_against))
  cat(sprintf("  PPDA: %.1f | Field Tilt: %.1f%%\n",
              report$team_style$avg_ppda, report$team_style$avg_field_tilt))

  cat("\nFORMATIONS USED:\n")
  for (i in 1:min(3, nrow(report$formations))) {
    f <- report$formations[i,]
    cat(sprintf("  %s: %.0f%% (%d matches)\n", f$formation, f$pct, f$n))
  }

  cat("\nKEY THREATS:\n")
  cat("  Top Scorers:", paste(report$key_players$scorers$player, collapse = ", "), "\n")
  cat("  Top Creators:", paste(report$key_players$creators$player, collapse = ", "), "\n")
}

Key Matchups Analysis

matchup_analysis.R / matchup_analysis.py
# Python: Identify key matchups
import pandas as pd
from typing import Dict, List, Optional
from dataclasses import dataclass

@dataclass
class Matchup:
    """Individual player matchup."""
    our_player: str
    opponent: str
    our_stats: Dict
    opponent_stats: Dict
    advantage: str
    key_metric: str

class MatchupAnalyzer:
    """Analyze key player matchups."""

    def analyze_matchups(self, our_players: pd.DataFrame,
                         opponent_players: pd.DataFrame,
                         our_formation: str,
                         opp_formation: str) -> List[Matchup]:
        """Identify key matchups based on formations."""
        matchups = []

        # Striker vs Center-backs
        striker_matchup = self._analyze_striker_matchup(
            our_players, opponent_players
        )
        if striker_matchup:
            matchups.append(striker_matchup)

        # Winger vs Full-back
        winger_matchups = self._analyze_winger_matchups(
            our_players, opponent_players
        )
        matchups.extend(winger_matchups)

        # Midfield battle
        midfield_matchup = self._analyze_midfield(
            our_players, opponent_players
        )
        if midfield_matchup:
            matchups.append(midfield_matchup)

        return matchups

    def _analyze_striker_matchup(self, our_players: pd.DataFrame,
                                  opp_players: pd.DataFrame) -> Optional[Matchup]:
        """Analyze striker vs center-back matchup."""
        our_strikers = our_players[
            our_players["position"].str.contains("FW|CF|ST", na=False)
        ].sort_values("minutes", ascending=False)

        opp_cbs = opp_players[
            opp_players["position"].str.contains("CB|DC", na=False)
        ].sort_values("minutes", ascending=False)

        if our_strikers.empty or opp_cbs.empty:
            return None

        striker = our_strikers.iloc[0]
        cb = opp_cbs.iloc[0]

        # Determine advantage
        aerial_diff = striker.get("aerial_win_pct", 50) - cb.get("aerial_win_pct", 50)
        pace_diff = striker.get("pace", 70) - cb.get("pace", 70)

        if aerial_diff > 10:
            advantage = "Aerial advantage - target with crosses"
            key_metric = "aerial"
        elif pace_diff > 5:
            advantage = "Pace advantage - play balls in behind"
            key_metric = "pace"
        elif pace_diff < -5:
            advantage = "Opponent has pace advantage - hold up play"
            key_metric = "pace"
        else:
            advantage = "Even matchup - vary approach"
            key_metric = "balanced"

        return Matchup(
            our_player=striker["player"],
            opponent=cb["player"],
            our_stats={"aerial": striker.get("aerial_win_pct", 50),
                       "pace": striker.get("pace", 70)},
            opponent_stats={"aerial": cb.get("aerial_win_pct", 50),
                           "pace": cb.get("pace", 70)},
            advantage=advantage,
            key_metric=key_metric
        )

    def _analyze_winger_matchups(self, our_players: pd.DataFrame,
                                  opp_players: pd.DataFrame) -> List[Matchup]:
        """Analyze winger vs full-back matchups."""
        matchups = []
        # Implementation similar to striker matchup
        return matchups

    def _analyze_midfield(self, our_players: pd.DataFrame,
                          opp_players: pd.DataFrame) -> Optional[Matchup]:
        """Analyze midfield battle."""
        # Implementation
        return None

    def format_matchups(self, matchups: List[Matchup]) -> str:
        """Format matchups for display."""
        output = "\n=== KEY MATCHUPS ===\n"
        for m in matchups:
            output += f"""
{m.our_player} vs {m.opponent}
  Our stats: {m.our_stats}
  Their stats: {m.opponent_stats}
  Assessment: {m.advantage}
"""
        return output

print("Matchup analyzer ready")
# R: Identify key matchups
library(tidyverse)

# Find key player matchups
analyze_key_matchups <- function(our_players, opponent_players, formation) {

  matchups <- list()

  # Striker vs Center-backs
  our_strikers <- our_players %>%
    filter(grepl("FW|CF|ST", position)) %>%
    arrange(desc(minutes))

  opp_cbs <- opponent_players %>%
    filter(grepl("CB|DC", position)) %>%
    arrange(desc(minutes))

  if (nrow(our_strikers) > 0 && nrow(opp_cbs) > 0) {
    matchups$striker_vs_cb <- tibble(
      our_player = our_strikers$player[1],
      opponent = opp_cbs$player[1],
      our_aerial_win_pct = our_strikers$aerial_win_pct[1],
      opp_aerial_win_pct = opp_cbs$aerial_win_pct[1],
      our_pace = our_strikers$pace[1],
      opp_pace = opp_cbs$pace[1],
      matchup_advantage = case_when(
        our_strikers$aerial_win_pct[1] > opp_cbs$aerial_win_pct[1] + 10 ~ "Aerial advantage",
        our_strikers$pace[1] > opp_cbs$pace[1] + 5 ~ "Pace advantage",
        TRUE ~ "Neutral"
      )
    )
  }

  # Wingers vs Full-backs
  our_wingers <- our_players %>%
    filter(grepl("LW|RW|LM|RM", position)) %>%
    arrange(desc(dribbles_completed))

  opp_fbs <- opponent_players %>%
    filter(grepl("LB|RB|FB", position)) %>%
    arrange(desc(minutes))

  # Midfield battle
  our_mids <- our_players %>%
    filter(grepl("CM|CDM|CAM", position)) %>%
    arrange(desc(minutes))

  opp_mids <- opponent_players %>%
    filter(grepl("CM|CDM|CAM", position)) %>%
    arrange(desc(minutes))

  return(matchups)
}

# Generate matchup summary
print_matchup_summary <- function(matchups) {
  cat("\n=== KEY MATCHUPS ===\n\n")

  if (!is.null(matchups$striker_vs_cb)) {
    m <- matchups$striker_vs_cb
    cat(sprintf("STRIKER VS CB: %s vs %s\n", m$our_player, m$opponent))
    cat(sprintf("  Aerial: %.0f%% vs %.0f%%\n",
                m$our_aerial_win_pct, m$opp_aerial_win_pct))
    cat(sprintf("  Assessment: %s\n\n", m$matchup_advantage))
  }
}

Live Match Analytics

During the match, the analytics team tracks real-time statistics and prepares halftime insights. Speed is crucial—analysis must be ready within 2-3 minutes of halftime.

live_match_tracker.R / live_match_tracker.py
# Python: Live match tracking system
from dataclasses import dataclass, field
from typing import Dict, List, Optional
from datetime import datetime
import pandas as pd

@dataclass
class LiveStats:
    """Running match statistics."""
    team: str
    shots: int = 0
    shots_on_target: int = 0
    xg: float = 0.0
    passes: int = 0
    pass_accuracy: float = 0.0
    possession: float = 50.0
    corners: int = 0
    fouls: int = 0
    yellows: int = 0
    reds: int = 0

class LiveMatchTracker:
    """Track and analyze match in real-time."""

    def __init__(self, match_id: str, home_team: str, away_team: str):
        self.match_id = match_id
        self.home_team = home_team
        self.away_team = away_team

        self.events: List[Dict] = []
        self.score = {"home": 0, "away": 0}
        self.current_minute = 0

        self.home_stats = LiveStats(team=home_team)
        self.away_stats = LiveStats(team=away_team)

    def add_event(self, minute: int, event_type: str, team: str,
                  player: Optional[str] = None, **details):
        """Add match event and update stats."""
        event = {
            "minute": minute,
            "event_type": event_type,
            "team": team,
            "player": player,
            "timestamp": datetime.now(),
            **details
        }
        self.events.append(event)
        self.current_minute = minute
        self._update_stats(event)

    def _update_stats(self, event: Dict):
        """Update running statistics based on event."""
        stats = self.home_stats if event["team"] == self.home_team else self.away_stats

        if event["event_type"] == "shot":
            stats.shots += 1
            if event.get("on_target"):
                stats.shots_on_target += 1
            stats.xg += event.get("xg", 0)

        elif event["event_type"] == "goal":
            if event["team"] == self.home_team:
                self.score["home"] += 1
            else:
                self.score["away"] += 1

        elif event["event_type"] == "corner":
            stats.corners += 1

        elif event["event_type"] == "foul":
            stats.fouls += 1

        elif event["event_type"] == "yellow_card":
            stats.yellows += 1

    def generate_halftime_report(self) -> Dict:
        """Generate halftime analysis report."""
        xg_diff = self.home_stats.xg - self.away_stats.xg

        return {
            "score": self.score,
            "xg": {
                "home": round(self.home_stats.xg, 2),
                "away": round(self.away_stats.xg, 2),
                "diff": round(xg_diff, 2)
            },
            "shots": {
                "home": f"{self.home_stats.shots} ({self.home_stats.shots_on_target} on target)",
                "away": f"{self.away_stats.shots} ({self.away_stats.shots_on_target} on target)"
            },
            "key_events": [e for e in self.events
                          if e["event_type"] in ["goal", "yellow_card", "red_card"]],
            "recommendations": self._generate_recommendations()
        }

    def _generate_recommendations(self) -> List[str]:
        """Generate tactical recommendations."""
        recs = []
        xg_diff = self.home_stats.xg - self.away_stats.xg
        score_diff = self.score["home"] - self.score["away"]

        if xg_diff < -0.5 and score_diff >= 0:
            recs.append("Overperforming xG - consider defensive consolidation")
        if xg_diff > 0.5 and score_diff <= 0:
            recs.append("Creating chances but not converting - maintain approach")
        if self.away_stats.xg > 1.0 and self.score["away"] == 0:
            recs.append("Opponent creating quality chances - tighten defense")

        return recs

    def format_halftime_display(self) -> str:
        """Format halftime report for display."""
        report = self.generate_halftime_report()
        return f"""
=== HALFTIME REPORT ===
Score: {self.home_team} {report["score"]["home"]} - {report["score"]["away"]} {self.away_team}

xG: {report["xg"]["home"]} - {report["xg"]["away"]}
Shots: {report["shots"]["home"]} - {report["shots"]["away"]}

Recommendations:
{chr(10).join(["  - " + r for r in report["recommendations"]])}
"""

print("Live match tracker ready")
# R: Live match tracking system
library(tidyverse)

# Live Match Tracker
LiveMatchTracker <- R6Class("LiveMatchTracker",
  public = list(
    match_id = NULL,
    events = NULL,
    current_score = NULL,
    current_minute = 0,

    # Running statistics
    home_stats = NULL,
    away_stats = NULL,

    initialize = function(match_id, home_team, away_team) {
      self$match_id <- match_id
      self$events <- tibble()
      self$current_score <- c(home = 0, away = 0)

      # Initialize stats
      self$home_stats <- list(
        team = home_team,
        shots = 0, shots_on_target = 0,
        xg = 0, passes = 0, pass_accuracy = 0,
        possession = 50, corners = 0,
        fouls = 0, yellows = 0, reds = 0
      )
      self$away_stats <- self$home_stats
      self$away_stats$team <- away_team
    },

    # Add event
    add_event = function(minute, event_type, team, player = NULL,
                         details = list()) {
      new_event <- tibble(
        minute = minute,
        event_type = event_type,
        team = team,
        player = player,
        timestamp = Sys.time()
      )

      # Bind additional details
      for (name in names(details)) {
        new_event[[name]] <- details[[name]]
      }

      self$events <- bind_rows(self$events, new_event)
      self$current_minute <- minute

      # Update running stats
      self$update_stats(new_event)
    },

    update_stats = function(event) {
      stats <- if (event$team == self$home_stats$team) self$home_stats else self$away_stats

      switch(event$event_type,
        "shot" = {
          stats$shots <- stats$shots + 1
          if (!is.null(event$on_target) && event$on_target) {
            stats$shots_on_target <- stats$shots_on_target + 1
          }
          if (!is.null(event$xg)) {
            stats$xg <- stats$xg + event$xg
          }
        },
        "goal" = {
          if (event$team == self$home_stats$team) {
            self$current_score["home"] <- self$current_score["home"] + 1
          } else {
            self$current_score["away"] <- self$current_score["away"] + 1
          }
        },
        "corner" = { stats$corners <- stats$corners + 1 },
        "foul" = { stats$fouls <- stats$fouls + 1 },
        "yellow_card" = { stats$yellows <- stats$yellows + 1 }
      )

      if (event$team == self$home_stats$team) {
        self$home_stats <- stats
      } else {
        self$away_stats <- stats
      }
    },

    # Generate halftime report
    generate_halftime_report = function() {
      list(
        score = self$current_score,
        home = self$home_stats,
        away = self$away_stats,
        xg_diff = self$home_stats$xg - self$away_stats$xg,
        key_events = self$events %>%
          filter(event_type %in% c("goal", "yellow_card", "red_card", "penalty")),
        recommendations = self$generate_recommendations()
      )
    },

    generate_recommendations = function() {
      recs <- character()

      # Based on xG difference
      xg_diff <- self$home_stats$xg - self$away_stats$xg
      if (xg_diff < -0.5 && self$current_score["home"] >= self$current_score["away"]) {
        recs <- c(recs, "Overperforming xG - consider more defensive approach")
      }
      if (xg_diff > 0.5 && self$current_score["home"] <= self$current_score["away"]) {
        recs <- c(recs, "Underperforming xG - continue attacking approach")
      }

      return(recs)
    }
  )
)

cat("Live match tracker ready\n")

Post-Match Analysis

Post-match analysis typically begins immediately after the final whistle and continues for 2-3 days. The goal is to produce actionable insights for the next match cycle.

post_match_analysis.R / post_match_analysis.py
# Python: Comprehensive post-match report generator
import pandas as pd
import numpy as np
from typing import Dict, List
from dataclasses import dataclass

@dataclass
class PostMatchReport:
    """Complete post-match analysis report."""
    match_id: str
    result: str
    score: str
    xg_analysis: Dict
    player_ratings: pd.DataFrame
    key_moments: pd.DataFrame
    improvement_areas: List[str]
    tactical_notes: List[str]

class PostMatchAnalyzer:
    """Generate comprehensive post-match analysis."""

    def generate_report(self, match_events: pd.DataFrame,
                        player_stats: pd.DataFrame,
                        our_team: str,
                        opponent: str) -> PostMatchReport:
        """Generate complete post-match report."""

        # Calculate match result
        our_goals = len(match_events[
            (match_events["event_type"] == "goal") &
            (match_events["team"] == our_team)
        ])
        opp_goals = len(match_events[
            (match_events["event_type"] == "goal") &
            (match_events["team"] == opponent)
        ])

        if our_goals > opp_goals:
            result = "Win"
        elif our_goals < opp_goals:
            result = "Loss"
        else:
            result = "Draw"

        # xG analysis
        our_xg = match_events[match_events["team"] == our_team]["xg"].sum()
        opp_xg = match_events[match_events["team"] == opponent]["xg"].sum()

        xg_analysis = {
            "our_xg": round(our_xg, 2),
            "opponent_xg": round(opp_xg, 2),
            "xg_difference": round(our_xg - opp_xg, 2),
            "goal_difference": our_goals - opp_goals,
            "performance": self._assess_performance(
                our_goals - opp_goals, our_xg - opp_xg
            )
        }

        # Player ratings
        player_ratings = self._calculate_player_ratings(
            player_stats[player_stats["team"] == our_team]
        )

        # Key moments
        key_moments = match_events[
            match_events["event_type"].isin(["goal", "red_card", "penalty"])
        ][["minute", "event_type", "team", "player"]]

        # Improvement areas
        improvement_areas = self._identify_improvements(match_events, our_team)

        return PostMatchReport(
            match_id=match_events["match_id"].iloc[0] if not match_events.empty else "N/A",
            result=result,
            score=f"{our_goals} - {opp_goals}",
            xg_analysis=xg_analysis,
            player_ratings=player_ratings,
            key_moments=key_moments,
            improvement_areas=improvement_areas,
            tactical_notes=[]
        )

    def _assess_performance(self, goal_diff: int, xg_diff: float) -> str:
        """Assess performance vs expectation."""
        actual_vs_expected = goal_diff - xg_diff
        if actual_vs_expected > 0.5:
            return "Overperformed expectations"
        elif actual_vs_expected < -0.5:
            return "Underperformed expectations"
        return "Performed as expected"

    def _calculate_player_ratings(self, stats: pd.DataFrame) -> pd.DataFrame:
        """Calculate player match ratings."""
        df = stats.copy()

        # Base rating + contributions
        df["rating"] = 6.0  # Base
        df["rating"] += df.get("goals", 0) * 0.5
        df["rating"] += df.get("assists", 0) * 0.3
        df["rating"] += df.get("key_passes", 0) * 0.1
        df["rating"] += df.get("tackles_won", 0) * 0.05
        df["rating"] -= df.get("errors", 0) * 0.3

        # Clamp ratings 4-10
        df["rating"] = df["rating"].clip(4, 10)

        return df.sort_values("rating", ascending=False)[
            ["player", "position", "minutes", "rating"]
        ]

    def _identify_improvements(self, events: pd.DataFrame,
                               our_team: str) -> List[str]:
        """Identify areas for improvement."""
        improvements = []

        # Check for goals conceded from set pieces
        goals_conceded = events[
            (events["event_type"] == "goal") &
            (events["team"] != our_team)
        ]

        if "situation" in goals_conceded.columns:
            set_piece_goals = goals_conceded[
                goals_conceded["situation"].str.contains(
                    "corner|free kick", case=False, na=False
                )
            ]
            if len(set_piece_goals) > 0:
                improvements.append(
                    f"Set piece defending: {len(set_piece_goals)} goal(s) conceded"
                )

        return improvements

    def format_report(self, report: PostMatchReport) -> str:
        """Format report for display."""
        return f"""
================================================================================
                      POST-MATCH ANALYSIS REPORT
================================================================================
Result: {report.result} ({report.score})

xG ANALYSIS:
  Our xG: {report.xg_analysis["our_xg"]}
  Opponent xG: {report.xg_analysis["opponent_xg"]}
  xG Difference: {report.xg_analysis["xg_difference"]:+.2f}
  Assessment: {report.xg_analysis["performance"]}

PLAYER RATINGS:
{report.player_ratings.head(5).to_string(index=False)}

AREAS FOR IMPROVEMENT:
{chr(10).join(["  - " + a for a in report.improvement_areas]) or "  None identified"}
================================================================================
"""

print("Post-match analyzer ready")
# R: Comprehensive post-match report generator
library(tidyverse)
library(ggplot2)

# Post-match Report Generator
generate_post_match_report <- function(match_events, player_stats,
                                        our_team, opponent) {

  report <- list()

  # Match summary
  our_goals <- sum(match_events$event_type == "goal" &
                   match_events$team == our_team)
  opp_goals <- sum(match_events$event_type == "goal" &
                   match_events$team == opponent)

  report$summary <- list(
    result = case_when(
      our_goals > opp_goals ~ "Win",
      our_goals < opp_goals ~ "Loss",
      TRUE ~ "Draw"
    ),
    score = sprintf("%d - %d", our_goals, opp_goals),
    our_xg = sum(match_events$xg[match_events$team == our_team], na.rm = TRUE),
    opp_xg = sum(match_events$xg[match_events$team == opponent], na.rm = TRUE)
  )

  # Performance vs expectation
  report$xg_analysis <- list(
    xg_difference = report$summary$our_xg - report$summary$opp_xg,
    goal_difference = our_goals - opp_goals,
    performance = case_when(
      (our_goals - opp_goals) > (report$summary$our_xg - report$summary$opp_xg) + 0.5 ~
        "Overperformed xG",
      (our_goals - opp_goals) < (report$summary$our_xg - report$summary$opp_xg) - 0.5 ~
        "Underperformed xG",
      TRUE ~ "Performed as expected"
    )
  )

  # Player ratings
  report$player_ratings <- player_stats %>%
    filter(team == our_team) %>%
    mutate(
      rating = 6.0 +  # Base rating
        (goals * 0.5) +
        (assists * 0.3) +
        (key_passes * 0.1) +
        (successful_dribbles * 0.05) +
        (tackles_won * 0.05) -
        (errors_leading_to_shot * 0.3)
    ) %>%
    mutate(rating = pmin(pmax(rating, 4), 10)) %>%  # Clamp 4-10
    arrange(desc(rating)) %>%
    select(player, position, minutes, rating, goals, assists)

  # Key moments
  report$key_moments <- match_events %>%
    filter(event_type %in% c("goal", "red_card", "penalty_miss", "var_decision")) %>%
    select(minute, event_type, team, player, details)

  # Areas for improvement
  report$improvement_areas <- identify_improvement_areas(match_events, player_stats)

  return(report)
}

# Identify tactical improvements
identify_improvement_areas <- function(events, player_stats) {
  areas <- character()

  # Check set piece defending
  set_piece_goals_conceded <- events %>%
    filter(event_type == "goal", team != our_team,
           grepl("corner|free kick|throw", situation)) %>%
    nrow()

  if (set_piece_goals_conceded > 0) {
    areas <- c(areas, sprintf("Set piece defending: %d goals conceded",
                              set_piece_goals_conceded))
  }

  # Check transition defending
  counter_goals <- events %>%
    filter(event_type == "goal", team != our_team,
           grepl("counter|fast break", situation)) %>%
    nrow()

  if (counter_goals > 0) {
    areas <- c(areas, "Counter-attack vulnerability")
  }

  return(areas)
}

cat("Post-match report generator ready\n")

Practice Exercises

Exercise 1: Build a Complete Match Day System

Implement the complete match day workflow using real StatsBomb open data. Create pre-match opposition reports, simulate live tracking, and generate post-match analysis for a Champions League match.

Exercise 2: Halftime Dashboard

Design a one-page halftime dashboard that displays: current score, xG comparison, shot maps, possession flow, and key events. The dashboard should update automatically and be optimized for tablet viewing on the touchline.

Exercise 3: Automated Video Clip Tagging

Create a system that takes match event data and generates timestamp markers for video analysis. Automatically tag key moments like goals, big chances, defensive errors, and tactical patterns for video review sessions.

Summary

Key Takeaways
  • Structured workflow: Match day analytics follows a clear pre/live/post cycle
  • Pre-match: Opposition reports, key matchups, and tactical preparation
  • Live analysis: Real-time tracking with rapid halftime insights
  • Post-match: Performance assessment, player ratings, and improvement identification
  • Speed matters: Halftime analysis must be ready in 2-3 minutes
Deliverables by Phase
Pre-Match Opposition report, set piece analysis, team selection insights
Live Running stats, halftime analysis, substitution recommendations
Post-Match Performance report, player ratings, video clips, improvement areas