Chapter 60

Capstone - Complete Analytics System

Intermediate 30 min read 5 sections 10 code examples
0 of 60 chapters completed (0%)

The Pitch as Your Canvas

Data visualization transforms raw numbers into compelling stories. In football analytics, the pitch itself becomes our primary canvas—a 105m × 68m space where every pass, shot, and movement can be mapped, colored, and analyzed.

Great football visualizations do more than display data—they reveal patterns invisible in spreadsheets, communicate complex tactical concepts instantly, and make analytics accessible to coaches, players, and fans alike. From Opta's iconic chalkboards to modern expected goals timelines, visualization has become the language of football analytics.

Why Visualization Matters
  • Pattern Recognition: See pressing triggers, passing lanes, and defensive vulnerabilities
  • Communication: Explain complex analysis to non-technical stakeholders
  • Storytelling: Build narratives around match events and player performances
  • Discovery: Uncover insights that statistics alone might miss

Visualization Libraries We'll Use

Both R and Python have excellent libraries specifically designed for football visualization:

Python Libraries
  • mplsoccer: The gold standard for pitch plots
  • matplotlib: Foundation for all visualizations
  • seaborn: Statistical visualizations
  • plotly: Interactive charts
R Libraries
  • ggsoccer: ggplot2 extension for pitches
  • ggplot2: Grammar of graphics foundation
  • ggrepel: Smart label positioning
  • patchwork: Combining multiple plots
# Install required packages pip install mplsoccer matplotlib seaborn plotly pandas # Import libraries import matplotlib.pyplot as plt import matplotlib.patches as patches from mplsoccer import Pitch, VerticalPitch, Sbopen import seaborn as sns import pandas as pd import numpy as np # For interactive plots import plotly.express as px import plotly.graph_objects as go # Check mplsoccer version import mplsoccer print(f"mplsoccer version: {mplsoccer.__version__}")
# Install required packages
install.packages(c("ggplot2", "ggsoccer", "ggrepel", "patchwork"))

# Load libraries
library(ggplot2)
library(ggsoccer)
library(ggrepel)
library(patchwork)
library(dplyr)

# Also useful for StatsBomb data
library(StatsBombR)

# Check ggsoccer version
packageVersion("ggsoccer")
chapter4-setup
Output
Installing visualization libraries

Drawing the Pitch

Before plotting any data, we need to understand how to draw a football pitch. Both libraries handle the complexity of penalty areas, center circles, and goal areas automatically.

Basic Pitch Creation

# Basic pitch with mplsoccer pitch = Pitch(pitch_color="grass", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) ax.set_title("Standard Football Pitch", fontsize=16) plt.show() # Vertical orientation (common for shot maps) pitch = VerticalPitch(pitch_color="#1a472a", line_color="white", half=False) fig, ax = pitch.draw(figsize=(8, 12)) ax.set_title("Vertical Pitch View", fontsize=16) plt.show() # Half pitch (useful for attacking analysis) pitch = VerticalPitch(pitch_color="grass", line_color="white", half=True) fig, ax = pitch.draw(figsize=(8, 8)) ax.set_title("Attacking Half Only", fontsize=16) plt.show()
# Basic pitch with ggsoccer
ggplot() +
  annotate_pitch(colour = "white",
                 fill = "springgreen4") +
  theme_pitch() +
  coord_flip() +
  ggtitle("Standard Football Pitch")

# Horizontal orientation
ggplot() +
  annotate_pitch(colour = "white",
                 fill = "#1a472a") +
  theme_pitch() +
  ggtitle("Horizontal Pitch View")

# Half pitch (useful for attacking analysis)
ggplot() +
  annotate_pitch(colour = "white",
                 fill = "springgreen4") +
  theme_pitch() +
  coord_flip(xlim = c(50, 100)) +
  ggtitle("Attacking Half Only")
chapter4-basic-pitch
Output
Creating basic pitch visualizations

Coordinate Systems

Different data providers use different coordinate systems. Understanding these is crucial for accurate plotting:

Provider X Range Y Range Origin Notes
StatsBomb 0-120 0-80 Bottom-left Y inverted (0 at top)
Opta 0-100 0-100 Bottom-left Percentage-based
Wyscout 0-100 0-100 Top-left Y inverted from Opta
UEFA 0-105 0-68 Bottom-left Real meters
# Converting coordinates with mplsoccer from mplsoccer import Standardizer # Create standardizer for Opta to StatsBomb opta_to_sb = Standardizer(pitch_from="opta", pitch_to="statsbomb") # Convert coordinates opta_x, opta_y = 88, 50 sb_x, sb_y = opta_to_sb.transform(opta_x, opta_y) print(f"Opta ({opta_x}, {opta_y}) -> StatsBomb ({sb_x:.1f}, {sb_y:.1f})") # Convert multiple points at once opta_coords = pd.DataFrame({ "x": [88, 75, 92], "y": [50, 30, 70] }) sb_coords = opta_to_sb.transform(opta_coords["x"], opta_coords["y"]) print(f"Converted coordinates: {sb_coords}")
# Converting coordinates example
# Opta (0-100) to StatsBomb (0-120, 0-80)
convert_opta_to_statsbomb <- function(x, y) {
  new_x <- x * 1.2     # 100 -> 120
  new_y <- y * 0.8     # 100 -> 80
  return(data.frame(x = new_x, y = new_y))
}

# StatsBomb to Opta
convert_statsbomb_to_opta <- function(x, y) {
  new_x <- x / 1.2     # 120 -> 100
  new_y <- y / 0.8     # 80 -> 100
  return(data.frame(x = new_x, y = new_y))
}

# Example usage
opta_shot <- data.frame(x = 88, y = 50)
sb_coords <- convert_opta_to_statsbomb(opta_shot$x, opta_shot$y)
print(sb_coords)  # x: 105.6, y: 40
chapter4-coordinates
Output
Converting between coordinate systems

Pitch Customization

# Custom pitch styling with mplsoccer # Dark theme pitch pitch = Pitch(pitch_color="#1a1a2e", line_color="#cccccc", linewidth=1, goal_type="box") fig, ax = pitch.draw(figsize=(12, 8)) fig.patch.set_facecolor("#1a1a2e") ax.set_title("Dark Theme Pitch", color="white", fontsize=16) plt.show() # Team colors (Manchester City) pitch = Pitch(pitch_color="#6CABDD", line_color="white", stripe=True, stripe_color="#5BA8D8") fig, ax = pitch.draw(figsize=(12, 8)) ax.set_title("Manchester City Theme", fontsize=16) plt.show() # Different pitch types pitch_types = ["statsbomb", "opta", "wyscout", "uefa"] fig, axes = plt.subplots(2, 2, figsize=(14, 10)) for ax, ptype in zip(axes.flatten(), pitch_types): pitch = Pitch(pitch_type=ptype, pitch_color="grass", line_color="white") pitch.draw(ax=ax) ax.set_title(f"{ptype.upper()} Coordinates", fontsize=12) plt.tight_layout() plt.show()
# Custom pitch styling
# Dark theme pitch
dark_pitch <- ggplot() +
  annotate_pitch(colour = "#cccccc",
                 fill = "#1a1a2e") +
  theme_pitch() +
  theme(panel.background = element_rect(fill = "#1a1a2e"),
        plot.background = element_rect(fill = "#1a1a2e"),
        plot.title = element_text(color = "white")) +
  ggtitle("Dark Theme Pitch")

# Team colors pitch (Manchester City)
city_pitch <- ggplot() +
  annotate_pitch(colour = "white",
                 fill = "#6CABDD") +
  theme_pitch() +
  ggtitle("Manchester City Theme")

# Add pitch markings with different dimensions
# Using Wembley dimensions (105m x 68m = 115 x 74 yards)
wembley <- ggplot() +
  annotate_pitch(colour = "white",
                 fill = "#228B22",
                 dimensions = pitch_wyscout) +
  theme_pitch() +
  ggtitle("Wyscout Dimensions")
chapter4-pitch-custom
Output
Customizing pitch appearance

Creating Shot Maps

Shot maps are perhaps the most iconic football visualization. They show where shots were taken, their outcomes, and increasingly, their expected goals (xG) values.

Basic Shot Map

# Load StatsBomb data from mplsoccer import Sbopen # Initialize parser parser = Sbopen() # Get free competition data competitions = parser.competition() matches = parser.match(competition_id=43, season_id=3) # World Cup 2018 # Get events from a match events, related, freeze, tactics = parser.event(matches.iloc[0]["match_id"]) # Filter for shots shots = events[events["type_name"] == "Shot"].copy() # Basic shot map pitch = VerticalPitch(pitch_color="grass", line_color="white", half=True) fig, ax = pitch.draw(figsize=(10, 10)) # Color by outcome colors = {"Goal": "yellow", "Saved": "white", "Off T": "red", "Blocked": "orange", "Post": "purple"} for outcome, group in shots.groupby("outcome_name"): color = colors.get(outcome, "gray") ax.scatter(group["x"], group["y"], c=color, s=100, label=outcome, alpha=0.8, edgecolors="black") ax.legend(loc="upper left", fontsize=10) ax.set_title("Shot Map", fontsize=16) plt.show()
# Load StatsBomb data
library(StatsBombR)

# Get shots from a match
matches <- FreeMatches(Competitions = FreeCompetitions())
events <- get.matchFree(matches[1, ])

# Filter for shots
shots <- events %>%
  filter(type.name == "Shot") %>%
  select(location.x, location.y, shot.outcome.name,
         shot.statsbomb_xg, player.name)

# Basic shot map
ggplot(shots) +
  annotate_pitch(colour = "white", fill = "springgreen4") +
  geom_point(aes(x = location.x, y = location.y,
                 color = shot.outcome.name),
             size = 4, alpha = 0.8) +
  scale_color_manual(values = c("Goal" = "yellow",
                                "Saved" = "white",
                                "Off T" = "red",
                                "Blocked" = "orange",
                                "Post" = "purple")) +
  theme_pitch() +
  coord_flip(xlim = c(60, 120)) +
  labs(title = "Shot Map",
       color = "Outcome") +
  theme(legend.position = "bottom")
chapter4-shotmap-basic
Output
Creating a basic shot map

xG Shot Map with Size Encoding

Professional shot maps encode xG values through point size—larger circles represent higher quality chances:

# xG shot map with sized points shots_with_xg = shots[shots["shot_statsbomb_xg"].notna()].copy() pitch = VerticalPitch(pitch_color="#1a472a", line_color="white", half=True) fig, ax = pitch.draw(figsize=(10, 10)) # Goals goals_df = shots_with_xg[shots_with_xg["outcome_name"] == "Goal"] non_goals = shots_with_xg[shots_with_xg["outcome_name"] != "Goal"] # Plot non-goals scatter1 = ax.scatter(non_goals["x"], non_goals["y"], s=non_goals["shot_statsbomb_xg"] * 500, c="#CCCCCC", alpha=0.6, edgecolors="black", label="No Goal") # Plot goals scatter2 = ax.scatter(goals_df["x"], goals_df["y"], s=goals_df["shot_statsbomb_xg"] * 500, c="#FFD700", alpha=0.9, edgecolors="black", label="Goal") # Add xG summary total_xg = shots_with_xg["shot_statsbomb_xg"].sum() total_goals = len(goals_df) ax.text(60, 62, f"xG: {total_xg:.2f} | Goals: {total_goals}", fontsize=14, ha="center", color="white", bbox=dict(boxstyle="round", facecolor="#333333")) ax.legend(loc="upper left") ax.set_title("xG Shot Map\nPoint size = xG value", fontsize=16) plt.show()
# xG shot map with sized points
shots_with_xg <- shots %>%
  filter(!is.na(shot.statsbomb_xg))

ggplot(shots_with_xg) +
  annotate_pitch(colour = "white", fill = "#1a472a") +
  geom_point(aes(x = location.x, y = location.y,
                 size = shot.statsbomb_xg,
                 color = shot.outcome.name == "Goal"),
             alpha = 0.7) +
  scale_size_continuous(range = c(2, 12),
                        name = "xG Value") +
  scale_color_manual(values = c("FALSE" = "#CCCCCC",
                                "TRUE" = "#FFD700"),
                     labels = c("No Goal", "Goal"),
                     name = "Result") +
  theme_pitch() +
  coord_flip(xlim = c(60, 120)) +
  labs(title = "xG Shot Map",
       subtitle = "Point size represents expected goals value") +
  theme(legend.position = "right",
        plot.title = element_text(size = 16, face = "bold"),
        plot.subtitle = element_text(size = 12))

# Add total xG annotation
total_xg <- sum(shots_with_xg$shot.statsbomb_xg, na.rm = TRUE)
goals <- sum(shots_with_xg$shot.outcome.name == "Goal")
chapter4-shotmap-xg
Output
Creating an xG shot map with size encoding

Professional Shot Map with Annotations

def create_shot_map(shots_df, team_name, match_info): """Create a professional-style shot map.""" # Calculate statistics total_shots = len(shots_df) goals = len(shots_df[shots_df["outcome_name"] == "Goal"]) total_xg = shots_df["shot_statsbomb_xg"].sum() # Setup pitch pitch = VerticalPitch(pitch_color="#1a1a2e", line_color="#808080", half=True, pad_bottom=0.5) fig, ax = pitch.draw(figsize=(10, 10)) fig.patch.set_facecolor("#1a1a2e") # Split data goals_df = shots_df[shots_df["outcome_name"] == "Goal"] non_goals = shots_df[shots_df["outcome_name"] != "Goal"] # Plot non-goals if len(non_goals) > 0: ax.scatter(non_goals["x"], non_goals["y"], s=non_goals["shot_statsbomb_xg"] * 500, c="#666666", alpha=0.7, edgecolors="#444444", zorder=2) # Plot goals with glow effect if len(goals_df) > 0: # Glow ax.scatter(goals_df["x"], goals_df["y"], s=goals_df["shot_statsbomb_xg"] * 800, c="#FFD700", alpha=0.3, zorder=3) # Main point ax.scatter(goals_df["x"], goals_df["y"], s=goals_df["shot_statsbomb_xg"] * 500, c="#FFD700", alpha=0.9, edgecolors="black", zorder=4) # Title ax.set_title(f"{team_name}\n{match_info}", color="white", fontsize=18, fontweight="bold", pad=20) # Stats annotation stats_text = f"Shots: {total_shots} | xG: {total_xg:.2f} | Goals: {goals}" ax.text(60, 40, stats_text, ha="center", va="center", fontsize=12, color="white", bbox=dict(boxstyle="round,pad=0.5", facecolor="#333333", edgecolor="#FFD700", alpha=0.9)) return fig, ax # Usage fig, ax = create_shot_map(shots, "England", "vs Sweden - Quarter Final") plt.show()
# Professional-style shot map
create_shot_map <- function(shots_df, team_name, match_info) {

  # Calculate statistics
  total_shots <- nrow(shots_df)
  goals <- sum(shots_df$shot.outcome.name == "Goal")
  total_xg <- sum(shots_df$shot.statsbomb_xg, na.rm = TRUE)

  # Create plot
  p <- ggplot(shots_df) +
    annotate_pitch(colour = "#808080", fill = "#1a1a2e") +

    # Non-goals
    geom_point(data = filter(shots_df, shot.outcome.name != "Goal"),
               aes(x = location.x, y = location.y,
                   size = shot.statsbomb_xg),
               color = "#666666", alpha = 0.7) +

    # Goals with highlight
    geom_point(data = filter(shots_df, shot.outcome.name == "Goal"),
               aes(x = location.x, y = location.y,
                   size = shot.statsbomb_xg),
               color = "#FFD700", alpha = 0.9) +
    geom_point(data = filter(shots_df, shot.outcome.name == "Goal"),
               aes(x = location.x, y = location.y,
                   size = shot.statsbomb_xg * 1.5),
               color = "#FFD700", alpha = 0.3) +  # Glow effect

    scale_size_continuous(range = c(3, 15), guide = "none") +
    theme_pitch() +
    coord_flip(xlim = c(60, 122)) +

    # Title and annotations
    labs(title = team_name,
         subtitle = match_info) +

    # Stats box
    annotate("text", x = 65, y = 5,
             label = paste0("Shots: ", total_shots),
             color = "white", hjust = 0, size = 4) +
    annotate("text", x = 65, y = 75,
             label = paste0("xG: ", round(total_xg, 2)),
             color = "white", hjust = 1, size = 4) +
    annotate("text", x = 62, y = 40,
             label = paste0("Goals: ", goals),
             color = "#FFD700", hjust = 0.5, size = 5,
             fontface = "bold") +

    theme(plot.background = element_rect(fill = "#1a1a2e"),
          plot.title = element_text(color = "white", size = 18,
                                    face = "bold", hjust = 0.5),
          plot.subtitle = element_text(color = "#888888", size = 12,
                                       hjust = 0.5))

  return(p)
}

# Usage
shot_map <- create_shot_map(shots, "England", "vs Sweden - Quarter Final")
chapter4-shotmap-pro
Output
Creating a professional shot map with annotations

Pass Maps and Networks

Pass maps visualize the flow of ball movement, revealing team structure, key passing lanes, and player involvement in buildup play.

Individual Pass Map

# Individual player pass map player_passes = events[ (events["type_name"] == "Pass") & (events["player_name"] == "Kevin De Bruyne") ].copy() # Remove passes without end location player_passes = player_passes.dropna(subset=["pass_end_x", "pass_end_y"]) # Create pitch pitch = Pitch(pitch_color="#1a472a", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) # Separate complete and incomplete passes complete = player_passes[player_passes["outcome_name"].isna()] incomplete = player_passes[player_passes["outcome_name"].notna()] # Draw complete passes (green) if len(complete) > 0: pitch.arrows(complete["x"], complete["y"], complete["pass_end_x"], complete["pass_end_y"], ax=ax, color="#98FB98", alpha=0.7, width=2, headwidth=5, headlength=5) # Draw incomplete passes (red) if len(incomplete) > 0: pitch.arrows(incomplete["x"], incomplete["y"], incomplete["pass_end_x"], incomplete["pass_end_y"], ax=ax, color="#FF6B6B", alpha=0.7, width=2, headwidth=5, headlength=5) ax.set_title(f"Kevin De Bruyne - Pass Map\n{len(player_passes)} passes", fontsize=16) plt.show()
# Individual player pass map
player_passes <- events %>%
  filter(type.name == "Pass",
         player.name == "Kevin De Bruyne") %>%
  filter(!is.na(pass.end_location.x))

# Create pass map
ggplot(player_passes) +
  annotate_pitch(colour = "white", fill = "#1a472a") +

  # Draw passes as arrows
  geom_segment(aes(x = location.x, y = location.y,
                   xend = pass.end_location.x,
                   yend = pass.end_location.y,
                   color = pass.outcome.name),
               arrow = arrow(length = unit(0.15, "cm")),
               alpha = 0.7, linewidth = 0.8) +

  scale_color_manual(values = c("Complete" = "#98FB98",
                                "Incomplete" = "#FF6B6B"),
                     na.value = "#98FB98") +

  theme_pitch() +
  coord_flip() +
  labs(title = "Kevin De Bruyne - Pass Map",
       subtitle = paste(nrow(player_passes), "passes attempted"),
       color = "Pass Result") +
  theme(legend.position = "bottom")
chapter4-passmap
Output
Creating an individual player pass map

Pass Network

Pass networks show connections between players, revealing team structure and key relationships:

# Calculate pass network # Get successful passes between players team_passes = events[ (events["type_name"] == "Pass") & (events["team_name"] == "England") & (events["outcome_name"].isna()) # Complete passes ].copy() # Count passes between player pairs pass_pairs = team_passes.groupby( ["player_name", "pass_recipient_name"] ).size().reset_index(name="passes") # Filter for significant connections (>2 passes) pass_pairs = pass_pairs[pass_pairs["passes"] > 2] # Get average positions avg_positions = events[ (events["team_name"] == "England") & (events["x"].notna()) ].groupby("player_name").agg({ "x": "mean", "y": "mean", "type_name": "count" }).rename(columns={"type_name": "touches"}).reset_index() # Create pitch pitch = Pitch(pitch_color="#1a472a", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) # Merge positions with pass pairs pass_network = pass_pairs.merge( avg_positions, left_on="player_name", right_on="player_name" ).merge( avg_positions, left_on="pass_recipient_name", right_on="player_name", suffixes=("", "_end") ) # Draw edges (lines between players) for _, row in pass_network.iterrows(): ax.plot([row["x"], row["x_end"]], [row["y"], row["y_end"]], color="white", alpha=0.4, linewidth=row["passes"] / 3) # Draw nodes (player positions) scatter = ax.scatter(avg_positions["x"], avg_positions["y"], s=avg_positions["touches"] * 10, c="#FFD700", alpha=0.9, edgecolors="black", zorder=5) # Add player labels for _, row in avg_positions.iterrows(): ax.annotate(row["player_name"].split()[-1], # Last name only (row["x"], row["y"] - 4), ha="center", fontsize=8, color="white") ax.set_title("England Pass Network", fontsize=16) plt.show()
# Calculate pass network
library(igraph)

# Get successful passes between players
pass_pairs <- events %>%
  filter(type.name == "Pass",
         is.na(pass.outcome.name),  # Complete passes
         team.name == "England") %>%
  select(player.name, pass.recipient.name) %>%
  filter(!is.na(pass.recipient.name)) %>%
  group_by(player.name, pass.recipient.name) %>%
  summarise(passes = n(), .groups = "drop")

# Get average positions
avg_positions <- events %>%
  filter(team.name == "England",
         !is.na(location.x)) %>%
  group_by(player.name) %>%
  summarise(
    x = mean(location.x, na.rm = TRUE),
    y = mean(location.y, na.rm = TRUE),
    touches = n()
  )

# Create network plot
ggplot() +
  annotate_pitch(colour = "white", fill = "#1a472a") +

  # Draw edges (passes)
  geom_segment(data = pass_pairs %>%
                 left_join(avg_positions, by = c("player.name" = "player.name")) %>%
                 left_join(avg_positions, by = c("pass.recipient.name" = "player.name"),
                           suffix = c("", "_end")),
               aes(x = x, y = y, xend = x_end, yend = y_end,
                   linewidth = passes),
               alpha = 0.6, color = "white") +

  # Draw nodes (players)
  geom_point(data = avg_positions,
             aes(x = x, y = y, size = touches),
             color = "#FFD700", alpha = 0.9) +

  # Player labels
  geom_text(data = avg_positions,
            aes(x = x, y = y - 3, label = player.name),
            color = "white", size = 2.5) +

  scale_linewidth_continuous(range = c(0.5, 4)) +
  scale_size_continuous(range = c(4, 15)) +
  theme_pitch() +
  coord_flip() +
  labs(title = "England Pass Network") +
  theme(legend.position = "none")
chapter4-passnetwork
Output
Creating a team pass network visualization

Progressive Pass Map

Progressive passes move the ball significantly toward the opponent's goal. They're key indicators of attacking intent:

# Define progressive pass criteria passes = events[ (events["type_name"] == "Pass") & (events["outcome_name"].isna()) # Complete passes ].copy() # Calculate progression passes["start_dist"] = 120 - passes["x"] passes["end_dist"] = 120 - passes["pass_end_x"] passes["progression"] = passes["start_dist"] - passes["end_dist"] # Filter for progressive passes (10+ yards, ends in final third) progressive = passes[ (passes["progression"] >= 10) & (passes["pass_end_x"] >= 80) ].copy() # Create plot pitch = Pitch(pitch_color="#1a472a", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) # Color by progression distance scatter = pitch.arrows(progressive["x"], progressive["y"], progressive["pass_end_x"], progressive["pass_end_y"], ax=ax, cmap="YlOrRd", c=progressive["progression"], width=2, headwidth=6, alpha=0.8) # Add colorbar cbar = fig.colorbar(scatter, ax=ax, shrink=0.6) cbar.set_label("Yards Gained", fontsize=10) ax.set_title(f"Progressive Passes into Final Third\n{len(progressive)} passes", fontsize=16) plt.show()
# Define progressive pass criteria
# A pass is progressive if it moves the ball at least 10 yards
# toward the opponent goal and ends in the final third

progressive_passes <- events %>%
  filter(type.name == "Pass",
         is.na(pass.outcome.name)) %>%  # Complete passes
  mutate(
    # Calculate progression (toward goal at x=120)
    start_dist = 120 - location.x,
    end_dist = 120 - pass.end_location.x,
    progression = start_dist - end_dist,

    # Progressive if moved 10+ yards forward and ends beyond x=80
    is_progressive = progression >= 10 & pass.end_location.x >= 80
  ) %>%
  filter(is_progressive)

# Plot progressive passes
ggplot(progressive_passes) +
  annotate_pitch(colour = "white", fill = "#1a472a") +
  geom_segment(aes(x = location.x, y = location.y,
                   xend = pass.end_location.x,
                   yend = pass.end_location.y,
                   color = progression),
               arrow = arrow(length = unit(0.2, "cm")),
               linewidth = 1, alpha = 0.8) +
  scale_color_gradient(low = "#90EE90", high = "#FF4500",
                       name = "Yards Gained") +
  theme_pitch() +
  coord_flip() +
  labs(title = "Progressive Passes into Final Third",
       subtitle = paste(nrow(progressive_passes), "progressive passes"))
chapter4-progressive
Output
Visualizing progressive passes

Heat Maps and Touch Maps

Heat maps show density of actions across the pitch, revealing where players or teams concentrate their activity.

Basic Heat Map

# Player action heat map using mplsoccer player_actions = events[ (events["player_name"] == "Lionel Messi") & (events["x"].notna()) & (events["y"].notna()) ].copy() # Create pitch with heatmap pitch = Pitch(pitch_color="#1a472a", line_color="white", line_zorder=2) fig, ax = pitch.draw(figsize=(12, 8)) # Kernel density estimate heatmap pitch.kdeplot(player_actions["x"], player_actions["y"], ax=ax, cmap="Reds", fill=True, levels=100, thresh=0, alpha=0.7) ax.set_title("Lionel Messi - Action Heat Map", fontsize=16) plt.show() # Alternative: Bin statistics (hexbin or grid) fig, axes = plt.subplots(1, 2, figsize=(16, 7)) # Hexbin pitch = Pitch(pitch_color="#1a472a", line_color="white") pitch.draw(ax=axes[0]) hexbin = pitch.hexbin(player_actions["x"], player_actions["y"], ax=axes[0], cmap="YlOrRd", gridsize=(15, 8)) axes[0].set_title("Hexbin Heatmap", fontsize=14) # Grid bins pitch.draw(ax=axes[1]) bin_stats = pitch.bin_statistic(player_actions["x"], player_actions["y"], statistic="count", bins=(12, 8)) pitch.heatmap(bin_stats, ax=axes[1], cmap="YlOrRd", edgecolors="#1a472a") axes[1].set_title("Grid Heatmap", fontsize=14) plt.tight_layout() plt.show()
# Player action heat map
player_actions <- events %>%
  filter(player.name == "Lionel Messi",
         !is.na(location.x),
         !is.na(location.y))

# Using stat_density_2d for heat map
ggplot(player_actions) +
  annotate_pitch(colour = "white", fill = "#1a472a") +
  stat_density_2d(aes(x = location.x, y = location.y,
                      fill = after_stat(level)),
                  geom = "polygon", alpha = 0.7,
                  bins = 10) +
  scale_fill_gradient(low = "transparent", high = "#FF6B6B") +
  theme_pitch() +
  coord_flip() +
  labs(title = "Lionel Messi - Action Heat Map") +
  theme(legend.position = "none")

# Alternative: using geom_bin_2d for discrete cells
ggplot(player_actions) +
  annotate_pitch(colour = "white", fill = "#1a472a") +
  geom_bin_2d(aes(x = location.x, y = location.y),
              binwidth = c(10, 10), alpha = 0.8) +
  scale_fill_gradient(low = "#FFFF00", high = "#FF0000") +
  theme_pitch() +
  coord_flip() +
  labs(title = "Messi Action Density (Grid)") +
  theme(legend.position = "right")
chapter4-heatmap
Output
Creating player heat maps

Touch Map

# Touch map with action types player_touches = events[ (events["player_name"] == "Mohamed Salah") & (events["x"].notna()) ].copy() # Categorize actions def categorize_action(row): if row["type_name"] == "Shot": return "Shots" elif row["type_name"] == "Pass": return "Passes" elif row["type_name"] == "Dribble": return "Dribbles" elif row["type_name"] == "Ball Receipt*": return "Receives" return "Other" player_touches["action_group"] = player_touches.apply(categorize_action, axis=1) player_touches = player_touches[player_touches["action_group"] != "Other"] # Create touch map pitch = Pitch(pitch_color="#1a472a", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) # Color and marker mapping colors = {"Shots": "#FFD700", "Passes": "#87CEEB", "Dribbles": "#98FB98", "Receives": "#DDA0DD"} markers = {"Shots": "*", "Passes": "o", "Dribbles": "^", "Receives": "s"} for action, group in player_touches.groupby("action_group"): ax.scatter(group["x"], group["y"], c=colors[action], marker=markers[action], s=80, alpha=0.7, label=action, edgecolors="black") ax.legend(loc="upper left", fontsize=10) ax.set_title(f"Mohamed Salah - Touch Map\n{len(player_touches)} actions", fontsize=16) plt.show()
# Touch map with action types
player_touches <- events %>%
  filter(player.name == "Mohamed Salah",
         !is.na(location.x)) %>%
  mutate(action_group = case_when(
    type.name == "Shot" ~ "Shots",
    type.name == "Pass" ~ "Passes",
    type.name == "Dribble" ~ "Dribbles",
    type.name == "Ball Receipt*" ~ "Receives",
    TRUE ~ "Other"
  )) %>%
  filter(action_group != "Other")

# Touch map by action type
ggplot(player_touches) +
  annotate_pitch(colour = "white", fill = "#1a472a") +
  geom_point(aes(x = location.x, y = location.y,
                 color = action_group, shape = action_group),
             size = 3, alpha = 0.7) +
  scale_color_manual(values = c("Shots" = "#FFD700",
                                "Passes" = "#87CEEB",
                                "Dribbles" = "#98FB98",
                                "Receives" = "#DDA0DD")) +
  theme_pitch() +
  coord_flip() +
  labs(title = "Mohamed Salah - Touch Map",
       subtitle = paste(nrow(player_touches), "total actions"),
       color = "Action Type", shape = "Action Type") +
  theme(legend.position = "bottom")
chapter4-touchmap
Output
Creating a touch map with action type categories

Zone Control Heat Map

# Team zone control / territorial dominance team_actions = events[ (events["x"].notna()) & (events["type_name"].isin(["Pass", "Carry", "Dribble", "Ball Receipt*", "Shot"])) ].copy() # Create bins for zone control pitch = Pitch(pitch_color="#333333", line_color="white", line_zorder=3) # Calculate zone statistics for both teams zone_stats_home = pitch.bin_statistic( team_actions[team_actions["team_name"] == "England"]["x"], team_actions[team_actions["team_name"] == "England"]["y"], statistic="count", bins=(6, 4) ) zone_stats_away = pitch.bin_statistic( team_actions[team_actions["team_name"] != "England"]["x"], team_actions[team_actions["team_name"] != "England"]["y"], statistic="count", bins=(6, 4) ) # Calculate control percentage total = zone_stats_home["statistic"] + zone_stats_away["statistic"] control = np.divide(zone_stats_home["statistic"], total, where=total > 0, out=np.zeros_like(total, dtype=float)) # Create custom stats object for control zone_stats_home["statistic"] = control # Plot fig, ax = pitch.draw(figsize=(12, 8)) heatmap = pitch.heatmap(zone_stats_home, ax=ax, cmap="RdYlBu_r", edgecolors="#333333", vmin=0, vmax=1) # Add colorbar cbar = fig.colorbar(heatmap, ax=ax, shrink=0.6) cbar.set_label("Possession Control %", fontsize=10) ax.set_title("England - Zone Control", fontsize=16) plt.show()
# Team zone control / territorial dominance
team_actions <- events %>%
  filter(!is.na(location.x),
         type.name %in% c("Pass", "Carry", "Dribble",
                          "Ball Receipt*", "Shot"))

# Calculate actions per zone
zones <- team_actions %>%
  mutate(
    zone_x = cut(location.x, breaks = seq(0, 120, 20), labels = FALSE),
    zone_y = cut(location.y, breaks = seq(0, 80, 20), labels = FALSE)
  ) %>%
  group_by(team.name, zone_x, zone_y) %>%
  summarise(actions = n(), .groups = "drop") %>%
  group_by(zone_x, zone_y) %>%
  mutate(
    total = sum(actions),
    control = actions / total
  ) %>%
  ungroup()

# Plot for one team
team_control <- zones %>%
  filter(team.name == "England")

ggplot(team_control) +
  annotate_pitch(colour = "white", fill = "#333333") +
  geom_tile(aes(x = (zone_x - 0.5) * 20,
                y = (zone_y - 0.5) * 20,
                fill = control),
            width = 18, height = 18, alpha = 0.8) +
  scale_fill_gradient2(low = "#1E90FF", mid = "#333333",
                       high = "#FF4500",
                       midpoint = 0.5,
                       labels = scales::percent) +
  theme_pitch() +
  coord_flip() +
  labs(title = "England - Zone Control",
       fill = "Possession %")
chapter4-zone-control
Output
Creating team zone control heat map

xG Timeline Charts

xG timelines show how a match evolved, revealing momentum shifts, dominant periods, and comparing actual goals to expected outcomes.

# xG Timeline (cumulative) import matplotlib.pyplot as plt import pandas as pd # Get shots with timing shots = events[events["type_name"] == "Shot"].copy() shots = shots.sort_values("minute") # Calculate cumulative xG per team teams = shots["team_name"].unique() timeline_data = {} for team in teams: team_shots = shots[shots["team_name"] == team].copy() team_shots["cumulative_xg"] = team_shots["shot_statsbomb_xg"].cumsum() timeline_data[team] = team_shots # Create plot fig, ax = plt.subplots(figsize=(14, 7)) colors = {"England": "#1E90FF", "Sweden": "#FFD700"} for team, data in timeline_data.items(): # Add starting point at 0 minutes = [0] + data["minute"].tolist() xg = [0] + data["cumulative_xg"].tolist() # Draw step line ax.step(minutes, xg, where="post", linewidth=2.5, color=colors.get(team, "gray"), label=team) # Mark goals goals = data[data["outcome_name"] == "Goal"] if len(goals) > 0: ax.scatter(goals["minute"], goals["cumulative_xg"], s=150, c="white", edgecolors=colors.get(team, "gray"), linewidth=2, zorder=5) # End annotation final_xg = xg[-1] ax.annotate(f"{final_xg:.2f} xG", (minutes[-1] + 2, final_xg), fontsize=11, fontweight="bold", color=colors.get(team, "gray")) ax.set_xlim(0, 100) ax.set_xlabel("Minute", fontsize=12) ax.set_ylabel("Cumulative xG", fontsize=12) ax.set_title("Match xG Timeline\nCircles indicate goals scored", fontsize=16, fontweight="bold") ax.legend(loc="upper left", fontsize=11) ax.grid(True, alpha=0.3) ax.set_xticks(range(0, 91, 15)) plt.tight_layout() plt.show()
# xG Timeline (cumulative)
library(ggplot2)
library(dplyr)

# Get shots with timing
shots_timeline <- events %>%
  filter(type.name == "Shot") %>%
  arrange(minute) %>%
  group_by(team.name) %>%
  mutate(cumulative_xg = cumsum(shot.statsbomb_xg)) %>%
  ungroup()

# Add start and end points for complete timeline
timeline_data <- shots_timeline %>%
  bind_rows(
    data.frame(team.name = unique(shots_timeline$team.name),
               minute = 0, cumulative_xg = 0)
  ) %>%
  arrange(team.name, minute)

# Get goals for markers
goals <- shots_timeline %>%
  filter(shot.outcome.name == "Goal")

# Create timeline plot
ggplot(timeline_data, aes(x = minute, y = cumulative_xg,
                          color = team.name)) +
  geom_step(linewidth = 1.5) +
  geom_point(data = goals, aes(x = minute, y = cumulative_xg),
             size = 5, shape = 21, fill = "white", stroke = 2) +

  # Add final xG annotations
  geom_text(data = timeline_data %>%
              group_by(team.name) %>%
              slice_tail(n = 1),
            aes(label = sprintf("%.2f xG", cumulative_xg)),
            hjust = -0.2, fontface = "bold") +

  scale_color_manual(values = c("England" = "#1E90FF",
                                "Sweden" = "#FFD700")) +
  scale_x_continuous(breaks = seq(0, 90, 15),
                     limits = c(0, 100)) +
  labs(title = "Match xG Timeline",
       subtitle = "Circles indicate goals scored",
       x = "Minute", y = "Cumulative xG",
       color = "Team") +
  theme_minimal() +
  theme(
    legend.position = "bottom",
    plot.title = element_text(size = 16, face = "bold"),
    panel.grid.minor = element_blank()
  )
chapter4-xg-timeline
Output
Creating an xG timeline chart

xG Match Summary Plot

# Complete xG match summary with shot strips fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True) shots = events[events["type_name"] == "Shot"].copy() teams = ["England", "Sweden"] colors = ["#1E90FF", "#FFD700"] for ax, team, color in zip(axes, teams, colors): team_shots = shots[shots["team_name"] == team] # Shot lollipops ax.vlines(team_shots["minute"], 0, team_shots["shot_statsbomb_xg"], colors=color, linewidth=4, alpha=0.8) # Goals as circles goals = team_shots[team_shots["outcome_name"] == "Goal"] if len(goals) > 0: ax.scatter(goals["minute"], goals["shot_statsbomb_xg"], s=200, c="white", edgecolors=color, linewidth=3, zorder=5) # Styling ax.set_ylabel("xG", fontsize=11) ax.set_ylim(0, 1) ax.set_title(team, fontsize=14, fontweight="bold") ax.grid(True, alpha=0.3, axis="x") # Add total xG total_xg = team_shots["shot_statsbomb_xg"].sum() total_goals = len(goals) ax.text(92, 0.8, f"xG: {total_xg:.2f}\nGoals: {total_goals}", fontsize=11, ha="left") axes[1].set_xlabel("Minute", fontsize=12) axes[0].set_xlim(0, 95) fig.suptitle("xG Match Summary - Shot Quality by Minute", fontsize=16, fontweight="bold", y=1.02) plt.tight_layout() plt.show()
# Complete xG match summary with shot strips
library(patchwork)

# Shot strip function
create_shot_strip <- function(shots_df, team_name, team_color) {
  team_shots <- shots_df %>% filter(team.name == team_name)

  ggplot(team_shots) +
    geom_segment(aes(x = minute, xend = minute,
                     y = 0, yend = shot.statsbomb_xg),
                 color = team_color, linewidth = 3) +
    geom_point(data = filter(team_shots, shot.outcome.name == "Goal"),
               aes(x = minute, y = shot.statsbomb_xg),
               size = 4, color = "white", shape = 21,
               fill = team_color, stroke = 2) +
    scale_x_continuous(limits = c(0, 95), breaks = seq(0, 90, 15)) +
    scale_y_continuous(limits = c(0, 1)) +
    labs(x = NULL, y = "xG") +
    theme_minimal() +
    theme(panel.grid.minor = element_blank())
}

# Create combined plot
shots_data <- events %>% filter(type.name == "Shot")

p1 <- create_shot_strip(shots_data, "England", "#1E90FF") +
  ggtitle("England") +
  theme(plot.title = element_text(hjust = 0.5))

p2 <- create_shot_strip(shots_data, "Sweden", "#FFD700") +
  ggtitle("Sweden") +
  scale_y_reverse() +  # Flip for opposition
  theme(plot.title = element_text(hjust = 0.5))

# Combine with patchwork
combined <- p1 / p2 +
  plot_annotation(
    title = "xG Match Summary",
    subtitle = "Shot quality by minute (goals circled)",
    theme = theme(plot.title = element_text(size = 18, face = "bold"))
  )

print(combined)
chapter4-xg-summary
Output
Creating a complete xG match summary

Radar Charts for Player Profiles

Radar charts (also called spider charts) are excellent for comparing players across multiple metrics simultaneously. They're widely used in scouting and player analysis.

# Player comparison radar chart using mplsoccer from mplsoccer import Radar, FontManager import matplotlib.pyplot as plt # Define parameters and values params = ["Goals", "Assists", "Key Passes", "Dribbles", "Tackles", "Interceptions", "Aerial Duels", "Pass %"] # Percentile ranks for two players player_a = [92, 65, 70, 88, 45, 40, 60, 55] player_b = [78, 85, 92, 55, 25, 22, 35, 75] # Create radar radar = Radar(params, min_range=[0]*8, max_range=[100]*8, round_int=[False]*8, num_rings=4, ring_width=1, center_circle_radius=1) # Plot fig, ax = radar.setup_axis() rings_inner = radar.draw_circles(ax=ax, facecolor="#1a1a2e", edgecolor="#808080") # Draw radar shapes radar_poly1, rings1, vertices1 = radar.draw_radar( player_a, ax=ax, kwargs_radar={"facecolor": "#E63946", "alpha": 0.3}, kwargs_rings={"facecolor": "#E63946", "alpha": 0.1} ) radar_poly2, rings2, vertices2 = radar.draw_radar( player_b, ax=ax, kwargs_radar={"facecolor": "#457B9D", "alpha": 0.3}, kwargs_rings={"facecolor": "#457B9D", "alpha": 0.1} ) # Draw parameter labels radar.draw_param_labels(ax=ax, fontsize=11, color="white") radar.draw_range_labels(ax=ax, fontsize=8, color="#808080") # Title and legend ax.set_title("Player Comparison Radar\nPercentile Ranks (per 90)", fontsize=16, fontweight="bold", color="white", pad=20) # Custom legend from matplotlib.patches import Patch legend_elements = [Patch(facecolor="#E63946", alpha=0.5, label="Player A"), Patch(facecolor="#457B9D", alpha=0.5, label="Player B")] ax.legend(handles=legend_elements, loc="lower right", fontsize=10) fig.patch.set_facecolor("#1a1a2e") ax.set_facecolor("#1a1a2e") plt.show()
# Player comparison radar chart
library(ggplot2)
library(tidyr)

# Sample player statistics (per 90 minutes)
player_stats <- data.frame(
  metric = c("Goals", "Assists", "Key Passes", "Dribbles",
             "Tackles", "Interceptions", "Aerial Duels", "Pass %"),
  Player_A = c(0.65, 0.25, 2.1, 3.2, 1.1, 0.8, 1.5, 85),
  Player_B = c(0.45, 0.55, 3.4, 1.8, 0.6, 0.5, 0.9, 89),
  # Percentile ranks (0-100)
  Player_A_pct = c(92, 65, 70, 88, 45, 40, 60, 55),
  Player_B_pct = c(78, 85, 92, 55, 25, 22, 35, 75)
)

# Prepare data for radar
radar_data <- player_stats %>%
  select(metric, Player_A_pct, Player_B_pct) %>%
  pivot_longer(cols = -metric, names_to = "player", values_to = "value") %>%
  mutate(player = gsub("_pct", "", player))

# Create radar using coord_polar
ggplot(radar_data, aes(x = metric, y = value,
                       group = player, color = player)) +
  geom_polygon(aes(fill = player), alpha = 0.2, linewidth = 1.5) +
  geom_point(size = 3) +
  coord_polar() +
  scale_y_continuous(limits = c(0, 100)) +
  scale_color_manual(values = c("Player_A" = "#E63946",
                                "Player_B" = "#457B9D")) +
  scale_fill_manual(values = c("Player_A" = "#E63946",
                               "Player_B" = "#457B9D")) +
  labs(title = "Player Comparison Radar",
       subtitle = "Percentile ranks (per 90 minutes)") +
  theme_minimal() +
  theme(
    axis.text.x = element_text(size = 10),
    axis.text.y = element_blank(),
    axis.title = element_blank(),
    legend.position = "bottom",
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5)
  )
chapter4-radar
Output
Creating a player comparison radar chart

Pizza Chart Alternative

Pizza charts are a modern alternative to radar charts, popularized by FBref and StatsBomb:

# Pizza chart using mplsoccer from mplsoccer import PyPizza import matplotlib.pyplot as plt # Parameters and values params = ["Goals", "xG", "Shots", "Assists", "xA", "Key Passes", "Tackles", "Interceptions", "Blocks"] values = [85, 78, 72, 45, 52, 68, 35, 42, 38] # Slice colors by category slice_colors = ["#E63946"]*3 + ["#457B9D"]*3 + ["#2A9D8F"]*3 text_colors = ["white"]*9 # Create pizza chart baker = PyPizza( params=params, background_color="#1a1a2e", straight_line_color="#1a1a2e", straight_line_lw=1, last_circle_color="#1a1a2e", last_circle_lw=1, other_circle_lw=0, inner_circle_size=20 ) fig, ax = baker.make_pizza( values, figsize=(10, 10), color_blank_space="same", slice_colors=slice_colors, value_colors=text_colors, value_bck_colors=slice_colors, blank_alpha=0.4, kwargs_slices=dict(edgecolor="#1a1a2e", zorder=2, linewidth=1), kwargs_params=dict(color="white", fontsize=12), kwargs_values=dict(color="white", fontsize=12, fontweight="bold", bbox=dict(edgecolor="white", facecolor="cornflowerblue", boxstyle="round,pad=0.2", lw=1)) ) # Title fig.text(0.5, 0.97, "Marcus Rashford", fontsize=18, fontweight="bold", ha="center", color="white") fig.text(0.5, 0.93, "Percentile Ranks vs. Position", fontsize=12, ha="center", color="#808080") # Legend from matplotlib.patches import Patch legend_elements = [ Patch(facecolor="#E63946", label="Attacking"), Patch(facecolor="#457B9D", label="Creative"), Patch(facecolor="#2A9D8F", label="Defensive") ] ax.legend(handles=legend_elements, loc="lower center", bbox_to_anchor=(0.5, -0.05), ncol=3, fontsize=10) plt.show()
# Pizza chart in R
# This is a stylized bar chart in polar coordinates

create_pizza_chart <- function(stats_df, player_name) {
  # stats_df should have: metric, value (percentile), category

  ggplot(stats_df, aes(x = reorder(metric, value), y = value,
                       fill = category)) +
    geom_bar(stat = "identity", width = 0.9) +
    coord_polar(theta = "x") +
    ylim(0, 100) +

    # Add value labels
    geom_text(aes(label = value), hjust = -0.3, size = 3) +

    scale_fill_manual(values = c("Attacking" = "#E63946",
                                 "Creative" = "#457B9D",
                                 "Defensive" = "#2A9D8F")) +
    labs(title = player_name,
         subtitle = "Percentile Ranks vs. Position") +
    theme_minimal() +
    theme(
      axis.text.y = element_blank(),
      axis.title = element_blank(),
      panel.grid = element_blank(),
      legend.position = "bottom",
      plot.title = element_text(hjust = 0.5, size = 16, face = "bold")
    )
}

# Example data
player_pizza <- data.frame(
  metric = c("Goals", "xG", "Shots", "Assists", "xA",
             "Key Passes", "Tackles", "Interceptions", "Blocks"),
  value = c(85, 78, 72, 45, 52, 68, 35, 42, 38),
  category = c(rep("Attacking", 3), rep("Creative", 3), rep("Defensive", 3))
)

create_pizza_chart(player_pizza, "Marcus Rashford")
chapter4-pizza
Output
Creating a pizza chart for player profiles

Visualization Best Practices

Do
  • Use consistent color schemes - team colors, semantic colors (goals = gold)
  • Add context - include sample sizes, time periods, competition level
  • Label clearly - titles, axes, legends should be self-explanatory
  • Consider your audience - coaches need different details than fans
  • Use appropriate chart types - shot maps for locations, timelines for match flow
  • Include data sources - always credit StatsBomb, FBref, etc.
Don't
  • Overload with information - one visualization, one message
  • Use misleading scales - always start y-axis at 0 for bar charts
  • Ignore color blindness - use colorblind-safe palettes
  • Compare incomparable data - different leagues, sample sizes
  • Over-design - clarity beats aesthetics
  • Forget mobile users - ensure readability at small sizes

Color Palettes for Football

# Colorblind-safe palettes import matplotlib.pyplot as plt import seaborn as sns # View seaborn palettes sns.palplot(sns.color_palette("colorblind")) sns.palplot(sns.color_palette("Set2")) # Custom football palette football_colors = { "goal": "#FFD700", # Gold "shot_saved": "#FFFFFF", # White "shot_blocked": "#FFA500", # Orange "shot_off": "#FF6B6B", # Light red "pass_complete": "#90EE90", # Light green "pass_incomplete": "#FF6B6B", "home_team": "#1E90FF", # Blue "away_team": "#DC143C" # Red } # Colorblind-safe alternatives cb_safe = { "blue": "#0077BB", "orange": "#EE7733", "green": "#009988", "red": "#CC3311", "purple": "#AA3377", "yellow": "#DDCC77" } # Usage plt.scatter(x, y, c=[football_colors["goal"] if g else football_colors["shot_saved"] for g in is_goal])
# Colorblind-safe palettes
library(RColorBrewer)

# View available palettes
display.brewer.all(colorblindFriendly = TRUE)

# Good palettes for football
# Sequential: Blues, Greens, Oranges, Reds
# Diverging: RdYlBu, RdYlGn (avoid red-green only)
# Qualitative: Set2, Paired

# Custom football palette
football_colors <- c(
  "goal" = "#FFD700",        # Gold
  "shot_saved" = "#FFFFFF",   # White
  "shot_blocked" = "#FFA500", # Orange
  "shot_off" = "#FF6B6B",     # Light red
  "pass_complete" = "#90EE90", # Light green
  "pass_incomplete" = "#FF6B6B",
  "home_team" = "#1E90FF",    # Blue
  "away_team" = "#DC143C"     # Red
)

# Usage in ggplot
scale_color_manual(values = football_colors)
chapter4-colors
Output
Setting up colorblind-safe palettes

Saving High-Quality Figures

# Save high-quality figures in Python import matplotlib.pyplot as plt # Create your figure fig, ax = plt.subplots(...) # Save as PNG (for web) - 300 DPI for print quality fig.savefig("shot_map.png", dpi=300, bbox_inches="tight", facecolor=fig.get_facecolor(), edgecolor="none") # Save as SVG (for print/editing) fig.savefig("shot_map.svg", format="svg", bbox_inches="tight") # Save as PDF (for publications) fig.savefig("shot_map.pdf", format="pdf", bbox_inches="tight") # For transparent background fig.savefig("shot_map_transparent.png", dpi=300, bbox_inches="tight", transparent=True) # Close figure to free memory plt.close(fig)
# Save high-quality figures in R
library(ggplot2)

# Create your plot
p <- ggplot(...) + ...

# Save as PNG (for web)
ggsave("shot_map.png", p,
       width = 12, height = 8, dpi = 300,
       bg = "white")

# Save as SVG (for print/editing)
ggsave("shot_map.svg", p,
       width = 12, height = 8)

# Save as PDF (for publications)
ggsave("shot_map.pdf", p,
       width = 12, height = 8)

# For dark backgrounds
ggsave("shot_map_dark.png", p,
       width = 12, height = 8, dpi = 300,
       bg = "#1a1a2e")
chapter4-export
Output
Exporting publication-quality figures

Chapter Summary

Key Takeaways
  • The pitch is your canvas - mplsoccer and ggsoccer make drawing pitches easy
  • Coordinate systems matter - always know your data provider's system
  • Shot maps tell stories - use size for xG, color for outcomes
  • Pass networks reveal structure - connections between players show tactics
  • Heat maps show density - where players operate, where teams dominate
  • xG timelines capture match flow - momentum, dominance, crucial moments
  • Radar charts compare players - multiple metrics at a glance

Practice Exercises

Exercise 4.1: Create a Team Shot Map

Task: Create a professional shot map for a World Cup team with xG-sized points, goal highlighting, and summary statistics.

# Exercise 4.1: Team Shot Map with xG from statsbombpy import sb import matplotlib.pyplot as plt from mplsoccer import VerticalPitch import pandas as pd # Load World Cup 2018 data matches = sb.matches(competition_id=43, season_id=3) all_events = pd.concat([ sb.events(mid).assign(match_id=mid) for mid in matches["match_id"] ]) # Filter for Belgium shots team_name = "Belgium" shots = all_events[ (all_events["team"] == team_name) & (all_events["type"] == "Shot") ].copy() shots["is_goal"] = shots["shot_outcome"] == "Goal" # Summary stats total_xg = shots["shot_statsbomb_xg"].sum() goals = shots["is_goal"].sum() total_shots = len(shots) # Create pitch pitch = VerticalPitch(pitch_color="#1a1a2e", line_color="white", half=True) fig, ax = pitch.draw(figsize=(10, 10)) fig.patch.set_facecolor("#1a1a2e") # Non-goals non_goals = shots[~shots["is_goal"]] ax.scatter(non_goals["x"], non_goals["y"], s=non_goals["shot_statsbomb_xg"] * 500, c="#666666", alpha=0.7, edgecolors="#444444") # Goals with glow goals_df = shots[shots["is_goal"]] ax.scatter(goals_df["x"], goals_df["y"], s=goals_df["shot_statsbomb_xg"] * 800, c="#FFD700", alpha=0.3) ax.scatter(goals_df["x"], goals_df["y"], s=goals_df["shot_statsbomb_xg"] * 500, c="#FFD700", alpha=0.9, edgecolors="black") # Summary ax.text(60, 40, f"xG: {total_xg:.2f} | Goals: {goals} | Shots: {total_shots}", ha="center", fontsize=12, color="white", bbox=dict(boxstyle="round", facecolor="#333", edgecolor="#FFD700")) ax.set_title(f"{team_name} - World Cup 2018 Shot Map", color="white", fontsize=16, fontweight="bold") plt.savefig("belgium_shotmap.png", dpi=300, bbox_inches="tight", facecolor="#1a1a2e") plt.show()
# Exercise 4.1: Team Shot Map with xG
library(StatsBombR)
library(ggplot2)
library(ggsoccer)
library(dplyr)

# Load World Cup 2018 data
comps <- FreeCompetitions() %>%
  filter(competition_id == 43, season_id == 3)
matches <- FreeMatches(comps)
events <- free_allevents(MatchesDF = matches)

# Filter for Belgium shots
team_name <- "Belgium"
shots <- events %>%
  filter(team.name == team_name, type.name == "Shot") %>%
  mutate(is_goal = shot.outcome.name == "Goal")

# Calculate summary stats
total_xg <- sum(shots$shot.statsbomb_xg, na.rm = TRUE)
goals <- sum(shots$is_goal)
total_shots <- nrow(shots)

# Create shot map
ggplot(shots) +
  annotate_pitch(colour = "#FFFFFF", fill = "#1a1a2e") +

  # Non-goals
  geom_point(data = filter(shots, !is_goal),
             aes(x = location.x, y = location.y,
                 size = shot.statsbomb_xg),
             color = "#666666", alpha = 0.7) +

  # Goals with glow effect
  geom_point(data = filter(shots, is_goal),
             aes(x = location.x, y = location.y,
                 size = shot.statsbomb_xg * 1.5),
             color = "#FFD700", alpha = 0.3) +
  geom_point(data = filter(shots, is_goal),
             aes(x = location.x, y = location.y,
                 size = shot.statsbomb_xg),
             color = "#FFD700", alpha = 0.9) +

  scale_size_continuous(range = c(3, 15), guide = "none") +
  theme_pitch() +
  coord_flip(xlim = c(60, 122)) +

  # Summary annotation
  annotate("text", x = 65, y = 40,
           label = sprintf("xG: %.2f | Goals: %d | Shots: %d",
                          total_xg, goals, total_shots),
           color = "white", size = 4) +

  labs(title = paste(team_name, "- World Cup 2018 Shot Map"),
       subtitle = "Gold = Goals | Size = xG value") +
  theme(plot.background = element_rect(fill = "#1a1a2e"),
        plot.title = element_text(color = "white", face = "bold", size = 16),
        plot.subtitle = element_text(color = "#888888", size = 12))

ggsave("belgium_shotmap.png", width = 12, height = 8, dpi = 300)
ex41-solution
Output
Exercise 4.1: Create professional team shot map
Exercise 4.2: Build a Pass Network

Task: Create a pass network visualization with player positions, connection strengths, and identify the most connected player.

# Exercise 4.2: Pass Network Visualization from statsbombpy import sb import matplotlib.pyplot as plt from mplsoccer import Pitch import pandas as pd import networkx as nx # Get match data match_id = matches["match_id"].iloc[0] match_events = sb.events(match_id=match_id) team_name = "France" # Completed passes with recipients team_passes = match_events[ (match_events["team"] == team_name) & (match_events["type"] == "Pass") & (match_events["pass_outcome"].isna()) & (match_events["pass_recipient"].notna()) ].copy() # Count pairs (min 3 passes) pass_pairs = team_passes.groupby( ["player", "pass_recipient"] ).size().reset_index(name="passes") pass_pairs = pass_pairs[pass_pairs["passes"] >= 3] # Average positions team_events = match_events[ (match_events["team"] == team_name) & (match_events["location"].notna()) ].copy() team_events["x"] = team_events["location"].apply(lambda l: l[0]) team_events["y"] = team_events["location"].apply(lambda l: l[1]) avg_pos = team_events.groupby("player").agg( x=("x", "mean"), y=("y", "mean"), touches=("type", "count") ).reset_index() # Find most connected connections = pass_pairs.groupby("player")["passes"].sum().reset_index() most_connected = connections.loc[connections["passes"].idxmax(), "player"] # Create visualization pitch = Pitch(pitch_color="#1B5E20", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) # Draw edges for _, row in pass_pairs.iterrows(): start = avg_pos[avg_pos["player"] == row["player"]] end = avg_pos[avg_pos["player"] == row["pass_recipient"]] if len(start) > 0 and len(end) > 0: ax.plot([start["x"].values[0], end["x"].values[0]], [start["y"].values[0], end["y"].values[0]], color="white", alpha=0.4, linewidth=row["passes"]/2) # Draw nodes ax.scatter(avg_pos["x"], avg_pos["y"], s=avg_pos["touches"] * 15, c="#75AADB", alpha=0.9, edgecolors="black", zorder=5) # Highlight most connected mc_pos = avg_pos[avg_pos["player"] == most_connected] ax.scatter(mc_pos["x"], mc_pos["y"], s=500, c="#FFD700", alpha=0.3, zorder=4) # Labels for _, row in avg_pos.iterrows(): ax.annotate(row["player"].split()[-1], (row["x"], row["y"] - 4), ha="center", fontsize=8, color="white") ax.set_title(f"{team_name} Pass Network\nMost Connected: {most_connected}", fontsize=14, fontweight="bold") plt.savefig("pass_network.png", dpi=150, bbox_inches="tight") plt.show()
# Exercise 4.2: Pass Network Visualization
library(StatsBombR)
library(ggplot2)
library(ggsoccer)
library(dplyr)

# Get a single match
match_id <- matches$match_id[1]
match_events <- events %>% filter(match_id == !!match_id)

team_name <- "France"

# Get completed passes with recipients
team_passes <- match_events %>%
  filter(team.name == team_name,
         type.name == "Pass",
         is.na(pass.outcome.name),
         !is.na(pass.recipient.name))

# Count pass pairs (min 3 passes for edge)
pass_pairs <- team_passes %>%
  group_by(player.name, pass.recipient.name) %>%
  summarise(passes = n(), .groups = "drop") %>%
  filter(passes >= 3)

# Average positions
avg_pos <- match_events %>%
  filter(team.name == team_name, !is.na(location.x)) %>%
  group_by(player.name) %>%
  summarise(x = mean(location.x), y = mean(location.y), touches = n())

# Join positions to edges
edges <- pass_pairs %>%
  left_join(avg_pos, by = "player.name") %>%
  left_join(avg_pos, by = c("pass.recipient.name" = "player.name"),
            suffix = c("", "_end"))

# Find most connected player
most_connected <- avg_pos %>%
  left_join(
    pass_pairs %>%
      group_by(player.name) %>%
      summarise(connections = sum(passes)),
    by = "player.name"
  ) %>%
  arrange(desc(connections)) %>%
  slice(1)

# Create visualization
ggplot() +
  annotate_pitch(colour = "white", fill = "#1B5E20") +

  # Edges
  geom_segment(data = edges,
               aes(x = x, y = y, xend = x_end, yend = y_end,
                   linewidth = passes),
               color = "white", alpha = 0.5) +

  # Nodes
  geom_point(data = avg_pos,
             aes(x = x, y = y, size = touches),
             color = "#75AADB", alpha = 0.9) +

  # Labels
  geom_text(data = avg_pos,
            aes(x = x, y = y - 4, label = gsub(".* ", "", player.name)),
            color = "white", size = 3) +

  # Highlight most connected
  geom_point(data = most_connected,
             aes(x = x, y = y), size = 20,
             color = "#FFD700", alpha = 0.3) +

  scale_linewidth_continuous(range = c(0.5, 4)) +
  scale_size_continuous(range = c(5, 15)) +
  theme_pitch() +
  coord_flip() +
  labs(title = paste(team_name, "Pass Network"),
       subtitle = paste("Most Connected:", most_connected$player.name)) +
  theme(legend.position = "none")

ggsave("pass_network.png", width = 12, height = 8)
ex42-solution
Output
Exercise 4.2: Build pass network with connections
Exercise 4.3: Multi-Panel Match Report

Task: Create a comprehensive match report combining xG timeline, shot maps for both teams, and key statistics in a single figure.

# Exercise 4.3: Multi-Panel Match Report import matplotlib.pyplot as plt from mplsoccer import VerticalPitch import pandas as pd from statsbombpy import sb # Get match data match_id = matches["match_id"].iloc[0] match_events = sb.events(match_id=match_id) teams = match_events["team"].dropna().unique() # Get shots shots = match_events[match_events["type"] == "Shot"].copy() shots = shots.sort_values("minute") # Create figure with subplots fig = plt.figure(figsize=(16, 12)) fig.patch.set_facecolor("#1a1a2e") # 1. xG Timeline (top) ax1 = fig.add_subplot(2, 2, (1, 2)) ax1.set_facecolor("#1a1a2e") colors = {"team1": "#1E90FF", "team2": "#DC143C"} for i, team in enumerate(teams): team_shots = shots[shots["team"] == team].copy() team_shots["cumxg"] = team_shots["shot_statsbomb_xg"].cumsum() minutes = [0] + team_shots["minute"].tolist() xg = [0] + team_shots["cumxg"].tolist() color = list(colors.values())[i] ax1.step(minutes, xg, where="post", linewidth=2.5, color=color, label=team) goals = team_shots[team_shots["shot_outcome"] == "Goal"] ax1.scatter(goals["minute"], goals["cumxg"], s=150, c="white", edgecolors=color, linewidth=2, zorder=5) ax1.set_xlabel("Minute", color="white", fontsize=12) ax1.set_ylabel("Cumulative xG", color="white", fontsize=12) ax1.set_title("xG Timeline", color="white", fontsize=14, fontweight="bold") ax1.legend(facecolor="#333", labelcolor="white") ax1.tick_params(colors="white") ax1.grid(alpha=0.3) # 2 & 3. Shot maps pitch = VerticalPitch(pitch_color="#333", line_color="white", half=True) for i, team in enumerate(teams): ax = fig.add_subplot(2, 2, 3 + i) pitch.draw(ax=ax) team_shots = shots[shots["team"] == team] color = list(colors.values())[i] # Non-goals non_goals = team_shots[team_shots["shot_outcome"] != "Goal"] ax.scatter(non_goals["x"], non_goals["y"], s=non_goals["shot_statsbomb_xg"] * 400, c=color, alpha=0.5, edgecolors="white") # Goals goals = team_shots[team_shots["shot_outcome"] == "Goal"] ax.scatter(goals["x"], goals["y"], s=goals["shot_statsbomb_xg"] * 400, c=color, alpha=1, edgecolors="white", linewidth=2) # Stats total_xg = team_shots["shot_statsbomb_xg"].sum() goal_count = len(goals) ax.set_title(f"{team}\nxG: {total_xg:.2f} | Goals: {goal_count}", color="white", fontsize=12) plt.suptitle(f"{teams[0]} vs {teams[1]}\nMatch Analysis Report", color="white", fontsize=16, fontweight="bold", y=0.98) plt.tight_layout(rect=[0, 0, 1, 0.95]) plt.savefig("match_report.png", dpi=150, bbox_inches="tight", facecolor="#1a1a2e") plt.show()
# Exercise 4.3: Multi-Panel Match Report
library(StatsBombR)
library(ggplot2)
library(ggsoccer)
library(dplyr)
library(patchwork)

# Get match data
match_id <- matches$match_id[1]
match_events <- events %>% filter(match_id == !!match_id)
teams <- unique(match_events$team.name[!is.na(match_events$team.name)])

# 1. xG Timeline
shots <- match_events %>%
  filter(type.name == "Shot") %>%
  arrange(minute) %>%
  group_by(team.name) %>%
  mutate(cumulative_xg = cumsum(shot.statsbomb_xg)) %>%
  ungroup()

p1 <- ggplot(shots, aes(x = minute, y = cumulative_xg, color = team.name)) +
  geom_step(linewidth = 1.5) +
  geom_point(data = filter(shots, shot.outcome.name == "Goal"),
             size = 4, shape = 21, fill = "white", stroke = 2) +
  scale_color_manual(values = c("#1E90FF", "#DC143C")) +
  labs(title = "xG Timeline", x = "Minute", y = "Cumulative xG", color = "") +
  theme_minimal() +
  theme(legend.position = "bottom")

# 2. Shot maps for each team
create_shot_map <- function(team, color) {
  team_shots <- shots %>% filter(team.name == team)
  ggplot(team_shots) +
    annotate_pitch(colour = "white", fill = "#333") +
    geom_point(aes(x = location.x, y = location.y,
                   size = shot.statsbomb_xg,
                   alpha = shot.outcome.name == "Goal"),
               color = color) +
    scale_size_continuous(range = c(2, 10)) +
    scale_alpha_manual(values = c(0.5, 1)) +
    theme_pitch() +
    coord_flip(xlim = c(60, 120)) +
    labs(title = team) +
    theme(legend.position = "none",
          plot.title = element_text(hjust = 0.5, color = "white"),
          plot.background = element_rect(fill = "#333"))
}

p2 <- create_shot_map(teams[1], "#1E90FF")
p3 <- create_shot_map(teams[2], "#DC143C")

# 3. Stats table
stats <- match_events %>%
  group_by(team.name) %>%
  summarise(
    Shots = sum(type.name == "Shot"),
    `On Target` = sum(type.name == "Shot" & shot.outcome.name %in% c("Goal", "Saved")),
    Goals = sum(type.name == "Shot" & shot.outcome.name == "Goal"),
    xG = round(sum(shot.statsbomb_xg[type.name == "Shot"], na.rm = TRUE), 2),
    Passes = sum(type.name == "Pass"),
    `Pass %` = round(sum(type.name == "Pass" & is.na(pass.outcome.name)) /
                    sum(type.name == "Pass") * 100, 1)
  )

# Combine with patchwork
final_plot <- (p1) / (p2 | p3) +
  plot_annotation(
    title = paste(teams[1], "vs", teams[2]),
    subtitle = "Match Analysis Report",
    theme = theme(plot.title = element_text(size = 20, face = "bold"),
                  plot.subtitle = element_text(size = 14))
  )

ggsave("match_report.png", final_plot, width = 14, height = 12, dpi = 300)
ex43-solution
Output
Exercise 4.3: Create multi-panel match report

Ready for Chapter 5?

Learn about traditional football statistics - possession, shots, pass completion, and how to calculate and interpret them.

Continue to Traditional Football Statistics