Chapter 60

Capstone - Complete Analytics System

Intermediate 30 min read 5 sections 10 code examples
0 of 60 chapters completed (0%)

The Language of the Game

Passing is football's fundamental action. Every team's style—from tiki-taka possession to direct counter-attacks—is defined by how they pass. This chapter explores advanced passing analytics beyond simple completion rates.

Modern passing analysis goes far beyond "Player X completed 87% of passes." We can now measure pass progression, network centrality, pass value, and how players perform under pressure. These insights reveal true playmaking quality.

What Passing Analytics Can Tell Us
  • Ball progression: Who moves the ball forward effectively?
  • Team structure: How does the team connect through passing?
  • Under pressure: Who maintains quality when pressed?
  • Pass value: Which passes increase goal probability most?

Progressive Passes

Not all passes are equal. A progressive pass moves the ball significantly toward the opponent's goal, breaking lines and creating attacking opportunities.

Defining Progressive Passes

Common definitions:

  • FBref/StatsBomb: Completed passes that move the ball at least 10 yards toward the opponent's goal (or any pass into the penalty area)
  • Opta: Forward passes into the final third or penalty area
  • Wyscout: Passes that move the ball significantly closer to goal
# Calculate progressive passes from statsbombpy import sb import pandas as pd import numpy as np # Load data matches = sb.matches(competition_id=43, season_id=106) all_events = pd.concat([sb.events(mid) for mid in matches["match_id"]]) # Filter completed passes passes = all_events[ (all_events["type"] == "Pass") & (all_events["pass_outcome"].isna()) # Completed ].copy() # Extract coordinates passes["start_x"] = passes["location"].apply(lambda l: l[0] if l else None) passes["start_y"] = passes["location"].apply(lambda l: l[1] if l else None) passes["end_x"] = passes["pass_end_location"].apply(lambda l: l[0] if l else None) passes["end_y"] = passes["pass_end_location"].apply(lambda l: l[1] if l else None) # Calculate progression passes["forward_progress"] = passes["end_x"] - passes["start_x"] # Progressive pass flags passes["is_progressive"] = ( (passes["forward_progress"] >= 10) & (passes["end_x"] > passes["start_x"]) & (passes["end_x"] >= 40) ) passes["is_into_final_third"] = ( (passes["start_x"] < 80) & (passes["end_x"] >= 80) ) passes["is_into_box"] = ( (passes["end_x"] >= 102) & (passes["end_y"] >= 18) & (passes["end_y"] <= 62) ) # Aggregate by player player_prog = passes.groupby(["player", "team"]).agg( matches=("match_id", "nunique"), total_passes=("type", "count"), progressive=("is_progressive", "sum"), into_final_third=("is_into_final_third", "sum"), into_box=("is_into_box", "sum") ).reset_index() player_prog["progressive_pct"] = ( player_prog["progressive"] / player_prog["total_passes"] * 100).round(1) player_prog["prog_per_90"] = ( player_prog["progressive"] / player_prog["matches"]).round(2) player_prog = player_prog[player_prog["total_passes"] >= 50] print("Progressive Passing Leaders:") print(player_prog.sort_values("prog_per_90", ascending=False).head(15))
# Calculate progressive passes
library(StatsBombR)
library(dplyr)

# Load data
comps <- FreeCompetitions() %>%
  filter(competition_id == 43, season_id == 106)
matches <- FreeMatches(comps)
events <- free_allevents(MatchesDF = matches)

# Filter completed passes
passes <- events %>%
  filter(type.name == "Pass",
         is.na(pass.outcome.name)) %>%  # Completed passes only
  mutate(
    # Start and end positions
    start_x = location.x,
    start_y = location.y,
    end_x = pass.end_location.x,
    end_y = pass.end_location.y,

    # Forward progress (toward goal at x=120)
    forward_progress = end_x - start_x,

    # Progressive pass definition
    # Must move at least 10 yards toward goal
    # AND end in a more advanced position
    is_progressive = (forward_progress >= 10) &
                     (end_x > start_x) &
                     (end_x >= 40),  # Not in own defensive third

    # Into final third
    is_into_final_third = (start_x < 80) & (end_x >= 80),

    # Into penalty area
    is_into_box = (end_x >= 102) & (end_y >= 18) & (end_y <= 62)
  )

# Player progressive passing stats
player_progressive <- passes %>%
  group_by(player.name, team.name) %>%
  summarise(
    matches = n_distinct(match_id),
    total_passes = n(),
    progressive = sum(is_progressive, na.rm = TRUE),
    into_final_third = sum(is_into_final_third, na.rm = TRUE),
    into_box = sum(is_into_box, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    progressive_pct = round(progressive / total_passes * 100, 1),
    prog_per_90 = round(progressive / matches, 2)
  ) %>%
  filter(total_passes >= 50) %>%
  arrange(desc(prog_per_90))

print("Progressive Passing Leaders:")
print(head(player_progressive, 15))
chapter9-progressive
Output
Calculating progressive passes

Progressive Carries

Don't forget ball carries—dribbling toward goal is also valuable progression:

# Calculate progressive carries carries = all_events[all_events["type"] == "Carry"].copy() carries["start_x"] = carries["location"].apply(lambda l: l[0] if l else None) carries["end_x"] = carries["carry_end_location"].apply(lambda l: l[0] if l else None) carries["forward_progress"] = carries["end_x"] - carries["start_x"] carries["is_progressive_carry"] = ( (carries["forward_progress"] >= 10) & (carries["end_x"] > carries["start_x"]) & (carries["end_x"] >= 40) ) # Combine passes and carries prog_passes = passes[passes["is_progressive"]][["player", "team", "match_id"]].copy() prog_passes["action_type"] = "Pass" prog_carries = carries[carries["is_progressive_carry"]][["player", "team", "match_id"]].copy() prog_carries["action_type"] = "Carry" all_progressive = pd.concat([prog_passes, prog_carries]) player_progression = all_progressive.groupby(["player", "team"]).agg( matches=("match_id", "nunique"), prog_passes=("action_type", lambda x: (x == "Pass").sum()), prog_carries=("action_type", lambda x: (x == "Carry").sum()), total_progressive=("action_type", "count") ).reset_index() player_progression["prog_per_90"] = ( player_progression["total_progressive"] / player_progression["matches"]).round(2) print("Ball Progression Leaders (Passes + Carries):") print(player_progression.sort_values("prog_per_90", ascending=False).head(15))
# Calculate progressive carries
carries <- events %>%
  filter(type.name == "Carry") %>%
  mutate(
    start_x = location.x,
    end_x = carry.end_location.x,
    forward_progress = end_x - start_x,

    is_progressive_carry = (forward_progress >= 10) &
                           (end_x > start_x) &
                           (end_x >= 40)
  )

# Combined progressive actions
player_ball_progression <- passes %>%
  select(player.name, team.name, match_id, is_progressive) %>%
  mutate(action_type = "Pass") %>%
  bind_rows(
    carries %>%
      select(player.name, team.name, match_id,
             is_progressive = is_progressive_carry) %>%
      mutate(action_type = "Carry")
  ) %>%
  filter(is_progressive) %>%
  group_by(player.name, team.name) %>%
  summarise(
    matches = n_distinct(match_id),
    progressive_passes = sum(action_type == "Pass"),
    progressive_carries = sum(action_type == "Carry"),
    total_progressive = n(),
    .groups = "drop"
  ) %>%
  mutate(
    prog_actions_per_90 = round(total_progressive / matches, 2)
  ) %>%
  arrange(desc(prog_actions_per_90))

print("Ball Progression Leaders (Passes + Carries):")
print(head(player_ball_progression, 15))
chapter9-carries
Output
Calculating progressive carries and combined ball progression

Pass Networks

Pass networks reveal team structure—who passes to whom, who is central to build-up, and how the team organizes tactically.

Building Pass Networks

# Build pass network for a team import networkx as nx import pandas as pd # Filter for one team in one match team_name = "Argentina" match_id = matches["match_id"].iloc[0] match_events = sb.events(match_id=match_id) match_passes = match_events[ (match_events["type"] == "Pass") & (match_events["team"] == team_name) & (match_events["pass_outcome"].isna()) & (match_events["pass_recipient"].notna()) ].copy() # Count pass pairs pass_pairs = match_passes.groupby( ["player", "pass_recipient"]).size().reset_index(name="passes") pass_pairs = pass_pairs[pass_pairs["passes"] >= 2] # Calculate average positions avg_positions = match_events[ (match_events["team"] == team_name) & (match_events["location"].notna()) ].copy() avg_positions["x"] = avg_positions["location"].apply(lambda l: l[0]) avg_positions["y"] = avg_positions["location"].apply(lambda l: l[1]) player_positions = avg_positions.groupby("player").agg( avg_x=("x", "mean"), avg_y=("y", "mean"), touches=("type", "count") ).reset_index() # Create directed graph G = nx.DiGraph() # Add nodes with positions for _, row in player_positions.iterrows(): G.add_node(row["player"], pos=(row["avg_x"], row["avg_y"]), touches=row["touches"]) # Add edges for _, row in pass_pairs.iterrows(): G.add_edge(row["player"], row["pass_recipient"], weight=row["passes"]) # Calculate centrality metrics degree = dict(G.degree()) betweenness = nx.betweenness_centrality(G) pagerank = nx.pagerank(G) # Combine metrics network_metrics = pd.DataFrame({ "player": list(G.nodes()), "touches": [G.nodes[n].get("touches", 0) for n in G.nodes()], "degree": [degree[n] for n in G.nodes()], "betweenness": [round(betweenness[n], 4) for n in G.nodes()], "pagerank": [round(pagerank[n], 4) for n in G.nodes()] }) print("Pass Network Centrality Metrics:") print(network_metrics.sort_values("betweenness", ascending=False))
# Build pass network for a team
library(igraph)
library(dplyr)

# Filter for one team in one match
team_name <- "Argentina"
match_passes <- passes %>%
  filter(team.name == team_name,
         match_id == matches$match_id[1],
         !is.na(pass.recipient.name))

# Count pass pairs
pass_pairs <- match_passes %>%
  group_by(player.name, pass.recipient.name) %>%
  summarise(passes = n(), .groups = "drop") %>%
  filter(passes >= 2)  # Minimum 2 passes for edge

# Calculate average positions
avg_positions <- events %>%
  filter(team.name == team_name,
         match_id == matches$match_id[1],
         !is.na(location.x)) %>%
  group_by(player.name) %>%
  summarise(
    avg_x = mean(location.x),
    avg_y = mean(location.y),
    touches = n()
  )

# Create network graph
g <- graph_from_data_frame(pass_pairs,
                           vertices = avg_positions,
                           directed = TRUE)

# Calculate network metrics
V(g)$degree <- degree(g, mode = "all")
V(g)$in_degree <- degree(g, mode = "in")
V(g)$out_degree <- degree(g, mode = "out")
V(g)$betweenness <- betweenness(g)
V(g)$pagerank <- page_rank(g)$vector

# Display metrics
network_metrics <- data.frame(
  player = V(g)$name,
  touches = V(g)$touches,
  degree = V(g)$degree,
  betweenness = round(V(g)$betweenness, 2),
  pagerank = round(V(g)$pagerank, 4)
) %>%
  arrange(desc(betweenness))

print("Pass Network Centrality Metrics:")
print(network_metrics)
chapter9-network
Output
Building and analyzing pass networks

Network Centrality Metrics Explained

Metric What It Measures High Value Indicates
Degree Number of unique passing connections Well-connected player, involved in many partnerships
Betweenness How often player is on shortest path between others Central hub, ball flows through this player
PageRank Importance based on who passes to you Key outlet for important players
Closeness Average distance to all other players Can reach anyone quickly with passes

Passing Under Pressure

Anyone can complete passes with time and space. True quality shows when opponents press:

# Analyze passing under pressure passes["under_pressure"] = passes["under_pressure"].fillna(False) pressure_stats = passes.groupby(["player", "team"]).agg( total_passes=("type", "count"), pressured_passes=("under_pressure", "sum"), unpressured_passes=("under_pressure", lambda x: (~x).sum()) ).reset_index() # Completion rates pressured_complete = passes[passes["under_pressure"]].groupby("player").size() unpressured_complete = passes[~passes["under_pressure"]].groupby("player").size() pressure_stats = pressure_stats.merge( pressured_complete.reset_index(name="pressured_completed"), on="player", how="left") pressure_stats["pressured_completed"] = pressure_stats["pressured_completed"].fillna(0) pressure_stats = pressure_stats[pressure_stats["pressured_passes"] >= 20].copy() pressure_stats["pressured_completion"] = ( pressure_stats["pressured_completed"] / pressure_stats["pressured_passes"] * 100).round(1) # Calculate unpressured completion rate pressure_stats = pressure_stats.merge( unpressured_complete.reset_index(name="unpressured_completed"), on="player", how="left") pressure_stats["unpressured_completed"] = pressure_stats["unpressured_completed"].fillna(0) pressure_stats["unpressured_completion"] = ( pressure_stats["unpressured_completed"] / pressure_stats["unpressured_passes"] * 100).round(1) # Calculate pressure drop (lower = better under pressure) pressure_stats["pressure_drop"] = ( pressure_stats["unpressured_completion"] - pressure_stats["pressured_completion"]).round(1) print("Passing Under Pressure Analysis:") print(pressure_stats.sort_values("pressured_completion", ascending=False).head(15))
# Analyze passing under pressure
pressure_passing <- passes %>%
  mutate(
    under_pressure = !is.na(under_pressure) & under_pressure == TRUE
  ) %>%
  group_by(player.name, team.name) %>%
  summarise(
    total_passes = n(),
    pressured_passes = sum(under_pressure),
    pressured_completed = sum(under_pressure & is.na(pass.outcome.name)),

    unpressured_passes = sum(!under_pressure),
    unpressured_completed = sum(!under_pressure & is.na(pass.outcome.name)),

    .groups = "drop"
  ) %>%
  filter(pressured_passes >= 20) %>%
  mutate(
    pressured_completion = round(pressured_completed / pressured_passes * 100, 1),
    unpressured_completion = round(unpressured_completed / unpressured_passes * 100, 1),
    pressure_drop = unpressured_completion - pressured_completion
  ) %>%
  arrange(pressure_drop)  # Lowest drop = best under pressure

print("Passing Under Pressure Analysis:")
print(head(pressure_passing %>%
             select(player.name, pressured_passes,
                    pressured_completion, unpressured_completion,
                    pressure_drop), 15))
chapter9-pressure
Output
Analyzing passing under pressure

Pass Type Analysis

Different pass types serve different tactical purposes. Analyzing the mix reveals playing style:

# Analyze pass types passes["pass_length"] = np.sqrt( (passes["end_x"] - passes["start_x"])**2 + (passes["end_y"] - passes["start_y"])**2 ) def categorize_pass(row): if row.get("pass_through_ball"): return "Through Ball" elif row.get("pass_cross"): return "Cross" elif row.get("pass_switch"): return "Switch" elif row["pass_length"] > 30: return "Long Ball" elif row["forward_progress"] > 5: return "Forward" elif row["forward_progress"] < -5: return "Backward" return "Lateral" passes["pass_category"] = passes.apply(categorize_pass, axis=1) # Team pass style team_style = passes.groupby(["team", "pass_category"]).size().unstack(fill_value=0) team_style_pct = team_style.div(team_style.sum(axis=1), axis=0) * 100 print("Team Pass Style (% of each type):") print(team_style_pct.round(1))
# Analyze pass types
pass_type_analysis <- passes %>%
  mutate(
    pass_category = case_when(
      pass.through_ball == TRUE ~ "Through Ball",
      pass.cross == TRUE ~ "Cross",
      pass.switch == TRUE ~ "Switch",
      pass.cut_back == TRUE ~ "Cutback",
      sqrt((end_x - start_x)^2 + (end_y - start_y)^2) > 30 ~ "Long Ball",
      forward_progress > 5 ~ "Forward",
      forward_progress < -5 ~ "Backward",
      TRUE ~ "Lateral"
    )
  ) %>%
  group_by(team.name, pass_category) %>%
  summarise(
    passes = n(),
    completion_rate = mean(is.na(pass.outcome.name)) * 100,
    .groups = "drop"
  )

# Pivot for team comparison
team_pass_style <- pass_type_analysis %>%
  group_by(team.name) %>%
  mutate(pct = passes / sum(passes) * 100) %>%
  select(team.name, pass_category, pct) %>%
  pivot_wider(names_from = pass_category, values_from = pct, values_fill = 0)

print("Team Pass Style (% of each type):")
print(head(team_pass_style, 10))
chapter9-pass-types
Output
Analyzing pass types and playing style

Chapter Summary

Key Takeaways
  • Progressive passes move the ball 10+ yards toward goal
  • Combine passes and carries for total ball progression
  • Pass networks reveal team structure and key connectors
  • Betweenness centrality identifies midfield hubs
  • Pressure resistance separates elite from average
  • Pass type mix defines team style (direct vs. possession)

Practice Exercises

Put your passing analytics knowledge into practice with these exercises.

Exercise 9.1: Progressive Passing Analysis

Task: Identify the top ball progressors in a tournament by analyzing both progressive passes and progressive carries. Create a combined metric that ranks players by total ball progression.

Definition: A progressive action moves the ball at least 10 yards toward the opponent's goal and ends in a more advanced position (beyond the 40-yard line).

# Exercise 9.1 Solution: Progressive Ball Movement from statsbombpy import sb import pandas as pd # Load data matches = sb.matches(competition_id=43, season_id=106) all_events = pd.concat([ sb.events(mid).assign(match_id=mid) for mid in matches["match_id"] ]) # Extract coordinates def get_coords(events_df, loc_col, end_col): df = events_df.copy() df["start_x"] = df[loc_col].apply(lambda l: l[0] if l else None) df["end_x"] = df[end_col].apply(lambda l: l[0] if l else None) return df # Progressive passes passes = all_events[ (all_events["type"] == "Pass") & (all_events["pass_outcome"].isna()) ].copy() passes = get_coords(passes, "location", "pass_end_location") passes["progress"] = passes["end_x"] - passes["start_x"] passes["is_prog"] = (passes["progress"] >= 10) & (passes["end_x"] >= 40) prog_passes = passes[passes["is_prog"]][["player", "team", "match_id", "progress"]] prog_passes["action"] = "Pass" # Progressive carries carries = all_events[all_events["type"] == "Carry"].copy() carries = get_coords(carries, "location", "carry_end_location") carries["progress"] = carries["end_x"] - carries["start_x"] carries["is_prog"] = (carries["progress"] >= 10) & (carries["end_x"] >= 40) prog_carries = carries[carries["is_prog"]][["player", "team", "match_id", "progress"]] prog_carries["action"] = "Carry" # Combine all_prog = pd.concat([prog_passes, prog_carries]) # Aggregate progressors = all_prog.groupby(["player", "team"]).agg( matches=("match_id", "nunique"), prog_passes=("action", lambda x: (x == "Pass").sum()), prog_carries=("action", lambda x: (x == "Carry").sum()), total_prog=("action", "count"), yards_progressed=("progress", "sum") ).reset_index() progressors["prog_per_90"] = (progressors["total_prog"] / progressors["matches"]).round(2) progressors["yards_per_90"] = (progressors["yards_progressed"] / progressors["matches"]).round(1) progressors = progressors[progressors["matches"] >= 4] print("Top Ball Progressors:") print(progressors.sort_values("prog_per_90", ascending=False).head(15))
# Exercise 9.1 Solution: Progressive Ball Movement
library(StatsBombR)
library(dplyr)

# Load World Cup data
comps <- FreeCompetitions() %>%
  filter(competition_id == 43, season_id == 106)
matches <- FreeMatches(comps)
events <- free_allevents(MatchesDF = matches)

# Progressive passes
prog_passes <- events %>%
  filter(type.name == "Pass", is.na(pass.outcome.name)) %>%
  mutate(
    start_x = location.x,
    end_x = pass.end_location.x,
    progress = end_x - start_x,
    is_progressive = progress >= 10 & end_x > start_x & end_x >= 40
  ) %>%
  filter(is_progressive) %>%
  select(player.name, team.name, match_id, progress, type = type.name)

# Progressive carries
prog_carries <- events %>%
  filter(type.name == "Carry") %>%
  mutate(
    start_x = location.x,
    end_x = carry.end_location.x,
    progress = end_x - start_x,
    is_progressive = progress >= 10 & end_x > start_x & end_x >= 40
  ) %>%
  filter(is_progressive) %>%
  select(player.name, team.name, match_id, progress, type = type.name)

# Combine and aggregate
all_progressive <- bind_rows(prog_passes, prog_carries)

ball_progressors <- all_progressive %>%
  group_by(player.name, team.name) %>%
  summarise(
    matches = n_distinct(match_id),
    prog_passes = sum(type == "Pass"),
    prog_carries = sum(type == "Carry"),
    total_progressive = n(),
    total_yards_progressed = sum(progress),
    .groups = "drop"
  ) %>%
  mutate(
    prog_per_90 = round(total_progressive / matches, 2),
    yards_per_90 = round(total_yards_progressed / matches, 1),
    pass_carry_ratio = round(prog_passes / (prog_carries + 0.1), 2)
  ) %>%
  filter(matches >= 4) %>%
  arrange(desc(prog_per_90))

print("Top Ball Progressors:")
print(head(ball_progressors, 15))

# Categorize style
ball_progressors <- ball_progressors %>%
  mutate(
    style = case_when(
      pass_carry_ratio > 5 ~ "Pass Progressor",
      pass_carry_ratio < 2 ~ "Carry Progressor",
      TRUE ~ "Balanced"
    )
  )

print("\nProgression Style Distribution:")
print(table(ball_progressors$style))
ex91-solution
Output
Exercise 9.1: Analyze progressive passes and carries
Exercise 9.2: Build a Pass Network

Task: Build and visualize a pass network for a team. Calculate centrality metrics (degree, betweenness) to identify the most influential players in the team's passing structure.

Requirements:

  • Plot player nodes at their average positions
  • Draw edges between players who passed to each other (min. 3 passes)
  • Size nodes by degree centrality
  • Identify the player with highest betweenness (the "hub")

# Exercise 9.2 Solution: Pass Network Visualization from statsbombpy import sb import pandas as pd import networkx as nx import matplotlib.pyplot as plt from mplsoccer import Pitch # Select match and team team_name = "Argentina" match_id = matches["match_id"].iloc[0] match_events = sb.events(match_id=match_id) # Completed passes with recipients team_passes = match_events[ (match_events["team"] == team_name) & (match_events["type"] == "Pass") & (match_events["pass_outcome"].isna()) & (match_events["pass_recipient"].notna()) ].copy() # Count pairs (min 3 passes) pass_pairs = team_passes.groupby( ["player", "pass_recipient"] ).size().reset_index(name="passes") pass_pairs = pass_pairs[pass_pairs["passes"] >= 3] # Average positions team_events = match_events[ (match_events["team"] == team_name) & (match_events["location"].notna()) ].copy() team_events["x"] = team_events["location"].apply(lambda l: l[0]) team_events["y"] = team_events["location"].apply(lambda l: l[1]) avg_pos = team_events.groupby("player").agg( x=("x", "mean"), y=("y", "mean"), touches=("type", "count") ).reset_index() # Build network G = nx.Graph() for _, row in avg_pos.iterrows(): G.add_node(row["player"], pos=(row["x"], row["y"])) for _, row in pass_pairs.iterrows(): G.add_edge(row["player"], row["pass_recipient"], weight=row["passes"]) # Centrality degree = dict(G.degree()) betweenness = nx.betweenness_centrality(G) hub = max(betweenness, key=betweenness.get) print(f"Team Hub: {hub}") # Visualize pitch = Pitch(pitch_type="statsbomb", pitch_color="#1B5E20", line_color="white") fig, ax = pitch.draw(figsize=(12, 8)) # Draw edges for u, v, d in G.edges(data=True): pos_u = G.nodes[u]["pos"] pos_v = G.nodes[v]["pos"] ax.plot([pos_u[0], pos_v[0]], [pos_u[1], pos_v[1]], color="white", alpha=0.5, linewidth=d["weight"]/2) # Draw nodes for node in G.nodes(): pos = G.nodes[node]["pos"] size = degree[node] * 50 ax.scatter(pos[0], pos[1], s=size, c="#75AADB", zorder=5, alpha=0.9) ax.annotate(node.split()[-1], (pos[0], pos[1]+3), ha="center", fontsize=8, color="white") ax.set_title(f"{team_name} Pass Network\nHub: {hub}", fontsize=14) plt.savefig("pass_network.png", dpi=150, bbox_inches="tight") plt.show()
# Exercise 9.2 Solution: Pass Network Visualization
library(StatsBombR)
library(dplyr)
library(igraph)
library(ggplot2)
library(ggsoccer)

# Select a match
team_name <- "Argentina"
match_id <- matches$match_id[1]

match_events <- events %>%
  filter(match_id == !!match_id)

# Get completed passes with recipients
team_passes <- match_events %>%
  filter(team.name == team_name,
         type.name == "Pass",
         is.na(pass.outcome.name),
         !is.na(pass.recipient.name))

# Count pass pairs
pass_pairs <- team_passes %>%
  group_by(player.name, pass.recipient.name) %>%
  summarise(passes = n(), .groups = "drop") %>%
  filter(passes >= 3)

# Average positions
avg_pos <- match_events %>%
  filter(team.name == team_name, !is.na(location.x)) %>%
  group_by(player.name) %>%
  summarise(
    x = mean(location.x),
    y = mean(location.y),
    touches = n()
  )

# Build network
g <- graph_from_data_frame(pass_pairs, vertices = avg_pos, directed = FALSE)

# Calculate centrality
V(g)$degree <- degree(g)
V(g)$betweenness <- betweenness(g, normalized = TRUE)

# Find hub player
hub_player <- V(g)$name[which.max(V(g)$betweenness)]
cat("Team Hub (highest betweenness):", hub_player, "\n")

# Prepare for ggplot
edges_df <- pass_pairs %>%
  left_join(avg_pos, by = c("player.name")) %>%
  rename(x_start = x, y_start = y) %>%
  left_join(avg_pos, by = c("pass.recipient.name" = "player.name")) %>%
  rename(x_end = x, y_end = y)

nodes_df <- data.frame(
  player = V(g)$name,
  x = avg_pos$x[match(V(g)$name, avg_pos$player.name)],
  y = avg_pos$y[match(V(g)$name, avg_pos$player.name)],
  degree = V(g)$degree,
  betweenness = V(g)$betweenness
)

# Plot
ggplot() +
  annotate_pitch(colour = "white", fill = "#1B5E20") +
  geom_segment(data = edges_df,
               aes(x = x_start, y = y_start, xend = x_end, yend = y_end,
                   linewidth = passes),
               color = "white", alpha = 0.6) +
  geom_point(data = nodes_df,
             aes(x = x, y = y, size = degree),
             color = "#75AADB", alpha = 0.9) +
  geom_text(data = nodes_df,
            aes(x = x, y = y + 3,
                label = gsub(".* ", "", player)),  # Last name
            color = "white", size = 3) +
  scale_size_continuous(range = c(5, 15)) +
  scale_linewidth_continuous(range = c(0.5, 3)) +
  labs(title = paste(team_name, "Pass Network"),
       subtitle = paste("Hub:", hub_player)) +
  theme_pitch() +
  coord_flip(xlim = c(0, 120), ylim = c(0, 80))

ggsave("pass_network.png", width = 12, height = 8)
ex92-solution
Output
Exercise 9.2: Build and visualize pass network with centrality
Exercise 9.3: Pressure Resistance Score

Task: Create a "Pressure Resistance Score" that measures how well players maintain passing quality when under pressure. Compare completion rates under pressure vs. without pressure.

Scoring:

  • Calculate pressured completion rate and unpressured rate
  • Pressure Drop = Unpressured Rate - Pressured Rate
  • Lower drop = better pressure resistance

# Exercise 9.3 Solution: Pressure Resistance Score import pandas as pd import matplotlib.pyplot as plt # Analyze passes passes = all_events[all_events["type"] == "Pass"].copy() passes["under_pressure"] = passes["under_pressure"].fillna(False) passes["completed"] = passes["pass_outcome"].isna() # Aggregate def calc_rates(group): pressured = group[group["under_pressure"]] unpressured = group[~group["under_pressure"]] return pd.Series({ "matches": group["match_id"].nunique(), "total_passes": len(group), "pressured_attempts": len(pressured), "pressured_completed": pressured["completed"].sum(), "unpressured_attempts": len(unpressured), "unpressured_completed": unpressured["completed"].sum() }) pressure_stats = passes.groupby(["player", "team"]).apply(calc_rates).reset_index() # Filter and calculate rates pressure_stats = pressure_stats[ (pressure_stats["pressured_attempts"] >= 20) & (pressure_stats["total_passes"] >= 100) ].copy() pressure_stats["pressured_rate"] = ( pressure_stats["pressured_completed"] / pressure_stats["pressured_attempts"] * 100 ).round(1) pressure_stats["unpressured_rate"] = ( pressure_stats["unpressured_completed"] / pressure_stats["unpressured_attempts"] * 100 ).round(1) pressure_stats["pressure_drop"] = ( pressure_stats["unpressured_rate"] - pressure_stats["pressured_rate"] ).round(1) # Rating def get_rating(drop): if drop < 5: return "Elite" elif drop < 10: return "Good" elif drop < 15: return "Average" return "Poor" pressure_stats["rating"] = pressure_stats["pressure_drop"].apply(get_rating) print("Pressure Resistance Leaders:") print(pressure_stats.sort_values("pressure_drop").head(15)[ ["player", "pressured_rate", "unpressured_rate", "pressure_drop", "rating"]]) # Visualization colors = {"Elite": "#1B5E20", "Good": "#4CAF50", "Average": "#FFC107", "Poor": "#F44336"} fig, ax = plt.subplots(figsize=(10, 8)) for rating in colors: subset = pressure_stats[pressure_stats["rating"] == rating] ax.scatter(subset["unpressured_rate"], subset["pressured_rate"], c=colors[rating], label=rating, alpha=0.7, s=50) ax.plot([60, 100], [60, 100], "k--", alpha=0.5, label="Equal performance") ax.set_xlabel("Unpressured Completion %") ax.set_ylabel("Pressured Completion %") ax.set_title("Pressure Resistance Analysis") ax.legend() plt.savefig("pressure_resistance.png", dpi=150) plt.show()
# Exercise 9.3 Solution: Pressure Resistance Score
library(StatsBombR)
library(dplyr)
library(ggplot2)

# Analyze passes with pressure info
passes <- events %>%
  filter(type.name == "Pass") %>%
  mutate(
    under_pressure = !is.na(under_pressure) & under_pressure == TRUE,
    completed = is.na(pass.outcome.name)
  )

# Calculate pressure resistance by player
pressure_resistance <- passes %>%
  group_by(player.name, team.name) %>%
  summarise(
    matches = n_distinct(match_id),
    total_passes = n(),

    # Pressured stats
    pressured_attempts = sum(under_pressure),
    pressured_completed = sum(under_pressure & completed),

    # Unpressured stats
    unpressured_attempts = sum(!under_pressure),
    unpressured_completed = sum(!under_pressure & completed),

    .groups = "drop"
  ) %>%
  filter(pressured_attempts >= 20, total_passes >= 100) %>%
  mutate(
    pressured_rate = round(pressured_completed / pressured_attempts * 100, 1),
    unpressured_rate = round(unpressured_completed / unpressured_attempts * 100, 1),
    pressure_drop = round(unpressured_rate - pressured_rate, 1),

    # Pressure resistance score (inverted: lower drop = higher score)
    resistance_score = round(100 - pressure_drop, 1),

    # Rating
    rating = case_when(
      pressure_drop < 5 ~ "Elite",
      pressure_drop < 10 ~ "Good",
      pressure_drop < 15 ~ "Average",
      TRUE ~ "Poor"
    )
  ) %>%
  arrange(pressure_drop)

print("Pressure Resistance Leaders (lowest drop):")
print(head(pressure_resistance %>%
             select(player.name, pressured_rate, unpressured_rate,
                    pressure_drop, rating), 15))

# Visualization
ggplot(pressure_resistance, aes(x = unpressured_rate, y = pressured_rate)) +
  geom_point(aes(color = rating), size = 3, alpha = 0.7) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
  geom_abline(slope = 1, intercept = -10, linetype = "dotted", color = "red") +
  scale_color_manual(values = c("Elite" = "#1B5E20", "Good" = "#4CAF50",
                                "Average" = "#FFC107", "Poor" = "#F44336")) +
  labs(
    title = "Pressure Resistance Analysis",
    subtitle = "Points above dashed line = better under pressure than average",
    x = "Unpressured Completion %",
    y = "Pressured Completion %",
    color = "Rating"
  ) +
  theme_minimal()

ggsave("pressure_resistance.png", width = 10, height = 8)
ex93-solution
Output
Exercise 9.3: Calculate and visualize pressure resistance

Next: Defensive Analytics

Learn to evaluate defenders with possession-adjusted metrics, pressing analysis, and defensive value models.

Continue to Defensive Analytics