Capstone - Complete Analytics System
The Language of the Game
Passing is football's fundamental action. Every team's style—from tiki-taka possession to direct counter-attacks—is defined by how they pass. This chapter explores advanced passing analytics beyond simple completion rates.
Modern passing analysis goes far beyond "Player X completed 87% of passes." We can now measure pass progression, network centrality, pass value, and how players perform under pressure. These insights reveal true playmaking quality.
What Passing Analytics Can Tell Us
- Ball progression: Who moves the ball forward effectively?
- Team structure: How does the team connect through passing?
- Under pressure: Who maintains quality when pressed?
- Pass value: Which passes increase goal probability most?
Progressive Passes
Not all passes are equal. A progressive pass moves the ball significantly toward the opponent's goal, breaking lines and creating attacking opportunities.
Defining Progressive Passes
Common definitions:
- FBref/StatsBomb: Completed passes that move the ball at least 10 yards toward the opponent's goal (or any pass into the penalty area)
- Opta: Forward passes into the final third or penalty area
- Wyscout: Passes that move the ball significantly closer to goal
# Calculate progressive passes
library(StatsBombR)
library(dplyr)
# Load data
comps <- FreeCompetitions() %>%
filter(competition_id == 43, season_id == 106)
matches <- FreeMatches(comps)
events <- free_allevents(MatchesDF = matches)
# Filter completed passes
passes <- events %>%
filter(type.name == "Pass",
is.na(pass.outcome.name)) %>% # Completed passes only
mutate(
# Start and end positions
start_x = location.x,
start_y = location.y,
end_x = pass.end_location.x,
end_y = pass.end_location.y,
# Forward progress (toward goal at x=120)
forward_progress = end_x - start_x,
# Progressive pass definition
# Must move at least 10 yards toward goal
# AND end in a more advanced position
is_progressive = (forward_progress >= 10) &
(end_x > start_x) &
(end_x >= 40), # Not in own defensive third
# Into final third
is_into_final_third = (start_x < 80) & (end_x >= 80),
# Into penalty area
is_into_box = (end_x >= 102) & (end_y >= 18) & (end_y <= 62)
)
# Player progressive passing stats
player_progressive <- passes %>%
group_by(player.name, team.name) %>%
summarise(
matches = n_distinct(match_id),
total_passes = n(),
progressive = sum(is_progressive, na.rm = TRUE),
into_final_third = sum(is_into_final_third, na.rm = TRUE),
into_box = sum(is_into_box, na.rm = TRUE),
.groups = "drop"
) %>%
mutate(
progressive_pct = round(progressive / total_passes * 100, 1),
prog_per_90 = round(progressive / matches, 2)
) %>%
filter(total_passes >= 50) %>%
arrange(desc(prog_per_90))
print("Progressive Passing Leaders:")
print(head(player_progressive, 15))chapter9-progressiveCalculating progressive passesProgressive Carries
Don't forget ball carries—dribbling toward goal is also valuable progression:
# Calculate progressive carries
carries <- events %>%
filter(type.name == "Carry") %>%
mutate(
start_x = location.x,
end_x = carry.end_location.x,
forward_progress = end_x - start_x,
is_progressive_carry = (forward_progress >= 10) &
(end_x > start_x) &
(end_x >= 40)
)
# Combined progressive actions
player_ball_progression <- passes %>%
select(player.name, team.name, match_id, is_progressive) %>%
mutate(action_type = "Pass") %>%
bind_rows(
carries %>%
select(player.name, team.name, match_id,
is_progressive = is_progressive_carry) %>%
mutate(action_type = "Carry")
) %>%
filter(is_progressive) %>%
group_by(player.name, team.name) %>%
summarise(
matches = n_distinct(match_id),
progressive_passes = sum(action_type == "Pass"),
progressive_carries = sum(action_type == "Carry"),
total_progressive = n(),
.groups = "drop"
) %>%
mutate(
prog_actions_per_90 = round(total_progressive / matches, 2)
) %>%
arrange(desc(prog_actions_per_90))
print("Ball Progression Leaders (Passes + Carries):")
print(head(player_ball_progression, 15))chapter9-carriesCalculating progressive carries and combined ball progressionPass Networks
Pass networks reveal team structure—who passes to whom, who is central to build-up, and how the team organizes tactically.
Building Pass Networks
# Build pass network for a team
library(igraph)
library(dplyr)
# Filter for one team in one match
team_name <- "Argentina"
match_passes <- passes %>%
filter(team.name == team_name,
match_id == matches$match_id[1],
!is.na(pass.recipient.name))
# Count pass pairs
pass_pairs <- match_passes %>%
group_by(player.name, pass.recipient.name) %>%
summarise(passes = n(), .groups = "drop") %>%
filter(passes >= 2) # Minimum 2 passes for edge
# Calculate average positions
avg_positions <- events %>%
filter(team.name == team_name,
match_id == matches$match_id[1],
!is.na(location.x)) %>%
group_by(player.name) %>%
summarise(
avg_x = mean(location.x),
avg_y = mean(location.y),
touches = n()
)
# Create network graph
g <- graph_from_data_frame(pass_pairs,
vertices = avg_positions,
directed = TRUE)
# Calculate network metrics
V(g)$degree <- degree(g, mode = "all")
V(g)$in_degree <- degree(g, mode = "in")
V(g)$out_degree <- degree(g, mode = "out")
V(g)$betweenness <- betweenness(g)
V(g)$pagerank <- page_rank(g)$vector
# Display metrics
network_metrics <- data.frame(
player = V(g)$name,
touches = V(g)$touches,
degree = V(g)$degree,
betweenness = round(V(g)$betweenness, 2),
pagerank = round(V(g)$pagerank, 4)
) %>%
arrange(desc(betweenness))
print("Pass Network Centrality Metrics:")
print(network_metrics)chapter9-networkBuilding and analyzing pass networksNetwork Centrality Metrics Explained
| Metric | What It Measures | High Value Indicates |
|---|---|---|
| Degree | Number of unique passing connections | Well-connected player, involved in many partnerships |
| Betweenness | How often player is on shortest path between others | Central hub, ball flows through this player |
| PageRank | Importance based on who passes to you | Key outlet for important players |
| Closeness | Average distance to all other players | Can reach anyone quickly with passes |
Passing Under Pressure
Anyone can complete passes with time and space. True quality shows when opponents press:
# Analyze passing under pressure
pressure_passing <- passes %>%
mutate(
under_pressure = !is.na(under_pressure) & under_pressure == TRUE
) %>%
group_by(player.name, team.name) %>%
summarise(
total_passes = n(),
pressured_passes = sum(under_pressure),
pressured_completed = sum(under_pressure & is.na(pass.outcome.name)),
unpressured_passes = sum(!under_pressure),
unpressured_completed = sum(!under_pressure & is.na(pass.outcome.name)),
.groups = "drop"
) %>%
filter(pressured_passes >= 20) %>%
mutate(
pressured_completion = round(pressured_completed / pressured_passes * 100, 1),
unpressured_completion = round(unpressured_completed / unpressured_passes * 100, 1),
pressure_drop = unpressured_completion - pressured_completion
) %>%
arrange(pressure_drop) # Lowest drop = best under pressure
print("Passing Under Pressure Analysis:")
print(head(pressure_passing %>%
select(player.name, pressured_passes,
pressured_completion, unpressured_completion,
pressure_drop), 15))chapter9-pressureAnalyzing passing under pressurePass Type Analysis
Different pass types serve different tactical purposes. Analyzing the mix reveals playing style:
# Analyze pass types
pass_type_analysis <- passes %>%
mutate(
pass_category = case_when(
pass.through_ball == TRUE ~ "Through Ball",
pass.cross == TRUE ~ "Cross",
pass.switch == TRUE ~ "Switch",
pass.cut_back == TRUE ~ "Cutback",
sqrt((end_x - start_x)^2 + (end_y - start_y)^2) > 30 ~ "Long Ball",
forward_progress > 5 ~ "Forward",
forward_progress < -5 ~ "Backward",
TRUE ~ "Lateral"
)
) %>%
group_by(team.name, pass_category) %>%
summarise(
passes = n(),
completion_rate = mean(is.na(pass.outcome.name)) * 100,
.groups = "drop"
)
# Pivot for team comparison
team_pass_style <- pass_type_analysis %>%
group_by(team.name) %>%
mutate(pct = passes / sum(passes) * 100) %>%
select(team.name, pass_category, pct) %>%
pivot_wider(names_from = pass_category, values_from = pct, values_fill = 0)
print("Team Pass Style (% of each type):")
print(head(team_pass_style, 10))chapter9-pass-typesAnalyzing pass types and playing styleChapter Summary
Key Takeaways
- Progressive passes move the ball 10+ yards toward goal
- Combine passes and carries for total ball progression
- Pass networks reveal team structure and key connectors
- Betweenness centrality identifies midfield hubs
- Pressure resistance separates elite from average
- Pass type mix defines team style (direct vs. possession)
Practice Exercises
Put your passing analytics knowledge into practice with these exercises.
Exercise 9.1: Progressive Passing Analysis
Task: Identify the top ball progressors in a tournament by analyzing both progressive passes and progressive carries. Create a combined metric that ranks players by total ball progression.
Definition: A progressive action moves the ball at least 10 yards toward the opponent's goal and ends in a more advanced position (beyond the 40-yard line).
# Exercise 9.1 Solution: Progressive Ball Movement
library(StatsBombR)
library(dplyr)
# Load World Cup data
comps <- FreeCompetitions() %>%
filter(competition_id == 43, season_id == 106)
matches <- FreeMatches(comps)
events <- free_allevents(MatchesDF = matches)
# Progressive passes
prog_passes <- events %>%
filter(type.name == "Pass", is.na(pass.outcome.name)) %>%
mutate(
start_x = location.x,
end_x = pass.end_location.x,
progress = end_x - start_x,
is_progressive = progress >= 10 & end_x > start_x & end_x >= 40
) %>%
filter(is_progressive) %>%
select(player.name, team.name, match_id, progress, type = type.name)
# Progressive carries
prog_carries <- events %>%
filter(type.name == "Carry") %>%
mutate(
start_x = location.x,
end_x = carry.end_location.x,
progress = end_x - start_x,
is_progressive = progress >= 10 & end_x > start_x & end_x >= 40
) %>%
filter(is_progressive) %>%
select(player.name, team.name, match_id, progress, type = type.name)
# Combine and aggregate
all_progressive <- bind_rows(prog_passes, prog_carries)
ball_progressors <- all_progressive %>%
group_by(player.name, team.name) %>%
summarise(
matches = n_distinct(match_id),
prog_passes = sum(type == "Pass"),
prog_carries = sum(type == "Carry"),
total_progressive = n(),
total_yards_progressed = sum(progress),
.groups = "drop"
) %>%
mutate(
prog_per_90 = round(total_progressive / matches, 2),
yards_per_90 = round(total_yards_progressed / matches, 1),
pass_carry_ratio = round(prog_passes / (prog_carries + 0.1), 2)
) %>%
filter(matches >= 4) %>%
arrange(desc(prog_per_90))
print("Top Ball Progressors:")
print(head(ball_progressors, 15))
# Categorize style
ball_progressors <- ball_progressors %>%
mutate(
style = case_when(
pass_carry_ratio > 5 ~ "Pass Progressor",
pass_carry_ratio < 2 ~ "Carry Progressor",
TRUE ~ "Balanced"
)
)
print("\nProgression Style Distribution:")
print(table(ball_progressors$style))ex91-solutionExercise 9.1: Analyze progressive passes and carriesExercise 9.2: Build a Pass Network
Task: Build and visualize a pass network for a team. Calculate centrality metrics (degree, betweenness) to identify the most influential players in the team's passing structure.
Requirements:
- Plot player nodes at their average positions
- Draw edges between players who passed to each other (min. 3 passes)
- Size nodes by degree centrality
- Identify the player with highest betweenness (the "hub")
# Exercise 9.2 Solution: Pass Network Visualization
library(StatsBombR)
library(dplyr)
library(igraph)
library(ggplot2)
library(ggsoccer)
# Select a match
team_name <- "Argentina"
match_id <- matches$match_id[1]
match_events <- events %>%
filter(match_id == !!match_id)
# Get completed passes with recipients
team_passes <- match_events %>%
filter(team.name == team_name,
type.name == "Pass",
is.na(pass.outcome.name),
!is.na(pass.recipient.name))
# Count pass pairs
pass_pairs <- team_passes %>%
group_by(player.name, pass.recipient.name) %>%
summarise(passes = n(), .groups = "drop") %>%
filter(passes >= 3)
# Average positions
avg_pos <- match_events %>%
filter(team.name == team_name, !is.na(location.x)) %>%
group_by(player.name) %>%
summarise(
x = mean(location.x),
y = mean(location.y),
touches = n()
)
# Build network
g <- graph_from_data_frame(pass_pairs, vertices = avg_pos, directed = FALSE)
# Calculate centrality
V(g)$degree <- degree(g)
V(g)$betweenness <- betweenness(g, normalized = TRUE)
# Find hub player
hub_player <- V(g)$name[which.max(V(g)$betweenness)]
cat("Team Hub (highest betweenness):", hub_player, "\n")
# Prepare for ggplot
edges_df <- pass_pairs %>%
left_join(avg_pos, by = c("player.name")) %>%
rename(x_start = x, y_start = y) %>%
left_join(avg_pos, by = c("pass.recipient.name" = "player.name")) %>%
rename(x_end = x, y_end = y)
nodes_df <- data.frame(
player = V(g)$name,
x = avg_pos$x[match(V(g)$name, avg_pos$player.name)],
y = avg_pos$y[match(V(g)$name, avg_pos$player.name)],
degree = V(g)$degree,
betweenness = V(g)$betweenness
)
# Plot
ggplot() +
annotate_pitch(colour = "white", fill = "#1B5E20") +
geom_segment(data = edges_df,
aes(x = x_start, y = y_start, xend = x_end, yend = y_end,
linewidth = passes),
color = "white", alpha = 0.6) +
geom_point(data = nodes_df,
aes(x = x, y = y, size = degree),
color = "#75AADB", alpha = 0.9) +
geom_text(data = nodes_df,
aes(x = x, y = y + 3,
label = gsub(".* ", "", player)), # Last name
color = "white", size = 3) +
scale_size_continuous(range = c(5, 15)) +
scale_linewidth_continuous(range = c(0.5, 3)) +
labs(title = paste(team_name, "Pass Network"),
subtitle = paste("Hub:", hub_player)) +
theme_pitch() +
coord_flip(xlim = c(0, 120), ylim = c(0, 80))
ggsave("pass_network.png", width = 12, height = 8)ex92-solutionExercise 9.2: Build and visualize pass network with centralityExercise 9.3: Pressure Resistance Score
Task: Create a "Pressure Resistance Score" that measures how well players maintain passing quality when under pressure. Compare completion rates under pressure vs. without pressure.
Scoring:
- Calculate pressured completion rate and unpressured rate
- Pressure Drop = Unpressured Rate - Pressured Rate
- Lower drop = better pressure resistance
# Exercise 9.3 Solution: Pressure Resistance Score
library(StatsBombR)
library(dplyr)
library(ggplot2)
# Analyze passes with pressure info
passes <- events %>%
filter(type.name == "Pass") %>%
mutate(
under_pressure = !is.na(under_pressure) & under_pressure == TRUE,
completed = is.na(pass.outcome.name)
)
# Calculate pressure resistance by player
pressure_resistance <- passes %>%
group_by(player.name, team.name) %>%
summarise(
matches = n_distinct(match_id),
total_passes = n(),
# Pressured stats
pressured_attempts = sum(under_pressure),
pressured_completed = sum(under_pressure & completed),
# Unpressured stats
unpressured_attempts = sum(!under_pressure),
unpressured_completed = sum(!under_pressure & completed),
.groups = "drop"
) %>%
filter(pressured_attempts >= 20, total_passes >= 100) %>%
mutate(
pressured_rate = round(pressured_completed / pressured_attempts * 100, 1),
unpressured_rate = round(unpressured_completed / unpressured_attempts * 100, 1),
pressure_drop = round(unpressured_rate - pressured_rate, 1),
# Pressure resistance score (inverted: lower drop = higher score)
resistance_score = round(100 - pressure_drop, 1),
# Rating
rating = case_when(
pressure_drop < 5 ~ "Elite",
pressure_drop < 10 ~ "Good",
pressure_drop < 15 ~ "Average",
TRUE ~ "Poor"
)
) %>%
arrange(pressure_drop)
print("Pressure Resistance Leaders (lowest drop):")
print(head(pressure_resistance %>%
select(player.name, pressured_rate, unpressured_rate,
pressure_drop, rating), 15))
# Visualization
ggplot(pressure_resistance, aes(x = unpressured_rate, y = pressured_rate)) +
geom_point(aes(color = rating), size = 3, alpha = 0.7) +
geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
geom_abline(slope = 1, intercept = -10, linetype = "dotted", color = "red") +
scale_color_manual(values = c("Elite" = "#1B5E20", "Good" = "#4CAF50",
"Average" = "#FFC107", "Poor" = "#F44336")) +
labs(
title = "Pressure Resistance Analysis",
subtitle = "Points above dashed line = better under pressure than average",
x = "Unpressured Completion %",
y = "Pressured Completion %",
color = "Rating"
) +
theme_minimal()
ggsave("pressure_resistance.png", width = 10, height = 8)ex93-solutionExercise 9.3: Calculate and visualize pressure resistanceNext: Defensive Analytics
Learn to evaluate defenders with possession-adjusted metrics, pressing analysis, and defensive value models.
Continue to Defensive Analytics