Capstone - Complete Analytics System
The Pitch as Your Canvas
Data visualization transforms raw numbers into compelling stories. In football analytics, the pitch itself becomes our primary canvas—a 105m × 68m space where every pass, shot, and movement can be mapped, colored, and analyzed.
Great football visualizations do more than display data—they reveal patterns invisible in spreadsheets, communicate complex tactical concepts instantly, and make analytics accessible to coaches, players, and fans alike. From Opta's iconic chalkboards to modern expected goals timelines, visualization has become the language of football analytics.
Why Visualization Matters
- Pattern Recognition: See pressing triggers, passing lanes, and defensive vulnerabilities
- Communication: Explain complex analysis to non-technical stakeholders
- Storytelling: Build narratives around match events and player performances
- Discovery: Uncover insights that statistics alone might miss
Visualization Libraries We'll Use
Both R and Python have excellent libraries specifically designed for football visualization:
- mplsoccer: The gold standard for pitch plots
- matplotlib: Foundation for all visualizations
- seaborn: Statistical visualizations
- plotly: Interactive charts
- ggsoccer: ggplot2 extension for pitches
- ggplot2: Grammar of graphics foundation
- ggrepel: Smart label positioning
- patchwork: Combining multiple plots
# Install required packages
install.packages(c("ggplot2", "ggsoccer", "ggrepel", "patchwork"))
# Load libraries
library(ggplot2)
library(ggsoccer)
library(ggrepel)
library(patchwork)
library(dplyr)
# Also useful for StatsBomb data
library(StatsBombR)
# Check ggsoccer version
packageVersion("ggsoccer")chapter4-setupInstalling visualization librariesDrawing the Pitch
Before plotting any data, we need to understand how to draw a football pitch. Both libraries handle the complexity of penalty areas, center circles, and goal areas automatically.
Basic Pitch Creation
# Basic pitch with ggsoccer
ggplot() +
annotate_pitch(colour = "white",
fill = "springgreen4") +
theme_pitch() +
coord_flip() +
ggtitle("Standard Football Pitch")
# Horizontal orientation
ggplot() +
annotate_pitch(colour = "white",
fill = "#1a472a") +
theme_pitch() +
ggtitle("Horizontal Pitch View")
# Half pitch (useful for attacking analysis)
ggplot() +
annotate_pitch(colour = "white",
fill = "springgreen4") +
theme_pitch() +
coord_flip(xlim = c(50, 100)) +
ggtitle("Attacking Half Only")chapter4-basic-pitchCreating basic pitch visualizationsCoordinate Systems
Different data providers use different coordinate systems. Understanding these is crucial for accurate plotting:
| Provider | X Range | Y Range | Origin | Notes |
|---|---|---|---|---|
| StatsBomb | 0-120 | 0-80 | Bottom-left | Y inverted (0 at top) |
| Opta | 0-100 | 0-100 | Bottom-left | Percentage-based |
| Wyscout | 0-100 | 0-100 | Top-left | Y inverted from Opta |
| UEFA | 0-105 | 0-68 | Bottom-left | Real meters |
# Converting coordinates example
# Opta (0-100) to StatsBomb (0-120, 0-80)
convert_opta_to_statsbomb <- function(x, y) {
new_x <- x * 1.2 # 100 -> 120
new_y <- y * 0.8 # 100 -> 80
return(data.frame(x = new_x, y = new_y))
}
# StatsBomb to Opta
convert_statsbomb_to_opta <- function(x, y) {
new_x <- x / 1.2 # 120 -> 100
new_y <- y / 0.8 # 80 -> 100
return(data.frame(x = new_x, y = new_y))
}
# Example usage
opta_shot <- data.frame(x = 88, y = 50)
sb_coords <- convert_opta_to_statsbomb(opta_shot$x, opta_shot$y)
print(sb_coords) # x: 105.6, y: 40chapter4-coordinatesConverting between coordinate systemsPitch Customization
# Custom pitch styling
# Dark theme pitch
dark_pitch <- ggplot() +
annotate_pitch(colour = "#cccccc",
fill = "#1a1a2e") +
theme_pitch() +
theme(panel.background = element_rect(fill = "#1a1a2e"),
plot.background = element_rect(fill = "#1a1a2e"),
plot.title = element_text(color = "white")) +
ggtitle("Dark Theme Pitch")
# Team colors pitch (Manchester City)
city_pitch <- ggplot() +
annotate_pitch(colour = "white",
fill = "#6CABDD") +
theme_pitch() +
ggtitle("Manchester City Theme")
# Add pitch markings with different dimensions
# Using Wembley dimensions (105m x 68m = 115 x 74 yards)
wembley <- ggplot() +
annotate_pitch(colour = "white",
fill = "#228B22",
dimensions = pitch_wyscout) +
theme_pitch() +
ggtitle("Wyscout Dimensions")chapter4-pitch-customCustomizing pitch appearanceCreating Shot Maps
Shot maps are perhaps the most iconic football visualization. They show where shots were taken, their outcomes, and increasingly, their expected goals (xG) values.
Basic Shot Map
# Load StatsBomb data
library(StatsBombR)
# Get shots from a match
matches <- FreeMatches(Competitions = FreeCompetitions())
events <- get.matchFree(matches[1, ])
# Filter for shots
shots <- events %>%
filter(type.name == "Shot") %>%
select(location.x, location.y, shot.outcome.name,
shot.statsbomb_xg, player.name)
# Basic shot map
ggplot(shots) +
annotate_pitch(colour = "white", fill = "springgreen4") +
geom_point(aes(x = location.x, y = location.y,
color = shot.outcome.name),
size = 4, alpha = 0.8) +
scale_color_manual(values = c("Goal" = "yellow",
"Saved" = "white",
"Off T" = "red",
"Blocked" = "orange",
"Post" = "purple")) +
theme_pitch() +
coord_flip(xlim = c(60, 120)) +
labs(title = "Shot Map",
color = "Outcome") +
theme(legend.position = "bottom")chapter4-shotmap-basicCreating a basic shot mapxG Shot Map with Size Encoding
Professional shot maps encode xG values through point size—larger circles represent higher quality chances:
# xG shot map with sized points
shots_with_xg <- shots %>%
filter(!is.na(shot.statsbomb_xg))
ggplot(shots_with_xg) +
annotate_pitch(colour = "white", fill = "#1a472a") +
geom_point(aes(x = location.x, y = location.y,
size = shot.statsbomb_xg,
color = shot.outcome.name == "Goal"),
alpha = 0.7) +
scale_size_continuous(range = c(2, 12),
name = "xG Value") +
scale_color_manual(values = c("FALSE" = "#CCCCCC",
"TRUE" = "#FFD700"),
labels = c("No Goal", "Goal"),
name = "Result") +
theme_pitch() +
coord_flip(xlim = c(60, 120)) +
labs(title = "xG Shot Map",
subtitle = "Point size represents expected goals value") +
theme(legend.position = "right",
plot.title = element_text(size = 16, face = "bold"),
plot.subtitle = element_text(size = 12))
# Add total xG annotation
total_xg <- sum(shots_with_xg$shot.statsbomb_xg, na.rm = TRUE)
goals <- sum(shots_with_xg$shot.outcome.name == "Goal")chapter4-shotmap-xgCreating an xG shot map with size encodingProfessional Shot Map with Annotations
# Professional-style shot map
create_shot_map <- function(shots_df, team_name, match_info) {
# Calculate statistics
total_shots <- nrow(shots_df)
goals <- sum(shots_df$shot.outcome.name == "Goal")
total_xg <- sum(shots_df$shot.statsbomb_xg, na.rm = TRUE)
# Create plot
p <- ggplot(shots_df) +
annotate_pitch(colour = "#808080", fill = "#1a1a2e") +
# Non-goals
geom_point(data = filter(shots_df, shot.outcome.name != "Goal"),
aes(x = location.x, y = location.y,
size = shot.statsbomb_xg),
color = "#666666", alpha = 0.7) +
# Goals with highlight
geom_point(data = filter(shots_df, shot.outcome.name == "Goal"),
aes(x = location.x, y = location.y,
size = shot.statsbomb_xg),
color = "#FFD700", alpha = 0.9) +
geom_point(data = filter(shots_df, shot.outcome.name == "Goal"),
aes(x = location.x, y = location.y,
size = shot.statsbomb_xg * 1.5),
color = "#FFD700", alpha = 0.3) + # Glow effect
scale_size_continuous(range = c(3, 15), guide = "none") +
theme_pitch() +
coord_flip(xlim = c(60, 122)) +
# Title and annotations
labs(title = team_name,
subtitle = match_info) +
# Stats box
annotate("text", x = 65, y = 5,
label = paste0("Shots: ", total_shots),
color = "white", hjust = 0, size = 4) +
annotate("text", x = 65, y = 75,
label = paste0("xG: ", round(total_xg, 2)),
color = "white", hjust = 1, size = 4) +
annotate("text", x = 62, y = 40,
label = paste0("Goals: ", goals),
color = "#FFD700", hjust = 0.5, size = 5,
fontface = "bold") +
theme(plot.background = element_rect(fill = "#1a1a2e"),
plot.title = element_text(color = "white", size = 18,
face = "bold", hjust = 0.5),
plot.subtitle = element_text(color = "#888888", size = 12,
hjust = 0.5))
return(p)
}
# Usage
shot_map <- create_shot_map(shots, "England", "vs Sweden - Quarter Final")chapter4-shotmap-proCreating a professional shot map with annotationsPass Maps and Networks
Pass maps visualize the flow of ball movement, revealing team structure, key passing lanes, and player involvement in buildup play.
Individual Pass Map
# Individual player pass map
player_passes <- events %>%
filter(type.name == "Pass",
player.name == "Kevin De Bruyne") %>%
filter(!is.na(pass.end_location.x))
# Create pass map
ggplot(player_passes) +
annotate_pitch(colour = "white", fill = "#1a472a") +
# Draw passes as arrows
geom_segment(aes(x = location.x, y = location.y,
xend = pass.end_location.x,
yend = pass.end_location.y,
color = pass.outcome.name),
arrow = arrow(length = unit(0.15, "cm")),
alpha = 0.7, linewidth = 0.8) +
scale_color_manual(values = c("Complete" = "#98FB98",
"Incomplete" = "#FF6B6B"),
na.value = "#98FB98") +
theme_pitch() +
coord_flip() +
labs(title = "Kevin De Bruyne - Pass Map",
subtitle = paste(nrow(player_passes), "passes attempted"),
color = "Pass Result") +
theme(legend.position = "bottom")chapter4-passmapCreating an individual player pass mapPass Network
Pass networks show connections between players, revealing team structure and key relationships:
# Calculate pass network
library(igraph)
# Get successful passes between players
pass_pairs <- events %>%
filter(type.name == "Pass",
is.na(pass.outcome.name), # Complete passes
team.name == "England") %>%
select(player.name, pass.recipient.name) %>%
filter(!is.na(pass.recipient.name)) %>%
group_by(player.name, pass.recipient.name) %>%
summarise(passes = n(), .groups = "drop")
# Get average positions
avg_positions <- events %>%
filter(team.name == "England",
!is.na(location.x)) %>%
group_by(player.name) %>%
summarise(
x = mean(location.x, na.rm = TRUE),
y = mean(location.y, na.rm = TRUE),
touches = n()
)
# Create network plot
ggplot() +
annotate_pitch(colour = "white", fill = "#1a472a") +
# Draw edges (passes)
geom_segment(data = pass_pairs %>%
left_join(avg_positions, by = c("player.name" = "player.name")) %>%
left_join(avg_positions, by = c("pass.recipient.name" = "player.name"),
suffix = c("", "_end")),
aes(x = x, y = y, xend = x_end, yend = y_end,
linewidth = passes),
alpha = 0.6, color = "white") +
# Draw nodes (players)
geom_point(data = avg_positions,
aes(x = x, y = y, size = touches),
color = "#FFD700", alpha = 0.9) +
# Player labels
geom_text(data = avg_positions,
aes(x = x, y = y - 3, label = player.name),
color = "white", size = 2.5) +
scale_linewidth_continuous(range = c(0.5, 4)) +
scale_size_continuous(range = c(4, 15)) +
theme_pitch() +
coord_flip() +
labs(title = "England Pass Network") +
theme(legend.position = "none")chapter4-passnetworkCreating a team pass network visualizationProgressive Pass Map
Progressive passes move the ball significantly toward the opponent's goal. They're key indicators of attacking intent:
# Define progressive pass criteria
# A pass is progressive if it moves the ball at least 10 yards
# toward the opponent goal and ends in the final third
progressive_passes <- events %>%
filter(type.name == "Pass",
is.na(pass.outcome.name)) %>% # Complete passes
mutate(
# Calculate progression (toward goal at x=120)
start_dist = 120 - location.x,
end_dist = 120 - pass.end_location.x,
progression = start_dist - end_dist,
# Progressive if moved 10+ yards forward and ends beyond x=80
is_progressive = progression >= 10 & pass.end_location.x >= 80
) %>%
filter(is_progressive)
# Plot progressive passes
ggplot(progressive_passes) +
annotate_pitch(colour = "white", fill = "#1a472a") +
geom_segment(aes(x = location.x, y = location.y,
xend = pass.end_location.x,
yend = pass.end_location.y,
color = progression),
arrow = arrow(length = unit(0.2, "cm")),
linewidth = 1, alpha = 0.8) +
scale_color_gradient(low = "#90EE90", high = "#FF4500",
name = "Yards Gained") +
theme_pitch() +
coord_flip() +
labs(title = "Progressive Passes into Final Third",
subtitle = paste(nrow(progressive_passes), "progressive passes"))chapter4-progressiveVisualizing progressive passesHeat Maps and Touch Maps
Heat maps show density of actions across the pitch, revealing where players or teams concentrate their activity.
Basic Heat Map
# Player action heat map
player_actions <- events %>%
filter(player.name == "Lionel Messi",
!is.na(location.x),
!is.na(location.y))
# Using stat_density_2d for heat map
ggplot(player_actions) +
annotate_pitch(colour = "white", fill = "#1a472a") +
stat_density_2d(aes(x = location.x, y = location.y,
fill = after_stat(level)),
geom = "polygon", alpha = 0.7,
bins = 10) +
scale_fill_gradient(low = "transparent", high = "#FF6B6B") +
theme_pitch() +
coord_flip() +
labs(title = "Lionel Messi - Action Heat Map") +
theme(legend.position = "none")
# Alternative: using geom_bin_2d for discrete cells
ggplot(player_actions) +
annotate_pitch(colour = "white", fill = "#1a472a") +
geom_bin_2d(aes(x = location.x, y = location.y),
binwidth = c(10, 10), alpha = 0.8) +
scale_fill_gradient(low = "#FFFF00", high = "#FF0000") +
theme_pitch() +
coord_flip() +
labs(title = "Messi Action Density (Grid)") +
theme(legend.position = "right")chapter4-heatmapCreating player heat mapsTouch Map
# Touch map with action types
player_touches <- events %>%
filter(player.name == "Mohamed Salah",
!is.na(location.x)) %>%
mutate(action_group = case_when(
type.name == "Shot" ~ "Shots",
type.name == "Pass" ~ "Passes",
type.name == "Dribble" ~ "Dribbles",
type.name == "Ball Receipt*" ~ "Receives",
TRUE ~ "Other"
)) %>%
filter(action_group != "Other")
# Touch map by action type
ggplot(player_touches) +
annotate_pitch(colour = "white", fill = "#1a472a") +
geom_point(aes(x = location.x, y = location.y,
color = action_group, shape = action_group),
size = 3, alpha = 0.7) +
scale_color_manual(values = c("Shots" = "#FFD700",
"Passes" = "#87CEEB",
"Dribbles" = "#98FB98",
"Receives" = "#DDA0DD")) +
theme_pitch() +
coord_flip() +
labs(title = "Mohamed Salah - Touch Map",
subtitle = paste(nrow(player_touches), "total actions"),
color = "Action Type", shape = "Action Type") +
theme(legend.position = "bottom")chapter4-touchmapCreating a touch map with action type categoriesZone Control Heat Map
# Team zone control / territorial dominance
team_actions <- events %>%
filter(!is.na(location.x),
type.name %in% c("Pass", "Carry", "Dribble",
"Ball Receipt*", "Shot"))
# Calculate actions per zone
zones <- team_actions %>%
mutate(
zone_x = cut(location.x, breaks = seq(0, 120, 20), labels = FALSE),
zone_y = cut(location.y, breaks = seq(0, 80, 20), labels = FALSE)
) %>%
group_by(team.name, zone_x, zone_y) %>%
summarise(actions = n(), .groups = "drop") %>%
group_by(zone_x, zone_y) %>%
mutate(
total = sum(actions),
control = actions / total
) %>%
ungroup()
# Plot for one team
team_control <- zones %>%
filter(team.name == "England")
ggplot(team_control) +
annotate_pitch(colour = "white", fill = "#333333") +
geom_tile(aes(x = (zone_x - 0.5) * 20,
y = (zone_y - 0.5) * 20,
fill = control),
width = 18, height = 18, alpha = 0.8) +
scale_fill_gradient2(low = "#1E90FF", mid = "#333333",
high = "#FF4500",
midpoint = 0.5,
labels = scales::percent) +
theme_pitch() +
coord_flip() +
labs(title = "England - Zone Control",
fill = "Possession %")chapter4-zone-controlCreating team zone control heat mapxG Timeline Charts
xG timelines show how a match evolved, revealing momentum shifts, dominant periods, and comparing actual goals to expected outcomes.
# xG Timeline (cumulative)
library(ggplot2)
library(dplyr)
# Get shots with timing
shots_timeline <- events %>%
filter(type.name == "Shot") %>%
arrange(minute) %>%
group_by(team.name) %>%
mutate(cumulative_xg = cumsum(shot.statsbomb_xg)) %>%
ungroup()
# Add start and end points for complete timeline
timeline_data <- shots_timeline %>%
bind_rows(
data.frame(team.name = unique(shots_timeline$team.name),
minute = 0, cumulative_xg = 0)
) %>%
arrange(team.name, minute)
# Get goals for markers
goals <- shots_timeline %>%
filter(shot.outcome.name == "Goal")
# Create timeline plot
ggplot(timeline_data, aes(x = minute, y = cumulative_xg,
color = team.name)) +
geom_step(linewidth = 1.5) +
geom_point(data = goals, aes(x = minute, y = cumulative_xg),
size = 5, shape = 21, fill = "white", stroke = 2) +
# Add final xG annotations
geom_text(data = timeline_data %>%
group_by(team.name) %>%
slice_tail(n = 1),
aes(label = sprintf("%.2f xG", cumulative_xg)),
hjust = -0.2, fontface = "bold") +
scale_color_manual(values = c("England" = "#1E90FF",
"Sweden" = "#FFD700")) +
scale_x_continuous(breaks = seq(0, 90, 15),
limits = c(0, 100)) +
labs(title = "Match xG Timeline",
subtitle = "Circles indicate goals scored",
x = "Minute", y = "Cumulative xG",
color = "Team") +
theme_minimal() +
theme(
legend.position = "bottom",
plot.title = element_text(size = 16, face = "bold"),
panel.grid.minor = element_blank()
)chapter4-xg-timelineCreating an xG timeline chartxG Match Summary Plot
# Complete xG match summary with shot strips
library(patchwork)
# Shot strip function
create_shot_strip <- function(shots_df, team_name, team_color) {
team_shots <- shots_df %>% filter(team.name == team_name)
ggplot(team_shots) +
geom_segment(aes(x = minute, xend = minute,
y = 0, yend = shot.statsbomb_xg),
color = team_color, linewidth = 3) +
geom_point(data = filter(team_shots, shot.outcome.name == "Goal"),
aes(x = minute, y = shot.statsbomb_xg),
size = 4, color = "white", shape = 21,
fill = team_color, stroke = 2) +
scale_x_continuous(limits = c(0, 95), breaks = seq(0, 90, 15)) +
scale_y_continuous(limits = c(0, 1)) +
labs(x = NULL, y = "xG") +
theme_minimal() +
theme(panel.grid.minor = element_blank())
}
# Create combined plot
shots_data <- events %>% filter(type.name == "Shot")
p1 <- create_shot_strip(shots_data, "England", "#1E90FF") +
ggtitle("England") +
theme(plot.title = element_text(hjust = 0.5))
p2 <- create_shot_strip(shots_data, "Sweden", "#FFD700") +
ggtitle("Sweden") +
scale_y_reverse() + # Flip for opposition
theme(plot.title = element_text(hjust = 0.5))
# Combine with patchwork
combined <- p1 / p2 +
plot_annotation(
title = "xG Match Summary",
subtitle = "Shot quality by minute (goals circled)",
theme = theme(plot.title = element_text(size = 18, face = "bold"))
)
print(combined)chapter4-xg-summaryCreating a complete xG match summaryRadar Charts for Player Profiles
Radar charts (also called spider charts) are excellent for comparing players across multiple metrics simultaneously. They're widely used in scouting and player analysis.
# Player comparison radar chart
library(ggplot2)
library(tidyr)
# Sample player statistics (per 90 minutes)
player_stats <- data.frame(
metric = c("Goals", "Assists", "Key Passes", "Dribbles",
"Tackles", "Interceptions", "Aerial Duels", "Pass %"),
Player_A = c(0.65, 0.25, 2.1, 3.2, 1.1, 0.8, 1.5, 85),
Player_B = c(0.45, 0.55, 3.4, 1.8, 0.6, 0.5, 0.9, 89),
# Percentile ranks (0-100)
Player_A_pct = c(92, 65, 70, 88, 45, 40, 60, 55),
Player_B_pct = c(78, 85, 92, 55, 25, 22, 35, 75)
)
# Prepare data for radar
radar_data <- player_stats %>%
select(metric, Player_A_pct, Player_B_pct) %>%
pivot_longer(cols = -metric, names_to = "player", values_to = "value") %>%
mutate(player = gsub("_pct", "", player))
# Create radar using coord_polar
ggplot(radar_data, aes(x = metric, y = value,
group = player, color = player)) +
geom_polygon(aes(fill = player), alpha = 0.2, linewidth = 1.5) +
geom_point(size = 3) +
coord_polar() +
scale_y_continuous(limits = c(0, 100)) +
scale_color_manual(values = c("Player_A" = "#E63946",
"Player_B" = "#457B9D")) +
scale_fill_manual(values = c("Player_A" = "#E63946",
"Player_B" = "#457B9D")) +
labs(title = "Player Comparison Radar",
subtitle = "Percentile ranks (per 90 minutes)") +
theme_minimal() +
theme(
axis.text.x = element_text(size = 10),
axis.text.y = element_blank(),
axis.title = element_blank(),
legend.position = "bottom",
plot.title = element_text(size = 16, face = "bold", hjust = 0.5)
)chapter4-radarCreating a player comparison radar chartPizza Chart Alternative
Pizza charts are a modern alternative to radar charts, popularized by FBref and StatsBomb:
# Pizza chart in R
# This is a stylized bar chart in polar coordinates
create_pizza_chart <- function(stats_df, player_name) {
# stats_df should have: metric, value (percentile), category
ggplot(stats_df, aes(x = reorder(metric, value), y = value,
fill = category)) +
geom_bar(stat = "identity", width = 0.9) +
coord_polar(theta = "x") +
ylim(0, 100) +
# Add value labels
geom_text(aes(label = value), hjust = -0.3, size = 3) +
scale_fill_manual(values = c("Attacking" = "#E63946",
"Creative" = "#457B9D",
"Defensive" = "#2A9D8F")) +
labs(title = player_name,
subtitle = "Percentile Ranks vs. Position") +
theme_minimal() +
theme(
axis.text.y = element_blank(),
axis.title = element_blank(),
panel.grid = element_blank(),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5, size = 16, face = "bold")
)
}
# Example data
player_pizza <- data.frame(
metric = c("Goals", "xG", "Shots", "Assists", "xA",
"Key Passes", "Tackles", "Interceptions", "Blocks"),
value = c(85, 78, 72, 45, 52, 68, 35, 42, 38),
category = c(rep("Attacking", 3), rep("Creative", 3), rep("Defensive", 3))
)
create_pizza_chart(player_pizza, "Marcus Rashford")chapter4-pizzaCreating a pizza chart for player profilesVisualization Best Practices
- Use consistent color schemes - team colors, semantic colors (goals = gold)
- Add context - include sample sizes, time periods, competition level
- Label clearly - titles, axes, legends should be self-explanatory
- Consider your audience - coaches need different details than fans
- Use appropriate chart types - shot maps for locations, timelines for match flow
- Include data sources - always credit StatsBomb, FBref, etc.
- Overload with information - one visualization, one message
- Use misleading scales - always start y-axis at 0 for bar charts
- Ignore color blindness - use colorblind-safe palettes
- Compare incomparable data - different leagues, sample sizes
- Over-design - clarity beats aesthetics
- Forget mobile users - ensure readability at small sizes
Color Palettes for Football
# Colorblind-safe palettes
library(RColorBrewer)
# View available palettes
display.brewer.all(colorblindFriendly = TRUE)
# Good palettes for football
# Sequential: Blues, Greens, Oranges, Reds
# Diverging: RdYlBu, RdYlGn (avoid red-green only)
# Qualitative: Set2, Paired
# Custom football palette
football_colors <- c(
"goal" = "#FFD700", # Gold
"shot_saved" = "#FFFFFF", # White
"shot_blocked" = "#FFA500", # Orange
"shot_off" = "#FF6B6B", # Light red
"pass_complete" = "#90EE90", # Light green
"pass_incomplete" = "#FF6B6B",
"home_team" = "#1E90FF", # Blue
"away_team" = "#DC143C" # Red
)
# Usage in ggplot
scale_color_manual(values = football_colors)chapter4-colorsSetting up colorblind-safe palettesSaving High-Quality Figures
# Save high-quality figures in R
library(ggplot2)
# Create your plot
p <- ggplot(...) + ...
# Save as PNG (for web)
ggsave("shot_map.png", p,
width = 12, height = 8, dpi = 300,
bg = "white")
# Save as SVG (for print/editing)
ggsave("shot_map.svg", p,
width = 12, height = 8)
# Save as PDF (for publications)
ggsave("shot_map.pdf", p,
width = 12, height = 8)
# For dark backgrounds
ggsave("shot_map_dark.png", p,
width = 12, height = 8, dpi = 300,
bg = "#1a1a2e")chapter4-exportExporting publication-quality figuresChapter Summary
Key Takeaways
- The pitch is your canvas - mplsoccer and ggsoccer make drawing pitches easy
- Coordinate systems matter - always know your data provider's system
- Shot maps tell stories - use size for xG, color for outcomes
- Pass networks reveal structure - connections between players show tactics
- Heat maps show density - where players operate, where teams dominate
- xG timelines capture match flow - momentum, dominance, crucial moments
- Radar charts compare players - multiple metrics at a glance
Practice Exercises
Exercise 4.1: Create a Team Shot Map
Task: Create a professional shot map for a World Cup team with xG-sized points, goal highlighting, and summary statistics.
# Exercise 4.1: Team Shot Map with xG
library(StatsBombR)
library(ggplot2)
library(ggsoccer)
library(dplyr)
# Load World Cup 2018 data
comps <- FreeCompetitions() %>%
filter(competition_id == 43, season_id == 3)
matches <- FreeMatches(comps)
events <- free_allevents(MatchesDF = matches)
# Filter for Belgium shots
team_name <- "Belgium"
shots <- events %>%
filter(team.name == team_name, type.name == "Shot") %>%
mutate(is_goal = shot.outcome.name == "Goal")
# Calculate summary stats
total_xg <- sum(shots$shot.statsbomb_xg, na.rm = TRUE)
goals <- sum(shots$is_goal)
total_shots <- nrow(shots)
# Create shot map
ggplot(shots) +
annotate_pitch(colour = "#FFFFFF", fill = "#1a1a2e") +
# Non-goals
geom_point(data = filter(shots, !is_goal),
aes(x = location.x, y = location.y,
size = shot.statsbomb_xg),
color = "#666666", alpha = 0.7) +
# Goals with glow effect
geom_point(data = filter(shots, is_goal),
aes(x = location.x, y = location.y,
size = shot.statsbomb_xg * 1.5),
color = "#FFD700", alpha = 0.3) +
geom_point(data = filter(shots, is_goal),
aes(x = location.x, y = location.y,
size = shot.statsbomb_xg),
color = "#FFD700", alpha = 0.9) +
scale_size_continuous(range = c(3, 15), guide = "none") +
theme_pitch() +
coord_flip(xlim = c(60, 122)) +
# Summary annotation
annotate("text", x = 65, y = 40,
label = sprintf("xG: %.2f | Goals: %d | Shots: %d",
total_xg, goals, total_shots),
color = "white", size = 4) +
labs(title = paste(team_name, "- World Cup 2018 Shot Map"),
subtitle = "Gold = Goals | Size = xG value") +
theme(plot.background = element_rect(fill = "#1a1a2e"),
plot.title = element_text(color = "white", face = "bold", size = 16),
plot.subtitle = element_text(color = "#888888", size = 12))
ggsave("belgium_shotmap.png", width = 12, height = 8, dpi = 300)ex41-solutionExercise 4.1: Create professional team shot mapExercise 4.2: Build a Pass Network
Task: Create a pass network visualization with player positions, connection strengths, and identify the most connected player.
# Exercise 4.2: Pass Network Visualization
library(StatsBombR)
library(ggplot2)
library(ggsoccer)
library(dplyr)
# Get a single match
match_id <- matches$match_id[1]
match_events <- events %>% filter(match_id == !!match_id)
team_name <- "France"
# Get completed passes with recipients
team_passes <- match_events %>%
filter(team.name == team_name,
type.name == "Pass",
is.na(pass.outcome.name),
!is.na(pass.recipient.name))
# Count pass pairs (min 3 passes for edge)
pass_pairs <- team_passes %>%
group_by(player.name, pass.recipient.name) %>%
summarise(passes = n(), .groups = "drop") %>%
filter(passes >= 3)
# Average positions
avg_pos <- match_events %>%
filter(team.name == team_name, !is.na(location.x)) %>%
group_by(player.name) %>%
summarise(x = mean(location.x), y = mean(location.y), touches = n())
# Join positions to edges
edges <- pass_pairs %>%
left_join(avg_pos, by = "player.name") %>%
left_join(avg_pos, by = c("pass.recipient.name" = "player.name"),
suffix = c("", "_end"))
# Find most connected player
most_connected <- avg_pos %>%
left_join(
pass_pairs %>%
group_by(player.name) %>%
summarise(connections = sum(passes)),
by = "player.name"
) %>%
arrange(desc(connections)) %>%
slice(1)
# Create visualization
ggplot() +
annotate_pitch(colour = "white", fill = "#1B5E20") +
# Edges
geom_segment(data = edges,
aes(x = x, y = y, xend = x_end, yend = y_end,
linewidth = passes),
color = "white", alpha = 0.5) +
# Nodes
geom_point(data = avg_pos,
aes(x = x, y = y, size = touches),
color = "#75AADB", alpha = 0.9) +
# Labels
geom_text(data = avg_pos,
aes(x = x, y = y - 4, label = gsub(".* ", "", player.name)),
color = "white", size = 3) +
# Highlight most connected
geom_point(data = most_connected,
aes(x = x, y = y), size = 20,
color = "#FFD700", alpha = 0.3) +
scale_linewidth_continuous(range = c(0.5, 4)) +
scale_size_continuous(range = c(5, 15)) +
theme_pitch() +
coord_flip() +
labs(title = paste(team_name, "Pass Network"),
subtitle = paste("Most Connected:", most_connected$player.name)) +
theme(legend.position = "none")
ggsave("pass_network.png", width = 12, height = 8)ex42-solutionExercise 4.2: Build pass network with connectionsExercise 4.3: Multi-Panel Match Report
Task: Create a comprehensive match report combining xG timeline, shot maps for both teams, and key statistics in a single figure.
# Exercise 4.3: Multi-Panel Match Report
library(StatsBombR)
library(ggplot2)
library(ggsoccer)
library(dplyr)
library(patchwork)
# Get match data
match_id <- matches$match_id[1]
match_events <- events %>% filter(match_id == !!match_id)
teams <- unique(match_events$team.name[!is.na(match_events$team.name)])
# 1. xG Timeline
shots <- match_events %>%
filter(type.name == "Shot") %>%
arrange(minute) %>%
group_by(team.name) %>%
mutate(cumulative_xg = cumsum(shot.statsbomb_xg)) %>%
ungroup()
p1 <- ggplot(shots, aes(x = minute, y = cumulative_xg, color = team.name)) +
geom_step(linewidth = 1.5) +
geom_point(data = filter(shots, shot.outcome.name == "Goal"),
size = 4, shape = 21, fill = "white", stroke = 2) +
scale_color_manual(values = c("#1E90FF", "#DC143C")) +
labs(title = "xG Timeline", x = "Minute", y = "Cumulative xG", color = "") +
theme_minimal() +
theme(legend.position = "bottom")
# 2. Shot maps for each team
create_shot_map <- function(team, color) {
team_shots <- shots %>% filter(team.name == team)
ggplot(team_shots) +
annotate_pitch(colour = "white", fill = "#333") +
geom_point(aes(x = location.x, y = location.y,
size = shot.statsbomb_xg,
alpha = shot.outcome.name == "Goal"),
color = color) +
scale_size_continuous(range = c(2, 10)) +
scale_alpha_manual(values = c(0.5, 1)) +
theme_pitch() +
coord_flip(xlim = c(60, 120)) +
labs(title = team) +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5, color = "white"),
plot.background = element_rect(fill = "#333"))
}
p2 <- create_shot_map(teams[1], "#1E90FF")
p3 <- create_shot_map(teams[2], "#DC143C")
# 3. Stats table
stats <- match_events %>%
group_by(team.name) %>%
summarise(
Shots = sum(type.name == "Shot"),
`On Target` = sum(type.name == "Shot" & shot.outcome.name %in% c("Goal", "Saved")),
Goals = sum(type.name == "Shot" & shot.outcome.name == "Goal"),
xG = round(sum(shot.statsbomb_xg[type.name == "Shot"], na.rm = TRUE), 2),
Passes = sum(type.name == "Pass"),
`Pass %` = round(sum(type.name == "Pass" & is.na(pass.outcome.name)) /
sum(type.name == "Pass") * 100, 1)
)
# Combine with patchwork
final_plot <- (p1) / (p2 | p3) +
plot_annotation(
title = paste(teams[1], "vs", teams[2]),
subtitle = "Match Analysis Report",
theme = theme(plot.title = element_text(size = 20, face = "bold"),
plot.subtitle = element_text(size = 14))
)
ggsave("match_report.png", final_plot, width = 14, height = 12, dpi = 300)ex43-solutionExercise 4.3: Create multi-panel match reportReady for Chapter 5?
Learn about traditional football statistics - possession, shots, pass completion, and how to calculate and interpret them.
Continue to Traditional Football Statistics