Capstone - Complete Analytics System
- Understand different stakeholder perspectives and communication needs
- Master data storytelling techniques for football analytics
- Create effective visualizations for technical and non-technical audiences
- Build compelling presentations for coaches, executives, and scouts
- Write clear, actionable analytics reports
- Design intuitive dashboards for real-time decision support
- Handle objections and build trust with analytics skeptics
- Develop a communication strategy for your analytics department
The Communication Challenge
The most sophisticated analysis is worthless if it doesn't influence decisions. Analytics communication is about translating complex statistical insights into actionable information that coaches, scouts, executives, and players can understand and trust. This is often the biggest barrier to analytics adoption in football.
The Analytics Gap
Many clubs have invested heavily in data infrastructure and analysts but struggle to see returns because insights don't reach decision-makers in usable form. The best analysts are not just technically skilled—they're effective communicators who understand their audience.
Coaches
Want tactical insights, quicklyExecutives
Need business impact, ROIScouts
Seek player comparisons, contextPlayers
Personal, actionable feedback# Python: Communication Style Framework
import pandas as pd
# Define audience profiles and their preferences
audience_profiles = pd.DataFrame({
"audience": ["Head Coach", "DOF", "Scout", "Owner", "Analyst"],
"time_available": ["2 minutes", "10 minutes", "15 minutes", "5 minutes", "30+ minutes"],
"technical_level": ["Low", "Medium", "Medium-High", "Low", "High"],
"primary_concern": ["Tactical edge", "Value & risk", "Player fit", "ROI", "Methodology"],
"preferred_format": ["Video + key stat", "Dashboard summary", "Detailed report",
"Executive summary", "Full technical doc"]
})
def format_insight(insight, audience):
"""Format analytical insight for specific audience."""
profile = audience_profiles[audience_profiles["audience"] == audience].iloc[0]
formatted = {
"headline": create_headline(insight, profile["primary_concern"]),
"visualization": select_viz_type(insight, profile["technical_level"]),
"detail_level": determine_detail(profile["time_available"]),
"call_to_action": create_cta(insight, profile["audience"])
}
return formatted
print(audience_profiles)# R: Communication Style Framework
library(tidyverse)
# Define audience profiles and their preferences
audience_profiles <- tribble(
~audience, ~time_available, ~technical_level, ~primary_concern, ~preferred_format,
"Head Coach", "2 minutes", "Low", "Tactical edge", "Video + key stat",
"DOF", "10 minutes", "Medium", "Value & risk", "Dashboard summary",
"Scout", "15 minutes", "Medium-High", "Player fit", "Detailed report",
"Owner", "5 minutes", "Low", "ROI", "Executive summary",
"Analyst", "30+ minutes", "High", "Methodology", "Full technical doc"
)
# Map insight to appropriate format
format_insight <- function(insight, audience) {
profile <- audience_profiles %>% filter(audience == !!audience)
formatted <- list(
headline = create_headline(insight, profile$primary_concern),
visualization = select_viz_type(insight, profile$technical_level),
detail_level = determine_detail(profile$time_available),
call_to_action = create_cta(insight, profile$audience)
)
return(formatted)
}
print(audience_profiles) audience time_available technical_level primary_concern preferred_format
0 Head Coach 2 minutes Low Tactical edge Video + key stat
1 DOF 10 minutes Medium Value & risk Dashboard summary
2 Scout 15 minutes Medium-High Player fit Detailed report
3 Owner 5 minutes Low ROI Executive summary
4 Analyst 30+ minutes High Methodology Full technical docData Storytelling
Effective analytics communication follows narrative structures. Rather than presenting data dumps, tell a story with a clear beginning (context), middle (analysis), and end (recommendation).
Situation
Set the context: What's the problem or opportunity? Why does it matter now?
Task
Define the question: What did we analyze? What was our objective?
Analysis
Present findings: What did the data reveal? Key insights only.
Recommendation
Action items: What should we do? Be specific and actionable.
# Python: Structure Analytics Story
import pandas as pd
def create_analytics_story(analysis_result, context):
"""Create structured analytics story using STAR framework."""
story = {}
# SITUATION: Set context
story["situation"] = f"""Our {context["metric_name"]} has dropped by {abs(context["change_pct"])}%
over the last {context["period"]} matches.
This puts us {ordinal(context["league_rank"])} in the league for this metric."""
# TASK: Define the question
story["task"] = f"""We analyzed {len(analysis_result["data"])} matches to identify
the root cause of this decline and potential solutions."""
# ANALYSIS: Key findings (limit to 3)
top_insights = analysis_result["insights"].nlargest(3, "impact_score")
story["findings"] = [
f"{i+1}. {row['category']}: {row['description']} (Impact: {row['impact_level']})"
for i, (_, row) in enumerate(top_insights.iterrows())
]
# RECOMMENDATION: Specific actions
story["recommendation"] = [
f"- {row['action']} ({row['expected_outcome']})"
for _, row in analysis_result["recommendations"].iterrows()
]
return story
# Example output
example_story = {
"situation": "Our pressing success rate has dropped by 12% over the last 8 matches.",
"task": "We analyzed pressing patterns across 24 matches.",
"findings": [
"1. Trigger timing: Pressing 0.5s too late on average",
"2. Compactness: Team shape stretched 8m wider",
"3. Recovery runs: 15% fewer recovery sprints"
],
"recommendation": [
"- Earlier trigger on opposition first touch",
"- Narrower defensive width in middle third"
]
}
print("=== ANALYTICS STORY EXAMPLE ===\n")
print(f"SITUATION:\n{example_story['situation']}\n")
print(f"TASK:\n{example_story['task']}\n")
print("KEY FINDINGS:")
print("\n".join(example_story["findings"]))
print("\nRECOMMENDATIONS:")
print("\n".join(example_story["recommendation"]))# R: Structure Analytics Story
library(tidyverse)
create_analytics_story <- function(analysis_result, context) {
story <- list()
# SITUATION: Set context
story$situation <- sprintf(
"Our %s has dropped by %d%% over the last %d matches.
This puts us %s in the league for this metric.",
context$metric_name,
abs(context$change_pct),
context$period,
ordinal(context$league_rank)
)
# TASK: Define the question
story$task <- sprintf(
"We analyzed %d matches to identify the root cause of
this decline and potential solutions.",
nrow(analysis_result$data)
)
# ANALYSIS: Key findings (limit to 3)
story$findings <- analysis_result$insights %>%
arrange(desc(impact_score)) %>%
head(3) %>%
mutate(
finding = sprintf(
"%d. %s: %s (Impact: %s)",
row_number(),
category,
description,
impact_level
)
) %>%
pull(finding)
# RECOMMENDATION: Specific actions
story$recommendation <- analysis_result$recommendations %>%
mutate(
action = sprintf("- %s (%s)", action, expected_outcome)
) %>%
pull(action)
return(story)
}
# Example output structure
example_story <- list(
situation = "Our pressing success rate has dropped by 12% over the last 8 matches.",
task = "We analyzed pressing patterns across 24 matches.",
findings = c(
"1. Trigger timing: Pressing 0.5s too late on average",
"2. Compactness: Team shape stretched 8m wider",
"3. Recovery runs: 15% fewer recovery sprints"
),
recommendation = c(
"- Earlier trigger on opposition first touch",
"- Narrower defensive width in middle third"
)
)
cat("=== ANALYTICS STORY EXAMPLE ===\n\n")
cat("SITUATION:\n", example_story$situation, "\n\n")
cat("TASK:\n", example_story$task, "\n\n")
cat("KEY FINDINGS:\n")
cat(paste(example_story$findings, collapse = "\n"), "\n\n")
cat("RECOMMENDATIONS:\n")
cat(paste(example_story$recommendation, collapse = "\n"))=== ANALYTICS STORY EXAMPLE ===
SITUATION:
Our pressing success rate has dropped by 12% over the last 8 matches.
TASK:
We analyzed pressing patterns across 24 matches.
KEY FINDINGS:
1. Trigger timing: Pressing 0.5s too late on average
2. Compactness: Team shape stretched 8m wider
3. Recovery runs: 15% fewer recovery sprints
RECOMMENDATIONS:
- Earlier trigger on opposition first touch
- Narrower defensive width in middle thirdThe Pyramid Principle
Start with the conclusion, then provide supporting evidence. Busy stakeholders may only read the first line—make it count.
"We analyzed 2,847 shots from this season. Using logistic regression, we modeled shot quality based on 15 variables including distance, angle, defender proximity... After controlling for various factors... The model suggests... Therefore, we recommend signing Player X."
The reader must wade through methodology to find the point."We should sign Player X—he would add 8-12 goals above our current options at the same cost. Here's why: [3 supporting points]. Detailed methodology available on request."
The key message is immediate. Details follow for those who want them.Visualization for Different Audiences
The right visualization depends on your audience. Technical analysts may appreciate complex plots; coaches need simple, immediate clarity.
| Audience | Recommended Visualizations | Avoid |
|---|---|---|
| Coaches | Pitch maps, shot maps, video clips, simple bar charts, player comparison tables | Dense scatter plots, statistical distributions, complex multi-panel figures |
| Executives | Trend lines, KPI dashboards, financial projections, league rankings | Raw data tables, overly technical metrics without context |
| Scouts | Radar charts, percentile rankings, heat maps, similar player comparisons | Aggregate stats without per-90 normalization |
| Players | Personal highlight clips with stats overlay, progress charts, specific zone analysis | League-wide comparisons that might demotivate |
# Python: Audience-Appropriate Visualizations
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np
def create_coach_shot_map(shots_data, player_name):
"""Create simple, clear shot map for coaches."""
fig, ax = plt.subplots(figsize=(12, 8))
# Pitch background
ax.set_facecolor("#2E7D32")
ax.add_patch(patches.Rectangle((0, 0), 120, 80, fill=False,
edgecolor="white", linewidth=2))
# Goal
ax.add_patch(patches.Rectangle((114, 30), 6, 20, fill=False,
edgecolor="white", linewidth=3))
# Plot shots with colors by result
colors = {"Goal": "#FFD700", "Saved": "white",
"Blocked": "gray", "Off Target": "red"}
for _, shot in shots_data.iterrows():
color = colors.get(shot["result"], "white")
size = shot["xg"] * 300 + 50
ax.scatter(shot["x"], shot["y"], c=color, s=size,
alpha=0.8, edgecolors="black", linewidths=0.5)
# Title and stats
goals = (shots_data["result"] == "Goal").sum()
total_xg = shots_data["xg"].sum()
n_shots = len(shots_data)
ax.text(60, 75, f"{player_name} - Shot Map",
ha="center", fontsize=16, fontweight="bold", color="white")
ax.text(60, 5, f"Goals: {goals} | xG: {total_xg:.1f} | Shots: {n_shots}",
ha="center", fontsize=12, color="white")
ax.set_xlim(0, 120)
ax.set_ylim(0, 80)
ax.set_aspect("equal")
ax.axis("off")
# Legend
for result, color in colors.items():
ax.scatter([], [], c=color, s=100, label=result, edgecolors="black")
ax.legend(loc="lower center", ncol=4, frameon=False)
plt.tight_layout()
return fig
def create_executive_trend(performance_data, metric, target):
"""Create trend chart with target for executives."""
fig, ax = plt.subplots(figsize=(10, 6))
x = performance_data["matchweek"]
y = performance_data[metric]
# Target line
ax.axhline(y=target, linestyle="--", color="#1B5E20",
linewidth=2, label=f"Target: {target}")
# Trend line
ax.plot(x, y, color="#2E7D32", linewidth=2)
# Points colored by above/below target
above = y >= target
ax.scatter(x[above], y[above], color="#1B5E20", s=80, zorder=5)
ax.scatter(x[~above], y[~above], color="#D32F2F", s=80, zorder=5)
# Current status
current = y.iloc[-1]
gap = current - target
ax.set_title(f"Performance: {metric}", fontsize=14, fontweight="bold")
ax.set_xlabel("Matchweek")
ax.set_ylabel(metric)
# Add summary annotation
ax.text(0.02, 0.98,
f"Current: {current:.1f} | Target: {target} | Gap: {gap:+.1f}",
transform=ax.transAxes, fontsize=10, verticalalignment="top",
bbox=dict(boxstyle="round", facecolor="wheat", alpha=0.5))
ax.legend(loc="upper right")
ax.grid(True, alpha=0.3)
plt.tight_layout()
return fig# R: Audience-Appropriate Visualizations
library(tidyverse)
library(ggplot2)
library(patchwork)
# COACH VERSION: Simple, clear shot map
create_coach_shot_map <- function(shots_data, player_name) {
ggplot(shots_data, aes(x = x, y = y)) +
# Pitch background
annotate("rect", xmin = 0, xmax = 120, ymin = 0, ymax = 80,
fill = "#2E7D32", color = "white") +
# Goal
annotate("rect", xmin = 114, xmax = 120, ymin = 30, ymax = 50,
fill = NA, color = "white", linewidth = 2) +
# Shots
geom_point(aes(size = xg, color = result),
alpha = 0.8) +
scale_color_manual(values = c("Goal" = "#FFD700", "Saved" = "white",
"Blocked" = "gray", "Off Target" = "red")) +
scale_size_continuous(range = c(2, 8), guide = "none") +
# Simple annotations
annotate("text", x = 60, y = 75,
label = sprintf("%s - Shot Map", player_name),
size = 6, fontface = "bold", color = "white") +
annotate("text", x = 60, y = 5,
label = sprintf("Goals: %d | xG: %.1f | Shots: %d",
sum(shots_data$result == "Goal"),
sum(shots_data$xg),
nrow(shots_data)),
size = 4, color = "white") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom",
legend.text = element_text(color = "black"),
plot.background = element_rect(fill = "white"))
}
# EXECUTIVE VERSION: Trend with context
create_executive_trend <- function(performance_data, metric, target) {
ggplot(performance_data, aes(x = matchweek, y = !!sym(metric))) +
geom_hline(yintercept = target, linetype = "dashed",
color = "#1B5E20", linewidth = 1) +
geom_line(color = "#2E7D32", linewidth = 1.5) +
geom_point(aes(color = !!sym(metric) >= target), size = 3) +
scale_color_manual(values = c("TRUE" = "#1B5E20", "FALSE" = "#D32F2F"),
guide = "none") +
annotate("text", x = max(performance_data$matchweek), y = target,
label = "Target", hjust = -0.1, color = "#1B5E20") +
labs(title = sprintf("Performance: %s", metric),
subtitle = sprintf("Current: %.1f | Target: %.1f | Gap: %+.1f",
tail(performance_data[[metric]], 1),
target,
tail(performance_data[[metric]], 1) - target),
x = "Matchweek", y = metric) +
theme_minimal() +
theme(plot.title = element_text(face = "bold"))
}The "So What?" Test
Every visualization should pass the "So What?" test. After viewing, the audience should immediately understand what it means for them.
# Python: Add "So What" Context to Visualizations
def add_so_what_context(fig, ax, insight, action):
"""Add interpretive context to make visualization actionable."""
# Add insight as subtitle
ax.text(0.5, 1.02, insight,
transform=ax.transAxes, ha="center", fontsize=10,
style="italic", color="#1B5E20")
# Add action as footer
fig.text(0.1, 0.02, f"Recommendation: {action}",
fontsize=9, color="#D32F2F", fontweight="bold")
plt.subplots_adjust(bottom=0.12, top=0.88)
return fig
# Example: xG difference chart with context
fig, ax = plt.subplots(figsize=(10, 6))
colors = ["#1B5E20" if x > 0 else "#D32F2F" for x in match_data["xg_diff"]]
ax.bar(match_data["matchweek"], match_data["xg_diff"], color=colors)
ax.axhline(y=0, color="black", linewidth=0.5)
ax.set_xlabel("Matchweek")
ax.set_ylabel("xG Difference")
ax.set_title("xG Difference by Match", fontweight="bold")
# Add the "So What?"
fig = add_so_what_context(
fig, ax,
insight="Creating more chances than conceding - underlying performance is strong",
action="Maintain current tactical approach; results will likely improve"
)
plt.show()# R: Add "So What" Context to Visualizations
add_so_what_context <- function(plot, insight, action) {
# Add interpretive subtitle and actionable caption
plot +
labs(
subtitle = insight, # What does this mean?
caption = paste("Recommendation:", action) # What should we do?
) +
theme(
plot.subtitle = element_text(color = "#1B5E20", face = "italic"),
plot.caption = element_text(color = "#D32F2F", face = "bold",
hjust = 0, size = 10)
)
}
# Example: Adding context to xG chart
xg_plot <- ggplot(match_data, aes(x = matchweek, y = xg_diff)) +
geom_col(aes(fill = xg_diff > 0)) +
scale_fill_manual(values = c("TRUE" = "#1B5E20", "FALSE" = "#D32F2F"),
guide = "none") +
labs(title = "xG Difference by Match", x = "Matchweek", y = "xG Difference")
# Add the "So What?"
xg_plot_with_context <- add_so_what_context(
xg_plot,
insight = "We're creating more chances than conceding - underlying performance is strong",
action = "Maintain current tactical approach; results will likely improve"
)
print(xg_plot_with_context)Presenting to Coaches
Coaches are time-poor and action-oriented. They need insights that directly inform training sessions, team selection, and match tactics. Build trust by speaking their language and respecting their expertise.
- Lead with video clips supported by data
- Use football terminology, not statistical jargon
- Provide specific, actionable recommendations
- Acknowledge what you don't know
- Respect their observational expertise
- Keep it brief—2 minutes max for key points
- Offer deeper dives on request
- Lecture on statistical methodology
- Present data that contradicts their observation without humility
- Overwhelm with too many metrics
- Use percentiles without context
- Claim certainty where uncertainty exists
- Ignore their questions or pushback
- Present without understanding the tactical context
# Python: Create Coach-Ready Presentation Packet
from dataclasses import dataclass
from typing import List
@dataclass
class CoachPacket:
headline: str
key_stat: str
tactical_suggestion: str
video_clips: List[str]
key_numbers: List[str]
def create_coach_packet(match_analysis, opposition_name):
"""Create presentation packet optimized for coaching staff."""
# 1. One-page summary
headline = f"{opposition_name} tend to leave space " \
f"{match_analysis['vulnerability_zone']} when pressing high"
key_stat = f"They concede {match_analysis['xg_conceded_zone']:.1f} xG/90 " \
f"from {match_analysis['vulnerability_zone']} - " \
f"{match_analysis['above_avg_pct']:.0f}% above league average"
tactical_suggestion = f"Consider {match_analysis['suggested_tactic']} " \
f"to exploit this. Video examples attached."
# 2. Key numbers (max 5)
key_numbers = [
f"{row['metric_name']}: {row['value_formatted']}"
for _, row in match_analysis["top_insights"].head(5).iterrows()
]
packet = CoachPacket(
headline=headline,
key_stat=key_stat,
tactical_suggestion=tactical_suggestion,
video_clips=match_analysis["example_clips"][:3],
key_numbers=key_numbers
)
return packet
# Example output
example_packet = CoachPacket(
headline="Liverpool leave space behind fullbacks when pressing high",
key_stat="They concede 0.42 xG/90 from wide areas - 35% above league average",
tactical_suggestion="Quick switches of play to exploit advancing fullbacks",
video_clips=["clip_001.mp4", "clip_002.mp4", "clip_003.mp4"],
key_numbers=[
"High press trigger: 65% of opposition goal kicks",
"Recovery time: 4.2 seconds average",
"xG from counters: 0.31/90 (league avg: 0.22)"
]
)
print("=== COACH PRESENTATION PACKET ===\n")
print(f"HEADLINE:\n{example_packet.headline}\n")
print(f"KEY STAT:\n{example_packet.key_stat}\n")
print(f"SUGGESTION:\n{example_packet.tactical_suggestion}\n")
print("KEY NUMBERS:")
for num in example_packet.key_numbers:
print(f"- {num}")# R: Create Coach-Ready Presentation Packet
library(tidyverse)
library(officer)
library(rvg)
create_coach_packet <- function(match_analysis, opposition_name) {
packet <- list()
# 1. One-page summary
packet$summary <- list(
headline = sprintf(
"%s tend to leave space %s when pressing high",
opposition_name,
match_analysis$vulnerability_zone
),
key_stat = sprintf(
"They concede %.1f xG/90 from %s - %.0f%% above league average",
match_analysis$xg_conceded_zone,
match_analysis$vulnerability_zone,
match_analysis$above_avg_pct
),
tactical_suggestion = sprintf(
"Consider %s to exploit this. Video examples attached.",
match_analysis$suggested_tactic
),
video_clips = match_analysis$example_clips[1:3] # Top 3 clips
)
# 2. Simple pitch graphic
packet$pitch_graphic <- create_zone_heatmap(
zones = match_analysis$dangerous_zones,
title = sprintf("Where %s are vulnerable", opposition_name),
subtitle = "Red = higher xG conceded"
)
# 3. Key numbers (max 5)
packet$key_numbers <- match_analysis$top_insights %>%
head(5) %>%
mutate(
formatted = sprintf("%s: %s", metric_name, value_formatted)
)
return(packet)
}
# Example packet content
example_packet <- list(
headline = "Liverpool leave space behind fullbacks when pressing high",
key_stat = "They concede 0.42 xG/90 from wide areas - 35% above league average",
tactical_suggestion = "Quick switches of play to exploit advancing fullbacks",
key_numbers = c(
"High press trigger: 65% of opposition goal kicks",
"Recovery time: 4.2 seconds average",
"xG from counters: 0.31/90 (league avg: 0.22)"
)
)
cat("=== COACH PRESENTATION PACKET ===\n\n")
cat("HEADLINE:\n", example_packet$headline, "\n\n")
cat("KEY STAT:\n", example_packet$key_stat, "\n\n")
cat("SUGGESTION:\n", example_packet$tactical_suggestion, "\n\n")
cat("KEY NUMBERS:\n")
cat(paste("-", example_packet$key_numbers, collapse = "\n"))=== COACH PRESENTATION PACKET ===
HEADLINE:
Liverpool leave space behind fullbacks when pressing high
KEY STAT:
They concede 0.42 xG/90 from wide areas - 35% above league average
SUGGESTION:
Quick switches of play to exploit advancing fullbacks
KEY NUMBERS:
- High press trigger: 65% of opposition goal kicks
- Recovery time: 4.2 seconds average
- xG from counters: 0.31/90 (league avg: 0.22)Writing Effective Reports
Written reports need clear structure, progressive disclosure of detail, and actionable conclusions. Different report types serve different purposes.
Purpose: Quick overview for busy decision-makers
Structure:
- Bottom line: One sentence recommendation
- Key findings: 3 bullet points max
- Risk/confidence: What could go wrong
- Next steps: Specific actions needed
Purpose: Detailed player assessment for recruitment
Structure:
- Player overview: Bio, current situation, market value
- Statistical profile: Key metrics with percentile ranks
- Strengths: 3-4 areas with data + video evidence
- Weaknesses: 2-3 areas with honest assessment
- Fit assessment: How they'd fit our system
- Comparable players: Similar profiles (successful & unsuccessful)
- Recommendation: Sign/don't sign with confidence level
Purpose: Full methodology for analysts/archival
Structure:
- Executive summary: 1 page overview
- Methodology: Data sources, models, assumptions
- Analysis: Full results with all visualizations
- Validation: Model accuracy, backtesting
- Limitations: What this analysis can't tell us
- Appendices: Code, raw data references
# Python: Generate Structured Report
from jinja2 import Template
def generate_scouting_report(player_analysis):
"""Generate structured scouting report."""
report = {}
# Section 1: Overview
report["overview"] = f"""## Player Overview
**Name:** {player_analysis["name"]}
**Age:** {player_analysis["age"]} | **Position:** {player_analysis["position"]} | **Club:** {player_analysis["current_club"]}
**Contract Expires:** {player_analysis["contract_end"]} | **Est. Value:** {player_analysis["market_value"]}
{player_analysis["summary_paragraph"]}"""
# Section 2: Statistical Profile
metrics_display = []
for _, row in player_analysis["key_metrics"].iterrows():
metrics_display.append(
f"- {row['metric_name']}: **{row['value']:.2f}** ({row['percentile']}th percentile)"
)
report["stats"] = f"""## Statistical Profile (vs {player_analysis["comparison_group"]})
{chr(10).join(metrics_display)}"""
# Section 3: Strengths
strengths_sections = []
for _, strength in player_analysis["strengths"].iterrows():
strengths_sections.append(f"""### {strength["title"]}
{strength["description"]}
*Evidence: {strength["video_link"]}*""")
report["strengths"] = f"""## Strengths
{chr(10).join(strengths_sections)}"""
# Section 4: Recommendation
report["recommendation"] = f"""## Recommendation
**Verdict:** {player_analysis["verdict"]}
**Confidence:** {player_analysis["confidence"]}
**Risk Level:** {player_analysis["risk"]}
{player_analysis["final_comments"]}"""
return report
def export_report_markdown(report, filename):
"""Export report to markdown file."""
full_report = "\n\n".join([
report["overview"],
report["stats"],
report["strengths"],
report["recommendation"]
])
with open(filename, "w") as f:
f.write(full_report)
print(f"Report exported to {filename}")# R: Generate Structured Report
library(tidyverse)
library(rmarkdown)
generate_scouting_report <- function(player_analysis) {
report <- list()
# Section 1: Overview
report$overview <- sprintf(
"## Player Overview\n
**Name:** %s
**Age:** %d | **Position:** %s | **Club:** %s
**Contract Expires:** %s | **Est. Value:** %s
%s",
player_analysis$name,
player_analysis$age,
player_analysis$position,
player_analysis$current_club,
player_analysis$contract_end,
player_analysis$market_value,
player_analysis$summary_paragraph
)
# Section 2: Statistical Profile
metrics_table <- player_analysis$key_metrics %>%
mutate(
display = sprintf("%s: **%.2f** (%sth percentile)",
metric_name, value, percentile)
)
report$stats <- sprintf(
"## Statistical Profile (vs %s)\n\n%s",
player_analysis$comparison_group,
paste(metrics_table$display, collapse = "\n")
)
# Section 3: Strengths
strengths <- player_analysis$strengths %>%
mutate(
section = sprintf(
"### %s\n%s\n*Evidence: %s*\n",
title, description, video_link
)
)
report$strengths <- sprintf(
"## Strengths\n%s",
paste(strengths$section, collapse = "\n")
)
# Section 4: Recommendation
report$recommendation <- sprintf(
"## Recommendation\n
**Verdict:** %s
**Confidence:** %s
**Risk Level:** %s
%s",
player_analysis$verdict,
player_analysis$confidence,
player_analysis$risk,
player_analysis$final_comments
)
return(report)
}Dashboard Design
Dashboards provide real-time decision support. Good dashboard design follows principles of visual hierarchy, progressive disclosure, and actionability.
Layout Principles
- Most important metrics top-left
- Group related information
- Consistent spacing and alignment
- Clear visual hierarchy
- Mobile-responsive design
Visual Design
- Limited color palette (3-5 colors)
- Semantic colors (red=bad, green=good)
- Minimize chart junk
- Clear labels and titles
- Consistent styling across charts
Interactivity
- Filters for different views
- Drill-down capability
- Tooltips for details
- Export functionality
- Real-time updates where needed
# Python: Create Interactive Dashboard with Streamlit
import streamlit as st
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
def create_match_dashboard():
"""Create interactive match analytics dashboard."""
st.set_page_config(page_title="Match Analytics", layout="wide")
# Sidebar controls
st.sidebar.title("Match Analytics")
match = st.sidebar.selectbox("Select Match", ["Match 1", "Match 2", "Match 3"])
metric = st.sidebar.selectbox("Primary Metric", ["xG", "Possession", "PPDA"])
show_benchmark = st.sidebar.checkbox("Show League Average", True)
# Load data based on selection
match_data = load_match_data(match)
# Row 1: Key metrics
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric("Expected Goals",
f"{match_data['xg']:.2f}",
delta=f"{match_data['xg'] - 1.5:.2f} vs avg")
with col2:
st.metric("Possession",
f"{match_data['possession']:.0f}%")
with col3:
st.metric("Shots",
match_data["shots"],
delta=match_data["shots_on_target"])
with col4:
st.metric("Pass Completion",
f"{match_data['pass_pct']:.1f}%")
# Row 2: Main visualizations
col_left, col_right = st.columns([2, 1])
with col_left:
st.subheader("xG Flow")
fig = go.Figure()
fig.add_trace(go.Scatter(
x=match_data["xg_timeline"]["minute"],
y=match_data["xg_timeline"]["cumulative_xg"],
mode="lines",
name="Our xG",
line=dict(color="#1B5E20", width=2)
))
if show_benchmark:
fig.add_hline(y=1.5, line_dash="dash",
annotation_text="League Avg")
st.plotly_chart(fig, use_container_width=True)
with col_right:
st.subheader("Shot Map")
shot_fig = create_shot_map(match_data["shots_data"])
st.pyplot(shot_fig)
# Row 3: Player table
st.subheader("Player Performance")
st.dataframe(match_data["player_stats"],
use_container_width=True,
hide_index=True)
# Run: streamlit run dashboard.py
if __name__ == "__main__":
create_match_dashboard()# R: Create Interactive Dashboard with Shiny
library(shiny)
library(shinydashboard)
library(plotly)
# Dashboard UI
ui <- dashboardPage(
dashboardHeader(title = "Match Analytics"),
dashboardSidebar(
selectInput("match", "Select Match:",
choices = c("Match 1", "Match 2", "Match 3")),
selectInput("metric", "Primary Metric:",
choices = c("xG", "Possession", "PPDA")),
checkboxInput("show_benchmark", "Show League Average", TRUE)
),
dashboardBody(
# Row 1: Key metrics
fluidRow(
valueBoxOutput("xg_box", width = 3),
valueBoxOutput("possession_box", width = 3),
valueBoxOutput("shots_box", width = 3),
valueBoxOutput("passes_box", width = 3)
),
# Row 2: Main visualization
fluidRow(
box(title = "xG Flow", status = "primary", solidHeader = TRUE,
width = 8, plotlyOutput("xg_flow_plot")),
box(title = "Shot Map", status = "success", solidHeader = TRUE,
width = 4, plotOutput("shot_map"))
),
# Row 3: Detailed tables
fluidRow(
box(title = "Player Performance", width = 12,
DT::dataTableOutput("player_table"))
)
)
)
# Dashboard Server
server <- function(input, output) {
output$xg_box <- renderValueBox({
valueBox(
sprintf("%.2f", match_data()$xg),
"Expected Goals",
icon = icon("futbol"),
color = if (match_data()$xg > 1.5) "green" else "yellow"
)
})
output$xg_flow_plot <- renderPlotly({
plot_ly(xg_data(), x = ~minute, y = ~cumulative_xg,
type = "scatter", mode = "lines",
line = list(color = "#1B5E20", width = 2)) %>%
layout(title = "Cumulative xG",
xaxis = list(title = "Minute"),
yaxis = list(title = "xG"))
})
}
shinyApp(ui, server)Handling Objections and Building Trust
Analytics skepticism is common in football. Building trust requires patience, humility, and consistent delivery of actionable insights.
| Common Objection | Response Strategy |
|---|---|
| "I've watched football for 30 years—I don't need numbers to tell me who's good." | Acknowledge their expertise. Position analytics as a complement: "Your eye test caught things the data confirms. But data can also flag things happening in parts of the pitch you weren't watching." |
| "Stats don't capture the intangibles—leadership, mentality." | Agree that some things are hard to quantify. "You're right—we focus on the measurable to free you up to evaluate those intangibles. Together it's a fuller picture." |
| "Your model said X would happen and it didn't." | Explain probabilities vs certainties. "A 70% chance means it won't happen 30% of the time. Over many decisions, following 70% probabilities wins more than gut feel." |
| "Football is too complex to reduce to numbers." | "You're right that models simplify. But every decision simplifies—the question is whether informed simplification beats uninformed intuition." |
| "We tried analytics before and it didn't work." | Ask what went wrong. Often it's communication, not the analytics. "Let's focus on the specific decisions you need support on and build from there." |
# Python: Track Analytics Impact for Credibility Building
import pandas as pd
import uuid
from datetime import date
class RecommendationTracker:
"""Track analytics recommendations and outcomes for credibility."""
def __init__(self):
self.recommendations = pd.DataFrame(columns=[
"id", "date", "category", "description", "confidence",
"decision_maker", "was_followed", "outcome", "outcome_date"
])
def log_recommendation(self, recommendation):
"""Log a new analytics recommendation."""
new_rec = {
"id": str(uuid.uuid4()),
"date": date.today(),
"category": recommendation["category"],
"description": recommendation["description"],
"confidence": recommendation["confidence"],
"decision_maker": recommendation["presented_to"],
"was_followed": None,
"outcome": None,
"outcome_date": None
}
self.recommendations = pd.concat([
self.recommendations,
pd.DataFrame([new_rec])
], ignore_index=True)
return new_rec["id"]
def update_outcome(self, rec_id, was_followed, outcome):
"""Update recommendation with outcome."""
mask = self.recommendations["id"] == rec_id
self.recommendations.loc[mask, "was_followed"] = was_followed
self.recommendations.loc[mask, "outcome"] = outcome
self.recommendations.loc[mask, "outcome_date"] = date.today()
def generate_credibility_report(self):
"""Generate report showing analytics track record."""
completed = self.recommendations.dropna(subset=["outcome"])
# Key metric: outcomes when followed vs not
followed_vs_not = completed.groupby("was_followed").agg(
n=("id", "count"),
positive_rate=("outcome", lambda x: (x == "positive").mean())
).reset_index()
# By category
by_category = completed.groupby(["category", "was_followed"]).agg(
n=("id", "count"),
positive_rate=("outcome", lambda x: (x == "positive").mean()),
avg_confidence=("confidence", "mean")
).reset_index()
print("=== ANALYTICS CREDIBILITY REPORT ===\n")
print("Recommendations followed vs not followed:")
print(followed_vs_not)
print("\nBy category:")
print(by_category)
return {"followed_vs_not": followed_vs_not, "by_category": by_category}
# Usage
tracker = RecommendationTracker()
rec_id = tracker.log_recommendation({
"category": "signing",
"description": "Sign Player X - projects to add 0.15 xG/90",
"confidence": 4,
"presented_to": "Director of Football"
})
# Later...
tracker.update_outcome(rec_id, was_followed=True, outcome="positive")# R: Track Analytics Impact for Credibility Building
library(tidyverse)
# Log analytics recommendations and outcomes
track_recommendation <- function(rec_db, recommendation) {
new_rec <- data.frame(
id = uuid::UUIDgenerate(),
date = Sys.Date(),
category = recommendation$category, # signing, tactical, lineup
description = recommendation$description,
confidence = recommendation$confidence, # 1-5
decision_maker = recommendation$presented_to,
was_followed = NA, # To be filled later
outcome = NA, # To be filled later
outcome_date = NA
)
rec_db <- rbind(rec_db, new_rec)
return(rec_db)
}
# Update with outcome
update_outcome <- function(rec_db, rec_id, was_followed, outcome) {
rec_db <- rec_db %>%
mutate(
was_followed = if_else(id == rec_id, was_followed, was_followed),
outcome = if_else(id == rec_id, outcome, outcome),
outcome_date = if_else(id == rec_id, as.character(Sys.Date()), outcome_date)
)
return(rec_db)
}
# Generate credibility report
generate_credibility_report <- function(rec_db) {
completed <- rec_db %>%
filter(!is.na(outcome))
summary <- completed %>%
group_by(category, was_followed) %>%
summarize(
n = n(),
pct_positive = mean(outcome == "positive"),
avg_confidence = mean(confidence),
.groups = "drop"
)
# Key metric: Do recommendations followed have better outcomes?
followed_vs_not <- completed %>%
group_by(was_followed) %>%
summarize(
n = n(),
positive_rate = mean(outcome == "positive"),
.groups = "drop"
)
cat("=== ANALYTICS CREDIBILITY REPORT ===\n\n")
cat("Recommendations followed vs not followed:\n")
print(followed_vs_not)
cat("\nBy category:\n")
print(summary)
return(list(summary = summary, followed_vs_not = followed_vs_not))
}Building Trust Over Time
The Trust-Building Playbook
- Start small: Begin with low-stakes insights that prove your value without threatening anyone's role
- Be right about something: Identify a prediction you're confident in and track it visibly
- Admit when wrong: Acknowledge failures openly—it builds credibility more than claiming perfection
- Speak their language: Learn the terminology coaches and scouts use; translate your findings
- Make them look good: Position insights as supporting their decisions, not replacing them
- Be present: Attend training, watch matches live, show you understand the game beyond spreadsheets
- Track your record: Keep evidence of recommendations and outcomes to demonstrate value over time
Developing a Communication Strategy
A systematic approach to analytics communication ensures consistent quality and builds organizational capability over time.
# Python: Analytics Communication Strategy Framework
import pandas as pd
# Define communication channels and cadence
communication_framework = pd.DataFrame({
"deliverable": ["Match Report", "Weekly Summary", "Transfer Target Report",
"Executive Dashboard", "Player Feedback", "Season Review"],
"audience": ["Coaches", "Technical Director", "DOF, Scouts",
"CEO, Board", "Individual Players", "All Staff"],
"frequency": ["Post-match", "Weekly", "On request",
"Monthly", "Monthly", "End of season"],
"owner": ["Performance Analyst", "Head of Analytics", "Recruitment Analyst",
"Head of Analytics", "Performance Analyst", "Analytics Team"],
"format": ["1-page + video", "Dashboard", "5-page PDF",
"Interactive", "1-on-1 meeting", "Presentation"]
})
# Quality checklist for deliverables
quality_checklist = {
"before_release": [
"Data accuracy verified by second analyst",
"Key insight highlighted in first 30 seconds",
"Visualizations pass the 5-second test",
"Recommendation is specific and actionable",
"Uncertainty/caveats clearly stated",
"Proofread for typos and errors"
],
"after_release": [
"Collect feedback from recipients",
"Track whether recommendation was followed",
"Log outcome when available",
"Schedule follow-up if needed"
]
}
print("Communication Framework:")
print(communication_framework)
print("\n=== QUALITY CHECKLIST ===")
print("\nBefore Release:")
for item in quality_checklist["before_release"]:
print(f"[ ] {item}")
print("\nAfter Release:")
for item in quality_checklist["after_release"]:
print(f"[ ] {item}")# R: Analytics Communication Strategy Framework
library(tidyverse)
# Define communication channels and cadence
communication_framework <- tribble(
~deliverable, ~audience, ~frequency, ~owner, ~format,
"Match Report", "Coaches", "Post-match", "Performance Analyst", "1-page + video",
"Weekly Summary", "Technical Director", "Weekly", "Head of Analytics", "Dashboard",
"Transfer Target Report", "DOF, Scouts", "On request", "Recruitment Analyst", "5-page PDF",
"Executive Dashboard", "CEO, Board", "Monthly", "Head of Analytics", "Interactive",
"Player Feedback", "Individual Players", "Monthly", "Performance Analyst", "1-on-1 meeting",
"Season Review", "All Staff", "End of season", "Analytics Team", "Presentation"
)
# Quality checklist for deliverables
quality_checklist <- list(
before_release = c(
"Data accuracy verified by second analyst",
"Key insight highlighted in first 30 seconds",
"Visualizations pass the 5-second test",
"Recommendation is specific and actionable",
"Uncertainty/caveats clearly stated",
"Proofread for typos and errors"
),
after_release = c(
"Collect feedback from recipients",
"Track whether recommendation was followed",
"Log outcome when available",
"Schedule follow-up if needed"
)
)
print("Communication Framework:")
print(communication_framework)
cat("\n=== QUALITY CHECKLIST ===\n")
cat("\nBefore Release:\n")
cat(paste("[ ]", quality_checklist$before_release, collapse = "\n"))
cat("\n\nAfter Release:\n")
cat(paste("[ ]", quality_checklist$after_release, collapse = "\n"))Communication Framework:
deliverable audience frequency owner format
0 Match Report Coaches Post-match Performance Analyst 1-page + video
1 Weekly Summary Technical Director Weekly Head of Analytics Dashboard
2 Transfer Target Report DOF, Scouts On request Recruitment Analyst 5-page PDF
3 Executive Dashboard CEO, Board Monthly Head of Analytics Interactive
4 Player Feedback Individual Players Monthly Performance Analyst 1-on-1 meeting
5 Season Review All Staff End of season Analytics Team Presentation
=== QUALITY CHECKLIST ===
Before Release:
[ ] Data accuracy verified by second analyst
[ ] Key insight highlighted in first 30 seconds
[ ] Visualizations pass the 5-second test
[ ] Recommendation is specific and actionable
[ ] Uncertainty/caveats clearly stated
[ ] Proofread for typos and errors
After Release:
[ ] Collect feedback from recipients
[ ] Track whether recommendation was followed
[ ] Log outcome when available
[ ] Schedule follow-up if neededPractice Exercises
Exercise 30.1: Automated Match Report Generator
Task: Build an automated system that generates comprehensive post-match reports with key metrics, visualizations, and narrative summaries tailored for different audiences.
Requirements:
- Generate executive summary with key findings in plain language
- Create visualizations for shots, passing networks, and territory control
- Produce audience-specific versions (coach, analyst, media)
- Include comparison against benchmarks and season averages
- Export to HTML/PDF format with consistent branding
import pandas as pd
import numpy as np
from statsbombpy import sb
import matplotlib.pyplot as plt
from matplotlib.patches import Arc, Rectangle, Circle
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')
class MatchReportGenerator:
"""Automated match report generator with multiple output formats."""
def __init__(self, match_id):
self.match_id = match_id
self.events = sb.events(match_id=match_id)
self.lineups = sb.lineups(match_id=match_id)
self.match_info = self._get_match_info()
self.team_stats = self._calculate_team_stats()
def _get_match_info(self):
"""Extract basic match information."""
teams = self.events['team'].unique()
return {
'home_team': teams[0] if len(teams) > 0 else 'Team A',
'away_team': teams[1] if len(teams) > 1 else 'Team B',
'date': datetime.now().strftime('%Y-%m-%d')
}
def _calculate_team_stats(self):
"""Calculate comprehensive team statistics."""
stats = {}
for team in [self.match_info['home_team'], self.match_info['away_team']]:
team_events = self.events[self.events['team'] == team]
shots = team_events[team_events['type'] == 'Shot']
stats[team] = {
'shots': len(shots),
'shots_on_target': len(shots[shots['shot_outcome'].isin(['Goal', 'Saved'])]),
'goals': len(shots[shots['shot_outcome'] == 'Goal']),
'xg': shots['shot_statsbomb_xg'].sum() if 'shot_statsbomb_xg' in shots.columns else 0,
'passes': len(team_events[team_events['type'] == 'Pass']),
'pass_completion': self._calc_pass_completion(team_events),
'possession_pct': len(team_events) / len(self.events) * 100,
'pressures': len(team_events[team_events['type'] == 'Pressure']),
'tackles': len(team_events[(team_events['type'] == 'Duel') &
(team_events['duel_type'].fillna('') == 'Tackle') if 'duel_type' in team_events.columns else False]),
'interceptions': len(team_events[team_events['type'] == 'Interception'])
}
return stats
def _calc_pass_completion(self, team_events):
"""Calculate pass completion percentage."""
passes = team_events[team_events['type'] == 'Pass']
if len(passes) == 0:
return 0
completed = passes['pass_outcome'].isna().sum()
return (completed / len(passes)) * 100
def generate_executive_summary(self):
"""Generate plain-language executive summary."""
home = self.match_info['home_team']
away = self.match_info['away_team']
h_stats = self.team_stats[home]
a_stats = self.team_stats[away]
# Determine result
if h_stats['goals'] > a_stats['goals']:
result = f"{home} won"
elif a_stats['goals'] > h_stats['goals']:
result = f"{away} won"
else:
result = "The match ended in a draw"
# xG winner
xg_winner = home if h_stats['xg'] > a_stats['xg'] else away
# Possession dominance
if h_stats['possession_pct'] > 55:
poss_leader = home
elif a_stats['possession_pct'] > 55:
poss_leader = away
else:
poss_leader = "Neither team"
summary = f"""
MATCH OVERVIEW
==============
{result} {h_stats['goals']}-{a_stats['goals']}.
KEY TAKEAWAYS:
- {xg_winner} created the better chances (xG: {h_stats['xg']:.2f} vs {a_stats['xg']:.2f})
- {poss_leader} dominated possession ({h_stats['possession_pct']:.1f}% vs {a_stats['possession_pct']:.1f}%)
- Shot efficiency: {home} {h_stats['shots_on_target']}/{h_stats['shots']} on target vs {away} {a_stats['shots_on_target']}/{a_stats['shots']}
PERFORMANCE RATING:
{home}: {self._rate_performance(h_stats)}
{away}: {self._rate_performance(a_stats)}
"""
return summary.strip()
def _rate_performance(self, stats):
"""Rate team performance based on key metrics."""
score = 0
# xG efficiency
if stats['goals'] >= stats['xg']:
score += 2
elif stats['goals'] >= stats['xg'] * 0.8:
score += 1
# Pass completion
if stats['pass_completion'] > 85:
score += 2
elif stats['pass_completion'] > 75:
score += 1
# Shots on target ratio
if stats['shots'] > 0 and stats['shots_on_target'] / stats['shots'] > 0.4:
score += 1
if score >= 5:
return "Excellent"
elif score >= 3:
return "Good"
elif score >= 2:
return "Average"
else:
return "Below Par"
def generate_coach_summary(self):
"""Generate tactical analysis for coaching staff."""
home = self.match_info['home_team']
away = self.match_info['away_team']
# Pressing analysis by zone
press_by_zone = self._analyze_pressing()
# Progressive passes
prog_passes = self._analyze_progressive_passes()
summary = f"""
TACTICAL ANALYSIS
=================
PRESSING:
- {home}: {press_by_zone.get(home, {}).get('High', 0)} high presses
- {away}: {press_by_zone.get(away, {}).get('High', 0)} high presses
PROGRESSIVE PLAY:
- {home}: {prog_passes.get(home, 0)} progressive passes
- {away}: {prog_passes.get(away, 0)} progressive passes
RECOMMENDATIONS:
{self._generate_recommendations()}
"""
return summary.strip()
def _analyze_pressing(self):
"""Analyze pressing by zone."""
presses = self.events[self.events['type'] == 'Pressure'].copy()
if len(presses) == 0:
return {}
presses['zone'] = presses['location'].apply(
lambda loc: 'High' if isinstance(loc, list) and loc[0] > 80
else ('Mid' if isinstance(loc, list) and loc[0] > 40 else 'Low')
)
result = {}
for team in presses['team'].unique():
team_presses = presses[presses['team'] == team]
result[team] = team_presses.groupby('zone').size().to_dict()
return result
def _analyze_progressive_passes(self):
"""Count progressive passes (>10m forward)."""
passes = self.events[self.events['type'] == 'Pass'].copy()
def calc_progress(row):
if not isinstance(row['location'], list) or not isinstance(row.get('pass_end_location'), list):
return 0
return row['pass_end_location'][0] - row['location'][0]
passes['progress'] = passes.apply(calc_progress, axis=1)
result = {}
for team in passes['team'].unique():
team_passes = passes[passes['team'] == team]
result[team] = (team_passes['progress'] > 10).sum()
return result
def _generate_recommendations(self):
"""Generate data-driven recommendations."""
recs = []
home = self.match_info['home_team']
h_stats = self.team_stats[home]
if h_stats['goals'] < h_stats['xg'] * 0.8:
recs.append(f"- Review finishing; created {h_stats['xg']:.2f} xG but only scored {h_stats['goals']}")
if h_stats['pass_completion'] < 75:
recs.append(f"- Passing under pressure needs work ({h_stats['pass_completion']:.1f}% completion)")
if not recs:
recs.append("- Strong all-around performance, maintain current approach")
return '\n'.join(recs)
def create_shot_map(self, team_name, ax=None):
"""Create shot map visualization."""
if ax is None:
fig, ax = plt.subplots(figsize=(10, 7))
# Draw pitch (simplified)
ax.set_xlim(60, 120)
ax.set_ylim(0, 80)
ax.set_facecolor('darkgreen')
ax.add_patch(Rectangle((102, 18), 18, 44, fill=False, color='white', lw=2)) # Box
ax.axvline(x=120, color='white', lw=2) # Goal line
# Plot shots
shots = self.events[(self.events['type'] == 'Shot') &
(self.events['team'] == team_name)]
for _, shot in shots.iterrows():
if isinstance(shot['location'], list):
x, y = shot['location']
xg = shot.get('shot_statsbomb_xg', 0.1)
is_goal = shot.get('shot_outcome') == 'Goal'
ax.scatter(x, y, s=xg*500,
c='red' if is_goal else 'steelblue',
alpha=0.7, edgecolors='white', linewidths=1)
total_xg = shots['shot_statsbomb_xg'].sum() if 'shot_statsbomb_xg' in shots.columns else 0
ax.set_title(f"{team_name} - Shot Map\nTotal xG: {total_xg:.2f}", fontsize=12, fontweight='bold')
ax.set_xticks([])
ax.set_yticks([])
return ax
def create_full_report(self, output_path='match_report.html'):
"""Generate complete HTML report."""
# Create visualizations
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
self.create_shot_map(self.match_info['home_team'], axes[0])
self.create_shot_map(self.match_info['away_team'], axes[1])
plt.tight_layout()
plt.savefig('shots.png', dpi=150, bbox_inches='tight', facecolor='white')
plt.close()
# Build HTML
html = f"""
<!DOCTYPE html>
<html>
<head>
<title>Match Report</title>
<style>
body {{ font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto; padding: 20px; }}
.header {{ background: #1B5E20; color: white; padding: 20px; text-align: center; }}
.section {{ margin: 20px 0; padding: 15px; background: #f5f5f5; border-radius: 5px; }}
pre {{ white-space: pre-wrap; font-family: inherit; }}
img {{ max-width: 100%; }}
</style>
</head>
<body>
<div class="header">
<h1>Match Report</h1>
<h2>{self.match_info['home_team']} vs {self.match_info['away_team']}</h2>
</div>
<div class="section">
<h3>Executive Summary</h3>
<pre>{self.generate_executive_summary()}</pre>
</div>
<div class="section">
<h3>Shot Maps</h3>
<img src="shots.png" alt="Shot Maps">
</div>
<div class="section">
<h3>Tactical Analysis</h3>
<pre>{self.generate_coach_summary()}</pre>
</div>
<footer>
<p>Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}</p>
</footer>
</body>
</html>
"""
with open(output_path, 'w') as f:
f.write(html)
print(f"Report saved to {output_path}")
return output_path
# Example usage
print("=== MATCH REPORT GENERATOR ===\n")
# Demo with a sample match
try:
# Get a sample match
competitions = sb.competitions()
la_liga = competitions[competitions['competition_name'] == 'La Liga'].iloc[0]
matches = sb.matches(competition_id=la_liga['competition_id'],
season_id=la_liga['season_id'])
sample_match = matches.iloc[0]['match_id']
# Generate report
reporter = MatchReportGenerator(match_id=sample_match)
print("EXECUTIVE SUMMARY:")
print(reporter.generate_executive_summary())
print("\n" + "="*50 + "\n")
print("COACH SUMMARY:")
print(reporter.generate_coach_summary())
# Create visualizations
fig, ax = plt.subplots(figsize=(10, 7))
reporter.create_shot_map(reporter.match_info['home_team'], ax)
plt.savefig('sample_shot_map.png', dpi=150, bbox_inches='tight')
plt.show()
except Exception as e:
print(f"Demo output (sample data not available): {e}")
print("\nTo use: reporter = MatchReportGenerator(match_id=12345)")
print("reporter.generate_executive_summary()")library(StatsBombR)
library(tidyverse)
library(ggplot2)
library(gridExtra)
library(knitr)
# Match Report Generator Class/Functions
generate_match_report <- function(match_id, competition_name = "La Liga") {
# Load match data
competitions <- FreeCompetitions() %>%
filter(competition_name == competition_name)
matches <- FreeMatches(competitions)
events <- StatsBombFreeEvents(MatchesDF = matches %>% filter(match_id == !!match_id))
# Get match info
match_info <- matches %>% filter(match_id == !!match_id)
report <- list()
report$match_info <- match_info
# Calculate team statistics
report$team_stats <- events %>%
group_by(team.name) %>%
summarise(
shots = sum(type.name == "Shot"),
shots_on_target = sum(type.name == "Shot" &
shot.outcome.name %in% c("Goal", "Saved")),
goals = sum(type.name == "Shot" & shot.outcome.name == "Goal"),
xg = sum(ifelse(type.name == "Shot", shot.statsbomb_xg, 0), na.rm = TRUE),
passes = sum(type.name == "Pass"),
pass_completion = mean(ifelse(type.name == "Pass", is.na(pass.outcome.name), NA), na.rm = TRUE) * 100,
possession_pct = n() / nrow(events) * 100,
pressures = sum(type.name == "Pressure"),
tackles = sum(type.name == "Duel" & duel.type.name == "Tackle"),
interceptions = sum(type.name == "Interception"),
.groups = "drop"
)
# Key events
report$key_events <- events %>%
filter(type.name %in% c("Shot", "Goal", "Substitution")) %>%
arrange(minute) %>%
select(minute, team.name, player.name, type.name, shot.outcome.name)
# Generate narrative summary
home_team <- match_info$home_team.home_team_name
away_team <- match_info$away_team.away_team_name
home_stats <- report$team_stats %>% filter(team.name == home_team)
away_stats <- report$team_stats %>% filter(team.name == away_team)
report$executive_summary <- generate_executive_summary(
home_team, away_team, home_stats, away_stats
)
report$coach_summary <- generate_coach_summary(
events, home_team, away_team, home_stats, away_stats
)
report
}
# Executive Summary Generator (plain language)
generate_executive_summary <- function(home_team, away_team, home_stats, away_stats) {
# Determine winner by xG
xg_winner <- ifelse(home_stats$xg > away_stats$xg, home_team, away_team)
xg_margin <- abs(home_stats$xg - away_stats$xg)
# Determine actual winner
if (home_stats$goals > away_stats$goals) {
result <- paste(home_team, "won")
goal_diff <- home_stats$goals - away_stats$goals
} else if (away_stats$goals > home_stats$goals) {
result <- paste(away_team, "won")
goal_diff <- away_stats$goals - home_stats$goals
} else {
result <- "The match ended in a draw"
goal_diff <- 0
}
# Build narrative
summary <- paste0(
"MATCH OVERVIEW\n",
"==============\n\n",
result, " ", home_stats$goals, "-", away_stats$goals, ".\n\n",
"KEY TAKEAWAYS:\n",
"- ", xg_winner, " created the better chances (xG: ",
round(home_stats$xg, 2), " vs ", round(away_stats$xg, 2), ")\n",
"- ", ifelse(home_stats$possession_pct > 55, home_team,
ifelse(away_stats$possession_pct > 55, away_team, "Neither team")),
" dominated possession (",
round(home_stats$possession_pct, 1), "% vs ",
round(away_stats$possession_pct, 1), "%)\n",
"- Shot efficiency: ", home_team, " ", home_stats$shots_on_target, "/",
home_stats$shots, " on target vs ", away_team, " ",
away_stats$shots_on_target, "/", away_stats$shots, "\n\n",
"PERFORMANCE RATING:\n",
home_team, ": ", rate_performance(home_stats), "\n",
away_team, ": ", rate_performance(away_stats)
)
summary
}
# Performance rating helper
rate_performance <- function(stats) {
score <- 0
# xG efficiency
if (stats$goals >= stats$xg) score <- score + 2
else if (stats$goals >= stats$xg * 0.8) score <- score + 1
# Pass completion
if (stats$pass_completion > 85) score <- score + 2
else if (stats$pass_completion > 75) score <- score + 1
# Shots on target ratio
if (stats$shots_on_target / stats$shots > 0.4) score <- score + 1
case_when(
score >= 5 ~ "Excellent",
score >= 3 ~ "Good",
score >= 2 ~ "Average",
TRUE ~ "Below Par"
)
}
# Coach Summary (tactical focus)
generate_coach_summary <- function(events, home_team, away_team, home_stats, away_stats) {
# Pressing analysis
press_by_zone <- events %>%
filter(type.name == "Pressure") %>%
mutate(
zone = case_when(
location.x > 80 ~ "High",
location.x > 40 ~ "Mid",
TRUE ~ "Low"
)
) %>%
group_by(team.name, zone) %>%
summarise(presses = n(), .groups = "drop") %>%
pivot_wider(names_from = zone, values_from = presses, values_fill = 0)
# Progressive passes
prog_passes <- events %>%
filter(type.name == "Pass") %>%
mutate(
progress = pass.end_location.x - location.x
) %>%
group_by(team.name) %>%
summarise(
progressive_passes = sum(progress > 10, na.rm = TRUE),
avg_progress = mean(progress, na.rm = TRUE),
.groups = "drop"
)
# Build coach brief
summary <- paste0(
"TACTICAL ANALYSIS\n",
"=================\n\n",
"PRESSING:\n",
"- ", home_team, ": ", press_by_zone$High[press_by_zone$team.name == home_team],
" high presses\n",
"- ", away_team, ": ", press_by_zone$High[press_by_zone$team.name == away_team],
" high presses\n\n",
"PROGRESSIVE PLAY:\n",
"- ", home_team, ": ", prog_passes$progressive_passes[prog_passes$team.name == home_team],
" progressive passes\n",
"- ", away_team, ": ", prog_passes$progressive_passes[prog_passes$team.name == away_team],
" progressive passes\n\n",
"RECOMMENDATIONS:\n",
generate_recommendations(home_stats, away_stats, home_team, away_team)
)
summary
}
# Generate recommendations based on data
generate_recommendations <- function(home_stats, away_stats, home_team, away_team) {
recs <- c()
# If underperformed xG
if (home_stats$goals < home_stats$xg * 0.8) {
recs <- c(recs, paste("- Review ", home_team, "'s finishing; created ",
round(home_stats$xg, 2), " xG but only scored ",
home_stats$goals))
}
# If low pass completion
if (home_stats$pass_completion < 75) {
recs <- c(recs, paste("- ", home_team, " passing under pressure needs work (",
round(home_stats$pass_completion, 1), "% completion)"))
}
# If allowing too many shots
if (away_stats$shots > 15) {
recs <- c(recs, paste("- Defensive shape allowed ", away_stats$shots,
" opposition shots"))
}
if (length(recs) == 0) {
recs <- "- Strong all-around performance, maintain current approach"
}
paste(recs, collapse = "\n")
}
# Create visualization panel
create_report_visuals <- function(events, team_name) {
# Shot map
shots <- events %>%
filter(type.name == "Shot", team.name == team_name)
p1 <- ggplot(shots, aes(x = location.x, y = location.y)) +
annotate("rect", xmin = 0, xmax = 120, ymin = 0, ymax = 80,
fill = "darkgreen", alpha = 0.3) +
geom_point(aes(size = shot.statsbomb_xg,
color = shot.outcome.name == "Goal"),
alpha = 0.7) +
scale_color_manual(values = c("FALSE" = "steelblue", "TRUE" = "red"),
labels = c("No Goal", "Goal"), name = "Outcome") +
scale_size_continuous(range = c(2, 8), name = "xG") +
coord_fixed(ratio = 80/120) +
labs(title = paste(team_name, "- Shot Map"),
subtitle = paste("Total xG:", round(sum(shots$shot.statsbomb_xg, na.rm = TRUE), 2))) +
theme_minimal() +
theme(legend.position = "bottom")
# Pass network simplified
passes <- events %>%
filter(type.name == "Pass", team.name == team_name, is.na(pass.outcome.name))
player_positions <- passes %>%
group_by(player.name) %>%
summarise(
x = mean(location.x),
y = mean(location.y),
passes = n(),
.groups = "drop"
) %>%
filter(passes >= 5)
p2 <- ggplot(player_positions, aes(x = x, y = y)) +
annotate("rect", xmin = 0, xmax = 120, ymin = 0, ymax = 80,
fill = "darkgreen", alpha = 0.3) +
geom_point(aes(size = passes), color = "steelblue", alpha = 0.7) +
geom_text(aes(label = substr(player.name, 1, 10)), size = 2, vjust = -1) +
scale_size_continuous(range = c(3, 10), name = "Passes") +
coord_fixed(ratio = 80/120) +
labs(title = paste(team_name, "- Average Positions"),
subtitle = "Size = pass volume") +
theme_minimal()
# Combine
grid.arrange(p1, p2, ncol = 2)
}
# Example usage
cat("=== MATCH REPORT GENERATOR ===\n\n")
cat("To generate a report, run:\n")
cat("report <- generate_match_report(match_id = 3773585)\n")
cat("cat(report$executive_summary)\n")
cat("cat(report$coach_summary)\n")
cat("create_report_visuals(events, 'Barcelona')\n")
# Demo with sample output
demo_output <- "
MATCH OVERVIEW
==============
Barcelona won 3-1.
KEY TAKEAWAYS:
- Barcelona created the better chances (xG: 2.34 vs 0.87)
- Barcelona dominated possession (67.2% vs 32.8%)
- Shot efficiency: Barcelona 5/12 on target vs Real Madrid 2/8
PERFORMANCE RATING:
Barcelona: Excellent
Real Madrid: Below Par
"
cat(demo_output)Exercise 30.2: Interactive Player Comparison Dashboard
Task: Build an interactive dashboard that compares player performance across multiple metrics with percentile rankings, radar charts, and trend analysis.
Requirements:
- Calculate percentile rankings against position-specific benchmarks
- Create radar charts for visual comparison
- Show performance trends over recent matches
- Include context with league averages and top performer benchmarks
- Generate exportable comparison reports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
from scipy import stats
import warnings
warnings.filterwarnings('ignore')
class PlayerComparisonDashboard:
"""Interactive dashboard for comparing player performance."""
def __init__(self, player_data):
"""
Initialize with player data DataFrame.
Expected columns: player_id, player_name, position, minutes,
goals_p90, assists_p90, xg_p90, xa_p90, etc.
"""
self.player_data = player_data
self.metrics = ['goals_p90', 'assists_p90', 'xg_p90', 'xa_p90',
'passes_p90', 'key_passes_p90', 'dribbles_p90']
def calculate_percentiles(self, player_ids, position='Forward', min_minutes=900):
"""Calculate percentile rankings against position peers."""
# Filter benchmark population
benchmark = self.player_data[
(self.player_data['position'] == position) &
(self.player_data['minutes'] >= min_minutes)
]
# Get selected players
selected = self.player_data[
self.player_data['player_id'].isin(player_ids)
].copy()
# Calculate percentiles
for metric in self.metrics:
if metric in benchmark.columns:
selected[f'{metric}_pct'] = selected[metric].apply(
lambda x: stats.percentileofscore(benchmark[metric].dropna(), x)
)
return selected
def create_radar_chart(self, percentile_data, ax=None):
"""Create radar chart comparing players."""
if ax is None:
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(projection='polar'))
# Prepare data
pct_cols = [f'{m}_pct' for m in self.metrics if f'{m}_pct' in percentile_data.columns]
labels = [m.replace('_p90', '').replace('_', ' ').title() for m in self.metrics]
# Number of variables
num_vars = len(pct_cols)
angles = np.linspace(0, 2 * np.pi, num_vars, endpoint=False).tolist()
angles += angles[:1] # Complete the loop
# Colors for each player
colors = ['#1B5E20', '#C62828', '#1565C0', '#FF8F00']
for idx, (_, player) in enumerate(percentile_data.iterrows()):
values = player[pct_cols].values.tolist()
values += values[:1] # Complete the loop
ax.plot(angles, values, 'o-', linewidth=2,
label=player['player_name'], color=colors[idx % len(colors)])
ax.fill(angles, values, alpha=0.15, color=colors[idx % len(colors)])
# Configure chart
ax.set_xticks(angles[:-1])
ax.set_xticklabels(labels, size=10)
ax.set_ylim(0, 100)
ax.set_yticks([25, 50, 75, 90])
ax.set_yticklabels(['25%', '50%', '75%', '90%'], size=8)
ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.0))
ax.set_title('Player Comparison - Percentile Rankings', size=14, fontweight='bold', y=1.08)
return ax
def create_percentile_bars(self, percentile_data, ax=None):
"""Create horizontal bar chart of percentile rankings."""
if ax is None:
fig, ax = plt.subplots(figsize=(10, 8))
pct_cols = [f'{m}_pct' for m in self.metrics if f'{m}_pct' in percentile_data.columns]
labels = [m.replace('_p90_pct', '').replace('_', ' ').title() for m in pct_cols]
x = np.arange(len(labels))
width = 0.25
colors = ['#1B5E20', '#C62828', '#1565C0']
for idx, (_, player) in enumerate(percentile_data.iterrows()):
values = player[pct_cols].values
offset = (idx - len(percentile_data) / 2 + 0.5) * width
ax.barh(x + offset, values, width, label=player['player_name'],
color=colors[idx % len(colors)], alpha=0.8)
# Add reference lines
for pct in [25, 50, 75, 90]:
ax.axvline(pct, color='gray', linestyle='--', alpha=0.5, linewidth=1)
ax.set_yticks(x)
ax.set_yticklabels(labels)
ax.set_xlabel('Percentile')
ax.set_xlim(0, 100)
ax.legend(loc='lower right')
ax.set_title('Percentile Rankings by Metric', fontweight='bold')
# Add tier labels
ax.text(12.5, len(labels), 'Poor', ha='center', va='bottom', fontsize=8, color='gray')
ax.text(37.5, len(labels), 'Below\nAvg', ha='center', va='bottom', fontsize=8, color='gray')
ax.text(62.5, len(labels), 'Avg', ha='center', va='bottom', fontsize=8, color='gray')
ax.text(82.5, len(labels), 'Above\nAvg', ha='center', va='bottom', fontsize=8, color='gray')
ax.text(95, len(labels), 'Elite', ha='center', va='bottom', fontsize=8, color='gray')
return ax
def generate_summary(self, percentile_data):
"""Generate text summary of comparison."""
pct_cols = [f'{m}_pct' for m in self.metrics if f'{m}_pct' in percentile_data.columns]
summary = "PLAYER COMPARISON SUMMARY\n"
summary += "=" * 25 + "\n\n"
for _, player in percentile_data.iterrows():
values = {col.replace('_p90_pct', ''): player[col]
for col in pct_cols if col in player.index}
# Sort to find strengths and weaknesses
sorted_metrics = sorted(values.items(), key=lambda x: x[1], reverse=True)
strengths = sorted_metrics[:2]
weaknesses = sorted_metrics[-2:]
summary += f"{player['player_name']}\n"
summary += f"Strengths: {strengths[0][0]} ({strengths[0][1]:.0f}%), "
summary += f"{strengths[1][0]} ({strengths[1][1]:.0f}%)\n"
summary += f"Weaknesses: {weaknesses[0][0]} ({weaknesses[0][1]:.0f}%), "
summary += f"{weaknesses[1][0]} ({weaknesses[1][1]:.0f}%)\n\n"
return summary
def create_full_dashboard(self, player_ids, position='Forward'):
"""Create complete comparison dashboard."""
# Calculate percentiles
percentile_data = self.calculate_percentiles(player_ids, position)
# Create figure with subplots
fig = plt.figure(figsize=(16, 10))
# Radar chart
ax1 = fig.add_subplot(121, projection='polar')
self.create_radar_chart(percentile_data, ax1)
# Percentile bars
ax2 = fig.add_subplot(122)
self.create_percentile_bars(percentile_data, ax2)
plt.suptitle(f'Player Comparison Dashboard - {position}s', fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
# Print summary
print(self.generate_summary(percentile_data))
return fig, percentile_data
# Demo with simulated data
np.random.seed(42)
# Create sample player data
n_players = 100
player_data = pd.DataFrame({
'player_id': range(1, n_players + 1),
'player_name': [f'Player_{i}' for i in range(1, n_players + 1)],
'position': np.random.choice(['Forward', 'Midfielder', 'Defender'], n_players,
p=[0.3, 0.4, 0.3]),
'minutes': np.random.randint(500, 3000, n_players),
'goals_p90': np.abs(np.random.normal(0.3, 0.15, n_players)),
'assists_p90': np.abs(np.random.normal(0.2, 0.1, n_players)),
'xg_p90': np.abs(np.random.normal(0.35, 0.12, n_players)),
'xa_p90': np.abs(np.random.normal(0.18, 0.08, n_players)),
'passes_p90': np.abs(np.random.normal(40, 10, n_players)),
'key_passes_p90': np.abs(np.random.normal(1.5, 0.5, n_players)),
'dribbles_p90': np.abs(np.random.normal(2, 0.8, n_players))
})
# Name specific players
player_data.loc[0, 'player_name'] = 'Messi Jr.'
player_data.loc[14, 'player_name'] = 'Ronaldo Jr.'
player_data.loc[41, 'player_name'] = 'Mbappe Jr.'
# Create dashboard
print("=== PLAYER COMPARISON DASHBOARD ===\n")
dashboard = PlayerComparisonDashboard(player_data)
fig, percentile_data = dashboard.create_full_dashboard(
player_ids=[1, 15, 42],
position='Forward'
)
plt.savefig('player_comparison_dashboard.png', dpi=150, bbox_inches='tight')
plt.show()
# Show the comparison data
print("\nPercentile Rankings:")
pct_cols = [col for col in percentile_data.columns if '_pct' in col]
print(percentile_data[['player_name'] + pct_cols].round(1).to_string(index=False))library(tidyverse)
library(fmsb)
library(ggplot2)
library(gridExtra)
# Player Comparison Dashboard
create_player_comparison_dashboard <- function(player_data, player_ids, position = "Forward") {
# Filter to selected players
selected <- player_data %>%
filter(player_id %in% player_ids)
# Get position benchmarks
benchmarks <- player_data %>%
filter(position == !!position, minutes >= 900) %>%
summarise(across(where(is.numeric), list(
mean = ~mean(., na.rm = TRUE),
p90 = ~quantile(., 0.9, na.rm = TRUE),
p10 = ~quantile(., 0.1, na.rm = TRUE)
)))
# Calculate percentiles for each player
metrics <- c("goals_p90", "assists_p90", "xg_p90", "xa_p90",
"passes_p90", "key_passes_p90", "dribbles_p90")
percentile_data <- selected %>%
rowwise() %>%
mutate(across(all_of(metrics), ~{
all_values <- player_data[[cur_column()]][player_data$position == position &
player_data$minutes >= 900]
ecdf(all_values)(.) * 100
}, .names = "{.col}_pct")) %>%
ungroup()
list(
players = selected,
percentiles = percentile_data,
benchmarks = benchmarks,
radar_data = prepare_radar_data(percentile_data, metrics)
)
}
# Prepare radar chart data
prepare_radar_data <- function(percentile_data, metrics) {
metric_pcts <- paste0(metrics, "_pct")
radar_df <- percentile_data %>%
select(player_name, all_of(metric_pcts)) %>%
column_to_rownames("player_name")
# Radar chart needs max and min rows
radar_df <- rbind(
rep(100, ncol(radar_df)), # Max
rep(0, ncol(radar_df)), # Min
radar_df
)
# Clean column names for display
colnames(radar_df) <- gsub("_p90_pct", "", colnames(radar_df)) %>%
gsub("_", " ", .) %>%
str_to_title()
radar_df
}
# Create radar chart comparison
plot_radar_comparison <- function(radar_data, colors = c("#1B5E20", "#C62828", "#1565C0")) {
n_players <- nrow(radar_data) - 2
# Set up plot parameters
par(mfrow = c(1, 1), mar = c(2, 2, 2, 2))
radarchart(radar_data,
axistype = 1,
pcol = colors[1:n_players],
pfcol = adjustcolor(colors[1:n_players], alpha = 0.2),
plwd = 2,
plty = 1,
cglcol = "grey",
cglty = 1,
axislabcol = "grey",
caxislabels = c("0%", "25%", "50%", "75%", "100%"),
vlcex = 0.8,
title = "Player Comparison - Percentile Ranks")
legend("topright",
legend = rownames(radar_data)[3:nrow(radar_data)],
col = colors[1:n_players],
lty = 1, lwd = 2,
cex = 0.8)
}
# Create trend analysis
plot_performance_trends <- function(match_data, player_ids, metric = "xg") {
trend_data <- match_data %>%
filter(player_id %in% player_ids) %>%
arrange(player_name, match_date) %>%
group_by(player_name) %>%
mutate(
rolling_avg = zoo::rollmean(get(metric), k = 5, fill = NA, align = "right"),
match_num = row_number()
) %>%
ungroup()
ggplot(trend_data, aes(x = match_num, y = rolling_avg, color = player_name)) +
geom_line(linewidth = 1.2) +
geom_point(aes(y = get(metric)), alpha = 0.4, size = 2) +
labs(
title = paste("Performance Trend -", str_to_title(metric)),
subtitle = "5-match rolling average",
x = "Match Number",
y = metric,
color = "Player"
) +
theme_minimal() +
theme(legend.position = "bottom") +
scale_color_manual(values = c("#1B5E20", "#C62828", "#1565C0"))
}
# Create percentile bar chart
plot_percentile_bars <- function(percentile_data, metrics) {
metric_pcts <- paste0(metrics, "_pct")
long_data <- percentile_data %>%
select(player_name, all_of(metric_pcts)) %>%
pivot_longer(-player_name, names_to = "metric", values_to = "percentile") %>%
mutate(
metric = gsub("_p90_pct", "", metric) %>% gsub("_", " ", .) %>% str_to_title(),
tier = case_when(
percentile >= 90 ~ "Elite (90%+)",
percentile >= 75 ~ "Above Avg (75-90%)",
percentile >= 50 ~ "Average (50-75%)",
percentile >= 25 ~ "Below Avg (25-50%)",
TRUE ~ "Poor (<25%)"
)
)
ggplot(long_data, aes(x = reorder(metric, percentile), y = percentile, fill = player_name)) +
geom_col(position = "dodge", alpha = 0.8) +
geom_hline(yintercept = c(25, 50, 75, 90), linetype = "dashed", alpha = 0.5) +
coord_flip() +
labs(
title = "Percentile Rankings by Metric",
x = NULL, y = "Percentile",
fill = "Player"
) +
scale_fill_manual(values = c("#1B5E20", "#C62828", "#1565C0")) +
scale_y_continuous(breaks = c(0, 25, 50, 75, 90, 100)) +
theme_minimal() +
theme(legend.position = "bottom")
}
# Generate comparison summary
generate_comparison_summary <- function(dashboard_data) {
players <- dashboard_data$percentiles
summary <- "PLAYER COMPARISON SUMMARY\n"
summary <- paste0(summary, "=========================\n\n")
for (i in 1:nrow(players)) {
p <- players[i, ]
# Find strengths and weaknesses
pct_cols <- grep("_pct$", names(p), value = TRUE)
pct_values <- as.numeric(p[, pct_cols])
names(pct_values) <- gsub("_p90_pct", "", pct_cols)
top_2 <- sort(pct_values, decreasing = TRUE)[1:2]
bottom_2 <- sort(pct_values)[1:2]
summary <- paste0(summary, p$player_name, "\n")
summary <- paste0(summary, "Strengths: ", paste(names(top_2), collapse = ", "),
" (", paste(round(top_2), collapse = "%, "), "%)\n")
summary <- paste0(summary, "Weaknesses: ", paste(names(bottom_2), collapse = ", "),
" (", paste(round(bottom_2), collapse = "%, "), "%)\n\n")
}
cat(summary)
invisible(summary)
}
# Demo with simulated data
set.seed(42)
# Create sample player data
n_players <- 100
player_data <- tibble(
player_id = 1:n_players,
player_name = paste0("Player_", 1:n_players),
position = sample(c("Forward", "Midfielder", "Defender"), n_players, replace = TRUE,
prob = c(0.3, 0.4, 0.3)),
minutes = sample(500:3000, n_players),
goals_p90 = abs(rnorm(n_players, 0.3, 0.15)),
assists_p90 = abs(rnorm(n_players, 0.2, 0.1)),
xg_p90 = abs(rnorm(n_players, 0.35, 0.12)),
xa_p90 = abs(rnorm(n_players, 0.18, 0.08)),
passes_p90 = abs(rnorm(n_players, 40, 10)),
key_passes_p90 = abs(rnorm(n_players, 1.5, 0.5)),
dribbles_p90 = abs(rnorm(n_players, 2, 0.8))
)
# Create dashboard for 3 players
cat("=== PLAYER COMPARISON DASHBOARD ===\n\n")
# Select players to compare
compare_ids <- c(1, 15, 42)
player_data$player_name[compare_ids] <- c("Messi Jr.", "Ronaldo Jr.", "Mbappe Jr.")
dashboard <- create_player_comparison_dashboard(
player_data, compare_ids, position = "Forward"
)
# Generate summary
generate_comparison_summary(dashboard)
# Create visualizations
par(mfrow = c(1, 1))
plot_radar_comparison(dashboard$radar_data)
# Percentile bars
metrics <- c("goals_p90", "assists_p90", "xg_p90", "xa_p90",
"passes_p90", "key_passes_p90", "dribbles_p90")
p <- plot_percentile_bars(dashboard$percentiles, metrics)
print(p)Exercise 30.3: Pre-Match Opposition Scout Report Generator
Task: Build an automated system that generates comprehensive opposition scout reports with tactical analysis, key players to watch, and data-driven recommendations.
Requirements:
- Analyze opposition's recent form and key metrics
- Identify tactical patterns (formation, pressing style, build-up)
- Highlight top performers and their tendencies
- Generate specific tactical recommendations with supporting data
- Create visual pitch diagrams showing key areas
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')
class ScoutReportGenerator:
"""Automated opposition scout report generator."""
def __init__(self, match_data, player_data):
self.match_data = match_data
self.player_data = player_data
def generate_report(self, team_name):
"""Generate comprehensive scout report for opposition team."""
report = {
'team': team_name,
'generated': datetime.now()
}
# Filter to team's recent matches
team_matches = self.match_data[
self.match_data['team'] == team_name
].sort_values('date', ascending=False).head(5)
# Calculate team profile
report['profile'] = self._calculate_team_profile(team_matches)
# Key players
report['key_players'] = self._identify_key_players(team_name)
# Tactical patterns
report['tactics'] = self._analyze_tactical_patterns(team_matches)
# Recommendations
report['recommendations'] = self._generate_recommendations(report)
return report
def _calculate_team_profile(self, matches):
"""Calculate team's recent profile."""
recent_form = {
'wins': (matches['result'] == 'W').sum(),
'draws': (matches['result'] == 'D').sum(),
'losses': (matches['result'] == 'L').sum(),
'goals_for': matches['goals_for'].sum(),
'goals_against': matches['goals_against'].sum(),
'xg_for': matches['xg_for'].sum(),
'xg_against': matches['xg_against'].sum(),
'avg_possession': matches['possession'].mean()
}
style = {
'avg_ppda': matches['ppda'].mean(),
'avg_passes': matches['passes'].mean(),
'avg_shots': matches['shots'].mean(),
'avg_crosses': matches['crosses'].mean(),
'pass_completion': matches['pass_completion'].mean(),
'directness': (matches['direct_attacks'] / matches['passes'] * 100).mean()
}
return {'recent_form': recent_form, 'style': style}
def _identify_key_players(self, team_name, top_n=5):
"""Identify most dangerous players."""
team_players = self.player_data[
(self.player_data['team'] == team_name) &
(self.player_data['minutes'] >= 450)
].copy()
# Calculate contribution score
team_players['contribution_score'] = (
team_players['goals'] * 3 +
team_players['assists'] * 2 +
team_players['key_passes'] * 0.5 +
team_players['successful_dribbles'] * 0.3 +
team_players['tackles_won'] * 0.2
)
# Assign threat level
team_players = team_players.sort_values('contribution_score', ascending=False).head(top_n)
threshold_high = team_players['contribution_score'].quantile(0.8)
threshold_med = team_players['contribution_score'].quantile(0.5)
team_players['threat_level'] = team_players['contribution_score'].apply(
lambda x: 'High' if x > threshold_high else ('Medium' if x > threshold_med else 'Monitor')
)
return team_players[['player_name', 'position', 'minutes', 'goals', 'assists',
'xg', 'xa', 'key_passes', 'contribution_score', 'threat_level']]
def _analyze_tactical_patterns(self, matches):
"""Analyze team's tactical patterns."""
# Primary formation
formation = matches['formation'].mode().iloc[0] if len(matches) > 0 else '4-3-3'
# Build-up style
if matches['passes'].mean() > 500 and matches['possession'].mean() > 55:
build_up = 'Possession-based'
elif matches['direct_attacks'].mean() > 20:
build_up = 'Direct/Counter'
else:
build_up = 'Balanced'
# Pressing style
avg_ppda = matches['ppda'].mean()
if avg_ppda < 10:
pressing = 'High Press'
elif avg_ppda > 15:
pressing = 'Low Block'
else:
pressing = 'Mid Block'
# Attack zones
attack_zones = {
'left_wing': matches['attacks_left'].mean(),
'center': matches['attacks_center'].mean(),
'right_wing': matches['attacks_right'].mean()
}
primary_attack = max(attack_zones, key=attack_zones.get)
return {
'formation': formation,
'build_up_style': build_up,
'pressing_style': pressing,
'primary_attack_zone': primary_attack,
'attack_zones': attack_zones,
'set_piece_threat': matches['set_piece_goals'].mean() > 0.5
}
def _generate_recommendations(self, report):
"""Generate tactical recommendations based on analysis."""
recs = {}
# Based on pressing style
if report['tactics']['pressing_style'] == 'High Press':
recs['against_press'] = {
'recommendation': 'Play through or over the press',
'detail': 'They press high - use quick passing combinations or long balls behind',
'supporting_data': f"PPDA: {report['profile']['style']['avg_ppda']:.1f}"
}
elif report['tactics']['pressing_style'] == 'Low Block':
recs['against_block'] = {
'recommendation': 'Patient build-up, exploit wide areas',
'detail': 'They sit deep - circulate ball and stretch them horizontally',
'supporting_data': f"PPDA: {report['profile']['style']['avg_ppda']:.1f}"
}
# Based on build-up style
if report['tactics']['build_up_style'] == 'Possession-based':
recs['disrupt_possession'] = {
'recommendation': 'Press their build-up triggers',
'detail': 'Target their center-backs and defensive midfielder when receiving',
'supporting_data': f"Avg passes: {report['profile']['style']['avg_passes']:.0f}"
}
# Based on attack zones
zones = report['tactics']['attack_zones']
if report['tactics']['primary_attack_zone'] == 'left_wing':
recs['defend_flank'] = {
'recommendation': 'Reinforce right defensive side',
'detail': 'Their primary threat comes down the left - double up if needed',
'supporting_data': f"{zones['left_wing']:.1f}% attacks from left"
}
elif report['tactics']['primary_attack_zone'] == 'right_wing':
recs['defend_flank'] = {
'recommendation': 'Reinforce left defensive side',
'detail': 'Their primary threat comes down the right - double up if needed',
'supporting_data': f"{zones['right_wing']:.1f}% attacks from right"
}
# Key player focus
key_player = report['key_players'].iloc[0]
recs['key_man'] = {
'recommendation': f"Man-mark or limit service to {key_player['player_name']}",
'detail': f"Their most dangerous player with {int(key_player['goals'])}G and {int(key_player['assists'])}A",
'supporting_data': f"Contribution score: {key_player['contribution_score']:.1f}"
}
return recs
def format_report(self, report):
"""Format report for display."""
output = f"""
═══════════════════════════════════════════════
OPPOSITION SCOUT REPORT: {report['team'].upper()}
═══════════════════════════════════════════════
RECENT FORM (Last 5 Matches)
──────────────────────────────
Record: {report['profile']['recent_form']['wins']}W-{report['profile']['recent_form']['draws']}D-{report['profile']['recent_form']['losses']}L
Goals: {report['profile']['recent_form']['goals_for']} scored, {report['profile']['recent_form']['goals_against']} conceded
xG: {report['profile']['recent_form']['xg_for']:.2f} for, {report['profile']['recent_form']['xg_against']:.2f} against
Avg Possession: {report['profile']['recent_form']['avg_possession']:.1f}%
TACTICAL PROFILE
──────────────────────────────
Formation: {report['tactics']['formation']}
Build-up: {report['tactics']['build_up_style']}
Defensive: {report['tactics']['pressing_style']}
Primary Attack: {report['tactics']['primary_attack_zone']}
KEY PLAYERS TO WATCH
──────────────────────────────"""
for idx, (_, player) in enumerate(report['key_players'].head(3).iterrows(), 1):
output += f"""
{idx}. {player['player_name']} ({player['position']})
{int(player['goals'])}G, {int(player['assists'])}A | Threat: {player['threat_level']}"""
output += """
TACTICAL RECOMMENDATIONS
──────────────────────────────"""
for rec_name, rec in report['recommendations'].items():
output += f"""
▸ {rec['recommendation']}
{rec['detail']}
[{rec['supporting_data']}]
"""
output += f"""
═══════════════════════════════════════════════
Generated: {report['generated'].strftime('%Y-%m-%d %H:%M')}
"""
return output
# Demo with simulated data
np.random.seed(42)
# Create simulated match data
dates = [datetime.now() - timedelta(days=x*4) for x in range(20)]
match_data = pd.DataFrame({
'match_id': range(1, 21),
'date': dates,
'team': ['Barcelona'] * 10 + ['Real Madrid'] * 10,
'opponent': ['Real Madrid'] * 10 + ['Barcelona'] * 10,
'result': np.random.choice(['W', 'D', 'L'], 20, p=[0.5, 0.25, 0.25]),
'goals_for': np.random.randint(0, 5, 20),
'goals_against': np.random.randint(0, 4, 20),
'xg_for': np.random.uniform(0.8, 2.5, 20),
'xg_against': np.random.uniform(0.5, 2.0, 20),
'possession': np.random.uniform(45, 70, 20),
'passes': np.random.randint(400, 600, 20),
'pass_completion': np.random.uniform(75, 90, 20),
'shots': np.random.randint(8, 20, 20),
'crosses': np.random.randint(10, 30, 20),
'ppda': np.random.uniform(8, 18, 20),
'direct_attacks': np.random.randint(10, 25, 20),
'attacks_left': np.random.uniform(25, 40, 20),
'attacks_center': np.random.uniform(20, 35, 20),
'attacks_right': np.random.uniform(25, 40, 20),
'set_piece_goals': np.random.randint(0, 3, 20),
'formation': np.random.choice(['4-3-3', '4-4-2', '3-5-2'], 20, p=[0.6, 0.3, 0.1])
})
# Create simulated player data
player_data = pd.DataFrame({
'player_name': ['Lewandowski', 'Pedri', 'Gavi', 'Araujo', 'Raphinha',
'Vinicius Jr.', 'Bellingham', 'Rodrygo', 'Valverde', 'Modric'],
'team': ['Barcelona'] * 5 + ['Real Madrid'] * 5,
'position': ['Forward', 'Midfielder', 'Midfielder', 'Defender', 'Forward',
'Forward', 'Midfielder', 'Forward', 'Midfielder', 'Midfielder'],
'minutes': np.random.randint(800, 2500, 10),
'goals': [15, 5, 3, 2, 8, 12, 10, 7, 4, 3],
'assists': [4, 8, 6, 1, 5, 7, 6, 5, 8, 9],
'xg': [12.5, 4.2, 2.8, 1.5, 7.2, 10.8, 8.5, 6.2, 3.5, 2.8],
'xa': [3.2, 7.5, 5.8, 0.8, 4.5, 6.2, 5.5, 4.8, 7.2, 8.5],
'key_passes': [25, 65, 48, 12, 42, 55, 52, 38, 58, 62],
'successful_dribbles': [18, 35, 42, 8, 52, 85, 45, 55, 28, 22],
'tackles_won': [12, 45, 52, 85, 18, 15, 42, 20, 55, 38]
})
# Generate report
print("=== OPPOSITION SCOUT REPORT GENERATOR ===\n")
generator = ScoutReportGenerator(match_data, player_data)
report = generator.generate_report('Real Madrid')
formatted_report = generator.format_report(report)
print(formatted_report)library(tidyverse)
library(ggplot2)
library(gridExtra)
# Opposition Scout Report Generator
generate_scout_report <- function(team_name, match_data, player_data) {
report <- list()
report$team <- team_name
report$generated <- Sys.time()
# Filter to team's recent matches
team_matches <- match_data %>%
filter(team == team_name) %>%
arrange(desc(date)) %>%
head(5)
# Calculate team profile
report$profile <- calculate_team_profile(team_matches)
# Key players
report$key_players <- identify_key_players(player_data, team_name)
# Tactical patterns
report$tactics <- analyze_tactical_patterns(team_matches)
# Recommendations
report$recommendations <- generate_tactical_recommendations(report)
report
}
# Team Profile Calculator
calculate_team_profile <- function(matches) {
list(
# Recent form
recent_form = matches %>%
summarise(
wins = sum(result == "W"),
draws = sum(result == "D"),
losses = sum(result == "L"),
goals_for = sum(goals_for),
goals_against = sum(goals_against),
xg_for = sum(xg_for),
xg_against = sum(xg_against),
avg_possession = mean(possession)
),
# Style metrics
style = matches %>%
summarise(
avg_ppda = mean(ppda, na.rm = TRUE), # Pressing intensity
avg_passes = mean(passes),
avg_shots = mean(shots),
avg_crosses = mean(crosses),
pass_completion = mean(pass_completion),
directness = mean(direct_attacks / passes * 100) # % direct play
)
)
}
# Key Players Identifier
identify_key_players <- function(player_data, team_name, top_n = 5) {
team_players <- player_data %>%
filter(team == team_name, minutes >= 450) %>%
mutate(
# Calculate contribution score
contribution_score = goals * 3 + assists * 2 + key_passes * 0.5 +
successful_dribbles * 0.3 + tackles_won * 0.2
) %>%
arrange(desc(contribution_score)) %>%
head(top_n)
team_players %>%
select(player_name, position, minutes, goals, assists, xg, xa,
key_passes, contribution_score) %>%
mutate(
threat_level = case_when(
contribution_score > quantile(contribution_score, 0.8) ~ "High",
contribution_score > quantile(contribution_score, 0.5) ~ "Medium",
TRUE ~ "Monitor"
)
)
}
# Tactical Pattern Analyzer
analyze_tactical_patterns <- function(matches) {
# Determine primary formation
formation <- matches$formation %>%
table() %>%
sort(decreasing = TRUE) %>%
names() %>%
head(1)
# Build-up style
if (mean(matches$passes) > 500 && mean(matches$possession) > 55) {
build_up <- "Possession-based"
} else if (mean(matches$direct_attacks) > 20) {
build_up <- "Direct/Counter"
} else {
build_up <- "Balanced"
}
# Pressing style
if (mean(matches$ppda, na.rm = TRUE) < 10) {
pressing <- "High Press"
} else if (mean(matches$ppda, na.rm = TRUE) > 15) {
pressing <- "Low Block"
} else {
pressing <- "Mid Block"
}
# Attack zones
attack_zones <- list(
left_wing = mean(matches$attacks_left, na.rm = TRUE),
center = mean(matches$attacks_center, na.rm = TRUE),
right_wing = mean(matches$attacks_right, na.rm = TRUE)
)
primary_attack <- names(attack_zones)[which.max(unlist(attack_zones))]
list(
formation = formation,
build_up_style = build_up,
pressing_style = pressing,
primary_attack_zone = primary_attack,
attack_zones = attack_zones,
set_piece_threat = mean(matches$set_piece_goals, na.rm = TRUE) > 0.5
)
}
# Tactical Recommendations Generator
generate_tactical_recommendations <- function(report) {
recs <- list()
# Based on pressing style
if (report$tactics$pressing_style == "High Press") {
recs$against_press <- list(
recommendation = "Play through or over the press",
detail = "They press high - use quick passing combinations or long balls behind",
supporting_data = paste("PPDA:", round(report$profile$style$avg_ppda, 1))
)
} else if (report$tactics$pressing_style == "Low Block") {
recs$against_block <- list(
recommendation = "Patient build-up, exploit wide areas",
detail = "They sit deep - circulate ball and stretch them horizontally",
supporting_data = paste("PPDA:", round(report$profile$style$avg_ppda, 1))
)
}
# Based on build-up style
if (report$tactics$build_up_style == "Possession-based") {
recs$disrupt_possession <- list(
recommendation = "Press their build-up triggers",
detail = "Target their center-backs and defensive midfielder when receiving",
supporting_data = paste("Avg passes:", round(report$profile$style$avg_passes))
)
}
# Based on attack zones
if (report$tactics$primary_attack_zone == "left_wing") {
recs$defend_left <- list(
recommendation = "Reinforce right defensive side",
detail = "Their primary threat comes down the left - double up if needed",
supporting_data = paste(round(report$tactics$attack_zones$left_wing, 1), "% attacks from left")
)
} else if (report$tactics$primary_attack_zone == "right_wing") {
recs$defend_right <- list(
recommendation = "Reinforce left defensive side",
detail = "Their primary threat comes down the right - double up if needed",
supporting_data = paste(round(report$tactics$attack_zones$right_wing, 1), "% attacks from right")
)
}
# Set pieces
if (report$tactics$set_piece_threat) {
recs$set_pieces <- list(
recommendation = "Extra attention on set pieces",
detail = "Significant goal threat from corners and free kicks",
supporting_data = "Above average set piece goals"
)
}
# Key player focus
top_player <- report$key_players$player_name[1]
recs$key_man <- list(
recommendation = paste("Man-mark or limit service to", top_player),
detail = paste("Their most dangerous player with",
report$key_players$goals[1], "goals and",
report$key_players$assists[1], "assists"),
supporting_data = paste("Contribution score:", round(report$key_players$contribution_score[1], 1))
)
recs
}
# Format report for output
format_scout_report <- function(report) {
output <- paste0(
"═══════════════════════════════════════════════\n",
" OPPOSITION SCOUT REPORT: ", toupper(report$team), "\n",
"═══════════════════════════════════════════════\n\n",
"RECENT FORM (Last 5 Matches)\n",
"──────────────────────────────\n",
"Record: ", report$profile$recent_form$wins, "W-",
report$profile$recent_form$draws, "D-",
report$profile$recent_form$losses, "L\n",
"Goals: ", report$profile$recent_form$goals_for, " scored, ",
report$profile$recent_form$goals_against, " conceded\n",
"xG: ", round(report$profile$recent_form$xg_for, 2), " for, ",
round(report$profile$recent_form$xg_against, 2), " against\n",
"Avg Possession: ", round(report$profile$recent_form$avg_possession, 1), "%\n\n",
"TACTICAL PROFILE\n",
"──────────────────────────────\n",
"Formation: ", report$tactics$formation, "\n",
"Build-up: ", report$tactics$build_up_style, "\n",
"Defensive: ", report$tactics$pressing_style, "\n",
"Primary Attack: ", report$tactics$primary_attack_zone, "\n\n",
"KEY PLAYERS TO WATCH\n",
"──────────────────────────────\n"
)
for (i in 1:min(3, nrow(report$key_players))) {
p <- report$key_players[i, ]
output <- paste0(output,
i, ". ", p$player_name, " (", p$position, ")\n",
" ", p$goals, "G, ", p$assists, "A | ",
"Threat: ", p$threat_level, "\n")
}
output <- paste0(output,
"\nTACTICAL RECOMMENDATIONS\n",
"──────────────────────────────\n")
for (rec_name in names(report$recommendations)) {
rec <- report$recommendations[[rec_name]]
output <- paste0(output,
"▸ ", rec$recommendation, "\n",
" ", rec$detail, "\n",
" [", rec$supporting_data, "]\n\n")
}
output <- paste0(output,
"═══════════════════════════════════════════════\n",
"Generated: ", format(report$generated, "%Y-%m-%d %H:%M"), "\n")
cat(output)
invisible(output)
}
# Demo with simulated data
set.seed(42)
# Simulated match data
match_data <- tibble(
match_id = 1:20,
date = seq(Sys.Date() - 40, Sys.Date(), length.out = 20),
team = rep(c("Barcelona", "Real Madrid"), each = 10),
opponent = rep(c("Real Madrid", "Barcelona"), each = 10),
result = sample(c("W", "D", "L"), 20, replace = TRUE, prob = c(0.5, 0.25, 0.25)),
goals_for = sample(0:4, 20, replace = TRUE),
goals_against = sample(0:3, 20, replace = TRUE),
xg_for = runif(20, 0.8, 2.5),
xg_against = runif(20, 0.5, 2.0),
possession = runif(20, 45, 70),
passes = sample(400:600, 20, replace = TRUE),
pass_completion = runif(20, 75, 90),
shots = sample(8:20, 20, replace = TRUE),
crosses = sample(10:30, 20, replace = TRUE),
ppda = runif(20, 8, 18),
direct_attacks = sample(10:25, 20, replace = TRUE),
attacks_left = runif(20, 25, 40),
attacks_center = runif(20, 20, 35),
attacks_right = runif(20, 25, 40),
set_piece_goals = sample(0:2, 20, replace = TRUE),
formation = sample(c("4-3-3", "4-4-2", "3-5-2"), 20, replace = TRUE, prob = c(0.6, 0.3, 0.1))
)
# Simulated player data
player_data <- tibble(
player_name = c("Lewandowski", "Pedri", "Gavi", "Araujo", "Raphinha",
"Vinicius Jr.", "Bellingham", "Rodrygo", "Valverde", "Modric"),
team = rep(c("Barcelona", "Real Madrid"), each = 5),
position = c("Forward", "Midfielder", "Midfielder", "Defender", "Forward",
"Forward", "Midfielder", "Forward", "Midfielder", "Midfielder"),
minutes = sample(800:2500, 10, replace = TRUE),
goals = c(15, 5, 3, 2, 8, 12, 10, 7, 4, 3),
assists = c(4, 8, 6, 1, 5, 7, 6, 5, 8, 9),
xg = c(12.5, 4.2, 2.8, 1.5, 7.2, 10.8, 8.5, 6.2, 3.5, 2.8),
xa = c(3.2, 7.5, 5.8, 0.8, 4.5, 6.2, 5.5, 4.8, 7.2, 8.5),
key_passes = c(25, 65, 48, 12, 42, 55, 52, 38, 58, 62),
successful_dribbles = c(18, 35, 42, 8, 52, 85, 45, 55, 28, 22),
tackles_won = c(12, 45, 52, 85, 18, 15, 42, 20, 55, 38)
)
# Generate report
cat("=== OPPOSITION SCOUT REPORT GENERATOR ===\n\n")
report <- generate_scout_report("Real Madrid", match_data, player_data)
format_scout_report(report)Chapter Summary
Key Takeaways
- Know your audience: Coaches, executives, scouts, and players all need different formats and levels of detail
- Lead with the conclusion: Use pyramid structure—busy people read the first line
- Tell a story: STAR framework (Situation, Task, Analysis, Recommendation) creates compelling narratives
- Visualize appropriately: Simple for coaches, detailed for analysts—every chart should pass the "So What?" test
- Build trust gradually: Start small, be right about something, admit when wrong, speak their language
- Track your impact: Log recommendations and outcomes to demonstrate value over time
- Systematize communication: Define deliverables, audiences, frequencies, and quality standards
Communication Principles
- Clarity over comprehensiveness
- Actionable over interesting
- Evidence over opinion
- Humble confidence over certainty
- Visual over verbal when possible
- Progressive disclosure of detail
- Consistent style and format
- Follow up on recommendations
Congratulations!
You've completed the Soccer Analytics textbook. The technical skills you've learned are powerful, but your impact will ultimately depend on how well you communicate insights to decision-makers. Keep practicing both the analysis and the communication—they're equally important for success in football analytics.