Soccer Analytics Resources
Everything you need to master football data analysis
Setup Guides
Python Environment Setup
Complete guide to setting up Python for soccer analytics, including Anaconda installation, virtual environments, and essential packages.
Quick Install (Recommended):
pip install pandas numpy matplotlib seaborn
pip install mplsoccer statsbombpy soccerdata
pip install scikit-learn scipy
With Conda:
conda create -n soccer python=3.11
conda activate soccer
pip install mplsoccer statsbombpy
R Environment Setup
Complete guide to setting up R and RStudio for soccer analytics, including package management and tidyverse configuration.
Essential Packages:
# Core packages
install.packages(c("tidyverse", "ggplot2", "dplyr"))
# Football-specific
install.packages("worldfootballR")
install.packages("ggsoccer")
devtools::install_github("statsbomb/StatsBombR")
Free Data Sources
Event Data
StatsBomb Open Data
The gold standard for learning. Free event-level data for FIFA World Cups (2018, 2022), FA Women's Super League, La Liga (Messi seasons), and more. Includes freeze frames.
Event Data Freeze Frames FreeWyscout Open Data
Event data for the 2017/18 season across five major European leagues, plus World Cup 2018. Published via Figshare for academic research.
Event Data Academic FreeMetrica Sports Sample
Three complete matches with synchronized tracking data (25fps) and event data. Perfect for learning pitch control and off-ball analysis.
Tracking Data Event Data FreeAggregate Statistics
FBref
Comprehensive statistics for 100+ leagues worldwide, powered by StatsBomb data. Includes xG, progressive passes, pressures, and detailed per-90 stats.
Aggregate Stats xG Data FreeUnderstat
Shot-level xG data for top 6 European leagues since 2014/15. Includes xG timelines, shot maps, and player season histories.
xG Data Shot-Level FreeFotmob
Match statistics, lineups, and live data for leagues worldwide. Great API for real-time data and match previews.
Match Stats Live Data FreeTransfermarkt
Player market values, transfer history, contract information, injury history, and detailed squad data for leagues worldwide.
Market Values Transfer Data FreeWhoScored
Match statistics, player ratings, and tactical data powered by Opta. Includes heat maps, pass maps, and detailed match statistics.
Match Stats Ratings FreeSofaScore
Live scores, statistics, and player ratings for football and other sports. Includes heat maps and shot maps for individual matches.
Live Scores Ratings FreeHistorical & Results Data
Football-Data.co.uk
Historical match results and betting odds for 25+ leagues dating back to 1990s. Essential for prediction models and betting analytics.
Betting Odds Historical FreeEuropean Football Database
25,000+ matches from 11 European countries (2008-2016) with betting odds, team attributes from FIFA video games, and player attributes.
Historical FIFA Attributes FreeOpen Football Data
Open data project with match results, standings, and fixtures for leagues worldwide. Available in JSON, CSV, and SQL formats.
Results Multiple Formats FreeAPIs
API-Football
RESTful API covering 900+ leagues with live scores, fixtures, standings, and statistics. Free tier available with rate limits.
REST API FreemiumFootball-Data.org
Free football data API for major competitions. Covers Premier League, La Liga, Bundesliga, Serie A, Ligue 1, and more.
REST API FreeOpenLigaDB
Community-driven database with German football focus. Free API with match results, standings, and team information.
REST API FreePython Libraries
Football-Specific
| Library | Purpose | Install |
|---|---|---|
| mplsoccer | Comprehensive soccer visualization library. Pitch plots, shot maps, pass networks, heat maps, Voronoi diagrams, and more. | pip install mplsoccer |
| statsbombpy | Official StatsBomb library for accessing their open data and API. Event data with freeze frames. | pip install statsbombpy |
| soccerdata | Scrape data from FBref, Understat, WhoScored, SoFIFA, ESPN, and ClubElo. | pip install soccerdata |
| socceraction | VAEP and xT implementations. Convert event streams to SPADL format. | pip install socceraction |
| kloppy | Unified interface for loading event and tracking data from multiple providers (StatsBomb, Wyscout, Metrica, etc.) | pip install kloppy |
| codeball | Football analytics and machine learning tools. Includes expected goals models. | pip install codeball |
| matplotsoccer | Alternative soccer pitch visualization library with different pitch types. | pip install matplotsoccer |
| floodlight | High-level tracking data analysis. Includes pitch control and movement analysis. | pip install floodlight |
Data Science Essentials
| Library | Purpose | Install |
|---|---|---|
| pandas | Data manipulation and analysis. DataFrames, data cleaning, groupby operations. | pip install pandas |
| numpy | Numerical computing. Arrays, linear algebra, mathematical functions. | pip install numpy |
| matplotlib | Base plotting library. Foundation for mplsoccer and most visualizations. | pip install matplotlib |
| seaborn | Statistical data visualization. Heatmaps, distributions, regression plots. | pip install seaborn |
| scikit-learn | Machine learning. Classification, regression, clustering, model evaluation. | pip install scikit-learn |
| scipy | Scientific computing. Statistics, optimization, signal processing. | pip install scipy |
| plotly | Interactive visualizations. Web-ready charts and dashboards. | pip install plotly |
| xgboost | Gradient boosting. Often used for xG models and match prediction. | pip install xgboost |
R Packages
Football-Specific
| Package | Purpose | Install |
|---|---|---|
| worldfootballR | Comprehensive scraping for FBref, Transfermarkt, Understat, and Fotmob. The most complete R football data package. | install.packages("worldfootballR") |
| StatsBombR | Official StatsBomb package for accessing open data and API. Event data with advanced statistics. | devtools::install_github("statsbomb/StatsBombR") |
| ggsoccer | Soccer pitch visualizations with ggplot2. Includes multiple pitch types and annotation layers. | install.packages("ggsoccer") |
| ggshakeR | Advanced football visualizations. Pizza charts, radar plots, bump charts, and shot maps. | devtools::install_github("abhiamishra/ggshakeR") |
| soccerAnimate | Create animated visualizations of tracking data and event sequences. | devtools::install_github("Dato-Futbol/soccerAnimate") |
| understatr | Access Understat data directly. Shot-level xG data for top European leagues. | devtools::install_github("ewenme/understatr") |
Data Science Essentials
| Package | Purpose | Install |
|---|---|---|
| tidyverse | Collection of essential packages: dplyr, ggplot2, tidyr, readr, purrr, stringr. | install.packages("tidyverse") |
| ggplot2 | Grammar of graphics visualization. Foundation for all R football visualizations. | install.packages("ggplot2") |
| dplyr | Data manipulation verbs: filter, select, mutate, summarize, group_by. | install.packages("dplyr") |
| tidymodels | Machine learning framework. Unified interface for modeling and evaluation. | install.packages("tidymodels") |
| gt | Grammar of tables. Create publication-ready tables for reports. | install.packages("gt") |
| gganimate | Animation extension for ggplot2. Create animated match sequences. | install.packages("gganimate") |
Books
Foundational Reading
The Numbers Game
Chris Anderson & David Sally
The foundational book on soccer analytics. Explains why analytics matters and challenges conventional wisdom about football.
Essential BeginnerSoccermatics
David Sumpter
Mathematical modeling of football. Covers probability, geometry, and physics applied to the beautiful game.
Essential IntermediateExpected Goals
Rory Smith
The story of how data conquered football. Chronicles the rise of analytics from outsiders to the heart of major clubs.
Essential BeginnerFootball Hackers
Christoph Biermann
Inside the revolution of data and football. Features interviews with pioneers like Ralf Rangnick and Ted Knutson.
Essential BeginnerZonal Marking
Michael Cox
Tactical history of European football. Essential context for understanding how analytics fits into tactical evolution.
Tactical BeginnerThe Mixer
Michael Cox
The story of Premier League tactics. Traces tactical evolution from 1992 to 2017.
Tactical BeginnerTechnical & Statistical
Soccernomics
Simon Kuper & Stefan Szymanski
Economics and statistics of football. Transfer markets, wage optimization, and data-driven decision making.
Economics IntermediateMoneyball
Michael Lewis
The classic sports analytics story. While about baseball, the principles apply directly to football analytics.
Classic BeginnerPython for Data Analysis
Wes McKinney
Definitive guide to pandas and data manipulation. Essential for any Python-based analytics work.
Technical IntermediateR for Data Science
Hadley Wickham
The tidyverse bible. Essential for R-based analytics. Available free online.
Technical IntermediateAn Introduction to Statistical Learning
James, Witten, Hastie, Tibshirani
Machine learning fundamentals. Covers regression, classification, and clustering used in xG models.
Technical AdvancedAnalyzing Baseball Data with R
Max Marchi & Jim Albert
While baseball-focused, demonstrates analytical techniques directly applicable to football.
Technical IntermediatePodcasts
The Double Pivot
Hosted by Grace Robertson and Yves Kalume. Weekly tactical and analytical discussions on football trends.
Tactical ActiveFanalytics
Analytics-focused podcast discussing data and football. Features industry professionals and academics.
Analytics ActiveTifo Football Podcast
Tactical and analytical discussions from the Tifo team. Accessible explanations of complex concepts.
Tactical ActiveStatsBomb Podcast
Official StatsBomb podcast. Deep dives into analytics topics and interviews with industry leaders.
Analytics OccasionalExpected Value
American Soccer Analysis podcast. MLS-focused but covers universal analytics concepts.
Analytics MLSThe Analyst
Opta's official podcast. Features their analysts discussing trends and statistics.
Analytics ActiveYouTube Channels
Tifo Football
Excellent tactical and analytical explainers. Great visualizations and accessible content.
Tactical 2M+ SubsMcKay Johns
Python tutorials for soccer analytics. Shot maps, radar charts, and data visualization guides.
Tutorials PythonFriends of Tracking
Academic tracking data tutorials. Lectures from leading researchers on pitch control and analysis.
Advanced Tracking DataThe Analyst
Opta's official channel. Match analysis, player profiles, and statistical breakdowns.
Analytics OfficialMRKT Insights
Player scouting and performance analysis. Detailed breakdowns of player styles.
Scouting PlayersSoccermatics (David Sumpter)
The author of Soccermatics shares mathematical football analysis and tutorials.
Advanced MathematicalOnline Courses
Mathematical Modeling of Football
Uppsala University (Free)
David Sumpter's course based on Soccermatics. Covers probability, expected goals, network analysis, and more. Python and R code provided.
Free IntermediateFriends of Tracking Course
Friends of Tracking (Free)
Comprehensive tracking data course. Pitch control, off-ball analysis, and expected possession value. Taught by leading researchers.
Free AdvancedFC Python Courses
FC Python
Python tutorials specifically for football analytics. From basics to advanced visualizations and machine learning.
Free BeginnerDataCamp Sports Analytics
DataCamp (Paid)
Professional courses on sports analytics using Python and R. Includes football-specific projects and case studies.
Paid IntermediateBlogs & Websites
StatsBomb Articles
Industry-leading analysis and research. Introduces new metrics and methodologies.
American Soccer Analysis
MLS-focused but universally applicable methodology. Excellent explanations of metrics.
The Athletic - Analytics
Premium analysis combining data with tactical insight. Worth the subscription.
Karun Singh's Blog
Creator of expected threat (xT). Technical deep-dives into advanced concepts.
Differentgame
Mark Thompson's visualizations and analysis. Beautiful charts and insightful commentary.
Soccerment Blog
Football analytics and data visualization tutorials with practical examples.
Communities
Football Analytics Twitter/X
The primary hub for soccer analytics discussion. Follow @StatsBomb, @OptaAnalyst, @Soccermatics, and #FootballAnalytics.
Football Analytics Slack
Active community for discussions, Q&A, and networking. Channels for Python, R, tracking data, and career advice.
Soccer Analytics Discord
Growing Discord community for real-time discussions, project sharing, and feedback.
r/SoccerAnalytics
Reddit community for soccer analytics discussions, questions, and sharing work.
GitHub
Explore open-source projects, contribute to libraries, and share your own work.
LinkedIn Groups
Professional networking groups for football analytics. Great for career connections.
Conferences & Events
StatsBomb Conference
Annual conference featuring cutting-edge research and industry presentations. The premier event in football analytics.
Annual IndustryOpta Forum
Stats Perform's annual event showcasing research and applications of football data.
Annual IndustryMIT Sloan Sports Analytics Conference
Premier sports analytics conference. Football research papers and presentations alongside other sports.
Annual AcademicECML Sports Analytics Workshop
Academic workshop at the European Conference on Machine Learning. Peer-reviewed research papers.
Annual AcademicFootball Data International Forum
Industry event focusing on practical applications of data in football operations.
Annual IndustryBarça Sports Analytics Summit
FC Barcelona's annual summit featuring research from their Innovation Hub.
Annual IndustryCareer Resources
Types of Roles
- Data Analyst - Match analysis, reporting, visualizations
- Data Scientist - Model building, xG, player prediction
- Video Analyst - Match coding, opponent scouting
- Scout/Recruitment Analyst - Player identification, profiling
- Performance Analyst - Physical data, GPS tracking
- Research Scientist - R&D, new methodologies
Key Skills
- Programming - Python, R, SQL
- Statistics - Regression, probability, ML
- Domain Knowledge - Football understanding
- Visualization - Clear communication of insights
- Communication - Explain data to non-technical staff
- Video Analysis - Coding, annotation tools
Job Boards & Resources
Global Sports Jobs
Dedicated sports industry job board with analytics positions at clubs and organizations.
LinkedIn Jobs
Search "football analytics", "soccer data", or "sports data scientist" for current openings.
Club Careers Pages
Many clubs post directly to their websites. Check your target clubs regularly.
Pro Tips for Breaking In
- Build a portfolio - Share analysis on Twitter, GitHub, or a personal blog
- Contribute to open source - Help improve libraries like mplsoccer or worldfootballR
- Enter competitions - Kaggle, StatsBomb challenges, academic paper contests
- Network actively - Engage on Twitter, attend conferences, join Slack communities
- Start local - Volunteer with local clubs, amateur teams, or youth academies
- Learn the game - Watch matches analytically, understand tactics and context
Key Academic Papers
| Paper | Authors | Topic | Year |
|---|---|---|---|
| Actions Speak Louder Than Goals | Decroos et al. | VAEP framework introduction | 2019 |
| Wide Open Spaces | Fernandez & Bornn | Pitch control and off-ball value | 2018 |
| A Framework for the Fine-Grained Evaluation of the Instantaneous Expected Value of Soccer Possessions | Fernandez, Bornn, Cervone | EPV model | 2021 |
| Decomposing the Immeasurable Sport | Cervone et al. | Expected possession value (basketball, applicable concepts) | 2016 |
| Large-Scale Analysis of Soccer Matches Using Spatiotemporal Tracking Data | Gyarmati et al. | Pass network analysis | 2014 |
| Beyond Expected Goals | Spearman | Off-ball scoring opportunity model | 2018 |
| Physics-Based Modeling of Pass Probabilities in Soccer | Spearman et al. | Pass probability models | 2017 |
Most papers are available on arXiv, Google Scholar, or the authors' personal websites.
Tools & Software
Tableau
Industry-standard visualization tool. Free Public version available. Used by many clubs for dashboards.
Visualization FreemiumPower BI
Microsoft's visualization tool. Strong integration with Excel. Free desktop version.
Visualization FreemiumJupyter Notebooks
Interactive Python environment. Standard for data analysis and sharing work.
Python FreeRStudio
IDE for R programming. Essential for R-based analysis. Free desktop version.
R FreeVS Code
Modern code editor with excellent Python and R extensions. Great for larger projects.
Editor FreeGoogle Colab
Free cloud notebooks with GPU access. Great for learning without local setup.
Python Free