Build a Rolling xG Form Chart in Python
A few lines of pandas and matplotlib to see form, not just totals.
A league table is a season-long average wearing a disguise. It tells you where a team sits, but it flattens the last two months into the same number as the first two, so a side surging into form and a side quietly collapsing can occupy the same row. A rolling expected-goals chart pulls those two stories apart. It takes about thirty lines of Python, and once you have it you will never read a table the same way again.
Why a rolling window beats the season total
Season totals are the right tool for a verdict and the wrong tool for a trend. A team that started terribly and is now excellent has a mediocre season-long xG difference, and the table shows mediocre — even though the only thing that matters for the next match is the “now excellent” part. The information you want is recent and directional: is the team creating more and conceding less than it was a month ago, or the reverse?
A rolling average answers exactly that. Instead of summing the whole season, you average over a sliding window of the most recent matches — the last six, say — and recompute it after every game. Plot that rolling number across the season and you get a line that rises and falls with form, smoothing out the single-match noise (a freak scoreline, one hot afternoon) while still responding to genuine change. Choose the window with the trade-off in mind: a short window (around six) reacts quickly but wobbles; a longer one (ten or so) is smoother but slower to register a real shift. Six to ten matches is the sweet spot for league form.
The two series worth plotting are xG for (how good the chances you create are) and xG against (how good the chances you concede are). Drawn together, the gap between the lines is your underlying form: widening in your favour means you are getting better, converging or crossing means trouble. If you have not built a season-long version first, the spreadsheet walk-through in build a simple xG-difference league table is a good companion to this one.
The data you need
This tutorial is deliberately source-agnostic: it works on any CSV with one row per match and an xG-for and xG-against column. The minimum schema is four columns — a match date, an opponent (optional, just for readability), the xG you created, and the xG you conceded:
date,opponent,xg_for,xg_against
2025-08-16,Opponent A,1.8,0.7
2025-08-23,Opponent B,0.9,1.4
2025-08-31,Opponent C,2.3,1.1
...
The numbers above are made-up sample rows, included only so the code has something to chew on — do not read them as any real team’s data. For real figures, the free StatsBomb open data is an excellent source: it ships shot-level events from which you can sum each match’s xG for and against, and the statsbombpy package makes pulling it straightforward. The setup walk-through is in getting started with StatsBomb open data. For this chart, all that matters is that you end up with a per-match CSV in the shape above; where the xG numbers come from is your choice.
Load the data and sort it
Everything here uses pandas and matplotlib. Install them with pip install pandas matplotlib if you have not already. The first step is to read the CSV, parse the date column into real dates, and sort by date — a rolling window is meaningless if the rows are out of order.
import pandas as pd
import matplotlib.pyplot as plt
# A small, clearly made-up sample so the script runs on its own.
# Replace this with: df = pd.read_csv("matches.csv", parse_dates=["date"])
sample = [
("2025-08-16", "Opponent A", 1.8, 0.7),
("2025-08-23", "Opponent B", 0.9, 1.4),
("2025-08-31", "Opponent C", 2.3, 1.1),
("2025-09-13", "Opponent D", 1.1, 1.2),
("2025-09-20", "Opponent E", 0.6, 2.0),
("2025-09-27", "Opponent F", 1.4, 0.9),
("2025-10-04", "Opponent G", 2.0, 0.8),
("2025-10-18", "Opponent H", 1.7, 1.0),
("2025-10-25", "Opponent I", 2.4, 0.6),
("2025-11-01", "Opponent J", 1.9, 1.3),
]
df = pd.DataFrame(sample, columns=["date", "opponent", "xg_for", "xg_against"])
df["date"] = pd.to_datetime(df["date"])
df = df.sort_values("date").reset_index(drop=True)
print(df.head())
If you are loading your own file, delete the sample list and uncomment the read_csv line. The parse_dates argument turns the date strings into datetimes so the x-axis plots correctly and the sort behaves.
Compute the rolling average
This is the heart of the tutorial, and it is a single pandas method. rolling(window=6) creates a sliding window of six rows; .mean() averages each window. The min_periods=1 argument tells pandas to start producing a value from the very first match rather than waiting until it has a full window of six — so the early-season points are an average of however many matches have been played so far, and the window “fills up” as the season progresses.
WINDOW = 6
df["xg_for_roll"] = df["xg_for"].rolling(window=WINDOW, min_periods=1).mean()
df["xg_against_roll"] = df["xg_against"].rolling(window=WINDOW, min_periods=1).mean()
# The gap between the two lines is the rolling xG difference (the form signal).
df["xg_diff_roll"] = df["xg_for_roll"] - df["xg_against_roll"]
print(df[["date", "xg_for_roll", "xg_against_roll", "xg_diff_roll"]].round(2))
That is the entire computation. Each row now carries the average xG-for and xG-against over the trailing six matches, plus their difference. Want a smoother, slower line? Set WINDOW = 10. Want it twitchier? Drop it to four. Because the window is a single constant, you can experiment freely without touching anything else.
Plot it
Now draw the two rolling series against the date. Plotting the rolling lines (not the raw per-match values) is the whole point — the raw numbers are jagged noise; the rolling lines are the trend. A light touch helps readability: the raw points can go on faintly underneath so you can still see the matches that drive the line.
fig, ax = plt.subplots(figsize=(11, 6))
# Faint raw per-match points, for context.
ax.scatter(df["date"], df["xg_for"], color="#2e7d32", alpha=0.25, s=25)
ax.scatter(df["date"], df["xg_against"], color="#b0413e", alpha=0.25, s=25)
# The rolling lines: the actual story.
ax.plot(df["date"], df["xg_for_roll"], color="#2e7d32", linewidth=2.5,
label=f"xG for ({WINDOW}-match rolling avg)")
ax.plot(df["date"], df["xg_against_roll"], color="#b0413e", linewidth=2.5,
label=f"xG against ({WINDOW}-match rolling avg)")
ax.set_title("Rolling expected-goals form")
ax.set_ylabel("Expected goals per match")
ax.set_xlabel("Date")
ax.legend()
ax.grid(True, alpha=0.2)
fig.autofmt_xdate() # angle the date labels so they don't overlap
fig.tight_layout()
fig.savefig("rolling_xg_form.png", dpi=144)
plt.show()
Run the script and you get a PNG and an on-screen chart. Read it the way you would read a pulse: where the green (xG for) line sits clearly above the red (xG against), the team is generating better chances than it concedes; where they converge or swap, form is turning. The crossings and the widening gaps — not any single match — are the signal.
Where to take it next
A few small extensions make the chart genuinely useful as a season-long monitor. Shade the area between the two lines so the rolling xG difference reads at a glance — ax.fill_between(df["date"], df["xg_for_roll"], df["xg_against_roll"], alpha=0.12) does it in one line. Plot two teams’ rolling xG-difference lines on the same axes to compare trajectories before a fixture. Or swap the date axis for a matchweek index if your competition plays in tidy rounds. The mechanics never change: sort by time, rolling().mean(), plot the lines.
If this is your first time drawing football data with matplotlib, the pitch-plotting basics — pass maps and shot maps with the mplsoccer library — are covered in draw your first pass map and shot map with mplsoccer, and they pair naturally with a form chart: the rolling line tells you when a team’s form turned, and a shot map tells you how. Between them you have moved from reading the table to interrogating it.
Sources & further reading
- Free textbook: Chapter 4: Python Tools for Soccer Analytics — the theory behind this, at DataField.dev.
- StatsBomb open data — free shot-level event data you can aggregate into the per-match xG-for and xG-against this chart needs.
- StatsBomb — the
statsbombpypackage and documentation for pulling that data into pandas. - Understat — an alternative source of per-match xG figures for the major European leagues.
- FBref — match-level xG data (via Opta) across many competitions, exportable to CSV.