I got a very cool e-mail this week (I'll leave the name out to protect the innocent):
I have a quick question for you. Why do I so rarely see time series in basketball? I work in climate science, and you can barely poke a stick without looking at a timeseries.
It just seems to me that it would be a great way to convey some things. A scorer's average points over a season or career, a team's w/l ratio... You could use a moving average, or plot against anomaly rather than absolute values, or look for trends to see who's improving and who's not.
My guess is that people subconsciously believe in the 'true' value of basketball stats, so while they are happy to delve into many different kinds of summaries of performance, they still tend to focus on the 'reality', which is a single snapshot rather than a value over time. I'm sure most people intuitively understand that these things move around over a season or career, after all you often read about player X having a bad January, or player Y being on fire since the All Star Break. Things like that would - should - jump out beautifully on some kind of time series.
For my next email, I'll move on to box plots. Just kidding.
First, anyone who jokes about box plots is ok by me.
Second, he has a good point. I do rarely see this kind of "moving average" data over the course of a season. Instead, most people focus on the whatever the up-to-the-minute scoring average, FG%, or name-your-favorite-stat here happens to be. Less often we might see a breakdown of a stat by month or something like that.
This guy got me thinking to the point that I'd like to start working on this problem, and hopefully, adding a new feature in the future to nbawowy that could let people investigate temporal trends. I want to give a taste of the idea in this article. And of course, my guinea pigs are always Golden State Warriors, lucky for you GSoMers.
Here I'll show charts for the top 6 scorers on the team last season, including Jarrett Jack and Carl Landry, who are no longer with the team (in case you missed it!). Hopefully, the charts mostly speak for themselves. Essentially, I'm plotting every shot during the season as a function of time and points scored. Instead of plotting symbols, I came up with the clever idea (patting myself on the back as I type this up) of plotting the name of the opponent, so you can (fairly) easily have a reference for the games.
On top of those points, I plotted a LOESS smoother which is a way to perform a local regression. Think of this as a fancy sort of moving average, where the most recent point (at a certain date) is given the most weight, and points farther away are given progressively less weighting. There is a "fudge factor" that is chosen to control the weighting, and for this post, I just picked something that "looked right" to me. Of course, if we want to be more rigorous about it, that factor could be varied and the value that turns out to be the most predictive could be chosen. But that's all work for the future.
What's visually striking is how variable the efficiency for each player is during the course of a season, even very good ones like Stephen Curry. I mean, I guess we all know this. But until you actually see it, it's not really appreciated. You can literally see how on fire Curry was around the time of the "54 point game" against the Knicks (his peak for the season) and how he severely trailed off by the end of the San Antonio series. And of all the players here, it looks to my eyes like Harrison Barnes had the most ups and downs of any of them. Not too surprising, right? But there it is on paper. Anyway, folks, enjoy the data!