clock menu more-arrow no yes mobile

Filed under:

Measuring Coaching Ability - Applying Lessons from FootballStudyHall to USC and Its Rivals came out with a fascinating look at how to measure the ability of coaches. While there are some minor concerns with the methodology (some of which can't really be helped), it overall helps advance the discourse in college football about how much recruiting and coaching matters. We analyze those conclusions, and look at how Spurrier (and select opponents) have measured up over the years.

Who the hell is ChickenHoops? And why does he buy into data that says Tommy Bowden out-coached me on game day in 2007 and 2008?
Who the hell is ChickenHoops? And why does he buy into data that says Tommy Bowden out-coached me on game day in 2007 and 2008?
To start with, you might like to know who you're reading.  I'm Kyle, formerly of the website ChickenHoops, where I was somewhat well-known (emphasis on the "somewhat," less on the "well-known") in the blogging community for statistical analysis of Gamecock basketballfootball (to a lesser extent), as well as my disdain for bunting and lazy media tropes.

GamecockMan and the rest of the fine folks over here at GABA were kind enough to ask me to come over and write for the site, and I couldn't be more excited to be here.

So you know my perspective: I'm a 2006 graduate of the university who was born and raised in Irmo, and spent large swathes of my childhood at Williams-Brice, the Carolina Coliseum, and Sarge Frye Field.  I moved to Washington, DC seven years ago, where I became a corporate attorney who enjoys his job but would occasionally rather analyze Connor Shaw's propensity to take sacks instead of transfers of intangible property.

Because of that background, I'll probably spend a lot of football season tinkering with data provided by the savants over at, as they do an incredible job of compiling data and providing top-notch analysis of college football.  One thing I hope to bring to that is additional analysis of what they do that focuses on the Gamecocks, which takes their work and brings it closer to our mission here.  And that's exactly how I spent my first post here at GABA, which I hope you enjoy.

1.  Measuring Coaching Effect

This week, RedmondLonghorn wrote a guest post that analyzed how the apparent talent of a team tracked its ultimate results, as measured by F/+, which combines two rating systems - FEI (which evaluates teams on a possession-by-possession basis) and S&P+ (which analyzes both play-by-play and drive data).  His methodology was relatively simple: he compared the average star rating for each program per class, and then compared two measures of his definition of "talent" ("pure talent," which takes the simple average of the 4 years, and "adjusted talent," which counts the juniors and seniors more) against the F+ each team put together for each year.

The key next step that Redmond took was to then credit this over-performance or under-performance of a team to the coach, on the following assumption: Team Quality = Apparent Talent + Coaching Effect + "Noise".

Once we take out each team's apparent talent, we can see which teams over- and under-performed their expectations to discern how well coaches are doing.  This works if we assume that Noise is essentially random and ultimately evens out over time.  Thus, while one year may be explained by something not captured in this data, over time, we should be able to discern the best and worst coaches (at least, before accounting for the ability to recruit itself).

I've done my best to explain his analysis here so you don't have to go over and read the article, but you should.  However, if you decide to charge ahead here, hopefully that basic description will suffice so that you can see what his data had to say about Spurrier, Swinney, and a few of our other rivals.

2.  Some Minor Issues With Methodology

There are a few issues I had with the methodology, though these are mere quibbles and do not cause me to reject the results generally.  First, it would obviously be optimal if we could adjust the talent portion of our analysis for attrition.  This is particularly relevant for coaches that take over programs and face a mass exodus of talented players who were loyal to the prior coaching staff (think Frank Martin's 2012-13 season).  I also don't think a simple average of classes was the best approach, since it over-weights classes with fewer players and under-weighs classes with more players (e.g., a class with 8 players who were all 4 stars is weighed equally to a class with 24 players that were all 2 stars, so while those 32 players average 2.5 stars, this methodology would treat them as averaging 3 stars).

Again, there's much to be said for taking shortcuts to account for the brevity of life, and I think the data still tells a very interesting story, but these are a few areas where the methods could be improved (also, as mentioned in the comments to the article, I would've used 247's Composite rating instead of Scout).  But let's put those things aside for now and focus on all the cool things we can learn from this project.

3.  So What Did We Learn?

Redmond summarizes his findings thusly:

Not surprisingly, there was a definite and fairly strong correlation between Apparent Talent and team quality (as measured by F/+). The correlation coefficient between the simple 4 year Average star rating and the Predictor was 0.51. Interestingly enough, the correlation with my age-weighted model was actually slightly lower, at 0.49. This was based on sample of 466 observations: all seasons for teams in the ACC, Big East, Big 10, Big XII, Pac-10/12, and SEC from 2006 through 2012.

This is clearly a strong correlation, and shows us that we can learn a lot abut a team by knowing how much talent they have on hand at any given time.

I appreciate this is not a surprise to you.  But trust me, more is coming.

4.  Why Does Talent Matter Less the More You Have?

One interesting thing from the data is the "the leverage of apparent talent on teams quality is a lot less than a 1:1 ratio."  Which means, in layman's terms, that the more talent you have, the less valuable each additional highly-rated recruit you reel in matters.  Redmond uses the following example: a team with 40% better than average talent would be expected to be 18% better than the average team.

I think the most likely reason for this is that most teams only play 30-40 players for meaningful snaps on any given day, and so the 84th and 85th recruit shouldn't be as important as the 1st, 2nd, or even 30th most-talented player.  It would be interesting to see the story told by the data if we only looked at the top half of a team's talent base.

5.  Recruiting Matters

One conclusion that is inescapable from this data?  Recruiting matters.  While there are people out there (looking at you, message boards) who still argue that recruiting ratings don't matter, this data belies that point, though that point has been thoroughly debunked for years now.  While there are surely anecdotal pieces of evidence that can find instances where a team excelled despite lesser talent (or failed despite great talent), the general rule is that to be a good football team you need good football players.  I'll not dispute that there are potentially other avenues to success, but you'd better be pretty good at everything else if you're going to account for that discrepancy.  If all else is even, the team with the more talented players will win.  It's not rocket science.

6.  But So Does Coaching

While recruiting of course matters, the fact that it doesn't explain everything tells us that there's more to what makes up a good football team than just the talent on the roster.  As we saw in the equation earlier, if a team's ability is equal to its talent + its coaching (+ noise, which is everything else that matters), then once we've controlled for talent, we can see how well our coaches are doing.

This is an important caveat.  As the article itself explains:

In this theoretical model, the "noise" factor includes a whole host of stuff ranging from key players being suspended, scandals turning the mojo around a program decidedly poor, to a whole cohort of players turning out to be grossly overrated. As anybody who follows college football knows, this stuff happens and can definitely affect how a team performs on the field, but from a statistical perspective it is truly unknowable, random and therefore can’t be considered.

But let's not forget the ultimate point, which is well summarized in this paragraph:

It stands to reason that over time, strong coaching staffs ought to consistently and measurably outperform (or at least perform in line) with the available talent on hand and weaker staffs will underachieve (sometimes spectacularly). Using the Apparent Talent metric and the simple model described above, we can statistically measure how well a staff performs, both in single season and over time. Of course even good coaches can have an outlier year to the downside and even lousy coaches are fortunate once in a while (cough, cough…Gene Chizik), but consistent deviation from what is predicted by the model ought to be statistically unlikely enough that it should indicate something about coaching performance. And as with most data of this kind, the really large outliers are very interesting.

7.  Before We Analyze, Some Caveats.

So this is all really interesting stuff.  However, there are a few assumptions I wanted to unpack before we plow ahead:

(a)  This assumes that noise in college football is random.  I'm not sure that I agree.  For instance, Oklahoma State and Oregon have some advantages that may go beyond just their ability to attract talent (which is captured in the talent portion of our equation), but that shouldn't be credited to the coach.  In general, it's hard to separate a coach from the athletic program in a lot of cases, and so we may risk over-crediting coaches their successes or failures.

(b)  Obviously, the talent you have on hand is a direct result of the coach in many instances, and so he should be held responsible for that level of talent.  I agree, and so let's just acknowledge that here.  However, we need to also acknowledge that not all of the talent on-hand for a coach is a result purely of his efforts.  Steve Spurrier consistently had better talent at Florida than at Duke, and that wasn't because he suddenly learned how to recruit better between 1989 and 1990.  If we could figure out a way to understand what the baseline recruiting for any specific program is and then credit the increase (or decrease) from that baseline to the coach, we'd do it.  But that's pretty impossible to separate out, especially when you've had coaches at a school for quite some time, so let's just acknowledge the imperfection and move on.

(c)  Of course, there will be some exceptions.  But in some instances, perhaps the exception proves the rule.  As many will note, this analysis does not think well of one Gene Chizik, and yet his season with Cam Newton is one of the best in the chart.  However, doesn't that just go to show that it takes that sort of outlier to impact the data?  If we can all agree that this analysis cannot account for the presence of a once-in-a-generation freak of nature on offense, then we'll all be happier.  Also, to be clear, I'm similarly still extremely angry about that 2010 Auburn team, but let's try to focus here.

8.  Enough Nonsense - How Do the Coaches Stack Up?

The final chart in the post looks at how each season for every BCS program from 2006-2012 has gone from a coaching perspective.  Wisely, Redmond measures how a coach performs by how many standard deviations above and below performance he achieves.  As shown at this link, think of 1 standard deviation as being around the 70th percentile, 2 standard deviations as the 95th percentile, and 3 standard deviations as the 99th percentile.  If you want to know more about standard deviations, I strongly encourage you to go read something other than a football blog.

You can scroll through all the data at your leisure back on the main post, but I wanted to specifically analyze how Spurrier's done here during his time at Carolina.  Then, for fun, let's look at how some of our main rivals have stacked up over the years.

Spurrier has an interesting trajectory at Carolina, and I think it goes more to show that while the Gamecocks have recruited well throughout his tenure, it's only been recently that we've been able to turn that recruiting into success.  Simply put, the teams we put on the field from 2007-2009 were about as talented as the teams we fielded in the last 3 years.  However, you don't need me to tell you how improved our performance has been when you compare those two three-year spans, and the chart below clearly indicates that the improvement doesn't stem from a talent influx (though there's been some of that, to be sure) as much as an influx of player development.  Connor Shaw wasn't as highly touted as Stephen Garcia.


One of my favorite charts is Dabo Swinney at Clemson.  As you all well know, he replaced Tommy Bowden in 2009.  In his 4 years thus far at Clemson, he has never done as good a coaching job on game day as Bowden did in 2007 or 2008.  Of course, since Swinney has strongly increased Clemson's recruiting take, this has still resulted in the Tigers improving as a team during his reign.  However, I can't help but chuckle at the fact that his worst coaching season just so happened to be the year the Tigers won a weakened ACC (and got obliterated by Carolina in Williams-Brice):


Let's look at a few other schools for fun as well.  As you can see from the following chart, Georgia's Mark Richt has been getting it done with talent down in Athens, not necessarily exceptional coaching, though last year was his best work in the seven years captured by this data:


Florida has also had a roller coaster ride that you can see coming without this data - the last year of Meyer and the first year of Muschamp went terribly; otherwise, a team with nothing but talent has been well-coached, which leads to elite performance.  But man, 2010 and 2011 were just brutal years for the Old Boys from Florida.


I could go on forever.  There's some amazingly interesting data - Chizik does nothing without Newton, Petrino tears it up at both Louisville and Arkansas, Harbaugh doesn't put together a great season until Andrew Luck (so was it Luck, or Harbaugh?), Marrone never puts together a position season until last year, Kiffen has never put together a season that was over 0.05 standard deviations above his talent... we could be here all night.  In fact, one great thing to discuss would be the list found in this comment of the coaches who have done the best and worst over the last seven years, according to this data (in the top 14 - Tommy Bowden and Butch Jones; not pictured, Steve Spurrier.  You decide how that makes you feel about the information, as well as about Tennessee's hire of Jones).

It will be interesting to see how much further Redmond and others can take this type of work.  But for now, I encourage you to take a look at it yourself, and share anything interesting you find with us in the comments.

More from Garnet And Black Attack: