Monday, November 9, 2009

Under the Hood: Updates

Over the last few weeks I've been getting feedback on my system and kicking around some new ideas with a colleague of mine.  He's off creating his own system -- and hopefully will start posting here soon, too -- and while I'm not adopting his methodology wholesale there are some tweaks I'm making in response to some of his comments.

For a quick refresher of how my system works, check out Pomeroy's primer on tempo-free statistics and my translation of those principles to college football.

The first change I'm making is reducing the exponent in the Pythagorean expectation formula from 3.0 to 2.7.  This may seem like a small change but it has the effect of flattening out the winning percentage distribution.  In the Week 10 rankings the team in the dead center of our distribution was Minnesota with a 0.5001 winning percentage, meaning they are our example of a perfectly average college football team.  All winning percentages and odds can be expressed in terms of "games against Minnesota".

For example, in the Week 10 rankings using the previous formula Florida (0.972 winning percentage) would have a 97.2% chance of defeating Minnesota.  The Golden Gophers would have to play 25 games against Florida in order to have better than 50-50 odds of winning at least one game.  That's two whole seasons of futile college ball against a single team.  Changing the exponent to 2.7 means that Florida's winning percentage drops from 0.972 to 0.963, a modest slip.  However, this brings a small glimmer of hope to Minnesota.  Now they would "only" have to play 18 games against Florida to have 50-50 odds of winning at least one game.

To be clear: this change will not affect the favorite in a matchup, simply the odds that the the favorite will win.  It reflects that there is more uncertainty in college ball than predicted under the old system.

The second tweak is an adjusting of the home field advantage factors.  The previous values were under-estimating the abilities of teams to win at home early in the season, and over-estimating their ability to win at home in November and December (odd side note: home teams actually have a losing record starting in mid-November, going 276-316).  This change actually will affect which team is predicted to win a certain game since it adjusts the offensive and defensive efficiencies.

Over the next week I also plan to explore the predictive effect (if any) of strength of schedule.  Namely, if two teams with similar winning percentages but vastly different strengths of schedule play, are there any statistically significant trends that emerge?  Is there actually an advantage to being "battle-tested"?  Is it possible to play too rigorous of a schedule and get worn down?

Further out I hope to explore the effect of two teams playing each other several times (e.g., conference foes).  Does repeated exposure to an opponent level the playing field?  Or do we only notice these upsets -- such as Stanford-Oregon -- more often because of the hype the press puts on these matchups?  More to come, hopefully with pretty charts and graphs.