Friday, March 21, 2008

March Madness 2008 - 1st Round Simulation

Well, unfortunately I didn't get a chance to add the database backend to my simulation program. I got caught up in a couple other web programming projects for a friend's internet radio station. I guess there's always next year...

...but wait! Against all odds (and sensibility), I decided to simulate the tournament anyway -- entering all the stats into the source code by hand. Well, not completely by hand. That would just be plain old nuts. I created an Excel spreadsheet to help me format the lines of code (arrays) properly. It's still waaay labor intensive, hence I've only simulated the first round so far.

So how is my program doing? Not so hot it turns out.

At the end of the first round, it's predicted 17 out of the 32 games correctly -- just barely over 50%. Basically no better than random chance. Ouch.

Keep in mind these simulations are based purely on box score statistics, which can be deceiving. Box score stats don't take into account a lot of factors, not the least of which is the level of competition the team has faced to achieve those stats. In general, teams in weaker conferences may tend to have their stats "over-valued" or artificially inflated in comparison to teams that have played stronger competition throughout the year. This is obviously a fundamental limitation of my program that I've been putting a lot of thought into trying to remedy recently. Not an easy task. There are other limitations, and even some known flaws in the code that need to be fixed, so I wasn't expecting a whole lot anyway. I was just curious to see how it would do in its current state.

I'm not going to post the predicted scores of each game, because it's pretty pointless with the known issues (ahem, bugs). Instead, I'll post predicted winning percentages, which is probably still somewhat pointless, but hides the known issues a little better. ;)

Yellow Highlight: Outcome predicted correctly
Red Text: Outcome not predicted correctly
Percentage in parenthesis is predicted winning percentage
1000 games simulated per matchup

EAST



MIDWEST


SOUTH


WEST


So, as you can see...not so hot. Definitely room for improvement.

As I said before, I'm doing this semi-manually, so it's taking a good amount of time. I don't have anything beyond the first round simulated yet, and will be gone this weekend for Easter. I'm hoping to get the rest of the tournament simulated before the Sweet 16 games start next week.

I can't wait to see who this program picks to win it all...could be interesting. :P

Stay tuned!