Jump right to our Machine Learning Bracket Predictions or Betting Odds or Try the Model Now!
You can download the updated training dataset here.
Madness! Chaos! Well, our AI-driven March Madness bracket is just as busted as yours. The first two rounds of the NCAA Tourney were just as unexpected and crazy as life in the last year. Upsets reigned. The first two rounds produced 12 (!!) upsets where a team 5 seeds or lower than their opponent won. And while we were disappointed to see our model’s pick to win it all (Ohio State) lose in the first round, we couldn’t help but cheer the underdogs along - even those our model didn’t pick.
We can’t help but take another bite at the apple. We’ve taken a look at our model, made some adjustments that we’ll describe here, and have re-forecasted from the Sweet Sixteen forward.
There is no sugarcoating it - our first NCAA bracket was a bust. As the dust cleared on the round of 32, none of our final four teams had even made it to the Sweet 16.
Within all the carnage there were a few places where our model performed well. In the First Four, we correctly predicted 3 of the 4 play-in games. For the first round, we got 23/32 of the picks correct. For the second round, we correctly predicted 8 of the 16 games. Breaking down the first and second round in a bit more detail:
Despite upsets taking out the final-four, our model is currently in the top 8% of models per ESPN’s bracket challenge.
In the sage words of Chumbawamba “I get knocked down, but I get up again.” After we took a break from our misery to watch that music video, we set out to build a new model and earn back our dignity. We made three big changes to the training data - all of which are intended to improve model performance while avoiding teaching it the “wrong” things.
Taken together these changes had a substantial impact on predicted outcomes. The new model prioritizes the strength of schedule, points scored, and offensive rebounds.
We pressure tested the new model against the first two rounds and found it performed similarly to our old model (20/32 correct in round 1 and 8/16 in round 2). That said, all four of its final four teams are still alive (vs the first model which had zero of its final four still alive). With bracket crushing in mind, this is a positive improvement as correct final four predictions have exponentially more weight.
You can predict any matchup you like by entering the year and the two team names below and clicking “predict”.
Pretty cool that you can run predictions against a cloud-deployed machine learning model right in this blog post! To save you some time we did the heavy lifting of running all of the matchups from here on out and have shared them below.
This new model ends up following conventional wisdom - it picks the higher seed to win in every game. The seed input only accounts for a very small percentage of the predictive weight of the model, so essentially, we’re seeing that the underlying statistics support the seeding of the teams.
At first glance, this is now a pretty boring bracket. But given how turbulent and upset-prone the first two rounds were, it’d be wild if there are no more upsets to come in the remaining games.
This is the last model we will be making for the NCAA tournament this year, but we will post individual game projections on the Akkio Twitter as those games happen (if the model gets some of the final four wrong, we will run and share its predictions for each of the games.)