Generating brackets through statistics in Sports Stat
Every spring, March Madness overtakes both the world of sports and the classroom. While the basketball championship attracts over 35 million U.S. adults to create brackets and bet an estimated $3.1 billion, it also consumes the Sports Statistics class.
Taught by David Stein, the second-semester class focuses on real-world applications of statistical ideas instead of the theoretical. As part of the course, Blazers take part in the tradition of predicting the wins and losses of the teams in the three-week, 67 game tournament that dates back to 1977.
Instead of manually creating their brackets, they use Minitab, a statistical software, to create a model that generates their bracket. Working in teams of four, each group parses through data from previous years of the 68 competing teams to select variables that will help them create an accurate model.
Senior Jared Ramirez comments on the difficulty of choosing only a small subset of variables to generate an accurate model. "There are so many variables in [a] game like basketball that you could actually potentially test, and we had to narrow it down to maybe four to six variables that we thought could accurately predict the entire tournament," Ramirez says.
The Blazers would determine what variables they thought would matter the most, and repeatedly test them against the dataset. Ramirez emphasizes the significant amount of time that this process took. "What that meant was testing different variable combinations, over and over again, until we actually found something we were happy with, [which] took quite a bit of time," Ramirez says.
After devising and optimizing their model, students in the class uploaded their models to competition in Kaggle, a data science and machine learning website, to see how their models would perform. Some have been fairly accurate so far, with Ramirez's group's model even predicting Saint Peter's upset against Kentucky on March 17.
In Sports Statistics, the Blazers don’t only create March Madness brackets every spring. They also look at the statistics behind decisions made by football coaches in high-pressure situations, specifically during the fourth down. In these cases coaches have one of three choices: punt the ball, try to score a field goal, or continue with the fourth down.
Senior Leila Faraday is surprised at how often the coaches are not making the most statistically optimal choice. "It was interesting to look into why that was, [since] these coaches or people have trained their whole life and should know exactly what they're doing. It's kinda cool to look at why human brains sometimes choose to do things that aren't statistically beneficial just because of human nature to choose what feels safe," Faraday says.
An avid sports fan, Ramirez appreciates the opportunity to analyze the game in a different way. "Unlike most classes I've done in school, we actually get a chance to take what we learned in a previous class and apply it to a very real-world situation like March Madness… It's been just a very fresh, very exciting and invigorating experience all around," Ramirez says.
Though not much of a sports lover herself, Faraday nonetheless enjoys applying the concepts she learned in AP Statistics the year before to a real-world scenario. "In math class, they use these perfect examples to describe things, but usually, things are messy and weird. I think this class has given me a good understanding of things when [they] get messy and weird and how real-life statisticians or sports experts are handling [them]," Faraday says.
Whether it be a new perspective from which to see a game for sports aficionados or real-world applications of statistical topics, Sports Statistics has something to offer for everyone.
If interested, rising seniors can add Sports Statistics to their schedule by contacting their counselor.
Isabelle Yang. Hi! I'm Isabelle (she/her). Outside of SCO, I love to listen to music, hike and solve puzzles. More »