When I was in middle school, I picked up the newspaper to read an article on the opponent our varsity football team was playing that night (for our younger audience, this is what a newspaper is). The article said the opposing team’s running back was averaging almost 6 yards per carry, which is very high. I decided that I wanted to keep stats during the game to see if he was able to average 6 yards per carry against our defense. I made a templatethat would let me record if the play was a pass or a run and how many yards the play gained. That night I calculated that our defense held the other...
A few weeks ago, I wrote about how fantasy football scores of quarterbacks have been rising the last six years. This led to the conclusion that you should draft an elite QB early in your draft. But, much like how you need to carefully consider all aspects of a process during a Six Sigma project, we now have to consider everything else that the quarterbacks can affect. Specifically, wide receivers and tight ends. After all, if QBs are throwing for more yards and touchdowns, somebody has to be catching those passes and getting more fantasy points.
Let’s use data and statistical softwareto see...
“Study the past if you would divine the future” – Confucius
In just a handful of words, Confucius was able to adequately summarize the world of statistics. Because when you think about it, the main use of statistics is to study the past so we can divine the future. An engineer performs a reliability analysis on defective refrigerator compressors in order to predict warranty claims. A quality engineer studies whether the can-filling process meets specifications to ensure future cans are made correctly.
Confucius’s quote could apply to many situations, but I’m pretty sure I know what specific area...
If there is one thing just about everybody agrees on when it comes to fantasy football, it’s that you need to spend your first pick on a running back. It’s even common for your first two picks to be running backs. And there is good reason. Last year I did some data analysis and confirmed that you should still be drafting a running back with your first pick. But that was 365 days ago! There has been an entire season’s worth of data since then. Just like in the quality improvement world, the landscape of sports is constantly changing. You'd better keep your information up to date, or you’ll end...
On October 14th, 1992 the Pittsburgh Pirates and Atlanta Braves were set to play game 7 of the National League Championship Series. The winner went to the World Series, the loser went home.
I was a mere 8 years old, and as much of a Pirates fan as an 8 year old can be. I also had a friend who was a Braves fan, and in the school bus line we argued over who was going to win the entire series. I had yet to hear of statistics or Minitab, so my arguments consisted of little more...
A few weeks ago, I used Minitab to calculate the odds of throwing a perfect game. The results were surprising. I found that the number of perfect games that have occurred since 1900 is vastly greater than the number we would have expected. And whether you're doing a six sigma project or a simple baseball data analysis, it's always good to go back and make sure you did everything correctly whenever you find surprising results.
To determine the number of perfect games we would have expected to occur since 1900, I calculated the probability of getting 27 outs in a row. To do this, I used the...
If you like baseball pitching statistics, then you've loved the month of June. On the first of the month, Johan Santana pitched the first no-hitter in Mets history. Then a week later, the Seattle Mariners used 6 different pitchers to do the same thing. That tied the MLB record for most pitchers used in a no-hitter. And finally, 5 days after that, Matt Cain pitched the 22nd perfect game in major league history. And we're only halfway through June! It doesn't take a Six Sigma Black Belt to realize it's been a crazy month.
But as a stat nerd, the question I have is how crazyhas June really been?...
It’s early June, and the Pittsburgh Pirates aren’t mathematically eliminated from the playoffs yet! But seriously, the Pirates are above 0.500, which is a big deal because they’ve had a record 19 consecutive losing seasons.
A winning season would finally allow them to break their embarrassing losing streak. But there is something else interesting about the Pirates season this year. They’ve actually been outscored by 22 runs! In their 56 games, they've scored 179 runs and given up 201. You would expect a team to have a losing record when they’ve given up more runs than they’ve scored, not the...
It’s May, the one month of the year that Americans are most likely to tune into horse racing. But this year is different than most. I’ll Have Another has a chance to become the first horse since 1978 to win the Triple Crown! This makes me wonder a couple of things:
- How often do horses win both the Kentucky Derby and Preakness (first two races in the triple crown)
- Where do horses that won the Derby and Preakness usually place in the Belmont Stakes?
- What are the odds that no horse has won the triple crown since 1978?
- What is the best horse name of all time?
Ok, well I can use Minitab to answer those...
This past week, the History Channel premiered a new show called the "United Stats of America." No, that's not a typo. It's a show hosted by twin brothers who are both standup comedians and obsessed with statistics. Since I'm also obsessed with statistics (I'm still working on the standup comedy part), I thought I'd check it out to see if I could relate any of their stats to common applications of Minitab Statistical Software.
The show attempts to reveal some of the most interesting and surprising statistics in America. For example, only 8% of teenage boys use soap when they wash their hands....
In my last blog post, I used Minitab Statistical Software to try to determine whether Mel Kiper or Todd McShay is better at predicting the 1st round of the NFL draft. Just like you might need to decide between suppliers in a quailty improvement situation, I needed to decide who to go with when I filled out my 2012 NFL Mock Draft Office Pool. Well, as it turns out there wasn't any statistical difference between the two, and the cost was exactly the same. So I just flipped a coin to decide who to go with, and it came up Kiper!
Now that the first round of this year's draft is complete, I can do...
It’s a common occurrence in any quality improvement situation: You have 2 (or more) suppliers that offer you the same product. How do you decide which one to choose? You could just flip a coin. But that wouldn’t be very sensible, would it? No, instead it’s probably best to do some data analysis to help you make your decision.
So what does this have to do with the NFL draft? Well, everybody and their brother are drawing up mock drafts right now. There are different suppliers offering me the same product! How do I know which one to use when I fill out my NFL Mock Draft Office Pool (that’s a...
There are just 3 games left in the 2012 NCAA basketball tournament. This means we have 64 games that we can do some data analysis on! So before the games get underway this weekend, I'm going to use Minitab to break down the some of the craziest things that happened in this year's tournament.
How do I define crazy? Well, since I'm a stat nerd, I'm going to
use probability! The lower the probability, the crazier the event.
All of these probabilities are based on predictions by the
regression model I developed over the last few weeks.
NOTE: The regression model calculates probabilities based on...
Finally! Not only is it March, but the NCAA tournament brackets are out! It's almost being like being a kid opening a brand new toy on Christmas morning. Hey, I said almost!
Anyway, in case you missed it, over the past few weeks I’ve used Minitab to create a regression model that predicts the probability one basketball team has of beating another, then improved that model, and tested the model. But there is one little bit of housekeeping that has to be done before we break down the brackets.A Model for Neutral Site Games
The previous model took into account which team was playing at home. So...
It’s finally March, which means it’s almost time for the NCAA basketball tournament. I’ve spent my last two posts finding a regression model that predicts the probability that a college basketball team has of beating another team. But our data analysis shouldn't stop once we have our model. Just like in any quality improvement situation, we should test our model to ensure that it actually works!
What, it’s still not March? Blasted February, why won't you just end already! Oh well, at least it gives us time for some more data analysis.
In my last post, I used Minitab’s Fitted Line Plot to create a regression model that predicted the probability of a home team winning a basketball game based on the difference in ranks between the two teams. This model had an r-squared value of 95.2%, which is great. But since it’s still February, let’s spend some time trying to improve on that number.
Improving the Regression Model
My last model used the difference in ranks between two teams. This assumes...
I know, I know. It’s not March yet. But it’s never too early to start thinking about your bracket. Can Murray St. be this year’s Butler? Which elite team really is the best? Can some double-digit seed give us an upset celebration as good as Hampton in 2001? There are so many questions. Let’s see if we can use Minitab Statistical Software to answer them.
Now, there are many statisticians out there who are already using data analysis to rank college basketball teams. These rankings can easily be used to predict the winner of a game by just looking at which team is ranked higher. But which set...
Five weeks into the NFL season, I used Minitab's
regression analysis to
predict player performance for the rest of the year. But when
you make predictions, it's always good to go back after the fact
and see how well you did. In
my previous post I used a fitted line plot to compare my
predictions and the players' final averages. Now I'm only going to
look at the top 10 players at each position to see how many my
model correctly predicted.
NOTE: The regression model predicted the top players based on points per game, not overall points. Because of this, the final rankings are also based on...
Five weeks into the NFL season, I used Minitab's regression
predict player performance for the rest of the year. Now that
the regular season is over, it's time to go back and see how
accurate the regression analysis was! I'm going to use a fitted
line plot to compare my predictions and the players' final
Back in October, I ranked the top 25 players at each position and predicted the average number of fantasy points per game they would finish the season with. So how accurate were those predictions? Minitab’s fitted line plot will let us see how close my predictions were,...
I recently read an article that talked about the randomness of this year's shortened NBA season.
Because of the lockout, the season will be only 66 games long,
instead of 82. The article says that having a sample size that's 16
games fewer than normal means there's a lot of uncertainty about
how the season will play out. But just how much more uncertainty
will there be?
Fortunately, Minitab Statistical Software has an entire set of Power and Sample Size tools that can answer that question!
We want to investigate how the sample size affects the margin of error around the proportion of games a...
The calenders have just turned to December, so you know what
time of the year it is. No, not the holiday season! It's time
for people to complain about the Bowl Championship Series
This year the main complaint is that Alabama, not Oklahoma State, gets to play Louisiana State (LSU) in the title game. Some voters in the Harris Poll don't even have Oklahoma State in the top 3! This has led some to call out certain voters in the Harris Poll. But before we go crying "conspiracy," it's best to do some data analysis. So I'm going to use Minitab Statistical Software to give us some descriptive...