“Fear leads to anger, anger leads to hate, hate leads to
suffering.” Yoda
This quotation sums up the way most people’s relationship to
statistics develops. But not me.
My name is Cody Steele, and I’m the guy who Eston
Martz warned you about. From the time my 6-year-old self
decided that Superman’s most amazing power was his intelligence,
through buying a calculus book after college so that I could
integrate functions for fun, I always liked math. After I took my
first statistics course, I was hooked. It astounded me that
statistics could help you make decisions when all you had to go on
was...
My family moved to Los Angeles in 1987, just as the Los Angeles Lakers were in the midst of winning back-to-back championships. While I don’t consider myself a huge basketball fan, the NBA finals always hold some interest for me. If you get to watch James Worthy, Michael Cooper, Byron Scott, A.C. Green, Magic Johnson, and Kareem Abdul-Jabbar win championships, it sticks with you.
So now that the Spurs and Heat are competing for the 2013 edition of the NBA championships, I get a little drawn in by the excitement. One of the interesting occurrences from this year’s finals is that the two teams...
Previously I wrote about using a decision matrix to help make a decision. Matrices are nice tools for collecting your thoughts and visualizing a decision. But complex decisions could involve collecting and synthesizing input from a number of different people.
Quality Companion (Minitab's process improvement software) uses ballots to let team members record their input to a decision matrix. If you’ve already made the matrix, setting up the ballot is easy. The ballot simplifies data collection and organization, even among team members who are dispersed in space and time. You can follow along in...
Normally, I tell you about ways to practice with Minitab Statistical Software so that you can boost your confidence with statistical analysis. But over the last few days in my house, we’ve been planning some activities for the family. That planning has given me a chance to have some fun with Quality Companion.
Quality Companion is a substantial piece of software: everything that you need to manage a quality improvement project in one application. Quality Companion provides project management tools so that you can make and communicate decisions.
My favorite tools in Quality Companion, with...
One of the more misunderstood concepts in statistics is alpha, more formally known as the significance level. Alpha is typically set before you conduct an experiment. When the calculated p-value from a hypothesis test is less than the significance level (α), the results of an experiment are so unlikely to happen by chance that the more likely explanation is the results occur because of the effect being studied. That the results are unlikely to happen by chance is what we mean by the phrase “statistical significance,” not to be confused with practical significance.
There was a wonderful example...
My holy of holies is the human body, health, intelligence, talent, inspiration, love, and the most absolute freedom imaginable, freedom from violence and lies, no matter what form the latter two take.
-Anton Chekhov
Normally, I write about subjects that are generally of interest to me when I do a blog post: you may have seen the list before. So I’ll have to start out this post with the admission that until I was leaving the office on Monday, I had no idea that the Boston Marathon was going on that day. I had no clue that the estimate of the number of spectators would be over...
I have a good time putting together simple data sets that you can use to build your confidence in statistics. But I tend to like fairly old things: Shakespeare (1564-1616), Poe (1809-1849) and gummi bears (invented 1922). But I have some modern interests too. One of those, appearing in about 2009, is Minecraft.
If you like Minecraft, then here’s a data set that you can use to practice a few things in Minitab Statistical Software. One of the nicest things about Minitab is that even with this spreadsheet, saved in Googledocs, you can copy and paste directly into Minitab.
Change Data with the...
Normally, I like to talk about fun statistical things to build your confidence: gummi bears, poetry, and movies, just to name a few. But building your confidence also means getting comfortable with Minitab Statistical Software. One of the features that makes it easy to view your results and data in a snap is Minitab's Project Manager.
My favorite way to use the Project Manager is through the toolbar:
Click the leftmost button once, and you see all of the output in your project. Click the second button once, and you see all of the worksheets in your project. Click the third button once, and...
Statistician-to-the-Stars William Briggs deserves credit for his correct prediction of the Best Picture Oscar the day before the ceremonies. And while Mr. Briggs would never encourage anyone to misuse his model this way, I feel my statistics heartstrings strummed by the desire to remind everyone about a particular common and dangerous statistical mistake: Correlation does not = causation.
Mr. Briggs correctly predicted Argo would be selected as Best Picture from among the nominated films and noted that "The key reasons for its victory will be: the lead actor is at least forty, the other...
It’s an amazing thing when a mass of rock and iron streaks through space and enters Earth’s atmosphere. So naturally, the Chelyabinsk meteor has attracted a great deal of attention. We’re fascinated by the images and captivated by the stories. And, if you’re interested in statistical analysis, you start to wonder a little bit about meteorites.
The nice thing is that the Meteoritical Society has a large database with information about meteorites recovered on Earth. The database has over 50,000 records.
It’s particularly neat to see where people find meteorites with recoverable masses. A Pareto...
Various commercials valiantly vied for the attention and dollars of football Super Bowl commercial fans on Sunday. The game was decided objectively (or by the referees), but the drama of which commercials won lives on. Because we love data analysis, I gathered a little bit to see which efforts really stood out. Then I plotted the data in Minitab to explore the results.
Three top-ten lists attracted my attention:
- True Reach by Visible Measures: The number of views on any website
- bluefin labs: The number of mentions on Twitter and public Facebook comments
- TiVo Rank: The rank for the number of...
Entities should not be multiplied unnecessarily.
— William of Occam, Quodlibeta Septem
We’ve had a chance now to explore Best Subsets Regression and Stepwise Regression in Minitab statistical software. Both techniques are ways of quickly looking at lots of candidate models so that you can identify promising ones.
We’ve seen that statistical significance and model fit statistics can’t guarantee a fit as good as we’re looking for. With stepwise regression, we found a model with unsatisfactory residuals until we considered extra terms. With best subsets regression, we had to understand that there...
Orlando: And wilt thou have me? 45
Rosalind: Ay, and twenty such.
Orlando: What sayest thou?
Rosalind: Are you not good?
Orlando: I hope so.
Rosalind: Why then, can one desire too much of a good thing?
William Shakespeare, As You Like It, Act IV, Scene I.
When looking at best subsets regression, Shakespeare’s question about whether one can desire too much of a good thing becomes immediately important. With the power of best subsets regression, you can quickly explore models with twenty such. For the gummi bear data, you could even try models with thirty-such! And as...
Last time, we used stepwise regression to come up with models for the gummi bear data. Stepwise regression is a great tool, but it has a downside: when we use stepwise selection in design of experiments, especially if we focus on only the last step, we can miss interesting models that might be useful.
One way to look at more models is to use Minitab’s Best Subsets feature. Instead of identifying a single model based on statistical significance, Minitab’s Best Subsets feature shows a number of different models, as well as some statistics to help us compare those models.
To get the idea, let’s...
We’ve used design of experiments to look at the data. We’ve seen that the center points are statistically insignificant. We’ve seen that blocks help account for the unstable conditions during the collection of the data. Now for the exciting part: let’s choose a model to use to predict where the gummi bears will land when we launch them.
Various criteria exist for how to choose a model, so we’re not going to settle on a single model right away. We’ll do three steps:
- Come up with some candidate models.
- Check for reasons to discard the candidate models.
- Check how the models perform when we go back...
Last time I used design of experiments to look at the gummi bear data, I interpreted the center point data. The data say that I won’t need any square or cubic terms to get a good fit to the data. Traditionally, the next effect to look at in design of experiments is the block effect.
I was worried that there would be a wearout effect acting on my catapult, so I changed popsicle sticks and rubber bands periodically. I also simply didn’t have time to collect all of my data at the same time, so the blocks represent different days. Moreover, I collected the data for the third block in a...
I was in 11th grade when Mrs. Barrett was my English teacher at Poway High School. Around Halloween time, we studied “The Fall of the House of Usher” and she mentioned that the punctuation changes the rhythms of the story to create the intensity of the climax. To celebrate Halloween, and to honor the fine work done by thousands of educators teaching Edgar Allen Poe this time of year, here’s some statistical analysis about what Mrs. Barrett said. Look at the count statistics by paragraph for the punctuation that Poe reserves for just the right moment:
There are some noticeably unusual moments on...
When I chose a full factorial design for my gummi bear experiment, I was using traditional design of experiments practice to try to learn the most from the least amount of data. I wanted to see if I could save myself the 10 or more data points I would need to add to the design to estimate nonlinear effects. Now that I have some data, the first thing I’m going to learn is: Do I need to collect more data?
I hope I don't, because I would have to go buy more gummi bears. I already ate the bears I didn’t throw away.
I talked about the role of center points in design of experiments earlier. When we...
Back when I chose the factors to study for my gummi bear design of experiments, I was thinking about the fact that something like the position of the gummi bear and the position of the fulcrum would probably interact. When I finished collecting the data, I was eager to see if that effect showed up in my analysis.
Before we look at the distance parallel to the catapult, let's look at the distance perpendicular to the catapult. I didn’t change any factors with the express purpose of making the gummi bear go left or right, so I was hoping all of these factors would be statistically insignificant....
I collected my first block of data for the gummi bear design of experiments this week. Why not all of it? Well, there’s lots you can learn when you start collecting data for real. Here are some of my thoughts:
Enter data quickly and accurately for design of experiments
If you’re going to do anything with your data, it’s a lot easier to have it in Minitab. If you followed my lead for doing design of experiments, you have a piece of paper that looks like this:
Accuracy will be much easier if the same person who wrote the data also enters it in the computer, so they can figure out if that number in...
Recently, we’ve discussed how to do the design and factor setup for design of experiments in Minitab Statistical Software. We’re almost ready to launch some gummi bears. But there’s something else to consider. When we produce the data for design of experiments, how does the data get from the measuring device to Minitab?
If you’re lucky, you have an electronic thingamajig that takes measurements and beams them directly to a desktop computer where they’re stored in an analyzable format. But that’s Joan-Ginther-or-Jim-Frost-lucky. The rest of us are probably going to have to use a classic...

