Happy St. Patrick’s Day! You might have found yourself thinking about four-leaf clovers and trying to find one yourself lately. According to Irish tradition, those who find a four-leaf clover are destined for good luck, as each leaf in the clover symbolizes good omens for faith, hope, love, and luck for the finder.
A lesser-known fact about four-leaf clovers is that they aren’t the luckiest symbol after all. Irish legend indicates that those who find a five-leaf clover will actually have more luck and financial success than those who just find a four-leaf clover.
However, good luck can be hard to come by if you’re only relying on finding a four-leaf clover, let alone a five-leaf clover! The estimated statistical odds of finding a four-leaf clover on your first try is 10,000 to 1, and the odds skyrocket to 1,000,000 to 1 if you’re looking to find a five-leaf clover on your first try. (The majority of the clovers you see outside have only three leaves.)
Suppose you want to delve deeper into descriptive statistics and compare the odds of finding a four-leaf clover to the odds of finding a five-leaf clover. In what is known as an odds ratio, you can compare the odds of two events, where the odds of an event equals the probability-the-event-occurs divided by the probability-it-does-not-occur.
In this case, you would find that your odds ratio is 100, showing that it is much more likely for you to find a four-leaf than a five-leaf clover. You can conclude that the odds of finding a four-leaf clover are 100 times greater than your odds of finding a five-leaf clover. (But keep in mind the odds here: it’s still pretty difficult to find a four-leaf clover, especially over a three-leaf clover!)
Odds ratios are not just important for comparing the odds of two events (like we did above with the clovers) — they also play an important role in logistic regression. With binary logistic regression (Stat > Regression > Binary Logistic Regression in Minitab Statistical Software), you can investigate the relationship between a binary response and one or more predictors. You can then use the odds ratio for the predictors to quantify how each predictor affects the probabilities of each response.
For example, suppose you are analyzing data from people who have found four-leaf clovers to determine whether the finder’s gender and age affect their finding abilities. You could create a logistic regression model with the following variables:
Variable | Type | Description |
Find | Binary response | Equals 0 if the person did not find, and 1 if the person did find |
Gender | Binary response | Equals 0 if the person is male, and 1 if the person is female |
Age | Continuous predictor | Equals the person’s age |
Suppose the logistic regression procedure declares both predictors to be significant. If gender has an odds ratio of 2.0, you conclude that the odds of a woman finding a four-leaf clover is twice the odds of a man finding a four-leaf clover. If age has an odds ratio of 1.05, you conclude that, for each additional year of the finder’s age, the odds of finding increase by 5%.
For each predictor variable in the logistic regression model, Minitab displays an odds ratio and a confidence interval for the odds ratio.
Of course, this is only a hypothetical example! I’m not sure your age or gender really matters—I think finding a four-leaf clover is all about LUCK. And that's pretty difficult to quantify!