This post might come across a little bit geeky, but hey, I like to apply my statistical acumen to my day-to-day life—especially if it gets me brownie points with my wife! On a more serious note, what I am about to show you can be very helpful in Lean Six Sigma initiatives, particularly if you are trying to improve customer satisfaction ratings across a range of products, services or sites.
But first, let me tell you how this all started… Every year, whether it’s my wife‘s birthday or our anniversary, I have to keep my wife happy and mark the occasion with something special and perhaps something romantic (as I do need to live up to the French reputation). So I started browsing the Internet looking for somewhere posh and classy to take my wife for a romantic dinner. Having seen celebrity chef Gordon Ramsay highjack our TV screens for the past few years, I thought: Why not check out customer reviews for some of his allegedly outstanding restaurants in London? That could potentially guarantee me the "Wow!" factor when celebration time comes!
So I searched for Gordon Ramsay’s restaurant reviews on one of the U.K.’s most trusted travel and leisure Web sites. There are essentially 5 restaurants to choose from in London: Gordon Ramsay at Claridge's, Maze Gordon Ramsay, Restaurant Gordon Ramsay, The Savoy Grill-Gordon Ramsay, and Petrus. Users’ opinions are tallied and split by the following outcomes: Excellent, Very good, Average, Poor, and Terrible.
Although the Web site provides overall rankings based on the observed user counts for each outcome, the statistician in me wanted to dig a little deeper. Were the differences in rating between the restaurants statistically significant or simply due to random variation at the time I checked the reviews? Ultimately, I want to choose the restaurant that is statistically more likely to make my wife happy!
Minitab Statistical Software provides a great tool that does just that and more, so watch out, Gordon!
Using the Chi-Square Test for Association for Customer Satisfaction Data
Indeed, the chi-square test for association (Assistant > Hypothesis Tests > compare more than 2 samples> Chi square test for association) determines whether the percentage in each outcome category significantly differs for two or more samples.
So I used the power of the Assistant menu to analyse customer satisfaction data from the site and generate the lovely reports below. Here's a screenshot of the actual dataset used for the analysis:
I went to Assistant > Hypothesis Tests > Compare more than 2 samples> Chi-square test for association, and the trick here is to mimic the layout of the actual worksheet as illustrated below:
I click “Ok” and Minitab generates 3 separate outputs. The first one is the Summary report:
The p-value at the top left corner of the output is 0.000. The Assistant therefore concludes that differences among the outcome percentage profiles are statistically significant (p<0.05). In other words, there is an association between customer satisfaction and the actual restaurant visited.
So I have part of the answer to my question: statistically, these restaurants are not able to get consistent customer ratings based on the sample ratings collected from the Web site. However, this still does not tell me which restaurants perform better than others—and, ultimately, which one I should pick for my romantic dinner.
Now let’s take a look at the Percentage profiles chart at the bottom left corner of the output.
Looking at the average customer satisfaction profile for Gordon Ramsay’s restaurants in general, we can see that the outcome that gets the largest proportion is “Excellent” with 67 % (blue bar at the top of the chart). In contrast, only 3% of the users rated the Gordon Ramsey experience as “Terrible.” So at first glance, it looks fairly good for me and my plans, regardless of the restaurant I end up choosing.
Let’s now focus on individual restaurants. I notice that we seem to have 2 clear winners with the “Excellent" rating taking over 80% of the possible outcomes: Restaurant Gordon Ramsay with 82% and Petrus with 86%! The chart on the bottom right, which represents the % difference between observed and expected counts (that is, expected counts, if there were no differences between the restaurants), also identifies the same 2 restaurants as being the obvious top 2 choices. Looking at the bars on the negative (left) side of the chart, the “Terrible” outcome occurs 88% less frequently than expected for Petrus and 79% less frequently than expected for Restaurant Gordon Ramsay—and those are the only 2 restaurants where the “Excellent” rating occurs more frequently than expected (blue bars on the positive side of the chart).
The other 3 restaurants are in an almost symmetrical opposition, with some of the less flattering outcomes (Average, Poor, and Terrible) occurring much more frequently than expected. So based on this output, I would be tempted to assume that the significant difference found in this analysis (remember, the p-value is 0.000) probably comes down to the obvious contrast between the better restaurants—Restaurant Gordon Ramsay and Petrus—and the “not as good “ restaurants: Gordon Ramsay at Claridges, Maze Gordon Ramsay, and The Savoy Grill-Gordon Ramsay .
Before I go any further, I do need to ensure the validity of the p-value found above, by checking for potential violations in the second output Minitab's Assistant feature provides: the Diagnostic report.
As all the expected counts are greater than 1, there are no “*” which indicate violations, so it’s all good. The 3rd output is the Report card, which again confirms the validity of the test with a nice check.
So I've narrowed my choices to 2. Now, looking back at the percentages profile chart in the summary report, it is actually very hard to establish whether there are any significant differences between the customer ratings of Restaurant Gordon Ramsay and Petrus. The profiles look almost identical.
Restaurant Gordon Ramsay Vs Petrus: Who Wins in the Chi-Square Showdown?
Which one shall I choose then? Does it really matter? Let’s find out by performing another chi-square test for association, this time comparing only the profiles of the 2 restaurants of interest. I'll use the Assistant menu again ( Assistant > Hypothesis Tests > Compare 2 samples with each other > Chi-square test for association). I get the following output:
This time, we get a p-value of 0.373, which is greater than the significance level of 0.05. The Assistant therefore concludes that there is no evidence of differences among the outcome percentage profiles at the 0.05 level of significance. Looking at the % Difference between Observed and Expected Counts chart, we could argue that Petrus gets the “Excellent” rating more frequently than expected (2% positive difference) and the “Average” and “Terrible” rating less frequently than expected (25% negative difference)—but, as stated above, these differences are not statistically significant.
So as far as my date is concerned, the verdict is.......................flip a coin and hope for the best!