Grocery shopping. For some, it's the most dreaded household activity. For others, it's fun, or perhaps just a “necessary evil.”
Personally, I enjoy it! My co-worker opened my eyes to something that made me love grocery shopping even more: she shared the data behind her family’s shopping trips. Being something of a data nerd, I really geeked out over the ability to analyze spending habits at the grocery store!
So how did she collect her data? What I find especially interesting is that she didn’t have to save her receipts or manually transfer any information from her receipts onto a spreadsheet. As a loyal Wegmans grocery store shopper, she was able to access over a year’s worth of her receipts just by signing up for a Wegmans.com account and using her Wegmans Shoppers Club card. The data she had access to includes the date, time of day, and total spent for each trip, as well as each item purchased, the grocery store department the item came from (i.e., dairy, produce, frozen foods, etc.), and if a discount was applied. As long as she used her card for purchases, it was tracked and accessible. Cool stuff!
She created a Minitab worksheet with her grocery receipt data from Wegmans for a several-month period, and shared it with me to see what kinds of Minitab analysis we could do and what we might be able to uncover about her shopping habits.
Using Time Series Plots to See Trends
Time series plots are great for evaluating patterns and behavior in data over time, so a time series plot was a natural first step in helping us look for any initial trends in Ginger’s shopping behavior. Here’s how her Minitab worksheet looked:
And here’s a time series plot that shows her spending over time:
To create this time series plot in Minitab, we navigated to Graph > Time Series Plot. It was easy to see the spending appears random over time, filled with several higher dollar orders (likely her weekly bulk trip to stock up) and several smaller orders (things forgotten or extras needed throughout the week). There doesn’t appear to be a trend or pattern. Almost all of her spending remained under $200 per trip, which is pretty good considering that many of her trips looked to be weekly bulk orders to feed her family of four. There were also very few outlier points with extremely high spending away from her consistent behavior to spend between $100 and $150 a 3-4 times per month.
However, you’ll notice that the graph above isn’t the simplest to read. To make it easier to zone-in on monthly spending habits, we used the graph paneling feature in Minitab to divide the graph into more manageable pieces:
The paneled graph makes it even easier to see that spending appears to be random, but consistently random! For more on paneling, check out this help topic on Graph Paneling.
Visualizing Spending Data by Day of the Week
To chart grocery spending by day of the week, we created a simple boxplot in Minitab (Graph > Boxplot):
It’s pretty easy to see that higher-spending trips took place on Saturdays, Sundays, Mondays and Tuesdays, with the greatest spread of spending (high, low, and in-between) occurring on Tuesdays. Wednesday appeared to be a low-spending day, with what looks to be quick trips to pick up just a few items.
How about the number of trips occurring each day of the week? To see this, we created a simple bar chart in Minitab (Graph > Bar Chart):
The highest number of trips to Wegmans occurred on Sunday (35) and Saturday (26), which isn’t really a surprise considering that many people do the majority of their grocery shopping on the weekends when they have time off from work. It’s also neat to see that many of her trips occurring on Wednesday and Thursday were likely smaller dollar trips (according to our box plot from earlier in the post). I can definitely relate to those pesky mid-week trips to get items forgotten earlier in the week!
Visualizing Spending Data by Department
And finally, what grocery store department does she purchase the most items from? To figure this out, we created a Pareto chart in Minitab (Stat > Quality Tools > Pareto Chart):
You can see that the highest number of items purchased is classified under OTHER, which we found to be a catch-all for items that don’t fit neatly into any of the other categories. In looking through the raw data with the item descriptions classified as OTHER, I found everything from personal care items like toothbrushes, to paper plates, and other specialty food items. The GROCERY category is another ambiguous category, but it seems as if this category is largely made up of items like canned and convenience foods (think apple sauces, cereal, crackers, etc.). The rest of the categories (dairy, produce, beverages) seem pretty self-explanatory.
The Pareto analysis is helpful because it can bring perspective to the types of foods being bought. Healthier items will likely be in the produce and dairy categories, so it’s good to see that these categories have high counts and percents in the Pareto above.
Grocery stores love data, too.
It’s certainly no surprise that grocery stores love to track consumer buying behaviors through store discount cards. This helps stores to better target consumers and offer them promotions they are more likely to take advantage of. But it’s also great that grocery stores like Wegmans are sharing the wealth and giving consumers the ability to easily access their own spending data and draw their own conclusions!