When the Covid-19 pandemic hit in 2020, mass transit organizations reallocated resources. Many cut train schedules and focused on track construction, infrastructure, and safety training.

As workers began to return to the office on a hybrid or full-time basis, many mass transit organizations failed to adequately adjust schedules. This led to safety concerns, unsanitary conditions, bypassed stations, overcrowded trains, and disappointed passengers.

As a regular commuter to my office in downtown Chicago, I've noticed changes over the past few years and thought I should investigate. With the help of Minitab Workspace and Minitab Statistical Software, these problems can be solved. Here’s how.

## A Use Case: What’s Bugging Passengers?

In a hypothetical scenario, a major Midwest transit authority conducted a customer survey to gauge customer satisfaction and identify areas for improvement. To their surprise, most customers were not happy. Several reasons were noted. The team used Minitab Workspace to visualize the most common complaints from customers:

The most common complaint they noticed was “peak hour congestion.” This was especially true during the morning rush hour (5 AM – 10 AM Monday through Friday). The next step was to use Minitab Statistical Software to visualize this data.

## Data Visualization: When do Riders Use Transit?

Generally, trains run every twenty minutes during rush hour on the train line we analyzed. The team spent several weeks collecting data to figure out approximately how many people rode each train. Once they collected all the relevant data, they created two different visualizations in Minitab, a boxplot and a scatterplot. Here is what their data looked like:

Their data showed that Tuesday, Wednesday, and Thursday received the heaviest ridership with Wednesday at 8:20 AM and 8:40 AM being the most heavily-used trains. This makes sense with the proliferation of hybrid work schedules with workers tending to work remotely more commonly on Mondays and Fridays.

## How Can We Solve This Problem?

Leadership at the company then split their data by day and used regression analysis to better understand the data trend of each day. Here is the fitted line plot Minitab produced for Wednesdays:

In this instance, leadership can use this equation to predict the ridership on a Wednesday for any time during the day, even during off-peak hours.

Perhaps more importantly, the team wanted to see where there were clear statistical shifts in the pattern of ridership. To do this, they used MARS Regression, located in Minitab’s Predictive Analytics module, to break the data into segments where a clear shift in patterns can be observed. Here is their data for Wednesdays:

This one predictor partial dependence plot added interesting context; although the heaviest volume of passengers traveled between 8:20 and 9:00 AM, the largest shift in the pattern of ridership occurred on the 7:40 AM train. And, with MARS, the team can simply click the Predict button to obtain future forecasts for each day of the week.

## So, What’s the Application?

Without this data, most transit organizations would advocate to add a train during the peak of rush hour, probably sometime around 8:30 or 8:40 AM.  But, on a granular level, the transit system will obtain better results to reduce overcrowding by adding more trains around 8:00 AM instead of a bit later during the peak of rush hour. This will be more effective in reducing overcrowding.

Hopefully, empowered with this data, the transit system will not need to redo their schedule twice, and they can be smarter about where they spend their limited resources. This step can be repeated for all days where ridership surpasses the permissible threshold to find times where adding an additional train would have the most significant impact.

The team also surmised that fixing this problem first could possibly naturally solve some of the other problems, like overcrowded stations, a lack of seating, and cleanliness.

In the end, the result will be happier riders. This is a good thing—happier riders are less likely to find other ways to commute to work, school, or for pleasure.

## Data-Driven Problem-Solving for Transit

Public transportation is vital for many reasons, including its positive impact on the environment, economic benefits for riders, reduction of traffic congestion for all, and promotion of social equity. When these systems encounter problems, it negatively impacts not only the riders but also the entire city-wide ecosystem that counts on reliable transit.

Minitab can help mass-transit systems operate more efficiently, reliably, and safely by providing powerful data analysis tools to identify and correct issues or address them proactively. By leveraging Minitab's capabilities, transit authorities can optimize routes, improve maintenance schedules, and enhance overall service quality, ensuring a smoother and more dependable experience for all riders.