# The Great Minitab Miles-Per-Gallon Project

For those of you who have travelled to the UK before, it will come as no surprise that our fuel costs are amongst the highest in Europe. It appears that driving a car could soon become a luxury at this rate.

Unfortunately, most of us here at Minitab Ltd. are reliant on our cars to be able to make it to the office every day. So, unless we are able to relocate closer to the office (which may not be the easiest option), use public transport (notoriously pricey and could be unreliable at times) or be able to influence petrol prices (very unlikely in the current economic climate) there is really little we can do to avoid feeling the pinch.

Okay, let's not be too cynical, there can be light at the end of the tunnel, in the form of a powerful statistical analysis tool.

Hail the power of Minitab!

We thought it would be interesting to monitor our fuel consumption habits over time, so we put together some real data in Minitab 16 and ran some graphical and statistical analyses, with the aim to find ways to increase our mileage per gallon.

This is what the data looks like:

The "Average Mph" (average speed) and "Average Mpg" (average mileage per gallon) are recorded from the onboard trip computer at every fuel stop. Amongst other things we also take into account are the type of engine (Petrol/Diesel) and the garage/petrol station where the refuelling took place.

The first thing we wanted to identify was the type of relationship between the average Speed and average Mileage per gallon.

We first generated a scatter plot of “Average Mpg” Vs “Average speed” (Graph > Scatterplot > Simple in Minitab). Once the graph was generated, we looked for possible trends/patterns in the data by observing the plotted points. We concluded that adding a quadratic regression fit (a fitted curve as opposed to a straight line) could possibly give us a good idea of what this relationship looks like. In Minitab, we used Editor > Add > Regression Fit, and selected Quadratic:

While this fit is far from great, it does look like there's a quadratic relationship which would suggest that the average mileage per gallon is lower at very low and very high average speeds. However, some of the points do fall quite far away from the curve. This is probably normal as we combined the results from all the drivers together to generate the graph.

To make sure, we can easily check by looking at the offending points in more detail using the brush tool, and by setting Id variables to display the drivers’ names associated with these points (Editor > Brush, select an area of interest, then Editor > Set Id Variables, select “Driver,” and click OK) :

It appears that the area with an unusually high average Mpg can exclusively be attributed to Martyn and Nick. Interestingly, they are the only 2 drivers who proudly own diesel cars.

In contrast, Sandeep has an unusually low average MPG at lower speed, which is perhaps due to the fact he lives very close to the office and regularly takes short journeys.

And finally, just for fun, it looks like Isaac is a rather fast driver--which is not surprising either, as he drives a sporty BMW.

It is quite amazing how much information you can get out of a simple scatter plot!

So to get back to the point, perhaps it would make more sense to look at data for a single driver instead. We'll use Gill's data, as she has been ever so meticulous and organised when recording results. Following the same steps, we also get a scatter plot that suggests a quadratic relationship between the average speed and average Mpg.

So now, it is time to describe this relationship and establish whether it is statistically significant using Minitab's general regression tools:   Stat > Regression > General Regression

We selected “Average Mpg” for the response. For the model, we selected “Average Mph,” “Average Mph” squared (“Average Mph”*”Average Mph”), and we also decided to include the garage/petrol station to check if it would play a part. As “Garage” is a categorical factor, we also need to enter it in the Categorical predictors section:

Here is a section of the output given by Minitab:

If we look for P values lower than the significance level of 0.05 in the “Coefficients” section, we see that “Average Mph” and “Average Mph*Average Mph” are both significant (0.001 and 0.007). However, Garage does not seem to have a significant impact on Average Mpg.

So if we reduce the model by excluding Garage, we get the final output below, which not only confirms our original assumption but also gives us a regression equation that describes the relationship between Average speed and Average Mileage per gallon.

Note: You need to check the residuals before validating any regression analysis, which we did (Stat > Regression > General Regression > Graphs, and check Four in one)

Based on the regression analysis of Gill's data, it would appear that the average mileage per gallon can be increased by aiming for an average speed which is not too low, but not too high either.

In practical terms this could mean, on one hand, avoiding too much inner-city driving with all the problems that come with it -- i.e., traffic lights,  congestion, etc. -- and also avoiding too much high-speed driving on motorways, for instance. As far as the garage /petrol stations are concerned, they do not seem to have as much impact as they would like us to think.

### 7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?