Under pressure to conserve energy, industrial companies are turning to data analytics to identify the main sources of energy consumption. To reduce energy utilization and save money, a potential solution is to analyze consumption data at each factory station.
Adjusting variables like machine speed, throughput, temperature, or equipment can make a big difference for a manufacturer. The key is to be as efficient as possible while maintaining overall quality, and one of the ways to do that is by using stepwise regression.
Stepwise regression is an appropriate analysis when you have many variables and you’re interested in identifying a useful subset of the predictors. In Minitab, the standard stepwise regression procedure both adds and removes predictors one at a time. Minitab stops when all variables not included in the model have p-values that are greater than a specified Alpha-to-Enter value and when all variables that are in the model have p-values that are less than or equal to a specified Alpha-to-Remove value.
In addition to the standard stepwise method, Minitab offers two other types of stepwise procedures:
In this example of using stepwise regression to identify the major sources of energy usage, analysts from the manufacturing plant considered the following predictor variables: total units produced, total equipment run time, staff size, mean outside temperature, minimum outside temperature, maximum outside temperature, percentage of sun, and mean equipment age. However, it’s important to note that stepwise regression can become especially helpful if you have over 100+ predictor variables!
Their goal was to narrow these variables into a list of the top predictors of energy usage. To get a final model, analysts chose Stat > Regression > Regression > Fit regression model in Minitab Statistical Software and completed the dialog box by entering the response ‘Energy’ and the above list of predictors in the continuous predictors field as shown in the screenshot below.
Click Stepwise in the dialog box and complete the sub dialogue box as shown below.
They were presented with the following model that included the predictors of total equipment run time, max temp, and average equipment age. Minitab removed the other variables because their p-values were greater than the ‘Alpha-to-Enter’ value.
To access the residual charts, select CTRL E to recall the last dialog box you filled in, click on Graphs and in the sub dialogue box, tick Pareto and under Residual Charts, select Four in One as shown below.
The regression equation below indicates that energy usage increases as total equipment run time, maximum temperature, and average equipment age increase:
Total equipment run time has the largest impact according to the T-statistics. Maximum temperature is second, followed by average equipment age.
With this analysis, the analysts were able to conclude that energy usage is significantly higher due to the extensive air conditioner usage, and that newer equipment appears to reduce energy usage. The plant might want to limit running equipment during peak times where air conditioning use is consistent and consider purchasing new equipment before the summer season.
While a lot can be learned with stepwise regression, there are some potential pitfalls to be aware of:
If you'd like to work with this data set yourself, download the data on Scribd.
Are hoping to achieve better energy efficiency throughout your organization? Watch our on-demand webinar to explore how optimized processes can increase efficiency, enhance equipment and material usage, and lower costs.