Data Analysis Software | MinitabMinitab:Data Analysis Software
http://blog.minitab.com/blog/data-analysis-software/rss
Wed, 30 Jul 2014 00:54:20 +0000FeedCreator 1.7.3Two-Way ANOVA in Minitab 17
http://blog.minitab.com/blog/marilyn-wheatleys-blog/two-way-anova-in-minitab-17
<p><span style="line-height: 1.6;">After upgrading to the latest and greatest version of our statistical software, Minitab 17, some users have contacted tech support to ask "Wait a minute, where is that Two-Way ANOVA option in Minitab 17?" </span></p>
<p><span style="line-height: 1.6;">The answer is that it’s not there. That’s right! The 2-Way ANOVA option that was available in Minitab 16 and prior versions was removed from Minitab 17.</span> Why would this feature be removed from the new version? Shouldn’t the new version have more features instead of less? </p>
<p>Two-Way ANOVA was removed from Minitab 17 because you can get the same output by using the <a href="http://support.minitab.com/en-us/minitab/17/topic-library/modeling-statistics/anova/basics/what-is-a-general-linear-model/">General Linear Model</a> option in the <a href="http://blog.minitab.com/blog/understanding-statistics/you-dont-need-a-weatherman-using-anova-graphs-and-regression-to-fact-check-the-forecasts">ANOVA </a>menu. Removing the separate 2 way ANOVA menu choice reduces redundancy and creates a more similar workflow for the linear models options.</p>
<p>Let's look at an example that shows how to replicate the Two-Way ANOVA output from Minitab 16 using Minitab 17.</p>
<p>The data shown below is a sample dataset used for 2-Way ANOVA in Minitab 16: <em>You as a biologist are studying how zooplankton live in two lakes. You set up twelve tanks in your laboratory, six each with water from one of the two lakes. You add one of three nutrient supplements to each tank and after 30 days you count the zooplankton in a unit volume of water. You use two-way ANOVA to test whether there is significant evidence of interactions and main effects.</em></p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/391474f6a45afe676d1d403b477a9547/1.png" style="width: 242px; height: 309px; border-width: 1px; border-style: solid;" /></p>
<p>The Two-Way ANOVA option in Minitab 16 yields the following output:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/f9a2412b309c5e35fb7e446ef68c74bd/2.png" style="border-width: 1px; border-style: solid; width: 399px; height: 164px;" /></p>
<p>To replicate the Two-Way ANOVA output from Minitab 16 using Minitab 17, use <strong>Stat</strong> > <strong>ANOVA</strong> > <strong>General Linear Model</strong> > <strong>Fit General Linear Model</strong>:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/9db638c9ed6424030c6fb1e1c06f115c/3.png" style="border-width: 1px; border-style: solid; width: 471px; height: 274px;" /></p>
<p><span style="line-height: 1.6;">Using GLM, we can enter our response column (Zooplankton) in the </span><strong style="line-height: 1.6;">Responses</strong><span style="line-height: 1.6;"> field and our two factors in the </span><strong style="line-height: 1.6;">Factors</strong><span style="line-height: 1.6;"> field without the need to specify one factor as the row and one as the column factor:</span></p>
<p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/cd2fd2232e010dc13cc154e948bb8cfa/4.png" style="border-width: 1px; border-style: solid; width: 580px; height: 439px;" /></p>
</p>
<p>Minitab 16's Two-Way ANOVA option also shows the two-factor interaction, so in Minitab 17 we need to manually add the interaction by clicking the <strong>Model</strong> button in the GLM dialog box. There we can highlight the factors listed on the left side (step 1 below); when we do that, the <strong>Add</strong> button on the right will become available. To add the interaction, click <strong>Add</strong> (step 2) and the interaction will be shown at the bottom under <strong>Terms in the model</strong> (step 3).</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/596dd69d168cfcac30e0aa77f683ed4f/capture.PNG" style="border-width: 1px; border-style: solid; width: 522px; height: 544px;" /></p>
<p>Click <strong>OK </strong>in the Model dialog box to return to the main GLM dialog.</p>
<p>By default, Minitab 17 will provide more detailed output than Two-Way ANOVA in Minitab 16. To make the results match, we can remove the additional output by clicking the <strong>Results</strong> button within the GLM dialog box. Unchecking the additional options so that only <strong>Analysis of variance</strong> and <strong>Model summary</strong> are selected (as shown below) will make the output match Minitab 16’s Two-Way ANOVA results.</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/06cbefa1d2c4c21fc8323308a74e7467/5.png" style="border-width: 1px; border-style: solid; width: 487px; height: 422px;" /></p>
<p>The results from General Linear Model in Minitab 17 now match the output from Two-Way ANOVA in Minitab 16:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/Image/49008c4124ecd0fabb99f70a3afb663e/6.png" style="border-width: 1px; border-style: solid; width: 472px; height: 258px;" /></p>
<p>If you're wondering how to do something with Minitab, our <a href="http://www.minitab.com/support/">technical support team</a> is always ready to help you. Our technical support representatives are knowledgeable in statistics, quality improvement, and computer systems. Best of all, our assistance is free.<br />
</p>
<p> </p>
Data AnalysisStatisticsStatsWed, 16 Jul 2014 12:00:00 +0000http://blog.minitab.com/blog/marilyn-wheatleys-blog/two-way-anova-in-minitab-17Marilyn WheatleyGuest Post: Did Ma's Diabetes Get Cured by Back Surgery?
http://blog.minitab.com/blog/voice-of-the-customer/guest-post%3a-did-mas-diabetes-get-cured-by-back-surgery
<p><strong><em>The Minitab Fan section of the Minitab blog is your chance to share with our readers! We always love to hear how you are using Minitab products for quality improvement projects, Lean Six Sigma initiatives, research and data analysis, and more. If our software has helped you, please <a href="http://blog.minitab.com/blog/landing-pages/share-your-story-about-minitab/n"> share your Minitab story</a>, too!</em></strong></p>
<p>Once my Mom was diagnosed with Diabetes Type II, I began to track her blood sugar readings in Minitab Statistical Software.</p>
<p>I did it three times a day before meals...over weeks, then months, then years. At each doctor's appointment I would take in her 'book' of readings, and I would take my charts, too.</p>
<p>The <a href="http://blog.minitab.com/blog/real-world-quality-improvement/three-ways-individual-value-plots-can-help-you-analyze-data">individual value plot</a> chart was very telling. Her blood sugars increased with each meal during the day. The doctor changed her insulin based on the undeniable visual trends.</p>
<p><img src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/84d328f1-1b81-41d1-aad4-fc00026edd38/Image/570a16b79c624b3e794fefea622d6570_w480.jpeg" /></p>
<p>Then the biggest surprise came. In June 2013, over a year after blood sugar tracking began, she decided to get back surgery to alleviate leg pain. The day after her back surgery, still in the hospital, her blood sugar dropped approx. 75 points. After a few months of this obvious transition and new trend, the doctor removed her from her diabetes medicine and insulin.</p>
<p><img src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/84d328f1-1b81-41d1-aad4-fc00026edd38/Image/40523346df4516a0c168896c5255e06a_w480.jpeg" /></p>
<p>Minitab charts help guide my Mom's health, even in her 80's. She is 86 today, and is still off insulin and diabetes medicine. And many Minitab <a href="http://www.minitab.com/products/minitab">Statistical Software</a> charts are a part of my Mom's health history, kept in her doctor's office records.<br />
<br />
Paul Kelly<br />
Black Belt<br />
Air Products<br />
Trexlertown, Pa.</p>
<p> </p>
Data AnalysisHealth Care Quality ImprovementSix SigmaStatsMon, 14 Jul 2014 12:00:00 +0000http://blog.minitab.com/blog/voice-of-the-customer/guest-post%3a-did-mas-diabetes-get-cured-by-back-surgeryMinitab FanThe 6 coolest tools on Minitab's toolbars
http://blog.minitab.com/blog/statistics-and-quality-improvement/the-6-coolest-th-on-minitabs-toolbars
<p>Toolbars are there to make your life easier, but if you don’t take the time to hover over each button and wait for a description, it’s pretty easy to never know that there’s a faster way to do something.</p>
<p>The toolbars in Minitab Statistical Software include some pretty nifty shortcuts. Here are my favorite 6:</p>
<ol>
<li>
<p><img alt="StatGuide Button" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/8944fa16dcbba19e329e595fb9388298/statguide_button.png" style="width: 23px; height: 27px;" /> StatGuide</p>
</li>
</ol>
<p>As soon as you have results in Minitab, the <a href="http://blog.minitab.com/blog/understanding-statistics/hidden-helpers-in-minitab-statistical-software">StatGuide</a> button becomes active on your toolbar. Click the button, and the StatGuide opens directly to guidance for the analysis that you’re looking at. Minitab saves you the time you would have spent looking for information about your results so that you have more time to get things done.</p>
<ol>
<li value="2">
<p><img alt="Edit Last Dialog" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/b742205b500fb457c0c67551860916fd/edit_last_dialog_button.png" style="width: 23px; height: 21px;" /> Edit Last Dialog</p>
</li>
</ol>
<p>To repeat an analysis, either because you want to run it on a different column or because you want to change a setting, all you have to do is click a button. Even better, most of Minitab’s analyses will remember what you entered the last time the dialog box was open. Make the small adjustments you need to make, and you’re ready to perform your new analysis.</p>
<ol>
<li value="3">
<p><img alt="Show Session Folder" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/92dec208bff467280065fc63c36f426c/project_manager_button.png" style="width: 21px; height: 23px;" /> Show Session Folder</p>
</li>
</ol>
<p>When you’ve run several analyses in Minitab, it can be nice to have a quick way to find the results of a particular analysis. Minitab’s project manager is the best way to find the results of an analysis quickly, and that’s why it’s so nice that it’s accessible from the toolbar. Click the button, and you get a list of all of the analyses and graphs in your Minitab project.</p>
<ol>
<li value="4">
<p><img alt="Current Data Window" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/872fea599c8786f805e4990cbab7a8eb/cycle_worksheets_button.png" style="width: 21px; height: 20px;" /> Current Data Window</p>
</li>
</ol>
<p>If you have a lot of worksheets open, you might want to be able to see both your worksheet and your results at the same time. When you click the button, the current worksheet comes to the front, without maximizing to hide your results. Click it again, and the next worksheet comes to the front. Click it again, and the next worksheet comes to the front. You can quickly cycle through the worksheets to find the one that you want, while still being able to see the results from your analysis.</p>
<ol>
<li value="5">
<p><img alt="Assign Formula To Column" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/1067f139e7036a2f3393897fbc76f1b3/add_formula_to_column_button.png" style="width: 23px; height: 22px;" /> Assign Formula To Column</p>
</li>
</ol>
<p>The <a href="http://blog.minitab.com/blog/real-world-quality-improvement/two-tip-tuesday-getting-the-most-out-of-your-text-data-in-minitab">Minitab calculator</a>’s a nice tool, but with the toolbar, you can use it even faster. The best part of all is that when you use the toolbar, you specify which column will have the formula without having to tell the calculator. Plus, when you’re in a complicated series of formulas, the column where the formula goes is not in the list of columns to select, so you can never get a recursive formula error.</p>
<p><img alt="No field to indicate where to store the formula." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/f340a34e53923ed30389555c88e1aefb/without_column_to_store_in.png" style="float: left; width: 261px; height: 200px;" /><img alt="With a field where you can select where to store the formula." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/144c0597ac4e208158470d7bc6f100a8/with_column_to_store_in.png" style="width: 282px; height: 200px;" /></p>
<ol>
<li value="6">
<p><img alt="Show Info" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/fcc5c9baee2de507effaab606f07b1ed/info_button.png" style="width: 20px; height: 22px;" /> Show Info</p>
</li>
</ol>
<p>Especially after you open or copy data from Excel, it can be helpful to get a quick snapshot of the columns in your worksheet. When you click the Info button, the Project Manager shows the column names, the lengths, the number of missing values, and the format of the columns. You can investigate why a column that contains numeric data is formatted as text. If any of the columns are the wrong length because of missing values at the end, you know right where to look. You won't have to spend your time scrolling around the worksheet looking for things that are amiss, so that you can get to your analysis faster.</p>
<p><img alt="The Project Manager shows the Id, length, number of missing values, and type for each column." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/1ed15fa7eafa8e0d393d68f156380ae2/info_window.png" style="width: 410px; height: 172px;" /></p>
Faster than you used to be
<p>The only thing better than doing fearless data analysis is doing fearless data analysis even faster. Minitab’s toolbars come ready with shortcuts that help you analyze your data faster, from generating your results to interpreting them. Of course, the toolbars that everyone uses can’t be perfect for everyone. If you’re feeling emboldened, check out how to <a href="http://support.minitab.com/en-us/minitab/17/topic-library/minitab-environment/interface/customize-the-minitab-interface/customize-menus-toolbars-and-shortcut-keys/">customize the existing toolbars</a> or even to <a href="http://support.minitab.com/en-us/minitab/17/getting-started/customizing-minitab/">create your own toolbars</a>!</p>
Data AnalysisStatisticsStatistics HelpWed, 11 Jun 2014 16:17:11 +0000http://blog.minitab.com/blog/statistics-and-quality-improvement/the-6-coolest-th-on-minitabs-toolbarsCody SteeleGuest Post: Analysis of Road Accidents in Hyderabad
http://blog.minitab.com/blog/voice-of-the-customer/guest-post%3a-analysis-of-road-accidents-in-hyderabad
<p><strong><em>The Minitab Fan section of the Minitab blog is your chance to share with our readers! We always love to hear how you are using Minitab products for quality improvement projects, Lean Six Sigma initiatives, research and data analysis, and more. If our software has helped you, please <a href="http://blog.minitab.com/blog/landing-pages/share-your-story-about-minitab/n"> share your Minitab story</a>, too!</em></strong></p>
An Analysis of Road Accidents in Hyderabad, India
<p>The data taken for this study is obtained from the official website of Hyderabad Traffic Police (<a href="http://www.htp.gov.in/Default.htm" rel="nofollow">http://www.htp.gov.in/Default.htm</a>). Also note that the data for 2014 covers only the period until April.</p>
<p>Reviewing the time series plot I obtained using <a href="http://www.minitab.com/products/minitab">Minitab 17</a> indicates that the number of accidents steadily decreased every year from 2011-2013, but there seems to be a rise from January-April 2014.</p>
<p><img src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/84d328f1-1b81-41d1-aad4-fc00026edd38/Image/ad29eb472012769f8ebb3a1c61a7623a_w480.jpeg" /></p>
<img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b239c867c06151e8bfc2d65f22614195/crash.jpg" style="border-width: 1px; border-style: solid; margin: 10px 15px; float: right; width: 250px; height: 250px;" />
<p>As I was brought up in city of Hyderabad, my experience has been that the following factors influence road accidents here:</p>
<ul>
<li>Increasing vehicle population leading to heavy traffic during peak hours</li>
<li>Drunken driving</li>
<li>Speed limit violation</li>
<li>Lack of properly laid roads</li>
<li>Violation of traffic and safety rules</li>
<li>Roads getting water logged during rainy season</li>
<li>Using cell phone while driving</li>
<li>Not wearing seat belts</li>
<li>Unwanted hurrying/negligence of the driver</li>
<li>Inattention while backing the vehicle</li>
<li>Not getting clear picture of surroundings—lack of signage</li>
<li>Using high beam light</li>
<li>Driving without a helmet</li>
<li>Speed driving on the flyovers and the Outer Ring Road</li>
<li>Tripping of heavy load vehicles in the city during the day time</li>
</ul>
<p>Following is a time series plot of the 852 accidents that took place from January-April 2014 according to the days of the week. This graph clearly indicates that the number of accidents occurring over the weekends is high.</p>
<p><img src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/84d328f1-1b81-41d1-aad4-fc00026edd38/Image/f3e309f8cf3ca640ff4bd561d619fe5e_w480.jpeg" /></p>
<p>The increase in the number of accidents over the weekend is a serious concern which requires attention since these accidents may be preventable by awareness campaigns targeted to the youth of the city.</p>
Conclusion
<p>Based on the results of the above analysis, preventive actions that I believe could be taken by the concerned authorities are:</p>
<ol>
<li>Make citizens aware of the importance of strictly adhering to the traffic rules, and impose fines on those who do not abide by them.</li>
<li>Issue driving licenses only as per age limits, and only after the person clears all the tests.</li>
<li>Inspect vehicles to make sure they are road-worthy.</li>
<li>Increase the number of traffic police in areas of heavy traffic.</li>
<li>Make sure the timers installed at traffic signals function properly.</li>
<li>Analyze the major accident-prone areas scientifically to reduce the rate of occurrence.</li>
<li>Check medians, footpaths, and curvatures carefully.</li>
<li>Use paint to clearly mark humps on the roads.</li>
<li>Remove attention-seeking boards, banners, and advertisements.</li>
</ol>
<p><br />
<strong>Dhatry Yaso Kala</strong><br />
Independent Consultant and Lean Six Sigma Black Belt<br />
Hyderabad, India<br />
</p>
Fri, 13 Jun 2014 12:00:00 +0000http://blog.minitab.com/blog/voice-of-the-customer/guest-post%3a-analysis-of-road-accidents-in-hyderabadMinitab FanThe Five Coolest Things You Can Do When You Right-click a Graph in Minitab Statistical Software
http://blog.minitab.com/blog/statistics-and-quality-improvement/the-five-coolest-things-you-can-do-when-you-right-click-a-graph-in-minitab-statistical-software
<p>Minitab graphs are powerful tools for investigating your process further and removing any doubt about the steps you should take to improve it. With that in mind, you’ll want to know every feature about Minitab graphs that can help you share and communicate your results effectively. While many ways to modify your graph are on the <strong>Editor</strong> menu, some of the best features become available when you right-click your graph.</p>
<p>Here are the five coolest things you can do when you right-click a graph in Minitab Statistical Software.</p>
Send graph to...
<p>Once your graph is ready for your report or presentation, you’ll want to put the graph in your document. Minitab makes this easy because you can right-click your graph and select either <strong>Send Graph to Microsoft Word</strong> or <strong>Send Graph to Microsoft PowerPoint</strong>. With that, you’re all set to go.</p>
<p> <img alt="The right-click menu, with "Sned Graph to Microsoft Word" highlighted." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/5c2d6174d12ee1a9bfd306566bc0f5f7/context_menu.png" style="width: 170px; height: 341px;" /> </p>
<p>When you use the Minitab menu to transfer your graph to a presentation document, Minitab automatically selects the format that provides the clearest graph. In the case of PowerPoint, Minitab also makes sure that the graph is automatically fit to fill the receiving slide.</p>
StatGuide™
<p>Getting your graph into a report is an important step, but you also want to be ready to explain your results. That’s where Minitab’s <a href="http://blog.minitab.com/blog/understanding-statistics/five-ways-to-get-help-with-statistics">StatGuide</a>™ comes into play. Right-click your graph, and the last menu item is always going to be <strong>StatGuide</strong>. Select <strong>StatGuide</strong> and you’ll be taken directly to a page about the graph that you’re examining. Minitab saves you the time you would have spend looking for information about the output so that you have more time to get things done.</p>
<p><img alt="The residuals versus order plot has a pattern in it." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/c285b252141dee3ab6b83b869e227850/residuals_vs_order_for_yield.jpg" style="width: 297px; height: 198px;" /><img alt="StatGuide contains information to help you interpret and explain your graph." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/84b2dbfcab19b03b56a09dddfccdad6c/statguide.png" style="width: 226px; height: 358px;" /></p>
Copy Text
<p>Graphs are excellent tools for exploring and communicating, but that doesn’t mean that you never want to see the exact numbers. Getting the numbers from a graph is as easy as selecting an individual component and choosing <strong>Copy Text</strong>.</p>
<p>For example, you have a boxplot and would like to see the exact statistics for the graph. The tooltip for the boxplot includes the mean, quartiles, minimum, maximum, interquartile range, and sample size. Select the box with a right-click and <strong>Copy Text</strong> is active in the context menu. You can even paste a text box directly onto the graph with the information from the tooltip!</p>
<p><img alt="The tooltip shows the statstics Minitab uses to draw the boxplot." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/07b066ff8c50b5e6c312521963b9c488/boxplot_tooltip_w640.png" style="width: 384px; height: 243px;" /><img alt="The statistics from the tooltip are pasted directly on the graph." height="243" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/231d2fb919e6facd416289bc46222d5c/boxplot_of_strength.jpg" width="366" /></p>
Switch to worksheet
<p>If you have a lot of graphs open, your Minitab window can sometimes get a little full. On those occasions, the right-click menu makes it easy for you to compare what you see on your graph with what’s in your data. For example, say that you’re looking at a residuals vs. fits plot and you brush the rows with the largest fits. Right-click the graph and choose <strong>Switch to</strong>, and you can quickly match up the brushed rows with the data on your graph. Here, you can easily see that while the catalyst changes between rows 4 and 8, the settings for time and temperature are the same.</p>
<p><img alt="The brushed points are the two largest fits." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/7ae0ef21bfec4c03b08d64dd9218cdfe/burshed_graph_w640.png" style="width: 500px; height: 334px;" /><img alt="The black dots indicate the brushed points in the worksheet." height="249" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/22791f44-517c-42aa-9f28-864c95cb4e27/Image/1d64afb4426b880c264d573da24ba2d5/worksheet.png" width="569" /></p>
Go to Session Line
<p>Sometimes you have to produce a lot of output in Minitab to understand your data. When you get a mix of lots of statistics and graphs, you’ll want to be able to easily find the Coefficients table for a particular residual plot or the p-value for a t-test shown on a specific individual value plot. Right-clicking a graph can save you again. When you right-click and select <strong>Go to Session Line</strong>, you’re taken to the portion of the session window where the graph was made. Any tables or statistics that Minitab produced at the same time as the graph are right above that point in the session window!</p>
Ready to go
<p>Graphs are an important tool for making sure that everyone understands the results of your data analysis. When you right-click a graph in Minitab, you’ll find a number of tools that make it easier to share and understand your results. The right-click menu is one more step on your path to <a href="http://www.minitab.com/en-us/products/minitab/features/?WT.ac=EN_WIL">fearless data analysis</a>.</p>
Data AnalysisStatisticsStatistics HelpStatsWed, 28 May 2014 15:48:00 +0000http://blog.minitab.com/blog/statistics-and-quality-improvement/the-five-coolest-things-you-can-do-when-you-right-click-a-graph-in-minitab-statistical-softwareCody SteeleCan I Just Delete Some Values to Reduce the Standard Variation in My ANOVA?
http://blog.minitab.com/blog/understanding-statistics/can-i-just-delete-some-values-to-reduce-the-standard-variation-in-my-anova
<p>We received the following question via social media recently:</p>
<p style="margin-left: 40px;"><em>I am using Minitab 17 for ANOVA. I calculated the mean and standard deviation for these 15 values, but the standard deviation is very high. If I delete some values, I can reduce the standard deviation. Is there an option in Minitab that will automatically indicate values that are out of range and delete them so that the standard deviation is low?</em></p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/730b09b8826f9404e8661473c1e91ec0/throwing_out_data.gif" style="margin: 10px 15px; float: right; width: 177px; height: 177px;" />In other words, this person wanted a way to automatically eliminate certain values to lower the standard deviation.</p>
<p>Fortunately, Minitab 17 does <em>not </em>have the functionality that this person was looking for.</p>
<p>Why is that fortunate? Because cherry-picking data isn’t a statistically sound practice. In fact, if you do it <em>specifically</em> to reduce variability, removing data points can amount to fraud.</p>
When <em>Is </em>It OK to Remove Data Points?
<p>So that raises a question: is it <em>ever </em>acceptable to remove data? The answer is yes. If you know, for a fact, that some values in your data were inappropriately attained, then it is okay to remove these bad data points. For example, if <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/using-minitab-to-weed-out-bloopers">data entry errors</a> resulted in a few data points from Sample A being entered under Sample B, it would make sense to remove those data points from the analysis of Sample B.</p>
<p>But you may encounter other suggestions for removing data. Some people will use a "trimmed" data set. This means you remove the top and bottom 1-2 samples. Depending upon what the data is, and how you plan to use it, this too can be fraud.</p>
<p>Some people will use the term "Data Cleansing." When they do this, they remove a few data points from a large data set. The end results tend to be minimal on data analysis. But when this changes the end results of an analysis, it again can amount to fraud.</p>
<p>The bottom line? If you don't know for certain that the data points are bad, removing them—especially to change the outcome of an analysis—is virtually impossible to defend.</p>
Finding and Handling Outliers in Your Data
<p>Minitab 17 won't automatically delete values to make your standard deviation small. However, our statistical software does make it easy to identify potential outliers that may be skewing your data, so that you can investigate them. You can access the outlier detection tests at <strong>Stat > Basic Statistics > Outlier Test…</strong></p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/bd0c3a768510bd4f6c68907f8155daad/outlier_menu.gif" style="width: 351px; height: 184px;" /></p>
<p>You can also look at specific statistical measures that indicate the presence of <a href="http://support.minitab.com/minitab/17/topic-library/modeling-statistics/regression-and-correlation/model-assumptions/ways-to-identify-outliers/">outliers in regression and ANOVA</a>.</p>
<p>Of course, before removing any data points you need to make sure that the values are really outliers. First, think about whether those values were collected under the same conditions as the other values. Was there a substitute lab technician working on the day that the potential outliers were collected? If so, did this technician do something differently than the other technicians? Or could the digits in a value be reversed? That is, was 48 recorded as 84?</p>
<p>If you have just one factor in an ANOVA, try using <strong>Assistant > Hypothesis Tests > One-Way ANOVA…</strong> Outliers will be flagged in the output automatically:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/97f93bd1c0b42f2ff85db7cce08bcc90/assistant_outlier_flagged_w640.png" style="width: 640px; height: 94px;" /></p>
<p>You could then run the analysis again after manually removing outliers as appropriate.</p>
<p>You also can use a boxplot chart to identify outliers:</p>
<p><img alt="Finding Outliers in a Boxplot" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/d1235a60856d3bde4bb8638116845f9a/outlier_boxplot_1_.png" style="width: 260px; height: 173px;" /></p>
<p>As you can see above, Minitab's boxplot uses an asterisk (*) symbol to identify outliers, defined as observations that are at least 1.5 times the interquartile range from the edge of the box. You can easily identify the unwanted data point by clicking on the outlier symbols so you can investigate further. After editing the worksheet you can update the boxplot, perhaps finding more outliers to remove.</p>
Are Your Outliers "Keepers"?
<p>While Minitab won't offer an automated "make my data look acceptable" tool, the software does make it easy to find specific data points that may take the results of your analysis in an in inaccurate or unwanted direction.</p>
<p>However, before removing any "bad" data points you should understand their causes and be sure you can avoid recurrence of those causes in the actual process. If the "bad" data could contribute to a more accurate understanding of the actual process, removing them from the calculation will produce wrong results. </p>
Data AnalysisLearningStatisticsStatistics HelpStatsMon, 30 Jun 2014 12:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/can-i-just-delete-some-values-to-reduce-the-standard-variation-in-my-anovaEston MartzCommon Statistical Mistakes You Should Avoid
http://blog.minitab.com/blog/real-world-quality-improvement/common-statistical-mistakes-you-should-avoid
<p>It's all too easy to make mistakes involving statistics. Powerful statistical software can remove a lot of the difficulty surrounding statistical calculation, reducing the risk of mathematical errors—but correctly interpreting the results of an analysis can be even more challenging. </p>
<p>No one knows that better than <a href="http://www.minitab.com/training/trainers/" target="_blank">Minitab's technical trainers</a>. All of our trainers are seasoned statisticians with years of quality improvement experience. They spend most of the year traveling around the country (and around the world) to help people learn to make the best use of Minitab software for analyzing data and improving quality. </p>
<p>A few years ago, Minitab trainers compiled a list of common statistical mistakes—the ones they encountered over and over again. Below are a few of their most commonly observed mistakes that involve drawing an incorrect conclusion from the results of analysis. </p>
Statistical Mistake 1: Misinterpreting Overlapping Confidence Intervals
<p>When comparing multiple means, statistical practitioners are sometimes advised to compare the results from confidence intervals and determine whether the intervals overlap. When 95% confidence intervals for the means of two independent populations don’t overlap, there will indeed be a statistically significant difference between the means (at the 0.05 level of significance). <a href="http://www.minitab.com/en-us/Published-Articles/Some-Misconceptions-about-Confidence-Intervals/" style="font-size: 13px; line-height: 1.6;" target="_blank"><strong>However, the opposite is not necessarily true.</strong></a> CI’s may overlap, yet there may be a statistically significant difference between the means.</p>
<p>Take this example:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ccb8f6d6-3464-4afb-a432-56c623a7b437/Image/5de2bc0040f8e8aab8b5b97e87c55dd9/ci_plot_w640.jpeg" style="width: 440px; height: 293px;" /><br />
<br />
Two 95% confidence intervals that overlap may be significantly different at the 95% confidence level.</p>
<p>What is the significance of the t-test P-value? The P-value in this case is less than 0.05 (0.049 < 0.05), telling us that there is a statistical difference between the means, (yet the CI's overlap considerably). </p>
Statistical Mistake 2: Making Incorrect Inferences about the Population
<p>With statistics, we can analyze a small sample to make inferences about the entire population. But there are a few situations where you should avoid making inferences about a population that the sample does not represent:</p>
<ul>
<li>In <a href="http://blog.minitab.com/blog/starting-out-with-statistical-software/starting-out-with-capability-analysis" target="_blank"><strong>capability analysis</strong></a>, data from a single day is sometimes inappropriately used to estimate the capability of the entire manufacturing process.</li>
<li>In <a href="http://support.minitab.com/en-us/minitab/17/topic-library/quality-tools/acceptance-sampling/basics/what-is-acceptance-sampling/" target="_blank"><strong>acceptance sampling</strong></a>, samples from one section of the lot are selected for the entire analysis.</li>
<li>A common and severe case occurs in a <a href="http://blog.minitab.com/blog/understanding-statistics/choosing-the-right-distribution-model-for-reliability-data" target="_blank"><strong>reliability analysis</strong></a> when only the units that failed are included in an analysis and the population is all units produced.</li>
</ul>
<p>To avoid these situations, define the population before sampling and take a sample that truly represents the population.</p>
Statistical Mistake 3: Assuming Correlation = Causation
<p>It’s sometimes overused, but “correlation does not imply causation” is a good reminder when you’re dealing with statistics. Correlation between two variables does not mean that one variable causes a change in the other, especially if correlation statistics are the only statistics you are using in your data analysis.</p>
<p>For example, data analysis has shown a strong positive correlation between shirt size and shoe size. As shirt size goes up, so does shoe size. Does this mean that wearing big shirts causes you to wear bigger shoes? Of course not! There could be other “hidden” factors at work here, such as height. (Tall people tend to wear bigger clothes and shoes.)</p>
<p>Take a look at this scatterplot that shows that HIV antibody false negative rates are correlated with patient age:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ccb8f6d6-3464-4afb-a432-56c623a7b437/Image/985458eb7adf5fd68aa4ede7da69cb29/scatterplot.jpg" style="width: 576px; height: 384px;" /><br />
<br />
Does this show that the HIV antibody test does not work as well on older patients? Well, maybe …</p>
<p>But you can’t stop there and assume that just because patients are older, age is the factor that is causing them to receive a false negative test result (a false negative is when a patient tests negative on the test, but is confirmed to have the disease).</p>
<p><em>Let’s dig a little deeper.</em> Below you see that patient age and days elapsed between at-risk exposure and test are correlated:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ccb8f6d6-3464-4afb-a432-56c623a7b437/Image/5f93449fe79a55047e0636c48bcf751a/scatterplot_2.jpg" style="width: 576px; height: 384px;" /><br />
<br />
Older patients got tested faster … before the HIV antibodies were able to fully develop and show a positive test result.</p>
<p>Keep the idea that “correlation does not imply causation” in your mind when reading some of the many <a href="http://blog.minitab.com/blog/real-world-quality-improvement/cell-phones-and-cancer-correlation-is-not-causation" target="_blank"><strong>studies publicized in the media</strong></a>. Intentionally or not, the media frequently imply that a study has revealed some cause-and-effect relationship, even when the study's authors detail precisely the limitations of their research.</p>
Statistical Mistake 4: Not Distinguishing Between Statistical Significance and Practical Significance
<p>It's important to remember that using statistics, we can find a statistically significant difference that has no discernible effect in the "real world." In other words, just because a difference <em>exists </em>doesn't make the difference <em>important</em>. And you can waste a lot of time and money trying to "correct" a statistically significant difference that doesn't matter. </p>
<p>Let's say you love Tastee-O's cereal. The factory that makes them weighs every cereal box at the end of the filling line using an automated measuring system. Say that 18,000 boxes are filled per shift, with a target fill weight of 360 grams and a standard deviation of 2.5 grams. </p>
<p>Using statistics, the factory can detect a shift of 0.06 grams in the mean fill weight 90% of the time. But just because that 0.06 gram shift is statistically significant doesn't mean it's practically significant. A 0.06 gram difference probably amounts to two or three Tastee-O’s—not enough to make you, the customer, notice or care. </p>
<p>In most hypothesis tests, we know that the null hypothesis is not <em>exactly</em> true. In this case, we don’t expect the mean fill weight to be precisely 360 grams -- we are just trying to see if there is a <em>meaningful</em> difference. Instead of a hypothesis test, the cereal maker could use a confidence interval to see how large the difference might be and decide if action is needed.</p>
Statistical Mistake 5: Stating That You've Proved the Null Hypothesis
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ccb8f6d6-3464-4afb-a432-56c623a7b437/Image/b8c8b7bfc58720fb6e45d20b5da06df6/coin_flip.JPG" style="float: right; width: 291px; height: 167px; border-width: 1px; border-style: solid; margin-left: 7px; margin-right: 7px;" />In a hypothesis test, you pose a null hypothesis (H0) and an alternative hypothesis (H1). Then you collect data, analyze it, and use statistics to assess whether or not the data support the alternative hypothesis. A p-value above 0.05 indicates “there is not enough evidence to conclude H1 at the .05 significance/alpha level”.</p>
<p>In other words, even if we do not have enough evidence in favor of the alternative hypothesis, the null hypothesis may or may not be true. </p>
<p>For example, we could flip a fair coin 3 times and test:</p>
<p>H0: Proportion of Heads = 0.40 </p>
<p>H1: Proportion of Heads ≠ 0.40</p>
<p>In this case, we are guaranteed to get a p-value higher than 0.05. Therefore we cannot conclude H1. But not being able to conclude H1 doesn't prove that H0 is correct or true! This is why we say we "fail to reject" the null hypothesis, rather than we "accept" the null hypothesis. </p>
Statistical Mistake 6: Not Seeking the Advice of an Expert
<p>One final mistake we’ll cover here is not knowing when to seek the advice of a statistical expert. Sometimes, employees are placed in statistical training programs with the expectation that they will come out immediately as experienced statisticians. While this training is excellent for basic statistical projects, it’s usually not enough to handle more advanced issues that may come about. After all, most skilled statisticians have had 4-8 years of education in statistics and at least 10 years of real-world experience!</p>
<p>If you’re in need of some help, you can hire a Minitab statistician. Learn more about Minitab’s Mentoring service by visiting <a href="http://www.minitab.com/training/" target="_blank">http://www.minitab.com/training/</a>. </p>
<p><em><a href="http://blog.minitab.com/blog/understanding-statistics" target="_blank">Eston Martz</a> and <a href="http://blog.minitab.com/blog/michelle-paret" target="_blank">Michelle Paret</a> contributed to the content of this post.</em></p>
<p><strong>Tell us in the comments below: Have you ever jumped to the wrong conclusion after looking at statistics? </strong></p>
<p> </p>
StatisticsStatistics HelpFri, 23 May 2014 14:20:12 +0000http://blog.minitab.com/blog/real-world-quality-improvement/common-statistical-mistakes-you-should-avoidCarly BarryGage This or Gage That? How the Number of Distinct Categories Relates to the %Study Variation
http://blog.minitab.com/blog/michelle-paret/gage-this-or-gage-that-how-the-number-of-distinct-categories-relates-to-the-study-variation
<p>We cannot improve what we cannot measure. Therefore, it is critical that we conduct a measurement systems analysis (MSA) before we start analyzing our data to make any kind of decisions.</p>
<p>When conducting an MSA for continuous measurements, we typically using a Gage R&R Study. And in these Gage R&R Studies, we look at output such as the <a href="http://blog.minitab.com/blog/quality-data-analysis-and-statistics/how-to-interpret-gage-output-part-2">percentage study variation</a> (%Study Var, or %SV) and the <a href="http://blog.minitab.com/blog/quality-data-analysis-and-statistics/understanding-your-gage-randr-output">Number of Distinct Categories</a> (ndc) to assess whether our measurement system is adequate.</p>
<p>Looking at these 2 values to assess a measurement system often leads to questions like "Should I look at both values? Will both values simultaneously indicate if my measurement system is poor? Are these 2 values related?" </p>
<p>The answer to all of these questions is "Yes," and here's why.</p>
How Are NDC and %Study Var Related?
<p>To clearly understand how number of distinct categories and percentage study variation are related, first consider how they are mathematically defined:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/d840d539abbf0f0cc70c3cb03c823cb1/equation1.jpg" style="width: 401px; float: left; height: 72px; margin-left: 50px; margin-right: 50px" /></p>
<p> </p>
<p> </p>
<p><br />
<span face="">where sigma represents the square root of the variance components. </span></p>
<p><span face="">Using substitution, we can express the relationship between ndc and %SV as:</span></p>
<p><span face=""><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/b8624dccb97d74650d8f3389eef2db64/equation2.jpg" style="width: 350px; float: left; height: 152px; margin-left: 50px; margin-right: 50px" /></span></p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p><span face="">The last equation shows that ndc and %SV are inversely proportional: the larger %SV is, the smaller the ndc is, and vice-versa. However, it also suggests that the value of ndc depends not only on %SV, but on the variance components as well.</span></p>
NDC as a Function of %SV
<p>To simplify the equation and represent ndc solely as a function of %SV, we can express the variance components in another way. The total variance is the sum of two variance components, one corresponding to gage repeatability and reproducibility and the other to part-to-part variation:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/8cdb02ebc3a57a05010fe627dfe8bb45/equation3.jpg" style="width: 222px; float: left; height: 36px; margin-left: 50px; margin-right: 50px" /></p>
<p> </p>
<p> </p>
<p>Solving for sigma-squared for part and dividing each side of the equation by sigma-squared for total yields:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/cfe7fe6042e0c688436844d14f9c9460/equation4.jpg" style="width: 193px; float: left; height: 73px; margin-left: 50px; margin-right: 50px" /></p>
<p> </p>
<p> </p>
<p> </p>
<p><span face="">Because %SV / 100 = sigma gage / sigma total, the equation above can be rewritten as:</span></p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/89d53946a51b100b7a4573c9677b3cf7/equation6.jpg" style="width: 350px; float: left; height: 82px; margin-left: 50px; margin-right: 50px" /></p>
<p> </p>
<p> </p>
<p> </p>
<p>Substituting this value into the previous equation for ndc gives the following simplified formula:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/00cecc8f1fff70ad94e17d0c785253b9/equation7.jpg" style="width: 330px; float: left; height: 144px; margin-left: 50px; margin-right: 50px" /></p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p>This equation clearly shows the relationship between ndc and %SV and can be used to calculate the number of distinct categories for a given percentage study variation. As shown in Table 1, the calculated ndc value is then truncated to obtain a whole number (integer).</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/394a73d4dd88ac618ceb3fe68a18922b/equation8.jpg" style="width: 270px; float: left; height: 268px; margin-left: 50px; margin-right: 50px" /></p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p>For example, if the calculated value is 15.8, mathematically you are not quite capable of differentiating between 16 categories. Therefore, Minitab <a href="http://www.minitab.com/products/minitab">Statistical Software</a> is conservative and truncates the number of distinct categories to 15. For practical purposes, you can also round the calculated ndc values to obtain the number of distinct categories.</p>
Guidelines and Limitations for Evaluating a Measurement System Using NDC
<p>You can evaluate a measurement system by looking only at the number of distinct categories and using the following guidelines (based on the truncation method used by Minitab):</p>
<ul>
<li>≥ <strong>14 distinct categories </strong>– the measurement system is acceptable</li>
<li><strong>4-13 distinct categories </strong>– the measurement system is marginally acceptable, depending on the importance of the application, cost of measurement device, cost of repair, and other factors</li>
<li><strong>≤ 3 distinct categories </strong>– the measurement system is unacceptable and should be improv</li>
</ul>
<p>These guidelines have some limitations. For example, in some cases when the %SV is over 30% the number of distinct categories is 4. Therefore, a measurement system with 32% study variation, which is unacceptable under the AIAG criteria for %SV, is acceptable under the ndc criteria. To avoid this discrepancy, some authors suggest only accepting a measurement system when it can distinguish between 5 or more categories. Although this fixes the original problem, it makes measurement systems with a 28-30% study variation unacceptable, because their corresponding ndc value equals 4.</p>
<p>To resolve this issue you can establish more specific guidelines based on the exact calculated ndc values, without truncating or rounding. For example, you could define an unacceptable measurement system based on an ndc < 4.497.</p>
<p>And that is how the number of distinct categories is related to %Study Var.</p>
Data AnalysisLean Six SigmaLearningQuality ImprovementSix SigmaStatisticsStatistics HelpStatsMon, 19 May 2014 12:00:00 +0000http://blog.minitab.com/blog/michelle-paret/gage-this-or-gage-that-how-the-number-of-distinct-categories-relates-to-the-study-variationMichelle ParetWhen Will I Ever See This Statistics Software Again?
http://blog.minitab.com/blog/understanding-statistics/when-will-i-ever-see-this-statistics-software-again
<p>Minitab Statistical Software was born out of a desire to make statistics easier to learn: by making the calculations faster and easier with computers, the trio of educators who created the first version of Minitab sought to free students from intensive computations to focus on learning key statistical concepts. That approach resonated with statistics instructors, and today Minitab is the standard for teaching and learning statistics at more than 4,000 universities all over the world.</p>
<p><img alt="Minitab is used around the world." src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/a2d0a2736e05ac7a3b256248e43e2fdb/earth.jpg" style="border-width: 1px; border-style: solid; margin: 10px 15px; float: right; width: 200px; height: 200px;" />But many students seem to believe Minitab is used <em>only </em>in education. Search Twitter for "Minitab," and you're likely to find a few students grousing that nobody uses Minitab Statistical Software in the "real world."</p>
<p>Those students are in for a big shock after they graduate. Organizations like Boeing, Dell, General Electric, Microsoft, Walt Disney, and <a href="http://www.minitab.com/company/customers/">thousands more worldwide</a> rely on Minitab software to help them improve the quality of their products and services.</p>
<p>Savvy instructors already know <a href="http://www.minitab.com/academic/help-your-students-find-a-job/">learning with Minitab can give students an advantage in the job market</a>.</p>
Stories of How Data Analysis Made a Real-World Difference
<p>In my job, I get to talk with professionals about how they use our software in their work. I've interviewed scientists, engineers, miners, shop stewards, foresters, Six Sigma experts, service managers, bankers, utility executives, soldiers, civil servants, and dozens of others.</p>
<p>The statistical methods they use vary widely, but a common thread running through all of their experiences reveals a critical link between Minitab's popularity in the academic world and its widespread application in so many different businesses and industries. Virtually every person I talk to about our software mentions something about "ease of use." </p>
<p>That makes a lot of sense: Minitab wasn't the first statistical software package, but it was the first <a href="http://blog.minitab.com/blog/voice-of-the-customer/making-statistics-easy">statistical software package designed with the express goal of being easy to use</a>. That led to its quick adoption by instructors and students, and those students brought Minitab with them into the workplace. And for more than 40 years, professionals have been using Minitab to solve challenges in the real world.</p>
<p>In case you're looking for examples, here are several of our favorite stories about how people have used Minitab: </p>
<p style="text-align: center;"><strong>Case Study</strong></p>
<p style="text-align: center;"><strong>Industry</strong></p>
<p style="text-align: center;"><strong>Methods and Tools</strong></p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/US-Army/">U.S. Army</a></p>
<p style="text-align: center;">Military</p>
<p style="text-align: center;">Pareto, Before/After Capability</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/Red-Cross-Hospital-and-Canisius-Wilhelmina-Ziekenhuis/">Rode Kruis and CWZ</a></p>
<p style="text-align: center;">Hospital</p>
<p style="text-align: center;">Boxplot, Pareto Chart</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/Belgian-Red-Cross/">Belgian Red Cross</a></p>
<p style="text-align: center;">Healthcare</p>
<p style="text-align: center;">Histogram, Probability Plot</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/Betfair/">BetFair</a></p>
<p style="text-align: center;">Sports Betting</p>
<p style="text-align: center;">Interaction Plot, Capability Analysis, I-MR Chart</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/Ford-Motor-Company-DOE/">Ford Motor Company</a></p>
<p style="text-align: center;">Automotive</p>
<p style="text-align: center;">Design of Experiments (DOE)</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/US-Bowling-Congress/">U.S. Bowling Congress</a></p>
<p style="text-align: center;">Sports and Leisure</p>
<p style="text-align: center;">Scatterplot</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/Six-Sigma-Ranch-Vineyards-and-Winery/">Six Sigma Ranch</a></p>
<p style="text-align: center;">Wine</p>
<p style="text-align: center;">Attribute Agreement Analysis, I-MR Chart</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/Newcrest-Mining-Ltd/">Newcrest Mining</a></p>
<p style="text-align: center;">Mining</p>
<p style="text-align: center;">Individual Value Plot</p>
<p style="text-align: center;"><a href="http://www.minitab.com/Case-Studies/NASCAR/">NASCAR</a></p>
<p style="text-align: center;">Car Racing</p>
<p style="text-align: center;">Design of Experiments (DOE)</p>
<p>Have you used Minitab software on the job? We'd love to hear your story!</p>
Lean Six SigmaLearningSix SigmaWed, 07 May 2014 12:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/when-will-i-ever-see-this-statistics-software-againEston Martz"Hidden Helpers" in Minitab Statistical Software
http://blog.minitab.com/blog/understanding-statistics/hidden-helpers-in-minitab-statistical-software
<p>Minitab <a href="http://www.minitab.com/products/minitab">Statistical Software</a> offers many features that can save you time and effort when you’re learning statistics or analyzing data. However, when we demonstrate many of these short cuts, tools, and capabilities at shows and events, we find that even some longtime users aren’t aware of them.</p>
<p>I asked members of our sales team and technical support staff to list some of Minitab’s most helpful, yet frequently overlooked features. How many do you use—or want to start using?</p>
Can You Repeat That?
<p>Frequently, you’ll need to modify or re-run some part of an analysis you conducted. You can easily return to your last dialog box by pressing <strong>CTRL+E</strong>. </p>
<p>What if you need more than 1 version of a graph? Maybe you're presenting your results to two different audiences, and you'd like to highlight different factors for each. Use <strong>Editor > Duplicate Graph</strong> to create an identical copies of the original graph, which you can then tailor to suit each of your audiences.</p>
<p><img alt="duplicate graphs in minitab" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/abf88e79842996c17703804fb44c0f3b/duplicate_graphs.jpg" style="width: 300px; height: 212px;" /></p>
<p>It’s also easy to create new graphs using different variables while retaining all of your graph edits. With a graph or control chart active, choose <strong>Editor > Make Similar Graph</strong> to make a graph that retains all properties of your original graph but uses different columns.</p>
Have It Your Way
<p>To customize menus and toolbars, choose <strong>Tools > Customize</strong>. You can add, delete, move, or edit menus and toolbars; add buttons to Minitab that you can simply click on to run macros; and set keystrokes for commands.</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/ef8af26c15c132d2b9235026a7872695/custom_menu_1_.gif" style="width: 160px; height: 92px;" /></p>
You Can Take It With You
<p>You can specify default settings using <strong>Tools > Options. </strong>Then store all your personalized settings and customizations in a profile (using <strong>Tools > Manage Profiles</strong>) that you can use whenever you choose and share with colleagues.</p>
Manipulating Data
<p>Need to change the format of a column? For example, do you need to convert a text column to numeric format for your analysis? Just choose<strong> Data > Change Data Type</strong> and select the appropriate option.</p>
May I Take Your Order?
<p>Have you ever created a graph and wished you could switch the order of the results? For instance, you might want to change “High, Medium, Low” to “Low, Medium, High”. To display your results in a specific order, right-click on the column used to generate the output and choose<strong> Column > Value Order</strong>. This lets you set the value order for a text column using an order you define. The value order lets you control the order of groups on bar charts and other graphs, as well as tables and other Session Window output.</p>
Help Is Just a Click Away
<p>If you’ve never clicked on Minitab’s <strong>Help</strong> menu, you’re missing a tremendous collection of resources. Of course you’ll find guidance about how to use Minitab software there, including step-by-step tutorials. In addition, you’ll also find:</p>
<ul>
<li>Minitab’s <strong>Statistical Glossary.</strong> This comprehensive, illustrated glossary covers all areas of Minitab statistics. Each definition contains practical, easy-to-understand information.</li>
<li><strong>StatGuide</strong><strong>™</strong><strong>.</strong> You’ve run an analysis, but what does it <em>mean</em>? StatGuide explains how to interpret Minitab results, using preselected examples to explain your output.</li>
<li>A list of <strong>Methods and Formulas</strong>. </li>
<li>Links to helpful Internet resources, including our extensive <strong>Answers Knowledgebase</strong>.</li>
</ul>
<p>And if you don’t find the answers you need, you can contact Minitab’s free <strong>Technical Support</strong> for assistance from highly-skilled specialists with expertise in both computing and statistics.</p>
<p>Do you have any favorite "hidden helpers" in Minitab? </p>
<p> </p>
Statistics HelpTue, 27 May 2014 12:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/hidden-helpers-in-minitab-statistical-softwareEston MartzHockey Penalties, Fans Booing, and Independent Trials
http://blog.minitab.com/blog/the-statistics-game/hockey-penalties-fans-booing-and-independent-trials
<p><img alt="Ref" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/fe2c58f6-2410-4b6f-b687-d378929b1f9b/Image/5fd1da09658e5c01a18ccf50e9358568/referee.jpg" style="float: right; width: 250px; height: 280px;" />We’re in the thick of the Stanley Cup playoffs, which means hockey fans are doing what seems to be every sports fan's favorite hobby...complaining about the refs! While most complaints, such as “We’re not getting any of the close calls!” are subjective and hard to get data for, there's one question that we should be able to answer objectively with a statistical analysis: Are hockey penalties independent trials? That is, does the team that the next penalty will be called on <em>depend</em> on the team that any previous penalties were called on?</p>
<p>Think of flipping a coin. Even if it comes up heads 10 times in a row, the probability of getting heads on the next flip is still 50%. In theory, you would think penalties in hockey work the same way. Both teams are playing hard, and should be equally likely to commit the next penalty. Maybe a single player would be less likely to commit a second penalty right after he just committed one because he’ll play more cautiously. But at the team level, you would think the outcome of the next penalty to be 50/50.</p>
<p>But players aren’t the only ones who affect the outcome of a penalty. Referees are ultimately the people who decide when to let things go and when to call a penalty. And you can only imagine what the crowd and coaches would do to the ref if the home team had 10 penalties called against them in a row.</p>
<p>So let’s dig into the data with Minitab <a href="http://www.minitab.com/products/minitab">Statistical Software</a> and see if refs call penalties independently, or if the team they call it on depends on which team they called previous penalties on.</p>
The Data
<p>I’m only going to include playoff games in my sample, because those are the games where the stakes are the highest. For every Stanley Cup Playoff game from 2013 and an additional 21 games from this year, I collected the team the penalty was called on (either home or away) and the order in which they were called. I only included penalties where one team got a power play. So if matching penalties were assessed to players on opposite teams, I didn’t include those since it didn't give either team an advantage. I also didn’t include penalties for fights that occurred late in hockey games that were blowouts. (By that point the game was effectively over, so the penalties didn’t matter.) In total I had 732 penalties that occurred in 106 playoff hockey games.</p>
The First Penalty
<p>No penalties have been called at this point, so there shouldn’t be any bias. We should expect an equal amount of penalties to be called on the home and away team.</p>
<p><img alt="Tally" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/fe2c58f6-2410-4b6f-b687-d378929b1f9b/Image/6529f826f721a5a7353c90f857d8ccf0/penalty_1.jpg" style="width: 380px; height: 131px;" /></p>
<p>Sure enough the penalties are just about 50/50. So far, so good, refs!</p>
The 2nd and 3rd goals
<p>Now let’s see what happens with the next two penalties, starting with the second. Is the team that the second penalty is called on independent of the team the first was called on? We can perform a <a href="http://blog.minitab.com/blog/understanding-statistics/chi-square-analysis-of-halloween-and-friday-the-13th-is-there-a-slasher-movie-gender-gap">chi-square test of association</a> to get an answer.</p>
<p><img alt="Chi-square test" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/fe2c58f6-2410-4b6f-b687-d378929b1f9b/Image/b5c572324b08587e6994de258896c562/penalty_2.jpg" style="width: 623px; height: 293px;" /></p>
<p>Looking at the table, we see that of the 51 times the first penalty was called on the away team, the next penalty was called on the home team 33 times (65%). And of the 55 times the first penalty was called on the home team, the next penalty was called on the away team 32 times (58%). It appears that the second penalty depends on the first, but we should examine the results of the chi-square tests to be certain.</p>
<p>The p-values for both chi-square tests are 0.018. Because this value is less than 0.05, we can conclude that there is an association between the first and second penalty. So if your team gets called for the first penalty, odds are the next one is going against the other team.</p>
<p>Will this trend continue for the third penalty? Let’s start by thinking about the number of penalties called on the home team. There could be 0, 1, 2, or 3 penalties called on them. If the penalties are independent of each other, then the probability of a single penalty being called on the home team at any point in time is 0.5. We can use this to easily calculate the probabilities for the different amount of penalties that could be called on the home team.</p>
<p style="text-align: center;"><strong>Penalties on the home team</strong></p>
<p style="text-align: center;"><strong>Equation</strong></p>
<p style="text-align: center;"><strong>Probability</strong></p>
<p style="text-align: center;">0</p>
<p style="text-align: center;">.53</p>
<p style="text-align: center;">0.125</p>
<p style="text-align: center;">1</p>
<p style="text-align: center;">3*(.5)*(.52)</p>
<p style="text-align: center;">0.375</p>
<p style="text-align: center;">2</p>
<p style="text-align: center;">3*(.52)*(.5)</p>
<p style="text-align: center;">0.375</p>
<p style="text-align: center;">3</p>
<p style="text-align: center;">.53</p>
<p style="text-align: center;">0.125</p>
<p> <br />
Now we just have to summarize our data, and see how many times each of these actually occurred. Then we can use a chi-square goodness-of-fit test to compare our observed values to the expected probabilities that we calculated above.</p>
<p><img alt="Chi-square goodness-of-fit test" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/fe2c58f6-2410-4b6f-b687-d378929b1f9b/Image/a72c80dc267c10d23b80e183b59784bd/penalty_3_w640.jpeg" style="width: 640px; height: 261px;" /></p>
<p>If you look at N in the bottom left corner, you’ll see that our sample size dropped to 103 games. That’s because there were 3 games where only 2 penalties were called, so they couldn’t be included in this analysis.</p>
<p>Now let’s focus on the table. In a sample of 103 games, we would expect there to be about 13 games where the home team had 0 penalties and another 13 where they had 3. But the Observed column shows us that there were far fewer. In fact, there were only 3 games where the first 3 penalties all went to the home team. It seems like the refs were reluctant to get the home crowd angry at them.</p>
<p>The 1 and 2 penalty categories suggest that the refs appear reluctant to have<em> anybody</em> get mad at them. While we would expect about 39 games to occur in each category, there were 46 and 47 instead!</p>
<p>The p-value for the chi-square test is 0.004. This means we can conclude that the data do not follow the proportions we would expect if the trials were independent. When it comes to the first 3 penalties of the game, the refs are reluctant to give either team too much of an advantage (especially the away team), instead opting to make the penalties as even as possible.</p>
The Entire Game
<p>Now let’s move on from the first couple goals and try to determine what happens throughout the entire hockey game. For every penalty throughout a single game, I gave it a 1 if it was called on the home team and a -1 if it was called on the away team. Then I added these values up throughout the game to keep track of the “count”. So if the first 3 penalties were called on the away team, the count is at -3. And if 3 of the first 5 penalties were called on the home team, the count would be at 1. Here is a histogram of the counts.</p>
<p><img alt="Histogram" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/fe2c58f6-2410-4b6f-b687-d378929b1f9b/Image/5ab7e0c843709600d1cb72583d598124/histogram_of_count_w640.jpeg" style="width: 600px; height: 400px;" /></p>
<p>Notice that it is very rare for the count to move too far from 0. Counts over 2 or less than -2 are pretty rare, once again showing that refs don’t want to seem too biased toward either team. Because counts of 3 or higher (or -3 and lower) were rare, these samples sizes are too small to obtain any conclusions from. So I combined them with the 2 and -2 category to increase the sample size.</p>
<p>Once I had the count for every penalty in the game, I recorded the team the next penalty was called on. Then for each count, we can see the proportions for the team the next penalty was called on!</p>
<p><img alt="Tabulated Statistics" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/fe2c58f6-2410-4b6f-b687-d378929b1f9b/Image/380348fd342bcbf1510b72c5f11af0ac/count.jpg" style="width: 610px; height: 499px;" /></p>
<p>Let’s start with the negative counts. This means the away team has had more penalties called on them than the home team. We see that in these cases, the home team is slightly more likely to be called for the next penalty, but the probabilities are still close to 50%. Any bias the refs show in calling penalties does not favor the away team.</p>
<p>It’s quite a different story when we look at the home team. When the home team has been called for 2 or more penalties than the away team, the next penalty goes on the away team 75% of the time! And in case you think this is just a result of combining all the counts of 2 or more, I can tell you that if you take a count of 2 as its own category, the percentage of the next penalty going to the away team was still 74.65%.</p>
<p>So what is causing this? It’s unlikely to be the players or the coach of the home team yelling at the ref. After all, I’m sure the players and coach of the away team yell at the refs just as much when the count is negative, and <em>they </em>don’t see an advantage.</p>
<p>That brings us back to the fans. Remember when I said every sports fans favorite hobby seems to be complaining about the refs? Well, refs are humans, too, and it definitely seems plausible that when most of the calls are going against the home team, the crowd noise might cause the refs to be more prone to call the next penalty on the away team. Of course, this analysis can’t <em>prove</em> that this is the cause, but it adds some credence to a nice theory!</p>
<p>So if you ever find yourself at a playoff hockey game, feel free to boo as loud as you want when the ref calls a penalty on your team. Even if it was a good call! You just might be tipping the odds in your team’s favor!</p>
Fun StatisticsFri, 25 Apr 2014 16:34:10 +0000http://blog.minitab.com/blog/the-statistics-game/hockey-penalties-fans-booing-and-independent-trialsKevin RudyWhat Can Classical Chinese Poetry Teach Us About Graphical Analysis?
http://blog.minitab.com/blog/statistics-and-quality-data-analysis/views-of-mist-covered-mountains-of-data
<p><img alt="mountains" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/0e36446611c739e73745751956f6fb21/lushan_w640.jpeg" style="float: right; border-width: 1px; border-style: solid; margin: 10px 15px; width: 174px; height: 223px;" />A famous classical Chinese poem from the Song dynasty describes the views of a mist-covered mountain called Lushan.</p>
<p>The poem was inscribed on the wall of a Buddhist monastery by <a href="http://www.chinaonlinemuseum.com/calligraphy-su-shi.php" target="_blank">Su Shi</a>, a renowned poet, artist, and calligrapher of the 11th century.</p>
<p>Deceptively simple, the poem captures the illusory nature of human perception.<br />
</p>
<p><img alt="poem" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/c1ddf575ee939d684c3b76089ce92eb8/chinese_poem.jpg" style="float: left; border-width: 1px; border-style: solid; margin: 10px 15px; width: 186px; height: 203px;" /><em> <strong>Written on the Wall of West Forest Temple</strong></em></p>
<p><em> --Su Shi<br />
<br />
From the side, it's a mountain ridge.<br />
Looking up, it's a single peak.<br />
Far or near, high or low, it never looks the same.<br />
You can't know the true face of Lu Mountain<br />
When you're in its midst.</em></p>
<p> </p>
<p>Our perception of reality, the poem suggests, is limited by our vantage point, which constantly changes.</p>
<p>In fact, there are probably as many interpretations of this famous poem as there are views of Mt. Lu.</p>
<p><em>Centuries after the end of the Song dynasty, imagine you are traversing a misty mountain of data using the Chinese language version of Minitab 17...</em></p>
<em>Written in the Graphs Folder in Minitab Statistical Software </em>
<p>From the interval plot, you are extremely (95%) confident that the population mean is within the interval bounds.</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/4a9fe65554ea6a8fc40fd49801727e9d/process_data_interval_____________.jpg" style="width: 576px; height: 384px;" /></p>
<p>From the individual value plot, the data may contain an outlier (which could bias the estimate the mean).</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/4c84668b71122b685ae830451f5a6cbd/process_data_ivp______________.jpg" style="width: 576px; height: 384px;" /></p>
<p>From the boxplot, the data appear to be extremely skewed (making the confidence interval and mean estimate unreliable).</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/060dd171117360654198c375fc67395b/process_data_____________.jpg" style="width: 576px; height: 384px;" /></p>
<p>From the histogram, the data are bimodal (which makes the estimate of the mean utterly ...er..<em>.mean</em>ingless)</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/0a424c4834674c7e857dcdd7b58cfc0c/process_data__________.jpg" style="width: 576px; height: 384px;" /></p>
<p>From the time series plot, the data show an order effect, with increasing variation and downward drift.</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/6312aa1170196ee92a4d452aa4458b50/process_data_ts___________________.jpg" style="width: 576px; height: 384px;" /></p>
<p>From the individuals and moving range charts with stages, the data appear stable and in control:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/7ee8e7da8a931d5d50426cfe81eb4eb3/process_data_____stage_____i_mr__________.jpg" style="width: 576px; height: 384px;" /></p>
<p>These graphs are all of <a href="//cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/File/eeb4fa8dbd44c1544dac9c3490441169/lushan.MPJ" target="_blank">the same data set</a>.</p>
<p>Take it from Su Shi. Don't rely on a single graphical view to capture the true reality of your data.</p>
<p><em>Image of Lushan licensed by <a href="http://en.wikipedia.org/wiki/File:Mount_Lushan_-_fog.JPG" target="_blank">Wikimedia Commons</a>.</em></p>
Data AnalysisMon, 14 Apr 2014 10:57:00 +0000http://blog.minitab.com/blog/statistics-and-quality-data-analysis/views-of-mist-covered-mountains-of-dataPatrick RunkelDid Welch’s ANOVA Make Fisher's Classic One-Way ANOVA Obsolete?
http://blog.minitab.com/blog/adventures-in-statistics/did-welchs-anova-make-fishers-classic-one-way-anova-obsolete
<p><img alt="Interval plot of group means" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/15105d100842a6143f9c699464b781c6/intplot_steps.gif" style="border-width: 1px; border-style: solid; margin: 10px 15px; float: right; width: 300px; height: 200px;" />One-way ANOVA can detect differences between the means of three or more groups. It’s such a classic statistical analysis that it’s hard to imagine it changing much.</p>
<p>However, a revolution has been under way for a while now. Fisher's classic one-way ANOVA, which is taught in Stats 101 courses everywhere, may well be obsolete thanks to Welch’s ANOVA.</p>
<p>In this post, I not only want to introduce you to Welch’s ANOVA, but also highlight some interesting research that we perform here at Minitab that guides the implementation of features in our <a href="http://www.minitab.com/en-us/products/minitab/" target="_blank">statistical software</a>.</p>
One-Way ANOVA Assumptions
<p>Like any statistical test, one-way ANOVA has several assumptions. However, some of these assumptions are stringent requirements, while others can be waived. Simulation studies can determine which assumptions are true requirements.</p>
<p>For one-way ANOVA, we’ll look at two major assumptions. One of these assumptions is a true requirement, and understanding that explains why Welch’s ANOVA beats the traditional one-way ANOVA.</p>
<p>The discussion below is a summary of simulation studies conducted by Rob Kelly, a senior statistician here at Minitab. You can read the full results in the <a href="http://support.minitab.com/en-us/minitab/17/Assistant_One_Way_ANOVA.pdf" target="_blank">one-way ANOVA white paper</a>. You can also peruse all of our <a href="http://support.minitab.com/en-us/minitab/17/technical-papers/" target="_blank">technical white papers</a> to see the research we conduct to develop methodology throughout the Assistant and Minitab.</p>
<p><strong>Assumption: Samples are drawn from normally distributed populations</strong></p>
<p>One-way ANOVA assumes that the data are normal. However, the simulations show that the test is accurate with nonnormal data when the sample sizes are large enough. These guidelines are:</p>
<ul>
<li>If you have 2-9 groups, the sample size for each group should be at least 15.</li>
<li>If you have 10-12 groups, the sample size for each group should be at least 20.</li>
</ul>
<p><strong>Assumption: The populations have equal standard deviations (or variances)</strong></p>
<p>One-way ANOVA also assumes that all groups share a common standard deviation even if they have different means. The simulations show that this assumption is stricter than the normality assumption. You can’t waive it away with a large sample size.</p>
<p>What happens if you violate the assumption of equal variances?</p>
<p>For hypothesis tests like ANOVA, you set a significance level. The significance level is the probability that the test incorrectly rejects the null hypothesis (Type I error). This error causes you to incorrectly conclude that the group means are different.</p>
<p>If you set the significance level to the common value of 0.05, 1 out of 20 tests should produce this error.</p>
<p>Rob ran 10,000 simulation runs for each of 50 different conditions to compare the observed error rate to the target level. Ideally, if you set the significance level to 0.05, the observed error rate is also 0.05.</p>
<p>The greater the difference between the target and actual error rate, the more sensitive one-way ANOVA is to violations of the equal variances assumption.</p>
Simulation results for unequal variances
<p>The simulations show that unequal standard deviations cause the actual error rate to diverge from the target rate for the traditional one-way ANOVA.</p>
<p>The best case scenario for unequal standard deviations is when group sizes are equal. With a significance level of 0.05, the observed error rate ranges from 0.02 to 0.08.</p>
<p>For unequal group sizes, the results varied greatly depending on the standard deviations of the larger and smaller groups. The error rates for unequal group sizes extend up to 0.22!</p>
Solutions to this Problem
<p>Clearly you need to be wary when you perform one-way ANOVA and your group standard deviations are potentially different. Fortunately, there are two approaches you can try.</p>
<strong>Test for equal variances</strong>
<p>In Minitab, you can perform a test to determine whether the standard deviations of the groups are significantly different: <strong>Stat > ANOVA > Test for Equal Variances</strong>. If the test’s p-value is greater than 0.05, there is insufficient evidence to conclude that the standard deviations are different.</p>
<p>However, there is a big caveat. Even if you meet the sample size guidelines for one-way ANOVA, the test for equal variances may have <a href="http://blog.minitab.com/blog/understanding-statistics/how-much-data-do-you-really-need-check-power-and-sample-size" target="_blank">insufficient power</a>. In this case, your groups can have unequal standard deviations but the test will be unlikely to detect the difference. In general, failing to reject the null hypothesis is not the best method to determine that groups are equal.</p>
<p>However, if you have an adequate sample size and if the variance test’s p-value is greater than 0.05, you can trust the results from the traditional one-way ANOVA.</p>
<strong>Welch’s ANOVA</strong>
<p>What do you do if the test for equal variances indicates that the standard deviations are different? Or that the test has insufficient power? Or, perhaps you just don’t want to have to worry about performing and explaining this extra test? Let me introduce you to Welch’s ANOVA!</p>
<p>Welch’s ANOVA is an elegant solution because it is a form of one-way ANOVA that does not assume equal variances. And the simulations show that it works great!</p>
<p>When the group standard deviations are unequal and the significance level is set at 0.05, the simulation error rate for:</p>
<ul>
<li>The traditional one-way ANOVA ranges from 0.02 to 0.22, while</li>
<li>Welch’s ANOVA has a much smaller range, from 0.046 to 0.054.</li>
</ul>
<p>Additionally, for cases where the group standard deviations are equal, there is only a negligible difference in statistical power between these two procedures.</p>
Where to Find Welch’s ANOVA in Minitab
<p>You might be using Welch’s ANOVA already without realizing it. Because of the advantages described above, the <a href="http://blog.minitab.com/blog/understanding-statistics/what-statistical-hypothesis-test-should-i-use" target="_blank">Assistant</a> only performs Welch’s ANOVA.</p>
<p>Starting in Minitab 17, you can also perform Welch’s ANOVA outside of the Assistant. Go to <strong>Stat > ANOVA > One-Way</strong>. Click <strong>Options</strong>, and uncheck <strong>Assume equal variances</strong>. You can also perform <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/keep-that-special-someone-happy-when-you-perform-multiple-comparisons" target="_blank">multiple comparisons</a> using the Games-Howell method to identify differences between pairs of groups.</p>
<p>Below is example output for Welch's ANOVA from the Assistant. Just like the classic one-way ANOVA, look at the p-value to determine significance and use the Means Comparison Chart to look for differences between specific groups.</p>
<p><img alt="One-Way ANOVA in Minitab's Assistant" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/69fbdb8397b249428e357eb894e81204/asst_one_way_w640.gif" style="width: 550px; height: 412px;" /></p>
<p>The low p-value (< 0.001) indicates that at least one mean is different. The chart shows that each mean is different from the other two means.</p>
Thu, 03 Apr 2014 11:00:00 +0000http://blog.minitab.com/blog/adventures-in-statistics/did-welchs-anova-make-fishers-classic-one-way-anova-obsoleteJim Frost Equivalence Testing for Quality Analysis (Part I): What are You Trying to Prove?
http://blog.minitab.com/blog/statistics-and-quality-data-analysis/equivalence-testing-for-quality-analysis-part-i-what-are-you-trying-to-prove
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/b922d8ae294ef7be358b1b2abdc06eab/scales.jpg" style="float: right; border-width: 1px; border-style: solid; margin: 10px 15px; width: 250px; height: 244px;" />With more options, come more decisions.</p>
<p>With equivalence testing added to Minitab 17, you now have more statistical tools to test a sample mean against target value or another sample mean.</p>
<p>Equivalence testing is extensively used in the biomedical field. Pharmaceutical manufacturers often need to test whether the biological activity of a generic drug is equivalent to that of a brand name drug that has already been through the regulatory approval process.</p>
<p>But in the field of quality improvement, why might you want to use an equivalence test instead of a standard t-test?</p>
Interpreting Hypothesis Tests: A Common Pitfall
<p>Suppose a manufacturer finds a new supplier that offers a less expensive material that could be substituted for a costly material currently used in the production process. This new material is <em>supposed to be</em> just as good as the material currently used. It should not make the product too pliable nor too rigid.</p>
<p>To make sure the substitution doesn’t negatively impact quality, an analyst collects two random samples from the production process (which is stable): one using the new material and one using the current material.</p>
<p>The analyst then uses a standard 2-sample t-test (<strong>Stat > Basic Statistics > 2-Sample t </strong>in Minitab <a href="http://www.minitab.com/products/minitab">Statistical Software</a>) to assess whether the mean pliability of the product is the same using both materials:</p>
<p style="margin-left: 40px;">________________________________________</p>
<p style="margin-left: 40px;"><strong>Two-Sample T-Test and CI: Current, New </strong></p>
<p style="margin-left: 40px;">Two-sample T for Current vs New<br />
N Mean StDev SE Mean<br />
Current 9 34.092 0.261 0.087<br />
New 10 33.971 0.581 0.18</p>
<p style="margin-left: 40px;">Difference = μ (Current) - μ (New)<br />
Estimate for difference: 0.121<br />
95% CI for difference: (-0.322, 0.564)<br />
T-Test of difference = 0 (vs ≠): T-Value = 0.60 <strong><span style="color:#FF0000;">P-Value = 0.562</span></strong> DF = 12<br />
________________________________________</p>
<p>Because the p-value is not less than the <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/alpha-male-vs-alpha-female">alpha level</a> (0.05), the analyst concludes that the means do not differ. Based on these results, the company switches suppliers for the material, confident that statistical analysis has proven that they can save money with the new material without compromising the quality of their product.</p>
<p>The test results make everyone happy. High-fives. Group hugs. Popping champagne corks. There’s only one minor problem.</p>
<p>Their statistical analysis didn’t really <em>prove</em> that the means are the same.</p>
Consider Where to Place the Burden of Proof
<p>In hypothesis testing, H1 is the alternative hypothesis that requires the burden of proof. Usually, the alternative hypothesis is what you’re hoping to prove or demonstrate. When you perform a standard 2-sample t-test, you’re really asking: “Do I have enough evidence to <em>prove</em>, beyond a reasonable doubt (your alpha level), that the population means are different?”</p>
<p>To do that, the hypotheses are set up as follows:</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/102c226ce0c8cffe1a37cb7e43a969eb/2samplet_w640.jpeg" style="width: 640px; height: 271px;" /></p>
<p>If the p-value is less than alpha, you conclude that the means significantly differ. But if the p-value is not less than alpha, you haven’t <em>proven</em> that the means are equal. You just don’t have enough evidence to prove that they’re not equal.</p>
<p>The absence of evidence for a statement is not proof of its converse. If you don’t have sufficient evidence to claim that A is true, you haven’t <em>proven</em> that A is false.</p>
<p>Equivalence tests were specifically developed to address this issue. In a 2-sample equivalence test, the null and alternative hypotheses are reversed from a standard 2-sample t test.</p>
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/1d72f62da34d5c7d7d7719973c3927ac/2smple_equiv_image_w640.jpeg" style="width: 640px; height: 271px;" /></p>
<p>This switches the burden of proof for the test. It also reverses the ramification of incorrectly assuming (H0) for the test.</p>
Case in Point: The Presumption of Innocence vs. Guilt
<p>This rough analogy may help illustrate the concept.</p>
<p>In the court of law, the burden of proof rests on proving guilt. The suspect is presumed innocent (H0), until proven guilty (H1). In the news media, the burden of proof is often reversed: The suspect is presumed guilty (H0), until proven innocent (H1).</p>
<p>Shifting the burden of proof can yield different conclusions. That’s why the news media often express outrage when a suspect who is presumed to be guilty is let go because there was not sufficient evidence to prove the suspect’s guilt in the courtroom. As long as news media and the courtroom reverse their null and alternative hypotheses, they’ll sometimes draw different conclusions based on the same evidence.</p>
<p>Why do they set up their hypotheses differently in the first place? Because each seems to have a different idea of what’s a worse error to make. The judicial system believes the worse error is to convict an innocent person, rather than let a guilty person go free. The news media seem to believe the contrary. (Maybe because the presumption of guilt sells more papers than presumption of innocence?)</p>
When the Burden of Proof Shifts, the Conclusion May Change
<p>Back to our quality analyst in the first example. To avoid losing customers, the company would rather err by assuming that the quality was not the same using the cheaper material--when it actually was--than err by assuming it was the same, when it actually was not.</p>
<p>To more rigorously demonstrate that the means are the same, the analyst performs a 2-sample equivalence test (<strong>Stat > Equivalence Tests > Two Sample</strong>).</p>
<p style="margin-left: 40px;">________________________________________</p>
<p style="margin-left: 40px;"><strong>Equivalence Test: Mean(New) - Mean(Current) </strong></p>
<p style="margin-left: 40px;">Test<br />
Null hypothesis: Difference ≤ -0.4 or Difference ≥ 0.4<br />
Alternative hypothesis: -0.4 < Difference < 0.4<br />
α level: 0.05</p>
<p style="margin-left: 40px;">Null Hypothesis DF T-Value P-Value<br />
Difference ≤ -0.4 12 1.3717 0.098<br />
Difference ≥ 0.4 12 -2.5646 0.012</p>
<p style="margin-left: 40px;"><strong><span style="color:#FF0000;">The greater of the two P-Values is 0.098. Cannot claim equivalence.</span></strong><br />
________________________________________</p>
<p>Using the equivalence test on the same data, the results now indicate that there<em> isn't</em> sufficient evidence to claim that the means are the same. The company <em>cannot</em><em> </em>be confident that product quality will not suffer if they substitute the less expensive material. By using an equivalence test, the company has raised the bar for evaluating a possible shift in the process mean.</p>
<p><strong>Note:</strong> If you look at the above output, you'll see another way that the equivalence test differs from a standard t-test. Two one-sided t-tests are used to test the null hypothesis. In addition, the test uses a zone of equivalence that defines what size difference between the means you consider to be practically insignificant. <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/equivalence-testing-for-quality-analysis-part-ii-what-difference-does-the-difference-make">We’ll look at that in more detail in my next post</a>.</p>
Quick Summary
<p>To choose between an equivalence test and a standard t-test, consider what you hope to prove or demonstrate. Whatever you hope to prove true should be set up as the alternative hypothesis for the test and require the burden of proof. Whatever you deem to be the less harmful incorrect assumption to make should be the null hypothesis. If you’re trying to rigorously prove that two means are equal, or that a mean equals a target value, you may want to use an equivalence test rather than a standard t-test.</p>
Hypothesis TestingQuality ImprovementMon, 31 Mar 2014 12:39:00 +0000http://blog.minitab.com/blog/statistics-and-quality-data-analysis/equivalence-testing-for-quality-analysis-part-i-what-are-you-trying-to-provePatrick RunkelThe Best European Football League: What the CTQ’s and Minitab Can Tell Us
http://blog.minitab.com/blog/statistics-in-the-field/the-best-european-football-league-what-the-ctqs-and-minitab-can-tell-us
<p><em>by Laerte de Araujo Lima, guest blogger </em></p>
<p><img alt="football" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/dde28685b4777941d546f12cd8d6b6a3/football.jpg" style="border-width: 1px; border-style: solid; margin: 10px 15px; float: right; width: 296px; height: 285px;" />In a previous post (<a href="http://blog.minitab.com/blog/statistics-in-the-field/how-data-analysis-can-help-us-predict-this-years-champions-league">How Data Analysis Can Help Us Predict This Year's Champions League</a>), I shared how I used Minitab <a href="http://www.minitab.com/products/minitab">Statistical Software</a> to predict the 2013-2014 season of the UEFA Champions league. This involved the regression analysis of main critical-to-quality (CTQ) factors, which I identified using the “voice of the customer” suggestions of some friends.</p>
<p>Since that post was published, my friends have stopped discussing the UEFA Champions league—they were convinced by the results I shared.</p>
<p>But now they’ve challenged me to use Six Sigma tools to quantify which European football league is best. In other words, which league gives its fans the best value (average per game) in terms of the CTQ factors that make games fun to watch?</p>
Critical to the CTQ—Voice of the Customer
<p>This analysis will be based on the same CTQ factors used in my previous post. I debriefed my friends (in a bar, while watching a football match, of course) after publishing that post, and they agreed that these CTQ really match their expectations about what should and should not happen in a match.</p>
<p>However, I did add one new CTQ factor in this study, “Average Number of Yellow and Red Cards,” since these data were available in a new database.</p>
<p><strong>CTQ – Voice of Customer</strong></p>
<p><strong>UEFA CL database variable associated with CTQ</strong></p>
<p>More goals per game to make game more fun!</p>
<p><strong>↑ Average goals scored per game</strong></p>
<p>Offensive strategy, with more attempts to score goals.</p>
<p><strong>↑ </strong><strong>Average attempts on target per game</strong></p>
<p><strong>↑ </strong><strong>Average goals scored per game</strong></p>
<strong>↓ Average fouls committed per game</strong>
<p>More effective use of game time.</p>
<p><strong>↓ Average fouls committed per game</strong></p>
<p>More “fair play” and protection for players with high football skills.</p>
<p><strong>↓ Average fouls committed per game</strong></p>
<p><strong>↓ Average number of Yellow and Red cards</strong></p>
European Football League Database
<div style="clear:both;">
<p>The hardest part of this study was finding a reliable and complete database. For this, my friends at <a href="http://www.whoscored.com">www.whoscored.com</a> proved best. In this database, I could find all variables associates with the previous defined CTQ. </p>
<p>I apologize to my Portuguese and French friends, but as I noted in my previous post, only the most predominant countries and leagues (Italy, England, Germany, and Spain) in the last 12 UEFA Championship League seasons are considered in this scenario.</p>
<p align="center">Country</p>
<p>League</p>
<p># of teams</p>
<p># of match’s per team</p>
<p>Web site</p>
<p align="center">Spain (ES) </p>
<p align="center">LIGA BBVA</p>
<p align="center">20</p>
<p align="center">38</p>
<p><a href="http://www.ligabbva.com">http://www.ligabbva.com</a></p>
<p align="center">Italy (IT) </p>
<p align="center">Serie A</p>
<p align="center">20</p>
<p align="center">38</p>
<p><a href="http://www.legaseriea.it/">http://www.legaseriea.it/</a></p>
<p align="center">England (UK) </p>
<p align="center">Barclays Premier League</p>
<p align="center">20</p>
<p align="center">38</p>
<p><a href="http://www.premierleague.com/en-gb.html">http://www.premierleague.com/en-gb.html</a></p>
<p align="center">Germany (GE) </p>
<p align="center">Bundesliga</p>
<p align="center">18</p>
<p align="center">34</p>
<p><a href="http://www.bundesliga.de/de/index.php">http://www.bundesliga.de/de/index.php</a></p>
<div style="clear:both;"> </div>
Ranking Criteria and Methodology
<p>Based on the CTQ factors, I performed an analysis of each league, looking at each team’s individual average values for each CTQ. This lets me compare not only the overall league average, but also the league’s teams’ average.</p>
<p>To perform this analysis, I used the statistical tool called Analysis of Variance (ANOVA). ANOVA tests the hypothesis that the means of two or more populations are equal.</p>
<p>ANOVA evaluates the importance of one or more factors by comparing the response variable means at the different factor levels. The null hypothesis states that all population means (factor level means) are equal while the alternative hypothesis states that at least one is different.</p>
<p>For this analysis I used the Assistant to perform One-Way ANOVA analysis. <strong>Assistant > Hypothesis test > One-Way ANOVA.</strong></p>
<p><img alt="ANOVA Chooser in the Minitab 17 Assistant" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/c75d8adad791e928fff41bb45623a940/anova_chooser.png" style="width: 500px; height: 391px;" /></p>
<p>Based on the One-Way ANOVA analysis, I’m able to identify and position the leagues based on last season’s average values of CTQ variables per match.</p>
<p>Then, after compiling all results, I deployed a <a href="http://asq.org/learn-about-quality/decision-making-tools/overview/decision-matrix.html" target="_blank">Decision Matrix</a> (another Six Sigma tool) to assess each league on the CTQ variables. The position of the league in the analysis and its associated weight (1 / 5 / 10) will give a final score for each league</p>
Average Number of Yellow & Red Cards
<p><img alt="One Way ANOVA for Average Yellow and Red Cards" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/eaa407ca3b8f2652749bfddd85e0b1be/average_yellow_red_cards_w640.png" style="width: 640px; height: 480px;" /></p>
<p>As the Assistant's output makes clear, the One Way ANOVA hypothesis test p-value (0.001) is less than the threshold ( < 0.05), which indicates that there is a difference among the means. The table to the right of the p-value calculation show us which means differ from others, and the means comparison chart gives the graphical view of the statistical analysis. </p>
<p><strong>Conclusion</strong>: The U.K. football league has the lowest average of cards (Yellow & Red) per game.</p>
Average Fouls Committed per Game
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/53d33522ab0a908315c8f030fd73c9c8/anova_fouls_per_game_w640.png" style="width: 640px; height: 480px;" /></p>
<p>The p-value (0.001) is less than the threshold (< 0.05), telling us that there is a difference in means. In this case, based on the comparison chart, it’s evident that the U.K. league has the lowest average number of fault per game so far. Among the remaining three leagues, there’s not a statistically significant difference between the Spanish and Italian leagues, nor between the Italian and German. In this situation, I’ve decided to give to the Spanish league a different score than German and Italian.</p>
Average Goals Scored per Game
<p><img alt="" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b7430894afa46004b25b8f1228a754a6/anova_goals_per_game_w640.png" style="width: 640px; height: 480px;" /></p>
<p>The p-value (0.729) is greater than the threshold (< 0.05), indicating there is no significant difference in the means.<br />
<br />
<strong>Conclusion: </strong>No matter which league you watch, the number of goals per match will be, on average, the same.</p>
Average Attempts on Target per Game
<p><img alt="ANOVA of attempts per game" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/28a6dfefba59412d636232d1672bd19d/anova_attempts_per_game_w640.png" style="width: 640px; height: 480px;" /></p>
<p>Again, the test p-value result (0,891) is greater than the threshold ( < 0.05), telling us that there is no statistically significant difference in the average attempts on target per game.</p>
<p><strong>Conclusion:</strong> All four leagues will will receive the same score for this variable.</p>
<p>The final decision matrix helps us see the results of all of these analyses:</p>
<p><img alt="decision matrix" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/98adf388cdfaae91f6c46bb85aa3f09d/decision_matrix.png" style="width: 602px; height: 282px;" /></p>
<p><img alt="chart of critical to quality factors" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/99a5b5a7dc72d2b777047ad650ed1ada/ctq.png" style="width: 550px; height: 370px;" /></p>
Conclusion: The Best European Football League
<p>Based on the results shown above, we can conclude the following.</p>
<ol>
<li>Normality is not an issue. Except for the German league (18), all data have a sample size of 20 teams by league, and are normally distributed.</li>
<li value="2">The U.K. league is the best in terms of “Fair Play.” Both average faults per game and cards (yellow & red) are less those than others leagues. (Now I can understand the critics of English support to the referee when an English team plays against another European team.)</li>
<li value="3">There is no difference among the leagues in terms of average attempts on target per game, nor average goals per match.</li>
<li value="4">Under the premise that the best European football league should have the best performance regarding the variables related to the selected CTQ, England’s football league comes out on top.</li>
</ol>
<p>If a good league is one that answers the “customer” expectations (CTQ), the exercise performed in this post shows that England’s supporters (from all teams in premier league) should be the most satisfied supporters in Europe.</p>
<p>Unfortunately, this analysis is only based on the last season’s data, and thus it may only represent one static season and not a trend. But at minimum this analysis indicates that there are significant difference among the leagues, especially in the “fair play” CTQ factor.</p>
<p> </p>
<p><strong>About the Guest Blogger: </strong></p>
<p><em>Laerte de Araujo Lima is a Supplier Development Manager for Airbus (France). He has previously worked as product quality engineer for Ford (Brazil), a Project Manager in MGI Coutier (Spain), and Quality Manager in IKF-Imerys (Spain). He earned a bachelor's degree in mechanical engineering from the University of Campina Grande (Brazil) and a master's degree in energy and sustainability from the Vigo University (Spain). He has 10 years of experience in applying Lean Six Sigma to product and process development/improvement. To get in touch with Laerte, please follow him on Twitter @laertelima or on</em> <a href="http://www.linkedin.com/pub/laerte-lima/7/46b/443" target="_blank"><strong><em>LinkedIn</em></strong></a><em>.</em></p>
</div>
Fun StatisticsThu, 27 Mar 2014 12:54:26 +0000http://blog.minitab.com/blog/statistics-in-the-field/the-best-european-football-league-what-the-ctqs-and-minitab-can-tell-usGuest Blogger