Data Analysis Software | MinitabBlog posts and articles with tips for using statistical software to analyze data for quality improvement.
http://blog.minitab.com/blog/data-analysis-software/rss
Fri, 24 Feb 2017 01:16:11 +0000FeedCreator 1.7.3Three Common P-Value Mistakes You'll Never Have to Make
http://blog.minitab.com/blog/understanding-statistics/three-common-p-value-mistakes-youll-never-have-to-make
<p>Statistics can be challenging, especially if you're not analyzing data and interpreting the results every day. <a href="http://www.minitab.com/products/minitab/" title="statistical software for analyzing quality data">Statistical software</a> makes things easier by handling the arduous mathematical work involved in statistics. But ultimately, we're responsible for correctly interpreting and communicating what the results of our analyses show.</p>
<p>The p-value is probably the most frequently cited statistic. We use p-values to interpret the results of regression analysis, hypothesis tests, and many other methods. Every introductory statistics student and every Lean Six Sigma Green Belt learns about p-values. </p>
<p>Yet this common statistic is misinterpreted so often that at least one scientific journal has abandoned its use.</p>
What Does a P-value Tell You?
<p>Typically, a P value is defined as "the probability of observing an effect at least as extreme as the one in your sample data—<em>if the <span><a href="http://blog.minitab.com/blog/understanding-statistics/why-shrewd-experts-fail-to-reject-the-null-every-time">null hypothesis</a></span> is true</em>." Thus, the only question a p-value can answer is this one:</p>
<p><em>How likely is it that I would get the data I have, assuming the null hypothesis is true?</em></p>
<p>If your p-value is less than your selected <span><a href="http://blog.minitab.com/blog/adventures-in-statistics-2/understanding-hypothesis-tests%3A-significance-levels-alpha-and-p-values-in-statistics">alpha level</a></span> (typically 0.05), you <em>reject the null hypothesis</em> in favor of the alternative hypothesis. If the p-value is above your alpha value, you <em>fail to reject</em> the null hypothesis. It's important to note that the null hypothesis is never accepted; we can only <em>reject </em>or <em>fail to reject</em> it. </p>
The P-Value in a 2-Sample t-Test
<p>Consider a typical hypothesis test—say, a 2-sample t-test of the mean weight of boxes of cereal filled at different facilities. We collect and weigh 50 boxes from each facility to confirm that the mean weight for each line's boxes is the listed package weight of 14 oz. </p>
<p>Our null hypothesis is that the two means are equal. Our alternative hypothesis is that they are <em>not </em>equal. </p>
<p>To run this test in Minitab, we enter our data in a worksheet and select <strong>Stat > Basic Statistics > 2-Sample T-test</strong>. If you'd like to follow along, you can download the <a href="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/2edc594cf40ec4931e5cd0021df6703e/cereal_weight.mtw">data</a> and, if you don't already have it, get the <a href="http://www.minitab.com/products/minitab/free-trial/">30-day trial of Minitab</a>. In the t-test dialog box, select<em> Both samples are in one column</em> from the drop-down menu, and choose "Weight" for Samples, and "Facility" for Sample IDs.</p>
<p style="margin-left: 40px;"><img alt="t test for the mean" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/1a090752bef395f3b227511c6e57946d/dialog.png" style="width: 424px; height: 296px;" /></p>
<p>Minitab gives us the following output, and I've highlighted the p-value for the hypothesis test:</p>
<p style="margin-left: 40px;"><img alt="t-test output" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/3b27f14d1859460a1875c81384c52ccb/t_test_output.png" style="width: 544px; height: 222px;" /></p>
<p>So we have a p-value of 0.029, which is less than our selected alpha value of 0.05. Therefore, we reject the null hypothesis that the means of Line A and Line B are equal. Note also that while the evidence indicates the means are different, that difference is estimated at 0.338 oz—a pretty small amount of cereal. </p>
<p>So far, so good. But this is the point at which trouble often starts.</p>
Three Frequent Misstatements about P-Values
<p>The p-value of 0.029 means we reject the null hypothesis that the means are equal. But that doesn't mean any of the following statements are accurate:</p>
<ol>
<li><strong>"There is 2.9% probability the means are the same, and 97.1% probability they are different." </strong><br />
We don't know that at all. The p-value only says that <strong><em>if </em></strong>the null hypothesis is true, the sample data collected would exhibit a difference this large or larger only 2.9% of the time. Remember that the p-value doesn't tell you anything <em>directly </em>about what you've seen. Instead, it tells you the <em>odds </em>of seeing it. </li>
<br />
<li><strong>"The p-value is low, which indicates there's an important difference in the means." </strong><br />
Based on the 0.029 p-value shown above, we can conclude that a statistically significant difference between the means exists. But the estimated size of that difference is less than a half-ounce, and won't matter to customers. A p-value may indicate a difference exists, but it tells you nothing about its practical impact.</li>
<br />
<li><strong>"The low p-value shows the alternative hypothesis is true."</strong><br />
A low p-value provides statistical evidence to reject the null hypothesis—but that doesn't prove the truth of the alternative hypothesis. If your alpha level is 0.05, there's a 5% chance you will incorrectly reject the null hypothesis. Or to put it another way, if a jury fails to convict a defendant, it doesn't prove the defendant is <em>innocent</em>: it only means the prosecution failed to prove the defendant's guilt beyond a reasonable doubt. </li>
</ol>
<p>These misinterpretations happen frequently enough to be a concern, but that doesn't mean that we shouldn't use p-values to help interpret data. The p-value remains a very useful tool, as long as we're interpreting and communicating its significance accurately.</p>
P-Value Results in Plain Language
<p>It's one thing to keep all of this straight if you're doing data analysis and statistics all the time. It's another thing if you're only analyze data occasionally, and need to do many other things in between—like most of us. "Use it or lose it" is certainly true about statistical knowledge, which could well be another factor that contributes to misinterpreted p-values. </p>
<p>If you're leery of that happening to you, a good way to avoid that possibility is to use the Assistant in Minitab to perform your analyses. If you haven't used it yet, the Assistant menu guides you through your analysis from start to finish. The dialog boxes and output are all in plain language, so it's easy to figure out what you need to do and what the results mean, even if it's been a while since your last analysis. (But even expert statisticians tell us they like using the Assistant because the output is so clear and easy to understand, regardless of an audience's statistical background.) </p>
<p>So let's redo the analysis above using the Assistant, to see what that output looks like and how it can help you avoid misinterpreting your results—or having them be misunderstood by others!</p>
<p>Start by selecting <strong>Assistant > Hypothesis Test...</strong> from the Minitab menu. Note that a window pops up to explain exactly what a hypothesis test does. </p>
<p style="margin-left: 40px;"><img alt="assistant hypothesis test" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/f26601f26db3576a7cf2b5bc3178f9ca/assistant_hypothesis_test.png" style="width: 420px; height: 252px;" /></p>
<p>The Assistant asks what we're trying to do, and gives us three options to choose from.</p>
<p style="margin-left: 40px;"><img alt="hypothesis test chooser" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/fba2ee28b10063e1c5f0f00eb77db1b2/assistant_hypothesis_test_chooser.png" style="width: 600px; height: 472px;" /></p>
<p>We know we want to compare a sample from Line A with a sample from Line B, but what if we can't remember which of the 5 available tests is the appropriate one in this situation? We can get guidance by clicking "Help Me Choose."</p>
<p style="margin-left: 40px;"><img alt="help me choose the right hypothesis test" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/51bb23fbb44603efff50fe4fa1d9dbd1/assistant_hypothesis_test_decision_tree.png" style="width: 700px; height: 551px;" /></p>
<p>The choices on the diagram direct us to the appropriate test. In this case, we choose continuous data instead of attribute (and even if we'd forgotten the difference, clicking on the diamond would explain it). We're comparing two means instead of two standard deviations, and we're measuring two different sets of items since our boxes came from different production lines. </p>
<p>Now we know what test to use, but suppose you want to make sure you don't miss anything that's important about the test, like requirements that must be met? Click the "more..." link and you'll get those details. </p>
<p style="margin-left: 40px;"><img alt="more info about the 2-Sampe t-Test" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/1b4f09a2438b0aaef14e8da6564524cf/assistant_hypothesis_test_more_info.png" style="width: 700px; height: 526px;" /></p>
<p>Now we can proceed to the Assistant's dialog box. Again, statistical jargon is minimized and everything is put in straightforward language. We just need to answer a few questions, as shown. Note that the Assistant even lets us tell it how big a difference needs to be for us to consider it practically important. In this case, we'll enter 2 ounces.</p>
<p style="margin-left: 40px;"><img alt="Assistant 2-sample t-Test dialog" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/994d9172bf788282258f765d4d08aefa/assistant_hypothesis_test_dialog.png" style="width: 641px; height: 495px;" /></p>
<p>When we press OK, the Assistant performs the t-test and delivers three reports. The first of these is a summary report, which includes summary statistics, confidence intervals, histograms of both samples, and more. And interpreting the results couldn't be more straightforward than what we see in the top left quadrant of the diagram. In response to the question, "Do the means differ?" we can see that p-value of 0.029 marked on the bar, very far toward the "Yes" end of the scale. </p>
<p style="margin-left: 40px;"><img alt="2-Sample t-Test summary report" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/8927b8bc833551678715f68149dd18ad/assistant_hypothesis_test_summary.png" style="width: 700px; height: 526px;" /></p>
<p>Next is the Diagnostic Report, which provides additional information about the test. </p>
<p style="margin-left: 40px;"><img alt="2-Sample t-Test diagnostic report" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/6467a0be0ba60329f2be282e14b9be33/assistant_hypothesis_test_diagnostic.png" style="width: 700px; height: 526px;" /></p>
<p>In addition to letting us check for outliers, the diagnostic report shows us the size of the observed difference, as well as the chances that our test could detect a practically significant difference of 2 oz. </p>
<p>The final piece of output the Assistant provides is the report card, which flags any problems or concerns about the test that we would need to be aware of. In this case, all of the boxes are green and checked (instead of red and x'ed). </p>
<p style="margin-left: 40px;"><img alt="2-Sample t-Test report card" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/0e4cd0dce832a8251701f8175de9a037/assistant_hypothesis_test_report_card.png" style="width: 700px; height: 526px;" /></p>
<p>When you're not doing statistics all the time, the Assistant makes it a breeze to find the right analysis for your situation and to make sure you interpret your results the right way. Using it is a great way to make sure you're not attaching too much, or too little, importance on the results of your analyses.</p>
<p> </p>
Hypothesis TestingStatisticsStatistics HelpStatsWed, 22 Feb 2017 14:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/three-common-p-value-mistakes-youll-never-have-to-makeEston MartzChi-Square Analysis: Powerful, Versatile, Statistically Objective
http://blog.minitab.com/blog/michelle-paret/chi-square-analysis-powerful-versatile-statistically-objective
<p style="line-height: 20.7999992370605px;">To make objective decisions about the processes that are critical to your organization, you often need to examine categorical data. You may know how to use a t-test or ANOVA when you’re comparing measurement data (like weight, length, <span style="line-height: 1.6;">revenue, </span><span style="line-height: 1.6;">and so on), but do you know how to compare attribute or counts data? It easy to do with <a href="http://www.minitab.com/products/minitab">statistical software</a> like Minitab. </span></p>
<p style="line-height: 20.7999992370605px;"><img alt="failures per production line" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/19b2bd8557279d21284a23e2174fef88/chisquare_onevariable_revision.jpg" style="line-height: 20.8px; width: 400px; height: 267px; float: right; margin: 10px 15px;" /></p>
<p style="line-height: 20.7999992370605px;">One person may look at this bar chart and decide that the production lines performed similarly<span style="line-height: 1.6;">. But another person may focus on the small difference between the bars and decide that one of the lines has outperformed the others. Without an appropriate statistical analysis, how can you know which person is right?</span></p>
<p style="line-height: 20.7999992370605px;">When time, money, and quality depend on your answers, you can’t rely on subjective visual assessments alone. To answer questions like these with statistical objectivity, you can use a Chi-Square analysis.</p>
Which Analysis Is Right for Me?
<p style="line-height: 20.7999992370605px;">Minitab offers three Chi-Square tests. The appropriate analysis depends on the number of variables that you want to examine. And for all three options, the data can be formatted either as raw data or summarized counts.</p>
<strong>Chi-Square Goodness-of-Fit Test – 1 Variable</strong>
<p style="line-height: 20.7999992370605px;">Use Minitab’s <strong>Stat > Tables > Chi-Square Goodness-of-Fit Test (One Variable)</strong> when you have just one variable.</p>
<p style="line-height: 20.7999992370605px;">The Chi-Square Goodness-of-Fit Test can test if the proportions for all groups are equal. It can also be used to test if the proportions for groups are equal to specific values. For example:</p>
<ul style="line-height: 20.7999992370605px;">
<li>A bottle cap manufacturer operates three production lines and records the number of defective caps for each line. The manufacturer uses the <strong>Chi-Square Goodness-of-Fit Test</strong> to determine if the proportion of defectives is equal across all three lines.</li>
<li>A bottle cap manufacturer operates three production lines and records the number of defective caps and the total number produced for each line. One line runs at high speed and produces twice as many caps as the other two lines that run at a slower speed. The manufacturer uses the <strong>Chi-Square Goodness-of-Fit Test</strong> to determine if the number of defective units for each line is proportional to the volume of caps it produces.</li>
</ul>
<strong>Chi-Square Test for Association – 2 Variables</strong>
<p style="line-height: 20.7999992370605px;">Use Minitab’s <strong>Stat > Tables > Chi-Square Test for Association</strong> when you have two variables.</p>
<p style="line-height: 20.7999992370605px;">The Chi-Square Test for Association can tell you if there’s an association between two variables. In another words, it can test if two variables are independent or not. For example:</p>
<ul style="line-height: 20.7999992370605px;">
<li>A paint manufacturer operates two production lines across three shifts and records the number of defective units per line per shift. The manufacturer uses the <strong>Chi-Square Goodness-of-Fit Test</strong> to determine if the percent defective is similar across all shifts and production lines. Or, are certain lines during certain shifts more prone to issues?<br />
<br />
<img alt="Defectives per line per shift" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/8f78b557ef93b1390b79b866787d5503/chisquare_twovariables_revision.jpg" style="width: 600px; height: 400px;" /><br />
<br />
</li>
<li>A call center randomly samples 100 incoming calls each day of the week for each of its three locations, for a total of 1500 calls. They then record the number of abandoned calls per location per day. The call center uses a Chi-Square Test to determine if there are is any association between location and day of the week with respect to missed calls.</li>
</ul>
<p style="margin-left: 40px;"><img alt="call center data" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/e60774e6ddac893694e7b8a1a39a47b4/callcenterdata.jpg" style="width: 265px; height: 133px;" /><br />
</p>
<strong>Cross Tabulation and Chi-Square – 2 or more variables</strong>
<p style="line-height: 20.7999992370605px;">Use Minitab’s <strong>Stat > Tables > Cross Tabulation and Chi-Square </strong>when you have two or more variables.</p>
<p style="line-height: 20.7999992370605px;">If you simply want to test for associations between two variables, you can use either <strong>Cross Tabulation and Chi-Square</strong> or <strong>Chi-Square Test for Association</strong>. However, <span><a href="http://blog.minitab.com/blog/understanding-statistics/using-cross-tabulation-and-chi-square-the-survey-says">Cross Tabulation and Chi-Square</a></span> also lets you control for the effect of additional variables. Here’s an example:</p>
<ul style="line-height: 20.7999992370605px;">
<li>A tire manufacturer records the number of failed tires for four different tire sizes across two production lines and three shifts. The plant uses a Cross Tabulation and Chi-Square analysis to look for failure dependencies between the tire sizes and production lines, while controlling for any shift effect. Perhaps a particular production line for a certain tire size is more prone to failures, but only during the first shift.</li>
</ul>
<p style="line-height: 20.7999992370605px;">This analysis also offers advanced options. For example, if your categories are ordinal (good, better, best or small, medium, large) you can include a special test for concordance.</p>
Conducting a Chi-Square Analysis in Minitab
<p style="line-height: 20.7999992370605px;">Each of these analyses is easy to run in Minitab. For more examples that include step-by-step instructions, just navigate to the Chi-Square menu of your choice and then click Help > example.</p>
<p style="line-height: 20.7999992370605px;">It can be tempting to make subjective assessments about a given set of data, their makeup, and possible interdependencies, but why risk an error in judgment when you can be sure with a Chi-Square test?</p>
<p style="line-height: 20.7999992370605px;">Whether you’re interested in one variable, two variables, or more, a Chi-Square analysis can help you make a clear, statistically sound assessment.</p>
Data AnalysisHypothesis TestingLean Six SigmaQuality ImprovementSix SigmaStatisticsStatistics HelpFri, 17 Feb 2017 13:16:00 +0000http://blog.minitab.com/blog/michelle-paret/chi-square-analysis-powerful-versatile-statistically-objectiveMichelle ParetA Field Guide to Statistical Distributions
http://blog.minitab.com/blog/statistics-in-the-field/a-field-guide-to-statistical-distributions
<p><em><span style="line-height: 1.6;">by Matthew Barsalou, guest blogger. </span></em></p>
<p>The old saying “if it walks like a duck, quacks like a duck and looks like a duck, then it must be a duck” may be appropriate in bird watching; however, the same idea can’t be applied when observing a statistical distribution. The dedicated ornithologist is often armed with binoculars and a field guide to the local birds and this should be sufficient. A statologist (I just made the word up, feel free to use it) on the other hand, is ill-equipped for the visual identification of his or her targets.</p>
Normal, Student's t, Chi-Square, and F Distributions
<p>Notice the upper two distributions in figure 1. The <span><a href="http://blog.minitab.com/blog/fun-with-statistics/normal-the-kevin-bacon-of-distributions">normal distribution</a></span> and student’s t distribution may appear similar. However, the standard normal distribution is calculated using n and <a href="http://blog.minitab.com/blog/michelle-paret/guinness-t-tests-and-proving-a-pint-really-does-taste-better-in-ireland">student’s t distribution</a> is calculated using n-1. This may appear to be a minor difference, but when n is small, student’s t distribution displays much more peakedness. Student’s t distribution approaches the normal distribution as the sample size increases, but it never truly matches the shape of the normal distribution.</p>
<p>Observe the Chi-square and F distribution in the lower half of figure 1. The shapes of the distributions can vary and even the most astute observer will not be able to differentiate between them by eye. Many distributions can be sneaky like that. It is a part of their nature that we must accept as we can’t change it.</p>
<p align="center"><img alt="Distribution Field Guide Figure 1" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b5c12365f066b6ca3d255bcd458314e1/distribution_field_guide_1.gif" style="width: 605px; height: 352px;" /><em><span style="line-height: 1.6;">Figure 1</span></em></p>
Binomial, Hypergeometric, Poisson, and Laplace Distributions
<p>Notice the distributions illustrated in figure 2. A bird watcher may suddenly encounter four birds sitting in a tree; a quick check of a reference book may help to determine that they are all of a different species. The same can’t always be said for statistical distributions. <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-and-using-discrete-distributions">Observe the binomial distribution, hypergeometric distribution and Poisson distribution</a>. We can’t even be sure the three are not the same distribution. If they are together with a Laplace distribution, an observer may conclude “one of these does not appear to be the same as the others.” But they <em>are </em>all different, which our eyes alone may fail to tell us.</p>
<p align="center"><img alt="Distribution Field Guide Figure 2" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b9011bf86767f49c3e7ec47c76d20631/distribution_field_guide_2.gif" style="width: 605px; height: 352px;" /><em><span style="line-height: 1.6;">Figure 2</span></em></p>
Weibull, Cauchy, Loglogistic, and Logistic Distributions
<p>Suppose we observe the four distributions in figure 3.What are they? Could you tell if they were not labeled? We must identify them correctly before we can do anything with them. One is a Weibull distribution, but all four could conceivably be various Weibull distributions. The shape of the Weibull distribution varies based upon the shape parameter (κ) and scale parameter (λ).The Weibull distribution is a useful, but potentially devious distribution that can be much like the double-barred finch, which may be mistaken for an owl upon first glance.</p>
<p align="center"><img alt="Distribution Field Guide Figure 3" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/2b606d88ff9ae159f94dcac04748c3e2/distribution_field_guide_3.gif" style="width: 605px; height: 351px;" /><em><span style="line-height: 1.6;">Figure 3</span></em></p>
<p>Attempting to visually identify a statistical distribution can be very risky. Many distributions such as the Chi-Square and F distribution change shape drastically based on the number of degrees of freedom. Figure 4 shows various shapes for the Chi-Square, F distribution and the Weibull distribution. Figure 4 also compares a standard normal distribution with a standard deviation of one to a t distribution with 27 degrees of freedom; notices how the shapes overlap to the point where it is no longer possible to tell the two distributions apart.</p>
<p>Although there is no definitive Field Guide to Statistical Distributions to guide us, there are formulas available to correctly identify statistical distributions. We can also use <a href="http://www.minitab.com/products/minitab">Minitab Statistical Software</a> to identify our distribution.</p>
<p align="center"><img alt="Distribution Field Guide Figure 4" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/aa4be49733e980c8c7e26395c5e8262a/distribution_field_guide_4.gif" style="width: 605px; height: 351px;" /><em style="line-height: 1.6;">Figure 4</em></p>
<p>Go to <strong>Stat > Quality Tools > Individual Distribution Identification...</strong> and enter the column containing the data and the subgroup size. The results can be observed in either the session window (figure 5) or the graphical outputs shown in figures 6 through 9.</p>
<p>In this case, we can conclude we are observing a 3-parameter Weibull distribution based on the p value of 0.364.</p>
<p align="center"><img alt="Distribution Field Guide Figure 5" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/29448180c3ff01cae81cfaf250a60115/distribution_field_guide_5.gif" style="width: 547px; height: 739px;" /></p>
<p align="center"><em>Figure 5</em></p>
<p> </p>
<p style="text-align: center;"><img alt="Distribution Field Guide Figure 6" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/781c7a83b14261ae062c63a07479b10d/distribution_field_guide_6.png" style="width: 576px; height: 384px;" /><em style="line-height: 1.6;">Figure 6</em></p>
<p style="text-align: center;"><img alt="Distribution Field Guide Figure 7" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/fcf5a7b56b859e6861ae8d96e8273fe1/distribution_field_guide_7.png" style="width: 576px; height: 384px;" /><em><span style="line-height: 1.6;">Figure 7</span></em></p>
<p style="text-align: center;"><em><img alt="Distribution Field Guide Figure 8" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/a13530fb7ec7ee8e3fe90143772eefbc/distribution_field_guide_8.png" style="width: 576px; height: 384px;" /><span style="line-height: 1.6;">Figure 8</span></em></p>
<p style="text-align: center;"><em><img alt="Distribution Field Guide Figure " src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/6f28cb199afaee379ccc2244a955557f/distribution_field_guide_9.png" style="width: 576px; height: 384px;" /><span style="line-height: 1.6;">Figure 9</span></em></p>
<p> </p>
<p> </p>
<div> </div>
<div>
<p style="line-height: 20.7999992370605px;"><strong>About the Guest Blogger</strong></p>
<p style="line-height: 20.7999992370605px;"><em><a href="https://www.linkedin.com/pub/matthew-barsalou/5b/539/198" target="_blank">Matthew Barsalou</a> is a statistical problem resolution Master Black Belt at <a href="http://www.3k-warner.de/" target="_blank">BorgWarner</a> Turbo Systems Engineering GmbH. He is a Smarter Solutions certified Lean Six Sigma Master Black Belt, ASQ-certified Six Sigma Black Belt, quality engineer, and quality technician, and a TÜV-certified quality manager, quality management representative, and auditor. He has a bachelor of science in industrial sciences, a master of liberal studies with emphasis in international business, and has a master of science in business administration and engineering from the Wilhelm Büchner Hochschule in Darmstadt, Germany. He is author of the books <a href="http://www.amazon.com/Root-Cause-Analysis-Step---Step/dp/148225879X/ref=sr_1_1?ie=UTF8&qid=1416937278&sr=8-1&keywords=Root+Cause+Analysis%3A+A+Step-By-Step+Guide+to+Using+the+Right+Tool+at+the+Right+Time" target="_blank">Root Cause Analysis: A Step-By-Step Guide to Using the Right Tool at the Right Time</a>, <a href="http://asq.org/quality-press/display-item/index.html?item=H1472" target="_blank">Statistics for Six Sigma Black Belts</a> and <a href="http://asq.org/quality-press/display-item/index.html?item=H1473&xvl=76115763" target="_blank">The ASQ Pocket Guide to Statistics for Six Sigma Black Belts</a>.</em></p>
</div>
<p> </p>
Fun StatisticsStatisticsStatistics HelpStatsFri, 10 Feb 2017 13:00:00 +0000http://blog.minitab.com/blog/statistics-in-the-field/a-field-guide-to-statistical-distributionsGuest BloggerStatistical Tools for Process Validation, Stage 2: Process Qualification
http://blog.minitab.com/blog/michelle-paret/statistical-tools-for-process-validation-stage-2-process-qualification
<p>In its industry guidance to companies that manufacture drugs and biological products for people and animals, the Food and Drug Administration (FDA) recommends three stages for process validation.<img alt="Process Validation Stages" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/26c294a2e9b5b993bfd0f571be11113d/processvalidationstages.jpg" style="width: 220px; height: 235px; margin: 10px 15px; float: right;" /> While my last post covered <a href="http://blog.minitab.com/blog/michelle-paret/statistical-tools-for-process-validation,-stage-1:-process-design">statistical tools for the Process Design stage</a>, here we will focus on the statistical techniques typically utilized for the second stage, Process Qualification.</p>
Stage 2: Process Qualification
<p>During this stage, the process design is evaluated to determine if it is capable of reproducible commercial manufacture. Successful completion of Stage 2 is necessary before commercial distribution.</p>
<span style="color:#008080;"><strong>Example: Evaluate Acceptance Criteria with Capability Analysis</strong></span>
<p>Suppose the active ingredient amount in a tranquilizer needs to be between 360 and 370 mg/mL and you need to assess the quality level, where a minimum Cpk of 1.33 is defined as the acceptance criteria. To assess process performance and determine if measurements are within specification, you can use capability analysis, available in <a href="http://www.minitab.com/products/minitab/">Minitab Statistical Software</a>.</p>
<p>Five samples are randomly selected from 50 batches and the amount of active ingredient is measured. The data is then analyzed relative to the 360 mg/mL minimum and 370 mg/mL maximum.</p>
<p style="margin-left: 40px;"><img alt="Process Capability" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/c48fee09e2caab2f6499c5e1ee74a867/processcapability.jpg" style="width: 400px; height: 300px; margin-left: 15px; margin-right: 15px;" /></p>
<p>The capability analysis reveals a Cpk of 0.53, which fails to meet the acceptance criteria of 1.33. The active ingredient amounts for this tranquilizer are not acceptable. So how can we improve it? The <a href="http://blog.minitab.com/blog/michelle-paret/how-to-improve-cpk">Cp value</a> of 1.41 and the graph both reveal that, although the variability is acceptable with respect to the width of the specification limits, the process average needs to be shifted to a higher mg/mL in order to achieve an acceptable Cpk.</p>
<span style="color:#008080;"><strong>Example: Conduct Variation Analysis across Batches</strong></span>
<p>Suppose we want to assess content uniformity, a critical quality characteristic, across 3 batches at 10 locations. To visualize the intra-batch (within-batch) variation and the inter-batch (between-batch) variation, we can create boxplots for each batch.</p>
<p>A boxplot can help us visually assess both the intra- and inter-batch variation, and identify any outliers. This specific graph shows a homogeneous dispersion of measurements both within each batch and between batches. And there are no <a href="http://blog.minitab.com/blog/michelle-paret/how-to-identify-outliers-and-get-rid-of-them">outliers</a>, which Minitab would flag with an asterisk (*). </p>
<p><img alt="Boxplot" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/f7f242d7f0f0ea0b793c91c4cf710ca8/boxplots.jpg" style="width: 400px; height: 267px; margin-left: 15px; margin-right: 15px;" /></p>
<p>Although boxplots are useful tools to conduct a visual assessment, we can also statistically assess if there is a significant difference in the between batch variation using an equal variances test. The test reveals a p-value greater than an alpha-level of 0.05 (or whatever alpha-level you prefer), which supports the conclusion that there is consistency between batches.</p>
<span style="color:#008080;"><strong>Example: Various Applications for Tolerance Intervals</strong></span>
<p>Another useful tool for Process Qualification is the tolerance interval. This tool has multiple applications. For example, tolerance intervals can be used to compare your process to specifications, profile the outcome of a process, or establish acceptance criteria.</p>
<p>For a given product characteristic, a tolerance interval provides a range of values that likely covers a specified proportion of the population (for example, 95%) for a specified confidence level (like 99%).</p>
<p>For example, suppose we want to know how the active ingredient values in the manufacturing process compare to our specification limits. Based on a dose-response study, the limits are 360 to 370 mg/mL.</p>
<p><img alt="Tolerance Interval" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/9a8a69e60f00528975c7b40d52fb8206/toleranceinterval.jpg" style="width: 400px; height: 267px; margin-left: 15px; margin-right: 15px;" /></p>
<p>For this particular data set, Minitab reveals that we can be 99% confident that 95% of the units will be between 362.272 and 367.468 mg/mL. The process bounds therefore indicate that we can meet the requirements of 360 to 370, and we can conclude with high confidence that the process variation is less than the allowable variation, defined by the specification limits.</p>
<p>Or perhaps we need to assess content uniformity using 99% confidence and 99% coverage. We sample 30 tablets and calculate a tolerance interval, revealing that we can be 99% certain that 99% of the tablets will have a content uniformity within some range, calculated using Minitab.</p>
<p>And that’s how you can use various statistical tools to support Process Qualification. In the final post in this series, we’ll explore the Continued Process Verification stage!</p>
Capability AnalysisData AnalysisQuality ImprovementStatisticsStatistics HelpStatsFri, 03 Feb 2017 13:00:00 +0000http://blog.minitab.com/blog/michelle-paret/statistical-tools-for-process-validation-stage-2-process-qualificationMichelle ParetHow to Use Data to Understand and Resolve Differences in Opinion, Part 3
http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-3
<p>In the first part of this series, we saw how <a href="http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-1">conflicting opinions about a subjective factor</a> can create business problems. In part 2, we used Minitab's Assistant feature to <a href="http://Previously, I discussed how business problems arise when people have conflicting opinions about a subjective factor, such as whether something is the right color, or whether a job applicant is qualified for a position. The key to resolving such honest disagreements and handling future decisions more consistently is a statistical tool called attribute agreement analysis. In this post, we'll cover how to set up and conduct an attribute agreement analysis.">set up an attribute agreement analysis study</a> that will provide a better understanding of where and when such disagreements occur. </p>
<p>We asked four loan application reviewers to reject or approve 30 selected applications, two times apiece. Now that we've collected that data, we can analyze it. If you'd like to follow along, you can download the data set <a href="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/6999a9517a572ff2fc3df681c36b3e44/loan_application_attribute_agreement_analysis.mtw">here</a>.</p>
<p>As is so often the case, you don't need statistical software to do this analysis—but with 240 data points to contend with, a computer and software such as <a href="http://www.minitab.com/products/minitab">Minitab</a> will make it much easier. </p>
Entering the Attribute Agreement Analysis Study Data
<p>Last time, we showed that the only data we need to record is whether each appraiser approved or rejected the sample application in each case. Using the data collection forms and the worksheet generated by Minitab, it's very easy to fill in the Results column of the worksheet. </p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis worksheet data entry" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/41387399781177418d2cc236755a4f41/attribute_agreement_worksheet_data_entry.png" style="width: 448px; height: 324px;" /></p>
Analyzing the Attribute Agreement Analysis Data
<p>The next step is to use statistics to better understand how well the reviewers agree with each others' assessments, and how consistently they judge the same application when they evaluate it again. Choose <strong>Assistant > Measurement Systems Analysis (MSA)...</strong> and press the <em>Attribute Agreement Analysis</em> button to bring up the appropriate dialog box: </p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis assistant selection" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/36dfe0d806a026f66083efd4e4e8e3be/assistant_msa_dialog.png" style="width: 500px; height: 393px;" /></p>
<p>The resulting dialog couldn't be easier to fill out. Assuming you used the Assistant to create your worksheet, just select the columns that correspond to each item in the dialog box, as shown: </p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis dialog box" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/a3b70ac28bd2783a6414e6f3ed6583ae/attribute_agreement_analysis_dialog.png" style="width: 500px; height: 285px;" /></p>
<p>If you set up your worksheet manually, or renamed the columns, just choose the appropriate column for each item. Select the value for good or acceptable items—"Accept," in this case—then press OK to analyze the data. </p>
Interpreting the Results of the Attribute Agreement Analysis
<p>Minitab's Assistant generates four reports as part of its attribute agreement analysis. The first is a summary report, shown below: </p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis summary report" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b7d0dab58d734d058f6a34830d81f0af/attribute_agreement_analysis_summary_report.png" style="width: 600px; height: 471px;" /></p>
<p>The green bar at top left of the report indicates that overall, the error rate of the application reviewers is 15.8%. That's not as bad as it could be, but it certainly indicates that there's room for improvement! The report also shows that 13% of the time, the reviewers rejected applications that should be accepted, and they accepted applications that should be rejected 18% of the time. In addition, the reviewers rated the same item two different ways almost 22% of the time.</p>
<p>The bar graph in the lower left indicates that Javier and Julia have the lowest accuracy percentages among the reviewers at 71.7% and 78.3%, respectively. Jim has the highest accuracy, with 96%, followed by Jill at 90%.</p>
<p>The second report from the Assistant, shown below, provides a graphic summary of the accuracy rates for the analysis.</p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis accuracy report" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/e0797dc24089c1624fdc9cd4f881b391/attribute_agreement_analysis_accuracy_report.png" style="width: 600px; height: 471px;" /></p>
<p>This report illustrates the 95% confidence intervals for each reviewer in the top left, and further breaks them down by standard (accept or reject) in the graphs on the right side of the report. Intervals that don't overlap are likely to be different. We can see that overall, Javier and Jim have different overall accuracy percentages. In addition, Javier and Jim have different accuracy percentages when it comes to assessing those applications that should be rejected. However, most of the other confidence intervals overlap, suggesting that the reviewers share similar abilities. Javier clearly has the most room for improvement, but none of the reviewers are performing terribly when compared to the others. </p>
<p>The Assistant's third report shows the most frequently misclassified items, and individual reviewers' misclassification rates:</p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis misclassification report" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/42c982c39fb1eb3835fe1382d4ccc1f0/attribute_agreement_analysis_misclassification_report.png" style="width: 600px; height: 471px;" /></p>
<p>This report shows that App 9 gave the reviewers the most difficulty, as it was misclassified almost 80% of the time. (A check of the application revealed that this was indeed a borderline application, so the fact that it proved challenging is not surprising.) Among the reject applications that were mistakenly accepted, App 5 was misclassified about half of the time. </p>
<p>The individual appraiser misclassification graphs show that Javier and Julia both misclassified acceptable applications as rejects about 20% of the time, but Javier accepted "reject" applications nearly 40% of the time, compared to roughly 20% for Julia. However, Julia rated items both ways nearly 40% of the time, compared to 30% for Javier. </p>
<p>The last item produced as part of the Assistant's analysis is the report card:</p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis report card" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/01e31b93979be4847836cfa28fc176bc/attribute_agreement_analysis_report_card.png" style="width: 600px; height: 471px;" /></p>
<p>This report card provides general information about the analysis, including how accuracy percentages are calculated. It also can alert you to potential problems with your analysis (for instance, if there were an imbalance in the amount of acceptable to rejectable items being evaluated); in this case, there are no alerts we need to be concerned about. </p>
Moving Forward from the Attribute Agreement Analysis
<p>The results of this attribute agreement analysis give the bank a clear indication of how the reviewers can improve their overall accuracy. Based on the results, the loan department provided additional training for Javier and Julia (who also were the least experienced reviewers on the team), and also conducted a general review session for all of the reviewers to refresh their understanding about which factors on an application were most important. </p>
<p>However, training may not always solve problems with inconsistent assessments. In many cases, the criteria on which decisions should be based are either unclear or nonexistent. "Use your common sense" is not a defined guideline! In this case, the loan officers decided to create very specific checklists that the reviewers could refer to when they encountered borderline cases. </p>
<p>After the additional training sessions were complete and the new tools were implemented, the bank conducted a second attribute agreement analysis, which verified improvements in the reviewers' accuracy. </p>
<p>If your organization is challenged by honest disagreements over "judgment calls," an attribute agreement analysis may be just the tool you need to get everyone back on the same page. </p>
Data AnalysisLean Six SigmaQuality ImprovementSix SigmaStatisticsMon, 30 Jan 2017 13:04:00 +0000http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-3Eston MartzHow to Use Data to Understand and Resolve Differences in Opinion, Part 2
http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-2
<p>Previously, I discussed how business <a href="http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-1">problems arise when people have conflicting opinions about a subjective factor</a>, such as whether something is the right color or not, or whether a job applicant is qualified for a position. The key to resolving such honest disagreements and handling future decisions more consistently is a statistical tool called attribute agreement analysis. In this post, we'll cover how to set up and conduct an attribute agreement analysis. </p>
Does This Applicant Qualify, or Not?
<p>A busy loan office for a major financial institution processed many applications each day. A team of four reviewers inspected each application and categorized it as Approved, in which case it went on to a loan officer for further handling, or Rejected, in which case the applicant received a polite note declining to fulfill the request. <img alt="filling out an application" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/64918ddbce0abb888c85031374765491/filling_out_paper.png" style="width: 300px; height: 223px; margin: 10px 15px; float: right; border-width: 1px; border-style: solid;" /></p>
<p>The loan officers began noticing inconsistency in approved applications, so the bank decided to conduct an attribute agreement analysis on the application reviewers.</p>
<p>Two outcomes were possible: </p>
<p style="margin-left: 40px;"><strong>1. The reviewers make the right choice most of the time.</strong> If this is the case, loan officers can be confident that the reviewers do a good job, rejecting risky applicants and approving applicants with potential to be good borrowers. </p>
<p style="margin-left: 40px;"><strong>2. The reviewers too often choose incorrectly.</strong> In this case, the loan officers might not be focusing their time on the best applications, and some people who may be qualified may be rejected incorrectly. </p>
<p>One particularly useful thing about an attribute agreement analysis: even if reviewers make the wrong choice too often, the results will indicate where the reviewers make mistakes. The bank can then use that information to help improve the reviewers' performance. </p>
The Basic Structure of an Attribute Agreement Analysis
<p>A typical attribute agreement analysis asks individual appraisers to evaluate multiple samples, which have been selected to reflect the range of variation they are likely to observe. The appraisers review each sample item several times each, so the analysis reveals how not only how well individual appraisers agree with each other, but also howl consistently each appraiser evaluates the same item. </p>
<p>For this study, the loan officers selected 30 applications, half of which the officers agreed should receive approval and half which should be rejected. These included both obvious and borderline applications. </p>
<p>Next, each of the four reviewers was asked to approve or reject the 30 applications two times. These evaluation sessions took place one week apart, to make it less likely they would remember how they'd classified them the first time. The applications were randomly ordered each time.</p>
<p>The reviewers did not know how the applications had been rated by the loan officers. In addition, they were asked not to talk about the applications until after the analysis was complete, to avoid biasing one another. </p>
Using Software to Set Up the Attribute Agreement Analysis
<p>You don't <em>need </em>to use software to perform an Attribute Agreement Analysis, but a program like <a href="http://www.minitab.com/products/minitab">Minitab</a> does make it easier both to plan the study and gather the data, as well as to analyze the data after you have it. There are two ways to set up your study in Minitab. </p>
<p>The first way is to go to <strong>Stat > Quality Tools > Create Attribute Agreement Analysis Worksheet...</strong> as shown here: </p>
<p style="margin-left: 40px;"><img alt="create attribute agreement analysis worksheet" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/3925c8ebdb8bb03a73638f78dfbc0c3d/attribute_agreement_stat_menu.png" style="width: 510px; height: 495px;" /></p>
<p>This option calls up an easy-to-follow dialog box that will set up your study, randomize the order of reviewer evaluations, and permit you to print out data collection forms for each evaluation session. </p>
<p>But it's even easier to use Minitab's Assistant. In the menu, select <strong>Assistant > Measurement Systems Analysis...</strong>, then click the <em>Attribute Agreement Worksheet</em> button:</p>
<p style="margin-left: 40px;"><img alt="Assistant MSA Dialog" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/78f07ea29a7339b689361b91f602de45/assistant_msa_dialog1.png" style="width: 500px; height: 393px;" /></p>
<p>That brings up the following dialog box, which walks you through setting up your worksheet and printing out data collection forms, if desired. For this analysis, the Assistant dialog box is filled out as shown here: </p>
<p style="margin-left: 40px;"><img alt="Create Attribute Agreement Analysis Worksheet" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/bae016bb52b30ab527fc82f471ae056f/attribute_agreement_setup_dialog.png" style="width: 500px; height: 492px;" /></p>
<p>After you press OK, Minitab creates a worksheet for you and gives you the option to print out data collection forms for each reviewer and each trial. As you can see in the "Test Items" column below, Minitab randomizes the order of the observed items in each trial automatically, and the worksheet is arranged so you need only enter the reviewers' judgments in the the "Results" column. </p>
<p style="margin-left: 40px;"><img alt="attribute agreement analysis worksheet" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/697bab496ec1eebf6dfb10ba4a27b15f/attribute_worksheet.png" style="width: 451px; height: 475px;" /></p>
<p>In my next post, we'll analyze the data collected in this attribute agreement analysis. </p>
Data AnalysisLean Six SigmaQuality ImprovementSix SigmaStatisticsMon, 23 Jan 2017 13:03:00 +0000http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-2Eston MartzDMAIC Tools and Techniques: The Measure Phase
http://blog.minitab.com/blog/michelle-paret/dmaic-tools-and-techniques%3A-the-measure-phase
<p>In my last post on <a href="http://blog.minitab.com/blog/michelle-paret/dmaic-tools-and-techniques:-the-define-phase">DMAIC tools for the Define phase</a>, we reviewed various graphs and stats typically used to <em>define</em> project goals and customer deliverables. Let’s now move along to the tools you can use in <a href="http://www.minitab.com/products/minitab/">Minitab Statistical Software</a> to conduct the Measure phase.</p>
Measure Phase Methodology
<p>The goal of this phase is to <em>measure</em> the process to determine its current performance and quantify the problem. This includes validating the measurement system and establishing a baseline process capability (i.e., sigma level).</p>
I. Tools for Continuous Data
<strong><img alt="Gage RandR" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/6ff6ed7f4c0940a9eb1a548487b72b2b/gagerr.jpg" style="width: 350px; height: 263px; float: right; margin: 10px 15px;" /></strong>
Gage R&R
<p>Before you analyze your data, you should first make sure you can trust it, which is why successful Lean Six Sigma projects begin the Measure phase with Gage R&R. This measurement systems analysis tool assesses if measurements are both <a href="http://blog.minitab.com/blog/michelle-paret/do-you-know-the-truth-about-gage-repeatability-and-reproducibility">repeatable and reproducible</a>. And there are Gage R&R studies available in Minitab for both <a href="http://blog.minitab.com/blog/michelle-paret/a-simple-guide-to-gage-randr-for-destructive-testing">destructive and non-destructive tests</a>.</p>
<p>Minitab location:<strong> </strong><strong><em>Stat > Quality Tools > Gage Study > Gage R&R Study</em></strong> OR <strong><em>Assistant > Measurement Systems Analysis</em>.</strong></p>
Gage Linearity and Bias
<p>When assessing the validity of our data, we need to consider both <a href="http://blog.minitab.com/blog/real-world-quality-improvement/accuracy-vs-precision-whats-the-difference">precision and accuracy</a>. While Gage R&R assesses precision, it’s Gage Linearity and Bias that tells us if our measurements are accurate or are biased.</p>
<p>Minitab location: <em><strong>Stat > Quality Tools > Gage Study > Gage Linearity and Bias Study</strong>.</em></p>
<p style="margin-left: 40px;"><img alt="Gage Linearity and Bias" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/85a16583e2d97dd638b6ff21071a61dd/gage_linearity_and_bias.jpg" style="width: 350px; height: 263px;" /></p>
Distribution Identification
<p>Many statistical tools and p-values assume that your data follow a specific distribution, commonly the normal distribution, so it’s good practice to assess the distribution of your data before analyzing it. And if your data don’t follow a normal distribution, do not fear as there are various <a href="http://www.minitab.com/en-us/lp/Non-Normal-Data-Tips-And-Tricks">techniques for analyzing non-normal data</a>.</p>
<p>Minitab location: <strong><em>Stat > Basic Statistics > Normality Test</em></strong> OR <strong><em>Stat > Quality Tools > Individual Distribution Identification.</em></strong></p>
<p style="margin-left: 40px;"><img alt="Distribution Identification" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/1e6b3763f36f991cf5cf1eb142b0f8d0/distribution_id_plot.jpg" style="width: 350px; height: 233px;" /></p>
Capability Analysis
<p>Capability analysis is arguably the crux of “Six Sigma” because it’s the tool for calculating your sigma level. Is your process at a 1 Sigma, 2 Sigma, etc.? It reveals just how good or bad a process is relative to specification limit(s). And in the Measure phase, it’s important to use this tool to establish a baseline before making any improvements.</p>
<p>Minitab location: <strong><em>Stat > Quality Tools > Capability Analysis/Sixpack</em><em> </em></strong>OR <strong><em>Assistant > Capability Analysis.</em></strong></p>
<p style="margin-left: 40px;"><img alt="Process Capability Analysis" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/7f8e9183ad3a5b3ee66e0fadca51aea4/process_capability_sixpack_report.jpg" style="width: 350px; height: 263px;" /></p>
II. Tools for Categorical (Attribute) Data
Attribute Agreement Analysis
<strong><img alt="Attribute Agreement Analysis" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/5d1759e9ef4da886e677bb7c2a7b2c79/attribute_agreement_analysis.jpg" style="width: 300px; height: 233px; float: right; margin: 10px 15px;" /></strong>
<p>Like Gage R&R and Gage Linearity and Bias studies mentioned above for continuous measurements, this tool helps you <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/the-lady-tasting-beer-evaluating-a-gono-go-gage-part-ii">assess if you can trust categorical measurements</a>, such as pass/fail ratings. This tool is available for <a href="http://blog.minitab.com/blog/adventures-in-statistics-2/understanding-and-using-discrete-distributions">binary, ordinal, and nominal data types</a>.</p>
<p>Minitab location: <strong><em>Stat > Quality Tools > Attribute Agreement Analysis</em> </strong>OR <strong><em>Assistant > Measurement Systems Analysis.</em></strong></p>
Capability Analysis (Binomial and Poisson)
<p>If you’re counting the number of defective items, where each item is classified as either pass/fail, go/no-go, etc., and you want to compute parts per million (PPM) defective, then you can use binomial capability analysis to assess the current state of the process.</p>
<p>Or if you’re counting the number of defects, where each item can have multiple flaws, then you can use Poisson capability analysis to establish your baseline performance.</p>
<p>Minitab location:<em> <strong>Stat > Quality Tools > Capability Analysis</strong></em> OR <strong><em>Assistant > Capability Analysis.</em></strong></p>
<p style="margin-left: 40px;"><img alt="Binomial Process Capability" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/4aad5a79836d8105d3adba60388b16b1/binomial_process_capability.jpg" style="width: 350px; height: 263px;" /></p>
Variation is Everywhere
<p>As I mentioned in my last post on the Define phase, Six Sigma projects can vary. Every project does not necessarily use the same identical tool set every time, so the tools above merely serve as a guide to the types of analyses you may need to use. And there are other tools to consider, such as flowcharts to map the process, which you can complete using Minitab’s cousin, <a href="http://www.minitab.com/products/quality-companion/">Quality Companion</a>.</p>
Capability AnalysisData AnalysisLean Six SigmaProject ToolsQuality ImprovementSix SigmaStatisticsStatsWed, 18 Jan 2017 13:00:00 +0000http://blog.minitab.com/blog/michelle-paret/dmaic-tools-and-techniques%3A-the-measure-phaseMichelle ParetHow to Use Data to Understand and Resolve Differences in Opinion, Part 1
http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-1
<p>People frequently have different opinions. Usually that's fine—if everybody thought the same way, life would be pretty boring—but many business decisions are based on opinion. And when different people in an organization reach different conclusions about the same business situation, problems follow. </p>
<img alt="difference of opinion" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/ad85e799b88c440d589cfc6b82caef8f/honest_disagreement.png" style="width: 300px; height: 200px; margin: 10px 15px; float: right;" />
<div>
<p>Inconsistency and poor quality result when people being asked to make yes / no, pass / fail, and similar decisions don't share the same opinions, or base their decisions on divergent standards. Consider the following examples. </p>
<p style="margin-left: 40px;"><strong>Manufacturing:</strong> Is this part acceptable? </p>
<p style="margin-left: 40px;"><strong>Billing and Purchasing:</strong> Are we paying or charging an appropriate amount for this project? </p>
<p style="margin-left: 40px;"><strong>Lending:</strong> Does this person qualify for a new credit line? </p>
<p style="margin-left: 40px;"><strong>Supervising:</strong> Is this employee's performance satisfactory or unsatisfactory? </p>
<p style="margin-left: 40px;"><strong>Teaching:</strong> Are essays being graded consistently by teaching assistants?</p>
<p>It's easy to see how differences in judgment can have serious impacts. I wrote about a situation encountered by the recreational equipment manufacturer <a href="http://www.minitab.com/burley">Burley</a>. Pass/fail decisions of inspectors at a manufacturing facility in China began to conflict with those of inspectors at Burley's U.S. headquarters. To make sure no products reached the market unless the company's strict quality standards were met, Burley acted quickly to ensure that inspectors at both facilities were making consistent decisions about quality evaluations. </p>
Sometimes We <em>Can't </em>Just Agree to Disagree
<p>The challenge is that people can have honest differences of opinion about, well, nearly everything—including different aspects of quality. So how do you get people to make business decisions based on a common viewpoint, or standard?</p>
<p>Fortunately, there's a statistical tool that can help businesses and other organizations figure out how, where, and why people evaluate the same thing in different ways. From there, problematic inconsistencies can be minimized. Also, inspectors and others who need to make tough judgment calls can be confident they are basing their decisions on a clearly defined, agreed-upon set of standards. </p>
<p>That statistical tool is called "Attribute Agreement Analysis," and using it is easier than you might think—especially with <a href="http://www.minitab.com/products/minitab">data analysis software such as Minitab</a>. </p>
What Does "Attribute Agreement Analysis" Mean?
<p>Statistical terms can be confusing, but "attribute agreement analysis" is exactly what it sounds like: a tool that helps you gather and <em>analyze </em>data about how much <em>agreement </em>individuals have on a given <em>attribute</em>.</p>
<p>So, what is an attribute? Basically, any characteristic that entails a <span><a href="http://blog.minitab.com/blog/understanding-statistics/got-good-judgment-prove-it-with-attribute-agreement-analysis">judgment call</a></span>, or requires us to classify items as <em>this </em>or <em>that</em>. We can't measure an attribute with an objective scale like a ruler or thermometer. The following statements concern such attributes:</p>
<ul>
<li>This soup is <strong><a href="http://www.minitab.com/products/minitab/quick-start/soup/">spicy</a></strong>.</li>
<li>The bill for that repair is <strong>low</strong>. </li>
<li>That dress is <strong>red</strong>. </li>
<li>The carpet is <strong>rough</strong>. </li>
<li>That part is <strong>acceptable</strong>. </li>
<li>This candidate is <strong>unqualified</strong>. </li>
</ul>
<p>Attribute agreement analysis uses data to understand how different people assess a particular item's attribute, how consistently the same person assesses the same item on multiple occasions, and compares both to the "right" assessment. </p>
<img alt="pass-fail" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/5868c6194234ef965e9320d10c7dab9e/pass_fail.png" style="width: 252px; height: 204px; margin: 10px 15px; float: right;" />
<ul>
</ul>
<p>This method can be applied to any situation where people need to appraise or rate things. In a typical quality improvement scenario, you might take a number of manufactured parts and ask multiple inspectors to assess each part more than once. The parts being inspected should include a roughly equal mix of good and bad items, which have been identified by an expert such as a senior inspector or supervisor. </p>
<p>In my next post, we'll look at an example from the financial industry to see how a loan department used this statistical method to make sure that applications for loans were accepted or rejected appropriately and consistently. </p>
</div>
Data AnalysisLean Six SigmaProject ToolsQuality ImprovementSix SigmaStatisticsMon, 16 Jan 2017 13:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/how-to-use-data-to-understand-and-resolve-differences-in-opinion-part-1Eston MartzStatistical Tools for Process Validation, Stage 1: Process Design
http://blog.minitab.com/blog/michelle-paret/statistical-tools-for-process-validation%2C-stage-1%3A-process-design
<p>Process validation is vital to the success of companies that manufacture drugs and biological products for people and animals. According to the FDA guidelines published by the U.S. Department of Health and Human Services:<img alt="Process Validation Stages" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/26c294a2e9b5b993bfd0f571be11113d/processvalidationstages.jpg" style="width: 280px; height: 299px; float: right; margin: 10px 15px;" /></p>
<p style="margin-left: 40px;"><em>“Process validation is defined as the collection and evaluation of data, from the process design state through commercial production, which establishes scientific evidence that a process is capable of consistently delivering quality product.”<br />
— Food and Drug Administration</em></p>
<p>The FDA recommends three stages for process validation. In this 3-part series, we will briefly explore the stage goals and the types of activities and statistical techniques typically conducted within each. For complete FDA guidelines, see <a href="http://www.fda.gov" target="_blank">www.fda.gov</a>. </p>
Stage 1: Process Design
<p>The goal of this stage is to design a process suitable for routine commercial manufacturing that can consistently deliver a product that meets its quality attributes. It is important to demonstrate an understanding of the process and characterize how it responds to various inputs within Process Design.</p>
Example: Identify Critical Process Parameters with DOE
<p>Suppose you need to identify the critical process parameters for an immediate-release tablet. There are three process input variables that you want to examine: filler%, disintegrant%, and particle size. You want to find which inputs and input settings will maximize the dissolution percentage at 30 minutes.</p>
<p>To conduct this analysis, you can use <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/design-of-experiment-doe:-searching-for-a-selfie-fountain-of-youth">design of experiments</a> (DOE). DOE provides an efficient data collection strategy, during which inputs are simultaneously adjusted, to identify if relationships exist between inputs and output(s). Once you collect the data and analyze it to identify important inputs, you can then use DOE to pinpoint optimal settings.</p>
<strong>Running the Experiment</strong>
<p>The first step in DOE is to identify the inputs and corresponding input ranges you want to explore. The next step is to use statistical software, such as <a href="http://www.minitab.com">Minitab</a>, to create an experimental design that serves as your data collection plan.</p>
<p>According to the design shown below, we first want to use a particle size of 10, disintegrant of 1%, and MCC at 33.3%, and then record the corresponding average dissolution% using six tablets from a batch:</p>
<p style="margin-left: 40px;"><img alt="DOE Experiment" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/92bb269cff6b75b9a700a7e19bec78d2/doe_experiment.jpg" style="width: 250px; height: 189px;" /></p>
<strong>Analyzing the Data</strong>
<p>Using Minitab’s DOE analysis and p-values, we are ready to identify which X's are critical. Based on the bars that cross the red significance line, we can conclude that particle size and disintegrant% significantly affect the dissolution%, as does the interaction between these two factors. Filler% is not significant.</p>
<p style="margin-left: 40px;"><img alt="Pareto chart" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/2b32669ef0ad0071a038fa7b5ffa25b7/paretochart.jpg" style="width: 350px; height: 233px;" /></p>
<strong>Optimizing Product Quality</strong>
<p>Now that we've identified the critical X's, we're ready to determine the optimal settings for those inputs. Using a contour plot, we can easily identify the process window for the particle size and disintegrant% settings needed to achieve a percent dissolution of 80% or greater.</p>
<img alt="Contour plot" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/89f74b68916fd451deca51e832a72591/doe_contourplot.jpg" style="width: 350px; height: 233px;" />
<p>And that's how you can use design of experiments to conduct the Process Design stage. Next in this series, we'll look at the statistical tools and techniques commonly used for Process Qualification!</p>
Data AnalysisDesign of ExperimentsStatisticsFri, 13 Jan 2017 13:00:00 +0000http://blog.minitab.com/blog/michelle-paret/statistical-tools-for-process-validation%2C-stage-1%3A-process-designMichelle ParetStrangest Capability Study: Super-Zooper-Flooper-Do Broom Boom
http://blog.minitab.com/blog/statistics-in-the-field/strangest-capability-study%3A-super-zooper-flooper-do-broom-boom
<p><em>by Matthew Barsalou, guest blogger</em></p>
<p>The great Dr. Seuss tells of Mr. Plunger, who is the custodian at Diffendoofer School on the corner of Dinkzoober and Dinzott in the town of Dinkerville. The good Mr. Plunger “<a href="http://www.seussville.com/books/book_detail.php?isbn=9780679890089" target="_blank">keeps the whole school clean</a>” using a supper-zooper-flooper-do.</p>
<p>Unfortunately, Dr. Seuss fails to tell us where the supper-zooper-flooper-do came from and if the production process was capable.</p>
<p><img alt="supper-zooper" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/05c32de7c03ea0d764f792f96ec3e8aa/supper_zooper_400w.jpg" style="width: 400px; height: 300px; margin: 10px 15px; float: right;" />Let’s assume the broom boom length was the most critical dimension on the supper-zooper-flooper-do. The broom boom length drawing calls for a length of 55.0 mm with a tolerance of +/- 0.5 mm. The quality engineer has checked three supper-zooper-flooper-do broom booms and all were in specification, so he concludes that there is no reason to worry about the process producing out of specification parts. But we know this not true. Perhaps the fourth supper-zooper-flooper-do broom boom <em>will </em>be out of specification. Or maybe the 1,000th.</p>
<p>It’s time for a capability study, but don’t fire up your <a href="http://www.minitab.com/products/minitab">Minitab Statistical Software</a> just yet. First we need to plan the capability study. Each day the supper-zooper-flooper-do factory produces supper-zooper-flooper-do broom booms with a change in broom boom material batch every 50th part. A capability study should have a minimum of 100 values and 25 subgroups. The subgroups should be rational: that means the variability within each subgroup should be less than the variability between subgroups. We can anticipate more variation between material batches than within a material batch so we will use the batches as subgroups, with a sample size of four.</p>
<p>Once the <a href="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/666380b753452544acf5ebfa58ac27e6/supper_zooper_worksheet.mtw">data</a> has been collected, we can crank up our Minitab and perform a capability study by going to <strong>Stat > Quality Tools > Capability Analysis > Normal</strong>. Enter the column containing the measurement values. Then either enter the column containing the subgroup or type the size of the subgroup. Enter the lower specification limit and the upper specification limit, and click OK.</p>
<p style="margin-left: 40px;"><img alt="Process Capability Report for Broom Boom Length" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/81e504d93a72524caaf009f90fdc031c/process_capability_report_boom_length.png" style="border-width: 0px; border-style: solid; width: 605px; height: 454px;" /></p>
<p>We now have the results for the supper-zooper-flooper-do broom boom lengths, but can we trust our results? A capability study has requirements that must be met. We should have a minimum of 100 values and 25 subgroups, which we have. But the data should also be normally distributed and in a state of statistical control; otherwise, we either need to transform the data, or identify the distribution of the data and perform capability study for nonnormal data.</p>
<p>Dr. Seuss has never discussed transforming data so perhaps we should be hesitant if the data do not fit a distribution. Before performing a transformation, we should determine if there is a reason the data do not fit any distribution.</p>
<p>We can use the Minitab Capability Sixpack to determine if the data is normally distributed and in a state of statistical control. Go to <strong>Stat > Quality Tools > Capability Sixpack > Normal</strong>. Enter the column containing the measurement values. Then either enter the column containing the subgroup or type the size of the subgroup. Enter the lower specification limit and the upper specification limit and click OK.</p>
<p style="margin-left: 40px;"><img alt="Process Capability Sixpack Report for Broom Boom Length" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/8743bc7fe1898b69990e2de7594a7934/process_capability_sixpack_boom_length.png" style="border-width: 0px; border-style: solid; width: 605px; height: 454px;" /></p>
<p>There are no out-of-control points in the control chart and the normal probability plot follows a straight line, and has a P value is greater than 0.05, so we fail to reject the null hypothesis that the data follow a normal distribution. The data is suitable for a capability study.</p>
<p>The within subgroup variation is also known as short term capability and is indicated by <span><a href="http://blog.minitab.com/blog/statistics-and-quality-improvement/process-capability-statistics-cp-and-cpk-working-together">Cp and Cpk</a></span>. The between subgroup variability is also known as long term capability is given as Pp and Ppk. The Cp and Cpk fail to account for the variability that will occur between batches; Pp and Ppk tell us what we can expect from the process over time.</p>
<p>Both Cp and Pp tell us how well the process conforms to the specification limits. In this case, a Cp of 1.63 tells us the spread of the data is much narrower than the width of the specification limits, and that is a good thing. But Cp and Pp alone are not sufficient. The Cpk and Ppk indicate how spread out the data is relative to the center of the specification limits. There is an upper and lower Cpk and Ppk; however, we are generally only concerned with the lower of the two values.</p>
<p>In the supper-zooper-flooper-do broom boom length example, a Cpk of 1.10 is an indication that the process is off center. The Cp is 1.63, so we can reduce the number of potentially out-of-specification supper-zooper-flooper-do broom booms if we shift the process mean down to center the process while maintaining the current variation. This is a fortunate situation as it is often easier to shift the process mean than to reduce the process variation.</p>
<p>Once improvements are implemented and verified, we can be sure that the next supper-zooper-flooper-do the Diffendoofer School purchases for Mr. Plunger will have a broom boom that is in specification if only common cause variation is present.</p>
<p> </p>
<div>
<p><strong>About the Guest Blogger</strong></p>
<p><em><a href="https://www.linkedin.com/pub/matthew-barsalou/5b/539/198" target="_blank">Matthew Barsalou</a> is a statistical problem resolution Master Black Belt at <a href="http://www.3k-warner.de/" target="_blank">BorgWarner</a> Turbo Systems Engineering GmbH. He is a Smarter Solutions certified Lean Six Sigma Master Black Belt, ASQ-certified Six Sigma Black Belt, quality engineer, and quality technician, and a TÜV-certified quality manager, quality management representative, and auditor. He has a bachelor of science in industrial sciences, a master of liberal studies with emphasis in international business, and has a master of science in business administration and engineering from the Wilhelm Büchner Hochschule in Darmstadt, Germany. He is author of the books <a href="http://www.amazon.com/Root-Cause-Analysis-Step---Step/dp/148225879X/ref=sr_1_1?ie=UTF8&qid=1416937278&sr=8-1&keywords=Root+Cause+Analysis%3A+A+Step-By-Step+Guide+to+Using+the+Right+Tool+at+the+Right+Time" target="_blank">Root Cause Analysis: A Step-By-Step Guide to Using the Right Tool at the Right Time</a>, <a href="http://asq.org/quality-press/display-item/index.html?item=H1472" target="_blank">Statistics for Six Sigma Black Belts</a> and <a href="http://asq.org/quality-press/display-item/index.html?item=H1473&xvl=76115763" target="_blank">The ASQ Pocket Guide to Statistics for Six Sigma Black Belts</a>.</em></p>
</div>
<div style="clear:both;"> </div>
Capability AnalysisFun StatisticsMon, 09 Jan 2017 13:04:00 +0000http://blog.minitab.com/blog/statistics-in-the-field/strangest-capability-study%3A-super-zooper-flooper-do-broom-boomGuest BloggerHow to Make Your Statistical Software Fit You Perfectly
http://blog.minitab.com/blog/understanding-statistics/how-to-make-your-statistical-software-fit-you-perfectly
<p>Did you ever get a pair of jeans or a shirt that you liked, but didn't quite fit you perfectly? That happened to me a few months ago. The jeans looked good, and they were very well made, but it took a while before I was comfortable wearing them.</p>
<p>I much prefer it when I can get a pair with a perfect fit, that feel like I was born in them, with no period of "adjustment." <img alt="jeans" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/f66ff501555082011e6457aac70ea720/jeans.jpg" style="width: 250px; height: 184px; margin: 10px 15px; float: right;" /></p>
<p>So which pair do you think I wear more often...the older pair that fits me like a glove, or the newer ones that aren't quite as comfortable? You already know the answer, because I'll bet <em>you </em>have a favorite pair of jeans, too. </p>
<p>So what does all this have to do with statistical software? Just this: if you can get statistical software that's perfectly matched to how you're going to use it, you're going to feel more comfortable, confident, and at ease from the second you open it. </p>
<p>We do strive to make Minitab Statistical Software very easy to use from the first time you launch it. Our roots lie in providing tools that make data analysis easier, and that's still our mission today. But we know a little bit of tailoring can make a garment that feels very good into one that feels <em>great</em>. </p>
<p>So if you want to tailor your Minitab software to fit you <em>perfectly</em>, we also make that easy—even if you have multiple people using Minitab on the same computer. </p>
A Set of Statistical Tools Made Just for You (or Me)
<p>If you're like most people, you want software that gives you the options you want, when you want them. You want a menu has everything organized just the way you like it. And while we're at it, how about a toolbar that gives you immediate access to the tools you know you'll be using most frequently? </p>
<p>We don't think that's too much to ask. </p>
<p>In my job, I frequently need to perform a series of analyses on data about marketing and online traffic. It's easy enough to access those tools through Minitab's default menus, but one day I realized I didn't even need to do that—I could just make myself a menu in Minitab that includes the tools I use most frequently. </p>
<img alt="customize statistical software menu" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b475d15563643b83a5f32c071a0870cf/customize.jpg" style="width: 479px; height: 208px; margin: 10px 15px; float: right;" />
<p>Taking this thought from idea to execution was a breeze. I simply right-clicked on the menu bar and selected the "Customize" option. </p>
<div>That brought up the dialog box shown below. All I had to do was select the "New Menu" command and drag it from the "Commands" window to the to the menu bar, and Voila! A new menu. </div>
<div> </div>
<div style="margin-left: 40px;"><img alt="customize dialog box" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/b5c665d4aebcb2c0ea1500d72a9dfd7f/customize_dialog.jpg" style="width: 447px; height: 375px;" /></div>
<div>
<p>From there, a right-click and the "Rename Button" command let me to rename my new menu "Eston's Tools." I was then able to simply drag and drop the tools I use most frequently from the customization dialog box into my new menu: </p>
<p style="margin-left: 40px; "><img alt="customized statistics menu" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/9cf58f81de2ecc232bf2296e0f87bc9a/estonsmenu.jpg" style="width: 264px; height: 246px;" /></p>
<p>Pretty nifty. I could even customize the icons, were I inclined to do so. </p>
</div>
<p>There are many more ways you can <a href="http://support.minitab.com/en-us/minitab/17/topic-library/minitab-environment/interface/customize-the-minitab-interface/customize-menus-toolbars-and-shortcut-keys/">customize Minitab to suit your needs</a>, including the creation of <span><a href="http://blog.minitab.com/blog/starting-out-with-statistical-software/have-it-your-way-how-to-create-a-custom-toolbar-in-minitab">customized toolbars</a></span> and individual profiles, which are great if you share your computer with someone who would like to have Minitab customized to <em>their </em>preferences, too. </p>
<p>Let us know what you've done to customize Minitab so it fits <em>you </em>perfectly!</p>
Data AnalysisProject ToolsStatisticsTue, 03 Jan 2017 13:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/how-to-make-your-statistical-software-fit-you-perfectlyEston MartzThe Difference Between Right-, Left- and Interval-Censored Data
http://blog.minitab.com/blog/michelle-paret/the-difference-between-right-left-and-interval-censored-data
<p><a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/reliability-and-survival-the-high-stakes-of-product-performance">Reliability analysis</a> is the perfect tool for calculating the proportion of items that you can expect to survive for a specified period of time under identical operating conditions. Light bulbs—or lamps—are a classic example. Want to calculate the number of light bulbs expected to fail within 1000 hours? Reliability analysis can help you answer this type of question.</p>
<p>But to conduct the analysis properly, we need to understand the difference between the three types of censoring.</p>
What is censored data?
<p>When you perform reliability analysis, you may not have exact failure times for all items. In fact, lifetime data are often "censored." Using the light bulb example, perhaps not all the light bulbs have failed by the time your study ends. The time data for those bulbs that have not yet failed are referred to as censored.</p>
<img alt="baby" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/913ae1dbf78dd9728367bf0dead44f45/baby.jpg" style="width: 250px; height: 244px; margin: 10px 15px; float: right;" />
<p>It is important to include the censored observations in your analysis because the fact that these items have not yet failed has a big impact on your reliability estimates.</p>
Right-censored data
<p>Let’s move from light bulbs to newborns, inspired by my colleague who’s at the “you’re <em>still </em>here?” stage of pregnancy.</p>
<p>Suppose you’re conducting a study on pregnancy duration. You’re ready to complete the study and run your analysis, but some women in the study are still pregnant, so you don’t know exactly how long their pregnancies will last. These observations would be <em>right-censored</em>. The “failure,” or birth in this case, will occur after the recorded time.</p>
<p style="margin-left: 40px;"><img alt="Right censored" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/c75961d3d78018da3800683ab233c989/right_censored.png" style="width: 291px; height: 241px;" /></p>
Left-censored data
<p>Now suppose you survey some women in your study at the 250-day mark, but they already had their babies. You know they had their babies before 250 days, but don’t know <em>exactly </em>when. These are therefore <em>left-censored</em> observations, where the “failure” occurred before a particular time.</p>
<p style="margin-left: 40px;"><img alt="Left censored" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/7279d0487d0b3d08120e224456bafc2f/left_censored.png" style="width: 237px; height: 242px;" /></p>
Interval-censored data
<p>If we don’t know exactly when some babies were born but we know it was within some interval of time, these observations would be <em>interval-censored</em>. We know the “failure” occurred within some given time period. For example, we might survey expectant mothers every 7 days and then count the number who had a baby within that given week.</p>
<p style="margin-left: 40px;"><img alt="Interval censored" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/6060c2db-f5d9-449b-abe2-68eade74814a/Image/deb69487d6f3256172beefe22b4ecbf6/intervalcensored.png" style="width: 253px; height: 241px;" /></p>
<p>Once you set up your data, running the analysis is easy with <a href="http://www.minitab.com/products/minitab/">Minitab Statistical Software</a>. For more information on how to run the analysis and interpret your results, see <a href="http://blog.minitab.com/blog/fun-with-statistics/what-i-learned-from-treating-childbirth-as-a-failure">this blog post</a>, which—coincidentally—is baby-related, too.</p>
Lean Six SigmaQuality ImprovementReliability AnalysisSix SigmaWed, 07 Dec 2016 14:03:00 +0000http://blog.minitab.com/blog/michelle-paret/the-difference-between-right-left-and-interval-censored-dataMichelle ParetCommon Assumptions about Data Part 3: Stability and Measurement Systems
http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-3-stability-and-measurement-systems
<p><img alt="Cart before the horse" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/8230e7c2bc193a831158677a70eb0146/chile_road_sign_po_4.svg" style="width: 101px; height: 101px; float: right; margin: 10px 15px;" />In Parts <span><a href="http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-1-random-samples-and-statistical-independence">1</a></span> and <span><a href="http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-2-normality-and-equal-variance">2</a></span> of this blog series, I wrote about how statistical inference uses data from a sample of individuals to reach conclusions about the whole population. That’s a very powerful tool, but you must check your assumptions when you make statistical inferences. Violating any of these assumptions can result in false positives or false negatives, thus invalidating your results. </p>
<p>The common data assumptions are: random samples, independence, normality, equal variance, stability, and that your measurement system is accurate and precise. I addressed random samples and statistical independence last time. Now let’s consider the assumptions of stability and measurement systems.</p>
What Is the Assumption of Stability?
<p>A stable process is one in which the inputs and conditions are consistent over time. When a process is stable, it is said to be “in control.” This means the sources of variation are consistent over time, and the process does not exhibit unpredictable variation. In contrast, if a process is unstable and changing over time, the sources of variation are inconsistent and unpredictable. As a result of the instability, you cannot be confident in your statistical test results.</p>
<p>Use one of the various types of <span><a href="http://blog.minitab.com/blog/understanding-statistics/what-control-chart-should-i-use">control charts</a></span> available in Minitab <a href="http://www.minitab.com/products/minitab/">Statistical Software</a> to assess the stability of your data set. The Assistant menu can walk you through the choices to select the appropriate control chart based on your data and subgroup size. You can get advice about collecting and using data by clicking the “more” link.</p>
<p><img alt="Choose a Control Chart" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/6ec77f5dbc070eb0c2070ce6bcf8144c/1_control_chart.png" style="border-width: 0px; border-style: solid; width: 474px; height: 338px; margin: 10px 15px;" /></p>
<p><img alt="I-MR Control Chart" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/3d69fc444cd5dd09a962a11e645a3a2e/2_control_chart.png" style="border-width: 0px; border-style: solid; width: 474px; height: 338px; margin: 10px 15px;" /></p>
<p>In addition to preparing the control chart, Minitab tests for out-of-control or non-random patterns based on the <a href="http://blog.minitab.com/blog/statistics-in-the-field/using-the-nelson-rules-for-control-charts-in-minitab">Nelson Rules</a> and provides an assessment in easy-to-read Summary and Stability reports. The Report Card, depending on the control chart selected, will automatically check your assumptions of stability, normality, amount of data, correlation, and will suggest alternative charts to further analyze your data.</p>
<p><img alt="Report Card" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/195741e519156b95ee5feee8b521041f/3_control_chart.jpg" style="border-width: 0px; border-style: solid; width: 464px; height: 348px; margin: 10px 15px;" /></p>
What Is the Assumption for Measurement Systems?
<p>All the other assumptions I’ve described “assume” the data reflects reality. But does it?</p>
<p>The <span><a href="http://blog.minitab.com/blog/understanding-statistics/explaining-quality-statistics-so-my-boss-will-understand-measurement-systems-analysis-msa">measurement system</a> </span>is one potential source of variability when measuring a product or process. When a measurement system is poor, you lose the ability to truthfully “see” process performance. A poor measurement system leads to incorrect conclusions and flawed implementation. </p>
<p>Minitab can perform a Gage R&R test for both measurement and appraisal data, depending on your measurement system. You can use the Assistant in Minitab to help you select the most appropriate test based on the type of measurement system you have.</p>
<p><img alt="Choose a MSA" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/3ff089fcee9ab280c8e8d1da1c56d610/4_msa.png" style="border-width: 0px; border-style: solid; width: 474px; height: 345px; margin: 10px 15px;" /></p>
<p>There are two assumptions that should be satisfied when performing a Gage R&R for measurement data: </p>
<ol>
<li>The measurement device should be calibrated.</li>
<li>The parts to be measured should be selected from a stable process and cover approximately 80% of the possible operating range. </li>
</ol>
<p>When using a measurement device make sure it is properly calibrated and check for linearity, bias, and stability over time. The device should produce accurate measurements, compared to a standard value, through the entire range of measurements and throughout the life of the device. Many companies have a metrology or calibration department responsible for calibrating and maintaining gauges. </p>
<p>Both these assumptions must be satisfied. If they are not, you cannot be sure that your data accurately reflect reality. And that means you’ll risk not understanding the sources of variation that influence your process outcomes. </p>
The Real Reason You Need to Check the Assumptions
<p>Collecting and analyzing data requires a lot of time and effort on your part. After all the work you put into your analysis, you want to be able to reach correct conclusions. Some analyses are robust to departures from these assumptions, but take the safe route and check! You want to be confident you can tell whether observed differences between data samples are simply due to chance, or if the populations are indeed different! </p>
<p>It’s easy to put the cart before the horse and just plunge in to the data collection and analysis, but it’s much wiser to take the time to understand which data assumptions apply to the statistical tests you will be using, and plan accordingly.</p>
<p>Thank you for reading my blog. I hope this information helps you with your data analysis mission!</p>
Data AnalysisHypothesis TestingQuality ImprovementStatisticsMon, 05 Dec 2016 13:00:00 +0000http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-3-stability-and-measurement-systemsBonnie K. StoneThe Joy of Playing in Endless Backyards with Statistics
http://blog.minitab.com/blog/adventures-in-statistics-2/the-joy-of-playing-in-endless-backyards-with-statistics
<p>Dear Readers,</p>
<p><img alt="Jim Frost" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/1ae3640a9bb3396a48ee4478020340d5/avatar.png" style="width: 131px; height: 186px; float: right; margin: 10px 15px;" />As 2016 comes to a close, it’s time to reflect on the passage of time and changes. As I’m sure you’ve guessed, I love statistics and analyzing data! I also love talking and writing about it. In fact, I’ve been writing statistical blog posts for over five years, and it’s been an absolute blast. John Tukey, the renowned statistician, once said, “The best thing about being a statistician is that you get to play in everyone’s backyard.” I enthusiastically agree!</p>
<p>However, when I first started writing the blog, I wondered about being able to keep up a constant supply of fresh blog posts. And, when I first mentioned to some non-statistician friends that I’d be writing a statistical blog, I noticed a certain lack of enthusiasm. For instance, I heard a variety of comments like, “So, you’ll be writing things along the lines of 9 out of 10 dentists recommend . . .” Would readers even be interested in what I had to say about statistics?</p>
<p>It turns out that with a curious mind, statistical knowledge, data, and a powerful tool like <a href="http://www.minitab.com/en-us/products/minitab/" target="_blank">Minitab statistical software</a>, the possibilities are endless. You <em>can</em> play in a wide variety of fascinating backyards! </p>
<p>The most surprising statistic is that <a href="http://blog.minitab.com/blog/adventures-in-statistics-2" target="_blank">my blog posts</a> have received over 5.5 million views in the past year alone. Never in my wildest dreams did I imagine so many readers when I wrote <a href="http://blog.minitab.com/blog/adventures-in-statistics/three-measurement-system-analysis-questions-to-ask-before-you-take-a-single-measurement" target="_blank">my first post</a>! It’s a real testament to the growing importance of data analysis that so many people are interested in a blog dedicated to statistics. Thank you all for reading!</p>
Endless Backyards . . .
<p><img alt="Dolphin" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/f9c1d0c9fbd374b7272f5ee2ee2716c0/dolphin.jpg" style="width: 225px; height: 150px; float: right; margin: 10px 15px;" />Some of the topics I've written about are out of this world. I’ve assessed <a href="http://blog.minitab.com/blog/adventures-in-statistics/using-statistics-to-analyze-words" target="_blank">dolphin communications</a> and compared it to the search for extraterrestrial intelligence and analyzed <a href="http://blog.minitab.com/blog/adventures-in-statistics/exoplanet-statistics-and-the-search-for-earth-twins" target="_blank">exoplanet data</a> in the search for the Earth’s twin! (As an aside, my analysis showed that my writing style is similar to dolphin communications. I'll take that as a compliment!)</p>
<p>For more Earthly subjects, I’ve studied the relationship between <a href="http://blog.minitab.com/blog/adventures-in-statistics/size-matters-metabolic-rate-and-longevity" target="_blank">mammal size and their metabolic rate and longevity</a>. I’ve analyzed raw research data to assess the <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-effective-are-flu-shots" target="_blank">effectiveness of flu shots</a> first hand. I’ve downloaded economic data to assess patterns in both the <a href="http://blog.minitab.com/blog/adventures-in-statistics/reassessing-gdp-growth-part-1" target="_blank">U.S. GDP</a> and <a href="http://blog.minitab.com/blog/adventures-in-statistics/us-job-growth-assessing-the-numbers-and-making-predictions" target="_blank">U.S. job growth</a>. For a Thanksgiving Day post, I analyzed world income data to answer the question of <a href="http://blog.minitab.com/blog/adventures-in-statistics/statistically-how-thankful-should-we-be-a-look-at-global-income-distributions-part-1" target="_blank">how thankful we should be statistically</a>. As for <a href="http://blog.minitab.com/blog/adventures-in-statistics/when-is-easter-for-the-next-2086-years" target="_blank">Easter</a>, I can tell you the date on which it falls in any of 2,517 years, along with which dates are the most and least common.</p>
<p><img alt="Mythbusters" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/7b3b8859da99d60dd3e9c7932faefba3/mythbusters.jpg" style="width: 225px; height: 149px; float: right; margin: 10px 15px;" />In the world of politics, I’ve used data to <a href="http://blog.minitab.com/blog/adventures-in-statistics/predicting-the-us-presidential-election-evaluating-two-models-part-one" target="_blank">predict the 2012 U.S. Presidential election</a>, <a href="http://blog.minitab.com/blog/adventures-in-statistics/statistical-analyses-of-the-house-freedom-caucus-and-the-search-for-a-new-speaker" target="_blank">analyzed the House Freedom Caucus and the search for the new Speaker of the House</a>, assessed the <a href="http://blog.minitab.com/blog/adventures-in-statistics/great-presidents-revisited-does-history-provide-a-different-perspective" target="_blank">factors that make a great President</a>, and even <a href="http://blog.minitab.com/blog/adventures-in-statistics/using-the-solution-desirability-matrix-to-help-mitt-romney-choose-the-vp-candidate" target="_blank">helped Mitt Romney pick a running mate</a>. Everyone talks about the weather, so of course I had to <a href="http://blog.minitab.com/blog/adventures-in-statistics/are-atlantas-winters-getting-colder-and-snowier" target="_blank">analyze that</a>. My family loves the Mythbusters and it was fun applying statistical analyses to some of the myths that they tested (<a href="http://blog.minitab.com/blog/adventures-in-statistics/busting-the-mythbusters-are-yawns-contagious" target="_blank">here</a> and <a href="http://blog.minitab.com/blog/adventures-in-statistics/using-hypothesis-tests-to-bust-myths-about-the-battle-of-the-sexes" target="_blank">here</a>). That's my family and I meeting them in the picture to the right!</p>
<p>Some of my posts have even been a bit surreal. I took my turn at attempting to explain the statistical illusion of the <a href="http://blog.minitab.com/blog/adventures-in-statistics/the-monty-hall-problem-and-the-importance-of-checking-your-assumptions" target="_blank">infamous Monty Hall problem</a>. I’ve compared <a href="http://blog.minitab.com/blog/adventures-in-statistics/world-travel-bumpy-roads-and-adjusting-your-graph-scales" target="_blank">world travel to adjusting scales in graphs</a> (seriously). I wrote a true story about how <a href="http://blog.minitab.com/blog/adventures-in-statistics/lessons-in-quality-during-a-long-and-strange-journey-home" target="_blank">I drove a plane load of passengers 200 miles to their homes</a> in the context of <img alt="ghost hunting" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/51587c9ccc575874d23335f607e520a0/nightshot.jpg" style="width: 225px; height: 127px; float: right; margin: 10px 15px;" />quality improvement! For Halloween-themed posts, I showed how to go <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-be-a-ghost-hunter-with-a-statistical-mindset" target="_blank">ghost hunting with a statistical mindset</a> and how <a href="http://blog.minitab.com/blog/adventures-in-statistics/beware-of-phantom-degrees-of-freedom-that-haunt-your-regression-models" target="_blank">regression models can be haunted by phantom degrees of freedom</a>. I analyzed the <a href="http://blog.minitab.com/blog/adventures-in-statistics/using-data-analysis-to-assess-fatality-rates-in-star-trek-the-original-series" target="_blank">fatality rates in the original Star Trek TV series</a>. I explored how some people can <a href="http://blog.minitab.com/blog/adventures-in-statistics/the-odds-of-finding-a-four-leaf-clover-revisited-how-do-some-people-find-so-many" target="_blank">find so many four leaf clovers despite their rarity</a>. And, I wondered whether <a href="http://blog.minitab.com/blog/adventures-in-statistics/can-a-statistician-say-that-age-is-just-a-number" target="_blank">a statistician can say that age is just a number</a>?</p>
<p>See, not a mention of those dentists...well, not until now. By this point, 9 out of 10 dentists are probably feeling neglected!</p>
Helping Others Perform Their Own Analyses
<p>I’ve also written many posts aimed at helping those who are learning and performing statistical analyses. I described <a href="http://blog.minitab.com/blog/adventures-in-statistics/working-at-the-edge-of-human-knowledge-part-one" target="_blank">why statistics is cool</a> based on my own personal experiences and how the whole <a href="http://blog.minitab.com/blog/adventures-in-statistics/why-statistics-is-important" target="_blank">field of statistics is growing in importance</a>. I showed how <a href="http://blog.minitab.com/blog/adventures-in-statistics/why-anecdotal-evidence-is-unreliable" target="_blank">anecdotal evidence is unreliable</a> and explained why it fails so badly. And, I took a look forward at how <a href="http://blog.minitab.com/blog/adventures-in-statistics/expanding-the-role-of-statistics-to-areas-traditionally-dominated-by-expert-judgment" target="_blank">statistical analyses are expanding into areas traditionally ruled by expert judgment</a>.</p>
<p>I zoomed in to cover the details about how to perform and interpret statistical analyses. Some might think that covering the nitty gritty of statistical best practices is boring. Yet, you’d be surprised by the lively discussions we’ve had. We’ve had heated debates and philosophical discussions about <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-correctly-interpret-p-values" target="_blank">how to correctly interpret p-values</a> and what <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-hypothesis-tests:-significance-levels-alpha-and-p-values-in-statistics" target="_blank">statistical significance</a> does and does not tell you. This reached a fever pitch when a psychology journal actually <a href="http://blog.minitab.com/blog/adventures-in-statistics/banned-p-values-and-confidence-intervals-a-rebuttal-part-1" target="_blank">banned p-values</a>!</p>
<p><img alt="Regression residuals" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/58964ccf1cb00ead2ee1735ca54886d9/residual_illustration.gif" style="width: 221px; height: 149px; float: right; border-width: 0px; border-style: solid; margin: 10px 15px;" />We had our difficult questions and surprising topics to grapple with. <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-high-should-r-squared-be-in-regression-analysis" target="_blank">How high should R-squared be</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/choosing-between-a-nonparametric-test-and-a-parametric-test" target="_blank">Should I use a parametric or nonparametric analysis</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-interpret-a-regression-model-with-low-r-squared-and-low-p-values" target="_blank">How is it possible that a regression model can have significant variables but still have a low R-squared</a>? I even had the nerve to suggest that <a href="http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-how-to-interpret-s-the-standard-error-of-the-regression" target="_blank">R-squared is overrated</a>! And, I made the unusual case that control charts are also <a href="http://blog.minitab.com/blog/adventures-in-statistics/control-charts-not-just-for-statistical-process-control-spc-anymore" target="_blank">very important outside the realm of quality improvement</a>. Then, there is the whole frequentist versus Bayesian debate, but let’s not go there!</p>
<p>However, it’s true that not all topics about how to perform statistical analyses are riveting. I still love these topics. The world is becoming an increasingly data-driven place, and to produce trustworthy results, you must analyze your data correctly. After all, it’s surprisingly easy to <a href="http://blog.minitab.com/blog/adventures-in-statistics/applied-regression-analysis-how-to-present-and-use-the-results-to-avoid-costly-mistakes-part-1" target="_blank">make a costly mistake</a> if you don’t know what you’re doing.</p>
<p><img alt="F-distribution with probability" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/6303a2314437d8fcf2f72d9a56b1293a/f_distribution_probability.png" style="width: 250px; height: 167px; float: right; margin: 10px 15px;" />A data-driven world requires an analyst to understand seemingly esoteric details such as: the <a href="http://blog.minitab.com/blog/adventures-in-statistics/curve-fitting-with-linear-and-nonlinear-regression" target="_blank">different methods of fitting curves</a>, <a href="http://blog.minitab.com/blog/adventures-in-statistics/the-danger-of-overfitting-regression-models" target="_blank">the dangers of overfitting your model</a>, <a href="http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-how-do-i-interpret-r-squared-and-assess-the-goodness-of-fit" target="_blank">assessing goodness-of-fit</a>, <a href="http://blog.minitab.com/blog/adventures-in-statistics/why-you-need-to-check-your-residual-plots-for-regression-analysis" target="_blank">checking your residual plots</a>, and how to check for and correct <a href="http://blog.minitab.com/blog/adventures-in-statistics/what-are-the-effects-of-multicollinearity-and-when-can-i-ignore-them" target="_blank">multicollinearity</a> and <a href="http://blog.minitab.com/blog/adventures-in-statistics/curing-heteroscedasticity-with-weighted-regression-in-minitab-statistical-software" target="_blank">heteroscedasticity</a>. How do you <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-choose-the-best-regression-model" target="_blank">choose the best model</a>? Do you need to <a href="http://blog.minitab.com/blog/adventures-in-statistics/when-is-it-crucial-to-standardize-the-variables-in-a-regression-model" target="_blank">standardize your variables</a> before performing the analysis? Maybe you need a <a href="http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-tutorial-and-examples" target="_blank">regression tutorial</a>?</p>
<p>You may need to know <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-identify-the-distribution-of-your-data-using-minitab" target="_blank">how to identify the distribution of your data</a>. And just <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-hypothesis-tests:-why-we-need-to-use-hypothesis-tests-in-statistics" target="_blank">how do hypothesis tests work</a> anyway? <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-analysis-of-variance-anova-and-the-f-test" target="_blank">F-tests</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-t-tests-t-values-and-t-distributions" target="_blank">T-tests</a>? How do you <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-test-your-discrete-distribution" target="_blank">test discrete data</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/when-should-i-use-confidence-intervals-prediction-intervals-and-tolerance-intervals" target="_blank">Should you use a confidence interval, prediction interval, or a tolerance interval</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/use-random-assignment-in-experiments-to-combat-confounding-variables" target="_blank">How do you know when X causes a change in Y</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/confound-it-some-more-how-a-factor-that-wasnt-there-hampered-my-analysis" target="_blank">Is a confounding variable distorting your results</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/repeated-measures-designs-benefits-challenges-and-an-anova-example" target="_blank">What are the pros and cons of using a repeated measures design</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/did-welchs-anova-make-fishers-classic-one-way-anova-obsolete" target="_blank">Fisher’s or Welch’s ANOVA</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/the-power-of-multivariate-anova-manova" target="_blank">ANOVA or MANOVA</a>? <a href="http://blog.minitab.com/blog/adventures-in-statistics/linear-or-nonlinear-regression-that-is-the-question" target="_blank">Linear or nonlinear regression?</a></p>
<p>These may not be “sexy” topics but they are the meat and potatoes of being able to draw sound conclusions from your data. And, based on numerous blog comments, they have been well received by many people. In fact, the most rewarding aspect of writing blog posts has been the interactions I've had with all of you. I've communicated with literally hundreds and hundreds of students learning statistics and practitioners performing statistics in the field. I’ve had the pleasure of learning how you use statistical analyses, understanding the difficulties you face, and helping you resolve those issues.</p>
<p>It's been an amazing journey and I hope that my blog posts have allowed you to see statistics through my eyes―as a key that can unlock discoveries that are trapped in your data. After all, that's the reason why I titled my blog <em>Adventures in Statistics</em>. Discovery is a bumpy road. There can be statistical challenges en route, but even those can be interesting, and perhaps even rewarding, to resolve. Sometimes it is the <a href="http://blog.minitab.com/blog/adventures-in-statistics/the-mysteries-of-variability-and-power" target="_blank">perplexing mystery in your data that prompts you to play detective and leads you to surprising new discoveries</a>!</p>
<p>To close out the old year, it's good to remember that change is constant. There are bound to be many new and exciting adventures in the New Year. I wish you all the best in your endeavors. </p>
<p>“We will open the book. Its pages are blank. We are going to put words on them ourselves. The book is called Opportunity and its first chapter is New Year's Day.” <em>― Edith Lovejoy Pierce </em></p>
<p>May you all find happiness in 2017! Onward and upward!</p>
<p>Jim</p>
Data AnalysisStatisticsStatistics HelpStatsWed, 30 Nov 2016 15:00:00 +0000http://blog.minitab.com/blog/adventures-in-statistics-2/the-joy-of-playing-in-endless-backyards-with-statisticsJim FrostMutant Trees Lay Waste to the Landscape and Reveal Mother Nature's Lean Design
http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/mutant-trees-lay-waste-to-the-landscape-and-reveal-mother-natures-lean-design
<p>The season of change is upon us here at Minitab's World Headquarters. The air is crisp and clear and the landscape is ablaze in vibrant fall colors. As I drove to work one recent morning, I couldn't help but soak in the beauty surrounding me and think, "Too bad everything they taught me as a kid was a lie."</p>
<p><img alt="fall trees" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/c2cb2bd427165df25e0ca2b38ef59381/trees.jpg" style="width: 208px; height: 182px; margin: 10px 15px; float: right;" />You see, as a boy growing up in New Hampshire, I was told that the sublime beauty of autumn was just a happy accident. As the days become shorter, the trees succumb to their own version of seasonal affective disorder; they stop producing chlorophyll because... well, what's the point? As a result of this photosynthetic funk, the green begins to drain from the leaves and the less pragmatic pigments prevail, if briefly.</p>
<p>But thanks to mutant trees, I now know the truth. Or at least one possible explanation. I refer, of course, to the findings of Hoch, Singsaas, and McCown, in their 2003 paper, "<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC281624/" target="_blank">Resorption Protection. Anthocyanins Facilitate Nutrient Recovery in Autumn by Shielding Leaves from Potentially Damaging Light Levels.</a>"</p>
<p>In truth, I shouldn't say that what I learned as kid was a <em>lie</em>. The theory of autumn by chromatic attrition might still be true to some extent. But I was intrigued to discover recently that newer theories posit a more adaptive role for the annual display. For example, one theory suggests that the bright displays evolved to inform potentially injurious insects that they are barking up the wrong tree. (For more information, see Archetti and Brown 2004, "<a href="http://harvardforest.fas.harvard.edu/sites/harvardforest.fas.harvard.edu/files/leaves/Archetti_%20Brown_2004.pdf" target="_blank">The coevolution theory of autumn colours</a>".)</p>
<p>But most interesting to me was the discovery that red pigments aren't just late-season hold-outs—production of these pigments is actually ramped up in the fall. Obviously, the "Accidental Autumn" explanation doesn't hold in this case. In their paper, Hoch and colleagues present evidence that anthocyanins, which produce red fall colors, actually help trees prepare for winter.</p>
<p>Here's where the mutants come in. The theory is that the anthocyanins act as a kind of sunblock to protect the leaves while the tree recovers valuable nutrients from the leaves before sending them downward and duffward.</p>
<p>To test this theory, the scientists sampled leaves from normal (wild) trees and from mutant trees that possessed superhuman powers. Well, actually, all trees possess superhuman powers because all trees can produce food from sunlight. (I've yet to meet a human who can do that.) But in this case, affected trees had a mutation that prevented them from producing anthocyanins and turning red in the fall. </p>
<p>It's always easier to understand what your data are showing you when you can look at the results of your analysis in a graph. I used <a href="http://www.minitab.com/products/minitab">Minitab Statistical Software</a> to create a couple of graphs that illustrate some of the results shared in the paper. </p>
Before and after nitrogen levels
<p>The scientists measured the nitrogen levels in the leaves before and after the period when the trees normally recover as much of that valuable nutrient as they can. This graph shows the before and after nitrogen levels for mutant and wild-type specimens of 3 different tree species. The graph shows that the nitrogen levels in the leaves tend to drop more for the wild trees, indicating that they are more successful at recovering the nitrogen than the mutant trees. </p>
<p style="margin-left: 40px;"><img alt="Line plot of before and after nitrogen levels" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/d0620bb01ef55623402cd4b603e3f861/lineplotbeforeafter.jpg" style="width: 459px; height: 306px;" /></p>
Resorption efficiency
<p>This <span><a href="http://blog.minitab.com/blog/quality-data-analysis-and-statistics/bar-charts-decoded">bar chart</a></span> shows the same data, but expressed as "Resorption Efficiency," which is just the percent change between the before and after nitrogen levels. The graph suggests that the lack of anthocyanins hampered the ability of the mutant trees to recover the nitrogen from their leaves. </p>
<p style="margin-left: 40px;"><img alt="Bar chart of resorption efficiency" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/f42ddf69c4e21e9804b053dabef3623c/barchartresorptionefficiency.jpg" style="width: 459px; height: 306px;" /></p>
<p>So, rather than simply accepting seasonal spikes in scrap waste, it appears that mother nature is a much better quality engineer than we had given her credit for. In addition to dazzling us with some beautiful color before winter sets in, those brilliant reds are actually adding value to the process by helping to reduce waste.</p>
<p>My newfound appreciation for nature's lean genius inspired me to do a little exploring around Minitab's World Headquarters and capture some images of industrious anthocyanins hard at work improving plant profitability. Along with some cows. If you've never had the opportunity to see trees do this—and even if you have—perhaps you'll enjoy the images shared below. </p>
<p>Happy Autumn! </p>
<p><em>Corn rows weave under undulating clouds</em><br />
<img alt="Harvest has come" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/bbd7dc445005ffda4874c3ae424ab730/maze_2.jpg" style="width: 500px; height: 378px;" /></p>
<p><em>Rusty barns rest after the harvest</em><br />
<img alt="Barn" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/292dfb266a78f7fae35a28648b6b33a4/barn__enhanced.jpg" style="width: 500px; height: 262px;" /></p>
<p><em>Rustling stalks spread from road to ridge</em><br />
<img alt="Ridge and meadow" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/f1f734dd3153923e80039eab70dba9e9/ridge_and_meadow.jpg" style="font-size: 13px; width: 500px; height: 269px;" /></p>
<p><i>Heifers</i><em> forage contentedly under a calm fall sky</em><br />
<img alt="Cows" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/46f84b64ecc05fb461d3c9fe7c67d5c5/cows.jpg" style="font-size: 13px; width: 500px; height: 545px;" /></p>
<p><em>Autumn finery frames the fabled Beaver Stadium </em><br />
<img alt="Fabled Beaver Stadium" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/5a6bde627d438a786ca7edaf37f2ca27/stadium_framed_by_field_and_tree.jpg" style="width: 500px; height: 555px;" /></p>
<p><em>Scenic splendor surrounds majestic Mount Nittany </em><br />
<img alt="Mount Nittany" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/4abbb7f7b6a09d5e1a3e92b9205511ee/flaming_frame__bright.jpg" style="width: 500px; height: 258px;" /></p>
<p><em>Wary hawk takes wing amid wild autumn hues</em><br />
<img alt="Hawk on the wing" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/a1fd4a9278394d7917dd4928c6b09c13/soar_2.jpg" style="width: 500px; height: 528px;" /></p>
<p><em>Opportunistic apparitions hang around to haunt passers by</em><br />
<img alt="Ghosts" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/3c8869125bd9b7f17deef5a9506018c0/ghosts2.jpg" style="width: 500px; height: 209px;" /></p>
<p><em>Minitab World Headquarters looms large on the landscape</em><br />
<img alt="Minitab World Headquarters" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/a7e45f704289b992aa7f69eb39a92a8d/peeper_tab.jpg" style="width: 500px; height: 293px;" /></p>
<p> </p>
<p> </p>
Fun StatisticsStatisticsStatistics in the NewsFri, 18 Nov 2016 13:00:00 +0000http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/mutant-trees-lay-waste-to-the-landscape-and-reveal-mother-natures-lean-designGreg Fox