Hypothesis Testing | MinitabBlog posts and articles about hypothesis testing, especially in the course of Lean Six Sigma quality improvement projects.
http://blog.minitab.com/blog/hypothesis-testing-2/rss
Fri, 09 Dec 2016 03:52:37 +0000FeedCreator 1.7.3Common Assumptions about Data Part 3: Stability and Measurement Systems
http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-3-stability-and-measurement-systems
<p><img alt="Cart before the horse" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/8230e7c2bc193a831158677a70eb0146/chile_road_sign_po_4.svg" style="width: 101px; height: 101px; float: right; margin: 10px 15px;" />In Parts <span><a href="http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-1-random-samples-and-statistical-independence">1</a></span> and <span><a href="http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-2-normality-and-equal-variance">2</a></span> of this blog series, I wrote about how statistical inference uses data from a sample of individuals to reach conclusions about the whole population. That’s a very powerful tool, but you must check your assumptions when you make statistical inferences. Violating any of these assumptions can result in false positives or false negatives, thus invalidating your results. </p>
<p>The common data assumptions are: random samples, independence, normality, equal variance, stability, and that your measurement system is accurate and precise. I addressed random samples and statistical independence last time. Now let’s consider the assumptions of stability and measurement systems.</p>
What Is the Assumption of Stability?
<p>A stable process is one in which the inputs and conditions are consistent over time. When a process is stable, it is said to be “in control.” This means the sources of variation are consistent over time, and the process does not exhibit unpredictable variation. In contrast, if a process is unstable and changing over time, the sources of variation are inconsistent and unpredictable. As a result of the instability, you cannot be confident in your statistical test results.</p>
<p>Use one of the various types of <span><a href="http://blog.minitab.com/blog/understanding-statistics/what-control-chart-should-i-use">control charts</a></span> available in Minitab <a href="http://www.minitab.com/products/minitab/">Statistical Software</a> to assess the stability of your data set. The Assistant menu can walk you through the choices to select the appropriate control chart based on your data and subgroup size. You can get advice about collecting and using data by clicking the “more” link.</p>
<p><img alt="Choose a Control Chart" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/6ec77f5dbc070eb0c2070ce6bcf8144c/1_control_chart.png" style="border-width: 0px; border-style: solid; width: 474px; height: 338px; margin: 10px 15px;" /></p>
<p><img alt="I-MR Control Chart" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/3d69fc444cd5dd09a962a11e645a3a2e/2_control_chart.png" style="border-width: 0px; border-style: solid; width: 474px; height: 338px; margin: 10px 15px;" /></p>
<p>In addition to preparing the control chart, Minitab tests for out-of-control or non-random patterns based on the <a href="http://blog.minitab.com/blog/statistics-in-the-field/using-the-nelson-rules-for-control-charts-in-minitab">Nelson Rules</a> and provides an assessment in easy-to-read Summary and Stability reports. The Report Card, depending on the control chart selected, will automatically check your assumptions of stability, normality, amount of data, correlation, and will suggest alternative charts to further analyze your data.</p>
<p><img alt="Report Card" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/195741e519156b95ee5feee8b521041f/3_control_chart.jpg" style="border-width: 0px; border-style: solid; width: 464px; height: 348px; margin: 10px 15px;" /></p>
What Is the Assumption for Measurement Systems?
<p>All the other assumptions I’ve described “assume” the data reflects reality. But does it?</p>
<p>The <span><a href="http://blog.minitab.com/blog/understanding-statistics/explaining-quality-statistics-so-my-boss-will-understand-measurement-systems-analysis-msa">measurement system</a> </span>is one potential source of variability when measuring a product or process. When a measurement system is poor, you lose the ability to truthfully “see” process performance. A poor measurement system leads to incorrect conclusions and flawed implementation. </p>
<p>Minitab can perform a Gage R&R test for both measurement and appraisal data, depending on your measurement system. You can use the Assistant in Minitab to help you select the most appropriate test based on the type of measurement system you have.</p>
<p><img alt="Choose a MSA" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/3ff089fcee9ab280c8e8d1da1c56d610/4_msa.png" style="border-width: 0px; border-style: solid; width: 474px; height: 345px; margin: 10px 15px;" /></p>
<p>There are two assumptions that should be satisfied when performing a Gage R&R for measurement data: </p>
<ol>
<li>The measurement device should be calibrated.</li>
<li>The parts to be measured should be selected from a stable process and cover approximately 80% of the possible operating range. </li>
</ol>
<p>When using a measurement device make sure it is properly calibrated and check for linearity, bias, and stability over time. The device should produce accurate measurements, compared to a standard value, through the entire range of measurements and throughout the life of the device. Many companies have a metrology or calibration department responsible for calibrating and maintaining gauges. </p>
<p>Both these assumptions must be satisfied. If they are not, you cannot be sure that your data accurately reflect reality. And that means you’ll risk not understanding the sources of variation that influence your process outcomes. </p>
The Real Reason You Need to Check the Assumptions
<p>Collecting and analyzing data requires a lot of time and effort on your part. After all the work you put into your analysis, you want to be able to reach correct conclusions. Some analyses are robust to departures from these assumptions, but take the safe route and check! You want to be confident you can tell whether observed differences between data samples are simply due to chance, or if the populations are indeed different! </p>
<p>It’s easy to put the cart before the horse and just plunge in to the data collection and analysis, but it’s much wiser to take the time to understand which data assumptions apply to the statistical tests you will be using, and plan accordingly.</p>
<p>Thank you for reading my blog. I hope this information helps you with your data analysis mission!</p>
Data AnalysisHypothesis TestingQuality ImprovementStatisticsMon, 05 Dec 2016 13:00:00 +0000http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-3-stability-and-measurement-systemsBonnie K. StonePicking the Perfect Plot to Communicate Your Data
http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/picking-the-perfect-plot-to-communicate-your-data
<p>At the inaugural Minitab Insights Conference in September, presenters Benjamin Turcan and Jennifer Berner discussed <a href="http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/5-questions-to-ask-before-you-present-statistical-results">how to present data effectively</a>. Among the considerations they discussed was choosing the right graph.</p>
<p>Different graphs are good for different things. Of course, opinions about which graph is best can, and do, differ. Dotplot devotees might decide that they are demonstrably advantageous for all applications. On the other hand, determined dotplot detractors might beg to differ, and declare that they are decidedly good for nothing. (The dotplots that is, not the devotees. But I digress.)</p>
<p>In their presentation, Turcan and Berner divided the many uses for graphs into four broad categories:</p>
<ol>
<li>Examining relationships between variables.</li>
<li>Comparing groups.</li>
<li>Assessing how the parts comprise the whole.</li>
<li>Looking at how values are distributed.</li>
</ol>
<p>In this post I'll explore some examples of how Minitab's many marvelous graphs match up with this matrix.</p>
Examining relationships between variables
<p><img alt="Bubble plot" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/5ab0ae04699d1ac64484efba1be41204/bubblegraph.png" style="width: 209px; height: 119px; margin: 10px 15px; float: left; border-width: 1px; border-style: solid;" />Is your scrap rate higher on days with higher humidity? Do hospital admissions increase or decrease when the weather gets warmer? Does your pulse pound faster the more trips you make to the coffee machine? </p>
<p>Questions like these involve examining pairs of measurements. For example, you might record the high temperature each day as well as the number patients admitted to a hospital, and then use one of the following graphs to look for a pattern. </p>
<p><strong>Scatterplot and Fitted Line Plot</strong></p>
<p>The following post shows how to use both a scatterplot and a fitted line plot to good effect, <a href="http://blog.minitab.com/blog/the-statistics-game/march-madnesswith-minitab" target="_blank">March Madness…with Minitab</a>.</p>
<div>
<p><strong>Matrix Plot</strong></p>
<p>What if you want to evaluate several different pairs of variables? Instead of creating a bunch of separate scatterplots, you can use Minitab's convenient Matrix Plot functionality as discussed in this fine post, <a href="http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/the-matrix-its-a-complex-plot" target="_blank">The Matrix, It's a Complex Plot</a>.</p>
<p><strong>Contour Plot, 3D Scatterplot, 3D Surface Plot, and Bubble Plot</strong></p>
<p>Minitab also includes several graphs that allow you to explore the relationships among three variables at the same time, such as those discussed in <a href="http://blog.minitab.com/blog/real-world-quality-improvement/3-ways-to-graph-3-variables-in-minitab">3 Ways to Graph 3 Variables in Minitab</a> and <a href="http://blog.minitab.com/blog/starting-out-with-statistical-software/introducing-the-bubble-plot" target="_blank">Introducing the Bubble Plot</a>.</p>
Comparing groups
<p><img alt="Line plot" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/ef962ecb3d4a0d55a373ea9b4ae25a23/line_plot.png" style="width: 218px; height: 127px; margin: 10px 15px; float: right;" />Which shift produces the most scrap? Is it the same every day of the week, or does the first shift generate the most scrap on Mondays, and the last shift generates the most scrap on Fridays? Which wing of a hospital has the most empty beds? Is that the same for all four seasons of the year, or is the ER most crowded in the winter, while the maternity ward is most crowded in the spring?</p>
<p>These are the kinds of questions you can answer by comparing measurements across groups. The following graphs are well suited for this purpose.</p>
<p><strong>Bar Chart</strong></p>
<p>The following post shows how to use a bar chart to compare the means of different groups: <a href="http://blog.minitab.com/blog/starting-out-with-statistical-software/investigating-starfighters-with-bar-charts3a-function-of-a-variable">Investigating Starfighters with Bar Charts: Function of a Variable</a>.</p>
<p><em>Fun fact: </em>Did you know that Minitab's Bar Chart feature can create both a bar chart and a column chart? By default, Minitab orients the bars vertically. But you can easily flip (or "transpose") the axes to display the bars horizontally. Just double-click an axis and choose <strong>Transpose value and category scales</strong>. (For more helpful information on customizing axes, see <a href="http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/graph-options/graph-framework-elements/modifying-graph-scales/" target="_blank">Modifying graph scales</a>.)</p>
<p><strong>Line Plot</strong></p>
<p>Another way to visualize differences between groups is with a line plot, as shown in this post: <a href="http://blog.minitab.com/blog/understanding-statistics/how-to-explore-interactions-with-line-plots">How to Explore Interactions with Line Plots</a>.</p>
Assessing how the parts comprise the whole
<p><img alt="Pie chart" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/6e100b0a0828d55bd6e8b8fece69cdb1/pie_chart.jpg" style="width: 346px; height: 207px; float: left; margin: 10px 15px; border-width: 1px; border-style: solid;" />Are scratches, chips, and blisters all equally likely to mar the surface of a new car that rolls off your assembly line? Or is one defect more common than the others?</p>
<p>Do customers seem to call for help with each of your products equally often? Or does one of the products prove more troublesome than the others?</p>
<p>The following graphs can help you breakdown a variable into its constituent categories. </p>
<p><strong>Pie Chart, Stacked Bar Chart, Pareto Chart</strong></p>
<p>The post <a href="http://blog.minitab.com/blog/applying-statistics-in-quality-projects/analyzing-qualitative-data-part-1-pareto-pie-and-stacked-bar-charts" target="_blank">Analyzing Qualitative Data, part 1: Pareto, Pie, and Stacked Bar Charts</a> does a good job of comparing the relative merits of these useful plots.</p>
<p><strong>Area Graph</strong></p>
<p>As the post <a href="http://blog.minitab.com/blog/starting-out-with-statistical-software/area-graphs-an-underutilized-tool" target="_blank">Area Graphs: An Underutilized Tool</a> describes, an area graph is a great way to view multiple time series when each series is part of one whole. </p>
Looking at how values are distributed
<p><img alt="Histogram" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/279158ac94f9afdefdfc076d1cf47c48/histogramoutlier.jpg" style="width: 232px; height: 140px; margin: 10px 15px; float: right; border-width: 1px; border-style: solid;" />What is the range of values in my sample? Are the data distributed the same way this time as they were last time? Are there any unusual points that I should investigate?</p>
<p>The following graphs can help you answer these questions. </p>
<p><strong>Histogram and Dotplot</strong></p>
<p>For continuous data, you can use a histogram or a dotplot to look at the distribution. For examples, check out <a href="http://blog.minitab.com/blog/michelle-paret/3-things-a-histogram-can-tell-you" target="_blank">3 Things a Histogram Can Tell You</a> and <a href="http://blog.minitab.com/blog/real-world-quality-improvement/managing-diabetes-with-six-sigma-and-statistics-part-i" target="_blank">Managing Diabetes with Six Sigma and Statistics, Part I</a>.</p>
<p><strong>Bar Chart </strong></p>
<p>For discrete data, you can use a bar chart to look at the relative frequencies for each category. For example, see <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/lost-baggage-its-all-relative">Analyzing Data about Lost Baggage: It’s All Relative (Frequency)</a>.</p>
What Are Your Go-To Graphs?
<p>These are just some possibilities of how you can use the many graphs available in Minitab Statistical Software to learn about your data and help present what you learn to others. You can find many other great examples on the Minitab Blog.</p>
<p>What are the graphs you like to use when presenting different kinds of data? Let us know in the comments! </p>
<p> </p>
</div>
Data AnalysisHypothesis TestingQuality ImprovementStatisticsStatistics HelpFri, 11 Nov 2016 13:00:00 +0000http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/picking-the-perfect-plot-to-communicate-your-dataGreg FoxCommon Assumptions about Data (Part 2: Normality and Equal Variance)
http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-2-normality-and-equal-variance
<p>In Part 1 of this <a href="http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-1-random-samples-and-statistical-independence">blog</a> series, I wrote about how statistical inference uses data from a sample of individuals to reach conclusions about the whole population. That’s a very powerful tool, but you must check your assumptions when you make statistical inferences. Violating any of these assumptions can result in false positives or false negatives, thus invalidating your results. <img alt="Horse and Cart sign" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/8230e7c2bc193a831158677a70eb0146/chile_road_sign_po_4.svg" style="width: 101px; height: 101px; margin: 10px 15px; float: right;" /></p>
<p>The common data assumptions are: random samples, independence, normality, equal variance, stability, and that your measurement system is accurate and precise.</p>
<p>I addressed random samples and statistical independence last time. Now let’s consider the assumptions of Normality and Equal Variance.</p>
What Is the Assumption of Normality?
<p>Before you perform a statistical test, you should find out the distribution of your data. If you don’t, you risk selecting an inappropriate statistical test. Many statistical methods start with the assumption your data follow the normal distribution, including the 1- and 2-Sample t tests, Process Capability, I-MR, and ANOVA. If you don’t have normally distributed data, you might use an <a href="http://blog.minitab.com/blog/understanding-statistics/data-not-normal-try-letting-it-be-with-a-nonparametric-hypothesis-test">equivalent non-parametric test</a> based on the median instead of the mean, or try the Box-Cox or Johnson Transformation to transform your non-normal data into a normal distribution.</p>
<p align="center"><img alt="Normal and Skewed Curves" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/01451195cce5757849948e3871c28187/1_curves.png" style="border-width: 0px; border-style: solid; width: 554px; height: 179px; margin: 10px 15px;" /></p>
<p>But keep in mind that many statistical tools based on the assumption of normality do not actually <em>require</em> normally distributed data if the sample sizes are at least 15 or 20. But if sample sizes are less than 15 and the data are not normally distributed, the p-value may be inaccurate and you should interpret the results with caution.</p>
<p>There are several methods to determine normality in Minitab, and I’ll discuss two of the tools in this post: the Normality Test and the Graphical Summary. </p>
<p>Minitab’s Normality Test will generate a probability plot and perform a one-sample hypothesis test to determine whether the population from which you draw your sample is non-normal. The null hypothesis states that the population is normal. The alternative hypothesis states that the population is non-normal.</p>
<p>Choose <strong>Stat > Basic Statistics > Normality Test</strong></p>
<p align="center"><img alt="Normality Test" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/363dab1dcf97061dd0075ab38aae2ee3/2_normality_test.png" style="border-width: 0px; border-style: solid; width: 583px; height: 306px; margin: 10px 15px;" /></p>
<p>When evaluating the distribution fit for the normality test:</p>
<ul>
<li>The plotted points will roughly form a straight line. Some departure from the straight line at the tails may be okay as long as it stays within the confidence limits.</li>
<li>The plotted points should fall close to the fitted distribution line and pass the “fat pencil” test. Imagine a "fat pencil" lying on top of the fitted line: If it covers all the data points on the plot, the data are probably normal.</li>
<li>The associated Anderson-Darling statistic will be small.</li>
<li>The associated p-value will be larger than your chosen α-level (commonly chosen levels for α include 0.05 and 0.10).</li>
</ul>
<p>The Anderson-Darling statistic is a measure of how far the plot points fall from the fitted line in a probability plot. The statistic is a weighted squared distance from the plot points to the fitted line with larger weights in the tails of the distribution. For a specified data set and distribution, the better the distribution fits the data, the smaller this statistic will be.</p>
<p>Minitab’s Descriptive Statistics with the Graphical Summary will generate a nice visual display of your data and calculate the Anderson-Darling & p-value. The graphical summary displays four graphs: histogram of data with an overlaid normal curve, boxplot, and 95% confidence intervals for both the mean and the median.</p>
<p>Choose <strong>Stat > Basic Statistics > Graphical Summary</strong></p>
<p align="center"><img alt="Probability Plot" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/9681575c2cdb6cfebde643d73a5e5ca0/3_probability_plot.png" style="border-width: 0px; border-style: solid; width: 599px; height: 350px; margin: 10px 15px;" /></p>
<p>When interpreting a graphical summary report for normality: </p>
<ul>
<li>The data will be displayed as a histogram. Look for how your data is distributed (normal or skewed), how the data is spread across the graph, and if there are outliers.</li>
<li>The associated Anderson-Darling statistic will be small.</li>
<li>The associated p-value will be larger than your chosen α-level (commonly chosen levels for α include 0.05 and 0.10).</li>
</ul>
<p>For some processes, such as time and cycle data, the data will never be normally distributed. Non-normal data are fine for some statistical methods, but make sure your data satisfy the <a href="http://blog.minitab.com/blog/fun-with-statistics/forget-statistical-assumptions-just-check-the-requirements">requirements</a> for your particular analysis.</p>
What Is the Assumption of Equal Variance?
<p>In simple terms, variance refers to the data spread or scatter. Statistical tests, such as analysis of variance (ANOVA), assume that although different samples can come from populations with different means, they have the same variance. Equal variances (homoscedasticity) is when the variances are approximately the same across the samples. Unequal variances (heteroscedasticity) can affect the Type I error rate and lead to false positives. If you are comparing two or more sample means, as in the 2-Sample t-test and ANOVA, a significantly different variance could overshadow the differences between means and lead to incorrect conclusions. </p>
<p>Minitab offers several methods to test for equal variances. Consult <a href="http://support.minitab.com/en-us/minitab/17/topic-library/modeling-statistics/anova/basics/understanding-test-for-equal-variances/">Minitab Help</a> to decide which method to use based on the type of data you have. You can also use the Minitab Assistant to check this assumption for you. (Tip: When using the Assistant, click “more” to see data collection tips and important information about how Minitab calculates your results.)</p>
<p align="center"><img alt="Hypothesis Assistant" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/cd958e1efe31a3a0c3acdc818971100c/4_hypothesis_assistant.png" style="border-width: 0px; border-style: solid; width: 402px; height: 318px; margin: 10px 15px;" /></p>
<p>After the analysis is performed, check the Diagnostic Report for the test interpretation and the Report Card for alerts to unusual data points or assumptions that were not met. (Tip: When performing the 2-Sample t test and ANOVA, the Assistant takes a more conservative approach and uses calculations that do not depend on the assumption of equal variance.)</p>
<p><img alt="Assistant Reports" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/1a474c8c-3979-4eba-b70c-1e5a3f1d6601/Image/a1d56fd284c40360bc62f96e04e69e59/5_assistant_reports.png" style="border-width: 0px; border-style: solid; width: 656px; height: 245px; margin: 10px 15px;" /></p>
The Real Reason You Need to Check the Assumptions
<p>You will be putting a lot of time and effort into collecting and analyzing data. After all the work you put into the analysis, you want to be able to reach correct conclusions. Some analyses are robust to departures from these assumptions, but take the safe route and check! You want to be confident that you can tell whether observed differences between data samples are simply due to chance, or if the populations are indeed different! </p>
<p>It’s easy to put the cart before the horse and just plunge in to the data collection and analysis, but it’s much wiser to take the time to understand which data assumptions apply to the statistical tests you will be using, and plan accordingly.</p>
<p>In my next blog post, I will review the <a href="http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-3-stability-and-measurement-systems">common assumptions about stability and the measurement system</a>. </p>
Data AnalysisHypothesis TestingStatisticsStatistics HelpStatsMon, 07 Nov 2016 15:36:00 +0000http://blog.minitab.com/blog/quality-business/common-assumptions-about-data-part-2-normality-and-equal-varianceBonnie K. StoneWhat Are T Values and P Values in Statistics?
http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-are-t-values-and-p-values-in-statistics
<p><img alt="" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/6f4053a89257952fef0b9998547dffe2/tweedle_tweedledum.jpg" style="line-height: 20.8px; float: right; width: 248px; height: 255px; margin: 10px 15px;" /></p>
<p>If you’re not a statistician, looking through statistical output can sometimes make you feel a bit like <em>Alice in</em> <em>Wonderland. </em>Suddenly, you step into a fantastical world where strange and mysterious phantasms appear out of nowhere. </p>
<p>For example, consider the T and P in your t-test results.</p>
<p>“Curiouser and curiouser!” you might exclaim, like Alice, as you gaze at your output.</p>
<p><img alt="One-Sample T test output" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/1e5a4c064f43f19169121222402e4560/t_test_results_one_sided.jpg" style="width: 467px; height: 121px;" /></p>
<p>What are these values, really? Where do they come from? Even if you’ve used the p-value to interpret the statistical significance of your results<span style="line-height: 20.7999992370605px;"> </span><span style="line-height: 20.7999992370605px;">umpteen times</span><span style="line-height: 1.6;">, its actual origin may remain murky to you.</span></p>
T & P: The Tweedledee and Tweedledum of a T-test
<p>T and P are inextricably linked. They go arm in arm, like Tweedledee and Tweedledum. Here's why.</p>
<p>When you perform a t-test, you're usually trying to find evidence of a significant difference between population means (2-sample t) or between the population mean and a hypothesized value (1-sample t). <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-is-a-t-test-and-why-is-it-like-telling-a-kid-to-clean-up-that-mess-in-the-kitchen">The t-value measures the size of the difference relative to the variation in your sample data</a>. Put another way, T is simply the calculated difference represented in units of standard error. The greater the magnitude of T (it can be either positive or negative), the greater the evidence <em>against </em>the null hypothesis that there is no significant difference. The closer T is to 0, the more likely there isn't a significant difference.</p>
<p>Remember, the t-value in your output is calculated from only one sample from the entire population. It you took repeated random samples of data from the same population, you'd get slightly different t-values each time, due to random sampling error (which is really not a mistake of any kind–it's just the random variation expected in the data).</p>
<p>How different could you expect the t-values from many random samples from the same population to be? And how does the t-value from your sample data compare to those expected t-values?</p>
<p>You can use a t-distribution to find out.</p>
Using a t-distribution to calculate probability
<p>For the sake of illustration, assume that you're using a 1-sample t-test to determine whether the population mean is greater than a hypothesized value, such as 5, based on a sample of 20 observations, as shown in the above t-test output.</p>
<ol>
<li>In Minitab, choose <strong>Graph > Probability Distribution Plot</strong>.</li>
<li>Select <strong>View Probability</strong>, then click <strong>OK</strong>.</li>
<li>From <strong>Distribution</strong>, select <strong>t</strong>.</li>
<li>In <strong>Degrees of freedom</strong>, enter <em>19</em>. (For a 1-sample t test, the degrees of freedom equals the sample size minus 1).</li>
<li>Click <strong>Shaded Area</strong>. Select <strong>X Value</strong>. Select <strong>Right Tail</strong>.</li>
<li> In <strong>X Value</strong>, enter 2.8 (the t-value), then click <strong>OK</strong>.</li>
</ol>
<p><img alt="" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/bc5183a42a169d45632fd4f6c0b153b3/distribution_plot_t_2.8" style="width: 576px; height: 384px;" /></p>
<p>The highest part (peak) of the distribution curve shows you where you can expect most of the t-values to fall. Most of the time, you’d expect to get t-values close to 0. That makes sense, right? Because if you randomly select representative samples from a population, the mean of most of those random samples from the population should be close to the overall population mean, making their differences (and thus the calculated t-values) close to 0.</p>
T values, P values, and poker hands
<p>T values of larger magnitudes (either negative or positive) are less likely. The far left and right "tails" of the distribution curve represent instances of obtaining extreme values of t, far from 0. For example, the shaded region represents the probability of obtaining a t-value of 2.8 or greater. Imagine a magical dart that could be thrown to land randomly anywhere under the distribution curve. What's the chance it would land in the shaded region? The calculated probability is 0.005712.....which rounds to 0.006...which is...the p-value obtained in the t-test results! <img alt="" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/5633b267494c2017d6d7c7544247d57d/poker_picture.jpg" style="float: right; width: 200px; height: 164px; margin: 10px 15px;" /></p>
<p>In other words, the probability of obtaining a t-value of 2.8 or higher, when sampling from the same population (here, a population with a hypothesized mean of 5), is approximately 0.006.</p>
<p>How likely is that? Not very! For comparison, the probability of being dealt 3-of-a-kind in a 5-card poker hand is over three times as high (≈ 0.021).</p>
<p>Given that the probability of obtaining a t-value this high or higher when sampling from this population is so low, what’s more likely? It’s more likely this sample doesn’t come from this population (with the hypothesized mean of 5). It's much more likely that this sample comes from different population, one with a mean greater than 5.</p>
<p>To wit: Because the p-value is very low (< alpha level), you reject the null hypothesis and conclude that there's a statistically significant difference.</p>
<p>In this way, T and P are inextricably linked. Consider them simply different ways to quantify the "extremeness" of your results under the null hypothesis. You can’t change the value of one without changing the other.</p>
<p>The larger the absolute value of the t-value, the smaller the p-value, and the greater the evidence against the null hypothesis.(You can verify this by entering lower and higher t values for the t-distribution in step 6 above).</p>
Try this two-tailed follow up...
<p>The t-distribution example shown above is based on a one-tailed t-test to determine whether the mean of the population is greater than a hypothesized value. Therefore the t-distribution example shows the probability associated with the t-value of 2.8 only in one direction (the right tail of the distribution).</p>
<p>How would you use the t-distribution to find the p-value associated with a t-value of 2.8 for two-tailed t-test (in both directions)?</p>
<p><strong>Hint:</strong> In Minitab, adjust the options in step 5 to find the probability for both tails. If you don't have a copy of Minitab, download a free <a href="http://www.minitab.com/en-us/products/minitab/free-trial/" target="_blank">30-day trial version</a>.</p>
Hypothesis TestingFri, 04 Nov 2016 12:10:00 +0000http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-are-t-values-and-p-values-in-statisticsPatrick RunkelProblems Using Data Mining to Build Regression Models, Part Two
http://blog.minitab.com/blog/adventures-in-statistics/problems-using-data-mining-to-build-regression-models-part-two
<p>Data mining can be helpful in the exploratory phase of an analysis. If you're in the early stages and you're just figuring out which predictors are potentially correlated with your response variable, data mining can help you identify candidates. However, there are problems associated with using data mining to select variables.</p>
<p>In my <a href="http://blog.minitab.com/blog/adventures-in-statistics/problems-using-data-mining-to-build-regression-models" target="_blank">previous post</a>, we used data mining to settle on the following model and graphed one of the relationships between the response (C1) and a predictor (C7). It all looks great! The only problem is that all of these data are randomly generated! No true relationships are present. </p>
<p style="margin-left: 40px;"><img alt="Regression output for data mining example" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/24e98167e2dfd848b346292af371acf3/regression_swo.png" style="width: 364px; height: 278px;" /></p>
<p style="margin-left: 40px;"><img alt="Scatter plot for data mining example" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/6e4dfb991b33031738756d4b2d1c77e4/scatterplot.png" style="width: 576px; height: 384px;" /></p>
<p>If you didn't already know there was no true relationship between these variables, these results could lead you to a very inaccurate conclusion.</p>
<p>Let's explore how these problems happen, and how to avoid them</p>
Why <em>Do </em>These Problems Occur with Data Mining?
<p>The problem with data mining is that you fit many different models, trying lots of different variables, and you pick your final model based mainly on statistical significance, rather than being guided by theory.</p>
<p>What's wrong with that approach? The problem is that every statistical test you perform has a chance of a false positive. A false positive in this context means that the <a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-correctly-interpret-p-values" target="_blank">p-value</a> is statistically significant but there really is no relationship between the variables at the population level. If you set the <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-hypothesis-tests:-significance-levels-alpha-and-p-values-in-statistics" target="_blank">significance level at 0.05</a>, you can expect that in 5% of the cases where the null hypothesis is true, you'll have a false positive.</p>
<p>Because of this false positive rate, if you analyze many different models with many different variables you will inevitably find false positives. And if you're guided mainly by statistical significance, you'll leave the false positives in your model. If you keep going with this approach, you'll fill your model with these false positives. That’s exactly what happened in our example. We had 100 candidate predictor variables and the stepwise procedure literally dredged through hundreds and hundreds of potential models to arrive at our final model.</p>
<p>As we’ve seen, data mining problems can be hard to detect. The numeric results and graph all look great. However, these results don’t represent true relationships but instead are chance correlations that are bound to occur with enough opportunities.</p>
<p>If I had to name my favorite R-squared, it would be <a href="http://blog.minitab.com/blog/adventures-in-statistics/multiple-regession-analysis-use-adjusted-r-squared-and-predicted-r-squared-to-include-the-correct-number-of-variables" target="_blank">predicted R-squared</a>, without a doubt. However, even predicted R-squared can't detect all problems. Ultimately, even though the predicted R-squared is moderate for our model, the ability of this model to predict accurately for an entirely new data set is practically zero.</p>
Theory, the Alternative to Data Mining
<p>Data mining can have a role in the exploratory stages of an analysis. However, for all variables that you identify through data mining, you should perform a confirmation study using newly collected to data to verify the relationships in the new sample. Failure to do so can be very costly. Just imagine if we had made decisions based on the model above!</p>
<p>An alternative to data mining is to use theory as a guide in terms of both the models you fit and the evaluation of your results. Look at what others have done and incorporate those findings when building your model. Before beginning the regression analysis, develop an idea of what the important variables are, along with their expected relationships, coefficient signs, and effect magnitudes.</p>
<p>Building on the results of others makes it easier both to collect the correct data and to specify the best regression model without the need for data mining. The difference is the process by which you fit and evaluate the models. When you’re guided by theory, you reduce the number of models you fit and you assess properties beyond just statistical significance.</p>
<p>Theoretical considerations should not be discarded based solely on statistical measures.</p>
<ul>
<li>Compare the coefficient signs to theory. If any of the signs contradict theory, investigate and either change your model or explain the inconsistency.</li>
<li>Use <a href="http://www.minitab.com/en-us/products/minitab/" target="_blank">Minitab statistical software</a> to create factorial plots based on your model to see if all the effects match theory.</li>
<li>Compare the <a href="http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-how-do-i-interpret-r-squared-and-assess-the-goodness-of-fit" target="_blank">R-squared</a> for your study to those of similar studies. If your R-squared is very different than those in similar studies, it's a sign that your model may have a problem.</li>
</ul>
<p>If you’re interested in learning more about these issues, read my post about <a href="http://blog.minitab.com/blog/adventures-in-statistics/beware-of-phantom-degrees-of-freedom-that-haunt-your-regression-models">how using too many <em>phantom</em> degrees of freedom is related to data mining problems</a>.</p>
<p> </p>
Data AnalysisHypothesis TestingLearningRegression AnalysisStatisticsStatistics HelpWed, 19 Oct 2016 12:00:00 +0000http://blog.minitab.com/blog/adventures-in-statistics/problems-using-data-mining-to-build-regression-models-part-twoJim FrostWhy Shrewd Experts "Fail to Reject the Null" Every Time
http://blog.minitab.com/blog/understanding-statistics/why-shrewd-experts-fail-to-reject-the-null-every-time
<p><img alt="nulls angels: the toughest statisticians around!" src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/d2c0571a-acbd-48c7-84f4-222276c293fe/Image/509959f8406d59b3bb31f686aeb3b6b0/nulls_angels.jpg" style="margin: 10px 15px; float: right; width: 175px; height: 198px;" />I watched an old <a href="https://en.wikipedia.org/wiki/The_Wild_Angels" target="_blank">motorcycle flick from the 1960s</a> the other night, and I was struck by the bikers' slang. They had a language all their own. Just like statisticians, whose manner of speaking often confounds those who aren't hep to the lingo of data analysis.</p>
<p>It got me thinking...what if there were an all-statistician biker gang? Call them the Nulls Angels. Imagine them in their colors, tearing across the countryside, analyzing data and asking the people they encounter on the road about whether they "fail to reject the null hypothesis."</p>
<p>If you point out how strange that phrase sounds, the Nulls Angels will <em>know</em> you're not cool...and not very aware of statistics.</p>
<p>Speaking purely as an editor, I acknowledge that "failing to reject the null hypothesis" <em>is</em> cringe-worthy. "Failing to reject" seems like an overly complicated equivalent to <em>accept</em>. At minimum, it's clunky phrasing.</p>
<p>But it turns out those rough-and-ready statisticians in the Nulls Angels have good reason to talk like that. From a <em>statistical</em> perspective, it's undeniably accurate—and replacing "failure to reject" with "accept" would just be wrong.</p>
What <em>Is </em>the Null Hypothesis, Anyway?
<p>Hypothesis tests include one- and two-sample t-tests, tests for association, tests for normality, and many more. (All of these tests are available under the <strong>Stat</strong><span> menu in Minitab <a href="http://www.minitab.com">statistical software</a>. Or, if you want a little more <a href="http://www.minitab.com/en-us/products/minitab/assistant">statistical guidance</a>, the Assistant can lead you through common hypothesis tests step-by-step.)</span></p>
<p>A hypothesis test examines two propositions: the null hypothesis (or H0 for short), and the alternative (H1). The <em>alternative </em>hypothesis is what we hope to support. We presume that the null hypothesis is true, unless the data provide sufficient evidence that it is not.</p>
<p>You've heard the phrase "Innocent until proven guilty." That means the defendant's innocence is taken for granted until guilt is proved. In statistics, the null hypothesis is taken for granted until the alternative is proved true.</p>
So Why Do We "Fail to Reject" the Null Hypothesis?
<p>That brings up the issue of "proof."</p>
<p>The degree of statistical evidence we need in order to “prove” the alternative hypothesis is the <a href="http://blog.minitab.com/blog/michelle-paret/alphas-p-values-confidence-intervals-oh-my">confidence level</a>. The confidence level is 1 minus our risk of committing a Type I error, which occurs when you incorrectly reject a null hypothesis that's true. Statisticians call this risk alpha, and also refer to it as the significance level. The typical alpha of 0.05 corresponds to a 95% confidence level: we're accepting a 5% chance of rejecting the null even if it is true. (In life-or-death matters, we might <a href="http://blog.minitab.com/blog/statistics-and-quality-data-analysis/alpha-male-vs-alpha-female">lower the risk of a Type I error to 1% or less</a>.)</p>
<p>Regardless of the alpha level we choose, any hypothesis test has only two possible outcomes:</p>
<ol>
<li><strong>Reject the null hypothesis</strong> and conclude that the alternative hypothesis is true at the 95% confidence level (or whatever level you've selected).<br />
</li>
<li><strong>Fail to reject the null hypothesis</strong> and conclude that <em>not</em> enough evidence is available to suggest the null is false at the 95% confidence level.</li>
</ol>
<p>We often use a <a href="http://blog.minitab.com/blog/understanding-statistics/three-things-the-p-value-cant-tell-you-about-your-hypothesis-test">p-value</a> to decide if the data support the null hypothesis or not. If the test's p-value is less than our selected alpha level, we reject the null. Or, as statisticians say "When the p-value's low, the null must go."</p>
<p>This still doesn't explain <em>why</em> a statistician won't "accept the null hypothesis." Here's the bottom line: failing to reject the null hypothesis does not prove the null hypothesis <em>is</em> true. That's because a hypothesis test does not determine <em>which</em> hypothesis is true, or even which is most likely: it <em>only</em> assesses whether evidence exists to reject the null hypothesis.</p>
<img alt=""My hypothesis is Null until proven Alternative, sir!" " src="http://cdn2.content.compendiumblog.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/d2c0571a-acbd-48c7-84f4-222276c293fe/Image/a07b85370986a3dd126ac4d021775d13/trial.jpg" style="border-width: 1px; border-style: solid; margin: 10px 15px; float: right; width: 300px; height: 200px;" />"Null Until Proved Alternative"
<p>Hark back to "innocent until proven guilty." As the data analyst, you are the judge. The hypothesis test is the trial, and the null hypothesis is the defendant. The alternative hypothesis is the prosecution, which needs to make its case <em>beyond a reasonable doubt</em> (say, with 95% certainty).</p>
<p>If the trial evidence does not show the defendant is guilty, neither has it proved that the defendant <em>is</em> innocent. However, based on the available evidence, you can't reject that <em>possibility</em>. So how would you announce your verdict?</p>
<p>"Not guilty."</p>
<p>That phrase is perfect: "Not guilty"doesn't say the defendant <em>is</em> innocent, because that has not been proved. It just says the prosecution couldn't convince the judge to abandon the assumption of innocence.</p>
<p>So "failure to reject the null" is the statistical equivalent of "not guilty." In a trial, the burden of proof falls to the prosecution. When analyzing data, the entire burden of proof falls to your sample data. "Not guilty" does not mean "innocent," and "failing to reject" the null hypothesis is quite distinct from "accepting" it. </p>
<p>So if a group of marauding statisticians in their Nulls Angels leathers ever asks, keep yourself in their good graces, and show that you know "failing to reject the null" is not "accepting the null."</p>
Fun StatisticsHypothesis TestingStatisticsStatistics HelpMon, 03 Oct 2016 12:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/why-shrewd-experts-fail-to-reject-the-null-every-timeEston MartzDescriptive vs. Inferential Statistics: When Is a P-value Superfluous?
http://blog.minitab.com/blog/statistics-and-quality-data-analysis/descriptive-vs-inferential-statistics-when-is-a-p-value-superfluous
<p>True or false: When comparing a parameter for two sets of measurements, you should always use a hypothesis test to determine whether the difference is statistically significant.</p>
<p>The answer? (<em>drumroll...</em>) True!</p>
<p>...and False!</p>
<p>To understand this paradoxical answer, you need to keep in mind the difference between samples, populations, and descriptive and inferential statistics. </p>
Descriptive Statistics and Populations
<p>Consider the fictional countries of Glumpland and Dolmania.</p>
<p style="text-align: center;"><img alt="Welcome to Glumpland!" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/c1f88e0e6d3e4e55684392ec5a8069e8/glumpland.jpg" style="width: 350px; height: 232px;" /></p>
<img alt="wkshet" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/47e5470dd8123218763ac3666f64bbdd/glumpland_dolmania_wkshet.jpg" style="line-height: 20.8px; width: 222px; height: 579px; float: right;" />
<p>The population of Glumpland is 8,442,012. The population of Dolmania is 6,977,201. For each country, the age of every citizen (to the nearest tenth), <a href="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/080981611ba11403dc8fde411e81d150/glumpland_and_dolmania_ages.mpj">is recorded in a cell of a Minitab worksheet</a>. </p>
<p>Using <strong>Stat > Basic Statistics > Display Descriptive Statistics</strong> we can quickly calculate the mean age of each country.</p>
<p style="margin-left: 40px;"><img alt="desc stats" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/1a791dd23ba85673193f20c2c9971fa4/mean_age_glump_and_dol.jpg" style="width: 316px; height: 96px;" /></p>
<p>It looks like Dolmanians are, on average, more youthful than Glumplanders. But is this difference in means statistically significant?</p>
<p>To find out, we might be tempted to evaluate these data using a <span><a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-t-tests%3A-1-sample%2C-2-sample%2C-and-paired-t-tests">2-sample t-test</a></span>.</p>
<p>Except for one thing: there's absolutely no point in doing that.</p>
<p>That's because these calculated means <em>are</em> the means of the entire populations. So we already know that the population means differ.</p>
<p>Another example. Suppose a baseball player gets 213 hits in 680 at bats in 2015, and 178 hits in 532 at bats in 2016.</p>
<p>Would you need a 2-proportions test to determine whether the difference in batting averages (.313 vs .335) is statistically significant? Of course not.</p>
<p>You've already calculated the proportions using all the data for the entire two seasons. There's nothing more to extrapolate. And yet you often see a hypothesis test applied in this type of situation, in the mistaken belief that if there's no p-value, the results aren't "solid" or "statistical" enough.</p>
<p>But if you've collected every possible piece of data for a population, that's about as solid as you can get!</p>
Inferential Statistics and Random Samples
<p>Now suppose that draconian budget cuts have made it infeasible to track and record the age of every resident in Glumpland and Dolmania. <span style="line-height: 1.6;">What can they do? </span></p>
<p><span style="line-height: 1.6;">Quite a lot, actually. They can apply inferential statistics, which is based on random sampling, to make reliable estimates without those millions of data values they don't have.</span></p>
<p>To see how it works, use <strong>Calc > Random Data > Sample from columns</strong> in Minitab. Randomly sample 50 values from the 8,422,012 values in column C1, which includes the ages of the entire population of Glumpland. Then use descriptive statistics to calculate the mean of the sample.</p>
<p>Here are the results for one random sample of 50:</p>
<p style="margin-left: 40px;"><strong>Descriptive Statistics: GPLND (50)</strong><br />
<span style="line-height: 1.6;">Variable Mean</span><br />
<span style="line-height: 1.6;">GPLND(50) 52.37</span></p>
<p>The sample mean, 52.37 is slightly less than the true mean age of 53 for the entire population of Glumpland. What about another random sample of 50?</p>
<p style="margin-left: 40px;"><strong>Descriptive Statistics: GPLND (50) </strong><br />
<span style="line-height: 1.6;">Variable Mean</span><br />
<span style="line-height: 1.6;">GPLND(50) 54.11</span></p>
<p>Hmm. This sample mean of 54.11 slightly <em>overshoots</em> the true population mean of 53.</p>
<p>Even though the sample estimates are in the ballpark of the true population mean, we're seeing some variation. <span style="line-height: 1.6;">How much variation can we expect? Using descriptive statistics alone, we have no inkling of how "close" a sample estimate might be to the truth. </span></p>
Enter...the Confidence Interval
<p>To quantify the precision of a sample estimate for the population, we can use a powerful tool in inferential statistics: the confidence interval.</p>
<p>Suppose you take random samples of size 5, 10, 20, 50, and 100 from Glumpland and Dolmania using <strong>Calc > Random Data > Sample from columns</strong>. Then use <strong>Graph > Interval Plot > Multiple Ys</strong> to display the 95% confidence intervals for the mean of each sample.</p>
<p>Here's what the interval plots look like for the random samples in my worksheet.</p>
<p style="margin-left: 40px;"><img alt="interval plot Glumpland" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/262031cc398ee9d48031fe1f43b38bdf/interval_plot_of_glumpland.jpg" style="line-height: 20.8px; width: 576px; height: 384px;" /></p>
<p style="margin-left: 40px;"><img alt="Interval plot Dolmania" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/75440d94eaff64a63e338b480029945b/interval_plot_of_dolmania.jpg" style="width: 576px; height: 384px;" /></p>
<p>Your plots will look different based on your random samples, but you should notice a similar pattern: The sample mean estimates (the blue dots) tend to vary more from the population mean as the sample sizes decrease. To compensate for this, the intervals "stretch out" more and more, to ensure the same 95% overall probability of "capturing" the true population mean.</p>
<p>The larger samples produce narrower intervals. In fact, using only 50-100 data values, we can closely estimate the mean of over 8.4 million values, and get a general sense of how precise the estimate is likely to be. That's the incredible power of random sampling and inferential statistics!</p>
<p>To display side-by-side confidence intervals of the mean estimates for Glumpland and Dolmania, you can use an interval plot with groups.</p>
<p style="margin-left: 40px;"><img alt="interval plot side by side" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/9e6348c87befdaf6434dbe80e8257516/interval_plot_of_age_side_by_side.jpg" style="width: 576px; height: 384px;" /></p>
<p>Now, you might be tempted to use these results to infer whether there's a statistically significant difference in the mean age of the populations of Glumpland and Dolmania. But don't. Confidence intervals can be misleading for that purpose.</p>
<p>For that, we need another powerful tool of inferential statistics...</p>
Enter...the hypothesis test and p-value
<p>The 2-sample t-test is used to determine whether there is a statistically significant difference in the means of the populations from which the two random samples were drawn. The following table shows the t-test results for each pair of same-sized samples from Glumpland and Dolmania. As the sample size increases, notice what happens to the p-value and the confidence interval for the difference between the population means.</p>
<p style="margin-left: 40px;"><img alt="t tests" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/ba6a552e-3bc0-4eed-9c9a-eae3ade49498/Image/7c1bf45756a7fb621094086e5350fef9/2_sample_t_test.jpg" style="width: 526px; height: 757px;" /></p>
<p>Again, the confidence intervals tend to get wider as the samples get smaller. With smaller samples, we're less certain of the precision of the estimate for the difference..</p>
<p>In fact, only for the two largest random samples (N=50 and N=100) is the p-value less than a 0.05 level of significance, allowing us to conclude that the mean ages of Glumplanders and Dolmanians are statistically different. For the three smallest samples (N=20, N=10, N=5), the p-value is greater than 0.05, and confidence interval for each of these small samples includes 0. Therefore, we cannot conclude that there is difference in the population means.</p>
<p>But remember, we already know that the true population means actually <em>do</em> differ by 5.4 years. We just can't statistically "prove" it with the small samples. That's why statisticians bristle when someone says, "The p-value is not less than 0.05. Therefore, there's no significant difference between the groups." There might very well be. So it's safer to say, especially with small samples, "<em>we don't have enough evidence </em>to conclude that there's a significant difference between the groups."</p>
<p>It's not just a matter of nit-picky semantics. It's simply the truth, as you can see when you take random samples of various sizes from the same known populations and test them for a difference.</p>
Wrap-up
<p>If you have a random sample, you should always accompany estimates of statistical parameters with a confidence interval and p-value, whenever possible. Without them, there's no way to know whether you can safely extrapolate to the entire population. But if you already know every value of the population, you're good to go. You don't need a p-value, a t-test, or a CI—any more than you need a clue to determine whats inside a box, if you already know what's in it.</p>
Data AnalysisHypothesis TestingLearningStatisticsFri, 23 Sep 2016 12:08:00 +0000http://blog.minitab.com/blog/statistics-and-quality-data-analysis/descriptive-vs-inferential-statistics-when-is-a-p-value-superfluousPatrick RunkelCreating Value from Your Data
http://blog.minitab.com/blog/applying-statistics-in-quality-projects/creating-value-from-your-data
<p>There may be huge potential benefits waiting in the data in your servers. These data may be used for many different purposes. Better data allows better decisions, of course. Banks, insurance firms, and telecom companies already own a large amount of data about their customers. These resources are useful for building a more personal relationship with each customer.</p>
<p>Some organizations already use data from agricultural fields to build complex and customized models based on a very extensive number of input variables (soil characteristics, weather, plant types, etc.) in order to improve crop yields. Airline companies and large hotel chains use dynamic pricing models to improve their yield management. Data is increasingly being referred as the new “gold mine” of the 21st century.</p>
<p>A couple of factors underlie the rising prominence of data (and, therefore, data analysis):</p>
<p><img alt="Afficher l'image d'origine" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/31b80fb2-db66-4edf-a753-74d4c9804ab8/File/de034e63187d191e1666721fa12a8880/de034e63187d191e1666721fa12a8880.png" style="width: 283px; height: 212px; margin: 10px 15px; float: right;" /></p>
Huge volumes of data
<p><span style="line-height: 1.6;">Data acquisition has never been easier (sensors in manufacturing plants, sensors in connected objects, data from internet usage and web clicks, from credit cards, fidelity cards, Customer Relations Management databases, satellite images etc…) and it can easily be stored at costs that are lower than ever before (huge storage capacity now available on the cloud and elsewhere). The amount of data that is being collected is not only huge, it is growing very fast… in an exponential way.</span></p>
Unprecedented velocity
<p>Connected devices, like our smart phones, provide data in almost real time and it can be processed very quickly. It is now possible to react to any change…almost immediately.</p>
Incredible variety
<p>The data collected is not be restricted to billing information; every source of data is potentially valuable for a business. Not only is numeric data getting collected in a massive way, but also unstructured data such as videos, pictures, etc., in a large variety of situations.</p>
<p>But the explosion of data available to us is prompting every business to wrestle with an extremely complicated problem:</p>
How can we create value from these resources ?
<p>Very simple methods, such as counting words used in queries submitted to company web sites, do provide a good insight as to the general mood of your customers and its evolution. Simple statistical correlations are often used by web vendors to suggest a purchase just after buying a product on the web. Very simple descriptive statistics are also useful.</p>
<p>Just guess what could be achieved from advanced regression models or powerful statistical multivariate techniques, which can be applied easily with <a href="http://www.minitab.com/products/minitab/">statistical software packages like Minitab</a>.</p>
A simple example of the benefits of analyzing an enormous database
<p>Let's consider an example of how one company benefited from analyzing a very large database.</p>
<p><span style="line-height: 20.8px;">Many steps are needed (security and safety checks, cleaning the cabin, etc.) before a plane can depart.</span><span style="line-height: 20.8px;"> Since d</span><span style="line-height: 20.8px;">elays negatively impact customer perceptions and also affect productivity, a</span><span style="line-height: 1.6;">irline companies routinely collect a very large amount of data related to flight delays and times required to perform tasks before departure. Some times are automatically collected, others are manually recorded.</span></p>
<p>A major worldwide airline company intended to use this data to identify the crucial milestones among a very large number of preparation steps, and which ones often triggered delays in departure times. The company used Minitab's <span><a href="http://blog.minitab.com/blog/adventures-in-statistics/regression-smackdown-stepwise-versus-best-subsets">stepwise regression analysis</a></span> to quickly focus on the few variables that played a major role among a large number of potential inputs. Many variables turned out to be statistically significant, but two among them clearly seemed to make a major contribution (X6 and X10).</p>
<p style="margin-left: 40px;">Analysis of Variance1</p>
<p style="margin-left: 40px;">Source DF Seq SS <strong><span style="color: rgb(0, 0, 128);">Contribution </span></strong> Adj SS Adj MS F-Value P-Value</p>
<p style="margin-left: 40px;"><span style="line-height: 1.6;"> X6 1 337394 </span><span style="line-height: 1.6; color: rgb(0, 0, 128);"><strong>53.54%</strong></span><span style="line-height: 1.6;"> 2512 2512.2 29.21 0.000</span></p>
<p style="margin-left: 40px;"><span style="line-height: 1.6;"> X10 1 112911 </span><strong style="line-height: 1.6;"><span style="color: rgb(0, 0, 128);"> 17.92%</span> </strong><span style="line-height: 1.6;"> 66357 66357.1 771.46 0.000</span></p>
<p>When huge databases are used, statistical analyses may become overly sensitive and <a href="http://blog.minitab.com/blog/the-stats-cat/sample-size-statistical-power-and-the-revenge-of-the-zombie-salmon-the-stats-cat">detect even very small differences</a> (due to the large sample and power of the analysis). P values often tend to be quite small (p < 0.05) for a large number of predictors.</p>
<p>However, in Minitab, if you click on Results in the regression dialogue box and select Expanded tables, contributions from each variable will get displayed. X6 and X10 when considered together were contributing to more than 80% of the overall variability (with the largest F values by far), the contributions from the remaining factors were much smaller. The airline then ran a residual analysis to cross-validate the final model. </p>
<p>In addition, a Principal Component Analysis (<a href="http://blog.minitab.com/blog/applying-statistics-in-quality-projects/use-statistics-to-better-understand-your-customers">PCA, a multivariate technique</a>) was performed in Minitab to describe the relations between the most important predictors and the response. Milestones were expected to be strongly correlated to the subsequent steps.</p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/31b80fb2-db66-4edf-a753-74d4c9804ab8/File/c023d71140ea4ee2b5b22480712a55a4/c023d71140ea4ee2b5b22480712a55a4.png" /></p>
<p>The graph above is a Loading Plot from a principal component analysis. Lines that go in the same direction and are close to one another indicate how the variables may be grouped. Variables are visually grouped together according to their statistical correlations and how closely they are related.</p>
<p>A group of nine variables turned out to be strongly correlated to the most important inputs (X6 and X10) and to the final delay times (Y). Delays at the X6 stage obviously affected the X7 and X8 stages (subsequent operations), and delays from X10 affected the subsequent X11 and X12 operations.</p>
Conclusion
<p>This analysis provided simple rules that this airline's crews can follow in order to avoid delays, making passengers' next flight more pleasant. </p>
<p>The airline can repeat this analysis periodically to search for the next most important causes of delays. Such an approach can propel innovation and help organizations replace traditional and intuitive decision-making methods with data-driven ones.</p>
<p>What's more, the use of data to make things better is not restricted to the corporate world. More and more public administrations and non-governmental organizations are making large, open databases easily accessible to communities and to virtually anyone. </p>
ANOVAData AnalysisHypothesis TestingRegression AnalysisStatisticsStatistics in the NewsTue, 06 Sep 2016 13:19:00 +0000http://blog.minitab.com/blog/applying-statistics-in-quality-projects/creating-value-from-your-dataBruno ScibiliaSunny Day for A Statistician vs. Dark Day for A Householder with Solar Panels
http://blog.minitab.com/blog/using-data-and-statistics/sunny-day-for-a-statistician-vs-dark-day-for-a-householder-with-solar-panels
<p>In 2011 we had solar panels fitted on our property. In the last few months we have noticed a few problems with the inverter (the equipment that converts the electricity generated by the panels from DC to AC, and manages the transfer of unused electric to the power company). It was shutting down at various times throughout the day, typically when it was very sunny, resulting in no electricity being generated.<img alt="solar panels" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/0ee09d62f414b4bd79601d23995458bf/solar.jpg" style="width: 400px; height: 267px; margin: 10px 15px; float: right;" /></p>
<p>I contacted the inverter manufacturer for some help to diagnose the problem. They asked me to download their monitoring app, called Sunny Portal. I did this and started a communication process with the inverter via Bluetooth, which not only showed me the error code but also delivered a time series of the electricity generated by the hour since the panels were installed.</p>
<p>I thought I had gone to statistician heaven! By using this data, I could establish if this problem was significantly reducing the amount of electricity generated and, consequently, reducing the amount of cash I was being paid for generating electricity. </p>
<p>The Sunny Portal, does have some basic bar charts to plot <span><a href="http://blog.minitab.com/blog/real-world-quality-improvement/3-ways-to-examine-data-over-time">time series</a></span>, by the month, day, and 5-minute interval; however, each chart automatically works out the scale according to the data so it is difficult to compare time periods. </p>
<div>
<p><strong>Top Minitab Tip</strong>: If you want to compare multiple charts measuring the same thing for different time periods or groups, make sure the Y-axis scales are the same. In many Minitab graphs and charts, if you select the Multiple Graphs button you will be given the option to select the same Y-axis scale.</p>
</div>
Getting the Data into Minitab
<p>I realized that I could output the data to text files, which meant I could use my statistical skills and Minitab to answer my questions. For each month between Sept 2011 and June 2016 I exported a file like the example shown below. For each day I have the date, the cumulative units generated since the inverter was commissioned, and the daily generation.</p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/06a0cad69d2d8bd7cc169fb1ccb039fc/06a0cad69d2d8bd7cc169fb1ccb039fc.png" /></p>
<p>These were easily read into Minitab, using <strong>File > Open</strong>, specifying the first row of data as row 9, and changing the delimiter from comma to semicolon. </p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/a533773b174b01e721e6bae8f3240cdb/a533773b174b01e721e6bae8f3240cdb.png" style="line-height: 20.8px;" /></p>
<p>I read all of these monthly files into individual Minitab worksheets and then used <strong>Data > Stack Worksheets</strong> to create a single worksheet that contained all the data. </p>
Creating and Reviewing the Time Series Plots
<p>Using <strong>Graph > Time Series Plot, </strong>I created the following time series plots. To get each year in different colours, I double-clicked on an individual data point in the chart, chose the "Groups" tab in the Edit Symbols dialog box, and put Year as the grouping variable.</p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/d8735a9c83b4d1fab3b48b1d850cab38/d8735a9c83b4d1fab3b48b1d850cab38.png" style="line-height: 20.8px;" /></p>
<p>Looking at this plot, it was clear that the most electricity is generated in the summer months and least in the winter months, but it was not easy to identify if the amount of electricity generated had been declining. I needed to consider another analytical approach.</p>
<p>Since I have only noticed this problem in the last 6 months, (Jan to June 2016) I decided to compare the electricity generated in the first 6 months of the year for the years 2012–2016. I did this using <strong>Assistant > Hypothesis Tests > One Way Anova</strong>. The descriptive results were as follows:</p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/535915f4f060684ccbfb1bf1cf34475b/535915f4f060684ccbfb1bf1cf34475b.png" style="line-height: 1.6;" /></p>
<p>Just looking at the summary statistics, I can clearly see that the average electric units generated per day for the first six months of 2016 is much lower at 5.71 units than it was in the previous years, which range between 8.15 in 2012 and 9.22 in 2014. However by using the results from the one-way ANOVA I can work out if 2016 is <em>significantly </em>worse than previous years. </p>
<p>From this chart, you can see that the p-value is less than 0.001. Hence, we can conclude that not all the group means are equal. By using the Means Comparision Chart, shown below I can also see that 2016 is significantly lower than all the other years.</p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/c3604a8dd269c552d10b231ad9e28f50/c3604a8dd269c552d10b231ad9e28f50.png" /></p>
<p>However, you might be thinking that first six months 2016 in England were darker than an average year, and there has been significantly less UV light. This might be a fair point, so to check this I looked at data produced by the UK Met Office, <strong>(<a href="http://www.metoffice.gov.uk/climate/uk/summaries/anomalygraphs">www.metoffice.gov.uk/climate/uk/summaries/anomalygraphs</a><u>)</u>. </strong>These charts, called anomaly graphs, compare the sunshine levels by month for particular years to the average sunshine levels for the previous decade.</p>
<p>The results for 2016 and 2012, the two worst years for average electricity generated per day, are as follows: </p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/2a6f05e175bfc75a8fdf9ccb91037eef/2a6f05e175bfc75a8fdf9ccb91037eef.png" /></p>
<p style="margin-left: 40px;"><img src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/65ceccab-4e9a-4eba-8ce7-73b8b3d4d078/File/1803e9942cfdad869abf51aad522a874/1803e9942cfdad869abf51aad522a874.png" /></p>
<p>When I compare Met Office data for the amount of sunshine in the first six months of 2016 in England (red bar), with 2012, the second-worst year according to my the summary statistics, I can see that only Jan and March were better in 2012. It should also be noted you generate more electricity when there are more daylight hours. So a bad June has a bigger influence on electricity generated than a bad January, and June in 2012 was worse than 2016.</p>
<p>Consequently, I can see that the English weather cannot be blamed for the lower electricity generation figures and the fault is with my inverter. The next steps are to determine when this problem with the inverter started, and estimate what it has cost. </p>
<p>After I shared my results, the helpdesk at the manufacturer identified the problem with the Inverter: it had been set up with German power grid settings, and apparently the UK grid has more voltage fluctuation. The settings were changed on 15th July, and I'm looking forward to collecting more data and analyzing it in Minitab to determine whether this problem has been solved</p>
<p> </p>
ANOVAData AnalysisFun StatisticsHypothesis TestingStatisticsFri, 26 Aug 2016 12:00:00 +0000http://blog.minitab.com/blog/using-data-and-statistics/sunny-day-for-a-statistician-vs-dark-day-for-a-householder-with-solar-panelsGillian GroomData Not Normal? Try Letting It Be, with a Nonparametric Hypothesis Test
http://blog.minitab.com/blog/understanding-statistics/data-not-normal-try-letting-it-be-with-a-nonparametric-hypothesis-test
<p>So the data you nurtured, that you worked so hard to format and make useful, failed the normality test.</p>
<img alt="not-normal" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/c6e92e8046f3fcee28e7cf505fb77005/data_freak_flag_300.jpg" style="line-height: 20.8px; width: 300px; height: 293px; margin: 10px 15px; float: right;" />
<p>Time to face the truth: despite your best efforts, that data set is <em>never </em>going to measure up to the assumption you may have been trained to fervently look for.</p>
<p>Your data's lack of normality seems to make it poorly suited for analysis. Now what?</p>
<p>Take it easy. Don't get uptight. Just let your data be what they are, go to the <strong>Stat </strong>menu in Minitab Statistical Software, and choose "Nonparametrics."</p>
<p style="margin-left: 40px;"><img alt="nonparametrics menu" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/fbebf763ac6bd92b40c0d241b7c4029c/nonparametrics_menu.png" style="width: 367px; height: 309px;" /></p>
<p>If you're stymied by your data's lack of normality, nonparametric statistics might help you find answers. And if the word "nonparametric" looks like five syllables' worth of trouble, don't be intimidated—it's just a big word that usually refers to "tests that don't assume your data follow a normal distribution."</p>
<p>In fact, nonparametric statistics don't assume your data follow <em>any distribution at all</em>. The following table lists common parametric tests, their equivalent nonparametric tests, and the main characteristics of each.</p>
<p style="margin-left: 40px;"><img alt="correspondence table for parametric and nonparametric tests" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/4a69043809861f5187be271de67f8161/parametric_correspondence_table.png" style="width: 661px; height: 488px;" /></p>
<p>Nonparametric analyses free your data from the straitjacket of the <span style="line-height: 20.8px;">normality </span><span style="line-height: 1.6;">assumption. So choosing a nonparametric analysis is sort of like removing your data from a stifling, </span><a href="https://www.verywell.com/the-asch-conformity-experiments-2794996" style="line-height: 1.6;" target="_blank">conformist environment</a><span style="line-height: 1.6;">, and putting it into </span><a href="https://en.wikipedia.org/wiki/Utopia" style="line-height: 1.6;" target="_blank">a judgment-free, groovy idyll</a><span style="line-height: 1.6;">, where your data set can just be what it is, with no hassles about its unique and beautiful shape. How cool is </span><em style="line-height: 1.6;">that</em><span style="line-height: 1.6;">, man? Can you dig it?</span></p>
<p>Of course, it's not <em>quite </em>that carefree. Just like the 1960s encompassed both <a href="https://en.wikipedia.org/wiki/Woodstock" target="_blank">Woodstock</a> and <a href="https://en.wikipedia.org/wiki/Altamont_Free_Concert" target="_blank">Altamont</a>, so nonparametric tests offer both compelling advantages and serious limitations.</p>
Advantages of Nonparametric Tests
<p>Both parametric and nonparametric tests draw inferences about populations based on samples, but parametric tests focus on sample parameters like the mean and the standard deviation, and make various assumptions about your data—for example, that it follows a normal distribution, and that samples include a minimum number of data points.</p>
<p>In contrast, nonparametric tests are unaffected by the distribution of your data. Nonparametric tests also accommodate many conditions that parametric tests do not handle, including small sample sizes, ordered outcomes, and outliers.</p>
<p>Consequently, they can be used in a wider range of situations and with more types of data than traditional parametric tests. Many people also feel that nonparametric analyses are more intuitive.</p>
Drawbacks of Nonparametric Tests
<p><span style="line-height: 20.8px;">But nonparametric tests are not </span><em style="line-height: 20.8px;">completely </em><span style="line-height: 20.8px;">free from assumptions—they do require data to be an independent random sample, for example.</span></p>
<p>And nonparametric tests aren't a cure-all. For starters, they typically have less <a href="http://blog.minitab.com/blog/starting-out-with-statistical-software/how-powerful-am-i-power-and-sample-size-in-minitab">statistical power</a> than parametric equivalents. Power is the probability that you will correctly reject the null hypothesis when it is false. That means you have an increased chance making a Type II error with these tests.</p>
<p>In practical terms, that means nonparametric tests are <em>less </em>likely to detect an effect or association when one really exists.</p>
<p>So if you want to draw conclusions with the same confidence level you'd get using an equivalent parametric test, you will need larger sample sizes. </p>
<p>Nonparametric tests are not a one-size-fits-all solution for non-normal data, but they can yield good answers in situations that parametric statistics just won't work.</p>
Is Parametric or Nonparametric the Right Choice for You?
<p>I've briefly outlined differences between parametric and nonparametric hypothesis tests, looked at which tests are equivalent, and considered some of their advantages and disadvantages. If you're waiting for me to tell you which direction you should choose...well, all I can say is, "It depends..." But I can give you some established rules of thumb to consider when you're looking at the specifics of your situation.</p>
<p>Keep in mind that <strong>nonnormal data does not immediately disqualify your data for a parametric test</strong>. What's your sample size? <span style="line-height: 20.8px;">As long as a certain minimum sample size is met, most parametric tests will be </span><a href="http://blog.minitab.com/blog/fun-with-statistics/forget-statistical-assumptions-just-check-the-requirements" style="line-height: 20.8px;">robust to the normality assumption</a><span style="line-height: 20.8px;">. </span><span style="line-height: 1.6;">For example, the Assistant in Minitab (which uses Welch's t-test) points out that </span><span style="line-height: 1.6;">while the 2-sample t-test is based on the assumption that the data are normally distributed, this assumption is not critical when the sample sizes are at least 15. And Bonnett's 2-sample standard deviation test performs well for nonnormal data even when sample sizes are as small as 20. </span></p>
<p><span style="line-height: 1.6;">In addition, while they may not require normal data, many nonparametric tests have other assumptions that you can’t disregard.</span> For example, t<span style="line-height: 20.8px;">he Kruskal-Wallis test assumes your samples come from populations that have similar shapes and equal variances. </span><span style="line-height: 1.6;">And the 1-sample Wilcoxon test does not assume a particular population distribution, but it does assume the distribution is symmetrical. </span></p>
<p><span style="line-height: 1.6;">In most cases, your choice between parametric and nonparametric tests ultimately comes down to sample size, and whether the center of your data's distribution is better reflected by the mean or the median.</span></p>
<ul>
<li>If the mean accurately represents the center of your distribution and your sample size is large enough, a parametric test offers you better accuracy and more power. </li>
<li>If your sample size is small, you'll likely need to go with a nonparametric test. But if the median better represents the center of your distribution, a nonparametric test may be a better option even for a large sample.</li>
</ul>
<p> </p>
Data AnalysisHypothesis TestingStatisticsStatistics HelpMon, 22 Aug 2016 12:00:00 +0000http://blog.minitab.com/blog/understanding-statistics/data-not-normal-try-letting-it-be-with-a-nonparametric-hypothesis-testEston MartzHave You Accidentally Done Statistics?
http://blog.minitab.com/blog/statistics-and-quality/have-you-accidentally-done-statistics
<p>Have you ever accidentally done statistics? Not all of us can (or would want to) be “stat nerds,” but the word “statistics” shouldn’t be scary. In fact, we all analyze things that happen to us every day. Sometimes we don’t realize that we are compiling data and analyzing it, but that’s exactly what we are doing. Yes, there are advanced statistical concepts that can be difficult to understand—but there are many concepts that we use every day that we don’t realize are statistics.</p>
<p>I consider myself a student of baseball, so my example of unknowingly performing statistical procedures concerns my own experiences playing that game.</p>
<p>My baseball career ended as a 5’7” college freshman walk-on. When I realized that my ceiling as a catcher was a lot lower than my 6’0”-6’5” teammates I hung up my spikes. As an adult, while finishing my degree in Business Statistics, I had the opportunity to shadow a couple of scouts from the Major League Baseball Scouting Bureau. Yes, I’ve seen <a href="http://blog.minitab.com/blog/the-statistics-game/moneyball-shows-the-power-of-statistics"><em>Moneyball </em></a>and I know that traditional scouting methods are reputed to conflict with the methods of stat nerds like myself, but as a former player I wanted to see what these scouts were looking at. </p>
<p><img alt="baseball statistics" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/076e1f8132a222e6204e393eb0d3e9a2/baseball_stats.jpg" style="width: 278px; height: 313px; margin: 10px 15px; float: right;" />My first day with the scouts I found out they were traditional baseball guys. They didn’t believe data could tell how good a player is better than observation could, and ultimately they didn't think statistics were important to what they do. </p>
<p>I found their thinking to be a little off, and a little funny. Although they didn’t believe in statistics, the tools they use for their jobs actually quantify a player's attributes. I watched as they used a radar gun to measure pitch speed, a stopwatch to measure running speed, and a notepad to record their measurements (they didn’t realize they were compiling data). As one of the scouts was conversing with me, asking how statistics are going to be brought into baseball, he was making a dot plot by hand of the pitcher's pitches by speed to find the velocity distribution of the pitcher.</p>
<p style="margin-left: 40px;"><img height="343" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/b51a0c86-e2dd-456e-878a-4196c7381c3a/File/8361f15f80b379a88187b539c124cad0/8361f15f80b379a88187b539c124cad0.png" width="514" /></p>
<p>After I explained to him that was unknowing creating a dot plot (like the one I created for Rasiel Iglesias using Minitab, and which has a <a href="http://support.minitab.com/minitab/17/topic-library/basic-statistics-and-graphs/summary-statistics/measures-of-central-tendency/">bimodal distribution</a>) we started talking about grading players’ skills. The scouts would grade how players hit, their power, how they run, arm strength, and fielding ability. They used a numeric grading system from 20-80 for each of the characteristics, with 20 being the lowest, 50 being average, and 80 being elite. After they compiled this data they would give the players grades through analysis, and they would create a report with these grades to convey to others what they saw in the player.</p>
<p style="margin-left: 40px;"><img height="401" src="https://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/b51a0c86-e2dd-456e-878a-4196c7381c3a/File/a57bd643816872de2fee895f303c0ddc/a57bd643816872de2fee895f303c0ddc.png" width="602" /></p>
<p>I was amazed at how these scouts—true, old-school baseball guys who said stats weren’t important for their jobs—were compiling data and analyzing it for their reports. </p>
<p>A few of the other statistical ideas the scouts were (accidentally) concerned about included the sample size of observations of a player, comparison analysis, and predicting a where a player falls within their physical development (regression).</p>
<p>Like the baseball scouts, many of us are unwittingly doing statistics. Just like these scouts, we run into data all day long without recognizing that we can compile and analyze it. In work we worry about customer satisfaction, wait time, average transaction value, cost ratios, efficiency, etc. And while many people get intimidated when we use the word "statistics," we don’t need advanced degrees to embrace observing, compiling data, and making solid decisions based on our analysis.</p>
<p>So, are <em>you </em>accidentally doing statistics? If you are wanting to get beyond accidentally doing statistics and analyze a little more deliberately, Minitab has many tools like the <a href="http://www.minitab.com/products/minitab/assistant/">Assistant menu</a>, and Stat Guide to help you on your stats journey.</p>
Data AnalysisFun StatisticsHypothesis TestingStatisticsStatistics in the NewsStatsTue, 02 Aug 2016 12:00:00 +0000http://blog.minitab.com/blog/statistics-and-quality/have-you-accidentally-done-statisticsJoseph HartsockOne-Sample t-test: Calculating the t-statistic is not really a bear
http://blog.minitab.com/blog/marilyn-wheatleys-blog/one-sample-t-test-calculating-the-t-statistic-is-not-really-a-bear
<p>While some posts in our Minitab blog focus on <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-t-tests-t-values-and-t-distributions">understanding t-tests and t-distributions</a> this post will focus more simply on how to hand-calculate the t-value for a one-sample t-test (and how to replicate the p-value that Minitab gives us). </p>
<p>The formulas used in this post are available within <a href="http://www.minitab.com/en-us/products/minitab/">Minitab Statistical Software</a> by choosing the following menu path: <strong>Help</strong> > <strong>Methods and Formulas</strong> > <strong>Basic Statistics</strong> > <strong>1-sample t</strong>.</p>
<p>The null and three alternative hypotheses for a one-sample t-test are shown below:</p>
<p style="margin-left: 40px;"><img border="0" height="184" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/553bfcce02e2394b13b5175655c99df6/553bfcce02e2394b13b5175655c99df6.png" width="368" /></p>
<p>The default alternative hypothesis is the last one listed: The true population mean is not equal to the mean of the sample, and this is the option used in this example.</p>
<p><img alt="bear" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/88db51bd8ccbfcbb306372bb65fa4902/bear.jpg" style="margin: 10px 15px; float: right; width: 400px; height: 290px;" />To understand the calculations, we’ll use a sample data set available within Minitab. The name of the dataset is <strong>Bears.MTW</strong>, because the calculation is not a huge bear to wrestle (plus who can resist a dataset with that name?). The path to access the sample data from within Minitab depends on the version of the software. </p>
<p>For the current version of Minitab, <a href="http://www.minitab.com/en-us/products/minitab/whats-new/">Minitab 17.3.1</a>, the sample data is available by choosing <strong>Help</strong> > <strong>Sample Data</strong>.</p>
<p>For previous versions of Minitab, the data set is available by choosing <strong>File</strong> > <strong>Open Worksheet</strong> and clicking the <strong>Look in Minitab Sample Data folder</strong> button at the bottom of the window.</p>
<p>For this example, we will use column C2, titled Age, in the Bears.MTW data set, and we will test the hypothesis that the average age of bears is 40. First, we’ll use <strong>Stat</strong> > <strong>Basic Statistics</strong> > <strong>1-sample t</strong> to test the hypothesis:</p>
<p style="margin-left: 40px;"><img border="0" height="315" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/d3336e100a9a4a91501ed1206c8e807f/d3336e100a9a4a91501ed1206c8e807f.png" width="400" /></p>
<p>After clicking <strong>OK</strong> above we see the following results in the session window:</p>
<p style="margin-left: 40px;"><img border="0" height="118" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/e62a2a776614c60eff0dd6383f66e5f5/e62a2a776614c60eff0dd6383f66e5f5.png" width="464" /></p>
<p>With a high p-value of 0.361, we don’t have enough evidence to conclude that the average age of bears is significantly different from 40. </p>
<p>Now we’ll see how to calculate the T value above by hand.</p>
<p>The formula for the T value (0.92) shown above is calculated using the following formula in Minitab:</p>
<p style="margin-left: 40px;"><img border="0" height="172" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/701f9c0efa98a38fb397f3c3ec459b66/701f9c0efa98a38fb397f3c3ec459b66.png" width="247" /></p>
<p>The output from the 1-sample t test above gives us all the information we need to plug the values into our formula:</p>
<p style="margin-left: 40px;">Sample mean: 43.43</p>
<p style="margin-left: 40px;">Sample standard deviation: 34.02</p>
<p style="margin-left: 40px;">Sample size: 83</p>
<p>We also know that our target or hypothesized value for the mean is 40.</p>
<p>Using the numbers above to calculate the t-statistic we see:</p>
<p style="margin-left: 40px;">t = (43.43-40)/34.02/√83) = <strong>0.918542</strong><br />
(which rounds to 0.92, as shown in Minitab’s 1-sample t-test output)</p>
<p>Now, we <em>could </em>dust off a statistics textbook and use it to compare our calculated t of 0.918542 to the corresponding critical value in a t-table, but that seems like a pretty big bear to wrestle when we can easily get the p-value from Minitab instead. To do that, I’ve used <strong>Graph</strong> > <strong>Probability Distribution Plot</strong> > <strong>View Probability</strong>:</p>
<p style="margin-left: 40px;"><img border="0" height="382" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/e43510dc233e71f22b93f190deb5e523/e43510dc233e71f22b93f190deb5e523.png" width="419" /></p>
<p>In the dialog above, we’re using the t distribution with 82 degrees of freedom (we had an N = 83, so the degrees of freedom for a 1-sample t-test is N-1). Next, I’ve selected the <strong>Shaded Area</strong> tab:</p>
<p style="margin-left: 40px;"><img border="0" height="383" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/e36572b6cead5cf393763d880b6f229a/e36572b6cead5cf393763d880b6f229a.png" width="414" /></p>
<p>In the dialog box above, we’re defining the shaded area by the X value (the calculated t-statistic), and I’ve typed in the t-value we calculated in the <strong>X value</strong> field. This was a 2-tailed test, so I’ve selected <strong>Both Tails</strong> in the dialog above.</p>
<p>After clicking <strong>OK</strong> in the window above, we see:</p>
<p style="margin-left: 40px;"><img border="0" height="384" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/f6d0da32-ba1d-41d4-ace1-af34dcb51351/File/a12abfcbe5ecea6902e4a138e96a53a6/a12abfcbe5ecea6902e4a138e96a53a6.png" width="576" /></p>
<p>We add together the probabilities from both tails, 0.1805 + 0.1805 and that equals 0.361 – the same p-value that Minitab gave us for the 1-sample t test. </p>
<p>That wasn’t so bad—not a difficult bear to wrestle at all!</p>
Data AnalysisFun StatisticsHypothesis TestingLearningStatisticsStatistics HelpStatsWed, 27 Jul 2016 17:57:00 +0000http://blog.minitab.com/blog/marilyn-wheatleys-blog/one-sample-t-test-calculating-the-t-statistic-is-not-really-a-bearMarilyn WheatleyUnderstanding Analysis of Variance (ANOVA) and the F-test
http://blog.minitab.com/blog/adventures-in-statistics/understanding-analysis-of-variance-anova-and-the-f-test
<p>Analysis of variance (ANOVA) can determine whether the means of three or more groups are different. ANOVA uses F-tests to statistically test the equality of means. In this post, I’ll show you how ANOVA and F-tests work using a one-way ANOVA example.</p>
<p>But wait a minute...have you ever stopped to wonder why you’d use an analysis of <em>variance</em> to determine whether <em>means</em> are different? I'll also show how variances provide information about means.</p>
<p>As in my posts about <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-t-tests:-1-sample,-2-sample,-and-paired-t-tests" target="_blank">understanding t-tests</a>, I’ll focus on concepts and graphs rather than equations to explain ANOVA F-tests.</p>
What are F-statistics and the F-test?
<p>F-tests are named after its test statistic, F, which was named in honor of Sir Ronald Fisher. The F-statistic is simply a ratio of two variances. Variances are a measure of dispersion, or how far the data are scattered from the mean. Larger values represent greater dispersion.</p>
<img alt="F is for F-test" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/2176eecdb5dee3586bf90f5dc2ca0007/f.gif" style="line-height: 20.8px; margin: 10px 15px; float: right; width: 200px; height: 221px;" />
<p>Variance is the square of the standard deviation. For us humans, standard deviations are easier to understand than variances because they’re in the same units as the data rather than squared units. However, many analyses actually use variances in the calculations.</p>
<p>F-statistics are based on the ratio of mean squares. The term “<a href="http://support.minitab.com/minitab/17/topic-library/modeling-statistics/anova/anova-statistics/understanding-mean-squares/" target="_blank">mean squares</a>” may sound confusing but it is simply an estimate of population variance that accounts for the <a href="http://support.minitab.com/minitab/17/topic-library/basic-statistics-and-graphs/introductory-concepts/basic-concepts/df/" target="_blank">degrees of freedom (DF)</a> used to calculate that estimate.</p>
<p>Despite being a ratio of variances, you can use F-tests in a wide variety of situations. Unsurprisingly, the F-test can assess the equality of variances. However, by changing the variances that are included in the ratio, the F-test becomes a very flexible test. For example, you can use F-statistics and F-tests to <a href="http://blog.minitab.com/blog/adventures-in-statistics/what-is-the-f-test-of-overall-significance-in-regression-analysis" target="_blank">test the overall significance for a regression model</a>, to compare the fits of different models, to test specific regression terms, and to test the equality of means.</p>
Using the F-test in One-Way ANOVA
<p>To use the F-test to determine whether group means are equal, it’s just a matter of including the correct variances in the ratio. In one-way ANOVA, the F-statistic is this ratio:</p>
<p style="margin-left: 40px;"><strong>F = variation between sample means / variation within the samples</strong></p>
<p>The best way to understand this ratio is to walk through a one-way ANOVA example.</p>
<p>We’ll analyze four samples of plastic to determine whether they have different mean strengths. You can download the <a href="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/File/a8a9c678090ccac0f3be61be91cf8012/plasticstrength.mtw">sample data</a> if you want to follow along. (If you don't have Minitab, you can download a <a href="http://www.minitab.com/en-us/products/minitab/free-trial/" target="_blank">free 30-day trial</a>.) I'll refer back to the one-way ANOVA output as I explain the concepts.</p>
<p>In Minitab, choose <strong>Stat > ANOVA > One-Way ANOVA...</strong> In the dialog box, choose "Strength" as the response, and "Sample" as the factor. Press OK, and Minitab's Session Window displays the following output: </p>
<p style="margin-left: 40px;"><img alt="Output for Minitab's one-way ANOVA" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/42587221b52ed940d53478106c134ebc/1way_swo.png" style="width: 315px; height: 322px;" /></p>
Numerator: Variation Between Sample Means
<p>One-way ANOVA has calculated a mean for each of the four samples of plastic. The group means are: 11.203, 8.938, 10.683, and 8.838. These group means are distributed around the overall mean for all 40 observations, which is 9.915. If the group means are clustered close to the overall mean, their variance is low. However, if the group means are spread out further from the overall mean, their variance is higher.</p>
<p>Clearly, if we want to show that the group means are different, it helps if the means are further apart from each other. In other words, we want higher variability among the means.</p>
<p>Imagine that we perform two different one-way ANOVAs where each analysis has four groups. The graph below shows the spread of the means. Each dot represents the mean of an entire group. The further the dots are spread out, the higher the value of the variability in the numerator of the F-statistic.</p>
<p style="margin-left: 40px;"><img alt="Dot plot that shows high and low variability between group means" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/f9a100946675098ca09c4440a7907230/group_means_dot_plot.png" style="width: 576px; height: 86px;" /></p>
<p>What value do we use to measure the variance between sample means for the plastic strength example? In the one-way ANOVA output, we’ll use the adjusted mean square (Adj MS) for Factor, which is 14.540. Don’t try to interpret this number because it won’t make sense. It’s the sum of the squared deviations divided by the factor DF. Just keep in mind that the further apart the group means are, the larger this number becomes.</p>
Denominator: Variation Within the Samples
<p>We also need an estimate of the variability within each sample. To calculate this variance, we need to calculate how far each observation is from its group mean for all 40 observations. Technically, it is the sum of the squared deviations of each observation from its group mean divided by the error DF.</p>
<p>If the observations for each group are close to the group mean, the variance within the samples is low. However, if the observations for each group are further from the group mean, the variance within the samples is higher.</p>
<p style="margin-left: 40px;"><img alt="Plot that shows high and low variability within groups" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/9ef2eae1cf6bba97ccb1b664356d0d0a/within_group_dplot.png" style="width: 576px; height: 384px;" /></p>
<p>In the graph, the panel on the left shows low variation in the samples while the panel on the right shows high variation. The more spread out the observations are from their group mean, the higher the value in the denominator of the F-statistic.</p>
<p>If we’re hoping to show that the means are different, it's good when the within-group variance is low. You can think of the within-group variance as the background noise that can obscure a difference between means.</p>
<p>For this one-way ANOVA example, the value that we’ll use for the variance within samples is the Adj MS for Error, which is 4.402. It is considered “error” because it is the variability that is not explained by the factor.</p>
The F-Statistic: Variation Between Sample Means / Variation Within the Samples
<p>The F-statistic is the <a href="http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/hypothesis-tests/basics/what-is-a-test-statistic/" target="_blank">test statistic</a> for F-tests. In general, an F-statistic is a ratio of two quantities that are expected to be roughly equal under the null hypothesis, which produces an F-statistic of approximately 1.</p>
<p>The F-statistic incorporates both measures of variability discussed above. Let's take a look at how these measures can work together to produce low and high F-values. Look at the graphs below and compare the width of the spread of the group means to the width of the spread within each group.</p>
<img alt="Graph that shows sample data that produce a low F-value" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/a8faab4bb32bf1a1f5864d34d96e8d56/low_f_dplot.png" style="width: 350px; height: 233px;" />
<img alt="Graph that shows sample data that produce a high F-value" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/054b86eb1e48803baba2cff9c78028ab/high_f_dplot.png" style="width: 350px; height: 233px;" />
<p>The low F-value graph shows a case where the group means are close together (low variability) relative to the variability within each group. The high F-value graph shows a case where the variability of group means is large relative to the within group variability. In order to reject the null hypothesis that the group means are equal, we need a high F-value.</p>
<p>For our plastic strength example, we'll use the Factor Adj MS for the numerator (14.540) and the Error Adj MS for the denominator (4.402), which gives us an F-value of 3.30.</p>
<p>Is our F-value high enough? A single F-value is hard to interpret on its own. We need to place our F-value into a larger context before we can interpret it. To do that, we’ll use the F-distribution to calculate probabilities.</p>
F-distributions and Hypothesis Testing
<p>For one-way ANOVA, the ratio of the between-group variability to the within-group variability follows an <a href="http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/probability-distributions-and-random-data/distributions/f-distribution/" target="_blank">F-distribution</a> when the null hypothesis is true.</p>
<p>When you perform a one-way ANOVA for a single study, you obtain a single F-value. However, if we drew multiple random samples of the same size from the same population and performed the same one-way ANOVA, we would obtain many F-values and we could plot a distribution of all of them. This type of distribution is known as a <a href="http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/introductory-concepts/basic-concepts/sampling-distribution/" target="_blank">sampling distribution</a>.</p>
<p>Because the F-distribution assumes that the null hypothesis is true, we can place the F-value from our study in the F-distribution to determine how consistent our results are with the null hypothesis and to calculate probabilities.</p>
<p>The probability that we want to calculate is the probability of observing an F-statistic that is at least as high as the value that our study obtained. That probability allows us to determine how common or rare our F-value is under the assumption that the null hypothesis is true. If the probability is low enough, we can conclude that our data is inconsistent with the null hypothesis. The evidence in the sample data is strong enough to reject the null hypothesis for the entire population.</p>
<p>This probability that we’re calculating is also known as the p-value!</p>
<p>To plot the F-distribution for our plastic strength example, I’ll use Minitab’s <a href="http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/graphs/graphs-of-distributions/probability-distribution-plots/probability-distribution-plot/" target="_blank">probability distribution plots</a>. In order to graph the F-distribution that is appropriate for our specific design and sample size, we'll need to specify the correct number of DF. Looking at our one-way ANOVA output, we can see that we have 3 DF for the numerator and 36 DF for the denominator.</p>
<p><img alt="Probability distribution plot for an F-distribution with a probability" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/742d7708-efd3-492c-abff-6044d78e3bbd/Image/6303a2314437d8fcf2f72d9a56b1293a/f_distribution_probability.png" style="width: 576px; height: 384px;" /></p>
<p>The graph displays the distribution of F-values that we'd obtain if the null hypothesis is true and we repeat our study many times. The shaded area represents the probability of observing an F-value that is at least as large as the F-value our study obtained. F-values fall within this shaded region about 3.1% of the time when the null hypothesis is true. This probability is low enough to reject the null hypothesis using the common <a href="http://blog.minitab.com/blog/adventures-in-statistics/understanding-hypothesis-tests:-significance-levels-alpha-and-p-values-in-statistics" target="_blank">significance level</a> of 0.05. We can conclude that not all the group means are equal.</p>
<p><a href="http://blog.minitab.com/blog/adventures-in-statistics/how-to-correctly-interpret-p-values" target="_blank">Learn how to correctly interpret the p-value.</a></p>
Assessing Means by Analyzing Variation
<p>ANOVA uses the F-test to determine whether the variability between group means is larger than the variability of the observations within the groups. If that ratio is sufficiently large, you can conclude that not all the means are equal.</p>
<p><span style="line-height: 20.8px;">This brings us back to why we analyze variation to make judgments about means. </span>Think about the question: "Are the group means different?" You are implicitly asking about the variability of the means. After all, if the group means <em>don't </em>vary, or don't vary by more than random chance allows, then you can't say the means are different. And that's why you use analysis of variance to test the means.</p>
ANOVAData AnalysisHypothesis TestingLearningStatistics HelpWed, 18 May 2016 12:00:00 +0000http://blog.minitab.com/blog/adventures-in-statistics/understanding-analysis-of-variance-anova-and-the-f-testJim FrostAn Overview of Discriminant Analysis
http://blog.minitab.com/blog/starting-out-with-statistical-software/an-overview-of-discriminant-analysis
<p>Among the most underutilized statistical tools in Minitab, and I think in general, are multivariate tools. Minitab offers a number of different multivariate tools, including principal component analysis, factor analysis, <span><a href="http://blog.minitab.com/blog/quality-data-analysis-and-statistics/cluster-analysis-tips-part-2">clustering</a></span>, and more. In this post, my goal is to give you a better understanding of the multivariate tool called discriminant analysis, and how it can be used.</p>
<p>Discriminant analysis is used to classify observations into two or more groups if you have a sample with known groups. Essentially, it's a way to handle a classification problem, where two or more groups, clusters, populations are known up front, and one or more new observations are placed into one of these known classifications based on the measured characteristics. Discriminant analysis can also used to investigate how variables contribute to group separation.</p>
<p>An area where this is especially useful is species classification. We'll use that as an example to explore how this all works. If you want to follow along and you don't already have Minitab, you can get it <a href="http://www.minitab.com/products/minitab/free-trial/">free for 30 days</a>. </p>
Discriminant Analysis in Action
<img alt="Arctic wolf" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/43484b551c0cc2eacb1b848678d666be/wolf.jpg" style="line-height: 20.8px; margin: 10px 15px; float: right; width: 241px; height: 300px;" />
<div>
<p>I have a <a href="//cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/479b4fbd-f8c0-4011-9409-f4109cc4c745/Image/9429cbd678e906f6bbbda0793aa859f6/discrimdata.mtw">data set</a> with variables containing data on both Rocky Mountain and Arctic wolves. We already know which species each observation belongs to; the main goal of this analysis is find out how the data we have contribute to the groupings, and then to use this information to help us classify new individuals. </p>
<p>In Minitab, we set up our worksheet to be column-based like usual. We have a column denoting the species of wolf, as well as 9 other columns containing measurements for each individual on a number of different features.</p>
<p>Once we have our continuous predictors and a group identifier column in our worksheet, we can go to <strong>Stat > Multivariate > Discriminant Analysis</strong>. Here's how we'd fill out the dialog:</p>
<p style="margin-left: 40px;"><img alt="dialog" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/732ead34-1005-4470-b034-d7f8b87fabcf/Image/bbfff731ce2f30923c064a73324dba1e/discrimdia.png" style="width: 448px; height: 336px;" /></p>
<p>'Groups' is where you would enter the column that contains the data on which group the observation falls into. In this case, "Location" is the species ID column. Our predictors, in my case X1-X9, represent the measurements of the individual wolves for each of 9 categories; we'll use these to determine which characteristics determine the groupings.</p>
<p>Some notes before we click OK. First, we're using a Linear discriminant function for simplicity. This makes the assumption that the covariance matrices are equal for all groups. This is something we can verify using Bartlett's Test (also available in Minitab). Once we have our dialog filled out, we can click OK and see our results.</p>
Using the Linear Discriminant Function to Classify New Observations
<p>One of the most important parts of the output we get is called the Linear Discriminant Function. In our example, it looks like this:</p>
<p style="margin-left: 40px;"><img alt="function" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/732ead34-1005-4470-b034-d7f8b87fabcf/Image/a3f3b5199c25010c69d3b19843c31b0e/function.PNG" style="width: 303px; height: 208px;" /></p>
<p>This is the function we will use to classify new observations into groups. Using this function, we can use these coefficients to determine which group provides the best fit for a new individual's measurements. Minitab can do this in the "Options" subdialog. For example, let's say we had an observation with a certain vector of measurements (X1,...,X9). If we do that, we get output like this:</p>
<p style="margin-left: 40px;"><img alt="pred" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/732ead34-1005-4470-b034-d7f8b87fabcf/Image/49873dcbc94d8aa1ae75a45474aaf147/predic.PNG" style="width: 421px; height: 119px;" /></p>
<p>This will give us the probability that a particular new observation falls into either of our groups. In our case, it was an easy one. The probability that is belongs to the AR species was 1. We're reasonably sure, based on the data, that this is the case. In some cases, you may get probabilities much closer to each other, meaning it isn't as clear cut.</p>
<p>I hope this gives you some idea of the usefulness of discriminant analysis, and how you can use it in Minitab to make decisions.</p>
</div>
Data AnalysisHypothesis TestingStatisticsMon, 16 May 2016 12:00:00 +0000http://blog.minitab.com/blog/starting-out-with-statistical-software/an-overview-of-discriminant-analysisEric HeckmanTests of 2 Standard Deviations? Side Effects May Include Paradoxical Dissociations
http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/tests-of-2-standard-deviations-side-effects-may-include-paradoxical-dissociations
<p>Once upon a time, when people wanted to compare the standard deviations of two samples, they had two handy tests available, the F-test and Levene's test.</p>
<p>Statistical lore has it that the F-test is so named because <a href="##footnote">it so frequently fails you.1</a> Although the F-test is suitable for data that are normally distributed, its sensitivity to departures from <span><a href="http://blog.minitab.com/blog/the-statistical-mentor/anderson-darling-ryan-joiner-or-kolmogorov-smirnov-which-normality-test-is-the-best">normality</a></span> limits when and where it can be used.</p>
<p><a name="#back"></a>Levene’s test was developed as an antidote to the F-test's extreme sensitivity to nonnormality. However, Levene's test<span style="line-height: 1.6;"> is sometimes accompanied by a troubling side effect: paradoxical </span>dissociations<span style="line-height: 1.6;">. To see what I mean, take a look at these results from an </span><span style="line-height: 1.6;">actual </span><span style="line-height: 1.6;">test of 2 standard deviations that I actually ran in Minitab 16 using actual data that I actually made up:</span></p>
<p style="margin-left: 40px;"><img alt="Ratio of the standard deviations in Release 16" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/313db9f57725eeb074002df423c4415e/16_ratio.jpg" style="width: 286px; height: 99px;" /></p>
<p>Nothing surprising so far. The ratio of the standard deviations from samples 1 and 2 (s1/s2) is <span style="line-height: 20.8px;">1.414 / 1.575 = 0.898. This ratio is </span>our best "point estimate" for the ratio of the standard deviations from populations 1 and 2 (Ps1/Ps2).</p>
<p>Note that the ratio is less than 1, which suggests that Ps2 is greater than Ps1. </p>
<p>Now, let's have a look at the confidence interval (CI) for the population ratio. The CI gives us a range of likely values for the ratio of Ps1/Ps2. The CI <span style="line-height: 20.8px;">below</span><span style="line-height: 1.6;"> labeled "Continuous" is the one calculated using Levene's method:</span></p>
<p style="margin-left: 40px;"><img alt="Confidence interval for the ratio in Release 16" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/aee886880d52d5aed7150abd242b5d61/16_ci.jpg" style="width: 338px; height: 114px;" /></p>
<p><span style="line-height: 1.6;">What in Gauss' name is going on here?!? The range of likely values for Ps1/Ps2—1.046 to 1.566—doesn't include the point estimate of 0.898?!? In fact, the CI suggests that Ps1/Ps2 is </span><em style="line-height: 1.6;">greater </em><span style="line-height: 1.6;">than 1. Which suggests that Ps1 is actually </span><em style="line-height: 20.8px;">greater </em><span style="line-height: 1.6;">than Ps2. </span></p>
<p><span style="line-height: 1.6;">But the point estimate suggests the exact opposite! Which suggests that </span><span style="line-height: 20.8px;">something odd is going on here. Or that</span><span style="line-height: 1.6;"> I might be losing my mind (which wouldn't be that odd). Or both.</span></p>
<p>As it turns out, the very elements that make Levene's test robust to departures from normality also leave the test susceptible to paradoxical dissociations like this one. You see, Levene's test isn't <em>actually </em>based on the standard deviation. Instead, the test is based on a statistic called the <em>mean absolute deviation from the median</em>, or MADM. The MADM is much less affected by nonnormality and outliers than is the standard deviation. And even though the MADM and the <span style="line-height: 20.8px;">standard deviation of a sample </span>can be very different, the <em>ratio </em>of MADM1/MADM2 is nevertheless a good approximation for the <em>ratio </em>of Ps1/Ps2. </p>
<p><span style="line-height: 1.6;">However, in extreme cases, outliers can affect the sample standard deviations so much that s1/s2 can fall completely outside of Levene's CI. And that's when you're left with an awkward and confusing case of paradoxical dissociation. </span></p>
<p><span style="line-height: 1.6;">Fortunately (and this may be the first and last time that you'll ever hear this next phrase), our </span><span style="line-height: 1.6;">statisticians have made things a lot less awkward. </span><span style="line-height: 1.6;">One of the brave folks in Minitab's R&D department toiled against all odds, and at considerable personal peril to solve this enigma. The result, which has been incorporated into Minitab 17, is an effective, elegant, and </span>non-enigmatic<span style="line-height: 1.6;"> test that we call Bonett's test. </span></p>
<p style="margin-left: 40px;"><span style="line-height: 1.6;"><img alt="Confidence interval in Release 17" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/3c014cdea970a3f1f6a540119ef3b533/bonnet_results.jpg" style="width: 310px; height: 170px;" /></span></p>
<p>Like Levene's test, Bonett's test can be used with nonnormal data. But <em>unlike </em>Levene's test, Bonett's test is actually based on the actual standard deviations of the actual samples. Which means that Bonett's test is not subject to the same awkward and confusing paradoxical dissociations that can accompany Levene's test. And I don't know about you, but I try to avoid paradoxical dissociations whenever I can. (Especially as I get older, ... I just don't bounce back the way I used to.) </p>
<p><span style="line-height: 20.8px;">When you compare two standard deviations in Minitab 17, you get a handy graphical report </span><span style="line-height: 20.8px;">that quickly and clearly summarizes the results of your test, including the point estimate and the CI from Bonett's test. Which means n</span><span style="line-height: 20.8px;">o more awkward and confusing paradoxical dissociations. </span></p>
<p style="margin-left: 40px;"><img alt="Summary plot in Release 17" src="http://cdn.app.compendium.com/uploads/user/458939f4-fe08-4dbc-b271-efca0f5a2682/8de770ba-a50a-4f6b-9144-9713c3b99f66/Image/b785749b3292df1aa6d32abe4e430b63/17_summary_plot.jpg" style="width: 578px; height: 386px;" /></p>
<p><span style="line-height: 1.6;">------------------------------------------------------------</span></p>
<p><a name="#footnote"> </a></p>
<p>1 So, that bit about the name of the F-test—I kind of made that up. Fortunately, there is a better source of information for the genuinely curious. Our white paper, <a href="http://support.minitab.com/en-us/minitab/17/bonetts_method_two_variances.pdf">Bonett's Method</a>, includes all kinds of details about these tests and comparisons between the CIs calculated with each. Enjoy.</p>
<p> <br />
<em><a href="##back">return to text of post</a></em></p>
<p> </p>
<p> </p>
Hypothesis TestingStatisticsStatsWed, 11 May 2016 12:00:00 +0000http://blog.minitab.com/blog/data-analysis-and-quality-improvement-and-stuff/tests-of-2-standard-deviations-side-effects-may-include-paradoxical-dissociationsGreg Fox