dcsimg
 

Jim Frost

Data analysis gives you the keys to how to manufacture the best product, provide the best services, or answer an academic research question. I’ll share practical tidbits that may help you do just that. Continue Reading »

Previously, I’ve written about how to interpret regression coefficients and their individual P values. I’ve also written about how to interpret R-squared to assess the strength of the relationship between your model and the response variable. Recently I've been asked, how does the F-test of the overall significance and its P value fit in with these other statistics? That’s the topic of this post! In... Continue Reading
Scientists who use the Hubble Space Telescope to explore the galaxy receive a stream of digitized images in the form binary code. In this state, the information is essentially worthless- these 1s and 0s must first be converted into pictures before the scientists can learn anything from them. The same is true of statistical distributions and parameters that are used to describe sample data. They... Continue Reading
In my previous post, I wrote about the hypothesis testing ban in the Journal of Basic and Applied Social Psychology. I showed how P values and confidence intervals provide important information that descriptive statistics alone don’t provide. In this post, I'll cover the editors’ concerns about hypothesis testing and how to avoid the problems they describe. The editors describe hypothesis testing... Continue Reading
Banned! In February 2015, editor David Trafimow and associate editor Michael Marks of the Journal of Basic and Applied Social Psychology declared that the null hypothesis statistical testing procedure is invalid. They promptly banned P values, confidence intervals, and hypothesis testing from the journal. The journal now requires descriptive statistics and effect sizes. They also encourage large... Continue Reading
The 2016 presidential race is becoming more real. We’ve had several announcements with Ted Cruz, Rand Paul, Hillary Clinton, and Marco Rubio officially entering the race to be President. While the prospective Democratic candidates are down to one, or at most a few, the Republican field is extra-large this election cycle. The first order of business for a GOP candidate is to survive the nomination... Continue Reading
In this series of posts, I show how hypothesis tests and confidence intervals work by focusing on concepts and graphs rather than equations and numbers.   Previously, I used graphs to show what statistical significance really means. In this post, I’ll explain both confidence intervals and confidence levels, and how they’re closely related to P values and significance levels. How to Correctly... Continue Reading
This is a companion post for a series of blog posts about understanding hypothesis tests. In this series, I create a graphical equivalent to a 1-sample t-test and confidence interval to help you understand how it works more intuitively. This post focuses entirely on the steps required to create the graphs. It’s a fairly technical and task-oriented post designed for those who need to create the... Continue Reading
What do significance levels and P values mean in hypothesis tests? What is statistical significance anyway? In this post, I’ll continue to focus on concepts and graphs to help you gain a more intuitive understanding of how hypothesis tests work in statistics. To bring it to life, I’ll add the significance level and P value to the graph in my previous post in order to perform a graphical version of... Continue Reading
Hypothesis testing is an essential procedure in statistics. A hypothesis test evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. When we say that a finding is statistically significant, it’s thanks to a hypothesis test. How do these tests really work and what does statistical significance actually mean? In this series of... Continue Reading
It’s safe to say that most people who use statistics are more familiar with parametric analyses than nonparametric analyses. Nonparametric tests are also called distribution-free tests because they don’t assume that your data follow a specific distribution. You may have heard that you should use nonparametric tests when your data don’t meet the assumptions of the parametric test, especially the... Continue Reading
As someone who has collected and analyzed real data for a living, the idea of using simulated data for a Monte Carlo simulation sounds a bit odd. How can you improve a real product with simulated data? In this post, I’ll help you understand the methods behind Monte Carlo simulation and walk you through a simulation example using Devize. What is Devize, you ask? Devize is Minitab's exciting new,... Continue Reading
Choosing the correct linear regression model can be difficult. After all, the world and how it works is complex. Trying to model it with only a sample doesn’t make it any easier. In this post, I'll review some common statistical methods for selecting models, complications you may face, and provide some practical advice for choosing the best regression model. It starts when a researcher wants to... Continue Reading
Last fall I had a birthday. It wasn’t one of those tougher birthdays where the number ends in a zero. Still, the birthday got me thinking. In response, I told myself, age is just a number. Then I did a mental double-take. Can a statistician say that? After all, numbers are how I understand the world and the way it works. Can age just be a number? After some musing, I concluded that age is just a... Continue Reading
Stepwise regression and best subsets regression are both automatic tools that help you identify useful predictors during the exploratory stages of model building for linear regression. These two procedures use different methods and present you with different output. An obvious question arises. Does one procedure pick the true model more often than the other? I’ll tackle that question in this post. Fi... Continue Reading
Analysis of variance (ANOVA) is great when you want to compare the differences between group means. For example, you can use ANOVA to assess how three different alloys are related to the mean strength of a product. However, most ANOVA tests assess one response variable at a time, which can be a big problem in certain situations. Fortunately, Minitab statistical software offers a... Continue Reading
Using a sample to estimate the properties of an entire population is common practice in statistics. For example, the mean from a random sample estimates that parameter for an entire population. In linear regression analysis, we’re used to the idea that the regression coefficients are estimates of the true parameters. However, it’s easy to forget that R-squared (R2) is also an estimate.... Continue Reading
I’ve written about the importance of checking your residual plots when performing linear regression analysis. If you don’t satisfy the assumptions for an analysis, you might not be able to trust the results. One of the assumptions for regression analysis is that the residuals are normally distributed. Typically, you assess this assumption using the normal probability plot of the residuals. Are... Continue Reading
Astronomy is cool! And, it’s gotten even more exciting with the search for exoplanets. You’ve probably heard about newly discovered exoplanets that are extremely different from Earth. These include hot Jupiters, super-cold iceballs, super-heated hellholes, very-low-density puffballs, and ultra-speedy planets that orbit their star in just hours. And then there is PSR J1719-1438 which has the mass... Continue Reading
In my previous post, I described how I was asked to weigh in on the ethics of researchers (DeStefano et al. 2004) who reportedly discarded data and potentially set scientific knowledge back a decade. I assessed the study in question and found that no data was discarded and that the researchers used good statistical practices. In this post, I assess a study by Brian S. Hooker that was... Continue Reading
The other day I received a request from a friend to look into a new study in a peer reviewed journal that found a link between MMR vaccinations and an increased risk of autism in African Americans boys. To draw this conclusion, the new study reanalyzed data that was discarded a decade ago by a previous study. My friend wanted to know, from a statistical perspective, was it unethical for the... Continue Reading