Get Back to Marketing Basics: Give A/B Email Testing a Try

Agnes Ogee and Robert Collis | 3/29/2024

Topics: Minitab Statistical Software, Data Analysis

If you are a marketer, you know that sending promotional emails and simply hoping for the best is not the best practice. Ideally, you should take the time to explore the performance of your emails and uncover measures you can take to increase that performance.

 

With a little effort, you can send two versions of a marketing email to a sample of your readers and compare success metrics before sending the best version to the whole audience. That way, you can identify the best version by its success metrics via open rate and/or click through rate.

The open rate is the percentage of people who opened your email out of the people who received your email. The click through rate is the percentage of contacts that received your email and clicked on at least one link in the message, thus demonstrating more engagement than just opening the email.

We are going to share a use case of how this worked for our marketing team at Minitab. We ran an A/B test to compare two versions of an email and reviewed the click through rates as success metrics.

AB-test-email-process-mapped-using-Minitab-Workspace-Process-Mapping-toolThe team documented the test with a process map to outline the steps

 


Learn more about process maps using Minitab’s Workspace Software® in our Blog Getting Started with Process Maps >

Read the Blog


 

Before we ran the test, we asked the following questions:

1. How can we determine the sample sizes requested to test two versions of an email, so that the success metrics are estimated with sufficient precision?

2. How can we compare two versions of an email, so that a particular difference in terms of success can be detected?

Both questions require a baseline in terms of success metrics, so we used historical data to make estimates. We wanted to know what the benchmark success metrics were for a similar marketing email aimed at the same target audience. In this example, our benchmark is a similar email sent to a lookalike audience where 340 readers out of 100,000 clicked on one or several links inserted in the email. The baseline click through rate is 0.34%.

PUT IT INTO PRACTICE:

1st way to calculate sample sizes for comparing the success of two versions of an email:

You can calculate the number of readers to sample using the Sample Size for Estimation” test in Minitab® Statistical Software to estimate your click through rate within a certain margin of error.

For example, let’s take sample sizes of 10,000 contacts from a population at least 10 times larger, that is at least 100,000 readers in total. If the expected click through rate is our benchmark of 0.34%, the lower bound at the 95% level would be 0.21% and the upper bound would be 0.47%.

Sample-size-for-margin-of-errors-estimation-using-Minitab-Statistical-Software

Use Stat>Power and Sample Size>Sample Size for Estimation in Minitab Statistical Software®. The 95% confidence interval would be between 0.24% and 0.47% for the click through rate.

Because the sample size is small compared to the size of the population, the binomial or hypergeometric distributions to model the sampled data would provide similar results. It is your call to decide if that margin of error for click through rates is acceptable and will estimate this success metric with sufficient precision.

2nd way to calculate sample sizes for comparing the success of two versions of an email:

Using the 2-Proportion Test, which can be found in the Stat>Basic Statistics>2 Proportions menu of Minitab Statistical Software. you can determine what sample size is required to detect a certain difference between the click through rates of the two email versions with a required probability or power.

The benchmark click through rate for your first email campaign is 0.34%. The change you could detect in the click through rate would be from 0.34% to 0.63% in 90% of cases if you choose a sample size of 10,000 contacts for each email version.

Sample-size-with-test-for-two-proportions-using-Minitab-Statistical-Software

Use Stat>Power and Sample size>2 Proportions to determine the sample size

If you wanted to detect a smaller difference between the click through rates, you could increase the sample size to see what impact it has on the test sensitivity.


Ready to determine the sample size for your own comparisons? Start your free trial of Minitab Statistical Software >

Start Free Trial


What statistical test can help to compare the success of each of the two email versions sent?

Now that the test has run, we’d like to compare results to select the best version of the email.

AB-test-email-open-rate-click rate-results

Number of delivered, opens and click throughs for each of the two versions of the email

Although the percentages are perceived as higher for version B in this descriptive table, it’s important to use a statistical test to check whether the difference is significant for the success metric the team is looking at in this example, i.e. the click through rate.

The statistical test is called 2-Sample % Defective and is accessible in the Assistant menu of Minitab Statistical Software. Classically, this test is used to demonstrate the difference in the defect rates of two populations via sampling. In this example, the reference event is not a defective unit, it’s a click through. So, a high defect rate means a high click through rate.

AB-test-email-open-and-clickthrough-rate-difference-summary-reportUse Assistant>Hypothesis tests>Compare two samples with each other>2-sample % defective

to compare the click through rates of the two versions of the email: version A and version B

AB-test-email-open-and-clickthrough-rate-difference-diagnostic-report

AB-test-email-open-and-clickthrough-rate-difference-report-card

The output provides guidelines on how to interpret the data.

The statistical output reveals that the data does provide sufficient evidence at the 10% level to conclude that the % of defective, here the % of click throughs, of version A is significantly lower than that of version B.

The reason is simple: each email version was not delivered to the recommended samples of 10,000 readers. As in many real-life experiments, the test could not be run under recommended conditions.

The sample sizes used only enable detection of a real increase in click through rate of about 1% from email versions A to B when that difference exists.

The team can now decide either to select additional samples to increase the power of the test, or to select one version of the email over the other, even though the click through rates were not shown to be statistically significant. The team can also compare the open rates using a similar method to get another element for comparison.


Interested in more content about A/B email testing, or A/B/C email testing, read our Blog!

Read the Blog