A/B Testing

This is page three of the Growth Marketing Guide.

A/B testing

This page helps you improve the rate at which visitors/users advance to the next step in your growth funnel.

Specifically, we're focused on increasing the rate at which people sign up and purchase.

The process of improving conversion is called A/B testing: it's the science of testing changes to see if they improve performance.

For example, you could rewrite the top half of your landing page or you could switch from a paid trial to a free trial. These changes might increase your sign up rate.

Your job is to figure out what's worth testing.

We'll cover:

  • Deciding what to A/B test
  • Finding the most valuable tests
  • Asking your team to implement tests

A/B testing is fundamental to growth

In a test, each thing you're testing is called a variant. For example, your existing site may be Variant A. The change you're comparing it to may be called Variant B.

Hence, "A/B" testing.

Testing makes or breaks growth. I've worked with many companies who couldn't get Facebook ads to run profitably then later achieved success through three months' worth of landing page A/B testing: they continuously made their visuals more enticing and their messaging more clear.

The A/B testing cycle

Here's the testing cycle:

  1. Decide what change to test.
  2. Use Google Optimize (an A/B testing tool) to show half your visitors the change.
  3. Run this test until you reach a statistically significant sample of visitors.
  4. When enough data is collected, Google Optimize will report the likelihood that your change had a significant effect on conversion. If it caused a significant positive difference, you should consider implementing it.
  5. Log the design and results of your experiment to inform future experiments.

Repeat these steps until you run out of variant ideas. Never have downtime; every day of the year, a test should be running—or you're letting traffic go to waste.

A/B testing isn't about striving for perfection with each variant. It's about iteration.

Sourcing A/B ideas

Here's where I source ideas from:

Testing the growth funnel

An A/B variant is only better if it increases your bottom line.

If you discover that a variant motivates visitors to click a button 10x more, but button clicking doesn’t actually lead to greater signups or purchases, then your variant isn’t better than the original. All it's done is distract users into clicking a button.

For each A/B test, keep your eye on the prize: What is the meaningful funnel metric you're trying to increase? Often, it's email captures, sign ups, purchases, and retention.

Of these, you'll more often A/B test earlier parts of the funnel—for two reasons:

Product changes are as important as early-funnel changes, but they're outside the scope of this handbook.

What to A/B test on your landing page

There are two types of variants: micros and macros.

Micro variants are small, quick changes. They're unlikely to have a large impact. For example, changing a button's color (a micro variant) typically won't have more than 2% conversion impact—at best.

Macro variants, on the other hand, are significant rethinkings of your asset. Entirely rewriting a landing page can increase conversion by 50-300%. This happens often. Although, you'll usually only get a couple of boosts before facing diminishing returns.

Your goal is to focus on big, macro impacts—because every A/B test has an opportunity cost: you're usually only running one test per audience at a time.

Macro variants

Macro variants require considerable effort: It’s hard to repeatedly summon the focus and company-wide collaboration needed to wholly rethink your assets. 

But macros are the only way to see the forest through the trees

Since the biggest obstacle to testing macros is committing the resources, I urge you to create an A/B testing calendar and adhere to it: Create a recurring event for, say, every 2 months. On that day, spend a couple hours brainstorming a macro variant for a step in your growth funnel.

You can do so using one of five approaches:

Micro variants

Now here are micro ideas.

Despite micros being less important, I'm including them because if you piece together enough micros, you sometimes have yourself a macro.

The best micro

When you run out of macros, this is the micro with the greatest impact: change your above-the-fold content.

Every page has an above-the-fold (ATF) section. This is what visitors see before scrolling to the rest of a page. The content placed in your ATF in part determines whether visitors continue scrolling.

Specifically, rewrite your header and subheader copy. Header text is the first hook encountered for your product. So, if you've been unknowingly showing visitors unenticing messaging here, fixing it can have an impact.

If you're looking for a third-party to help you with this process, my growth training program will assist: Demand Curve.

Prioritizing A/B tests

An A/B test has an opportunity cost; you only have so many visitors to test against. So prioritize thoughtfully.

Here are the factors I consider:

Setting up A/B tests

Two things to understand about proper test design:

Google Optimize handles all this A/B testing logic for you.

Consider only targeting new users

When setting up tests, consider who should be included in them. It doesn't have to be everyone.

For example, consider only showing an experiment to visitors arriving at your site for the first time. This ensures that everyone in the test has the same base level of familiarity with your product.

To target only new users in Google Optimize, follow Example 1 in these instructions:

How to configure targeting settings in a Google Optimize experiment

Professional growth services

If you think this page is in-depth, growth marketing gets much, much deeper. My team will train you or your company in growth marketing. See our programs at Demand Curve.

Assessing A/B test results

For test results to be statistically valid, you need to reach a sufficiently large sample size. The math is simple:

The implication is that if you don’t have a lot of traffic, the opportunity cost is too great to run micro variants, which tend to show conversion increases in just the 1-5% range. Meanwhile, macros have the potential to produce 10-20%+ improvements, which is well above the 6.3% threshold.

Below is an example of an experiment I ran using Google Optimize:

Read Google's docs (parts one and two) to learn how to interpret these results.

Above, our page had 1,724 views throughout the testing period. There was a 30% (29/22) improvement in our test variant over our baseline.

This 30% number is likely inaccurate, by the way. It's just a reference for the variant's maximum potential. We don't yet have that many sessions to validate this conversion improvement with certainty. But 30% is likely good enough to validate that we improved conversion by at least 6.3% (the number from earlier).

Pay attention to the Google Optimize column labeled Probability to be Best. If a variant’s probability is 70%+ and it has sufficient sessions (e.g. 1,000 and 10,000 as I indicated above in the sample size thresholds), the results are likely statistically sound, and the winning variant should be considered for implementation.

Now you can decide if the labor and implementation externalities are worth the 6.3%+ improvement in conversion.

Sample sizes and revenue

What if our results weren't conclusive? What if we didn't surpass a 70% certainty?

Had the experiment revealed merely a 3% increase, for example, we would have to dismiss the sample size of 1,724 as too small for the 3% to be statistically valid. 

We would end the experiment if we have low confidence in it, or we'd accept the testing opportunity cost and continue until we reach 10,000 sessions. If, after 10,000 sessions, the 3% increase remains, we'd conclude it's likely valid.

But, as mentioned in the previous section, if you have little traffic to begin with, don't risk waiting on a small, 3% improvement. Instead, consider a new test.

However, if that small change is tied to a meaningful revenue objective (e.g. purchases) as opposed, to say, people providing their email addresses, then perhaps it's worth continuing.

In other words, the closer an experiment's conversion objective is to revenue, the more worthwhile it may be to confirm small conversion boosts.

Don't implement negligible wins

Don't implement A/B variants that win negligibly. The unknown downsides of implementation often outweigh the expected value of the gain.

For example, a change may introduce unforeseen funnel consequences that won't be obvious for a few months. It'll later be difficult to identify this as the root cause.

Consider degree of intent

However, sometimes negligible wins are worth re-running on a new audience.

Consider this: when running A/B tests to improve conversion, you'll get diminishing returns on conversion gains for already high-intent traffic (e.g. organic search, referrals, and word of mouth). Those visitors came looking for you on their own merit. They're already interested. The onus is on you to reassure that you sell what they're expecting, and to not scare them off.

In contrast, for, say, ad traffic, A/B testing has the potential to provide much larger returns. These are uninterested, medium-intent eyeballs at best—often people who whimsically clicked your ad. They're looking for excuses to dismiss your value props and leave immediately.

This is where A/B tests shine: they're more effective at significantly improving conversion rates for low-to-medium intent traffic—because there's a greater interest gap to cover.

Here’s the implication: If you only A/B against high-intent traffic, you may not notice a significant improvement and may mistakenly dismiss your test as a global failure. When this happens, but you're confident the variant does have potential, retry the test on paid traffic. That’s where the improvement may be large enough to notice its significance.

How to share results with your team+

I use a task management tool, like Trello, to track A/B tests. I note the following:

When the test is finished, I further make note of:

Refer to these past tests before running new ones. Learn from your past mistakes.

Here's the point

Three takeaways:

Next: Onboarding

How to excite new users into using your product. And how virality works.

Next page →