B Testing Playbook
B Testing Playbook
Playbook
Table of contents
Welcome to the world of web experimentation 3
A/B testing types 5
Conclusion 25
About Contentsquare 26
Welcome and products, but it’s just one piece of a very fragile puzzle: budgets are often tight, users are
unpredictable, and changes are risky.
to the world A/B testing is a user experience (UX) research and web experimentation methodology that
proves your hypothesis right or wrong. Running a test enables you to launch variations of web
pages, apps, or products that you know will yield the best results for a given conversion goal—
But, the biggest A/B testing myth is that the benefits end there. Hidden in the losing version
experimentation of your test are revenue-boosting insights capable of informing future tests and adding real
business value—you just need to know how to look for them.
That’s why we’ve created this A/B testing tool kit. We’re here to help you mine every nugget of
insight from your test so you can give your audience a digital experience that’s easy, intuitive,
and exactly what they need.
At its most basic level, A/B testing allows product, UX, marketing, and ecommerce teams to
compare two versions of a web page, app, product, or feature and then determine the top-
performing candidate. During the testing period, these two versions—called a control and
variation—launch simultaneously to different audiences. Statistical analysis determines the
winning version. The version that yields the best result for a given conversion goal is then rolled
out to the entire audience.
+30%
engaged with the review stars. Space NK Results:
hypothesized that getting more customers to
read reviews would increase conversion and
revenue. increased
overall conversions
They tested two variations of the product
review stars and CTA, emphasizing the color
Variation 2
and style. They left variation 1 as the control
and to variation 2, they changed the color
of the stars and updated the CTA copy. We
found that variation 2 encouraged more
visitors to click on review stars and interact
with customer reviews. This ultimately
increased conversion by +30%.
experimentation
• Usually tests 2 versions • Compares 2 versions of • Tests numerous
While A/B testing is the most common of a web page, app, a web page, but each combinations of variables
example of experimentation, there are several product, or feature is hosted on a different at once, so may include
additional testing methodologies you can use hosted on the same URL URL, and traffic is equally dozens of versions of a
to get the data you need, depending on your distributed between them web page
• Lets you answer targeted
role, industry, and business size.
questions about highly • Recommended for bigger • Used to determine which
specific modifications tests that require more combination of elements
Here are 6 of the most useful experimentation significant design or produces the most
types. Which ones are non-negotiable for backend changes conversions
your experimentation program? Once you
Example
know the answer, you’ll be better positioned
to choose a testing tool. Comparing a blue CTA
Example Example
button to an orange one to
see which gets more clicks Comparing 2 vastly Comparing 2 image options
different landing page and 4 CTA colors in 8
designs that offer different different combinations
page experiences
• An extensive test that looks at users’ journeys • Tests high-impact non-user interface (UI) variations, • A server-side test that compares different
through multiple pages that are all part of a like different algorithms, architectures, and other versions of an in-app experience on mobile
particular funnel backend variables.
• Used to improve overall mobile app
• Used to optimize the design of several pages • Used by developers to go beyond visual changes and experience and encourage users to convert.
to encourage bottom-of-funnel conversions compare complex product variables, rendering tests
on the server instead of the user’s browser
Choosing
an A/B testing tool
Who it’s for Who it’s for Who it’s for Who it’s for
Product, marketing, engineering, and Marketing, UX, and ecommerce teams Marketing teams Marketing, UX, and ecommerce
ecommerce teams Enterprise-grade Best for SMBs teams
Enterprise-grade Enterprise-grade
Tests supported Tests supported
Tests supported A/B, split URL, multivariate, A/B and multivariate tests (specifically for Tests supported
A/B, split URL multi-page/multi-funnel, and mobile app Unbounce landing pages) A/B, split URL, multivariate, multi-
page
Pros Pros Pros
Powerful segmentation engine A robust customer data platform for Integrates with many popular marketing tools Pros
Includes surveys for additional in-depth insights Automatically directs visitors to the Full-stack capabilities included in
qualitative insights Clear and concise results dashboard best-performing variant based on real-time data every plan
Intuitive drag-and-drop system
Cons Cons Cons
Can be confusing for less technical users Steep learning curve compared to some Limited access to features in lower plans Cons
Complex user interface compared to other tools Can’t split test existing landing pages created Has some interface usability issues
other tools On the pricier side outside of Unbounce Onboarding is limited
Are they in the Contentsquare partner Are they in the Contentsquare partner Are they in the Contentsquare partner Are they in the Contentsquare
ecosystem? ecosystem? ecosystem? partner ecosystem?
No Technology Partner No Technology partner
*G2 is a tech marketplace that compares software and services based on user ratings and social
data. The G2 rating is a standardized score used to compare products within the same category.
For example, the Space NK team’s hypothesis Based on insights from Contentsquare’s analysis on product reviews, we believe
for their customer review stars test we that getting more customers to read reviews will result in increased conversion
mentioned earlier could look like this: and revenue.
An evidence-based hypothesis that follows this format ensures you don’t waste time
going down rabbit holes that ultimately lead nowhere. So, instead of immediately
launching into a test, spend some time doing the research into the business metrics
that need improvement. Dig into past initiatives that may have impacted these numbers
and look at existing behavior analytics data. Seeing what users currently do and how
they feel will help inform your A/B test.
+42.3%
their site. Mitre 10 wanted to A/B test this “no results” Results:
page to prevent users from quickly bouncing
off the site.
The variation resulted in a +42.3% uplift in
revenue for the search audience segment. uplift
in revenue for the search
audience segment
Bounce rate—the percentage Retention rate is the percentage of Exposure rate in Heatmaps shows how
Abandonment rate refers to the of visitors entering and quickly users revisiting a website or specific far down a page users scroll.
percentage of tasks users start leaving your website without page after a certain period and is a
but don’t complete, like leaving taking additional action—is a valuable sign of customer loyalty. It’s is a key metric to track when A/B
a survey midway or adding an good indicator of visitor interest. Comparing retention rates between testing, as it can help you to better
item to a shopping cart but not A high bounce rate is often different A/B test variations helps understand how much users scroll,
purchasing. A/B testing to prevent indicative of website design you understand what encourages especially to elements you want them
abandonment will ultimately issues or a content mismatch, users to return and engage with your to engage with. With this information,
increase revenue for any business giving you more insight into the website or product. you can make data-driven empirical
selling online. effectiveness of your experiment. sizing and placement adjustments.
Changes to test
Changes to test Changes to test Changes to test
• Experimenting with in-app
• Replacing product imagery on • Adjusting the messaging and messaging to guide users to • Redesigning the content hierarchy to
listings placement of website copy valuable product features compare average time on page
• Reducing checkout steps to that communicates the value of • Trying out different onboarding • Placing important information above
simplify UX your website flows to improve users’ the fold so users get important
• Comparing a multi- and single- • Improving page load speed to understanding of a product from information, faster
page checkout experience avoid user frustration the get-go • Experimenting with design elements
• Fully redesigning a page to • Testing loyalty programs, referral like headings, colors, and image
improve its experience programs, or incentive offers to placement
encourage repeat purchases
+35%
and engaged with the most. was located beneath the fold. Results:
Assumption
increase
in overall revenue attributed
Based on this evidence, the team made an to the zone
educated guess that the banner size and page
structure could be reorganized.
Result
+44%
Results:
Running an A/B test with a shorter banner
resulted in a +44% increase to the exposure
increase rate (from 57% in the control to 82% in the
variant). The team also noticed a +24%
to the exposure rate
increase in click rate and overall revenue
attributed to the zone increased by +35%.
Testing, 1, 2, 3:
a 3-step plan
for successful
tests
Preliminary research
Case study
See it in action
This led the team to simplify the page with a smaller hero
image, remove the category tiles, and use only product tiles,
to help boost product visibility.
U.K. online grocery business
Ocado Retail conducted a
Heatmap analysis in our live
browser extension on their
Christmas promotional page.
The team found that 0.76% of
Desktop click rate
desktop users and 1.24% of
mobile users were clicking on
the unclickable
hero image and that scroll
Variation 1 (Control) Variation 2
depth was low. A zoning analysis of Ocado’s
Christmas Wondermarket page,
showing click rate on desktop (L) Ocado’s A/B test with (control) and without (variation 2)
and mobile (R) devices. Mobile click rate the hero banner.
They hypothesized that by removing the hero image, the product tiles would sit above the fold, making it easier for their
customers to find the products they were looking for. They then ran an A/B test with a variation of the page without the
hero banner. This variation saw no impact on order conversion, but it did have a positive uplift on their secondary metrics.
A confidence level determines how sure you can be of your results. As the researcher, this
value is totally up to you. In market research, it’s common to use a 95% confidence level.
This means if you ran the experiment 20 times, you’d get the same results (with a margin of
error) about 19 times. A higher confidence level provides greater certainty but might require
a larger sample size to achieve, while a lower confidence level may be more practical but
comes with the risk of drawing inaccurate conclusions.