Important Questions about Online Experimentation

To supplement our reading this week. I wanted to highlight on some important points that the article brings up, but I feel deserve a bit more time and explanation.

Why do most online experiments test a single variable at a time?

Well, this comes down to basic scientific principles. To determine causality of an outcome, we must be able to narrow down what change in the environment led to that outcome. 

Maybe you’ve heard the latin phrase “ceteris paribus” or more likely its English translation “holding all else equal”. Well that’s what we’re talking about. Holding all variables except our focal variable equal or in their current state. For example, if I want to figure out if adding a rocket ship emoji to my email subject line will result in a higher number of email opens, well I would send the exact same email to two statistically equal audiences—which we usually do through random sampling—

but the only difference would be that minor addition of the rocket ship emoji. If we then see a statically significant change in our email opens, then we can deduce that the emoji addition that we made actually caused the uptick in email opens. At this point we’d probably integrate that change into our email campaign and then send then email with the emoji addition to the remainder of our email marketing list.

If we change multiple elements of our email. For example, we added the rocket ship emoji and we also changed the font we used in the body of the email or the call-to-action label we used or heck even the time of day we sent the email then with most email management systems we have no way of knowing whether it was the emoji or the other element that we changed that caused the uptick in email opens. 

So, if you’re going to do online experimentation, only experiment with a single variable at a time. Once you’ve established the source of a positive effect, incorporate that change and now test another variable. Firms like Google and Amazon run thousands of these types of experiments every single day. Can you run experiments with multiple variables simultaneously. Yes! But systems that can properly analyze them are expensive and these experiments are more difficult to interpret. 

And there many different variables you could choose to manipulate to see if there is room to improve the outcomes of your email campaigns. We’ve talked about the subject line. You can also change the sender name. Perhaps one name might resonate better with an audience than another. Calls-to-action are of course a really important element to promotion emails. What wording will work best for your customers? What about the button size or color? Personalization is another really important element. People tend to respond more favorably when their name is used, but what about incorporating information related to purchases they’ve made in the past? Retargeting emails like abandoned cart notifications fall into this area. Day and time can make a big difference. When do your customers tend to open their emails? The design of your email is a big one. More likely than not you would be test more than one simple tweak as part of a design change, so you may not be able to know whether it was the color scheme change, typography change, and use of images, but you would know whether one design was more effective than another. Frequency, or how often you send your emails could be another way to test, though this would be a little different that a typical A/B test. Finally, target audience is another element that can be tested. Think about breaking your audience lists into groups based on how they respond to your emails and you might also find other similarities that can help you define a new audience segment.

What is A/B/C testing?

We’ve established that the correct way to test elements to determine causality is to only change one variable and hold all else equal. So then what is A/B/C testing. Wouldn’t that be testing more that one variable. 

Well, no. Recall that when we engage in A/B testing we have one variable, again let’s say email subject line. But that one variable has two states or treatments: with the rocket ship emoji and without the emoji. 

Given a sufficiently large enough sample, we could actually test three maybe even four treatments of the same variable. So I could try the standard version of the subject line, one with a rocket ship added, and maybe one with a puppy emoji added…A, B, and C. This is still only testing one variable, but with more that two test treatments.

The is a common occurrence with real-time online testing. For example, Amazon might run a simultaneous experiment where they divvy up their online customers into multiple groups. The first group to whom they show the current variation of a page design (for example showing vehicle body styles in black&white, the other groups get different variation including an alternate version where the vehicles are in full-color). With real-time online experiments like this, a firm has the ability to immediately shift course and serve up a version of the page that is more successful, or in other words, leads to more conversions.

Do online experiments need to be driven by an underlying theory?

In the scientific method, we often start by forming a hypothesis about something based on existing theories. We then devise an experiment that we can use to test it, run the experiment, and analyze the results. Theory is the lifeblood of scientific research. But an email marketing campaign or a search engine website are commercial endeavors. We can use the scientific method to test variables, and we should, but that doesn’t necessarily mean that we have to have a theory that drives our initial interest in experimentation. 

Business is a fast-moving, highly-complicated environment. As marketers, we’re continuously looking for an edge to help improve our outcomes. Sometimes we have a gut instinct about what might appeal to our customers and that is a completely valid starting point for experimentation in the business world. If you’re in the healthcare industry and your gut feeling is your customers will likely stay on your website longer if you use a color palette of blues and not yellows, then that’s an easy thing to test and you don’t need to dive headlong into color theory to find justification for an experiment. These types of experiments are inexpensive and quick and they can often reveal surprising outcomes.