A/B Testing and Experimentation: Tactics From Matt Gershoff

Learn how to strategize, implement, and scale A/B testing and experimentation from web analytics and optimization expert Matt Gershoff, founder of Conductrics.
Diana Ellegaard-Daia
Head of Content Marketing at Accutics

Matt Gershoff, optimization expert and founder of Conductrics, shares thoughts about A/B testing and experimentation in this interview.

Common Misconceptions About A/B Testing and Experimentation

D.D.: You have a lot of experience helping companies with A/B testing and experimentation, what would you say are some of the issues that keep organizations from getting the most value out of their experimentation programs?

Think first about the problem, rather than the Solution

M.G.: At a high level, experimentation via A/B testing is a fairly accessible idea. You roll out competing options in parallel and see if any of them perform better (or different) than your existing way of doing things. At a deeper level, A/B testing has a fair amount of complex abstractions and counterintuitive thinking that underpin the underlying statistics. Treating A/B testing as a black box can often lead to confusion and unmet expectations. The black box thinking is a form of solutionism.

Solutionism is when we hyperfocus on a solution, or tool, without:

  1. thinking through the details of the problem we want to solve; and
  2. fully understanding the limits of what the solution is capable of.

Unfortunately, this can lead to treating A/B testing as an input-output procedure that is able to:

  1. provide definitive answers for transactional decisions; and
  2. free us from having to understand the inner workings of hypothesis testing.
Treating A/B Testing as a black box can often lead to confusion and unmet expectations. The black box thinking is a form of solutionism.

Unfortunately, this is not the case. Most of the work, and value, in experimentation, is in:

  1. how well the problem is defined upfront;
  2. how well the experiment is designed; and
  3. how well the team is able to understand and interpret the experiment's results.

Both 2 and 3 require at least some basic understanding of the statistical ideas underpinning A/B testing. For example, sometimes we see companies running A/B tests with 20, 30, or even more conversion objectives. Having many conversion objectives is often an indication that what exactly is being tested hasn't been fully thought through. This lack of focus tends to happen when the team has their analytics hat on, where data collection is of the Just In Case (JIC) variety. Meaning that all data is collected just in case a question comes up down the road. Taking the just-in-case approach can lead to incorrect interpretation, extra effort in gleaning insights, and ultimately a failed experiment. However, more effective A/B testing takes a Just in Time (JIT) data collection approach. Meaning that we are collecting THIS data to answer THIS specific, well-defined question.

For example, collapsing several separate conversion objectives into a single conversion goal, after running the experiment, requires care in recalculating the standard errors (a single user with two conversions will contribute more to the variance than two users each with just one conversion). It will also most likely make the original minimal detectable effect (MDE) and power, used in calculating sample size (assuming this was even done) inconsistent with the final analysis, invalidating or at least compromising the quality of the test. Even more of an issue, is the possible type 1 error control bias due to cherry-picking, based on whichever ones will show a significant result, the conversion events to use in the analysis.

This is not to say that it is always a bad idea to experiment without a strong hypothesis. There is nothing wrong with running a discovery/exploratory experiment to get an idea if certain types of interventions might have the potential to be worth it to invest more time in developing. However, it is good practice to follow up these results with a more formal confirmatory flavor of the A/B test.

In general, a solution, if it exists, will be found within the problem. So, rather than focus on the solutions (A/B testing, bandits, machine learning, etc.), we should first focus most of our energies on defining the problem and how much value we think there will be in solving it.

The value of A/B testing and experimentation comes not in the tests themselves, but in how well the company understands their customers' wants and needs, and in the skill of the people running the experiments.

The value of A/B testing and experimentation comes not in the tests themselves, but in how well the company understands their customers' wants and needs, and in the skill of the people running the experiments.

Incremental versus Transformative View of Value

Another issue with the input/output way of thinking is that it leads to the expectation that the experimentation program will directly generate an incremental profit. This is what I call the incremental value view of experimentation.

“We run X experiments and we expect Y incremental revenue.” This way of seeing experimentation calculates value based on both the direct gains and loss avoidance. Direct gains are what we tend to think of as optimization - changes made to the customer experience that increase revenue over some original experiences. Loss avoidance is a counterfactual that asks, “what decisions would we have made that would have made things worse had we not had our experimentation program in place?” Often loss avoidance makes up the lion’s share in this calculation and can make an experimentation program worth the cost and effort.

Seamless
adoption
Accutics is extremely simple. It’s a very easy tool to use, it’s extremely intuitive and the fact that you can download and upload a template and generate thousands of CMP IDs with the click of a button is a big selling point especially for agencies.
Nael Cassier,  
Senior Manager Digital Marketing, Qualcomm

Although the incremental view is useful, I think the larger payoff is in a transformative view. The transformative view is about unlocking the full potential of the company to learn and deliver products and experiences that customers want in an ongoing way. A well-functioning experimentation program with a transformative view allows the organization to tap into unrealized productivity. Put another way, by using a process that provides strong guarantees against making catastrophic mistakes, you increase the liquidity of ideas flowing from employees into the product and customer experiences that would not have occurred without this protection. The transformative view is about culture change, which is difficult to measure explicitly but can deliver tremendous value nonetheless.

The transformative view is about unlocking the full potential of the company to learn and deliver products and experiences that customers want in an ongoing way. This really is about culture change, which unfortunately is difficult to explicitly measure, but can be of tremendous value nonetheless.

Active vs Passive Data Collection

D.D.: In several industry presentations, you talk about the difference between active vs passive data collection. Would you explain a bit more what you mean by that?

M.G.: Sure. This is related to the point I made earlier. I think that the main difference between experimentation and other approaches in analytics is that experiments explicitly collect the data needed to answer a specific question - active data collection - whereas analytics tends to collect a lot of data before the questions are known, just in case it is needed to answer some currently unknown future question. In active data collection, you need to know the question before you collect the data. You can think of active data as just-in-time data.

Thinking about experiments as exercises in active data collection also helps to think about the marginal cost of data, or alternatively, the marginal value of answering a given question. So one can ask, ‘is the answer to this question going to be worth the cost in time and energy to collect the data required to answer it?'

Tough Truths About A/B Testing and Experimentation

D.D.: What are the tough truths that we don't talk a lot about in experimentation and optimization?

M.G.: It’s not the martech tools that provide the answers. Technology requires human expertise to get anything out of it -  the value is in the skill of the operator. If you don’t know what you want to learn, the technology is not going to be helpful.

Challenges With Implementing an A/B Testing and Experimentation Program

D.D.: You’ve helped some of the world’s largest brands optimize their A/B testing and experimentation programs. What would you say are some of the most common hurdles when implementing or expanding a testing and experimentation program?

M.G.: In addition to what I mentioned above, I would say:

  1. Get alignment within the organization about what is expected from the A/B testing and experimentation program. Often experimentation can be seen as slowing down making decisions, however, when done right, experimentation should allow for more and better decision making.
  2. Establish a process for efficient management of the following: the business ideas that will drive what experiments to run; the design and execution of the experiments; and finally the analysis and dissemination of learnings.

What's Next In A/B Testing and Experimentation

D.D.: Are there any new ideas or methods in testing and experimentation coming?

M.G.: We expect that there will be a continued expansion of the experimentation / A/B testing optimization tool kit. Specifically, for data-rich environments, we expect there to be greater familiarity and demand for use of the adaptive methods that Conductrics provides, such as multi-armed contextual bandits, and reinforcement learning for multi-step problems.

Interestingly, we also expect additional tools for situations for more limited data environments. For example, with browsers limiting how event data is remembered, it will become increasingly challenging to run classic Random Control Trials (RCTs) - the foundation of A/B tests. This will encourage clients to use other methods. Causal inference methods are better suited to account for the inability to assign an individual user directly to a particular treatment or experience. Conductrics is working on providing tools in the next several months that will help our clients ask causal inference questions in these situations.

About Matt Gershoff:

Matt Gershoff is an optimization expert and co-founder of Conductrics, an A/B Testing and Optimization platform. He specializes in Web Analytics, Database Marketing, Decision Optimization, Data Mining, and Adaptive Agent models.

Don't let your
organization
fall behind
Take the first step towards data-driven success by exploring how Accutics can transform your marketing operations.

Lead with Insights Survey: Unlock the Power of Data Governance

Why and how marketing data governance comes out as a clear differentiator for success

BY Diana Ellegaard-Daia
Investigating companies that outperform industry growth, marketing data governance comes out as a clear differentiator for success. With the Lead with Insights Survey, we are looking at the current state of the union and we need the help of those who know this best. The end goal is providing you with a report with tangible advice on how to get ahead of the curve.

Join the Lead with Insights Survey

By participating in this brief survey, you'll help us understand the current landscape of data-driven marketing. Your insights will shed light on critical questions:

·     How are marketing and data leaders leveraging data to drive growth?
·     What are the primary obstacles to data standardization and governance?
·     Which data insights carry the most sway for business impact?

Why your participation matters

Organizations that prioritize data governance gain a significant competitive edge. Without a solid data standards practice, your organization risks falling behind and losing market share.

By taking the Lead with Insights Survey, you'll contribute to shaping the future of data-driven marketing. Your insights will enable us to provide valuable resources and solutions to the industry.

Stay updated

The Lead with Insights Survey is just the beginning. We're dedicated to empowering marketing decision-makers with the knowledge and tools they need to succeed.

·     Sign up for our Lead with Insights newsletter to be the first to receive exclusive insights and actionable recommendations.
·     Bookmark this page to stay updated on survey highlights and key findings.
Matt Gershoff

A/B Testing and Experimentation: Tactics From Matt Gershoff

Learn how to strategize, implement, and scale A/B testing and experimentation from web analytics and optimization expert Matt Gershoff, founder of Conductrics.

Diana Ellegaard-Daia
Matt Gershoff
Matt Gershoff

A/B Testing and Experimentation: Tactics From Matt Gershoff

Learn how to strategize, implement, and scale A/B testing and experimentation from web analytics and optimization expert Matt Gershoff, founder of Conductrics.

By
Diana Ellegaard-Daia
Matt Gershoff
Matt Gershoff

A/B Testing and Experimentation: Tactics From Matt Gershoff

Learn how to strategize, implement, and scale A/B testing and experimentation from web analytics and optimization expert Matt Gershoff, founder of Conductrics.

By
Diana Ellegaard-Daia
Matt Gershoff

A/B Testing and Experimentation: Tactics From Matt Gershoff

Learn how to strategize, implement, and scale A/B testing and experimentation from web analytics and optimization expert Matt Gershoff, founder of Conductrics.

By
Diana Ellegaard-Daia

Matt Gershoff, optimization expert and founder of Conductrics, shares thoughts about A/B testing and experimentation in this interview.

Common Misconceptions About A/B Testing and Experimentation

D.D.: You have a lot of experience helping companies with A/B testing and experimentation, what would you say are some of the issues that keep organizations from getting the most value out of their experimentation programs?

Think first about the problem, rather than the Solution

M.G.: At a high level, experimentation via A/B testing is a fairly accessible idea. You roll out competing options in parallel and see if any of them perform better (or different) than your existing way of doing things. At a deeper level, A/B testing has a fair amount of complex abstractions and counterintuitive thinking that underpin the underlying statistics. Treating A/B testing as a black box can often lead to confusion and unmet expectations. The black box thinking is a form of solutionism.

Solutionism is when we hyperfocus on a solution, or tool, without:

  1. thinking through the details of the problem we want to solve; and
  2. fully understanding the limits of what the solution is capable of.

Unfortunately, this can lead to treating A/B testing as an input-output procedure that is able to:

  1. provide definitive answers for transactional decisions; and
  2. free us from having to understand the inner workings of hypothesis testing.
Treating A/B Testing as a black box can often lead to confusion and unmet expectations. The black box thinking is a form of solutionism.

Unfortunately, this is not the case. Most of the work, and value, in experimentation, is in:

  1. how well the problem is defined upfront;
  2. how well the experiment is designed; and
  3. how well the team is able to understand and interpret the experiment's results.

Both 2 and 3 require at least some basic understanding of the statistical ideas underpinning A/B testing. For example, sometimes we see companies running A/B tests with 20, 30, or even more conversion objectives. Having many conversion objectives is often an indication that what exactly is being tested hasn't been fully thought through. This lack of focus tends to happen when the team has their analytics hat on, where data collection is of the Just In Case (JIC) variety. Meaning that all data is collected just in case a question comes up down the road. Taking the just-in-case approach can lead to incorrect interpretation, extra effort in gleaning insights, and ultimately a failed experiment. However, more effective A/B testing takes a Just in Time (JIT) data collection approach. Meaning that we are collecting THIS data to answer THIS specific, well-defined question.

For example, collapsing several separate conversion objectives into a single conversion goal, after running the experiment, requires care in recalculating the standard errors (a single user with two conversions will contribute more to the variance than two users each with just one conversion). It will also most likely make the original minimal detectable effect (MDE) and power, used in calculating sample size (assuming this was even done) inconsistent with the final analysis, invalidating or at least compromising the quality of the test. Even more of an issue, is the possible type 1 error control bias due to cherry-picking, based on whichever ones will show a significant result, the conversion events to use in the analysis.

This is not to say that it is always a bad idea to experiment without a strong hypothesis. There is nothing wrong with running a discovery/exploratory experiment to get an idea if certain types of interventions might have the potential to be worth it to invest more time in developing. However, it is good practice to follow up these results with a more formal confirmatory flavor of the A/B test.

In general, a solution, if it exists, will be found within the problem. So, rather than focus on the solutions (A/B testing, bandits, machine learning, etc.), we should first focus most of our energies on defining the problem and how much value we think there will be in solving it.

The value of A/B testing and experimentation comes not in the tests themselves, but in how well the company understands their customers' wants and needs, and in the skill of the people running the experiments.

The value of A/B testing and experimentation comes not in the tests themselves, but in how well the company understands their customers' wants and needs, and in the skill of the people running the experiments.

Incremental versus Transformative View of Value

Another issue with the input/output way of thinking is that it leads to the expectation that the experimentation program will directly generate an incremental profit. This is what I call the incremental value view of experimentation.

“We run X experiments and we expect Y incremental revenue.” This way of seeing experimentation calculates value based on both the direct gains and loss avoidance. Direct gains are what we tend to think of as optimization - changes made to the customer experience that increase revenue over some original experiences. Loss avoidance is a counterfactual that asks, “what decisions would we have made that would have made things worse had we not had our experimentation program in place?” Often loss avoidance makes up the lion’s share in this calculation and can make an experimentation program worth the cost and effort.

Although the incremental view is useful, I think the larger payoff is in a transformative view. The transformative view is about unlocking the full potential of the company to learn and deliver products and experiences that customers want in an ongoing way. A well-functioning experimentation program with a transformative view allows the organization to tap into unrealized productivity. Put another way, by using a process that provides strong guarantees against making catastrophic mistakes, you increase the liquidity of ideas flowing from employees into the product and customer experiences that would not have occurred without this protection. The transformative view is about culture change, which is difficult to measure explicitly but can deliver tremendous value nonetheless.

The transformative view is about unlocking the full potential of the company to learn and deliver products and experiences that customers want in an ongoing way. This really is about culture change, which unfortunately is difficult to explicitly measure, but can be of tremendous value nonetheless.

Active vs Passive Data Collection

D.D.: In several industry presentations, you talk about the difference between active vs passive data collection. Would you explain a bit more what you mean by that?

M.G.: Sure. This is related to the point I made earlier. I think that the main difference between experimentation and other approaches in analytics is that experiments explicitly collect the data needed to answer a specific question - active data collection - whereas analytics tends to collect a lot of data before the questions are known, just in case it is needed to answer some currently unknown future question. In active data collection, you need to know the question before you collect the data. You can think of active data as just-in-time data.

Thinking about experiments as exercises in active data collection also helps to think about the marginal cost of data, or alternatively, the marginal value of answering a given question. So one can ask, ‘is the answer to this question going to be worth the cost in time and energy to collect the data required to answer it?'

Tough Truths About A/B Testing and Experimentation

D.D.: What are the tough truths that we don't talk a lot about in experimentation and optimization?

M.G.: It’s not the martech tools that provide the answers. Technology requires human expertise to get anything out of it -  the value is in the skill of the operator. If you don’t know what you want to learn, the technology is not going to be helpful.

Challenges With Implementing an A/B Testing and Experimentation Program

D.D.: You’ve helped some of the world’s largest brands optimize their A/B testing and experimentation programs. What would you say are some of the most common hurdles when implementing or expanding a testing and experimentation program?

M.G.: In addition to what I mentioned above, I would say:

  1. Get alignment within the organization about what is expected from the A/B testing and experimentation program. Often experimentation can be seen as slowing down making decisions, however, when done right, experimentation should allow for more and better decision making.
  2. Establish a process for efficient management of the following: the business ideas that will drive what experiments to run; the design and execution of the experiments; and finally the analysis and dissemination of learnings.

What's Next In A/B Testing and Experimentation

D.D.: Are there any new ideas or methods in testing and experimentation coming?

M.G.: We expect that there will be a continued expansion of the experimentation / A/B testing optimization tool kit. Specifically, for data-rich environments, we expect there to be greater familiarity and demand for use of the adaptive methods that Conductrics provides, such as multi-armed contextual bandits, and reinforcement learning for multi-step problems.

Interestingly, we also expect additional tools for situations for more limited data environments. For example, with browsers limiting how event data is remembered, it will become increasingly challenging to run classic Random Control Trials (RCTs) - the foundation of A/B tests. This will encourage clients to use other methods. Causal inference methods are better suited to account for the inability to assign an individual user directly to a particular treatment or experience. Conductrics is working on providing tools in the next several months that will help our clients ask causal inference questions in these situations.

About Matt Gershoff:

Matt Gershoff is an optimization expert and co-founder of Conductrics, an A/B Testing and Optimization platform. He specializes in Web Analytics, Database Marketing, Decision Optimization, Data Mining, and Adaptive Agent models.

Dive into the full case
Find out how the Fortune 100 insurer achieved marketing excellence with Accutics
Get the Campaign Naming Convention Guide for the Enterprise

Popular Articles

campaign tracking

Lead with Insights Survey: Unlock the Power of Data Governance

Why and how marketing data governance comes out as a clear differentiator for success
Diana Ellegaard-Daia
5 min read
Campaign Tracking

Campaign Tracking in Adobe Analytics: Tracking UTMs and CIDs

Get Frederik Werner's method for campaign tracking in Adobe Analytics using UTMs and CID tracking codes.
Frederik Werner
9 min read

Data-driven marketing

Understanding CDPs and Their Role in the Marketing Data Stack

How do you create compelling presentations that wow your colleagues and impress your managers?
Diana Ellegaard-Daia
8 min read
Digital analytics

Migrating to a New Digital Analytics Solution

Mental models are simple expressions of complex processes or relationships.
Frederik Werner
9 min read

Get the latest
marketing analytics insights

No-nonsense marketing and analytics best practices from
international industry leaders, straight to your inbox

Sign up to newsletter