Beyond A/B Testing: Optimization Tactics From Kelly Wortham

Get A/B testing optimization tactics from Kelly Wortham, Director of Optimization at Search Discovery, Test & Learn Community founder & CRO evangelist.
Diana Daia
September 3, 2020

Kelly Wortham, Search Discovery,  joins the fireside to discuss the role of experimentation, accurate data, and reporting for A/B testing optimization.

Kelly Wortham

Who is Kelly Wortham?

Kelly Wortham is the Senior Director of Optimization at Search Discovery, the founder of the Test & Learn Community, the former CRO lead and evangelist at EY.

She is a controlled experimentation professional with expertise developed over the last 15+ years in digital optimization, ideation and project execution, account management, presentation and training, analytics, and personalization. 

D.D.:You have over 15+ years in analytics-driven A/B testing optimization and controlled experimentation. What are the 3 key things you’ve learned on your journey?

K.W.:

  • “Best practice” is bogus - “right practice” depends on the company and the problem
  • Organizational culture & leadership buy-in are crucial to a program’s success
  • 90+% of optimization work is herding cats and gaining alignment

90+% of optimization work is herding cats and gaining alignment.

Kelly Wortham
Kelly Wortham
Search Discovery

D.D.: “Anyone can ‘do’ A/B testing. It takes something more to truly optimize” is something that you believe in. How can companies leverage A/B testing optimization techniques to drive more successful digital campaigns?

K.W.: A/B testing is easy - optimizing is hard. The difference is in the strategy and planning that go into the test design. Junk in still leads to junk out for testing. So if you don’t have solid evidence supporting the reason you’re wanting to use experimentation to make a better decision, you’re not going to have better data with which to make that better decision when the test is over.

Optimization requires a grounding in both the customer-need and the potential solutions that can resolve that need, but also in exactly how you can use split testing to validate that solution and ensure the customer's need is being met. That’s how you learn and scale those insights across the organization. Else - you have only learned that experience B is better than A. That’s not scalable.

D.D.: In the A/B testing optimization space - and digital marketing space at large, there’s a lot of focus on marketing data quality. Why is it important to have a solid data foundation in place for driving optimization successfully?

K.W.: Regardless of where testing is occurring, quality data is needed - both for evidence gathering and evaluation and interpretation of the results. Not having the necessary data to support ideation leads to fewer ideas grounded in evidence and likely to move the needle. Not having solid data for evaluating the results leads to misinterpretation of test results and bad decision-making or missed opportunities.

Get actionable marketing analytics insights straight to your inbox

Sign up to newsletter

D.D.: You’ve successfully helped companies build a solid optimization strategy that is tied to business goals. How would you advise organizations that are the early stages of establishing a program to get started successfully? What are the important steps to keep in mind?

K.W.: Well, as already stated above, it’s super important to have organizational buy-in and executive sponsors who completely support and empower you with testing. Sometimes it’s hard to start with that - but when you can - you can sorta leapfrog many of the other steps required to prove the idea of optimization. Outside of that - you need the basics to make any program work. People, Process, and Technology. And the “right” version of each of those for your organization. At a minimum within “People”, you need a leader with plenty of experience in optimization who has helped to build and/or grow a program before. You also need smart and experienced developers, analysts, and project managers - but their experiences with testing can be considerably less.

On the Process front, it’s important that you have a process that works for your organization. A lot of programs have processes that add so much red tape and heavy lifting that it becomes a nightmare of friction and testing dies a slow death. Done right, process should actually reduce the cycle times, make it easier and faster to test - while also ensuring high impact, high-quality experimenting is taking place. Lastly, Technology. Just like “best practice” is a myth - so too is the idea of the perfect technology. (Sorry vendors). Are there technologies with more capabilities and customization options and integrations… of course.

The one thing I have learned that tends to be true is that every platform has pros and cons and every platform empowers better decision making than no platform at all.

Junk in still leads to junk out for testing. So if you don’t have solid evidence supporting the reason you’re wanting to use experimentation to make a better decision, you’re not going to have better data with which to make that better decision when the test is over.

Kelly Wortham
Kelly Wortham
Search discovery

D.D.: You’ve been talking at great length about the issues with over-reporting. Why is sharing complex statistics detrimental to communicating the true value of A/B testing optimization? What key steps can practitioners take to avoid that?

K.W.: My colleague and friend, Val Kroll, explained it best in best in her presentation on sharing results - when presenting results to executives and stakeholders - they just need to know the time. You don’t have to explain how the watch was made. And doing so just leads to more confusion and questions and a rabbit hole conversation completely unhelpful to getting alignment on the action the results recommend.

When you go to a doctor for testing, you don’t ask them how they know the results mean X or Y. You just ask them what actions you need to take to get better. It’s the same here. We should stop trying to explain what confidence means (and what it absolutely does NOT mean) and start focusing on making recommendations. When I teach workshops on experimentation right practice, I talk about the 3 components of any analysis. The “What”, “So What”, and “Now What”. Most of us spend a ton of time explaining the “what” - which is the least important and least interesting part of the analysis. The tables of numbers. Yes - that’s what happened. But what matters is why those numbers are interesting. Those are the insights in the “So what” - the “Why should I care?”. And the MOST important component is the “Now What” - the next steps and recommendations. And when we spend all our time going over the tables of numbers and explaining what they mean - we risk not making it to the “Now What” - or at least - we risk not bringing the audience along with us.

D.D.: When it comes to ‘best practices’ in experimentation, you’re a skeptic about generalizing. Why don't ‘one size fits all’ assumptions work? From your research, what meaningful examples could you share here?

K.W.: I think of best practice as the peak of Mount Stupid on the Dunning-Kroeger effect chart. The more experience you have in experimentation, the more you learn that it just doesn’t exist.

You have to start with the problem. What are you trying to learn, to decide, to impact? I’m sorry, but an A/B test isn’t always going to be the right answer. Sometimes - you might have to run a Multivariate (MVT). Sometimes, controlling for false positives might be crucial - other times - not so much. Where is the risk in rolling out new copy that doesn’t perform better (so long as it doesn’t perform worse - but if you can control the sign-error and our only risk is rolling out a “flat” winner that has no cost for development… our cost is minimal and arguably - making a decision faster with lower confidence makes sense here. But clearly not everywhere).

The “two-week” minimum test is another example that even I still quote - but it’s not really always true. It depends on your business cycles, the problem, and what type of errors you want to control for - not to mention the cost associated with a wrong decision.

Not having the necessary data to support ideation leads to fewer ideas grounded in evidence and likely to move the needle. Not having solid data for evaluating the results leads to misinterpretation of test results and bad decision-making or missed opportunities.

Kelly Wortham
Kelly Wortham
Search discovery

D.D.: You’re also the founder & moderator of the Test & Learn Community (TLC), a growing community of experimentation industry practitioners and experts with monthly virtual panel conversations. How did the idea emerge and what have you learned so far?

K.W.: The TLC was created after a great conversation during a “Huddle” at the former all-conversation format, X Change conference. (A conference inspired by that “all conversation” format of the X Change conference now lives on in the Digital Analytics Hub and definitely should be on any digital analyst or CRO’s “must attend” list). In that huddle - a group of about 15 experimentation practitioners and professionals were having a wonderful deep conversation that we weren’t ready to end just because our time was up. So we passed around a sheet of paper and everyone provided email addresses and we continued the conversation over the next year between conferences. Eventually, we decided to get on the phone and tried a teleconference. New people joined...and conversations became difficult, so we converted to zoom. Eventually, even that became unwieldy and we decided to convert to a panel-style conversation and opened the floodgates to invite any who wanted to join - with the same rules from that first X Change conference - no sales or solicitation allowed.

Honestly, I learn more every day from the TLC members and our conversations than I could have ever hoped to learn from just my own work experiences. Everyone asks great questions and provides earnest answers. It’s such a helpful community. Rising waters lift all boats and we know the industry has a ways to go, so it’s extremely important that communities like these exist.

D.D.: From the talks that you’ve had with analytics and marketing professionals through TLC, what are pressing industry challenges that people are facing? What are the hard nuts to crack?

K.W.: Data access and quality - of course - continue to be massive challenges. And measurement as a whole has become increasingly difficult with all the browser and privacy changes. So figuring out how to continue to optimize - but with the restrictions, these changes place on us has become one of the biggest challenges for the industry.

The more experience you have in experimentation, the more you learn that best practice just doesn’t exist. You have to start with the problem. What are you trying to learn, to decide, to impact?

D.D.: If you were to predict the upcoming marketing analytics trends, what would those be?

K.W.: I’m going to go way out on a limb here and think pretty far outside the box (er tree?) and say that what we think of today with “targeting” and “personalization” will become more “customization”. The difference being - customers will opt-in (or not) to marketing preferences, perhaps getting some kind of benefit (other than relevancy), and this type of personalized marketing will become crucial to a company’s success. And figuring out how to analyze and optimize these “customizations” will keep us all up at night.

Related articles:

SUBSCRIBE TO NEWSLETTER

Get the latest marketing analytics insights

No-nonsense marketing and analytics best practices from international industry leaders, straight to your inbox.
Sign up