Kelly Wortham is the Senior Director of Optimization at Search Discovery, the founder of the Test & Learn Community, the former CRO lead and evangelist at EY.
She is a controlled experimentation professional with expertise developed over the last 15+ years in digital optimization, ideation and project execution, account management, presentation and training, analytics, and personalization.
K.W.: A/B testing is easy - optimizing is hard. The difference is in the strategy and planning that go into the test design. Junk in still leads to junk out for testing. So if you don’t have solid evidence supporting the reason you’re wanting to use experimentation to make a better decision, you’re not going to have better data with which to make that better decision when the test is over.
Optimization requires a grounding in both the customer-need and the potential solutions that can resolve that need, but also in exactly how you can use split testing to validate that solution and ensure the customer's need is being met. That’s how you learn and scale those insights across the organization. Else - you have only learned that experience B is better than A. That’s not scalable.
K.W.: Regardless of where testing is occurring, quality data is needed - both for evidence gathering and evaluation and interpretation of the results. Not having the necessary data to support ideation leads to fewer ideas grounded in evidence and likely to move the needle. Not having solid data for evaluating the results leads to misinterpretation of test results and bad decision-making or missed opportunities.
K.W.: Well, as already stated above, it’s super important to have organizational buy-in and executive sponsors who completely support and empower you with testing. Sometimes it’s hard to start with that - but when you can - you can sorta leapfrog many of the other steps required to prove the idea of optimization. Outside of that - you need the basics to make any program work. People, Process, and Technology. And the “right” version of each of those for your organization. At a minimum within “People”, you need a leader with plenty of experience in optimization who has helped to build and/or grow an experimentation program before. You also need smart and experienced developers, analysts, and project managers - but their experiences with testing can be considerably less.
On the Process front, it’s important that you have a process that works for your organization. A lot of programs have processes that add so much red tape and heavy lifting that it becomes a nightmare of friction and testing dies a slow death. Done right, process should actually reduce the cycle times, make it easier and faster to test - while also ensuring high impact, high-quality experimenting is taking place. Lastly, Technology. Just like “best practice” is a myth - so too is the idea of the perfect technology. (Sorry vendors). Are there technologies with more capabilities and customization options and integrations… of course.
The one thing I have learned that tends to be true is that every platform has pros and cons and every platform empowers better decision making than no platform at all.
K.W.: My colleague and friend, Val Kroll, explained it best in best in her presentation on sharing results - when presenting results to executives and stakeholders - they just need to know the time. You don’t have to explain how the watch was made. And doing so just leads to more confusion and questions and a rabbit hole conversation completely unhelpful to getting alignment on the action the results recommend.
When you go to a doctor for testing, you don’t ask them how they know the results mean X or Y. You just ask them what actions you need to take to get better. It’s the same here. We should stop trying to explain what confidence means (and what it absolutely does NOT mean) and start focusing on making recommendations. When I teach workshops on experimentation right practice, I talk about the 3 components of any analysis. The “What”, “So What”, and “Now What”. Most of us spend a ton of time explaining the “what” - which is the least important and least interesting part of the analysis. The tables of numbers. Yes - that’s what happened. But what matters is why those numbers are interesting. Those are the insights in the “So what” - the “Why should I care?”. And the MOST important component is the “Now What” - the next steps and recommendations. And when we spend all our time going over the tables of numbers and explaining what they mean - we risk not making it to the “Now What” - or at least - we risk not bringing the audience along with us.
K.W.: I think of best practice as the peak of Mount Stupid on the Dunning-Kroeger effect chart. The more experience you have in experimentation, the more you learn that it just doesn’t exist.
You have to start with the problem. What are you trying to learn, to decide, to impact? I’m sorry, but an A/B test isn’t always going to be the right answer. Sometimes - you might have to run a Multivariate (MVT). Sometimes, controlling for false positives might be crucial - other times - not so much. Where is the risk in rolling out new copy that doesn’t perform better (so long as it doesn’t perform worse - but if you can control the sign-error and our only risk is rolling out a “flat” winner that has no cost for development… our cost is minimal and arguably - making a decision faster with lower confidence makes sense here. But clearly not everywhere).
The “two-week” minimum test is another example that even I still quote - but it’s not really always true. It depends on your business cycles, the problem, and what type of errors you want to control for - not to mention the cost associated with a wrong decision.
K.W.: The TLC was created after a great conversation during a “Huddle” at the former all-conversation format, X Change conference. (A conference inspired by that “all conversation” format of the X Change conference now lives on in the Digital Analytics Hub and definitely should be on any digital analyst or CRO’s “must attend” list). In that huddle - a group of about 15 experimentation practitioners and professionals were having a wonderful deep conversation that we weren’t ready to end just because our time was up. So we passed around a sheet of paper and everyone provided email addresses and we continued the conversation over the next year between conferences. Eventually, we decided to get on the phone and tried a teleconference. New people joined...and conversations became difficult, so we converted to zoom. Eventually, even that became unwieldy and we decided to convert to a panel-style conversation and opened the floodgates to invite any who wanted to join - with the same rules from that first X Change conference - no sales or solicitation allowed.
Honestly, I learn more every day from the TLC members and our conversations than I could have ever hoped to learn from just my own work experiences. Everyone asks great questions and provides earnest answers. It’s such a helpful community. Rising waters lift all boats and we know the industry has a ways to go, so it’s extremely important that communities like these exist.
K.W.: Data access and quality - of course - continue to be massive challenges. And measurement as a whole has become increasingly difficult with all the browser and privacy changes. So figuring out how to continue to optimize - but with the restrictions, these changes place on us has become one of the biggest challenges for the industry.
K.W.: I’m going to go way out on a limb here and think pretty far outside the box (er tree?) and say that what we think of today with “targeting” and “personalization” will become more “customization”. The difference being - customers will opt-in (or not) to marketing preferences, perhaps getting some kind of benefit (other than relevancy), and this type of personalized marketing will become crucial to a company’s success. And figuring out how to analyze and optimize these “customizations” will keep us all up at night.