You are an authoritative digital marketer at a leading digital organization. You run frequent A/B tests to measure the effectiveness of your marketing. I know you are running tests. Why? Because Google told you. And BCG told you. And Bain told you. It’s your “test-and-iterate” approach that makes you such an authority.
Bad news for you: Major marketing test mistakes confound your efforts. Your business impact lags behind it’s high-powered potential.
The good news: You can quickly identify and remedy these mistakes. The fixes are broadly applicable. They are strategic, not technical. Let’s dive in.
Mistake #1: Running A/B tests without defining impact on future business decisions.
I learn so many curious facts when I talk to businesses about their marketing measurement test results.
We saw that our five-part creative combination – when viewed in order – lifted conversion rates by almost 22%.
Our sales data correlated to changes in air temperature – with an R-squared of .76
Customers who read our Monday email campaigns before 9 AM spend $25 more per transaction!
Moving 20% of our campaign budget into over-the-top addressable media results in 3.5% incremental reach.
Those are all incredibly specific and interesting insights! Here’s how those conversations always play out afterwards.
That’s impressive. How are you planning to build this narrative sequence into your broader digital strategy?
Oh, we don’t have a current digital audience strategy partner in mind for always-on execution. Our VP didn’t think the test vendor’s fees were within next year’s budget.
Great insight. What’s your automation strategy for optimizing to weather events?
We weren’t aware you could optimize to weather events. Is that possible?
How are you re-designing your email marketing campaigns to better capture this Monday-morning effect?
I don’t think our email marketing cadence will change much next year. Our current program is managed by a third-party, and we are locked into our contract until 2022.
You get the drift. Bottom line – marketing analytics are used to build way too much trivia, and don’t influence nearly enough decisions.
Mistake #2: Changing tactics during strategy tests, and vice versa.
Just about every marketing test I’ve ever seen involves botched tactics. A launch happens late. A setting was missed until the second day. Sometimes, these issues are prohibitive but most of the time, they are expected realities of any substantial change worth managing. This mistake is not about these known issues. You’ll never launch a flawless marketing test, so insisting on developing one is a fool’s errand.
What this mistake is about are the explicit, intentional, sometimes well-meaning changes our teams deploy that corrupt the basic relationship your test design attempts to isolate.
Example: In the early 2010’s, many multi-channel retailers were unsure if their digital media drove store sales. (ed. – This seems like a quaint, almost adorably naive belief in 2020.) Finance teams approved pretty substantial media testing budgets, and built in-depth, omnichannel measurement plans, all in an attempt to determine once and for all if their digital media campaigns drove store sales improvement.
Unfortunately, these same companies were often under substantial revenue and earnings pressure overall. Many still are. This created an environment where business results from every channel were always under scrutiny, and efforts that improved results from any sales channel were quickly rewarded and deployed.
This one-two punch doomed effective marketing test design. Many large online-to-store media tests went into market across all of US multi-channel retail between 2010-2016, but media tactic improvements and optimizations – such as improved use of automation and machine learning – found their way into every digital media investment —. this included the large online-to-store test campaigns. The tactical optimization improvements worked – overall media ROI improved and revenues increased. Online-to-store campaigns worked, too! Store sales lifted where retailers committed serious test investments to O2S.
So why did things go wrong? Since tactical improvements were applied to online-to-store test strategies, marketing analysts couldn’t definitively answer the question: Did store sales increase because of our online-to-store investments, or did they increase because we improved our overall media tactics?
Did marketers and executives say “Looks like both work! Let’s invest more in online-to-store strategies and also continue to push our machine learning and automation capabilities forward.”
Of course not. They are human beings after all. And when humans are asked to choose between two really good options, they end up waffling back and forth, unable to commit to either. Just about all of these multi-channel retailers ran incredibly successful tests from a business perspective. Almost all of them reverted back to old strategies the following year, repeating the process a few times before finally accepting online-to-store, in fact, is worth investing in.
Mistake #2 isn’t problematic because “test data are corrupted by confounding variables” or “test and control populations co-mingled” or even “unexpected metric covariance”…..
Mistake #2 is a big problem because when given too many choices, your organization will paralyze itself with indecision. You need to commit to learning one important thing with each marketing test. Avoid the urge to “win” the test by capturing every opportunity and angle for victory. The fix here is easy & simple: test one change, and let everything else go.
Mistake #3: Testing that yields “insights” without logic.
This mistake would be laugh-out-loud funny if it wasn’t also so darn tragic. These are the tests that take 3 months to design, 40 graduate degrees and a 20 person video conference to approve. Despite intense planning, they do not produce the insight you thought you were building towards.
I’ve read 15-page marketing test design documents before. I’ve been asked to comment on any best practices they were missing! The only thing I can think of to say in cases like this are things I can’t say out loud. None of these tests – ever – make any sense whatsoever. Somewhere along the way in these situations, the entire purpose of marketing analytics loses itself. Teams forget that they are trying to improve a business outcome, affect a strategic decision, and make the business more money.
Worse yet — over-engineered test design usually means a drastic overemphasis on statistical rigor and the scientific method, and not nearly enough emphasis on domain experience and common sense. Variables like “ad impressions”, “users”, “clicks”, “conversions”, “exposures”, “cookie-match rates”, and other marketing analytics jargon stop holding meaning to the teams designing the test, and they start becoming anonymous variables in a statistics equation. Soon enough, the media tested simply becomes a vehicle for generating anonymous variables and data tables. Rigorous, correct statistics are applied to these tables without any thought to what the data are, the concepts they represent, and whether the principles of the test make any sense. When that occurs, your company has just funded a very expensive data science sandbox for statisticians to play around in. You’ve also probably just “proven” profound insights like: “We can say with 95% confidence that the best way for us to grow our business is to have more people search for the term [buy our product now]”. Great job everyone. SMH.
Somehow we need to get through to the marketing analytics and decision science community that business isn’t a hard science like physics. We aren’t smashing particles in a collider, we are trying to decide best-course-of-action in consumer and business markets. This involves socioeconomic factors. This involves auction and game theory. This involves psychology. This involves some serious complexity science. Mimicking the lab-like precision, design rigor and statistical precision of tests designed in the hard sciences in your marketing tests diminishes their quality and impact on your business, because doing so usually means abandoning practical utility.
We need to stop over-engineering and over-complicating the process of getting strategies and tactics tested in-market because we want to sound smarter, appear more credible to “hard science” types or others that look down on marketing analytics. Frankly, screw those people. Marketing analytics and decision science are incredibly hard precisely because they aren’t measurable in a lab. Sure, you can do one-off lab-like tests when constraining your hypothesis and expectations sufficiently. (Also, you can do this if your business isn’t under existential pressure & your coffers are flush with cash) But marketing analytics teams are asked to test strategies and tactics in real-time, while managing massive in-market programs, in an often ridiculously competitive environment with limited capital and increasingly shorter deadlines to deliver measurable results. Marketing analytics exists to develop and evaluate useful, insightful and profitable marketing strategies. Let’s discuss criteria for doing just that.
How To Fix Marketing Test Mistakes
Fix #1: Secure commitment on business decisions ahead of the test launch.
Chief Decision Scientist @ Google Cassie Kozyrkov calls out “Define your Objectives” as the #1 action item in every statistics & artificial intelligence “testing talk” I’ve seen her give. While that has slightly broader implications, the point remains the same: define why you are running the A/B test & what your business will do based on the results every single time you do one.
- Write it down.
- Keep a library of A/B test design documents.
- Share the design document with your stakeholders and business decision makers.
- Get sign-off on the decision statements prior to putting a single test dollar into market.
How should you frame your A/B Testing hypothesis and decision statements? Flex between two approaches:
- Appease the hard scientists in your cohort – if you have (are) them – by stating the belief driving the test (the hypothesis), and then aligning decision statements to “null hypothesis”, “hypothesis accepted” and “hypothesis rejected” test outcomes.
- If you are just working with regular folks like you and me, feel free to use more approachable language like:
Securing commitment to business decisions before you run the test makes your marketing test useful. Nothing improves the quality and impact of your marketing measurement and analytics practice than becoming more useful to your business.
That’s Great, But My Boss Won’t Commit & Told Me To Run This Test Anyway
Let’s address the elephant in the room. By and large, your marketing analytics & testing organization wasn’t trying to avoid utility and impact. You know you should prioritize outcomes — as head of marketing analytics at your organization, making mistake #1 isn’t your fault. In the real world, most marketing tests are launched because of a business question asked by the C-Suite, boardroom, or even the CEO herself. The motivations for running the test aren’t entirely about improving the business – your bosses want to look smarter.
They know your credibility. In their minds, if you run a test it will not only give the answer they need – they’ll be unassailable. So what do you do? Are you doomed to making Mistake #1 until you become CEO?
Don’t despair! There’s another path. Here’s where you can use Fix #2.
Fix #2: When you know the answer to the question, share the knowledge.
Revisiting the multi-channel retailer vignette above: while it’s true online-to-store testing was pervasive between 2010 and 2016, there were already plenty of datasets available at these same organizations well before 2010 that demonstrably verified that online media indeed impacted in-store sales. Sometimes, these datasets were more robust than the eCommerce business lines themselves. And so while the default thinking was that there was no impact to store sales from digital ads, and most digital marketing analytics folks jumped at the chance to deploy these tests to prove otherwise, they should have done something else.
Digital marketing analytics folks knew the truth was far from the default thinking: they were sitting on customer-level interaction data from their digital assets and initiatives, where same exact customers were purchasing in store. These datasets had thousands of rows, some had tens of millions. We’re there efforts to show these data to business executives to educate and illuminate their thinking? I’m sure there were some, but by and large, we all thought that the easiest path forward was simply to run the tests instead.
When your boss asks for a test for something you already have a definitive dataset on, book time with her and show. her. that. data.
Your managers might not be able to say it in team meetings or one-on-one conversations, but what they really need is your insight. You can save yourself and the company a lot of time, make your boss look like a genius, and make some serious money by simply offering to share the expertise you know – and by being willing to take the time to help others truly understand it as well as you do.
Fix #3: Walk the process & draw a causal diagram before committing to A/B test design.
Will customer respond to your new marketing initiative? How will the A/B test go? While you can’t predict the future, you know a ton about your business and the world it lives in. For instance:
- an ad impression can’t drive a sale unless someone sees it.
- a website visit doesn’t drive a new transaction if it happened after the sale.
- For you programmatic media experts: You can’t check if someone is on your list if you don’t know who you are looking for, and you don’t have a list to check against.
These feel like “no duh” statements. Realize that they are profound insight above anything available through pure advanced statistics. A poorly designed measurement model, without a valid causal diagram underpinning it, can produce statistically significant support for impossible concepts and preposterous hypotheses. If a test design’s measurement framework violates a basic logical process for how something could work, then stop the test design in it’s tracks and find a path that retains basic common sense.
What’s an example? A marketing test that concludes “The best way for us to grow our business is to have more people search for the term [buy our product now].” is a problem! You control your marketing – you don’t control users. This statement tells marketers nothing about how to influence users. Evaluations like this are nonsensical and fraught with risk.
If you don’t know what a causal diagram is, I highly recommend The Book of Why by Judea Pearl. Do you need to physically draw the causal diagrams? Not at all. What you do need to do is use your domain experts – who know your industry’s dynamics, and how things work – to ensure that the hypothesis you are testing could conceivably be true and useful given what is known about your business or the world. It’s very possible that your marketing analytics or data science teams do not have enough domain expertise to verify causal diagrams. That’s okay! There’s a simple solution to this — If you don’t know the domain, get a domain expert in the room to help. No data science or graduate-level statistics degree required.
I hope these fixes help your marketing analytics strategy this year and beyond. Testing remains your most powerful tool, and you are the chosen ones armed with the ability to wield it. Please be wary of testing pitfalls – keep these fixes in mind, socialize them and push for their adoption into your organization’s best practice. My next post on marketing analytics & testing will cover the three criteria that define world-class marketing analytics practice. I hope many of you are back to check that out and share it if you find it valuable. Until then, come check us out if you want to improve your marketing analytics, strategy, test design & more.