If I were in your shoes, the very first thing I’d do is install a session-recording tool and watch at least 50 user sessions before I even think about an A/B test.
Here’s why I’d start there. The biggest mistake you can make is testing things for the sake of testing. You’ll waste time and traffic testing button colors when the real problem is that your shipping estimator is broken on mobile. The goal of A/B testing isn't just to find a winner, it's to validate a hypothesis about your customers. And you can't form a good hypothesis without observing real behavior first. Jon MacDonald of The Good is always talking about analyzing user behavior to uncover conversion roadblocks, and he’s right. Watching recordings is the fastest way to see your store through your customers' eyes.
So, in week one, I’d just watch. I’d grab a coffee, open up Hotjar or a similar tool, and look for patterns. Where do people hesitate? Where do they rage-click? Where do they go back and forth? I would focus entirely on mobile checkout, since that’s probably where most of your users are. By the end of the week, I’d have a list of friction points and one single hypothesis I want to test. Maybe it's "Removing the discount code box from the first page of checkout will increase conversion because it stops people from leaving to search for a coupon."
Then in month one, I’d run that single, simple test. You need tools that make it easy, and there are plenty of apps for improving Shopify stores. The key is to test a meaningful change, not something tiny. This is where I think of Kurt Elster, who on The My Wife Quit Her Job Podcast talked about making counterintuitive, changes that can have a big impact. Removing a discount box feels scary, but it’s a great example of a high-impact test.
What I’d ignore completely at the start is any advice that tells you to copy Amazon or another giant retailer. Your checkout flow has different problems and different customer expectations. I’d also ignore any test that I couldn’t explain the "why" behind in a single sentence.
The biggest trap to avoid is getting impatient and stopping a test too early. You need statistical significance, and that takes traffic and time. It’s better to run one clean test for three weeks than three messy tests in one week. Let the data mature. Your goal isn’t a quick hack, it's building a system of continuous improvement, which is exactly what Jon MacDonald covers in his episode Conversion Rate Optimisation For eCom Businesses.


