E-commerce CROMarch 31, 2026·7 min read

Stop A/B Testing Your Store Until You've Fixed These 7 Things First

By Jonathan · Founder, PageGains

Most e-commerce teams treat A/B testing like a magic wand. Sales are flat, so they split-test button colors, tweak headlines, and run experiments for weeks — only to find no winner, or worse, a false positive that disappears the moment they ship it. The problem usually isn't the test. It's that the store was never ready to be tested in the first place.

Your Analytics Are Lying to You (And You Don't Know It)

Before you split a single visitor, verify that your data is actually trustworthy. Google Analytics 4 misconfigures itself constantly — duplicate purchase events, sessions that fire twice, checkout steps that never track at all. If your funnel data has holes, every experiment you run is measuring noise.

The fix is boring but non-negotiable: audit your tracking before anything else. Open GA4's DebugView and walk through your own checkout. Check that each step fires exactly once. Pull your revenue numbers from GA4 and compare them against your Shopify or WooCommerce backend. If they're off by more than 5–10%, you have a tracking problem, not a conversion problem. Fix it now, or every winning test you think you've found will be built on sand.

One client was declaring experiment winners based on GA4 purchase events that were double-counting mobile transactions. Their "winning" variant was actually 3% worse — they just couldn't see it. Two hours of QA would have saved them months of bad decisions.

Your Sample Size Math Is Wrong

Here's a number most teams get wrong: you need roughly 1,000 conversions per variant to detect a 10% lift with 95% confidence. That means if your store converts at 2% and you're testing two variants, you need 100,000 visitors before you can trust the result.

Run a sample size calculator (Evan Miller's is free and accurate) before you start any test. Plug in your current conversion rate, the minimum lift you care about detecting, and your desired confidence level. If you'd need six months of traffic to reach significance, don't run an A/B test — fix something obvious instead, measure the before/after, and move on.

The reason underpowered tests are dangerous isn't just that they waste time. It's that they produce false winners. A test that ends at 300 conversions per variant has a roughly 1-in-3 chance of calling a loser a winner due to random variance. You ship the "winning" variant, see no improvement, and blame the idea — when the real culprit was math.

Your Page Speed Is Bleeding Conversions Before Any Test Can Help

Google's data is consistent and brutal: every additional second of load time on mobile reduces conversions by 4–8%. If your product pages load in 5 seconds on a mid-range Android device, you're already starting with a massive handicap that no headline test will fix.

Run your store through PageSpeed Insights on mobile. Focus on Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS). LCP above 2.5 seconds means your hero image or above-the-fold content is loading too slowly. CLS above 0.1 means elements are jumping around as the page loads — which destroys trust and fat-finger taps buttons the visitor didn't mean to press.

The quick wins: compress images with a tool like Squoosh or switch to WebP format, defer non-critical JavaScript, and remove any third-party scripts (chat widgets, pop-up tools, review apps) that load synchronously. Don't A/B test a slow page. Fix the speed, then test.

GET YOUR OWN AUDIT

Find these issues on your own page

PageGains analyzes any URL and surfaces these exact problems in ~60 seconds. First audit from $3.99.

Analyze my page →

Your Mobile Checkout Has Deal-Breaking Friction

Pull up your checkout flow on a real phone — not your phone, which you use every day and have muscle memory for. Hand it to someone who's never seen your store. Watch where they hesitate, pinch to zoom, or tap the wrong button.

Common killers that show up constantly: address fields with no autocomplete, credit card inputs that don't trigger a numeric keyboard, order summary sections that are hidden by default so customers don't know what they're paying for, and "Guest checkout" options buried below a login wall. Any one of these adds seconds and confusion. Together, they tank mobile conversion rates to half what desktop achieves — and mobile now drives 60–70% of traffic for most e-commerce stores.

Fix these before you split-test your checkout button label. A friction-free checkout with a generic CTA will outsell a frictionless-free headline test on a broken checkout every single time.

Your Value Proposition Isn't Clear in the First 5 Seconds

Open your homepage on a fresh browser tab and set a timer. Ask yourself: within five seconds, can a first-time visitor answer three questions — what do you sell, who is it for, and why should they buy from you instead of someone else?

Most stores fail question three completely. "High quality products delivered fast" is not a value proposition. It's a placeholder. What actually makes you different? Is it a 90-day no-hassle return policy? Handmade in Portugal? Dermatologist-tested formulas? Whatever the real answer is, it should be in the headline or the subheadline — not buried in a paragraph below the fold.

If your value proposition is vague, every A/B test you run is testing around the real problem instead of at it. You might find that headline A beats headline B, but if both headlines are weak, you've just found the least-bad version of a structural mistake.

Your Product Pages Are Missing the Persuasion Basics

Before you test which product image style converts better, make sure the page has the fundamentals in place. Specifically: social proof that's prominent and specific, a clear answer to the top objection a buyer would have, and scarcity or urgency that's real — not fake countdown timers that reset every time the page loads.

Social proof means more than a star rating. It means a review that names the specific benefit the customer got ("I've had chronic back pain for years and this chair is the first one that's actually helped"). That kind of specificity converts. Generic "Love this product!" reviews barely move the needle.

The top objection varies by category. For apparel it's usually fit. For supplements it's safety and efficacy. For electronics it's compatibility. Find yours by reading your 3-star reviews and your customer support tickets — that's where buyers tell you exactly what almost stopped them from buying. Answer that objection directly on the page, above the fold if possible.

You Have No Idea Why Visitors Are Actually Leaving

A/B testing tells you which version wins. It doesn't tell you why visitors are leaving in the first place. Without knowing the "why," you're guessing at what to test — and most guesses are wrong.

Three tools that give you actual answers in 1–2 weeks: Hotjar or Microsoft Clarity for session recordings and heatmaps, a simple on-exit survey asking "What stopped you from completing your purchase today?" (Hotjar, Survicate, or even a simple Typeform triggered on exit intent), and a 5-person usability test where you watch real people shop your store on Zoom.

The usability test alone will surface more winning test ideas than a month of heatmap analysis. You'll see exactly where people get confused, what copy they misread, and which trust signals they actually look for. When you know what's broken, you test the fix — not a random variation. That's how you get 20% lifts instead of 2%.

GET YOUR OWN AUDIT

Find these issues on your own page

PageGains analyzes any URL and surfaces these exact problems in ~60 seconds. First audit from $3.99.

Analyze my page →

The Bottom Line

A/B testing is one of the most powerful tools in CRO. But it's a measurement tool, not a repair tool. Running experiments on a store with broken tracking, slow page speed, vague positioning, and checkout friction is like trying to measure the temperature of a broken thermometer — the reading is meaningless.

The seven issues above aren't glamorous. Fixing your analytics isn't a story you can tell in a case study. Compressing images doesn't make for a great LinkedIn post. But fixing them moves the baseline — and a higher baseline means every future experiment you run starts from a stronger position and reaches significance faster.

Get the foundation right first. Then test. When you do, you'll find your experiments start producing clearer winners, your conversion rates stop bouncing around unpredictably, and the lifts you see in testing actually hold up after you ship. That's what good CRO feels like — and it starts long before you ever split a single visitor.