Turn hunches into proven results by designing A/B tests that deliver valid insights for your marketing or product. Get help creating clear hypotheses, determining how many visitors you need for reliable data, and understanding exactly how to interpret the outcome. Use this whenever you need to validate a new headline, design change, or feature before rolling it out to everyone.
name: “ab-test-setup”
description: When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions “A/B test,” “split test,” “experiment,” “test this change,” “variant copy,” “multivariate test,” “hypothesis,” “conversion experiment,” “statistical significance,” or “test this.” For tracking implementation, see analytics-tracking.
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: marketing
updated: 2026-03-06
A/B Test Setup
You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results.
Initial Assessment
Check for product marketing context first:
If .claude/product-marketing-context.md exists, read it before asking questions. Use that context and only ask for information not already covered or specific to this task.
Before designing a test, understand:
Test Context - What are you trying to improve? What change are you considering?
Current State - Baseline conversion rate? Current traffic volume?
Because [observation/data],we believe [change]will cause [expected outcome]for [audience].We'll know this is true when [metrics].
Example
Weak: “Changing the button color might increase clicks.”
Strong: “Because users report difficulty finding the CTA (per heatmaps and feedback), we believe making the button larger and using contrasting color will increase CTA clicks by 15%+ for new visitors. We’ll measure click-through rate from page view to signup start.”
Looking at results before reaching sample size and stopping early leads to false positives and wrong decisions. Pre-commit to sample size and trust the process.
Ranked experiments by expected impact and feasibility
Communication
All outputs should meet the quality standard: clear hypothesis, pre-registered metrics, and documented decisions. Avoid presenting inconclusive results as wins. Every test should produce a learning, even if the variant loses. Reference marketing-context for product and audience framing before designing experiments.
Related Skills
page-cro — USE when you need ideas for what to test; NOT when you already have a hypothesis and just need test design.
analytics-tracking — USE to set up measurement infrastructure before running tests; NOT as a substitute for defining primary metrics upfront.
campaign-analytics — USE after tests conclude to fold results into broader campaign attribution; NOT during the test itself.
pricing-strategy — USE when test results affect pricing decisions; NOT to replace a controlled test with pure strategic reasoning.
marketing-context — USE as foundation before any test design to ensure hypotheses align with ICP and positioning; always load first.
Sample Size Guide
Reference for calculating sample sizes and test duration.
If you must check results before reaching sample size:
What is it?
Statistical method that adjusts for multiple looks at data.
When to use
High-risk changes
Need to stop bad variants early
Time-sensitive decisions
Tools that support it
Optimizely (Stats Accelerator)
VWO (SmartStats)
PostHog (Bayesian approach)
Tradeoff
More flexibility to stop early
Slightly larger sample size requirement
More complex analysis
Quick Decision Framework
Can I run this test?
Daily traffic to page: _____Baseline conversion rate: _____MDE I care about: _____Sample needed per variant: _____ (from tables above)Days to run: Sample / Daily traffic = _____If days > 60: Consider alternativesIf days > 30: Acceptable for high-impact testsIf days < 14: Likely feasibleIf days < 7: Easy to run, consider running longer anyway
A/B Test Templates Reference
Templates for planning, documenting, and analyzing experiments.
Test Plan Template
# A/B Test: [Name]## Overview- **Owner**: [Name]- **Test ID**: [ID in testing tool]- **Page/Feature**: [What's being tested]- **Planned dates**: [Start] - [End]## HypothesisBecause [observation/data],we believe [change]will cause [expected outcome]for [audience].We'll know this is true when [metrics].## Test Design| Element | Details ||---------|---------|| Test type | A/B / A/B/n / MVT || Duration | X weeks || Sample size | X per variant || Traffic allocation | 50/50 || Tool | [Tool name] || Implementation | Client-side / Server-side |## Variants### Control (A)[Screenshot]- Current experience- [Key details about current state]### Variant (B)[Screenshot or mockup]- [Specific change #1]- [Specific change #2]- Rationale: [Why we think this will win]## Metrics### Primary- **Metric**: [metric name]- **Definition**: [how it's calculated]- **Current baseline**: [X%]- **Minimum detectable effect**: [X%]### Secondary- [Metric 1]: [what it tells us]- [Metric 2]: [what it tells us]- [Metric 3]: [what it tells us]### Guardrails- [Metric that shouldn't get worse]- [Another safety metric]## Segment Analysis Plan- Mobile vs. desktop- New vs. returning visitors- Traffic source- [Other relevant segments]## Success Criteria- Winner: [Primary metric improves by X% with 95% confidence]- Loser: [Primary metric decreases significantly]- Inconclusive: [What we'll do if no significant result]## Pre-Launch Checklist- [ ] Hypothesis documented and reviewed- [ ] Primary metric defined and trackable- [ ] Sample size calculated- [ ] Test duration estimated- [ ] Variants implemented correctly- [ ] Tracking verified in all variants- [ ] QA completed on all variants- [ ] Stakeholders informed- [ ] Calendar hold for analysis date
Results Documentation Template
# A/B Test Results: [Name]## Summary| Element | Value ||---------|-------|| Test ID | [ID] || Dates | [Start] - [End] || Duration | X days || Result | Winner / Loser / Inconclusive || Decision | [What we're doing] |## Hypothesis (Reminder)[Copy from test plan]## Results### Sample Size| Variant | Target | Actual | % of target ||---------|--------|--------|-------------|| Control | X | Y | Z% || Variant | X | Y | Z% |### Primary Metric: [Metric Name]| Variant | Value | 95% CI | vs. Control ||---------|-------|--------|-------------|| Control | X% | [X%, Y%] | — || Variant | X% | [X%, Y%] | +X% |**Statistical significance**: p = X.XX (95% = sig / not sig)**Practical significance**: [Is this lift meaningful for the business?]### Secondary Metrics| Metric | Control | Variant | Change | Significant? ||--------|---------|---------|--------|--------------|| [Metric 1] | X | Y | +Z% | Yes/No || [Metric 2] | X | Y | +Z% | Yes/No |### Guardrail Metrics| Metric | Control | Variant | Change | Concern? ||--------|---------|---------|--------|----------|| [Metric 1] | X | Y | +Z% | Yes/No |### Segment Analysis**Mobile vs. Desktop**| Segment | Control | Variant | Lift ||---------|---------|---------|------|| Mobile | X% | Y% | +Z% || Desktop | X% | Y% | +Z% |**New vs. Returning**| Segment | Control | Variant | Lift ||---------|---------|---------|------|| New | X% | Y% | +Z% || Returning | X% | Y% | +Z% |## Interpretation### What happened?[Explanation of results in plain language]### Why do we think this happened?[Analysis and reasoning]### Caveats[Any limitations, external factors, or concerns]## Decision**Winner**: [Control / Variant]**Action**: [Implement variant / Keep control / Re-test]**Timeline**: [When changes will be implemented]## Learnings### What we learned- [Key insight 1]- [Key insight 2]### What to test next- [Follow-up test idea 1]- [Follow-up test idea 2]### Impact- **Projected lift**: [X% improvement in Y metric]- **Business impact**: [Revenue, conversions, etc.]
Test Repository Entry Template
For tracking all tests in a central location:
| Test ID | Name | Page | Dates | Primary Metric | Result | Lift | Link ||---------|------|------|-------|----------------|--------|------|------|| 001 | Hero headline test | Homepage | 1/1-1/15 | CTR | Winner | +12% | [Link] || 002 | Pricing table layout | Pricing | 1/10-1/31 | Plan selection | Loser | -5% | [Link] || 003 | Signup form fields | Signup | 2/1-2/14 | Completion | Inconclusive | +2% | [Link] |
Quick Test Brief Template
For simple tests that don't need full documentation: