DISQUS

20bits: Statistical Analysis and A/B Testing

  • Chris · 11 months ago
    Nice post. Social Media had a similar blog entry a while back.
    http://blog.socialmedia.com/crafting-a-statisti...
  • Jesse · 11 months ago
    Chris,

    That's a nice video. I wonder what software they were using?
  • Chris · 11 months ago
    They were using an Excel spreadsheet and were kind enough to make it public.
    http://blog.socialmedia.com/wp-content/uploads/...

    I just tried it and had to watch the video a couple of times to understand how to use the spreadsheet.
  • Brendan O'Connor · 11 months ago
    The most important thing to know is a software package to use -- you don't want to muck around coding this yourself. R's t.test() is a good choice. (I've heard Excel can do it too I suppose.)
  • Brendan O'Connor · 11 months ago
    The most important thing to know is a software package to use -- you don't want to muck around coding this yourself. R's t.test() is a good choice. (I guess Excel is good too.)
  • hadley · 11 months ago
    If you're going to use R, why not actually use the appropriate test - in this case it would be prop.test() for testing the different between two proportions.
  • Jesse Farmer · 11 months ago
    Where did my comments go? :(
  • Jesse Farmer · 11 months ago
    Oh, that was weird.
  • hadley · 11 months ago
    "The conversion rate for each treatment is a normally distributed random variable" - are you sure??
  • Jesse Farmer · 11 months ago
    "The conversion rate for each treatment approximates a normally distributed random variable" is more correct.
  • Yvonne · 9 months ago
    If for example, you had a treatment D which had a z-score of -2.94 - would you then be 95% confident that treatment D is worse than the control?
  • Traveller_Adventure · 3 months ago
    This is quite impressive, I am pleased to read this post, keep posts like this coming, you totally rock!
    Cheers,
    Blog Review
  • Loïc d'Anterroches · 2 months ago
    Thanks a lot! You helped me getting a 50% boost in conversion rate for my website:
    http://www.ceondo.com/ecte/2009/08/ab-testing-b...

    I really recommend everybody to do some AB testing. I am linking to the PHP code to the tests from my article if people are interested.
  • jtregister · 2 months ago
    This is a great resource; thanks. I'm beginning to look into the statistics behind A/B testing and have some questions. This is well after the initial post, so hopefully Jesse and others will see this.

    For the distribution of the conversion rate, it seems like it should be a binomial distribution, which can be approximated by the normal distribution (as Jesse asserts in the comments) with scale.

    But how about if we take this one step further and look to measure this on an e-commerce website, where there's not just conversion rate but also average order value to consider? (Really, we want to look at the contribution margin, but let's assume -- admittedly incorrectly -- that we have a 100% margin on the shopping cart.) This considers contribution per visitor, a broader metric of an e-commerce website than simply conversion rate. (And of course the subsequent step is to follow the impact on lifetime customer value, but let's not go there for now.)

    Now if you consider the distribution of average order value on a typical e-commerce website, often ~95% do not convert. Of those who do convert, there's typically a normally distributed range of average order values. But if you plot the entire range of AOV, including those who don't convert, there's a huge 'peak' at zero followed by a normal bell curve. This is a more complicated distribution than a simple normal distribution.

    Does anyone have insights on how to analyze the A/B results for contribution per visitor given this type of distribution? Seems like perhaps a compound Poisson, or something similarly complex. Or can someone perhaps provide a good justification of why this level of complexity is unnecessary in the analysis?

    Thanks,
    Jonathan