Suppose you make an x% improvement followed by a y% improvement. Together do they make an (x + y)% improvement? Maybe.
The business principle of kaizen, based on the Japanese ?? for improvement, is based on the assumption that incremental improvements accumulate. But quantifying how improvements accumulate takes some care.
Add or Multiply?
Two successive 1% improvements amount to a 2% improvement. But two successive 50% improvements amount to a 125% improvement. So sometimes you can add, and sometimes you cannot. What’s going on?
An x% improvement multiplies something by 1 + x/100. For example, if you earn 5% interest on a principle of P dollars, you now have 1.05 P dollars.
So an x% improvement followed by a y% improvement multiplies by
(1 + x/100)(1 + y/100) = 1 + (x + y)/100 + xy/10000.
If x and y are small, then xy/10000 is negligible. But if x and y are large, the product term may not be negligible, depending on context. I go into this further in this post: Small probabilities add, big ones don’t.
Interactions
Now let’s look at a variation. Suppose doing one thing by itself brings an x% improvement and doing another thing by itself makes a y% improvement. How much improvement could you expect from doing both?
For example, suppose you find through A/B testing that changing the font on a page increases conversions by 10%. And you find in a separate A/B test that changing an image on the page increases conversions by 15%. If you change the font and the image, would you expect a 25% increase in conversions?
The issue here is not so much whether it is appropriate to add percentages. Since
1.1 × 1.15 = 1.265
you don’t get a much different answer whether you multiply or add. But maybe you could change the font and the image and conversions increase 12%. Maybe either change alone creates a better impression, but together they don’t make a better impression than doing one of the changes. Or maybe the new font and the new image clash somehow and doing both changes together lowers conversions.
The statistical term for what’s going on is interaction effects. A sequence of small improvements creates an additive effect if the improvements are independent. But the effects could be dependent, in which case the whole is less than the sum of the parts. This is typical. Assuming that improvements are independent is often overly optimistic. But sometimes you run into a synergistic effect and the whole is greater than the sum of the parts.
Sequential Testing
In the example above, we imagine testing the effect of a font change and an image change separately. What if we first changed the font, then with the new font tested the image? That’s better. If there were a clash between the new font and the new image we’d know it.
But we’re missing something here. If we had tested the image first and then tested the new font with the new image, we might have gotten different results. In general, the order of sequential testing matters.
Factorial Testing
If you have a small number of things to test, you can discover interaction effects by doing a factorial design, either a full factorial design or a fractional factorial design.
If you have a large number of things to test, you’ll have to do some sort of sequential testing. Maybe you do some combination of sequential and factorial testing, guided by which effects you have reason to believe will be approximately independent.
In practice, a testing plan needs to balance simplicity and statistical power. Sequentially testing one option at a time is simple, and may be fine if interaction effects are small. But if interaction effects are large, sequential testing may be leaving money on the table.