This is going to be another long post about a seemingly boring topic of optimisation. If you ask a hundred affiliates how they optimise a campaign 90% of them will tell you they just get a feel for it. Very few affiliates actually have a set process or mathematical formula they use when optimising. Whether that be creatives such as banners or site ids or mobile devices.
I would say that almost all affiliates myself included over-optimise at some point. It’s just too easy to look at a campaign that is losing money or breaking even and cut out the worse 30% of the traffic. The problem is if this is a campaign that you intend to run for the long term you are leaving a lot of money on the table by doing it this way.
To look at this in more detail and work on my own formulas I used the following php code to set a random number between 0-9 for one of my tracking c variables.
$c7 = rand(0,9);
I then collected data on that just to see how much variance we can find in the random field. So basically what I was trying to emulate was 10 banners/landing pages/devices that all converted exactly the same. In theory the leads should be the same for each variable but they obviously weren’t.
Here’s the data. I’ve gone through and marked what I think most affiliates would normally optimise out with a red box if negatively optimising, the green boxes are the ones that would often be whitelisted if positively optimising and the rest are neutral.
31 leads | 2 Hours | 1359 leads | Week1 | |||||
cvar | Impressions | Leads | cvar | Impressions | Leads | |||
0 | 1301 | 3 | 0 | 154779 | 139 | |||
1 | 1354 | 4 | 1 | 155098 | 155 | |||
2 | 1325 | 4 | 2 | 155657 | 119 | |||
3 | 1296 | 5 | 3 | 155238 | 127 | |||
4 | 1316 | 3 | 4 | 155178 | 146 | |||
5 | 1347 | 1 | 5 | 154699 | 139 | |||
6 | 1352 | 2 | 6 | 154862 | 141 | |||
7 | 1319 | 4 | 7 | 155579 | 117 | |||
8 | 1325 | 3 | 8 | 155242 | 152 | |||
9 | 1287 | 2 | 9 | 155665 | 124 | |||
13222 | 31 | 1551997 | 1359 | |||||
367 leads | 1 Day | 3597 leads | Week 2 | |||||
cvar | Impressions | Leads | cvar | Impressions | Leads | |||
0 | 15191 | 34 | 0 | 219959 | 333 | |||
1 | 15266 | 40 | 1 | 220168 | 365 | |||
2 | 15342 | 32 | 2 | 218989 | 331 | |||
3 | 15162 | 45 | 3 | 220844 | 379 | |||
4 | 15396 | 41 | 4 | 219957 | 395 | |||
5 | 15308 | 39 | 5 | 219921 | 381 | |||
6 | 15115 | 27 | 6 | 218962 | 351 | |||
7 | 15358 | 41 | 7 | 219464 | 363 | |||
8 | 15187 | 34 | 8 | 219146 | 348 | |||
9 | 15276 | 34 | 9 | 220077 | 351 | |||
152601 | 367 | 2197487 | 3597 |
Looks pretty familiar doesn’t it 🙂 I’m sure that nearly everyone, myself included, has looked at a set of data like this and drawn the same conclusions.
Everyone of the optimisations above is incorrect. There is not enough data to draw the conclusion in any of that data. It’s obvious when you know they are just random variables but how many times have you done a similar thing for banners or site id’s on a traffic network?
OK so the first table is for a 31 leads with an average of 3.1 leads per variable. The highest variable has 5 conversions and the lowest has 1. If they were landing pages you would be thinking yep LP3 converts way better than LP5, let’s get rid of it. But the truth is there is nowhere near enough data yet to say that one converted better than the other. The difference (swing) between the average and the highest/lowest is around 70%
The second table shows 367 leads with an average of 36.7 leads per variable. The highest is 45 and the lowest is 27. This is starting to get a bit closer, the swing is now 30%ish either way. You can still see that one random variable is still outperforming another by more than 50%. Would you optimise at this stage? I have in the past.
The third table has 1359 leads, 136~ average, worse/best = 117/155 and a swing of about 15% in both directions.
The final table has 3597 leads, 360 average, worse/best = 331/395 and a swing of roughly 10%.
The total number of leads run was 4956, 496 average, worse/best 472/541, 9% swing.
So from this we can work out the likely swing from the number of conversions.
The chart below shows the amount of random swing you are likely to see affecting your stats.
So let’s draw some lines on the chart and see what swing we should expect for common optimisation situations.
So if you are averaging 10 leads/variable you can expect up to 50% swing either way due to the “luck factor”, So basically you should only be optimising out variables that have less than 5 conversions.
With 100 leads per variable you can expect up to roughly 20% swing on the data. So you should only be blacklisting variables which have significantly less than 80 leads and whitelisting variables that have 120+.
So the optimisation process I recommend would go something like this. At around 10 conversions per variable I would start to optimise out any variables which had no conversions at all. Then at around 50 conversions/variable I would take out any that had less than 25 conversions. At 100 conversions / variable I would look really closely at the data and blacklist anything that had less than 70 conversions.
This would ultimately depend on the campaign itself though as I am still a lot more aggressive with a short-term mobile campaign than I am with a long-term traffic source running to my own offer.
The traffic I used for this test was a simple DOI conversion for a fairly large campaign. What chance do you have if you are optimising for a high CPA offer or CPS. The worse people in the world for over-optimising are affiliate managers who run direct affiliate offers. Particularly in dating where the CPA is $10 and the CPS is around $100-200. If you send them source data they will start trying to optimise your campaigns for you with just 2-3 sales. How many times have you heard “this siteid doesn’t convert lead to sale” and when you go look they’ve only received 20 leads and probably haven’t had any sales. The moral of this story is don’t send any source data to anyone. Reps do this as a day job and don’t have the skills or inclination required to optimise your traffic. Work with your own data, this is the only way to make sure you aren’t over-optimising your campaigns.
If you have a high CPA offer like a binary options or insurance type thing it is often worth while setting up a second pixel. So you have one pixel that fires on completion of the offer but you have another one which fires much more often. One thing I like to do is insert a javascript function which loads a conversion pixel when you get to the bottom of the page. This will give you a good indication of siteid quality long before you have enough data to optimise on a $100 CPA.
Optimisation is a core skill for affiliate marketers and anyone buying traffic online. Understanding and being able to predict the conversion variance due to good or bad luck leads to more professional analysis of data and in turn more successful long term campaigns and profits.
Update 1/4/15: I came across this site which prints out some lovely graphs which might help visualise what I am trying to get at and give a mathematically more accurate optimisation model.
Additional resource: http://glimmer.rstudio.com/odnl/ab-test-calculator-bayes/